Sample records for analysis identifies genes

  1. A Strategy for Identifying Quantitative Trait Genes Using Gene Expression Analysis and Causal Analysis.

    PubMed

    Ishikawa, Akira

    2017-11-27

    Large numbers of quantitative trait loci (QTL) affecting complex diseases and other quantitative traits have been reported in humans and model animals. However, the genetic architecture of these traits remains elusive due to the difficulty in identifying causal quantitative trait genes (QTGs) for common QTL with relatively small phenotypic effects. A traditional strategy based on techniques such as positional cloning does not always enable identification of a single candidate gene for a QTL of interest because it is difficult to narrow down a target genomic interval of the QTL to a very small interval harboring only one gene. A combination of gene expression analysis and statistical causal analysis can greatly reduce the number of candidate genes. This integrated approach provides causal evidence that one of the candidate genes is a putative QTG for the QTL. Using this approach, I have recently succeeded in identifying a single putative QTG for resistance to obesity in mice. Here, I outline the integration approach and discuss its usefulness using my studies as an example.

  2. Identifying key genes in rheumatoid arthritis by weighted gene co-expression network analysis.

    PubMed

    Ma, Chunhui; Lv, Qi; Teng, Songsong; Yu, Yinxian; Niu, Kerun; Yi, Chengqin

    2017-08-01

    This study aimed to identify rheumatoid arthritis (RA) related genes based on microarray data using the WGCNA (weighted gene co-expression network analysis) method. Two gene expression profile datasets GSE55235 (10 RA samples and 10 healthy controls) and GSE77298 (16 RA samples and seven healthy controls) were downloaded from Gene Expression Omnibus database. Characteristic genes were identified using metaDE package. WGCNA was used to find disease-related networks based on gene expression correlation coefficients, and module significance was defined as the average gene significance of all genes used to assess the correlation between the module and RA status. Genes in the disease-related gene co-expression network were subject to functional annotation and pathway enrichment analysis using Database for Annotation Visualization and Integrated Discovery. Characteristic genes were also mapped to the Connectivity Map to screen small molecules. A total of 599 characteristic genes were identified. For each dataset, characteristic genes in the green, red and turquoise modules were most closely associated with RA, with gene numbers of 54, 43 and 79, respectively. These genes were enriched in totally enriched in 17 Gene Ontology terms, mainly related to immune response (CD97, FYB, CXCL1, IKBKE, CCR1, etc.), inflammatory response (CD97, CXCL1, C3AR1, CCR1, LYZ, etc.) and homeostasis (C3AR1, CCR1, PLN, CCL19, PPT1, etc.). Two small-molecule drugs sanguinarine and papaverine were predicted to have a therapeutic effect against RA. Genes related to immune response, inflammatory response and homeostasis presumably have critical roles in RA pathogenesis. Sanguinarine and papaverine have a potential therapeutic effect against RA. © 2017 Asia Pacific League of Associations for Rheumatology and John Wiley & Sons Australia, Ltd.

  3. Gene-based rare allele analysis identified a risk gene of Alzheimer's disease.

    PubMed

    Kim, Jong Hun; Song, Pamela; Lim, Hyunsun; Lee, Jae-Hyung; Lee, Jun Hong; Park, Sun Ah

    2014-01-01

    Alzheimer's disease (AD) has a strong propensity to run in families. However, the known risk genes excluding APOE are not clinically useful. In various complex diseases, gene studies have targeted rare alleles for unsolved heritability. Our study aims to elucidate previously unknown risk genes for AD by targeting rare alleles. We used data from five publicly available genetic studies from the Alzheimer's Disease Neuroimaging Initiative (ADNI) and the database of Genotypes and Phenotypes (dbGaP). A total of 4,171 cases and 9,358 controls were included. The genotype information of rare alleles was imputed using 1,000 genomes. We performed gene-based analysis of rare alleles (minor allele frequency≤3%). The genome-wide significance level was defined as meta P<1.8×10(-6) (0.05/number of genes in human genome = 0.05/28,517). ZNF628, which is located at chromosome 19q13.42, showed a genome-wide significant association with AD. The association of ZNF628 with AD was not dependent on APOE ε4. APOE and TREM2 were also significantly associated with AD, although not at genome-wide significance levels. Other genes identified by targeting common alleles could not be replicated in our gene-based rare allele analysis. We identified that rare variants in ZNF628 are associated with AD. The protein encoded by ZNF628 is known as a transcription factor. Furthermore, the associations of APOE and TREM2 with AD were highly significant, even in gene-based rare allele analysis, which implies that further deep sequencing of these genes is required in AD heritability studies.

  4. Gene expression meta-analysis identifies chromosomal regions and candidate genes involved in breast cancer metastasis.

    PubMed

    Thomassen, Mads; Tan, Qihua; Kruse, Torben A

    2009-01-01

    Breast cancer cells exhibit complex karyotypic alterations causing deregulation of numerous genes. Some of these genes are probably causal for cancer formation and local growth whereas others are causal for the various steps of metastasis. In a fraction of tumors deregulation of the same genes might be caused by epigenetic modulations, point mutations or the influence of other genes. We have investigated the relation of gene expression and chromosomal position, using eight datasets including more than 1200 breast tumors, to identify chromosomal regions and candidate genes possibly causal for breast cancer metastasis. By use of "Gene Set Enrichment Analysis" we have ranked chromosomal regions according to their relation to metastasis. Overrepresentation analysis identified regions with increased expression for chromosome 1q41-42, 8q24, 12q14, 16q22, 16q24, 17q12-21.2, 17q21-23, 17q25, 20q11, and 20q13 among metastasizing tumors and reduced gene expression at 1p31-21, 8p22-21, and 14q24. By analysis of genes with extremely imbalanced expression in these regions we identified DIRAS3 at 1p31, PSD3, LPL, EPHX2 at 8p21-22, and FOS at 14q24 as candidate metastasis suppressor genes. Potential metastasis promoting genes includes RECQL4 at 8q24, PRMT7 at 16q22, GINS2 at 16q24, and AURKA at 20q13.

  5. Systematic analysis of microarray datasets to identify Parkinson's disease‑associated pathways and genes.

    PubMed

    Feng, Yinling; Wang, Xuefeng

    2017-03-01

    In order to investigate commonly disturbed genes and pathways in various brain regions of patients with Parkinson's disease (PD), microarray datasets from previous studies were collected and systematically analyzed. Different normalization methods were applied to microarray datasets from different platforms. A strategy combining gene co‑expression networks and clinical information was adopted, using weighted gene co‑expression network analysis (WGCNA) to screen for commonly disturbed genes in different brain regions of patients with PD. Functional enrichment analysis of commonly disturbed genes was performed using the Database for Annotation, Visualization, and Integrated Discovery (DAVID). Co‑pathway relationships were identified with Pearson's correlation coefficient tests and a hypergeometric distribution‑based test. Common genes in pathway pairs were selected out and regarded as risk genes. A total of 17 microarray datasets from 7 platforms were retained for further analysis. Five gene coexpression modules were identified, containing 9,745, 736, 233, 101 and 93 genes, respectively. One module was significantly correlated with PD samples and thus the 736 genes it contained were considered to be candidate PD‑associated genes. Functional enrichment analysis demonstrated that these genes were implicated in oxidative phosphorylation and PD. A total of 44 pathway pairs and 52 risk genes were revealed, and a risk gene pathway relationship network was constructed. Eight modules were identified and were revealed to be associated with PD, cancers and metabolism. A number of disturbed pathways and risk genes were unveiled in PD, and these findings may help advance understanding of PD pathogenesis.

  6. Haplotype Analysis in Multiple Crosses to Identify a QTL Gene

    PubMed Central

    Wang, Xiaosong; Korstanje, Ron; Higgins, David; Paigen, Beverly

    2004-01-01

    Identifying quantitative trait locus (QTL) genes is a challenging task. Herein, we report using a two-step process to identify Apoa2 as the gene underlying Hdlq5, a QTL for plasma high-density lipoprotein cholesterol (HDL) levels on mouse chromosome 1. First, we performed a sequence analysis of the Apoa2 coding region in 46 genetically diverse mouse strains and found five different APOA2 protein variants, which we named APOA2a to APOA2e. Second, we conducted a haplotype analysis of the strains in 21 crosses that have so far detected HDL QTLs; we found that Hdlq5 was detected only in the nine crosses where one parent had the APOA2b protein variant characterized by an Ala61-to-Val61 substitution. We then found that strains with the APOA2b variant had significantly higher (P ≤ 0.002) plasma HDL levels than those with either the APOA2a or the APOA2c variant. These findings support Apoa2 as the underlying Hdlq5 gene and suggest the Apoa2 polymorphisms responsible for the Hdlq5 phenotype. Therefore, haplotype analysis in multiple crosses can be used to support a candidate QTL gene. PMID:15310659

  7. Haplotype analysis in multiple crosses to identify a QTL gene.

    PubMed

    Wang, Xiaosong; Korstanje, Ron; Higgins, David; Paigen, Beverly

    2004-09-01

    Identifying quantitative trait locus (QTL) genes is a challenging task. Herein, we report using a two-step process to identify Apoa2 as the gene underlying Hdlq5, a QTL for plasma high-density lipoprotein cholesterol (HDL) levels on mouse chromosome 1. First, we performed a sequence analysis of the Apoa2 coding region in 46 genetically diverse mouse strains and found five different APOA2 protein variants, which we named APOA2a to APOA2e. Second, we conducted a haplotype analysis of the strains in 21 crosses that have so far detected HDL QTLs; we found that Hdlq5 was detected only in the nine crosses where one parent had the APOA2b protein variant characterized by an Ala61-to-Val61 substitution. We then found that strains with the APOA2b variant had significantly higher (P < or = 0.002) plasma HDL levels than those with either the APOA2a or the APOA2c variant. These findings support Apoa2 as the underlying Hdlq5 gene and suggest the Apoa2 polymorphisms responsible for the Hdlq5 phenotype. Therefore, haplotype analysis in multiple crosses can be used to support a candidate QTL gene.

  8. Gene expression patterns combined with bioinformatics analysis identify genes associated with cholangiocarcinoma.

    PubMed

    Li, Chen; Shen, Weixing; Shen, Sheng; Ai, Zhilong

    2013-12-01

    To explore the molecular mechanisms of cholangiocarcinoma (CC), microarray technology was used to find biomarkers for early detection and diagnosis. The gene expression profiles from 6 patients with CC and 5 normal controls were downloaded from Gene Expression Omnibus and compared. As a result, 204 differentially co-expressed genes (DCGs) in CC patients compared to normal controls were identified using a computational bioinformatics analysis. These genes were mainly involved in coenzyme metabolic process, peptidase activity and oxidation reduction. A regulatory network was constructed by mapping the DCGs to known regulation data. Four transcription factors, FOXC1, ZIC2, NKX2-2 and GCGR, were hub nodes in the network. In conclusion, this study provides a set of targets useful for future investigations into molecular biomarker studies. Copyright © 2013 Elsevier Ltd. All rights reserved.

  9. Integrative Analysis of GWASs, Human Protein Interaction, and Gene Expression Identified Gene Modules Associated With BMDs

    PubMed Central

    He, Hao; Zhang, Lei; Li, Jian; Wang, Yu-Ping; Zhang, Ji-Gang; Shen, Jie; Guo, Yan-Fang

    2014-01-01

    Context: To date, few systems genetics studies in the bone field have been performed. We designed our study from a systems-level perspective by integrating genome-wide association studies (GWASs), human protein-protein interaction (PPI) network, and gene expression to identify gene modules contributing to osteoporosis risk. Methods: First we searched for modules significantly enriched with bone mineral density (BMD)-associated genes in human PPI network by using 2 large meta-analysis GWAS datasets through a dense module search algorithm. One included 7 individual GWAS samples (Meta7). The other was from the Genetic Factors for Osteoporosis Consortium (GEFOS2). One was assigned as a discovery dataset and the other as an evaluation dataset, and vice versa. Results: In total, 42 modules and 129 modules were identified significantly in both Meta7 and GEFOS2 datasets for femoral neck and spine BMD, respectively. There were 3340 modules identified for hip BMD only in Meta7. As candidate modules, they were assessed for the biological relevance to BMD by gene set enrichment analysis in 2 expression profiles generated from circulating monocytes in subjects with low versus high BMD values. Interestingly, there were 2 modules significantly enriched in monocytes from the low BMD group in both gene expression datasets (nominal P value <.05). Two modules had 16 nonredundant genes. Functional enrichment analysis revealed that both modules were enriched for genes involved in Wnt receptor signaling and osteoblast differentiation. Conclusion: We highlighted 2 modules and novel genes playing important roles in the regulation of bone mass, providing important clues for therapeutic approaches for osteoporosis. PMID:25119315

  10. Gene expression profiles analysis identifies key genes for acute lung injury in patients with sepsis.

    PubMed

    Guo, Zhiqiang; Zhao, Chuncheng; Wang, Zheng

    2014-09-26

    To identify critical genes and biological pathways in acute lung injury (ALI), a comparative analysis of gene expression profiles of patients with ALI + sepsis compared with patients with sepsis alone were performed with bioinformatic tools. GSE10474 was downloaded from Gene Expression Omnibus, including a collective of 13 whole blood samples with ALI + sepsis and 21 whole blood samples with sepsis alone. After pre-treatment with robust multichip averaging (RMA) method, differential analysis was conducted using simpleaffy package based upon t-test and fold change. Hierarchical clustering was also performed using function hclust from package stats. Beisides, functional enrichment analysis was conducted using iGepros. Moreover, the gene regulatory network was constructed with information from Kyoto Encyclopedia of Genes and Genomes (KEGG) and then visualized by Cytoscape. A total of 128 differentially expressed genes (DEGs) were identified, including 47 up- and 81 down-regulated genes. The significantly enriched functions included negative regulation of cell proliferation, regulation of response to stimulus and cellular component morphogenesis. A total of 27 DEGs were significantly enriched in 16 KEGG pathways, such as protein digestion and absorption, fatty acid metabolism, amoebiasis, etc. Furthermore, the regulatory network of these 27 DEGs was constructed, which involved several key genes, including protein tyrosine kinase 2 (PTK2), v-src avian sarcoma (SRC) and Caveolin 2 (CAV2). PTK2, SRC and CAV2 may be potential markers for diagnosis and treatment of ALI. The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/5865162912987143.

  11. Gene expression profiling combined with bioinformatics analysis identify biomarkers for Parkinson disease.

    PubMed

    Diao, Hongyu; Li, Xinxing; Hu, Sheng; Liu, Yunhui

    2012-01-01

    Parkinson disease (PD) progresses relentlessly and affects approximately 4% of the population aged over 80 years old. It is difficult to diagnose in its early stages. The purpose of our study is to identify molecular biomarkers for PD initiation using a computational bioinformatics analysis of gene expression. We downloaded the gene expression profile of PD from Gene Expression Omnibus and identified differentially coexpressed genes (DCGs) and dysfunctional pathways in PD patients compared to controls. Besides, we built a regulatory network by mapping the DCGs to known regulatory data between transcription factors (TFs) and target genes and calculated the regulatory impact factor of each transcription factor. As the results, a total of 1004 genes associated with PD initiation were identified. Pathway enrichment of these genes suggests that biological processes of protein turnover were impaired in PD. In the regulatory network, HLF, E2F1 and STAT4 were found have altered expression levels in PD patients. The expression levels of other transcription factors, NKX3-1, TAL1, RFX1 and EGR3, were not found altered. However, they regulated differentially expressed genes. In conclusion, we suggest that HLF, E2F1 and STAT4 may be used as molecular biomarkers for PD; however, more work is needed to validate our result.

  12. Gene Expression Profiling Combined with Bioinformatics Analysis Identify Biomarkers for Parkinson Disease

    PubMed Central

    Diao, Hongyu; Li, Xinxing; Hu, Sheng; Liu, Yunhui

    2012-01-01

    Parkinson disease (PD) progresses relentlessly and affects approximately 4% of the population aged over 80 years old. It is difficult to diagnose in its early stages. The purpose of our study is to identify molecular biomarkers for PD initiation using a computational bioinformatics analysis of gene expression. We downloaded the gene expression profile of PD from Gene Expression Omnibus and identified differentially coexpressed genes (DCGs) and dysfunctional pathways in PD patients compared to controls. Besides, we built a regulatory network by mapping the DCGs to known regulatory data between transcription factors (TFs) and target genes and calculated the regulatory impact factor of each transcription factor. As the results, a total of 1004 genes associated with PD initiation were identified. Pathway enrichment of these genes suggests that biological processes of protein turnover were impaired in PD. In the regulatory network, HLF, E2F1 and STAT4 were found have altered expression levels in PD patients. The expression levels of other transcription factors, NKX3-1, TAL1, RFX1 and EGR3, were not found altered. However, they regulated differentially expressed genes. In conclusion, we suggest that HLF, E2F1 and STAT4 may be used as molecular biomarkers for PD; however, more work is needed to validate our result. PMID:23284986

  13. A whole-blood transcriptome meta-analysis identifies gene expression signatures of cigarette smoking

    PubMed Central

    Huan, Tianxiao; Joehanes, Roby; Schurmann, Claudia; Schramm, Katharina; Pilling, Luke C.; Peters, Marjolein J.; Mägi, Reedik; DeMeo, Dawn; O'Connor, George T.; Ferrucci, Luigi; Teumer, Alexander; Homuth, Georg; Biffar, Reiner; Völker, Uwe; Herder, Christian; Waldenberger, Melanie; Peters, Annette; Zeilinger, Sonja; Metspalu, Andres; Hofman, Albert; Uitterlinden, André G.; Hernandez, Dena G.; Singleton, Andrew B.; Bandinelli, Stefania; Munson, Peter J.; Lin, Honghuang; Benjamin, Emelia J.; Esko, Tõnu; Grabe, Hans J.; Prokisch, Holger; van Meurs, Joyce B.J.; Melzer, David; Levy, Daniel

    2016-01-01

    Abstract Cigarette smoking is a leading modifiable cause of death worldwide. We hypothesized that cigarette smoking induces extensive transcriptomic changes that lead to target-organ damage and smoking-related diseases. We performed a meta-analysis of transcriptome-wide gene expression using whole blood-derived RNA from 10,233 participants of European ancestry in six cohorts (including 1421 current and 3955 former smokers) to identify associations between smoking and altered gene expression levels. At a false discovery rate (FDR) <0.1, we identified 1270 differentially expressed genes in current vs. never smokers, and 39 genes in former vs. never smokers. Expression levels of 12 genes remained elevated up to 30 years after smoking cessation, suggesting that the molecular consequence of smoking may persist for decades. Gene ontology analysis revealed enrichment of smoking-related genes for activation of platelets and lymphocytes, immune response, and apoptosis. Many of the top smoking-related differentially expressed genes, including LRRN3 and GPR15, have DNA methylation loci in promoter regions that were recently reported to be hypomethylated among smokers. By linking differential gene expression with smoking-related disease phenotypes, we demonstrated that stroke and pulmonary function show enrichment for smoking-related gene expression signatures. Mediation analysis revealed the expression of several genes (e.g. ALAS2) to be putative mediators of the associations between smoking and inflammatory biomarkers (IL6 and C-reactive protein levels). Our transcriptomic study provides potential insights into the effects of cigarette smoking on gene expression in whole blood and their relations to smoking-related diseases. The results of such analyses may highlight attractive targets for treating or preventing smoking-related health effects. PMID:28158590

  14. Preferential Allele Expression Analysis Identifies Shared Germline and Somatic Driver Genes in Advanced Ovarian Cancer

    PubMed Central

    Halabi, Najeeb M.; Martinez, Alejandra; Al-Farsi, Halema; Mery, Eliane; Puydenus, Laurence; Pujol, Pascal; Khalak, Hanif G.; McLurcan, Cameron; Ferron, Gwenael; Querleu, Denis; Al-Azwani, Iman; Al-Dous, Eman; Mohamoud, Yasmin A.; Malek, Joel A.; Rafii, Arash

    2016-01-01

    Identifying genes where a variant allele is preferentially expressed in tumors could lead to a better understanding of cancer biology and optimization of targeted therapy. However, tumor sample heterogeneity complicates standard approaches for detecting preferential allele expression. We therefore developed a novel approach combining genome and transcriptome sequencing data from the same sample that corrects for sample heterogeneity and identifies significant preferentially expressed alleles. We applied this analysis to epithelial ovarian cancer samples consisting of matched primary ovary and peritoneum and lymph node metastasis. We find that preferentially expressed variant alleles include germline and somatic variants, are shared at a relatively high frequency between patients, and are in gene networks known to be involved in cancer processes. Analysis at a patient level identifies patient-specific preferentially expressed alleles in genes that are targets for known drugs. Analysis at a site level identifies patterns of site specific preferential allele expression with similar pathways being impacted in the primary and metastasis sites. We conclude that genes with preferentially expressed variant alleles can act as cancer drivers and that targeting those genes could lead to new therapeutic strategies. PMID:26735499

  15. Gene expression patterns combined with network analysis identify hub genes associated with bladder cancer.

    PubMed

    Bi, Dongbin; Ning, Hao; Liu, Shuai; Que, Xinxiang; Ding, Kejia

    2015-06-01

    To explore molecular mechanisms of bladder cancer (BC), network strategy was used to find biomarkers for early detection and diagnosis. The differentially expressed genes (DEGs) between bladder carcinoma patients and normal subjects were screened using empirical Bayes method of the linear models for microarray data package. Co-expression networks were constructed by differentially co-expressed genes and links. Regulatory impact factors (RIF) metric was used to identify critical transcription factors (TFs). The protein-protein interaction (PPI) networks were constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and clusters were obtained through molecular complex detection (MCODE) algorithm. Centralities analyses for complex networks were performed based on degree, stress and betweenness. Enrichment analyses were performed based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Co-expression networks and TFs (based on expression data of global DEGs and DEGs in different stages and grades) were identified. Hub genes of complex networks, such as UBE2C, ACTA2, FABP4, CKS2, FN1 and TOP2A, were also obtained according to analysis of degree. In gene enrichment analyses of global DEGs, cell adhesion, proteinaceous extracellular matrix and extracellular matrix structural constituent were top three GO terms. ECM-receptor interaction, focal adhesion, and cell cycle were significant pathways. Our results provide some potential underlying biomarkers of BC. However, further validation is required and deep studies are needed to elucidate the pathogenesis of BC. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. Gene-set analysis based on the pharmacological profiles of drugs to identify repurposing opportunities in schizophrenia.

    PubMed

    de Jong, Simone; Vidler, Lewis R; Mokrab, Younes; Collier, David A; Breen, Gerome

    2016-08-01

    Genome-wide association studies (GWAS) have identified thousands of novel genetic associations for complex genetic disorders, leading to the identification of potential pharmacological targets for novel drug development. In schizophrenia, 108 conservatively defined loci that meet genome-wide significance have been identified and hundreds of additional sub-threshold associations harbour information on the genetic aetiology of the disorder. In the present study, we used gene-set analysis based on the known binding targets of chemical compounds to identify the 'drug pathways' most strongly associated with schizophrenia-associated genes, with the aim of identifying potential drug repositioning opportunities and clues for novel treatment paradigms, especially in multi-target drug development. We compiled 9389 gene sets (2496 with unique gene content) and interrogated gene-based p-values from the PGC2-SCZ analysis. Although no single drug exceeded experiment wide significance (corrected p<0.05), highly ranked gene-sets reaching suggestive significance including the dopamine receptor antagonists metoclopramide and trifluoperazine and the tyrosine kinase inhibitor neratinib. This is a proof of principle analysis showing the potential utility of GWAS data of schizophrenia for the direct identification of candidate drugs and molecules that show polypharmacy. © The Author(s) 2016.

  17. De novo Transcriptome Analysis of Miscanthus lutarioriparius Identifies Candidate Genes in Rhizome Development

    PubMed Central

    Hu, Ruibo; Yu, Changjiang; Wang, Xiaoyu; Jia, Chunlin; Pei, Shengqiang; He, Kang; He, Guo; Kong, Yingzhen; Zhou, Gongke

    2017-01-01

    HIGHLIGHT De novo transcriptome profiling of five tissues reveals candidate genes putatively involved in rhizome development in M. lutarioriparius. Miscanthus lutarioriparius is a promising lignocellulosic feedstock for second-generation bioethanol production. However, the genomic resource for this species is relatively limited thus hampers our understanding of the molecular mechanisms underlying many important biological processes. In this study, we performed the first de novo transcriptome analysis of five tissues (leaf, stem, root, lateral bud and rhizome bud) of M. lutarioriparius with an emphasis to identify putative genes involved in rhizome development. Approximately 66 gigabase (GB) paired-end clean reads were obtained and assembled into 169,064 unigenes with an average length of 759 bp. Among these unigenes, 103,899 (61.5%) were annotated in seven public protein databases. Differential gene expression profiling analysis revealed that 4,609, 3,188, 1,679, 1,218, and 1,077 genes were predominantly expressed in root, leaf, stem, lateral bud, and rhizome bud, respectively. Their expression patterns were further classified into 12 distinct clusters. Pathway enrichment analysis revealed that genes predominantly expressed in rhizome bud were mainly involved in primary metabolism and hormone signaling and transduction pathways. Noteworthy, 19 transcription factors (TFs) and 16 hormone signaling pathway-related genes were identified to be predominantly expressed in rhizome bud compared with the other tissues, suggesting putative roles in rhizome formation and development. In addition, a predictive regulatory network was constructed between four TFs and six auxin and abscisic acid (ABA) -related genes. Furthermore, the expression of 24 rhizome-specific genes was further validated by quantitative real-time RT-PCR (qRT-PCR) analysis. Taken together, this study provide a global portrait of gene expression across five different tissues and reveal preliminary insights

  18. Weighted gene co-expression network analysis of expression data of monozygotic twins identifies specific modules and hub genes related to BMI.

    PubMed

    Wang, Weijing; Jiang, Wenjie; Hou, Lin; Duan, Haiping; Wu, Yili; Xu, Chunsheng; Tan, Qihua; Li, Shuxia; Zhang, Dongfeng

    2017-11-13

    The therapeutic management of obesity is challenging, hence further elucidating the underlying mechanisms of obesity development and identifying new diagnostic biomarkers and therapeutic targets are urgent and necessary. Here, we performed differential gene expression analysis and weighted gene co-expression network analysis (WGCNA) to identify significant genes and specific modules related to BMI based on gene expression profile data of 7 discordant monozygotic twins. In the differential gene expression analysis, it appeared that 32 differentially expressed genes (DEGs) were with a trend of up-regulation in twins with higher BMI when compared to their siblings. Categories of positive regulation of nitric-oxide synthase biosynthetic process, positive regulation of NF-kappa B import into nucleus, and peroxidase activity were significantly enriched within GO database and NF-kappa B signaling pathway within KEGG database. DEGs of NAMPT, TLR9, PTGS2, HBD, and PCSK1N might be associated with obesity. In the WGCNA, among the total 20 distinct co-expression modules identified, coral1 module (68 genes) had the strongest positive correlation with BMI (r = 0.56, P = 0.04) and disease status (r = 0.56, P = 0.04). Categories of positive regulation of phospholipase activity, high-density lipoprotein particle clearance, chylomicron remnant clearance, reverse cholesterol transport, intermediate-density lipoprotein particle, chylomicron, low-density lipoprotein particle, very-low-density lipoprotein particle, voltage-gated potassium channel complex, cholesterol transporter activity, and neuropeptide hormone activity were significantly enriched within GO database for this module. And alcoholism and cell adhesion molecules pathways were significantly enriched within KEGG database. Several hub genes, such as GAL, ASB9, NPPB, TBX2, IL17C, APOE, ABCG4, and APOC2 were also identified. The module eigengene of saddlebrown module (212 genes) was also significantly

  19. Genome-wide transcriptional analysis of flagellar regeneration in Chlamydomonas reinhardtii identifies orthologs of ciliary disease genes

    NASA Technical Reports Server (NTRS)

    Stolc, Viktor; Samanta, Manoj Pratim; Tongprasit, Waraporn; Marshall, Wallace F.

    2005-01-01

    The important role that cilia and flagella play in human disease creates an urgent need to identify genes involved in ciliary assembly and function. The strong and specific induction of flagellar-coding genes during flagellar regeneration in Chlamydomonas reinhardtii suggests that transcriptional profiling of such cells would reveal new flagella-related genes. We have conducted a genome-wide analysis of RNA transcript levels during flagellar regeneration in Chlamydomonas by using maskless photolithography method-produced DNA oligonucleotide microarrays with unique probe sequences for all exons of the 19,803 predicted genes. This analysis represents previously uncharacterized whole-genome transcriptional activity profiling study in this important model organism. Analysis of strongly induced genes reveals a large set of known flagellar components and also identifies a number of important disease-related proteins as being involved with cilia and flagella, including the zebrafish polycystic kidney genes Qilin, Reptin, and Pontin, as well as the testis-expressed tubby-like protein TULP2.

  20. Transcriptomic meta-analysis identifies gene expression characteristics in various samples of HIV-infected patients with nonprogressive disease.

    PubMed

    Zhang, Le-Le; Zhang, Zi-Ning; Wu, Xian; Jiang, Yong-Jun; Fu, Ya-Jing; Shang, Hong

    2017-09-12

    A small proportion of HIV-infected patients remain clinically and/or immunologically stable for years, including elite controllers (ECs) who have undetectable viremia (<50 copies/ml) and long-term nonprogressors (LTNPs) who maintain normal CD4 + T cell counts for prolonged periods (>10 years). However, the mechanism of nonprogression needs to be further resolved. In this study, a transcriptome meta-analysis was performed on nonprogressor and progressor microarray data to identify differential transcriptome pathways and potential biomarkers. Using the INMEX (integrative meta-analysis of expression data) program, we performed the meta-analysis to identify consistently differentially expressed genes (DEGs) in nonprogressors and further performed functional interpretation (gene ontology analysis and pathway analysis) of the DEGs identified in the meta-analysis. Five microarray datasets (81 cases and 98 controls in total), including whole blood, CD4 + and CD8 + T cells, were collected for meta-analysis. We determined that nonprogressors have reduced expression of important interferon-stimulated genes (ISGs), CD38, lymphocyte activation gene 3 (LAG-3) in whole blood, CD4 + and CD8 + T cells. Gene ontology (GO) analysis showed a significant enrichment in DEGs that function in the type I interferon signaling pathway. Upregulated pathways, including the PI3K-Akt signaling pathway in whole blood, cytokine-cytokine receptor interaction in CD4 + T cells and the MAPK signaling pathway in CD8 + T cells, were identified in nonprogressors compared with progressors. In each metabolic functional category, the number of downregulated DEGs was more than the upregulated DEGs, and almost all genes were downregulated DEGs in the oxidative phosphorylation (OXPHOS) and tricarboxylic acid (TCA) cycle in the three types of samples. Our transcriptomic meta-analysis provides a comprehensive evaluation of the gene expression profiles in major blood types of nonprogressors, providing new

  1. Analysis of global gene expression profiles to identify differentially expressed genes critical for embryo development in Brassica rapa.

    PubMed

    Zhang, Yu; Peng, Lifang; Wu, Ya; Shen, Yanyue; Wu, Xiaoming; Wang, Jianbo

    2014-11-01

    Embryo development represents a crucial developmental period in the life cycle of flowering plants. To gain insights into the genetic programs that control embryo development in Brassica rapa L., RNA sequencing technology was used to perform transcriptome profiling analysis of B. rapa developing embryos. The results generated 42,906,229 sequence reads aligned with 32,941 genes. In total, 27,760, 28,871, 28,384, and 25,653 genes were identified from embryos at globular, heart, early cotyledon, and mature developmental stages, respectively, and analysis between stages revealed a subset of stage-specific genes. We next investigated 9,884 differentially expressed genes with more than fivefold changes in expression and false discovery rate ≤ 0.001 from three adjacent-stage comparisons; 1,514, 3,831, and 6,633 genes were detected between globular and heart stage embryo libraries, heart stage and early cotyledon stage, and early cotyledon and mature stage, respectively. Large numbers of genes related to cellular process, metabolism process, response to stimulus, and biological process were expressed during the early and middle stages of embryo development. Fatty acid biosynthesis, biosynthesis of secondary metabolites, and photosynthesis-related genes were expressed predominantly in embryos at the middle stage. Genes for lipid metabolism and storage proteins were highly expressed in the middle and late stages of embryo development. We also identified 911 transcription factor genes that show differential expression across embryo developmental stages. These results increase our understanding of the complex molecular and cellular events during embryo development in B. rapa and provide a foundation for future studies on other oilseed crops.

  2. Genome-wide methylation analysis identifies genes silenced in non-seminoma cell lines

    PubMed Central

    Noor, Dzul Azri Mohamed; Jeyapalan, Jennie N; Alhazmi, Safiah; Carr, Matthew; Squibb, Benjamin; Wallace, Claire; Tan, Christopher; Cusack, Martin; Hughes, Jaime; Reader, Tom; Shipley, Janet; Sheer, Denise; Scotting, Paul J

    2016-01-01

    Silencing of genes by DNA methylation is a common phenomenon in many types of cancer. However, the genome-wide effect of DNA methylation on gene expression has been analysed in relatively few cancers. Germ cell tumours (GCTs) are a complex group of malignancies. They are unique in developing from a pluripotent progenitor cell. Previous analyses have suggested that non-seminomas exhibit much higher levels of DNA methylation than seminomas. The genomic targets that are methylated, the extent to which this results in gene silencing and the identity of the silenced genes most likely to play a role in the tumours’ biology have not yet been established. In this study, genome-wide methylation and expression analysis of GCT cell lines was combined with gene expression data from primary tumours to address this question. Genome methylation was analysed using the Illumina infinium HumanMethylome450 bead chip system and gene expression was analysed using Affymetrix GeneChip Human Genome U133 Plus 2.0 arrays. Regulation by methylation was confirmed by demethylation using 5-aza-2-deoxycytidine and reverse transcription–quantitative PCR. Large differences in the level of methylation of the CpG islands of individual genes between tumour cell lines correlated well with differential gene expression. Treatment of non-seminoma cells with 5-aza-2-deoxycytidine verified that methylation of all genes tested played a role in their silencing in yolk sac tumour cells and many of these genes were also differentially expressed in primary tumours. Genes silenced by methylation in the various GCT cell lines were identified. Several pluripotency-associated genes were identified as a major functional group of silenced genes. PMID:29263807

  3. Genome-wide methylation analysis identifies genes silenced in non-seminoma cell lines.

    PubMed

    Noor, Dzul Azri Mohamed; Jeyapalan, Jennie N; Alhazmi, Safiah; Carr, Matthew; Squibb, Benjamin; Wallace, Claire; Tan, Christopher; Cusack, Martin; Hughes, Jaime; Reader, Tom; Shipley, Janet; Sheer, Denise; Scotting, Paul J

    2016-01-01

    Silencing of genes by DNA methylation is a common phenomenon in many types of cancer. However, the genome-wide effect of DNA methylation on gene expression has been analysed in relatively few cancers. Germ cell tumours (GCTs) are a complex group of malignancies. They are unique in developing from a pluripotent progenitor cell. Previous analyses have suggested that non-seminomas exhibit much higher levels of DNA methylation than seminomas. The genomic targets that are methylated, the extent to which this results in gene silencing and the identity of the silenced genes most likely to play a role in the tumours' biology have not yet been established. In this study, genome-wide methylation and expression analysis of GCT cell lines was combined with gene expression data from primary tumours to address this question. Genome methylation was analysed using the Illumina infinium HumanMethylome450 bead chip system and gene expression was analysed using Affymetrix GeneChip Human Genome U133 Plus 2.0 arrays. Regulation by methylation was confirmed by demethylation using 5-aza-2-deoxycytidine and reverse transcription-quantitative PCR. Large differences in the level of methylation of the CpG islands of individual genes between tumour cell lines correlated well with differential gene expression. Treatment of non-seminoma cells with 5-aza-2-deoxycytidine verified that methylation of all genes tested played a role in their silencing in yolk sac tumour cells and many of these genes were also differentially expressed in primary tumours. Genes silenced by methylation in the various GCT cell lines were identified. Several pluripotency-associated genes were identified as a major functional group of silenced genes.

  4. Robust Principal Component Analysis Regularized by Truncated Nuclear Norm for Identifying Differentially Expressed Genes.

    PubMed

    Wang, Ya-Xuan; Gao, Ying-Lian; Liu, Jin-Xing; Kong, Xiang-Zhen; Li, Hai-Jun

    2017-09-01

    Identifying differentially expressed genes from the thousands of genes is a challenging task. Robust principal component analysis (RPCA) is an efficient method in the identification of differentially expressed genes. RPCA method uses nuclear norm to approximate the rank function. However, theoretical studies showed that the nuclear norm minimizes all singular values, so it may not be the best solution to approximate the rank function. The truncated nuclear norm is defined as the sum of some smaller singular values, which may achieve a better approximation of the rank function than nuclear norm. In this paper, a novel method is proposed by replacing nuclear norm of RPCA with the truncated nuclear norm, which is named robust principal component analysis regularized by truncated nuclear norm (TRPCA). The method decomposes the observation matrix of genomic data into a low-rank matrix and a sparse matrix. Because the significant genes can be considered as sparse signals, the differentially expressed genes are viewed as the sparse perturbation signals. Thus, the differentially expressed genes can be identified according to the sparse matrix. The experimental results on The Cancer Genome Atlas data illustrate that the TRPCA method outperforms other state-of-the-art methods in the identification of differentially expressed genes.

  5. Featured Article: Transcriptional landscape analysis identifies differently expressed genes involved in follicle-stimulating hormone induced postmenopausal osteoporosis.

    PubMed

    Maasalu, Katre; Laius, Ott; Zhytnik, Lidiia; Kõks, Sulev; Prans, Ele; Reimann, Ene; Märtson, Aare

    2017-01-01

    Osteoporosis is a disorder associated with bone tissue reorganization, bone mass, and mineral density. Osteoporosis can severely affect postmenopausal women, causing bone fragility and osteoporotic fractures. The aim of the current study was to compare blood mRNA profiles of postmenopausal women with and without osteoporosis, with the aim of finding different gene expressions and thus targets for future osteoporosis biomarker studies. Our study consisted of transcriptome analysis of whole blood serum from 12 elderly female osteoporotic patients and 12 non-osteoporotic elderly female controls. The transcriptome analysis was performed with RNA sequencing technology. For data analysis, the edgeR package of R Bioconductor was used. Two hundred and fourteen genes were expressed differently in osteoporotic compared with non-osteoporotic patients. Statistical analysis revealed 20 differently expressed genes with a false discovery rate of less than 1.47 × 10 -4 among osteoporotic patients. The expression of 10 genes were up-regulated and 10 down-regulated. Further statistical analysis identified a potential osteoporosis mRNA biomarker pattern consisting of six genes: CACNA1G, ALG13, SBK1, GGT7, MBNL3, and RIOK3. Functional ingenuity pathway analysis identified the strongest candidate genes with regard to potential involvement in a follicle-stimulating hormone activated network of increased osteoclast activity and hypogonadal bone loss. The differentially expressed genes identified in this study may contribute to future research of postmenopausal osteoporosis blood biomarkers.

  6. Large-Scale Gene-Centric Analysis Identifies Novel Variants for Coronary Artery Disease

    PubMed Central

    2011-01-01

    Coronary artery disease (CAD) has a significant genetic contribution that is incompletely characterized. To complement genome-wide association (GWA) studies, we conducted a large and systematic candidate gene study of CAD susceptibility, including analysis of many uncommon and functional variants. We examined 49,094 genetic variants in ∼2,100 genes of cardiovascular relevance, using a customised gene array in 15,596 CAD cases and 34,992 controls (11,202 cases and 30,733 controls of European descent; 4,394 cases and 4,259 controls of South Asian origin). We attempted to replicate putative novel associations in an additional 17,121 CAD cases and 40,473 controls. Potential mechanisms through which the novel variants could affect CAD risk were explored through association tests with vascular risk factors and gene expression. We confirmed associations of several previously known CAD susceptibility loci (eg, 9p21.3:p<10−33; LPA:p<10−19; 1p13.3:p<10−17) as well as three recently discovered loci (COL4A1/COL4A2, ZC3HC1, CYP17A1:p<5×10−7). However, we found essentially null results for most previously suggested CAD candidate genes. In our replication study of 24 promising common variants, we identified novel associations of variants in or near LIPA, IL5, TRIB1, and ABCG5/ABCG8, with per-allele odds ratios for CAD risk with each of the novel variants ranging from 1.06–1.09. Associations with variants at LIPA, TRIB1, and ABCG5/ABCG8 were supported by gene expression data or effects on lipid levels. Apart from the previously reported variants in LPA, none of the other ∼4,500 low frequency and functional variants showed a strong effect. Associations in South Asians did not differ appreciably from those in Europeans, except for 9p21.3 (per-allele odds ratio: 1.14 versus 1.27 respectively; P for heterogeneity = 0.003). This large-scale gene-centric analysis has identified several novel genes for CAD that relate to diverse biochemical and cellular functions and

  7. Large-scale gene-centric analysis identifies novel variants for coronary artery disease.

    PubMed

    2011-09-01

    Coronary artery disease (CAD) has a significant genetic contribution that is incompletely characterized. To complement genome-wide association (GWA) studies, we conducted a large and systematic candidate gene study of CAD susceptibility, including analysis of many uncommon and functional variants. We examined 49,094 genetic variants in ∼2,100 genes of cardiovascular relevance, using a customised gene array in 15,596 CAD cases and 34,992 controls (11,202 cases and 30,733 controls of European descent; 4,394 cases and 4,259 controls of South Asian origin). We attempted to replicate putative novel associations in an additional 17,121 CAD cases and 40,473 controls. Potential mechanisms through which the novel variants could affect CAD risk were explored through association tests with vascular risk factors and gene expression. We confirmed associations of several previously known CAD susceptibility loci (eg, 9p21.3:p<10(-33); LPA:p<10(-19); 1p13.3:p<10(-17)) as well as three recently discovered loci (COL4A1/COL4A2, ZC3HC1, CYP17A1:p<5×10(-7)). However, we found essentially null results for most previously suggested CAD candidate genes. In our replication study of 24 promising common variants, we identified novel associations of variants in or near LIPA, IL5, TRIB1, and ABCG5/ABCG8, with per-allele odds ratios for CAD risk with each of the novel variants ranging from 1.06-1.09. Associations with variants at LIPA, TRIB1, and ABCG5/ABCG8 were supported by gene expression data or effects on lipid levels. Apart from the previously reported variants in LPA, none of the other ∼4,500 low frequency and functional variants showed a strong effect. Associations in South Asians did not differ appreciably from those in Europeans, except for 9p21.3 (per-allele odds ratio: 1.14 versus 1.27 respectively; P for heterogeneity = 0.003). This large-scale gene-centric analysis has identified several novel genes for CAD that relate to diverse biochemical and cellular functions and

  8. Parallel analysis of tagged deletion mutants efficiently identifies genes involved in endoplasmic reticulum biogenesis.

    PubMed

    Wright, Robin; Parrish, Mark L; Cadera, Emily; Larson, Lynnelle; Matson, Clinton K; Garrett-Engele, Philip; Armour, Chris; Lum, Pek Yee; Shoemaker, Daniel D

    2003-07-30

    Increased levels of HMG-CoA reductase induce cell type- and isozyme-specific proliferation of the endoplasmic reticulum. In yeast, the ER proliferations induced by Hmg1p consist of nuclear-associated stacks of smooth ER membranes known as karmellae. To identify genes required for karmellae assembly, we compared the composition of populations of homozygous diploid S. cerevisiae deletion mutants following 20 generations of growth with and without karmellae. Using an initial population of 1,557 deletion mutants, 120 potential mutants were identified as a result of three independent experiments. Each experiment produced a largely non-overlapping set of potential mutants, suggesting that differences in specific growth conditions could be used to maximize the comprehensiveness of similar parallel analysis screens. Only two genes, UBC7 and YAL011W, were identified in all three experiments. Subsequent analysis of individual mutant strains confirmed that each experiment was identifying valid mutations, based on the mutant's sensitivity to elevated HMG-CoA reductase and inability to assemble normal karmellae. The largest class of HMG-CoA reductase-sensitive mutations was a subset of genes that are involved in chromatin structure and transcriptional regulation, suggesting that karmellae assembly requires changes in transcription or that the presence of karmellae may interfere with normal transcriptional regulation. Copyright 2003 John Wiley & Sons, Ltd.

  9. Co-fuse: a new class discovery analysis tool to identify and prioritize recurrent fusion genes from RNA-sequencing data.

    PubMed

    Paisitkriangkrai, Sakrapee; Quek, Kelly; Nievergall, Eva; Jabbour, Anissa; Zannettino, Andrew; Kok, Chung Hoow

    2018-06-07

    Recurrent oncogenic fusion genes play a critical role in the development of various cancers and diseases and provide, in some cases, excellent therapeutic targets. To date, analysis tools that can identify and compare recurrent fusion genes across multiple samples have not been available to researchers. To address this deficiency, we developed Co-occurrence Fusion (Co-fuse), a new and easy to use software tool that enables biologists to merge RNA-seq information, allowing them to identify recurrent fusion genes, without the need for exhaustive data processing. Notably, Co-fuse is based on pattern mining and statistical analysis which enables the identification of hidden patterns of recurrent fusion genes. In this report, we show that Co-fuse can be used to identify 2 distinct groups within a set of 49 leukemic cell lines based on their recurrent fusion genes: a multiple myeloma (MM) samples-enriched cluster and an acute myeloid leukemia (AML) samples-enriched cluster. Our experimental results further demonstrate that Co-fuse can identify known driver fusion genes (e.g., IGH-MYC, IGH-WHSC1) in MM, when compared to AML samples, indicating the potential of Co-fuse to aid the discovery of yet unknown driver fusion genes through cohort comparisons. Additionally, using a 272 primary glioma sample RNA-seq dataset, Co-fuse was able to validate recurrent fusion genes, further demonstrating the power of this analysis tool to identify recurrent fusion genes. Taken together, Co-fuse is a powerful new analysis tool that can be readily applied to large RNA-seq datasets, and may lead to the discovery of new disease subgroups and potentially new driver genes, for which, targeted therapies could be developed. The Co-fuse R source code is publicly available at https://github.com/sakrapee/co-fuse .

  10. Microarray analysis and scale-free gene networks identify candidate regulators in drought-stressed roots of loblolly pine (P. taeda L.)

    PubMed Central

    2011-01-01

    Background Global transcriptional analysis of loblolly pine (Pinus taeda L.) is challenging due to limited molecular tools. PtGen2, a 26,496 feature cDNA microarray, was fabricated and used to assess drought-induced gene expression in loblolly pine propagule roots. Statistical analysis of differential expression and weighted gene correlation network analysis were used to identify drought-responsive genes and further characterize the molecular basis of drought tolerance in loblolly pine. Results Microarrays were used to interrogate root cDNA populations obtained from 12 genotype × treatment combinations (four genotypes, three watering regimes). Comparison of drought-stressed roots with roots from the control treatment identified 2445 genes displaying at least a 1.5-fold expression difference (false discovery rate = 0.01). Genes commonly associated with drought response in pine and other plant species, as well as a number of abiotic and biotic stress-related genes, were up-regulated in drought-stressed roots. Only 76 genes were identified as differentially expressed in drought-recovered roots, indicating that the transcript population can return to the pre-drought state within 48 hours. Gene correlation analysis predicts a scale-free network topology and identifies eleven co-expression modules that ranged in size from 34 to 938 members. Network topological parameters identified a number of central nodes (hubs) including those with significant homology (E-values ≤ 2 × 10-30) to 9-cis-epoxycarotenoid dioxygenase, zeatin O-glucosyltransferase, and ABA-responsive protein. Identified hubs also include genes that have been associated previously with osmotic stress, phytohormones, enzymes that detoxify reactive oxygen species, and several genes of unknown function. Conclusion PtGen2 was used to evaluate transcriptome responses in loblolly pine and was leveraged to identify 2445 differentially expressed genes responding to severe drought stress in roots. Many of the

  11. Genome-wide association meta-analysis of 78,308 individuals identifies new loci and genes influencing human intelligence.

    PubMed

    Sniekers, Suzanne; Stringer, Sven; Watanabe, Kyoko; Jansen, Philip R; Coleman, Jonathan R I; Krapohl, Eva; Taskesen, Erdogan; Hammerschlag, Anke R; Okbay, Aysu; Zabaneh, Delilah; Amin, Najaf; Breen, Gerome; Cesarini, David; Chabris, Christopher F; Iacono, William G; Ikram, M Arfan; Johannesson, Magnus; Koellinger, Philipp; Lee, James J; Magnusson, Patrik K E; McGue, Matt; Miller, Mike B; Ollier, William E R; Payton, Antony; Pendleton, Neil; Plomin, Robert; Rietveld, Cornelius A; Tiemeier, Henning; van Duijn, Cornelia M; Posthuma, Danielle

    2017-07-01

    Intelligence is associated with important economic and health-related life outcomes. Despite intelligence having substantial heritability (0.54) and a confirmed polygenic nature, initial genetic studies were mostly underpowered. Here we report a meta-analysis for intelligence of 78,308 individuals. We identify 336 associated SNPs (METAL P < 5 × 10 -8 ) in 18 genomic loci, of which 15 are new. Around half of the SNPs are located inside a gene, implicating 22 genes, of which 11 are new findings. Gene-based analyses identified an additional 30 genes (MAGMA P < 2.73 × 10 -6 ), of which all but one had not been implicated previously. We show that the identified genes are predominantly expressed in brain tissue, and pathway analysis indicates the involvement of genes regulating cell development (MAGMA competitive P = 3.5 × 10 -6 ). Despite the well-known difference in twin-based heritability for intelligence in childhood (0.45) and adulthood (0.80), we show substantial genetic correlation (r g = 0.89, LD score regression P = 5.4 × 10 -29 ). These findings provide new insight into the genetic architecture of intelligence.

  12. Systematic analysis of mutation distribution in three dimensional protein structures identifies cancer driver genes.

    PubMed

    Fujimoto, Akihiro; Okada, Yukinori; Boroevich, Keith A; Tsunoda, Tatsuhiko; Taniguchi, Hiroaki; Nakagawa, Hidewaki

    2016-05-26

    Protein tertiary structure determines molecular function, interaction, and stability of the protein, therefore distribution of mutation in the tertiary structure can facilitate the identification of new driver genes in cancer. To analyze mutation distribution in protein tertiary structures, we applied a novel three dimensional permutation test to the mutation positions. We analyzed somatic mutation datasets of 21 types of cancers obtained from exome sequencing conducted by the TCGA project. Of the 3,622 genes that had ≥3 mutations in the regions with tertiary structure data, 106 genes showed significant skew in mutation distribution. Known tumor suppressors and oncogenes were significantly enriched in these identified cancer gene sets. Physical distances between mutations in known oncogenes were significantly smaller than those of tumor suppressors. Twenty-three genes were detected in multiple cancers. Candidate genes with significant skew of the 3D mutation distribution included kinases (MAPK1, EPHA5, ERBB3, and ERBB4), an apoptosis related gene (APP), an RNA splicing factor (SF1), a miRNA processing factor (DICER1), an E3 ubiquitin ligase (CUL1) and transcription factors (KLF5 and EEF1B2). Our study suggests that systematic analysis of mutation distribution in the tertiary protein structure can help identify cancer driver genes.

  13. Systematic analysis of mutation distribution in three dimensional protein structures identifies cancer driver genes

    PubMed Central

    Fujimoto, Akihiro; Okada, Yukinori; Boroevich, Keith A.; Tsunoda, Tatsuhiko; Taniguchi, Hiroaki; Nakagawa, Hidewaki

    2016-01-01

    Protein tertiary structure determines molecular function, interaction, and stability of the protein, therefore distribution of mutation in the tertiary structure can facilitate the identification of new driver genes in cancer. To analyze mutation distribution in protein tertiary structures, we applied a novel three dimensional permutation test to the mutation positions. We analyzed somatic mutation datasets of 21 types of cancers obtained from exome sequencing conducted by the TCGA project. Of the 3,622 genes that had ≥3 mutations in the regions with tertiary structure data, 106 genes showed significant skew in mutation distribution. Known tumor suppressors and oncogenes were significantly enriched in these identified cancer gene sets. Physical distances between mutations in known oncogenes were significantly smaller than those of tumor suppressors. Twenty-three genes were detected in multiple cancers. Candidate genes with significant skew of the 3D mutation distribution included kinases (MAPK1, EPHA5, ERBB3, and ERBB4), an apoptosis related gene (APP), an RNA splicing factor (SF1), a miRNA processing factor (DICER1), an E3 ubiquitin ligase (CUL1) and transcription factors (KLF5 and EEF1B2). Our study suggests that systematic analysis of mutation distribution in the tertiary protein structure can help identify cancer driver genes. PMID:27225414

  14. Integrative Analysis of DNA Methylation and Gene Expression Data Identifies EPAS1 as a Key Regulator of COPD

    PubMed Central

    Yoo, Seungyeul; Takikawa, Sachiko; Geraghty, Patrick; Argmann, Carmen; Campbell, Joshua; Lin, Luan; Huang, Tao; Tu, Zhidong; Feronjy, Robert; Spira, Avrum; Schadt, Eric E.; Powell, Charles A.; Zhu, Jun

    2015-01-01

    Chronic Obstructive Pulmonary Disease (COPD) is a complex disease. Genetic, epigenetic, and environmental factors are known to contribute to COPD risk and disease progression. Therefore we developed a systematic approach to identify key regulators of COPD that integrates genome-wide DNA methylation, gene expression, and phenotype data in lung tissue from COPD and control samples. Our integrative analysis identified 126 key regulators of COPD. We identified EPAS1 as the only key regulator whose downstream genes significantly overlapped with multiple genes sets associated with COPD disease severity. EPAS1 is distinct in comparison with other key regulators in terms of methylation profile and downstream target genes. Genes predicted to be regulated by EPAS1 were enriched for biological processes including signaling, cell communications, and system development. We confirmed that EPAS1 protein levels are lower in human COPD lung tissue compared to non-disease controls and that Epas1 gene expression is reduced in mice chronically exposed to cigarette smoke. As EPAS1 downstream genes were significantly enriched for hypoxia responsive genes in endothelial cells, we tested EPAS1 function in human endothelial cells. EPAS1 knockdown by siRNA in endothelial cells impacted genes that significantly overlapped with EPAS1 downstream genes in lung tissue including hypoxia responsive genes, and genes associated with emphysema severity. Our first integrative analysis of genome-wide DNA methylation and gene expression profiles illustrates that not only does DNA methylation play a ‘causal’ role in the molecular pathophysiology of COPD, but it can be leveraged to directly identify novel key mediators of this pathophysiology. PMID:25569234

  15. Co-expression network analysis identified six hub genes in association with metastasis risk and prognosis in hepatocellular carcinoma

    PubMed Central

    Feng, Juerong; Zhou, Rui; Chang, Ying; Liu, Jing; Zhao, Qiu

    2017-01-01

    Hepatocellular carcinoma (HCC) has a high incidence and mortality worldwide, and its carcinogenesis and progression are influenced by a complex network of gene interactions. A weighted gene co-expression network was constructed to identify gene modules associated with the clinical traits in HCC (n = 214). Among the 13 modules, high correlation was only found between the red module and metastasis risk (classified by the HCC metastasis gene signature) (R2 = −0.74). Moreover, in the red module, 34 network hub genes for metastasis risk were identified, six of which (ABAT, AGXT, ALDH6A1, CYP4A11, DAO and EHHADH) were also hub nodes in the protein-protein interaction network of the module genes. Thus, a total of six hub genes were identified. In validation, all hub genes showed a negative correlation with the four-stage HCC progression (P for trend < 0.05) in the test set. Furthermore, in the training set, HCC samples with any hub gene lowly expressed demonstrated a higher recurrence rate and poorer survival rate (hazard ratios with 95% confidence intervals > 1). RNA-sequencing data of 142 HCC samples showed consistent results in the prognosis. Gene set enrichment analysis (GSEA) demonstrated that in the samples with any hub gene highly expressed, a total of 24 functional gene sets were enriched, most of which focused on amino acid metabolism and oxidation. In conclusion, co-expression network analysis identified six hub genes in association with HCC metastasis risk and prognosis, which might improve the prognosis by influencing amino acid metabolism and oxidation. PMID:28430663

  16. Integrative analysis of DNA methylation and gene expression data identifies EPAS1 as a key regulator of COPD.

    PubMed

    Yoo, Seungyeul; Takikawa, Sachiko; Geraghty, Patrick; Argmann, Carmen; Campbell, Joshua; Lin, Luan; Huang, Tao; Tu, Zhidong; Foronjy, Robert F; Feronjy, Robert; Spira, Avrum; Schadt, Eric E; Powell, Charles A; Zhu, Jun

    2015-01-01

    Chronic Obstructive Pulmonary Disease (COPD) is a complex disease. Genetic, epigenetic, and environmental factors are known to contribute to COPD risk and disease progression. Therefore we developed a systematic approach to identify key regulators of COPD that integrates genome-wide DNA methylation, gene expression, and phenotype data in lung tissue from COPD and control samples. Our integrative analysis identified 126 key regulators of COPD. We identified EPAS1 as the only key regulator whose downstream genes significantly overlapped with multiple genes sets associated with COPD disease severity. EPAS1 is distinct in comparison with other key regulators in terms of methylation profile and downstream target genes. Genes predicted to be regulated by EPAS1 were enriched for biological processes including signaling, cell communications, and system development. We confirmed that EPAS1 protein levels are lower in human COPD lung tissue compared to non-disease controls and that Epas1 gene expression is reduced in mice chronically exposed to cigarette smoke. As EPAS1 downstream genes were significantly enriched for hypoxia responsive genes in endothelial cells, we tested EPAS1 function in human endothelial cells. EPAS1 knockdown by siRNA in endothelial cells impacted genes that significantly overlapped with EPAS1 downstream genes in lung tissue including hypoxia responsive genes, and genes associated with emphysema severity. Our first integrative analysis of genome-wide DNA methylation and gene expression profiles illustrates that not only does DNA methylation play a 'causal' role in the molecular pathophysiology of COPD, but it can be leveraged to directly identify novel key mediators of this pathophysiology.

  17. Exome Sequencing and Linkage Analysis Identified Novel Candidate Genes in Recessive Intellectual Disability Associated with Ataxia.

    PubMed

    Jazayeri, Roshanak; Hu, Hao; Fattahi, Zohreh; Musante, Luciana; Abedini, Seyedeh Sedigheh; Hosseini, Masoumeh; Wienker, Thomas F; Ropers, Hans Hilger; Najmabadi, Hossein; Kahrizi, Kimia

    2015-10-01

    Intellectual disability (ID) is a neuro-developmental disorder which causes considerable socio-economic problems. Some ID individuals are also affected by ataxia, and the condition includes different mutations affecting several genes. We used whole exome sequencing (WES) in combination with homozygosity mapping (HM) to identify the genetic defects in five consanguineous families among our cohort study, with two affected children with ID and ataxia as major clinical symptoms. We identified three novel candidate genes, RIPPLY1, MRPL10, SNX14, and a new mutation in known gene SURF1. All are autosomal genes, except RIPPLY1, which is located on the X chromosome. Two are housekeeping genes, implicated in transcription and translation regulation and intracellular trafficking, and two encode mitochondrial proteins. The pathogenesis of these variants was evaluated by mutation classification, bioinformatic methods, review of medical and biological relevance, co-segregation studies in the particular family, and a normal population study. Linkage analysis and exome sequencing of a small number of affected family members is a powerful new technique which can be used to decrease the number of candidate genes in heterogenic disorders such as ID, and may even identify the responsible gene(s).

  18. Identifying biomarkers of papillary renal cell carcinoma associated with pathological stage by weighted gene co-expression network analysis.

    PubMed

    He, Zhongshi; Sun, Min; Ke, Yuan; Lin, Rongjie; Xiao, Youde; Zhou, Shuliang; Zhao, Hong; Wang, Yan; Zhou, Fuxiang; Zhou, Yunfeng

    2017-04-25

    Although papillary renal cell carcinoma (PRCC) accounts for 10%-15% of renal cell carcinoma (RCC), no predictive molecular biomarker is currently applicable to guiding disease stage of PRCC patients. The mRNASeq data of PRCC and adjacent normal tissue in The Cancer Genome Atlas was analyzed to identify 1148 differentially expressed genes, on which weighted gene co-expression network analysis was performed. Then 11 co-expressed gene modules were identified. The highest association was found between blue module and pathological stage (r = 0.45) by Pearson's correlation analysis. Functional enrichment analysis revealed that biological processes of blue module focused on nuclear division, cell cycle phase, and spindle (all P < 1e-10). All 40 hub genes in blue module can distinguish localized (pathological stage I, II) from non-localized (pathological stage III, IV) PRCC (P < 0.01). A good molecular biomarker for pathological stage of RCC must be a prognostic gene in clinical practice. Survival analysis was performed to reversely validate if hub genes were associated with pathological stage. Survival analysis unveiled that all hub genes were associated with patient prognosis (P < 0.01).The validation cohort GSE2748 verified that 30 hub genes can differentiate localized from non-localized PRCC (P < 0.01), and 18 hub genes are prognosis-associated (P < 0.01).ROC curve indicated that the 17 hub genes exhibited excellent diagnostic efficiency for localized and non-localized PRCC (AUC > 0.7). These hub genes may serve as a biomarker and help to distinguish different pathological stages for PRCC patients.

  19. Phylogenomic analysis of UDP glycosyltransferase 1 multigene family in Linum usitatissimum identified genes with varied expression patterns.

    PubMed

    Barvkar, Vitthal T; Pardeshi, Varsha C; Kale, Sandip M; Kadoo, Narendra Y; Gupta, Vidya S

    2012-05-08

    The glycosylation process, catalyzed by ubiquitous glycosyltransferase (GT) family enzymes, is a prevalent modification of plant secondary metabolites that regulates various functions such as hormone homeostasis, detoxification of xenobiotics and biosynthesis and storage of secondary metabolites. Flax (Linum usitatissimum L.) is a commercially grown oilseed crop, important because of its essential fatty acids and health promoting lignans. Identification and characterization of UDP glycosyltransferase (UGT) genes from flax could provide valuable basic information about this important gene family and help to explain the seed specific glycosylated metabolite accumulation and other processes in plants. Plant genome sequencing projects are useful to discover complexity within this gene family and also pave way for the development of functional genomics approaches. Taking advantage of the newly assembled draft genome sequence of flax, we identified 137 UDP glycosyltransferase (UGT) genes from flax using a conserved signature motif. Phylogenetic analysis of these protein sequences clustered them into 14 major groups (A-N). Expression patterns of these genes were investigated using publicly available expressed sequence tag (EST), microarray data and reverse transcription quantitative real time PCR (RT-qPCR). Seventy-three per cent of these genes (100 out of 137) showed expression evidence in 15 tissues examined and indicated varied expression profiles. The RT-qPCR results of 10 selected genes were also coherent with the digital expression analysis. Interestingly, five duplicated UGT genes were identified, which showed differential expression in various tissues. Of the seven intron loss/gain positions detected, two intron positions were conserved among most of the UGTs, although a clear relationship about the evolution of these genes could not be established. Comparison of the flax UGTs with orthologs from four other sequenced dicot genomes indicated that seven UGTs were

  20. Transcriptomic Analysis Using Olive Varieties and Breeding Progenies Identifies Candidate Genes Involved in Plant Architecture.

    PubMed

    González-Plaza, Juan J; Ortiz-Martín, Inmaculada; Muñoz-Mérida, Antonio; García-López, Carmen; Sánchez-Sevilla, José F; Luque, Francisco; Trelles, Oswaldo; Bejarano, Eduardo R; De La Rosa, Raúl; Valpuesta, Victoriano; Beuzón, Carmen R

    2016-01-01

    Plant architecture is a critical trait in fruit crops that can significantly influence yield, pruning, planting density and harvesting. Little is known about how plant architecture is genetically determined in olive, were most of the existing varieties are traditional with an architecture poorly suited for modern growing and harvesting systems. In the present study, we have carried out microarray analysis of meristematic tissue to compare expression profiles of olive varieties displaying differences in architecture, as well as seedlings from their cross pooled on the basis of their sharing architecture-related phenotypes. The microarray used, previously developed by our group has already been applied to identify candidates genes involved in regulating juvenile to adult transition in the shoot apex of seedlings. Varieties with distinct architecture phenotypes and individuals from segregating progenies displaying opposite architecture features were used to link phenotype to expression. Here, we identify 2252 differentially expressed genes (DEGs) associated to differences in plant architecture. Microarray results were validated by quantitative RT-PCR carried out on genes with functional annotation likely related to plant architecture. Twelve of these genes were further analyzed in individual seedlings of the corresponding pool. We also examined Arabidopsis mutants in putative orthologs of these targeted candidate genes, finding altered architecture for most of them. This supports a functional conservation between species and potential biological relevance of the candidate genes identified. This study is the first to identify genes associated to plant architecture in olive, and the results obtained could be of great help in future programs aimed at selecting phenotypes adapted to modern cultivation practices in this species.

  1. Correlational analysis for identifying genes whose regulation contributes to chronic neuropathic pain

    PubMed Central

    Persson, Anna-Karin; Gebauer, Mathias; Jordan, Suzana; Metz-Weidmann, Christiane; Schulte, Anke M; Schneider, Hans-Christoph; Ding-Pfennigdorff, Danping; Thun, Jonas; Xu, Xiao-Jun; Wiesenfeld-Hallin, Zsuzsanna; Darvasi, Ariel; Fried, Kaj; Devor, Marshall

    2009-01-01

    Background Nerve injury-triggered hyperexcitability in primary sensory neurons is considered a major source of chronic neuropathic pain. The hyperexcitability, in turn, is thought to be related to transcriptional switching in afferent cell somata. Analysis using expression microarrays has revealed that many genes are regulated in the dorsal root ganglion (DRG) following axotomy. But which contribute to pain phenotype versus other nerve injury-evoked processes such as nerve regeneration? Using the L5 spinal nerve ligation model of neuropathy we examined differential changes in gene expression in the L5 (and L4) DRGs in five mouse strains with contrasting susceptibility to neuropathic pain. We sought genes for which the degree of regulation correlates with strain-specific pain phenotype. Results In an initial experiment six candidate genes previously identified as important in pain physiology were selected for in situ hybridization to DRG sections. Among these, regulation of the Na+ channel α subunit Scn11a correlated with levels of spontaneous pain behavior, and regulation of the cool receptor Trpm8 correlated with heat hypersensibility. In a larger scale experiment, mRNA extracted from individual mouse DRGs was processed on Affymetrix whole-genome expression microarrays. Overall, 2552 ± 477 transcripts were significantly regulated in the axotomized L5DRG 3 days postoperatively. However, in only a small fraction of these was the degree of regulation correlated with pain behavior across strains. Very few genes in the "uninjured" L4DRG showed altered expression (24 ± 28). Conclusion Correlational analysis based on in situ hybridization provided evidence that differential regulation of Scn11a and Trpm8 contributes to across-strain variability in pain phenotype. This does not, of course, constitute evidence that the others are unrelated to pain. Correlational analysis based on microarray data yielded a larger "look-up table" of genes whose regulation likely

  2. Genome-wide gene by lead exposure interaction analysis identifies UNC5D as a candidate gene for neurodevelopment.

    PubMed

    Wang, Zhaoxi; Claus Henn, Birgit; Wang, Chaolong; Wei, Yongyue; Su, Li; Sun, Ryan; Chen, Han; Wagner, Peter J; Lu, Quan; Lin, Xihong; Wright, Robert; Bellinger, David; Kile, Molly; Mazumdar, Maitreyi; Tellez-Rojo, Martha Maria; Schnaas, Lourdes; Christiani, David C

    2017-07-28

    Neurodevelopment is a complex process involving both genetic and environmental factors. Prenatal exposure to lead (Pb) has been associated with lower performance on neurodevelopmental tests. Adverse neurodevelopmental outcomes are more frequent and/or more severe when toxic exposures interact with genetic susceptibility. To explore possible loci associated with increased susceptibility to prenatal Pb exposure, we performed a genome-wide gene-environment interaction study (GWIS) in young children from Mexico (n = 390) and Bangladesh (n = 497). Prenatal Pb exposure was estimated by cord blood Pb concentration. Neurodevelopment was assessed using the Bayley Scales of Infant Development. We identified a locus on chromosome 8, containing UNC5D, and demonstrated evidence of its genome-wide significance with mental composite scores (rs9642758, p meta  = 4.35 × 10 -6 ). Within this locus, the joint effects of two independent single nucleotide polymorphisms (SNPs, rs9642758 and rs10503970) had a p-value of 4.38 × 10 -9 for mental composite scores. Correlating GWIS results with in vitro transcriptomic profiles identified one common gene, SLC1A5, which is involved in synaptic function, neuronal development, and excitotoxicity. Further analysis revealed interconnected interactions that formed a large network of 52 genes enriched with oxidative stress genes and neurodevelopmental genes. Our findings suggest that certain genetic polymorphisms within/near genes relevant to neurodevelopment might modify the toxic effects of Pb exposure via oxidative stress.

  3. Integrative multi-platform meta-analysis of gene expression profiles in pancreatic ductal adenocarcinoma patients for identifying novel diagnostic biomarkers.

    PubMed

    Irigoyen, Antonio; Jimenez-Luna, Cristina; Benavides, Manuel; Caba, Octavio; Gallego, Javier; Ortuño, Francisco Manuel; Guillen-Ponce, Carmen; Rojas, Ignacio; Aranda, Enrique; Torres, Carolina; Prados, Jose

    2018-01-01

    Applying differentially expressed genes (DEGs) to identify feasible biomarkers in diseases can be a hard task when working with heterogeneous datasets. Expression data are strongly influenced by technology, sample preparation processes, and/or labeling methods. The proliferation of different microarray platforms for measuring gene expression increases the need to develop models able to compare their results, especially when different technologies can lead to signal values that vary greatly. Integrative meta-analysis can significantly improve the reliability and robustness of DEG detection. The objective of this work was to develop an integrative approach for identifying potential cancer biomarkers by integrating gene expression data from two different platforms. Pancreatic ductal adenocarcinoma (PDAC), where there is an urgent need to find new biomarkers due its late diagnosis, is an ideal candidate for testing this technology. Expression data from two different datasets, namely Affymetrix and Illumina (18 and 36 PDAC patients, respectively), as well as from 18 healthy controls, was used for this study. A meta-analysis based on an empirical Bayesian methodology (ComBat) was then proposed to integrate these datasets. DEGs were finally identified from the integrated data by using the statistical programming language R. After our integrative meta-analysis, 5 genes were commonly identified within the individual analyses of the independent datasets. Also, 28 novel genes that were not reported by the individual analyses ('gained' genes) were also discovered. Several of these gained genes have been already related to other gastroenterological tumors. The proposed integrative meta-analysis has revealed novel DEGs that may play an important role in PDAC and could be potential biomarkers for diagnosing the disease.

  4. Novel linkage disequilibrium clustering algorithm identifies new lupus genes on meta-analysis of GWAS datasets.

    PubMed

    Saeed, Mohammad

    2017-05-01

    Systemic lupus erythematosus (SLE) is a complex disorder. Genetic association studies of complex disorders suffer from the following three major issues: phenotypic heterogeneity, false positive (type I error), and false negative (type II error) results. Hence, genes with low to moderate effects are missed in standard analyses, especially after statistical corrections. OASIS is a novel linkage disequilibrium clustering algorithm that can potentially address false positives and negatives in genome-wide association studies (GWAS) of complex disorders such as SLE. OASIS was applied to two SLE dbGAP GWAS datasets (6077 subjects; ∼0.75 million single-nucleotide polymorphisms). OASIS identified three known SLE genes viz. IFIH1, TNIP1, and CD44, not previously reported using these GWAS datasets. In addition, 22 novel loci for SLE were identified and the 5 SLE genes previously reported using these datasets were verified. OASIS methodology was validated using single-variant replication and gene-based analysis with GATES. This led to the verification of 60% of OASIS loci. New SLE genes that OASIS identified and were further verified include TNFAIP6, DNAJB3, TTF1, GRIN2B, MON2, LATS2, SNX6, RBFOX1, NCOA3, and CHAF1B. This study presents the OASIS algorithm, software, and the meta-analyses of two publicly available SLE GWAS datasets along with the novel SLE genes. Hence, OASIS is a novel linkage disequilibrium clustering method that can be universally applied to existing GWAS datasets for the identification of new genes.

  5. Transcriptomic Analysis Using Olive Varieties and Breeding Progenies Identifies Candidate Genes Involved in Plant Architecture

    PubMed Central

    González-Plaza, Juan J.; Ortiz-Martín, Inmaculada; Muñoz-Mérida, Antonio; García-López, Carmen; Sánchez-Sevilla, José F.; Luque, Francisco; Trelles, Oswaldo; Bejarano, Eduardo R.; De La Rosa, Raúl; Valpuesta, Victoriano; Beuzón, Carmen R.

    2016-01-01

    Plant architecture is a critical trait in fruit crops that can significantly influence yield, pruning, planting density and harvesting. Little is known about how plant architecture is genetically determined in olive, were most of the existing varieties are traditional with an architecture poorly suited for modern growing and harvesting systems. In the present study, we have carried out microarray analysis of meristematic tissue to compare expression profiles of olive varieties displaying differences in architecture, as well as seedlings from their cross pooled on the basis of their sharing architecture-related phenotypes. The microarray used, previously developed by our group has already been applied to identify candidates genes involved in regulating juvenile to adult transition in the shoot apex of seedlings. Varieties with distinct architecture phenotypes and individuals from segregating progenies displaying opposite architecture features were used to link phenotype to expression. Here, we identify 2252 differentially expressed genes (DEGs) associated to differences in plant architecture. Microarray results were validated by quantitative RT-PCR carried out on genes with functional annotation likely related to plant architecture. Twelve of these genes were further analyzed in individual seedlings of the corresponding pool. We also examined Arabidopsis mutants in putative orthologs of these targeted candidate genes, finding altered architecture for most of them. This supports a functional conservation between species and potential biological relevance of the candidate genes identified. This study is the first to identify genes associated to plant architecture in olive, and the results obtained could be of great help in future programs aimed at selecting phenotypes adapted to modern cultivation practices in this species. PMID:26973682

  6. Phylogenomic analysis of UDP glycosyltransferase 1 multigene family in Linum usitatissimum identified genes with varied expression patterns

    PubMed Central

    2012-01-01

    Background The glycosylation process, catalyzed by ubiquitous glycosyltransferase (GT) family enzymes, is a prevalent modification of plant secondary metabolites that regulates various functions such as hormone homeostasis, detoxification of xenobiotics and biosynthesis and storage of secondary metabolites. Flax (Linum usitatissimum L.) is a commercially grown oilseed crop, important because of its essential fatty acids and health promoting lignans. Identification and characterization of UDP glycosyltransferase (UGT) genes from flax could provide valuable basic information about this important gene family and help to explain the seed specific glycosylated metabolite accumulation and other processes in plants. Plant genome sequencing projects are useful to discover complexity within this gene family and also pave way for the development of functional genomics approaches. Results Taking advantage of the newly assembled draft genome sequence of flax, we identified 137 UDP glycosyltransferase (UGT) genes from flax using a conserved signature motif. Phylogenetic analysis of these protein sequences clustered them into 14 major groups (A-N). Expression patterns of these genes were investigated using publicly available expressed sequence tag (EST), microarray data and reverse transcription quantitative real time PCR (RT-qPCR). Seventy-three per cent of these genes (100 out of 137) showed expression evidence in 15 tissues examined and indicated varied expression profiles. The RT-qPCR results of 10 selected genes were also coherent with the digital expression analysis. Interestingly, five duplicated UGT genes were identified, which showed differential expression in various tissues. Of the seven intron loss/gain positions detected, two intron positions were conserved among most of the UGTs, although a clear relationship about the evolution of these genes could not be established. Comparison of the flax UGTs with orthologs from four other sequenced dicot genomes indicated that

  7. Computational Gene Expression Modeling Identifies Salivary Biomarker Analysis that Predict Oral Feeding Readiness in the Newborn

    PubMed Central

    Maron, Jill L.; Hwang, Jooyeon S.; Pathak, Subash; Ruthazer, Robin; Russell, Ruby L.; Alterovitz, Gil

    2014-01-01

    Objective To combine mathematical modeling of salivary gene expression microarray data and systems biology annotation with RT-qPCR amplification to identify (phase I) and validate (phase II) salivary biomarker analysis for the prediction of oral feeding readiness in preterm infants. Study design Comparative whole transcriptome microarray analysis from 12 preterm newborns pre- and post-oral feeding success was used for computational modeling and systems biology analysis to identify potential salivary transcripts associated with oral feeding success (phase I). Selected gene expression biomarkers (15 from computational modeling; 6 evidence-based; and 3 reference) were evaluated by RT-qPCR amplification on 400 salivary samples from successful (n=200) and unsuccessful (n=200) oral feeders (phase II). Genes, alone and in combination, were evaluated by a multivariate analysis controlling for sex and post-conceptional age (PCA) to determine the probability that newborns achieved successful oral feeding. Results Advancing post-conceptional age (p < 0.001) and female sex (p = 0.05) positively predicted an infant’s ability to feed orally. A combination of five genes, NPY2R (hunger signaling), AMPK (energy homeostasis), PLXNA1 (olfactory neurogenesis), NPHP4 (visual behavior) and WNT3 (facial development), in addition to PCA and sex, demonstrated good accuracy for determining feeding success (AUROC = 0.78). Conclusions We have identified objective and biologically relevant salivary biomarkers that noninvasively assess a newborn’s developing brain, sensory and facial development as they relate to oral feeding success. Understanding the mechanisms that underlie the development of oral feeding readiness through translational and computational methods may improve clinical decision making while decreasing morbidities and health care costs. PMID:25620512

  8. Regulatory network analysis of Epstein-Barr virus identifies functional modules and hub genes involved in infectious mononucleosis.

    PubMed

    Poorebrahim, Mansour; Salarian, Ali; Najafi, Saeideh; Abazari, Mohammad Foad; Aleagha, Maryam Nouri; Dadras, Mohammad Nasr; Jazayeri, Seyed Mohammad; Ataei, Atousa; Poortahmasebi, Vahdat

    2017-05-01

    Epstein-Barr virus (EBV) is the most common cause of infectious mononucleosis (IM) and establishes lifetime infection associated with a variety of cancers and autoimmune diseases. The aim of this study was to develop an integrative gene regulatory network (GRN) approach and overlying gene expression data to identify the representative subnetworks for IM and EBV latent infection (LI). After identifying differentially expressed genes (DEGs) in both IM and LI gene expression profiles, functional annotations were applied using gene ontology (GO) and BiNGO tools, and construction of GRNs, topological analysis and identification of modules were carried out using several plugins of Cytoscape. In parallel, a human-EBV GRN was generated using the Hu-Vir database for further analyses. Our analysis revealed that the majority of DEGs in both IM and LI were involved in cell-cycle and DNA repair processes. However, these genes showed a significant negative correlation in the IM and LI states. Furthermore, cyclin-dependent kinase 2 (CDK2) - a hub gene with the highest centrality score - appeared to be the key player in cell cycle regulation in IM disease. The most significant functional modules in the IM and LI states were involved in the regulation of the cell cycle and apoptosis, respectively. Human-EBV network analysis revealed several direct targets of EBV proteins during IM disease. Our study provides an important first report on the response to IM/LI EBV infection in humans. An important aspect of our data was the upregulation of genes associated with cell cycle progression and proliferation.

  9. Integrating Genetic and Gene Co-expression Analysis Identifies Gene Networks Involved in Alcohol and Stress Responses

    PubMed Central

    Luo, Jie; Xu, Pei; Cao, Peijian; Wan, Hongjian; Lv, Xiaonan; Xu, Shengchun; Wang, Gangjun; Cook, Melloni N.; Jones, Byron C.; Lu, Lu; Wang, Xusheng

    2018-01-01

    Although the link between stress and alcohol is well recognized, the underlying mechanisms of how they interplay at the molecular level remain unclear. The purpose of this study is to identify molecular networks underlying the effects of alcohol and stress responses, as well as their interaction on anxiety behaviors in the hippocampus of mice using a systems genetics approach. Here, we applied a gene co-expression network approach to transcriptomes of 41 BXD mouse strains under four conditions: stress, alcohol, stress-induced alcohol and control. The co-expression analysis identified 14 modules and characterized four expression patterns across the four conditions. The four expression patterns include up-regulation in no restraint stress and given an ethanol injection (NOE) but restoration in restraint stress followed by an ethanol injection (RSE; pattern 1), down-regulation in NOE but rescue in RSE (pattern 2), up-regulation in both restraint stress followed by a saline injection (RSS) and NOE, and further amplification in RSE (pattern 3), and up-regulation in RSS but reduction in both NOE and RSE (pattern 4). We further identified four functional subnetworks by superimposing protein-protein interactions (PPIs) to the 14 co-expression modules, including γ-aminobutyric acid receptor (GABA) signaling, glutamate signaling, neuropeptide signaling, cAMP-dependent signaling. We further performed module specificity analysis to identify modules that are specific to stress, alcohol, or stress-induced alcohol responses. Finally, we conducted causality analysis to link genetic variation to these identified modules, and anxiety behaviors after stress and alcohol treatments. This study underscores the importance of integrative analysis and offers new insights into the molecular networks underlying stress and alcohol responses. PMID:29674951

  10. Association Analysis Suggests SOD2 as a Newly Identified Candidate Gene Associated With Leprosy Susceptibility.

    PubMed

    Ramos, Geovana Brotto; Salomão, Heloisa; Francio, Angela Schneider; Fava, Vinícius Medeiros; Werneck, Renata Iani; Mira, Marcelo Távora

    2016-08-01

    Genetic studies have identified several genes and genomic regions contributing to the control of host susceptibility to leprosy. Here, we test variants of the positional and functional candidate gene SOD2 for association with leprosy in 2 independent population samples. Family-based analysis revealed an association between leprosy and allele G of marker rs295340 (P = .042) and borderline evidence of an association between leprosy and alleles C and A of markers rs4880 (P = .077) and rs5746136 (P = .071), respectively. Findings were validated in an independent case-control sample for markers rs295340 (P = .049) and rs4880 (P = .038). These results suggest SOD2 as a newly identified gene conferring susceptibility to leprosy. © The Author 2016. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail journals.permissions@oup.com.

  11. Pathway-based analysis of GWAs data identifies association of sex determination genes with susceptibility to testicular germ cell tumors.

    PubMed

    Koster, Roelof; Mitra, Nandita; D'Andrea, Kurt; Vardhanabhuti, Saran; Chung, Charles C; Wang, Zhaoming; Loren Erickson, R; Vaughn, David J; Litchfield, Kevin; Rahman, Nazneen; Greene, Mark H; McGlynn, Katherine A; Turnbull, Clare; Chanock, Stephen J; Nathanson, Katherine L; Kanetsky, Peter A

    2014-11-15

    Genome-wide association (GWA) studies of testicular germ cell tumor (TGCT) have identified 18 susceptibility loci, some containing genes encoding proteins important in male germ cell development. Deletions of one of these genes, DMRT1, lead to male-to-female sex reversal and are associated with development of gonadoblastoma. To further explore genetic association with TGCT, we undertook a pathway-based analysis of SNP marker associations in the Penn GWAs (349 TGCT cases and 919 controls). We analyzed a custom-built sex determination gene set consisting of 32 genes using three different methods of pathway-based analysis. The sex determination gene set ranked highly compared with canonical gene sets, and it was associated with TGCT (FDRG = 2.28 × 10(-5), FDRM = 0.014 and FDRI = 0.008 for Gene Set Analysis-SNP (GSA-SNP), Meta-Analysis Gene Set Enrichment of Variant Associations (MAGENTA) and Improved Gene Set Enrichment Analysis for Genome-wide Association Study (i-GSEA4GWAS) analysis, respectively). The association remained after removal of DMRT1 from the gene set (FDRG = 0.0002, FDRM = 0.055 and FDRI = 0.009). Using data from the NCI GWA scan (582 TGCT cases and 1056 controls) and UK scan (986 TGCT cases and 4946 controls), we replicated these findings (NCI: FDRG = 0.006, FDRM = 0.014, FDRI = 0.033, and UK: FDRG = 1.04 × 10(-6), FDRM = 0.016, FDRI = 0.025). After removal of DMRT1 from the gene set, the sex determination gene set remains associated with TGCT in the NCI (FDRG = 0.039, FDRM = 0.050 and FDRI = 0.055) and UK scans (FDRG = 3.00 × 10(-5), FDRM = 0.056 and FDRI = 0.044). With the exception of DMRT1, genes in the sex determination gene set have not previously been identified as TGCT susceptibility loci in these GWA scans, demonstrating the complementary nature of a pathway-based approach for genome-wide analysis of TGCT. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  12. Candidate genes for panhypopituitarism identified by gene expression profiling

    PubMed Central

    Mortensen, Amanda H.; MacDonald, James W.; Ghosh, Debashis

    2011-01-01

    Mutations in the transcription factors PROP1 and PIT1 (POU1F1) lead to pituitary hormone deficiency and hypopituitarism in mice and humans. The dysmorphology of developing Prop1 mutant pituitaries readily distinguishes them from those of Pit1 mutants and normal mice. This and other features suggest that Prop1 controls the expression of genes besides Pit1 that are important for pituitary cell migration, survival, and differentiation. To identify genes involved in these processes we used microarray analysis of gene expression to compare pituitary RNA from newborn Prop1 and Pit1 mutants and wild-type littermates. Significant differences in gene expression were noted between each mutant and their normal littermates, as well as between Prop1 and Pit1 mutants. Otx2, a gene critical for normal eye and pituitary development in humans and mice, exhibited elevated expression specifically in Prop1 mutant pituitaries. We report the spatial and temporal regulation of Otx2 in normal mice and Prop1 mutants, and the results suggest Otx2 could influence pituitary development by affecting signaling from the ventral diencephalon and regulation of gene expression in Rathke's pouch. The discovery that Otx2 expression is affected by Prop1 deficiency provides support for our hypothesis that identifying molecular differences in mutants will contribute to understanding the molecular mechanisms that control pituitary organogenesis and lead to human pituitary disease. PMID:21828248

  13. Identifying potential maternal genes of Bombyx mori using digital gene expression profiling

    PubMed Central

    Xu, Pingzhen

    2018-01-01

    Maternal genes present in mature oocytes play a crucial role in the early development of silkworm. Although maternal genes have been widely studied in many other species, there has been limited research in Bombyx mori. High-throughput next generation sequencing provides a practical method for gene discovery on a genome-wide level. Herein, a transcriptome study was used to identify maternal-related genes from silkworm eggs. Unfertilized eggs from five different stages of early development were used to detect the changing situation of gene expression. The expressed genes showed different patterns over time. Seventy-six maternal genes were annotated according to homology analysis with Drosophila melanogaster. More than half of the differentially expressed maternal genes fell into four expression patterns, while the expression patterns showed a downward trend over time. The functional annotation of these material genes was mainly related to transcription factor activity, growth factor activity, nucleic acid binding, RNA binding, ATP binding, and ion binding. Additionally, twenty-two gene clusters including maternal genes were identified from 18 scaffolds. Altogether, we plotted a profile for the maternal genes of Bombyx mori using a digital gene expression profiling method. This will provide the basis for maternal-specific signature research and improve the understanding of the early development of silkworm. PMID:29462160

  14. Evolutionary analysis of vision genes identifies potential drivers of visual differences between giraffe and okapi

    PubMed Central

    Agaba, Morris; Cavener, Douglas R.

    2017-01-01

    Background The capacity of visually oriented species to perceive and respond to visual signal is integral to their evolutionary success. Giraffes are closely related to okapi, but the two species have broad range of phenotypic differences including their visual capacities. Vision studies rank giraffe’s visual acuity higher than all other artiodactyls despite sharing similar vision ecological determinants with many of them. The extent to which the giraffe’s unique visual capacity and its difference with okapi is reflected by changes in their vision genes is not understood. Methods The recent availability of giraffe and okapi genomes provided opportunity to identify giraffe and okapi vision genes. Multiple strategies were employed to identify thirty-six candidate mammalian vision genes in giraffe and okapi genomes. Quantification of selection pressure was performed by a combination of branch-site tests of positive selection and clade models of selection divergence through comparing giraffe and okapi vision genes and orthologous sequences from other mammals. Results Signatures of selection were identified in key genes that could potentially underlie giraffe and okapi visual adaptations. Importantly, some genes that contribute to optical transparency of the eye and those that are critical in light signaling pathway were found to show signatures of adaptive evolution or selection divergence. Comparison between giraffe and other ruminants identifies significant selection divergence in CRYAA and OPN1LW. Significant selection divergence was identified in SAG while positive selection was detected in LUM when okapi is compared with ruminants and other mammals. Sequence analysis of OPN1LW showed that at least one of the sites known to affect spectral sensitivity of the red pigment is uniquely divergent between giraffe and other ruminants. Discussion By taking a systemic approach to gene function in vision, the results provide the first molecular clues associated with

  15. Evolutionary analysis of vision genes identifies potential drivers of visual differences between giraffe and okapi.

    PubMed

    Ishengoma, Edson; Agaba, Morris; Cavener, Douglas R

    2017-01-01

    The capacity of visually oriented species to perceive and respond to visual signal is integral to their evolutionary success. Giraffes are closely related to okapi, but the two species have broad range of phenotypic differences including their visual capacities. Vision studies rank giraffe's visual acuity higher than all other artiodactyls despite sharing similar vision ecological determinants with many of them. The extent to which the giraffe's unique visual capacity and its difference with okapi is reflected by changes in their vision genes is not understood. The recent availability of giraffe and okapi genomes provided opportunity to identify giraffe and okapi vision genes. Multiple strategies were employed to identify thirty-six candidate mammalian vision genes in giraffe and okapi genomes. Quantification of selection pressure was performed by a combination of branch-site tests of positive selection and clade models of selection divergence through comparing giraffe and okapi vision genes and orthologous sequences from other mammals. Signatures of selection were identified in key genes that could potentially underlie giraffe and okapi visual adaptations. Importantly, some genes that contribute to optical transparency of the eye and those that are critical in light signaling pathway were found to show signatures of adaptive evolution or selection divergence. Comparison between giraffe and other ruminants identifies significant selection divergence in CRYAA and OPN1LW . Significant selection divergence was identified in SAG while positive selection was detected in LUM when okapi is compared with ruminants and other mammals. Sequence analysis of OPN1LW showed that at least one of the sites known to affect spectral sensitivity of the red pigment is uniquely divergent between giraffe and other ruminants. By taking a systemic approach to gene function in vision, the results provide the first molecular clues associated with giraffe and okapi vision adaptations. At

  16. Meta-analysis identifies a MECOM gene as a novel predisposing factor of osteoporotic fracture

    PubMed Central

    Hwang, Joo-Yeon; Lee, Seung Hun; Go, Min Jin; Kim, Beom-Jun; Kou, Ikuyo; Ikegawa, Shiro; Guo, Yan; Deng, Hong-Wen; Raychaudhuri, Soumya; Kim, Young Jin; Oh, Ji Hee; Kim, Youngdoe; Moon, Sanghoon; Kim, Dong-Joon; Koo, Heejo; Cha, My-Jung; Lee, Min Hye; Yun, Ji Young; Yoo, Hye-Sook; Kang, Young-Ah; Cho, Eun-Hee; Kim, Sang-Wook; Oh, Ki Won; Kang, Moo II; Son, Ho Young; Kim, Shin-Yoon; Kim, Ghi Su; Han, Bok-Ghee; Cho, Yoon Shin; Cho, Myeong-Chan; Lee, Jong-Young; Koh, Jung-Min

    2014-01-01

    Background Osteoporotic fracture (OF) as a clinical endpoint is a major complication of osteoporosis. To screen for OF susceptibility genes, we performed a genome-wide association study and carried out de novo replication analysis of an East Asian population. Methods Association was tested using a logistic regression analysis. A meta-analysis was performed on the combined results using effect size and standard errors estimated for each study. Results In a combined meta-analysis of a discovery cohort (288 cases and 1139 controls), three hospital based sets in replication stage I (462 cases and 1745 controls), and an independent ethnic group in replication stage II (369 cases and 560 for controls), we identified a new locus associated with OF (rs784288 in the MECOM gene) that showed genome-wide significance (p=3.59×10−8; OR 1.39). RNA interference revealed that a MECOM knockdown suppresses osteoclastogenesis. Conclusions Our findings provide new insights into the genetic architecture underlying OF in East Asians. PMID:23349225

  17. Integrated microarray and ChIP analysis identifies multiple Foxa2 dependent target genes in the notochord.

    PubMed

    Tamplin, Owen J; Cox, Brian J; Rossant, Janet

    2011-12-15

    The node and notochord are key tissues required for patterning of the vertebrate body plan. Understanding the gene regulatory network that drives their formation and function is therefore important. Foxa2 is a key transcription factor at the top of this genetic hierarchy and finding its targets will help us to better understand node and notochord development. We performed an extensive microarray-based gene expression screen using sorted embryonic notochord cells to identify early notochord-enriched genes. We validated their specificity to the node and notochord by whole mount in situ hybridization. This provides the largest available resource of notochord-expressed genes, and therefore candidate Foxa2 target genes in the notochord. Using existing Foxa2 ChIP-seq data from adult liver, we were able to identify a set of genes expressed in the notochord that had associated regions of Foxa2-bound chromatin. Given that Foxa2 is a pioneer transcription factor, we reasoned that these sites might represent notochord-specific enhancers. Candidate Foxa2-bound regions were tested for notochord specific enhancer function in a zebrafish reporter assay and 7 novel notochord enhancers were identified. Importantly, sequence conservation or predictive models could not have readily identified these regions. Mutation of putative Foxa2 binding elements in two of these novel enhancers abrogated reporter expression and confirmed their Foxa2 dependence. The combination of highly specific gene expression profiling and genome-wide ChIP analysis is a powerful means of understanding developmental pathways, even for small cell populations such as the notochord. Copyright © 2011 Elsevier Inc. All rights reserved.

  18. Integrated Analysis of Mutation Data from Various Sources Identifies Key Genes and Signaling Pathways in Hepatocellular Carcinoma

    PubMed Central

    Wei, Lin; Tang, Ruqi; Lian, Baofeng; Zhao, Yingjun; He, Xianghuo; Xie, Lu

    2014-01-01

    Background Recently, a number of studies have performed genome or exome sequencing of hepatocellular carcinoma (HCC) and identified hundreds or even thousands of mutations in protein-coding genes. However, these studies have only focused on a limited number of candidate genes, and many important mutation resources remain to be explored. Principal Findings In this study, we integrated mutation data obtained from various sources and performed pathway and network analysis. We identified 113 pathways that were significantly mutated in HCC samples and found that the mutated genes included in these pathways contained high percentages of known cancer genes, and damaging genes and also demonstrated high conservation scores, indicating their important roles in liver tumorigenesis. Five classes of pathways that were mutated most frequently included (a) proliferation and apoptosis related pathways, (b) tumor microenvironment related pathways, (c) neural signaling related pathways, (d) metabolic related pathways, and (e) circadian related pathways. Network analysis further revealed that the mutated genes with the highest betweenness coefficients, such as the well-known cancer genes TP53, CTNNB1 and recently identified novel mutated genes GNAL and the ADCY family, may play key roles in these significantly mutated pathways. Finally, we highlight several key genes (e.g., RPS6KA3 and PCLO) and pathways (e.g., axon guidance) in which the mutations were associated with clinical features. Conclusions Our workflow illustrates the increased statistical power of integrating multiple studies of the same subject, which can provide biological insights that would otherwise be masked under individual sample sets. This type of bioinformatics approach is consistent with the necessity of making the best use of the ever increasing data provided in valuable databases, such as TCGA, to enhance the speed of deciphering human cancers. PMID:24988079

  19. Integrated analysis of mutation data from various sources identifies key genes and signaling pathways in hepatocellular carcinoma.

    PubMed

    Zhang, Yuannv; Qiu, Zhaoping; Wei, Lin; Tang, Ruqi; Lian, Baofeng; Zhao, Yingjun; He, Xianghuo; Xie, Lu

    2014-01-01

    Recently, a number of studies have performed genome or exome sequencing of hepatocellular carcinoma (HCC) and identified hundreds or even thousands of mutations in protein-coding genes. However, these studies have only focused on a limited number of candidate genes, and many important mutation resources remain to be explored. In this study, we integrated mutation data obtained from various sources and performed pathway and network analysis. We identified 113 pathways that were significantly mutated in HCC samples and found that the mutated genes included in these pathways contained high percentages of known cancer genes, and damaging genes and also demonstrated high conservation scores, indicating their important roles in liver tumorigenesis. Five classes of pathways that were mutated most frequently included (a) proliferation and apoptosis related pathways, (b) tumor microenvironment related pathways, (c) neural signaling related pathways, (d) metabolic related pathways, and (e) circadian related pathways. Network analysis further revealed that the mutated genes with the highest betweenness coefficients, such as the well-known cancer genes TP53, CTNNB1 and recently identified novel mutated genes GNAL and the ADCY family, may play key roles in these significantly mutated pathways. Finally, we highlight several key genes (e.g., RPS6KA3 and PCLO) and pathways (e.g., axon guidance) in which the mutations were associated with clinical features. Our workflow illustrates the increased statistical power of integrating multiple studies of the same subject, which can provide biological insights that would otherwise be masked under individual sample sets. This type of bioinformatics approach is consistent with the necessity of making the best use of the ever increasing data provided in valuable databases, such as TCGA, to enhance the speed of deciphering human cancers.

  20. Identifying key genes associated with acute myocardial infarction.

    PubMed

    Cheng, Ming; An, Shoukuan; Li, Junquan

    2017-10-01

    This study aimed to identify key genes associated with acute myocardial infarction (AMI) by reanalyzing microarray data. Three gene expression profile datasets GSE66360, GSE34198, and GSE48060 were downloaded from GEO database. After data preprocessing, genes without heterogeneity across different platforms were subjected to differential expression analysis between the AMI group and the control group using metaDE package. P < .05 was used as the cutoff for a differentially expressed gene (DEG). The expression data matrices of DEGs were imported in ReactomeFIViz to construct a gene functional interaction (FI) network. Then, DEGs in each module were subjected to pathway enrichment analysis using DAVID. MiRNAs and transcription factors predicted to regulate target DEGs were identified. Quantitative real-time polymerase chain reaction (RT-PCR) was applied to verify the expression of genes. A total of 913 upregulated genes and 1060 downregulated genes were identified in the AMI group. A FI network consists of 21 modules and DEGs in 12 modules were significantly enriched in pathways. The transcription factor-miRNA-gene network contains 2 transcription factors FOXO3 and MYBL2, and 2 miRNAs hsa-miR-21-5p and hsa-miR-30c-5p. RT-PCR validations showed that expression levels of FOXO3 and MYBL2 were significantly increased in AMI, and expression levels of hsa-miR-21-5p and hsa-miR-30c-5p were obviously decreased in AMI. A total of 41 DEGs, such as SOCS3, VAPA, and COL5A2, are speculated to have roles in the pathogenesis of AMI; 2 transcription factors FOXO3 and MYBL2, and 2 miRNAs hsa-miR-21-5p and hsa-miR-30c-5p may be involved in the regulation of the expression of these DEGs.

  1. Identifying osteosarcoma metastasis associated genes by weighted gene co-expression network analysis (WGCNA).

    PubMed

    Tian, Honglai; Guan, Donghui; Li, Jianmin

    2018-06-01

    Osteosarcoma (OS), the most common malignant bone tumor, accounts for the heavy healthy threat in the period of children and adolescents. OS occurrence usually correlates with early metastasis and high death rate. This study aimed to better understand the mechanism of OS metastasis.Based on Gene Expression Omnibus (GEO) database, we downloaded 4 expression profile data sets associated with OS metastasis, and selected differential expressed genes. Weighted gene co-expression network analysis (WGCNA) approach allowed us to investigate the most OS metastasis-correlated module. Gene Ontology functional and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were used to give annotation of selected OS metastasis-associated genes.We select 897 differential expressed genes from OS metastasis and OS non-metastasis groups. Based on these selected genes, WGCNA further explored 142 genes included in the most OS metastasis-correlated module. Gene Ontology functional and KEGG pathway enrichment analyses showed that significantly OS metastasis-associated genes were involved in pathway correlated with insulin-like growth factor binding.Our research figured out several potential molecules participating in metastasis process and factors acting as biomarker. With this study, we could better explore the mechanism of OS metastasis and further discover more therapy targets.

  2. A stratified transcriptomics analysis of polygenic fat and lean mouse adipose tissues identifies novel candidate obesity genes.

    PubMed

    Morton, Nicholas M; Nelson, Yvonne B; Michailidou, Zoi; Di Rollo, Emma M; Ramage, Lynne; Hadoke, Patrick W F; Seckl, Jonathan R; Bunger, Lutz; Horvat, Simon; Kenyon, Christopher J; Dunbar, Donald R

    2011-01-01

    Obesity and metabolic syndrome results from a complex interaction between genetic and environmental factors. In addition to brain-regulated processes, recent genome wide association studies have indicated that genes highly expressed in adipose tissue affect the distribution and function of fat and thus contribute to obesity. Using a stratified transcriptome gene enrichment approach we attempted to identify adipose tissue-specific obesity genes in the unique polygenic Fat (F) mouse strain generated by selective breeding over 60 generations for divergent adiposity from a comparator Lean (L) strain. To enrich for adipose tissue obesity genes a 'snap-shot' pooled-sample transcriptome comparison of key fat depots and non adipose tissues (muscle, liver, kidney) was performed. Known obesity quantitative trait loci (QTL) information for the model allowed us to further filter genes for increased likelihood of being causal or secondary for obesity. This successfully identified several genes previously linked to obesity (C1qr1, and Np3r) as positional QTL candidate genes elevated specifically in F line adipose tissue. A number of novel obesity candidate genes were also identified (Thbs1, Ppp1r3d, Tmepai, Trp53inp2, Ttc7b, Tuba1a, Fgf13, Fmr) that have inferred roles in fat cell function. Quantitative microarray analysis was then applied to the most phenotypically divergent adipose depot after exaggerating F and L strain differences with chronic high fat feeding which revealed a distinct gene expression profile of line, fat depot and diet-responsive inflammatory, angiogenic and metabolic pathways. Selected candidate genes Npr3 and Thbs1, as well as Gys2, a non-QTL gene that otherwise passed our enrichment criteria were characterised, revealing novel functional effects consistent with a contribution to obesity. A focussed candidate gene enrichment strategy in the unique F and L model has identified novel adipose tissue-enriched genes contributing to obesity.

  3. Gene-centric Meta-analysis in 87,736 Individuals of European Ancestry Identifies Multiple Blood-Pressure-Related Loci

    PubMed Central

    Tragante, Vinicius; Barnes, Michael R.; Ganesh, Santhi K.; Lanktree, Matthew B.; Guo, Wei; Franceschini, Nora; Smith, Erin N.; Johnson, Toby; Holmes, Michael V.; Padmanabhan, Sandosh; Karczewski, Konrad J.; Almoguera, Berta; Barnard, John; Baumert, Jens; Chang, Yen-Pei Christy; Elbers, Clara C.; Farrall, Martin; Fischer, Mary E.; Gaunt, Tom R.; Gho, Johannes M.I.H.; Gieger, Christian; Goel, Anuj; Gong, Yan; Isaacs, Aaron; Kleber, Marcus E.; Leach, Irene Mateo; McDonough, Caitrin W.; Meijs, Matthijs F.L.; Melander, Olle; Nelson, Christopher P.; Nolte, Ilja M.; Pankratz, Nathan; Price, Tom S.; Shaffer, Jonathan; Shah, Sonia; Tomaszewski, Maciej; van der Most, Peter J.; Van Iperen, Erik P.A.; Vonk, Judith M.; Witkowska, Kate; Wong, Caroline O.L.; Zhang, Li; Beitelshees, Amber L.; Berenson, Gerald S.; Bhatt, Deepak L.; Brown, Morris; Burt, Amber; Cooper-DeHoff, Rhonda M.; Connell, John M.; Cruickshanks, Karen J.; Curtis, Sean P.; Davey-Smith, George; Delles, Christian; Gansevoort, Ron T.; Guo, Xiuqing; Haiqing, Shen; Hastie, Claire E.; Hofker, Marten H.; Hovingh, G. Kees; Kim, Daniel S.; Kirkland, Susan A.; Klein, Barbara E.; Klein, Ronald; Li, Yun R.; Maiwald, Steffi; Newton-Cheh, Christopher; O’Brien, Eoin T.; Onland-Moret, N. Charlotte; Palmas, Walter; Parsa, Afshin; Penninx, Brenda W.; Pettinger, Mary; Vasan, Ramachandran S.; Ranchalis, Jane E.; M Ridker, Paul; Rose, Lynda M.; Sever, Peter; Shimbo, Daichi; Steele, Laura; Stolk, Ronald P.; Thorand, Barbara; Trip, Mieke D.; van Duijn, Cornelia M.; Verschuren, W. Monique; Wijmenga, Cisca; Wyatt, Sharon; Young, J. Hunter; Zwinderman, Aeilko H.; Bezzina, Connie R.; Boerwinkle, Eric; Casas, Juan P.; Caulfield, Mark J.; Chakravarti, Aravinda; Chasman, Daniel I.; Davidson, Karina W.; Doevendans, Pieter A.; Dominiczak, Anna F.; FitzGerald, Garret A.; Gums, John G.; Fornage, Myriam; Hakonarson, Hakon; Halder, Indrani; Hillege, Hans L.; Illig, Thomas; Jarvik, Gail P.; Johnson, Julie A.; Kastelein, John J.P.; Koenig, Wolfgang; Kumari, Meena; März, Winfried; Murray, Sarah S.; O’Connell, Jeffery R.; Oldehinkel, Albertine J.; Pankow, James S.; Rader, Daniel J.; Redline, Susan; Reilly, Muredach P.; Schadt, Eric E.; Kottke-Marchant, Kandice; Snieder, Harold; Snyder, Michael; Stanton, Alice V.; Tobin, Martin D.; Uitterlinden, André G.; van der Harst, Pim; van der Schouw, Yvonne T.; Samani, Nilesh J.; Watkins, Hugh; Johnson, Andrew D.; Reiner, Alex P.; Zhu, Xiaofeng; de Bakker, Paul I.W.; Levy, Daniel; Asselbergs, Folkert W.; Munroe, Patricia B.; Keating, Brendan J.

    2014-01-01

    Blood pressure (BP) is a heritable risk factor for cardiovascular disease. To investigate genetic associations with systolic BP (SBP), diastolic BP (DBP), mean arterial pressure (MAP), and pulse pressure (PP), we genotyped ∼50,000 SNPs in up to 87,736 individuals of European ancestry and combined these in a meta-analysis. We replicated findings in an independent set of 68,368 individuals of European ancestry. Our analyses identified 11 previously undescribed associations in independent loci containing 31 genes including PDE1A, HLA-DQB1, CDK6, PRKAG2, VCL, H19, NUCB2, RELA, HOXC@ complex, FBN1, and NFAT5 at the Bonferroni-corrected array-wide significance threshold (p < 6 × 10−7) and confirmed 27 previously reported associations. Bioinformatic analysis of the 11 loci provided support for a putative role in hypertension of several genes, such as CDK6 and NUCB2. Analysis of potential pharmacological targets in databases of small molecules showed that ten of the genes are predicted to be a target for small molecules. In summary, we identified previously unknown loci associated with BP. Our findings extend our understanding of genes involved in BP regulation, which may provide new targets for therapeutic intervention or drug response stratification. PMID:24560520

  4. Quantitative analysis of bristle number in Drosophila mutants identifies genes involved in neural development

    NASA Technical Reports Server (NTRS)

    Norga, Koenraad K.; Gurganus, Marjorie C.; Dilda, Christy L.; Yamamoto, Akihiko; Lyman, Richard F.; Patel, Prajal H.; Rubin, Gerald M.; Hoskins, Roger A.; Mackay, Trudy F.; Bellen, Hugo J.

    2003-01-01

    BACKGROUND: The identification of the function of all genes that contribute to specific biological processes and complex traits is one of the major challenges in the postgenomic era. One approach is to employ forward genetic screens in genetically tractable model organisms. In Drosophila melanogaster, P element-mediated insertional mutagenesis is a versatile tool for the dissection of molecular pathways, and there is an ongoing effort to tag every gene with a P element insertion. However, the vast majority of P element insertion lines are viable and fertile as homozygotes and do not exhibit obvious phenotypic defects, perhaps because of the tendency for P elements to insert 5' of transcription units. Quantitative genetic analysis of subtle effects of P element mutations that have been induced in an isogenic background may be a highly efficient method for functional genome annotation. RESULTS: Here, we have tested the efficacy of this strategy by assessing the extent to which screening for quantitative effects of P elements on sensory bristle number can identify genes affecting neural development. We find that such quantitative screens uncover an unusually large number of genes that are known to function in neural development, as well as genes with yet uncharacterized effects on neural development, and novel loci. CONCLUSIONS: Our findings establish the use of quantitative trait analysis for functional genome annotation through forward genetics. Similar analyses of quantitative effects of P element insertions will facilitate our understanding of the genes affecting many other complex traits in Drosophila.

  5. Whole genome co-expression analysis of soybean cytochrome P450 genes identifies nodulation-specific P450 monooxygenases

    PubMed Central

    2010-01-01

    Background Cytochrome P450 monooxygenases (P450s) catalyze oxidation of various substrates using oxygen and NAD(P)H. Plant P450s are involved in the biosynthesis of primary and secondary metabolites performing diverse biological functions. The recent availability of the soybean genome sequence allows us to identify and analyze soybean putative P450s at a genome scale. Co-expression analysis using an available soybean microarray and Illumina sequencing data provides clues for functional annotation of these enzymes. This approach is based on the assumption that genes that have similar expression patterns across a set of conditions may have a functional relationship. Results We have identified a total number of 332 full-length P450 genes and 378 pseudogenes from the soybean genome. From the full-length sequences, 195 genes belong to A-type, which could be further divided into 20 families. The remaining 137 genes belong to non-A type P450s and are classified into 28 families. A total of 178 probe sets were found to correspond to P450 genes on the Affymetrix soybean array. Out of these probe sets, 108 represented single genes. Using the 28 publicly available microarray libraries that contain organ-specific information, some tissue-specific P450s were identified. Similarly, stress responsive soybean P450s were retrieved from 99 microarray soybean libraries. We also utilized Illumina transcriptome sequencing technology to analyze the expressions of all 332 soybean P450 genes. This dataset contains total RNAs isolated from nodules, roots, root tips, leaves, flowers, green pods, apical meristem, mock-inoculated and Bradyrhizobium japonicum-infected root hair cells. The tissue-specific expression patterns of these P450 genes were analyzed and the expression of a representative set of genes were confirmed by qRT-PCR. We performed the co-expression analysis on many of the 108 P450 genes on the Affymetrix arrays. First we confirmed that CYP93C5 (an isoflavone synthase gene) is

  6. Transcriptome analysis identifies genes involved in ethanol response of Saccharomyces cerevisiae in Agave tequilana juice.

    PubMed

    Ramirez-Córdova, Jesús; Drnevich, Jenny; Madrigal-Pulido, Jaime Alberto; Arrizon, Javier; Allen, Kirk; Martínez-Velázquez, Moisés; Alvarez-Maya, Ikuri

    2012-08-01

    During ethanol fermentation, yeast cells are exposed to stress due to the accumulation of ethanol, cell growth is altered and the output of the target product is reduced. For Agave beverages, like tequila, no reports have been published on the global gene expression under ethanol stress. In this work, we used microarray analysis to identify Saccharomyces cerevisiae genes involved in the ethanol response. Gene expression of a tequila yeast strain of S. cerevisiae (AR5) was explored by comparing global gene expression with that of laboratory strain S288C, both after ethanol exposure. Additionally, we used two different culture conditions, cells grown in Agave tequilana juice as a natural fermentation media or grown in yeast-extract peptone dextrose as artificial media. Of the 6368 S. cerevisiae genes in the microarray, 657 genes were identified that had different expression responses to ethanol stress due to strain and/or media. A cluster of 28 genes was found over-expressed specifically in the AR5 tequila strain that could be involved in the adaptation to tequila yeast fermentation, 14 of which are unknown such as yor343c, ylr162w, ygr182c, ymr265c, yer053c-a or ydr415c. These could be the most suitable genes for transforming tequila yeast to increase ethanol tolerance in the tequila fermentation process. Other genes involved in response to stress (RFC4, TSA1, MLH1, PAU3, RAD53) or transport (CYB2, TIP20, QCR9) were expressed in the same cluster. Unknown genes could be good candidates for the development of recombinant yeasts with ethanol tolerance for use in industrial tequila fermentation.

  7. Identifying key genes associated with acute myocardial infarction

    PubMed Central

    Cheng, Ming; An, Shoukuan; Li, Junquan

    2017-01-01

    Abstract Background: This study aimed to identify key genes associated with acute myocardial infarction (AMI) by reanalyzing microarray data. Methods: Three gene expression profile datasets GSE66360, GSE34198, and GSE48060 were downloaded from GEO database. After data preprocessing, genes without heterogeneity across different platforms were subjected to differential expression analysis between the AMI group and the control group using metaDE package. P < .05 was used as the cutoff for a differentially expressed gene (DEG). The expression data matrices of DEGs were imported in ReactomeFIViz to construct a gene functional interaction (FI) network. Then, DEGs in each module were subjected to pathway enrichment analysis using DAVID. MiRNAs and transcription factors predicted to regulate target DEGs were identified. Quantitative real-time polymerase chain reaction (RT-PCR) was applied to verify the expression of genes. Result: A total of 913 upregulated genes and 1060 downregulated genes were identified in the AMI group. A FI network consists of 21 modules and DEGs in 12 modules were significantly enriched in pathways. The transcription factor-miRNA-gene network contains 2 transcription factors FOXO3 and MYBL2, and 2 miRNAs hsa-miR-21-5p and hsa-miR-30c-5p. RT-PCR validations showed that expression levels of FOXO3 and MYBL2 were significantly increased in AMI, and expression levels of hsa-miR-21–5p and hsa-miR-30c-5p were obviously decreased in AMI. Conclusion: A total of 41 DEGs, such as SOCS3, VAPA, and COL5A2, are speculated to have roles in the pathogenesis of AMI; 2 transcription factors FOXO3 and MYBL2, and 2 miRNAs hsa-miR-21-5p and hsa-miR-30c-5p may be involved in the regulation of the expression of these DEGs. PMID:29049183

  8. Genes for seed longevity in barley identified by genomic analysis on Near Isogenic Lines.

    PubMed

    Wozny, Dorothee; Kramer, Katharina; Finkemeier, Iris; Acosta, Ivan F; Koornneef, Maarten

    2018-05-09

    Genes controlling differences in seed longevity between two barley (Hordeum vulgare) accessions were identified by combining quantitative genetics 'omics' technologies in Near Isogenic Lines (NILs). The NILs were derived from crosses between the spring barley landraces L94 from Ethiopia and Cebada Capa from Argentina. A combined transcriptome and proteome analysis on mature, non-aged seeds of the two parental lines and the L94 NILs by RNA-sequencing and total seed proteomic profiling identified the UDP-glycosyltransferase MLOC_11661.1 as candidate gene for the QTL on 2H, and the NADP-dependent malic enzyme (NADP-ME) MLOC_35785.1 as possible downstream target gene. To validate these candidates, they were expressed in Arabidopsis under the control of constitutive promoters to attempt complementing the T-DNA knock-out line nadp-me1. Both the NADP-ME MLOC_35785.1 and the UDP-glycosyltransferase MLOC_11661.1 were able to rescue the nadp-me1 seed longevity phenotype. In the case of the UDP-glycosyltransferase, with high accumulation in NILs, only the coding sequence of Cebada Capa had a rescue effect. This article is protected by copyright. All rights reserved.

  9. A Stratified Transcriptomics Analysis of Polygenic Fat and Lean Mouse Adipose Tissues Identifies Novel Candidate Obesity Genes

    PubMed Central

    Morton, Nicholas M.; Nelson, Yvonne B.; Michailidou, Zoi; Di Rollo, Emma M.; Ramage, Lynne; Hadoke, Patrick W. F.; Seckl, Jonathan R.; Bunger, Lutz; Horvat, Simon; Kenyon, Christopher J.; Dunbar, Donald R.

    2011-01-01

    Background Obesity and metabolic syndrome results from a complex interaction between genetic and environmental factors. In addition to brain-regulated processes, recent genome wide association studies have indicated that genes highly expressed in adipose tissue affect the distribution and function of fat and thus contribute to obesity. Using a stratified transcriptome gene enrichment approach we attempted to identify adipose tissue-specific obesity genes in the unique polygenic Fat (F) mouse strain generated by selective breeding over 60 generations for divergent adiposity from a comparator Lean (L) strain. Results To enrich for adipose tissue obesity genes a ‘snap-shot’ pooled-sample transcriptome comparison of key fat depots and non adipose tissues (muscle, liver, kidney) was performed. Known obesity quantitative trait loci (QTL) information for the model allowed us to further filter genes for increased likelihood of being causal or secondary for obesity. This successfully identified several genes previously linked to obesity (C1qr1, and Np3r) as positional QTL candidate genes elevated specifically in F line adipose tissue. A number of novel obesity candidate genes were also identified (Thbs1, Ppp1r3d, Tmepai, Trp53inp2, Ttc7b, Tuba1a, Fgf13, Fmr) that have inferred roles in fat cell function. Quantitative microarray analysis was then applied to the most phenotypically divergent adipose depot after exaggerating F and L strain differences with chronic high fat feeding which revealed a distinct gene expression profile of line, fat depot and diet-responsive inflammatory, angiogenic and metabolic pathways. Selected candidate genes Npr3 and Thbs1, as well as Gys2, a non-QTL gene that otherwise passed our enrichment criteria were characterised, revealing novel functional effects consistent with a contribution to obesity. Conclusions A focussed candidate gene enrichment strategy in the unique F and L model has identified novel adipose tissue-enriched genes

  10. Gene-centric meta-analysis in 87,736 individuals of European ancestry identifies multiple blood-pressure-related loci.

    PubMed

    Tragante, Vinicius; Barnes, Michael R; Ganesh, Santhi K; Lanktree, Matthew B; Guo, Wei; Franceschini, Nora; Smith, Erin N; Johnson, Toby; Holmes, Michael V; Padmanabhan, Sandosh; Karczewski, Konrad J; Almoguera, Berta; Barnard, John; Baumert, Jens; Chang, Yen-Pei Christy; Elbers, Clara C; Farrall, Martin; Fischer, Mary E; Gaunt, Tom R; Gho, Johannes M I H; Gieger, Christian; Goel, Anuj; Gong, Yan; Isaacs, Aaron; Kleber, Marcus E; Mateo Leach, Irene; McDonough, Caitrin W; Meijs, Matthijs F L; Melander, Olle; Nelson, Christopher P; Nolte, Ilja M; Pankratz, Nathan; Price, Tom S; Shaffer, Jonathan; Shah, Sonia; Tomaszewski, Maciej; van der Most, Peter J; Van Iperen, Erik P A; Vonk, Judith M; Witkowska, Kate; Wong, Caroline O L; Zhang, Li; Beitelshees, Amber L; Berenson, Gerald S; Bhatt, Deepak L; Brown, Morris; Burt, Amber; Cooper-DeHoff, Rhonda M; Connell, John M; Cruickshanks, Karen J; Curtis, Sean P; Davey-Smith, George; Delles, Christian; Gansevoort, Ron T; Guo, Xiuqing; Haiqing, Shen; Hastie, Claire E; Hofker, Marten H; Hovingh, G Kees; Kim, Daniel S; Kirkland, Susan A; Klein, Barbara E; Klein, Ronald; Li, Yun R; Maiwald, Steffi; Newton-Cheh, Christopher; O'Brien, Eoin T; Onland-Moret, N Charlotte; Palmas, Walter; Parsa, Afshin; Penninx, Brenda W; Pettinger, Mary; Vasan, Ramachandran S; Ranchalis, Jane E; M Ridker, Paul; Rose, Lynda M; Sever, Peter; Shimbo, Daichi; Steele, Laura; Stolk, Ronald P; Thorand, Barbara; Trip, Mieke D; van Duijn, Cornelia M; Verschuren, W Monique; Wijmenga, Cisca; Wyatt, Sharon; Young, J Hunter; Zwinderman, Aeilko H; Bezzina, Connie R; Boerwinkle, Eric; Casas, Juan P; Caulfield, Mark J; Chakravarti, Aravinda; Chasman, Daniel I; Davidson, Karina W; Doevendans, Pieter A; Dominiczak, Anna F; FitzGerald, Garret A; Gums, John G; Fornage, Myriam; Hakonarson, Hakon; Halder, Indrani; Hillege, Hans L; Illig, Thomas; Jarvik, Gail P; Johnson, Julie A; Kastelein, John J P; Koenig, Wolfgang; Kumari, Meena; März, Winfried; Murray, Sarah S; O'Connell, Jeffery R; Oldehinkel, Albertine J; Pankow, James S; Rader, Daniel J; Redline, Susan; Reilly, Muredach P; Schadt, Eric E; Kottke-Marchant, Kandice; Snieder, Harold; Snyder, Michael; Stanton, Alice V; Tobin, Martin D; Uitterlinden, André G; van der Harst, Pim; van der Schouw, Yvonne T; Samani, Nilesh J; Watkins, Hugh; Johnson, Andrew D; Reiner, Alex P; Zhu, Xiaofeng; de Bakker, Paul I W; Levy, Daniel; Asselbergs, Folkert W; Munroe, Patricia B; Keating, Brendan J

    2014-03-06

    Blood pressure (BP) is a heritable risk factor for cardiovascular disease. To investigate genetic associations with systolic BP (SBP), diastolic BP (DBP), mean arterial pressure (MAP), and pulse pressure (PP), we genotyped ~50,000 SNPs in up to 87,736 individuals of European ancestry and combined these in a meta-analysis. We replicated findings in an independent set of 68,368 individuals of European ancestry. Our analyses identified 11 previously undescribed associations in independent loci containing 31 genes including PDE1A, HLA-DQB1, CDK6, PRKAG2, VCL, H19, NUCB2, RELA, HOXC@ complex, FBN1, and NFAT5 at the Bonferroni-corrected array-wide significance threshold (p < 6 × 10(-7)) and confirmed 27 previously reported associations. Bioinformatic analysis of the 11 loci provided support for a putative role in hypertension of several genes, such as CDK6 and NUCB2. Analysis of potential pharmacological targets in databases of small molecules showed that ten of the genes are predicted to be a target for small molecules. In summary, we identified previously unknown loci associated with BP. Our findings extend our understanding of genes involved in BP regulation, which may provide new targets for therapeutic intervention or drug response stratification. Copyright © 2014 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  11. Meta-analysis identifies gene-by-environment interactions as demonstrated in a study of 4,965 mice.

    PubMed

    Kang, Eun Yong; Han, Buhm; Furlotte, Nicholas; Joo, Jong Wha J; Shih, Diana; Davis, Richard C; Lusis, Aldons J; Eskin, Eleazar

    2014-01-01

    Identifying environmentally-specific genetic effects is a key challenge in understanding the structure of complex traits. Model organisms play a crucial role in the identification of such gene-by-environment interactions, as a result of the unique ability to observe genetically similar individuals across multiple distinct environments. Many model organism studies examine the same traits but under varying environmental conditions. For example, knock-out or diet-controlled studies are often used to examine cholesterol in mice. These studies, when examined in aggregate, provide an opportunity to identify genomic loci exhibiting environmentally-dependent effects. However, the straightforward application of traditional methodologies to aggregate separate studies suffers from several problems. First, environmental conditions are often variable and do not fit the standard univariate model for interactions. Additionally, applying a multivariate model results in increased degrees of freedom and low statistical power. In this paper, we jointly analyze multiple studies with varying environmental conditions using a meta-analytic approach based on a random effects model to identify loci involved in gene-by-environment interactions. Our approach is motivated by the observation that methods for discovering gene-by-environment interactions are closely related to random effects models for meta-analysis. We show that interactions can be interpreted as heterogeneity and can be detected without utilizing the traditional uni- or multi-variate approaches for discovery of gene-by-environment interactions. We apply our new method to combine 17 mouse studies containing in aggregate 4,965 distinct animals. We identify 26 significant loci involved in High-density lipoprotein (HDL) cholesterol, many of which are consistent with previous findings. Several of these loci show significant evidence of involvement in gene-by-environment interactions. An additional advantage of our meta-analysis

  12. Meta-Analysis Identifies Gene-by-Environment Interactions as Demonstrated in a Study of 4,965 Mice

    PubMed Central

    Joo, Jong Wha J.; Shih, Diana; Davis, Richard C.; Lusis, Aldons J.; Eskin, Eleazar

    2014-01-01

    Identifying environmentally-specific genetic effects is a key challenge in understanding the structure of complex traits. Model organisms play a crucial role in the identification of such gene-by-environment interactions, as a result of the unique ability to observe genetically similar individuals across multiple distinct environments. Many model organism studies examine the same traits but under varying environmental conditions. For example, knock-out or diet-controlled studies are often used to examine cholesterol in mice. These studies, when examined in aggregate, provide an opportunity to identify genomic loci exhibiting environmentally-dependent effects. However, the straightforward application of traditional methodologies to aggregate separate studies suffers from several problems. First, environmental conditions are often variable and do not fit the standard univariate model for interactions. Additionally, applying a multivariate model results in increased degrees of freedom and low statistical power. In this paper, we jointly analyze multiple studies with varying environmental conditions using a meta-analytic approach based on a random effects model to identify loci involved in gene-by-environment interactions. Our approach is motivated by the observation that methods for discovering gene-by-environment interactions are closely related to random effects models for meta-analysis. We show that interactions can be interpreted as heterogeneity and can be detected without utilizing the traditional uni- or multi-variate approaches for discovery of gene-by-environment interactions. We apply our new method to combine 17 mouse studies containing in aggregate 4,965 distinct animals. We identify 26 significant loci involved in High-density lipoprotein (HDL) cholesterol, many of which are consistent with previous findings. Several of these loci show significant evidence of involvement in gene-by-environment interactions. An additional advantage of our meta-analysis

  13. Integrating genome-wide association study and expression quantitative trait loci data identifies multiple genes and gene set associated with neuroticism.

    PubMed

    Fan, Qianrui; Wang, Wenyu; Hao, Jingcan; He, Awen; Wen, Yan; Guo, Xiong; Wu, Cuiyan; Ning, Yujie; Wang, Xi; Wang, Sen; Zhang, Feng

    2017-08-01

    Neuroticism is a fundamental personality trait with significant genetic determinant. To identify novel susceptibility genes for neuroticism, we conducted an integrative analysis of genomic and transcriptomic data of genome wide association study (GWAS) and expression quantitative trait locus (eQTL) study. GWAS summary data was driven from published studies of neuroticism, totally involving 170,906 subjects. eQTL dataset containing 927,753 eQTLs were obtained from an eQTL meta-analysis of 5311 samples. Integrative analysis of GWAS and eQTL data was conducted by summary data-based Mendelian randomization (SMR) analysis software. To identify neuroticism associated gene sets, the SMR analysis results were further subjected to gene set enrichment analysis (GSEA). The gene set annotation dataset (containing 13,311 annotated gene sets) of GSEA Molecular Signatures Database was used. SMR single gene analysis identified 6 significant genes for neuroticism, including MSRA (p value=2.27×10 -10 ), MGC57346 (p value=6.92×10 -7 ), BLK (p value=1.01×10 -6 ), XKR6 (p value=1.11×10 -6 ), C17ORF69 (p value=1.12×10 -6 ) and KIAA1267 (p value=4.00×10 -6 ). Gene set enrichment analysis observed significant association for Chr8p23 gene set (false discovery rate=0.033). Our results provide novel clues for the genetic mechanism studies of neuroticism. Copyright © 2017. Published by Elsevier Inc.

  14. ENU Mutagenesis in Mice Identifies Candidate Genes For Hypogonadism

    PubMed Central

    Weiss, Jeffrey; Hurley, Lisa A.; Harris, Rebecca M.; Finlayson, Courtney; Tong, Minghan; Fisher, Lisa A.; Moran, Jennifer L.; Beier, David R.; Mason, Christopher; Jameson, J. Larry

    2012-01-01

    Genome-wide mutagenesis was performed in mice to identify candidate genes for male infertility, for which the predominant causes remain idiopathic. Mice were mutagenized using N-ethyl-N-nitrosourea (ENU), bred, and screened for phenotypes associated with the male urogenital system. Fifteen heritable lines were isolated and chromosomal loci were assigned using low density genome-wide SNP arrays. Ten of the fifteen lines were pursued further using higher resolution SNP analysis to narrow the candidate gene regions. Exon sequencing of candidate genes identified mutations in mice with cystic kidneys (Bicc1), cryptorchidism (Rxfp2), restricted germ cell deficiency (Plk4), and severe germ cell deficiency (Prdm9). In two other lines with severe hypogonadism candidate sequencing failed to identify mutations, suggesting defects in genes with previously undocumented roles in gonadal function. These genomic intervals were sequenced in their entirety and a candidate mutation was identified in SnrpE in one of the two lines. The line harboring the SnrpE variant retains substantial spermatogenesis despite small testis size, an unusual phenotype. In addition to the reproductive defects, heritable phenotypes were observed in mice with ataxia (Myo5a), tremors (Pmp22), growth retardation (unknown gene), and hydrocephalus (unknown gene). These results demonstrate that the ENU screen is an effective tool for identifying potential causes of male infertility. PMID:22258617

  15. Gene-Based Genome-Wide Association Analysis in European and Asian Populations Identified Novel Genes for Rheumatoid Arthritis.

    PubMed

    Zhu, Hong; Xia, Wei; Mo, Xing-Bo; Lin, Xiang; Qiu, Ying-Hua; Yi, Neng-Jun; Zhang, Yong-Hong; Deng, Fei-Yan; Lei, Shu-Feng

    2016-01-01

    Rheumatoid arthritis (RA) is a complex autoimmune disease. Using a gene-based association research strategy, the present study aims to detect unknown susceptibility to RA and to address the ethnic differences in genetic susceptibility to RA between European and Asian populations. Gene-based association analyses were performed with KGG 2.5 by using publicly available large RA datasets (14,361 RA cases and 43,923 controls of European subjects, 4,873 RA cases and 17,642 controls of Asian Subjects). For the newly identified RA-associated genes, gene set enrichment analyses and protein-protein interactions analyses were carried out with DAVID and STRING version 10.0, respectively. Differential expression verification was conducted using 4 GEO datasets. The expression levels of three selected 'highly verified' genes were measured by ELISA among our in-house RA cases and controls. A total of 221 RA-associated genes were newly identified by gene-based association study, including 71'overlapped', 76 'European-specific' and 74 'Asian-specific' genes. Among them, 105 genes had significant differential expressions between RA patients and health controls at least in one dataset, especially for 20 genes including 11 'overlapped' (ABCF1, FLOT1, HLA-F, IER3, TUBB, ZKSCAN4, BTN3A3, HSP90AB1, CUTA, BRD2, HLA-DMA), 5 'European-specific' (PHTF1, RPS18, BAK1, TNFRSF14, SUOX) and 4 'Asian-specific' (RNASET2, HFE, BTN2A2, MAPK13) genes whose differential expressions were significant at least in three datasets. The protein expressions of two selected genes FLOT1 (P value = 1.70E-02) and HLA-DMA (P value = 4.70E-02) in plasma were significantly different in our in-house samples. Our study identified 221 novel RA-associated genes and especially highlighted the importance of 20 candidate genes on RA. The results addressed ethnic genetic background differences for RA susceptibility between European and Asian populations and detected a long list of overlapped or ethnic specific RA genes. The

  16. Identifying key genes in glaucoma based on a benchmarked dataset and the gene regulatory network.

    PubMed

    Chen, Xi; Wang, Qiao-Ling; Zhang, Meng-Hui

    2017-10-01

    The current study aimed to identify key genes in glaucoma based on a benchmarked dataset and gene regulatory network (GRN). Local and global noise was added to the gene expression dataset to produce a benchmarked dataset. Differentially-expressed genes (DEGs) between patients with glaucoma and normal controls were identified utilizing the Linear Models for Microarray Data (Limma) package based on benchmarked dataset. A total of 5 GRN inference methods, including Zscore, GeneNet, context likelihood of relatedness (CLR) algorithm, Partial Correlation coefficient with Information Theory (PCIT) and GEne Network Inference with Ensemble of Trees (Genie3) were evaluated using receiver operating characteristic (ROC) and precision and recall (PR) curves. The interference method with the best performance was selected to construct the GRN. Subsequently, topological centrality (degree, closeness and betweenness) was conducted to identify key genes in the GRN of glaucoma. Finally, the key genes were validated by performing reverse transcription-quantitative polymerase chain reaction (RT-qPCR). A total of 176 DEGs were detected from the benchmarked dataset. The ROC and PR curves of the 5 methods were analyzed and it was determined that Genie3 had a clear advantage over the other methods; thus, Genie3 was used to construct the GRN. Following topological centrality analysis, 14 key genes for glaucoma were identified, including IL6 , EPHA2 and GSTT1 and 5 of these 14 key genes were validated by RT-qPCR. Therefore, the current study identified 14 key genes in glaucoma, which may be potential biomarkers to use in the diagnosis of glaucoma and aid in identifying the molecular mechanism of this disease.

  17. Suppression subtractive hybridization and comparative expression analysis to identify developmentally regulated genes in filamentous fungi.

    PubMed

    Gesing, Stefan; Schindler, Daniel; Nowrousian, Minou

    2013-09-01

    Ascomycetes differentiate four major morphological types of fruiting bodies (apothecia, perithecia, pseudothecia and cleistothecia) that are derived from an ancestral fruiting body. Thus, fruiting body differentiation is most likely controlled by a set of common core genes. One way to identify such genes is to search for genes with evolutionary conserved expression patterns. Using suppression subtractive hybridization (SSH), we selected differentially expressed transcripts in Pyronema confluens (Pezizales) by comparing two cDNA libraries specific for sexual and for vegetative development, respectively. The expression patterns of selected genes from both libraries were verified by quantitative real time PCR. Expression of several corresponding homologous genes was found to be conserved in two members of the Sordariales (Sordaria macrospora and Neurospora crassa), a derived group of ascomycetes that is only distantly related to the Pezizales. Knockout studies with N. crassa orthologues of differentially regulated genes revealed a functional role during fruiting body development for the gene NCU05079, encoding a putative MFS peptide transporter. These data indicate conserved gene expression patterns and a functional role of the corresponding genes during fruiting body development; such genes are candidates of choice for further functional analysis. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. Meta-analysis identifies five novel loci associated with endometriosis highlighting key genes involved in hormone metabolism

    PubMed Central

    Sapkota, Yadav; Steinthorsdottir, Valgerdur; Morris, Andrew P.; Fassbender, Amelie; Rahmioglu, Nilufer; De Vivo, Immaculata; Buring, Julie E.; Zhang, Futao; Edwards, Todd L.; Jones, Sarah; O, Dorien; Peterse, Daniëlle; Rexrode, Kathryn M.; Ridker, Paul M.; Schork, Andrew J.; MacGregor, Stuart; Martin, Nicholas G.; Becker, Christian M.; Adachi, Sosuke; Yoshihara, Kosuke; Enomoto, Takayuki; Takahashi, Atsushi; Kamatani, Yoichiro; Matsuda, Koichi; Kubo, Michiaki; Thorleifsson, Gudmar; Geirsson, Reynir T.; Thorsteinsdottir, Unnur; Wallace, Leanne M.; Werge, Thomas M.; Thompson, Wesley K.; Yang, Jian; Velez Edwards, Digna R.; Nyegaard, Mette; Low, Siew-Kee; Zondervan, Krina T.; Missmer, Stacey A.; D'Hooghe, Thomas; Montgomery, Grant W.; Chasman, Daniel I.; Stefansson, Kari; Tung, Joyce Y.; Nyholt, Dale R.

    2017-01-01

    Endometriosis is a heritable hormone-dependent gynecological disorder, associated with severe pelvic pain and reduced fertility; however, its molecular mechanisms remain largely unknown. Here we perform a meta-analysis of 11 genome-wide association case-control data sets, totalling 17,045 endometriosis cases and 191,596 controls. In addition to replicating previously reported loci, we identify five novel loci significantly associated with endometriosis risk (P<5 × 10−8), implicating genes involved in sex steroid hormone pathways (FN1, CCDC170, ESR1, SYNE1 and FSHB). Conditional analysis identified five secondary association signals, including two at the ESR1 locus, resulting in 19 independent single nucleotide polymorphisms (SNPs) robustly associated with endometriosis, which together explain up to 5.19% of variance in endometriosis. These results highlight novel variants in or near specific genes with important roles in sex steroid hormone signalling and function, and offer unique opportunities for more targeted functional research efforts. PMID:28537267

  19. An EST-based analysis identifies new genes and reveals distinctive gene expression features of Coffea arabica and Coffea canephora

    PubMed Central

    2011-01-01

    Background Coffee is one of the world's most important crops; it is consumed worldwide and plays a significant role in the economy of producing countries. Coffea arabica and C. canephora are responsible for 70 and 30% of commercial production, respectively. C. arabica is an allotetraploid from a recent hybridization of the diploid species, C. canephora and C. eugenioides. C. arabica has lower genetic diversity and results in a higher quality beverage than C. canephora. Research initiatives have been launched to produce genomic and transcriptomic data about Coffea spp. as a strategy to improve breeding efficiency. Results Assembling the expressed sequence tags (ESTs) of C. arabica and C. canephora produced by the Brazilian Coffee Genome Project and the Nestlé-Cornell Consortium revealed 32,007 clusters of C. arabica and 16,665 clusters of C. canephora. We detected different GC3 profiles between these species that are related to their genome structure and mating system. BLAST analysis revealed similarities between coffee and grape (Vitis vinifera) genes. Using KA/KS analysis, we identified coffee genes under purifying and positive selection. Protein domain and gene ontology analyses suggested differences between Coffea spp. data, mainly in relation to complex sugar synthases and nucleotide binding proteins. OrthoMCL was used to identify specific and prevalent coffee protein families when compared to five other plant species. Among the interesting families annotated are new cystatins, glycine-rich proteins and RALF-like peptides. Hierarchical clustering was used to independently group C. arabica and C. canephora expression clusters according to expression data extracted from EST libraries, resulting in the identification of differentially expressed genes. Based on these results, we emphasize gene annotation and discuss plant defenses, abiotic stress and cup quality-related functional categories. Conclusion We present the first comprehensive genome-wide transcript

  20. Comparative Analysis of the Full Genome of Helicobacter pylori Isolate Sahul64 Identifies Genes of High Divergence

    PubMed Central

    Lu, Wei; Wise, Michael J.; Tay, Chin Yen; Windsor, Helen M.; Marshall, Barry J.; Peacock, Christopher

    2014-01-01

    Isolates of Helicobacter pylori can be classified phylogeographically. High genetic diversity and rapid microevolution are a hallmark of H. pylori genomes, a phenomenon that is proposed to play a functional role in persistence and colonization of diverse human populations. To provide further genomic evidence in the lineage of H. pylori and to further characterize diverse strains of this pathogen in different human populations, we report the finished genome sequence of Sahul64, an H. pylori strain isolated from an indigenous Australian. Our analysis identified genes that were highly divergent compared to the 38 publically available genomes, which include genes involved in the biosynthesis and modification of lipopolysaccharide, putative prophage genes, restriction modification components, and hypothetical genes. Furthermore, the virulence-associated vacA locus is a pseudogene and the cag pathogenicity island (cagPAI) is not present. However, the genome does contain a gene cluster associated with pathogenicity, including dupA. Our analysis found that with the addition of Sahul64 to the 38 genomes, the core genome content of H. pylori is reduced by approximately 14% (∼170 genes) and the pan-genome has expanded from 2,070 to 2,238 genes. We have identified three putative horizontally acquired regions, including one that is likely to have been acquired from the closely related Helicobacter cetorum prior to speciation. Our results suggest that Sahul64, with the absence of cagPAI, highly divergent cell envelope proteins, and a predicted nontransportable VacA protein, could be more highly adapted to ancient indigenous Australian people but with lower virulence potential compared to other sequenced and cagPAI-positive H. pylori strains. PMID:24375107

  1. Comparative analysis of the full genome of Helicobacter pylori isolate Sahul64 identifies genes of high divergence.

    PubMed

    Lu, Wei; Wise, Michael J; Tay, Chin Yen; Windsor, Helen M; Marshall, Barry J; Peacock, Christopher; Perkins, Tim

    2014-03-01

    Isolates of Helicobacter pylori can be classified phylogeographically. High genetic diversity and rapid microevolution are a hallmark of H. pylori genomes, a phenomenon that is proposed to play a functional role in persistence and colonization of diverse human populations. To provide further genomic evidence in the lineage of H. pylori and to further characterize diverse strains of this pathogen in different human populations, we report the finished genome sequence of Sahul64, an H. pylori strain isolated from an indigenous Australian. Our analysis identified genes that were highly divergent compared to the 38 publically available genomes, which include genes involved in the biosynthesis and modification of lipopolysaccharide, putative prophage genes, restriction modification components, and hypothetical genes. Furthermore, the virulence-associated vacA locus is a pseudogene and the cag pathogenicity island (cagPAI) is not present. However, the genome does contain a gene cluster associated with pathogenicity, including dupA. Our analysis found that with the addition of Sahul64 to the 38 genomes, the core genome content of H. pylori is reduced by approximately 14% (∼170 genes) and the pan-genome has expanded from 2,070 to 2,238 genes. We have identified three putative horizontally acquired regions, including one that is likely to have been acquired from the closely related Helicobacter cetorum prior to speciation. Our results suggest that Sahul64, with the absence of cagPAI, highly divergent cell envelope proteins, and a predicted nontransportable VacA protein, could be more highly adapted to ancient indigenous Australian people but with lower virulence potential compared to other sequenced and cagPAI-positive H. pylori strains.

  2. Candidate chemosensory genes identified in the endoparasitoid Meteorus pulchricornis (Hymenoptera: Braconidae) by antennal transcriptome analysis.

    PubMed

    Sheng, Sheng; Liao, Cheng-Wu; Zheng, Yu; Zhou, Yu; Xu, Yan; Song, Wen-Miao; He, Peng; Zhang, Jian; Wu, Fu-An

    2017-06-01

    Meteorus pulchricornis is an endoparasitoid wasp which attacks the larvae of various lepidopteran pests. We present the first antennal transcriptome dataset for M. pulchricornis. A total of 48,845,072 clean reads were obtained and 34,967 unigenes were assembled. Of these, 15,458 unigenes showed a significant similarity (E-value <10 -5 ) to known proteins in the NCBI non-redundant protein database. Gene ontology (GO) and cluster of orthologous groups (COG) analyses were used to classify the functions of M. pulchricornis antennae genes. We identified 16 putative odorant-binding protein (OBP) genes, eight chemosensory protein (CSP) genes, 99 olfactory receptor (OR) genes, 19 ionotropic receptor (IR) genes and one sensory neuron membrane protein (SNMP) gene. BLASTx best hit results and phylogenetic analysis both indicated that these chemosensory genes were most closely related to those found in other hymenopteran species. Real-time quantitative PCR assays showed that 14 MpulOBP genes were antennae-specific. Of these, MpulOBP6, MpulOBP9, MpulOBP10, MpulOBP12, MpulOBP15 and MpulOBP16 were found to have greater expression in the antennae than in other body parts, while MpulOBP2 and MpulOBP3 were expressed predominately in the legs and abdomens, respectively. These results might provide a foundation for future studies of olfactory genes and chemoreception in M. pulchricornis. Copyright © 2017 Elsevier Inc. All rights reserved.

  3. NIH Researchers Identify OCD Risk Gene

    MedlinePlus

    ... News From NIH NIH Researchers Identify OCD Risk Gene Past Issues / Summer 2006 Table of Contents For ... and Alcoholism (NIAAA) have identified a previously unknown gene variant that doubles an individual's risk for obsessive- ...

  4. [Key effect genes responding to nerve injury identified by gene ontology and computer pattern recognition].

    PubMed

    Pan, Qian; Peng, Jin; Zhou, Xue; Yang, Hao; Zhang, Wei

    2012-07-01

    In order to screen out important genes from large gene data of gene microarray after nerve injury, we combine gene ontology (GO) method and computer pattern recognition technology to find key genes responding to nerve injury, and then verify one of these screened-out genes. Data mining and gene ontology analysis of gene chip data GSE26350 was carried out through MATLAB software. Cd44 was selected from screened-out key gene molecular spectrum by comparing genes' different GO terms and positions on score map of principal component. Function interferences were employed to influence the normal binding of Cd44 and one of its ligands, chondroitin sulfate C (CSC), to observe neurite extension. Gene ontology analysis showed that the first genes on score map (marked by red *) mainly distributed in molecular transducer activity, receptor activity, protein binding et al molecular function GO terms. Cd44 is one of six effector protein genes, and attracted us with its function diversity. After adding different reagents into the medium to interfere the normal binding of CSC and Cd44, varying-degree remissions of CSC's inhibition on neurite extension were observed. CSC can inhibit neurite extension through binding Cd44 on the neuron membrane. This verifies that important genes in given physiological processes can be identified by gene ontology analysis of gene chip data.

  5. Comparative phylogenetic analysis and transcriptional profiling of MADS-box gene family identified DAM and FLC-like genes in apple (Malusx domestica)

    PubMed Central

    Kumar, Gulshan; Arya, Preeti; Gupta, Khushboo; Randhawa, Vinay; Acharya, Vishal; Singh, Anil Kumar

    2016-01-01

    The MADS-box transcription factors play essential roles in various processes of plant growth and development. In the present study, phylogenetic analysis of 142 apple MADS-box proteins with that of other dicotyledonous species identified six putative Dormancy-Associated MADS-box (DAM) and four putative Flowering Locus C-like (FLC-like) proteins. In order to study the expression of apple MADS-box genes, RNA-seq analysis of 3 apical and 5 spur bud stages during dormancy, 6 flower stages and 7 fruit development stages was performed. The dramatic reduction in expression of two MdDAMs, MdMADS063 and MdMADS125 and two MdFLC-like genes, MdMADS135 and MdMADS136 during dormancy release suggests their role as flowering-repressors in apple. Apple orthologs of Arabidopsis genes, FLOWERING LOCUS T, FRIGIDA, SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1 and LEAFY exhibit similar expression patterns as reported in Arabidopsis, suggesting functional conservation in floral signal integration and meristem determination pathways. Gene ontology enrichment analysis of predicted targets of DAM revealed their involvement in regulation of reproductive processes and meristematic activities, indicating functional conservation of SVP orthologs (DAM) in apple. This study provides valuable insights into the functions of MADS-box proteins during apple phenology, which may help in devising strategies to improve important traits in apple. PMID:26856238

  6. Comparative phylogenetic analysis and transcriptional profiling of MADS-box gene family identified DAM and FLC-like genes in apple (Malusx domestica).

    PubMed

    Kumar, Gulshan; Arya, Preeti; Gupta, Khushboo; Randhawa, Vinay; Acharya, Vishal; Singh, Anil Kumar

    2016-02-09

    The MADS-box transcription factors play essential roles in various processes of plant growth and development. In the present study, phylogenetic analysis of 142 apple MADS-box proteins with that of other dicotyledonous species identified six putative Dormancy-Associated MADS-box (DAM) and four putative Flowering Locus C-like (FLC-like) proteins. In order to study the expression of apple MADS-box genes, RNA-seq analysis of 3 apical and 5 spur bud stages during dormancy, 6 flower stages and 7 fruit development stages was performed. The dramatic reduction in expression of two MdDAMs, MdMADS063 and MdMADS125 and two MdFLC-like genes, MdMADS135 and MdMADS136 during dormancy release suggests their role as flowering-repressors in apple. Apple orthologs of Arabidopsis genes, FLOWERING LOCUS T, FRIGIDA, SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1 and LEAFY exhibit similar expression patterns as reported in Arabidopsis, suggesting functional conservation in floral signal integration and meristem determination pathways. Gene ontology enrichment analysis of predicted targets of DAM revealed their involvement in regulation of reproductive processes and meristematic activities, indicating functional conservation of SVP orthologs (DAM) in apple. This study provides valuable insights into the functions of MADS-box proteins during apple phenology, which may help in devising strategies to improve important traits in apple.

  7. Epidermal growth factor gene is a newly identified candidate gene for gout.

    PubMed

    Han, Lin; Cao, Chunwei; Jia, Zhaotong; Liu, Shiguo; Liu, Zhen; Xin, Ruosai; Wang, Can; Li, Xinde; Ren, Wei; Wang, Xuefeng; Li, Changgui

    2016-08-10

    Chromosome 4q25 has been identified as a genomic region associated with gout. However, the associations of gout with the genes in this region have not yet been confirmed. Here, we performed two-stage analysis to determine whether variations in candidate genes in the 4q25 region are associated with gout in a male Chinese Han population. We first evaluated 96 tag single nucleotide polymorphisms (SNPs) in eight inflammatory/immune pathway- or glucose/lipid metabolism-related genes in the 4q25 region in 480 male gout patients and 480 controls. The SNP rs12504538, located in the elongation of very-long-chain-fatty-acid-like family member 6 gene (Elovl6), was found to be associated with gout susceptibility (Padjusted = 0.00595). In the second stage of analysis, we performed fine mapping analysis of 93 tag SNPs in Elovl6 and in the epidermal growth factor gene (EGF) and its flanking regions in 1017 male patients gout and 1897 healthy male controls. We observed a significant association between the T allele of EGF rs2298999 and gout (odds ratio = 0.77, 95% confidence interval = 0.67-0.88, Padjusted = 6.42 × 10(-3)). These results provide the first evidence for an association between the EGF rs2298999 C/T polymorphism and gout. Our findings should be validated in additional populations.

  8. Novel Myopia Genes and Pathways Identified From Syndromic Forms of Myopia

    PubMed Central

    Loughman, James; Wildsoet, Christine F.; Williams, Cathy; Guggenheim, Jeremy A.

    2018-01-01

    Purpose To test the hypothesis that genes known to cause clinical syndromes featuring myopia also harbor polymorphisms contributing to nonsyndromic refractive errors. Methods Clinical phenotypes and syndromes that have refractive errors as a recognized feature were identified using the Online Mendelian Inheritance in Man (OMIM) database. One hundred fifty-four unique causative genes were identified, of which 119 were specifically linked with myopia and 114 represented syndromic myopia (i.e., myopia and at least one other clinical feature). Myopia was the only refractive error listed for 98 genes and hyperopia and the only refractive error noted for 28 genes, with the remaining 28 genes linked to phenotypes with multiple forms of refractive error. Pathway analysis was carried out to find biological processes overrepresented within these sets of genes. Genetic variants located within 50 kb of the 119 myopia-related genes were evaluated for involvement in refractive error by analysis of summary statistics from genome-wide association studies (GWAS) conducted by the CREAM Consortium and 23andMe, using both single-marker and gene-based tests. Results Pathway analysis identified several biological processes already implicated in refractive error development through prior GWAS analyses and animal studies, including extracellular matrix remodeling, focal adhesion, and axon guidance, supporting the research hypothesis. Novel pathways also implicated in myopia development included mannosylation, glycosylation, lens development, gliogenesis, and Schwann cell differentiation. Hyperopia was found to be linked to a different pattern of biological processes, mostly related to organogenesis. Comparison with GWAS findings further confirmed that syndromic myopia genes were enriched for genetic variants that influence refractive errors in the general population. Gene-based analyses implicated 21 novel candidate myopia genes (ADAMTS18, ADAMTS2, ADAMTSL4, AGK, ALDH18A1, ASXL1, COL4A1

  9. Genome-wide methylation analysis identifies a core set of hypermethylated genes in CIMP-H colorectal cancer.

    PubMed

    McInnes, Tyler; Zou, Donghui; Rao, Dasari S; Munro, Francesca M; Phillips, Vicky L; McCall, John L; Black, Michael A; Reeve, Anthony E; Guilford, Parry J

    2017-03-28

    Aberrant DNA methylation profiles are a characteristic of all known cancer types, epitomized by the CpG island methylator phenotype (CIMP) in colorectal cancer (CRC). Hypermethylation has been observed at CpG islands throughout the genome, but it is unclear which factors determine whether an individual island becomes methylated in cancer. DNA methylation in CRC was analysed using the Illumina HumanMethylation450K array. Differentially methylated loci were identified using Significance Analysis of Microarrays (SAM) and the Wilcoxon Signed Rank (WSR) test. Unsupervised hierarchical clustering was used to identify methylation subtypes in CRC. In this study we characterized the DNA methylation profiles of 94 CRC tissues and their matched normal counterparts. Consistent with previous studies, unsupervized hierarchical clustering of genome-wide methylation data identified three subtypes within the tumour samples, designated CIMP-H, CIMP-L and CIMP-N, that showed high, low and very low methylation levels, respectively. Differential methylation between normal and tumour samples was analysed at the individual CpG level, and at the gene level. The distribution of hypermethylation in CIMP-N tumours showed high inter-tumour variability and appeared to be highly stochastic in nature, whereas CIMP-H tumours exhibited consistent hypermethylation at a subset of genes, in addition to a highly variable background of hypermethylated genes. EYA4, TFPI2 and TLX1 were hypermethylated in more than 90% of all tumours examined. One-hundred thirty-two genes were hypermethylated in 100% of CIMP-H tumours studied and these were highly enriched for functions relating to skeletal system development (Bonferroni adjusted p value =2.88E-15), segment specification (adjusted p value =9.62E-11), embryonic development (adjusted p value =1.52E-04), mesoderm development (adjusted p value =1.14E-20), and ectoderm development (adjusted p value =7.94E-16). Our genome-wide characterization of DNA

  10. Network inference analysis identifies an APRR2-like gene linked to pigment accumulation in tomato and pepper fruits.

    PubMed

    Pan, Yu; Bradley, Glyn; Pyke, Kevin; Ball, Graham; Lu, Chungui; Fray, Rupert; Marshall, Alexandra; Jayasuta, Subhalai; Baxter, Charles; van Wijk, Rik; Boyden, Laurie; Cade, Rebecca; Chapman, Natalie H; Fraser, Paul D; Hodgman, Charlie; Seymour, Graham B

    2013-03-01

    Carotenoids represent some of the most important secondary metabolites in the human diet, and tomato (Solanum lycopersicum) is a rich source of these health-promoting compounds. In this work, a novel and fruit-related regulator of pigment accumulation in tomato has been identified by artificial neural network inference analysis and its function validated in transgenic plants. A tomato fruit gene regulatory network was generated using artificial neural network inference analysis and transcription factor gene expression profiles derived from fruits sampled at various points during development and ripening. One of the transcription factor gene expression profiles with a sequence related to an Arabidopsis (Arabidopsis thaliana) ARABIDOPSIS PSEUDO RESPONSE REGULATOR2-LIKE gene (APRR2-Like) was up-regulated at the breaker stage in wild-type tomato fruits and, when overexpressed in transgenic lines, increased plastid number, area, and pigment content, enhancing the levels of chlorophyll in immature unripe fruits and carotenoids in red ripe fruits. Analysis of the transcriptome of transgenic lines overexpressing the tomato APPR2-Like gene revealed up-regulation of several ripening-related genes in the overexpression lines, providing a link between the expression of this tomato gene and the ripening process. A putative ortholog of the tomato APPR2-Like gene in sweet pepper (Capsicum annuum) was associated with pigment accumulation in fruit tissues. We conclude that the function of this gene is conserved across taxa and that it encodes a protein that has an important role in ripening.

  11. Comparative Transcriptome Analysis Identifies Putative Genes Involved in the Biosynthesis of Xanthanolides in Xanthium strumarium L.

    PubMed

    Li, Yuanjun; Gou, Junbo; Chen, Fangfang; Li, Changfu; Zhang, Yansheng

    2016-01-01

    Xanthium strumarium L. is a traditional Chinese herb belonging to the Asteraceae family. The major bioactive components of this plant are sesquiterpene lactones (STLs), which include the xanthanolides. To date, the biogenesis of xanthanolides, especially their downstream pathway, remains largely unknown. In X. strumarium, xanthanolides primarily accumulate in its glandular trichomes. To identify putative gene candidates involved in the biosynthesis of xanthanolides, three X. strumarium transcriptomes, which were derived from the young leaves of two different cultivars and the purified glandular trichomes from one of the cultivars, were constructed in this study. In total, 157 million clean reads were generated and assembled into 91,861 unigenes, of which 59,858 unigenes were successfully annotated. All the genes coding for known enzymes in the upstream pathway to the biosynthesis of xanthanolides were present in the X. strumarium transcriptomes. From a comparative analysis of the X. strumarium transcriptomes, this study identified a number of gene candidates that are putatively involved in the downstream pathway to the synthesis of xanthanolides, such as four unigenes encoding CYP71 P450s, 50 unigenes for dehydrogenases, and 27 genes for acetyltransferases. The possible functions of these four CYP71 candidates are extensively discussed. In addition, 116 transcription factors that are highly expressed in X. strumarium glandular trichomes were also identified. Their possible regulatory roles in the biosynthesis of STLs are discussed. The global transcriptomic data for X. strumarium should provide a valuable resource for further research into the biosynthesis of xanthanolides.

  12. A comparative gene analysis with rice identified orthologous group II HKT genes and their association with Na(+) concentration in bread wheat.

    PubMed

    Ariyarathna, H A Chandima K; Oldach, Klaus H; Francki, Michael G

    2016-01-19

    Although the HKT transporter genes ascertain some of the key determinants of crop salt tolerance mechanisms, the diversity and functional role of group II HKT genes are not clearly understood in bread wheat. The advanced knowledge on rice HKT and whole genome sequence was, therefore, used in comparative gene analysis to identify orthologous wheat group II HKT genes and their role in trait variation under different saline environments. The four group II HKTs in rice identified two orthologous gene families from bread wheat, including the known TaHKT2;1 gene family and a new distinctly different gene family designated as TaHKT2;2. A single copy of TaHKT2;2 was found on each homeologous chromosome arm 7AL, 7BL and 7DL and each gene was expressed in leaf blade, sheath and root tissues under non-stressed and at 200 mM salt stressed conditions. The proteins encoded by genes of the TaHKT2;2 family revealed more than 93% amino acid sequence identity but ≤52% amino acid identity compared to the proteins encoded by TaHKT2;1 family. Specifically, variations in known critical domains predicted functional differences between the two protein families. Similar to orthologous rice genes on chromosome 6L, TaHKT2;1 and TaHKT2;2 genes were located approximately 3 kb apart on wheat chromosomes 7AL, 7BL and 7DL, forming a static syntenic block in the two species. The chromosomal region on 7AL containing TaHKT2;1 7AL-1 co-located with QTL for shoot Na(+) concentration and yield in some saline environments. The differences in copy number, genes sequences and encoded proteins between TaHKT2;2 homeologous genes and other group II HKT gene families within and across species likely reflect functional diversity for ion selectivity and transport in plants. Evidence indicated that neither TaHKT2;2 nor TaHKT2;1 were associated with primary root Na(+) uptake but TaHKT2;1 may be associated with trait variation for Na(+) exclusion and yield in some but not all saline environments.

  13. Global Gene-Expression Analysis to Identify Differentially Expressed Genes Critical for the Heat Stress Response in Brassica rapa

    PubMed Central

    Dong, Xiangshu; Yi, Hankuil; Lee, Jeongyeo; Nou, Ill-Sup; Han, Ching-Tack; Hur, Yoonkang

    2015-01-01

    Genome-wide dissection of the heat stress response (HSR) is necessary to overcome problems in crop production caused by global warming. To identify HSR genes, we profiled gene expression in two Chinese cabbage inbred lines with different thermotolerances, Chiifu and Kenshin. Many genes exhibited >2-fold changes in expression upon exposure to 0.5– 4 h at 45°C (high temperature, HT): 5.2% (2,142 genes) in Chiifu and 3.7% (1,535 genes) in Kenshin. The most enriched GO (Gene Ontology) items included ‘response to heat’, ‘response to reactive oxygen species (ROS)’, ‘response to temperature stimulus’, ‘response to abiotic stimulus’, and ‘MAPKKK cascade’. In both lines, the genes most highly induced by HT encoded small heat shock proteins (Hsps) and heat shock factor (Hsf)-like proteins such as HsfB2A (Bra029292), whereas high-molecular weight Hsps were constitutively expressed. Other upstream HSR components were also up-regulated: ROS-scavenging genes like glutathione peroxidase 2 (BrGPX2, Bra022853), protein kinases, and phosphatases. Among heat stress (HS) marker genes in Arabidopsis, only exportin 1A (XPO1A) (Bra008580, Bra006382) can be applied to B. rapa for basal thermotolerance (BT) and short-term acquired thermotolerance (SAT) gene. CYP707A3 (Bra025083, Bra021965), which is involved in the dehydration response in Arabidopsis, was associated with membrane leakage in both lines following HS. Although many transcription factors (TF) genes, including DREB2A (Bra005852), were involved in HS tolerance in both lines, Bra024224 (MYB41) and Bra021735 (a bZIP/AIR1 [Anthocyanin-Impaired-Response-1]) were specific to Kenshin. Several candidate TFs involved in thermotolerance were confirmed as HSR genes by real-time PCR, and these assignments were further supported by promoter analysis. Although some of our findings are similar to those obtained using other plant species, clear differences in Brassica rapa reveal a distinct HSR in this species. Our data

  14. Joint genetic analysis of hippocampal size in mouse and human identifies a novel gene linked to neurodegenerative disease.

    PubMed

    Ashbrook, David G; Williams, Robert W; Lu, Lu; Stein, Jason L; Hibar, Derrek P; Nichols, Thomas E; Medland, Sarah E; Thompson, Paul M; Hager, Reinmar

    2014-10-03

    Variation in hippocampal volume has been linked to significant differences in memory, behavior, and cognition among individuals. To identify genetic variants underlying such differences and associated disease phenotypes, multinational consortia such as ENIGMA have used large magnetic resonance imaging (MRI) data sets in human GWAS studies. In addition, mapping studies in mouse model systems have identified genetic variants for brain structure variation with great power. A key challenge is to understand how genetically based differences in brain structure lead to the propensity to develop specific neurological disorders. We combine the largest human GWAS of brain structure with the largest mammalian model system, the BXD recombinant inbred mouse population, to identify novel genetic targets influencing brain structure variation that are linked to increased risk for neurological disorders. We first use a novel cross-species, comparative analysis using mouse and human genetic data to identify a candidate gene, MGST3, associated with adult hippocampus size in both systems. We then establish the coregulation and function of this gene in a comprehensive systems-analysis. We find that MGST3 is associated with hippocampus size and is linked to a group of neurodegenerative disorders, such as Alzheimer's.

  15. Common Marker Genes Identified from Various Sample Types for Systemic Lupus Erythematosus.

    PubMed

    Bing, Peng-Fei; Xia, Wei; Wang, Lan; Zhang, Yong-Hong; Lei, Shu-Feng; Deng, Fei-Yan

    2016-01-01

    Systemic lupus erythematosus (SLE) is a complex auto-immune disease. Gene expression studies have been conducted to identify SLE-related genes in various types of samples. It is unknown whether there are common marker genes significant for SLE but independent of sample types, which may have potentials for follow-up translational research. The aim of this study is to identify common marker genes across various sample types for SLE. Based on four public microarray gene expression datasets for SLE covering three representative types of blood-born samples (monocyte; peripheral blood mononuclear cell, PBMC; whole blood), we utilized three statistics (fold-change, FC; t-test p value; false discovery rate adjusted p value) to scrutinize genes simultaneously regulated with SLE across various sample types. For common marker genes, we conducted the Gene Ontology enrichment analysis and Protein-Protein Interaction analysis to gain insights into their functions. We identified 10 common marker genes associated with SLE (IFI6, IFI27, IFI44L, OAS1, OAS2, EIF2AK2, PLSCR1, STAT1, RNASE2, and GSTO1). Significant up-regulation of IFI6, IFI27, and IFI44L with SLE was observed in all the studied sample types, though the FC was most striking in monocyte, compared with PBMC and whole blood (8.82-251.66 vs. 3.73-74.05 vs. 1.19-1.87). Eight of the above 10 genes, except RNASE2 and GSTO1, interact with each other and with known SLE susceptibility genes, participate in immune response, RNA and protein catabolism, and cell death. Our data suggest that there exist common marker genes across various sample types for SLE. The 10 common marker genes, identified herein, deserve follow-up studies to dissert their potentials as diagnostic or therapeutic markers to predict SLE or treatment response.

  16. Comprehensive Analysis of Gene Expression Profiles of Sepsis-Induced Multiorgan Failure Identified Its Valuable Biomarkers.

    PubMed

    Wang, Yumei; Yin, Xiaoling; Yang, Fang

    2018-02-01

    Sepsis is an inflammatory-related disease, and severe sepsis would induce multiorgan dysfunction, which is the most common cause of death of patients in noncoronary intensive care units. Progression of novel therapeutic strategies has proven to be of little impact on the mortality of severe sepsis, and unfortunately, its mechanisms still remain poorly understood. In this study, we analyzed gene expression profiles of severe sepsis with failure of lung, kidney, and liver for the identification of potential biomarkers. We first downloaded the gene expression profiles from the Gene Expression Omnibus and performed preprocessing of raw microarray data sets and identification of differential expression genes (DEGs) through the R programming software; then, significantly enriched functions of DEGs in lung, kidney, and liver failure sepsis samples were obtained from the Database for Annotation, Visualization, and Integrated Discovery; finally, protein-protein interaction network was constructed for DEGs based on the STRING database, and network modules were also obtained through the MCODE cluster method. As a result, lung failure sepsis has the highest number of DEGs of 859, whereas the number of DEGs in kidney and liver failure sepsis samples is 178 and 175, respectively. In addition, 17 overlaps were obtained among the three lists of DEGs. Biological processes related to immune and inflammatory response were found to be significantly enriched in DEGs. Network and module analysis identified four gene clusters in which all or most of genes were upregulated. The expression changes of Icam1 and Socs3 were further validated through quantitative PCR analysis. This study should shed light on the development of sepsis and provide potential therapeutic targets for sepsis-induced multiorgan failure.

  17. Frameshift mutational target gene analysis identifies similarities and differences in constitutional mismatch repair-deficiency and Lynch syndrome.

    PubMed

    Maletzki, Claudia; Huehns, Maja; Bauer, Ingrid; Ripperger, Tim; Mork, Maureen M; Vilar, Eduardo; Klöcking, Sabine; Zettl, Heike; Prall, Friedrich; Linnebacher, Michael

    2017-07-01

    Mismatch-repair deficient (MMR-D) malignancies include Lynch Syndrome (LS), which is secondary to germline mutations in one of the MMR genes, and the rare childhood-form of constitutional mismatch repair-deficiency (CMMR-D); caused by bi-allelic MMR gene mutations. A hallmark of LS-associated cancers is microsatellite instability (MSI), characterized by coding frameshift mutations (cFSM) in target genes. By contrast, tumors arising in CMMR-D patients are thought to display a somatic mutation pattern differing from LS. This study has the main goal to identify cFSM in MSI target genes relevant in CMMR-D and to compare the spectrum of common somatic mutations, including alterations in DNA polymerases POLE and D1 between LS and CMMR-D. CMMR-D-associated tumors harbored more somatic mutations compared to LS cases, especially in the TP53 gene and in POLE and POLD1, where novel mutations were additionally identified. Strikingly, MSI in classical mononucleotide markers BAT40 and CAT25 was frequent in CMMR-D cases. MSI-target gene analysis revealed mutations in CMMR-D-associated tumors, some of them known to be frequently hit in LS, such as RNaseT2, HT001, and TGFβR2. Our results imply a general role for these cFSM as potential new drivers of MMR-D tumorigenesis. © 2017 Wiley Periodicals, Inc.

  18. Epidermal growth factor gene is a newly identified candidate gene for gout

    PubMed Central

    Han, Lin; Cao, Chunwei; Jia, Zhaotong; Liu, Shiguo; Liu, Zhen; Xin, Ruosai; Wang, Can; Li, Xinde; Ren, Wei; Wang, Xuefeng; Li, Changgui

    2016-01-01

    Chromosome 4q25 has been identified as a genomic region associated with gout. However, the associations of gout with the genes in this region have not yet been confirmed. Here, we performed two-stage analysis to determine whether variations in candidate genes in the 4q25 region are associated with gout in a male Chinese Han population. We first evaluated 96 tag single nucleotide polymorphisms (SNPs) in eight inflammatory/immune pathway- or glucose/lipid metabolism-related genes in the 4q25 region in 480 male gout patients and 480 controls. The SNP rs12504538, located in the elongation of very-long-chain-fatty-acid-like family member 6 gene (Elovl6), was found to be associated with gout susceptibility (Padjusted = 0.00595). In the second stage of analysis, we performed fine mapping analysis of 93 tag SNPs in Elovl6 and in the epidermal growth factor gene (EGF) and its flanking regions in 1017 male patients gout and 1897 healthy male controls. We observed a significant association between the T allele of EGF rs2298999 and gout (odds ratio = 0.77, 95% confidence interval = 0.67–0.88, Padjusted = 6.42 × 10−3). These results provide the first evidence for an association between the EGF rs2298999 C/T polymorphism and gout. Our findings should be validated in additional populations. PMID:27506295

  19. RNA-Seq Meta-analysis identifies genes in skeletal muscle associated with gain and intake across a multi-season study of crossbred beef steers.

    PubMed

    Keel, Brittney N; Zarek, Christina M; Keele, John W; Kuehn, Larry A; Snelling, Warren M; Oliver, William T; Freetly, Harvey C; Lindholm-Perry, Amanda K

    2018-06-04

    Feed intake and body weight gain are economically important inputs and outputs of beef production systems. The purpose of this study was to discover differentially expressed genes that will be robust for feed intake and gain across a large segment of the cattle industry. Transcriptomic studies often suffer from issues with reproducibility and cross-validation. One way to improve reproducibility is by integrating multiple datasets via meta-analysis. RNA sequencing (RNA-Seq) was performed on longissimus dorsi muscle from 80 steers (5 cohorts, each with 16 animals) selected from the outside fringe of a bivariate gain and feed intake distribution to understand the genes and pathways involved in feed efficiency. In each cohort, 16 steers were selected from one of four gain and feed intake phenotypes (n = 4 per phenotype) in a 2 × 2 factorial arrangement with gain and feed intake as main effect variables. Each cohort was analyzed as a single experiment using a generalized linear model and results from the 5 cohort analyses were combined in a meta-analysis to identify differentially expressed genes (DEG) across the cohorts. A total of 51 genes were differentially expressed for the main effect of gain, 109 genes for the intake main effect, and 11 genes for the gain x intake interaction (P corrected  < 0.05). A jackknife sensitivity analysis showed that, in general, the meta-analysis produced robust DEGs for the two main effects and their interaction. Pathways identified from over-represented genes included mitochondrial energy production and oxidative stress pathways for the main effect of gain due to DEG including GPD1, NDUFA6, UQCRQ, ACTC1, and MGST3. For intake, metabolic pathways including amino acid biosynthesis and degradation were identified, and for the interaction analysis the pathways identified included GADD45, pyridoxal 5'phosphate salvage, and caveolar mediated endocytosis signaling. Variation among DEG identified by cohort suggests that

  20. Comparative Transcriptome Analysis Identifies Putative Genes Involved in the Biosynthesis of Xanthanolides in Xanthium strumarium L.

    PubMed Central

    Li, Yuanjun; Gou, Junbo; Chen, Fangfang; Li, Changfu; Zhang, Yansheng

    2016-01-01

    Xanthium strumarium L. is a traditional Chinese herb belonging to the Asteraceae family. The major bioactive components of this plant are sesquiterpene lactones (STLs), which include the xanthanolides. To date, the biogenesis of xanthanolides, especially their downstream pathway, remains largely unknown. In X. strumarium, xanthanolides primarily accumulate in its glandular trichomes. To identify putative gene candidates involved in the biosynthesis of xanthanolides, three X. strumarium transcriptomes, which were derived from the young leaves of two different cultivars and the purified glandular trichomes from one of the cultivars, were constructed in this study. In total, 157 million clean reads were generated and assembled into 91,861 unigenes, of which 59,858 unigenes were successfully annotated. All the genes coding for known enzymes in the upstream pathway to the biosynthesis of xanthanolides were present in the X. strumarium transcriptomes. From a comparative analysis of the X. strumarium transcriptomes, this study identified a number of gene candidates that are putatively involved in the downstream pathway to the synthesis of xanthanolides, such as four unigenes encoding CYP71 P450s, 50 unigenes for dehydrogenases, and 27 genes for acetyltransferases. The possible functions of these four CYP71 candidates are extensively discussed. In addition, 116 transcription factors that are highly expressed in X. strumarium glandular trichomes were also identified. Their possible regulatory roles in the biosynthesis of STLs are discussed. The global transcriptomic data for X. strumarium should provide a valuable resource for further research into the biosynthesis of xanthanolides. PMID:27625674

  1. Microarray analysis identifies candidate genes for key roles in coral development

    PubMed Central

    Grasso, Lauretta C; Maindonald, John; Rudd, Stephen; Hayward, David C; Saint, Robert; Miller, David J; Ball, Eldon E

    2008-01-01

    Background Anthozoan cnidarians are amongst the simplest animals at the tissue level of organization, but are surprisingly complex and vertebrate-like in terms of gene repertoire. As major components of tropical reef ecosystems, the stony corals are anthozoans of particular ecological significance. To better understand the molecular bases of both cnidarian development in general and coral-specific processes such as skeletogenesis and symbiont acquisition, microarray analysis was carried out through the period of early development – when skeletogenesis is initiated, and symbionts are first acquired. Results Of 5081 unique peptide coding genes, 1084 were differentially expressed (P ≤ 0.05) in comparisons between four different stages of coral development, spanning key developmental transitions. Genes of likely relevance to the processes of settlement, metamorphosis, calcification and interaction with symbionts were characterised further and their spatial expression patterns investigated using whole-mount in situ hybridization. Conclusion This study is the first large-scale investigation of developmental gene expression for any cnidarian, and has provided candidate genes for key roles in many aspects of coral biology, including calcification, metamorphosis and symbiont uptake. One surprising finding is that some of these genes have clear counterparts in higher animals but are not present in the closely-related sea anemone Nematostella. Secondly, coral-specific processes (i.e. traits which distinguish corals from their close relatives) may be analogous to similar processes in distantly related organisms. This first large-scale application of microarray analysis demonstrates the potential of this approach for investigating many aspects of coral biology, including the effects of stress and disease. PMID:19014561

  2. Applying Multivariate Adaptive Splines to Identify Genes With Expressions Varying After Diagnosis in Microarray Experiments.

    PubMed

    Duan, Fenghai; Xu, Ye

    2017-01-01

    To analyze a microarray experiment to identify the genes with expressions varying after the diagnosis of breast cancer. A total of 44 928 probe sets in an Affymetrix microarray data publicly available on Gene Expression Omnibus from 249 patients with breast cancer were analyzed by the nonparametric multivariate adaptive splines. Then, the identified genes with turning points were grouped by K-means clustering, and their network relationship was subsequently analyzed by the Ingenuity Pathway Analysis. In total, 1640 probe sets (genes) were reliably identified to have turning points along with the age at diagnosis in their expression profiling, of which 927 expressed lower after turning points and 713 expressed higher after the turning points. K-means clustered them into 3 groups with turning points centering at 54, 62.5, and 72, respectively. The pathway analysis showed that the identified genes were actively involved in various cancer-related functions or networks. In this article, we applied the nonparametric multivariate adaptive splines method to a publicly available gene expression data and successfully identified genes with expressions varying before and after breast cancer diagnosis.

  3. Similarity of markers identified from cancer gene expression studies: observations from GEO.

    PubMed

    Shi, Xingjie; Shen, Shihao; Liu, Jin; Huang, Jian; Zhou, Yong; Ma, Shuangge

    2014-09-01

    Gene expression profiling has been extensively conducted in cancer research. The analysis of multiple independent cancer gene expression datasets may provide additional information and complement single-dataset analysis. In this study, we conduct multi-dataset analysis and are interested in evaluating the similarity of cancer-associated genes identified from different datasets. The first objective of this study is to briefly review some statistical methods that can be used for such evaluation. Both marginal analysis and joint analysis methods are reviewed. The second objective is to apply those methods to 26 Gene Expression Omnibus (GEO) datasets on five types of cancers. Our analysis suggests that for the same cancer, the marker identification results may vary significantly across datasets, and different datasets share few common genes. In addition, datasets on different cancers share few common genes. The shared genetic basis of datasets on the same or different cancers, which has been suggested in the literature, is not observed in the analysis of GEO data. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  4. Identifying candidate genes for Type 2 Diabetes Mellitus and obesity through gene expression profiling in multiple tissues or cells.

    PubMed

    Chen, Junhui; Meng, Yuhuan; Zhou, Jinghui; Zhuo, Min; Ling, Fei; Zhang, Yu; Du, Hongli; Wang, Xiaoning

    2013-01-01

    Type 2 Diabetes Mellitus (T2DM) and obesity have become increasingly prevalent in recent years. Recent studies have focused on identifying causal variations or candidate genes for obesity and T2DM via analysis of expression quantitative trait loci (eQTL) within a single tissue. T2DM and obesity are affected by comprehensive sets of genes in multiple tissues. In the current study, gene expression levels in multiple human tissues from GEO datasets were analyzed, and 21 candidate genes displaying high percentages of differential expression were filtered out. Specifically, DENND1B, LYN, MRPL30, POC1B, PRKCB, RP4-655J12.3, HIBADH, and TMBIM4 were identified from the T2DM-control study, and BCAT1, BMP2K, CSRNP2, MYNN, NCKAP5L, SAP30BP, SLC35B4, SP1, BAP1, GRB14, HSP90AB1, ITGA5, and TOMM5 were identified from the obesity-control study. The majority of these genes are known to be involved in T2DM and obesity. Therefore, analysis of gene expression in various tissues using GEO datasets may be an effective and feasible method to determine novel or causal genes associated with T2DM and obesity.

  5. Transcriptome meta-analysis reveals common differential and global gene expression profiles in cystic fibrosis and other respiratory disorders and identifies CFTR regulators.

    PubMed

    Clarke, Luka A; Botelho, Hugo M; Sousa, Lisete; Falcao, Andre O; Amaral, Margarida D

    2015-11-01

    A meta-analysis of 13 independent microarray data sets was performed and gene expression profiles from cystic fibrosis (CF), similar disorders (COPD: chronic obstructive pulmonary disease, IPF: idiopathic pulmonary fibrosis, asthma), environmental conditions (smoking, epithelial injury), related cellular processes (epithelial differentiation/regeneration), and non-respiratory "control" conditions (schizophrenia, dieting), were compared. Similarity among differentially expressed (DE) gene lists was assessed using a permutation test, and a clustergram was constructed, identifying common gene markers. Global gene expression values were standardized using a novel approach, revealing that similarities between independent data sets run deeper than shared DE genes. Correlation of gene expression values identified putative gene regulators of the CF transmembrane conductance regulator (CFTR) gene, of potential therapeutic significance. Our study provides a novel perspective on CF epithelial gene expression in the context of other lung disorders and conditions, and highlights the contribution of differentiation/EMT and injury to gene signatures of respiratory disease. Copyright © 2015 Elsevier Inc. All rights reserved.

  6. Exome sequencing coupled with mRNA analysis identifies NDUFAF6 as a Leigh gene.

    PubMed

    Bianciardi, Laura; Imperatore, Valentina; Fernandez-Vizarra, Erika; Lopomo, Angela; Falabella, Micol; Furini, Simone; Galluzzi, Paolo; Grosso, Salvatore; Zeviani, Massimo; Renieri, Alessandra; Mari, Francesca; Frullanti, Elisa

    2016-11-01

    We report here the case of a young male who started to show verbal fluency disturbance, clumsiness and gait anomalies at the age of 3.5years and presented bilateral striatal necrosis. Clinically, the diagnosis was compatible with Leigh syndrome but the underlying molecular defect remained elusive even after exome analysis using autosomal/X-linked recessive or de novo models. Dosage of respiratory chain activity on fibroblasts, but not in muscle, underlined a deficit in complex I. Re-analysis of heterozygous probably pathogenic variants, inherited from one healthy parent, identified the p.Ala178Pro in NDUFAF6, a complex I assembly factor. RNA analysis showed an almost mono-allelic expression of the mutated allele in blood and fibroblasts and puromycin treatment on cultured fibroblasts did not lead to the rescue of the maternal allele expression, not supporting the involvement of nonsense-mediated RNA decay mechanism. Complementation assay underlined a recovery of complex I activity after transduction of the wild-type gene. Since the second mutation was not detected and promoter methylation analysis resulted normal, we hypothesized a non-exonic event in the maternal allele affecting a regulatory element that, in conjunction with the paternal mutation, leads to the autosomal recessive disorder and the different allele expression in various tissues. This paper confirms NDUFAF6 as a genuine morbid gene and proposes the coupling of exome sequencing with mRNA analysis as a method useful for enhancing the exome sequencing detection rate when the simple application of classical inheritance models fails. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. Gene network-based analysis identifies two potential subtypes of small intestinal neuroendocrine tumors.

    PubMed

    Kidd, Mark; Modlin, Irvin M; Drozdov, Ignat

    2014-07-15

    Tumor transcriptomes contain information of critical value to understanding the different capacities of a cell at both a physiological and pathological level. In terms of clinical relevance, they provide information regarding the cellular "toolbox" e.g., pathways associated with malignancy and metastasis or drug dependency. Exploration of this resource can therefore be leveraged as a translational tool to better manage and assess neoplastic behavior. The availability of public genome-wide expression datasets, provide an opportunity to reassess neuroendocrine tumors at a more fundamental level. We hypothesized that stringent analysis of expression profiles as well as regulatory networks of the neoplastic cell would provide novel information that facilitates further delineation of the genomic basis of small intestinal neuroendocrine tumors. We re-analyzed two publically available small intestinal tumor transcriptomes using stringent quality control parameters and network-based approaches and validated expression of core secretory regulatory elements e.g., CPE, PCSK1, secretogranins, including genes involved in depolarization e.g., SCN3A, as well as transcription factors associated with neurodevelopment (NKX2-2, NeuroD1, INSM1) and glucose homeostasis (APLP1). The candidate metastasis-associated transcription factor, ST18, was highly expressed (>14-fold, p < 0.004). Genes previously associated with neoplasia, CEBPA and SDHD, were decreased in expression (-1.5 - -2, p < 0.02). Genomic interrogation indicated that intestinal tumors may consist of two different subtypes, serotonin-producing neoplasms and serotonin/substance P/tachykinin lesions. QPCR validation in an independent dataset (n = 13 neuroendocrine tumors), confirmed up-regulated expression of 87% of genes (13/15). An integrated cellular transcriptomic analysis of small intestinal neuroendocrine tumors identified that they are regulated at a developmental level, have key activation of hypoxic pathways (a known

  8. Analysis of genomic aberrations and gene expression profiling identifies novel lesions and pathways in myeloproliferative neoplasms

    PubMed Central

    Rice, K L; Lin, X; Wolniak, K; Ebert, B L; Berkofsky-Fessler, W; Buzzai, M; Sun, Y; Xi, C; Elkin, P; Levine, R; Golub, T; Gilliland, D G; Crispino, J D; Licht, J D; Zhang, W

    2011-01-01

    Polycythemia vera (PV), essential thrombocythemia and primary myelofibrosis, are myeloproliferative neoplasms (MPNs) with distinct clinical features and are associated with the JAK2V617F mutation. To identify genomic anomalies involved in the pathogenesis of these disorders, we profiled 87 MPN patients using Affymetrix 250K single-nucleotide polymorphism (SNP) arrays. Aberrations affecting chr9 were the most frequently observed and included 9pLOH (n=16), trisomy 9 (n=6) and amplifications of 9p13.3–23.3 (n=1), 9q33.1–34.13 (n=1) and 9q34.13 (n=6). Patients with trisomy 9 were associated with elevated JAK2V617F mutant allele burden, suggesting that gain of chr9 represents an alternative mechanism for increasing JAK2V617F dosage. Gene expression profiling of patients with and without chr9 abnormalities (+9, 9pLOH), identified genes potentially involved in disease pathogenesis including JAK2, STAT5B and MAPK14. We also observed recurrent gains of 1p36.31–36.33 (n=6), 17q21.2–q21.31 (n=5) and 17q25.1–25.3 (n=5) and deletions affecting 18p11.31–11.32 (n=8). Combined SNP and gene expression analysis identified aberrations affecting components of a non-canonical PRC2 complex (EZH1, SUZ12 and JARID2) and genes comprising a ‘HSC signature' (MLLT3, SMARCA2 and PBX1). We show that NFIB, which is amplified in 7/87 MPN patients and upregulated in PV CD34+ cells, protects cells from apoptosis induced by cytokine withdrawal. PMID:22829077

  9. A Multiomics Approach to Identify Genes Associated with Childhood Asthma Risk and Morbidity.

    PubMed

    Forno, Erick; Wang, Ting; Yan, Qi; Brehm, John; Acosta-Perez, Edna; Colon-Semidey, Angel; Alvarez, Maria; Boutaoui, Nadia; Cloutier, Michelle M; Alcorn, John F; Canino, Glorisa; Chen, Wei; Celedón, Juan C

    2017-10-01

    Childhood asthma is a complex disease. In this study, we aim to identify genes associated with childhood asthma through a multiomics "vertical" approach that integrates multiple analytical steps using linear and logistic regression models. In a case-control study of childhood asthma in Puerto Ricans (n = 1,127), we used adjusted linear or logistic regression models to evaluate associations between several analytical steps of omics data, including genome-wide (GW) genotype data, GW methylation, GW expression profiling, cytokine levels, asthma-intermediate phenotypes, and asthma status. At each point, only the top genes/single-nucleotide polymorphisms/probes/cytokines were carried forward for subsequent analysis. In step 1, asthma modified the gene expression-protein level association for 1,645 genes; pathway analysis showed an enrichment of these genes in the cytokine signaling system (n = 269 genes). In steps 2-3, expression levels of 40 genes were associated with intermediate phenotypes (asthma onset age, forced expiratory volume in 1 second, exacerbations, eosinophil counts, and skin test reactivity); of those, methylation of seven genes was also associated with asthma. Of these seven candidate genes, IL5RA was also significant in analytical steps 4-8. We then measured plasma IL-5 receptor α levels, which were associated with asthma age of onset and moderate-severe exacerbations. In addition, in silico database analysis showed that several of our identified IL5RA single-nucleotide polymorphisms are associated with transcription factors related to asthma and atopy. This approach integrates several analytical steps and is able to identify biologically relevant asthma-related genes, such as IL5RA. It differs from other methods that rely on complex statistical models with various assumptions.

  10. Major carcinogenic pathways identified by gene expression analysis of peritoneal mesotheliomas following chemical treatment in F344 rats

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kim, Yongbaek; Thai-Vu Ton; De Angelo, Anthony B.

    2006-07-15

    This study was performed to characterize the gene expression profile and to identify the major carcinogenic pathways involved in rat peritoneal mesothelioma (RPM) formation following treatment of Fischer 344 rats with o-nitrotoluene (o-NT) or bromochloracetic acid (BCA). Oligo arrays, with over 20,000 target genes, were used to evaluate o-NT- and BCA-induced RPMs, when compared to a non-transformed mesothelial cell line (Fred-PE). Analysis using Ingenuity Pathway Analysis software revealed 169 cancer-related genes that were categorized into binding activity, growth and proliferation, cell cycle progression, apoptosis, and invasion and metastasis. The microarray data were validated by positive correlation with quantitative real-time RT-PCRmore » on 16 selected genes including igf1, tgfb3 and nov. Important carcinogenic pathways involved in RPM formation included insulin-like growth factor 1 (IGF-1), p38 MAPkinase, Wnt/{beta}-catenin and integrin signaling pathways. This study demonstrated that mesotheliomas in rats exposed to o-NT- and BCA were similar to mesotheliomas in humans, at least at the cellular and molecular level.« less

  11. Utilizing Gene Tree Variation to Identify Candidate Effector Genes in Zymoseptoria tritici

    PubMed Central

    McDonald, Megan C.; McGinness, Lachlan; Hane, James K.; Williams, Angela H.; Milgate, Andrew; Solomon, Peter S.

    2016-01-01

    Zymoseptoria tritici is a host-specific, necrotrophic pathogen of wheat. Infection by Z. tritici is characterized by its extended latent period, which typically lasts 2 wks, and is followed by extensive host cell death, and rapid proliferation of fungal biomass. This work characterizes the level of genomic variation in 13 isolates, for which we have measured virulence on 11 wheat cultivars with differential resistance genes. Between the reference isolate, IPO323, and the 13 Australian isolates we identified over 800,000 single nucleotide polymorphisms, of which ∼10% had an effect on the coding regions of the genome. Furthermore, we identified over 1700 probable presence/absence polymorphisms in genes across the Australian isolates using de novo assembly. Finally, we developed a gene tree sorting method that quickly identifies groups of isolates within a single gene alignment whose sequence haplotypes correspond with virulence scores on a single wheat cultivar. Using this method, we have identified < 100 candidate effector genes whose gene sequence correlates with virulence toward a wheat cultivar carrying a major resistance gene. PMID:26837952

  12. Time-course microarray analysis for identifying candidate genes involved in obesity-associated pathological changes in the mouse colon.

    PubMed

    Bae, Yun Jung; Kim, Sung-Eun; Hong, Seong Yeon; Park, Taesun; Lee, Sang Gyu; Choi, Myung-Sook; Sung, Mi-Kyung

    2016-01-01

    Obesity is known to increase the risk of colorectal cancer. However, mechanisms underlying the pathogenesis of obesity-induced colorectal cancer are not completely understood. The purposes of this study were to identify differentially expressed genes in the colon of mice with diet-induced obesity and to select candidate genes as early markers of obesity-associated abnormal cell growth in the colon. C57BL/6N mice were fed normal diet (11% fat energy) or high-fat diet (40% fat energy) and were euthanized at different time points. Genome-wide expression profiles of the colon were determined at 2, 4, 8, and 12 weeks. Cluster analysis was performed using expression data of genes showing log 2 fold change of ≥1 or ≤-1 (twofold change), based on time-dependent expression patterns, followed by virtual network analysis. High-fat diet-fed mice showed significant increase in body weight and total visceral fat weight over 12 weeks. Time-course microarray analysis showed that 50, 47, 36, and 411 genes were differentially expressed at 2, 4, 8, and 12 weeks, respectively. Ten cluster profiles representing distinguishable patterns of genes differentially expressed over time were determined. Cluster 4, which consisted of genes showing the most significant alterations in expression in response to high-fat diet over 12 weeks, included Apoa4 (apolipoprotein A-IV), Ppap2b (phosphatidic acid phosphatase type 2B), Cel (carboxyl ester lipase), and Clps (colipase, pancreatic), which interacted strongly with surrounding genes associated with colorectal cancer or obesity. Our data indicate that Apoa4 , Ppap2b , Cel , and Clps are candidate early marker genes associated with obesity-related pathological changes in the colon. Genome-wide analyses performed in the present study provide new insights on selecting novel genes that may be associated with the development of diseases of the colon.

  13. Pathway-driven gene stability selection of two rheumatoid arthritis GWAS identifies and validates new susceptibility genes in receptor mediated signalling pathways.

    PubMed

    Eleftherohorinou, Hariklia; Hoggart, Clive J; Wright, Victoria J; Levin, Michael; Coin, Lachlan J M

    2011-09-01

    Rheumatoid arthritis (RA) is the commonest chronic, systemic, inflammatory disorder affecting ∼1% of the world population. It has a strong genetic component and a growing number of associated genes have been discovered in genome-wide association studies (GWAS), which nevertheless only account for 23% of the total genetic risk. We aimed to identify additional susceptibility loci through the analysis of GWAS in the context of biological function. We bridge the gap between pathway and gene-oriented analyses of GWAS, by introducing a pathway-driven gene stability-selection methodology that identifies potential causal genes in the top-associated disease pathways that may be driving the pathway association signals. We analysed the WTCCC and the NARAC studies of ∼5000 and ∼2000 subjects, respectively. We examined 700 pathways comprising ∼8000 genes. Ranking pathways by significance revealed that the NARAC top-ranked ∼6% laid within the top 10% of WTCCC. Gene selection on those pathways identified 58 genes in WTCCC and 61 in NARAC; 21 of those were common (P(overlap)< 10(-21)), of which 16 were novel discoveries. Among the identified genes, we validated 10 known RA associations in WTCCC and 13 in NARAC, not discovered using single-SNP approaches on the same data. Gene ontology functional enrichment analysis on the identified genes showed significant over-representation of signalling activity (P< 10(-29)) in both studies. Our findings suggest a novel model of RA genetic predisposition, which involves cell-membrane receptors and genes in second messenger signalling systems, in addition to genes that regulate immune responses, which have been the focus of interest previously.

  14. Analysis of pan-genome to identify the core genes and essential genes of Brucella spp.

    PubMed

    Yang, Xiaowen; Li, Yajie; Zang, Juan; Li, Yexia; Bie, Pengfei; Lu, Yanli; Wu, Qingmin

    2016-04-01

    Brucella spp. are facultative intracellular pathogens, that cause a contagious zoonotic disease, that can result in such outcomes as abortion or sterility in susceptible animal hosts and grave, debilitating illness in humans. For deciphering the survival mechanism of Brucella spp. in vivo, 42 Brucella complete genomes from NCBI were analyzed for the pan-genome and core genome by identification of their composition and function of Brucella genomes. The results showed that the total 132,143 protein-coding genes in these genomes were divided into 5369 clusters. Among these, 1710 clusters were associated with the core genome, 1182 clusters with strain-specific genes and 2477 clusters with dispensable genomes. COG analysis indicated that 44 % of the core genes were devoted to metabolism, which were mainly responsible for energy production and conversion (COG category C), and amino acid transport and metabolism (COG category E). Meanwhile, approximately 35 % of the core genes were in positive selection. In addition, 1252 potential essential genes were predicted in the core genome by comparison with a prokaryote database of essential genes. The results suggested that the core genes in Brucella genomes are relatively conservation, and the energy and amino acid metabolism play a more important role in the process of growth and reproduction in Brucella spp. This study might help us to better understand the mechanisms of Brucella persistent infection and provide some clues for further exploring the gene modules of the intracellular survival in Brucella spp.

  15. Gene-Trap Mutagenesis Identifies Mammalian Genes Contributing to Intoxication by Clostridium perfringens ε-Toxin

    PubMed Central

    Ivie, Susan E.; Fennessey, Christine M.; Sheng, Jinsong; Rubin, Donald H.; McClain, Mark S.

    2011-01-01

    The Clostridium perfringens ε-toxin is an extremely potent toxin associated with lethal toxemias in domesticated ruminants and may be toxic to humans. Intoxication results in fluid accumulation in various tissues, most notably in the brain and kidneys. Previous studies suggest that the toxin is a pore-forming toxin, leading to dysregulated ion homeostasis and ultimately cell death. However, mammalian host factors that likely contribute to ε-toxin-induced cytotoxicity are poorly understood. A library of insertional mutant Madin Darby canine kidney (MDCK) cells, which are highly susceptible to the lethal affects of ε-toxin, was used to select clones of cells resistant to ε-toxin-induced cytotoxicity. The genes mutated in 9 surviving resistant cell clones were identified. We focused additional experiments on one of the identified genes as a means of validating the experimental approach. Gene expression microarray analysis revealed that one of the identified genes, hepatitis A virus cellular receptor 1 (HAVCR1, KIM-1, TIM1), is more abundantly expressed in human kidney cell lines than it is expressed in human cells known to be resistant to ε-toxin. One human kidney cell line, ACHN, was found to be sensitive to the toxin and expresses a larger isoform of the HAVCR1 protein than the HAVCR1 protein expressed by other, toxin-resistant human kidney cell lines. RNA interference studies in MDCK and in ACHN cells confirmed that HAVCR1 contributes to ε-toxin-induced cytotoxicity. Additionally, ε-toxin was shown to bind to HAVCR1 in vitro. The results of this study indicate that HAVCR1 and the other genes identified through the use of gene-trap mutagenesis and RNA interference strategies represent important targets for investigation of the process by which ε-toxin induces cell death and new targets for potential therapeutic intervention. PMID:21412435

  16. Gene-trap mutagenesis identifies mammalian genes contributing to intoxication by Clostridium perfringens ε-toxin.

    PubMed

    Ivie, Susan E; Fennessey, Christine M; Sheng, Jinsong; Rubin, Donald H; McClain, Mark S

    2011-03-11

    The Clostridium perfringens ε-toxin is an extremely potent toxin associated with lethal toxemias in domesticated ruminants and may be toxic to humans. Intoxication results in fluid accumulation in various tissues, most notably in the brain and kidneys. Previous studies suggest that the toxin is a pore-forming toxin, leading to dysregulated ion homeostasis and ultimately cell death. However, mammalian host factors that likely contribute to ε-toxin-induced cytotoxicity are poorly understood. A library of insertional mutant Madin Darby canine kidney (MDCK) cells, which are highly susceptible to the lethal affects of ε-toxin, was used to select clones of cells resistant to ε-toxin-induced cytotoxicity. The genes mutated in 9 surviving resistant cell clones were identified. We focused additional experiments on one of the identified genes as a means of validating the experimental approach. Gene expression microarray analysis revealed that one of the identified genes, hepatitis A virus cellular receptor 1 (HAVCR1, KIM-1, TIM1), is more abundantly expressed in human kidney cell lines than it is expressed in human cells known to be resistant to ε-toxin. One human kidney cell line, ACHN, was found to be sensitive to the toxin and expresses a larger isoform of the HAVCR1 protein than the HAVCR1 protein expressed by other, toxin-resistant human kidney cell lines. RNA interference studies in MDCK and in ACHN cells confirmed that HAVCR1 contributes to ε-toxin-induced cytotoxicity. Additionally, ε-toxin was shown to bind to HAVCR1 in vitro. The results of this study indicate that HAVCR1 and the other genes identified through the use of gene-trap mutagenesis and RNA interference strategies represent important targets for investigation of the process by which ε-toxin induces cell death and new targets for potential therapeutic intervention.

  17. Identifying conserved gene clusters in the presence of homology families.

    PubMed

    He, Xin; Goldwasser, Michael H

    2005-01-01

    The study of conserved gene clusters is important for understanding the forces behind genome organization and evolution, as well as the function of individual genes or gene groups. In this paper, we present a new model and algorithm for identifying conserved gene clusters from pairwise genome comparison. This generalizes a recent model called "gene teams." A gene team is a set of genes that appear homologously in two or more species, possibly in a different order yet with the distance of adjacent genes in the team for each chromosome always no more than a certain threshold. We remove the constraint in the original model that each gene must have a unique occurrence in each chromosome and thus allow the analysis on complex prokaryotic or eukaryotic genomes with extensive paralogs. Our algorithm analyzes a pair of chromosomes in O(mn) time and uses O(m+n) space, where m and n are the number of genes in the respective chromosomes. We demonstrate the utility of our methods by studying two bacterial genomes, E. coli K-12 and B. subtilis. Many of the teams identified by our algorithm correlate with documented E. coli operons, while several others match predicted operons, previously suggested by computational techniques. Our implementation and data are publicly available at euler.slu.edu/ approximately goldwasser/homologyteams/.

  18. Gonad Transcriptome Analysis of the Pacific Oyster Crassostrea gigas Identifies Potential Genes Regulating the Sex Determination and Differentiation Process.

    PubMed

    Yue, Chenyang; Li, Qi; Yu, Hong

    2018-04-01

    The Pacific oyster Crassostrea gigas is a commercially important bivalve in aquaculture worldwide. C. gigas has a fascinating sexual reproduction system consisting of dioecism, sex change, and occasional hermaphroditism, while knowledge of the molecular mechanisms of sex determination and differentiation is still limited. In this study, the transcriptomes of male and female gonads at different gametogenesis stages were characterized by RNA-seq. Hierarchical clustering based on genes differentially expressed revealed that 1269 genes were expressed specifically in female gonads and 817 genes were expressed increasingly over the course of spermatogenesis. Besides, we identified two and one gene modules related to female and male gonad development, respectively, using weighted gene correlation network analysis (WGCNA). Interestingly, GO and KEGG enrichment analysis showed that neurotransmitter-related terms were significantly enriched in genes related to ovary development, suggesting that the neurotransmitters were likely to regulate female sex differentiation. In addition, two hub genes related to testis development, lncRNA LOC105321313 and Cg-Sh3kbp1, and one hub gene related to ovary development, Cg-Malrd1-like, were firstly investigated. This study points out the role of neurotransmitter and non-coding RNA regulation during gonad development and produces lists of novel relevant candidate genes for further studies. All of these provided valuable information to understand the molecular mechanisms of C. gigas sex determination and differentiation.

  19. Axon Regeneration Genes Identified by RNAi Screening in C. elegans

    PubMed Central

    Nix, Paola; Hammarlund, Marc; Hauth, Linda; Lachnit, Martina; Jorgensen, Erik M.

    2014-01-01

    Axons of the mammalian CNS lose the ability to regenerate soon after development due to both an inhibitory CNS environment and the loss of cell-intrinsic factors necessary for regeneration. The complex molecular events required for robust regeneration of mature neurons are not fully understood, particularly in vivo. To identify genes affecting axon regeneration in Caenorhabditis elegans, we performed both an RNAi-based screen for defective motor axon regeneration in unc-70/β-spectrin mutants and a candidate gene screen. From these screens, we identified at least 50 conserved genes with growth-promoting or growth-inhibiting functions. Through our analysis of mutants, we shed new light on certain aspects of regeneration, including the role of β-spectrin and membrane dynamics, the antagonistic activity of MAP kinase signaling pathways, and the role of stress in promoting axon regeneration. Many gene candidates had not previously been associated with axon regeneration and implicate new pathways of interest for therapeutic intervention. PMID:24403161

  20. An Integrative Genetics Approach to Identify Candidate Genes Regulating BMD: Combining Linkage, Gene Expression, and Association

    PubMed Central

    Farber, Charles R; van Nas, Atila; Ghazalpour, Anatole; Aten, Jason E; Doss, Sudheer; Sos, Brandon; Schadt, Eric E; Ingram-Drake, Leslie; Davis, Richard C; Horvath, Steve; Smith, Desmond J; Drake, Thomas A; Lusis, Aldons J

    2009-01-01

    Numerous quantitative trait loci (QTLs) affecting bone traits have been identified in the mouse; however, few of the underlying genes have been discovered. To improve the process of transitioning from QTL to gene, we describe an integrative genetics approach, which combines linkage analysis, expression QTL (eQTL) mapping, causality modeling, and genetic association in outbred mice. In C57BL/6J × C3H/HeJ (BXH) F2 mice, nine QTLs regulating femoral BMD were identified. To select candidate genes from within each QTL region, microarray gene expression profiles from individual F2 mice were used to identify 148 genes whose expression was correlated with BMD and regulated by local eQTLs. Many of the genes that were the most highly correlated with BMD have been previously shown to modulate bone mass or skeletal development. Candidates were further prioritized by determining whether their expression was predicted to underlie variation in BMD. Using network edge orienting (NEO), a causality modeling algorithm, 18 of the 148 candidates were predicted to be causally related to differences in BMD. To fine-map QTLs, markers in outbred MF1 mice were tested for association with BMD. Three chromosome 11 SNPs were identified that were associated with BMD within the Bmd11 QTL. Finally, our approach provides strong support for Wnt9a, Rasd1, or both underlying Bmd11. Integration of multiple genetic and genomic data sets can substantially improve the efficiency of QTL fine-mapping and candidate gene identification. PMID:18767929

  1. A cross-species bi-clustering approach to identifying conserved co-regulated genes.

    PubMed

    Sun, Jiangwen; Jiang, Zongliang; Tian, Xiuchun; Bi, Jinbo

    2016-06-15

    A growing number of studies have explored the process of pre-implantation embryonic development of multiple mammalian species. However, the conservation and variation among different species in their developmental programming are poorly defined due to the lack of effective computational methods for detecting co-regularized genes that are conserved across species. The most sophisticated method to date for identifying conserved co-regulated genes is a two-step approach. This approach first identifies gene clusters for each species by a cluster analysis of gene expression data, and subsequently computes the overlaps of clusters identified from different species to reveal common subgroups. This approach is ineffective to deal with the noise in the expression data introduced by the complicated procedures in quantifying gene expression. Furthermore, due to the sequential nature of the approach, the gene clusters identified in the first step may have little overlap among different species in the second step, thus difficult to detect conserved co-regulated genes. We propose a cross-species bi-clustering approach which first denoises the gene expression data of each species into a data matrix. The rows of the data matrices of different species represent the same set of genes that are characterized by their expression patterns over the developmental stages of each species as columns. A novel bi-clustering method is then developed to cluster genes into subgroups by a joint sparse rank-one factorization of all the data matrices. This method decomposes a data matrix into a product of a column vector and a row vector where the column vector is a consistent indicator across the matrices (species) to identify the same gene cluster and the row vector specifies for each species the developmental stages that the clustered genes co-regulate. Efficient optimization algorithm has been developed with convergence analysis. This approach was first validated on synthetic data and compared

  2. Microarray analysis identified Puccinia striiformis f. sp. tritici genes involved in infection and sporulation.

    USDA-ARS?s Scientific Manuscript database

    Puccinia striiformis f. sp. tritici (Pst) causes stripe rust, one of the most important diseases of wheat worldwide. To identify Pst genes involved in infection and sporulation, a custom oligonucleotide Genechip was made using sequences of 442 genes selected from Pst cDNA libraries. Microarray analy...

  3. Identifying Cancer Driver Genes Using Replication-Incompetent Retroviral Vectors

    PubMed Central

    Bii, Victor M.; Trobridge, Grant D.

    2016-01-01

    Identifying novel genes that drive tumor metastasis and drug resistance has significant potential to improve patient outcomes. High-throughput sequencing approaches have identified cancer genes, but distinguishing driver genes from passengers remains challenging. Insertional mutagenesis screens using replication-incompetent retroviral vectors have emerged as a powerful tool to identify cancer genes. Unlike replicating retroviruses and transposons, replication-incompetent retroviral vectors lack additional mutagenesis events that can complicate the identification of driver mutations from passenger mutations. They can also be used for almost any human cancer due to the broad tropism of the vectors. Replication-incompetent retroviral vectors have the ability to dysregulate nearby cancer genes via several mechanisms including enhancer-mediated activation of gene promoters. The integrated provirus acts as a unique molecular tag for nearby candidate driver genes which can be rapidly identified using well established methods that utilize next generation sequencing and bioinformatics programs. Recently, retroviral vector screens have been used to efficiently identify candidate driver genes in prostate, breast, liver and pancreatic cancers. Validated driver genes can be potential therapeutic targets and biomarkers. In this review, we describe the emergence of retroviral insertional mutagenesis screens using replication-incompetent retroviral vectors as a novel tool to identify cancer driver genes in different cancer types. PMID:27792127

  4. Genome-Wide and Gene-Based Meta-Analyses Identify Novel Loci Influencing Blood Pressure Response to Hydrochlorothiazide.

    PubMed

    Salvi, Erika; Wang, Zhiying; Rizzi, Federica; Gong, Yan; McDonough, Caitrin W; Padmanabhan, Sandosh; Hiltunen, Timo P; Lanzani, Chiara; Zaninello, Roberta; Chittani, Martina; Bailey, Kent R; Sarin, Antti-Pekka; Barcella, Matteo; Melander, Olle; Chapman, Arlene B; Manunta, Paolo; Kontula, Kimmo K; Glorioso, Nicola; Cusi, Daniele; Dominiczak, Anna F; Johnson, Julie A; Barlassina, Cristina; Boerwinkle, Eric; Cooper-DeHoff, Rhonda M; Turner, Stephen T

    2017-01-01

    This study aimed to identify novel loci influencing the antihypertensive response to hydrochlorothiazide monotherapy. A genome-wide meta-analysis of blood pressure (BP) response to hydrochlorothiazide was performed in 1739 white hypertensives from 6 clinical trials within the International Consortium for Antihypertensive Pharmacogenomics Studies, making it the largest study to date of its kind. No signals reached genome-wide significance (P<5×10 - 8 ), and the suggestive regions (P<10 -5 ) were cross-validated in 2 black cohorts treated with hydrochlorothiazide. In addition, a gene-based analysis was performed on candidate genes with previous evidence of involvement in diuretic response, in BP regulation, or in hypertension susceptibility. Using the genome-wide meta-analysis approach, with validation in blacks, we identified 2 suggestive regulatory regions linked to gap junction protein α1 gene (GJA1) and forkhead box A1 gene (FOXA1), relevant for cardiovascular and kidney function. With the gene-based approach, we identified hydroxy-delta-5-steroid dehydrogenase, 3 β- and steroid δ-isomerase 1 gene (HSD3B1) as significantly associated with BP response (P<2.28×10 - 4 ). HSD3B1 encodes the 3β-hydroxysteroid dehydrogenase enzyme and plays a crucial role in the biosynthesis of aldosterone and endogenous ouabain. By amassing all of the available pharmacogenomic studies of BP response to hydrochlorothiazide, and using 2 different analytic approaches, we identified 3 novel loci influencing BP response to hydrochlorothiazide. The gene-based analysis, never before applied to pharmacogenomics of antihypertensive drugs to our knowledge, provided a powerful strategy to identify a locus of interest, which was not identified in the genome-wide meta-analysis because of high allelic heterogeneity. These data pave the way for future investigations on new pathways and drug targets to enhance the current understanding of personalized antihypertensive treatment. © 2016

  5. A computational approach to identify cellular heterogeneity and tissue-specific gene regulatory networks.

    PubMed

    Jambusaria, Ankit; Klomp, Jeff; Hong, Zhigang; Rafii, Shahin; Dai, Yang; Malik, Asrar B; Rehman, Jalees

    2018-06-07

    The heterogeneity of cells across tissue types represents a major challenge for studying biological mechanisms as well as for therapeutic targeting of distinct tissues. Computational prediction of tissue-specific gene regulatory networks may provide important insights into the mechanisms underlying the cellular heterogeneity of cells in distinct organs and tissues. Using three pathway analysis techniques, gene set enrichment analysis (GSEA), parametric analysis of gene set enrichment (PGSEA), alongside our novel model (HeteroPath), which assesses heterogeneously upregulated and downregulated genes within the context of pathways, we generated distinct tissue-specific gene regulatory networks. We analyzed gene expression data derived from freshly isolated heart, brain, and lung endothelial cells and populations of neurons in the hippocampus, cingulate cortex, and amygdala. In both datasets, we found that HeteroPath segregated the distinct cellular populations by identifying regulatory pathways that were not identified by GSEA or PGSEA. Using simulated datasets, HeteroPath demonstrated robustness that was comparable to what was seen using existing gene set enrichment methods. Furthermore, we generated tissue-specific gene regulatory networks involved in vascular heterogeneity and neuronal heterogeneity by performing motif enrichment of the heterogeneous genes identified by HeteroPath and linking the enriched motifs to regulatory transcription factors in the ENCODE database. HeteroPath assesses contextual bidirectional gene expression within pathways and thus allows for transcriptomic assessment of cellular heterogeneity. Unraveling tissue-specific heterogeneity of gene expression can lead to a better understanding of the molecular underpinnings of tissue-specific phenotypes.

  6. Transcriptome and proteome analysis of tyrosine kinase inhibitor treated canine mast cell tumour cells identifies potentially kit signaling-dependent genes

    PubMed Central

    2012-01-01

    Background Canine mast cell tumour proliferation depends to a large extent on the activity of KIT, a tyrosine kinase receptor. Inhibitors of the KIT tyrosine kinase have recently been introduced and successfully applied as a therapeutic agent for this tumour type. However, little is known on the downstream target genes of this signaling pathway and molecular changes after inhibition. Results Transcriptome analysis of the canine mast cell tumour cell line C2 treated for up to 72 hours with the tyrosine kinase inhibitor masitinib identified significant changes in the expression levels of approximately 3500 genes or 16% of the canine genome. Approximately 40% of these genes had increased mRNA expression levels including genes associated with the pro-proliferative pathways of B- and T-cell receptors, chemokine receptors, steroid hormone receptors and EPO-, RAS and MAP kinase signaling. Proteome analysis of C2 cells treated for 72 hours identified 24 proteins with changed expression levels, most of which being involved in gene transcription, e.g. EIA3, EIA4, TARDBP, protein folding, e.g. HSP90, UCHL3, PDIA3 and protection from oxidative stress, GSTT3, SELENBP1. Conclusions Transcriptome and proteome analysis of neoplastic canine mast cells treated with masitinib confirmed the strong important and complex role of KIT in these cells. Approximately 16% of the total canine genome and thus the majority of the active genes were significantly transcriptionally regulated. Most of these changes were associated with reduced proliferation and metabolism of treated cells. Interestingly, several pro-proliferative pathways were up-regulated, which may represent attempts of masitinib treated cells to activate alternative pro-proliferative pathways. These pathways may contain hypothetical targets for a combination therapy with masitinib to further improve its therapeutic effect. PMID:22747577

  7. Combining Genome-Scale Experimental and Computational Methods To Identify Essential Genes in Rhodobacter sphaeroides

    DOE PAGES

    Burger, Brian T.; Imam, Saheed; Scarborough, Matthew J.; ...

    2017-06-06

    Rhodobacter sphaeroides is one of the best-studied alphaproteobacteria from biochemical, genetic, and genomic perspectives. To gain a better systems-level understanding of this organism, we generated a large transposon mutant library and used transposon sequencing (Tn-seq) to identify genes that are essential under several growth conditions. Using newly developed Tn-seq analysis software (TSAS), we identified 493 genes as essential for aerobic growth on a rich medium. We then used the mutant library to identify conditionally essential genes under two laboratory growth conditions, identifying 85 additional genes required for aerobic growth in a minimal medium and 31 additional genes required for photosyntheticmore » growth. In all instances, our analyses confirmed essentiality for many known genes and identified genes not previously considered to be essential. We used the resulting Tn-seq data to refine and improve a genome-scale metabolic network model (GEM) for R. sphaeroides. Together, we demonstrate how genetic, genomic, and computational approaches can be combined to obtain a systems-level understanding of the genetic framework underlying metabolic diversity in bacterial species.« less

  8. Digital gene expression profiling of flax (Linum usitatissimum L.) stem peel identifies genes enriched in fiber-bearing phloem tissue.

    PubMed

    Guo, Yuan; Qiu, Caisheng; Long, Songhua; Chen, Ping; Hao, Dongmei; Preisner, Marta; Wang, Hui; Wang, Yufu

    2017-08-30

    To better understand the molecular mechanisms and gene expression characteristics associated with development of bast fiber cell within flax stem phloem, the gene expression profiling of flax stem peels and leaves were screened, using Illumina's Digital Gene Expression (DGE) analysis. Four DGE libraries (2 for stem peel and 2 for leaf), ranging from 6.7 to 9.2 million clean reads were obtained, which produced 7.0 million and 6.8 million mapped reads for flax stem peel and leave, respectively. By differential gene expression analysis, a total of 975 genes, of which 708 (73%) genes have protein-coding annotation, were identified as phloem enriched genes putatively involved in the processes of polysaccharide and cell wall metabolism. Differential expression genes (DEGs) was validated using quantitative RT-PCR, the expression pattern of all nine genes determined by qRT-PCR fitted in well with that obtained by sequencing analysis. Cluster and Gene Ontology (GO) analysis revealed that a large number of genes related to metabolic process, catalytic activity and binding category were expressed predominantly in the stem peels. The Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of the phloem enriched genes suggested approximately 111 biological pathways. The large number of genes and pathways produced from DGE sequencing will expand our understanding of the complex molecular and cellular events in flax bast fiber development and provide a foundation for future studies on fiber development in other bast fiber crops. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Transcriptome and metabolite analysis identifies nitrogen utilization genes in tea plant (Camellia sinensis).

    PubMed

    Li, Wei; Xiang, Fen; Zhong, Micai; Zhou, Lingyun; Liu, Hongyan; Li, Saijun; Wang, Xuewen

    2017-05-10

    Applied nitrogen (N) fertilizer significantly increases the leaf yield. However, most N is not utilized by the plant, negatively impacting the environment. To date, little is known regarding N utilization genes and mechanisms in the leaf production. To understand this, we investigated transcriptomes using RNA-seq and amino acid levels with N treatment in tea (Camellia sinensis), the most popular beverage crop. We identified 196 and 29 common differentially expressed genes in roots and leaves, respectively, in response to ammonium in two tea varieties. Among those genes, AMT, NRT and AQP for N uptake and GOGAT and GS for N assimilation were the key genes, validated by RT-qPCR, which expressed in a network manner with tissue specificity. Importantly, only AQP and three novel DEGs associated with stress, manganese binding, and gibberellin-regulated transcription factor were common in N responses across all tissues and varieties. A hypothesized gene regulatory network for N was proposed. A strong statistical correlation between key genes' expression and amino acid content was revealed. The key genes and regulatory network improve our understanding of the molecular mechanism of N usage and offer gene targets for plant improvement.

  10. Microarray and differential display identify genes involved in jasmonate-dependent anther development.

    PubMed

    Mandaokar, Ajin; Kumar, V Dinesh; Amway, Matt; Browse, John

    2003-07-01

    Jasmonate (JA) is a signaling compound essential for anther development and pollen fertility in Arabidopsis. Mutations that block the pathway of JA synthesis result into male sterility. To understand the processes of anther and pollen maturation, we used microarray and differential display approaches to compare gene expression pattern in anthers of wild-type Arabidopsis and the male-sterile mutant, opr3. Microarray experiment revealed 25 genes that were up-regulated more than 1.8-fold in wild-type anthers as compared to mutant anthers. Experiments based on differential display identified 13 additional genes up-regulated in wild-type anthers compared to opr3 for a total of 38 differentially expressed genes. Searches of the Arabidopsis and non-redundant databases disclosed known or likely functions for 28 of the 38 genes identified, while 10 genes encode proteins of unknown function. Northern blot analysis of eight representative clones as probes confirmed low expression in opr3 anthers compared with wild-type anthers. JA responsiveness of these same genes was also investigated by northern blot analysis of anther RNA isolated from wild-type and opr3 plants, In these experiments, four genes were induced in opr3 anthers within 0.5-1 h of JA treatment while the remaining genes were up-regulated only 1-8 h after JA application. None of these genes was induced by JA in anthers of the coil mutant that is deficient in JA responsiveness. The four early-induced genes in opr3 encode lipoxygenase, a putative bHLH transcription factor, epithiospecifier protein and an unknown protein. We propose that these and other early components may be involved in JA signaling and in the initiation of developmental processes. The four late genes encode an extensin-like protein, a peptide transporter and two unknown proteins, which may represent components required later in anther and pollen maturation. Transcript profiling has provided a successful approach to identify genes involved in

  11. MMTV insertional mutagenesis identifies genes, gene families and pathways involved in mammary cancer.

    PubMed

    Theodorou, Vassiliki; Kimm, Melanie A; Boer, Mandy; Wessels, Lodewyk; Theelen, Wendy; Jonkers, Jos; Hilkens, John

    2007-06-01

    We performed a high-throughput retroviral insertional mutagenesis screen in mouse mammary tumor virus (MMTV)-induced mammary tumors and identified 33 common insertion sites, of which 17 genes were previously not known to be associated with mammary cancer and 13 had not previously been linked to cancer in general. Although members of the Wnt and fibroblast growth factors (Fgf) families were frequently tagged, our exhaustive screening for MMTV insertion sites uncovered a new repertoire of candidate breast cancer oncogenes. We validated one of these genes, Rspo3, as an oncogene by overexpression in a p53-deficient mammary epithelial cell line. The human orthologs of the candidate oncogenes were frequently deregulated in human breast cancers and associated with several tumor parameters. Computational analysis of all MMTV-tagged genes uncovered specific gene families not previously associated with cancer and showed a significant overrepresentation of protein domains and signaling pathways mainly associated with development and growth factor signaling. Comparison of all tagged genes in MMTV and Moloney murine leukemia virus-induced malignancies showed that both viruses target mostly different genes that act predominantly in distinct pathways.

  12. Use of RNA-seq to identify cardiac genes and gene pathways differentially expressed between dogs with and without dilated cardiomyopathy

    PubMed Central

    Friedenberg, Steven G.; Chdid, Lhoucine; Keene, Bruce; Sherry, Barbara; Motsinger-Reif, Alison; Meurs, Kathryn M.

    2017-01-01

    OBJECTIVE To identify cardiac tissue genes and gene pathways differentially expressed between dogs with and without dilated cardiomyopathy (DCM). ANIMALS 8 dogs with and 5 dogs without DCM. PROCEDURES Following euthanasia, samples of left ventricular myocardium were collected from each dog. Total RNA was extracted from tissue samples, and RNA sequencing was performed on each sample. Samples from dogs with and without DCM were grouped to identify genes that were differentially regulated between the 2 populations. Overrepresentation analysis was performed on upregulated and downregulated gene sets to identify altered molecular pathways in dogs with DCM. RESULTS Genes involved in cellular energy metabolism, especially metabolism of carbohydrates and fats, were significantly downregulated in dogs with DCM. Expression of cardiac structural proteins was also altered in affected dogs. CONCLUSIONS AND CLINICAL RELEVANCE Results suggested that RNA sequencing may provide important insights into the pathogenesis of DCM in dogs and highlight pathways that should be explored to identify causative mutations and develop novel therapeutic interventions. PMID:27347821

  13. Use of RNA-seq to identify cardiac genes and gene pathways differentially expressed between dogs with and without dilated cardiomyopathy.

    PubMed

    Friedenberg, Steven G; Chdid, Lhoucine; Keene, Bruce; Sherry, Barbara; Motsinger-Reif, Alison; Meurs, Kathryn M

    2016-07-01

    OBJECTIVE To identify cardiac tissue genes and gene pathways differentially expressed between dogs with and without dilated cardiomyopathy (DCM). ANIMALS 8 dogs with and 5 dogs without DCM. PROCEDURES Following euthanasia, samples of left ventricular myocardium were collected from each dog. Total RNA was extracted from tissue samples, and RNA sequencing was performed on each sample. Samples from dogs with and without DCM were grouped to identify genes that were differentially regulated between the 2 populations. Overrepresentation analysis was performed on upregulated and downregulated gene sets to identify altered molecular pathways in dogs with DCM. RESULTS Genes involved in cellular energy metabolism, especially metabolism of carbohydrates and fats, were significantly downregulated in dogs with DCM. Expression of cardiac structural proteins was also altered in affected dogs. CONCLUSIONS AND CLINICAL RELEVANCE Results suggested that RNA sequencing may provide important insights into the pathogenesis of DCM in dogs and highlight pathways that should be explored to identify causative mutations and develop novel therapeutic interventions.

  14. Genome-wide DNA methylation analysis identifies MEGF10 as a novel epigenetically repressed candidate tumor suppressor gene in neuroblastoma.

    PubMed

    Charlet, Jessica; Tomari, Ayumi; Dallosso, Anthony R; Szemes, Marianna; Kaselova, Martina; Curry, Thomas J; Almutairi, Bader; Etchevers, Heather C; McConville, Carmel; Malik, Karim T A; Brown, Keith W

    2017-04-01

    Neuroblastoma is a childhood cancer in which many children still have poor outcomes, emphasising the need to better understand its pathogenesis. Despite recent genome-wide mutation analyses, many primary neuroblastomas do not contain recognizable driver mutations, implicating alternate molecular pathologies such as epigenetic alterations. To discover genes that become epigenetically deregulated during neuroblastoma tumorigenesis, we took the novel approach of comparing neuroblastomas to neural crest precursor cells, using genome-wide DNA methylation analysis. We identified 93 genes that were significantly differentially methylated of which 26 (28%) were hypermethylated and 67 (72%) were hypomethylated. Concentrating on hypermethylated genes to identify candidate tumor suppressor loci, we found the cell engulfment and adhesion factor gene MEGF10 to be epigenetically repressed by DNA hypermethylation or by H3K27/K9 methylation in neuroblastoma cell lines. MEGF10 showed significantly down-regulated expression in neuroblastoma tumor samples; furthermore patients with the lowest-expressing tumors had reduced relapse-free survival. Our functional studies showed that knock-down of MEGF10 expression in neuroblastoma cell lines promoted cell growth, consistent with MEGF10 acting as a clinically relevant, epigenetically deregulated neuroblastoma tumor suppressor gene. © 2016 The Authors. Molecular Carcinogenesis Published by Wiley Periodicals, Inc. © 2016 The Authors. Molecular Carcinogenesis Published by Wiley Periodicals, Inc.

  15. Genome‐wide DNA methylation analysis identifies MEGF10 as a novel epigenetically repressed candidate tumor suppressor gene in neuroblastoma

    PubMed Central

    Charlet, Jessica; Tomari, Ayumi; Dallosso, Anthony R.; Szemes, Marianna; Kaselova, Martina; Curry, Thomas J.; Almutairi, Bader; Etchevers, Heather C.; McConville, Carmel; Malik, Karim T. A.

    2016-01-01

    Neuroblastoma is a childhood cancer in which many children still have poor outcomes, emphasising the need to better understand its pathogenesis. Despite recent genome‐wide mutation analyses, many primary neuroblastomas do not contain recognizable driver mutations, implicating alternate molecular pathologies such as epigenetic alterations. To discover genes that become epigenetically deregulated during neuroblastoma tumorigenesis, we took the novel approach of comparing neuroblastomas to neural crest precursor cells, using genome‐wide DNA methylation analysis. We identified 93 genes that were significantly differentially methylated of which 26 (28%) were hypermethylated and 67 (72%) were hypomethylated. Concentrating on hypermethylated genes to identify candidate tumor suppressor loci, we found the cell engulfment and adhesion factor gene MEGF10 to be epigenetically repressed by DNA hypermethylation or by H3K27/K9 methylation in neuroblastoma cell lines. MEGF10 showed significantly down‐regulated expression in neuroblastoma tumor samples; furthermore patients with the lowest‐expressing tumors had reduced relapse‐free survival. Our functional studies showed that knock‐down of MEGF10 expression in neuroblastoma cell lines promoted cell growth, consistent with MEGF10 acting as a clinically relevant, epigenetically deregulated neuroblastoma tumor suppressor gene. © 2016 The Authors. Molecular Carcinogenesis Published by Wiley Periodicals, Inc. PMID:27862318

  16. Analysis of bHLH coding genes using gene co-expression network approach.

    PubMed

    Srivastava, Swati; Sanchita; Singh, Garima; Singh, Noopur; Srivastava, Gaurava; Sharma, Ashok

    2016-07-01

    Network analysis provides a powerful framework for the interpretation of data. It uses novel reference network-based metrices for module evolution. These could be used to identify module of highly connected genes showing variation in co-expression network. In this study, a co-expression network-based approach was used for analyzing the genes from microarray data. Our approach consists of a simple but robust rank-based network construction. The publicly available gene expression data of Solanum tuberosum under cold and heat stresses were considered to create and analyze a gene co-expression network. The analysis provide highly co-expressed module of bHLH coding genes based on correlation values. Our approach was to analyze the variation of genes expression, according to the time period of stress through co-expression network approach. As the result, the seed genes were identified showing multiple connections with other genes in the same cluster. Seed genes were found to be vary in different time periods of stress. These analyzed seed genes may be utilized further as marker genes for developing the stress tolerant plant species.

  17. Effect of the absolute statistic on gene-sampling gene-set analysis methods.

    PubMed

    Nam, Dougu

    2017-06-01

    Gene-set enrichment analysis and its modified versions have commonly been used for identifying altered functions or pathways in disease from microarray data. In particular, the simple gene-sampling gene-set analysis methods have been heavily used for datasets with only a few sample replicates. The biggest problem with this approach is the highly inflated false-positive rate. In this paper, the effect of absolute gene statistic on gene-sampling gene-set analysis methods is systematically investigated. Thus far, the absolute gene statistic has merely been regarded as a supplementary method for capturing the bidirectional changes in each gene set. Here, it is shown that incorporating the absolute gene statistic in gene-sampling gene-set analysis substantially reduces the false-positive rate and improves the overall discriminatory ability. Its effect was investigated by power, false-positive rate, and receiver operating curve for a number of simulated and real datasets. The performances of gene-set analysis methods in one-tailed (genome-wide association study) and two-tailed (gene expression data) tests were also compared and discussed.

  18. The application of artificial intelligence to microarray data: identification of a novel gene signature to identify bladder cancer progression.

    PubMed

    Catto, James W F; Abbod, Maysam F; Wild, Peter J; Linkens, Derek A; Pilarsky, Christian; Rehman, Ishtiaq; Rosario, Derek J; Denzinger, Stefan; Burger, Maximilian; Stoehr, Robert; Knuechel, Ruth; Hartmann, Arndt; Hamdy, Freddie C

    2010-03-01

    New methods for identifying bladder cancer (BCa) progression are required. Gene expression microarrays can reveal insights into disease biology and identify novel biomarkers. However, these experiments produce large datasets that are difficult to interpret. To develop a novel method of microarray analysis combining two forms of artificial intelligence (AI): neurofuzzy modelling (NFM) and artificial neural networks (ANN) and validate it in a BCa cohort. We used AI and statistical analyses to identify progression-related genes in a microarray dataset (n=66 tumours, n=2800 genes). The AI-selected genes were then investigated in a second cohort (n=262 tumours) using immunohistochemistry. We compared the accuracy of AI and statistical approaches to identify tumour progression. AI identified 11 progression-associated genes (odds ratio [OR]: 0.70; 95% confidence interval [CI], 0.56-0.87; p=0.0004), and these were more discriminate than genes chosen using statistical analyses (OR: 1.24; 95% CI, 0.96-1.60; p=0.09). The expression of six AI-selected genes (LIG3, FAS, KRT18, ICAM1, DSG2, and BRCA2) was determined using commercial antibodies and successfully identified tumour progression (concordance index: 0.66; log-rank test: p=0.01). AI-selected genes were more discriminate than pathologic criteria at determining progression (Cox multivariate analysis: p=0.01). Limitations include the use of statistical correlation to identify 200 genes for AI analysis and that we did not compare regression identified genes with immunohistochemistry. AI and statistical analyses use different techniques of inference to determine gene-phenotype associations and identify distinct prognostic gene signatures that are equally valid. We have identified a prognostic gene signature whose members reflect a variety of carcinogenic pathways that could identify progression in non-muscle-invasive BCa. 2009 European Association of Urology. Published by Elsevier B.V. All rights reserved.

  19. Gene Network Construction from Microarray Data Identifies a Key Network Module and Several Candidate Hub Genes in Age-Associated Spatial Learning Impairment

    PubMed Central

    Uddin, Raihan; Singh, Shiva M.

    2017-01-01

    As humans age many suffer from a decrease in normal brain functions including spatial learning impairments. This study aimed to better understand the molecular mechanisms in age-associated spatial learning impairment (ASLI). We used a mathematical modeling approach implemented in Weighted Gene Co-expression Network Analysis (WGCNA) to create and compare gene network models of young (learning unimpaired) and aged (predominantly learning impaired) brains from a set of exploratory datasets in rats in the context of ASLI. The major goal was to overcome some of the limitations previously observed in the traditional meta- and pathway analysis using these data, and identify novel ASLI related genes and their networks based on co-expression relationship of genes. This analysis identified a set of network modules in the young, each of which is highly enriched with genes functioning in broad but distinct GO functional categories or biological pathways. Interestingly, the analysis pointed to a single module that was highly enriched with genes functioning in “learning and memory” related functions and pathways. Subsequent differential network analysis of this “learning and memory” module in the aged (predominantly learning impaired) rats compared to the young learning unimpaired rats allowed us to identify a set of novel ASLI candidate hub genes. Some of these genes show significant repeatability in networks generated from independent young and aged validation datasets. These hub genes are highly co-expressed with other genes in the network, which not only show differential expression but also differential co-expression and differential connectivity across age and learning impairment. The known function of these hub genes indicate that they play key roles in critical pathways, including kinase and phosphatase signaling, in functions related to various ion channels, and in maintaining neuronal integrity relating to synaptic plasticity and memory formation. Taken

  20. Gene Network Construction from Microarray Data Identifies a Key Network Module and Several Candidate Hub Genes in Age-Associated Spatial Learning Impairment.

    PubMed

    Uddin, Raihan; Singh, Shiva M

    2017-01-01

    As humans age many suffer from a decrease in normal brain functions including spatial learning impairments. This study aimed to better understand the molecular mechanisms in age-associated spatial learning impairment (ASLI). We used a mathematical modeling approach implemented in Weighted Gene Co-expression Network Analysis (WGCNA) to create and compare gene network models of young (learning unimpaired) and aged (predominantly learning impaired) brains from a set of exploratory datasets in rats in the context of ASLI. The major goal was to overcome some of the limitations previously observed in the traditional meta- and pathway analysis using these data, and identify novel ASLI related genes and their networks based on co-expression relationship of genes. This analysis identified a set of network modules in the young, each of which is highly enriched with genes functioning in broad but distinct GO functional categories or biological pathways. Interestingly, the analysis pointed to a single module that was highly enriched with genes functioning in "learning and memory" related functions and pathways. Subsequent differential network analysis of this "learning and memory" module in the aged (predominantly learning impaired) rats compared to the young learning unimpaired rats allowed us to identify a set of novel ASLI candidate hub genes. Some of these genes show significant repeatability in networks generated from independent young and aged validation datasets. These hub genes are highly co-expressed with other genes in the network, which not only show differential expression but also differential co-expression and differential connectivity across age and learning impairment. The known function of these hub genes indicate that they play key roles in critical pathways, including kinase and phosphatase signaling, in functions related to various ion channels, and in maintaining neuronal integrity relating to synaptic plasticity and memory formation. Taken together, they

  1. Transcriptome Analysis of Mango (Mangifera indica L.) Fruit Epidermal Peel to Identify Putative Cuticle-Associated Genes

    PubMed Central

    Tafolla-Arellano, Julio C.; Zheng, Yi; Sun, Honghe; Jiao, Chen; Ruiz-May, Eliel; Hernández-Oñate, Miguel A.; González-León, Alberto; Báez-Sañudo, Reginaldo; Fei, Zhangjun; Domozych, David; Rose, Jocelyn K. C.; Tiznado-Hernández, Martín E.

    2017-01-01

    Mango fruit (Mangifera indica L.) are highly perishable and have a limited shelf life, due to postharvest desiccation and senescence, which limits their global distribution. Recent studies of tomato fruit suggest that these traits are influenced by the expression of genes that are associated with cuticle metabolism. However, studies of these phenomena in mango fruit are limited by the lack of genome-scale data. In order to gain insight into the mango cuticle biogenesis and identify putative cuticle-associated genes, we analyzed the transcriptomes of peels from ripe and overripe mango fruit using RNA-Seq. Approximately 400 million reads were generated and de novo assembled into 107,744 unigenes, with a mean length of 1,717 bp and with this information an online Mango RNA-Seq Database (http://bioinfo.bti.cornell.edu/cgi-bin/mango/index.cgi) which is a valuable genomic resource for molecular research into the biology of mango fruit was created. RNA-Seq analysis suggested that the pathway leading to biosynthesis of the cuticle component, cutin, is up-regulated during overripening. This data was supported by analysis of the expression of several putative cuticle-associated genes and by gravimetric and microscopic studies of cuticle deposition, revealing a complex continuous pattern of cuticle deposition during fruit development and involving substantial accumulation during ripening/overripening. PMID:28425468

  2. Transcriptome Analysis of Mango (Mangifera indica L.) Fruit Epidermal Peel to Identify Putative Cuticle-Associated Genes.

    PubMed

    Tafolla-Arellano, Julio C; Zheng, Yi; Sun, Honghe; Jiao, Chen; Ruiz-May, Eliel; Hernández-Oñate, Miguel A; González-León, Alberto; Báez-Sañudo, Reginaldo; Fei, Zhangjun; Domozych, David; Rose, Jocelyn K C; Tiznado-Hernández, Martín E

    2017-04-20

    Mango fruit (Mangifera indica L.) are highly perishable and have a limited shelf life, due to postharvest desiccation and senescence, which limits their global distribution. Recent studies of tomato fruit suggest that these traits are influenced by the expression of genes that are associated with cuticle metabolism. However, studies of these phenomena in mango fruit are limited by the lack of genome-scale data. In order to gain insight into the mango cuticle biogenesis and identify putative cuticle-associated genes, we analyzed the transcriptomes of peels from ripe and overripe mango fruit using RNA-Seq. Approximately 400 million reads were generated and de novo assembled into 107,744 unigenes, with a mean length of 1,717 bp and with this information an online Mango RNA-Seq Database (http://bioinfo.bti.cornell.edu/cgi-bin/mango/index.cgi) which is a valuable genomic resource for molecular research into the biology of mango fruit was created. RNA-Seq analysis suggested that the pathway leading to biosynthesis of the cuticle component, cutin, is up-regulated during overripening. This data was supported by analysis of the expression of several putative cuticle-associated genes and by gravimetric and microscopic studies of cuticle deposition, revealing a complex continuous pattern of cuticle deposition during fruit development and involving substantial accumulation during ripening/overripening.

  3. Transcriptome Analysis of Mango (Mangifera indica L.) Fruit Epidermal Peel to Identify Putative Cuticle-Associated Genes

    NASA Astrophysics Data System (ADS)

    Tafolla-Arellano, Julio C.; Zheng, Yi; Sun, Honghe; Jiao, Chen; Ruiz-May, Eliel; Hernández-Oñate, Miguel A.; González-León, Alberto; Báez-Sañudo, Reginaldo; Fei, Zhangjun; Domozych, David; Rose, Jocelyn K. C.; Tiznado-Hernández, Martín E.

    2017-04-01

    Mango fruit (Mangifera indica L.) are highly perishable and have a limited shelf life, due to postharvest desiccation and senescence, which limits their global distribution. Recent studies of tomato fruit suggest that these traits are influenced by the expression of genes that are associated with cuticle metabolism. However, studies of these phenomena in mango fruit are limited by the lack of genome-scale data. In order to gain insight into the mango cuticle biogenesis and identify putative cuticle-associated genes, we analyzed the transcriptomes of peels from ripe and overripe mango fruit using RNA-Seq. Approximately 400 million reads were generated and de novo assembled into 107,744 unigenes, with a mean length of 1,717 bp and with this information an online Mango RNA-Seq Database (http://bioinfo.bti.cornell.edu/cgi-bin/mango/index.cgi) which is a valuable genomic resource for molecular research into the biology of mango fruit was created. RNA-Seq analysis suggested that the pathway leading to biosynthesis of the cuticle component, cutin, is up-regulated during overripening. This data was supported by analysis of the expression of several putative cuticle-associated genes and by gravimetric and microscopic studies of cuticle deposition, revealing a complex continuous pattern of cuticle deposition during fruit development and involving substantial accumulation during ripening/overripening.

  4. Gene co-expression analysis identifies gene clusters associated with isotropic and polarized growth in Aspergillus fumigatus conidia.

    PubMed

    Baltussen, Tim J H; Coolen, Jordy P M; Zoll, Jan; Verweij, Paul E; Melchers, Willem J G

    2018-04-26

    Aspergillus fumigatus is a saprophytic fungus that extensively produces conidia. These microscopic asexually reproductive structures are small enough to reach the lungs. Germination of conidia followed by hyphal growth inside human lungs is a key step in the establishment of infection in immunocompromised patients. RNA-Seq was used to analyze the transcriptome of dormant and germinating A. fumigatus conidia. Construction of a gene co-expression network revealed four gene clusters (modules) correlated with a growth phase (dormant, isotropic growth, polarized growth). Transcripts levels of genes encoding for secondary metabolites were high in dormant conidia. During isotropic growth, transcript levels of genes involved in cell wall modifications increased. Two modules encoding for growth and cell cycle/DNA processing were associated with polarized growth. In addition, the co-expression network was used to identify highly connected intermodular hub genes. These genes may have a pivotal role in the respective module and could therefore be compelling therapeutic targets. Generally, cell wall remodeling is an important process during isotropic and polarized growth, characterized by an increase of transcripts coding for hyphal growth and cell cycle/DNA processing when polarized growth is initiated. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  5. Network Inference Analysis Identifies an APRR2-Like Gene Linked to Pigment Accumulation in Tomato and Pepper Fruits1[W][OA

    PubMed Central

    Pan, Yu; Bradley, Glyn; Pyke, Kevin; Ball, Graham; Lu, Chungui; Fray, Rupert; Marshall, Alexandra; Jayasuta, Subhalai; Baxter, Charles; van Wijk, Rik; Boyden, Laurie; Cade, Rebecca; Chapman, Natalie H.; Fraser, Paul D.; Hodgman, Charlie; Seymour, Graham B.

    2013-01-01

    Carotenoids represent some of the most important secondary metabolites in the human diet, and tomato (Solanum lycopersicum) is a rich source of these health-promoting compounds. In this work, a novel and fruit-related regulator of pigment accumulation in tomato has been identified by artificial neural network inference analysis and its function validated in transgenic plants. A tomato fruit gene regulatory network was generated using artificial neural network inference analysis and transcription factor gene expression profiles derived from fruits sampled at various points during development and ripening. One of the transcription factor gene expression profiles with a sequence related to an Arabidopsis (Arabidopsis thaliana) ARABIDOPSIS PSEUDO RESPONSE REGULATOR2-LIKE gene (APRR2-Like) was up-regulated at the breaker stage in wild-type tomato fruits and, when overexpressed in transgenic lines, increased plastid number, area, and pigment content, enhancing the levels of chlorophyll in immature unripe fruits and carotenoids in red ripe fruits. Analysis of the transcriptome of transgenic lines overexpressing the tomato APPR2-Like gene revealed up-regulation of several ripening-related genes in the overexpression lines, providing a link between the expression of this tomato gene and the ripening process. A putative ortholog of the tomato APPR2-Like gene in sweet pepper (Capsicum annuum) was associated with pigment accumulation in fruit tissues. We conclude that the function of this gene is conserved across taxa and that it encodes a protein that has an important role in ripening. PMID:23292788

  6. A meta-analysis of public microarray data identifies biological regulatory networks in Parkinson's disease.

    PubMed

    Su, Lining; Wang, Chunjie; Zheng, Chenqing; Wei, Huiping; Song, Xiaoqing

    2018-04-13

    Parkinson's disease (PD) is a long-term degenerative disease that is caused by environmental and genetic factors. The networks of genes and their regulators that control the progression and development of PD require further elucidation. We examine common differentially expressed genes (DEGs) from several PD blood and substantia nigra (SN) microarray datasets by meta-analysis. Further we screen the PD-specific genes from common DEGs using GCBI. Next, we used a series of bioinformatics software to analyze the miRNAs, lncRNAs and SNPs associated with the common PD-specific genes, and then identify the mTF-miRNA-gene-gTF network. Our results identified 36 common DEGs in PD blood studies and 17 common DEGs in PD SN studies, and five of the genes were previously known to be associated with PD. Further study of the regulatory miRNAs associated with the common PD-specific genes revealed 14 PD-specific miRNAs in our study. Analysis of the mTF-miRNA-gene-gTF network about PD-specific genes revealed two feed-forward loops: one involving the SPRK2 gene, hsa-miR-19a-3p and SPI1, and the second involving the SPRK2 gene, hsa-miR-17-3p and SPI. The long non-coding RNA (lncRNA)-mediated regulatory network identified lncRNAs associated with PD-specific genes and PD-specific miRNAs. Moreover, single nucleotide polymorphism (SNP) analysis of the PD-specific genes identified two significant SNPs, and SNP analysis of the neurodegenerative disease-specific genes identified seven significant SNPs. Most of these SNPs are present in the 3'-untranslated region of genes and are controlled by several miRNAs. Our study identified a total of 53 common DEGs in PD patients compared with healthy controls in blood and brain datasets and five of these genes were previously linked with PD. Regulatory network analysis identified PD-specific miRNAs, associated long non-coding RNA and feed-forward loops, which contribute to our understanding of the mechanisms underlying PD. The SNPs identified in our

  7. A Sleeping Beauty forward genetic screen identifies new genes and pathways driving osteosarcoma development and metastasis

    PubMed Central

    Moriarity, Branden S; Otto, George M; Rahrmann, Eric P; Rathe, Susan K; Wolf, Natalie K; Weg, Madison T; Manlove, Luke A; LaRue, Rebecca S; Temiz, Nuri A; Molyneux, Sam D; Choi, Kwangmin; Holly, Kevin J; Sarver, Aaron L; Scott, Milcah C; Forster, Colleen L; Modiano, Jaime F; Khanna, Chand; Hewitt, Stephen M; Khokha, Rama; Yang, Yi; Gorlick, Richard; Dyer, Michael A; Largaespada, David A

    2016-01-01

    Osteosarcomas are sarcomas of the bone, derived from osteoblasts or their precursors, with a high propensity to metastasize. Osteosarcoma is associated with massive genomic instability, making it problematic to identify driver genes using human tumors or prototypical mouse models, many of which involve loss of Trp53 function. To identify the genes driving osteosarcoma development and metastasis, we performed a Sleeping Beauty (SB) transposon-based forward genetic screen in mice with and without somatic loss of Trp53. Common insertion site (CIS) analysis of 119 primary tumors and 134 metastatic nodules identified 232 sites associated with osteosarcoma development and 43 sites associated with metastasis, respectively. Analysis of CIS-associated genes identified numerous known and new osteosarcoma-associated genes enriched in the ErbB, PI3K-AKT-mTOR and MAPK signaling pathways. Lastly, we identified several oncogenes involved in axon guidance, including Sema4d and Sema6d, which we functionally validated as oncogenes in human osteosarcoma. PMID:25961939

  8. Clustering approaches to identifying gene expression patterns from DNA microarray data.

    PubMed

    Do, Jin Hwan; Choi, Dong-Kug

    2008-04-30

    The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.

  9. Bi-directional gene set enrichment and canonical correlation analysis identify key diet-sensitive pathways and biomarkers of metabolic syndrome.

    PubMed

    Morine, Melissa J; McMonagle, Jolene; Toomey, Sinead; Reynolds, Clare M; Moloney, Aidan P; Gormley, Isobel C; Gaora, Peadar O; Roche, Helen M

    2010-10-07

    constituent genes, as well as strong correlations between gene expression and plasma markers of metabolic syndrome independent of the dietary effect. Bi-directional gene set enrichment analysis more accurately reflects dynamic regulatory behaviour in biochemical pathways, and as such highlighted biologically relevant changes that were not detected using a traditional approach. In such cases where transcriptomic response to treatment is exceptionally large, canonical correlation analysis in conjunction with Fisher's exact test highlights the subset of pathways showing strongest correlation with the clinical markers of interest. In this case, we have identified selenoamino acid metabolism and steroid biosynthesis as key pathways mediating the observed relationship between metabolic health and high-CLA beef. These results indicate that this type of analysis has the potential to generate novel transcriptome-based biomarkers of disease.

  10. Bi-directional gene set enrichment and canonical correlation analysis identify key diet-sensitive pathways and biomarkers of metabolic syndrome

    PubMed Central

    2010-01-01

    -sensitive changes in constituent genes, as well as strong correlations between gene expression and plasma markers of metabolic syndrome independent of the dietary effect. Conclusion Bi-directional gene set enrichment analysis more accurately reflects dynamic regulatory behaviour in biochemical pathways, and as such highlighted biologically relevant changes that were not detected using a traditional approach. In such cases where transcriptomic response to treatment is exceptionally large, canonical correlation analysis in conjunction with Fisher's exact test highlights the subset of pathways showing strongest correlation with the clinical markers of interest. In this case, we have identified selenoamino acid metabolism and steroid biosynthesis as key pathways mediating the observed relationship between metabolic health and high-CLA beef. These results indicate that this type of analysis has the potential to generate novel transcriptome-based biomarkers of disease. PMID:20929581

  11. A framework to identify gene expression profiles in a model of inflammation induced by lipopolysaccharide after treatment with thalidomide

    PubMed Central

    2012-01-01

    Background Thalidomide is an anti-inflammatory and anti-angiogenic drug currently used for the treatment of several diseases, including erythema nodosum leprosum, which occurs in patients with lepromatous leprosy. In this research, we use DNA microarray analysis to identify the impact of thalidomide on gene expression responses in human cells after lipopolysaccharide (LPS) stimulation. We employed a two-stage framework. Initially, we identified 1584 altered genes in response to LPS. Modulation of this set of genes was then analyzed in the LPS stimulated cells treated with thalidomide. Results We identified 64 genes with altered expression induced by thalidomide using the rank product method. In addition, the lists of up-regulated and down-regulated genes were investigated by means of bioinformatics functional analysis, which allowed for the identification of biological processes affected by thalidomide. Confirmatory analysis was done in five of the identified genes using real time PCR. Conclusions The results showed some genes that can further our understanding of the biological mechanisms in the action of thalidomide. Of the five genes evaluated with real time PCR, three were down regulated and two were up regulated confirming the initial results of the microarray analysis. PMID:22695124

  12. Gene Expression Analysis Reveals New Possible Mechanisms of Vancomycin-Induced Nephrotoxicity and Identifies Gene Markers Candidates

    PubMed Central

    Dieterich, Christine; Puey, Angela; Lyn, Sylvia; Swezey, Robert; Furimsky, Anna; Fairchild, David; Mirsalis, Jon C.; Ng, Hanna H.

    2009-01-01

    Vancomycin, one of few effective treatments against methicillin-resistant Staphylococcus aureus, is nephrotoxic. The goals of this study were to (1) gain insights into molecular mechanisms of nephrotoxicity at the genomic level, (2) evaluate gene markers of vancomycin-induced kidney injury, and (3) compare gene expression responses after iv and ip administration. Groups of six female BALB/c mice were treated with seven daily iv or ip doses of vancomycin (50, 200, and 400 mg/kg) or saline, and sacrificed on day 8. Clinical chemistry and histopathology demonstrated kidney injury at 400 mg/kg only. Hierarchical clustering analysis revealed that kidney gene expression profiles of all mice treated at 400 mg/kg clustered with those of mice administered 200 mg/kg iv. Transcriptional profiling might thus be more sensitive than current clinical markers for detecting kidney damage, though the profiles can differ with the route of administration. Analysis of transcripts whose expression was changed by at least twofold compared with vehicle saline after high iv and ip doses of vancomycin suggested the possibility of oxidative stress and mitochondrial damage in vancomycin-induced toxicity. In addition, our data showed changes in expression of several transcripts from the complement and inflammatory pathways. Such expression changes were confirmed by relative real-time reverse transcription–polymerase chain reaction. Finally, our results further substantiate the use of gene markers of kidney toxicity such as KIM-1/Havcr1, as indicators of renal injury. PMID:18930951

  13. Gene expression analysis reveals new possible mechanisms of vancomycin-induced nephrotoxicity and identifies gene markers candidates.

    PubMed

    Dieterich, Christine; Puey, Angela; Lin, Sylvia; Lyn, Sylvia; Swezey, Robert; Furimsky, Anna; Fairchild, David; Mirsalis, Jon C; Ng, Hanna H

    2009-01-01

    Vancomycin, one of few effective treatments against methicillin-resistant Staphylococcus aureus, is nephrotoxic. The goals of this study were to (1) gain insights into molecular mechanisms of nephrotoxicity at the genomic level, (2) evaluate gene markers of vancomycin-induced kidney injury, and (3) compare gene expression responses after iv and ip administration. Groups of six female BALB/c mice were treated with seven daily iv or ip doses of vancomycin (50, 200, and 400 mg/kg) or saline, and sacrificed on day 8. Clinical chemistry and histopathology demonstrated kidney injury at 400 mg/kg only. Hierarchical clustering analysis revealed that kidney gene expression profiles of all mice treated at 400 mg/kg clustered with those of mice administered 200 mg/kg iv. Transcriptional profiling might thus be more sensitive than current clinical markers for detecting kidney damage, though the profiles can differ with the route of administration. Analysis of transcripts whose expression was changed by at least twofold compared with vehicle saline after high iv and ip doses of vancomycin suggested the possibility of oxidative stress and mitochondrial damage in vancomycin-induced toxicity. In addition, our data showed changes in expression of several transcripts from the complement and inflammatory pathways. Such expression changes were confirmed by relative real-time reverse transcription-polymerase chain reaction. Finally, our results further substantiate the use of gene markers of kidney toxicity such as KIM-1/Havcr1, as indicators of renal injury.

  14. ADAGE signature analysis: differential expression analysis with data-defined gene sets.

    PubMed

    Tan, Jie; Huyck, Matthew; Hu, Dongbo; Zelaya, René A; Hogan, Deborah A; Greene, Casey S

    2017-11-22

    Gene set enrichment analysis and overrepresentation analyses are commonly used methods to determine the biological processes affected by a differential expression experiment. This approach requires biologically relevant gene sets, which are currently curated manually, limiting their availability and accuracy in many organisms without extensively curated resources. New feature learning approaches can now be paired with existing data collections to directly extract functional gene sets from big data. Here we introduce a method to identify perturbed processes. In contrast with methods that use curated gene sets, this approach uses signatures extracted from public expression data. We first extract expression signatures from public data using ADAGE, a neural network-based feature extraction approach. We next identify signatures that are differentially active under a given treatment. Our results demonstrate that these signatures represent biological processes that are perturbed by the experiment. Because these signatures are directly learned from data without supervision, they can identify uncurated or novel biological processes. We implemented ADAGE signature analysis for the bacterial pathogen Pseudomonas aeruginosa. For the convenience of different user groups, we implemented both an R package (ADAGEpath) and a web server ( http://adage.greenelab.com ) to run these analyses. Both are open-source to allow easy expansion to other organisms or signature generation methods. We applied ADAGE signature analysis to an example dataset in which wild-type and ∆anr mutant cells were grown as biofilms on the Cystic Fibrosis genotype bronchial epithelial cells. We mapped active signatures in the dataset to KEGG pathways and compared with pathways identified using GSEA. The two approaches generally return consistent results; however, ADAGE signature analysis also identified a signature that revealed the molecularly supported link between the MexT regulon and Anr. We designed

  15. GeneCOST: a novel scoring-based prioritization framework for identifying disease causing genes.

    PubMed

    Ozer, Bugra; Sağıroğlu, Mahmut; Demirci, Hüseyin

    2015-11-15

    Due to the big data produced by next-generation sequencing studies, there is an evident need for methods to extract the valuable information gathered from these experiments. In this work, we propose GeneCOST, a novel scoring-based method to evaluate every gene for their disease association. Without any prior filtering and any prior knowledge, we assign a disease likelihood score to each gene in correspondence with their variations. Then, we rank all genes based on frequency, conservation, pedigree and detailed variation information to find out the causative reason of the disease state. We demonstrate the usage of GeneCOST with public and real life Mendelian disease cases including recessive, dominant, compound heterozygous and sporadic models. As a result, we were able to identify causative reason behind the disease state in top rankings of our list, proving that this novel prioritization framework provides a powerful environment for the analysis in genetic disease studies alternative to filtering-based approaches. GeneCOST software is freely available at www.igbam.bilgem.tubitak.gov.tr/en/softwares/genecost-en/index.html. buozer@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  16. Allele specific expression analysis identifies regulatory variation associated with stress-related genes in the Mexican highland maize landrace Palomero Toluqueño

    PubMed Central

    González-Segovia, Eric; Ross-Ibarra, Jeffrey; Simpson, June K.

    2017-01-01

    Background Gene regulatory variation has been proposed to play an important role in the adaptation of plants to environmental stress. In the central highlands of Mexico, farmer selection has generated a unique group of maize landraces adapted to the challenges of the highland niche. In this study, gene expression in Mexican highland maize and a reference maize breeding line were compared to identify evidence of regulatory variation in stress-related genes. It was hypothesised that local adaptation in Mexican highland maize would be associated with a transcriptional signature observable even under benign conditions. Methods Allele specific expression analysis was performed using the seedling-leaf transcriptome of an F1 individual generated from the cross between the highland adapted Mexican landrace Palomero Toluqueño and the reference line B73, grown under benign conditions. Results were compared with a published dataset describing the transcriptional response of B73 seedlings to cold, heat, salt and UV treatments. Results A total of 2,386 genes were identified to show allele specific expression. Of these, 277 showed an expression difference between Palomero Toluqueño and B73 alleles under benign conditions that anticipated the response of B73 cold, heat, salt and/or UV treatments, and, as such, were considered to display a prior stress response. Prior stress response candidates included genes associated with plant hormone signaling and a number of transcription factors. Construction of a gene co-expression network revealed further signaling and stress-related genes to be among the potential targets of the transcription factors candidates. Discussion Prior activation of responses may represent the best strategy when stresses are severe but predictable. Expression differences observed here between Palomero Toluqueño and B73 alleles indicate the presence of cis-acting regulatory variation linked to stress-related genes in Palomero Toluqueño. Considered alongside

  17. Identification of gene expression profiles and key genes in subchondral bone of osteoarthritis using weighted gene coexpression network analysis.

    PubMed

    Guo, Sheng-Min; Wang, Jian-Xiong; Li, Jin; Xu, Fang-Yuan; Wei, Quan; Wang, Hai-Ming; Huang, Hou-Qiang; Zheng, Si-Lin; Xie, Yu-Jie; Zhang, Chi

    2018-06-15

    Osteoarthritis (OA) significantly influences the quality life of people around the world. It is urgent to find an effective way to understand the genetic etiology of OA. We used weighted gene coexpression network analysis (WGCNA) to explore the key genes involved in the subchondral bone pathological process of OA. Fifty gene expression profiles of GSE51588 were downloaded from the Gene Expression Omnibus database. The OA-associated genes and gene ontologies were acquired from JuniorDoc. Weighted gene coexpression network analysis was used to find disease-related networks based on 21756 gene expression correlation coefficients, hub-genes with the highest connectivity in each module were selected, and the correlation between module eigengene and clinical traits was calculated. The genes in the traits-related gene coexpression modules were subject to functional annotation and pathway enrichment analysis using ClusterProfiler. A total of 73 gene modules were identified, of which, 12 modules were found with high connectivity with clinical traits. Five modules were found with enriched OA-associated genes. Moreover, 310 OA-associated genes were found, and 34 of them were among hub-genes in each module. Consequently, enrichment results indicated some key metabolic pathways, such as extracellular matrix (ECM)-receptor interaction (hsa04512), focal adhesion (hsa04510), the phosphatidylinositol 3'-kinase (PI3K)-Akt signaling pathway (PI3K-AKT) (hsa04151), transforming growth factor beta pathway, and Wnt pathway. We intended to identify some core genes, collagen (COL)6A3, COL6A1, ITGA11, BAMBI, and HCK, which could influence downstream signaling pathways once they were activated. In this study, we identified important genes within key coexpression modules, which associate with a pathological process of subchondral bone in OA. Functional analysis results could provide important information to understand the mechanism of OA. © 2018 Wiley Periodicals, Inc.

  18. Controllability analysis of the directed human protein interaction network identifies disease genes and drug targets

    PubMed Central

    Vinayagam, Arunachalam; Gibson, Travis E.; Lee, Ho-Joon; Yilmazel, Bahar; Roesel, Charles; Hu, Yanhui; Kwon, Young; Sharma, Amitabh; Liu, Yang-Yu; Perrimon, Norbert; Barabási, Albert-László

    2016-01-01

    The protein–protein interaction (PPI) network is crucial for cellular information processing and decision-making. With suitable inputs, PPI networks drive the cells to diverse functional outcomes such as cell proliferation or cell death. Here, we characterize the structural controllability of a large directed human PPI network comprising 6,339 proteins and 34,813 interactions. This network allows us to classify proteins as “indispensable,” “neutral,” or “dispensable,” which correlates to increasing, no effect, or decreasing the number of driver nodes in the network upon removal of that protein. We find that 21% of the proteins in the PPI network are indispensable. Interestingly, these indispensable proteins are the primary targets of disease-causing mutations, human viruses, and drugs, suggesting that altering a network’s control property is critical for the transition between healthy and disease states. Furthermore, analyzing copy number alterations data from 1,547 cancer patients reveals that 56 genes that are frequently amplified or deleted in nine different cancers are indispensable. Among the 56 genes, 46 of them have not been previously associated with cancer. This suggests that controllability analysis is very useful in identifying novel disease genes and potential drug targets. PMID:27091990

  19. Genome-wide transcriptome analysis in the ovaries of two goats identifies differentially expressed genes related to fecundity.

    PubMed

    Miao, Xiangyang; Luo, Qingmiao; Qin, Xiaoyu

    2016-05-10

    The goats are widely kept as livestock throughout the world. Two excellent domestic breeds in China, the Laiwu Black and Jining Grey goats, have different fecundities and prolificacies. Although the goat genome sequences have been resolved recently, little is known about the gene regulations at the transcriptional level in goat. To understand the molecular and genetic mechanisms related to the fecundities and prolificacies, we performed genome-wide sequencing of the mRNAs from two breeds of goat using the next-generation RNA-Seq technology and used functional annotation to identify pathways of interest. Digital gene expression analysis showed 338 genes were up-regulated in the Jining Grey goats and 404 were up-regulated in the Laiwu Black goats. Quantitative real-time PCR verified the reliability of the RNA-Seq data. This study suggests that multiple genes responsible for various biological functions and signaling pathways are differentially expressed in the two different goat breeds, and these genes might be involved in the regulation of goat fecundity and prolificacy. Taken together, our study provides insight into the transcriptional regulation in the ovaries of 2 species of goats that might serve as a key resource for understanding goat fecundity, prolificacy and genetic diversity between species. Copyright © 2016 Elsevier B.V. All rights reserved.

  20. Random forests-based differential analysis of gene sets for gene expression data.

    PubMed

    Hsueh, Huey-Miin; Zhou, Da-Wei; Tsai, Chen-An

    2013-04-10

    In DNA microarray studies, gene-set analysis (GSA) has become the focus of gene expression data analysis. GSA utilizes the gene expression profiles of functionally related gene sets in Gene Ontology (GO) categories or priori-defined biological classes to assess the significance of gene sets associated with clinical outcomes or phenotypes. Many statistical approaches have been proposed to determine whether such functionally related gene sets express differentially (enrichment and/or deletion) in variations of phenotypes. However, little attention has been given to the discriminatory power of gene sets and classification of patients. In this study, we propose a method of gene set analysis, in which gene sets are used to develop classifications of patients based on the Random Forest (RF) algorithm. The corresponding empirical p-value of an observed out-of-bag (OOB) error rate of the classifier is introduced to identify differentially expressed gene sets using an adequate resampling method. In addition, we discuss the impacts and correlations of genes within each gene set based on the measures of variable importance in the RF algorithm. Significant classifications are reported and visualized together with the underlying gene sets and their contribution to the phenotypes of interest. Numerical studies using both synthesized data and a series of publicly available gene expression data sets are conducted to evaluate the performance of the proposed methods. Compared with other hypothesis testing approaches, our proposed methods are reliable and successful in identifying enriched gene sets and in discovering the contributions of genes within a gene set. The classification results of identified gene sets can provide an valuable alternative to gene set testing to reveal the unknown, biologically relevant classes of samples or patients. In summary, our proposed method allows one to simultaneously assess the discriminatory ability of gene sets and the importance of genes for

  1. Male specific genes from dioecious white campion identified by fluorescent differential display.

    PubMed

    Scutt, Charles P; Jenkins, Tom; Furuya, Masaki; Gilmartin, Philip M

    2002-05-01

    Fluorescent differential display (FDD) has been used to screen for cDNAs that are differentially up-regulated in male flowers of the dioecious plant Silene latifolia in which an X/Y chromosome system of sex determination operates. To adapt FDD to the cloning of large numbers of differential cDNAs, a novel method of confirming the differential expression of these has been devised. FDD gels were Southern electro-blotted and probed with mixtures of individual cDNA clones derived from different FDD product ligation reactions. These Southern blots were then stripped and re-probed with further mixtures of individual cloned FDD products to identify the maximum number of recombinant clones carrying the true differential amplification products. Of 135 differential bands identified by FDD, 56 differential amplification products were confirmed; these represent 23 unique differentially expressed genes as determined by virtual Northern analysis and two genes expressed at or below the level of detection by virtual Northern analysis. These two low expressed genes show bands of hybridization on genomic Southern blots that are specific to male plants, indicating that they are derived from, or closely related to, Y chromosome genes.

  2. A transposon-based genetic screen in mice identifies genes altered in colorectal cancer.

    PubMed

    Starr, Timothy K; Allaei, Raha; Silverstein, Kevin A T; Staggs, Rodney A; Sarver, Aaron L; Bergemann, Tracy L; Gupta, Mihir; O'Sullivan, M Gerard; Matise, Ilze; Dupuy, Adam J; Collier, Lara S; Powers, Scott; Oberg, Ann L; Asmann, Yan W; Thibodeau, Stephen N; Tessarollo, Lino; Copeland, Neal G; Jenkins, Nancy A; Cormier, Robert T; Largaespada, David A

    2009-03-27

    Human colorectal cancers (CRCs) display a large number of genetic and epigenetic alterations, some of which are causally involved in tumorigenesis (drivers) and others that have little functional impact (passengers). To help distinguish between these two classes of alterations, we used a transposon-based genetic screen in mice to identify candidate genes for CRC. Mice harboring mutagenic Sleeping Beauty (SB) transposons were crossed with mice expressing SB transposase in gastrointestinal tract epithelium. Most of the offspring developed intestinal lesions, including intraepithelial neoplasia, adenomas, and adenocarcinomas. Analysis of over 16,000 transposon insertions identified 77 candidate CRC genes, 60 of which are mutated and/or dysregulated in human CRC and thus are most likely to drive tumorigenesis. These genes include APC, PTEN, and SMAD4. The screen also identified 17 candidate genes that had not previously been implicated in CRC, including POLI, PTPRK, and RSPO2.

  3. Tissue Non-Specific Genes and Pathways Associated with Diabetes: An Expression Meta-Analysis.

    PubMed

    Mei, Hao; Li, Lianna; Liu, Shijian; Jiang, Fan; Griswold, Michael; Mosley, Thomas

    2017-01-21

    We performed expression studies to identify tissue non-specific genes and pathways of diabetes by meta-analysis. We searched curated datasets of the Gene Expression Omnibus (GEO) database and identified 13 and five expression studies of diabetes and insulin responses at various tissues, respectively. We tested differential gene expression by empirical Bayes-based linear method and investigated gene set expression association by knowledge-based enrichment analysis. Meta-analysis by different methods was applied to identify tissue non-specific genes and gene sets. We also proposed pathway mapping analysis to infer functions of the identified gene sets, and correlation and independent analysis to evaluate expression association profile of genes and gene sets between studies and tissues. Our analysis showed that PGRMC1 and HADH genes were significant over diabetes studies, while IRS1 and MPST genes were significant over insulin response studies, and joint analysis showed that HADH and MPST genes were significant over all combined data sets. The pathway analysis identified six significant gene sets over all studies. The KEGG pathway mapping indicated that the significant gene sets are related to diabetes pathogenesis. The results also presented that 12.8% and 59.0% pairwise studies had significantly correlated expression association for genes and gene sets, respectively; moreover, 12.8% pairwise studies had independent expression association for genes, but no studies were observed significantly different for expression association of gene sets. Our analysis indicated that there are both tissue specific and non-specific genes and pathways associated with diabetes pathogenesis. Compared to the gene expression, pathway association tends to be tissue non-specific, and a common pathway influencing diabetes development is activated through different genes at different tissues.

  4. Time-Course Gene Set Analysis for Longitudinal Gene Expression Data

    PubMed Central

    Hejblum, Boris P.; Skinner, Jason; Thiébaut, Rodolphe

    2015-01-01

    Gene set analysis methods, which consider predefined groups of genes in the analysis of genomic data, have been successfully applied for analyzing gene expression data in cross-sectional studies. The time-course gene set analysis (TcGSA) introduced here is an extension of gene set analysis to longitudinal data. The proposed method relies on random effects modeling with maximum likelihood estimates. It allows to use all available repeated measurements while dealing with unbalanced data due to missing at random (MAR) measurements. TcGSA is a hypothesis driven method that identifies a priori defined gene sets with significant expression variations over time, taking into account the potential heterogeneity of expression within gene sets. When biological conditions are compared, the method indicates if the time patterns of gene sets significantly differ according to these conditions. The interest of the method is illustrated by its application to two real life datasets: an HIV therapeutic vaccine trial (DALIA-1 trial), and data from a recent study on influenza and pneumococcal vaccines. In the DALIA-1 trial TcGSA revealed a significant change in gene expression over time within 69 gene sets during vaccination, while a standard univariate individual gene analysis corrected for multiple testing as well as a standard a Gene Set Enrichment Analysis (GSEA) for time series both failed to detect any significant pattern change over time. When applied to the second illustrative data set, TcGSA allowed the identification of 4 gene sets finally found to be linked with the influenza vaccine too although they were found to be associated to the pneumococcal vaccine only in previous analyses. In our simulation study TcGSA exhibits good statistical properties, and an increased power compared to other approaches for analyzing time-course expression patterns of gene sets. The method is made available for the community through an R package. PMID:26111374

  5. Identifying Candidate Reprogramming Genes in Mouse Induced Pluripotent Stem Cells.

    PubMed

    Gao, Fang; Li, Jingyu; Zhang, Heng; Yang, Xu; An, Tiezhu

    2017-08-01

    Factor-based induced reprogramming approaches have tremendous potential for human regenerative medicine, but the efficiencies of these approaches are still low. In this study, we analyzed the global transcriptional profiles of mouse induced pluripotent stem cells (miPSCs) and mouse embryonic stem cells (mESCs) from seven different labs and present here the first successful clustering according to cell type, not by lab of origin. We identified 2131 different expression genes (DEs) as candidate pluripotency-associated genes by comparing mESCs/miPSCs with somatic cells and 720 DEs between miPSCs and mESCs. Interestingly, there was a significant overlap between the two DE sets. Therefore, we defined the overlap DEs as "consensus DEs" including 313 miPSC-specific genes expressed at a higher level in miPSCs versus mESCs and 184 mESC-specific genes in total and reasoned that these may contribute to the differences in pluripotency between mESCs and miPSCs. A classification of "consensus DEs" according to their different expression levels between somatic cells and mESCs/miPSCs shows that 86% of the miPSC-specific genes are more highly expressed in somatic cells, while 73% of mESC-specific genes are highly expressed in mESCs/miPSCs, indicating that the miPSCs have not efficiently silenced the expression pattern of the somatic cells from which they are derived and failed to completely induce the genes with high expression levels in mESCs. We further revealed a strong correlation between oocyte-enriched factors and insufficiently induced mESC-specific genes and identified 11 hub genes via network analysis. In light of these findings, we postulated that these key hub genes might not only drive somatic cell nuclear transfer (SCNT) reprogramming but also augment the efficiency and quality of miPSC reprogramming.

  6. Identifying gene networks underlying the neurobiology of ethanol and alcoholism.

    PubMed

    Wolen, Aaron R; Miles, Michael F

    2012-01-01

    For complex disorders such as alcoholism, identifying the genes linked to these diseases and their specific roles is difficult. Traditional genetic approaches, such as genetic association studies (including genome-wide association studies) and analyses of quantitative trait loci (QTLs) in both humans and laboratory animals already have helped identify some candidate genes. However, because of technical obstacles, such as the small impact of any individual gene, these approaches only have limited effectiveness in identifying specific genes that contribute to complex diseases. The emerging field of systems biology, which allows for analyses of entire gene networks, may help researchers better elucidate the genetic basis of alcoholism, both in humans and in animal models. Such networks can be identified using approaches such as high-throughput molecular profiling (e.g., through microarray-based gene expression analyses) or strategies referred to as genetical genomics, such as the mapping of expression QTLs (eQTLs). Characterization of gene networks can shed light on the biological pathways underlying complex traits and provide the functional context for identifying those genes that contribute to disease development.

  7. High-Resolution Melting Curve Analysis of the 16S Ribosomal Gene to Detect and Identify Pathogenic and Saprophytic Leptospira species in Colombian Isolates

    PubMed Central

    Peláez Sánchez, Ronald G.; Quintero, Juan Álvaro López; Pereira, Martha María; Agudelo-Flórez, Piedad

    2017-01-01

    It is important to identify the circulating Leptospira agent to enhance the performance of serodiagnostic tests by incorporating specific antigens of native species, develop vaccines that take into account the species/serovars circulating in different regions, and optimize prevention and control strategies. The objectives of this study were to develop a polymerase chain reaction (PCR)–high-resolution melting (HRM) assay for differentiating between species of the genus Leptospira and to verify its usefulness in identifying unknown samples to species level. A set of primers from the initial region of the 16S ribosomal gene was designed to detect and differentiate the 22 species of Leptospira. Eleven reference strains were used as controls to establish the reference species and differential melting curves. Twenty-five Colombian Leptospira isolates were studied to evaluate the usefulness of the PCR–HRM assay in identifying unknown samples to species level. This identification was confirmed by sequencing and phylogenetic analysis of the 16S ribosomal gene. Eleven Leptospira species were successfully identified, except for Leptospira meyeri/Leptospira yanagawae because the sequences were 100% identical. The 25 isolates from humans, animals, and environmental water sources were identified as Leptospira santarosai (twelve), Leptospira interrogans (nine), and L. meyeri/L. yanagawae (four). The species verification was 100% concordant between PCR–HRM and phylogenetic analysis of the 16S ribosomal gene. The PCR–HRM assay designed in this study is a useful tool for identifying Leptospira species from isolates. PMID:28500802

  8. Whole-genome transcription and DNA methylation analysis of peripheral blood mononuclear cells identified aberrant gene regulation pathways in systemic lupus erythematosus.

    PubMed

    Zhu, Honglin; Mi, Wentao; Luo, Hui; Chen, Tao; Liu, Shengxi; Raman, Indu; Zuo, Xiaoxia; Li, Quan-Zhen

    2016-07-13

    Recent achievement in genetics and epigenetics has led to the exploration of the pathogenesis of systemic lupus erythematosus (SLE). Identification of differentially expressed genes and their regulatory mechanism(s) at whole-genome level will provide a comprehensive understanding of the development of SLE and its devastating complications, lupus nephritis (LN). We performed whole-genome transcription and DNA methylation analysis in PBMC of 30 SLE patients, including 15 with LN (SLE LN(+)) and 15 without LN (SLE LN(-)), and 25 normal controls (NC) using HumanHT-12 Beadchips and Illumina Human Methy450 chips. The serum proinflammatory cytokines were quantified using Bio-plex Human Cytokine 27-plex assay. Differentially expressed genes and differentially methylated CpG were analyzed with GenomeStudio, R, and SAM software. The association between DNA methylation and gene expression were tested. Gene interaction pathways of the differentially expressed genes were analyzed by IPA software. We identified 552 upregulated genes and 550 downregulated genes in PBMC of SLE. Integration of DNA methylation and gene expression profiling showed that 334 upregulated genes were hypomethylated, and 479 downregulated genes were hypermethylated. Pathway analysis on the differential genes in SLE revealed significant enrichment in interferon (IFN) signaling and toll-like receptor (TLR) signaling pathways. Nine IFN- and seven TLR-related genes were identified and displayed step-wise increase in SLE LN(-) and SLE LN(+). Hypomethylated CpG sites were detected on these genes. The gene expressions for MX1, GPR84, and E2F2 were increased in SLE LN(+) as compared to SLE LN(-) patients. The serum levels of inflammatory cytokines, including IL17A, IP-10, bFGF, TNF-α, IL-6, IL-15, GM-CSF, IL-1RA, IL-5, and IL-12p70, were significantly elevated in SLE compared with NC. The levels of IL-15 and IL1RA correlated with their mRNA expression. The upregulation of IL-15 may be regulated by hypomethylated

  9. Whole-Genome Sequencing of Sordaria macrospora Mutants Identifies Developmental Genes.

    PubMed

    Nowrousian, Minou; Teichert, Ines; Masloff, Sandra; Kück, Ulrich

    2012-02-01

    The study of mutants to elucidate gene functions has a long and successful history; however, to discover causative mutations in mutants that were generated by random mutagenesis often takes years of laboratory work and requires previously generated genetic and/or physical markers, or resources like DNA libraries for complementation. Here, we present an alternative method to identify defective genes in developmental mutants of the filamentous fungus Sordaria macrospora through Illumina/Solexa whole-genome sequencing. We sequenced pooled DNA from progeny of crosses of three mutants and the wild type and were able to pinpoint the causative mutations in the mutant strains through bioinformatics analysis. One mutant is a spore color mutant, and the mutated gene encodes a melanin biosynthesis enzyme. The causative mutation is a G to A change in the first base of an intron, leading to a splice defect. The second mutant carries an allelic mutation in the pro41 gene encoding a protein essential for sexual development. In the mutant, we detected a complex pattern of deletion/rearrangements at the pro41 locus. In the third mutant, a point mutation in the stop codon of a transcription factor-encoding gene leads to the production of immature fruiting bodies. For all mutants, transformation with a wild type-copy of the affected gene restored the wild-type phenotype. Our data demonstrate that whole-genome sequencing of mutant strains is a rapid method to identify developmental genes in an organism that can be genetically crossed and where a reference genome sequence is available, even without prior mapping information.

  10. Whole-Genome Sequencing of Sordaria macrospora Mutants Identifies Developmental Genes

    PubMed Central

    Nowrousian, Minou; Teichert, Ines; Masloff, Sandra; Kück, Ulrich

    2012-01-01

    The study of mutants to elucidate gene functions has a long and successful history; however, to discover causative mutations in mutants that were generated by random mutagenesis often takes years of laboratory work and requires previously generated genetic and/or physical markers, or resources like DNA libraries for complementation. Here, we present an alternative method to identify defective genes in developmental mutants of the filamentous fungus Sordaria macrospora through Illumina/Solexa whole-genome sequencing. We sequenced pooled DNA from progeny of crosses of three mutants and the wild type and were able to pinpoint the causative mutations in the mutant strains through bioinformatics analysis. One mutant is a spore color mutant, and the mutated gene encodes a melanin biosynthesis enzyme. The causative mutation is a G to A change in the first base of an intron, leading to a splice defect. The second mutant carries an allelic mutation in the pro41 gene encoding a protein essential for sexual development. In the mutant, we detected a complex pattern of deletion/rearrangements at the pro41 locus. In the third mutant, a point mutation in the stop codon of a transcription factor-encoding gene leads to the production of immature fruiting bodies. For all mutants, transformation with a wild type-copy of the affected gene restored the wild-type phenotype. Our data demonstrate that whole-genome sequencing of mutant strains is a rapid method to identify developmental genes in an organism that can be genetically crossed and where a reference genome sequence is available, even without prior mapping information. PMID:22384404

  11. Integration of QTL and bioinformatic tools to identify candidate genes for triglycerides in mice[S

    PubMed Central

    Leduc, Magalie S.; Hageman, Rachael S.; Verdugo, Ricardo A.; Tsaih, Shirng-Wern; Walsh, Kenneth; Churchill, Gary A.; Paigen, Beverly

    2011-01-01

    To identify genetic loci influencing lipid levels, we performed quantitative trait loci (QTL) analysis between inbred mouse strains MRL/MpJ and SM/J, measuring triglyceride levels at 8 weeks of age in F2 mice fed a chow diet. We identified one significant QTL on chromosome (Chr) 15 and three suggestive QTL on Chrs 2, 7, and 17. We also carried out microarray analysis on the livers of parental strains of 282 F2 mice and used these data to find cis-regulated expression QTL. We then narrowed the list of candidate genes under significant QTL using a “toolbox” of bioinformatic resources, including haplotype analysis; parental strain comparison for gene expression differences and nonsynonymous coding single nucleotide polymorphisms (SNP); cis-regulated eQTL in livers of F2 mice; correlation between gene expression and phenotype; and conditioning of expression on the phenotype. We suggest Slc25a7 as a candidate gene for the Chr 7 QTL and, based on expression differences, five genes (Polr3 h, Cyp2d22, Cyp2d26, Tspo, and Ttll12) as candidate genes for Chr 15 QTL. This study shows how bioinformatics can be used effectively to reduce candidate gene lists for QTL related to complex traits. PMID:21622629

  12. Analysis of the nucleoprotein gene identifies three distinct lineages of viral haemorrhagic septicemia virus (VHSV) within the European marine environment

    USGS Publications Warehouse

    Snow, M.; Cunningham, C.O.; Melvin, W.T.; Kurath, G.

    1999-01-01

    A ribonuclease (RNase) protection assay (RPA) has been used to detect nucleotide sequence variation within the nucleoprotein gene of 39 viral haemorrhagic septicaemia virus (VHSV) isolates of European marine origin. The classification of VHSV isolates based on RPA cleavage patterns permitted the identification of ten distinct groups of viruses based on differences at the molecular level. The nucleotide sequence of representatives of each of these groupings was determined and subjected to phylogenetic analysis. This revealed grouping of the European marine isolates of VHSV into three genotypes circulating within distinct geographic areas. A fourth genotype was identified comprising isolates originating from North America. Phylogenetic analyses indicated that VHSV isolates recovered from wild caught fish around the British Isles were genetically related to isolates responsible for losses in farmed turbot. Furthermore, a relationship between naturally occurring marine isolates and VHSV isolates causing mortality among rainbow trout in continental Europe was demonstrated. Analysis of the nucleoprotein gene identifies distinct lineages of viral haemorrhagic septicaemia virus within the European marine environment. Virus Res. 63, 35-44. Available from: 

  13. A general method for identifying major hybrid male sterility genes in Drosophila.

    PubMed

    Zeng, L W; Singh, R S

    1995-10-01

    The genes responsible for hybrid male sterility in species crosses are usually identified by introgressing chromosome segments, monitored by visible markers, between closely related species by continuous backcrosses. This commonly used method, however, suffers from two problems. First, it relies on the availability of markers to monitor the introgressed regions and so the portion of the genome examined is limited to the marked regions. Secondly, the introgressed regions are usually large and it is impossible to tell if the effects of the introgressed regions are the result of single (or few) major genes or many minor genes (polygenes). Here we introduce a simple and general method for identifying putative major hybrid male sterility genes which is free of these problems. In this method, the actual hybrid male sterility genes (rather than markers), or tightly linked gene complexes with large effects, are selectively introgressed from one species into the background of another species by repeated backcrosses. This is performed by selectively backcrossing heterozygous (for hybrid male sterility gene or genes) females producing fertile and sterile sons in roughly equal proportions to males of either parental species. As no marker gene is required for this procedure, this method can be used with any species pairs that produce unisexual sterility. With the application of this method, a small X chromosome region of Drosophila mauritiana which produces complete hybrid male sterility (aspermic testes) in the background of D. simulans was identified. Recombination analysis reveals that this region contains a second major hybrid male sterility gene linked to the forked locus located at either 62.7 +/- 0.66 map units or at the centromere region of the X chromosome of D. mauritiana.

  14. Gene interactions in the DNA damage-response pathway identified by genome-wide RNA-interference analysis of synthetic lethality

    PubMed Central

    van Haaften, Gijs; Vastenhouw, Nadine L.; Nollen, Ellen A. A.; Plasterk, Ronald H. A.; Tijsterman, Marcel

    2004-01-01

    Here, we describe a systematic search for synthetic gene interactions in a multicellular organism, the nematode Caenorhabditis elegans. We established a high-throughput method to determine synthetic gene interactions by genome-wide RNA interference and identified genes that are required to protect the germ line against DNA double-strand breaks. Besides known DNA-repair proteins such as the C. elegans orthologs of TopBP1, RPA2, and RAD51, eight genes previously unassociated with a double-strand-break response were identified. Knockdown of these genes increased sensitivity to ionizing radiation and camptothecin and resulted in increased chromosomal nondisjunction. All genes have human orthologs that may play a role in human carcinogenesis. PMID:15326288

  15. Whole exome sequencing identifies novel candidate genes that modify chronic obstructive pulmonary disease susceptibility.

    PubMed

    Bruse, Shannon; Moreau, Michael; Bromberg, Yana; Jang, Jun-Ho; Wang, Nan; Ha, Hongseok; Picchi, Maria; Lin, Yong; Langley, Raymond J; Qualls, Clifford; Klensney-Tait, Julia; Zabner, Joseph; Leng, Shuguang; Mao, Jenny; Belinsky, Steven A; Xing, Jinchuan; Nyunoya, Toru

    2016-01-07

    Chronic obstructive pulmonary disease (COPD) is characterized by an irreversible airflow limitation in response to inhalation of noxious stimuli, such as cigarette smoke. However, only 15-20 % smokers manifest COPD, suggesting a role for genetic predisposition. Although genome-wide association studies have identified common genetic variants that are associated with susceptibility to COPD, effect sizes of the identified variants are modest, as is the total heritability accounted for by these variants. In this study, an extreme phenotype exome sequencing study was combined with in vitro modeling to identify COPD candidate genes. We performed whole exome sequencing of 62 highly susceptible smokers and 30 exceptionally resistant smokers to identify rare variants that may contribute to disease risk or resistance to COPD. This was a cross-sectional case-control study without therapeutic intervention or longitudinal follow-up information. We identified candidate genes based on rare variant analyses and evaluated exonic variants to pinpoint individual genes whose function was computationally established to be significantly different between susceptible and resistant smokers. Top scoring candidate genes from these analyses were further filtered by requiring that each gene be expressed in human bronchial epithelial cells (HBECs). A total of 81 candidate genes were thus selected for in vitro functional testing in cigarette smoke extract (CSE)-exposed HBECs. Using small interfering RNA (siRNA)-mediated gene silencing experiments, we showed that silencing of several candidate genes augmented CSE-induced cytotoxicity in vitro. Our integrative analysis through both genetic and functional approaches identified two candidate genes (TACC2 and MYO1E) that augment cigarette smoke (CS)-induced cytotoxicity and, potentially, COPD susceptibility.

  16. Diametrical clustering for identifying anti-correlated gene clusters.

    PubMed

    Dhillon, Inderjit S; Marcotte, Edward M; Roshan, Usman

    2003-09-01

    Clustering genes based upon their expression patterns allows us to predict gene function. Most existing clustering algorithms cluster genes together when their expression patterns show high positive correlation. However, it has been observed that genes whose expression patterns are strongly anti-correlated can also be functionally similar. Biologically, this is not unintuitive-genes responding to the same stimuli, regardless of the nature of the response, are more likely to operate in the same pathways. We present a new diametrical clustering algorithm that explicitly identifies anti-correlated clusters of genes. Our algorithm proceeds by iteratively (i). re-partitioning the genes and (ii). computing the dominant singular vector of each gene cluster; each singular vector serving as the prototype of a 'diametric' cluster. We empirically show the effectiveness of the algorithm in identifying diametrical or anti-correlated clusters. Testing the algorithm on yeast cell cycle data, fibroblast gene expression data, and DNA microarray data from yeast mutants reveals that opposed cellular pathways can be discovered with this method. We present systems whose mRNA expression patterns, and likely their functions, oppose the yeast ribosome and proteosome, along with evidence for the inverse transcriptional regulation of a number of cellular systems.

  17. MAGMA: Generalized Gene-Set Analysis of GWAS Data

    PubMed Central

    de Leeuw, Christiaan A.; Mooij, Joris M.; Heskes, Tom; Posthuma, Danielle

    2015-01-01

    By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn’s Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn’s Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn’s Disease data was found to be considerably faster as well. PMID:25885710

  18. MAGMA: generalized gene-set analysis of GWAS data.

    PubMed

    de Leeuw, Christiaan A; Mooij, Joris M; Heskes, Tom; Posthuma, Danielle

    2015-04-01

    By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn's Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn's Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn's Disease data was found to be considerably faster as well.

  19. Comparative gene expression analysis between coronary arteries and internal mammary arteries identifies a role for the TES gene in endothelial cell functions relevant to coronary artery disease.

    PubMed

    Archacki, Stephen R; Angheloiu, George; Moravec, Christine S; Liu, Hui; Topol, Eric J; Wang, Qing Kenneth

    2012-03-15

    Coronary artery disease (CAD) is the leading cause of death worldwide. It has been established that internal mammary arteries (IMA) are resistant to the development of atherosclerosis, whereas left anterior descending (LAD) coronary arteries are athero-prone. The contrasting properties of these two arteries provide an innovative strategy to identify the genes that play important roles in the development of atherosclerosis. We carried out microarray analysis to identify genes differentially expressed between IMA and LAD. Twenty-nine genes showed significant differences in their expression levels between IMA and LAD, which included the TES gene encoding Testin. The role of TES in the cardiovascular system is unknown. Here we show that TES is involved in endothelial cell (EC) functions relevant to atherosclerosis. Western blot analysis showed higher TES expression in IMA than in LAD. Reverse transcription polymerase chain reaction and western blot analyses showed that TES was consistently and markedly down-regulated by more than 6-fold at both mRNA and protein levels in patients with CAD compared with controls without CAD (P= 0.000049). The data suggest that reduced TES expression is associated with the development of CAD. Knockdown of TES expression by small-interfering RNA promoted oxidized-LDL-mediated monocyte adhesion to ECs, EC migration and the transendothelial migration of monocytes, while the over-expression of TES in ECs blunted these processes. These results demonstrate association between reduced TES expression and CAD, establish a novel role for TES in EC functions and raise the possibility that reduced TES expression increases susceptibility to the development of CAD.

  20. Exome sequencing of a large family identifies potential candidate genes contributing risk to bipolar disorder.

    PubMed

    Zhang, Tianxiao; Hou, Liping; Chen, David T; McMahon, Francis J; Wang, Jen-Chyong; Rice, John P

    2018-03-01

    Bipolar disorder is a mental illness with lifetime prevalence of about 1%. Previous genetic studies have identified multiple chromosomal linkage regions and candidate genes that might be associated with bipolar disorder. The present study aimed to identify potential susceptibility variants for bipolar disorder using 6 related case samples from a four-generation family. A combination of exome sequencing and linkage analysis was performed to identify potential susceptibility variants for bipolar disorder. Our study identified a list of five potential candidate genes for bipolar disorder. Among these five genes, GRID1(Glutamate Receptor Delta-1 Subunit), which was previously reported to be associated with several psychiatric disorders and brain related traits, is particularly interesting. Variants with functional significance in this gene were identified from two cousins in our bipolar disorder pedigree. Our findings suggest a potential role for these genes and the related rare variants in the onset and development of bipolar disorder in this one family. Additional research is needed to replicate these findings and evaluate their patho-biological significance. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Whole genome population genetics analysis of Sudanese goats identifies regions harboring genes associated with major traits.

    PubMed

    Rahmatalla, Siham A; Arends, Danny; Reissmann, Monika; Said Ahmed, Ammar; Wimmers, Klaus; Reyer, Henry; Brockmann, Gudrun A

    2017-10-23

    Sudan is endowed with a variety of indigenous goat breeds which are used for meat and milk production and which are well adapted to the local environment. The aim of the present study was to determine the genetic diversity and relationship within and between the four main Sudanese breeds of Nubian, Desert, Taggar and Nilotic goats. Using the 50 K SNP chip, 24 animals of each breed were genotyped. More than 96% of high quality SNPs were polymorphic with an average minor allele frequency of 0.3. In all breeds, no significant difference between observed (0.4) and expected (0.4) heterozygosity was found and the inbreeding coefficients (F IS ) did not differ from zero. F st coefficients for the genetic distance between breeds also did not significantly deviate from zero. In addition, the analysis of molecular variance revealed that 93% of the total variance in the examined population can be explained by differences among individuals, while only 7% result from differences between the breeds. These findings provide evidence for high genetic diversity and little inbreeding within breeds on one hand, and low diversity between breeds on the other hand. Further examinations using Nei's genetic distance and STRUCTURE analysis clustered Taggar goats distinct from the other breeds. In a principal component (PC) analysis, PC1 could separate Taggar, Nilotic and a mix of Nubian and Desert goats into three groups. The SNPs that contributed strongly to PC1 showed high F st values in Taggar goat versus the other goat breeds. PCA allowed us to identify target genomic regions which contain genes known to influence growth, development, bone formation and the immune system. The information on the genetic variability and diversity in this study confirmed that Taggar goat is genetically different from the other goat breeds in Sudan. The SNPs identified by the first principal components show high F st values in Taggar goat and allowed to identify candidate genes which can be used in the

  2. Identifying candidate driver genes by integrative ovarian cancer genomics data

    NASA Astrophysics Data System (ADS)

    Lu, Xinguo; Lu, Jibo

    2017-08-01

    Integrative analysis of molecular mechanics underlying cancer can distinguish interactions that cannot be revealed based on one kind of data for the appropriate diagnosis and treatment of cancer patients. Tumor samples exhibit heterogeneity in omics data, such as somatic mutations, Copy Number Variations CNVs), gene expression profiles and so on. In this paper we combined gene co-expression modules and mutation modulators separately in tumor patients to obtain the candidate driver genes for resistant and sensitive tumor from the heterogeneous data. The final list of modulators identified are well known in biological processes associated with ovarian cancer, such as CCL17, CACTIN, CCL16, CCL22, APOB, KDF1, CCL11, HNF1B, LRG1, MED1 and so on, which can help to facilitate the discovery of biomarkers, molecular diagnostics, and drug discovery.

  3. Metabolomic profiling and genomic analysis of wheat aneuploid lines to identify genes controlling biochemical pathways in mature grain.

    PubMed

    Francki, Michael G; Hayton, Sarah; Gummer, Joel P A; Rawlinson, Catherine; Trengove, Robert D

    2016-02-01

    Metabolomics is becoming an increasingly important tool in plant genomics to decipher the function of genes controlling biochemical pathways responsible for trait variation. Although theoretical models can integrate genes and metabolites for trait variation, biological networks require validation using appropriate experimental genetic systems. In this study, we applied an untargeted metabolite analysis to mature grain of wheat homoeologous group 3 ditelosomic lines, selected compounds that showed significant variation between wheat lines Chinese Spring and at least one ditelosomic line, tracked the genes encoding enzymes of their biochemical pathway using the wheat genome survey sequence and determined the genetic components underlying metabolite variation. A total of 412 analytes were resolved in the wheat grain metabolome, and principal component analysis indicated significant differences in metabolite profiles between Chinese Spring and each ditelosomic lines. The grain metabolome identified 55 compounds positively matched against a mass spectral library where the majority showed significant differences between Chinese Spring and at least one ditelosomic line. Trehalose and branched-chain amino acids were selected for detailed investigation, and it was expected that if genes encoding enzymes directly related to their biochemical pathways were located on homoeologous group 3 chromosomes, then corresponding ditelosomic lines would have a significant reduction in metabolites compared with Chinese Spring. Although a proportion showed a reduction, some lines showed significant increases in metabolites, indicating that genes directly and indirectly involved in biosynthetic pathways likely regulate the metabolome. Therefore, this study demonstrated that wheat aneuploid lines are suitable experimental genetic system to validate metabolomics-genomics networks. © 2015 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.

  4. LNDriver: identifying driver genes by integrating mutation and expression data based on gene-gene interaction network.

    PubMed

    Wei, Pi-Jing; Zhang, Di; Xia, Junfeng; Zheng, Chun-Hou

    2016-12-23

    Cancer is a complex disease which is characterized by the accumulation of genetic alterations during the patient's lifetime. With the development of the next-generation sequencing technology, multiple omics data, such as cancer genomic, epigenomic and transcriptomic data etc., can be measured from each individual. Correspondingly, one of the key challenges is to pinpoint functional driver mutations or pathways, which contributes to tumorigenesis, from millions of functional neutral passenger mutations. In this paper, in order to identify driver genes effectively, we applied a generalized additive model to mutation profiles to filter genes with long length and constructed a new gene-gene interaction network. Then we integrated the mutation data and expression data into the gene-gene interaction network. Lastly, greedy algorithm was used to prioritize candidate driver genes from the integrated data. We named the proposed method Length-Net-Driver (LNDriver). Experiments on three TCGA datasets, i.e., head and neck squamous cell carcinoma, kidney renal clear cell carcinoma and thyroid carcinoma, demonstrated that the proposed method was effective. Also, it can identify not only frequently mutated drivers, but also rare candidate driver genes.

  5. Analysis of SOX10 mutations identified in Waardenburg-Hirschsprung patients: Differential effects on target gene regulation.

    PubMed

    Chan, Kwok Keung; Wong, Corinne Kung Yen; Lui, Vincent Chi Hang; Tam, Paul Kwong Hang; Sham, Mai Har

    2003-10-15

    SOX10 is a member of the SOX gene family related by homology to the high-mobility group (HMG) box region of the testis-determining gene SRY. Mutations of the transcription factor gene SOX10 lead to Waardenburg-Hirschsprung syndrome (Waardenburg-Shah syndrome, WS4) in humans. A number of SOX10 mutations have been identified in WS4 patients who suffer from different extents of intestinal aganglionosis, pigmentation, and hearing abnormalities. Some patients also exhibit signs of myelination deficiency in the central and peripheral nervous systems. Although the molecular bases for the wide range of symptoms displayed by the patients are still not clearly understood, a few target genes for SOX10 have been identified. We have analyzed the impact of six different SOX10 mutations on the activation of SOX10 target genes by yeast one-hybrid and mammalian cell transfection assays. To investigate the transactivation activities of the mutant proteins, three different SOX target binding sites were introduced into luciferase reporter gene constructs and examined in our series of transfection assays: consensus HMG domain protein binding sites; SOX10 binding sites identified in the RET promoter; and Sox10 binding sites identified in the P0 promoter. We found that the same mutation could have different transactivation activities when tested with different target binding sites and in different cell lines. The differential transactivation activities of the SOX10 mutants appeared to correlate with the intestinal and/or neurological symptoms presented in the patients. Among the six mutant SOX10 proteins tested, much reduced transactivation activities were observed when tested on the SOX10 binding sites from the RET promoter. Of the two similar mutations X467K and 1400del12, only the 1400del12 mutant protein exhibited an increase of transactivation through the P0 promoter. While the lack of normal SOX10 mediated activation of RET transcription may lead to intestinal aganglionosis

  6. A novel method to identify pathways associated with renal cell carcinoma based on a gene co-expression network

    PubMed Central

    RUAN, XIYUN; LI, HONGYUN; LIU, BO; CHEN, JIE; ZHANG, SHIBAO; SUN, ZEQIANG; LIU, SHUANGQING; SUN, FAHAI; LIU, QINGYONG

    2015-01-01

    The aim of the present study was to develop a novel method for identifying pathways associated with renal cell carcinoma (RCC) based on a gene co-expression network. A framework was established where a co-expression network was derived from the database as well as various co-expression approaches. First, the backbone of the network based on differentially expressed (DE) genes between RCC patients and normal controls was constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database. The differentially co-expressed links were detected by Pearson’s correlation, the empirical Bayesian (EB) approach and Weighted Gene Co-expression Network Analysis (WGCNA). The co-expressed gene pairs were merged by a rank-based algorithm. We obtained 842; 371; 2,883 and 1,595 co-expressed gene pairs from the co-expression networks of the STRING database, Pearson’s correlation EB method and WGCNA, respectively. Two hundred and eighty-one differentially co-expressed (DC) gene pairs were obtained from the merged network using this novel method. Pathway enrichment analysis based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and the network enrichment analysis (NEA) method were performed to verify feasibility of the merged method. Results of the KEGG and NEA pathway analyses showed that the network was associated with RCC. The suggested method was computationally efficient to identify pathways associated with RCC and has been identified as a useful complement to traditional co-expression analysis. PMID:26058425

  7. Genome-Wide Transcriptome Analysis of Cotton (Gossypium hirsutum L.) Identifies Candidate Gene Signatures in Response to Aflatoxin Producing Fungus Aspergillus flavus.

    PubMed

    Bedre, Renesh; Rajasekaran, Kanniah; Mangu, Venkata Ramanarao; Sanchez Timm, Luis Eduardo; Bhatnagar, Deepak; Baisakh, Niranjan

    2015-01-01

    Aflatoxins are toxic and potent carcinogenic metabolites produced from the fungi Aspergillus flavus and A. parasiticus. Aflatoxins can contaminate cottonseed under conducive preharvest and postharvest conditions. United States federal regulations restrict the use of aflatoxin contaminated cottonseed at >20 ppb for animal feed. Several strategies have been proposed for controlling aflatoxin contamination, and much success has been achieved by the application of an atoxigenic strain of A. flavus in cotton, peanut and maize fields. Development of cultivars resistant to aflatoxin through overexpression of resistance associated genes and/or knocking down aflatoxin biosynthesis of A. flavus will be an effective strategy for controlling aflatoxin contamination in cotton. In this study, genome-wide transcriptome profiling was performed to identify differentially expressed genes in response to infection with both toxigenic and atoxigenic strains of A. flavus on cotton (Gossypium hirsutum L.) pericarp and seed. The genes involved in antifungal response, oxidative burst, transcription factors, defense signaling pathways and stress response were highly differentially expressed in pericarp and seed tissues in response to A. flavus infection. The cell-wall modifying genes and genes involved in the production of antimicrobial substances were more active in pericarp as compared to seed. The genes involved in auxin and cytokinin signaling were also induced. Most of the genes involved in defense response in cotton were highly induced in pericarp than in seed. The global gene expression analysis in response to fungal invasion in cotton will serve as a source for identifying biomarkers for breeding, potential candidate genes for transgenic manipulation, and will help in understanding complex plant-fungal interaction for future downstream research.

  8. In-Silico Integration Approach to Identify a Key miRNA Regulating a Gene Network in Aggressive Prostate Cancer

    PubMed Central

    Colaprico, Antonio; Bontempi, Gianluca; Castiglioni, Isabella

    2018-01-01

    Like other cancer diseases, prostate cancer (PC) is caused by the accumulation of genetic alterations in the cells that drives malignant growth. These alterations are revealed by gene profiling and copy number alteration (CNA) analysis. Moreover, recent evidence suggests that also microRNAs have an important role in PC development. Despite efforts to profile PC, the alterations (gene, CNA, and miRNA) and biological processes that correlate with disease development and progression remain partially elusive. Many gene signatures proposed as diagnostic or prognostic tools in cancer poorly overlap. The identification of co-expressed genes, that are functionally related, can identify a core network of genes associated with PC with a better reproducibility. By combining different approaches, including the integration of mRNA expression profiles, CNAs, and miRNA expression levels, we identified a gene signature of four genes overlapping with other published gene signatures and able to distinguish, in silico, high Gleason-scored PC from normal human tissue, which was further enriched to 19 genes by gene co-expression analysis. From the analysis of miRNAs possibly regulating this network, we found that hsa-miR-153 was highly connected to the genes in the network. Our results identify a four-gene signature with diagnostic and prognostic value in PC and suggest an interesting gene network that could play a key regulatory role in PC development and progression. Furthermore, hsa-miR-153, controlling this network, could be a potential biomarker for theranostics in high Gleason-scored PC. PMID:29562723

  9. Gene-based meta-analysis of genome-wide association study data identifies independent single-nucleotide polymorphisms in ANXA6 as being associated with systemic lupus erythematosus in Asian populations.

    PubMed

    Zhang, Jing; Zhang, Lu; Zhang, Yan; Yang, Jing; Guo, Mengbiao; Sun, Liangdan; Pan, Hai-Feng; Hirankarn, Nattiya; Ying, Dingge; Zeng, Shuai; Lee, Tsz Leung; Lau, Chak Sing; Chan, Tak Mao; Leung, Alexander Moon Ho; Mok, Chi Chiu; Wong, Sik Nin; Lee, Ka Wing; Ho, Marco Hok Kung; Lee, Pamela Pui Wah; Chung, Brian Hon-Yin; Chong, Chun Yin; Wong, Raymond Woon Sing; Mok, Mo Yin; Wong, Wilfred Hing Sang; Tong, Kwok Lung; Tse, Niko Kei Chiu; Li, Xiang-Pei; Avihingsanon, Yingyos; Rianthavorn, Pornpimol; Deekajorndej, Thavatchai; Suphapeetiporn, Kanya; Shotelersuk, Vorasuk; Ying, Shirley King Yee; Fung, Samuel Ka Shun; Lai, Wai Ming; Garcia-Barceló, Maria-Mercè; Cherny, Stacey S; Sham, Pak Chung; Cui, Yong; Yang, Sen; Ye, Dong Qing; Zhang, Xue-Jun; Lau, Yu Lung; Yang, Wanling

    2015-11-01

    Previous genome-wide association studies (GWAS), which were mainly based on single-variant analysis, have identified many systemic lupus erythematosus (SLE) susceptibility loci. However, the genetic architecture of this complex disease is far from being understood. The aim of this study was to investigate whether using a gene-based analysis may help to identify novel loci, by considering global evidence of association from a gene or a genomic region rather than focusing on evidence for individual variants. Based on the results of a meta-analysis of 2 GWAS of SLE conducted in 2 Asian cohorts, we performed an in-depth gene-based analysis followed by replication in a total of 4,626 patients and 7,466 control subjects of Asian ancestry. Differential allelic expression was measured by pyrosequencing. More than one-half of the reported SLE susceptibility loci showed evidence of independent effects, and this finding is important for understanding the mechanisms of association and explaining disease heritability. ANXA6 was detected as a novel SLE susceptibility gene, with several single-nucleotide polymorphisms (SNPs) contributing independently to the association with disease. The risk allele of rs11960458 correlated significantly with increased expression of ANXA6 in peripheral blood mononuclear cells from heterozygous healthy control subjects. Several other associated SNPs may also regulate ANXA6 expression, according to data obtained from public databases. Higher expression of ANXA6 in patients with SLE was also reported previously. Our study demonstrated the merit of using gene-based analysis to identify novel susceptibility loci, especially those with independent effects, and also demonstrated the widespread presence of loci with independent effects in SLE susceptibility genes. © 2015, American College of Rheumatology.

  10. Genomic analysis of human lung fibroblasts exposed to vanadium pentoxide to identify candidate genes for occupational bronchitis

    PubMed Central

    Ingram, Jennifer L; Antao-Menezes, Aurita; Turpin, Elizabeth A; Wallace, Duncan G; Mangum, James B; Pluta, Linda J; Thomas, Russell S; Bonner, James C

    2007-01-01

    Background Exposure to vanadium pentoxide (V2O5) is a cause of occupational bronchitis. We evaluated gene expression profiles in cultured human lung fibroblasts exposed to V2O5 in vitro in order to identify candidate genes that could play a role in inflammation, fibrosis, and repair during the pathogenesis of V2O5-induced bronchitis. Methods Normal human lung fibroblasts were exposed to V2O5 in a time course experiment. Gene expression was measured at various time points over a 24 hr period using the Affymetrix Human Genome U133A 2.0 Array. Selected genes that were significantly changed in the microarray experiment were validated by RT-PCR. Results V2O5 altered more than 1,400 genes, of which ~300 were induced while >1,100 genes were suppressed. Gene ontology categories (GO) categories unique to induced genes included inflammatory response and immune response, while GO catogories unique to suppressed genes included ubiquitin cycle and cell cycle. A dozen genes were validated by RT-PCR, including growth factors (HBEGF, VEGF, CTGF), chemokines (IL8, CXCL9, CXCL10), oxidative stress response genes (SOD2, PIPOX, OXR1), and DNA-binding proteins (GAS1, STAT1). Conclusion Our study identified a variety of genes that could play pivotal roles in inflammation, fibrosis and repair during V2O5-induced bronchitis. The induction of genes that mediate inflammation and immune responses, as well as suppression of genes involved in growth arrest appear to be important to the lung fibrotic reaction to V2O5. PMID:17459161

  11. A genomic approach to identify hybrid incompatibility genes.

    PubMed

    Cooper, Jacob C; Phadnis, Nitin

    2016-07-02

    Uncovering the genetic and molecular basis of barriers to gene flow between populations is key to understanding how new species are born. Intrinsic postzygotic reproductive barriers such as hybrid sterility and hybrid inviability are caused by deleterious genetic interactions known as hybrid incompatibilities. The difficulty in identifying these hybrid incompatibility genes remains a rate-limiting step in our understanding of the molecular basis of speciation. We recently described how whole genome sequencing can be applied to identify hybrid incompatibility genes, even from genetically terminal hybrids. Using this approach, we discovered a new hybrid incompatibility gene, gfzf, between Drosophila melanogaster and Drosophila simulans, and found that it plays an essential role in cell cycle regulation. Here, we discuss the history of the hunt for incompatibility genes between these species, discuss the molecular roles of gfzf in cell cycle regulation, and explore how intragenomic conflict drives the evolution of fundamental cellular mechanisms that lead to the developmental arrest of hybrids.

  12. A genomic approach to identify hybrid incompatibility genes

    PubMed Central

    Cooper, Jacob C.; Phadnis, Nitin

    2016-01-01

    ABSTRACT Uncovering the genetic and molecular basis of barriers to gene flow between populations is key to understanding how new species are born. Intrinsic postzygotic reproductive barriers such as hybrid sterility and hybrid inviability are caused by deleterious genetic interactions known as hybrid incompatibilities. The difficulty in identifying these hybrid incompatibility genes remains a rate-limiting step in our understanding of the molecular basis of speciation. We recently described how whole genome sequencing can be applied to identify hybrid incompatibility genes, even from genetically terminal hybrids. Using this approach, we discovered a new hybrid incompatibility gene, gfzf, between Drosophila melanogaster and Drosophila simulans, and found that it plays an essential role in cell cycle regulation. Here, we discuss the history of the hunt for incompatibility genes between these species, discuss the molecular roles of gfzf in cell cycle regulation, and explore how intragenomic conflict drives the evolution of fundamental cellular mechanisms that lead to the developmental arrest of hybrids. PMID:27230814

  13. Identifying key genes, pathways and screening therapeutic agents for manganese-induced Alzheimer disease using bioinformatics analysis.

    PubMed

    Ling, JunJun; Yang, Shengyou; Huang, Yi; Wei, Dongfeng; Cheng, Weidong

    2018-06-01

    Alzheimer disease (AD) is a progressive neurodegenerative disease, the etiology of which remains largely unknown. Accumulating evidence indicates that elevated manganese (Mn) in brain exerts toxic effects on neurons and contributes to AD development. Thus, we aimed to explore the gene and pathway variations through analysis of high through-put data in this process.To screen the differentially expressed genes (DEGs) that may play critical roles in Mn-induced AD, public microarray data regarding Mn-treated neurocytes versus controls (GSE70845), and AD versus controls (GSE48350), were downloaded and the DEGs were screened out, respectively. The intersection of the DEGs of each datasets was obtained by using Venn analysis. Then, gene ontology (GO) function analysis and KEGG pathway analysis were carried out. For screening hub genes, protein-protein interaction network was constructed. At last, DEGs were analyzed in Connectivity Map (CMAP) for identification of small molecules that overcome Mn-induced neurotoxicity or AD development.The intersection of the DEGs obtained 140 upregulated and 267 downregulated genes. The top 5 items of biological processes of GO analysis were taxis, chemotaxis, cell-cell signaling, regulation of cellular physiological process, and response to wounding. The top 5 items of KEGG pathway analysis were cytokine-cytokine receptor interaction, apoptosis, oxidative phosphorylation, Toll-like receptor signaling pathway, and insulin signaling pathway. Afterwards, several hub genes such as INSR, VEGFA, PRKACB, DLG4, and BCL2 that might play key roles in Mn-induced AD were further screened out. Interestingly, tyrphostin AG-825, an inhibitor of tyrosine phosphorylation, was predicted to be a potential agent for overcoming Mn-induced neurotoxicity or AD development.The present study provided a novel insight into the molecular mechanisms of Mn-induced neurotoxicity or AD development and screened out several small molecular candidates that might be

  14. Knowledge-Driven Analysis Identifies a Gene–Gene Interaction Affecting High-Density Lipoprotein Cholesterol Levels in Multi-Ethnic Populations

    PubMed Central

    Ma, Li; Brautbar, Ariel; Boerwinkle, Eric; Sing, Charles F.

    2012-01-01

    Total cholesterol, low-density lipoprotein cholesterol, triglyceride, and high-density lipoprotein cholesterol (HDL-C) levels are among the most important risk factors for coronary artery disease. We tested for gene–gene interactions affecting the level of these four lipids based on prior knowledge of established genome-wide association study (GWAS) hits, protein–protein interactions, and pathway information. Using genotype data from 9,713 European Americans from the Atherosclerosis Risk in Communities (ARIC) study, we identified an interaction between HMGCR and a locus near LIPC in their effect on HDL-C levels (Bonferroni corrected P c = 0.002). Using an adaptive locus-based validation procedure, we successfully validated this gene–gene interaction in the European American cohorts from the Framingham Heart Study (P c = 0.002) and the Multi-Ethnic Study of Atherosclerosis (MESA; P c = 0.006). The interaction between these two loci is also significant in the African American sample from ARIC (P c = 0.004) and in the Hispanic American sample from MESA (P c = 0.04). Both HMGCR and LIPC are involved in the metabolism of lipids, and genome-wide association studies have previously identified LIPC as associated with levels of HDL-C. However, the effect on HDL-C of the novel gene–gene interaction reported here is twice as pronounced as that predicted by the sum of the marginal effects of the two loci. In conclusion, based on a knowledge-driven analysis of epistasis, together with a new locus-based validation method, we successfully identified and validated an interaction affecting a complex trait in multi-ethnic populations. PMID:22654671

  15. A method to identify differential expression profiles of time-course gene data with Fourier transformation.

    PubMed

    Kim, Jaehee; Ogden, Robert Todd; Kim, Haseong

    2013-10-18

    Time course gene expression experiments are an increasingly popular method for exploring biological processes. Temporal gene expression profiles provide an important characterization of gene function, as biological systems are both developmental and dynamic. With such data it is possible to study gene expression changes over time and thereby to detect differential genes. Much of the early work on analyzing time series expression data relied on methods developed originally for static data and thus there is a need for improved methodology. Since time series expression is a temporal process, its unique features such as autocorrelation between successive points should be incorporated into the analysis. This work aims to identify genes that show different gene expression profiles across time. We propose a statistical procedure to discover gene groups with similar profiles using a nonparametric representation that accounts for the autocorrelation in the data. In particular, we first represent each profile in terms of a Fourier basis, and then we screen out genes that are not differentially expressed based on the Fourier coefficients. Finally, we cluster the remaining gene profiles using a model-based approach in the Fourier domain. We evaluate the screening results in terms of sensitivity, specificity, FDR and FNR, compare with the Gaussian process regression screening in a simulation study and illustrate the results by application to yeast cell-cycle microarray expression data with alpha-factor synchronization.The key elements of the proposed methodology: (i) representation of gene profiles in the Fourier domain; (ii) automatic screening of genes based on the Fourier coefficients and taking into account autocorrelation in the data, while controlling the false discovery rate (FDR); (iii) model-based clustering of the remaining gene profiles. Using this method, we identified a set of cell-cycle-regulated time-course yeast genes. The proposed method is general and can be

  16. A method to identify differential expression profiles of time-course gene data with Fourier transformation

    PubMed Central

    2013-01-01

    Background Time course gene expression experiments are an increasingly popular method for exploring biological processes. Temporal gene expression profiles provide an important characterization of gene function, as biological systems are both developmental and dynamic. With such data it is possible to study gene expression changes over time and thereby to detect differential genes. Much of the early work on analyzing time series expression data relied on methods developed originally for static data and thus there is a need for improved methodology. Since time series expression is a temporal process, its unique features such as autocorrelation between successive points should be incorporated into the analysis. Results This work aims to identify genes that show different gene expression profiles across time. We propose a statistical procedure to discover gene groups with similar profiles using a nonparametric representation that accounts for the autocorrelation in the data. In particular, we first represent each profile in terms of a Fourier basis, and then we screen out genes that are not differentially expressed based on the Fourier coefficients. Finally, we cluster the remaining gene profiles using a model-based approach in the Fourier domain. We evaluate the screening results in terms of sensitivity, specificity, FDR and FNR, compare with the Gaussian process regression screening in a simulation study and illustrate the results by application to yeast cell-cycle microarray expression data with alpha-factor synchronization. The key elements of the proposed methodology: (i) representation of gene profiles in the Fourier domain; (ii) automatic screening of genes based on the Fourier coefficients and taking into account autocorrelation in the data, while controlling the false discovery rate (FDR); (iii) model-based clustering of the remaining gene profiles. Conclusions Using this method, we identified a set of cell-cycle-regulated time-course yeast genes. The

  17. Bacterial reference genes for gene expression studies by RT-qPCR: survey and analysis.

    PubMed

    Rocha, Danilo J P; Santos, Carolina S; Pacheco, Luis G C

    2015-09-01

    The appropriate choice of reference genes is essential for accurate normalization of gene expression data obtained by the method of reverse transcription quantitative real-time PCR (RT-qPCR). In 2009, a guideline called the Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) highlighted the importance of the selection and validation of more than one suitable reference gene for obtaining reliable RT-qPCR results. Herein, we searched the recent literature in order to identify the bacterial reference genes that have been most commonly validated in gene expression studies by RT-qPCR (in the first 5 years following publication of the MIQE guidelines). Through a combination of different search parameters with the text mining tool MedlineRanker, we identified 145 unique bacterial genes that were recently tested as candidate reference genes. Of these, 45 genes were experimentally validated and, in most of the cases, their expression stabilities were verified using the software tools geNorm and NormFinder. It is noteworthy that only 10 of these reference genes had been validated in two or more of the studies evaluated. An enrichment analysis using Gene Ontology classifications demonstrated that genes belonging to the functional categories of DNA Replication (GO: 0006260) and Transcription (GO: 0006351) rendered a proportionally higher number of validated reference genes. Three genes in the former functional class were also among the top five most stable genes identified through an analysis of gene expression data obtained from the Pathosystems Resource Integration Center. These results may provide a guideline for the initial selection of candidate reference genes for RT-qPCR studies in several different bacterial species.

  18. Transcriptome Analysis and Its Application in Identifying Genes Associated with Fruiting Body Development in Basidiomycete Hypsizygus marmoreus

    PubMed Central

    Chen, Hui; Zhao, Mingwen; Shi, Liang; Chen, Mingjie; Wang, Hong; Feng, Zhiyong

    2015-01-01

    To elucidate the mechanisms of fruit body development in H. marmoreus, a total of 43609521 high-quality RNA-seq reads were obtained from four developmental stages, including the mycelial knot (H-M), mycelial pigmentation (H-V), primordium (H-P) and fruiting body (H-F) stages. These reads were assembled to obtain 40568 unigenes with an average length of 1074 bp. A total of 26800 (66.06%) unigenes were annotated and analyzed with the Kyoto Encyclopedia of Genes and Genomes (KEGG), Gene Ontology (GO), and Eukaryotic Orthologous Group (KOG) databases. Differentially expressed genes (DEGs) from the four transcriptomes were analyzed. The KEGG enrichment analysis revealed that the mycelium pigmentation stage was associated with the MAPK, cAMP, and blue light signal transduction pathways. In addition, expression of the two-component system members changed with the transition from H-M to H-V, suggesting that light affected the expression of genes related to fruit body initiation in H. marmoreus. During the transition from H-V to H-P, stress signals associated with MAPK, cAMP and ROS signals might be the most important inducers. Our data suggested that nitrogen starvation might be one of the most important factors in promoting fruit body maturation, and nitrogen metabolism and mTOR signaling pathway were associated with this process. In addition, 30 genes of interest were analyzed by quantitative real-time PCR to verify their expression profiles at the four developmental stages. This study advances our understanding of the molecular mechanism of fruiting body development in H. marmoreus by identifying a wealth of new genes that may play important roles in mushroom morphogenesis. PMID:25837428

  19. Mutation analysis in a German family identified a new cataract-causing allele in the CRYBB2 gene

    PubMed Central

    Pauli, Silke; Söker, Torben; Klopp, Norman; Illig, Thomas; Engel, Wolfgang

    2007-01-01

    Purpose The study demonstrates the functional candidate gene analysis in a cataract family of German descent. Methods We screened a German family, clinically documented to have congenital cataracts, for mutation in the candidate genes CRYG (A to D) and CRYBB2 through polymerase chain reaction analyses and sequencing. Results Congenital cataract was first observed in a daughter of healthy parents. Her two children (a boy and a girl) also suffer from congenital cataracts and have been operated within the first weeks of birth. Morphologically, the cataract is characterized as nuclear with an additional ring-shaped cortical opacity. Molecular analysis revealed no causative mutation in any of the CRYG genes. However, sequencing of the exons of the CRYBB2 gene identified a sequence variation in exon 5 (383 A>T) with a substitution of Asp to Val at position 128. All three affected family members revealed this change but it was not observed in any of the unaffected persons of the family. The putative mutation creates a restriction site for the enzyme TaiI. This mutation was checked for in controls of randomly selected DNA samples from ophthalmologically normal individuals from the population-based KORA S4 study (n=96) and no mutation was observed. Moreover, the Asp at position 128 is within a stretch of 12 amino acids, which are highly conserved throughout the animal kingdom. For the mutant protein, the isoelectric point is raised from pH 6.50 to 6.75. Additionally, the random coil structure of the protein between the amino acids 126-139 is interrupted by a short extended strand structure. In addition, this region becomes hydrophobic (from neutral to +1) and the electrostatic potential in the region surrounding the exchanged amino acid alters from a mainly negative potential to an enlarged positive potential. Conclusions The D128V mutation segregates only in affected family members and is not seen in representative controls. It represents the first mutation outside exon 6

  20. Ancestry-based stratified analysis of Immunochip data identifies novel associations with celiac disease.

    PubMed

    Garcia-Etxebarria, Koldo; Jauregi-Miguel, Amaia; Romero-Garmendia, Irati; Plaza-Izurieta, Leticia; Legarda, Maria; Irastorza, Iñaki; Bilbao, Jose Ramon

    2016-12-01

    To identify candidate genes in celiac disease (CD), we reanalyzed the whole Immunochip CD cohort using a different approach that clusters individuals based on immunoancestry prior to disease association analysis, rather than by geographical origin. We detected 636 new associated SNPs (P<7.02 × 10 -07 ) and identified 5 novel genomic regions, extended 8 others previously identified and also detected 18 isolated signals defined by one or very few significant SNPs. To test whether we could identify putative candidate genes, we performed expression analyses of several genes from the top novel region (chr2:134533564-136169524), from a previously identified locus that is now extended, and a gene marked by an isolated SNP, in duodenum biopsies of active and treated CD patients, and non-celiac controls. In the largest novel region, CCNT2 and R3HDM1 were constitutively underexpressed in disease, even after gluten removal. Moreover, several genes within this region were coexpressed in patients, but not in controls. Other novel genes like KIF21B, REL and SORD also showed altered expression in active disease. Apart from the identification of novel CD loci, these results suggest that ancestry-based stratified analysis is an efficient strategy for association studies in complex diseases.

  1. Gene expression analysis identifies new candidate genes associated with the development of black skin spots in Corriedale sheep.

    PubMed

    Peñagaricano, Francisco; Zorrilla, Pilar; Naya, Hugo; Robello, Carlos; Urioste, Jorge I

    2012-02-01

    The white coat colour of sheep is an important economic trait. For unknown reasons, some animals are born with, and others develop with time, black skin spots that can also produce pigmented fibres. The presence of pigmented fibres in the white wool significantly decreases the fibre quality. The aim of this work was to study gene expression in black spots (with and without pigmented fibres) and white skin by microarray techniques, in order to identify the possible genes involved in the development of this trait. Five unrelated Corriedale sheep were used and, for each animal, the three possible comparisons (three different hybridisations) between the three samples of interest were performed. Differential gene expression patterns were analysed using different t-test approaches. Most of the major genes with well-known roles in skin pigmentation, e.g. ASIP, MC1R and C-KIT, showed no significant difference in the gene expression between white skin and black spots. On the other hand, many of the differentially expressed genes (raw P-value < 0.005) detected in this study, e.g. C-FOS, KLF4 and UFC1, fulfil biological functions that are plausible to be involved in the formation of black spots. The gene expression of C-FOS and KLF4, transcription factors involved in the cellular response to external factors such as ultraviolet light, was validated by quantitative polymerase chain reaction (PCR). This exploratory study provides a list of candidate genes that could be associated with the development of black skin spots that should be studied in more detail. Characterisation of these genes will enable us to discern the molecular mechanisms involved in the development of this feature and, hence, increase our understanding of melanocyte biology and skin pigmentation. In sheep, understanding this phenomenon is a first step towards developing molecular tools to assist in the selection against the presence of pigmented fibres in white wool.

  2. Systemic bioinformatics analysis of skeletal muscle gene expression profiles of sepsis

    PubMed Central

    Yang, Fang; Wang, Yumei

    2018-01-01

    Sepsis is a type of systemic inflammatory response syndrome with high morbidity and mortality. Skeletal muscle dysfunction is one of the major complications of sepsis that may also influence the outcome of sepsis. The aim of the present study was to explore and identify potential mechanisms and therapeutic targets of sepsis. Systemic bioinformatics analysis of skeletal muscle gene expression profiles from the Gene Expression Omnibus was performed. Differentially expressed genes (DEGs) in samples from patients with sepsis and control samples were screened out using the limma package. Differential co-expression and coregulation (DCE and DCR, respectively) analysis was performed based on the Differential Co-expression Analysis package to identify differences in gene co-expression and coregulation patterns between the control and sepsis groups. Gene Ontology terms and Kyoto Encyclopedia of Genes and Genomes pathways of DEGs were identified using the Database for Annotation, Visualization and Integrated Discovery, and inflammatory, cancer and skeletal muscle development-associated biological processes and pathways were identified. DCE and DCR analysis revealed several potential therapeutic targets for sepsis, including genes and transcription factors. The results of the present study may provide a basis for the development of novel therapeutic targets and treatment methods for sepsis. PMID:29805480

  3. Meta-Analysis of Placental Transcriptome Data Identifies a Novel Molecular Pathway Related to Preeclampsia.

    PubMed

    van Uitert, Miranda; Moerland, Perry D; Enquobahrie, Daniel A; Laivuori, Hannele; van der Post, Joris A M; Ris-Stalpers, Carrie; Afink, Gijs B

    2015-01-01

    Studies using the placental transcriptome to identify key molecules relevant for preeclampsia are hampered by a relatively small sample size. In addition, they use a variety of bioinformatics and statistical methods, making comparison of findings challenging. To generate a more robust preeclampsia gene expression signature, we performed a meta-analysis on the original data of 11 placenta RNA microarray experiments, representing 139 normotensive and 116 preeclamptic pregnancies. Microarray data were pre-processed and analyzed using standardized bioinformatics and statistical procedures and the effect sizes were combined using an inverse-variance random-effects model. Interactions between genes in the resulting gene expression signature were identified by pathway analysis (Ingenuity Pathway Analysis, Gene Set Enrichment Analysis, Graphite) and protein-protein associations (STRING). This approach has resulted in a comprehensive list of differentially expressed genes that led to a 388-gene meta-signature of preeclamptic placenta. Pathway analysis highlights the involvement of the previously identified hypoxia/HIF1A pathway in the establishment of the preeclamptic gene expression profile, while analysis of protein interaction networks indicates CREBBP/EP300 as a novel element central to the preeclamptic placental transcriptome. In addition, there is an apparent high incidence of preeclampsia in women carrying a child with a mutation in CREBBP/EP300 (Rubinstein-Taybi Syndrome). The 388-gene preeclampsia meta-signature offers a vital starting point for further studies into the relevance of these genes (in particular CREBBP/EP300) and their concomitant pathways as biomarkers or functional molecules in preeclampsia. This will result in a better understanding of the molecular basis of this disease and opens up the opportunity to develop rational therapies targeting the placental dysfunction causal to preeclampsia.

  4. Transcriptome analysis identifies genes involved in sex determination and development of Xenopus laevis gonads.

    PubMed

    Piprek, Rafal P; Damulewicz, Milena; Kloc, Malgorzata; Kubiak, Jacek Z

    Development of the gonads is a complex process, which starts with a period of undifferentiated, bipotential gonads. During this period the expression of sex-determining genes is initiated. Sex determination is a process triggering differentiation of the gonads into the testis or ovary. Sex determination period is followed by sexual differentiation, i.e. appearance of the first testis- and ovary-specific features. In Xenopus laevis W-linked DM-domain gene (DM-W) had been described as a master determinant of the gonadal female sex. However, the data on the expression and function of other genes participating in gonad development in X. laevis, and in anurans, in general, are very limited. We applied microarray technique to analyze the expression pattern of a subset of X. laevis genes previously identified to be involved in gonad development in several vertebrate species. We also analyzed the localization and the expression level of proteins encoded by these genes in developing X. laevis gonads. These analyses pointed to the set of genes differentially expressed in developing testes and ovaries. Gata4, Sox9, Dmrt1, Amh, Fgf9, Ptgds, Pdgf, Fshr, and Cyp17a1 expression was upregulated in developing testes, while DM-W, Fst, Foxl2, and Cyp19a1 were upregulated in developing ovaries. We discuss the possible roles of these genes in development of X. laevis gonads. Copyright © 2018 International Society of Differentiation. Published by Elsevier B.V. All rights reserved.

  5. Identifying a gene expression signature of cluster headache in blood

    PubMed Central

    Eising, Else; Pelzer, Nadine; Vijfhuizen, Lisanne S.; Vries, Boukje de; Ferrari, Michel D.; ‘t Hoen, Peter A. C.; Terwindt, Gisela M.; van den Maagdenberg, Arn M. J. M.

    2017-01-01

    Cluster headache is a relatively rare headache disorder, typically characterized by multiple daily, short-lasting attacks of excruciating, unilateral (peri-)orbital or temporal pain associated with autonomic symptoms and restlessness. To better understand the pathophysiology of cluster headache, we used RNA sequencing to identify differentially expressed genes and pathways in whole blood of patients with episodic (n = 19) or chronic (n = 20) cluster headache in comparison with headache-free controls (n = 20). Gene expression data were analysed by gene and by module of co-expressed genes with particular attention to previously implicated disease pathways including hypocretin dysregulation. Only moderate gene expression differences were identified and no associations were found with previously reported pathogenic mechanisms. At the level of functional gene sets, associations were observed for genes involved in several brain-related mechanisms such as GABA receptor function and voltage-gated channels. In addition, genes and modules of co-expressed genes showed a role for intracellular signalling cascades, mitochondria and inflammation. Although larger study samples may be required to identify the full range of involved pathways, these results indicate a role for mitochondria, intracellular signalling and inflammation in cluster headache. PMID:28074859

  6. A novel approach to identify genes that determine grain protein deviation in cereals.

    PubMed

    Mosleth, Ellen F; Wan, Yongfang; Lysenko, Artem; Chope, Gemma A; Penson, Simon P; Shewry, Peter R; Hawkesford, Malcolm J

    2015-06-01

    Grain yield and protein content were determined for six wheat cultivars grown over 3 years at multiple sites and at multiple nitrogen (N) fertilizer inputs. Although grain protein content was negatively correlated with yield, some grain samples had higher protein contents than expected based on their yields, a trait referred to as grain protein deviation (GPD). We used novel statistical approaches to identify gene transcripts significantly related to GPD across environments. The yield and protein content were initially adjusted for nitrogen fertilizer inputs and then adjusted for yield (to remove the negative correlation with protein content), resulting in a parameter termed corrected GPD. Significant genetic variation in corrected GPD was observed for six cultivars grown over a range of environmental conditions (a total of 584 samples). Gene transcript profiles were determined in a subset of 161 samples of developing grain to identify transcripts contributing to GPD. Principal component analysis (PCA), analysis of variance (ANOVA) and means of scores regression (MSR) were used to identify individual principal components (PCs) correlating with GPD alone. Scores of the selected PCs, which were significantly related to GPD and protein content but not to the yield and significantly affected by cultivar, were identified as reflecting a multivariate pattern of gene expression related to genetic variation in GPD. Transcripts with consistent variation along the selected PCs were identified by an approach hereby called one-block means of scores regression (one-block MSR). © 2014 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  7. De novo sequencing and analysis of the cranberry fruit transcriptome to identify putative genes involved in flavonoid biosynthesis, transport and regulation.

    PubMed

    Sun, Haiyue; Liu, Yushan; Gai, Yuzhuo; Geng, Jinman; Chen, Li; Liu, Hongdi; Kang, Limin; Tian, Youwen; Li, Yadong

    2015-09-02

    Cranberries (Vaccinium macrocarpon Ait.), renowned for their excellent health benefits, are an important berry crop. Here, we performed transcriptome sequencing of one cranberry cultivar, from fruits at two different developmental stages, on the Illumina HiSeq 2000 platform. Our main goals were to identify putative genes for major metabolic pathways of bioactive compounds and compare the expression patterns between white fruit (W) and red fruit (R) in cranberry. In this study, two cDNA libraries of W and R were constructed. Approximately 119 million raw sequencing reads were generated and assembled de novo, yielding 57,331 high quality unigenes with an average length of 739 bp. Using BLASTx, 38,460 unigenes were identified as putative homologs of annotated sequences in public protein databases, including NCBI NR, NT, Swiss-Prot, KEGG, COG and GO. Of these, 21,898 unigenes mapped to 128 KEGG pathways, with the metabolic pathways, secondary metabolites, glycerophospholipid metabolism, ether lipid metabolism, starch and sucrose metabolism, purine metabolism, and pyrimidine metabolism being well represented. Among them, many candidate genes were involved in flavonoid biosynthesis, transport and regulation. Furthermore, digital gene expression (DEG) analysis identified 3,257 unigenes that were differentially expressed between the two fruit developmental stages. In addition, 14,473 simple sequence repeats (SSRs) were detected. Our results present comprehensive gene expression information about the cranberry fruit transcriptome that could facilitate our understanding of the molecular mechanisms of fruit development in cranberries. Although it will be necessary to validate the functions carried out by these genes, these results could be used to improve the quality of breeding programs for the cranberry and related species.

  8. Comprehensive Analysis of the COBRA-Like (COBL) Gene Family in Gossypium Identifies Two COBLs Potentially Associated with Fiber Quality

    PubMed Central

    Niu, Erli; Shang, Xiaoguang; Cheng, Chaoze; Bao, Jianghao; Zeng, Yanda; Cai, Caiping; Du, Xiongming; Guo, Wangzhen

    2015-01-01

    COBRA-Like (COBL) genes, which encode a plant-specific glycosylphosphatidylinositol (GPI) anchored protein, have been proven to be key regulators in the orientation of cell expansion and cellulose crystallinity status. Genome-wide analysis has been performed in A. thaliana, O. sativa, Z. mays and S. lycopersicum, but little in Gossypium. Here we identified 19, 18 and 33 candidate COBL genes from three sequenced cotton species, diploid cotton G. raimondii, G. arboreum and tetraploid cotton G. hirsutum acc. TM-1, respectively. These COBL members were anchored onto 10 chromosomes in G. raimondii and could be divided into two subgroups. Expression patterns of COBL genes showed highly developmental and spatial regulation in G. hirsutum acc. TM-1. Of them, GhCOBL9 and GhCOBL13 were preferentially expressed at the secondary cell wall stage of fiber development and had significantly co-upregulated expression with cellulose synthase genes GhCESA4, GhCESA7 and GhCESA8. Besides, GhCOBL9 Dt and GhCOBL13 Dt were co-localized with previously reported cotton fiber quality quantitative trait loci (QTLs) and the favorable allele types of GhCOBL9 Dt had significantly positive correlations with fiber quality traits, indicating that these two genes might play an important role in fiber development. PMID:26710066

  9. Systems approach identifies an organic nitrogen-responsive gene network that is regulated by the master clock control gene CCA1.

    PubMed

    Gutiérrez, Rodrigo A; Stokes, Trevor L; Thum, Karen; Xu, Xiaodong; Obertello, Mariana; Katari, Manpreet S; Tanurdzic, Milos; Dean, Alexis; Nero, Damion C; McClung, C Robertson; Coruzzi, Gloria M

    2008-03-25

    Understanding how nutrients affect gene expression will help us to understand the mechanisms controlling plant growth and development as a function of nutrient availability. Nitrate has been shown to serve as a signal for the control of gene expression in Arabidopsis. There is also evidence, on a gene-by-gene basis, that downstream products of nitrogen (N) assimilation such as glutamate (Glu) or glutamine (Gln) might serve as signals of organic N status that in turn regulate gene expression. To identify genome-wide responses to such organic N signals, Arabidopsis seedlings were transiently treated with ammonium nitrate in the presence or absence of MSX, an inhibitor of glutamine synthetase, resulting in a block of Glu/Gln synthesis. Genes that responded to organic N were identified as those whose response to ammonium nitrate treatment was blocked in the presence of MSX. We showed that some genes previously identified to be regulated by nitrate are under the control of an organic N-metabolite. Using an integrated network model of molecular interactions, we uncovered a subnetwork regulated by organic N that included CCA1 and target genes involved in N-assimilation. We validated some of the predicted interactions and showed that regulation of the master clock control gene CCA1 by Glu or a Glu-derived metabolite in turn regulates the expression of key N-assimilatory genes. Phase response curve analysis shows that distinct N-metabolites can advance or delay the CCA1 phase. Regulation of CCA1 by organic N signals may represent a novel input mechanism for N-nutrients to affect plant circadian clock function.

  10. Gene expression profiles of breast biopsies from healthy women identify a group with claudin-low features

    PubMed Central

    2011-01-01

    Background Increased understanding of the variability in normal breast biology will enable us to identify mechanisms of breast cancer initiation and the origin of different subtypes, and to better predict breast cancer risk. Methods Gene expression patterns in breast biopsies from 79 healthy women referred to breast diagnostic centers in Norway were explored by unsupervised hierarchical clustering and supervised analyses, such as gene set enrichment analysis and gene ontology analysis and comparison with previously published genelists and independent datasets. Results Unsupervised hierarchical clustering identified two separate clusters of normal breast tissue based on gene-expression profiling, regardless of clustering algorithm and gene filtering used. Comparison of the expression profile of the two clusters with several published gene lists describing breast cells revealed that the samples in cluster 1 share characteristics with stromal cells and stem cells, and to a certain degree with mesenchymal cells and myoepithelial cells. The samples in cluster 1 also share many features with the newly identified claudin-low breast cancer intrinsic subtype, which also shows characteristics of stromal and stem cells. More women belonging to cluster 1 have a family history of breast cancer and there is a slight overrepresentation of nulliparous women in cluster 1. Similar findings were seen in a separate dataset consisting of histologically normal tissue from both breasts harboring breast cancer and from mammoplasty reductions. Conclusion This is the first study to explore the variability of gene expression patterns in whole biopsies from normal breasts and identified distinct subtypes of normal breast tissue. Further studies are needed to determine the specific cell contribution to the variation in the biology of normal breasts, how the clusters identified relate to breast cancer risk and their possible link to the origin of the different molecular subtypes of breast

  11. Gene expression profiles of breast biopsies from healthy women identify a group with claudin-low features.

    PubMed

    Haakensen, Vilde D; Lingjaerde, Ole Christian; Lüders, Torben; Riis, Margit; Prat, Aleix; Troester, Melissa A; Holmen, Marit M; Frantzen, Jan Ole; Romundstad, Linda; Navjord, Dina; Bukholm, Ida K; Johannesen, Tom B; Perou, Charles M; Ursin, Giske; Kristensen, Vessela N; Børresen-Dale, Anne-Lise; Helland, Aslaug

    2011-11-01

    Increased understanding of the variability in normal breast biology will enable us to identify mechanisms of breast cancer initiation and the origin of different subtypes, and to better predict breast cancer risk. Gene expression patterns in breast biopsies from 79 healthy women referred to breast diagnostic centers in Norway were explored by unsupervised hierarchical clustering and supervised analyses, such as gene set enrichment analysis and gene ontology analysis and comparison with previously published genelists and independent datasets. Unsupervised hierarchical clustering identified two separate clusters of normal breast tissue based on gene-expression profiling, regardless of clustering algorithm and gene filtering used. Comparison of the expression profile of the two clusters with several published gene lists describing breast cells revealed that the samples in cluster 1 share characteristics with stromal cells and stem cells, and to a certain degree with mesenchymal cells and myoepithelial cells. The samples in cluster 1 also share many features with the newly identified claudin-low breast cancer intrinsic subtype, which also shows characteristics of stromal and stem cells. More women belonging to cluster 1 have a family history of breast cancer and there is a slight overrepresentation of nulliparous women in cluster 1. Similar findings were seen in a separate dataset consisting of histologically normal tissue from both breasts harboring breast cancer and from mammoplasty reductions. This is the first study to explore the variability of gene expression patterns in whole biopsies from normal breasts and identified distinct subtypes of normal breast tissue. Further studies are needed to determine the specific cell contribution to the variation in the biology of normal breasts, how the clusters identified relate to breast cancer risk and their possible link to the origin of the different molecular subtypes of breast cancer.

  12. Weighted gene co‑expression network analysis in identification of key genes and networks for ischemic‑reperfusion remodeling myocardium.

    PubMed

    Guo, Nan; Zhang, Nan; Yan, Liqiu; Lian, Zheng; Wang, Jiawang; Lv, Fengfeng; Wang, Yunfei; Cao, Xufen

    2018-06-14

    Acute myocardial infarction induces ventricular remodeling, which is implicated in dilated heart and heart failure. The pathogenical mechanism of myocardium remodeling remains to be elucidated. The aim of the present study was to identify key genes and networks for myocardium remodeling following ischemia‑reperfusion (IR). First, the mRNA expression data from the National Center for Biotechnology Information database were downloaded to identify differences in mRNA expression of the IR heart at days 2 and 7. Then, weighted gene co‑expression network analysis, hierarchical clustering, protein‑protein interaction (PPI) network, Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway were used to identify key genes and networks for the heart remodeling process following IR. A total of 3,321 differentially expressed genes were identified during the heart remodeling process. A total of 6 modules were identified through gene co‑expression network analysis. GO and KEGG analysis results suggested that each module represented a different biological function and was associated with different pathways. Finally, hub genes of each module were identified by PPI network construction. The present study revealed that heart remodeling following IR is a complicated process, involving extracellular matrix organization, neural development, apoptosis and energy metabolism. The dysregulated genes, including SRC proto‑oncogene, non‑receptor tyrosine kinase, discs large MAGUK scaffold protein 1, ATP citrate lyase, RAN, member RAS oncogene family, tumor protein p53, and polo like kinase 2, may be essential for heart remodeling following IR and may be used as potential targets for the inhibition of heart remodeling following acute myocardial infarction.

  13. A 6-gene signature identifies four molecular subgroups of neuroblastoma

    PubMed Central

    2011-01-01

    Background There are currently three postulated genomic subtypes of the childhood tumour neuroblastoma (NB); Type 1, Type 2A, and Type 2B. The most aggressive forms of NB are characterized by amplification of the oncogene MYCN (MNA) and low expression of the favourable marker NTRK1. Recently, mutations or high expression of the familial predisposition gene Anaplastic Lymphoma Kinase (ALK) was associated to unfavourable biology of sporadic NB. Also, various other genes have been linked to NB pathogenesis. Results The present study explores subgroup discrimination by gene expression profiling using three published microarray studies on NB (47 samples). Four distinct clusters were identified by Principal Components Analysis (PCA) in two separate data sets, which could be verified by an unsupervised hierarchical clustering in a third independent data set (101 NB samples) using a set of 74 discriminative genes. The expression signature of six NB-associated genes ALK, BIRC5, CCND1, MYCN, NTRK1, and PHOX2B, significantly discriminated the four clusters (p < 0.05, one-way ANOVA test). PCA clusters p1, p2, and p3 were found to correspond well to the postulated subtypes 1, 2A, and 2B, respectively. Remarkably, a fourth novel cluster was detected in all three independent data sets. This cluster comprised mainly 11q-deleted MNA-negative tumours with low expression of ALK, BIRC5, and PHOX2B, and was significantly associated with higher tumour stage, poor outcome and poor survival compared to the Type 1-corresponding favourable group (INSS stage 4 and/or dead of disease, p < 0.05, Fisher's exact test). Conclusions Based on expression profiling we have identified four molecular subgroups of neuroblastoma, which can be distinguished by a 6-gene signature. The fourth subgroup has not been described elsewhere, and efforts are currently made to further investigate this group's specific characteristics. PMID:21492432

  14. A data mining paradigm for identifying key factors in biological processes using gene expression data.

    PubMed

    Li, Jin; Zheng, Le; Uchiyama, Akihiko; Bin, Lianghua; Mauro, Theodora M; Elias, Peter M; Pawelczyk, Tadeusz; Sakowicz-Burkiewicz, Monika; Trzeciak, Magdalena; Leung, Donald Y M; Morasso, Maria I; Yu, Peng

    2018-06-13

    A large volume of biological data is being generated for studying mechanisms of various biological processes. These precious data enable large-scale computational analyses to gain biological insights. However, it remains a challenge to mine the data efficiently for knowledge discovery. The heterogeneity of these data makes it difficult to consistently integrate them, slowing down the process of biological discovery. We introduce a data processing paradigm to identify key factors in biological processes via systematic collection of gene expression datasets, primary analysis of data, and evaluation of consistent signals. To demonstrate its effectiveness, our paradigm was applied to epidermal development and identified many genes that play a potential role in this process. Besides the known epidermal development genes, a substantial proportion of the identified genes are still not supported by gain- or loss-of-function studies, yielding many novel genes for future studies. Among them, we selected a top gene for loss-of-function experimental validation and confirmed its function in epidermal differentiation, proving the ability of this paradigm to identify new factors in biological processes. In addition, this paradigm revealed many key genes in cold-induced thermogenesis using data from cold-challenged tissues, demonstrating its generalizability. This paradigm can lead to fruitful results for studying molecular mechanisms in an era of explosive accumulation of publicly available biological data.

  15. Identification of pathogenic genes related to rheumatoid arthritis through integrated analysis of DNA methylation and gene expression profiling.

    PubMed

    Zhang, Lei; Ma, Shiyun; Wang, Huailiang; Su, Hang; Su, Ke; Li, Longjie

    2017-11-15

    The purpose of our study was to identify new pathogenic genes used for exploring the pathogenesis of rheumatoid arthritis (RA). To screen pathogenic genes of RA, an integrated analysis was performed by using the microarray datasets in RA derived from the Gene Expression Omnibus (GEO) database. The functional annotation and potential pathways of differentially expressed genes (DEGs) were further discovered by Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis. Afterwards, the integrated analysis of DNA methylation and gene expression profiling was used to screen crucial genes. In addition, we used RT-PCR and MSP to verify the expression levels and methylation status of these crucial genes in 20 synovial biopsy samples obtained from 10 RA model mice and 10 normal mice. BCL11B, CCDC88C, FCRLA and APOL6 were both up-regulated and hypomethylated in RA according to integrated analysis, RT-PCR and MSP verification. Four crucial genes (BCL11B, CCDC88C, FCRLA and APOL6) identified and analyzed in this study might be closely connected with the pathogenesis of RA. Copyright © 2017. Published by Elsevier B.V.

  16. Transcriptomic Analysis Identifies Candidate Genes Related to Intramuscular Fat Deposition and Fatty Acid Composition in the Breast Muscle of Squabs (Columba)

    PubMed Central

    Ye, Manhong; Zhou, Bin; Wei, Shanshan; Ding, MengMeng; Lu, Xinghui; Shi, Xuehao; Ding, Jiatong; Yang, Shengmei; Wei, Wanhong

    2016-01-01

    Despite the fact that squab is consumed throughout the world because of its high nutritional value and appreciated sensory attributes, aspects related to its characterization, and in particular genetic issues, have rarely been studied. In this study, meat traits in terms of pH, water-holding capacity, intramuscular fat content, and fatty acid profile of the breast muscle of squabs from two meat pigeon breeds were determined. Breed-specific differences were detected in fat-related traits of intramuscular fat content and fatty acid composition. RNA-Sequencing was applied to compare the transcriptomes of muscle and liver tissues between squabs of two breeds to identify candidate genes associated with the differences in the capacity of fat deposition. A total of 27 differentially expressed genes assigned to pathways of lipid metabolism were identified, of which, six genes belonged to the peroxisome proliferator-activated receptor signaling pathway along with four other genes. Our results confirmed in part previous reports in livestock and provided also a number of genes which had not been related to fat deposition so far. These genes can serve as a basis for further investigations to screen markers closely associated with intramuscular fat content and fatty acid composition in squabs. The data from this study were deposited in the National Center for Biotechnology Information (NCBI)’s Sequence Read Archive under the accession numbers SRX1680021 and SRX1680022. This is the first transcriptome analysis of the muscle and liver tissue in Columba using next generation sequencing technology. Data provided here are of potential value to dissect functional genes influencing fat deposition in squabs. PMID:27175015

  17. Barcode Sequencing Screen Identifies SUB1 as a Regulator of Yeast Pheromone Inducible Genes

    PubMed Central

    Sliva, Anna; Kuang, Zheng; Meluh, Pamela B.; Boeke, Jef D.

    2016-01-01

    The yeast pheromone response pathway serves as a valuable model of eukaryotic mitogen-activated protein kinase (MAPK) pathways, and transcription of their downstream targets. Here, we describe application of a screening method combining two technologies: fluorescence-activated cell sorting (FACS), and barcode analysis by sequencing (Bar-Seq). Using this screening method, and pFUS1-GFP as a reporter for MAPK pathway activation, we readily identified mutants in known mating pathway components. In this study, we also include a comprehensive analysis of the FUS1 induction properties of known mating pathway mutants by flow cytometry, featuring single cell analysis of each mutant population. We also characterized a new source of false positives resulting from the design of this screen. Additionally, we identified a deletion mutant, sub1Δ, with increased basal expression of pFUS1-GFP. Here, in the first ChIP-Seq of Sub1, our data shows that Sub1 binds to the promoters of about half the genes in the genome (tripling the 991 loci previously reported), including the promoters of several pheromone-inducible genes, some of which show an increase upon pheromone induction. Here, we also present the first RNA-Seq of a sub1Δ mutant; the majority of genes have no change in RNA, but, of the small subset that do, most show decreased expression, consistent with biochemical studies implicating Sub1 as a positive transcriptional regulator. The RNA-Seq data also show that certain pheromone-inducible genes are induced less in the sub1Δ mutant relative to the wild type, supporting a role for Sub1 in regulation of mating pathway genes. The sub1Δ mutant has increased basal levels of a small subset of other genes besides FUS1, including IMD2 and FIG1, a gene encoding an integral membrane protein necessary for efficient mating. PMID:26837954

  18. DNA methylome profiling identifies novel methylated genes in African American patients with colorectal neoplasia.

    PubMed

    Ashktorab, Hassan; Daremipouran, M; Goel, Ajay; Varma, Sudhir; Leavitt, R; Sun, Xueguang; Brim, Hassan

    2014-04-01

    The identification of genes that are differentially methylated in colorectal cancer (CRC) has potential value for both diagnostic and therapeutic interventions specifically in high-risk populations such as African Americans (AAs). However, DNA methylation patterns in CRC, especially in AAs, have not been systematically explored and remain poorly understood. Here, we performed DNA methylome profiling to identify the methylation status of CpG islands within candidate genes involved in critical pathways important in the initiation and development of CRC. We used reduced representation bisulfite sequencing (RRBS) in colorectal cancer and adenoma tissues that were compared with DNA methylome from a healthy AA subject's colon tissue and peripheral blood DNA. The identified methylation markers were validated in fresh frozen CRC tissues and corresponding normal tissues from AA patients diagnosed with CRC at Howard University Hospital. We identified and validated the methylation status of 355 CpG sites located within 16 gene promoter regions associated with CpG islands. Fifty CpG sites located within CpG islands-in genes ATXN7L1 (2), BMP3 (7), EID3 (15), GAS7 (1), GPR75 (24), and TNFAIP2 (1)-were significantly hypermethylated in tumor vs. normal tissues (P<0.05). The methylation status of BMP3, EID3, GAS7, and GPR75 was confirmed in an independent, validation cohort. Ingenuity pathway analysis mapped three of these markers (GAS7, BMP3 and GPR) in the insulin and TGF-β1 network-the two key pathways in CRC. In addition to hypermethylated genes, our analysis also revealed that LINE-1 repeat elements were progressively hypomethylated in the normal-adenoma-cancer sequence. We conclude that DNA methylome profiling based on RRBS is an effective method for screening aberrantly methylated genes in CRC. While previous studies focused on the limited identification of hypermethylated genes, ours is the first study to systematically and comprehensively identify novel hypermethylated

  19. Sarcoidosis Related Novel Candidate Genes Identified by Multi-Omics Integrative Analyses.

    PubMed

    Hočevar, Keli; Maver, Aleš; Kunej, Tanja; Peterlin, Borut

    2018-05-01

    Sarcoidosis is a multifactorial systemic disease characterized by granulomatous inflammation and greatly impacting on global public health. The etiology and mechanisms of sarcoidosis are not fully understood. Recent high-throughput biological research has generated vast amounts of multi-omics big data on sarcoidosis, but their significance remains to be determined. We sought to identify novel candidate regions, and genes consistently altered in heterogeneous omics studies so as to reveal the underlying molecular mechanisms. We conducted a comprehensive integrative literature analysis on global data on sarcoidosis, including genomic, transcriptomic, proteomic, and phenomic studies. We performed positional integration analysis of 38 eligible datasets originating from 17 different biological layers. Using the integration interval length of 50 kb, we identified 54 regions reaching significance value p ≤ 0.0001 and 15 regions with significance value p ≤ 0.00001, when applying more stringent criteria. Secondary literature analysis of the top 20 regions, with the most significant accumulation of signals, revealed several novel candidate genes for which associations with sarcoidosis have not yet been established, but have considerable support for their involvement based on omic data. These new plausible candidate genes include NELFE, CFB, EGFL7, AGPAT2, FKBPL, NRC3, and NEU1. Furthermore, annotated data were prepared to enable custom visualization and browsing of these sarcoidosis related omics evidence in the University of California Santa Cruz (UCSC) Genome Browser. Further multi-omics approaches are called for sarcoidosis biomarkers and diagnostic and therapeutic innovation. Our approach for harnessing multi-omics data and the findings presented herein reflect important steps toward understanding the etiology and underlying pathological mechanisms of sarcoidosis.

  20. Metabolic pathways and genes identified by RNA-seq analysis of barley near-isogenic lines differing by allelic state of the Black lemma and pericarp (Blp) gene.

    PubMed

    Glagoleva, Anastasiya Y; Shmakov, Nikolay A; Shoeva, Olesya Y; Vasiliev, Gennady V; Shatskaya, Natalia V; Börner, Andreas; Afonnikov, Dmitry A; Khlestkina, Elena K

    2017-11-14

    Some plant species have 'melanin-like' black seed pigmentation. However, the chemical and genetic nature of this 'melanin-like' black pigment have not yet been fully explored due to its complex structure and ability to withstand almost all solvents. Nevertheless, identification of genetic networks participating in trait formation is key to understanding metabolic processes involved in the expression of 'melanin-like' black seed pigmentation. The aim of the current study was to identify differentially expressed genes (DEGs) in barley near-isogenic lines (NILs) differing by allelic state of the Blp (black lemma and pericarp) locus. RNA-seq analysis of six libraries (three replicates for each line) was performed. A total of 957 genome fragments had statistically significant changes in expression levels between lines BLP and BW, with 632 fragments having increased expression levels in line BLP and 325 genome fragments having decreased expression. Among identified DEGs, 191 genes were recognized as participating in known pathways. Among these were metabolic pathways including 'suberin monomer biosynthesis', 'diterpene phytoalexins precursors biosynthesis', 'cutin biosynthesis', 'cuticular wax biosynthesis', and 'phenylpropanoid biosynthesis, initial reactions'. Differential expression was confirmed by real-time PCR analysis of selected genes. Metabolic pathways and genes presumably associated with black lemma and pericarp colour as well as Blp-associated resistance to oxidative stress and pathogens, were revealed. We suggest that the black pigmentation of lemmas and pericarps is related to increased level of phenolic compounds and their oxidation. The effect of functional Blp on the synthesis of ferulic acid and other phenolic compounds can explain the increased antioxidant capacity and biotic and abiotic stress tolerance of black-grained cereals. Their drought tolerance and resistance to diseases affecting the spike may also be related to cuticular wax biosynthesis. In

  1. Coalitional game theory as a promising approach to identify candidate autism genes.

    PubMed

    Gupta, Anika; Sun, Min Woo; Paskov, Kelley Marie; Stockham, Nate Tyler; Jung, Jae-Yoon; Wall, Dennis Paul

    2018-01-01

    Despite mounting evidence for the strong role of genetics in the phenotypic manifestation of Autism Spectrum Disorder (ASD), the specific genes responsible for the variable forms of ASD remain undefined. ASD may be best explained by a combinatorial genetic model with varying epistatic interactions across many small effect mutations. Coalitional or cooperative game theory is a technique that studies the combined effects of groups of players, known as coalitions, seeking to identify players who tend to improve the performance--the relationship to a specific disease phenotype--of any coalition they join. This method has been previously shown to boost biologically informative signal in gene expression data but to-date has not been applied to the search for cooperative mutations among putative ASD genes. We describe our approach to highlight genes relevant to ASD using coalitional game theory on alteration data of 1,965 fully sequenced genomes from 756 multiplex families. Alterations were encoded into binary matrices for ASD (case) and unaffected (control) samples, indicating likely gene-disrupting, inherited mutations in altered genes. To determine individual gene contributions given an ASD phenotype, a "player" metric, referred to as the Shapley value, was calculated for each gene in the case and control cohorts. Sixty seven genes were found to have significantly elevated player scores and likely represent significant contributors to the genetic coordination underlying ASD. Using network and cross-study analysis, we found that these genes are involved in biological pathways known to be affected in the autism cases and that a subset directly interact with several genes known to have strong associations to autism. These findings suggest that coalitional game theory can be applied to large-scale genomic data to identify hidden yet influential players in complex polygenic disorders such as autism.

  2. A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research.

    PubMed

    Weidner, Christopher; Steinfath, Matthias; Wistorf, Elisa; Oelgeschläger, Michael; Schneider, Marlon R; Schönfelder, Gilbert

    2017-08-16

    Recent studies that compared transcriptomic datasets of human diseases with datasets from mouse models using traditional gene-to-gene comparison techniques resulted in contradictory conclusions regarding the relevance of animal models for translational research. A major reason for the discrepancies between different gene expression analyses is the arbitrary filtering of differentially expressed genes. Furthermore, the comparison of single genes between different species and platforms often is limited by technical variance, leading to misinterpretation of the con/discordance between data from human and animal models. Thus, standardized approaches for systematic data analysis are needed. To overcome subjective gene filtering and ineffective gene-to-gene comparisons, we recently demonstrated that gene set enrichment analysis (GSEA) has the potential to avoid these problems. Therefore, we developed a standardized protocol for the use of GSEA to distinguish between appropriate and inappropriate animal models for translational research. This protocol is not suitable to predict how to design new model systems a-priori, as it requires existing experimental omics data. However, the protocol describes how to interpret existing data in a standardized manner in order to select the most suitable animal model, thus avoiding unnecessary animal experiments and misleading translational studies.

  3. Gene-environment interaction involving recently identified colorectal cancer susceptibility loci

    PubMed Central

    Kantor, Elizabeth D.; Hutter, Carolyn M.; Minnier, Jessica; Berndt, Sonja I.; Brenner, Hermann; Caan, Bette J.; Campbell, Peter T.; Carlson, Christopher S.; Casey, Graham; Chan, Andrew T.; Chang-Claude, Jenny; Chanock, Stephen J.; Cotterchio, Michelle; Du, Mengmeng; Duggan, David; Fuchs, Charles S.; Giovannucci, Edward L.; Gong, Jian; Harrison, Tabitha A.; Hayes, Richard B.; Henderson, Brian E.; Hoffmeister, Michael; Hopper, John L.; Jenkins, Mark A.; Jiao, Shuo; Kolonel, Laurence N.; Le Marchand, Loic; Lemire, Mathieu; Ma, Jing; Newcomb, Polly A.; Ochs-Balcom, Heather M.; Pflugeisen, Bethann M.; Potter, John D.; Rudolph, Anja; Schoen, Robert E.; Seminara, Daniela; Slattery, Martha L.; Stelling, Deanna L.; Thomas, Fridtjof; Thornquist, Mark; Ulrich, Cornelia M.; Warnick, Greg S.; Zanke, Brent W.; Peters, Ulrike; Hsu, Li; White, Emily

    2014-01-01

    BACKGROUND Genome-wide association studies have identified several single nucleotide polymorphisms (SNPs) that are associated with risk of colorectal cancer (CRC). Prior research has evaluated the presence of gene-environment interaction involving the first 10 identified susceptibility loci, but little work has been conducted on interaction involving SNPs at recently identified susceptibility loci, including: rs10911251, rs6691170, rs6687758, rs11903757, rs10936599, rs647161, rs1321311, rs719725, rs1665650, rs3824999, rs7136702, rs11169552, rs59336, rs3217810, rs4925386, and rs2423279. METHODS Data on 9160 cases and 9280 controls from the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO) and Colon Cancer Family Registry (CCFR) were used to evaluate the presence of interaction involving the above-listed SNPs and sex, body mass index (BMI), alcohol consumption, smoking, aspirin use, post-menopausal hormone (PMH) use, as well as intake of dietary calcium, dietary fiber, dietary folate, red meat, processed meat, fruit, and vegetables. Interaction was evaluated using a fixed-effects meta-analysis of an efficient Empirical Bayes estimator, and permutation was used to account for multiple comparisons. RESULTS None of the permutation-adjusted p-values reached statistical significance. CONCLUSIONS The associations between recently identified genetic susceptibility loci and CRC are not strongly modified by sex, BMI, alcohol, smoking, aspirin, PMH use, and various dietary factors. IMPACT Results suggest no evidence of strong gene-environment interactions involving the recently identified 16 susceptibility loci for CRC taken one at a time. PMID:24994789

  4. [Analysis of horizontal transfer gene of Bombyx mori NPV].

    PubMed

    Duan, Hai-Rong; Qiu, De-Bin; Gong, Cheng-Liang; Huang, Mo-Li

    2011-06-01

    For research on genetic characters and evolutionary origin of the genome of baculoviruses, a comprehensive homology search and phylogenetic analysis of the complete genomes of Bombyx mori NPV and Bombyx mori were used. Three horizontally transferred genes (inhibitor of apoptosis, chitinase, and UDP-glucosyltransferase) were identified, and there was evidence that all of these genes were derived from the insect host. The results of analysis showed lots of differences between the features of horizontal transferred genes and the ones of whole genomic genes, such as nucleotide composition, codon usagebias and selection pressure. These results reconfirmed that the horizontally transferred genes are exogenous. The analysis of gene function suggested that horizontally transferred genes acquired from an ancestral host insect can increase the efficiency of baculoviruses transmission.

  5. LGscore: A method to identify disease-related genes using biological literature and Google data.

    PubMed

    Kim, Jeongwoo; Kim, Hyunjin; Yoon, Youngmi; Park, Sanghyun

    2015-04-01

    Since the genome project in 1990s, a number of studies associated with genes have been conducted and researchers have confirmed that genes are involved in disease. For this reason, the identification of the relationships between diseases and genes is important in biology. We propose a method called LGscore, which identifies disease-related genes using Google data and literature data. To implement this method, first, we construct a disease-related gene network using text-mining results. We then extract gene-gene interactions based on co-occurrences in abstract data obtained from PubMed, and calculate the weights of edges in the gene network by means of Z-scoring. The weights contain two values: the frequency and the Google search results. The frequency value is extracted from literature data, and the Google search result is obtained using Google. We assign a score to each gene through a network analysis. We assume that genes with a large number of links and numerous Google search results and frequency values are more likely to be involved in disease. For validation, we investigated the top 20 inferred genes for five different diseases using answer sets. The answer sets comprised six databases that contain information on disease-gene relationships. We identified a significant number of disease-related genes as well as candidate genes for Alzheimer's disease, diabetes, colon cancer, lung cancer, and prostate cancer. Our method was up to 40% more accurate than existing methods. Copyright © 2015 Elsevier Inc. All rights reserved.

  6. Integrating mean and variance heterogeneities to identify differentially expressed genes.

    PubMed

    Ouyang, Weiwei; An, Qiang; Zhao, Jinying; Qin, Huaizhen

    2016-12-06

    In functional genomics studies, tests on mean heterogeneity have been widely employed to identify differentially expressed genes with distinct mean expression levels under different experimental conditions. Variance heterogeneity (aka, the difference between condition-specific variances) of gene expression levels is simply neglected or calibrated for as an impediment. The mean heterogeneity in the expression level of a gene reflects one aspect of its distribution alteration; and variance heterogeneity induced by condition change may reflect another aspect. Change in condition may alter both mean and some higher-order characteristics of the distributions of expression levels of susceptible genes. In this report, we put forth a conception of mean-variance differentially expressed (MVDE) genes, whose expression means and variances are sensitive to the change in experimental condition. We mathematically proved the null independence of existent mean heterogeneity tests and variance heterogeneity tests. Based on the independence, we proposed an integrative mean-variance test (IMVT) to combine gene-wise mean heterogeneity and variance heterogeneity induced by condition change. The IMVT outperformed its competitors under comprehensive simulations of normality and Laplace settings. For moderate samples, the IMVT well controlled type I error rates, and so did existent mean heterogeneity test (i.e., the Welch t test (WT), the moderated Welch t test (MWT)) and the procedure of separate tests on mean and variance heterogeneities (SMVT), but the likelihood ratio test (LRT) severely inflated type I error rates. In presence of variance heterogeneity, the IMVT appeared noticeably more powerful than all the valid mean heterogeneity tests. Application to the gene profiles of peripheral circulating B raised solid evidence of informative variance heterogeneity. After adjusting for background data structure, the IMVT replicated previous discoveries and identified novel experiment

  7. Comparative Analysis of Muscle Transcriptome between Pig Genotypes Identifies Genes and Regulatory Mechanisms Associated to Growth, Fatness and Metabolism

    PubMed Central

    Ayuso, Miriam; Fernández, Almudena; Núñez, Yolanda; Benítez, Rita; Isabel, Beatriz; Barragán, Carmen; Fernández, Ana Isabel; Rey, Ana Isabel; Medrano, Juan F.; Cánovas, Ángela; González-Bulnes, Antonio; López-Bote, Clemente; Ovilo, Cristina

    2015-01-01

    Iberian ham production includes both purebred (IB) and Duroc-crossbred (IBxDU) Iberian pigs, which show important differences in meat quality and production traits, such as muscle growth and fatness. This experiment was conducted to investigate gene expression differences, transcriptional regulation and genetic polymorphisms that could be associated with the observed phenotypic differences between IB and IBxDU pigs. Nine IB and 10 IBxDU pigs were slaughtered at birth. Morphometric measures and blood samples were obtained and samples from Biceps femoris muscle were employed for compositional and transcriptome analysis by RNA-Seq technology. Phenotypic differences were evident at this early age, including greater body size and weight in IBxDU and greater Biceps femoris intramuscular fat and plasma cholesterol content in IB newborns. We detected 149 differentially expressed genes between IB and IBxDU neonates (p < 0.01 and Fold-Change > 1. 5). Several were related to adipose and muscle tissues development (DLK1, FGF21 or UBC). The functional interpretation of the transcriptomic differences revealed enrichment of functions and pathways related to lipid metabolism in IB and to cellular and muscle growth in IBxDU pigs. Protein catabolism, cholesterol biosynthesis and immune system were functions enriched in both genotypes. We identified transcription factors potentially affecting the observed gene expression differences. Some of them have known functions on adipogenesis (CEBPA, EGRs), lipid metabolism (PPARGC1B) and myogenesis (FOXOs, MEF2D, MYOD1), which suggest a key role in the meat quality differences existing between IB and IBxDU hams. We also identified several polymorphisms showing differential segregation between IB and IBxDU pigs. Among them, non-synonymous variants were detected in several transcription factors as PPARGC1B and TRIM63 genes, which could be associated to altered gene function. Taken together, these results provide information about candidate

  8. Gene Expression Analysis of Forskolin Treated Basilar Papillae Identifies MicroRNA181a as a Mediator of Proliferation

    PubMed Central

    Frucht, Corey S.; Uduman, Mohamed; Duke, Jamie L.; Kleinstein, Steven H.; Santos-Sacchi, Joseph; Navaratnam, Dhasakumar S.

    2010-01-01

    Background Auditory hair cells spontaneously regenerate following injury in birds but not mammals. A better understanding of the molecular events underlying hair cell regeneration in birds may allow for identification and eventually manipulation of relevant pathways in mammals to stimulate regeneration and restore hearing in deaf patients. Methodology/Principal Findings Gene expression was profiled in forskolin treated (i.e., proliferating) and quiescent control auditory epithelia of post-hatch chicks using an Affymetrix whole-genome chicken array after 24 (n = 6), 48 (n = 6), and 72 (n = 12) hours in culture. In the forskolin-treated epithelia there was significant (p<0.05; >two-fold change) upregulation of many genes thought to be relevant to cell cycle control and inner ear development. Gene set enrichment analysis was performed on the data and identified myriad microRNAs that are likely to be upregulated in the regenerating tissue, including microRNA181a (miR181a), which is known to mediate proliferation in other systems. Functional experiments showed that miR181a overexpression is sufficient to stimulate proliferation within the basilar papilla, as assayed by BrdU incorporation. Further, some of the newly produced cells express the early hair cell marker myosin VI, suggesting that miR181a transfection can result in the production of new hair cells. Conclusions/Significance These studies have identified a single microRNA, miR181a, that can cause proliferation in the chicken auditory epithelium with production of new hair cells. PMID:20634979

  9. Genome-scale analysis of positionally relocated genes

    PubMed Central

    Bhutkar, Arjun; Russo, Susan M.; Smith, Temple F.; Gelbart, William M.

    2007-01-01

    During evolution, genome reorganization includes large-scale events such as inversions, translocations, and segmental or even whole-genome duplications, as well as fine-scale events such as the relocation of individual genes. This latter category, which we will refer to as positionally relocated genes (PRGs), is the subject of this report. Assessment of the magnitude of such PRGs and of possible contributing mechanisms is aided by a comparative analysis of related genomes, where conserved chromosomal organization can aid in identifying genes that have acquired a new location in a lineage of these genomes. Here we utilize two methods to comprehensively identify relocated protein-coding genes in the recently sequenced genomes of 12 species of genus Drosophila. We use exceptions to the general rule of maintenance of chromosome arm (Muller element) association for most Drosophila genes to identify one major class of PRGs. We also identify a partially overlapping set of PRGs among “embedded genes,” located within the extents of other surrounding genes. We provide evidence that PRG movements have at least two different origins: Some events occur via retrotransposition of processed RNAs and others via a DNA-based transposition mechanism. Overall, we identify several hundred PRGs that arose within a lineage of the genus Drosophila phylogeny and provide suggestive evidence that a few thousand such events have occurred within the radiation of the insect order Diptera, thereby illustrating the magnitude of the contribution of PRG movement to chromosomal reorganization during evolution. PMID:17989252

  10. Gene identification for risk of relapse in stage I lung adenocarcinoma patients: a combined methodology of gene expression profiling and computational gene network analysis.

    PubMed

    Ludovini, Vienna; Bianconi, Fortunato; Siggillino, Annamaria; Piobbico, Danilo; Vannucci, Jacopo; Metro, Giulio; Chiari, Rita; Bellezza, Guido; Puma, Francesco; Della Fazia, Maria Agnese; Servillo, Giuseppe; Crinò, Lucio

    2016-05-24

    Risk assessment and treatment choice remains a challenge in early non-small-cell lung cancer (NSCLC). The aim of this study was to identify novel genes involved in the risk of early relapse (ER) compared to no relapse (NR) in resected lung adenocarcinoma (AD) patients using a combination of high throughput technology and computational analysis. We identified 18 patients (n.13 NR and n.5 ER) with stage I AD. Frozen samples of patients in ER, NR and corresponding normal lung (NL) were subjected to Microarray technology and quantitative-PCR (Q-PCR). A gene network computational analysis was performed to select predictive genes. An independent set of 79 ADs stage I samples was used to validate selected genes by Q-PCR.From microarray analysis we selected 50 genes, using the fold change ratio of ER versus NR. They were validated both in pool and individually in patient samples (ER and NR) by Q-PCR. Fourteen increased and 25 decreased genes showed a concordance between two methods. They were used to perform a computational gene network analysis that identified 4 increased (HOXA10, CLCA2, AKR1B10, FABP3) and 6 decreased (SCGB1A1, PGC, TFF1, PSCA, SPRR1B and PRSS1) genes. Moreover, in an independent dataset of ADs samples, we showed that both high FABP3 expression and low SCGB1A1 expression was associated with a worse disease-free survival (DFS).Our results indicate that it is possible to define, through gene expression and computational analysis, a characteristic gene profiling of patients with an increased risk of relapse that may become a tool for patient selection for adjuvant therapy.

  11. Secretome Characterization and Correlation Analysis Reveal Putative Pathogenicity Mechanisms and Identify Candidate Avirulence Genes in the Wheat Stripe Rust Fungus Puccinia striiformis f. sp. tritici.

    PubMed

    Xia, Chongjing; Wang, Meinan; Cornejo, Omar E; Jiwan, Derick A; See, Deven R; Chen, Xianming

    2017-01-01

    Stripe (yellow) rust, caused by Puccinia striiformis f. sp. tritici ( Pst ), is one of the most destructive diseases of wheat worldwide. Planting resistant cultivars is an effective way to control this disease, but race-specific resistance can be overcome quickly due to the rapid evolving Pst population. Studying the pathogenicity mechanisms is critical for understanding how Pst virulence changes and how to develop wheat cultivars with durable resistance to stripe rust. We re-sequenced 7 Pst isolates and included additional 7 previously sequenced isolates to represent balanced virulence/avirulence profiles for several avirulence loci in seretome analyses. We observed an uneven distribution of heterozygosity among the isolates. Secretome comparison of Pst with other rust fungi identified a large portion of species-specific secreted proteins, suggesting that they may have specific roles when interacting with the wheat host. Thirty-two effectors of Pst were identified from its secretome. We identified candidates for Avr genes corresponding to six Yr genes by correlating polymorphisms for effector genes to the virulence/avirulence profiles of the 14 Pst isolates. The putative AvYr76 was present in the avirulent isolates, but absent in the virulent isolates, suggesting that deleting the coding region of the candidate avirulence gene has produced races virulent to resistance gene Yr76 . We conclude that incorporating avirulence/virulence phenotypes into correlation analysis with variations in genomic structure and secretome, particularly presence/absence polymorphisms of effectors, is an efficient way to identify candidate Avr genes in Pst . The candidate effector genes provide a rich resource for further studies to determine the evolutionary history of Pst populations and the co-evolutionary arms race between Pst and wheat. The Avr candidates identified in this study will lead to cloning avirulence genes in Pst , which will enable us to understand molecular mechanisms

  12. Omics of Brucella: Species-Specific sRNA-Mediated Gene Ontology Regulatory Networks Identified by Computational Biology.

    PubMed

    Vishnu, Udayakumar S; Sankarasubramanian, Jagadesan; Gunasekaran, Paramasamy; Sridhar, Jayavel; Rajendhran, Jeyaprakash

    2016-06-01

    Brucella is an intracellular bacterium that causes the zoonotic infectious disease, brucellosis. Brucella species are currently intensively studied with a view to developing novel global health diagnostics and therapeutics. In this context, small RNAs (sRNAs) are one of the emerging topical areas; they play significant roles in regulating gene expression and cellular processes in bacteria. In the present study, we forecast sRNAs in three Brucella species that infect humans, namely Brucella melitensis, Brucella abortus, and Brucella suis, using a computational biology analysis. We combined two bioinformatic algorithms, SIPHT and sRNAscanner. In B. melitensis 16M, 21 sRNA candidates were identified, of which 14 were novel. Similarly, 14 sRNAs were identified in B. abortus, of which four were novel. In B. suis, 16 sRNAs were identified, and five of them were novel. TargetRNA2 software predicted the putative target genes that could be regulated by the identified sRNAs. The identified mRNA targets are involved in carbohydrate, amino acid, lipid, nucleotide, and coenzyme metabolism and transport, energy production and conversion, replication, recombination, repair, and transcription. Additionally, the Gene Ontology (GO) network analysis revealed the species-specific, sRNA-based regulatory networks in B. melitensis, B. abortus, and B. suis. Taken together, although sRNAs are veritable modulators of gene expression in prokaryotes, there are few reports on the significance of sRNAs in Brucella. This report begins to address this literature gap by offering a series of initial observations based on computational biology to pave the way for future experimental analysis of sRNAs and their targets to explain the complex pathogenesis of Brucella.

  13. A gene-trap strategy identifies quiescence-induced genes in synchronized myoblasts.

    PubMed

    Sambasivan, Ramkumar; Pavlath, Grace K; Dhawan, Jyotsna

    2008-03-01

    Cellular quiescence is characterized not only by reduced mitotic and metabolic activity but also by altered gene expression. Growing evidence suggests that quiescence is not merely a basal state but is regulated by active mechanisms. To understand the molecular programme that governs reversible cell cycle exit, we focused on quiescence-related gene expression in a culture model of myogenic cell arrest and activation. Here we report the identification of quiescence-induced genes using a gene-trap strategy. Using a retroviral vector, we generated a library of gene traps in C2C12 myoblasts that were screened for arrest-induced insertions by live cell sorting (FACS-gal). Several independent gene- trap lines revealed arrest-dependent induction of betagal activity, confirming the efficacy of the FACS screen. The locus of integration was identified in 15 lines. In three lines,insertion occurred in genes previously implicated in the control of quiescence, i.e. EMSY - a BRCA2--interacting protein, p8/com1 - a p300HAT -- binding protein and MLL5 - a SET domain protein. Our results demonstrate that expression of chromatin modulatory genes is induced in G0, providing support to the notion that this reversibly arrested state is actively regulated.

  14. Network-Based Integration of GWAS and Gene Expression Identifies a HOX-Centric Network Associated with Serous Ovarian Cancer Risk.

    PubMed

    Kar, Siddhartha P; Tyrer, Jonathan P; Li, Qiyuan; Lawrenson, Kate; Aben, Katja K H; Anton-Culver, Hoda; Antonenkova, Natalia; Chenevix-Trench, Georgia; Baker, Helen; Bandera, Elisa V; Bean, Yukie T; Beckmann, Matthias W; Berchuck, Andrew; Bisogna, Maria; Bjørge, Line; Bogdanova, Natalia; Brinton, Louise; Brooks-Wilson, Angela; Butzow, Ralf; Campbell, Ian; Carty, Karen; Chang-Claude, Jenny; Chen, Yian Ann; Chen, Zhihua; Cook, Linda S; Cramer, Daniel; Cunningham, Julie M; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Dennis, Joe; Dicks, Ed; Doherty, Jennifer A; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Easton, Douglas F; Edwards, Robert P; Ekici, Arif B; Fasching, Peter A; Fridley, Brooke L; Gao, Yu-Tang; Gentry-Maharaj, Aleksandra; Giles, Graham G; Glasspool, Rosalind; Goode, Ellen L; Goodman, Marc T; Grownwald, Jacek; Harrington, Patricia; Harter, Philipp; Hein, Alexander; Heitz, Florian; Hildebrandt, Michelle A T; Hillemanns, Peter; Hogdall, Estrid; Hogdall, Claus K; Hosono, Satoyo; Iversen, Edwin S; Jakubowska, Anna; Paul, James; Jensen, Allan; Ji, Bu-Tian; Karlan, Beth Y; Kjaer, Susanne K; Kelemen, Linda E; Kellar, Melissa; Kelley, Joseph; Kiemeney, Lambertus A; Krakstad, Camilla; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D; Lee, Alice W; Lele, Shashi; Leminen, Arto; Lester, Jenny; Levine, Douglas A; Liang, Dong; Lissowska, Jolanta; Lu, Karen; Lubinski, Jan; Lundvall, Lene; Massuger, Leon; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R; McNeish, Iain A; Menon, Usha; Modugno, Francesmary; Moysich, Kirsten B; Narod, Steven A; Nedergaard, Lotte; Ness, Roberta B; Nevanlinna, Heli; Odunsi, Kunle; Olson, Sara H; Orlow, Irene; Orsulic, Sandra; Weber, Rachel Palmieri; Pearce, Celeste Leigh; Pejovic, Tanja; Pelttari, Liisa M; Permuth-Wey, Jennifer; Phelan, Catherine M; Pike, Malcolm C; Poole, Elizabeth M; Ramus, Susan J; Risch, Harvey A; Rosen, Barry; Rossing, Mary Anne; Rothstein, Joseph H; Rudolph, Anja; Runnebaum, Ingo B; Rzepecka, Iwona K; Salvesen, Helga B; Schildkraut, Joellen M; Schwaab, Ira; Shu, Xiao-Ou; Shvetsov, Yurii B; Siddiqui, Nadeem; Sieh, Weiva; Song, Honglin; Southey, Melissa C; Sucheston-Campbell, Lara E; Tangen, Ingvild L; Teo, Soo-Hwang; Terry, Kathryn L; Thompson, Pamela J; Timorek, Agnieszka; Tsai, Ya-Yu; Tworoger, Shelley S; van Altena, Anne M; Van Nieuwenhuysen, Els; Vergote, Ignace; Vierkant, Robert A; Wang-Gohrke, Shan; Walsh, Christine; Wentzensen, Nicolas; Whittemore, Alice S; Wicklund, Kristine G; Wilkens, Lynne R; Woo, Yin-Ling; Wu, Xifeng; Wu, Anna; Yang, Hannah; Zheng, Wei; Ziogas, Argyrios; Sellers, Thomas A; Monteiro, Alvaro N A; Freedman, Matthew L; Gayther, Simon A; Pharoah, Paul D P

    2015-10-01

    Genome-wide association studies (GWAS) have so far reported 12 loci associated with serous epithelial ovarian cancer (EOC) risk. We hypothesized that some of these loci function through nearby transcription factor (TF) genes and that putative target genes of these TFs as identified by coexpression may also be enriched for additional EOC risk associations. We selected TF genes within 1 Mb of the top signal at the 12 genome-wide significant risk loci. Mutual information, a form of correlation, was used to build networks of genes strongly coexpressed with each selected TF gene in the unified microarray dataset of 489 serous EOC tumors from The Cancer Genome Atlas. Genes represented in this dataset were subsequently ranked using a gene-level test based on results for germline SNPs from a serous EOC GWAS meta-analysis (2,196 cases/4,396 controls). Gene set enrichment analysis identified six networks centered on TF genes (HOXB2, HOXB5, HOXB6, HOXB7 at 17q21.32 and HOXD1, HOXD3 at 2q31) that were significantly enriched for genes from the risk-associated end of the ranked list (P < 0.05 and FDR < 0.05). These results were replicated (P < 0.05) using an independent association study (7,035 cases/21,693 controls). Genes underlying enrichment in the six networks were pooled into a combined network. We identified a HOX-centric network associated with serous EOC risk containing several genes with known or emerging roles in serous EOC development. Network analysis integrating large, context-specific datasets has the potential to offer mechanistic insights into cancer susceptibility and prioritize genes for experimental characterization. ©2015 American Association for Cancer Research.

  15. Identification of novel risk genes associated with type 1 diabetes mellitus using a genome-wide gene-based association analysis.

    PubMed

    Qiu, Ying-Hua; Deng, Fei-Yan; Li, Min-Jing; Lei, Shu-Feng

    2014-11-01

    Type 1 diabetes mellitus is a serious disorder characterized by destruction of pancreatic β-cells, culminating in absolute insulin deficiency. Genetic factors contribute to the susceptibility of type 1 diabetes mellitus. The aim of the present study was to identify more susceptibility genes of type 1 diabetes mellitus. We carried out an initial gene-based genome-wide association study in a total of 4,075 type 1 diabetes mellitus cases and 2,604 controls by using the Gene-based Association Test using Extended Simes procedure. Furthermore, we carried out replication studies, differential expression analysis and functional annotation clustering analysis to support the significance of the identified susceptibility genes. We identified 452 genes associated with type 1 diabetes mellitus, even after adapting the genome-wide threshold for significance (P < 9.05E-04). Among these genes, 171 were newly identified for type 1 diabetes mellitus, which were ignored in single-nucleotide polymorphism-based association analysis and were not previously reported. We found that 53 genes have supportive evidence from replication studies and/or differential expression studies. In particular, seven genes including four non-human leukocyte antigen (HLA) genes (RASIP1, STRN4, BCAR1 and MYL2) are replicated in at least one independent population and also differentially expressed in peripheral blood mononuclear cells or monocytes. Furthermore, the associated genes tend to enrich in immune-related pathways or Gene Ontology project terms. The present results suggest the high power of gene-based association analysis in detecting disease-susceptibility genes. Our findings provide more insights into the genetic basis of type 1 diabetes mellitus.

  16. RNA-Seq analysis and annotation of a draft blueberry genome assembly identifies candidate genes involved in fruit ripening, biosynthesis of bioactive compounds, and stage-specific alternative splicing.

    PubMed

    Gupta, Vikas; Estrada, April D; Blakley, Ivory; Reid, Rob; Patel, Ketan; Meyer, Mason D; Andersen, Stig Uggerhøj; Brown, Allan F; Lila, Mary Ann; Loraine, Ann E

    2015-01-01

    Blueberries are a rich source of antioxidants and other beneficial compounds that can protect against disease. Identifying genes involved in synthesis of bioactive compounds could enable the breeding of berry varieties with enhanced health benefits. Toward this end, we annotated a previously sequenced draft blueberry genome assembly using RNA-Seq data from five stages of berry fruit development and ripening. Genome-guided assembly of RNA-Seq read alignments combined with output from ab initio gene finders produced around 60,000 gene models, of which more than half were similar to proteins from other species, typically the grape Vitis vinifera. Comparison of gene models to the PlantCyc database of metabolic pathway enzymes identified candidate genes involved in synthesis of bioactive compounds, including bixin, an apocarotenoid with potential disease-fighting properties, and defense-related cyanogenic glycosides, which are toxic. Cyanogenic glycoside (CG) biosynthetic enzymes were highly expressed in green fruit, and a candidate CG detoxification enzyme was up-regulated during fruit ripening. Candidate genes for ethylene, anthocyanin, and 400 other biosynthetic pathways were also identified. Homology-based annotation using Blast2GO and InterPro assigned Gene Ontology terms to around 15,000 genes. RNA-Seq expression profiling showed that blueberry growth, maturation, and ripening involve dynamic gene expression changes, including coordinated up- and down-regulation of metabolic pathway enzymes and transcriptional regulators. Analysis of RNA-seq alignments identified developmentally regulated alternative splicing, promoter use, and 3' end formation. We report genome sequence, gene models, functional annotations, and RNA-Seq expression data that provide an important new resource enabling high throughput studies in blueberry.

  17. Surprisal analysis of genome-wide transcript profiling identifies differentially expressed genes and pathways associated with four growth conditions in the microalga Chlamydomonas.

    PubMed

    Bogaert, Kenny A; Manoharan-Basil, Sheeba S; Perez, Emilie; Levine, Raphael D; Remacle, Francoise; Remacle, Claire

    2018-01-01

    The usual cultivation mode of the green microalga Chlamydomonas is liquid medium and light. However, the microalga can also be grown on agar plates and in darkness. Our aim is to analyze and compare gene expression of cells cultivated in these different conditions. For that purpose, RNA-seq data are obtained from Chlamydomonas samples of two different labs grown in four environmental conditions (agar@light, agar@dark, liquid@light, liquid@dark). The RNA seq data are analyzed by surprisal analysis, which allows the simultaneous meta-analysis of all the samples. First we identify a balance state, which defines a state where the expression levels are similar in all the samples irrespectively of their growth conditions, or lab origin. In addition our analysis identifies additional constraints needed to quantify the deviation with respect to the balance state. The first constraint differentiates the agar samples versus the liquid ones; the second constraint the dark samples versus the light ones. The two constraints are almost of equal importance. Pathways involved in stress responses are found in the agar phenotype while the liquid phenotype comprises ATP and NADH production pathways. Remodeling of membrane is suggested in the dark phenotype while photosynthetic pathways characterize the light phenotype. The same trends are also present when performing purely statistical analysis such as K-means clustering and differentially expressed genes.

  18. A P-Norm Robust Feature Extraction Method for Identifying Differentially Expressed Genes

    PubMed Central

    Liu, Jian; Liu, Jin-Xing; Gao, Ying-Lian; Kong, Xiang-Zhen; Wang, Xue-Song; Wang, Dong

    2015-01-01

    In current molecular biology, it becomes more and more important to identify differentially expressed genes closely correlated with a key biological process from gene expression data. In this paper, based on the Schatten p-norm and Lp-norm, a novel p-norm robust feature extraction method is proposed to identify the differentially expressed genes. In our method, the Schatten p-norm is used as the regularization function to obtain a low-rank matrix and the Lp-norm is taken as the error function to improve the robustness to outliers in the gene expression data. The results on simulation data show that our method can obtain higher identification accuracies than the competitive methods. Numerous experiments on real gene expression data sets demonstrate that our method can identify more differentially expressed genes than the others. Moreover, we confirmed that the identified genes are closely correlated with the corresponding gene expression data. PMID:26201006

  19. A P-Norm Robust Feature Extraction Method for Identifying Differentially Expressed Genes.

    PubMed

    Liu, Jian; Liu, Jin-Xing; Gao, Ying-Lian; Kong, Xiang-Zhen; Wang, Xue-Song; Wang, Dong

    2015-01-01

    In current molecular biology, it becomes more and more important to identify differentially expressed genes closely correlated with a key biological process from gene expression data. In this paper, based on the Schatten p-norm and Lp-norm, a novel p-norm robust feature extraction method is proposed to identify the differentially expressed genes. In our method, the Schatten p-norm is used as the regularization function to obtain a low-rank matrix and the Lp-norm is taken as the error function to improve the robustness to outliers in the gene expression data. The results on simulation data show that our method can obtain higher identification accuracies than the competitive methods. Numerous experiments on real gene expression data sets demonstrate that our method can identify more differentially expressed genes than the others. Moreover, we confirmed that the identified genes are closely correlated with the corresponding gene expression data.

  20. Integrative strategies to identify candidate genes in rodent models of human alcoholism.

    PubMed

    Treadwell, Julie A

    2006-01-01

    The search for genes underlying alcohol-related behaviours in rodent models of human alcoholism has been ongoing for many years with only limited success. Recently, new strategies that integrate several of the traditional approaches have provided new insights into the molecular mechanisms underlying ethanol's actions in the brain. We have used alcohol-preferring C57BL/6J (B6) and alcohol-avoiding DBA/2J (D2) genetic strains of mice in an integrative strategy combining high-throughput gene expression screening, genetic segregation analysis, and mapping to previously published quantitative trait loci to uncover candidate genes for the ethanol-preference phenotype. In our study, 2 genes, retinaldehyde binding protein 1 (Rlbp1) and syntaxin 12 (Stx12), were found to be strong candidates for ethanol preference. Such experimental approaches have the power and the potential to greatly speed up the laborious process of identifying candidate genes for the animal models of human alcoholism.

  1. Identifying candidate genes for 2p15p16.1 microdeletion syndrome using clinical, genomic, and functional analysis

    PubMed Central

    Bagheri, Hani; Badduke, Chansonette; Qiao, Ying; Colnaghi, Rita; Abramowicz, Iga; Alcantara, Diana; Dunham, Christopher; Wen, Jiadi; Wildin, Robert S.; Nowaczyk, Malgorzata J.M.; Eichmeyer, Jennifer; Lehman, Anna; Maranda, Bruno; Martell, Sally; Shan, Xianghong; Lewis, Suzanne M.E.; O’Driscoll, Mark; Gregory-Evans, Cheryl Y.

    2016-01-01

    The 2p15p16.1 microdeletion syndrome has a core phenotype consisting of intellectual disability, microcephaly, hypotonia, delayed growth, common craniofacial features, and digital anomalies. So far, more than 20 cases of 2p15p16.1 microdeletion syndrome have been reported in the literature; however, the size of the deletions and their breakpoints vary, making it difficult to identify the candidate genes. Recent reports pointed to 4 genes (XPO1, USP34, BCL11A, and REL) that were included, alone or in combination, in the smallest deletions causing the syndrome. Here, we describe 8 new patients with the 2p15p16.1 deletion and review all published cases to date. We demonstrate functional deficits for the above 4 candidate genes using patients’ lymphoblast cell lines (LCLs) and knockdown of their orthologs in zebrafish. All genes were dosage sensitive on the basis of reduced protein expression in LCLs. In addition, deletion of XPO1, a nuclear exporter, cosegregated with nuclear accumulation of one of its cargo molecules (rpS5) in patients’ LCLs. Other pathways associated with these genes (e.g., NF-κB and Wnt signaling as well as the DNA damage response) were not impaired in patients’ LCLs. Knockdown of xpo1a, rel, bcl11aa, and bcl11ab resulted in abnormal zebrafish embryonic development including microcephaly, dysmorphic body, hindered growth, and small fins as well as structural brain abnormalities. Our multifaceted analysis strongly implicates XPO1, REL, and BCL11A as candidate genes for 2p15p16.1 microdeletion syndrome. PMID:27699255

  2. Resistance gene candidates identified by PCR with degenerate oligonucleotide primers map to clusters of resistance genes in lettuce.

    PubMed

    Shen, K A; Meyers, B C; Islam-Faridi, M N; Chin, D B; Stelly, D M; Michelmore, R W

    1998-08-01

    The recent cloning of genes for resistance against diverse pathogens from a variety of plants has revealed that many share conserved sequence motifs. This provides the possibility of isolating numerous additional resistance genes by polymerase chain reaction (PCR) with degenerate oligonucleotide primers. We amplified resistance gene candidates (RGCs) from lettuce with multiple combinations of primers with low degeneracy designed from motifs in the nucleotide binding sites (NBSs) of RPS2 of Arabidopsis thaliana and N of tobacco. Genomic DNA, cDNA, and bacterial artificial chromosome (BAC) clones were successfully used as templates. Four families of sequences were identified that had the same similarity to each other as to resistance genes from other species. The relationship of the amplified products to resistance genes was evaluated by several sequence and genetic criteria. The amplified products contained open reading frames with additional sequences characteristic of NBSs. Hybridization of RGCs to genomic DNA and to BAC clones revealed large numbers of related sequences. Genetic analysis demonstrated the existence of clustered multigene families for each of the four RGC sequences. This parallels classical genetic data on clustering of disease resistance genes. Two of the four families mapped to known clusters of resistance genes; these two families were therefore studied in greater detail. Additional evidence that these RGCs could be resistance genes was gained by the identification of leucine-rich repeat (LRR) regions in sequences adjoining the NBS similar to those in RPM1 and RPS2 of A. thaliana. Fluorescent in situ hybridization confirmed the clustered genomic distribution of these sequences. The use of PCR with degenerate oligonucleotide primers is therefore an efficient method to identify numerous RGCs in plants.

  3. Separate enrichment analysis of pathways for up- and downregulated genes.

    PubMed

    Hong, Guini; Zhang, Wenjing; Li, Hongdong; Shen, Xiaopei; Guo, Zheng

    2014-03-06

    Two strategies are often adopted for enrichment analysis of pathways: the analysis of all differentially expressed (DE) genes together or the analysis of up- and downregulated genes separately. However, few studies have examined the rationales of these enrichment analysis strategies. Using both microarray and RNA-seq data, we show that gene pairs with functional links in pathways tended to have positively correlated expression levels, which could result in an imbalance between the up- and downregulated genes in particular pathways. We then show that the imbalance could greatly reduce the statistical power for finding disease-associated pathways through the analysis of all-DE genes. Further, using gene expression profiles from five types of tumours, we illustrate that the separate analysis of up- and downregulated genes could identify more pathways that are really pertinent to phenotypic difference. In conclusion, analysing up- and downregulated genes separately is more powerful than analysing all of the DE genes together.

  4. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes.

    PubMed

    Hu, H; Haas, S A; Chelly, J; Van Esch, H; Raynaud, M; de Brouwer, A P M; Weinert, S; Froyen, G; Frints, S G M; Laumonnier, F; Zemojtel, T; Love, M I; Richard, H; Emde, A-K; Bienek, M; Jensen, C; Hambrock, M; Fischer, U; Langnick, C; Feldkamp, M; Wissink-Lindhout, W; Lebrun, N; Castelnau, L; Rucci, J; Montjean, R; Dorseuil, O; Billuart, P; Stuhlmann, T; Shaw, M; Corbett, M A; Gardner, A; Willis-Owen, S; Tan, C; Friend, K L; Belet, S; van Roozendaal, K E P; Jimenez-Pocquet, M; Moizard, M-P; Ronce, N; Sun, R; O'Keeffe, S; Chenna, R; van Bömmel, A; Göke, J; Hackett, A; Field, M; Christie, L; Boyle, J; Haan, E; Nelson, J; Turner, G; Baynam, G; Gillessen-Kaesbach, G; Müller, U; Steinberger, D; Budny, B; Badura-Stronka, M; Latos-Bieleńska, A; Ousager, L B; Wieacker, P; Rodríguez Criado, G; Bondeson, M-L; Annerén, G; Dufke, A; Cohen, M; Van Maldergem, L; Vincent-Delorme, C; Echenne, B; Simon-Bouy, B; Kleefstra, T; Willemsen, M; Fryns, J-P; Devriendt, K; Ullmann, R; Vingron, M; Wrogemann, K; Wienker, T F; Tzschach, A; van Bokhoven, H; Gecz, J; Jentsch, T J; Chen, W; Ropers, H-H; Kalscheuer, V M

    2016-01-01

    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes or loci are yet to be identified. Here, we have investigated 405 unresolved families with XLID. We employed massively parallel sequencing of all X-chromosome exons in the index males. The majority of these males were previously tested negative for copy number variations and for mutations in a subset of known XLID genes by Sanger sequencing. In total, 745 X-chromosomal genes were screened. After stringent filtering, a total of 1297 non-recurrent exonic variants remained for prioritization. Co-segregation analysis of potential clinically relevant changes revealed that 80 families (20%) carried pathogenic variants in established XLID genes. In 19 families, we detected likely causative protein truncating and missense variants in 7 novel and validated XLID genes (CLCN4, CNKSR2, FRMPD4, KLHL15, LAS1L, RLIM and USP27X) and potentially deleterious variants in 2 novel candidate XLID genes (CDK16 and TAF1). We show that the CLCN4 and CNKSR2 variants impair protein functions as indicated by electrophysiological studies and altered differentiation of cultured primary neurons from Clcn4(-/-) mice or after mRNA knock-down. The newly identified and candidate XLID proteins belong to pathways and networks with established roles in cognitive function and intellectual disability in particular. We suggest that systematic sequencing of all X-chromosomal genes in a cohort of patients with genetic evidence for X-chromosome locus involvement may resolve up to 58% of Fragile X-negative cases.

  5. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes

    PubMed Central

    Hu, H; Haas, S A; Chelly, J; Van Esch, H; Raynaud, M; de Brouwer, A P M; Weinert, S; Froyen, G; Frints, S G M; Laumonnier, F; Zemojtel, T; Love, M I; Richard, H; Emde, A-K; Bienek, M; Jensen, C; Hambrock, M; Fischer, U; Langnick, C; Feldkamp, M; Wissink-Lindhout, W; Lebrun, N; Castelnau, L; Rucci, J; Montjean, R; Dorseuil, O; Billuart, P; Stuhlmann, T; Shaw, M; Corbett, M A; Gardner, A; Willis-Owen, S; Tan, C; Friend, K L; Belet, S; van Roozendaal, K E P; Jimenez-Pocquet, M; Moizard, M-P; Ronce, N; Sun, R; O'Keeffe, S; Chenna, R; van Bömmel, A; Göke, J; Hackett, A; Field, M; Christie, L; Boyle, J; Haan, E; Nelson, J; Turner, G; Baynam, G; Gillessen-Kaesbach, G; Müller, U; Steinberger, D; Budny, B; Badura-Stronka, M; Latos-Bieleńska, A; Ousager, L B; Wieacker, P; Rodríguez Criado, G; Bondeson, M-L; Annerén, G; Dufke, A; Cohen, M; Van Maldergem, L; Vincent-Delorme, C; Echenne, B; Simon-Bouy, B; Kleefstra, T; Willemsen, M; Fryns, J-P; Devriendt, K; Ullmann, R; Vingron, M; Wrogemann, K; Wienker, T F; Tzschach, A; van Bokhoven, H; Gecz, J; Jentsch, T J; Chen, W; Ropers, H-H; Kalscheuer, V M

    2016-01-01

    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes or loci are yet to be identified. Here, we have investigated 405 unresolved families with XLID. We employed massively parallel sequencing of all X-chromosome exons in the index males. The majority of these males were previously tested negative for copy number variations and for mutations in a subset of known XLID genes by Sanger sequencing. In total, 745 X-chromosomal genes were screened. After stringent filtering, a total of 1297 non-recurrent exonic variants remained for prioritization. Co-segregation analysis of potential clinically relevant changes revealed that 80 families (20%) carried pathogenic variants in established XLID genes. In 19 families, we detected likely causative protein truncating and missense variants in 7 novel and validated XLID genes (CLCN4, CNKSR2, FRMPD4, KLHL15, LAS1L, RLIM and USP27X) and potentially deleterious variants in 2 novel candidate XLID genes (CDK16 and TAF1). We show that the CLCN4 and CNKSR2 variants impair protein functions as indicated by electrophysiological studies and altered differentiation of cultured primary neurons from Clcn4−/− mice or after mRNA knock-down. The newly identified and candidate XLID proteins belong to pathways and networks with established roles in cognitive function and intellectual disability in particular. We suggest that systematic sequencing of all X-chromosomal genes in a cohort of patients with genetic evidence for X-chromosome locus involvement may resolve up to 58% of Fragile X-negative cases. PMID:25644381

  6. ChIP-Seq Analysis for Identifying Genome-Wide Histone Modifications Associated with Stress-Responsive Genes in Plants.

    PubMed

    Li, Guosheng; Jagadeeswaran, Guru; Mort, Andrew; Sunkar, Ramanjulu

    2017-01-01

    Histone modifications represent the crux of epigenetic gene regulation essential for most biological processes including abiotic stress responses in plants. Thus, identification of histone modifications at the genome-scale can provide clues for how some genes are 'turned-on' while some others are "turned-off" in response to stress. This chapter details a step-by-step protocol for identifying genome-wide histone modifications associated with stress-responsive gene regulation using chromatin immunoprecipitation (ChIP) followed by sequencing of the DNA (ChIP-seq).

  7. MADGiC: a model-based approach for identifying driver genes in cancer

    PubMed Central

    Korthauer, Keegan D.; Kendziorski, Christina

    2015-01-01

    Motivation: Identifying and prioritizing somatic mutations is an important and challenging area of cancer research that can provide new insights into gene function as well as new targets for drug development. Most methods for prioritizing mutations rely primarily on frequency-based criteria, where a gene is identified as having a driver mutation if it is altered in significantly more samples than expected according to a background model. Although useful, frequency-based methods are limited in that all mutations are treated equally. It is well known, however, that some mutations have no functional consequence, while others may have a major deleterious impact. The spatial pattern of mutations within a gene provides further insight into their functional consequence. Properly accounting for these factors improves both the power and accuracy of inference. Also important is an accurate background model. Results: Here, we develop a Model-based Approach for identifying Driver Genes in Cancer (termed MADGiC) that incorporates both frequency and functional impact criteria and accommodates a number of factors to improve the background model. Simulation studies demonstrate advantages of the approach, including a substantial increase in power over competing methods. Further advantages are illustrated in an analysis of ovarian and lung cancer data from The Cancer Genome Atlas (TCGA) project. Availability and implementation: R code to implement this method is available at http://www.biostat.wisc.edu/ kendzior/MADGiC/. Contact: kendzior@biostat.wisc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25573922

  8. Genomics Analysis of Genes Expressed in Maize Endosperm Identifies Novel Seed Proteins and Clarifies Patterns of Zein Gene Expression

    PubMed Central

    Woo, Young-Min; Hu, David Wang-Nan; Larkins, Brian A.; Jung, Rudolf

    2001-01-01

    We analyzed cDNA libraries from developing endosperm of the B73 maize inbred line to evaluate the expression of storage protein genes. This study showed that zeins are by far the most highly expressed genes in the endosperm, but we found an inverse relationship between the number of zein genes and the relative amount of specific mRNAs. Although α-zeins are encoded by large multigene families, only a few of these genes are transcribed at high or detectable levels. In contrast, relatively small gene families encode the γ- and δ-zeins, and members of these gene families, especially the γ-zeins, are highly expressed. Knowledge of expressed storage protein genes allowed the development of DNA and antibody probes that distinguish between closely related gene family members. Using in situ hybridization, we found differences in the temporal and spatial expression of the α-, γ-, and δ-zein gene families, which provides evidence that γ-zeins are synthesized throughout the endosperm before α- and δ-zeins. This observation is consistent with earlier studies that suggested that γ-zeins play an important role in prolamin protein body assembly. Analysis of endosperm cDNAs also revealed several previously unidentified proteins, including a 50-kD γ-zein, an 18-kD α-globulin, and a legumin-related protein. Immunolocalization of the 50-kD γ-zein showed this protein to be located at the surface of prolamin-containing protein bodies, similar to other γ-zeins. The 18-kD α-globulin, however, is deposited in novel, vacuole-like organelles that were not described previously in maize endosperm. PMID:11595803

  9. Principal Angle Enrichment Analysis (PAEA): Dimensionally Reduced Multivariate Gene Set Enrichment Analysis Tool.

    PubMed

    Clark, Neil R; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D; Jones, Matthew R; Ma'ayan, Avi

    2015-11-01

    Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community.

  10. Bioinformatics analysis and detection of gelatinase encoded gene in Lysinibacillussphaericus

    NASA Astrophysics Data System (ADS)

    Repin, Rul Aisyah Mat; Mutalib, Sahilah Abdul; Shahimi, Safiyyah; Khalid, Rozida Mohd.; Ayob, Mohd. Khan; Bakar, Mohd. Faizal Abu; Isa, Mohd Noor Mat

    2016-11-01

    In this study, we performed bioinformatics analysis toward genome sequence of Lysinibacillussphaericus (L. sphaericus) to determine gene encoded for gelatinase. L. sphaericus was isolated from soil and gelatinase species-specific bacterium to porcine and bovine gelatin. This bacterium offers the possibility of enzymes production which is specific to both species of meat, respectively. The main focus of this research is to identify the gelatinase encoded gene within the bacteria of L. Sphaericus using bioinformatics analysis of partially sequence genome. From the research study, three candidate gene were identified which was, gelatinase candidate gene 1 (P1), NODE_71_length_93919_cov_158.931839_21 which containing 1563 base pair (bp) in size with 520 amino acids sequence; Secondly, gelatinase candidate gene 2 (P2), NODE_23_length_52851_cov_190.061386_17 which containing 1776 bp in size with 591 amino acids sequence; and Thirdly, gelatinase candidate gene 3 (P3), NODE_106_length_32943_cov_169.147919_8 containing 1701 bp in size with 566 amino acids sequence. Three pairs of oligonucleotide primers were designed and namely as, F1, R1, F2, R2, F3 and R3 were targeted short sequences of cDNA by PCR. The amplicons were reliably results in 1563 bp in size for candidate gene P1 and 1701 bp in size for candidate gene P3. Therefore, the results of bioinformatics analysis of L. Sphaericus resulting in gene encoded gelatinase were identified.

  11. The statistics of identifying differentially expressed genes in Expresso and TM4: a comparison

    PubMed Central

    Sioson, Allan A; Mane, Shrinivasrao P; Li, Pinghua; Sha, Wei; Heath, Lenwood S; Bohnert, Hans J; Grene, Ruth

    2006-01-01

    Background Analysis of DNA microarray data takes as input spot intensity measurements from scanner software and returns differential expression of genes between two conditions, together with a statistical significance assessment. This process typically consists of two steps: data normalization and identification of differentially expressed genes through statistical analysis. The Expresso microarray experiment management system implements these steps with a two-stage, log-linear ANOVA mixed model technique, tailored to individual experimental designs. The complement of tools in TM4, on the other hand, is based on a number of preset design choices that limit its flexibility. In the TM4 microarray analysis suite, normalization, filter, and analysis methods form an analysis pipeline. TM4 computes integrated intensity values (IIV) from the average intensities and spot pixel counts returned by the scanner software as input to its normalization steps. By contrast, Expresso can use either IIV data or median intensity values (MIV). Here, we compare Expresso and TM4 analysis of two experiments and assess the results against qRT-PCR data. Results The Expresso analysis using MIV data consistently identifies more genes as differentially expressed, when compared to Expresso analysis with IIV data. The typical TM4 normalization and filtering pipeline corrects systematic intensity-specific bias on a per microarray basis. Subsequent statistical analysis with Expresso or a TM4 t-test can effectively identify differentially expressed genes. The best agreement with qRT-PCR data is obtained through the use of Expresso analysis and MIV data. Conclusion The results of this research are of practical value to biologists who analyze microarray data sets. The TM4 normalization and filtering pipeline corrects microarray-specific systematic bias and complements the normalization stage in Expresso analysis. The results of Expresso using MIV data have the best agreement with qRT-PCR results. In

  12. Analysis of blood-based gene expression in idiopathic Parkinson disease.

    PubMed

    Shamir, Ron; Klein, Christine; Amar, David; Vollstedt, Eva-Juliane; Bonin, Michael; Usenovic, Marija; Wong, Yvette C; Maver, Ales; Poths, Sven; Safer, Hershel; Corvol, Jean-Christophe; Lesage, Suzanne; Lavi, Ofer; Deuschl, Günther; Kuhlenbaeumer, Gregor; Pawlack, Heike; Ulitsky, Igor; Kasten, Meike; Riess, Olaf; Brice, Alexis; Peterlin, Borut; Krainc, Dimitri

    2017-10-17

    To examine whether gene expression analysis of a large-scale Parkinson disease (PD) patient cohort produces a robust blood-based PD gene signature compared to previous studies that have used relatively small cohorts (≤220 samples). Whole-blood gene expression profiles were collected from a total of 523 individuals. After preprocessing, the data contained 486 gene profiles (n = 205 PD, n = 233 controls, n = 48 other neurodegenerative diseases) that were partitioned into training, validation, and independent test cohorts to identify and validate a gene signature. Batch-effect reduction and cross-validation were performed to ensure signature reliability. Finally, functional and pathway enrichment analyses were applied to the signature to identify PD-associated gene networks. A gene signature of 100 probes that mapped to 87 genes, corresponding to 64 upregulated and 23 downregulated genes differentiating between patients with idiopathic PD and controls, was identified with the training cohort and successfully replicated in both an independent validation cohort (area under the curve [AUC] = 0.79, p = 7.13E-6) and a subsequent independent test cohort (AUC = 0.74, p = 4.2E-4). Network analysis of the signature revealed gene enrichment in pathways, including metabolism, oxidation, and ubiquitination/proteasomal activity, and misregulation of mitochondria-localized genes, including downregulation of COX4I1 , ATP5A1 , and VDAC3 . We present a large-scale study of PD gene expression profiling. This work identifies a reliable blood-based PD signature and highlights the importance of large-scale patient cohorts in developing potential PD biomarkers. © 2017 American Academy of Neurology.

  13. An elm EST database for identifying leaf beetle egg-induced defense genes

    PubMed Central

    2012-01-01

    Background Plants can defend themselves against herbivorous insects prior to the onset of larval feeding by responding to the eggs laid on their leaves. In the European field elm (Ulmus minor), egg laying by the elm leaf beetle ( Xanthogaleruca luteola) activates the emission of volatiles that attract specialised egg parasitoids, which in turn kill the eggs. Little is known about the transcriptional changes that insect eggs trigger in plants and how such indirect defense mechanisms are orchestrated in the context of other biological processes. Results Here we present the first large scale study of egg-induced changes in the transcriptional profile of a tree. Five cDNA libraries were generated from leaves of (i) untreated control elms, and elms treated with (ii) egg laying and feeding by elm leaf beetles, (iii) feeding, (iv) artificial transfer of egg clutches, and (v) methyl jasmonate. A total of 361,196 ESTs expressed sequence tags (ESTs) were identified which clustered into 52,823 unique transcripts (Unitrans) and were stored in a database with a public web interface. Among the analyzed Unitrans, 73% could be annotated by homology to known genes in the UniProt (Plant) database, particularly to those from Vitis, Ricinus, Populus and Arabidopsis. Comparative in silico analysis among the different treatments revealed differences in Gene Ontology term abundances. Defense- and stress-related gene transcripts were present in high abundance in leaves after herbivore egg laying, but transcripts involved in photosynthesis showed decreased abundance. Many pathogen-related genes and genes involved in phytohormone signaling were expressed, indicative of jasmonic acid biosynthesis and activation of jasmonic acid responsive genes. Cross-comparisons between different libraries based on expression profiles allowed the identification of genes with a potential relevance in egg-induced defenses, as well as other biological processes, including signal transduction, transport and

  14. An elm EST database for identifying leaf beetle egg-induced defense genes.

    PubMed

    Büchel, Kerstin; McDowell, Eric; Nelson, Will; Descour, Anne; Gershenzon, Jonathan; Hilker, Monika; Soderlund, Carol; Gang, David R; Fenning, Trevor; Meiners, Torsten

    2012-06-15

    Plants can defend themselves against herbivorous insects prior to the onset of larval feeding by responding to the eggs laid on their leaves. In the European field elm (Ulmus minor), egg laying by the elm leaf beetle ( Xanthogaleruca luteola) activates the emission of volatiles that attract specialised egg parasitoids, which in turn kill the eggs. Little is known about the transcriptional changes that insect eggs trigger in plants and how such indirect defense mechanisms are orchestrated in the context of other biological processes. Here we present the first large scale study of egg-induced changes in the transcriptional profile of a tree. Five cDNA libraries were generated from leaves of (i) untreated control elms, and elms treated with (ii) egg laying and feeding by elm leaf beetles, (iii) feeding, (iv) artificial transfer of egg clutches, and (v) methyl jasmonate. A total of 361,196 ESTs expressed sequence tags (ESTs) were identified which clustered into 52,823 unique transcripts (Unitrans) and were stored in a database with a public web interface. Among the analyzed Unitrans, 73% could be annotated by homology to known genes in the UniProt (Plant) database, particularly to those from Vitis, Ricinus, Populus and Arabidopsis. Comparative in silico analysis among the different treatments revealed differences in Gene Ontology term abundances. Defense- and stress-related gene transcripts were present in high abundance in leaves after herbivore egg laying, but transcripts involved in photosynthesis showed decreased abundance. Many pathogen-related genes and genes involved in phytohormone signaling were expressed, indicative of jasmonic acid biosynthesis and activation of jasmonic acid responsive genes. Cross-comparisons between different libraries based on expression profiles allowed the identification of genes with a potential relevance in egg-induced defenses, as well as other biological processes, including signal transduction, transport and primary metabolism

  15. Network-based integration of GWAS and gene expression identifies a HOX-centric network associated with serous ovarian cancer risk

    PubMed Central

    Kar, Siddhartha P.; Tyrer, Jonathan P.; Li, Qiyuan; Lawrenson, Kate; Aben, Katja K.H.; Anton-Culver, Hoda; Antonenkova, Natalia; Chenevix-Trench, Georgia; Baker, Helen; Bandera, Elisa V.; Bean, Yukie T.; Beckmann, Matthias W.; Berchuck, Andrew; Bisogna, Maria; Bjørge, Line; Bogdanova, Natalia; Brinton, Louise; Brooks-Wilson, Angela; Butzow, Ralf; Campbell, Ian; Carty, Karen; Chang-Claude, Jenny; Chen, Yian Ann; Chen, Zhihua; Cook, Linda S.; Cramer, Daniel; Cunningham, Julie M.; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Dennis, Joe; Dicks, Ed; Doherty, Jennifer A.; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Easton, Douglas F.; Edwards, Robert P.; Ekici, Arif B.; Fasching, Peter A.; Fridley, Brooke L.; Gao, Yu-Tang; Gentry-Maharaj, Aleksandra; Giles, Graham G.; Glasspool, Rosalind; Goode, Ellen L.; Goodman, Marc T.; Grownwald, Jacek; Harrington, Patricia; Harter, Philipp; Hein, Alexander; Heitz, Florian; Hildebrandt, Michelle A.T.; Hillemanns, Peter; Hogdall, Estrid; Hogdall, Claus K.; Hosono, Satoyo; Iversen, Edwin S.; Jakubowska, Anna; Paul, James; Jensen, Allan; Ji, Bu-Tian; Karlan, Beth Y; Kjaer, Susanne K.; Kelemen, Linda E.; Kellar, Melissa; Kelley, Joseph; Kiemeney, Lambertus A.; Krakstad, Camilla; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D.; Lee, Alice W.; Lele, Shashi; Leminen, Arto; Lester, Jenny; Levine, Douglas A.; Liang, Dong; Lissowska, Jolanta; Lu, Karen; Lubinski, Jan; Lundvall, Lene; Massuger, Leon; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R.; McNeish, Iain A.; Menon, Usha; Modugno, Francesmary; Moysich, Kirsten B.; Narod, Steven A.; Nedergaard, Lotte; Ness, Roberta B.; Nevanlinna, Heli; Odunsi, Kunle; Olson, Sara H.; Orlow, Irene; Orsulic, Sandra; Weber, Rachel Palmieri; Pearce, Celeste Leigh; Pejovic, Tanja; Pelttari, Liisa M.; Permuth-Wey, Jennifer; Phelan, Catherine M.; Pike, Malcolm C.; Poole, Elizabeth M.; Ramus, Susan J.; Risch, Harvey A.; Rosen, Barry; Rossing, Mary Anne; Rothstein, Joseph H.; Rudolph, Anja; Runnebaum, Ingo B.; Rzepecka, Iwona K.; Salvesen, Helga B.; Schildkraut, Joellen M.; Schwaab, Ira; Shu, Xiao-Ou; Shvetsov, Yurii B; Siddiqui, Nadeem; Sieh, Weiva; Song, Honglin; Southey, Melissa C.; Sucheston-Campbell, Lara E.; Tangen, Ingvild L.; Teo, Soo-Hwang; Terry, Kathryn L.; Thompson, Pamela J; Timorek, Agnieszka; Tsai, Ya-Yu; Tworoger, Shelley S.; van Altena, Anne M.; Van Nieuwenhuysen, Els; Vergote, Ignace; Vierkant, Robert A.; Wang-Gohrke, Shan; Walsh, Christine; Wentzensen, Nicolas; Whittemore, Alice S.; Wicklund, Kristine G.; Wilkens, Lynne R.; Woo, Yin-Ling; Wu, Xifeng; Wu, Anna; Yang, Hannah; Zheng, Wei; Ziogas, Argyrios; Sellers, Thomas A.; Monteiro, Alvaro N. A.; Freedman, Matthew L.; Gayther, Simon A.; Pharoah, Paul D. P.

    2015-01-01

    Background Genome-wide association studies (GWAS) have so far reported 12 loci associated with serous epithelial ovarian cancer (EOC) risk. We hypothesized that some of these loci function through nearby transcription factor (TF) genes and that putative target genes of these TFs as identified by co-expression may also be enriched for additional EOC risk associations. Methods We selected TF genes within 1 Mb of the top signal at the 12 genome-wide significant risk loci. Mutual information, a form of correlation, was used to build networks of genes strongly co-expressed with each selected TF gene in the unified microarray data set of 489 serous EOC tumors from The Cancer Genome Atlas. Genes represented in this data set were subsequently ranked using a gene-level test based on results for germline SNPs from a serous EOC GWAS meta-analysis (2,196 cases/4,396 controls). Results Gene set enrichment analysis identified six networks centered on TF genes (HOXB2, HOXB5, HOXB6, HOXB7 at 17q21.32 and HOXD1, HOXD3 at 2q31) that were significantly enriched for genes from the risk-associated end of the ranked list (P<0.05 and FDR<0.05). These results were replicated (P<0.05) using an independent association study (7,035 cases/21,693 controls). Genes underlying enrichment in the six networks were pooled into a combined network. Conclusion We identified a HOX-centric network associated with serous EOC risk containing several genes with known or emerging roles in serous EOC development. Impact Network analysis integrating large, context-specific data sets has the potential to offer mechanistic insights into cancer susceptibility and prioritize genes for experimental characterization. PMID:26209509

  16. Gene Expression Profile Analysis is Directly Affected by the Selected Reference Gene: The Case of Leaf-Cutting Atta Sexdens

    PubMed Central

    Máximo, Wesley P. F.; Zanetti, Ronald; Paiva, Luciano V.

    2018-01-01

    Although several ant species are important targets for the development of molecular control strategies, only a few studies focus on identifying and validating reference genes for quantitative reverse transcription polymerase chain reaction (RT-qPCR) data normalization. We provide here an extensive study to identify and validate suitable reference genes for gene expression analysis in the ant Atta sexdens, a threatening agricultural pest in South America. The optimal number of reference genes varies according to each sample and the result generated by RefFinder differed about which is the most suitable reference gene. Results suggest that the RPS16, NADH and SDHB genes were the best reference genes in the sample pool according to stability values. The SNF7 gene expression pattern was stable in all evaluated sample set. In contrast, when using less stable reference genes for normalization a large variability in SNF7 gene expression was recorded. There is no universal reference gene suitable for all conditions under analysis, since these genes can also participate in different cellular functions, thus requiring a systematic validation of possible reference genes for each specific condition. The choice of reference genes on SNF7 gene normalization confirmed that unstable reference genes might drastically change the expression profile analysis of target candidate genes. PMID:29419794

  17. Profiling analysis of FOX gene family members identified FOXE1 as potential regulator of NSCLC development.

    PubMed

    Ji, G H; Cui, Y; Yu, H; Cui, X B

    2016-09-30

    Lung cancer is one of the most malignant tumors worldwide with a high mortality rate, which has not been improved since several decades ago. FOX gene family members have been reported to play extensive roles in regulating many biological processes and disorders. In order to clarify the contribution of FOX gene family members in lung cancer biology, we performed expression profiling analysis of FOX gene family members from FOXA to FOXR in lung cancer cell lines and tissue specimens by Real-time PCR, western blot and immunohistochemistry analysis. We found that FOXE1 was the only gene which was over-expressed in six out of eight lung cancer cell lines and human cancer tissue specimens (28 out of 35 cases with higher expression and 7 out of 35 cases with moderate expression). Further investigation showed that MMP2 gene was up-regulated, and autophagy markers such as LC3B, ATG5, ATG12 and BECLIN1, were down-regulated concomitant with the increase of FOXE1. These results implicated that FOXE1 may be an important regulator by targeting autophagy and MMPs pathways in lung cancer development.

  18. Down-weighting overlapping genes improves gene set analysis

    PubMed Central

    2012-01-01

    Background The identification of gene sets that are significantly impacted in a given condition based on microarray data is a crucial step in current life science research. Most gene set analysis methods treat genes equally, regardless how specific they are to a given gene set. Results In this work we propose a new gene set analysis method that computes a gene set score as the mean of absolute values of weighted moderated gene t-scores. The gene weights are designed to emphasize the genes appearing in few gene sets, versus genes that appear in many gene sets. We demonstrate the usefulness of the method when analyzing gene sets that correspond to the KEGG pathways, and hence we called our method Pathway Analysis with Down-weighting of Overlapping Genes (PADOG). Unlike most gene set analysis methods which are validated through the analysis of 2-3 data sets followed by a human interpretation of the results, the validation employed here uses 24 different data sets and a completely objective assessment scheme that makes minimal assumptions and eliminates the need for possibly biased human assessments of the analysis results. Conclusions PADOG significantly improves gene set ranking and boosts sensitivity of analysis using information already available in the gene expression profiles and the collection of gene sets to be analyzed. The advantages of PADOG over other existing approaches are shown to be stable to changes in the database of gene sets to be analyzed. PADOG was implemented as an R package available at: http://bioinformaticsprb.med.wayne.edu/PADOG/or http://www.bioconductor.org. PMID:22713124

  19. Mango (Mangifera indica L.) cv. Kent fruit mesocarp de novo transcriptome assembly identifies gene families important for ripening.

    PubMed

    Dautt-Castro, Mitzuko; Ochoa-Leyva, Adrian; Contreras-Vergara, Carmen A; Pacheco-Sanchez, Magda A; Casas-Flores, Sergio; Sanchez-Flores, Alejandro; Kuhn, David N; Islas-Osuna, Maria A

    2015-01-01

    Fruit ripening is a physiological and biochemical process genetically programmed to regulate fruit quality parameters like firmness, flavor, odor and color, as well as production of ethylene in climacteric fruit. In this study, a transcriptomic analysis of mango (Mangifera indica L.) mesocarp cv. "Kent" was done to identify key genes associated with fruit ripening. Using the Illumina sequencing platform, 67,682,269 clean reads were obtained and a transcriptome of 4.8 Gb. A total of 33,142 coding sequences were predicted and after functional annotation, 25,154 protein sequences were assigned with a product according to Swiss-Prot database and 32,560 according to non-redundant database. Differential expression analysis identified 2,306 genes with significant differences in expression between mature-green and ripe mango [1,178 up-regulated and 1,128 down-regulated (FDR ≤ 0.05)]. The expression of 10 genes evaluated by both qRT-PCR and RNA-seq data was highly correlated (R = 0.97), validating the differential expression data from RNA-seq alone. Gene Ontology enrichment analysis, showed significantly represented terms associated to fruit ripening like "cell wall," "carbohydrate catabolic process" and "starch and sucrose metabolic process" among others. Mango genes were assigned to 327 metabolic pathways according to Kyoto Encyclopedia of Genes and Genomes database, among them those involved in fruit ripening such as plant hormone signal transduction, starch and sucrose metabolism, galactose metabolism, terpenoid backbone, and carotenoid biosynthesis. This study provides a mango transcriptome that will be very helpful to identify genes for expression studies in early and late flowering mangos during fruit ripening.

  20. High-Throughput Effect-Directed Analysis Using Downscaled in Vitro Reporter Gene Assays To Identify Endocrine Disruptors in Surface Water

    PubMed Central

    2018-01-01

    Effect-directed analysis (EDA) is a commonly used approach for effect-based identification of endocrine disruptive chemicals in complex (environmental) mixtures. However, for routine toxicity assessment of, for example, water samples, current EDA approaches are considered time-consuming and laborious. We achieved faster EDA and identification by downscaling of sensitive cell-based hormone reporter gene assays and increasing fractionation resolution to allow testing of smaller fractions with reduced complexity. The high-resolution EDA approach is demonstrated by analysis of four environmental passive sampler extracts. Downscaling of the assays to a 384-well format allowed analysis of 64 fractions in triplicate (or 192 fractions without technical replicates) without affecting sensitivity compared to the standard 96-well format. Through a parallel exposure method, agonistic and antagonistic androgen and estrogen receptor activity could be measured in a single experiment following a single fractionation. From 16 selected candidate compounds, identified through nontargeted analysis, 13 could be confirmed chemically and 10 were found to be biologically active, of which the most potent nonsteroidal estrogens were identified as oxybenzone and piperine. The increased fractionation resolution and the higher throughput that downscaling provides allow for future application in routine high-resolution screening of large numbers of samples in order to accelerate identification of (emerging) endocrine disruptors. PMID:29547277

  1. A Genome-wide CRISPR Screen in Toxoplasma Identifies Essential Apicomplexan Genes.

    PubMed

    Sidik, Saima M; Huet, Diego; Ganesan, Suresh M; Huynh, My-Hang; Wang, Tim; Nasamu, Armiyaw S; Thiru, Prathapan; Saeij, Jeroen P J; Carruthers, Vern B; Niles, Jacquin C; Lourido, Sebastian

    2016-09-08

    Apicomplexan parasites are leading causes of human and livestock diseases such as malaria and toxoplasmosis, yet most of their genes remain uncharacterized. Here, we present the first genome-wide genetic screen of an apicomplexan. We adapted CRISPR/Cas9 to assess the contribution of each gene from the parasite Toxoplasma gondii during infection of human fibroblasts. Our analysis defines ∼200 previously uncharacterized, fitness-conferring genes unique to the phylum, from which 16 were investigated, revealing essential functions during infection of human cells. Secondary screens identify as an invasion factor the claudin-like apicomplexan microneme protein (CLAMP), which resembles mammalian tight-junction proteins and localizes to secretory organelles, making it critical to the initiation of infection. CLAMP is present throughout sequenced apicomplexan genomes and is essential during the asexual stages of the malaria parasite Plasmodium falciparum. These results provide broad-based functional information on T. gondii genes and will facilitate future approaches to expand the horizon of antiparasitic interventions. Copyright © 2016 Elsevier Inc. All rights reserved.

  2. Multiscale Embedded Gene Co-expression Network Analysis

    PubMed Central

    Song, Won-Min; Zhang, Bin

    2015-01-01

    Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma. PMID:26618778

  3. Multiscale Embedded Gene Co-expression Network Analysis.

    PubMed

    Song, Won-Min; Zhang, Bin

    2015-11-01

    Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.

  4. Gene co-expression network analysis in Rhodobacter capsulatus and application to comparative expression analysis of Rhodobacter sphaeroides

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pena-Castillo, Lourdes; Mercer, Ryan; Gurinovich, Anastasia

    2014-08-28

    The genus Rhodobacter contains purple nonsulfur bacteria found mostly in freshwater environments. Representative strains of two Rhodobacter species, R. capsulatus and R. sphaeroides, have had their genomes fully sequenced and both have been the subject of transcriptional profiling studies. Gene co-expression networks can be used to identify modules of genes with similar expression profiles. Functional analysis of gene modules can then associate co-expressed genes with biological pathways, and network statistics can determine the degree of module preservation in related networks. In this paper, we constructed an R. capsulatus gene co-expression network, performed functional analysis of identified gene modules, and investigatedmore » preservation of these modules in R. capsulatus proteomics data and in R. sphaeroides transcriptomics data. Results: The analysis identified 40 gene co-expression modules in R. capsulatus. Investigation of the module gene contents and expression profiles revealed patterns that were validated based on previous studies supporting the biological relevance of these modules. We identified two R. capsulatus gene modules preserved in the protein abundance data. We also identified several gene modules preserved between both Rhodobacter species, which indicate that these cellular processes are conserved between the species and are candidates for functional information transfer between species. Many gene modules were non-preserved, providing insight into processes that differentiate the two species. In addition, using Local Network Similarity (LNS), a recently proposed metric for expression divergence, we assessed the expression conservation of between-species pairs of orthologs, and within-species gene-protein expression profiles. Conclusions: Our analyses provide new sources of information for functional annotation in R. capsulatus because uncharacterized genes in modules are now connected with groups of genes that constitute a joint functional

  5. Construction and analysis of gene-gene dynamics influence networks based on a Boolean model.

    PubMed

    Mazaya, Maulida; Trinh, Hung-Cuong; Kwon, Yung-Keun

    2017-12-21

    Identification of novel gene-gene relations is a crucial issue to understand system-level biological phenomena. To this end, many methods based on a correlation analysis of gene expressions or structural analysis of molecular interaction networks have been proposed. They have a limitation in identifying more complicated gene-gene dynamical relations, though. To overcome this limitation, we proposed a measure to quantify a gene-gene dynamical influence (GDI) using a Boolean network model and constructed a GDI network to indicate existence of a dynamical influence for every ordered pair of genes. It represents how much a state trajectory of a target gene is changed by a knockout mutation subject to a source gene in a gene-gene molecular interaction (GMI) network. Through a topological comparison between GDI and GMI networks, we observed that the former network is denser than the latter network, which implies that there exist many gene pairs of dynamically influencing but molecularly non-interacting relations. In addition, a larger number of hub genes were generated in the GDI network. On the other hand, there was a correlation between these networks such that the degree value of a node was positively correlated to each other. We further investigated the relationships of the GDI value with structural properties and found that there are negative and positive correlations with the length of a shortest path and the number of paths, respectively. In addition, a GDI network could predict a set of genes whose steady-state expression is affected in E. coli gene-knockout experiments. More interestingly, we found that the drug-targets with side-effects have a larger number of outgoing links than the other genes in the GDI network, which implies that they are more likely to influence the dynamics of other genes. Finally, we found biological evidences showing that the gene pairs which are not molecularly interacting but dynamically influential can be considered for novel gene-gene

  6. A Penalized Robust Method for Identifying Gene-Environment Interactions

    PubMed Central

    Shi, Xingjie; Liu, Jin; Huang, Jian; Zhou, Yong; Xie, Yang; Ma, Shuangge

    2015-01-01

    In high-throughput studies, an important objective is to identify gene-environment interactions associated with disease outcomes and phenotypes. Many commonly adopted methods assume specific parametric or semiparametric models, which may be subject to model mis-specification. In addition, they usually use significance level as the criterion for selecting important interactions. In this study, we adopt the rank-based estimation, which is much less sensitive to model specification than some of the existing methods and includes several commonly encountered data and models as special cases. Penalization is adopted for the identification of gene-environment interactions. It achieves simultaneous estimation and identification and does not rely on significance level. For computation feasibility, a smoothed rank estimation is further proposed. Simulation shows that under certain scenarios, for example with contaminated or heavy-tailed data, the proposed method can significantly outperform the existing alternatives with more accurate identification. We analyze a lung cancer prognosis study with gene expression measurements under the AFT (accelerated failure time) model. The proposed method identifies interactions different from those using the alternatives. Some of the identified genes have important implications. PMID:24616063

  7. Turning publicly available gene expression data into discoveries using gene set context analysis.

    PubMed

    Ji, Zhicheng; Vokes, Steven A; Dang, Chi V; Ji, Hongkai

    2016-01-08

    Gene Set Context Analysis (GSCA) is an open source software package to help researchers use massive amounts of publicly available gene expression data (PED) to make discoveries. Users can interactively visualize and explore gene and gene set activities in 25,000+ consistently normalized human and mouse gene expression samples representing diverse biological contexts (e.g. different cells, tissues and disease types, etc.). By providing one or multiple genes or gene sets as input and specifying a gene set activity pattern of interest, users can query the expression compendium to systematically identify biological contexts associated with the specified gene set activity pattern. In this way, researchers with new gene sets from their own experiments may discover previously unknown contexts of gene set functions and hence increase the value of their experiments. GSCA has a graphical user interface (GUI). The GUI makes the analysis convenient and customizable. Analysis results can be conveniently exported as publication quality figures and tables. GSCA is available at https://github.com/zji90/GSCA. This software significantly lowers the bar for biomedical investigators to use PED in their daily research for generating and screening hypotheses, which was previously difficult because of the complexity, heterogeneity and size of the data. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  8. De novo leaf and root transcriptome analysis to identify putative genes involved in triterpenoid saponins biosynthesis in Hedera helix L.

    PubMed Central

    Li, Fang; Xu, Zijian; Sun, Mengli; Cong, Hanqing; Qiao, Fei; Zhong, Xiaohong

    2017-01-01

    Hedera helix L. is an important traditional medicinal plant in Europe. The main active components are triterpenoid saponins, but none of the potential enzymes involved in triterpenoid saponins biosynthesis have been discovered and annotated. Here is reported the first study of global transcriptome analyses using the Illumina HiSeq™ 2500 platform for H. helix. In total, over 24 million clean reads were produced and 96,333 unigenes were assembled, with an average length of 1385 nt; more than 79,085 unigenes had at least one significant match to an existing gene model. Differentially Expressed Gene analysis identified 6,222 and 7,012 unigenes which were expressed either higher or lower in leaf samples when compared with roots. After functional annotation and classification, two pathways and 410 unigenes related to triterpenoid saponins biosynthesis were discovered. The accuracy of these de novo sequences was validated by RT-qPCR analysis and a RACE clone. These data will enrich our knowledge of triterpenoid saponin biosynthesis and provide a theoretical foundation for molecular research on H. helix. PMID:28771546

  9. Transcriptome Sequencing Identified Genes and Gene Ontologies Associated with Early Freezing Tolerance in Maize

    PubMed Central

    Li, Zhao; Hu, Guanghui; Liu, Xiangfeng; Zhou, Yao; Li, Yu; Zhang, Xu; Yuan, Xiaohui; Zhang, Qian; Yang, Deguang; Wang, Tianyu; Zhang, Zhiwu

    2016-01-01

    Originating in a tropical climate, maize has faced great challenges as cultivation has expanded to the majority of the world's temperate zones. In these zones, frost and cold temperatures are major factors that prevent maize from reaching its full yield potential. Among 30 elite maize inbred lines adapted to northern China, we identified two lines of extreme, but opposite, freezing tolerance levels—highly tolerant and highly sensitive. During the seedling stage of these two lines, we used RNA-seq to measure changes in maize whole genome transcriptome before and after freezing treatment. In total, 19,794 genes were expressed, of which 4550 exhibited differential expression due to either treatment (before or after freezing) or line type (tolerant or sensitive). Of the 4550 differently expressed genes, 948 exhibited differential expression due to treatment within line or lines under freezing condition. Analysis of gene ontology found that these 948 genes were significantly enriched for binding functions (DNA binding, ATP binding, and metal ion binding), protein kinase activity, and peptidase activity. Based on their enrichment, literature support, and significant levels of differential expression, 30 of these 948 genes were selected for quantitative real-time PCR (qRT-PCR) validation. The validation confirmed our RNA-Seq-based findings, with squared correlation coefficients of 80% and 50% in the tolerance and sensitive lines, respectively. This study provided valuable resources for further studies to enhance understanding of the molecular mechanisms underlying maize early freezing response and enable targeted breeding strategies for developing varieties with superior frost resistance to achieve yield potential. PMID:27774095

  10. Integrated network analysis identifies fight-club nodes as a class of hubs encompassing key putative switch genes that induce major transcriptome reprogramming during grapevine development.

    PubMed

    Palumbo, Maria Concetta; Zenoni, Sara; Fasoli, Marianna; Massonnet, Mélanie; Farina, Lorenzo; Castiglione, Filippo; Pezzotti, Mario; Paci, Paola

    2014-12-01

    We developed an approach that integrates different network-based methods to analyze the correlation network arising from large-scale gene expression data. By studying grapevine (Vitis vinifera) and tomato (Solanum lycopersicum) gene expression atlases and a grapevine berry transcriptomic data set during the transition from immature to mature growth, we identified a category named "fight-club hubs" characterized by a marked negative correlation with the expression profiles of neighboring genes in the network. A special subset named "switch genes" was identified, with the additional property of many significant negative correlations outside their own group in the network. Switch genes are involved in multiple processes and include transcription factors that may be considered master regulators of the previously reported transcriptome remodeling that marks the developmental shift from immature to mature growth. All switch genes, expressed at low levels in vegetative/green tissues, showed a significant increase in mature/woody organs, suggesting a potential regulatory role during the developmental transition. Finally, our analysis of tomato gene expression data sets showed that wild-type switch genes are downregulated in ripening-deficient mutants. The identification of known master regulators of tomato fruit maturation suggests our method is suitable for the detection of key regulators of organ development in different fleshy fruit crops. © 2014 American Society of Plant Biologists. All rights reserved.

  11. Transposon mutagenesis identifies genes and cellular processes driving epithelial-mesenchymal transition in hepatocellular carcinoma

    PubMed Central

    Kodama, Takahiro; Newberg, Justin Y.; Kodama, Michiko; Rangel, Roberto; Yoshihara, Kosuke; Tien, Jean C.; Parsons, Pamela H.; Wu, Hao; Finegold, Milton J.; Copeland, Neal G.; Jenkins, Nancy A.

    2016-01-01

    Epithelial-mesenchymal transition (EMT) is thought to contribute to metastasis and chemoresistance in patients with hepatocellular carcinoma (HCC), leading to their poor prognosis. The genes driving EMT in HCC are not yet fully understood, however. Here, we show that mobilization of Sleeping Beauty (SB) transposons in immortalized mouse hepatoblasts induces mesenchymal liver tumors on transplantation to nude mice. These tumors show significant down-regulation of epithelial markers, along with up-regulation of mesenchymal markers and EMT-related transcription factors (EMT-TFs). Sequencing of transposon insertion sites from tumors identified 233 candidate cancer genes (CCGs) that were enriched for genes and cellular processes driving EMT. Subsequent trunk driver analysis identified 23 CCGs that are predicted to function early in tumorigenesis and whose mutation or alteration in patients with HCC is correlated with poor patient survival. Validation of the top trunk drivers identified in the screen, including MET (MET proto-oncogene, receptor tyrosine kinase), GRB2-associated binding protein 1 (GAB1), HECT, UBA, and WWE domain containing 1 (HUWE1), lysine-specific demethylase 6A (KDM6A), and protein-tyrosine phosphatase, nonreceptor-type 12 (PTPN12), showed that deregulation of these genes activates an EMT program in human HCC cells that enhances tumor cell migration. Finally, deregulation of these genes in human HCC was found to confer sorafenib resistance through apoptotic tolerance and reduced proliferation, consistent with recent studies showing that EMT contributes to the chemoresistance of tumor cells. Our unique cell-based transposon mutagenesis screen appears to be an excellent resource for discovering genes involved in EMT in human HCC and potentially for identifying new drug targets. PMID:27247392

  12. MAVTgsa: An R Package for Gene Set (Enrichment) Analysis

    DOE PAGES

    Chien, Chih-Yi; Chang, Ching-Wei; Tsai, Chen-An; ...

    2014-01-01

    Gene semore » t analysis methods aim to determine whether an a priori defined set of genes shows statistically significant difference in expression on either categorical or continuous outcomes. Although many methods for gene set analysis have been proposed, a systematic analysis tool for identification of different types of gene set significance modules has not been developed previously. This work presents an R package, called MAVTgsa, which includes three different methods for integrated gene set enrichment analysis. (1) The one-sided OLS (ordinary least squares) test detects coordinated changes of genes in gene set in one direction, either up- or downregulation. (2) The two-sided MANOVA (multivariate analysis variance) detects changes both up- and downregulation for studying two or more experimental conditions. (3) A random forests-based procedure is to identify gene sets that can accurately predict samples from different experimental conditions or are associated with the continuous phenotypes. MAVTgsa computes the P values and FDR (false discovery rate) q -value for all gene sets in the study. Furthermore, MAVTgsa provides several visualization outputs to support and interpret the enrichment results. This package is available online.« less

  13. Predicting hepatocellular carcinoma through cross-talk genes identified by risk pathways

    PubMed Central

    Shao, Zhuo; Huo, Diwei; Zhang, Denan; Xie, Hongbo; Yang, Jingbo; Liu, Qiuqi; Chen, Xiujie

    2018-01-01

    Hepatocellular carcinoma (HCC) is the most frequent type of liver cancer with poor survival rate and high mortality. Despite efforts on the mechanism of HCC, new molecular markers are needed for exact diagnosis, evaluation and treatment. Here, we combined transcriptome of HCC with networks and pathways to identify reliable molecular markers. Through integrating 249 differentially expressed genes with syncretic protein interaction networks, we constructed a HCC-specific network, from which we further extracted 480 pivotal genes. Based on the cross-talk between the enriched pathways of the pivotal genes, we finally identified a HCC signature of 45 genes, which could accurately distinguish HCC patients with normal individuals and reveal the prognosis of HCC patients. Among these 45 genes, 15 showed dysregulated expression patterns and a part have been reported to be associated with HCC and/or other cancers. These findings suggested that our identified 45 gene signature could be potential and valuable molecular markers for diagnosis and evaluation of HCC. PMID:29765536

  14. Inferring Gene Family Histories in Yeast Identifies Lineage Specific Expansions

    PubMed Central

    Ames, Ryan M.; Money, Daniel; Lovell, Simon C.

    2014-01-01

    The complement of genes found in the genome is a balance between gene gain and gene loss. Knowledge of the specific genes that are gained and lost over evolutionary time allows an understanding of the evolution of biological functions. Here we use new evolutionary models to infer gene family histories across complete yeast genomes; these models allow us to estimate the relative genome-wide rates of gene birth, death, innovation and extinction (loss of an entire family) for the first time. We show that the rates of gene family evolution vary both between gene families and between species. We are also able to identify those families that have experienced rapid lineage specific expansion/contraction and show that these families are enriched for specific functions. Moreover, we find that families with specific functions are repeatedly expanded in multiple species, suggesting the presence of common adaptations and that these family expansions/contractions are not random. Additionally, we identify potential specialisations, unique to specific species, in the functions of lineage specific expanded families. These results suggest that an important mechanism in the evolution of genome content is the presence of lineage-specific gene family changes. PMID:24921666

  15. Gene Network for Identifying the Entropy Changes of Different Modules in Pediatric Sepsis.

    PubMed

    Yang, Jing; Zhang, Pingli; Wang, Lumin

    2016-01-01

    Pediatric sepsis is a disease that threatens life of children. The incidence of pediatric sepsis is higher in developing countries due to various reasons, such as insufficient immunization and nutrition, water and air pollution, etc. Exploring the potential genes via different methods is of significance for the prevention and treatment of pediatric sepsis. This study aimed to identify potential genes associated with pediatric sepsis utilizing analysis of gene network and entropy. The mRNA expression in the blood samples collected from 20 septic children and 30 healthy controls was quantified by using Affymetrix HG-U133A microarray. Two condition-specific protein-protein interaction networks (PINs), one for the healthy control and the other one for the children with sepsis, were deduced by combining the fundamental human PINs with gene expression profiles in the two phenotypes. Subsequently, distinct modules from the two conditional networks were extracted by adopting a maximal clique-merging approach. Delta entropy (ΔS) was calculated between sepsis and control modules. Then, key genes displaying changes in gene composition were identified by matching the control and sepsis modules. Two objective modules were obtained, in which ribosomal protein RPL4 and RPL9 as well as TOP2A were probably considered as the key genes differentiating sepsis from healthy controls. According to previous reports and this work, TOP2A is the potential gene therapy target for pediatric sepsis. The relationship between pediatric sepsis and RPL4 and RPL9 needs further investigation. © 2016 The Author(s) Published by S. Karger AG, Basel.

  16. Mapping eQTLs in the Norfolk Island Genetic Isolate Identifies Candidate Genes for CVD Risk Traits

    PubMed Central

    Benton, Miles C.; Lea, Rod A.; Macartney-Coxson, Donia; Carless, Melanie A.; Göring, Harald H.; Bellis, Claire; Hanna, Michelle; Eccles, David; Chambers, Geoffrey K.; Curran, Joanne E.; Harper, Jacquie L.; Blangero, John; Griffiths, Lyn R.

    2013-01-01

    Cardiovascular disease (CVD) affects millions of people worldwide and is influenced by numerous factors, including lifestyle and genetics. Expression quantitative trait loci (eQTLs) influence gene expression and are good candidates for CVD risk. Founder-effect pedigrees can provide additional power to map genes associated with disease risk. Therefore, we identified eQTLs in the genetic isolate of Norfolk Island (NI) and tested for associations between these and CVD risk factors. We measured genome-wide transcript levels of blood lymphocytes in 330 individuals and used pedigree-based heritability analysis to identify heritable transcripts. eQTLs were identified by genome-wide association testing of these transcripts. Testing for association between CVD risk factors (i.e., blood lipids, blood pressure, and body fat indices) and eQTLs revealed 1,712 heritable transcripts (p < 0.05) with heritability values ranging from 0.18 to 0.84. From these, we identified 200 cis-acting and 70 trans-acting eQTLs (p < 1.84 × 10−7) An eQTL-centric analysis of CVD risk traits revealed multiple associations, including 12 previously associated with CVD-related traits. Trait versus eQTL regression modeling identified four CVD risk candidates (NAAA, PAPSS1, NME1, and PRDX1), all of which have known biological roles in disease. In addition, we implicated several genes previously associated with CVD risk traits, including MTHFR and FN3KRP. We have successfully identified a panel of eQTLs in the NI pedigree and used this to implicate several genes in CVD risk. Future studies are required for further assessing the functional importance of these eQTLs and whether the findings here also relate to outbred populations. PMID:24314549

  17. QTL and gene expression analyses identify genes affecting carcass weight and marbling on BTA14 in Hanwoo (Korean Cattle).

    PubMed

    Lee, Seung Hwan; van der Werf, J H J; Kim, Nam Kuk; Lee, Sang Hong; Gondro, C; Park, Eung Woo; Oh, Sung Jong; Gibson, J P; Thompson, J M

    2011-10-01

    Causal mutations affecting quantitative trait variation can be good targets for marker-assisted selection for carcass traits in beef cattle. In this study, linkage and linkage disequilibrium analysis (LDLA) for four carcass traits was undertaken using 19 markers on bovine chromosome 14. The LDLA analysis detected quantitative trait loci (QTL) for carcass weight (CWT) and eye muscle area (EMA) at the same position at around 50 cM and surrounded by the markers FABP4SNP2774C>G and FABP4_μsat3237. The QTL for marbling (MAR) was identified at the midpoint of markers BMS4513 and RM137 in a 3.5-cM marker interval. The most likely position for a second QTL for CWT was found at the midpoint of tenth marker bracket (FABP4SNP2774C>G and FABP4_μsat3237). For this marker bracket, the total number of haplotypes was 34 with a most common frequency of 0.118. Effects of haplotypes on CWT varied from a -5-kg deviation for haplotype 6 to +8 kg for haplotype 23. To determine which genes contribute to the QTL effect, gene expression analysis was performed in muscle for a wide range of phenotypes. The results demonstrate that two genes, LOC781182 (p = 0.002) and TRPS1 (p = 0.006) were upregulated with increasing CWT and EMA, whereas only LOC614744 (p = 0.04) has a significant effect on intramuscular fat (IMF) content. Two genetic markers detected in FABP4 were the most likely QTL position in this QTL study, but FABP4 did not show a significant effect on both traits (CWT and EMA) in gene expression analysis. We conclude that three genes could be potential causal genes affecting carcass traits CWT, EMA, and IMF in Hanwoo.

  18. Principal Angle Enrichment Analysis (PAEA): Dimensionally Reduced Multivariate Gene Set Enrichment Analysis Tool

    PubMed Central

    Clark, Neil R.; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D.; Jones, Matthew R.; Ma’ayan, Avi

    2016-01-01

    Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community. PMID:26848405

  19. The genetics of alcoholism: identifying specific genes through family studies.

    PubMed

    Edenberg, Howard J; Foroud, Tatiana

    2006-09-01

    Alcoholism is a complex disorder with both genetic and environmental risk factors. Studies in humans have begun to elucidate the genetic underpinnings of the risk for alcoholism. Here we briefly review strategies for identifying individual genes in which variations affect the risk for alcoholism and related phenotypes, in the context of one large study that has successfully identified such genes. The Collaborative Study on the Genetics of Alcoholism (COGA) is a family-based study that has collected detailed phenotypic data on individuals in families with multiple alcoholic members. A genome-wide linkage approach led to the identification of chromosomal regions containing genes that influenced alcoholism risk and related phenotypes. Subsequently, single nucleotide polymorphisms (SNPs) were genotyped in positional candidate genes located within the linked chromosomal regions, and analyzed for association with these phenotypes. Using this sequential approach, COGA has detected association with GABRA2, CHRM2 and ADH4; these associations have all been replicated by other researchers. COGA has detected association to additional genes including GABRG3, TAS2R16, SNCA, OPRK1 and PDYN, results that are awaiting confirmation. These successes demonstrate that genes contributing to the risk for alcoholism can be reliably identified using human subjects.

  20. Deep sequencing analysis of the transcriptomes of peanut aerial and subterranean young pods identifies candidate genes related to early embryo abortion.

    PubMed

    Chen, Xiaoping; Zhu, Wei; Azam, Sarwar; Li, Heying; Zhu, Fanghe; Li, Haifen; Hong, Yanbin; Liu, Haiyan; Zhang, Erhua; Wu, Hong; Yu, Shanlin; Zhou, Guiyuan; Li, Shaoxiong; Zhong, Ni; Wen, Shijie; Li, Xingyu; Knapp, Steve J; Ozias-Akins, Peggy; Varshney, Rajeev K; Liang, Xuanqiang

    2013-01-01

    The failure of peg penetration into the soil leads to seed abortion in peanut. Knowledge of genes involved in these processes is comparatively deficient. Here, we used RNA-seq to gain insights into transcriptomes of aerial and subterranean pods. More than 2 million transcript reads with an average length of 396 bp were generated from one aerial (AP) and two subterranean (SP1 and SP2) pod libraries using pyrosequencing technology. After assembly, sets of 49 632, 49 952 and 50 494 from a total of 74 974 transcript assembly contigs (TACs) were identified in AP, SP1 and SP2, respectively. A clear linear relationship in the gene expression level was observed between these data sets. In brief, 2194 differentially expressed TACs with a 99.0% true-positive rate were identified, among which 859 and 1068 TACs were up-regulated in aerial and subterranean pods, respectively. Functional analysis showed that putative function based on similarity with proteins catalogued in UniProt and gene ontology term classification could be determined for 59 342 (79.2%) and 42 955 (57.3%) TACs, respectively. A total of 2968 TACs were mapped to 174 KEGG pathways, of which 168 were shared by aerial and subterranean transcriptomes. TACs involved in photosynthesis were significantly up-regulated and enriched in the aerial pod. In addition, two senescence-associated genes were identified as significantly up-regulated in the aerial pod, which potentially contribute to embryo abortion in aerial pods, and in turn, to cessation of swelling. The data set generated in this study provides evidence for some functional genes as robust candidates underlying aerial and subterranean pod development and contributes to an elucidation of the evolutionary implications resulting from fruit development under light and dark conditions. © 2012 The Authors Plant Biotechnology Journal © 2012 Society for Experimental Biology, Association of Applied Biologists and Blackwell Publishing Ltd.

  1. Multi-species comparative analysis of the equine ACE gene identifies a highly conserved potential transcription factor binding site in intron 16.

    PubMed

    Hamilton, Natasha A; Tammen, Imke; Raadsma, Herman W

    2013-01-01

    Angiotensin converting enzyme (ACE) is essential for control of blood pressure. The human ACE gene contains an intronic Alu indel (I/D) polymorphism that has been associated with variation in serum enzyme levels, although the functional mechanism has not been identified. The polymorphism has also been associated with cardiovascular disease, type II diabetes, renal disease and elite athleticism. We have characterized the ACE gene in horses of breeds selected for differing physical abilities. The equine gene has a similar structure to that of all known mammalian ACE genes. Nine common single nucleotide polymorphisms (SNPs) discovered in pooled DNA were found to be inherited in nine haplotypes. Three of these SNPs were located in intron 16, homologous to that containing the Alu polymorphism in the human. A highly conserved 18 bp sequence, also within that intron, was identified as being a potential binding site for the transcription factors Oct-1, HFH-1 and HNF-3β, and lies within a larger area of higher than normal homology. This putative regulatory element may contribute to regulation of the documented inter-individual variation in human circulating enzyme levels, for which a functional mechanism is yet to be defined. Two equine SNPs occurred within the conserved area in intron 16, although neither of them disrupted the putative binding site. We propose a possible regulatory mechanism of the ACE gene in mammalian species which was previously unknown. This advance will allow further analysis leading to a better understanding of the mechanisms underpinning the associations seen between the human Alu polymorphism and enzyme levels, cardiovascular disease states and elite athleticism.

  2. Multi-Species Comparative Analysis of the Equine ACE Gene Identifies a Highly Conserved Potential Transcription Factor Binding Site in Intron 16

    PubMed Central

    Hamilton, Natasha A.; Tammen, Imke; Raadsma, Herman W.

    2013-01-01

    Angiotensin converting enzyme (ACE) is essential for control of blood pressure. The human ACE gene contains an intronic Alu indel (I/D) polymorphism that has been associated with variation in serum enzyme levels, although the functional mechanism has not been identified. The polymorphism has also been associated with cardiovascular disease, type II diabetes, renal disease and elite athleticism. We have characterized the ACE gene in horses of breeds selected for differing physical abilities. The equine gene has a similar structure to that of all known mammalian ACE genes. Nine common single nucleotide polymorphisms (SNPs) discovered in pooled DNA were found to be inherited in nine haplotypes. Three of these SNPs were located in intron 16, homologous to that containing the Alu polymorphism in the human. A highly conserved 18 bp sequence, also within that intron, was identified as being a potential binding site for the transcription factors Oct-1, HFH-1 and HNF-3β, and lies within a larger area of higher than normal homology. This putative regulatory element may contribute to regulation of the documented inter-individual variation in human circulating enzyme levels, for which a functional mechanism is yet to be defined. Two equine SNPs occurred within the conserved area in intron 16, although neither of them disrupted the putative binding site. We propose a possible regulatory mechanism of the ACE gene in mammalian species which was previously unknown. This advance will allow further analysis leading to a better understanding of the mechanisms underpinning the associations seen between the human Alu polymorphism and enzyme levels, cardiovascular disease states and elite athleticism. PMID:23408978

  3. Identifying the Viral Genes Encoding Envelope Glycoproteins for Differentiation of Cyprinid herpesvirus 3 Isolates

    PubMed Central

    Han, Jee Eun; Kim, Ji Hyung; Renault, Tristan; Choresca, Casiano; Shin, Sang Phil; Jun, Jin Woo; Park, Se Chang

    2013-01-01

    Cyprinid herpes virus 3 (CyHV-3) diseases have been reported around the world and are associated with high mortalities of koi (Cyprinus carpio). Although little work has been conducted on the molecular analysis of this virus, glycoprotein genes identified in the present study seem to be valuable targets for genetic comparison of this virus. Three envelope glycoprotein genes (ORF25, 65 and 116) of the CyHV-3 isolates from the USA, Israel, Japan and Korea were compared, and interestingly, sequence insertions or deletions were observed in these target regions. In addition, polymorphisms were presented in microsatellite zones from two glycoprotein genes (ORF65 and 116). In phylogenetic tree analysis, the Korean isolate was remarkably distinguished from USA, Israel, Japan isolates. These findings may be suitable for many applications including isolates differentiation and phylogeny studies. PMID:23435236

  4. Identifying the viral genes encoding envelope glycoproteins for differentiation of Cyprinid herpesvirus 3 isolates.

    PubMed

    Han, Jee Eun; Kim, Ji Hyung; Renault, Tristan; Choresca, Casiano; Shin, Sang Phil; Jun, Jin Woo; Park, Se Chang

    2013-01-31

    Cyprinid herpes virus 3 (CyHV-3) diseases have been reported around the world and are associated with high mortalities of koi (Cyprinus carpio). Although little work has been conducted on the molecular analysis of this virus, glycoprotein genes identified in the present study seem to be valuable targets for genetic comparison of this virus. Three envelope glycoprotein genes (ORF25, 65 and 116) of the CyHV-3 isolates from the USA, Israel, Japan and Korea were compared, and interestingly, sequence insertions or deletions were observed in these target regions. In addition, polymorphisms were presented in microsatellite zones from two glycoprotein genes (ORF65 and 116). In phylogenetic tree analysis, the Korean isolate was remarkably distinguished from USA, Israel, Japan isolates. These findings may be suitable for many applications including isolates differentiation and phylogeny studies.

  5. Gene expression in bovine rumen epithelium during weaning identifies molecular regulators of rumen development and growth.

    PubMed

    Connor, Erin E; Baldwin, Ransom L; Li, Cong-jun; Li, Robert W; Chung, Hoyoung

    2013-03-01

    During weaning, epithelial cell function in the rumen transitions in response to conversion from a pre-ruminant to a true ruminant environment to ensure efficient nutrient absorption and metabolism. To identify gene networks affected by weaning in bovine rumen, Holstein bull calves were fed commercial milk replacer only (MRO) until 42 days of age, then were provided diets of either milk + orchardgrass hay (MH) or milk + grain-based calf starter (MG). Rumen epithelial RNA was extracted from calves sacrificed at four time points: day 14 (n = 3) and day 42 (n = 3) of age while fed the MRO diet and day 56 (n = 3/diet) and day 70 (n = 3/diet) while fed the MH and MG diets for transcript profiling by microarray hybridization. Five two-group comparisons were made using Permutation Analysis of Differential Expression® to identify differentially expressed genes over time and developmental stage between days 14 and 42 within the MRO diet, between day 42 on the MRO diet and day 56 on the MG or MH diets, and between the MG and MH diets at days 56 and 70. Ingenuity Pathway Analysis (IPA) of differentially expressed genes during weaning indicated the top 5 gene networks involving molecules participating in lipid metabolism, cell morphology and death, cellular growth and proliferation, molecular transport, and the cell cycle. Putative genes functioning in the establishment of the rumen microbial population and associated rumen epithelial inflammation during weaning were identified. Activation of transcription factor PPAR-α was identified by IPA software as an important regulator of molecular changes in rumen epithelium that function in papillary development and fatty acid oxidation during the transition from pre-rumination to rumination. Thus, molecular markers of rumen development and gene networks regulating differentiation and growth of rumen epithelium were identified for selecting targets and methods for improving and assessing rumen development and

  6. Mango (Mangifera indica L.) cv. Kent fruit mesocarp de novo transcriptome assembly identifies gene families important for ripening

    PubMed Central

    Dautt-Castro, Mitzuko; Ochoa-Leyva, Adrian; Contreras-Vergara, Carmen A.; Pacheco-Sanchez, Magda A.; Casas-Flores, Sergio; Sanchez-Flores, Alejandro; Kuhn, David N.; Islas-Osuna, Maria A.

    2015-01-01

    Fruit ripening is a physiological and biochemical process genetically programmed to regulate fruit quality parameters like firmness, flavor, odor and color, as well as production of ethylene in climacteric fruit. In this study, a transcriptomic analysis of mango (Mangifera indica L.) mesocarp cv. “Kent” was done to identify key genes associated with fruit ripening. Using the Illumina sequencing platform, 67,682,269 clean reads were obtained and a transcriptome of 4.8 Gb. A total of 33,142 coding sequences were predicted and after functional annotation, 25,154 protein sequences were assigned with a product according to Swiss-Prot database and 32,560 according to non-redundant database. Differential expression analysis identified 2,306 genes with significant differences in expression between mature-green and ripe mango [1,178 up-regulated and 1,128 down-regulated (FDR ≤ 0.05)]. The expression of 10 genes evaluated by both qRT-PCR and RNA-seq data was highly correlated (R = 0.97), validating the differential expression data from RNA-seq alone. Gene Ontology enrichment analysis, showed significantly represented terms associated to fruit ripening like “cell wall,” “carbohydrate catabolic process” and “starch and sucrose metabolic process” among others. Mango genes were assigned to 327 metabolic pathways according to Kyoto Encyclopedia of Genes and Genomes database, among them those involved in fruit ripening such as plant hormone signal transduction, starch and sucrose metabolism, galactose metabolism, terpenoid backbone, and carotenoid biosynthesis. This study provides a mango transcriptome that will be very helpful to identify genes for expression studies in early and late flowering mangos during fruit ripening. PMID:25741352

  7. Transcriptomic profiling in muscle and adipose tissue identifies genes related to growth and lipid deposition

    PubMed Central

    Pang, Jianhui; Zhong, Zhijun; Chen, Xiaohui; Yang, Yuekui; Zeng, Kai; Kang, Runming; Lei, Yunfeng; Ying, Sancheng; Gong, Jianjun; Gu, Yiren

    2017-01-01

    Growth performance and meat quality are important traits for the pig industry and consumers. Adipose tissue is the main site at which fat storage and fatty acid synthesis occur. Therefore, we combined high-throughput transcriptomic sequencing in adipose and muscle tissues with the quantification of corresponding phenotypic features using seven Chinese indigenous pig breeds and one Western commercial breed (Yorkshire). We obtained data on 101 phenotypic traits, from which principal component analysis distinguished two groups: one associated with the Chinese breeds and one with Yorkshire. The numbers of differentially expressed genes between all Chinese breeds and Yorkshire were shown to be 673 and 1056 in adipose and muscle tissues, respectively. Functional enrichment analysis revealed that these genes are associated with biological functions and canonical pathways related to oxidoreductase activity, immune response, and metabolic process. Weighted gene coexpression network analysis found more coexpression modules significantly correlated with the measured phenotypic traits in adipose than in muscle, indicating that adipose regulates meat and carcass quality. Using the combination of differential expression, QTL information, gene significance, and module hub genes, we identified a large number of candidate genes potentially related to economically important traits in pig, which should help us improve meat production and quality. PMID:28877211

  8. Transcriptomic profiling in muscle and adipose tissue identifies genes related to growth and lipid deposition.

    PubMed

    Tao, Xuan; Liang, Yan; Yang, Xuemei; Pang, Jianhui; Zhong, Zhijun; Chen, Xiaohui; Yang, Yuekui; Zeng, Kai; Kang, Runming; Lei, Yunfeng; Ying, Sancheng; Gong, Jianjun; Gu, Yiren; Lv, Xuebin

    2017-01-01

    Growth performance and meat quality are important traits for the pig industry and consumers. Adipose tissue is the main site at which fat storage and fatty acid synthesis occur. Therefore, we combined high-throughput transcriptomic sequencing in adipose and muscle tissues with the quantification of corresponding phenotypic features using seven Chinese indigenous pig breeds and one Western commercial breed (Yorkshire). We obtained data on 101 phenotypic traits, from which principal component analysis distinguished two groups: one associated with the Chinese breeds and one with Yorkshire. The numbers of differentially expressed genes between all Chinese breeds and Yorkshire were shown to be 673 and 1056 in adipose and muscle tissues, respectively. Functional enrichment analysis revealed that these genes are associated with biological functions and canonical pathways related to oxidoreductase activity, immune response, and metabolic process. Weighted gene coexpression network analysis found more coexpression modules significantly correlated with the measured phenotypic traits in adipose than in muscle, indicating that adipose regulates meat and carcass quality. Using the combination of differential expression, QTL information, gene significance, and module hub genes, we identified a large number of candidate genes potentially related to economically important traits in pig, which should help us improve meat production and quality.

  9. A hierarchical approach employing metabolic and gene expression profiles to identify the pathways that confer cytotoxicity in HepG2 cells

    PubMed Central

    Li, Zheng; Srivastava, Shireesh; Yang, Xuerui; Mittal, Sheenu; Norton, Paul; Resau, James; Haab, Brian; Chan, Christina

    2007-01-01

    Background Free fatty acids (FFA) and tumor necrosis factor alpha (TNF-α) have been implicated in the pathogenesis of many obesity-related metabolic disorders. When human hepatoblastoma cells (HepG2) were exposed to different types of FFA and TNF-α, saturated fatty acid was found to be cytotoxic and its toxicity was exacerbated by TNF-α. In order to identify the processes associated with the toxicity of saturated FFA and TNF-α, the metabolic and gene expression profiles were measured to characterize the cellular states. A computational model was developed to integrate these disparate data to reveal the underlying pathways and mechanisms involved in saturated fatty acid toxicity. Results A hierarchical framework consisting of three stages was developed to identify the processes and genes that regulate the toxicity. First, discriminant analysis identified that fatty acid oxidation and intracellular triglyceride accumulation were the most relevant in differentiating the cytotoxic phenotype. Second, gene set enrichment analysis (GSEA) was applied to the cDNA microarray data to identify the transcriptionally altered pathways and processes. Finally, the genes and gene sets that regulate the metabolic responses identified in step 1 were identified by integrating the expression of the enriched gene sets and the metabolic profiles with a multi-block partial least squares (MBPLS) regression model. Conclusion The hierarchical approach suggested potential mechanisms involved in mediating the cytotoxic and cytoprotective pathways, as well as identified novel targets, such as NADH dehydrogenases, aldehyde dehydrogenases 1A1 (ALDH1A1) and endothelial membrane protein 3 (EMP3) as modulator of the toxic phenotypes. These predictions, as well as, some specific targets that were suggested by the analysis were experimentally validated. PMID:17498300

  10. A Canonical Correlation Analysis of AIDS Restriction Genes and Metabolic Pathways Identifies Purine Metabolism as a Key Cooperator.

    PubMed

    Ye, Hanhui; Yuan, Jinjin; Wang, Zhengwu; Huang, Aiqiong; Liu, Xiaolong; Han, Xiao; Chen, Yahong

    2016-01-01

    Human immunodeficiency virus causes a severe disease in humans, referred to as immune deficiency syndrome. Studies on the interaction between host genetic factors and the virus have revealed dozens of genes that impact diverse processes in the AIDS disease. To resolve more genetic factors related to AIDS, a canonical correlation analysis was used to determine the correlation between AIDS restriction and metabolic pathway gene expression. The results show that HIV-1 postentry cellular viral cofactors from AIDS restriction genes are coexpressed in human transcriptome microarray datasets. Further, the purine metabolism pathway comprises novel host factors that are coexpressed with AIDS restriction genes. Using a canonical correlation analysis for expression is a reliable approach to exploring the mechanism underlying AIDS.

  11. Transcriptome Analysis in Prenatal IGF1-Deficient Mice Identifies Molecular Pathways and Target Genes Involved in Distal Lung Differentiation

    PubMed Central

    Hernández-Porras, Isabel; López, Icíar Paula; De Las Rivas, Javier; Pichel, José García

    2013-01-01

    Background Insulin-like Growth Factor 1 (IGF1) is a multifunctional regulator of somatic growth and development throughout evolution. IGF1 signaling through IGF type 1 receptor (IGF1R) controls cell proliferation, survival and differentiation in multiple cell types. IGF1 deficiency in mice disrupts lung morphogenesis, causing altered prenatal pulmonary alveologenesis. Nevertheless, little is known about the cellular and molecular basis of IGF1 activity during lung development. Methods/Principal Findings Prenatal Igf1−/− mutant mice with a C57Bl/6J genetic background displayed severe disproportional lung hypoplasia, leading to lethal neonatal respiratory distress. Immuno-histological analysis of their lungs showed a thickened mesenchyme, alterations in extracellular matrix deposition, thinner smooth muscles and dilated blood vessels, which indicated immature and delayed distal pulmonary organogenesis. Transcriptomic analysis of Igf1−/− E18.5 lungs using RNA microarrays identified deregulated genes related to vascularization, morphogenesis and cellular growth, and to MAP-kinase, Wnt and cell-adhesion pathways. Up-regulation of immunity-related genes was verified by an increase in inflammatory markers. Increased expression of Nfib and reduced expression of Klf2, Egr1 and Ctgf regulatory proteins as well as activation of ERK2 MAP-kinase were corroborated by Western blot. Among IGF-system genes only IGFBP2 revealed a reduction in mRNA expression in mutant lungs. Immuno-staining patterns for IGF1R and IGF2, similar in both genotypes, correlated to alterations found in specific cell compartments of Igf1−/− lungs. IGF1 addition to Igf1−/− embryonic lungs cultured ex vivo increased airway septa remodeling and distal epithelium maturation, processes accompanied by up-regulation of Nfib and Klf2 transcription factors and Cyr61 matricellular protein. Conclusions/Significance We demonstrated the functional tissue specific implication of IGF1 on fetal lung

  12. Multi-dimensional genomic analysis of myoepithelial carcinoma identifies prevalent oncogenic gene fusions.

    PubMed

    Dalin, Martin G; Katabi, Nora; Persson, Marta; Lee, Ken-Wing; Makarov, Vladimir; Desrichard, Alexis; Walsh, Logan A; West, Lyndsay; Nadeem, Zaineb; Ramaswami, Deepa; Havel, Jonathan J; Kuo, Fengshen; Chadalavada, Kalyani; Nanjangud, Gouri J; Ganly, Ian; Riaz, Nadeem; Ho, Alan L; Antonescu, Cristina R; Ghossein, Ronald; Stenman, Göran; Chan, Timothy A; Morris, Luc G T

    2017-10-30

    Myoepithelial carcinoma (MECA) is an aggressive salivary gland cancer with largely unknown genetic features. Here we comprehensively analyze molecular alterations in 40 MECAs using integrated genomic analyses. We identify a low mutational load, and high prevalence (70%) of oncogenic gene fusions. Most fusions involve the PLAG1 oncogene, which is associated with PLAG1 overexpression. We find FGFR1-PLAG1 in seven (18%) cases, and the novel TGFBR3-PLAG1 fusion in six (15%) cases. TGFBR3-PLAG1 promotes a tumorigenic phenotype in vitro, and is absent in 723 other salivary gland tumors. Other novel PLAG1 fusions include ND4-PLAG1; a fusion between mitochondrial and nuclear DNA. We also identify higher number of copy number alterations as a risk factor for recurrence, independent of tumor stage at diagnosis. Our findings indicate that MECA is a fusion-driven disease, nominate TGFBR3-PLAG1 as a hallmark of MECA, and provide a framework for future diagnostic and therapeutic research in this lethal cancer.

  13. [Polymorphic loci and polymorphism analysis of short tandem repeats within XNP gene].

    PubMed

    Liu, Qi-Ji; Gong, Yao-Qin; Guo, Chen-Hong; Chen, Bing-Xi; Li, Jiang-Xia; Guo, Yi-Shou

    2002-01-01

    To select polymorphic short tandem repeat markers within X-linked nuclear protein (XNP) gene, genomic clones which contain XNP gene were recognized by homologous analysis with XNP cDNA. By comparing the cDNA with genomic DNA, non-exonic sequences were identified, and short tandem repeats were selected from non-exonic sequences by using BCM search Launcher. Polymorphisms of the short tandem repeats in Chinese population were evaluated by PCR amplification and PAGE. Five short tandem repeats were identified from XNP gene, two of which were polymorphic. Four and 11 alleles were observed in Chinese population for XNPSTR1 and XNPSTR4, respectively. Heterozygosities were 47% for XNPSTR1 and 70% for XNPSTR4. XNPSTR1 and XNPSTR4 localized within 3' end and intron 10, respectively. Two polymorphic short tandem repeats have been identified within XNP gene and will be useful for linkage analysis and gene diagnosis of XNP gene.

  14. Cluster analysis of spontaneous preterm birth phenotypes identifies potential associations among preterm birth mechanisms

    PubMed Central

    Esplin, M Sean; Manuck, Tracy A.; Varner, Michael W.; Christensen, Bryce; Biggio, Joseph; Bukowski, Radek; Parry, Samuel; Zhang, Heping; Huang, Hao; Andrews, William; Saade, George; Sadovsky, Yoel; Reddy, Uma M.; Ilekis, John

    2015-01-01

    Objective We sought to employ an innovative tool based on common biological pathways to identify specific phenotypes among women with spontaneous preterm birth (SPTB), in order to enhance investigators' ability to identify to highlight common mechanisms and underlying genetic factors responsible for SPTB. Study Design A secondary analysis of a prospective case-control multicenter study of SPTB. All cases delivered a preterm singleton at SPTB ≤34.0 weeks gestation. Each woman was assessed for the presence of underlying SPTB etiologies. A hierarchical cluster analysis was used to identify groups of women with homogeneous phenotypic profiles. One of the phenotypic clusters was selected for candidate gene association analysis using VEGAS software. Results 1028 women with SPTB were assigned phenotypes. Hierarchical clustering of the phenotypes revealed five major clusters. Cluster 1 (N=445) was characterized by maternal stress, cluster 2 (N=294) by premature membrane rupture, cluster 3 (N=120) by familial factors, and cluster 4 (N=63) by maternal comorbidities. Cluster 5 (N=106) was multifactorial, characterized by infection (INF), decidual hemorrhage (DH) and placental dysfunction (PD). These three phenotypes were highly correlated by Chi-square analysis [PD and DH (p<2.2e-6); PD and INF (p=6.2e-10); INF and DH (p=0.0036)]. Gene-based testing identified the INS (insulin) gene as significantly associated with cluster 3 of SPTB. Conclusion We identified 5 major clusters of SPTB based on a phenotype tool and hierarchal clustering. There was significant correlation between several of the phenotypes. The INS gene was associated with familial factors underlying SPTB. PMID:26070700

  15. Cross-species microarray hybridization to identify developmentally regulated genes in the filamentous fungus Sordaria macrospora.

    PubMed

    Nowrousian, Minou; Ringelberg, Carol; Dunlap, Jay C; Loros, Jennifer J; Kück, Ulrich

    2005-04-01

    The filamentous fungus Sordaria macrospora forms complex three-dimensional fruiting bodies that protect the developing ascospores and ensure their proper discharge. Several regulatory genes essential for fruiting body development were previously isolated by complementation of the sterile mutants pro1, pro11 and pro22. To establish the genetic relationships between these genes and to identify downstream targets, we have conducted cross-species microarray hybridizations using cDNA arrays derived from the closely related fungus Neurospora crassa and RNA probes prepared from wild-type S. macrospora and the three developmental mutants. Of the 1,420 genes which gave a signal with the probes from all the strains used, 172 (12%) were regulated differently in at least one of the three mutants compared to the wild type, and 17 (1.2%) were regulated differently in all three mutant strains. Microarray data were verified by Northern analysis or quantitative real time PCR. Among the genes that are up- or down-regulated in the mutant strains are genes encoding the pheromone precursors, enzymes involved in melanin biosynthesis and a lectin-like protein. Analysis of gene expression in double mutants revealed a complex network of interaction between the pro gene products.

  16. Integrating text mining, data mining, and network analysis for identifying genetic breast cancer trends.

    PubMed

    Jurca, Gabriela; Addam, Omar; Aksac, Alper; Gao, Shang; Özyer, Tansel; Demetrick, Douglas; Alhajj, Reda

    2016-04-26

    Breast cancer is a serious disease which affects many women and may lead to death. It has received considerable attention from the research community. Thus, biomedical researchers aim to find genetic biomarkers indicative of the disease. Novel biomarkers can be elucidated from the existing literature. However, the vast amount of scientific publications on breast cancer make this a daunting task. This paper presents a framework which investigates existing literature data for informative discoveries. It integrates text mining and social network analysis in order to identify new potential biomarkers for breast cancer. We utilized PubMed for the testing. We investigated gene-gene interactions, as well as novel interactions such as gene-year, gene-country, and abstract-country to find out how the discoveries varied over time and how overlapping/diverse are the discoveries and the interest of various research groups in different countries. Interesting trends have been identified and discussed, e.g., different genes are highlighted in relationship to different countries though the various genes were found to share functionality. Some text analysis based results have been validated against results from other tools that predict gene-gene relations and gene functions.

  17. Identification of key genes associated with the effect of estrogen on ovarian cancer using microarray analysis.

    PubMed

    Zhang, Shi-tao; Zuo, Chao; Li, Wan-nan; Fu, Xue-qi; Xing, Shu; Zhang, Xiao-ping

    2016-02-01

    To identify key genes related to the effect of estrogen on ovarian cancer. Microarray data (GSE22600) were downloaded from Gene Expression Omnibus. Eight estrogen and seven placebo treatment samples were obtained using a 2 × 2 factorial designs, which contained 2 cell lines (PEO4 and 2008) and 2 treatments (estrogen and placebo). Differentially expressed genes were identified by Bayesian methods, and the genes with P < 0.05 and |log2FC (fold change)| ≥0.5 were chosen as cut-off criterion. Differentially co-expressed genes (DCGs) and differentially regulated genes (DRGs) were, respectively, identified by DCe function and DRsort function in DCGL package. Topological structure analysis was performed on the important transcriptional factors (TFs) and genes in transcriptional regulatory network using tYNA. Functional enrichment analysis was, respectively, performed for DEGs and the important genes using Gene Ontology and KEGG databases. In total, 465 DEGs were identified. Functional enrichment analysis of DEGs indicated that ACVR2B, LTBP1, BMP7 and MYC involved in TGF-beta signaling pathway. The 2285 DCG pairs and 357 DRGs were identified. Topological structure analysis showed that 52 important TFs and 65 important genes were identified. Functional enrichment analysis of the important genes showed that TP53 and MLH1 participated in DNA damage response and the genes (ACVR2B, LTBP1, BMP7 and MYC) involved in TGF-beta signaling pathway. TP53, MLH1, ACVR2B, LTBP1 and BMP7 might participate in the pathogenesis of ovarian cancer.

  18. Use of an activated beta-catenin to identify Wnt pathway target genes in caenorhabditis elegans, including a subset of collagen genes expressed in late larval development.

    PubMed

    Jackson, Belinda M; Abete-Luzi, Patricia; Krause, Michael W; Eisenmann, David M

    2014-04-16

    The Wnt signaling pathway plays a fundamental role during metazoan development, where it regulates diverse processes, including cell fate specification, cell migration, and stem cell renewal. Activation of the beta-catenin-dependent/canonical Wnt pathway up-regulates expression of Wnt target genes to mediate a cellular response. In the nematode Caenorhabditis elegans, a canonical Wnt signaling pathway regulates several processes during larval development; however, few target genes of this pathway have been identified. To address this deficit, we used a novel approach of conditionally activated Wnt signaling during a defined stage of larval life by overexpressing an activated beta-catenin protein, then used microarray analysis to identify genes showing altered expression compared with control animals. We identified 166 differentially expressed genes, of which 104 were up-regulated. A subset of the up-regulated genes was shown to have altered expression in mutants with decreased or increased Wnt signaling; we consider these genes to be bona fide C. elegans Wnt pathway targets. Among these was a group of six genes, including the cuticular collagen genes, bli-1 col-38, col-49, and col-71. These genes show a peak of expression in the mid L4 stage during normal development, suggesting a role in adult cuticle formation. Consistent with this finding, reduction of function for several of the genes causes phenotypes suggestive of defects in cuticle function or integrity. Therefore, this work has identified a large number of putative Wnt pathway target genes during larval life, including a small subset of Wnt-regulated collagen genes that may function in synthesis of the adult cuticle.

  19. Genome-Wide association study identifies candidate genes for Parkinson's disease in an Ashkenazi Jewish population

    PubMed Central

    2011-01-01

    Background To date, nine Parkinson disease (PD) genome-wide association studies in North American, European and Asian populations have been published. The majority of studies have confirmed the association of the previously identified genetic risk factors, SNCA and MAPT, and two studies have identified three new PD susceptibility loci/genes (PARK16, BST1 and HLA-DRB5). In a recent meta-analysis of datasets from five of the published PD GWAS an additional 6 novel candidate genes (SYT11, ACMSD, STK39, MCCC1/LAMP3, GAK and CCDC62/HIP1R) were identified. Collectively the associations identified in these GWAS account for only a small proportion of the estimated total heritability of PD suggesting that an 'unknown' component of the genetic architecture of PD remains to be identified. Methods We applied a GWAS approach to a relatively homogeneous Ashkenazi Jewish (AJ) population from New York to search for both 'rare' and 'common' genetic variants that confer risk of PD by examining any SNPs with allele frequencies exceeding 2%. We have focused on a genetic isolate, the AJ population, as a discovery dataset since this cohort has a higher sharing of genetic background and historically experienced a significant bottleneck. We also conducted a replication study using two publicly available datasets from dbGaP. The joint analysis dataset had a combined sample size of 2,050 cases and 1,836 controls. Results We identified the top 57 SNPs showing the strongest evidence of association in the AJ dataset (p < 9.9 × 10-5). Six SNPs located within gene regions had positive signals in at least one other independent dbGaP dataset: LOC100505836 (Chr3p24), LOC153328/SLC25A48 (Chr5q31.1), UNC13B (9p13.3), SLCO3A1(15q26.1), WNT3(17q21.3) and NSF (17q21.3). We also replicated published associations for the gene regions SNCA (Chr4q21; rs3775442, p = 0.037), PARK16 (Chr1q32.1; rs823114 (NUCKS1), p = 6.12 × 10-4), BST1 (Chr4p15; rs12502586, p = 0.027), STK39 (Chr2q24.3; rs3754775, p = 0

  20. Differential gene expression analysis in glioblastoma cells and normal human brain cells based on GEO database.

    PubMed

    Wang, Anping; Zhang, Guibin

    2017-11-01

    The differentially expressed genes between glioblastoma (GBM) cells and normal human brain cells were investigated to performed pathway analysis and protein interaction network analysis for the differentially expressed genes. GSE12657 and GSE42656 gene chips, which contain gene expression profile of GBM were obtained from Gene Expression Omniub (GEO) database of National Center for Biotechnology Information (NCBI). The 'limma' data packet in 'R' software was used to analyze the differentially expressed genes in the two gene chips, and gene integration was performed using 'RobustRankAggreg' package. Finally, pheatmap software was used for heatmap analysis and Cytoscape, DAVID, STRING and KOBAS were used for protein-protein interaction, Gene Ontology (GO) and KEGG analyses. As results: i) 702 differentially expressed genes were identified in GSE12657, among those genes, 548 were significantly upregulated and 154 were significantly downregulated (p<0.01, fold-change >1), and 1,854 differentially expressed genes were identified in GSE42656, among the genes, 1,068 were significantly upregulated and 786 were significantly downregulated (p<0.01, fold-change >1). A total of 167 differentially expressed genes including 100 upregulated genes and 67 downregulated genes were identified after gene integration, and the genes showed significantly different expression levels in GBM compared with normal human brain cells (p<0.05). ii) Interactions between the protein products of 101 differentially expressed genes were identified using STRING and expression network was established. A key gene, called CALM3, was identified by Cytoscape software. iii) GO enrichment analysis showed that differentially expressed genes were mainly enriched in 'neurotransmitter:sodium symporter activity' and 'neurotransmitter transporter activity', which can affect the activity of neurotransmitter transportation. KEGG pathway analysis showed that the differentially expressed genes were mainly enriched in

  1. Digital transcriptome analysis of putative sex-determination genes in papaya (Carica papaya).

    PubMed

    Urasaki, Naoya; Tarora, Kazuhiko; Shudo, Ayano; Ueno, Hiroki; Tamaki, Moritoshi; Miyagi, Norimichi; Adaniya, Shinichi; Matsumura, Hideo

    2012-01-01

    Papaya (Carica papaya) is a trioecious plant species that has male, female and hermaphrodite flowers on different plants. The primitive sex chromosomes genetically determine the sex of the papaya. Although draft sequences of the papaya genome are already available, the genes for sex determination have not been identified, likely due to the complicated structure of its sex-chromosome sequences. To identify the candidate genes for sex determination, we conducted a transcriptome analysis of flower samples from male, female and hermaphrodite plants using high-throughput SuperSAGE for digital gene expression analysis. Among the short sequence tags obtained from the transcripts, 312 unique tags were specifically mapped to the primitive sex chromosome (X or Y(h)) sequences. An annotation analysis revealed that retroelements are the most abundant sequences observed in the genes corresponding to these tags. The majority of tags on the sex chromosomes were located on the X chromosome, and only 30 tags were commonly mapped to both the X and Y(h) chromosome, implying a loss of many genes on the Y(h) chromosome. Nevertheless, candidate Y(h) chromosome-specific female determination genes, including a MADS-box gene, were identified. Information on these sex chromosome-specific expressed genes will help elucidating sex determination in the papaya.

  2. Digital Transcriptome Analysis of Putative Sex-Determination Genes in Papaya (Carica papaya)

    PubMed Central

    Urasaki, Naoya; Tarora, Kazuhiko; Shudo, Ayano; Ueno, Hiroki; Tamaki, Moritoshi; Miyagi, Norimichi; Adaniya, Shinichi; Matsumura, Hideo

    2012-01-01

    Papaya (Carica papaya) is a trioecious plant species that has male, female and hermaphrodite flowers on different plants. The primitive sex chromosomes genetically determine the sex of the papaya. Although draft sequences of the papaya genome are already available, the genes for sex determination have not been identified, likely due to the complicated structure of its sex-chromosome sequences. To identify the candidate genes for sex determination, we conducted a transcriptome analysis of flower samples from male, female and hermaphrodite plants using high-throughput SuperSAGE for digital gene expression analysis. Among the short sequence tags obtained from the transcripts, 312 unique tags were specifically mapped to the primitive sex chromosome (X or Yh) sequences. An annotation analysis revealed that retroelements are the most abundant sequences observed in the genes corresponding to these tags. The majority of tags on the sex chromosomes were located on the X chromosome, and only 30 tags were commonly mapped to both the X and Yh chromosome, implying a loss of many genes on the Yh chromosome. Nevertheless, candidate Yh chromosome-specific female determination genes, including a MADS-box gene, were identified. Information on these sex chromosome-specific expressed genes will help elucidating sex determination in the papaya. PMID:22815863

  3. Causal network analysis of head and neck keloid tissue identifies potential master regulators.

    PubMed

    Garcia-Rodriguez, Laura; Jones, Lamont; Chen, Kang Mei; Datta, Indrani; Divine, George; Worsham, Maria J

    2016-10-01

    To generate novel insights and hypotheses in keloid development from potential master regulators. Prospective cohort. Six fresh keloid and six normal skin samples from 12 anonymous donors were used in a prospective cohort study. Genome-wide profiling was done previously on the cohort using the Infinium HumanMethylation450 BeadChip (Illumina, San Diego, CA). The 190 statistically significant CpG islands between keloid and normal tissue mapped to 152 genes (P < .05). The top 10 statistically significant genes (VAMP5, ACTR3C, GALNT3, KCNAB2, LRRC61, SCML4, SYNGR1, TNS1, PLEKHG5, PPP1R13-α, false discovery rate <.015) were uploaded into the Ingenuity Pathway Analysis software's Causal Network Analysis (QIAGEN, Redwood City, CA). To reflect expected gene expression direction in the context of methylation changes, the inverse of the methylation ratio from keloid versus normal tissue was used for the analysis. Causal Network Analysis identified disease-specific master regulator molecules based on downstream differentially expressed keloid-specific genes and expected directionality of expression (hypermethylated vs. hypomethylated). Causal Network Analysis software identified four hierarchical networks that included four master regulators (pyroxamide, tributyrin, PRKG2, and PENK) and 19 intermediate regulators. Causal Network Analysis of differentiated methylated gene data of keloid versus normal skin demonstrated four causal networks with four master regulators. These hierarchical networks suggest potential driver roles for their downstream keloid gene targets in the pathogenesis of the keloid phenotype, likely triggered due to perturbation/injury to normal tissue. NA Laryngoscope, 126:E319-E324, 2016. © 2016 The American Laryngological, Rhinological and Otological Society, Inc.

  4. Genome-wide significant localization for working and spatial memory: Identifying genes for psychosis using models of cognition.

    PubMed

    Knowles, Emma E M; Carless, Melanie A; de Almeida, Marcio A A; Curran, Joanne E; McKay, D Reese; Sprooten, Emma; Dyer, Thomas D; Göring, Harald H; Olvera, Rene; Fox, Peter; Almasy, Laura; Duggirala, Ravi; Kent, Jack W; Blangero, John; Glahn, David C

    2014-01-01

    It is well established that risk for developing psychosis is largely mediated by the influence of genes, but identifying precisely which genes underlie that risk has been problematic. Focusing on endophenotypes, rather than illness risk, is one solution to this problem. Impaired cognition is a well-established endophenotype of psychosis. Here we aimed to characterize the genetic architecture of cognition using phenotypically detailed models as opposed to relying on general IQ or individual neuropsychological measures. In so doing we hoped to identify genes that mediate cognitive ability, which might also contribute to psychosis risk. Hierarchical factor models of genetically clustered cognitive traits were subjected to linkage analysis followed by QTL region-specific association analyses in a sample of 1,269 Mexican American individuals from extended pedigrees. We identified four genome wide significant QTLs, two for working and two for spatial memory, and a number of plausible and interesting candidate genes. The creation of detailed models of cognition seemingly enhanced the power to detect genetic effects on cognition and provided a number of possible candidate genes for psychosis. © 2013 Wiley Periodicals, Inc.

  5. Gene Signature in Sessile Serrated Polyps Identifies Colon Cancer Subtype

    PubMed Central

    Kanth, Priyanka; Bronner, Mary P.; Boucher, Kenneth M.; Burt, Randall W.; Neklason, Deborah W.; Hagedorn, Curt H.; Delker, Don A.

    2016-01-01

    Sessile serrated colon adenoma/polyps (SSA/Ps) are found during routine screening colonoscopy and may account for 20–30% of colon cancers. However, differentiating SSA/Ps from hyperplastic polyps (HP) with little risk of cancer is challenging and complementary molecular markers are needed. Additionally, the molecular mechanisms of colon cancer development from SSA/Ps are poorly understood. RNA sequencing was performed on 21 SSA/Ps, 10 HPs, 10 adenomas, 21 uninvolved colon and 20 control colon specimens. Differential expression and leave-one-out cross validation methods were used to define a unique gene signature of SSA/Ps. Our SSA/P gene signature was evaluated in colon cancer RNA-Seq data from The Cancer Genome Atlas (TCGA) to identify a subtype of colon cancers that may develop from SSA/Ps. A total of 1422 differentially expressed genes were found in SSA/Ps relative to controls. Serrated polyposis syndrome (n=12) and sporadic SSA/Ps (n=9) exhibited almost complete (96%) gene overlap. A 51-gene panel in SSA/P showed similar expression in a subset of TCGA colon cancers with high microsatellite instability (MSI-H). A smaller seven-gene panel showed high sensitivity and specificity in identifying BRAF mutant, CpG island methylator phenotype high (CIMP-H) and MLH1 silenced colon cancers. We describe a unique gene signature in SSA/Ps that identifies a subset of colon cancers likely to develop through the serrated pathway. These gene panels may be utilized for improved differentiation of SSA/Ps from HPs and provide insights into novel molecular pathways altered in colon cancer arising from the serrated pathway. PMID:27026680

  6. Genetic Susceptibility to Vitiligo: GWAS Approaches for Identifying Vitiligo Susceptibility Genes and Loci

    PubMed Central

    Shen, Changbing; Gao, Jing; Sheng, Yujun; Dou, Jinfa; Zhou, Fusheng; Zheng, Xiaodong; Ko, Randy; Tang, Xianfa; Zhu, Caihong; Yin, Xianyong; Sun, Liangdan; Cui, Yong; Zhang, Xuejun

    2016-01-01

    Vitiligo is an autoimmune disease with a strong genetic component, characterized by areas of depigmented skin resulting from loss of epidermal melanocytes. Genetic factors are known to play key roles in vitiligo through discoveries in association studies and family studies. Previously, vitiligo susceptibility genes were mainly revealed through linkage analysis and candidate gene studies. Recently, our understanding of the genetic basis of vitiligo has been rapidly advancing through genome-wide association study (GWAS). More than 40 robust susceptible loci have been identified and confirmed to be associated with vitiligo by using GWAS. Most of these associated genes participate in important pathways involved in the pathogenesis of vitiligo. Many susceptible loci with unknown functions in the pathogenesis of vitiligo have also been identified, indicating that additional molecular mechanisms may contribute to the risk of developing vitiligo. In this review, we summarize the key loci that are of genome-wide significance, which have been shown to influence vitiligo risk. These genetic loci may help build the foundation for genetic diagnosis and personalize treatment for patients with vitiligo in the future. However, substantial additional studies, including gene-targeted and functional studies, are required to confirm the causality of the genetic variants and their biological relevance in the development of vitiligo. PMID:26870082

  7. Suppression subtractive hybridization identified differentially expressed genes in lung adenocarcinoma: ERGIC3 as a novel lung cancer-related gene

    PubMed Central

    2013-01-01

    Background To understand the carcinogenesis caused by accumulated genetic and epigenetic alterations and seek novel biomarkers for various cancers, studying differentially expressed genes between cancerous and normal tissues is crucial. In the study, two cDNA libraries of lung cancer were constructed and screened for identification of differentially expressed genes. Methods Two cDNA libraries of differentially expressed genes were constructed using lung adenocarcinoma tissue and adjacent nonmalignant lung tissue by suppression subtractive hybridization. The data of the cDNA libraries were then analyzed and compared using bioinformatics analysis. Levels of mRNA and protein were measured by quantitative real-time polymerase chain reaction (q-RT-PCR) and western blot respectively, as well as expression and localization of proteins were determined by immunostaining. Gene functions were investigated using proliferation and migration assays after gene silencing and gene over-expression. Results Two libraries of differentially expressed genes were obtained. The forward-subtracted library (FSL) and the reverse-subtracted library (RSL) contained 177 and 59 genes, respectively. Bioinformatic analysis demonstrated that these genes were involved in a wide range of cellular functions. The vast majority of these genes were newly identified to be abnormally expressed in lung cancer. In the first stage of the screening for 16 genes, we compared lung cancer tissues with their adjacent non-malignant tissues at the mRNA level, and found six genes (ERGIC3, DDR1, HSP90B1, SDC1, RPSA, and LPCAT1) from the FSL were significantly up-regulated while two genes (GPX3 and TIMP3) from the RSL were significantly down-regulated (P < 0.05). The ERGIC3 protein was also over-expressed in lung cancer tissues and cultured cells, and expression of ERGIC3 was correlated with the differentiated degree and histological type of lung cancer. The up-regulation of ERGIC3 could promote cellular migration

  8. Live-cell monitoring of periodic gene expression in synchronous human cells identifies Forkhead genes involved in cell cycle control

    PubMed Central

    Grant, Gavin D.; Gamsby, Joshua; Martyanov, Viktor; Brooks, Lionel; George, Lacy K.; Mahoney, J. Matthew; Loros, Jennifer J.; Dunlap, Jay C.; Whitfield, Michael L.

    2012-01-01

    We developed a system to monitor periodic luciferase activity from cell cycle–regulated promoters in synchronous cells. Reporters were driven by a minimal human E2F1 promoter with peak expression in G1/S or a basal promoter with six Forkhead DNA-binding sites with peak expression at G2/M. After cell cycle synchronization, luciferase activity was measured in live cells at 10-min intervals across three to four synchronous cell cycles, allowing unprecedented resolution of cell cycle–regulated gene expression. We used this assay to screen Forkhead transcription factors for control of periodic gene expression. We confirmed a role for FOXM1 and identified two novel cell cycle regulators, FOXJ3 and FOXK1. Knockdown of FOXJ3 and FOXK1 eliminated cell cycle–dependent oscillations and resulted in decreased cell proliferation rates. Analysis of genes regulated by FOXJ3 and FOXK1 showed that FOXJ3 may regulate a network of zinc finger proteins and that FOXK1 binds to the promoter and regulates DHFR, TYMS, GSDMD, and the E2F binding partner TFDP1. Chromatin immunoprecipitation followed by high-throughput sequencing analysis identified 4329 genomic loci bound by FOXK1, 83% of which contained a FOXK1-binding motif. We verified that a subset of these loci are activated by wild-type FOXK1 but not by a FOXK1 (H355A) DNA-binding mutant. PMID:22740631

  9. Identification and function analysis of contrary genes in Dupuytren's contracture.

    PubMed

    Ji, Xianglu; Tian, Feng; Tian, Lijie

    2015-07-01

    The present study aimed to analyze the expression of genes involved in Dupuytren's contracture (DC), using bioinformatic methods. The profile of GSE21221 was downloaded from the gene expression ominibus, which included six samples, derived from fibroblasts and six healthy control samples, derived from carpal-tunnel fibroblasts. A Distributed Intrusion Detection System was used in order to identify differentially expressed genes. The term contrary genes is proposed. Contrary genes were the genes that exhibited opposite expression patterns in the positive and negative groups, and likely exhibited opposite functions. These were identified using Coexpress software. Gene ontology (GO) function analysis was conducted for the contrary genes. A network of GO terms was constructed using the reduce and visualize gene ontology database. Significantly expressed genes (801) and contrary genes (98) were screened. A significant association was observed between Chitinase-3-like protein 1 and ten genes in the positive gene set. Positive regulation of transcription and the activation of nuclear factor-κB (NF-κB)-inducing kinase activity exhibited the highest degree values in the network of GO terms. In the present study, the expression of genes involved in the development of DC was analyzed, and the concept of contrary genes proposed. The genes identified in the present study are involved in the positive regulation of transcription and activation of NF-κB-inducing kinase activity. The contrary genes and GO terms identified in the present study may potentially be used for DC diagnosis and treatment.

  10. GOEAST: a web-based software toolkit for Gene Ontology enrichment analysis.

    PubMed

    Zheng, Qi; Wang, Xiu-Jie

    2008-07-01

    Gene Ontology (GO) analysis has become a commonly used approach for functional studies of large-scale genomic or transcriptomic data. Although there have been a lot of software with GO-related analysis functions, new tools are still needed to meet the requirements for data generated by newly developed technologies or for advanced analysis purpose. Here, we present a Gene Ontology Enrichment Analysis Software Toolkit (GOEAST), an easy-to-use web-based toolkit that identifies statistically overrepresented GO terms within given gene sets. Compared with available GO analysis tools, GOEAST has the following improved features: (i) GOEAST displays enriched GO terms in graphical format according to their relationships in the hierarchical tree of each GO category (biological process, molecular function and cellular component), therefore, provides better understanding of the correlations among enriched GO terms; (ii) GOEAST supports analysis for data from various sources (probe or probe set IDs of Affymetrix, Illumina, Agilent or customized microarrays, as well as different gene identifiers) and multiple species (about 60 prokaryote and eukaryote species); (iii) One unique feature of GOEAST is to allow cross comparison of the GO enrichment status of multiple experiments to identify functional correlations among them. GOEAST also provides rigorous statistical tests to enhance the reliability of analysis results. GOEAST is freely accessible at http://omicslab.genetics.ac.cn/GOEAST/

  11. RNA-Seq analysis identifies key genes associated with haustorial development in the root hemiparasite Santalum album

    PubMed Central

    Zhang, Xinhua; Berkowitz, Oliver; Teixeira da Silva, Jaime A.; Zhang, Muhan; Ma, Guohua; Whelan, James; Duan, Jun

    2015-01-01

    Santalum album (sandalwood) is one of the economically important plant species in the Santalaceae for its production of highly valued perfume oils. Sandalwood is also a hemiparasitic tree that obtains some of its water and simple nutrients by tapping into other plants through haustoria which are highly specialized organs in parasitic angiosperms. However, an understanding of the molecular mechanisms involved in haustorium development is limited. In this study, RNA sequencing (RNA-seq) analyses were performed to identify changes in gene expression and metabolic pathways associated with the development of the S. album haustorium. A total of 56,011 non-redundant contigs with a mean contig size of 618 bp were obtained by de novo assembly of the transcriptome of haustoria and non-haustorial seedling roots. A substantial number of the identified differentially expressed genes were involved in cell wall metabolism and protein metabolism, as well as mitochondrial electron transport functions. Phytohormone-mediated regulation might play an important role during haustorial development. Especially, auxin signaling is likely to be essential for haustorial initiation, and genes related to cytokinin and gibberellin biosynthesis and metabolism are involved in haustorial development. Our results suggest that genes encoding nodulin-like proteins may be important for haustorial morphogenesis in S. album. The obtained sequence data will become a rich resource for future research in this interesting species. This information improves our understanding of haustorium development in root hemiparasitic species and will allow further exploration of the detailed molecular mechanisms underlying plant parasitism. PMID:26388878

  12. A Heterogeneous Network Based Method for Identifying GBM-Related Genes by Integrating Multi-Dimensional Data.

    PubMed

    Chen Peng; Ao Li

    2017-01-01

    The emergence of multi-dimensional data offers opportunities for more comprehensive analysis of the molecular characteristics of human diseases and therefore improving diagnosis, treatment, and prevention. In this study, we proposed a heterogeneous network based method by integrating multi-dimensional data (HNMD) to identify GBM-related genes. The novelty of the method lies in that the multi-dimensional data of GBM from TCGA dataset that provide comprehensive information of genes, are combined with protein-protein interactions to construct a weighted heterogeneous network, which reflects both the general and disease-specific relationships between genes. In addition, a propagation algorithm with resistance is introduced to precisely score and rank GBM-related genes. The results of comprehensive performance evaluation show that the proposed method significantly outperforms the network based methods with single-dimensional data and other existing approaches. Subsequent analysis of the top ranked genes suggests they may be functionally implicated in GBM, which further corroborates the superiority of the proposed method. The source code and the results of HNMD can be downloaded from the following URL: http://bioinformatics.ustc.edu.cn/hnmd/ .

  13. Serial analysis of gene expression identifies connective tissue growth factor expression as a prognostic biomarker in gallbladder cancer.

    PubMed

    Alvarez, Hector; Corvalan, Alejandro; Roa, Juan C; Argani, Pedram; Murillo, Francisco; Edwards, Jennifer; Beaty, Robert; Feldmann, Georg; Hong, Seung-Mo; Mullendore, Michael; Roa, Ivan; Ibañez, Luis; Pimentel, Fernando; Diaz, Alfonso; Riggins, Gregory J; Maitra, Anirban

    2008-05-01

    Gallbladder cancer (GBC) is an uncommon neoplasm in the United States, but one with high mortality rates. This malignancy remains largely understudied at the molecular level such that few targeted therapies or predictive biomarkers exist. We built the first series of serial analysis of gene expression (SAGE) libraries from GBC and nonneoplastic gallbladder mucosa, composed of 21-bp long-SAGE tags. SAGE libraries were generated from three stage-matched GBC patients (representing Hispanic/Latino, Native American, and Caucasian ethnicities, respectively) and one histologically alithiasic gallbladder. Real-time quantitative PCR was done on microdissected epithelium from five matched GBC and corresponding nonneoplastic gallbladder mucosa. Immunohistochemical analysis was done on a panel of 182 archival GBC in high-throughput tissue microarray format. SAGE tags corresponding to connective tissue growth factor (CTGF) transcripts were identified as differentially overexpressed in all pairwise comparisons of GBC (P < 0.001). Real-time quantitative PCR confirmed significant overexpression of CTGF transcripts in microdissected primary GBC (P < 0.05), but not in metastatic GBC, compared with nonneoplastic gallbladder epithelium. By immunohistochemistry, 66 of 182 (36%) GBC had high CTGF antigen labeling, which was significantly associated with better survival on univariate analysis (P = 0.0069, log-rank test). An unbiased analysis of the GBC transcriptome by SAGE has identified CTGF expression as a predictive biomarker of favorable prognosis in this malignancy. The SAGE libraries from GBC and nonneoplastic gallbladder mucosa are publicly available at the Cancer Genome Anatomy Project web site and should facilitate much needed research into this lethal neoplasm.

  14. Identifying marker genes in transcription profiling data using a mixture of feature relevance experts.

    PubMed

    Chow, M L; Moler, E J; Mian, I S

    2001-03-08

    Transcription profiling experiments permit the expression levels of many genes to be measured simultaneously. Given profiling data from two types of samples, genes that most distinguish the samples (marker genes) are good candidates for subsequent in-depth experimental studies and developing decision support systems for diagnosis, prognosis, and monitoring. This work proposes a mixture of feature relevance experts as a method for identifying marker genes and illustrates the idea using published data from samples labeled as acute lymphoblastic and myeloid leukemia (ALL, AML). A feature relevance expert implements an algorithm that calculates how well a gene distinguishes samples, reorders genes according to this relevance measure, and uses a supervised learning method [here, support vector machines (SVMs)] to determine the generalization performances of different nested gene subsets. The mixture of three feature relevance experts examined implement two existing and one novel feature relevance measures. For each expert, a gene subset consisting of the top 50 genes distinguished ALL from AML samples as completely as all 7,070 genes. The 125 genes at the union of the top 50s are plausible markers for a prototype decision support system. Chromosomal aberration and other data support the prediction that the three genes at the intersection of the top 50s, cystatin C, azurocidin, and adipsin, are good targets for investigating the basic biology of ALL/AML. The same data were employed to identify markers that distinguish samples based on their labels of T cell/B cell, peripheral blood/bone marrow, and male/female. Selenoprotein W may discriminate T cells from B cells. Results from analysis of transcription profiling data from tumor/nontumor colon adenocarcinoma samples support the general utility of the aforementioned approach. Theoretical issues such as choosing SVM kernels and their parameters, training and evaluating feature relevance experts, and the impact of

  15. Discrimination of germline V genes at different sequencing lengths and mutational burdens: A new tool for identifying and evaluating the reliability of V gene assignment.

    PubMed

    Zhang, Bochao; Meng, Wenzhao; Prak, Eline T Luning; Hershberg, Uri

    2015-12-01

    Immune repertoires are collections of lymphocytes that express diverse antigen receptor gene rearrangements consisting of Variable (V), (Diversity (D) in the case of heavy chains) and Joining (J) gene segments. Clonally related cells typically share the same germline gene segments and have highly similar junctional sequences within their third complementarity determining regions. Identifying clonal relatedness of sequences is a key step in the analysis of immune repertoires. The V gene is the most important for clone identification because it has the longest sequence and the greatest number of sequence variants. However, accurate identification of a clone's germline V gene source is challenging because there is a high degree of similarity between different germline V genes. This difficulty is compounded in antibodies, which can undergo somatic hypermutation. Furthermore, high-throughput sequencing experiments often generate partial sequences and have significant error rates. To address these issues, we describe a novel method to estimate which germline V genes (or alleles) cannot be discriminated under different conditions (read lengths, sequencing errors or somatic hypermutation frequencies). Starting with any set of germline V genes, this method measures their similarity using different sequencing lengths and calculates their likelihood of unambiguous assignment under different levels of mutation. Hence, one can identify, under different experimental and biological conditions, the germline V genes (or alleles) that cannot be uniquely identified and bundle them together into groups of specific V genes with highly similar sequences. Copyright © 2015 Elsevier B.V. All rights reserved.

  16. Contig Maps and Genomic Sequencing Identify Candidate Genes in the Usher 1C Locus

    PubMed Central

    Higgins, Michael J.; Day, Colleen D.; Smilinich, Nancy J.; Ni, L.; Cooper, Paul R.; Nowak, Norma J.; Davies, Chris; de Jong, Pieter J.; Hejtmancik, Fielding; Evans, Glen A.; Smith, Richard J.H.; Shows, Thomas B.

    1998-01-01

    Usher syndrome 1C (USH1C) is a congenital condition manifesting profound hearing loss, the absence of vestibular function, and eventual retinal degeneration. The USH1C locus has been mapped genetically to a 2- to 3-cM interval in 11p14–15.1 between D11S899 and D11S861. In an effort to identify the USH1C disease gene we have isolated the region between these markers in yeast artificial chromosomes (YACs) using a combination of STS content mapping and Alu–PCR hybridization. The YAC contig is ∼3.5 Mb and has located several other loci within this interval, resulting in the order CEN-LDHA-SAA1-TPH-D11S1310-(D11S1888/KCNC1)-MYOD1-D11S902D11S921-D11S1890-TEL. Subsequent haplotyping and homozygosity analysis refined the location of the disease gene to a 400-kb interval between D11S902 and D11S1890 with all affected individuals being homozygous for the internal marker D11S921. To facilitate gene identification, the critical region has been converted into P1 artificial chromosome (PAC) clones using sequence-tagged sites (STSs) mapped to the YAC contig, Alu–PCR products generated from the YACs, and PAC end probes. A contig of >50 PAC clones has been assembled between D11S1310 and D11S1890, confirming the order of markers used in haplotyping. Three PAC clones representing nearly two-thirds of the USH1C critical region have been sequenced. PowerBLAST analysis identified six clusters of expressed sequence tags (ESTs), two known genes (BIR,SUR1) mapped previously to this region, and a previously characterized but unmapped gene NEFA (DNA binding/EF hand/acidic amino-acid-rich). GRAIL analysis identified 11 CpG islands and 73 exons of excellent quality. These data allowed the construction of a transcription map for the USH1C critical region, consisting of three known genes and six or more novel transcripts. Based on their map location, these loci represent candidate disease loci for USH1C. The NEFA gene was assessed as the USH1C locus by the sequencing of an amplified NEFA

  17. Systems Biology-Based Investigation of Cellular Antiviral Drug Targets Identified by Gene-Trap Insertional Mutagenesis.

    PubMed

    Cheng, Feixiong; Murray, James L; Zhao, Junfei; Sheng, Jinsong; Zhao, Zhongming; Rubin, Donald H

    2016-09-01

    Viruses require host cellular factors for successful replication. A comprehensive systems-level investigation of the virus-host interactome is critical for understanding the roles of host factors with the end goal of discovering new druggable antiviral targets. Gene-trap insertional mutagenesis is a high-throughput forward genetics approach to randomly disrupt (trap) host genes and discover host genes that are essential for viral replication, but not for host cell survival. In this study, we used libraries of randomly mutagenized cells to discover cellular genes that are essential for the replication of 10 distinct cytotoxic mammalian viruses, 1 gram-negative bacterium, and 5 toxins. We herein reported 712 candidate cellular genes, characterizing distinct topological network and evolutionary signatures, and occupying central hubs in the human interactome. Cell cycle phase-specific network analysis showed that host cell cycle programs played critical roles during viral replication (e.g. MYC and TAF4 regulating G0/1 phase). Moreover, the viral perturbation of host cellular networks reflected disease etiology in that host genes (e.g. CTCF, RHOA, and CDKN1B) identified were frequently essential and significantly associated with Mendelian and orphan diseases, or somatic mutations in cancer. Computational drug repositioning framework via incorporating drug-gene signatures from the Connectivity Map into the virus-host interactome identified 110 putative druggable antiviral targets and prioritized several existing drugs (e.g. ajmaline) that may be potential for antiviral indication (e.g. anti-Ebola). In summary, this work provides a powerful methodology with a tight integration of gene-trap insertional mutagenesis testing and systems biology to identify new antiviral targets and drugs for the development of broadly acting and targeted clinical antiviral therapeutics.

  18. Phenoscape: Identifying Candidate Genes for Evolutionary Phenotypes

    PubMed Central

    Edmunds, Richard C.; Su, Baofeng; Balhoff, James P.; Eames, B. Frank; Dahdul, Wasila M.; Lapp, Hilmar; Lundberg, John G.; Vision, Todd J.; Dunham, Rex A.; Mabee, Paula M.; Westerfield, Monte

    2016-01-01

    Phenotypes resulting from mutations in genetic model organisms can help reveal candidate genes for evolutionarily important phenotypic changes in related taxa. Although testing candidate gene hypotheses experimentally in nonmodel organisms is typically difficult, ontology-driven information systems can help generate testable hypotheses about developmental processes in experimentally tractable organisms. Here, we tested candidate gene hypotheses suggested by expert use of the Phenoscape Knowledgebase, specifically looking for genes that are candidates responsible for evolutionarily interesting phenotypes in the ostariophysan fishes that bear resemblance to mutant phenotypes in zebrafish. For this, we searched ZFIN for genetic perturbations that result in either loss of basihyal element or loss of scales phenotypes, because these are the ancestral phenotypes observed in catfishes (Siluriformes). We tested the identified candidate genes by examining their endogenous expression patterns in the channel catfish, Ictalurus punctatus. The experimental results were consistent with the hypotheses that these features evolved through disruption in developmental pathways at, or upstream of, brpf1 and eda/edar for the ancestral losses of basihyal element and scales, respectively. These results demonstrate that ontological annotations of the phenotypic effects of genetic alterations in model organisms, when aggregated within a knowledgebase, can be used effectively to generate testable, and useful, hypotheses about evolutionary changes in morphology. PMID:26500251

  19. Microarray analysis reveals key genes and pathways in Tetralogy of Fallot

    PubMed Central

    He, Yue-E; Qiu, Hui-Xian; Jiang, Jian-Bing; Wu, Rong-Zhou; Xiang, Ru-Lian; Zhang, Yuan-Hai

    2017-01-01

    The aim of the present study was to identify key genes that may be involved in the pathogenesis of Tetralogy of Fallot (TOF) using bioinformatics methods. The GSE26125 microarray dataset, which includes cardiovascular tissue samples derived from 16 children with TOF and five healthy age-matched control infants, was downloaded from the Gene Expression Omnibus database. Differential expression analysis was performed between TOF and control samples to identify differentially expressed genes (DEGs) using Student's t-test, and the R/limma package, with a log2 fold-change of >2 and a false discovery rate of <0.01 set as thresholds. The biological functions of DEGs were analyzed using the ToppGene database. The ReactomeFIViz application was used to construct functional interaction (FI) networks, and the genes in each module were subjected to pathway enrichment analysis. The iRegulon plugin was used to identify transcription factors predicted to regulate the DEGs in the FI network, and the gene-transcription factor pairs were then visualized using Cytoscape software. A total of 878 DEGs were identified, including 848 upregulated genes and 30 downregulated genes. The gene FI network contained seven function modules, which were all comprised of upregulated genes. Genes enriched in Module 1 were enriched in the following three neurological disorder-associated signaling pathways: Parkinson's disease, Alzheimer's disease and Huntington's disease. Genes in Modules 0, 3 and 5 were dominantly enriched in pathways associated with ribosomes and protein translation. The Xbox binding protein 1 transcription factor was demonstrated to be involved in the regulation of genes encoding the subunits of cytoplasmic and mitochondrial ribosomes, as well as genes involved in neurodegenerative disorders. Therefore, dysfunction of genes involved in signaling pathways associated with neurodegenerative disorders, ribosome function and protein translation may contribute to the pathogenesis of TOF

  20. Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated with High-Density Lipoprotein Cholesterol in Two Asian Cohorts

    PubMed Central

    Silver, Matt; Chen, Peng; Li, Ruoying; Cheng, Ching-Yu; Wong, Tien-Yin; Tai, E-Shyong; Teo, Yik-Ying; Montana, Giovanni

    2013-01-01

    Standard approaches to data analysis in genome-wide association studies (GWAS) ignore any potential functional relationships between gene variants. In contrast gene pathways analysis uses prior information on functional structure within the genome to identify pathways associated with a trait of interest. In a second step, important single nucleotide polymorphisms (SNPs) or genes may be identified within associated pathways. The pathways approach is motivated by the fact that genes do not act alone, but instead have effects that are likely to be mediated through their interaction in gene pathways. Where this is the case, pathways approaches may reveal aspects of a trait's genetic architecture that would otherwise be missed when considering SNPs in isolation. Most pathways methods begin by testing SNPs one at a time, and so fail to capitalise on the potential advantages inherent in a multi-SNP, joint modelling approach. Here, we describe a dual-level, sparse regression model for the simultaneous identification of pathways and genes associated with a quantitative trait. Our method takes account of various factors specific to the joint modelling of pathways with genome-wide data, including widespread correlation between genetic predictors, and the fact that variants may overlap multiple pathways. We use a resampling strategy that exploits finite sample variability to provide robust rankings for pathways and genes. We test our method through simulation, and use it to perform pathways-driven gene selection in a search for pathways and genes associated with variation in serum high-density lipoprotein cholesterol levels in two separate GWAS cohorts of Asian adults. By comparing results from both cohorts we identify a number of candidate pathways including those associated with cardiomyopathy, and T cell receptor and PPAR signalling. Highlighted genes include those associated with the L-type calcium channel, adenylate cyclase, integrin, laminin, MAPK signalling and immune

  1. Pathways-driven sparse regression identifies pathways and genes associated with high-density lipoprotein cholesterol in two Asian cohorts.

    PubMed

    Silver, Matt; Chen, Peng; Li, Ruoying; Cheng, Ching-Yu; Wong, Tien-Yin; Tai, E-Shyong; Teo, Yik-Ying; Montana, Giovanni

    2013-11-01

    Standard approaches to data analysis in genome-wide association studies (GWAS) ignore any potential functional relationships between gene variants. In contrast gene pathways analysis uses prior information on functional structure within the genome to identify pathways associated with a trait of interest. In a second step, important single nucleotide polymorphisms (SNPs) or genes may be identified within associated pathways. The pathways approach is motivated by the fact that genes do not act alone, but instead have effects that are likely to be mediated through their interaction in gene pathways. Where this is the case, pathways approaches may reveal aspects of a trait's genetic architecture that would otherwise be missed when considering SNPs in isolation. Most pathways methods begin by testing SNPs one at a time, and so fail to capitalise on the potential advantages inherent in a multi-SNP, joint modelling approach. Here, we describe a dual-level, sparse regression model for the simultaneous identification of pathways and genes associated with a quantitative trait. Our method takes account of various factors specific to the joint modelling of pathways with genome-wide data, including widespread correlation between genetic predictors, and the fact that variants may overlap multiple pathways. We use a resampling strategy that exploits finite sample variability to provide robust rankings for pathways and genes. We test our method through simulation, and use it to perform pathways-driven gene selection in a search for pathways and genes associated with variation in serum high-density lipoprotein cholesterol levels in two separate GWAS cohorts of Asian adults. By comparing results from both cohorts we identify a number of candidate pathways including those associated with cardiomyopathy, and T cell receptor and PPAR signalling. Highlighted genes include those associated with the L-type calcium channel, adenylate cyclase, integrin, laminin, MAPK signalling and immune

  2. Genetic analysis of the calcineurin pathway identifies members of the EGR gene family, specifically EGR3, as potential susceptibility candidates in schizophrenia

    PubMed Central

    Yamada, Kazuo; Gerber, David J.; Iwayama, Yoshimi; Ohnishi, Tetsuo; Ohba, Hisako; Toyota, Tomoko; Aruga, Jun; Minabe, Yoshio; Tonegawa, Susumu; Yoshikawa, Takeo

    2007-01-01

    The calcineurin cascade is central to neuronal signal transduction, and genes in this network are intriguing candidate schizophrenia susceptibility genes. To replicate and extend our previously reported association between the PPP3CC gene, encoding the calcineurin catalytic γ-subunit, and schizophrenia, we examined 84 SNPs from 14 calcineurin-related candidate genes for genetic association by using 124 Japanese schizophrenic pedigrees. Four of these genes (PPP3CC, EGR2, EGR3, and EGR4) showed nominally significant association with schizophrenia. In a postmortem brain study, EGR1, EGR2, and EGR3 transcripts were shown to be down-regulated in the prefrontal cortex of schizophrenic, but not bipolar, patients. These findings raise a potentially important role for EGR genes in schizophrenia pathogenesis. Because EGR3 is an attractive candidate gene based on its chromosomal location close to PPP3CC within 8p21.3 and its functional link to dopamine, glutamate, and neuregulin signaling, we extended our analysis by resequencing the entire EGR3 genomic interval and detected 15 SNPs. One of these, IVS1 + 607A→G SNP, displayed the strongest evidence for disease association, which was confirmed in 1,140 independent case-control samples. An in vitro promoter assay detected a possible expression-regulatory effect of this SNP. These findings support the previous genetic association of altered calcineurin signaling with schizophrenia pathogenesis and identify EGR3 as a compelling susceptibility gene. PMID:17360599

  3. A novel gammaretroviral shuttle vector insertional mutagenesis screen identifies SHARPIN as a breast cancer metastasis gene and prognostic biomarker.

    PubMed

    Bii, Victor M; Rae, Dustin T; Trobridge, Grant D

    2015-11-24

    Breast cancer (BC) is the second leading cause of malignancy among U.S. women. Metastasis results in a poor prognosis and increased mortality, but the molecular mechanisms by which metastatic tumors occur are not well understood. Identifying the genes that drive the metastatic process could provide targets for improved therapy and biomarkers to improve BC patient outcomes. Using a forward mutagenesis screen, BC cells mutagenized with a replication-incompetent gammaretroviral vector (γRV) were xenotransplanted into the mammary fat pad of immunodeficient mice. In this approach the vector provirus dysregulates nearby genes, providing a selective advantage to transduced cells to form metastases. Metastatic tumors were analyzed for proviral integration sites to identify nearby candidate metastasis genes. The γRV has a transgene cassette that allows for rescue in bacteria and rapid identification of vector integration sites. Using this approach, we identified the previously described metastasis gene WWTR1 (TAZ), and three other novel candidate metastasis genes including SHARPIN. SHARPIN was independently validated in vivo as a BC metastasis gene. Analysis of patient data showed that SHARPIN expression predicts metastasis-free survival after adjuvant therapy. Our approach has broad potential to identify genes involved in oncogenic processes for BC and other cancers. We show here it can identify both known (WWTR1) and novel (SHARPIN) BC metastasis genes.

  4. Computational modeling identifies key gene regulatory interactions underlying phenobarbital-mediated tumor promotion

    PubMed Central

    Luisier, Raphaëlle; Unterberger, Elif B.; Goodman, Jay I.; Schwarz, Michael; Moggs, Jonathan; Terranova, Rémi; van Nimwegen, Erik

    2014-01-01

    Gene regulatory interactions underlying the early stages of non-genotoxic carcinogenesis are poorly understood. Here, we have identified key candidate regulators of phenobarbital (PB)-mediated mouse liver tumorigenesis, a well-characterized model of non-genotoxic carcinogenesis, by applying a new computational modeling approach to a comprehensive collection of in vivo gene expression studies. We have combined our previously developed motif activity response analysis (MARA), which models gene expression patterns in terms of computationally predicted transcription factor binding sites with singular value decomposition (SVD) of the inferred motif activities, to disentangle the roles that different transcriptional regulators play in specific biological pathways of tumor promotion. Furthermore, transgenic mouse models enabled us to identify which of these regulatory activities was downstream of constitutive androstane receptor and β-catenin signaling, both crucial components of PB-mediated liver tumorigenesis. We propose novel roles for E2F and ZFP161 in PB-mediated hepatocyte proliferation and suggest that PB-mediated suppression of ESR1 activity contributes to the development of a tumor-prone environment. Our study shows that combining MARA with SVD allows for automated identification of independent transcription regulatory programs within a complex in vivo tissue environment and provides novel mechanistic insights into PB-mediated hepatocarcinogenesis. PMID:24464994

  5. Identification of candidate genes in osteoporosis by integrated microarray analysis.

    PubMed

    Li, J J; Wang, B Q; Fei, Q; Yang, Y; Li, D

    2016-12-01

    In order to screen the altered gene expression profile in peripheral blood mononuclear cells of patients with osteoporosis, we performed an integrated analysis of the online microarray studies of osteoporosis. We searched the Gene Expression Omnibus (GEO) database for microarray studies of peripheral blood mononuclear cells in patients with osteoporosis. Subsequently, we integrated gene expression data sets from multiple microarray studies to obtain differentially expressed genes (DEGs) between patients with osteoporosis and normal controls. Gene function analysis was performed to uncover the functions of identified DEGs. A total of three microarray studies were selected for integrated analysis. In all, 1125 genes were found to be significantly differentially expressed between osteoporosis patients and normal controls, with 373 upregulated and 752 downregulated genes. Positive regulation of the cellular amino metabolic process (gene ontology (GO): 0033240, false discovery rate (FDR) = 1.00E + 00) was significantly enriched under the GO category for biological processes, while for molecular functions, flavin adenine dinucleotide binding (GO: 0050660, FDR = 3.66E-01) and androgen receptor binding (GO: 0050681, FDR = 6.35E-01) were significantly enriched. DEGs were enriched in many osteoporosis-related signalling pathways, including those of mitogen-activated protein kinase (MAPK) and calcium. Protein-protein interaction (PPI) network analysis showed that the significant hub proteins contained ubiquitin specific peptidase 9, X-linked (Degree = 99), ubiquitin specific peptidase 19 (Degree = 57) and ubiquitin conjugating enzyme E2 B (Degree = 57). Analysis of gene function of identified differentially expressed genes may expand our understanding of fundamental mechanisms leading to osteoporosis. Moreover, significantly enriched pathways, such as MAPK and calcium, may involve in osteoporosis through osteoblastic differentiation and bone formation.Cite this article: J. J

  6. A Morpholino-based screen to identify novel genes involved in craniofacial morphogenesis

    PubMed Central

    Melvin, Vida Senkus; Feng, Weiguo; Hernandez-Lagunas, Laura; Artinger, Kristin Bruk; Williams, Trevor

    2014-01-01

    BACKGROUND The regulatory mechanisms underpinning facial development are conserved between diverse species. Therefore, results from model systems provide insight into the genetic causes of human craniofacial defects. Previously, we generated a comprehensive dataset examining gene expression during development and fusion of the mouse facial prominences. Here, we used this resource to identify genes that have dynamic expression patterns in the facial prominences, but for which only limited information exists concerning developmental function. RESULTS This set of ~80 genes was used for a high throughput functional analysis in the zebrafish system using Morpholino gene knockdown technology. This screen revealed three classes of cranial cartilage phenotypes depending upon whether knockdown of the gene affected the neurocranium, viscerocranium, or both. The targeted genes that produced consistent phenotypes encoded proteins linked to transcription (meis1, meis2a, tshz2, vgll4l), signaling (pkdcc, vlk, macc1, wu:fb16h09), and extracellular matrix function (smoc2). The majority of these phenotypes were not altered by reduction of p53 levels, demonstrating that both p53 dependent and independent mechanisms were involved in the craniofacial abnormalities. CONCLUSIONS This Morpholino-based screen highlights new genes involved in development of the zebrafish craniofacial skeleton with wider relevance to formation of the face in other species, particularly mouse and human. PMID:23559552

  7. Meta-Analysis of Genome-Wide Association Studies and Network Analysis-Based Integration with Gene Expression Data Identify New Suggestive Loci and Unravel a Wnt-Centric Network Associated with Dupuytren’s Disease

    PubMed Central

    Becker, Kerstin; Siegert, Sabine; Toliat, Mohammad Reza; Du, Juanjiangmeng; Casper, Ramona; Dolmans, Guido H.; Werker, Paul M.; Tinschert, Sigrid; Franke, Andre; Gieger, Christian; Strauch, Konstantin; Nothnagel, Michael; Nürnberg, Peter; Hennies, Hans Christian

    2016-01-01

    Dupuytren´s disease, a fibromatosis of the connective tissue in the palm, is a common complex disease with a strong genetic component. Up to date nine genetic loci have been found to be associated with the disease. Six of these loci contain genes that code for Wnt signalling proteins. In spite of this striking first insight into the genetic factors in Dupuytren´s disease, much of the inherited risk in Dupuytren´s disease still needs to be discovered. The already identified loci jointly explain ~1% of the heritability in this disease. To further elucidate the genetic basis of Dupuytren´s disease, we performed a genome-wide meta-analysis combining three genome-wide association study (GWAS) data sets, comprising 1,580 cases and 4,480 controls. We corroborated all nine previously identified loci, six of these with genome-wide significance (p-value < 5x10-8). In addition, we identified 14 new suggestive loci (p-value < 10−5). Intriguingly, several of these new loci contain genes associated with Wnt signalling and therefore represent excellent candidates for replication. Next, we compared whole-transcriptome data between patient- and control-derived tissue samples and found the Wnt/β-catenin pathway to be the top deregulated pathway in patient samples. We then conducted network and pathway analyses in order to identify protein networks that are enriched for genes highlighted in the GWAS meta-analysis and expression data sets. We found further evidence that the Wnt signalling pathways in conjunction with other pathways may play a critical role in Dupuytren´s disease. PMID:27467239

  8. Prioritization of Epilepsy Associated Candidate Genes by Convergent Analysis

    PubMed Central

    Jia, Peilin; Ewers, Jeffrey M.; Zhao, Zhongming

    2011-01-01

    Background Epilepsy is a severe neurological disorder affecting a large number of individuals, yet the underlying genetic risk factors for epilepsy remain unclear. Recent studies have revealed several recurrent copy number variations (CNVs) that are more likely to be associated with epilepsy. The responsible gene(s) within these regions have yet to be definitively linked to the disorder, and the implications of their interactions are not fully understood. Identification of these genes may contribute to a better pathological understanding of epilepsy, and serve to implicate novel therapeutic targets for further research. Methodology/Principal Findings In this study, we examined genes within heterozygous deletion regions identified in a recent large-scale study, encompassing a diverse spectrum of epileptic syndromes. By integrating additional protein-protein interaction data, we constructed subnetworks for these CNV-region genes and also those previously studied for epilepsy. We observed 20 genes common to both networks, primarily concentrated within a small molecular network populated by GABA receptor, BDNF/MAPK signaling, and estrogen receptor genes. From among the hundreds of genes in the initial networks, these were designated by convergent evidence for their likely association with epilepsy. Importantly, the identified molecular network was found to contain complex interrelationships, providing further insight into epilepsy's underlying pathology. We further performed pathway enrichment and crosstalk analysis and revealed a functional map which indicates the significant enrichment of closely related neurological, immune, and kinase regulatory pathways. Conclusions/Significance The convergent framework we proposed here provides a unique and powerful approach to screening and identifying promising disease genes out of typically hundreds to thousands of genes in disease-related CNV-regions. Our network and pathway analysis provides important implications for the

  9. Prioritization of epilepsy associated candidate genes by convergent analysis.

    PubMed

    Jia, Peilin; Ewers, Jeffrey M; Zhao, Zhongming

    2011-02-24

    Epilepsy is a severe neurological disorder affecting a large number of individuals, yet the underlying genetic risk factors for epilepsy remain unclear. Recent studies have revealed several recurrent copy number variations (CNVs) that are more likely to be associated with epilepsy. The responsible gene(s) within these regions have yet to be definitively linked to the disorder, and the implications of their interactions are not fully understood. Identification of these genes may contribute to a better pathological understanding of epilepsy, and serve to implicate novel therapeutic targets for further research. In this study, we examined genes within heterozygous deletion regions identified in a recent large-scale study, encompassing a diverse spectrum of epileptic syndromes. By integrating additional protein-protein interaction data, we constructed subnetworks for these CNV-region genes and also those previously studied for epilepsy. We observed 20 genes common to both networks, primarily concentrated within a small molecular network populated by GABA receptor, BDNF/MAPK signaling, and estrogen receptor genes. From among the hundreds of genes in the initial networks, these were designated by convergent evidence for their likely association with epilepsy. Importantly, the identified molecular network was found to contain complex interrelationships, providing further insight into epilepsy's underlying pathology. We further performed pathway enrichment and crosstalk analysis and revealed a functional map which indicates the significant enrichment of closely related neurological, immune, and kinase regulatory pathways. The convergent framework we proposed here provides a unique and powerful approach to screening and identifying promising disease genes out of typically hundreds to thousands of genes in disease-related CNV-regions. Our network and pathway analysis provides important implications for the underlying molecular mechanisms for epilepsy. The strategy can be

  10. Genetic regulation of gene expression in the lung identifies CST3 and CD22 as potential causal genes for airflow obstruction.

    PubMed

    Lamontagne, Maxime; Timens, Wim; Hao, Ke; Bossé, Yohan; Laviolette, Michel; Steiling, Katrina; Campbell, Joshua D; Couture, Christian; Conti, Massimo; Sherwood, Karen; Hogg, James C; Brandsma, Corry-Anke; van den Berge, Maarten; Sandford, Andrew; Lam, Stephen; Lenburg, Marc E; Spira, Avrum; Paré, Peter D; Nickle, David; Sin, Don D; Postma, Dirkje S

    2014-11-01

    COPD is a complex chronic disease with poorly understood pathogenesis. Integrative genomic approaches have the potential to elucidate the biological networks underlying COPD and lung function. We recently combined genome-wide genotyping and gene expression in 1111 human lung specimens to map expression quantitative trait loci (eQTL). To determine causal associations between COPD and lung function-associated single nucleotide polymorphisms (SNPs) and lung tissue gene expression changes in our lung eQTL dataset. We evaluated causality between SNPs and gene expression for three COPD phenotypes: FEV(1)% predicted, FEV(1)/FVC and COPD as a categorical variable. Different models were assessed in the three cohorts independently and in a meta-analysis. SNPs associated with a COPD phenotype and gene expression were subjected to causal pathway modelling and manual curation. In silico analyses evaluated functional enrichment of biological pathways among newly identified causal genes. Biologically relevant causal genes were validated in two separate gene expression datasets of lung tissues and bronchial airway brushings. High reliability causal relations were found in SNP-mRNA-phenotype triplets for FEV(1)% predicted (n=169) and FEV(1)/FVC (n=80). Several genes of potential biological relevance for COPD were revealed. eQTL-SNPs upregulating cystatin C (CST3) and CD22 were associated with worse lung function. Signalling pathways enriched with causal genes included xenobiotic metabolism, apoptosis, protease-antiprotease and oxidant-antioxidant balance. By using integrative genomics and analysing the relationships of COPD phenotypes with SNPs and gene expression in lung tissue, we identified CST3 and CD22 as potential causal genes for airflow obstruction. This study also augmented the understanding of previously described COPD pathways. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  11. Combined gene expression analysis of whole-tissue and microdissected pancreatic ductal adenocarcinoma identifies genes specifically overexpressed in tumor epithelia.

    PubMed

    Badea, Liviu; Herlea, Vlad; Dima, Simona Olimpia; Dumitrascu, Traian; Popescu, Irinel

    2008-01-01

    The precise details of pancreatic ductal adenocarcinoma (PDAC) pathogenesis are still insufficiently known, requiring the use of high-throughput methods. However, PDAC is especially difficult to study using microarrays due to its strong desmoplastic reaction, which involves a hyperproliferating stroma that effectively "masks" the contribution of the minoritary neoplastic epithelial cells. Thus it is not clear which of the genes that have been found differentially expressed between normal and whole tumor tissues are due to the tumor epithelia and which simply reflect the differences in cellular composition. To address this problem, laser microdissection studies have been performed, but these have to deal with much smaller tissue sample quantities and therefore have significantly higher experimental noise. In this paper we combine our own large sample whole-tissue study with a previously published smaller sample microdissection study by Grützmann et al. to identify the genes that are specifically overexpressed in PDAC tumor epithelia. The overlap of this list of genes with other microarray studies of pancreatic cancer as well as with the published literature is impressive. Moreover, we find a number of genes whose over-expression appears to be inversely correlated with patient survival: keratin 7, laminin gamma 2, stratifin, platelet phosphofructokinase, annexin A2, MAP4K4 and OACT2 (MBOAT2), which are all specifically upregulated in the neoplastic epithelia, rather than the tumor stroma. We improve on other microarray studies of PDAC by putting together the higher statistical power due to a larger number of samples with information about cell-type specific expression and patient survival.

  12. Identifying molecular features for prostate cancer with Gleason 7 based on microarray gene expression profiles.

    PubMed

    Bălăcescu, Loredana; Bălăcescu, O; Crişan, N; Fetica, B; Petruţ, B; Bungărdean, Cătălina; Rus, Meda; Tudoran, Oana; Meurice, G; Irimie, Al; Dragoş, N; Berindan-Neagoe, Ioana

    2011-01-01

    Prostate cancer represents the first leading cause of cancer among western male population, with different clinical behavior ranging from indolent to metastatic disease. Although many molecules and deregulated pathways are known, the molecular mechanisms involved in the development of prostate cancer are not fully understood. The aim of this study was to explore the molecular variation underlying the prostate cancer, based on microarray analysis and bioinformatics approaches. Normal and prostate cancer tissues were collected by macrodissection from prostatectomy pieces. All prostate cancer specimens used in our study were Gleason score 7. Gene expression microarray (Agilent Technologies) was used for Whole Human Genome evaluation. The bioinformatics and functional analysis were based on Limma and Ingenuity software. The microarray analysis identified 1119 differentially expressed genes between prostate cancer and normal prostate, which were up- or down-regulated at least 2-fold. P-values were adjusted for multiple testing using Benjamini-Hochberg method with a false discovery rate of 0.01. These genes were analyzed with Ingenuity Pathway Analysis software and were established 23 genetic networks. Our microarray results provide new information regarding the molecular networks in prostate cancer stratified as Gleason 7. These data highlighted gene expression profiles for better understanding of prostate cancer progression.

  13. Integrated Network Analysis Identifies Fight-Club Nodes as a Class of Hubs Encompassing Key Putative Switch Genes That Induce Major Transcriptome Reprogramming during Grapevine Development[W][OPEN

    PubMed Central

    Palumbo, Maria Concetta; Zenoni, Sara; Fasoli, Marianna; Massonnet, Mélanie; Farina, Lorenzo; Castiglione, Filippo; Pezzotti, Mario; Paci, Paola

    2014-01-01

    We developed an approach that integrates different network-based methods to analyze the correlation network arising from large-scale gene expression data. By studying grapevine (Vitis vinifera) and tomato (Solanum lycopersicum) gene expression atlases and a grapevine berry transcriptomic data set during the transition from immature to mature growth, we identified a category named “fight-club hubs” characterized by a marked negative correlation with the expression profiles of neighboring genes in the network. A special subset named “switch genes” was identified, with the additional property of many significant negative correlations outside their own group in the network. Switch genes are involved in multiple processes and include transcription factors that may be considered master regulators of the previously reported transcriptome remodeling that marks the developmental shift from immature to mature growth. All switch genes, expressed at low levels in vegetative/green tissues, showed a significant increase in mature/woody organs, suggesting a potential regulatory role during the developmental transition. Finally, our analysis of tomato gene expression data sets showed that wild-type switch genes are downregulated in ripening-deficient mutants. The identification of known master regulators of tomato fruit maturation suggests our method is suitable for the detection of key regulators of organ development in different fleshy fruit crops. PMID:25490918

  14. Analysis of the Prefoldin Gene Family in 14 Plant Species

    PubMed Central

    Cao, Jun

    2016-01-01

    Prefoldin is a hexameric molecular chaperone complex present in all eukaryotes and archaea. The evolution of this gene family in plants is unknown. Here, I identified 140 prefoldin genes in 14 plant species. These prefoldin proteins were divided into nine groups through phylogenetic analysis. Highly conserved gene organization and motif distribution exist in each prefoldin group, implying their functional conservation. I also observed the segmental duplication of maize prefoldin gene family. Moreover, a few functional divergence sites were identified within each group pairs. Functional network analyses identified 78 co-expressed genes, and most of them were involved in carrying, binding and kinase activity. Divergent expression profiles of the maize prefoldin genes were further investigated in different tissues and development periods and under auxin and some abiotic stresses. I also found a few cis-elements responding to abiotic stress and phytohormone in the upstream sequences of the maize prefoldin genes. The results provided a foundation for exploring the characterization of the prefoldin genes in plants and will offer insights for additional functional studies. PMID:27014333

  15. Transcriptome profiling of equine vitamin E deficient neuroaxonal dystrophy identifies upregulation of liver X receptor target genes

    PubMed Central

    Finno, Carrie J.; Bordbari, Matthew H.; Valberg, Stephanie J.; Lee, David; Herron, Josi; Hines, Kelly; Monsour, Tamer; Scott, Erica; Bannasch, Danika L.; Mickelson, James; Xu, Libin

    2016-01-01

    Specific spontaneous heritable neurodegenerative diseases have been associated with lower serum and cerebrospinal fluid α-tocopherol (α-TOH) concentrations. Equine neuroaxonal dystrophy (eNAD) has similar histologic lesions to human ataxia with vitamin E deficiency caused by mutations in the α-TOH transfer protein gene (TTPA). Mutations in TTPA are not present with eNAD and the molecular basis remains unknown. Given the neuropathologic phenotypic similarity of the conditions, we assessed the molecular basis of eNAD by global transcriptome sequencing of the cervical spinal cord. Differential gene expression analysis identified 157 significantly (FDR<0.05) dysregulated transcripts within the spinal cord of eNAD-affected horses. Statistical enrichment analysis identified significant downregulation of the ionotropic and metabotropic group III glutamate receptor, synaptic vesicle trafficking and cholesterol biosynthesis pathways. Gene co-expression analysis identified one module of upregulated genes significantly associated with the eNAD phenotype that included the liver X receptor (LXR) targets CYP7A1, APOE, PLTP and ABCA1. Validation of CYP7A1 and APOE dysregulation was performed in an independent biologic group and CYP7A1 was found to be additionally upregulated in the medulla oblongata of eNAD horses. Evidence of LXR activation supports a role for modulation of oxysterol-dependent LXR transcription factor activity by tocopherols. We hypothesize that the protective role of α-TOH in eNAD may reside in its ability to prevent oxysterol accumulation and subsequent activation of the LXR in order to decrease lipid peroxidation associated neurodegeneration. PMID:27751910

  16. Comparative transcriptome analysis of stylar canal cells identifies novel candidate genes implicated in the self-incompatibility response of Citrus clementina

    PubMed Central

    2012-01-01

    Background Reproductive biology in citrus is still poorly understood. Although in recent years several efforts have been made to study pollen-pistil interaction and self-incompatibility, little information is available about the molecular mechanisms regulating these processes. Here we report the identification of candidate genes involved in pollen-pistil interaction and self-incompatibility in clementine (Citrus clementina Hort. ex Tan.). These genes have been identified comparing the transcriptomes of laser-microdissected stylar canal cells (SCC) isolated from two genotypes differing for self-incompatibility response ('Comune', a self-incompatible cultivar and 'Monreal', a self- compatible mutation of 'Comune'). Results The transcriptome profiling of SCC indicated that the differential regulation of few specific, mostly uncharacterized transcripts is associated with the breakdown of self-incompatibility in 'Monreal'. Among them, a novel F-box gene showed a drastic up-regulation both in laser microdissected stylar canal cells and in self-pollinated whole styles with stigmas of 'Comune' in concomitance with the arrest of pollen tube growth. Moreover, we identify a non-characterized gene family as closely associated to the self-incompatibility genetic program activated in 'Comune'. Three different aspartic-acid rich (Asp-rich) protein genes, located in tandem in the clementine genome, were over-represented in the transcriptome of 'Comune'. These genes are tightly linked to a DELLA gene, previously found to be up-regulated in the self-incompatible genotype during pollen-pistil interaction. Conclusion The highly specific transcriptome survey of the stylar canal cells identified novel genes which have not been previously associated with self-pollen rejection in citrus and in other plant species. Bioinformatic and transcriptional analyses suggested that the mutation leading to self-compatibility in 'Monreal' affected the expression of non-homologous genes located in a

  17. Gene panel sequencing in familial breast/ovarian cancer patients identifies multiple novel mutations also in genes others than BRCA1/2.

    PubMed

    Kraus, Cornelia; Hoyer, Juliane; Vasileiou, Georgia; Wunderle, Marius; Lux, Michael P; Fasching, Peter A; Krumbiegel, Mandy; Uebe, Steffen; Reuter, Miriam; Beckmann, Matthias W; Reis, André

    2017-01-01

    Breast and ovarian cancer (BC/OC) predisposition has been attributed to a number of high- and moderate to low-penetrance susceptibility genes. With the advent of next generation sequencing (NGS) simultaneous testing of these genes has become feasible. In this monocentric study, we report results of panel-based screening of 14 BC/OC susceptibility genes (BRCA1, BRCA2, RAD51C, RAD51D, CHEK2, PALB2, ATM, NBN, CDH1, TP53, MLH1, MSH2, MSH6 and PMS2) in a group of 581 consecutive individuals from a German population with BC and/or OC fulfilling diagnostic criteria for BRCA1 and BRCA2 testing including 179 with a triple-negative tumor. Altogether we identified 106 deleterious mutations in 105 (18%) patients in 10 different genes, including seven different exon deletions. Of these 106 mutations, 16 (15%) were novel and only six were found in BRCA1/2. To further characterize mutations located in or nearby splicing consensus sites we performed RT-PCR analysis which allowed confirmation of pathogenicity in 7 of 9 mutations analyzed. In PALB2, we identified a deleterious variant in six cases. All but one were associated with early onset BC and a positive family history indicating that penetrance for PALB2 mutations is comparable to BRCA2. Overall, extended testing beyond BRCA1/2 identified a deleterious mutation in further 6% of patients. As a downside, 89 variants of uncertain significance were identified highlighting the need for comprehensive variant databases. In conclusion, panel testing yields more accurate information on genetic cancer risk than assessing BRCA1/2 alone and wide-spread testing will help improve penetrance assessment of variants in these risk genes. © 2016 UICC.

  18. Blood pressure loci identified with a gene-centric array.

    PubMed

    Johnson, Toby; Gaunt, Tom R; Newhouse, Stephen J; Padmanabhan, Sandosh; Tomaszewski, Maciej; Kumari, Meena; Morris, Richard W; Tzoulaki, Ioanna; O'Brien, Eoin T; Poulter, Neil R; Sever, Peter; Shields, Denis C; Thom, Simon; Wannamethee, Sasiwarang G; Whincup, Peter H; Brown, Morris J; Connell, John M; Dobson, Richard J; Howard, Philip J; Mein, Charles A; Onipinla, Abiodun; Shaw-Hawkins, Sue; Zhang, Yun; Davey Smith, George; Day, Ian N M; Lawlor, Debbie A; Goodall, Alison H; Fowkes, F Gerald; Abecasis, Gonçalo R; Elliott, Paul; Gateva, Vesela; Braund, Peter S; Burton, Paul R; Nelson, Christopher P; Tobin, Martin D; van der Harst, Pim; Glorioso, Nicola; Neuvrith, Hani; Salvi, Erika; Staessen, Jan A; Stucchi, Andrea; Devos, Nabila; Jeunemaitre, Xavier; Plouin, Pierre-François; Tichet, Jean; Juhanson, Peeter; Org, Elin; Putku, Margus; Sõber, Siim; Veldre, Gudrun; Viigimaa, Margus; Levinsson, Anna; Rosengren, Annika; Thelle, Dag S; Hastie, Claire E; Hedner, Thomas; Lee, Wai K; Melander, Olle; Wahlstrand, Björn; Hardy, Rebecca; Wong, Andrew; Cooper, Jackie A; Palmen, Jutta; Chen, Li; Stewart, Alexandre F R; Wells, George A; Westra, Harm-Jan; Wolfs, Marcel G M; Clarke, Robert; Franzosi, Maria Grazia; Goel, Anuj; Hamsten, Anders; Lathrop, Mark; Peden, John F; Seedorf, Udo; Watkins, Hugh; Ouwehand, Willem H; Sambrook, Jennifer; Stephens, Jonathan; Casas, Juan-Pablo; Drenos, Fotios; Holmes, Michael V; Kivimaki, Mika; Shah, Sonia; Shah, Tina; Talmud, Philippa J; Whittaker, John; Wallace, Chris; Delles, Christian; Laan, Maris; Kuh, Diana; Humphries, Steve E; Nyberg, Fredrik; Cusi, Daniele; Roberts, Robert; Newton-Cheh, Christopher; Franke, Lude; Stanton, Alice V; Dominiczak, Anna F; Farrall, Martin; Hingorani, Aroon D; Samani, Nilesh J; Caulfield, Mark J; Munroe, Patricia B

    2011-12-09

    Raised blood pressure (BP) is a major risk factor for cardiovascular disease. Previous studies have identified 47 distinct genetic variants robustly associated with BP, but collectively these explain only a few percent of the heritability for BP phenotypes. To find additional BP loci, we used a bespoke gene-centric array to genotype an independent discovery sample of 25,118 individuals that combined hypertensive case-control and general population samples. We followed up four SNPs associated with BP at our p < 8.56 × 10(-7) study-specific significance threshold and six suggestively associated SNPs in a further 59,349 individuals. We identified and replicated a SNP at LSP1/TNNT3, a SNP at MTHFR-NPPB independent (r(2) = 0.33) of previous reports, and replicated SNPs at AGT and ATP2B1 reported previously. An analysis of combined discovery and follow-up data identified SNPs significantly associated with BP at p < 8.56 × 10(-7) at four further loci (NPR3, HFE, NOS3, and SOX6). The high number of discoveries made with modest genotyping effort can be attributed to using a large-scale yet targeted genotyping array and to the development of a weighting scheme that maximized power when meta-analyzing results from samples ascertained with extreme phenotypes, in combination with results from nonascertained or population samples. Chromatin immunoprecipitation and transcript expression data highlight potential gene regulatory mechanisms at the MTHFR and NOS3 loci. These results provide candidates for further study to help dissect mechanisms affecting BP and highlight the utility of studying SNPs and samples that are independent of those studied previously even when the sample size is smaller than that in previous studies. Copyright © 2011 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  19. Type 2 diabetes mellitus disease risk genes identified by genome wide copy number variation scan in normal populations.

    PubMed

    Prabhanjan, Manasa; Suresh, Raviraj V; Murthy, Megha N; Ramachandra, Nallur B

    2016-03-01

    To identify the role of copy number variations (CNVs) on disease risk genes and its effect on disease phenotypes in type 2 diabetes mellitus (T2DM) in 12 random populations using high throughput arrays. CNV analysis was carried out on a total of 1715 individuals from 12 populations, from ArrayExpress Archive of the European Bioinformatics Institute along with our subjects using Affymetrix Genome Wide SNP 6.0 array. CNV effect on T2DM genes were analyzed using several bioinformatics tools and a molecular protein interaction network was constructed to identify the disease mechanism altered by the CNVs. Analysis showed 34.4% of the total population to be under CNV burden for T2DM, with 83 disease causal and associated genes being under CNV influence. Hotspots were identified on chromosomes 22, 12, 6, 19 and 11.Overlap studies with case cohorts revealed significant disease risk genes such as EGFR, E2F1, PPP1R3A, HLA and TSPAN8. CNVs play a significant role in predisposing T2DM in normal cohorts and contribute to the phenotypic effects. Thus, CNVs should be considered as one of the major contributors in predisposition of the disease. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  20. A fast and high performance multiple data integration algorithm for identifying human disease genes

    PubMed Central

    2015-01-01

    Background Integrating multiple data sources is indispensable in improving disease gene identification. It is not only due to the fact that disease genes associated with similar genetic diseases tend to lie close with each other in various biological networks, but also due to the fact that gene-disease associations are complex. Although various algorithms have been proposed to identify disease genes, their prediction performances and the computational time still should be further improved. Results In this study, we propose a fast and high performance multiple data integration algorithm for identifying human disease genes. A posterior probability of each candidate gene associated with individual diseases is calculated by using a Bayesian analysis method and a binary logistic regression model. Two prior probability estimation strategies and two feature vector construction methods are developed to test the performance of the proposed algorithm. Conclusions The proposed algorithm is not only generated predictions with high AUC scores, but also runs very fast. When only a single PPI network is employed, the AUC score is 0.769 by using F2 as feature vectors. The average running time for each leave-one-out experiment is only around 1.5 seconds. When three biological networks are integrated, the AUC score using F3 as feature vectors increases to 0.830, and the average running time for each leave-one-out experiment takes only about 12.54 seconds. It is better than many existing algorithms. PMID:26399620

  1. Pathway-GPS and SIGORA: identifying relevant pathways based on the over-representation of their gene-pair signatures

    PubMed Central

    Foroushani, Amir B.K.; Brinkman, Fiona S.L.

    2013-01-01

    Motivation. Predominant pathway analysis approaches treat pathways as collections of individual genes and consider all pathway members as equally informative. As a result, at times spurious and misleading pathways are inappropriately identified as statistically significant, solely due to components that they share with the more relevant pathways. Results. We introduce the concept of Pathway Gene-Pair Signatures (Pathway-GPS) as pairs of genes that, as a combination, are specific to a single pathway. We devised and implemented a novel approach to pathway analysis, Signature Over-representation Analysis (SIGORA), which focuses on the statistically significant enrichment of Pathway-GPS in a user-specified gene list of interest. In a comparative evaluation of several published datasets, SIGORA outperformed traditional methods by delivering biologically more plausible and relevant results. Availability. An efficient implementation of SIGORA, as an R package with precompiled GPS data for several human and mouse pathway repositories is available for download from http://sigora.googlecode.com/svn/. PMID:24432194

  2. ICan: an integrated co-alteration network to identify ovarian cancer-related genes.

    PubMed

    Zhou, Yuanshuai; Liu, Yongjing; Li, Kening; Zhang, Rui; Qiu, Fujun; Zhao, Ning; Xu, Yan

    2015-01-01

    Over the last decade, an increasing number of integrative studies on cancer-related genes have been published. Integrative analyses aim to overcome the limitation of a single data type, and provide a more complete view of carcinogenesis. The vast majority of these studies used sample-matched data of gene expression and copy number to investigate the impact of copy number alteration on gene expression, and to predict and prioritize candidate oncogenes and tumor suppressor genes. However, correlations between genes were neglected in these studies. Our work aimed to evaluate the co-alteration of copy number, methylation and expression, allowing us to identify cancer-related genes and essential functional modules in cancer. We built the Integrated Co-alteration network (ICan) based on multi-omics data, and analyzed the network to uncover cancer-related genes. After comparison with random networks, we identified 155 ovarian cancer-related genes, including well-known (TP53, BRCA1, RB1 and PTEN) and also novel cancer-related genes, such as PDPN and EphA2. We compared the results with a conventional method: CNAmet, and obtained a significantly better area under the curve value (ICan: 0.8179, CNAmet: 0.5183). In this paper, we describe a framework to find cancer-related genes based on an Integrated Co-alteration network. Our results proved that ICan could precisely identify candidate cancer genes and provide increased mechanistic understanding of carcinogenesis. This work suggested a new research direction for biological network analyses involving multi-omics data.

  3. Identifying differentially expressed genes in cancer patients using a non-parameter Ising model.

    PubMed

    Li, Xumeng; Feltus, Frank A; Sun, Xiaoqian; Wang, James Z; Luo, Feng

    2011-10-01

    Identification of genes and pathways involved in diseases and physiological conditions is a major task in systems biology. In this study, we developed a novel non-parameter Ising model to integrate protein-protein interaction network and microarray data for identifying differentially expressed (DE) genes. We also proposed a simulated annealing algorithm to find the optimal configuration of the Ising model. The Ising model was applied to two breast cancer microarray data sets. The results showed that more cancer-related DE sub-networks and genes were identified by the Ising model than those by the Markov random field model. Furthermore, cross-validation experiments showed that DE genes identified by Ising model can improve classification performance compared with DE genes identified by Markov random field model. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. Genome-wide Analyses Identify KIF5A as a Novel ALS Gene.

    PubMed

    Nicolas, Aude; Kenna, Kevin P; Renton, Alan E; Ticozzi, Nicola; Faghri, Faraz; Chia, Ruth; Dominov, Janice A; Kenna, Brendan J; Nalls, Mike A; Keagle, Pamela; Rivera, Alberto M; van Rheenen, Wouter; Murphy, Natalie A; van Vugt, Joke J F A; Geiger, Joshua T; Van der Spek, Rick A; Pliner, Hannah A; Shankaracharya; Smith, Bradley N; Marangi, Giuseppe; Topp, Simon D; Abramzon, Yevgeniya; Gkazi, Athina Soragia; Eicher, John D; Kenna, Aoife; Mora, Gabriele; Calvo, Andrea; Mazzini, Letizia; Riva, Nilo; Mandrioli, Jessica; Caponnetto, Claudia; Battistini, Stefania; Volanti, Paolo; La Bella, Vincenzo; Conforti, Francesca L; Borghero, Giuseppe; Messina, Sonia; Simone, Isabella L; Trojsi, Francesca; Salvi, Fabrizio; Logullo, Francesco O; D'Alfonso, Sandra; Corrado, Lucia; Capasso, Margherita; Ferrucci, Luigi; Moreno, Cristiane de Araujo Martins; Kamalakaran, Sitharthan; Goldstein, David B; Gitler, Aaron D; Harris, Tim; Myers, Richard M; Phatnani, Hemali; Musunuri, Rajeeva Lochan; Evani, Uday Shankar; Abhyankar, Avinash; Zody, Michael C; Kaye, Julia; Finkbeiner, Steven; Wyman, Stacia K; LeNail, Alex; Lima, Leandro; Fraenkel, Ernest; Svendsen, Clive N; Thompson, Leslie M; Van Eyk, Jennifer E; Berry, James D; Miller, Timothy M; Kolb, Stephen J; Cudkowicz, Merit; Baxi, Emily; Benatar, Michael; Taylor, J Paul; Rampersaud, Evadnie; Wu, Gang; Wuu, Joanne; Lauria, Giuseppe; Verde, Federico; Fogh, Isabella; Tiloca, Cinzia; Comi, Giacomo P; Sorarù, Gianni; Cereda, Cristina; Corcia, Philippe; Laaksovirta, Hannu; Myllykangas, Liisa; Jansson, Lilja; Valori, Miko; Ealing, John; Hamdalla, Hisham; Rollinson, Sara; Pickering-Brown, Stuart; Orrell, Richard W; Sidle, Katie C; Malaspina, Andrea; Hardy, John; Singleton, Andrew B; Johnson, Janel O; Arepalli, Sampath; Sapp, Peter C; McKenna-Yasek, Diane; Polak, Meraida; Asress, Seneshaw; Al-Sarraj, Safa; King, Andrew; Troakes, Claire; Vance, Caroline; de Belleroche, Jacqueline; Baas, Frank; Ten Asbroek, Anneloor L M A; Muñoz-Blanco, José Luis; Hernandez, Dena G; Ding, Jinhui; Gibbs, J Raphael; Scholz, Sonja W; Floeter, Mary Kay; Campbell, Roy H; Landi, Francesco; Bowser, Robert; Pulst, Stefan M; Ravits, John M; MacGowan, Daniel J L; Kirby, Janine; Pioro, Erik P; Pamphlett, Roger; Broach, James; Gerhard, Glenn; Dunckley, Travis L; Brady, Christopher B; Kowall, Neil W; Troncoso, Juan C; Le Ber, Isabelle; Mouzat, Kevin; Lumbroso, Serge; Heiman-Patterson, Terry D; Kamel, Freya; Van Den Bosch, Ludo; Baloh, Robert H; Strom, Tim M; Meitinger, Thomas; Shatunov, Aleksey; Van Eijk, Kristel R; de Carvalho, Mamede; Kooyman, Maarten; Middelkoop, Bas; Moisse, Matthieu; McLaughlin, Russell L; Van Es, Michael A; Weber, Markus; Boylan, Kevin B; Van Blitterswijk, Marka; Rademakers, Rosa; Morrison, Karen E; Basak, A Nazli; Mora, Jesús S; Drory, Vivian E; Shaw, Pamela J; Turner, Martin R; Talbot, Kevin; Hardiman, Orla; Williams, Kelly L; Fifita, Jennifer A; Nicholson, Garth A; Blair, Ian P; Rouleau, Guy A; Esteban-Pérez, Jesús; García-Redondo, Alberto; Al-Chalabi, Ammar; Rogaeva, Ekaterina; Zinman, Lorne; Ostrow, Lyle W; Maragakis, Nicholas J; Rothstein, Jeffrey D; Simmons, Zachary; Cooper-Knock, Johnathan; Brice, Alexis; Goutman, Stephen A; Feldman, Eva L; Gibson, Summer B; Taroni, Franco; Ratti, Antonia; Gellera, Cinzia; Van Damme, Philip; Robberecht, Wim; Fratta, Pietro; Sabatelli, Mario; Lunetta, Christian; Ludolph, Albert C; Andersen, Peter M; Weishaupt, Jochen H; Camu, William; Trojanowski, John Q; Van Deerlin, Vivianna M; Brown, Robert H; van den Berg, Leonard H; Veldink, Jan H; Harms, Matthew B; Glass, Jonathan D; Stone, David J; Tienari, Pentti; Silani, Vincenzo; Chiò, Adriano; Shaw, Christopher E; Traynor, Bryan J; Landers, John E

    2018-03-21

    To identify novel genes associated with ALS, we undertook two lines of investigation. We carried out a genome-wide association study comparing 20,806 ALS cases and 59,804 controls. Independently, we performed a rare variant burden analysis comparing 1,138 index familial ALS cases and 19,494 controls. Through both approaches, we identified kinesin family member 5A (KIF5A) as a novel gene associated with ALS. Interestingly, mutations predominantly in the N-terminal motor domain of KIF5A are causative for two neurodegenerative diseases: hereditary spastic paraplegia (SPG10) and Charcot-Marie-Tooth type 2 (CMT2). In contrast, ALS-associated mutations are primarily located at the C-terminal cargo-binding tail domain and patients harboring loss-of-function mutations displayed an extended survival relative to typical ALS cases. Taken together, these results broaden the phenotype spectrum resulting from mutations in KIF5A and strengthen the role of cytoskeletal defects in the pathogenesis of ALS. Copyright © 2018 Elsevier Inc. All rights reserved.

  5. Linking the salt transcriptome with physiological responses of a salt-resistant Populus species as a strategy to identify genes important for stress acclimation.

    PubMed

    Brinker, Monika; Brosché, Mikael; Vinocur, Basia; Abo-Ogiala, Atef; Fayyaz, Payam; Janz, Dennis; Ottow, Eric A; Cullmann, Andreas D; Saborowski, Joachim; Kangasjärvi, Jaakko; Altman, Arie; Polle, Andrea

    2010-12-01

    To investigate early salt acclimation mechanisms in a salt-tolerant poplar species (Populus euphratica), the kinetics of molecular, metabolic, and physiological changes during a 24-h salt exposure were measured. Three distinct phases of salt stress were identified by analyses of the osmotic pressure and the shoot water potential: dehydration, salt accumulation, and osmotic restoration associated with ionic stress. The duration and intensity of these phases differed between leaves and roots. Transcriptome analysis using P. euphratica-specific microarrays revealed clusters of coexpressed genes in these phases, with only 3% overlapping salt-responsive genes in leaves and roots. Acclimation of cellular metabolism to high salt concentrations involved remodeling of amino acid and protein biosynthesis and increased expression of molecular chaperones (dehydrins, osmotin). Leaves suffered initially from dehydration, which resulted in changes in transcript levels of mitochondrial and photosynthetic genes, indicating adjustment of energy metabolism. Initially, decreases in stress-related genes were found, whereas increases occurred only when leaves had restored the osmotic balance by salt accumulation. Comparative in silico analysis of the poplar stress regulon with Arabidopsis (Arabidopsis thaliana) orthologs was used as a strategy to reduce the number of candidate genes for functional analysis. Analysis of Arabidopsis knockout lines identified a lipocalin-like gene (AtTIL) and a gene encoding a protein with previously unknown functions (AtSIS) to play roles in salt tolerance. In conclusion, by dissecting the stress transcriptome of tolerant species, novel genes important for salt endurance can be identified.

  6. A recellularized human colon model identifies cancer driver genes

    PubMed Central

    Chen, Huanhuan Joyce; Wei, Zhubo; Sun, Jian; Bhattacharya, Asmita; Savage, David J; Serda, Rita; Mackeyev, Yuri; Curley, Steven A.; Bu, Pengcheng; Wang, Lihua; Chen, Shuibing; Cohen-Gould, Leona; Huang, Emina; Shen, Xiling; Lipkin, Steven M.; Copeland, Neal G.; Jenkins, Nancy A.; Shuler, Michael L.

    2016-01-01

    Refined cancer models are needed to bridge the gap between cell-line, animal and clinical research. Here we describe the engineering of an organotypic colon cancer model by recellularization of a native human matrix that contains cell-populated mucosa and an intact muscularis mucosa layer. This ex vivo system recapitulates the pathophysiological progression from APC-mutant neoplasia to submucosal invasive tumor. We used it to perform a Sleeping Beauty transposon mutagenesis screen to identify genes that cooperate with mutant APC in driving invasive neoplasia. 38 candidate invasion driver genes were identified, 17 of which have been previously implicated in colorectal cancer progression, including TCF7L2, TWIST2, MSH2, DCC and EPHB1/2. Six invasion driver genes that to our knowledge have not been previously described were validated in vitro using cell proliferation, migration and invasion assays, and ex vivo using recellularized human colon. These results demonstrate the utility of our organoid model for studying cancer biology. PMID:27398792

  7. Co-expression analysis identifies CRC and AP1 the regulator of Arabidopsis fatty acid biosynthesis.

    PubMed

    Han, Xinxin; Yin, Linlin; Xue, Hongwei

    2012-07-01

    Fatty acids (FAs) play crucial rules in signal transduction and plant development, however, the regulation of FA metabolism is still poorly understood. To study the relevant regulatory network, fifty-eight FA biosynthesis genes including de novo synthases, desaturases and elongases were selected as "guide genes" to construct the co-expression network. Calculation of the correlation between all Arabidopsis thaliana (L.) genes with each guide gene by Arabidopsis co-expression dating mining tools (ACT) identifies 797 candidate FA-correlated genes. Gene ontology (GO) analysis of these co-expressed genes showed they are tightly correlated to photosynthesis and carbohydrate metabolism, and function in many processes. Interestingly, 63 transcription factors (TFs) were identified as candidate FA biosynthesis regulators and 8 TF families are enriched. Two TF genes, CRC and AP1, both correlating with 8 FA guide genes, were further characterized. Analyses of the ap1 and crc mutant showed the altered total FA composition of mature seeds. The contents of palmitoleic acid, stearic acid, arachidic acid and eicosadienoic acid are decreased, whereas that of oleic acid is increased in ap1 and crc seeds, which is consistent with the qRT-PCR analysis revealing the suppressed expression of the corresponding guide genes. In addition, yeast one-hybrid analysis and electrophoretic mobility shift assay (EMSA) revealed that CRC can bind to the promoter regions of KCS7 and KCS15, indicating that CRC may directly regulate FA biosynthesis. © 2012 Institute of Botany, Chinese Academy of Sciences.

  8. Cluster analysis of spontaneous preterm birth phenotypes identifies potential associations among preterm birth mechanisms.

    PubMed

    Esplin, M Sean; Manuck, Tracy A; Varner, Michael W; Christensen, Bryce; Biggio, Joseph; Bukowski, Radek; Parry, Samuel; Zhang, Heping; Huang, Hao; Andrews, William; Saade, George; Sadovsky, Yoel; Reddy, Uma M; Ilekis, John

    2015-09-01

    We sought to use an innovative tool that is based on common biologic pathways to identify specific phenotypes among women with spontaneous preterm birth (SPTB) to enhance investigators' ability to identify and to highlight common mechanisms and underlying genetic factors that are responsible for SPTB. We performed a secondary analysis of a prospective case-control multicenter study of SPTB. All cases delivered a preterm singleton at SPTB ≤34.0 weeks' gestation. Each woman was assessed for the presence of underlying SPTB causes. A hierarchic cluster analysis was used to identify groups of women with homogeneous phenotypic profiles. One of the phenotypic clusters was selected for candidate gene association analysis with the use of VEGAS software. One thousand twenty-eight women with SPTB were assigned phenotypes. Hierarchic clustering of the phenotypes revealed 5 major clusters. Cluster 1 (n = 445) was characterized by maternal stress; cluster 2 (n = 294) was characterized by premature membrane rupture; cluster 3 (n = 120) was characterized by familial factors, and cluster 4 (n = 63) was characterized by maternal comorbidities. Cluster 5 (n = 106) was multifactorial and characterized by infection (INF), decidual hemorrhage (DH), and placental dysfunction (PD). These 3 phenotypes were correlated highly by χ(2) analysis (PD and DH, P < 2.2e-6; PD and INF, P = 6.2e-10; INF and DH, (P = .0036). Gene-based testing identified the INS (insulin) gene as significantly associated with cluster 3 of SPTB. We identified 5 major clusters of SPTB based on a phenotype tool and hierarch clustering. There was significant correlation between several of the phenotypes. The INS gene was associated with familial factors that were underlying SPTB. Copyright © 2015 Elsevier Inc. All rights reserved.

  9. Identification of suitable genes contributes to lung adenocarcinoma clustering by multiple meta-analysis methods.

    PubMed

    Yang, Ze-Hui; Zheng, Rui; Gao, Yuan; Zhang, Qiang

    2016-09-01

    With the widespread application of high-throughput technology, numerous meta-analysis methods have been proposed for differential expression profiling across multiple studies. We identified the suitable differentially expressed (DE) genes that contributed to lung adenocarcinoma (ADC) clustering based on seven popular multiple meta-analysis methods. Seven microarray expression profiles of ADC and normal controls were extracted from the ArrayExpress database. The Bioconductor was used to perform the data preliminary preprocessing. Then, DE genes across multiple studies were identified. Hierarchical clustering was applied to compare the classification performance for microarray data samples. The classification efficiency was compared based on accuracy, sensitivity and specificity. Across seven datasets, 573 ADC cases and 222 normal controls were collected. After filtering out unexpressed and noninformative genes, 3688 genes were remained for further analysis. The classification efficiency analysis showed that DE genes identified by sum of ranks method separated ADC from normal controls with the best accuracy, sensitivity and specificity of 0.953, 0.969 and 0.932, respectively. The gene set with the highest classification accuracy mainly participated in the regulation of response to external stimulus (P = 7.97E-04), cyclic nucleotide-mediated signaling (P = 0.01), regulation of cell morphogenesis (P = 0.01) and regulation of cell proliferation (P = 0.01). Evaluation of DE genes identified by different meta-analysis methods in classification efficiency provided a new perspective to the choice of the suitable method in a given application. Varying meta-analysis methods always present varying abilities, so synthetic consideration should be taken when providing meta-analysis methods for particular research. © 2015 John Wiley & Sons Ltd.

  10. Combining suppressive subtractive hybridization and cDNA microarrays to identify dietary phosphorus-responsive genes of the rainbow trout (Oncorhynchus mykiss) kidney.

    PubMed

    Lake, Jennifer; Gravel, Catherine; Koko, Gabriel Koffi D; Robert, Claude; Vandenberg, Grant W

    2010-03-01

    Phosphorus (P)-responsive genes and how they regulate renal adaptation to phosphorous-deficient diets in animals, including fish, are not well understood. RNA abundance profiling using cDNA microarrays is an efficient approach to study nutrient-gene interactions and identify these dietary P-responsive genes. To test the hypothesis that dietary P-responsive genes are differentially expressed in fish fed varying P levels, rainbow trout were fed a practical high-P diet (R20: 0.96% P) or a low-P diet (R0: 0.38% P) for 7 weeks. The differentially-expressed genes between dietary groups were identified and compared from the kidney by combining suppressive subtractive hybridization (SSH) with cDNA microarray analysis. A number of genes were confirmed by real-time PCR, and correlated with plasma and bone P concentrations. Approximately 54 genes were identified as potential dietary P-responsive after 7 weeks on a diet deficient in P according to cDNA microarray analysis. Of 18 selected genes, 13 genes were confirmed to be P-responsive at 7 weeks by real-time PCR analysis, including: iNOS, cytochrome b, cytochrome c oxidase subunit II , alpha-globin I, beta-globin, ATP synthase, hyperosmotic protein 21, COL1A3, Nkef, NDPK, glucose phosphate isomerase 1, Na+/H+ exchange protein and GDP dissociation inhibitor 2. Many of these dietary P-responsive genes responded in a moderate way (R0/R20 ratio: <2-3 or >0.5) and in a transient manner to dietary P limitation. In summary, renal adaptation to dietary P deficiency in trout involves changes in the expression of several genes, suggesting a profile of metabolic stress, since many of these differentially-expressed candidates are associated with the cellular adaptative responses. Crown Copyright 2009. Published by Elsevier Inc. All rights reserved.

  11. Microarray Meta-Analysis Identifies Acute Lung Injury Biomarkers in Donor Lungs That Predict Development of Primary Graft Failure in Recipients

    PubMed Central

    Haitsma, Jack J.; Furmli, Suleiman; Masoom, Hussain; Liu, Mingyao; Imai, Yumiko; Slutsky, Arthur S.; Beyene, Joseph; Greenwood, Celia M. T.; dos Santos, Claudia

    2012-01-01

    Objectives To perform a meta-analysis of gene expression microarray data from animal studies of lung injury, and to identify an injury-specific gene expression signature capable of predicting the development of lung injury in humans. Methods We performed a microarray meta-analysis using 77 microarray chips across six platforms, two species and different animal lung injury models exposed to lung injury with or/and without mechanical ventilation. Individual gene chips were classified and grouped based on the strategy used to induce lung injury. Effect size (change in gene expression) was calculated between non-injurious and injurious conditions comparing two main strategies to pool chips: (1) one-hit and (2) two-hit lung injury models. A random effects model was used to integrate individual effect sizes calculated from each experiment. Classification models were built using the gene expression signatures generated by the meta-analysis to predict the development of lung injury in human lung transplant recipients. Results Two injury-specific lists of differentially expressed genes generated from our meta-analysis of lung injury models were validated using external data sets and prospective data from animal models of ventilator-induced lung injury (VILI). Pathway analysis of gene sets revealed that both new and previously implicated VILI-related pathways are enriched with differentially regulated genes. Classification model based on gene expression signatures identified in animal models of lung injury predicted development of primary graft failure (PGF) in lung transplant recipients with larger than 80% accuracy based upon injury profiles from transplant donors. We also found that better classifier performance can be achieved by using meta-analysis to identify differentially-expressed genes than using single study-based differential analysis. Conclusion Taken together, our data suggests that microarray analysis of gene expression data allows for the detection of

  12. Genomic analysis of primordial dwarfism reveals novel disease genes.

    PubMed

    Shaheen, Ranad; Faqeih, Eissa; Ansari, Shinu; Abdel-Salam, Ghada; Al-Hassnan, Zuhair N; Al-Shidi, Tarfa; Alomar, Rana; Sogaty, Sameera; Alkuraya, Fowzan S

    2014-02-01

    Primordial dwarfism (PD) is a disease in which severely impaired fetal growth persists throughout postnatal development and results in stunted adult size. The condition is highly heterogeneous clinically, but the use of certain phenotypic aspects such as head circumference and facial appearance has proven helpful in defining clinical subgroups. In this study, we present the results of clinical and genomic characterization of 16 new patients in whom a broad definition of PD was used (e.g., 3M syndrome was included). We report a novel PD syndrome with distinct facies in two unrelated patients, each with a different homozygous truncating mutation in CRIPT. Our analysis also reveals, in addition to mutations in known PD disease genes, the first instance of biallelic truncating BRCA2 mutation causing PD with normal bone marrow analysis. In addition, we have identified a novel locus for Seckel syndrome based on a consanguineous multiplex family and identified a homozygous truncating mutation in DNA2 as the likely cause. An additional novel PD disease candidate gene XRCC4 was identified by autozygome/exome analysis, and the knockout mouse phenotype is highly compatible with PD. Thus, we add a number of novel genes to the growing list of PD-linked genes, including one which we show to be linked to a novel PD syndrome with a distinct facial appearance. PD is extremely heterogeneous genetically and clinically, and genomic tools are often required to reach a molecular diagnosis.

  13. Genomic analysis of primordial dwarfism reveals novel disease genes

    PubMed Central

    Shaheen, Ranad; Faqeih, Eissa; Ansari, Shinu; Abdel-Salam, Ghada; Al-Hassnan, Zuhair N.; Al-Shidi, Tarfa; Alomar, Rana; Sogaty, Sameera; Alkuraya, Fowzan S.

    2014-01-01

    Primordial dwarfism (PD) is a disease in which severely impaired fetal growth persists throughout postnatal development and results in stunted adult size. The condition is highly heterogeneous clinically, but the use of certain phenotypic aspects such as head circumference and facial appearance has proven helpful in defining clinical subgroups. In this study, we present the results of clinical and genomic characterization of 16 new patients in whom a broad definition of PD was used (e.g., 3M syndrome was included). We report a novel PD syndrome with distinct facies in two unrelated patients, each with a different homozygous truncating mutation in CRIPT. Our analysis also reveals, in addition to mutations in known PD disease genes, the first instance of biallelic truncating BRCA2 mutation causing PD with normal bone marrow analysis. In addition, we have identified a novel locus for Seckel syndrome based on a consanguineous multiplex family and identified a homozygous truncating mutation in DNA2 as the likely cause. An additional novel PD disease candidate gene XRCC4 was identified by autozygome/exome analysis, and the knockout mouse phenotype is highly compatible with PD. Thus, we add a number of novel genes to the growing list of PD-linked genes, including one which we show to be linked to a novel PD syndrome with a distinct facial appearance. PD is extremely heterogeneous genetically and clinically, and genomic tools are often required to reach a molecular diagnosis. PMID:24389050

  14. Leveraging network analytics to infer patient syndrome and identify causal genes in rare disease cases.

    PubMed

    Krämer, Andreas; Shah, Sohela; Rebres, Robert Anthony; Tang, Susan; Richards, Daniel Rene

    2017-08-11

    Next-generation sequencing is widely used to identify disease-causing variants in patients with rare genetic disorders. Identifying those variants from whole-genome or exome data can be both scientifically challenging and time consuming. A significant amount of time is spent on variant annotation, and interpretation. Fully or partly automated solutions are therefore needed to streamline and scale this process. We describe Phenotype Driven Ranking (PDR), an algorithm integrated into Ingenuity Variant Analysis, that uses observed patient phenotypes to prioritize diseases and genes in order to expedite causal-variant discovery. Our method is based on a network of phenotype-disease-gene relationships derived from the QIAGEN Knowledge Base, which allows for efficient computational association of phenotypes to implicated diseases, and also enables scoring and ranking. We have demonstrated the utility and performance of PDR by applying it to a number of clinical rare-disease cases, where the true causal gene was known beforehand. It is also shown that PDR compares favorably to a representative alternative tool.

  15. GeneSigDB: a manually curated database and resource for analysis of gene expression signatures

    PubMed Central

    Culhane, Aedín C.; Schröder, Markus S.; Sultana, Razvan; Picard, Shaita C.; Martinelli, Enzo N.; Kelly, Caroline; Haibe-Kains, Benjamin; Kapushesky, Misha; St Pierre, Anne-Alyssa; Flahive, William; Picard, Kermshlise C.; Gusenleitner, Daniel; Papenhausen, Gerald; O'Connor, Niall; Correll, Mick; Quackenbush, John

    2012-01-01

    GeneSigDB (http://www.genesigdb.org or http://compbio.dfci.harvard.edu/genesigdb/) is a database of gene signatures that have been extracted and manually curated from the published literature. It provides a standardized resource of published prognostic, diagnostic and other gene signatures of cancer and related disease to the community so they can compare the predictive power of gene signatures or use these in gene set enrichment analysis. Since GeneSigDB release 1.0, we have expanded from 575 to 3515 gene signatures, which were collected and transcribed from 1604 published articles largely focused on gene expression in cancer, stem cells, immune cells, development and lung disease. We have made substantial upgrades to the GeneSigDB website to improve accessibility and usability, including adding a tag cloud browse function, facetted navigation and a ‘basket’ feature to store genes or gene signatures of interest. Users can analyze GeneSigDB gene signatures, or upload their own gene list, to identify gene signatures with significant gene overlap and results can be viewed on a dynamic editable heatmap that can be downloaded as a publication quality image. All data in GeneSigDB can be downloaded in numerous formats including .gmt file format for gene set enrichment analysis or as a R/Bioconductor data file. GeneSigDB is available from http://www.genesigdb.org. PMID:22110038

  16. 1000 Genomes-based meta-analysis identifies 10 novel loci for kidney function

    PubMed Central

    Gorski, Mathias; van der Most, Peter J.; Teumer, Alexander; Chu, Audrey Y.; Li, Man; Mijatovic, Vladan; Nolte, Ilja M.; Cocca, Massimiliano; Taliun, Daniel; Gomez, Felicia; Li, Yong; Tayo, Bamidele; Tin, Adrienne; Feitosa, Mary F.; Aspelund, Thor; Attia, John; Biffar, Reiner; Bochud, Murielle; Boerwinkle, Eric; Borecki, Ingrid; Bottinger, Erwin P.; Chen, Ming-Huei; Chouraki, Vincent; Ciullo, Marina; Coresh, Josef; Cornelis, Marilyn C.; Curhan, Gary C.; d’Adamo, Adamo Pio; Dehghan, Abbas; Dengler, Laura; Ding, Jingzhong; Eiriksdottir, Gudny; Endlich, Karlhans; Enroth, Stefan; Esko, Tõnu; Franco, Oscar H.; Gasparini, Paolo; Gieger, Christian; Girotto, Giorgia; Gottesman, Omri; Gudnason, Vilmundur; Gyllensten, Ulf; Hancock, Stephen J.; Harris, Tamara B.; Helmer, Catherine; Höllerer, Simon; Hofer, Edith; Hofman, Albert; Holliday, Elizabeth G.; Homuth, Georg; Hu, Frank B.; Huth, Cornelia; Hutri-Kähönen, Nina; Hwang, Shih-Jen; Imboden, Medea; Johansson, Åsa; Kähönen, Mika; König, Wolfgang; Kramer, Holly; Krämer, Bernhard K.; Kumar, Ashish; Kutalik, Zoltan; Lambert, Jean-Charles; Launer, Lenore J.; Lehtimäki, Terho; de Borst, Martin; Navis, Gerjan; Swertz, Morris; Liu, Yongmei; Lohman, Kurt; Loos, Ruth J. F.; Lu, Yingchang; Lyytikäinen, Leo-Pekka; McEvoy, Mark A.; Meisinger, Christa; Meitinger, Thomas; Metspalu, Andres; Metzger, Marie; Mihailov, Evelin; Mitchell, Paul; Nauck, Matthias; Oldehinkel, Albertine J.; Olden, Matthias; WJH Penninx, Brenda; Pistis, Giorgio; Pramstaller, Peter P.; Probst-Hensch, Nicole; Raitakari, Olli T.; Rettig, Rainer; Ridker, Paul M.; Rivadeneira, Fernando; Robino, Antonietta; Rosas, Sylvia E.; Ruderfer, Douglas; Ruggiero, Daniela; Saba, Yasaman; Sala, Cinzia; Schmidt, Helena; Schmidt, Reinhold; Scott, Rodney J.; Sedaghat, Sanaz; Smith, Albert V.; Sorice, Rossella; Stengel, Benedicte; Stracke, Sylvia; Strauch, Konstantin; Toniolo, Daniela; Uitterlinden, Andre G.; Ulivi, Sheila; Viikari, Jorma S.; Völker, Uwe; Vollenweider, Peter; Völzke, Henry; Vuckovic, Dragana; Waldenberger, Melanie; Jin Wang, Jie; Yang, Qiong; Chasman, Daniel I.; Tromp, Gerard; Snieder, Harold; Heid, Iris M.; Fox, Caroline S.; Köttgen, Anna; Pattaro, Cristian; Böger, Carsten A.; Fuchsberger, Christian

    2017-01-01

    HapMap imputed genome-wide association studies (GWAS) have revealed >50 loci at which common variants with minor allele frequency >5% are associated with kidney function. GWAS using more complete reference sets for imputation, such as those from The 1000 Genomes project, promise to identify novel loci that have been missed by previous efforts. To investigate the value of such a more complete variant catalog, we conducted a GWAS meta-analysis of kidney function based on the estimated glomerular filtration rate (eGFR) in 110,517 European ancestry participants using 1000 Genomes imputed data. We identified 10 novel loci with p-value < 5 × 10−8 previously missed by HapMap-based GWAS. Six of these loci (HOXD8, ARL15, PIK3R1, EYA4, ASTN2, and EPB41L3) are tagged by common SNPs unique to the 1000 Genomes reference panel. Using pathway analysis, we identified 39 significant (FDR < 0.05) genes and 127 significantly (FDR < 0.05) enriched gene sets, which were missed by our previous analyses. Among those, the 10 identified novel genes are part of pathways of kidney development, carbohydrate metabolism, cardiac septum development and glucose metabolism. These results highlight the utility of re-imputing from denser reference panels, until whole-genome sequencing becomes feasible in large samples. PMID:28452372

  17. 1000 Genomes-based meta-analysis identifies 10 novel loci for kidney function.

    PubMed

    Gorski, Mathias; van der Most, Peter J; Teumer, Alexander; Chu, Audrey Y; Li, Man; Mijatovic, Vladan; Nolte, Ilja M; Cocca, Massimiliano; Taliun, Daniel; Gomez, Felicia; Li, Yong; Tayo, Bamidele; Tin, Adrienne; Feitosa, Mary F; Aspelund, Thor; Attia, John; Biffar, Reiner; Bochud, Murielle; Boerwinkle, Eric; Borecki, Ingrid; Bottinger, Erwin P; Chen, Ming-Huei; Chouraki, Vincent; Ciullo, Marina; Coresh, Josef; Cornelis, Marilyn C; Curhan, Gary C; d'Adamo, Adamo Pio; Dehghan, Abbas; Dengler, Laura; Ding, Jingzhong; Eiriksdottir, Gudny; Endlich, Karlhans; Enroth, Stefan; Esko, Tõnu; Franco, Oscar H; Gasparini, Paolo; Gieger, Christian; Girotto, Giorgia; Gottesman, Omri; Gudnason, Vilmundur; Gyllensten, Ulf; Hancock, Stephen J; Harris, Tamara B; Helmer, Catherine; Höllerer, Simon; Hofer, Edith; Hofman, Albert; Holliday, Elizabeth G; Homuth, Georg; Hu, Frank B; Huth, Cornelia; Hutri-Kähönen, Nina; Hwang, Shih-Jen; Imboden, Medea; Johansson, Åsa; Kähönen, Mika; König, Wolfgang; Kramer, Holly; Krämer, Bernhard K; Kumar, Ashish; Kutalik, Zoltan; Lambert, Jean-Charles; Launer, Lenore J; Lehtimäki, Terho; de Borst, Martin; Navis, Gerjan; Swertz, Morris; Liu, Yongmei; Lohman, Kurt; Loos, Ruth J F; Lu, Yingchang; Lyytikäinen, Leo-Pekka; McEvoy, Mark A; Meisinger, Christa; Meitinger, Thomas; Metspalu, Andres; Metzger, Marie; Mihailov, Evelin; Mitchell, Paul; Nauck, Matthias; Oldehinkel, Albertine J; Olden, Matthias; Wjh Penninx, Brenda; Pistis, Giorgio; Pramstaller, Peter P; Probst-Hensch, Nicole; Raitakari, Olli T; Rettig, Rainer; Ridker, Paul M; Rivadeneira, Fernando; Robino, Antonietta; Rosas, Sylvia E; Ruderfer, Douglas; Ruggiero, Daniela; Saba, Yasaman; Sala, Cinzia; Schmidt, Helena; Schmidt, Reinhold; Scott, Rodney J; Sedaghat, Sanaz; Smith, Albert V; Sorice, Rossella; Stengel, Benedicte; Stracke, Sylvia; Strauch, Konstantin; Toniolo, Daniela; Uitterlinden, Andre G; Ulivi, Sheila; Viikari, Jorma S; Völker, Uwe; Vollenweider, Peter; Völzke, Henry; Vuckovic, Dragana; Waldenberger, Melanie; Jin Wang, Jie; Yang, Qiong; Chasman, Daniel I; Tromp, Gerard; Snieder, Harold; Heid, Iris M; Fox, Caroline S; Köttgen, Anna; Pattaro, Cristian; Böger, Carsten A; Fuchsberger, Christian

    2017-04-28

    HapMap imputed genome-wide association studies (GWAS) have revealed >50 loci at which common variants with minor allele frequency >5% are associated with kidney function. GWAS using more complete reference sets for imputation, such as those from The 1000 Genomes project, promise to identify novel loci that have been missed by previous efforts. To investigate the value of such a more complete variant catalog, we conducted a GWAS meta-analysis of kidney function based on the estimated glomerular filtration rate (eGFR) in 110,517 European ancestry participants using 1000 Genomes imputed data. We identified 10 novel loci with p-value < 5 × 10 -8 previously missed by HapMap-based GWAS. Six of these loci (HOXD8, ARL15, PIK3R1, EYA4, ASTN2, and EPB41L3) are tagged by common SNPs unique to the 1000 Genomes reference panel. Using pathway analysis, we identified 39 significant (FDR < 0.05) genes and 127 significantly (FDR < 0.05) enriched gene sets, which were missed by our previous analyses. Among those, the 10 identified novel genes are part of pathways of kidney development, carbohydrate metabolism, cardiac septum development and glucose metabolism. These results highlight the utility of re-imputing from denser reference panels, until whole-genome sequencing becomes feasible in large samples.

  18. Transcriptomic analysis of Arabidopsis developing stems: a close-up on cell wall genes

    PubMed Central

    Minic, Zoran; Jamet, Elisabeth; San-Clemente, Hélène; Pelletier, Sandra; Renou, Jean-Pierre; Rihouey, Christophe; Okinyo, Denis PO; Proux, Caroline; Lerouge, Patrice; Jouanin, Lise

    2009-01-01

    Background Different strategies (genetics, biochemistry, and proteomics) can be used to study proteins involved in cell biogenesis. The availability of the complete sequences of several plant genomes allowed the development of transcriptomic studies. Although the expression patterns of some Arabidopsis thaliana genes involved in cell wall biogenesis were identified at different physiological stages, detailed microarray analysis of plant cell wall genes has not been performed on any plant tissues. Using transcriptomic and bioinformatic tools, we studied the regulation of cell wall genes in Arabidopsis stems, i.e. genes encoding proteins involved in cell wall biogenesis and genes encoding secreted proteins. Results Transcriptomic analyses of stems were performed at three different developmental stages, i.e., young stems, intermediate stage, and mature stems. Many genes involved in the synthesis of cell wall components such as polysaccharides and monolignols were identified. A total of 345 genes encoding predicted secreted proteins with moderate or high level of transcripts were analyzed in details. The encoded proteins were distributed into 8 classes, based on the presence of predicted functional domains. Proteins acting on carbohydrates and proteins of unknown function constituted the two most abundant classes. Other proteins were proteases, oxido-reductases, proteins with interacting domains, proteins involved in signalling, and structural proteins. Particularly high levels of expression were established for genes encoding pectin methylesterases, germin-like proteins, arabinogalactan proteins, fasciclin-like arabinogalactan proteins, and structural proteins. Finally, the results of this transcriptomic analyses were compared with those obtained through a cell wall proteomic analysis from the same material. Only a small proportion of genes identified by previous proteomic analyses were identified by transcriptomics. Conversely, only a few proteins encoded by genes

  19. Bioinformatics Analysis Reveals Genes Involved in the Pathogenesis of Ameloblastoma and Keratocystic Odontogenic Tumor.

    PubMed

    Santos, Eliane Macedo Sobrinho; Santos, Hércules Otacílio; Dos Santos Dias, Ivoneth; Santos, Sérgio Henrique; Batista de Paula, Alfredo Maurício; Feltenberger, John David; Sena Guimarães, André Luiz; Farias, Lucyana Conceição

    2016-01-01

    Pathogenesis of odontogenic tumors is not well known. It is important to identify genetic deregulations and molecular alterations. This study aimed to investigate, through bioinformatic analysis, the possible genes involved in the pathogenesis of ameloblastoma (AM) and keratocystic odontogenic tumor (KCOT). Genes involved in the pathogenesis of AM and KCOT were identified in GeneCards. Gene list was expanded, and the gene interactions network was mapped using the STRING software. "Weighted number of links" (WNL) was calculated to identify "leader genes" (highest WNL). Genes were ranked by K-means method and Kruskal-Wallis test was used (P<0.001). Total interactions score (TIS) was also calculated using all interaction data generated by the STRING database, in order to achieve global connectivity for each gene. The topological and ontological analyses were performed using Cytoscape software and BinGO plugin. Literature review data was used to corroborate the bioinformatics data. CDK1 was identified as leader gene for AM. In KCOT group, results show PCNA and TP53 . Both tumors exhibit a power law behavior. Our topological analysis suggested leader genes possibly important in the pathogenesis of AM and KCOT, by clustering coefficient calculated for both odontogenic tumors (0.028 for AM, zero for KCOT). The results obtained in the scatter diagram suggest an important relationship of these genes with the molecular processes involved in AM and KCOT. Ontological analysis for both AM and KCOT demonstrated different mechanisms. Bioinformatics analyzes were confirmed through literature review. These results may suggest the involvement of promising genes for a better understanding of the pathogenesis of AM and KCOT.

  20. A Functional Genomics Approach to Identify Novel Breast Cancer Gene Targets in Yeast

    DTIC Science & Technology

    2004-05-01

    AD Award Number: DAMD17-03-1-0232 TITLE: A Functional Genomics Approach to Identify Novel Breast Cancer Gene Targets in Yeast PRINCIPAL INVESTIGATOR...Approach to Identify Novel Breast DAMD17-03-1-0232 Cancer Gene Targets in Yeast 6. A UTHOR(S) Craig Bennett, Ph.D. 7. PERFORMING ORGANIZA TION NAME(S...Unlimited 13. ABSTRACT (Maximum 200 Words) We are using the yeast Saccharomyces cerevisiae to identify new cancer gene targets that interact with the

  1. A conserved BDNF, glutamate- and GABA-enriched gene module related to human depression identified by coexpression meta-analysis and DNA variant genome-wide association studies.

    PubMed

    Chang, Lun-Ching; Jamain, Stephane; Lin, Chien-Wei; Rujescu, Dan; Tseng, George C; Sibille, Etienne

    2014-01-01

    Large scale gene expression (transcriptome) analysis and genome-wide association studies (GWAS) for single nucleotide polymorphisms have generated a considerable amount of gene- and disease-related information, but heterogeneity and various sources of noise have limited the discovery of disease mechanisms. As systematic dataset integration is becoming essential, we developed methods and performed meta-clustering of gene coexpression links in 11 transcriptome studies from postmortem brains of human subjects with major depressive disorder (MDD) and non-psychiatric control subjects. We next sought enrichment in the top 50 meta-analyzed coexpression modules for genes otherwise identified by GWAS for various sets of disorders. One coexpression module of 88 genes was consistently and significantly associated with GWAS for MDD, other neuropsychiatric disorders and brain functions, and for medical illnesses with elevated clinical risk of depression, but not for other diseases. In support of the superior discriminative power of this novel approach, we observed no significant enrichment for GWAS-related genes in coexpression modules extracted from single studies or in meta-modules using gene expression data from non-psychiatric control subjects. Genes in the identified module encode proteins implicated in neuronal signaling and structure, including glutamate metabotropic receptors (GRM1, GRM7), GABA receptors (GABRA2, GABRA4), and neurotrophic and development-related proteins [BDNF, reelin (RELN), Ephrin receptors (EPHA3, EPHA5)]. These results are consistent with the current understanding of molecular mechanisms of MDD and provide a set of putative interacting molecular partners, potentially reflecting components of a functional module across cells and biological pathways that are synchronously recruited in MDD, other brain disorders and MDD-related illnesses. Collectively, this study demonstrates the importance of integrating transcriptome data, gene coexpression modules

  2. Bioinformatics analysis of differentially expressed gene profiles associated with systemic lupus erythematosus

    PubMed Central

    Wu, Chengjiang; Zhao, Yangjing; Lin, Yu; Yang, Xinxin; Yan, Meina; Min, Yujiao; Pan, Zihui; Xia, Sheng; Shao, Qixiang

    2018-01-01

    DNA microarray and high-throughput sequencing have been widely used to identify the differentially expressed genes (DEGs) in systemic lupus erythematosus (SLE). However, the big data from gene microarrays are also challenging to work with in terms of analysis and processing. The presents study combined data from the microarray expression profile (GSE65391) and bioinformatics analysis to identify the key genes and cellular pathways in SLE. Gene ontology (GO) and cellular pathway enrichment analyses of DEGs were performed to investigate significantly enriched pathways. A protein-protein interaction network was constructed to determine the key genes in the occurrence and development of SLE. A total of 310 DEGs were identified in SLE, including 193 upregulated genes and 117 downregulated genes. GO analysis revealed that the most significant biological process of DEGs was immune system process. Kyoto Encyclopedia of Genes and Genome pathway analysis showed that these DEGs were enriched in signaling pathways associated with the immune system, including the RIG-I-like receptor signaling pathway, intestinal immune network for IgA production, antigen processing and presentation and the toll-like receptor signaling pathway. The current study screened the top 10 genes with higher degrees as hub genes, which included 2′-5′-oligoadenylate synthetase 1, MX dynamin like GTPase 2, interferon induced protein with tetratricopeptide repeats 1, interferon regulatory factor 7, interferon induced with helicase C domain 1, signal transducer and activator of transcription 1, ISG15 ubiquitin-like modifier, DExD/H-box helicase 58, interferon induced protein with tetratricopeptide repeats 3 and 2′-5′-oligoadenylate synthetase 2. Module analysis revealed that these hub genes were also involved in the RIG-I-like receptor signaling, cytosolic DNA-sensing, toll-like receptor signaling and ribosome biogenesis pathways. In addition, these hub genes, from different probe sets, exhibited

  3. ICan: An Integrated Co-Alteration Network to Identify Ovarian Cancer-Related Genes

    PubMed Central

    Zhou, Yuanshuai; Liu, Yongjing; Li, Kening; Zhang, Rui; Qiu, Fujun; Zhao, Ning; Xu, Yan

    2015-01-01

    Background Over the last decade, an increasing number of integrative studies on cancer-related genes have been published. Integrative analyses aim to overcome the limitation of a single data type, and provide a more complete view of carcinogenesis. The vast majority of these studies used sample-matched data of gene expression and copy number to investigate the impact of copy number alteration on gene expression, and to predict and prioritize candidate oncogenes and tumor suppressor genes. However, correlations between genes were neglected in these studies. Our work aimed to evaluate the co-alteration of copy number, methylation and expression, allowing us to identify cancer-related genes and essential functional modules in cancer. Results We built the Integrated Co-alteration network (ICan) based on multi-omics data, and analyzed the network to uncover cancer-related genes. After comparison with random networks, we identified 155 ovarian cancer-related genes, including well-known (TP53, BRCA1, RB1 and PTEN) and also novel cancer-related genes, such as PDPN and EphA2. We compared the results with a conventional method: CNAmet, and obtained a significantly better area under the curve value (ICan: 0.8179, CNAmet: 0.5183). Conclusion In this paper, we describe a framework to find cancer-related genes based on an Integrated Co-alteration network. Our results proved that ICan could precisely identify candidate cancer genes and provide increased mechanistic understanding of carcinogenesis. This work suggested a new research direction for biological network analyses involving multi-omics data. PMID:25803614

  4. Transcriptome analysis of nitric oxide-responsive genes in upland cotton (Gossypium hirsutum).

    PubMed

    Huang, Juan; Wei, Hengling; Li, Libei; Yu, Shuxun

    2018-01-01

    Nitric oxide (NO) is an important signaling molecule with diverse physiological functions in plants. It is therefore important to characterize the downstream genes and signal transduction networks modulated by NO. Here, we identified 1,932 differentially expressed genes (DEGs) responding to NO in upland cotton using high throughput tag sequencing. The results of quantitative real-time polymerase chain reaction (qRT-PCR) analysis of 25 DEGs showed good consistency. Gene Ontology (GO) and KEGG pathway were analyzed to gain a better understanding of these DEGs. We identified 157 DEGs belonging to 36 transcription factor (TF) families and 72 DEGs related to eight plant hormones, among which several TF families and hormones were involved in stress responses. Hydrogen peroxide and malondialdehyde (MDA) contents were increased, as well related genes after treatment with sodium nitroprusside (SNP) (an NO donor), suggesting a role for NO in the plant stress response. Finally, we compared of the current and previous data indicating a massive number of NO-responsive genes at the large-scale transcriptome level. This study evaluated the landscape of NO-responsive genes in cotton and identified the involvement of NO in the stress response. Some of the identified DEGs represent good candidates for further functional analysis in cotton.

  5. Identification of key microRNAs and genes in preeclampsia by bioinformatics analysis

    PubMed Central

    Luo, Shouling; Cao, Nannan; Tang, Yao; Gu, Weirong

    2017-01-01

    Preeclampsia is a leading cause of perinatal maternal–foetal mortality and morbidity. The aim of this study is to identify the key microRNAs and genes in preeclampsia and uncover their potential functions. We downloaded the miRNA expression profile of GSE84260 and the gene expression profile of GSE73374 from the Gene Expression Omnibus database. Differentially expressed miRNAs and genes were identified and compared to miRNA-target information from MiRWalk 2.0, and a total of 65 differentially expressed miRNAs (DEMIs), including 32 up-regulated miRNAs and 33 down-regulated miRNAs, and 91 differentially expressed genes (DEGs), including 83 up-regulated genes and 8 down-regulated genes, were identified. The pathway enrichment analyses of the DEMIs showed that the up-regulated DEMIs were enriched in the Hippo signalling pathway and MAPK signalling pathway, and the down-regulated DEMIs were enriched in HTLV-I infection and miRNAs in cancers. The gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) enrichment analyses of the DEGs were performed using Multifaceted Analysis Tool for Human Transcriptome. The up-regulated DEGs were enriched in biological processes (BPs), including the response to cAMP, response to hydrogen peroxide and cell-cell adhesion mediated by integrin; no enrichment of down-regulated DEGs was identified. KEGG analysis showed that the up-regulated DEGs were enriched in the Hippo signalling pathway and pathways in cancer. A PPI network of the DEGs was constructed by using Cytoscape software, and FOS, STAT1, MMP14, ITGB1, VCAN, DUSP1, LDHA, MCL1, MET, and ZFP36 were identified as the hub genes. The current study illustrates a characteristic microRNA profile and gene profile in preeclampsia, which may contribute to the interpretation of the progression of preeclampsia and provide novel biomarkers and therapeutic targets for preeclampsia. PMID:28594854

  6. Identifying Mendelian disease genes with the Variant Effect Scoring Tool

    PubMed Central

    2013-01-01

    Background Whole exome sequencing studies identify hundreds to thousands of rare protein coding variants of ambiguous significance for human health. Computational tools are needed to accelerate the identification of specific variants and genes that contribute to human disease. Results We have developed the Variant Effect Scoring Tool (VEST), a supervised machine learning-based classifier, to prioritize rare missense variants with likely involvement in human disease. The VEST classifier training set comprised ~ 45,000 disease mutations from the latest Human Gene Mutation Database release and another ~45,000 high frequency (allele frequency >1%) putatively neutral missense variants from the Exome Sequencing Project. VEST outperforms some of the most popular methods for prioritizing missense variants in carefully designed holdout benchmarking experiments (VEST ROC AUC = 0.91, PolyPhen2 ROC AUC = 0.86, SIFT4.0 ROC AUC = 0.84). VEST estimates variant score p-values against a null distribution of VEST scores for neutral variants not included in the VEST training set. These p-values can be aggregated at the gene level across multiple disease exomes to rank genes for probable disease involvement. We tested the ability of an aggregate VEST gene score to identify candidate Mendelian disease genes, based on whole-exome sequencing of a small number of disease cases. We used whole-exome data for two Mendelian disorders for which the causal gene is known. Considering only genes that contained variants in all cases, the VEST gene score ranked dihydroorotate dehydrogenase (DHODH) number 2 of 2253 genes in four cases of Miller syndrome, and myosin-3 (MYH3) number 2 of 2313 genes in three cases of Freeman Sheldon syndrome. Conclusions Our results demonstrate the potential power gain of aggregating bioinformatics variant scores into gene-level scores and the general utility of bioinformatics in assisting the search for disease genes in large-scale exome sequencing studies. VEST is

  7. A genome-wide analysis of the expansin genes in Malus × Domestica.

    PubMed

    Zhang, Shizhong; Xu, Ruirui; Gao, Zheng; Chen, Changtian; Jiang, Zesheng; Shu, Huairui

    2014-04-01

    Expansins were first identified as cell wall-loosening proteins; they are involved in regulating cell expansion, fruits softening and many other physiological processes. However, our knowledge about the expansin family members and their evolutionary relationships in fruit trees, such as apple, is limited. In this study, we identified 41 members of the expansin gene family in the genome of apple (Malus × Domestica L. Borkh). Phylogenetic analysis revealed that expansin genes in apple could be divided into four subfamilies according to their gene structures and protein motifs. By phylogenetic analysis of the expansins in five plants (Arabidopsis, rice, poplar, grape and apple), the expansins were divided into 17 subgroups. Our gene duplication analysis revealed that whole-genome and chromosomal-segment duplications contributed to the expansion of Mdexpansins. The microarray and expressed sequence tag (EST) data showed that 34 Mdexpansin genes could be divided into five groups by the EST analysis; they may also play different roles during fruit development. An expression model for MdEXPA16 and MdEXPA20 showed their potential role in developing fruit. Overall, our study provides useful data and novel insights into the functions and regulatory mechanisms of the expansin genes in apple, as well as their evolution and divergence. As the first step towards genome-wide analysis of the expansin genes in apple, our results have established a solid foundation for future studies on the function of the expansin genes in fruit development.

  8. In silico identification and analysis of phytoene synthase genes in plants.

    PubMed

    Han, Y; Zheng, Q S; Wei, Y P; Chen, J; Liu, R; Wan, H J

    2015-08-14

    In this study, we examined phytoene synthetase (PSY), the first key limiting enzyme in the synthesis of carotenoids and catalyzing the formation of geranylgeranyl pyrophosphate in terpenoid biosynthesis. We used known amino acid sequences of the PSY gene in tomato plants to conduct a genome-wide search and identify putative candidates in 34 sequenced plants. A total of 101 homologous genes were identified. Phylogenetic analysis revealed that PSY evolved independently in algae as well as monocotyledonous and dicotyledonous plants. Our results showed that the amino acid structures exhibited 5 motifs (motifs 1 to 5) in algae and those in higher plants were highly conserved. The PSY gene structures showed that the number of intron in algae varied widely, while the number of introns in higher plants was 4 to 5. Identification of PSY genes in plants and the analysis of the gene structure may provide a theoretical basis for studying evolutionary relationships in future analyses.

  9. High-Throughput Genetic Screens Identify a Large and Diverse Collection of New Sporulation Genes in Bacillus subtilis.

    PubMed

    Meeske, Alexander J; Rodrigues, Christopher D A; Brady, Jacqueline; Lim, Hoong Chuin; Bernhardt, Thomas G; Rudner, David Z

    2016-01-01

    The differentiation of the bacterium Bacillus subtilis into a dormant spore is among the most well-characterized developmental pathways in biology. Classical genetic screens performed over the past half century identified scores of factors involved in every step of this morphological process. More recently, transcriptional profiling uncovered additional sporulation-induced genes required for successful spore development. Here, we used transposon-sequencing (Tn-seq) to assess whether there were any sporulation genes left to be discovered. Our screen identified 133 out of the 148 genes with known sporulation defects. Surprisingly, we discovered 24 additional genes that had not been previously implicated in spore formation. To investigate their functions, we used fluorescence microscopy to survey early, middle, and late stages of differentiation of null mutants from the B. subtilis ordered knockout collection. This analysis identified mutants that are delayed in the initiation of sporulation, defective in membrane remodeling, and impaired in spore maturation. Several mutants had novel sporulation phenotypes. We performed in-depth characterization of two new factors that participate in cell-cell signaling pathways during sporulation. One (SpoIIT) functions in the activation of σE in the mother cell; the other (SpoIIIL) is required for σG activity in the forespore. Our analysis also revealed that as many as 36 sporulation-induced genes with no previously reported mutant phenotypes are required for timely spore maturation. Finally, we discovered a large set of transposon insertions that trigger premature initiation of sporulation. Our results highlight the power of Tn-seq for the discovery of new genes and novel pathways in sporulation and, combined with the recently completed null mutant collection, open the door for similar screens in other, less well-characterized processes.

  10. High-Throughput Genetic Screens Identify a Large and Diverse Collection of New Sporulation Genes in Bacillus subtilis

    PubMed Central

    Brady, Jacqueline; Lim, Hoong Chuin; Bernhardt, Thomas G.; Rudner, David Z.

    2016-01-01

    The differentiation of the bacterium Bacillus subtilis into a dormant spore is among the most well-characterized developmental pathways in biology. Classical genetic screens performed over the past half century identified scores of factors involved in every step of this morphological process. More recently, transcriptional profiling uncovered additional sporulation-induced genes required for successful spore development. Here, we used transposon-sequencing (Tn-seq) to assess whether there were any sporulation genes left to be discovered. Our screen identified 133 out of the 148 genes with known sporulation defects. Surprisingly, we discovered 24 additional genes that had not been previously implicated in spore formation. To investigate their functions, we used fluorescence microscopy to survey early, middle, and late stages of differentiation of null mutants from the B. subtilis ordered knockout collection. This analysis identified mutants that are delayed in the initiation of sporulation, defective in membrane remodeling, and impaired in spore maturation. Several mutants had novel sporulation phenotypes. We performed in-depth characterization of two new factors that participate in cell–cell signaling pathways during sporulation. One (SpoIIT) functions in the activation of σE in the mother cell; the other (SpoIIIL) is required for σG activity in the forespore. Our analysis also revealed that as many as 36 sporulation-induced genes with no previously reported mutant phenotypes are required for timely spore maturation. Finally, we discovered a large set of transposon insertions that trigger premature initiation of sporulation. Our results highlight the power of Tn-seq for the discovery of new genes and novel pathways in sporulation and, combined with the recently completed null mutant collection, open the door for similar screens in other, less well-characterized processes. PMID:26735940

  11. Machine Learning Analysis Identifies Drosophila Grunge/Atrophin as an Important Learning and Memory Gene Required for Memory Retention and Social Learning.

    PubMed

    Kacsoh, Balint Z; Greene, Casey S; Bosco, Giovanni

    2017-11-06

    High-throughput experiments are becoming increasingly common, and scientists must balance hypothesis-driven experiments with genome-wide data acquisition. We sought to predict novel genes involved in Drosophila learning and long-term memory from existing public high-throughput data. We performed an analysis using PILGRM, which analyzes public gene expression compendia using machine learning. We evaluated the top prediction alongside genes involved in learning and memory in IMP, an interface for functional relationship networks. We identified Grunge/Atrophin ( Gug/Atro ), a transcriptional repressor, histone deacetylase, as our top candidate. We find, through multiple, distinct assays, that Gug has an active role as a modulator of memory retention in the fly and its function is required in the adult mushroom body. Depletion of Gug specifically in neurons of the adult mushroom body, after cell division and neuronal development is complete, suggests that Gug function is important for memory retention through regulation of neuronal activity, and not by altering neurodevelopment. Our study provides a previously uncharacterized role for Gug as a possible regulator of neuronal plasticity at the interface of memory retention and memory extinction. Copyright © 2017 Kacsoh et al.

  12. Serial analysis of gene expression (SAGE) in normal human trabecular meshwork.

    PubMed

    Liu, Yutao; Munro, Drew; Layfield, David; Dellinger, Andrew; Walter, Jeffrey; Peterson, Katherine; Rickman, Catherine Bowes; Allingham, R Rand; Hauser, Michael A

    2011-04-08

    To identify the genes expressed in normal human trabecular meshwork tissue, a tissue critical to the pathogenesis of glaucoma. Total RNA was extracted from human trabecular meshwork (HTM) harvested from 3 different donors. Extracted RNA was used to synthesize individual SAGE (serial analysis of gene expression) libraries using the I-SAGE Long kit from Invitrogen. Libraries were analyzed using SAGE 2000 software to extract the 17 base pair sequence tags. The extracted sequence tags were mapped to the genome using SAGE Genie map. A total of 298,834 SAGE tags were identified from all HTM libraries (96,842, 88,126, and 113,866 tags, respectively). Collectively, there were 107,325 unique tags. There were 10,329 unique tags with a minimum of 2 counts from a single library. These tags were mapped to known unique Unigene clusters. Approximately 29% of the tags (orphan tags) did not map to a known Unigene cluster. Thirteen percent of the tags mapped to at least 2 Unigene clusters. Sequence tags from many glaucoma-related genes, including myocilin, optineurin, and WD repeat domain 36, were identified. This is the first time SAGE analysis has been used to characterize the gene expression profile in normal HTM. SAGE analysis provides an unbiased sampling of gene expression of the target tissue. These data will provide new and valuable information to improve understanding of the biology of human aqueous outflow.

  13. Weighted gene co-expression network analysis reveals potential genes involved in early metamorphosis process in sea cucumber Apostichopus japonicus.

    PubMed

    Li, Yongxin; Kikuchi, Mani; Li, Xueyan; Gao, Qionghua; Xiong, Zijun; Ren, Yandong; Zhao, Ruoping; Mao, Bingyu; Kondo, Mariko; Irie, Naoki; Wang, Wen

    2018-01-01

    Sea cucumbers, one main class of Echinoderms, have a very fast and drastic metamorphosis process during their development. However, the molecular basis under this process remains largely unknown. Here we systematically examined the gene expression profiles of Japanese common sea cucumber (Apostichopus japonicus) for the first time by RNA sequencing across 16 developmental time points from fertilized egg to juvenile stage. Based on the weighted gene co-expression network analysis (WGCNA), we identified 21 modules. Among them, MEdarkmagenta was highly expressed and correlated with the early metamorphosis process from late auricularia to doliolaria larva. Furthermore, gene enrichment and differentially expressed gene analysis identified several genes in the module that may play key roles in the metamorphosis process. Our results not only provide a molecular basis for experimentally studying the development and morphological complexity of sea cucumber, but also lay a foundation for improving its emergence rate. Copyright © 2017 Elsevier Inc. All rights reserved.

  14. Sample entropy analysis of cervical neoplasia gene-expression signatures

    PubMed Central

    Botting, Shaleen K; Trzeciakowski, Jerome P; Benoit, Michelle F; Salama, Salama A; Diaz-Arrastia, Concepcion R

    2009-01-01

    Background We introduce Approximate Entropy as a mathematical method of analysis for microarray data. Approximate entropy is applied here as a method to classify the complex gene expression patterns resultant of a clinical sample set. Since Entropy is a measure of disorder in a system, we believe that by choosing genes which display minimum entropy in normal controls and maximum entropy in the cancerous sample set we will be able to distinguish those genes which display the greatest variability in the cancerous set. Here we describe a method of utilizing Approximate Sample Entropy (ApSE) analysis to identify genes of interest with the highest probability of producing an accurate, predictive, classification model from our data set. Results In the development of a diagnostic gene-expression profile for cervical intraepithelial neoplasia (CIN) and squamous cell carcinoma of the cervix, we identified 208 genes which are unchanging in all normal tissue samples, yet exhibit a random pattern indicative of the genetic instability and heterogeneity of malignant cells. This may be measured in terms of the ApSE when compared to normal tissue. We have validated 10 of these genes on 10 Normal and 20 cancer and CIN3 samples. We report that the predictive value of the sample entropy calculation for these 10 genes of interest is promising (75% sensitivity, 80% specificity for prediction of cervical cancer over CIN3). Conclusion The success of the Approximate Sample Entropy approach in discerning alterations in complexity from biological system with such relatively small sample set, and extracting biologically relevant genes of interest hold great promise. PMID:19232110

  15. Comparative transcriptome analysis of shoot and root tissue of Bacopa monnieri identifies potential genes related to triterpenoid saponin biosynthesis.

    PubMed

    Jeena, Gajendra Singh; Fatima, Shahnoor; Tripathi, Pragya; Upadhyay, Swati; Shukla, Rakesh Kumar

    2017-06-28

    Bacopa monnieri commonly known as Brahmi is utilized in Ayurveda to improve memory and many other human health benefits. Bacosides enriched standardized extract of Bacopa monnieri is being marketed as a memory enhancing agent. In spite of its well known pharmacological properties it is not much studied in terms of transcripts involved in biosynthetic pathway and its regulation that controls the secondary metabolic pathway in this plant. The aim of this study was to identify the potential transcripts and provide a framework of identified transcripts involved in bacosides production through transcriptome assembly. We performed comparative transcriptome analysis of shoot and root tissue of Bacopa monnieri in two independent biological replicate and obtained 22.48 million and 22.0 million high quality processed reads in shoot and root respectively. After de novo assembly and quantitative assessment total 26,412 genes got annotated in root and 18,500 genes annotated in shoot sample. Quality of raw reads was determined by using SeqQC-V2.2. Assembled sequences were annotated using BLASTX against public database such as NR or UniProt. Searching against the KEGG pathway database indicated that 37,918 unigenes from root and 35,130 unigenes from shoot were mapped to 133 KEGG pathways. Based on the DGE data we found that most of the transcript related to CYP450s and UDP-glucosyltransferases were specifically upregulated in shoot tissue as compared to root tissue. Finally, we have selected 43 transcripts related to secondary metabolism including transcription factor families which are differentially expressed in shoot and root tissues were validated by qRT-PCR and their expression level were monitored after MeJA treatment and wounding for 1, 3 and 5 h. This study not only represents the first de novo transcriptome analysis of Bacopa monnieri but also provides information about the identification, expression and differential tissues specific distribution of transcripts related

  16. Exome Sequencing Identifies Three Novel Candidate Genes Implicated in Intellectual Disability

    PubMed Central

    Azam, Maleeha; Ayub, Humaira; Vissers, Lisenka E. L. M.; Gilissen, Christian; Ali, Syeda Hafiza Benish; Riaz, Moeen; Veltman, Joris A.; Pfundt, Rolph; van Bokhoven, Hans; Qamar, Raheel

    2014-01-01

    Intellectual disability (ID) is a major health problem mostly with an unknown etiology. Recently exome sequencing of individuals with ID identified novel genes implicated in the disease. Therefore the purpose of the present study was to identify the genetic cause of ID in one syndromic and two non-syndromic Pakistani families. Whole exome of three ID probands was sequenced. Missense variations in two plausible novel genes implicated in autosomal recessive ID were identified: lysine (K)-specific methyltransferase 2B (KMT2B), zinc finger protein 589 (ZNF589), as well as hedgehog acyltransferase (HHAT) with a de novo mutation with autosomal dominant mode of inheritance. The KMT2B recessive variant is the first report of recessive Kleefstra syndrome-like phenotype. Identification of plausible causative mutations for two recessive and a dominant type of ID, in genes not previously implicated in disease, underscores the large genetic heterogeneity of ID. These results also support the viewpoint that large number of ID genes converge on limited number of common networks i.e. ZNF589 belongs to KRAB-domain zinc-finger proteins previously implicated in ID, HHAT is predicted to affect sonic hedgehog, which is involved in several disorders with ID, KMT2B associated with syndromic ID fits the epigenetic module underlying the Kleefstra syndromic spectrum. The association of these novel genes in three different Pakistani ID families highlights the importance of screening these genes in more families with similar phenotypes from different populations to confirm the involvement of these genes in pathogenesis of ID. PMID:25405613

  17. Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits.

    PubMed

    Mancuso, Nicholas; Shi, Huwenbo; Goddard, Pagé; Kichaev, Gleb; Gusev, Alexander; Pasaniuc, Bogdan

    2017-03-02

    Although genome-wide association studies (GWASs) have identified thousands of risk loci for many complex traits and diseases, the causal variants and genes at these loci remain largely unknown. Here, we introduce a method for estimating the local genetic correlation between gene expression and a complex trait and utilize it to estimate the genetic correlation due to predicted expression between pairs of traits. We integrated gene expression measurements from 45 expression panels with summary GWAS data to perform 30 multi-tissue transcriptome-wide association studies (TWASs). We identified 1,196 genes whose expression is associated with these traits; of these, 168 reside more than 0.5 Mb away from any previously reported GWAS significant variant. We then used our approach to find 43 pairs of traits with significant genetic correlation at the level of predicted expression; of these, eight were not found through genetic correlation at the SNP level. Finally, we used bi-directional regression to find evidence that BMI causally influences triglyceride levels and that triglyceride levels causally influence low-density lipoprotein. Together, our results provide insight into the role of gene expression in the susceptibility of complex traits and diseases. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  18. MVisAGe Identifies Concordant and Discordant Genomic Alterations of Driver Genes in Squamous Tumors.

    PubMed

    Walter, Vonn; Du, Ying; Danilova, Ludmila; Hayward, Michele C; Hayes, D Neil

    2018-06-15

    Integrated analyses of multiple genomic datatypes are now common in cancer profiling studies. Such data present opportunities for numerous computational experiments, yet analytic pipelines are limited. Tools such as the cBioPortal and Regulome Explorer, although useful, are not easy to access programmatically or to implement locally. Here, we introduce the MVisAGe R package, which allows users to quantify gene-level associations between two genomic datatypes to investigate the effect of genomic alterations (e.g., DNA copy number changes on gene expression). Visualizing Pearson/Spearman correlation coefficients according to the genomic positions of the underlying genes provides a powerful yet novel tool for conducting exploratory analyses. We demonstrate its utility by analyzing three publicly available cancer datasets. Our approach highlights canonical oncogenes in chr11q13 that displayed the strongest associations between expression and copy number, including CCND1 and CTTN , genes not identified by copy number analysis in the primary reports. We demonstrate highly concordant usage of shared oncogenes on chr3q, yet strikingly diverse oncogene usage on chr11q as a function of HPV infection status. Regions of chr19 that display remarkable associations between methylation and gene expression were identified, as were previously unreported miRNA-gene expression associations that may contribute to the epithelial-to-mesenchymal transition. Significance: This study presents an important bioinformatics tool that will enable integrated analyses of multiple genomic datatypes. Cancer Res; 78(12); 3375-85. ©2018 AACR . ©2018 American Association for Cancer Research.

  19. Integrative Analysis to Identify Common Genetic Markers of Metabolic Syndrome, Dementia, and Diabetes

    PubMed Central

    Zhang, Weihong; Xin, Linlin; Lu, Ying

    2017-01-01

    Background Emerging data have established links between systemic metabolic dysfunction, such as diabetes and metabolic syndrome (MetS), with neurocognitive impairment, including dementia. The common gene signature and the associated signaling pathways of MetS, diabetes, and dementia have not been widely studied. Material/Methods We exploited the translational bioinformatics approach to choose the common gene signatures for both dementia and MetS. For this we employed “DisGeNET discovery platform”. Results Gene mining analysis revealed that a total of 173 genes (86 genes common to all three diseases) which comprised a proportion of 43% of the total genes associated with dementia. The gene enrichment analysis showed that these genes were involved in dysregulation in the neurological system (23.2%) and the central nervous system (20.8%) phenotype processes. The network analysis revealed APOE, APP, PARK2, CEPBP, PARP1, MT-CO2, CXCR4, IGFIR, CCR5, and PIK3CD as important nodes with significant interacting partners. The meta-regression analysis showed modest association of APOE with dementia and metabolic complications. The directionality of effects of the variants on Alzheimer disease is generally consistent with previous observations and did not differ by race/ethnicity (p>0.05), although our study had low power for this test. Conclusions Our novel approach showed APOE as a common gene signature with a link to dementia, MetS, and diabetes. Future gene association studies should focus on the association of gene polymorphisms with multiple disease models to identify novel putative drug targets. PMID:29229897

  20. Genome-Wide Detection and Analysis of Multifunctional Genes

    PubMed Central

    Pritykin, Yuri; Ghersi, Dario; Singh, Mona

    2015-01-01

    Many genes can play a role in multiple biological processes or molecular functions. Identifying multifunctional genes at the genome-wide level and studying their properties can shed light upon the complexity of molecular events that underpin cellular functioning, thereby leading to a better understanding of the functional landscape of the cell. However, to date, genome-wide analysis of multifunctional genes (and the proteins they encode) has been limited. Here we introduce a computational approach that uses known functional annotations to extract genes playing a role in at least two distinct biological processes. We leverage functional genomics data sets for three organisms—H. sapiens, D. melanogaster, and S. cerevisiae—and show that, as compared to other annotated genes, genes involved in multiple biological processes possess distinct physicochemical properties, are more broadly expressed, tend to be more central in protein interaction networks, tend to be more evolutionarily conserved, and are more likely to be essential. We also find that multifunctional genes are significantly more likely to be involved in human disorders. These same features also hold when multifunctionality is defined with respect to molecular functions instead of biological processes. Our analysis uncovers key features about multifunctional genes, and is a step towards a better genome-wide understanding of gene multifunctionality. PMID:26436655

  1. Integrating transcriptome and genome re-sequencing data to identify key genes and mutations affecting chicken eggshell qualities.

    PubMed

    Zhang, Quan; Zhu, Feng; Liu, Long; Zheng, Chuan Wei; Wang, De He; Hou, Zhuo Cheng; Ning, Zhong Hua

    2015-01-01

    Eggshell damages lead to economic losses in the egg production industry and are a threat to human health. We examined 49-wk-old Rhode Island White hens (Gallus gallus) that laid eggs having shells with significantly different strengths and thicknesses. We used HiSeq 2000 (Illumina) sequencing to characterize the chicken transcriptome and whole genome to identify the key genes and genetic mutations associated with eggshell calcification. We identified a total of 14,234 genes expressed in the chicken uterus, representing 89% of all annotated chicken genes. A total of 889 differentially expressed genes were identified by comparing low eggshell strength (LES) and normal eggshell strength (NES) genomes. The DEGs are enriched in calcification-related processes, including calcium ion transport and calcium signaling pathways as revealed by gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) pathway analysis. Some important matrix proteins, such as OC-116, LTF and SPP1, were also expressed differentially between two groups. A total of 3,671,919 single-nucleotide polymorphisms (SNPs) and 508,035 Indels were detected in protein coding genes by whole-genome re-sequencing, including 1775 non-synonymous variations and 19 frame-shift Indels in DEGs. SNPs and Indels found in this study could be further investigated for eggshell traits. This is the first report to integrate the transcriptome and genome re-sequencing to target the genetic variations which decreased the eggshell qualities. These findings further advance our understanding of eggshell calcification in the chicken uterus.

  2. Transcriptomic analysis of rice aleurone cells identified a novel abscisic acid response element.

    PubMed

    Watanabe, Kenneth A; Homayouni, Arielle; Gu, Lingkun; Huang, Kuan-Ying; Ho, Tuan-Hua David; Shen, Qingxi J

    2017-09-01

    Seeds serve as a great model to study plant responses to drought stress, which is largely mediated by abscisic acid (ABA). The ABA responsive element (ABRE) is a key cis-regulatory element in ABA signalling. However, its consensus sequence (ACGTG(G/T)C) is present in the promoters of only about 40% of ABA-induced genes in rice aleurone cells, suggesting other ABREs may exist. To identify novel ABREs, RNA sequencing was performed on aleurone cells of rice seeds treated with 20 μM ABA. Gibbs sampling was used to identify enriched elements, and particle bombardment-mediated transient expression studies were performed to verify the function. Gene ontology analysis was performed to predict the roles of genes containing the novel ABREs. This study revealed 2443 ABA-inducible genes and a novel ABRE, designated as ABREN, which was experimentally verified to mediate ABA signalling in rice aleurone cells. Many of the ABREN-containing genes are predicted to be involved in stress responses and transcription. Analysis of other species suggests that the ABREN may be monocot specific. This study also revealed interesting expression patterns of genes involved in ABA metabolism and signalling. Collectively, this study advanced our understanding of diverse cis-regulatory sequences and the transcriptomes underlying ABA responses in rice aleurone cells. © 2017 John Wiley & Sons Ltd.

  3. Using gene chips to identify organ-specific, smooth muscle responses to experimental diabetes: potential applications to urological diseases.

    PubMed

    Hipp, Jason D; Davies, Kelvin P; Tar, Moses; Valcic, Mira; Knoll, Abraham; Melman, Arnold; Christ, George J

    2007-02-01

    To identify early diabetes-related alterations in gene expression in bladder and erectile tissue that would provide novel diagnostic and therapeutic treatment targets to prevent, delay or ameliorate the ensuing bladder and erectile dysfunction. The RG-U34A rat GeneChip (Affymetrix Inc., Sunnyvale, CA, USA) oligonucleotide microarray (containing approximately 8799 genes) was used to evaluate gene expression in corporal and male bladder tissue excised from rats 1 week after confirmation of a diabetic state, but before demonstrable changes in organ function in vivo. A conservative analytical approach was used to detect alterations in gene expression, and gene ontology (GO) classifications were used to identify biological themes/pathways involved in the aetiology of the organ dysfunction. In all, 320 and 313 genes were differentially expressed in bladder and corporal tissue, respectively. GO analysis in bladder tissue showed prominent increases in biological pathways involved in cell proliferation, metabolism, actin cytoskeleton and myosin, as well as decreases in cell motility, and regulation of muscle contraction. GO analysis in corpora showed increases in pathways related to ion channel transport and ion channel activity, while there were decreases in collagen I and actin genes. The changes in gene expression in these initial experiments are consistent with the pathophysiological characteristics of the bladder and erectile dysfunction seen later in the diabetic disease process. Thus, the observed changes in gene expression might be harbingers or biomarkers of impending organ dysfunction, and could provide useful diagnostic and therapeutic targets for a variety of progressive urological diseases/conditions (i.e. lower urinary tract symptoms related to benign prostatic hyperplasia, erectile dysfunction, etc.).

  4. Differential Gene Expression between Leaf and Rhizome in Atractylodes lancea: A Comparative Transcriptome Analysis

    PubMed Central

    Huang, Qianqian; Huang, Xiao; Deng, Juan; Liu, Hegang; Liu, Yanwen; Yu, Kun; Huang, Bisheng

    2016-01-01

    The rhizome of Atractylodes lancea is extensively used in the practice of Traditional Chinese Medicine because of its broad pharmacological activities. This study was designed to characterize the transcriptome profiling of the rhizome and leaf of Atractylodes lancea in an attempt to uncover the molecular mechanisms regulating rhizome formation and growth. Over 270 million clean reads were assembled into 92,366 unigenes, 58% of which are homologous with sequences in public protein databases (NR, Swiss-Prot, GO, and KEGG). Analysis of expression levels showed that genes involved in photosynthesis, stress response, and translation were the most abundant transcripts in the leaf, while transcripts involved in stress response, transcription regulation, translation, and metabolism were dominant in the rhizome. Tissue-specific gene analysis identified distinct gene families active in the leaf and rhizome. Differential gene expression analysis revealed a clear difference in gene expression pattern, identifying 1518 up-regulated genes and 3464 down-regulated genes in the rhizome compared with the leaf, including a series of genes related to signal transduction, primary and secondary metabolism. Transcription factor (TF) analysis identified 42 TF families, with 67 and 60 TFs up-regulated in the rhizome and leaf, respectively. A total of 104 unigenes were identified as candidates for regulating rhizome formation and development. These data offer an overview of the gene expression pattern of the rhizome and leaf and provide essential information for future studies on the molecular mechanisms of controlling rhizome formation and growth. The extensive transcriptome data generated in this study will be a valuable resource for further functional genomics studies of A. lancea. PMID:27066021

  5. Using SCOPE to identify potential regulatory motifs in coregulated genes.

    PubMed

    Martyanov, Viktor; Gross, Robert H

    2011-05-31

    SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data. In this article, we utilize a web version of SCOPE to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs and has been used in other studies. The three algorithms that comprise SCOPE are BEAM, which finds non-degenerate motifs (ACCGGT), PRISM, which finds degenerate motifs (ASCGWT), and SPACER, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well. Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor. Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run. Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from

  6. Integration of mouse and human genome-wide association data identifies KCNIP4 as an asthma gene.

    PubMed

    Himes, Blanca E; Sheppard, Keith; Berndt, Annerose; Leme, Adriana S; Myers, Rachel A; Gignoux, Christopher R; Levin, Albert M; Gauderman, W James; Yang, James J; Mathias, Rasika A; Romieu, Isabelle; Torgerson, Dara G; Roth, Lindsey A; Huntsman, Scott; Eng, Celeste; Klanderman, Barbara; Ziniti, John; Senter-Sylvia, Jody; Szefler, Stanley J; Lemanske, Robert F; Zeiger, Robert S; Strunk, Robert C; Martinez, Fernando D; Boushey, Homer; Chinchilli, Vernon M; Israel, Elliot; Mauger, David; Koppelman, Gerard H; Postma, Dirkje S; Nieuwenhuis, Maartje A E; Vonk, Judith M; Lima, John J; Irvin, Charles G; Peters, Stephen P; Kubo, Michiaki; Tamari, Mayumi; Nakamura, Yusuke; Litonjua, Augusto A; Tantisira, Kelan G; Raby, Benjamin A; Bleecker, Eugene R; Meyers, Deborah A; London, Stephanie J; Barnes, Kathleen C; Gilliland, Frank D; Williams, L Keoki; Burchard, Esteban G; Nicolae, Dan L; Ober, Carole; DeMeo, Dawn L; Silverman, Edwin K; Paigen, Beverly; Churchill, Gary; Shapiro, Steve D; Weiss, Scott T

    2013-01-01

    Asthma is a common chronic respiratory disease characterized by airway hyperresponsiveness (AHR). The genetics of asthma have been widely studied in mouse and human, and homologous genomic regions have been associated with mouse AHR and human asthma-related phenotypes. Our goal was to identify asthma-related genes by integrating AHR associations in mouse with human genome-wide association study (GWAS) data. We used Efficient Mixed Model Association (EMMA) analysis to conduct a GWAS of baseline AHR measures from males and females of 31 mouse strains. Genes near or containing SNPs with EMMA p-values <0.001 were selected for further study in human GWAS. The results of the previously reported EVE consortium asthma GWAS meta-analysis consisting of 12,958 diverse North American subjects from 9 study centers were used to select a subset of homologous genes with evidence of association with asthma in humans. Following validation attempts in three human asthma GWAS (i.e., Sepracor/LOCCS/LODO/Illumina, GABRIEL, DAG) and two human AHR GWAS (i.e., SHARP, DAG), the Kv channel interacting protein 4 (KCNIP4) gene was identified as nominally associated with both asthma and AHR at a gene- and SNP-level. In EVE, the smallest KCNIP4 association was at rs6833065 (P-value 2.9e-04), while the strongest associations for Sepracor/LOCCS/LODO/Illumina, GABRIEL, DAG were 1.5e-03, 1.0e-03, 3.1e-03 at rs7664617, rs4697177, rs4696975, respectively. At a SNP level, the strongest association across all asthma GWAS was at rs4697177 (P-value 1.1e-04). The smallest P-values for association with AHR were 2.3e-03 at rs11947661 in SHARP and 2.1e-03 at rs402802 in DAG. Functional studies are required to validate the potential involvement of KCNIP4 in modulating asthma susceptibility and/or AHR. Our results suggest that a useful approach to identify genes associated with human asthma is to leverage mouse AHR association data.

  7. Linking gene regulation and the exo-metabolome: A comparative transcriptomics approach to identify genes that impact on the production of volatile aroma compounds in yeast

    PubMed Central

    Rossouw, Debra; Næs, Tormod; Bauer, Florian F

    2008-01-01

    Background 'Omics' tools provide novel opportunities for system-wide analysis of complex cellular functions. Secondary metabolism is an example of a complex network of biochemical pathways, which, although well mapped from a biochemical point of view, is not well understood with regards to its physiological roles and genetic and biochemical regulation. Many of the metabolites produced by this network such as higher alcohols and esters are significant aroma impact compounds in fermentation products, and different yeast strains are known to produce highly divergent aroma profiles. Here, we investigated whether we can predict the impact of specific genes of known or unknown function on this metabolic network by combining whole transcriptome and partial exo-metabolome analysis. Results For this purpose, the gene expression levels of five different industrial wine yeast strains that produce divergent aroma profiles were established at three different time points of alcoholic fermentation in synthetic wine must. A matrix of gene expression data was generated and integrated with the concentrations of volatile aroma compounds measured at the same time points. This relatively unbiased approach to the study of volatile aroma compounds enabled us to identify candidate genes for aroma profile modification. Five of these genes, namely YMR210W, BAT1, AAD10, AAD14 and ACS1 were selected for overexpression in commercial wine yeast, VIN13. Analysis of the data show a statistically significant correlation between the changes in the exo-metabome of the overexpressing strains and the changes that were predicted based on the unbiased alignment of transcriptomic and exo-metabolomic data. Conclusion The data suggest that a comparative transcriptomics and metabolomics approach can be used to identify the metabolic impacts of the expression of individual genes in complex systems, and the amenability of transcriptomic data to direct applications of biotechnological relevance. PMID:18990252

  8. DEIVA: a web application for interactive visual analysis of differential gene expression profiles.

    PubMed

    Harshbarger, Jayson; Kratz, Anton; Carninci, Piero

    2017-01-07

    Differential gene expression (DGE) analysis is a technique to identify statistically significant differences in RNA abundance for genes or arbitrary features between different biological states. The result of a DGE test is typically further analyzed using statistical software, spreadsheets or custom ad hoc algorithms. We identified a need for a web-based system to share DGE statistical test results, and locate and identify genes in DGE statistical test results with a very low barrier of entry. We have developed DEIVA, a free and open source, browser-based single page application (SPA) with a strong emphasis on being user friendly that enables locating and identifying single or multiple genes in an immediate, interactive, and intuitive manner. By design, DEIVA scales with very large numbers of users and datasets. Compared to existing software, DEIVA offers a unique combination of design decisions that enable inspection and analysis of DGE statistical test results with an emphasis on ease of use.

  9. Sleeping Beauty transposon mutagenesis identifies genes that cooperate with mutant Smad4 in gastric cancer development

    PubMed Central

    Takeda, Haruna; Rust, Alistair G.; Ward, Jerrold M.; Yew, Christopher Chin Kuan; Jenkins, Nancy A.; Copeland, Neal G.

    2016-01-01

    Mutations in SMAD4 predispose to the development of gastrointestinal cancer, which is the third leading cause of cancer-related deaths. To identify genes driving gastric cancer (GC) development, we performed a Sleeping Beauty (SB) transposon mutagenesis screen in the stomach of Smad4+/− mutant mice. This screen identified 59 candidate GC trunk drivers and a much larger number of candidate GC progression genes. Strikingly, 22 SB-identified trunk drivers are known or candidate cancer genes, whereas four SB-identified trunk drivers, including PTEN, SMAD4, RNF43, and NF1, are known human GC trunk drivers. Similar to human GC, pathway analyses identified WNT, TGF-β, and PI3K-PTEN signaling, ubiquitin-mediated proteolysis, adherens junctions, and RNA degradation in addition to genes involved in chromatin modification and organization as highly deregulated pathways in GC. Comparative oncogenomic filtering of the complete list of SB-identified genes showed that they are highly enriched for genes mutated in human GC and identified many candidate human GC genes. Finally, by comparing our complete list of SB-identified genes against the list of mutated genes identified in five large-scale human GC sequencing studies, we identified LDL receptor-related protein 1B (LRP1B) as a previously unidentified human candidate GC tumor suppressor gene. In LRP1B, 129 mutations were found in 462 human GC samples sequenced, and LRP1B is one of the top 10 most deleted genes identified in a panel of 3,312 human cancers. SB mutagenesis has, thus, helped to catalog the cooperative molecular mechanisms driving SMAD4-induced GC growth and discover genes with potential clinical importance in human GC. PMID:27006499

  10. Sleeping Beauty transposon mutagenesis identifies genes that cooperate with mutant Smad4 in gastric cancer development.

    PubMed

    Takeda, Haruna; Rust, Alistair G; Ward, Jerrold M; Yew, Christopher Chin Kuan; Jenkins, Nancy A; Copeland, Neal G

    2016-04-05

    Mutations in SMAD4 predispose to the development of gastrointestinal cancer, which is the third leading cause of cancer-related deaths. To identify genes driving gastric cancer (GC) development, we performed a Sleeping Beauty (SB) transposon mutagenesis screen in the stomach of Smad4(+/-) mutant mice. This screen identified 59 candidate GC trunk drivers and a much larger number of candidate GC progression genes. Strikingly, 22 SB-identified trunk drivers are known or candidate cancer genes, whereas four SB-identified trunk drivers, including PTEN, SMAD4, RNF43, and NF1, are known human GC trunk drivers. Similar to human GC, pathway analyses identified WNT, TGF-β, and PI3K-PTEN signaling, ubiquitin-mediated proteolysis, adherens junctions, and RNA degradation in addition to genes involved in chromatin modification and organization as highly deregulated pathways in GC. Comparative oncogenomic filtering of the complete list of SB-identified genes showed that they are highly enriched for genes mutated in human GC and identified many candidate human GC genes. Finally, by comparing our complete list of SB-identified genes against the list of mutated genes identified in five large-scale human GC sequencing studies, we identified LDL receptor-related protein 1B (LRP1B) as a previously unidentified human candidate GC tumor suppressor gene. In LRP1B, 129 mutations were found in 462 human GC samples sequenced, and LRP1B is one of the top 10 most deleted genes identified in a panel of 3,312 human cancers. SB mutagenesis has, thus, helped to catalog the cooperative molecular mechanisms driving SMAD4-induced GC growth and discover genes with potential clinical importance in human GC.

  11. Comparing cancer vs normal gene expression profiles identifies new disease entities and common transcriptional programs in AML patients.

    PubMed

    Rapin, Nicolas; Bagger, Frederik Otzen; Jendholm, Johan; Mora-Jensen, Helena; Krogh, Anders; Kohlmann, Alexander; Thiede, Christian; Borregaard, Niels; Bullinger, Lars; Winther, Ole; Theilgaard-Mönch, Kim; Porse, Bo T

    2014-02-06

    Gene expression profiling has been used extensively to characterize cancer, identify novel subtypes, and improve patient stratification. However, it has largely failed to identify transcriptional programs that differ between cancer and corresponding normal cells and has not been efficient in identifying expression changes fundamental to disease etiology. Here we present a method that facilitates the comparison of any cancer sample to its nearest normal cellular counterpart, using acute myeloid leukemia (AML) as a model. We first generated a gene expression-based landscape of the normal hematopoietic hierarchy, using expression profiles from normal stem/progenitor cells, and next mapped the AML patient samples to this landscape. This allowed us to identify the closest normal counterpart of individual AML samples and determine gene expression changes between cancer and normal. We find the cancer vs normal method (CvN method) to be superior to conventional methods in stratifying AML patients with aberrant karyotype and in identifying common aberrant transcriptional programs with potential importance for AML etiology. Moreover, the CvN method uncovered a novel poor-outcome subtype of normal-karyotype AML, which allowed for the generation of a highly prognostic survival signature. Collectively, our CvN method holds great potential as a tool for the analysis of gene expression profiles of cancer patients.

  12. Gene expression profiling to identify the toxicities and potentially relevant human disease outcomes associated with environmental heavy metal exposure.

    PubMed

    Korashy, Hesham M; Attafi, Ibraheem M; Famulski, Konrad S; Bakheet, Saleh A; Hafez, Mohammed M; Alsaad, Abdulaziz M S; Al-Ghadeer, Abdul Rahman M

    2017-02-01

    Heavy metals are the most commonly encountered toxic substances that increase susceptibility to various diseases after prolonged exposure. We have previously shown that healthy volunteers living near a mining area had significant contamination with heavy metals associated with significant changes in the expression of some detoxifying genes, xenobiotic metabolizing enzymes, and DNA repair genes. However, alterations of most of the molecular target genes associated with diseases are still unknown. Thus, the aims of this study were to (a) evaluate the gene expression profile and (b) identify the toxicities and potentially relevant human disease outcomes associated with long-term human exposure to environmental heavy metals in mining area using microarray analysis. For this purpose, 40 healthy male volunteers who were residents of a heavy metal-polluted area (Mahd Al-Dhahab city, Saudi Arabia) and 20 healthy male volunteers who were residents of a non-heavy metal-polluted area were included in the study. Total RNA was isolated from whole blood using PAXgene Blood RNA tubes and then reversed transcribed and hybridized to the gene array using the Affymetrix U219 GeneChip. Microarray analysis showed about 2129 genes were identified and differentially altered, among which a shared set of 425 genes was differentially expressed in the heavy metal-exposed groups. Ingenuity pathway analysis revealed that the most altered gene-regulated diseases in heavy metal-exposed groups included hematological and developmental disorders and mostly renal and urological diseases. Quantitative real-time polymerase chain reaction closely matched the microarray data for some genes tested. Importantly, changes in gene-related diseases were attributed to alterations in the genes encoded for protein synthesis. Renal and urological diseases were the diseases that were most frequently associated with the heavy metal-exposed group. Therefore, there is a need for further studies to validate these

  13. Expression profiling identifies novel Hh/Gli regulated genes in developing zebrafish embryos.

    PubMed Central

    Bergeron, Sadie A.; Milla, Luis A.; Villegas, Rosario; Shen, Meng-Chieh; Burgess, Shawn M.; Allende, Miguel L.; Karlstrom, Rolf O.; Palma, Verónica

    2008-01-01

    The Hedgehog (Hh) signaling pathway plays critical instructional roles during embryonic development. Mis-regulation of Hh/Gli signaling is a major causative factor in human congenital disorders and in a variety of cancers. The zebrafish is a powerful genetic model for the study of Hh signaling during embryogenesis, as a large number of mutants have been identified affecting different components of the Hh/Gli signaling system. By performing global profiling of gene expression in different Hh/Gli gain- and loss-of-function scenarios we identified several known (e.g. ptc1 and nkx2.2a) as well as a large number of novel Hh regulated genes that are differentially expressed in embryos with altered Hh/Gli signaling function. By uncovering changes in tissue specific gene expression, we revealed new embryological processes that are influenced by Hh signaling. We thus provide a comprehensive survey of Hh/Gli regulated genes during embryogenesis and we identify new Hh-regulated genes that may be targets of mis-regulation during tumorogenesis. PMID:18055165

  14. An independent component analysis confounding factor correction framework for identifying broad impact expression quantitative trait loci

    PubMed Central

    Ju, Jin Hyun; Crystal, Ronald G.

    2017-01-01

    Genome-wide expression Quantitative Trait Loci (eQTL) studies in humans have provided numerous insights into the genetics of both gene expression and complex diseases. While the majority of eQTL identified in genome-wide analyses impact a single gene, eQTL that impact many genes are particularly valuable for network modeling and disease analysis. To enable the identification of such broad impact eQTL, we introduce CONFETI: Confounding Factor Estimation Through Independent component analysis. CONFETI is designed to address two conflicting issues when searching for broad impact eQTL: the need to account for non-genetic confounding factors that can lower the power of the analysis or produce broad impact eQTL false positives, and the tendency of methods that account for confounding factors to model broad impact eQTL as non-genetic variation. The key advance of the CONFETI framework is the use of Independent Component Analysis (ICA) to identify variation likely caused by broad impact eQTL when constructing the sample covariance matrix used for the random effect in a mixed model. We show that CONFETI has better performance than other mixed model confounding factor methods when considering broad impact eQTL recovery from synthetic data. We also used the CONFETI framework and these same confounding factor methods to identify eQTL that replicate between matched twin pair datasets in the Multiple Tissue Human Expression Resource (MuTHER), the Depression Genes Networks study (DGN), the Netherlands Study of Depression and Anxiety (NESDA), and multiple tissue types in the Genotype-Tissue Expression (GTEx) consortium. These analyses identified both cis-eQTL and trans-eQTL impacting individual genes, and CONFETI had better or comparable performance to other mixed model confounding factor analysis methods when identifying such eQTL. In these analyses, we were able to identify and replicate a few broad impact eQTL although the overall number was small even when applying CONFETI. In

  15. An independent component analysis confounding factor correction framework for identifying broad impact expression quantitative trait loci.

    PubMed

    Ju, Jin Hyun; Shenoy, Sushila A; Crystal, Ronald G; Mezey, Jason G

    2017-05-01

    Genome-wide expression Quantitative Trait Loci (eQTL) studies in humans have provided numerous insights into the genetics of both gene expression and complex diseases. While the majority of eQTL identified in genome-wide analyses impact a single gene, eQTL that impact many genes are particularly valuable for network modeling and disease analysis. To enable the identification of such broad impact eQTL, we introduce CONFETI: Confounding Factor Estimation Through Independent component analysis. CONFETI is designed to address two conflicting issues when searching for broad impact eQTL: the need to account for non-genetic confounding factors that can lower the power of the analysis or produce broad impact eQTL false positives, and the tendency of methods that account for confounding factors to model broad impact eQTL as non-genetic variation. The key advance of the CONFETI framework is the use of Independent Component Analysis (ICA) to identify variation likely caused by broad impact eQTL when constructing the sample covariance matrix used for the random effect in a mixed model. We show that CONFETI has better performance than other mixed model confounding factor methods when considering broad impact eQTL recovery from synthetic data. We also used the CONFETI framework and these same confounding factor methods to identify eQTL that replicate between matched twin pair datasets in the Multiple Tissue Human Expression Resource (MuTHER), the Depression Genes Networks study (DGN), the Netherlands Study of Depression and Anxiety (NESDA), and multiple tissue types in the Genotype-Tissue Expression (GTEx) consortium. These analyses identified both cis-eQTL and trans-eQTL impacting individual genes, and CONFETI had better or comparable performance to other mixed model confounding factor analysis methods when identifying such eQTL. In these analyses, we were able to identify and replicate a few broad impact eQTL although the overall number was small even when applying CONFETI. In

  16. Global expression analysis of gene regulatory pathways during endocrine pancreatic development.

    PubMed

    Gu, Guoqiang; Wells, James M; Dombkowski, David; Preffer, Fred; Aronow, Bruce; Melton, Douglas A

    2004-01-01

    To define genetic pathways that regulate development of the endocrine pancreas, we generated transcriptional profiles of enriched cells isolated from four biologically significant stages of endocrine pancreas development: endoderm before pancreas specification, early pancreatic progenitor cells, endocrine progenitor cells and adult islets of Langerhans. These analyses implicate new signaling pathways in endocrine pancreas development, and identified sets of known and novel genes that are temporally regulated, as well as genes that spatially define developing endocrine cells from their neighbors. The differential expression of several genes from each time point was verified by RT-PCR and in situ hybridization. Moreover, we present preliminary functional evidence suggesting that one transcription factor encoding gene (Myt1), which was identified in our screen, is expressed in endocrine progenitors and may regulate alpha, beta and delta cell development. In addition to identifying new genes that regulate endocrine cell fate, this global gene expression analysis has uncovered informative biological trends that occur during endocrine differentiation.

  17. Whole Wiskott‑Aldrich syndrome protein gene deletion identified by high throughput sequencing.

    PubMed

    He, Xiangling; Zou, Runying; Zhang, Bing; You, Yalan; Yang, Yang; Tian, Xin

    2017-11-01

    Wiskott‑Aldrich syndrome (WAS) is a rare X‑linked recessive immunodeficiency disorder, characterized by thrombocytopenia, small platelets, eczema and recurrent infections associated with increased risk of autoimmunity and malignancy disorders. Mutations in the WAS protein (WASP) gene are responsible for WAS. To date, WASP mutations, including missense/nonsense, splicing, small deletions, small insertions, gross deletions, and gross insertions have been identified in patients with WAS. In addition, WASP‑interacting proteins are suspected in patients with clinical features of WAS, in whom the WASP gene sequence and mRNA levels are normal. The present study aimed to investigate the application of next generation sequencing in definitive diagnosis and clinical therapy for WAS. A 5 month‑old child with WAS who displayed symptoms of thrombocytopenia was examined. Whole exome sequence analysis of genomic DNA showed that the coverage and depth of WASP were extremely low. Quantitative polymerase chain reaction indicated total WASP gene deletion in the proband. In conclusion, high throughput sequencing is useful for the verification of WAS on the genetic profile, and has implications for family planning guidance and establishment of clinical programs.

  18. Analysis of gene expression profile microarray data in complex regional pain syndrome.

    PubMed

    Tan, Wulin; Song, Yiyan; Mo, Chengqiang; Jiang, Shuangjian; Wang, Zhongxing

    2017-09-01

    The aim of the present study was to predict key genes and proteins associated with complex regional pain syndrome (CRPS) using bioinformatics analysis. The gene expression profiling microarray data, GSE47603, which included peripheral blood samples from 4 patients with CRPS and 5 healthy controls, was obtained from the Gene Expression Omnibus (GEO) database. The differentially expressed genes (DEGs) in CRPS patients compared with healthy controls were identified using the GEO2R online tool. Functional enrichment analysis was then performed using The Database for Annotation Visualization and Integrated Discovery online tool. Protein‑protein interaction (PPI) network analysis was subsequently performed using Search Tool for the Retrieval of Interaction Genes database and analyzed with Cytoscape software. A total of 257 DEGs were identified, including 243 upregulated genes and 14 downregulated ones. Genes in the human leukocyte antigen (HLA) family were most significantly differentially expressed. Enrichment analysis demonstrated that signaling pathways, including immune response, cell motion, adhesion and angiogenesis were associated with CRPS. PPI network analysis revealed that key genes, including early region 1A binding protein p300 (EP300), CREB‑binding protein (CREBBP), signal transducer and activator of transcription (STAT)3, STAT5A and integrin α M were associated with CRPS. The results suggest that the immune response may therefore serve an important role in CRPS development. In addition, genes in the HLA family, such as HLA‑DQB1 and HLA‑DRB1, may present potential biomarkers for the diagnosis of CRPS. Furthermore, EP300, its paralog CREBBP, and the STAT family genes, STAT3 and STAT5 may be important in the development of CRPS.

  19. Identifying Stress Transcription Factors Using Gene Expression and TF-Gene Association Data

    PubMed Central

    Wu, Wei-Sheng; Chen, Bor-Sen

    2007-01-01

    Unicellular organisms such as yeasts have evolved to survive environmental stresses by rapidly reorganizing the genomic expression program to meet the challenges of harsh environments. The complex adaptation mechanisms to stress remain to be elucidated. In this study, we developed Stress Transcription Factor Identification Algorithm (STFIA), which integrates gene expression and TF-gene association data to identify the stress transcription factors (TFs) of six kinds of stresses. We identified some general stress TFs that are in response to various stresses, and some specific stress TFs that are in response to one specific stress. The biological significance of our findings is validated by the literature. We found that a small number of TFs may be sufficient to control a wide variety of expression patterns in yeast under different stresses. Two implications can be inferred from this observation. First, the adaptation mechanisms to different stresses may have a bow-tie structure. Second, there may exist extensive regulatory cross-talk among different stress responses. In conclusion, this study proposes a network of the regulators of stress responses and their mechanism of action. PMID:20066130

  20. Vasohibin-1 is identified as a master-regulator of endothelial cell apoptosis using gene network analysis

    PubMed Central

    2013-01-01

    Background Apoptosis is a critical process in endothelial cell (EC) biology and pathology, which has been extensively studied at protein level. Numerous gene expression studies of EC apoptosis have also been performed, however few attempts have been made to use gene expression data to identify the molecular relationships and master regulators that underlie EC apoptosis. Therefore, we sought to understand these relationships by generating a Bayesian gene regulatory network (GRN) model. Results ECs were induced to undergo apoptosis using serum withdrawal and followed over a time course in triplicate, using microarrays. When generating the GRN, this EC time course data was supplemented by a library of microarray data from EC treated with siRNAs targeting over 350 signalling molecules. The GRN model proposed Vasohibin-1 (VASH1) as one of the candidate master-regulators of EC apoptosis with numerous downstream mRNAs. To evaluate the role played by VASH1 in EC, we used siRNA to reduce the expression of VASH1. Of 10 mRNAs downstream of VASH1 in the GRN that were examined, 7 were significantly up- or down-regulated in the direction predicted by the GRN.Further supporting an important biological role of VASH1 in EC, targeted reduction of VASH1 mRNA abundance conferred resistance to serum withdrawal-induced EC death. Conclusion We have utilised Bayesian GRN modelling to identify a novel candidate master regulator of EC apoptosis. This study demonstrates how GRN technology can complement traditional methods to hypothesise the regulatory relationships that underlie important biological processes. PMID:23324451

  1. Literature mining, gene-set enrichment and pathway analysis for target identification in Behçet's disease.

    PubMed

    Wilson, Paul; Larminie, Christopher; Smith, Rona

    2016-01-01

    To use literature mining to catalogue Behçet's associated genes, and advanced computational methods to improve the understanding of the pathways and signalling mechanisms that lead to the typical clinical characteristics of Behçet's patients. To extend this technique to identify potential treatment targets for further experimental validation. Text mining methods combined with gene enrichment tools, pathway analysis and causal analysis algorithms. This approach identified 247 human genes associated with Behçet's disease and the resulting disease map, comprising 644 nodes and 19220 edges, captured important details of the relationships between these genes and their associated pathways, as described in diverse data repositories. Pathway analysis has identified how Behçet's associated genes are likely to participate in innate and adaptive immune responses. Causal analysis algorithms have identified a number of potential therapeutic strategies for further investigation. Computational methods have captured pertinent features of the prominent disease characteristics presented in Behçet's disease and have highlighted NOD2, ICOS and IL18 signalling as potential therapeutic strategies.

  2. Two splice variants of the bovine lactoferrin gene identified in Staphylococcus aureus isolated from mastitis in dairy cattle.

    PubMed

    Huang, J M; Wang, Z Y; Ju, Z H; Wang, C F; Li, Q L; Sun, T; Hou, Q L; Hang, S Q; Hou, M H; Zhong, J F

    2011-12-21

    Bovine lactoferrin (bLF) is a member of the transferrin family; it plays an important role in the innate immune response. We identified novel splice variants of the bLF gene in mastitis-infected and healthy cows. Reverse transcription-polymerase chain reaction (RT-PCR) and clone sequencing analysis were used to screen the splice variants of the bLF gene in the mammary gland, spleen and liver tissues. One main transcript corresponding to the bLF reference sequence was found in three tissues in both healthy and mastitis-infected cows. Quantitative real-time PCR analysis showed that the expression levels of the LF gene's main transcript were not significantly different in tissues from healthy versus mastitis-infected cows. However, the new splice variant, LF-AS2, which has the exon-skipping alternative splicing pattern, was only identified in mammary glands infected with Staphylococcus aureus. Sequencing analysis showed that the new splice variant was 251 bp in length, including exon 1, part of exon 2, part of exon 16, and exon 17. We conclude that bLF may play a role in resistance to mastitis through alternative splicing mechanisms.

  3. Genomic convergence to identify candidate genes for Alzheimer disease on chromosome 10

    PubMed Central

    Liang, Xueying; Slifer, Michael; Martin, Eden R.; Schnetz-Boutaud, Nathalie; Bartlett, Jackie; Anderson, Brent; Züchner, Stephan; Gwirtsman, Harry; Gilbert, John R.; Pericak-Vance, Margaret A.; Haines, Jonathan L.

    2009-01-01

    A broad region of chromosome 10 (chr10) has engendered continued interest in the etiology of late-onset Alzheimer Disease (LOAD) from both linkage and candidate gene studies. However, there is a very extensive heterogeneity on chr10. We converged linkage analysis and gene expression data using the concept of genomic convergence that suggests that genes showing positive results across multiple different data types are more likely to be involved in AD. We identified and examined 28 genes on chr10 for association with AD in a Caucasian case-control dataset of 506 cases and 558 controls with substantial clinical information. The cases were all LOAD (minimum age at onset ≥ 60 years). Both single marker and haplotypic associations were tested in the overall dataset and 8 subsets defined by age, gender, ApoE and clinical status. PTPLA showed allelic, genotypic and haplotypic association in the overall dataset. SORCS1 was significant in the overall data sets (p=0.0025) and most significant in the female subset (allelic association p=0.00002, a 3-locus haplotype had p=0.0005). Odds Ratio of SORCS1 in the female subset was 1.7 (p<0.0001). SORCS1 is an interesting candidate gene involved in the Aβ pathway. Therefore, genetic variations in PTPLA and SORCS1 may be associated and have modest effect to the risk of AD by affecting Aβ pathway. The replication of the effect of these genes in different study populations and search for susceptible variants and functional studies of these genes are necessary to get a better understanding of the roles of the genes in Alzheimer disease. PMID:19241460

  4. RNA-Seq analysis of yak ovary: improving yak gene structure information and mining reproduction-related genes.

    PubMed

    Lan, DaoLiang; Xiong, XianRong; Wei, YanLi; Xu, Tong; Zhong, JinCheng; Zhi, XiangDong; Wang, Yong; Li, Jian

    2014-09-01

    RNA-Seq, a high-throughput (HT) sequencing technique, has been used effectively in large-scale transcriptomic studies, and is particularly useful for improving gene structure information and mining of new genes. In this study, RNA-Seq HT technology was employed to analyze the transcriptome of yak ovary. After Illumina-Solexa deep sequencing, 26826516 clean reads with a total of 4828772880 bp were obtained from the ovary library. Alignment analysis showed that 16992 yak genes mapped to the yak genome and 3734 of these genes were involved in alternative splicing. Gene structure refinement analysis showed that 7340 genes that were annotated in the yak genome could be extended at the 5' or 3' ends based on the alignments been the transcripts and the genome sequence. Novel transcript prediction analysis identified 6321 new transcripts with lengths ranging from 180 to 14884 bp, and 2267 of them were predicted to code proteins. BLAST analysis of the new transcripts showed that 1200?4933 mapped to the non-redundant (nr), nucleotide (nt) and/or SwissProt sequence databases. Comparative statistical analysis of the new mapped transcripts showed that the majority of them were similar to genes in Bos taurus (41.4%), Bos grunniens mutus (33.0%), Ovis aries (6.3%), Homo sapiens (2.8%), Mus musculus (1.6%) and other species. Functional analysis showed that these expressed genes were involved in various Gene Ontology (GO) categories and Kyoto Encyclopedia of Genes and Genomes pathways. GO analysis of the new transcripts found that the largest proportion of them was associated with reproduction. The results of this study will provide a basis for describing the normal transcriptome map of yak ovary and for future studies on yak breeding performance. Moreover, the results confirmed that RNA-Seq HT technology is highly advantageous in improving gene structure information and mining of new genes, as well as in providing valuable data to expand the yak genome information.

  5. Genome-wide analysis of the WRKY gene family in cotton.

    PubMed

    Dou, Lingling; Zhang, Xiaohong; Pang, Chaoyou; Song, Meizhen; Wei, Hengling; Fan, Shuli; Yu, Shuxun

    2014-12-01

    WRKY proteins are major transcription factors involved in regulating plant growth and development. Although many studies have focused on the functional identification of WRKY genes, our knowledge concerning many areas of WRKY gene biology is limited. For example, in cotton, the phylogenetic characteristics, global expression patterns, molecular mechanisms regulating expression, and target genes/pathways of WRKY genes are poorly characterized. Therefore, in this study, we present a genome-wide analysis of the WRKY gene family in cotton (Gossypium raimondii and Gossypium hirsutum). We identified 116 WRKY genes in G. raimondii from the completed genome sequence, and we cloned 102 WRKY genes in G. hirsutum. Chromosomal location analysis indicated that WRKY genes in G. raimondii evolved mainly from segmental duplication followed by tandem amplifications. Phylogenetic analysis of alga, bryophyte, lycophyta, monocot and eudicot WRKY domains revealed family member expansion with increasing complexity of the plant body. Microarray, expression profiling and qRT-PCR data revealed that WRKY genes in G. hirsutum may regulate the development of fibers, anthers, tissues (roots, stems, leaves and embryos), and are involved in the response to stresses. Expression analysis showed that most group II and III GhWRKY genes are highly expressed under diverse stresses. Group I members, representing the ancestral form, seem to be insensitive to abiotic stress, with low expression divergence. Our results indicate that cotton WRKY genes might have evolved by adaptive duplication, leading to sensitivity to diverse stresses. This study provides fundamental information to inform further analysis and understanding of WRKY gene functions in cotton species.

  6. Transcriptional profiling identifies differentially expressed genes in developing turkey skeletal muscle

    PubMed Central

    2011-01-01

    Background Skeletal muscle growth and development from embryo to adult consists of a series of carefully regulated changes in gene expression. Understanding these developmental changes in agriculturally important species is essential to the production of high quality meat products. For example, consumer demand for lean, inexpensive meat products has driven the turkey industry to unprecedented production through intensive genetic selection. However, achievements of increased body weight and muscle mass have been countered by an increased incidence of myopathies and meat quality defects. In a previous study, we developed and validated a turkey skeletal muscle-specific microarray as a tool for functional genomics studies. The goals of the current study were to utilize this microarray to elucidate functional pathways of genes responsible for key events in turkey skeletal muscle development and to compare differences in gene expression between two genetic lines of turkeys. To achieve these goals, skeletal muscle samples were collected at three critical stages in muscle development: 18d embryo (hyperplasia), 1d post-hatch (shift from myoblast-mediated growth to satellite cell-modulated growth by hypertrophy), and 16wk (market age) from two genetic lines: a randombred control line (RBC2) maintained without selection pressure, and a line (F) selected from the RBC2 line for increased 16wk body weight. Array hybridizations were performed in two experiments: Experiment 1 directly compared the developmental stages within genetic line, while Experiment 2 directly compared the two lines within each developmental stage. Results A total of 3474 genes were differentially expressed (false discovery rate; FDR < 0.001) by overall effect of development, while 16 genes were differentially expressed (FDR < 0.10) by overall effect of genetic line. Ingenuity Pathways Analysis was used to group annotated genes into networks, functions, and canonical pathways. The expression of 28 genes

  7. Pla2g12b and Hpn Are Genes Identified by Mouse ENU Mutagenesis That Affect HDL Cholesterol

    PubMed Central

    Aljakna, Aleksandra; Choi, Seungbum; Savage, Holly; Hageman Blair, Rachael; Gu, Tongjun; Svenson, Karen L.; Churchill, Gary A.; Hibbs, Matt; Korstanje, Ron

    2012-01-01

    Despite considerable progress understanding genes that affect the HDL particle, its function, and cholesterol content, genes identified to date explain only a small percentage of the genetic variation. We used N-ethyl-N-nitrosourea mutagenesis in mice to discover novel genes that affect HDL cholesterol levels. Two mutant lines (Hlb218 and Hlb320) with low HDL cholesterol levels were established. Causal mutations in these lines were mapped using linkage analysis: for line Hlb218 within a 12 Mbp region on Chr 10; and for line Hlb320 within a 21 Mbp region on Chr 7. High-throughput sequencing of Hlb218 liver RNA identified a mutation in Pla2g12b. The transition of G to A leads to a cysteine to tyrosine change and most likely causes a loss of a disulfide bridge. Microarray analysis of Hlb320 liver RNA showed a 7-fold downregulation of Hpn; sequencing identified a mutation in the 3′ splice site of exon 8. Northern blot confirmed lower mRNA expression level in Hlb320 and did not show a difference in splicing, suggesting that the mutation only affects the splicing rate. In addition to affecting HDL cholesterol, the mutated genes also lead to reduction in serum non-HDL cholesterol and triglyceride levels. Despite low HDL cholesterol levels, the mice from both mutant lines show similar atherosclerotic lesion sizes compared to control mice. These new mutant mouse models are valuable tools to further study the role of these genes, their affect on HDL cholesterol levels, and metabolism. PMID:22912808

  8. LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights

    PubMed Central

    Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong

    2016-01-01

    Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher’s exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO’s usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher. PMID:26750448

  9. LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights.

    PubMed

    Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong

    2016-01-11

    Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher's exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO's usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.

  10. Transcriptome analysis of the exocarp of apple fruit identifies light-induced genes involved in red color pigmentation.

    PubMed

    Vimolmangkang, Sornkanok; Zheng, Danman; Han, Yuepeng; Khan, M Awais; Soria-Guerra, Ruth Elena; Korban, Schuyler S

    2014-01-15

    Although the mechanism of light regulation of color pigmentation of apple fruit is not fully understood, it has been shown that light can regulate expression of genes in the anthocyanin biosynthesis pathway by inducing transcription factors (TFs). Moreover, expression of genes encoding enzymes involved in this pathway may be coordinately regulated by multiple TFs. In this study, fruits on trees of apple cv. Red Delicious were covered with paper bags during early stages of fruit development and then removed prior to maturation to analyze the transcriptome in the exocarp of apple fruit. Comparisons of gene expression profiles of fruit covered with paper bags (dark-grown treatment) and those subjected to 14 h light treatment, following removal of paper bags, were investigated using an apple microarray of 40,000 sequences. Expression profiles were investigated over three time points, at one week intervals, during fruit development. Overall, 736 genes with expression values greater than two-fold were found to be modulated by light treatment. Light-induced products were classified into 19 categories with highest scores in primary metabolism (17%) and transcription (12%). Based on the Arabidopsis gene ontology annotation, 18 genes were identified as TFs. To further confirm expression patterns of flavonoid-related genes, these were subjected to quantitative RT-PCR (qRT-PCR) using fruit of red-skinned apple cv. Red Delicious and yellow-skinned apple cv. Golden Delicious. Of these, two genes showed higher levels of expression in 'Red Delicious' than in 'Golden Delicious', and were likely involved in the regulation of fruit red color pigmentation. © 2013 Elsevier B.V. All rights reserved.

  11. Phylogenetic Analysis of Seven WRKY Genes across the Palm Subtribe Attaleinae (Arecaceae) Identifies Syagrus as Sister Group of the Coconut

    PubMed Central

    Meerow, Alan W.; Noblick, Larry; Borrone, James W.; Couvreur, Thomas L. P.; Mauro-Herrera, Margarita; Hahn, William J.; Kuhn, David N.; Nakamura, Kyoko; Oleas, Nora H.; Schnell, Raymond J.

    2009-01-01

    Background The Cocoseae is one of 13 tribes of Arecaceae subfam. Arecoideae, and contains a number of palms with significant economic importance, including the monotypic and pantropical Cocos nucifera L., the coconut, the origins of which have been one of the “abominable mysteries” of palm systematics for decades. Previous studies with predominantly plastid genes weakly supported American ancestry for the coconut but ambiguous sister relationships. In this paper, we use multiple single copy nuclear loci to address the phylogeny of the Cocoseae subtribe Attaleinae, and resolve the closest extant relative of the coconut. Methodology/Principal Findings We present the results of combined analysis of DNA sequences of seven WRKY transcription factor loci across 72 samples of Arecaceae tribe Cocoseae subtribe Attaleinae, representing all genera classified within the subtribe, and three outgroup taxa with maximum parsimony, maximum likelihood, and Bayesian approaches, producing highly congruent and well-resolved trees that robustly identify the genus Syagrus as sister to Cocos and resolve novel and well-supported relationships among the other genera of the Attaleinae. We also address incongruence among the gene trees with gene tree reconciliation analysis, and assign estimated ages to the nodes of our tree. Conclusions/Significance This study represents the as yet most extensive phylogenetic analyses of Cocoseae subtribe Attaleinae. We present a well-resolved and supported phylogeny of the subtribe that robustly indicates a sister relationship between Cocos and Syagrus. This is not only of biogeographic interest, but will also open fruitful avenues of inquiry regarding evolution of functional genes useful for crop improvement. Establishment of two major clades of American Attaleinae occurred in the Oligocene (ca. 37 MYBP) in Eastern Brazil. The divergence of Cocos from Syagrus is estimated at 35 MYBP. The biogeographic and morphological congruence that we see for

  12. RNA-seq Transcriptome Analysis of Panax japonicus, and Its Comparison with Other Panax Species to Identify Potential Genes Involved in the Saponins Biosynthesis

    PubMed Central

    Rai, Amit; Yamazaki, Mami; Takahashi, Hiroki; Nakamura, Michimi; Kojoma, Mareshige; Suzuki, Hideyuki; Saito, Kazuki

    2016-01-01

    The Panax genus has been a source of natural medicine, benefitting human health over the ages, among which the Panax japonicus represents an important species. Our understanding of several key pathways and enzymes involved in the biosynthesis of ginsenosides, a pharmacologically active class of metabolites and a major chemical constituents of the rhizome extracts from the Panax species, are limited. Limited genomic information, and lack of studies on comparative transcriptomics across the Panax species have restricted our understanding of the biosynthetic mechanisms of these and many other important classes of phytochemicals. Herein, we describe Illumina based RNA sequencing analysis to characterize the transcriptome and expression profiles of genes expressed in the five tissues of P. japonicus, and its comparison with other Panax species. RNA sequencing and de novo transcriptome assembly for P. japonicus resulted in a total of 135,235 unigenes with 78,794 (58.24%) unigenes being annotated using NCBI-nr database. Transcriptome profiling, and gene ontology enrichment analysis for five tissues of P. japonicus showed that although overall processes were evenly conserved across all tissues. However, each tissue was characterized by several unique unigenes with the leaves showing the most unique unigenes among the tissues studied. A comparative analysis of the P. japonicus transcriptome assembly with publically available transcripts from other Panax species, namely, P. ginseng, P. notoginseng, and P. quinquefolius also displayed high sequence similarity across all Panax species, with P. japonicus showing highest similarity with P. ginseng. Annotation of P. japonicus transcriptome resulted in the identification of putative genes encoding all enzymes from the triterpene backbone biosynthetic pathways, and identified 24 and 48 unigenes annotated as cytochrome P450 (CYP) and glycosyltransferases (GT), respectively. These CYPs and GTs annotated unigenes were conserved across

  13. Network Analysis of Human Genes Influencing Susceptibility to Mycobacterial Infections

    PubMed Central

    Lipner, Ettie M.; Garcia, Benjamin J.; Strong, Michael

    2016-01-01

    Tuberculosis and nontuberculous mycobacterial infections constitute a high burden of pulmonary disease in humans, resulting in over 1.5 million deaths per year. Building on the premise that genetic factors influence the instance, progression, and defense of infectious disease, we undertook a systems biology approach to investigate relationships among genetic factors that may play a role in increased susceptibility or control of mycobacterial infections. We combined literature and database mining with network analysis and pathway enrichment analysis to examine genes, pathways, and networks, involved in the human response to Mycobacterium tuberculosis and nontuberculous mycobacterial infections. This approach allowed us to examine functional relationships among reported genes, and to identify novel genes and enriched pathways that may play a role in mycobacterial susceptibility or control. Our findings suggest that the primary pathways and genes influencing mycobacterial infection control involve an interplay between innate and adaptive immune proteins and pathways. Signaling pathways involved in autoimmune disease were significantly enriched as revealed in our networks. Mycobacterial disease susceptibility networks were also examined within the context of gene-chemical relationships, in order to identify putative drugs and nutrients with potential beneficial immunomodulatory or anti-mycobacterial effects. PMID:26751573

  14. Identification, characterization and expression analysis of lineage-specific genes within sweet orange (Citrus sinensis).

    PubMed

    Xu, Yuantao; Wu, Guizhi; Hao, Baohai; Chen, Lingling; Deng, Xiuxin; Xu, Qiang

    2015-11-23

    With the availability of rapidly increasing number of genome and transcriptome sequences, lineage-specific genes (LSGs) can be identified and characterized. Like other conserved functional genes, LSGs play important roles in biological evolution and functions. Two set of citrus LSGs, 296 citrus-specific genes (CSGs) and 1039 orphan genes specific to sweet orange, were identified by comparative analysis between the sweet orange genome sequences and 41 genomes and 273 transcriptomes. With the two sets of genes, gene structure and gene expression pattern were investigated. On average, both the CSGs and orphan genes have fewer exons, shorter gene length and higher GC content when compared with those evolutionarily conserved genes (ECs). Expression profiling indicated that most of the LSGs expressed in various tissues of sweet orange and some of them exhibited distinct temporal and spatial expression patterns. Particularly, the orphan genes were preferentially expressed in callus, which is an important pluripotent tissue of citrus. Besides, part of the CSGs and orphan genes expressed responsive to abiotic stress, indicating their potential functions during interaction with environment. This study identified and characterized two sets of LSGs in citrus, dissected their sequence features and expression patterns, and provided valuable clues for future functional analysis of the LSGs in sweet orange.

  15. Relating genes to function: identifying enriched transcription factors using the ENCODE ChIP-Seq significance tool.

    PubMed

    Auerbach, Raymond K; Chen, Bin; Butte, Atul J

    2013-08-01

    Biological analysis has shifted from identifying genes and transcripts to mapping these genes and transcripts to biological functions. The ENCODE Project has generated hundreds of ChIP-Seq experiments spanning multiple transcription factors and cell lines for public use, but tools for a biomedical scientist to analyze these data are either non-existent or tailored to narrow biological questions. We present the ENCODE ChIP-Seq Significance Tool, a flexible web application leveraging public ENCODE data to identify enriched transcription factors in a gene or transcript list for comparative analyses. The ENCODE ChIP-Seq Significance Tool is written in JavaScript on the client side and has been tested on Google Chrome, Apple Safari and Mozilla Firefox browsers. Server-side scripts are written in PHP and leverage R and a MySQL database. The tool is available at http://encodeqt.stanford.edu. abutte@stanford.edu Supplementary material is available at Bioinformatics online.

  16. Identifying anti-cancer drug response related genes using an integrative analysis of transcriptomic and genomic variations with cell line-based drug perturbations.

    PubMed

    Sun, Yi; Zhang, Wei; Chen, Yunqin; Ma, Qin; Wei, Jia; Liu, Qi

    2016-02-23

    Clinical responses to anti-cancer therapies often only benefit a defined subset of patients. Predicting the best treatment strategy hinges on our ability to effectively translate genomic data into actionable information on drug responses. To achieve this goal, we compiled a comprehensive collection of baseline cancer genome data and drug response information derived from a large panel of cancer cell lines. This data set was applied to identify the signature genes relevant to drug sensitivity and their resistance by integrating CNVs and the gene expression of cell lines with in vitro drug responses. We presented an efficient in-silico pipeline for integrating heterogeneous cell line data sources with the simultaneous modeling of drug response values across all the drugs and cell lines. Potential signature genes correlated with drug response (sensitive or resistant) in different cancer types were identified. Using signature genes, our collaborative filtering-based drug response prediction model outperformed the 44 algorithms submitted to the DREAM competition on breast cancer cells. The functions of the identified drug response related signature genes were carefully analyzed at the pathway level and the synthetic lethality level. Furthermore, we validated these signature genes by applying them to the classification of the different subtypes of the TCGA tumor samples, and further uncovered their in vivo implications using clinical patient data. Our work may have promise in translating genomic data into customized marker genes relevant to the response of specific drugs for a specific cancer type of individual patients.

  17. Genes Important for Schizosaccharomyces pombe Meiosis Identified Through a Functional Genomics Screen

    PubMed Central

    Blyth, Julie; Makrantoni, Vasso; Barton, Rachael E.; Spanos, Christos; Rappsilber, Juri; Marston, Adele L.

    2018-01-01

    Meiosis is a specialized cell division that generates gametes, such as eggs and sperm. Errors in meiosis result in miscarriages and are the leading cause of birth defects; however, the molecular origins of these defects remain unknown. Studies in model organisms are beginning to identify the genes and pathways important for meiosis, but the parts list is still poorly defined. Here we present a comprehensive catalog of genes important for meiosis in the fission yeast, Schizosaccharomyces pombe. Our genome-wide functional screen surveyed all nonessential genes for roles in chromosome segregation and spore formation. Novel genes important at distinct stages of the meiotic chromosome segregation and differentiation program were identified. Preliminary characterization implicated three of these genes in centrosome/spindle pole body, centromere, and cohesion function. Our findings represent a near-complete parts list of genes important for meiosis in fission yeast, providing a valuable resource to advance our molecular understanding of meiosis. PMID:29259000

  18. Comparative phylogenomic analysis provides insights into TCP gene functions in Sorghum.

    PubMed

    Francis, Aleena; Dhaka, Namrata; Bakshi, Mohit; Jung, Ki-Hong; Sharma, Manoj K; Sharma, Rita

    2016-12-05

    Sorghum is a highly efficient C4 crop with potential to mitigate challenges associated with food, feed and fuel. TCP proteins are of particular interest for crop improvement programs due to their well-demonstrated roles in crop domestication and shaping plant architecture thereby, affecting agronomic traits. We identified 20 TCP genes from Sorghum. Except SbTCP8, all are either intronless or contain introns in the untranslated regions. Comparative phylogenetic analysis of Arabidopsis, rice, Brachypodium and Sorghum TCP proteins revealed two distinct classes categorized into ten sub-clades. Sub-clade F is dicot-specific, whereas A2, G1 and I1 groups only contained genes from grasses. Sub-clade B was missing in Sorghum, whereas group A1 was missing in rice indicating species-specific divergence of TCP proteins. TCP proteins of Sorghum are enriched in disorder promoting residues with class I containing higher percent disorder than class II proteins. Seven pairs of paralogous TCP genes were identified from Sorghum, five of which seem to predate Rice-Sorghum divergence. All of them have diverged in their expression. Based on the expression and orthology analysis, five Sorghum genes have been shortlisted for further investigation for their roles in regulating plant morphology. Whereas, three genes have been identified as candidates for engineering abiotic stress tolerance.

  19. The human RHOX gene cluster: target genes and functional analysis of gene variants in infertile men.

    PubMed

    Borgmann, Jennifer; Tüttelmann, Frank; Dworniczak, Bernd; Röpke, Albrecht; Song, Hye-Won; Kliesch, Sabine; Wilkinson, Miles F; Laurentino, Sandra; Gromoll, Jörg

    2016-11-15

    The X-linked reproductive homeobox (RHOX) gene cluster encodes transcription factors preferentially expressed in reproductive tissues. This gene cluster has important roles in male fertility based on phenotypic defects of Rhox-mutant mice and the finding that aberrant RHOX promoter methylation is strongly associated with abnormal human sperm parameters. However, little is known about the molecular mechanism of RHOX function in humans. Using gene expression profiling, we identified genes regulated by members of the human RHOX gene cluster. Some genes were uniquely regulated by RHOXF1 or RHOXF2/2B, while others were regulated by both of these transcription factors. Several of these regulated genes encode proteins involved in processes relevant to spermatogenesis; e.g. stress protection and cell survival. One of the target genes of RHOXF2/2B is RHOXF1, suggesting cross-regulation to enhance transcriptional responses. The potential role of RHOX in human infertility was addressed by sequencing all RHOX exons in a group of 250 patients with severe oligozoospermia. This revealed two mutations in RHOXF1 (c.515G > A and c.522C > T) and four in RHOXF2/2B (-73C > G, c.202G > A, c.411C > T and c.679G > A), of which only one (c.202G > A) was found in a control group of men with normal sperm concentration. Functional analysis demonstrated that c.202G > A and c.679G > A significantly impaired the ability of RHOXF2/2B to regulate downstream genes. Molecular modelling suggested that these mutations alter RHOXF2/F2B protein conformation. By combining clinical data with in vitro functional analysis, we demonstrate how the X-linked RHOX gene cluster may function in normal human spermatogenesis and we provide evidence that it is impaired in human male fertility.

  20. Major carcinogenic pathways identified by gene expression analysis of peritoneal mesotheliomas following chemical treatment in F344 rats

    EPA Science Inventory

    This study was performed to characterize the gene expression profile and to identify the major carcinogenic pathways involved in rat peritoneal mesothelioma (RPM) formation following treatment of Fischer 344 rats with o-nitrotoluene (o-NT) or bromochloracetic acid (BCA). Oligo a...

  1. Lentiviral vector-based insertional mutagenesis identifies genes associated with liver cancer

    PubMed Central

    Ranzani, Marco; Cesana, Daniela; Bartholomae, Cynthia C.; Sanvito, Francesca; Pala, Mauro; Benedicenti, Fabrizio; Gallina, Pierangela; Sergi, Lucia Sergi; Merella, Stefania; Bulfone, Alessandro; Doglioni, Claudio; von Kalle, Christof; Kim, Yoon Jun; Schmidt, Manfred; Tonon, Giovanni; Naldini, Luigi; Montini, Eugenio

    2013-01-01

    Transposons and γ-retroviruses have been efficiently used as insertional mutagens in different tissues to identify molecular culprits of cancer. However, these systems are characterized by recurring integrations that accumulate in tumor cells, hampering the identification of early cancer-driving events amongst bystander and progression-related events. We developed an insertional mutagenesis platform based on lentiviral vectors (LVV) by which we could efficiently induce hepatocellular carcinoma (HCC) in 3 different mouse models. By virtue of LVV’s replication-deficient nature and broad genome-wide integration pattern, LVV-based insertional mutagenesis allowed identification of 4 new liver cancer genes from a limited number of integrations. We validated the oncogenic potential of all the identified genes in vivo, with different levels of penetrance. Our newly identified cancer genes are likely to play a role in human disease, since they are upregulated and/or amplified/deleted in human HCCs and can predict clinical outcome of patients. PMID:23314173

  2. Integrating Transcriptome and Genome Re-Sequencing Data to Identify Key Genes and Mutations Affecting Chicken Eggshell Qualities

    PubMed Central

    Liu, Long; Zheng, Chuan Wei; Wang, De He; Hou, Zhuo Cheng; Ning, Zhong Hua

    2015-01-01

    Eggshell damages lead to economic losses in the egg production industry and are a threat to human health. We examined 49-wk-old Rhode Island White hens (Gallus gallus) that laid eggs having shells with significantly different strengths and thicknesses. We used HiSeq 2000 (Illumina) sequencing to characterize the chicken transcriptome and whole genome to identify the key genes and genetic mutations associated with eggshell calcification. We identified a total of 14,234 genes expressed in the chicken uterus, representing 89% of all annotated chicken genes. A total of 889 differentially expressed genes were identified by comparing low eggshell strength (LES) and normal eggshell strength (NES) genomes. The DEGs are enriched in calcification-related processes, including calcium ion transport and calcium signaling pathways as reveled by gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) pathway analysis. Some important matrix proteins, such as OC-116, LTF and SPP1, were also expressed differentially between two groups. A total of 3,671,919 single-nucleotide polymorphisms (SNPs) and 508,035 Indels were detected in protein coding genes by whole-genome re-sequencing, including 1775 non-synonymous variations and 19 frame-shift Indels in DEGs. SNPs and Indels found in this study could be further investigated for eggshell traits. This is the first report to integrate the transcriptome and genome re-sequencing to target the genetic variations which decreased the eggshell qualities. These findings further advance our understanding of eggshell calcification in the chicken uterus. PMID:25974068

  3. Bioinformatics Analysis of NBS-LRR Encoding Resistance Genes in Setaria italica.

    PubMed

    Zhao, Yan; Weng, Qiaoyun; Song, Jinhui; Ma, Hailian; Yuan, Jincheng; Dong, Zhiping; Liu, Yinghui

    2016-06-01

    In plants, resistance (R) genes are involved in pathogen recognition and subsequent activation of innate immune responses. The nucleotide-binding site-leucine-rich repeat (NBS-LRR) genes family forms the largest R-gene family among plant genomes and play an important role in plant disease resistance. In this paper, comprehensive analysis of NBS-encoding genes is performed in the whole Setaria italica genome. A total of 96 NBS-LRR genes are identified, and comprehensive overview of the NBS-LRR genes is undertaken, including phylogenetic analysis, chromosome locations, conserved motifs of proteins, and gene expression. Based on the domain, these genes are divided into two groups and distributed in all Setaria italica chromosomes. Most NBS-LRR genes are located at the distal tip of the long arms of the chromosomes. Setaria italica NBS-LRR proteins share at least one nucleotide-biding domain and one leucine-rich repeat domain. Our results also show the duplication of NBS-LRR genes in Setaria italica is related to their gene structure.

  4. The banana E2 gene family: Genomic identification, characterization, expression profiling analysis.

    PubMed

    Dong, Chen; Hu, Huigang; Jue, Dengwei; Zhao, Qiufang; Chen, Hongliang; Xie, Jianghui; Jia, Liqiang

    2016-04-01

    The E2 is at the center of a cascade of Ub1 transfers, and it links activation of the Ub1 by E1 to its eventual E3-catalyzed attachment to substrate. Although the genome-wide analysis of this family has been performed in some species, little is known about analysis of E2 genes in banana. In this study, 74 E2 genes of banana were identified and phylogenetically clustered into thirteen subgroups. The predicted banana E2 genes were distributed across all 11 chromosomes at different densities. Additionally, the E2 domain, gene structure and motif compositions were analyzed. The expression of all of the banana E2 genes was analyzed in the root, stem, leaf, flower organs, five stages of fruit development and under abiotic stresses. All of the banana E2 genes, with the exception of few genes in each group, were expressed in at least one of the organs and fruit developments, which indicated that the E2 genes might involve in various aspects of the physiological and developmental processes of the banana. Quantitative RT-PCR (qRT-PCR) analysis identified that 45 E2s under drought and 33 E2s under salt were induced. To the best of our knowledge, this report describes the first genome-wide analysis of the banana E2 gene family, and the results should provide valuable information for understanding the classification, cloning and putative functions of this family. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  5. GESearch: An Interactive GUI Tool for Identifying Gene Expression Signature.

    PubMed

    Ye, Ning; Yin, Hengfu; Liu, Jingjing; Dai, Xiaogang; Yin, Tongming

    2015-01-01

    The huge amount of gene expression data generated by microarray and next-generation sequencing technologies present challenges to exploit their biological meanings. When searching for the coexpression genes, the data mining process is largely affected by selection of algorithms. Thus, it is highly desirable to provide multiple options of algorithms in the user-friendly analytical toolkit to explore the gene expression signatures. For this purpose, we developed GESearch, an interactive graphical user interface (GUI) toolkit, which is written in MATLAB and supports a variety of gene expression data files. This analytical toolkit provides four models, including the mean, the regression, the delegate, and the ensemble models, to identify the coexpression genes, and enables the users to filter data and to select gene expression patterns by browsing the display window or by importing knowledge-based genes. Subsequently, the utility of this analytical toolkit is demonstrated by analyzing two sets of real-life microarray datasets from cell-cycle experiments. Overall, we have developed an interactive GUI toolkit that allows for choosing multiple algorithms for analyzing the gene expression signatures.

  6. Gene context analysis in the Integrated Microbial Genomes (IMG) data management system.

    PubMed

    Mavromatis, Konstantinos; Chu, Ken; Ivanova, Natalia; Hooper, Sean D; Markowitz, Victor M; Kyrpides, Nikos C

    2009-11-24

    Computational methods for determining the function of genes in newly sequenced genomes have been traditionally based on sequence similarity to genes whose function has been identified experimentally. Function prediction methods can be extended using gene context analysis approaches such as examining the conservation of chromosomal gene clusters, gene fusion events and co-occurrence profiles across genomes. Context analysis is based on the observation that functionally related genes are often having similar gene context and relies on the identification of such events across phylogenetically diverse collection of genomes. We have used the data management system of the Integrated Microbial Genomes (IMG) as the framework to implement and explore the power of gene context analysis methods because it provides one of the largest available genome integrations. Visualization and search tools to facilitate gene context analysis have been developed and applied across all publicly available archaeal and bacterial genomes in IMG. These computations are now maintained as part of IMG's regular genome content update cycle. IMG is available at: http://img.jgi.doe.gov.

  7. TGMI: an efficient algorithm for identifying pathway regulators through evaluation of triple-gene mutual interaction

    PubMed Central

    Gunasekara, Chathura; Zhang, Kui; Deng, Wenping; Brown, Laura

    2018-01-01

    Abstract Despite their important roles, the regulators for most metabolic pathways and biological processes remain elusive. Presently, the methods for identifying metabolic pathway and biological process regulators are intensively sought after. We developed a novel algorithm called triple-gene mutual interaction (TGMI) for identifying these regulators using high-throughput gene expression data. It first calculated the regulatory interactions among triple gene blocks (two pathway genes and one transcription factor (TF)), using conditional mutual information, and then identifies significantly interacted triple genes using a newly identified novel mutual interaction measure (MIM), which was substantiated to reflect strengths of regulatory interactions within each triple gene block. The TGMI calculated the MIM for each triple gene block and then examined its statistical significance using bootstrap. Finally, the frequencies of all TFs present in all significantly interacted triple gene blocks were calculated and ranked. We showed that the TFs with higher frequencies were usually genuine pathway regulators upon evaluating multiple pathways in plants, animals and yeast. Comparison of TGMI with several other algorithms demonstrated its higher accuracy. Therefore, TGMI will be a valuable tool that can help biologists to identify regulators of metabolic pathways and biological processes from the exploded high-throughput gene expression data in public repositories. PMID:29579312

  8. Genome-Wide Identification and Expression Analysis of the WRKY Gene Family in Cassava

    PubMed Central

    Wei, Yunxie; Shi, Haitao; Xia, Zhiqiang; Tie, Weiwei; Ding, Zehong; Yan, Yan; Wang, Wenquan; Hu, Wei; Li, Kaimian

    2016-01-01

    The WRKY family, a large family of transcription factors (TFs) found in higher plants, plays central roles in many aspects of physiological processes and adaption to environment. However, little information is available regarding the WRKY family in cassava (Manihot esculenta). In the present study, 85 WRKY genes were identified from the cassava genome and classified into three groups according to conserved WRKY domains and zinc-finger structure. Conserved motif analysis showed that all of the identified MeWRKYs had the conserved WRKY domain. Gene structure analysis suggested that the number of introns in MeWRKY genes varied from 1 to 5, with the majority of MeWRKY genes containing three exons. Expression profiles of MeWRKY genes in different tissues and in response to drought stress were analyzed using the RNA-seq technique. The results showed that 72 MeWRKY genes had differential expression in their transcript abundance and 78 MeWRKY genes were differentially expressed in response to drought stresses in different accessions, indicating their contribution to plant developmental processes and drought stress resistance in cassava. Finally, the expression of 9 WRKY genes was analyzed by qRT-PCR under osmotic, salt, ABA, H2O2, and cold treatments, indicating that MeWRKYs may be involved in different signaling pathways. Taken together, this systematic analysis identifies some tissue-specific and abiotic stress-responsive candidate MeWRKY genes for further functional assays in planta, and provides a solid foundation for understanding of abiotic stress responses and signal transduction mediated by WRKYs in cassava. PMID:26904033

  9. Genome-Wide Identification and Expression Analysis of the WRKY Gene Family in Cassava.

    PubMed

    Wei, Yunxie; Shi, Haitao; Xia, Zhiqiang; Tie, Weiwei; Ding, Zehong; Yan, Yan; Wang, Wenquan; Hu, Wei; Li, Kaimian

    2016-01-01

    The WRKY family, a large family of transcription factors (TFs) found in higher plants, plays central roles in many aspects of physiological processes and adaption to environment. However, little information is available regarding the WRKY family in cassava (Manihot esculenta). In the present study, 85 WRKY genes were identified from the cassava genome and classified into three groups according to conserved WRKY domains and zinc-finger structure. Conserved motif analysis showed that all of the identified MeWRKYs had the conserved WRKY domain. Gene structure analysis suggested that the number of introns in MeWRKY genes varied from 1 to 5, with the majority of MeWRKY genes containing three exons. Expression profiles of MeWRKY genes in different tissues and in response to drought stress were analyzed using the RNA-seq technique. The results showed that 72 MeWRKY genes had differential expression in their transcript abundance and 78 MeWRKY genes were differentially expressed in response to drought stresses in different accessions, indicating their contribution to plant developmental processes and drought stress resistance in cassava. Finally, the expression of 9 WRKY genes was analyzed by qRT-PCR under osmotic, salt, ABA, H2O2, and cold treatments, indicating that MeWRKYs may be involved in different signaling pathways. Taken together, this systematic analysis identifies some tissue-specific and abiotic stress-responsive candidate MeWRKY genes for further functional assays in planta, and provides a solid foundation for understanding of abiotic stress responses and signal transduction mediated by WRKYs in cassava.

  10. Identifying novel glioma associated pathways based on systems biology level meta-analysis.

    PubMed

    Hu, Yangfan; Li, Jinquan; Yan, Wenying; Chen, Jiajia; Li, Yin; Hu, Guang; Shen, Bairong

    2013-01-01

    With recent advances in microarray technology, including genomics, proteomics, and metabolomics, it brings a great challenge for integrating this "-omics" data to analysis complex disease. Glioma is an extremely aggressive and lethal form of brain tumor, and thus the study of the molecule mechanism underlying glioma remains very important. To date, most studies focus on detecting the differentially expressed genes in glioma. However, the meta-analysis for pathway analysis based on multiple microarray datasets has not been systematically pursued. In this study, we therefore developed a systems biology based approach by integrating three types of omics data to identify common pathways in glioma. Firstly, the meta-analysis has been performed to study the overlapping of signatures at different levels based on the microarray gene expression data of glioma. Among these gene expression datasets, 12 pathways were found in GeneGO database that shared by four stages. Then, microRNA expression profiles and ChIP-seq data were integrated for the further pathway enrichment analysis. As a result, we suggest 5 of these pathways could be served as putative pathways in glioma. Among them, the pathway of TGF-beta-dependent induction of EMT via SMAD is of particular importance. Our results demonstrate that the meta-analysis based on systems biology level provide a more useful approach to study the molecule mechanism of complex disease. The integration of different types of omics data, including gene expression microarrays, microRNA and ChIP-seq data, suggest some common pathways correlated with glioma. These findings will offer useful potential candidates for targeted therapeutic intervention of glioma.

  11. Comparative analysis of gene expression profiles of hip articular cartilage between non-traumatic necrosis and osteoarthritis.

    PubMed

    Wang, Wenyu; Liu, Yang; Hao, Jingcan; Zheng, Shuyu; Wen, Yan; Xiao, Xiao; He, Awen; Fan, Qianrui; Zhang, Feng; Liu, Ruiyu

    2016-10-10

    Hip cartilage destruction is consistently observed in the non-traumatic osteonecrosis of femoral head (NOFH) and accelerates its bone necrosis. The molecular mechanism underlying the cartilage damage of NOFH remains elusive. In this study, we conducted a systematically comparative study of gene expression profiles between NOFH and osteoarthritis (OA). Hip articular cartilage specimens were collected from 12 NOFH patients and 12 controls with traumatic femoral neck fracture for microarray (n=4) and quantitative real-time PCR validation experiments (n=8). Gene expression profiling of articular cartilage was performed using Agilent Human 4×44K Microarray chip. The accuracy of microarray experiment was further validated by qRT-PCR. Gene expression results of OA hip cartilage were derived from previously published study. Significance Analysis of Microarrays (SAM) software was applied for identifying differently expressed genes. Gene ontology (GO) and pathway enrichment analysis were conducted by Gene Set Enrichment Analysis software and DAVID tool, respectively. Totally, 27 differently expressed genes were identified for NOFH. Comparing the gene expression profiles of NOFH cartilage and OA cartilage detected 8 common differently expressed genes, including COL5A1, OGN, ANGPTL4, CRIP1, NFIL3, METRNL, ID2 and STEAP1. GO comparative analysis identified 10 common significant GO terms, mainly implicated in apoptosis and development process. Pathway comparative analysis observed that ECM-receptor interaction pathway and focal adhesion pathway were enriched in the differently expressed genes of both NOFH and hip OA. In conclusion, we identified a set of differently expressed genes, GO and pathways for NOFH articular destruction, some of which were also involved in the hip OA. Our study results may help to reveal the pathogenetic similarities and differences of cartilage damage of NOFH and hip OA. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. Metadata Analysis of Phanerochaete chrysosporium Gene Expression Data Identified Common CAZymes Encoding Gene Expression Profiles Involved in Cellulose and Hemicellulose Degradation.

    PubMed

    Kameshwar, Ayyappa Kumar Sista; Qin, Wensheng

    2017-01-01

    In literature, extensive studies have been conducted on popular wood degrading white rot fungus, Phanerochaete chrysosporium about its lignin degrading mechanisms compared to the cellulose and hemicellulose degrading abilities. This study delineates cellulose and hemicellulose degrading mechanisms through large scale metadata analysis of P. chrysosporium gene expression data (retrieved from NCBI GEO) to understand the common expression patterns of differentially expressed genes when cultured on different growth substrates. Genes encoding glycoside hydrolase classes commonly expressed during breakdown of cellulose such as GH-5,6,7,9,44,45,48 and hemicellulose are GH-2,8,10,11,26,30,43,47 were found to be highly expressed among varied growth conditions including simple customized and complex natural plant biomass growth mediums. Genes encoding carbohydrate esterase class enzymes CE (1,4,8,9,15,16) polysaccharide lyase class enzymes PL-8 and PL-14, and glycosyl transferases classes GT (1,2,4,8,15,20,35,39,48) were differentially expressed in natural plant biomass growth mediums. Based on these results, P. chrysosporium, on natural plant biomass substrates was found to express lignin and hemicellulose degrading enzymes more than cellulolytic enzymes except GH-61 (LPMO) class enzymes, in early stages. It was observed that the fate of P. chrysosporium transcriptome is significantly affected by the wood substrate provided. We believe, the gene expression findings in this study plays crucial role in developing genetically efficient microbe with effective cellulose and hemicellulose degradation abilities.

  13. Dynamic association rules for gene expression data analysis.

    PubMed

    Chen, Shu-Chuan; Tsai, Tsung-Hsien; Chung, Cheng-Han; Li, Wen-Hsiung

    2015-10-14

    The purpose of gene expression analysis is to look for the association between regulation of gene expression levels and phenotypic variations. This association based on gene expression profile has been used to determine whether the induction/repression of genes correspond to phenotypic variations including cell regulations, clinical diagnoses and drug development. Statistical analyses on microarray data have been developed to resolve gene selection issue. However, these methods do not inform us of causality between genes and phenotypes. In this paper, we propose the dynamic association rule algorithm (DAR algorithm) which helps ones to efficiently select a subset of significant genes for subsequent analysis. The DAR algorithm is based on association rules from market basket analysis in marketing. We first propose a statistical way, based on constructing a one-sided confidence interval and hypothesis testing, to determine if an association rule is meaningful. Based on the proposed statistical method, we then developed the DAR algorithm for gene expression data analysis. The method was applied to analyze four microarray datasets and one Next Generation Sequencing (NGS) dataset: the Mice Apo A1 dataset, the whole genome expression dataset of mouse embryonic stem cells, expression profiling of the bone marrow of Leukemia patients, Microarray Quality Control (MAQC) data set and the RNA-seq dataset of a mouse genomic imprinting study. A comparison of the proposed method with the t-test on the expression profiling of the bone marrow of Leukemia patients was conducted. We developed a statistical way, based on the concept of confidence interval, to determine the minimum support and minimum confidence for mining association relationships among items. With the minimum support and minimum confidence, one can find significant rules in one single step. The DAR algorithm was then developed for gene expression data analysis. Four gene expression datasets showed that the proposed

  14. Comparative analysis of protein interactome networks prioritizes candidate genes with cancer signatures.

    PubMed

    Li, Yongsheng; Sahni, Nidhi; Yi, Song

    2016-11-29

    Comprehensive understanding of human cancer mechanisms requires the identification of a thorough list of cancer-associated genes, which could serve as biomarkers for diagnoses and therapies in various types of cancer. Although substantial progress has been made in functional studies to uncover genes involved in cancer, these efforts are often time-consuming and costly. Therefore, it remains challenging to comprehensively identify cancer candidate genes. Network-based methods have accelerated this process through the analysis of complex molecular interactions in the cell. However, the extent to which various interactome networks can contribute to prediction of candidate genes responsible for cancer is still enigmatic. In this study, we evaluated different human protein-protein interactome networks and compared their application to cancer gene prioritization. Our results indicate that network analyses can increase the power to identify novel cancer genes. In particular, such predictive power can be enhanced with the use of unbiased systematic protein interaction maps for cancer gene prioritization. Functional analysis reveals that the top ranked genes from network predictions co-occur often with cancer-related terms in literature, and further, these candidate genes are indeed frequently mutated across cancers. Finally, our study suggests that integrating interactome networks with other omics datasets could provide novel insights into cancer-associated genes and underlying molecular mechanisms.

  15. Comparative genomic analysis of the PKS genes in five species and expression analysis in upland cotton

    PubMed Central

    Cheng, Xi; Wang, Yanan; Abdullah, Muhammad; Li, Manli; Li, Dahui; Gao, Junshan

    2017-01-01

    Plant type III polyketide synthase (PKS) can catalyse the formation of a series of secondary metabolites with different structures and different biological functions; the enzyme plays an important role in plant growth, development and resistance to stress. At present, the PKS gene has been identified and studied in a variety of plants. Here, we identified 11 PKS genes from upland cotton (Gossypium hirsutum) and compared them with 41 PKS genes in Populus tremula, Vitis vinifera, Malus domestica and Arabidopsis thaliana. According to the phylogenetic tree, a total of 52 PKS genes can be divided into four subfamilies (I–IV). The analysis of gene structures and conserved motifs revealed that most of the PKS genes were composed of two exons and one intron and there are two characteristic conserved domains (Chal_sti_synt_N and Chal_sti_synt_C) of the PKS gene family. In our study of the five species, gene duplication was found in addition to Arabidopsis thaliana and we determined that purifying selection has been of great significance in maintaining the function of PKS gene family. From qRT-PCR analysis and a combination of the role of the accumulation of proanthocyanidins (PAs) in brown cotton fibers, we concluded that five PKS genes are candidate genes involved in brown cotton fiber pigment synthesis. These results are important for the further study of brown cotton PKS genes. It not only reveals the relationship between PKS gene family and pigment in brown cotton, but also creates conditions for improving the quality of brown cotton fiber. PMID:29104824

  16. Gene expression profile analysis of rat cerebellum under acute alcohol intoxication.

    PubMed

    Zhang, Yu; Wei, Guangkuan; Wang, Yuehong; Jing, Ling; Zhao, Qingjie

    2015-02-25

    Acute alcohol intoxication, a common disease causing damage to the central nervous system (CNS) has been primarily studied on the aspects of alcohol addiction and chronic alcohol exposure. The understanding of gene expression change in the CNS during acute alcohol intoxication is still lacking. We established a model for acute alcohol intoxication in SD rats by oral gavage. A rat cDNA microarray was used to profile mRNA expression in the cerebella of alcohol-intoxicated rats (experimental group) and saline-treated rats (control group). A total of 251 differentially expressed genes were identified in response to acute alcohol intoxication, in which 208 of them were up-regulated and 43 were down-regulated. Gene ontology (GO) term enrichment analysis and pathway analysis revealed that the genes involved in the biological processes of immune response and endothelial integrity are among the most severely affected in response to acute alcohol intoxication. We discovered five transcription factors whose consensus binding motifs are overrepresented in the promoter region of differentially expressed genes. Additionally, we identified 20 highly connected hub genes by co-expression analysis, and validated the differential expression of these genes by real-time quantitative PCR. By determining novel biological pathways and transcription factors that have functional implication to acute alcohol intoxication, our study substantially contributes to the understanding of the molecular mechanism underlying the pathology of acute alcoholism. Copyright © 2014 Elsevier B.V. All rights reserved.

  17. Genome-wide analysis of starch metabolism genes in potato (Solanum tuberosum L.).

    PubMed

    Van Harsselaar, Jessica K; Lorenz, Julia; Senning, Melanie; Sonnewald, Uwe; Sonnewald, Sophia

    2017-01-05

    Starch is the principle constituent of potato tubers and is of considerable importance for food and non-food applications. Its metabolism has been subject of extensive research over the past decades. Despite its importance, a description of the complete inventory of genes involved in starch metabolism and their genome organization in potato plants is still missing. Moreover, mechanisms regulating the expression of starch genes in leaves and tubers remain elusive with regard to differences between transitory and storage starch metabolism, respectively. This study aimed at identifying and mapping the complete set of potato starch genes, and to study their expression pattern in leaves and tubers using different sets of transcriptome data. Moreover, we wanted to uncover transcription factors co-regulated with starch accumulation in tubers in order to get insight into the regulation of starch metabolism. We identified 77 genomic loci encoding enzymes involved in starch metabolism. Novel isoforms of many enzymes were found. Their analysis will help to elucidate mechanisms of starch biosynthesis and degradation. Expression analysis of starch genes led to the identification of tissue-specific isoenzymes suggesting differences in the transcriptional regulation of starch metabolism between potato leaf and tuber tissues. Selection of genes predominantly expressed in developing potato tubers and exhibiting an expression pattern indicative for a role in starch biosynthesis enabled the identification of possible transcriptional regulators of tuber starch biosynthesis by co-expression analysis. This study provides the annotation of the complete set of starch metabolic genes in potato plants and their genomic localizations. Novel, so far undescribed, enzyme isoforms were revealed. Comparative transcriptome analysis enabled the identification of tuber- and leaf-specific isoforms of starch genes. This finding suggests distinct regulatory mechanisms in transitory and storage starch

  18. Identifying the genes of unconventional high temperature superconductors.

    PubMed

    Hu, Jiangping

    We elucidate a recently emergent framework in unifying the two families of high temperature (high [Formula: see text]) superconductors, cuprates and iron-based superconductors. The unification suggests that the latter is simply the counterpart of the former to realize robust extended s-wave pairing symmetries in a square lattice. The unification identifies that the key ingredients (gene) of high [Formula: see text] superconductors is a quasi two dimensional electronic environment in which the d -orbitals of cations that participate in strong in-plane couplings to the p -orbitals of anions are isolated near Fermi energy. With this gene, the superexchange magnetic interactions mediated by anions could maximize their contributions to superconductivity. Creating the gene requires special arrangements between local electronic structures and crystal lattice structures. The speciality explains why high [Formula: see text] superconductors are so rare. An explicit prediction is made to realize high [Formula: see text] superconductivity in Co/Ni-based materials with a quasi two dimensional hexagonal lattice structure formed by trigonal bipyramidal complexes.

  19. Gene expression profile analysis of Ligon lintless-1 (Li1) mutant reveals important genes and pathways in cotton leaf and fiber development.

    PubMed

    Ding, Mingquan; Jiang, Yurong; Cao, Yuefen; Lin, Lifeng; He, Shae; Zhou, Wei; Rong, Junkang

    2014-02-10

    Ligon lintless-1 (Li1) is a monogenic dominant mutant of Gossypium hirsutum (upland cotton) with a phenotype of impaired vegetative growth and short lint fibers. Despite years of research involving genetic mapping and gene expression profile analysis of Li1 mutant ovule tissues, the gene remains uncloned and the underlying pathway of cotton fiber elongation is still unclear. In this study, we report the whole genome-level deep-sequencing analysis of leaf tissues of the Li1 mutant. Differentially expressed genes in leaf tissues of mutant versus wild-type (WT) plants are identified, and the underlying pathways and potential genes that control leaf and fiber development are inferred. The results show that transcription factors AS2, YABBY5, and KANDI-like are significantly differentially expressed in mutant tissues compared with WT ones. Interestingly, several fiber development-related genes are found in the downregulated gene list of the mutant leaf transcriptome. These genes include heat shock protein family, cytoskeleton arrangement, cell wall synthesis, energy, H2O2 metabolism-related genes, and WRKY transcription factors. This finding suggests that the genes are involved in leaf morphology determination and fiber elongation. The expression data are also compared with the previously published microarray data of Li1 ovule tissues. Comparative analysis of the ovule transcriptomes of Li1 and WT reveals that a number of pathways important for fiber elongation are enriched in the downregulated gene list at different fiber development stages (0, 6, 9, 12, 15, 18dpa). Differentially expressed genes identified in both leaf and fiber samples are aligned with cotton whole genome sequences and combined with the genetic fine mapping results to identify a list of candidate genes for Li1. Copyright © 2013 Elsevier B.V. All rights reserved.

  20. Combining gene expression and genetic analyses to identify candidate genes involved in cold responses in pea.

    PubMed

    Legrand, Sylvain; Marque, Gilles; Blassiau, Christelle; Bluteau, Aurélie; Canoy, Anne-Sophie; Fontaine, Véronique; Jaminon, Odile; Bahrman, Nasser; Mautord, Julie; Morin, Julie; Petit, Aurélie; Baranger, Alain; Rivière, Nathalie; Wilmer, Jeroen; Delbreil, Bruno; Lejeune-Hénaut, Isabelle

    2013-09-01

    Cold stress affects plant growth and development. In order to better understand the responses to cold (chilling or freezing tolerance), we used two contrasted pea lines. Following a chilling period, the Champagne line becomes tolerant to frost whereas the Terese line remains sensitive. Four suppression subtractive hybridisation libraries were obtained using mRNAs isolated from pea genotypes Champagne and Terese. Using quantitative polymerase chain reaction (qPCR) performed on 159 genes, 43 and 54 genes were identified as differentially expressed at the initial time point and during the time course study, respectively. Molecular markers were developed from the differentially expressed genes and were genotyped on a population of 164 RILs derived from a cross between Champagne and Terese. We identified 5 candidate genes colocalizing with 3 different frost damage quantitative trait loci (QTL) intervals and a protein quantity locus (PQL) rich region previously reported. This investigation revealed the role of constitutive differences between both genotypes in the cold responses, in particular with genes related to glycine degradation pathway that could confer to Champagne a better frost tolerance. We showed that freezing tolerance involves a decrease of expression of genes related to photosynthesis and the expression of a gene involved in the production of cysteine and methionine that could act as cryoprotectant molecules. Although it remains to be confirmed, this study could also reveal the involvement of the jasmonate pathway in the cold responses, since we observed that two genes related to this pathway were mapped in a frost damage QTL interval and in a PQL rich region interval, respectively. Copyright © 2013 Elsevier GmbH. All rights reserved.

  1. Novel mutations and phenotypic associations identified through APC, MUTYH, NTHL1, POLD1, POLE gene analysis in Indian Familial Adenomatous Polyposis cohort.

    PubMed

    Khan, Nikhat; Lipsa, Anuja; Arunachal, Gautham; Ramadwar, Mukta; Sarin, Rajiv

    2017-05-22

    Colo-Rectal Cancer is a common cancer worldwide with 5-10% cases being hereditary. Familial Adenomatous Polyposis (FAP) syndrome is due to germline mutations in the APC or rarely MUTYH gene. NTHL1, POLD1, POLE have been recently reported in previously unexplained FAP cases. Unlike the Caucasian population, FAP phenotype and its genotypic associations have not been widely studied in several geoethnic groups. We report the first FAP cohort from South Asia and the only non-Caucasian cohort with comprehensive analysis of APC, MUTYH, NTHL1, POLD1, POLE genes. In this cohort of 112 individuals from 53 FAP families, we detected germline APC mutations in 60 individuals (45 families) and biallelic MUTYH mutations in 4 individuals (2 families). No NTHL1, POLD1, POLE mutations were identified. Fifteen novel APC mutations and a new Indian APC mutational hotspot at codon 935 were identified. Eight very rare FAP phenotype or phenotypes rarely associated with mutations outside specific APC regions were observed. APC genotype-phenotype association studies in different geo-ethnic groups can enrich the existing knowledge about phenotypic consequences of distinct APC mutations and guide counseling and risk management in different populations. A stepwise cost-effective mutation screening approach is proposed for genetic testing of south Asian FAP patients.

  2. In silico analysis of miRNA-mediated gene regulation in OCA and OA genes.

    PubMed

    Kamaraj, Balu; Gopalakrishnan, Chandrasekhar; Purohit, Rituraj

    2014-12-01

    Albinism is an autosomal recessive genetic disorder due to low secretion of melanin. The oculocutaneous albinism (OCA) and ocular albinism (OA) genes are responsible for melanin production and also act as a potential targets for miRNAs. The role of miRNA is to inhibit the protein synthesis partially or completely by binding with the 3'UTR of the mRNA thus regulating gene expression. In this analysis, we predicted the genetic variation that occurred in 3'UTR of the transcript which can be a reason for low melanin production thus causing albinism. The single nucleotide polymorphisms (SNPs) in 3'UTR cause more new binding sites for miRNA which binds with mRNA which leads to inhibit the translation process either partially or completely. The SNPs in the mRNA of OCA and OA genes can create new binding sites for miRNA which may control the gene expression and lead to hypopigmentation. We have developed a computational procedure to determine the SNPs in the 3'UTR region of mRNA of OCA (TYR, OCA2, TYRP1 and SLC45A2) and OA (GPR143) genes which will be a potential cause for albinism. We identified 37 SNPs in five genes that are predicted to create 87 new binding sites on mRNA, which may lead to abrogation of the translation process. Expression analysis confirms that these genes are highly expressed in skin and eye regions. It is well supported by enrichment analysis that these genes are mainly involved in eye pigmentation and melanin biosynthesis process. The network analysis also shows how the genes are interacting and expressing in a complex network. This insight provides clue to wet-lab researches to understand the expression pattern of OCA and OA genes and binding phenomenon of mRNA and miRNA upon mutation, which is responsible for inhibition of translation process at genomic levels.

  3. Multi-membership gene regulation in pathway based microarray analysis

    PubMed Central

    2011-01-01

    Background Gene expression analysis has been intensively researched for more than a decade. Recently, there has been elevated interest in the integration of microarray data analysis with other types of biological knowledge in a holistic analytical approach. We propose a methodology that can be facilitated for pathway based microarray data analysis, based on the observation that a substantial proportion of genes present in biochemical pathway databases are members of a number of distinct pathways. Our methodology aims towards establishing the state of individual pathways, by identifying those truly affected by the experimental conditions based on the behaviour of such genes. For that purpose it considers all the pathways in which a gene participates and the general census of gene expression per pathway. Results We utilise hill climbing, simulated annealing and a genetic algorithm to analyse the consistency of the produced results, through the application of fuzzy adjusted rand indexes and hamming distance. All algorithms produce highly consistent genes to pathways allocations, revealing the contribution of genes to pathway functionality, in agreement with current pathway state visualisation techniques, with the simulated annealing search proving slightly superior in terms of efficiency. Conclusions We show that the expression values of genes, which are members of a number of biochemical pathways or modules, are the net effect of the contribution of each gene to these biochemical processes. We show that by manipulating the pathway and module contribution of such genes to follow underlying trends we can interpret microarray results centred on the behaviour of these genes. PMID:21939531

  4. Multi-membership gene regulation in pathway based microarray analysis.

    PubMed

    Pavlidis, Stelios P; Payne, Annette M; Swift, Stephen M

    2011-09-22

    Gene expression analysis has been intensively researched for more than a decade. Recently, there has been elevated interest in the integration of microarray data analysis with other types of biological knowledge in a holistic analytical approach. We propose a methodology that can be facilitated for pathway based microarray data analysis, based on the observation that a substantial proportion of genes present in biochemical pathway databases are members of a number of distinct pathways. Our methodology aims towards establishing the state of individual pathways, by identifying those truly affected by the experimental conditions based on the behaviour of such genes. For that purpose it considers all the pathways in which a gene participates and the general census of gene expression per pathway. We utilise hill climbing, simulated annealing and a genetic algorithm to analyse the consistency of the produced results, through the application of fuzzy adjusted rand indexes and hamming distance. All algorithms produce highly consistent genes to pathways allocations, revealing the contribution of genes to pathway functionality, in agreement with current pathway state visualisation techniques, with the simulated annealing search proving slightly superior in terms of efficiency. We show that the expression values of genes, which are members of a number of biochemical pathways or modules, are the net effect of the contribution of each gene to these biochemical processes. We show that by manipulating the pathway and module contribution of such genes to follow underlying trends we can interpret microarray results centred on the behaviour of these genes.

  5. A recessive contiguous gene deletion causing infantile hyperinsulinism, enteropathy and deafness identifies the Usher type 1C gene.

    PubMed

    Bitner-Glindzicz, M; Lindley, K J; Rutland, P; Blaydon, D; Smith, V V; Milla, P J; Hussain, K; Furth-Lavi, J; Cosgrove, K E; Shepherd, R M; Barnes, P D; O'Brien, R E; Farndon, P A; Sowden, J; Liu, X Z; Scanlan, M J; Malcolm, S; Dunne, M J; Aynsley-Green, A; Glaser, B

    2000-09-01

    Usher syndrome type 1 describes the association of profound, congenital sensorineural deafness, vestibular hypofunction and childhood onset retinitis pigmentosa. It is an autosomal recessive condition and is subdivided on the basis of linkage analysis into types 1A through 1E. Usher type 1C maps to the region containing the genes ABCC8 and KCNJ11 (encoding components of ATP-sensitive K + (KATP) channels), which may be mutated in patients with hyperinsulinism. We identified three individuals from two consanguineous families with severe hyperinsulinism, profound congenital sensorineural deafness, enteropathy and renal tubular dysfunction. The molecular basis of the disorder is a homozygous 122-kb deletion of 11p14-15, which includes part of ABCC8 and overlaps with the locus for Usher syndrome type 1C and DFNB18. The centromeric boundary of this deletion includes part of a gene shown to be mutated in families with type 1C Usher syndrome, and is hence assigned the name USH1C. The pattern of expression of the USH1C protein is consistent with the clinical features exhibited by individuals with the contiguous gene deletion and with isolated Usher type 1C.

  6. Integrated analysis of gene expression and methylation profiles of 48 candidate genes in breast cancer patients.

    PubMed

    Li, Zibo; Heng, Jianfu; Yan, Jinhua; Guo, Xinwu; Tang, Lili; Chen, Ming; Peng, Limin; Wu, Yepeng; Wang, Shouman; Xiao, Zhi; Deng, Zhongping; Dai, Lizhong; Wang, Jun

    2016-11-01

    Gene-specific methylation and expression have shown biological and clinical importance for breast cancer diagnosis and prognosis. Integrated analysis of gene methylation and gene expression may identify genes associated with biology mechanism and clinical outcome of breast cancer and aid in clinical management. Using high-throughput microfluidic quantitative PCR, we analyzed the expression profiles of 48 candidate genes in 96 Chinese breast cancer patients and investigated their correlation with gene methylation and associations with breast cancer clinical parameters. Breast cancer-specific gene expression alternation was found in 25 genes with significant expression difference between paired tumor and normal tissues. A total of 9 genes (CCND2, EGFR, GSTP1, PGR, PTGS2, RECK, SOX17, TNFRSF10D, and WIF1) showed significant negative correlation between methylation and gene expression, which were validated in the TCGA database. Total 23 genes (ACADL, APC, BRCA2, CADM1, CAV1, CCND2, CST6, EGFR, ESR2, GSTP1, ICAM5, NPY, PGR, PTGS2, RECK, RUNX3, SFRP1, SOX17, SYK, TGFBR2, TNFRSF10D, WIF1, and WRN) annotated with potential TFBSs in the promoter regions showed negative correlation between methylation and expression. In logistics regression analysis, 31 of the 48 genes showed improved performance in disease prediction with combination of methylation and expression coefficient. Our results demonstrated the complex correlation and the possible regulatory mechanisms between DNA methylation and gene expression. Integration analysis of methylation and expression of candidate genes could improve performance in breast cancer prediction. These findings would contribute to molecular characterization and identification of biomarkers for potential clinical applications.

  7. Screening strategies for a highly polymorphic gene: DHPLC analysis of the Fanconi anemia group A gene.

    PubMed

    Rischewski, J; Schneppenheim, R

    2001-01-30

    Patients with Fanconi anemia (Fanc) are at risk of developing leukemia. Mutations of the group A gene (FancA) are most common. A multitude of polymorphisms and mutations within the 43 exons of the gene are described. To examine the role of heterozygosity as a risk factor for malignancies, a partially automatized screening method to identify aberrations was needed. We report on our experience with DHPLC (WAVE (Transgenomic)). PCR amplification of all 43 exons from one individual was performed on one microtiter plate on a gradient thermocycler. DHPLC analysis conditions were established via melting curves, prediction software, and test runs with aberrant samples. PCR products were analyzed twice: native, and after adding a WT-PCR product. Retention patterns were compared with previously identified polymorphic PCR products or mutants. We have defined the mutation screening conditions for all 43 exons of FancA using DHPLC. So far, 40 different sequence variations have been detected in more than 100 individuals. The native analysis identifies heterozygous individuals, and the second run detects homozygous aberrations. Retention patterns are specific for the underlying sequence aberration, thus reducing sequencing demand and costs. DHPLC is a valuable tool for reproducible recognition of known sequence aberrations and screening for unknown mutations in the highly polymorphic FancA gene.

  8. Transcriptome analysis to identify genes for peptides and proteins involved in immunity and reproduction from male accessory glands and ejaculatory duct of Bactrocera dorsalis.

    PubMed

    Wei, Dong; Tian, Chuan-Bei; Liu, Shi-Huo; Wang, Tao; Smagghe, Guy; Jia, Fu-Xian; Dou, Wei; Wang, Jin-Jun

    2016-06-01

    In the male reproductive system of insects, the male accessory glands and ejaculatory duct (MAG/ED) are important organs and their primary function is to enhance the fertility of spermatozoa. Proteins secreted by the MAG/ED are also known to induce post-mating changes and immunity responses in the female insect. To understand the gene expression profile in the MAG/ED of the oriental fruit fly Bactrocera dorsalis (Hendel), that is an important pest in fruits, we performed an Illumina-based deep sequencing of mRNA. This yielded 54,577,630 clean reads corresponding to 4.91Gb total nucleotides that were assembled and clustered to 30,669 unigenes (average 645bp). Among them, 20,419 unigenes were functionally annotated to known proteins/peptides in Gene Orthology, Clusters of Orthologous Groups, Kyoto Encyclopedia of Genes and Genomes pathway databases. Typically, many genes were involved in immunity and these included microbial recognition proteins and antimicrobial peptides. Subsequently, the inducible expression of these immunity-related genes was confirmed by qRT-PCR analysis when insects were challenged with immunity-inducible factors, suggesting their function in guaranteeing fertilization success. Besides, we identified some important reproductive genes such as juvenile hormone- and ecdysteroid-related genes in this de novo assembly. In conclusion, this transcriptomic sequencing of B. dorsalis MAG/ED provides insights to facilitate further functional research of reproduction, immunity and molecular evolution of reproductive proteins in this important agricultural pest. Copyright © 2015 Elsevier Inc. All rights reserved.

  9. Partial Least Squares Based Gene Expression Analysis in EBV- Positive and EBV-Negative Posttransplant Lymphoproliferative Disorders.

    PubMed

    Wu, Sa; Zhang, Xin; Li, Zhi-Ming; Shi, Yan-Xia; Huang, Jia-Jia; Xia, Yi; Yang, Hang; Jiang, Wen-Qi

    2013-01-01

    Post-transplant lymphoproliferative disorder (PTLD) is a common complication of therapeutic immunosuppression after organ transplantation. Gene expression profile facilitates the identification of biological difference between Epstein-Barr virus (EBV) positive and negative PTLDs. Previous studies mainly implemented variance/regression analysis without considering unaccounted array specific factors. The aim of this study is to investigate the gene expression difference between EBV positive and negative PTLDs through partial least squares (PLS) based analysis. With a microarray data set from the Gene Expression Omnibus database, we performed PLS based analysis. We acquired 1188 differentially expressed genes. Pathway and Gene Ontology enrichment analysis identified significantly over-representation of dysregulated genes in immune response and cancer related biological processes. Network analysis identified three hub genes with degrees higher than 15, including CREBBP, ATXN1, and PML. Proteins encoded by CREBBP and PML have been reported to be interact with EBV before. Our findings shed light on expression distinction of EBV positive and negative PTLDs with the hope to offer theoretical support for future therapeutic study.

  10. De novo transcriptome sequencing in Frankliniella occidentalis to identify genes involved in plant virus transmission and insecticide resistance.

    PubMed

    Zhang, Zhijun; Zhang, Pengjun; Li, Weidi; Zhang, Jinming; Huang, Fang; Yang, Jian; Bei, Yawei; Lu, Yaobin

    2013-05-01

    The western flower thrips (WFT), Frankliniella occidentalis, a world-wide invasive insect, causes agricultural damage by directly feeding and by indirectly vectoring Tospoviruses, such as Tomato spotted wilt virus (TSWV). We characterized the transcriptome of WFT and analyzed global gene expression of WFT response to TSWV infection using Illumina sequencing platform. We compiled 59,932 unigenes, and identified 36,339 unigenes by similarity analysis against public databases, most of which were annotated using gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis. Within these annotated transcripts, we collected 278 sequences related to insecticide resistance. GO and KEGG analysis of different expression genes between TSWV-infected and non-infected WFT population revealed that TSWV can regulate cellular process and immune response, which might lead to low virus titers in thrips cells and no detrimental effects on F. occidentalis. This data-set not only enriches genomic resource for WFT, but also benefits research into its molecular genetics and functional genomics. Copyright © 2013 Elsevier Inc. All rights reserved.

  11. Analysis of Pax6 contiguous gene deletions in the mouse, Mus musculus, identifies regions distinct from Pax6 responsible for extreme small-eye and belly-spotting phenotypes.

    PubMed

    Favor, Jack; Bradley, Alan; Conte, Nathalie; Janik, Dirk; Pretsch, Walter; Reitmeir, Peter; Rosemann, Michael; Schmahl, Wolfgang; Wienberg, Johannes; Zaus, Irmgard

    2009-08-01

    In the mouse Pax6 function is critical in a dose-dependent manner for proper eye development. Pax6 contiguous gene deletions were shown to be homozygous lethal at an early embryonic stage. Heterozygotes express belly spotting and extreme microphthalmia. The eye phenotype is more severe than in heterozygous Pax6 intragenic null mutants, raising the possibility that deletions are functionally different from intragenic null mutations or that a region distinct from Pax6 included in the deletions affects eye phenotype. We recovered and identified the exact regions deleted in three new Pax6 deletions. All are homozygous lethal at an early embryonic stage. None express belly spotting. One expresses extreme microphthalmia and two express the milder eye phenotype similar to Pax6 intragenic null mutants. Analysis of Pax6 expression levels and the major isoforms excluded the hypothesis that the deletions expressing extreme microphthalmia are directly due to the action of Pax6 and functionally different from intragenic null mutations. A region distinct from Pax6 containing eight genes was identified for belly spotting. A second region containing one gene (Rcn1) was identified for the extreme microphthalmia phenotype. Rcn1 is a Ca(+2)-binding protein, resident in the endoplasmic reticulum, participates in the secretory pathway and expressed in the eye. Our results suggest that deletion of Rcn1 directly or indirectly contributes to the eye phenotype in Pax6 contiguous gene deletions.

  12. Meta-analysis of gene expression profiles associated with histological classification and survival in 829 ovarian cancer samples.

    PubMed

    Fekete, Tibor; Rásó, Erzsébet; Pete, Imre; Tegze, Bálint; Liko, István; Munkácsy, Gyöngyi; Sipos, Norbert; Rigó, János; Györffy, Balázs

    2012-07-01

    Transcriptomic analysis of global gene expression in ovarian carcinoma can identify dysregulated genes capable to serve as molecular markers for histology subtypes and survival. The aim of our study was to validate previous candidate signatures in an independent setting and to identify single genes capable to serve as biomarkers for ovarian cancer progression. As several datasets are available in the GEO today, we were able to perform a true meta-analysis. First, 829 samples (11 datasets) were downloaded, and the predictive power of 16 previously published gene sets was assessed. Of these, eight were capable to discriminate histology subtypes, and none was capable to predict survival. To overcome the differences in previous studies, we used the 829 samples to identify new predictors. Then, we collected 64 ovarian cancer samples (median relapse-free survival 24.5 months) and performed TaqMan Real Time Polimerase Chain Reaction (RT-PCR) analysis for the best 40 genes associated with histology subtypes and survival. Over 90% of subtype-associated genes were confirmed. Overall survival was effectively predicted by hormone receptors (PGR and ESR2) and by TSPAN8. Relapse-free survival was predicted by MAPT and SNCG. In summary, we successfully validated several gene sets in a meta-analysis in large datasets of ovarian samples. Additionally, several individual genes identified were validated in a clinical cohort. Copyright © 2011 UICC.

  13. A multicolor panel of novel lentiviral "gene ontology" (LeGO) vectors for functional gene analysis.

    PubMed

    Weber, Kristoffer; Bartsch, Udo; Stocking, Carol; Fehse, Boris

    2008-04-01

    Functional gene analysis requires the possibility of overexpression, as well as downregulation of one, or ideally several, potentially interacting genes. Lentiviral vectors are well suited for this purpose as they ensure stable expression of complementary DNAs (cDNAs), as well as short-hairpin RNAs (shRNAs), and can efficiently transduce a wide spectrum of cell targets when packaged within the coat proteins of other viruses. Here we introduce a multicolor panel of novel lentiviral "gene ontology" (LeGO) vectors designed according to the "building blocks" principle. Using a wide spectrum of different fluorescent markers, including drug-selectable enhanced green fluorescent protein (eGFP)- and dTomato-blasticidin-S resistance fusion proteins, LeGO vectors allow simultaneous analysis of multiple genes and shRNAs of interest within single, easily identifiable cells. Furthermore, each functional module is flanked by unique cloning sites, ensuring flexibility and individual optimization. The efficacy of these vectors for analyzing multiple genes in a single cell was demonstrated in several different cell types, including hematopoietic, endothelial, and neural stem and progenitor cells, as well as hepatocytes. LeGO vectors thus represent a valuable tool for investigating gene networks using conditional ectopic expression and knock-down approaches simultaneously.

  14. Microarray expression profiling identifies genes with altered expression in HDL-deficient mice

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Callow, Matthew J.; Dudoit, Sandrine; Gong, Elaine L.

    2000-05-05

    Based on the assumption that severe alterations in the expression of genes known to be involved in HDL metabolism may affect the expression of other genes we screened an array of over 5000 mouse expressed sequence tags (ESTs) for altered gene expression in the livers of two lines of mice with dramatic decreases in HDL plasma concentrations. Labeled cDNA from livers of apolipoprotein AI (apo AI) knockout mice, Scavenger Receptor BI (SR-BI) transgenic mice and control mice were co-hybridized to microarrays. Two-sample t-statistics were used to identify genes with altered expression levels in the knockout or transgenic mice compared withmore » the control mice. In the SR-BI group we found 9 array elements representing at least 5 genes to be significantly altered on the basis of an adjusted p value of less than 0.05. In the apo AI knockout group 8 array elements representing 4 genes were altered compared with the control group (p < 0.05). Several of the genes identified in the SR-BI transgenic suggest altered sterol metabolism and oxidative processes. These studies illustrate the use of multiple-testing methods for the identification of genes with altered expression in replicated microarray experiments of apo AI knockout and SR-BI transgenic mice.« less

  15. Transcriptional Network Analysis Identifies BACH1 as a Master Regulator of Breast Cancer Bone Metastasis

    PubMed Central

    Liang, Yajun; Wu, Heng; Lei, Rong; Chong, Robert A.; Wei, Yong; Lu, Xin; Tagkopoulos, Ilias; Kung, Sun-Yuan; Yang, Qifeng; Hu, Guohong; Kang, Yibin

    2012-01-01

    The application of functional genomic analysis of breast cancer metastasis has led to the identification of a growing number of organ-specific metastasis genes, which often function in concert to facilitate different steps of the metastatic cascade. However, the gene regulatory network that controls the expression of these metastasis genes remains largely unknown. Here, we demonstrate a computational approach for the deconvolution of transcriptional networks to discover master regulators of breast cancer bone metastasis. Several known regulators of breast cancer bone metastasis such as Smad4 and HIF1 were identified in our analysis. Experimental validation of the networks revealed BACH1, a basic leucine zipper transcription factor, as the common regulator of several functional metastasis genes, including MMP1 and CXCR4. Ectopic expression of BACH1 enhanced the malignance of breast cancer cells, and conversely, BACH1 knockdown significantly reduced bone metastasis. The expression of BACH1 and its target genes was linked to the higher risk of breast cancer recurrence in patients. This study established BACH1 as the master regulator of breast cancer bone metastasis and provided a paradigm to identify molecular determinants in complex pathological processes. PMID:22875853

  16. Digital Gene Expression Analysis Provides Insight into the Transcript Profile of the Genes Involved in Aporphine Alkaloid Biosynthesis in Lotus (Nelumbo nucifera)

    PubMed Central

    Yang, Mei; Zhu, Lingping; Li, Ling; Li, Juanjuan; Xu, Liming; Feng, Ji; Liu, Yanling

    2017-01-01

    The predominant alkaloids in lotus leaves are aporphine alkaloids. These are the most important active components and have many pharmacological properties, but little is known about their biosynthesis. We used digital gene expression (DGE) technology to identify differentially-expressed genes (DEGs) between two lotus cultivars with different alkaloid contents at four leaf development stages. We also predicted potential genes involved in aporphine alkaloid biosynthesis by weighted gene co-expression network analysis (WGCNA). Approximately 335 billion nucleotides were generated; and 94% of which were aligned against the reference genome. Of 22 thousand expressed genes, 19,000 were differentially expressed between the two cultivars at the four stages. Gene Ontology (GO) enrichment analysis revealed that catalytic activity and oxidoreductase activity were enriched significantly in most pairwise comparisons. In Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis, dozens of DEGs were assigned to the categories of biosynthesis of secondary metabolites, isoquinoline alkaloid biosynthesis, and flavonoid biosynthesis. The genes encoding norcoclaurine synthase (NCS), norcoclaurine 6-O-methyltransferase (6OMT), coclaurine N-methyltransferase (CNMT), N-methylcoclaurine 3′-hydroxylase (NMCH), and 3′-hydroxy-N-methylcoclaurine 4′-O-methyltransferase (4′OMT) in the common pathways of benzylisoquinoline alkaloid biosynthesis and the ones encoding corytuberine synthase (CTS) in aporphine alkaloid biosynthetic pathway, which have been characterized in other plants, were identified in lotus. These genes had positive effects on alkaloid content, albeit with phenotypic lag. The WGCNA of DEGs revealed that one network module was associated with the dynamic change of alkaloid content. Eleven genes encoding proteins with methyltransferase, oxidoreductase and CYP450 activities were identified. These were surmised to be genes involved in aporphine alkaloid biosynthesis. This

  17. Identifying the rooted species tree from the distribution of unrooted gene trees under the coalescent.

    PubMed

    Allman, Elizabeth S; Degnan, James H; Rhodes, John A

    2011-06-01

    Gene trees are evolutionary trees representing the ancestry of genes sampled from multiple populations. Species trees represent populations of individuals-each with many genes-splitting into new populations or species. The coalescent process, which models ancestry of gene copies within populations, is often used to model the probability distribution of gene trees given a fixed species tree. This multispecies coalescent model provides a framework for phylogeneticists to infer species trees from gene trees using maximum likelihood or Bayesian approaches. Because the coalescent models a branching process over time, all trees are typically assumed to be rooted in this setting. Often, however, gene trees inferred by traditional phylogenetic methods are unrooted. We investigate probabilities of unrooted gene trees under the multispecies coalescent model. We show that when there are four species with one gene sampled per species, the distribution of unrooted gene tree topologies identifies the unrooted species tree topology and some, but not all, information in the species tree edges (branch lengths). The location of the root on the species tree is not identifiable in this situation. However, for 5 or more species with one gene sampled per species, we show that the distribution of unrooted gene tree topologies identifies the rooted species tree topology and all its internal branch lengths. The length of any pendant branch leading to a leaf of the species tree is also identifiable for any species from which more than one gene is sampled.

  18. In Silico Analysis Identifies a Novel Role for Androgens in the Regulation of Human Endometrial Apoptosis

    PubMed Central

    Marshall, Elaine; Lowrey, Jacqueline; MacPherson, Sheila; Maybin, Jacqueline A.; Collins, Frances; Critchley, Hilary O. D.

    2011-01-01

    Context: The endometrium is a multicellular, steroid-responsive tissue that undergoes dynamic remodeling every menstrual cycle in preparation for implantation and, in absence of pregnancy, menstruation. Androgen receptors are present in the endometrium. Objective: The objective of the study was to investigate the impact of androgens on human endometrial stromal cells (hESC). Design: Bioinformatics was used to identify an androgen-regulated gene set and processes associated with their function. Regulation of target genes and impact of androgens on cell function were validated using primary hESC. Setting: The study was conducted at the University Research Institute. Patients: Endometrium was collected from women with regular menses; tissues were used for recovery of cells, total mRNA, or protein and for immunohistochemistry. Results: A new endometrial androgen target gene set (n = 15) was identified. Bioinformatics revealed 12 of these genes interacted in one pathway and identified an association with control of cell survival. Dynamic androgen-dependent changes in expression of the gene set were detected in hESC with nine significantly down-regulated at 2 and/or 8 h. Treatment of hESC with dihydrotestosterone reduced staurosporine-induced apoptosis and cell migration/proliferation. Conclusions: Rigorous in silico analysis resulted in identification of a group of androgen-regulated genes expressed in human endometrium. Pathway analysis and functional assays suggest androgen-dependent changes in gene expression may have a significant impact on stromal cell proliferation, migration, and survival. These data provide the platform for further studies on the role of circulatory or local androgens in the regulation of endometrial function and identify androgens as candidates in the pathogenesis of common endometrial disorders including polycystic ovarian syndrome, cancer, and endometriosis. PMID:21865353

  19. When is hub gene selection better than standard meta-analysis?

    PubMed

    Langfelder, Peter; Mischel, Paul S; Horvath, Steve

    2013-01-01

    Since hub nodes have been found to play important roles in many networks, highly connected hub genes are expected to play an important role in biology as well. However, the empirical evidence remains ambiguous. An open question is whether (or when) hub gene selection leads to more meaningful gene lists than a standard statistical analysis based on significance testing when analyzing genomic data sets (e.g., gene expression or DNA methylation data). Here we address this question for the special case when multiple genomic data sets are available. This is of great practical importance since for many research questions multiple data sets are publicly available. In this case, the data analyst can decide between a standard statistical approach (e.g., based on meta-analysis) and a co-expression network analysis approach that selects intramodular hubs in consensus modules. We assess the performance of these two types of approaches according to two criteria. The first criterion evaluates the biological insights gained and is relevant in basic research. The second criterion evaluates the validation success (reproducibility) in independent data sets and often applies in clinical diagnostic or prognostic applications. We compare meta-analysis with consensus network analysis based on weighted correlation network analysis (WGCNA) in three comprehensive and unbiased empirical studies: (1) Finding genes predictive of lung cancer survival, (2) finding methylation markers related to age, and (3) finding mouse genes related to total cholesterol. The results demonstrate that intramodular hub gene status with respect to consensus modules is more useful than a meta-analysis p-value when identifying biologically meaningful gene lists (reflecting criterion 1). However, standard meta-analysis methods perform as good as (if not better than) a consensus network approach in terms of validation success (criterion 2). The article also reports a comparison of meta-analysis techniques applied to

  20. Microarray analysis identifies keratin loci as sensitive biomarkers for thyroid hormone disruption in the salamander Ambystoma mexicanum.

    PubMed

    Page, Robert B; Monaghan, James R; Samuels, Amy K; Smith, Jeramiah J; Beachy, Christopher K; Voss, S Randal

    2007-02-01

    Ambystomatid salamanders offer several advantages for endocrine disruption research, including genomic and bioinformatics resources, an accessible laboratory model (Ambystoma mexicanum), and natural lineages that are broadly distributed among North American habitats. We used microarray analysis to measure the relative abundance of transcripts isolated from A. mexicanum epidermis (skin) after exogenous application of thyroid hormone (TH). Only one gene had a >2-fold change in transcript abundance after 2 days of TH treatment. However, hundreds of genes showed significantly different transcript levels at days 12 and 28 in comparison to day 0. A list of 123 TH-responsive genes was identified using statistical, BLAST, and fold level criteria. Cluster analysis identified two groups of genes with similar transcription patterns: up-regulated versus down-regulated. Most notably, several keratins exhibited dramatic (1000 fold) increases or decreases in transcript abundance. Keratin gene expression changes coincided with morphological remodeling of epithelial tissues. This suggests that keratin loci can be developed as sensitive biomarkers to assay temporal disruptions of larval-to-adult gene expression programs. Our study has identified the first collection of loci that are regulated during TH-induced metamorphosis in a salamander, thus setting the stage for future investigations of TH disruption in the Mexican axolotl and other salamanders of the genus Ambystoma.

  1. Differential analysis between somatic mutation and germline variation profiles reveals cancer-related genes.

    PubMed

    Przytycki, Pawel F; Singh, Mona

    2017-08-25

    A major aim of cancer genomics is to pinpoint which somatically mutated genes are involved in tumor initiation and progression. We introduce a new framework for uncovering cancer genes, differential mutation analysis, which compares the mutational profiles of genes across cancer genomes with their natural germline variation across healthy individuals. We present DiffMut, a fast and simple approach for differential mutational analysis, and demonstrate that it is more effective in discovering cancer genes than considerably more sophisticated approaches. We conclude that germline variation across healthy human genomes provides a powerful means for characterizing somatic mutation frequency and identifying cancer driver genes. DiffMut is available at https://github.com/Singh-Lab/Differential-Mutation-Analysis .

  2. Genome-wide analysis of the WRKY gene family in physic nut (Jatropha curcas L.).

    PubMed

    Xiong, Wangdan; Xu, Xueqin; Zhang, Lin; Wu, Pingzhi; Chen, Yaping; Li, Meiru; Jiang, Huawu; Wu, Guojiang

    2013-07-25

    The WRKY proteins, which contain highly conserved WRKYGQK amino acid sequences and zinc-finger-like motifs, constitute a large family of transcription factors in plants. They participate in diverse physiological and developmental processes. WRKY genes have been identified and characterized in a number of plant species. We identified a total of 58 WRKY genes (JcWRKY) in the genome of the physic nut (Jatropha curcas L.). On the basis of their conserved WRKY domain sequences, all of the JcWRKY proteins could be assigned to one of the previously defined groups, I-III. Phylogenetic analysis of JcWRKY genes with Arabidopsis and rice WRKY genes, and separately with castor bean WRKY genes, revealed no evidence of recent gene duplication in JcWRKY gene family. Analysis of transcript abundance of JcWRKY gene products were tested in different tissues under normal growth condition. In addition, 47 WRKY genes responded to at least one abiotic stress (drought, salinity, phosphate starvation and nitrogen starvation) in individual tissues (leaf, root and/or shoot cortex). Our study provides a useful reference data set as the basis for cloning and functional analysis of physic nut WRKY genes. Copyright © 2013 Elsevier B.V. All rights reserved.

  3. [BIOINFORMATIC SEARCH AND PHYLOGENETIC ANALYSIS OF THE CELLULOSE SYNTHASE GENES OF FLAX (LINUM USITATISSIMUM)].

    PubMed

    Pydiura, N A; Bayer, G Ya; Galinousky, D V; Yemets, A I; Pirko, Ya V; Podvitski, T A; Anisimova, N V; Khotyleva, L V; Kilchevsky, A V; Blume, Ya B

    2015-01-01

    A bioinformatic search of sequences encoding cellulose synthase genes in the flax genome, and their comparison to dicots orthologs was carried out. The analysis revealed 32 cellulose synthase gene candidates, 16 of which are highly likely to encode cellulose synthases, and the remaining 16--cellulose synthase-like proteins (Csl). Phylogenetic analysis of gene products of cellulose synthase genes allowed distinguishing 6 groups of cellulose synthase genes of different classes: CesA1/10, CesA3, CesA4, CesA5/6/2/9, CesA7 and CesA8. Paralogous sequences within classes CesA1/10 and CesA5/6/2/9 which are associated with the primary cell wall formation are characterized by a greater similarity within these classes than orthologous sequences. Whereas the genes controlling the biosynthesis of secondary cell wall cellulose form distinct clades: CesA4, CesA7, and CesA8. The analysis of 16 identified flax cellulose synthase gene candidates shows the presence of at least 12 different cellulose synthase gene variants in flax genome which are represented in all six clades of cellulose synthase genes. Thus, at this point genes of all ten known cellulose synthase classes are identify in flax genome, but their correct classification requires additional research.

  4. A gene network bioinformatics analysis for pemphigoid autoimmune blistering diseases.

    PubMed

    Barone, Antonio; Toti, Paolo; Giuca, Maria Rita; Derchi, Giacomo; Covani, Ugo

    2015-07-01

    In this theoretical study, a text mining search and clustering analysis of data related to genes potentially involved in human pemphigoid autoimmune blistering diseases (PAIBD) was performed using web tools to create a gene/protein interaction network. The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database was employed to identify a final set of PAIBD-involved genes and to calculate the overall significant interactions among genes: for each gene, the weighted number of links, or WNL, was registered and a clustering procedure was performed using the WNL analysis. Genes were ranked in class (leader, B, C, D and so on, up to orphans). An ontological analysis was performed for the set of 'leader' genes. Using the above-mentioned data network, 115 genes represented the final set; leader genes numbered 7 (intercellular adhesion molecule 1 (ICAM-1), interferon gamma (IFNG), interleukin (IL)-2, IL-4, IL-6, IL-8 and tumour necrosis factor (TNF)), class B genes were 13, whereas the orphans were 24. The ontological analysis attested that the molecular action was focused on extracellular space and cell surface, whereas the activation and regulation of the immunity system was widely involved. Despite the limited knowledge of the present pathologic phenomenon, attested by the presence of 24 genes revealing no protein-protein direct or indirect interactions, the network showed significant pathways gathered in several subgroups: cellular components, molecular functions, biological processes and the pathologic phenomenon obtained from the Kyoto Encyclopaedia of Genes and Genomes (KEGG) database. The molecular basis for PAIBD was summarised and expanded, which will perhaps give researchers promising directions for the identification of new therapeutic targets.

  5. Genome-wide characterization and expression analysis of citrus NUCLEAR FACTOR-Y (NF-Y) transcription factors identified a novel NF-YA gene involved in drought-stress response and tolerance.

    PubMed

    Pereira, Suzam L S; Martins, Cristina P S; Sousa, Aurizangela O; Camillo, Luciana R; Araújo, Caroline P; Alcantara, Grazielle M; Camargo, Danielle S; Cidade, Luciana C; de Almeida, Alex-Alan F; Costa, Marcio G C

    2018-01-01

    Nuclear factor Y (NF-Y) is a ubiquitous transcription factor found in eukaryotes. It is composed of three distinct subunits called NF-YA, NF-YB and NF-YC. NF-Ys have been identified as key regulators of multiple pathways in the control of development and tolerance to biotic and abiotic factors. The present study aimed to identify and characterize the complete repertoire of genes coding for NF-Y in citrus, as well as to perform the functional characterization of one of its members, namely CsNFYA5, in transgenic tobacco plants. A total of 22 genes coding for NF-Y were identified in the genomes of sweet orange (Citrus sinensis) and Clementine mandarin (C. clementina), including six CsNF-YAs, 11 CsNF-YBs and five CsNF-YCs. Phylogenetic analyses showed that there is a NF-Y orthologous in the Clementine genome for each sweet orange NF-Y gene; this was not observed when compared to Arabidopsis thaliana. CsNF-Y proteins shared the same conserved domains with their orthologous proteins in other organisms, including mouse. Analysis of gene expression by RNA-seq and EST data demonstrated that CsNF-Ys have a tissue-specific and stress inducible expression profile. qRT-PCR analysis revealed that CsNF-YA5 exhibits differential expression in response to water deficit in leaves and roots of citrus plants. Overexpression of CsNF-YA5 in transgenic tobacco plants contributed to the reduction of H2O2 production under dehydration conditions and increased plant growth and photosynthetic rate under normal conditions and drought stress. These biochemical and physiological responses to drought stress promoted by CsNF-YA5 may confer a productivity advantage in environments with frequent short-term soil water deficit.

  6. Genome-wide survey and expression analysis of F-box genes in chickpea.

    PubMed

    Gupta, Shefali; Garg, Vanika; Kant, Chandra; Bhatia, Sabhyata

    2015-02-13

    The F-box genes constitute one of the largest gene families in plants involved in degradation of cellular proteins. F-box proteins can recognize a wide array of substrates and regulate many important biological processes such as embryogenesis, floral development, plant growth and development, biotic and abiotic stress, hormonal responses and senescence, among others. However, little is known about the F-box genes in the important legume crop, chickpea. The available draft genome sequence of chickpea allowed us to conduct a genome-wide survey of the F-box gene family in chickpea. A total of 285 F-box genes were identified in chickpea which were classified based on their C-terminal domain structures into 10 subfamilies. Thirteen putative novel motifs were also identified in F-box proteins with no known functional domain at their C-termini. The F-box genes were physically mapped on the 8 chickpea chromosomes and duplication events were investigated which revealed that the F-box gene family expanded largely due to tandem duplications. Phylogenetic analysis classified the chickpea F-box genes into 9 clusters. Also, maximum syntenic relationship was observed with soybean followed by Medicago truncatula, Lotus japonicus and Arabidopsis. Digital expression analysis of F-box genes in various chickpea tissues as well as under abiotic stress conditions utilizing the available chickpea transcriptome data revealed differential expression patterns with several F-box genes specifically expressing in each tissue, few of which were validated by using quantitative real-time PCR. The genome-wide analysis of chickpea F-box genes provides new opportunities for characterization of candidate F-box genes and elucidation of their function in growth, development and stress responses for utilization in chickpea improvement.

  7. Linking the Salt Transcriptome with Physiological Responses of a Salt-Resistant Populus Species as a Strategy to Identify Genes Important for Stress Acclimation1[W][OA

    PubMed Central

    Brinker, Monika; Brosché, Mikael; Vinocur, Basia; Abo-Ogiala, Atef; Fayyaz, Payam; Janz, Dennis; Ottow, Eric A.; Cullmann, Andreas D.; Saborowski, Joachim; Kangasjärvi, Jaakko; Altman, Arie; Polle, Andrea

    2010-01-01

    To investigate early salt acclimation mechanisms in a salt-tolerant poplar species (Populus euphratica), the kinetics of molecular, metabolic, and physiological changes during a 24-h salt exposure were measured. Three distinct phases of salt stress were identified by analyses of the osmotic pressure and the shoot water potential: dehydration, salt accumulation, and osmotic restoration associated with ionic stress. The duration and intensity of these phases differed between leaves and roots. Transcriptome analysis using P. euphratica-specific microarrays revealed clusters of coexpressed genes in these phases, with only 3% overlapping salt-responsive genes in leaves and roots. Acclimation of cellular metabolism to high salt concentrations involved remodeling of amino acid and protein biosynthesis and increased expression of molecular chaperones (dehydrins, osmotin). Leaves suffered initially from dehydration, which resulted in changes in transcript levels of mitochondrial and photosynthetic genes, indicating adjustment of energy metabolism. Initially, decreases in stress-related genes were found, whereas increases occurred only when leaves had restored the osmotic balance by salt accumulation. Comparative in silico analysis of the poplar stress regulon with Arabidopsis (Arabidopsis thaliana) orthologs was used as a strategy to reduce the number of candidate genes for functional analysis. Analysis of Arabidopsis knockout lines identified a lipocalin-like gene (AtTIL) and a gene encoding a protein with previously unknown functions (AtSIS) to play roles in salt tolerance. In conclusion, by dissecting the stress transcriptome of tolerant species, novel genes important for salt endurance can be identified. PMID:20959419

  8. Comparative phylogenomic analysis provides insights into TCP gene functions in Sorghum

    PubMed Central

    Francis, Aleena; Dhaka, Namrata; Bakshi, Mohit; Jung, Ki-Hong; Sharma, Manoj K.; Sharma, Rita

    2016-01-01

    Sorghum is a highly efficient C4 crop with potential to mitigate challenges associated with food, feed and fuel. TCP proteins are of particular interest for crop improvement programs due to their well-demonstrated roles in crop domestication and shaping plant architecture thereby, affecting agronomic traits. We identified 20 TCP genes from Sorghum. Except SbTCP8, all are either intronless or contain introns in the untranslated regions. Comparative phylogenetic analysis of Arabidopsis, rice, Brachypodium and Sorghum TCP proteins revealed two distinct classes categorized into ten sub-clades. Sub-clade F is dicot-specific, whereas A2, G1 and I1 groups only contained genes from grasses. Sub-clade B was missing in Sorghum, whereas group A1 was missing in rice indicating species-specific divergence of TCP proteins. TCP proteins of Sorghum are enriched in disorder promoting residues with class I containing higher percent disorder than class II proteins. Seven pairs of paralogous TCP genes were identified from Sorghum, five of which seem to predate Rice-Sorghum divergence. All of them have diverged in their expression. Based on the expression and orthology analysis, five Sorghum genes have been shortlisted for further investigation for their roles in regulating plant morphology. Whereas, three genes have been identified as candidates for engineering abiotic stress tolerance. PMID:27917941

  9. Analysis of resistance genes of clinical Pannonibacter phragmitetus strain 31801 by complete genome sequencing.

    PubMed

    Ming, De-Song; Chen, Qing-Qing; Chen, Xiao-Tin

    2018-05-14

    To clarify the resistance mechanisms of Pannonibacter phragmitetus 31801, isolated from the blood of a liver abscess patient, at the genomic level, we performed whole genomic sequencing using a PacBio RS II single-molecule real-time long-read sequencer. Bioinformatic analysis of the resulting sequence was then carried out to identify any possible resistance genes. Analyses included Basic Local Alignment Search Tool searches against the Antibiotic Resistance Genes Database, ResFinder analysis of the genome sequence, and Resistance Gene Identifier analysis within the Comprehensive Antibiotic Resistance Database. Prophages, clustered regularly interspaced short palindromic repeats (CRISPR), and other putative virulence factors were also identified using PHAST, CRISPRfinder, and the Virulence Factors Database, respectively. The circular chromosome and single plasmid of P. phragmitetus 31801 contained multiple antibiotic resistance genes, including those coding for three different types of β-lactamase [NPS β-lactamase (EC 3.5.2.6), β-lactamase class C, and a metal-dependent hydrolase of β-lactamase superfamily I]. In addition, genes coding for subunits of several multidrug-resistance efflux pumps were identified, including those targeting macrolides (adeJ, cmeB), tetracycline (acrB, adeAB), fluoroquinolones (acrF, ceoB), and aminoglycosides (acrD, amrB, ceoB, mexY, smeB). However, apart from the tripartite macrolide efflux pump macAB-tolC, the genome did not appear to contain the complete complement of subunit genes required for production of most of the major multidrug-resistance efflux pumps.

  10. Comparative and evolutionary analysis of the 14-3-3 family genes in eleven fishes.

    PubMed

    Cao, Jun; Tan, Xiaona

    2018-07-01

    14-3-3 proteins are a type of highly conserved acidic proteins, which are distributed over a wide variety of organisms and are involved in multiple cellular processes. While the comparative and evolutionary analysis of this gene family is unavailable in various fish species. In this study, we identified 101 putative 14-3-3 genes in 11 fish species and divided them into 5 groups via phylogenetic analysis. Synteny analysis implied conserved and dynamic evolution characteristics near the 14-3-3 gene loci in some vertebrates. We also found that some recombination events have accelerated the evolution of this gene family. Moreover, a positive selection site was also identified, and mutation of this site could reduce the 14-3-3 stability. Divergent expression profiles of the zebrafish 14-3-3 genes were further investigated under organophosphorus stress, suggesting that they may be involved in the different osmoregulation and immune response. The results will serve as a foundation for the further functional investigation into the 14-3-3 genes in fishes. Copyright © 2018 Elsevier B.V. All rights reserved.

  11. Genome-wide gene phylogeny of CIPK family in cassava and expression analysis of partial drought-induced genes

    PubMed Central

    Hu, Wei; Xia, Zhiqiang; Yan, Yan; Ding, Zehong; Tie, Weiwei; Wang, Lianzhe; Zou, Meiling; Wei, Yunxie; Lu, Cheng; Hou, Xiaowan; Wang, Wenquan; Peng, Ming

    2015-01-01

    Cassava is an important food and potential biofuel crop that is tolerant to multiple abiotic stressors. The mechanisms underlying these tolerances are currently less known. CBL-interacting protein kinases (CIPKs) have been shown to play crucial roles in plant developmental processes, hormone signaling transduction, and in the response to abiotic stress. However, no data is currently available about the CPK family in cassava. In this study, a total of 25 CIPK genes were identified from cassava genome based on our previous genome sequencing data. Phylogenetic analysis suggested that 25 MeCIPKs could be classified into four subfamilies, which was supported by exon-intron organizations and the architectures of conserved protein motifs. Transcriptomic analysis of a wild subspecies and two cultivated varieties showed that most MeCIPKs had different expression patterns between wild subspecies and cultivatars in different tissues or in response to drought stress. Some orthologous genes involved in CIPK interaction networks were identified between Arabidopsis and cassava. The interaction networks and co-expression patterns of these orthologous genes revealed that the crucial pathways controlled by CIPK networks may be involved in the differential response to drought stress in different accessions of cassava. Nine MeCIPK genes were selected to investigate their transcriptional response to various stimuli and the results showed the comprehensive response of the tested MeCIPK genes to osmotic, salt, cold, oxidative stressors, and ABA signaling. The identification and expression analysis of CIPK family suggested that CIPK genes are important components of development and multiple signal transduction pathways in cassava. The findings of this study will help lay a foundation for the functional characterization of the CIPK gene family and provide an improved understanding of abiotic stress responses and signaling transduction in cassava. PMID:26579161

  12. Defining the gene expression signature of rhabdomyosarcoma by meta-analysis

    PubMed Central

    Romualdi, Chiara; De Pittà, Cristiano; Tombolan, Lucia; Bortoluzzi, Stefania; Sartori, Francesca; Rosolen, Angelo; Lanfranchi, Gerolamo

    2006-01-01

    Background Rhabdomyosarcoma is a highly malignant soft tissue sarcoma in childhood and arises as a consequence of regulatory disruption of the growth and differentiation pathways of myogenic precursor cells. The pathogenic pathways involved in this tumor are mostly unknown and therefore a better characterization of RMS gene expression profile would represent a considerable advance. The availability of publicly available gene expression datasets have opened up new challenges especially for the integration of data generated by different research groups and different array platforms with the purpose of obtaining new insights on the biological process investigated. Results In this work we performed a meta-analysis on four microarray and two SAGE datasets of gene expression data on RMS in order to evaluate the degree of agreement of the biological results obtained by these different studies and to identify common regulatory pathways that could be responsible of tumor growth. Regulatory pathways and biological processes significantly enriched has been investigated and a list of differentially meta-profiles have been identified as possible candidate of aggressiveness of RMS. Conclusion Our results point to a general down regulation of the energy production pathways, suggesting a hypoxic physiology for RMS cells. This result agrees with the high malignancy of RMS and with its resistance to most of the therapeutic treatments. In this context, different isoforms of the ANT gene have been consistently identified for the first time as differentially expressed in RMS. This gene is involved in anti-apoptotic processes when cells grow in low oxygen conditions. These new insights in the biological processes responsible of RMS growth and development demonstrate the effective advantage of the use of integrated analysis of gene expression studies. PMID:17090319

  13. Whole Gene Capture Analysis of 15 CRC Susceptibility Genes in Suspected Lynch Syndrome Patients.

    PubMed

    Jansen, Anne M L; Geilenkirchen, Marije A; van Wezel, Tom; Jagmohan-Changur, Shantie C; Ruano, Dina; van der Klift, Heleen M; van den Akker, Brendy E W M; Laros, Jeroen F J; van Galen, Michiel; Wagner, Anja; Letteboer, Tom G W; Gómez-García, Encarna B; Tops, Carli M J; Vasen, Hans F; Devilee, Peter; Hes, Frederik J; Morreau, Hans; Wijnen, Juul T

    2016-01-01

    Lynch Syndrome (LS) is caused by pathogenic germline variants in one of the mismatch repair (MMR) genes. However, up to 60% of MMR-deficient colorectal cancer cases are categorized as suspected Lynch Syndrome (sLS) because no pathogenic MMR germline variant can be identified, which leads to difficulties in clinical management. We therefore analyzed the genomic regions of 15 CRC susceptibility genes in leukocyte DNA of 34 unrelated sLS patients and 11 patients with MLH1 hypermethylated tumors with a clear family history. Using targeted next-generation sequencing, we analyzed the entire non-repetitive genomic sequence, including intronic and regulatory sequences, of 15 CRC susceptibility genes. In addition, tumor DNA from 28 sLS patients was analyzed for somatic MMR variants. Of 1979 germline variants found in the leukocyte DNA of 34 sLS patients, one was a pathogenic variant (MLH1 c.1667+1delG). Leukocyte DNA of 11 patients with MLH1 hypermethylated tumors was negative for pathogenic germline variants in the tested CRC susceptibility genes and for germline MLH1 hypermethylation. Somatic DNA analysis of 28 sLS tumors identified eight (29%) cases with two pathogenic somatic variants, one with a VUS predicted to pathogenic and LOH, and nine cases (32%) with one pathogenic somatic variant (n = 8) or one VUS predicted to be pathogenic (n = 1). This is the first study in sLS patients to include the entire genomic sequence of CRC susceptibility genes. An underlying somatic or germline MMR gene defect was identified in ten of 34 sLS patients (29%). In the remaining sLS patients, the underlying genetic defect explaining the MMRdeficiency in their tumors might be found outside the genomic regions harboring the MMR and other known CRC susceptibility genes.

  14. Comparison of gene expression in segregating families identifies genes and genomic regions involved in a novel adaptation, zinc hyperaccumulation.

    PubMed

    Filatov, Victor; Dowdle, John; Smirnoff, Nicholas; Ford-Lloyd, Brian; Newbury, H John; Macnair, Mark R

    2006-09-01

    One of the challenges of comparative genomics is to identify specific genetic changes associated with the evolution of a novel adaptation or trait. We need to be able to disassociate the genes involved with a particular character from all the other genetic changes that take place as lineages diverge. Here we show that by comparing the transcriptional profile of segregating families with that of parent species differing in a novel trait, it is possible to narrow down substantially the list of potential target genes. In addition, by assuming synteny with a related model organism for which the complete genome sequence is available, it is possible to use the cosegregation of markers differing in transcription level to identify regions of the genome which probably contain quantitative trait loci (QTLs) for the character. This novel combination of genomics and classical genetics provides a very powerful tool to identify candidate genes. We use this methodology to investigate zinc hyperaccumulation in Arabidopsis halleri, the sister species to the model plant, Arabidopsis thaliana. We compare the transcriptional profile of A. halleri with that of its sister nonaccumulator species, Arabidopsis petraea, and between accumulator and nonaccumulator F(3)s derived from the cross between the two species. We identify eight genes which consistently show greater expression in accumulator phenotypes in both roots and shoots, including two metal transporter genes (NRAMP3 and ZIP6), and cytoplasmic aconitase, a gene involved in iron homeostasis in mammals. We also show that there appear to be two QTLs for zinc accumulation, on chromosomes 3 and 7.

  15. Heterogeneous activation of the TGFβ pathway in glioblastomas identified by gene expression-based classification using TGFβ-responsive genes

    PubMed Central

    Xu, Xie L; Kapoun, Ann M

    2009-01-01

    Background TGFβ has emerged as an attractive target for the therapeutic intervention of glioblastomas. Aberrant TGFβ overproduction in glioblastoma and other high-grade gliomas has been reported, however, to date, none of these reports has systematically examined the components of TGFβ signaling to gain a comprehensive view of TGFβ activation in large cohorts of human glioma patients. Methods TGFβ activation in mammalian cells leads to a transcriptional program that typically affects 5–10% of the genes in the genome. To systematically examine the status of TGFβ activation in high-grade glial tumors, we compiled a gene set of transcriptional response to TGFβ stimulation from tissue culture and in vivo animal studies. These genes were used to examine the status of TGFβ activation in high-grade gliomas including a large cohort of glioblastomas. Unsupervised and supervised classification analysis was performed in two independent, publicly available glioma microarray datasets. Results Unsupervised and supervised classification using the TGFβ-responsive gene list in two independent glial tumor gene expression data sets revealed various levels of TGFβ activation in these tumors. Among glioblastomas, one of the most devastating human cancers, two subgroups were identified that showed distinct TGFβ activation patterns as measured from transcriptional responses. Approximately 62% of glioblastoma samples analyzed showed strong TGFβ activation, while the rest showed a weak TGFβ transcriptional response. Conclusion Our findings suggest heterogeneous TGFβ activation in glioblastomas, which may cause potential differences in responses to anti-TGFβ therapies in these two distinct subgroups of glioblastomas patients. PMID:19192267

  16. Integrative molecular network analysis identifies emergent enzalutamide resistance mechanisms in prostate cancer

    PubMed Central

    King, Carly J.; Woodward, Josha; Schwartzman, Jacob; Coleman, Daniel J.; Lisac, Robert; Wang, Nicholas J.; Van Hook, Kathryn; Gao, Lina; Urrutia, Joshua; Dane, Mark A.; Heiser, Laura M.; Alumkal, Joshi J.

    2017-01-01

    Recent work demonstrates that castration-resistant prostate cancer (CRPC) tumors harbor countless genomic aberrations that control many hallmarks of cancer. While some specific mutations in CRPC may be actionable, many others are not. We hypothesized that genomic aberrations in cancer may operate in concert to promote drug resistance and tumor progression, and that organization of these genomic aberrations into therapeutically targetable pathways may improve our ability to treat CRPC. To identify the molecular underpinnings of enzalutamide-resistant CRPC, we performed transcriptional and copy number profiling studies using paired enzalutamide-sensitive and resistant LNCaP prostate cancer cell lines. Gene networks associated with enzalutamide resistance were revealed by performing an integrative genomic analysis with the PAthway Representation and Analysis by Direct Reference on Graphical Models (PARADIGM) tool. Amongst the pathways enriched in the enzalutamide-resistant cells were those associated with MEK, EGFR, RAS, and NFKB. Functional validation studies of 64 genes identified 10 candidate genes whose suppression led to greater effects on cell viability in enzalutamide-resistant cells as compared to sensitive parental cells. Examination of a patient cohort demonstrated that several of our functionally-validated gene hits are deregulated in metastatic CRPC tumor samples, suggesting that they may be clinically relevant therapeutic targets for patients with enzalutamide-resistant CRPC. Altogether, our approach demonstrates the potential of integrative genomic analyses to clarify determinants of drug resistance and rational co-targeting strategies to overcome resistance. PMID:29340039

  17. Genome analysis and identification of gelatinase encoded gene in Enterobacter aerogenes

    NASA Astrophysics Data System (ADS)

    Shahimi, Safiyyah; Mutalib, Sahilah Abdul; Khalid, Rozida Abdul; Repin, Rul Aisyah Mat; Lamri, Mohd Fadly; Bakar, Mohd Faizal Abu; Isa, Mohd Noor Mat

    2016-11-01

    In this study, bioinformatic analysis towards genome sequence of E. aerogenes was done to determine gene encoded for gelatinase. Enterobacter aerogenes was isolated from hot spring water and gelatinase species-specific bacterium to porcine and fish gelatin. This bacterium offers the possibility of enzymes production which is specific to both species gelatine, respectively. Enterobacter aerogenes was partially genome sequenced resulting in 5.0 mega basepair (Mbp) total size of sequence. From pre-process pipeline, 87.6 Mbp of total reads, 68.8 Mbp of total high quality reads and 78.58 percent of high quality percentage was determined. Genome assembly produced 120 contigs with 67.5% of contigs over 1 kilo base pair (kbp), 124856 bp of N50 contig length and 55.17 % of GC base content percentage. About 4705 protein gene was identified from protein prediction analysis. Two candidate genes selected have highest similarity identity percentage against gelatinase enzyme available in Swiss-Prot and NCBI online database. They were NODE_9_length_26866_cov_148.013245_12 containing 1029 base pair (bp) sequence with 342 amino acid sequence and NODE_24_length_155103_cov_177.082458_62 which containing 717 bp sequence with 238 amino acid sequence, respectively. Thus, two paired of primers (forward and reverse) were designed, based on the open reading frame (ORF) of selected genes. Genome analysis of E. aerogenes resulting genes encoded gelatinase were identified.

  18. Genomics and relative expression analysis identifies key genes associated with high female to male flower ratio in Jatropha curcas L.

    PubMed

    Gangwar, Manali; Sood, Hemant; Chauhan, Rajinder Singh

    2016-04-01

    Jatropha curcas, has been projected as a major source of biodiesel due to high seed oil content (42 %). A major roadblock for commercialization of Jatropha-based biodiesel is low seed yield per inflorescence, which is affected by low female to male flower ratio (1:25-30). Molecular dissection of female flower development by analyzing genes involved in phase transitions and floral organ development is, therefore, crucial for increasing seed yield. Expression analysis of 42 genes implicated in floral organ development and sex determination was done at six floral developmental stages of a J. curcas genotype (IC561235) with inherently higher female to male flower ratio (1:8-10). Relative expression analysis of these genes was done on low ratio genotype. Genes TFL1, SUP, AP1, CRY2, CUC2, CKX1, TAA1 and PIN1 were associated with reproductive phase transition. Further, genes CUC2, TAA1, CKX1 and PIN1 were associated with female flowering while SUP and CRY2 in female flower transition. Relative expression of these genes with respect to low female flower ratio genotype showed up to ~7 folds increase in transcript abundance of SUP, TAA1, CRY2 and CKX1 genes in intermediate buds but not a significant increase (~1.25 folds) in female flowers, thereby suggesting that these genes possibly play a significant role in increased transition towards female flowering by promoting abortion of male flower primordia. The outcome of study has implications in feedstock improvement of J. curcas through functional validation and eventual utilization of key genes associated with female flowering.

  19. A large-scale RNA interference screen identifies genes that regulate autophagy at different stages.

    PubMed

    Guo, Sujuan; Pridham, Kevin J; Virbasius, Ching-Man; He, Bin; Zhang, Liqing; Varmark, Hanne; Green, Michael R; Sheng, Zhi

    2018-02-12

    Dysregulated autophagy is central to the pathogenesis and therapeutic development of cancer. However, how autophagy is regulated in cancer is not well understood and genes that modulate cancer autophagy are not fully defined. To gain more insights into autophagy regulation in cancer, we performed a large-scale RNA interference screen in K562 human chronic myeloid leukemia cells using monodansylcadaverine staining, an autophagy-detecting approach equivalent to immunoblotting of the autophagy marker LC3B or fluorescence microscopy of GFP-LC3B. By coupling monodansylcadaverine staining with fluorescence-activated cell sorting, we successfully isolated autophagic K562 cells where we identified 336 short hairpin RNAs. After candidate validation using Cyto-ID fluorescence spectrophotometry, LC3B immunoblotting, and quantitative RT-PCR, 82 genes were identified as autophagy-regulating genes. 20 genes have been reported previously and the remaining 62 candidates are novel autophagy mediators. Bioinformatic analyses revealed that most candidate genes were involved in molecular pathways regulating autophagy, rather than directly participating in the autophagy process. Further autophagy flux assays revealed that 57 autophagy-regulating genes suppressed autophagy initiation, whereas 21 candidates promoted autophagy maturation. Our RNA interference screen identifies identified genes that regulate autophagy at different stages, which helps decode autophagy regulation in cancer and offers novel avenues to develop autophagy-related therapies for cancer.

  20. Overexpression screens identify conserved dosage chromosome instability genes in yeast and human cancer

    PubMed Central

    Duffy, Supipi; Fam, Hok Khim; Wang, Yi Kan; Styles, Erin B.; Kim, Jung-Hyun; Ang, J. Sidney; Singh, Tejomayee; Larionov, Vladimir; Shah, Sohrab P.; Andrews, Brenda; Boerkoel, Cornelius F.; Hieter, Philip

    2016-01-01

    Somatic copy number amplification and gene overexpression are common features of many cancers. To determine the role of gene overexpression on chromosome instability (CIN), we performed genome-wide screens in the budding yeast for yeast genes that cause CIN when overexpressed, a phenotype we refer to as dosage CIN (dCIN), and identified 245 dCIN genes. This catalog of genes reveals human orthologs known to be recurrently overexpressed and/or amplified in tumors. We show that two genes, TDP1, a tyrosyl-DNA-phosphdiesterase, and TAF12, an RNA polymerase II TATA-box binding factor, cause CIN when overexpressed in human cells. Rhabdomyosarcoma lines with elevated human Tdp1 levels also exhibit CIN that can be partially rescued by siRNA-mediated knockdown of TDP1. Overexpression of dCIN genes represents a genetic vulnerability that could be leveraged for selective killing of cancer cells through targeting of an unlinked synthetic dosage lethal (SDL) partner. Using SDL screens in yeast, we identified a set of genes that when deleted specifically kill cells with high levels of Tdp1. One gene was the histone deacetylase RPD3, for which there are known inhibitors. Both HT1080 cells overexpressing hTDP1 and rhabdomyosarcoma cells with elevated levels of hTdp1 were more sensitive to histone deacetylase inhibitors valproic acid (VPA) and trichostatin A (TSA), recapitulating the SDL interaction in human cells and suggesting VPA and TSA as potential therapeutic agents for tumors with elevated levels of hTdp1. The catalog of dCIN genes presented here provides a candidate list to identify genes that cause CIN when overexpressed in cancer, which can then be leveraged through SDL to selectively target tumors. PMID:27551064

  1. Identifying protein domains by global analysis of soluble fragment data.

    PubMed

    Bulloch, Esther M M; Kingston, Richard L

    2014-11-15

    The production and analysis of individual structural domains is a common strategy for studying large or complex proteins, which may be experimentally intractable in their full-length form. However, identifying domain boundaries is challenging if there is little structural information concerning the protein target. One experimental procedure for mapping domains is to screen a library of random protein fragments for solubility, since truncation of a domain will typically expose hydrophobic groups, leading to poor fragment solubility. We have coupled fragment solubility screening with global data analysis to develop an effective method for identifying structural domains within a protein. A gene fragment library is generated using mechanical shearing, or by uracil doping of the gene and a uracil-specific enzymatic digest. A split green fluorescent protein (GFP) assay is used to screen the corresponding protein fragments for solubility when expressed in Escherichia coli. The soluble fragment data are then analyzed using two complementary approaches. Fragmentation "hotspots" indicate possible interdomain regions. Clustering algorithms are used to group related fragments, and concomitantly predict domain location. The effectiveness of this Domain Seeking procedure is demonstrated by application to the well-characterized human protein p85α. Copyright © 2014 Elsevier Inc. All rights reserved.

  2. Transcriptomic Analysis of Paeonia delavayi Wild Population Flowers to Identify Differentially Expressed Genes Involved in Purple-Red and Yellow Petal Pigmentation

    PubMed Central

    Wang, Yan; Li, Kui; Zheng, Baoqiang; Miao, Kun

    2015-01-01

    Tree peony (Paeonia suffruticosa Andrews) is a very famous traditional ornamental plant in China. P. delavayi is a species endemic to Southwest China that has aroused great interest from researchers as a precious genetic resource for flower color breeding. However, the current understanding of the molecular mechanisms of flower pigmentation in this plant is limited, hindering the genetic engineering of novel flower color in tree peonies. In this study, we conducted a large-scale transcriptome analysis based on Illumina HiSeq sequencing of cDNA libraries generated from yellow and purple-red P. delavayi petals. A total of 90,202 unigenes were obtained by de novo assembly, with an average length of 721 nt. Using Blastx, 44,811 unigenes (49.68%) were found to have significant similarity to accessions in the NR, NT, and Swiss-Prot databases. We also examined COG, GO and KEGG annotations to better understand the functions of these unigenes. Further analysis of the two digital transcriptomes revealed that 6,855 unigenes were differentially expressed between yellow and purple-red flower petals, with 3,430 up-regulated and 3,425 down-regulated. According to the RNA-Seq data and qRT-PCR analysis, we proposed that four up-regulated key structural genes, including F3H, DFR, ANS and 3GT, might play an important role in purple-red petal pigmentation, while high co-expression of THC2'GT, CHI and FNS II ensures the accumulation of pigments contributing to the yellow color. We also found 50 differentially expressed transcription factors that might be involved in flavonoid biosynthesis. This study is the first to report genetic information for P. delavayi. The large number of gene sequences produced by transcriptome sequencing and the candidate genes identified using pathway mapping and expression profiles will provide a valuable resource for future association studies aimed at better understanding the molecular mechanisms underlying flower pigmentation in tree peonies. PMID

  3. Mapping Adipose and Muscle Tissue Expression Quantitative Trait Loci in African Americans to Identify Genes for Type 2 Diabetes and Obesity

    PubMed Central

    Sajuthi, Satria P.; Sharma, Neeraj K.; Chou, Jeff W.; Palmer, Nicholette D.; McWilliams, David R.; Beal, John; Comeau, Mary E.; Ma, Lijun; Calles-Escandon, Jorge; Demons, Jamehl; Rogers, Samantha; Cherry, Kristina; Menon, Lata; Kouba, Ethel; Davis, Donna; Burris, Marcie; Byerly, Sara J.; Ng, Maggie C.Y.; Maruthur, Nisa M.; Patel, Sanjay R.; Bielak, Lawrence F.; Lange, Leslie; Guo, Xiuqing; Sale, Michèle M.; Chan, Kei Hang; Monda, Keri L.; Chen, Gary K.; Taylor, Kira; Palmer, Cameron; Edwards, Todd L; North, Kari E.; Haiman, Christopher A.; Bowden, Donald W.; Freedman, Barry I.; Langefeld, Carl D.; Das, Swapan K.

    2016-01-01

    Relative to European Americans, type 2 diabetes (T2D) is more prevalent in African Americans (AAs). Genetic variation may modulate transcript abundance in insulin-responsive tissues and contribute to risk; yet published studies identifying expression quantitative trait loci (eQTLs) in African ancestry populations are restricted to blood cells. This study aims to develop a map of genetically regulated transcripts expressed in tissues important for glucose homeostasis in AAs, critical for identifying the genetic etiology of T2D and related traits. Quantitative measures of adipose and muscle gene expression, and genotypic data were integrated in 260 non-diabetic AAs to identify expression regulatory variants. Their roles in genetic susceptibility to T2D, and related metabolic phenotypes were evaluated by mining GWAS datasets. eQTL analysis identified 1,971 and 2,078 cis-eGenes in adipose and muscle, respectively. Cis-eQTLs for 885 transcripts including top cis-eGenes CHURC1, USMG5, and ERAP2, were identified in both tissues. 62.1% of top cis-eSNPs were within ±50kb of transcription start sites and cis-eGenes were enriched for mitochondrial transcripts. Mining GWAS databases revealed association of cis-eSNPs for more than 50 genes with T2D (e.g. PIK3C2A, RBMS1, UFSP1), gluco-metabolic phenotypes, (e.g. INPP5E, SNX17, ERAP2, FN3KRP), and obesity (e.g. POMC, CPEB4). Integration of GWAS meta-analysis data from AA cohorts revealed the most significant association for cis-eSNPs of ATP5SL and MCCC1 genes, with T2D and BMI, respectively. This study developed the first comprehensive map of adipose and muscle tissue eQTLs in AAs (publically accessible at https://mdsetaa.phs.wakehealth.edu) and identified genetically-regulated transcripts for delineating genetic causes of T2D, and related metabolic phenotypes. PMID:27193597

  4. Digital transcriptome profiling of normal and glioblastoma-derived neural stem cells identifies genes associated with patient survival

    PubMed Central

    2012-01-01

    Background Glioblastoma multiforme, the most common type of primary brain tumor in adults, is driven by cells with neural stem (NS) cell characteristics. Using derivation methods developed for NS cells, it is possible to expand tumorigenic stem cells continuously in vitro. Although these glioblastoma-derived neural stem (GNS) cells are highly similar to normal NS cells, they harbor mutations typical of gliomas and initiate authentic tumors following orthotopic xenotransplantation. Here, we analyzed GNS and NS cell transcriptomes to identify gene expression alterations underlying the disease phenotype. Methods Sensitive measurements of gene expression were obtained by high-throughput sequencing of transcript tags (Tag-seq) on adherent GNS cell lines from three glioblastoma cases and two normal NS cell lines. Validation by quantitative real-time PCR was performed on 82 differentially expressed genes across a panel of 16 GNS and 6 NS cell lines. The molecular basis and prognostic relevance of expression differences were investigated by genetic characterization of GNS cells and comparison with public data for 867 glioma biopsies. Results Transcriptome analysis revealed major differences correlated with glioma histological grade, and identified misregulated genes of known significance in glioblastoma as well as novel candidates, including genes associated with other malignancies or glioma-related pathways. This analysis further detected several long non-coding RNAs with expression profiles similar to neighboring genes implicated in cancer. Quantitative PCR validation showed excellent agreement with Tag-seq data (median Pearson r = 0.91) and discerned a gene set robustly distinguishing GNS from NS cells across the 22 lines. These expression alterations include oncogene and tumor suppressor changes not detected by microarray profiling of tumor tissue samples, and facilitated the identification of a GNS expression signature strongly associated with patient survival (P = 1e

  5. Biomphalaria glabrata transcriptome: cDNA microarray profiling identifies resistant- and susceptible-specific gene expression in haemocytes from snail strains exposed to Schistosoma mansoni

    PubMed Central

    Lockyer, Anne E; Spinks, Jenny; Kane, Richard A; Hoffmann, Karl F; Fitzpatrick, Jennifer M; Rollinson, David; Noble, Leslie R; Jones, Catherine S

    2008-01-01

    Background Biomphalaria glabrata is an intermediate snail host for Schistosoma mansoni, one of the important schistosomes infecting man. B. glabrata/S. mansoni provides a useful model system for investigating the intimate interactions between host and parasite. Examining differential gene expression between S. mansoni-exposed schistosome-resistant and susceptible snail lines will identify genes and pathways that may be involved in snail defences. Results We have developed a 2053 element cDNA microarray for B. glabrata containing clones from ORESTES (Open Reading frame ESTs) libraries, suppression subtractive hybridization (SSH) libraries and clones identified in previous expression studies. Snail haemocyte RNA, extracted from parasite-challenged resistant and susceptible snails, 2 to 24 h post-exposure to S. mansoni, was hybridized to the custom made cDNA microarray and 98 differentially expressed genes or gene clusters were identified, 94 resistant-associated and 4 susceptible-associated. Quantitative PCR analysis verified the cDNA microarray results for representative transcripts. Differentially expressed genes were annotated and clustered using gene ontology (GO) terminology and Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway analysis. 61% of the identified differentially expressed genes have no known function including the 4 susceptible strain-specific transcripts. Resistant strain-specific expression of genes implicated in innate immunity of invertebrates was identified, including hydrolytic enzymes such as cathepsin L, a cysteine proteinase involved in lysis of phagocytosed particles; metabolic enzymes such as ornithine decarboxylase, the rate-limiting enzyme in the production of polyamines, important in inflammation and infection processes, as well as scavenging damaging free radicals produced during production of reactive oxygen species; stress response genes such as HSP70; proteins involved in signalling, such as importin 7 and copine 1

  6. Biomphalaria glabrata transcriptome: cDNA microarray profiling identifies resistant- and susceptible-specific gene expression in haemocytes from snail strains exposed to Schistosoma mansoni.

    PubMed

    Lockyer, Anne E; Spinks, Jenny; Kane, Richard A; Hoffmann, Karl F; Fitzpatrick, Jennifer M; Rollinson, David; Noble, Leslie R; Jones, Catherine S

    2008-12-29

    Biomphalaria glabrata is an intermediate snail host for Schistosoma mansoni, one of the important schistosomes infecting man. B. glabrata/S. mansoni provides a useful model system for investigating the intimate interactions between host and parasite. Examining differential gene expression between S. mansoni-exposed schistosome-resistant and susceptible snail lines will identify genes and pathways that may be involved in snail defences. We have developed a 2053 element cDNA microarray for B. glabrata containing clones from ORESTES (Open Reading frame ESTs) libraries, suppression subtractive hybridization (SSH) libraries and clones identified in previous expression studies. Snail haemocyte RNA, extracted from parasite-challenged resistant and susceptible snails, 2 to 24 h post-exposure to S. mansoni, was hybridized to the custom made cDNA microarray and 98 differentially expressed genes or gene clusters were identified, 94 resistant-associated and 4 susceptible-associated. Quantitative PCR analysis verified the cDNA microarray results for representative transcripts. Differentially expressed genes were annotated and clustered using gene ontology (GO) terminology and Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway analysis. 61% of the identified differentially expressed genes have no known function including the 4 susceptible strain-specific transcripts. Resistant strain-specific expression of genes implicated in innate immunity of invertebrates was identified, including hydrolytic enzymes such as cathepsin L, a cysteine proteinase involved in lysis of phagocytosed particles; metabolic enzymes such as ornithine decarboxylase, the rate-limiting enzyme in the production of polyamines, important in inflammation and infection processes, as well as scavenging damaging free radicals produced during production of reactive oxygen species; stress response genes such as HSP70; proteins involved in signalling, such as importin 7 and copine 1, cytoplasmic intermediate

  7. Systemic analysis of genome-wide expression profiles identified potential therapeutic targets of demethylation drugs for glioblastoma.

    PubMed

    Ning, Tongbo; Cui, Hao; Sun, Feng; Zou, Jidian

    2017-09-05

    Glioblastoma represents one of the most aggressive malignant brain tumors with high morbidity and motility. Demethylation drugs have been developed for its treatment with little efficacy has been observed. The purpose of this study was to screen therapeutic targets of demethylation drugs or bioactive molecules for glioblastoma through systemic bioinformatics analysis. We firstly downloaded genome-wide expression profiles from the Gene Expression Omnibus (GEO) and conducted the primary analysis through R software, mainly including preprocessing of raw microarray data, transformation between probe ID and gene symbol and identification of differential expression genes (DEGs). Secondly, functional enrichment analysis was conducted via the Database for Annotation, Visualization and Integrated Discovery (DAVID) to explore biological processes involved in the development of glioblastoma. Thirdly, we constructed protein-protein interaction (PPI) network of interested genes and conducted cross analysis for multi datasets to obtain potential therapeutic targets for glioblastoma. Finally, we further confirmed the therapeutic targets through real-time RT-PCR. As a result, biological processes that related to cancer development, amino metabolism, immune response and etc. were found to be significantly enriched in genes that differential expression in glioblastoma and regulated by 5'aza-dC. Besides, network and cross analysis identified ACAT2, UFC1 and CYB5R1 as novel therapeutic targets of demethylation drugs which also confirmed by real time RT-PCR. In conclusions, our study identified several biological processes and genes that involved in the development of glioblastoma and regulated by 5'aza-dC, which would be helpful for the treatment of glioblastoma. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. Mapping autosomal recessive intellectual disability: combined microarray and exome sequencing identifies 26 novel candidate genes in 192 consanguineous families.

    PubMed

    Harripaul, R; Vasli, N; Mikhailov, A; Rafiq, M A; Mittal, K; Windpassinger, C; Sheikh, T I; Noor, A; Mahmood, H; Downey, S; Johnson, M; Vleuten, K; Bell, L; Ilyas, M; Khan, F S; Khan, V; Moradi, M; Ayaz, M; Naeem, F; Heidari, A; Ahmed, I; Ghadami, S; Agha, Z; Zeinali, S; Qamar, R; Mozhdehipanah, H; John, P; Mir, A; Ansar, M; French, L; Ayub, M; Vincent, J B

    2018-04-01

    Approximately 1% of the global population is affected by intellectual disability (ID), and the majority receive no molecular diagnosis. Previous studies have indicated high levels of genetic heterogeneity, with estimates of more than 2500 autosomal ID genes, the majority of which are autosomal recessive (AR). Here, we combined microarray genotyping, homozygosity-by-descent (HBD) mapping, copy number variation (CNV) analysis, and whole exome sequencing (WES) to identify disease genes/mutations in 192 multiplex Pakistani and Iranian consanguineous families with non-syndromic ID. We identified definite or candidate mutations (or CNVs) in 51% of families in 72 different genes, including 26 not previously reported for ARID. The new ARID genes include nine with loss-of-function mutations (ABI2, MAPK8, MPDZ, PIDD1, SLAIN1, TBC1D23, TRAPPC6B, UBA7 and USP44), and missense mutations include the first reports of variants in BDNF or TET1 associated with ID. The genes identified also showed overlap with de novo gene sets for other neuropsychiatric disorders. Transcriptional studies showed prominent expression in the prenatal brain. The high yield of AR mutations for ID indicated that this approach has excellent clinical potential and should inform clinical diagnostics, including clinical whole exome and genome sequencing, for populations in which consanguinity is common. As with other AR disorders, the relevance will also apply to outbred populations.

  9. Phenotypes of Recessive Pediatric Cataract in a Cohort of Children with Identified Homozygous Gene Mutations (An American Ophthalmological Society Thesis)

    PubMed Central

    Khan, Arif O.; Aldahmesh, Mohammed A.; Alkuraya, Fowzan S.

    2015-01-01

    Purpose: To assess for phenotype-genotype correlations in families with recessive pediatric cataract and identified gene mutations. Methods: Retrospective review (2004 through 2013) of 26 Saudi Arabian apparently nonsyndromic pediatric cataract families referred to one of the authors (A.O.K.) and for which recessive gene mutations were identified. Results: Fifteen different homozygous recessive gene mutations were identified in the 26 consanguineous families; two genes and five families are novel to this study. Ten families had a founder CRYBB1 deletion (all with bilateral central pulverulent cataract), two had the same missense mutation in CRYAB (both with bilateral juvenile cataract with marked variable expressivity), and two had different mutations in FYCO1 (both with bilateral posterior capsular abnormality). The remaining 12 families each had mutations in 12 different genes (CRYAA, CRYBA1, AKR1E2, AGK, BFSP2, CYP27A1, CYP51A1, EPHA2, GCNT2, LONP1, RNLS, WDR87) with unique phenotypes noted for CYP27A1 (bilateral juvenile fleck with anterior and/or posterior capsular cataract and later cerebrotendinous xanthomatosis), EPHA2 (bilateral anterior persistent fetal vasculature), and BFSP2 (bilateral flecklike with cloudy cortex). Potential carrier signs were documented for several families. Conclusions: In this recessive pediatric cataract case series most identified genes are noncrystallin. Recessive pediatric cataract phenotypes are generally nonspecific, but some notable phenotypes are distinct and associated with specific gene mutations. Marked variable expressivity can occur from a recessive missense CRYAB mutation. Genetic analysis of apparently isolated pediatric cataract can sometimes uncover mutations in a syndromic gene. Some gene mutations seem to be associated with apparent heterozygous carrier signs. PMID:26622071

  10. G20210A prothrombin gene mutation identified in patients with venous leg ulcers.

    PubMed

    Jebeleanu, G; Procopciuc, L

    2001-01-01

    The G20210A mutation variant of prothrombin gene is the second most frequent mutation identified in patients with deep venous thrombosis, after factor V Leiden. The risk for developing deep venous thrombosis is high in patients identified as heterozygous for G20210A mutation. In order to identify this polymorphism in the gene coding prothrombin, the 345bp fragment in the 3'- untranslated region of the prothrombin gene was amplified using amplification by polymerase chain reaction and enzymatic digestion by HindIII (restriction endonuclease enzyme). The products of amplification and enzymatic's digestion were analized using agarose gel electrophoresis. We investigated 20 patients with venous leg ulcers and we found 2 heterozygous (10%) for G20210A mutation. None of the patients in the control group had G20210A mutation. Our study confirms the presence of G20210A mutation in the Romanian population. Our study also shows the link between venous leg ulcers and this polymorphism in the prothrombin gene.

  11. Novel genes identified in a high-density genome wide association study for nicotine dependence.

    PubMed

    Bierut, Laura Jean; Madden, Pamela A F; Breslau, Naomi; Johnson, Eric O; Hatsukami, Dorothy; Pomerleau, Ovide F; Swan, Gary E; Rutter, Joni; Bertelsen, Sarah; Fox, Louis; Fugman, Douglas; Goate, Alison M; Hinrichs, Anthony L; Konvicka, Karel; Martin, Nicholas G; Montgomery, Grant W; Saccone, Nancy L; Saccone, Scott F; Wang, Jen C; Chase, Gary A; Rice, John P; Ballinger, Dennis G

    2007-01-01

    Tobacco use is a leading contributor to disability and death worldwide, and genetic factors contribute in part to the development of nicotine dependence. To identify novel genes for which natural variation contributes to the development of nicotine dependence, we performed a comprehensive genome wide association study using nicotine dependent smokers as cases and non-dependent smokers as controls. To allow the efficient, rapid, and cost effective screen of the genome, the study was carried out using a two-stage design. In the first stage, genotyping of over 2.4 million single nucleotide polymorphisms (SNPs) was completed in case and control pools. In the second stage, we selected SNPs for individual genotyping based on the most significant allele frequency differences between cases and controls from the pooled results. Individual genotyping was performed in 1050 cases and 879 controls using 31 960 selected SNPs. The primary analysis, a logistic regression model with covariates of age, gender, genotype and gender by genotype interaction, identified 35 SNPs with P-values less than 10(-4) (minimum P-value 1.53 x 10(-6)). Although none of the individual findings is statistically significant after correcting for multiple tests, additional statistical analyses support the existence of true findings in this group. Our study nominates several novel genes, such as Neurexin 1 (NRXN1), in the development of nicotine dependence while also identifying a known candidate gene, the beta3 nicotinic cholinergic receptor. This work anticipates the future directions of large-scale genome wide association studies with state-of-the-art methodological approaches and sharing of data with the scientific community.

  12. A Novel Yeast Genomics Method for Identifying New Breast Cancer Susceptibility Genes

    DTIC Science & Technology

    2007-05-01

    find new candidate genes for breast cancer susceptibility in women and identifying these human genes can further improve monitoring and treatment...breast cancer susceptibility genes in humans that are currently unknown and not deducible from current methodologies. It is a fundamental...template to faithfully repair the broken strand. In human cancer it is loss of HR, rather than NHEJ, that is more important in increasing cancer

  13. Meta-type analysis of dopaminergic effects on gene expression in the neuroendocrine brain of female goldfish.

    PubMed

    Popesku, Jason T; Martyniuk, Christopher J; Trudeau, Vance L

    2012-01-01

    Dopamine (DA) is a major neurotransmitter important for neuroendocrine control and recent studies have described genomic signaling pathways activated and inhibited by DA agonists and antagonists in the goldfish brain. Here we perform a meta-type analysis using microarray datasets from experiments conducted with female goldfish to characterize the gene expression responses that underlie dopaminergic signaling. Sexually mature, pre-spawning [gonadosomatic index (GSI) = 4.5 ± 1.3%] or sexually regressing (GSI = 3 ± 0.4%) female goldfish (15-40 g) injected intraperitoneally with either SKF 38393, LY 171555, SCH 23390, sulpiride, or a combination of 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine and α-methyl-p-tyrosine. Microarray meta-type analysis identified 268 genes in the telencephalon and hypothalamus as having reciprocal (i.e., opposite between agonism and antagonism/depletion) fold change responses, suggesting that these transcripts are likely targets for DA-mediated regulation. Noteworthy genes included ependymin, vimentin, and aromatase, genes that support the significance of DA in neuronal plasticity and tissue remodeling. Sub-network enrichment analysis (SNEA) was used to identify common gene regulators and binding proteins associated with the differentially expressed genes mediated by DA. SNEA analysis identified gene expression targets that were related to three major categories that included cell signaling (STAT3, SP1, SMAD, Jun/Fos), immune response (IL-6, IL-1β, TNFs, cytokine, NF-κB), and cell proliferation and growth (IGF1, TGFβ1). These gene networks are also known to be associated with neurodegenerative disorders such as Parkinsons' disease, well-known to be associated with loss of dopaminergic neurons. This study identifies genes and networks that underlie DA signaling in the vertebrate CNS and provides targets that may be key neuroendocrine regulators. The results provide a foundation for future work on dopaminergic

  14. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python).

    PubMed

    Irizarry, Kristopher J L; Rutllant, Josep

    2016-01-01

    Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism's genome (such as the mouse genome) in order to make physiological inferences about the role of genes and proteins in a less characterized organism's genome (such as the Burmese python). We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 gene-phenotype relationships in the python which are implicated in 10 specific sperm phenotypes. Results obtained through our systematic analysis identified subsets of python genes exhibiting associations with gene ontology annotation terms. Functional annotation data was represented in a semantic scatter plot. Together, these newly annotated Python bivittatus genome resources provide a high resolution framework from which the biology relating to reptile spermatogenesis, fertility, and reproduction can be further investigated. Applications of our research include (1) production of genetic diagnostics for assessing fertility in domestic and wild reptiles; (2) enhanced assisted reproduction technology for endangered and captive reptiles; and (3) novel molecular targets for biotechnology-based approaches aimed at reducing fertility and reproduction of invasive reptiles. Additional enhancements to reptile genomic resources will further enhance their value.

  15. De Novo Transcriptome Sequencing of Olea europaea L. to Identify Genes Involved in the Development of the Pollen Tube.

    PubMed

    Iaria, Domenico; Chiappetta, Adriana; Muzzalupo, Innocenzo

    2016-01-01

    In olive (Olea europaea L.), the processes controlling self-incompatibility are still unclear and the molecular basis underlying this process are still not fully characterized. In order to determine compatibility relationships, using next-generation sequencing techniques and a de novo transcriptome assembly strategy, we show that pollen tubes from different olive plants, grown in vitro in a medium containing its own pistil and in combination pollen/pistil from self-sterile and self-fertile cultivars, have a distinct gene expression profile and many of the differentially expressed sequences between the samples fall within gene families involved in the development of the pollen tube, such as lipase, carboxylesterase, pectinesterase, pectin methylesterase, and callose synthase. Moreover, different genes involved in signal transduction, transcription, and growth are overrepresented. The analysis also allowed us to identify members in actin and actin depolymerization factor and fibrin gene family and member of the Ca(2+) binding gene family related to the development and polarization of pollen apical tip. The whole transcriptomic analysis, through the identification of the differentially expressed transcripts set and an extended functional annotation analysis, will lead to a better understanding of the mechanisms of pollen germination and pollen tube growth in the olive.

  16. iGC-an integrated analysis package of gene expression and copy number alteration.

    PubMed

    Lai, Yi-Pin; Wang, Liang-Bo; Wang, Wei-An; Lai, Liang-Chuan; Tsai, Mong-Hsun; Lu, Tzu-Pin; Chuang, Eric Y

    2017-01-14

    With the advancement in high-throughput technologies, researchers can simultaneously investigate gene expression and copy number alteration (CNA) data from individual patients at a lower cost. Traditional analysis methods analyze each type of data individually and integrate their results using Venn diagrams. Challenges arise, however, when the results are irreproducible and inconsistent across multiple platforms. To address these issues, one possible approach is to concurrently analyze both gene expression profiling and CNAs in the same individual. We have developed an open-source R/Bioconductor package (iGC). Multiple input formats are supported and users can define their own criteria for identifying differentially expressed genes driven by CNAs. The analysis of two real microarray datasets demonstrated that the CNA-driven genes identified by the iGC package showed significantly higher Pearson correlation coefficients with their gene expression levels and copy numbers than those genes located in a genomic region with CNA. Compared with the Venn diagram approach, the iGC package showed better performance. The iGC package is effective and useful for identifying CNA-driven genes. By simultaneously considering both comparative genomic and transcriptomic data, it can provide better understanding of biological and medical questions. The iGC package's source code and manual are freely available at https://www.bioconductor.org/packages/release/bioc/html/iGC.html .

  17. Ossification of the posterior longitudinal ligament related genes identification using microarray gene expression profiling and bioinformatics analysis.

    PubMed

    He, Hailong; Mao, Lingzhou; Xu, Peng; Xi, Yanhai; Xu, Ning; Xue, Mingtao; Yu, Jiangming; Ye, Xiaojian

    2014-01-10

    Ossification of the posterior longitudinal ligament (OPLL) is a kind of disease with physical barriers and neurological disorders. The objective of this study was to explore the differentially expressed genes (DEGs) in OPLL patient ligament cells and identify the target sites for the prevention and treatment of OPLL in clinic. Gene expression data GSE5464 was downloaded from Gene Expression Omnibus; then DEGs were screened by limma package in R language, and changed functions and pathways of OPLL cells compared to normal cells were identified by DAVID (The Database for Annotation, Visualization and Integrated Discovery); finally, an interaction network of DEGs was constructed by string. A total of 1536 DEGs were screened, with 31 down-regulated and 1505 up-regulated genes. Response to wounding function and Toll-like receptor signaling pathway may involve in the development of OPLL. Genes, such as PDGFB, PRDX2 may involve in OPLL through response to wounding function. Toll-like receptor signaling pathway enriched genes such as TLR1, TLR5, and TLR7 may involve in spine cord injury in OPLL. PIK3R1 was the hub gene in the network of DEGs with the highest degree; INSR was one of the most closely related genes of it. OPLL related genes screened by microarray gene expression profiling and bioinformatics analysis may be helpful for elucidating the mechanism of OPLL. © 2013.

  18. Analysis of Gene Regulatory Networks of Maize in Response to Nitrogen.

    PubMed

    Jiang, Lu; Ball, Graham; Hodgman, Charlie; Coules, Anne; Zhao, Han; Lu, Chungui

    2018-03-08

    Nitrogen (N) fertilizer has a major influence on the yield and quality. Understanding and optimising the response of crop plants to nitrogen fertilizer usage is of central importance in enhancing food security and agricultural sustainability. In this study, the analysis of gene regulatory networks reveals multiple genes and biological processes in response to N. Two microarray studies have been used to infer components of the nitrogen-response network. Since they used different array technologies, a map linking the two probe sets to the maize B73 reference genome has been generated to allow comparison. Putative Arabidopsis homologues of maize genes were used to query the Biological General Repository for Interaction Datasets (BioGRID) network, which yielded the potential involvement of three transcription factors (TFs) (GLK5, MADS64 and bZIP108) and a Calcium-dependent protein kinase. An Artificial Neural Network was used to identify influential genes and retrieved bZIP108 and WRKY36 as significant TFs in both microarray studies, along with genes for Asparagine Synthetase, a dual-specific protein kinase and a protein phosphatase. The output from one study also suggested roles for microRNA (miRNA) 399b and Nin-like Protein 15 (NLP15). Co-expression-network analysis of TFs with closely related profiles to known Nitrate-responsive genes identified GLK5, GLK8 and NLP15 as candidate regulators of genes repressed under low Nitrogen conditions, while bZIP108 might play a role in gene activation.

  19. Genome-wide analysis of WRKY gene family in Cucumis sativus

    PubMed Central

    2011-01-01

    Background WRKY proteins are a large family of transcriptional regulators in higher plant. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. Prior to the present study, only one full-length cucumber WRKY protein had been reported. The recent publication of the draft genome sequence of cucumber allowed us to conduct a genome-wide search for cucumber WRKY proteins, and to compare these positively identified proteins with their homologs in model plants, such as Arabidopsis. Results We identified a total of 55 WRKY genes in the cucumber genome. According to structural features of their encoded proteins, the cucumber WRKY (CsWRKY) genes were classified into three groups (group 1-3). Analysis of expression profiles of CsWRKY genes indicated that 48 WRKY genes display differential expression either in their transcript abundance or in their expression patterns under normal growth conditions, and 23 WRKY genes were differentially expressed in response to at least one abiotic stresses (cold, drought or salinity). The expression profile of stress-inducible CsWRKY genes were correlated with those of their putative Arabidopsis WRKY (AtWRKY) orthologs, except for the group 3 WRKY genes. Interestingly, duplicated group 3 AtWRKY genes appear to have been under positive selection pressure during evolution. In contrast, there was no evidence of recent gene duplication or positive selection pressure among CsWRKY group 3 genes, which may have led to the expressional divergence of group 3 orthologs. Conclusions Fifty-five WRKY genes were identified in cucumber and the structure of their encoded proteins, their expression, and their evolution were examined. Considering that there has been extensive expansion of group 3 WRKY genes in angiosperms, the occurrence of different evolutionary events could explain the functional divergence of these genes. PMID:21955985

  20. Genome-wide analysis of WRKY gene family in Cucumis sativus.

    PubMed

    Ling, Jian; Jiang, Weijie; Zhang, Ying; Yu, Hongjun; Mao, Zhenchuan; Gu, Xingfang; Huang, Sanwen; Xie, Bingyan

    2011-09-28

    WRKY proteins are a large family of transcriptional regulators in higher plant. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. Prior to the present study, only one full-length cucumber WRKY protein had been reported. The recent publication of the draft genome sequence of cucumber allowed us to conduct a genome-wide search for cucumber WRKY proteins, and to compare these positively identified proteins with their homologs in model plants, such as Arabidopsis. We identified a total of 55 WRKY genes in the cucumber genome. According to structural features of their encoded proteins, the cucumber WRKY (CsWRKY) genes were classified into three groups (group 1-3). Analysis of expression profiles of CsWRKY genes indicated that 48 WRKY genes display differential expression either in their transcript abundance or in their expression patterns under normal growth conditions, and 23 WRKY genes were differentially expressed in response to at least one abiotic stresses (cold, drought or salinity). The expression profile of stress-inducible CsWRKY genes were correlated with those of their putative Arabidopsis WRKY (AtWRKY) orthologs, except for the group 3 WRKY genes. Interestingly, duplicated group 3 AtWRKY genes appear to have been under positive selection pressure during evolution. In contrast, there was no evidence of recent gene duplication or positive selection pressure among CsWRKY group 3 genes, which may have led to the expressional divergence of group 3 orthologs. Fifty-five WRKY genes were identified in cucumber and the structure of their encoded proteins, their expression, and their evolution were examined. Considering that there has been extensive expansion of group 3 WRKY genes in angiosperms, the occurrence of different evolutionary events could explain the functional divergence of these genes.