Vimaleswaran, Karani S; Tachmazidou, Ioanna; Zhao, Jing Hua; Hirschhorn, Joel N; Dudbridge, Frank; Loos, Ruth J F
2012-10-15
Before the advent of genome-wide association studies (GWASs), hundreds of candidate genes for obesity-susceptibility had been identified through a variety of approaches. We examined whether those obesity candidate genes are enriched for associations with body mass index (BMI) compared with non-candidate genes by using data from a large-scale GWAS. A thorough literature search identified 547 candidate genes for obesity-susceptibility based on evidence from animal studies, Mendelian syndromes, linkage studies, genetic association studies and expression studies. Genomic regions were defined to include the genes ±10 kb of flanking sequence around candidate and non-candidate genes. We used summary statistics publicly available from the discovery stage of the genome-wide meta-analysis for BMI performed by the genetic investigation of anthropometric traits consortium in 123 564 individuals. Hypergeometric, rank tail-strength and gene-set enrichment analysis tests were used to test for the enrichment of association in candidate compared with non-candidate genes. The hypergeometric test of enrichment was not significant at the 5% P-value quantile (P = 0.35), but was nominally significant at the 25% quantile (P = 0.015). The rank tail-strength and gene-set enrichment tests were nominally significant for the full set of genes and borderline significant for the subset without SNPs at P < 10(-7). Taken together, the observed evidence for enrichment suggests that the candidate gene approach retains some value. However, the degree of enrichment is small despite the extensive number of candidate genes and the large sample size. Studies that focus on candidate genes have only slightly increased chances of detecting associations, and are likely to miss many true effects in non-candidate genes, at least for obesity-related traits.
Johnson, Emma C; Border, Richard; Melroy-Greif, Whitney E; de Leeuw, Christiaan A; Ehringer, Marissa A; Keller, Matthew C
2017-11-15
A recent analysis of 25 historical candidate gene polymorphisms for schizophrenia in the largest genome-wide association study conducted to date suggested that these commonly studied variants were no more associated with the disorder than would be expected by chance. However, the same study identified other variants within those candidate genes that demonstrated genome-wide significant associations with schizophrenia. As such, it is possible that variants within historic schizophrenia candidate genes are associated with schizophrenia at levels above those expected by chance, even if the most-studied specific polymorphisms are not. The present study used association statistics from the largest schizophrenia genome-wide association study conducted to date as input to a gene set analysis to investigate whether variants within schizophrenia candidate genes are enriched for association with schizophrenia. As a group, variants in the most-studied candidate genes were no more associated with schizophrenia than were variants in control sets of noncandidate genes. While a small subset of candidate genes did appear to be significantly associated with schizophrenia, these genes were not particularly noteworthy given the large number of more strongly associated noncandidate genes. The history of schizophrenia research should serve as a cautionary tale to candidate gene investigators examining other phenotypes: our findings indicate that the most investigated candidate gene hypotheses of schizophrenia are not well supported by genome-wide association studies, and it is likely that this will be the case for other complex traits as well. Copyright © 2017 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Cankorur-Cetinkaya, Ayca; Dereli, Elif; Eraslan, Serpil; Karabekmez, Erkan; Dikicioglu, Duygu; Kirdar, Betul
2012-01-01
Background Understanding the dynamic mechanism behind the transcriptional organization of genes in response to varying environmental conditions requires time-dependent data. The dynamic transcriptional response obtained by real-time RT-qPCR experiments could only be correctly interpreted if suitable reference genes are used in the analysis. The lack of available studies on the identification of candidate reference genes in dynamic gene expression studies necessitates the identification and the verification of a suitable gene set for the analysis of transient gene expression response. Principal Findings In this study, a candidate reference gene set for RT-qPCR analysis of dynamic transcriptional changes in Saccharomyces cerevisiae was determined using 31 different publicly available time series transcriptome datasets. Ten of the twelve candidates (TPI1, FBA1, CCW12, CDC19, ADH1, PGK1, GCN4, PDC1, RPS26A and ARF1) we identified were not previously reported as potential reference genes. Our method also identified the commonly used reference genes ACT1 and TDH3. The most stable reference genes from this pool were determined as TPI1, FBA1, CDC19 and ACT1 in response to a perturbation in the amount of available glucose and as FBA1, TDH3, CCW12 and ACT1 in response to a perturbation in the amount of available ammonium. The use of these newly proposed gene sets outperformed the use of common reference genes in the determination of dynamic transcriptional response of the target genes, HAP4 and MEP2, in response to relaxation from glucose and ammonium limitations, respectively. Conclusions A candidate reference gene set to be used in dynamic real-time RT-qPCR expression profiling in yeast was proposed for the first time in the present study. Suitable pools of stable reference genes to be used under different experimental conditions could be selected from this candidate set in order to successfully determine the expression profiles for the genes of interest. PMID:22675547
Integrative Functional Genomics for Systems Genetics in GeneWeaver.org.
Bubier, Jason A; Langston, Michael A; Baker, Erich J; Chesler, Elissa J
2017-01-01
The abundance of existing functional genomics studies permits an integrative approach to interpreting and resolving the results of diverse systems genetics studies. However, a major challenge lies in assembling and harmonizing heterogeneous data sets across species for facile comparison to the positional candidate genes and coexpression networks that come from systems genetic studies. GeneWeaver is an online database and suite of tools at www.geneweaver.org that allows for fast aggregation and analysis of gene set-centric data. GeneWeaver contains curated experimental data together with resource-level data such as GO annotations, MP annotations, and KEGG pathways, along with persistent stores of user entered data sets. These can be entered directly into GeneWeaver or transferred from widely used resources such as GeneNetwork.org. Data are analyzed using statistical tools and advanced graph algorithms to discover new relations, prioritize candidate genes, and generate function hypotheses. Here we use GeneWeaver to find genes common to multiple gene sets, prioritize candidate genes from a quantitative trait locus, and characterize a set of differentially expressed genes. Coupling a large multispecies repository curated and empirical functional genomics data to fast computational tools allows for the rapid integrative analysis of heterogeneous data for interpreting and extrapolating systems genetics results.
Hindumathi, V; Kranthi, T; Rao, S B; Manimaran, P
2014-06-01
With rapidly changing technology, prediction of candidate genes has become an indispensable task in recent years mainly in the field of biological research. The empirical methods for candidate gene prioritization that succors to explore the potential pathway between genetic determinants and complex diseases are highly cumbersome and labor intensive. In such a scenario predicting potential targets for a disease state through in silico approaches are of researcher's interest. The prodigious availability of protein interaction data coupled with gene annotation renders an ease in the accurate determination of disease specific candidate genes. In our work we have prioritized the cervix related cancer candidate genes by employing Csaba Ortutay and his co-workers approach of identifying the candidate genes through graph theoretical centrality measures and gene ontology. With the advantage of the human protein interaction data, cervical cancer gene sets and the ontological terms, we were able to predict 15 novel candidates for cervical carcinogenesis. The disease relevance of the anticipated candidate genes was corroborated through a literature survey. Also the presence of the drugs for these candidates was detected through Therapeutic Target Database (TTD) and DrugMap Central (DMC) which affirms that they may be endowed as potential drug targets for cervical cancer.
Endeavour update: a web resource for gene prioritization in multiple species
Tranchevent, Léon-Charles; Barriot, Roland; Yu, Shi; Van Vooren, Steven; Van Loo, Peter; Coessens, Bert; De Moor, Bart; Aerts, Stein; Moreau, Yves
2008-01-01
Endeavour (http://www.esat.kuleuven.be/endeavourweb; this web site is free and open to all users and there is no login requirement) is a web resource for the prioritization of candidate genes. Using a training set of genes known to be involved in a biological process of interest, our approach consists of (i) inferring several models (based on various genomic data sources), (ii) applying each model to the candidate genes to rank those candidates against the profile of the known genes and (iii) merging the several rankings into a global ranking of the candidate genes. In the present article, we describe the latest developments of Endeavour. First, we provide a web-based user interface, besides our Java client, to make Endeavour more universally accessible. Second, we support multiple species: in addition to Homo sapiens, we now provide gene prioritization for three major model organisms: Mus musculus, Rattus norvegicus and Caenorhabditis elegans. Third, Endeavour makes use of additional data sources and is now including numerous databases: ontologies and annotations, protein–protein interactions, cis-regulatory information, gene expression data sets, sequence information and text-mining data. We tested the novel version of Endeavour on 32 recent disease gene associations from the literature. Additionally, we describe a number of recent independent studies that made use of Endeavour to prioritize candidate genes for obesity and Type II diabetes, cleft lip and cleft palate, and pulmonary fibrosis. PMID:18508807
Singh, Anuradha; Mantri, Shrikant; Sharma, Monica; Chaudhury, Ashok; Tuli, Rakesh; Roy, Joy
2014-01-16
The cultivated bread wheat (Triticum aestivum L.) possesses unique flour quality, which can be processed into many end-use food products such as bread, pasta, chapatti (unleavened flat bread), biscuit, etc. The present wheat varieties require improvement in processing quality to meet the increasing demand of better quality food products. However, processing quality is very complex and controlled by many genes, which have not been completely explored. To identify the candidate genes whose expressions changed due to variation in processing quality and interaction (quality x development), genome-wide transcriptome studies were performed in two sets of diverse Indian wheat varieties differing for chapatti quality. It is also important to understand the temporal and spatial distributions of their expressions for designing tissue and growth specific functional genomics experiments. Gene-specific two-way ANOVA analysis of expression of about 55 K transcripts in two diverse sets of Indian wheat varieties for chapatti quality at three seed developmental stages identified 236 differentially expressed probe sets (10-fold). Out of 236, 110 probe sets were identified for chapatti quality. Many processing quality related key genes such as glutenin and gliadins, puroindolines, grain softness protein, alpha and beta amylases, proteases, were identified, and many other candidate genes related to cellular and molecular functions were also identified. The ANOVA analysis revealed that the expression of 56 of 110 probe sets was involved in interaction (quality x development). Majority of the probe sets showed differential expression at early stage of seed development i.e. temporal expression. Meta-analysis revealed that the majority of the genes expressed in one or a few growth stages indicating spatial distribution of their expressions. The differential expressions of a few candidate genes such as pre-alpha/beta-gliadin and gamma gliadin were validated by RT-PCR. Therefore, this study identified several quality related key genes including many other genes, their interactions (quality x development) and temporal and spatial distributions. The candidate genes identified for processing quality and information on temporal and spatial distributions of their expressions would be useful for designing wheat improvement programs for processing quality either by changing their expression or development of single nucleotide polymorphisms (SNPs) markers.
2014-01-01
Background The cultivated bread wheat (Triticum aestivum L.) possesses unique flour quality, which can be processed into many end-use food products such as bread, pasta, chapatti (unleavened flat bread), biscuit, etc. The present wheat varieties require improvement in processing quality to meet the increasing demand of better quality food products. However, processing quality is very complex and controlled by many genes, which have not been completely explored. To identify the candidate genes whose expressions changed due to variation in processing quality and interaction (quality x development), genome-wide transcriptome studies were performed in two sets of diverse Indian wheat varieties differing for chapatti quality. It is also important to understand the temporal and spatial distributions of their expressions for designing tissue and growth specific functional genomics experiments. Results Gene-specific two-way ANOVA analysis of expression of about 55 K transcripts in two diverse sets of Indian wheat varieties for chapatti quality at three seed developmental stages identified 236 differentially expressed probe sets (10-fold). Out of 236, 110 probe sets were identified for chapatti quality. Many processing quality related key genes such as glutenin and gliadins, puroindolines, grain softness protein, alpha and beta amylases, proteases, were identified, and many other candidate genes related to cellular and molecular functions were also identified. The ANOVA analysis revealed that the expression of 56 of 110 probe sets was involved in interaction (quality x development). Majority of the probe sets showed differential expression at early stage of seed development i.e. temporal expression. Meta-analysis revealed that the majority of the genes expressed in one or a few growth stages indicating spatial distribution of their expressions. The differential expressions of a few candidate genes such as pre-alpha/beta-gliadin and gamma gliadin were validated by RT-PCR. Therefore, this study identified several quality related key genes including many other genes, their interactions (quality x development) and temporal and spatial distributions. Conclusions The candidate genes identified for processing quality and information on temporal and spatial distributions of their expressions would be useful for designing wheat improvement programs for processing quality either by changing their expression or development of single nucleotide polymorphisms (SNPs) markers. PMID:24433256
Qu, Conghui; Schuetz, Johanna M.; Min, Jeong Eun; Leach, Stephen; Daley, Denise; Spinelli, John J.; Brooks-Wilson, Angela; Graham, Jinko
2011-01-01
We describe a statistical approach to predict gender-labeling errors in candidate-gene association studies, when Y-chromosome markers have not been included in the genotyping set. The approach adds value to methods that consider only the heterozygosity of X-chromosome SNPs, by incorporating available information about the intensity of X-chromosome SNPs in candidate genes relative to autosomal SNPs from the same individual. To our knowledge, no published methods formalize a framework in which heterozygosity and relative intensity are simultaneously taken into account. Our method offers the advantage that, in the genotyping set, no additional space is required beyond that already assigned to X-chromosome SNPs in the candidate genes. We also show how the predictions can be used in a two-phase sampling design to estimate the gender-labeling error rates for an entire study, at a fraction of the cost of a conventional design. PMID:22303327
Novel Biomarker Candidates for Colorectal Cancer Metastasis: A Meta-analysis of In Vitro Studies
Long, Nguyen Phuoc; Lee, Wun Jun; Huy, Nguyen Truong; Lee, Seul Ji; Park, Jeong Hill; Kwon, Sung Won
2016-01-01
Colorectal cancer (CRC) is one of the most common and lethal cancers. Although numerous studies have evaluated potential biomarkers for early diagnosis, current biomarkers have failed to reach an acceptable level of accuracy for distant metastasis. In this paper, we performed a gene set meta-analysis of in vitro microarray studies and combined the results from this study with previously published proteomic data to validate and suggest prognostic candidates for CRC metastasis. Two microarray data sets included found 21 significant genes. Of these significant genes, ALDOA, IL8 (CXCL8), and PARP4 had strong potential as prognostic candidates. LAMB2, MCM7, CXCL23A, SERPINA3, ABCA3, ALDH3A2, and POLR2I also have potential. Other candidates were more controversial, possibly because of the biologic heterogeneity of tumor cells, which is a major obstacle to predicting metastasis. In conclusion, we demonstrated a meta-analysis approach and successfully suggested ten biomarker candidates for future investigation. PMID:27688707
Novel Biomarker Candidates for Colorectal Cancer Metastasis: A Meta-analysis of In Vitro Studies.
Long, Nguyen Phuoc; Lee, Wun Jun; Huy, Nguyen Truong; Lee, Seul Ji; Park, Jeong Hill; Kwon, Sung Won
2016-01-01
Colorectal cancer (CRC) is one of the most common and lethal cancers. Although numerous studies have evaluated potential biomarkers for early diagnosis, current biomarkers have failed to reach an acceptable level of accuracy for distant metastasis. In this paper, we performed a gene set meta-analysis of in vitro microarray studies and combined the results from this study with previously published proteomic data to validate and suggest prognostic candidates for CRC metastasis. Two microarray data sets included found 21 significant genes. Of these significant genes, ALDOA, IL8 (CXCL8), and PARP4 had strong potential as prognostic candidates. LAMB2, MCM7, CXCL23A, SERPINA3, ABCA3, ALDH3A2, and POLR2I also have potential. Other candidates were more controversial, possibly because of the biologic heterogeneity of tumor cells, which is a major obstacle to predicting metastasis. In conclusion, we demonstrated a meta-analysis approach and successfully suggested ten biomarker candidates for future investigation.
Galatola, Martina; Cielo, Donatella; Panico, Camilla; Stellato, Pio; Malamisura, Basilio; Carbone, Lorenzo; Gianfrani, Carmen; Troncone, Riccardo; Greco, Luigi; Auricchio, Renata
2017-09-01
The prevalence of celiac disease (CD) has increased significantly in recent years, and risk prediction and early diagnosis have become imperative especially in at-risk families. In a previous study, we identified individuals with CD based on the expression profile of a set of candidate genes in peripheral blood monocytes. Here we evaluated the expression of a panel of CD candidate genes in peripheral blood mononuclear cells from at-risk infants long time before any symptom or production of antibodies. We analyzed the gene expression of a set of 9 candidate genes, associated with CD, in 22 human leukocyte antigen predisposed children from at-risk families for CD, studied from birth to 6 years of age. Nine of them developed CD (patients) and 13 did not (controls). We analyzed gene expression at 3 different time points (age matched in the 2 groups): 4-19 months before diagnosis, at the time of CD diagnosis, and after at least 1 year of a gluten-free diet. At similar age points, controls were also evaluated. Three genes (KIAA, TAGAP [T-cell Activation GTPase Activating Protein], and SH2B3 [SH2B Adaptor Protein 3]) were overexpressed in patients, compared with controls, at least 9 months before CD diagnosis. At a stepwise discriminant analysis, 4 genes (RGS1 [Regulator of G-protein signaling 1], TAGAP, TNFSF14 [Tumor Necrosis Factor (Ligand) Superfamily member 14], and SH2B3) differentiate patients from controls before serum antibodies production and clinical symptoms. Multivariate equation correctly classified CD from non-CD children in 95.5% of patients. The expression of a small set of candidate genes in peripheral blood mononuclear cells can predict CD at least 9 months before the appearance of any clinical and serological signs of the disease.
Evaluating Reported Candidate Gene Associations with Polycystic Ovary Syndrome
Pau, Cindy; Saxena, Richa; Welt, Corrine Kolka
2013-01-01
Objective To replicate variants in candidate genes associated with PCOS in a population of European PCOS and control subjects. Design Case-control association analysis and meta-analysis. Setting Major academic hospital Patients Women of European ancestry with PCOS (n=525) and controls (n=472), aged 18 to 45 years. Intervention Variants previously associated with PCOS in candidate gene studies were genotyped (n=39). Metabolic, reproductive and anthropomorphic parameters were examined as a function of the candidate variants. All genetic association analyses were adjusted for age, BMI and ancestry and were reported after correction for multiple testing. Main Outcome Measure Association of candidate gene variants with PCOS. Results Three variants, rs3797179 (SRD5A1), rs12473543 (POMC), and rs1501299 (ADIPOQ), were nominally associated with PCOS. However, they did not remain significant after correction for multiple testing and none of the variants replicated in a sufficiently powered meta-analysis. Variants in the FBN3 gene (rs17202517 and rs73503752) were associated with smaller waist circumferences and variant rs727428 in the SHBG gene was associated with lower SHBG levels. Conclusion Previously identified variants in candidate genes do not appear to be associated with PCOS risk. PMID:23375202
Scuba: scalable kernel-based gene prioritization.
Zampieri, Guido; Tran, Dinh Van; Donini, Michele; Navarin, Nicolò; Aiolli, Fabio; Sperduti, Alessandro; Valle, Giorgio
2018-01-25
The uncovering of genes linked to human diseases is a pressing challenge in molecular biology and precision medicine. This task is often hindered by the large number of candidate genes and by the heterogeneity of the available information. Computational methods for the prioritization of candidate genes can help to cope with these problems. In particular, kernel-based methods are a powerful resource for the integration of heterogeneous biological knowledge, however, their practical implementation is often precluded by their limited scalability. We propose Scuba, a scalable kernel-based method for gene prioritization. It implements a novel multiple kernel learning approach, based on a semi-supervised perspective and on the optimization of the margin distribution. Scuba is optimized to cope with strongly unbalanced settings where known disease genes are few and large scale predictions are required. Importantly, it is able to efficiently deal both with a large amount of candidate genes and with an arbitrary number of data sources. As a direct consequence of scalability, Scuba integrates also a new efficient strategy to select optimal kernel parameters for each data source. We performed cross-validation experiments and simulated a realistic usage setting, showing that Scuba outperforms a wide range of state-of-the-art methods. Scuba achieves state-of-the-art performance and has enhanced scalability compared to existing kernel-based approaches for genomic data. This method can be useful to prioritize candidate genes, particularly when their number is large or when input data is highly heterogeneous. The code is freely available at https://github.com/gzampieri/Scuba .
McDonald, Jacqueline U.; Kaforou, Myrsini; Clare, Simon; Hale, Christine; Ivanova, Maria; Huntley, Derek; Dorner, Marcus; Wright, Victoria J.; Levin, Michael; Martinon-Torres, Federico; Herberg, Jethro A.
2016-01-01
ABSTRACT Greater understanding of the functions of host gene products in response to infection is required. While many of these genes enable pathogen clearance, some enhance pathogen growth or contribute to disease symptoms. Many studies have profiled transcriptomic and proteomic responses to infection, generating large data sets, but selecting targets for further study is challenging. Here we propose a novel data-mining approach combining multiple heterogeneous data sets to prioritize genes for further study by using respiratory syncytial virus (RSV) infection as a model pathogen with a significant health care impact. The assumption was that the more frequently a gene is detected across multiple studies, the more important its role is. A literature search was performed to find data sets of genes and proteins that change after RSV infection. The data sets were standardized, collated into a single database, and then panned to determine which genes occurred in multiple data sets, generating a candidate gene list. This candidate gene list was validated by using both a clinical cohort and in vitro screening. We identified several genes that were frequently expressed following RSV infection with no assigned function in RSV control, including IFI27, IFIT3, IFI44L, GBP1, OAS3, IFI44, and IRF7. Drilling down into the function of these genes, we demonstrate a role in disease for the gene for interferon regulatory factor 7, which was highly ranked on the list, but not for IRF1, which was not. Thus, we have developed and validated an approach for collating published data sets into a manageable list of candidates, identifying novel targets for future analysis. IMPORTANCE Making the most of “big data” is one of the core challenges of current biology. There is a large array of heterogeneous data sets of host gene responses to infection, but these data sets do not inform us about gene function and require specialized skill sets and training for their utilization. Here we describe an approach that combines and simplifies these data sets, distilling this information into a single list of genes commonly upregulated in response to infection with RSV as a model pathogen. Many of the genes on the list have unknown functions in RSV disease. We validated the gene list with new clinical, in vitro, and in vivo data. This approach allows the rapid selection of genes of interest for further, more-detailed studies, thus reducing time and costs. Furthermore, the approach is simple to use and widely applicable to a range of diseases. PMID:27822537
Selection of reference genes for miRNA qRT-PCR under abiotic stress in grapevine.
Luo, Meng; Gao, Zhen; Li, Hui; Li, Qin; Zhang, Caixi; Xu, Wenping; Song, Shiren; Ma, Chao; Wang, Shiping
2018-03-13
Grapevine is among the fruit crops with high economic value, and because of the economic losses caused by abiotic stresses, the stress resistance of Vitis vinifera has become an increasingly important research area. Among the mechanisms responding to environmental stresses, the role of miRNA has received much attention recently. qRT-PCR is a powerful method for miRNA quantitation, but the accuracy of the method strongly depends on the appropriate reference genes. To determine the most suitable reference genes for grapevine miRNA qRT-PCR, 15 genes were chosen as candidate reference genes. After eliminating 6 candidate reference genes with unsatisfactory amplification efficiency, the expression stability of the remaining candidate reference genes under salinity, cold and drought was analysed using four algorithms, geNorm, NormFinder, deltaCt and Bestkeeper. The results indicated that U6 snRNA was the most suitable reference gene under salinity and cold stresses; whereas miR168 was the best for drought stress. The best reference gene sets for salinity, cold and drought stresses were miR160e + miR164a, miR160e + miR168 and ACT + UBQ + GAPDH, respectively. The selected reference genes or gene sets were verified using miR319 or miR408 as the target gene.
Reranking candidate gene models with cross-species comparison for improved gene prediction
Liu, Qian; Crammer, Koby; Pereira, Fernando CN; Roos, David S
2008-01-01
Background Most gene finders score candidate gene models with state-based methods, typically HMMs, by combining local properties (coding potential, splice donor and acceptor patterns, etc). Competing models with similar state-based scores may be distinguishable with additional information. In particular, functional and comparative genomics datasets may help to select among competing models of comparable probability by exploiting features likely to be associated with the correct gene models, such as conserved exon/intron structure or protein sequence features. Results We have investigated the utility of a simple post-processing step for selecting among a set of alternative gene models, using global scoring rules to rerank competing models for more accurate prediction. For each gene locus, we first generate the K best candidate gene models using the gene finder Evigan, and then rerank these models using comparisons with putative orthologous genes from closely-related species. Candidate gene models with lower scores in the original gene finder may be selected if they exhibit strong similarity to probable orthologs in coding sequence, splice site location, or signal peptide occurrence. Experiments on Drosophila melanogaster demonstrate that reranking based on cross-species comparison outperforms the best gene models identified by Evigan alone, and also outperforms the comparative gene finders GeneWise and Augustus+. Conclusion Reranking gene models with cross-species comparison improves gene prediction accuracy. This straightforward method can be readily adapted to incorporate additional lines of evidence, as it requires only a ranked source of candidate gene models. PMID:18854050
A human functional protein interaction network and its application to cancer data analysis
2010-01-01
Background One challenge facing biologists is to tease out useful information from massive data sets for further analysis. A pathway-based analysis may shed light by projecting candidate genes onto protein functional relationship networks. We are building such a pathway-based analysis system. Results We have constructed a protein functional interaction network by extending curated pathways with non-curated sources of information, including protein-protein interactions, gene coexpression, protein domain interaction, Gene Ontology (GO) annotations and text-mined protein interactions, which cover close to 50% of the human proteome. By applying this network to two glioblastoma multiforme (GBM) data sets and projecting cancer candidate genes onto the network, we found that the majority of GBM candidate genes form a cluster and are closer than expected by chance, and the majority of GBM samples have sequence-altered genes in two network modules, one mainly comprising genes whose products are localized in the cytoplasm and plasma membrane, and another comprising gene products in the nucleus. Both modules are highly enriched in known oncogenes, tumor suppressors and genes involved in signal transduction. Similar network patterns were also found in breast, colorectal and pancreatic cancers. Conclusions We have built a highly reliable functional interaction network upon expert-curated pathways and applied this network to the analysis of two genome-wide GBM and several other cancer data sets. The network patterns revealed from our results suggest common mechanisms in the cancer biology. Our system should provide a foundation for a network or pathway-based analysis platform for cancer and other diseases. PMID:20482850
PINTA: a web server for network-based gene prioritization from expression data
Nitsch, Daniela; Tranchevent, Léon-Charles; Gonçalves, Joana P.; Vogt, Josef Korbinian; Madeira, Sara C.; Moreau, Yves
2011-01-01
PINTA (available at http://www.esat.kuleuven.be/pinta/; this web site is free and open to all users and there is no login requirement) is a web resource for the prioritization of candidate genes based on the differential expression of their neighborhood in a genome-wide protein–protein interaction network. Our strategy is meant for biological and medical researchers aiming at identifying novel disease genes using disease specific expression data. PINTA supports both candidate gene prioritization (starting from a user defined set of candidate genes) as well as genome-wide gene prioritization and is available for five species (human, mouse, rat, worm and yeast). As input data, PINTA only requires disease specific expression data, whereas various platforms (e.g. Affymetrix) are supported. As a result, PINTA computes a gene ranking and presents the results as a table that can easily be browsed and downloaded by the user. PMID:21602267
Vadigepalli, Rajanikanth; Chakravarthula, Praveen; Zak, Daniel E; Schwaber, James S; Gonye, Gregory E
2003-01-01
We have developed a bioinformatics tool named PAINT that automates the promoter analysis of a given set of genes for the presence of transcription factor binding sites. Based on coincidence of regulatory sites, this tool produces an interaction matrix that represents a candidate transcriptional regulatory network. This tool currently consists of (1) a database of promoter sequences of known or predicted genes in the Ensembl annotated mouse genome database, (2) various modules that can retrieve and process the promoter sequences for binding sites of known transcription factors, and (3) modules for visualization and analysis of the resulting set of candidate network connections. This information provides a substantially pruned list of genes and transcription factors that can be examined in detail in further experimental studies on gene regulation. Also, the candidate network can be incorporated into network identification methods in the form of constraints on feasible structures in order to render the algorithms tractable for large-scale systems. The tool can also produce output in various formats suitable for use in external visualization and analysis software. In this manuscript, PAINT is demonstrated in two case studies involving analysis of differentially regulated genes chosen from two microarray data sets. The first set is from a neuroblastoma N1E-115 cell differentiation experiment, and the second set is from neuroblastoma N1E-115 cells at different time intervals following exposure to neuropeptide angiotensin II. PAINT is available for use as an agent in BioSPICE simulation and analysis framework (www.biospice.org), and can also be accessed via a WWW interface at www.dbi.tju.edu/dbi/tools/paint/.
Deep Sequencing of 71 Candidate Genes to Characterize Variation Associated with Alcohol Dependence.
Clark, Shaunna L; McClay, Joseph L; Adkins, Daniel E; Kumar, Gaurav; Aberg, Karolina A; Nerella, Srilaxmi; Xie, Linying; Collins, Ann L; Crowley, James J; Quackenbush, Corey R; Hilliard, Christopher E; Shabalin, Andrey A; Vrieze, Scott I; Peterson, Roseann E; Copeland, William E; Silberg, Judy L; McGue, Matt; Maes, Hermine; Iacono, William G; Sullivan, Patrick F; Costello, Elizabeth J; van den Oord, Edwin J
2017-04-01
Previous genomewide association studies (GWASs) have identified a number of putative risk loci for alcohol dependence (AD). However, only a few loci have replicated and these replicated variants only explain a small proportion of AD risk. Using an innovative approach, the goal of this study was to generate hypotheses about potentially causal variants for AD that can be explored further through functional studies. We employed targeted capture of 71 candidate loci and flanking regions followed by next-generation deep sequencing (mean coverage 78X) in 806 European Americans. Regions included in our targeted capture library were genes identified through published GWAS of alcohol, all human alcohol and aldehyde dehydrogenases, reward system genes including dopaminergic and opioid receptors, prioritized candidate genes based on previous associations, and genes involved in the absorption, distribution, metabolism, and excretion of drugs. We performed single-locus tests to determine if any single variant was associated with AD symptom count. Sets of variants that overlapped with biologically meaningful annotations were tested for association in aggregate. No single, common variant was significantly associated with AD in our study. We did, however, find evidence for association with several variant sets. Two variant sets were significant at the q-value <0.10 level: a genic enhancer for ADHFE1 (p = 1.47 × 10 -5 ; q = 0.019), an alcohol dehydrogenase, and ADORA1 (p = 5.29 × 10 -5 ; q = 0.035), an adenosine receptor that belongs to a G-protein-coupled receptor gene family. To our knowledge, this is the first sequencing study of AD to examine variants in entire genes, including flanking and regulatory regions. We found that in addition to protein coding variant sets, regulatory variant sets may play a role in AD. From these findings, we have generated initial functional hypotheses about how these sets may influence AD. Copyright © 2017 by the Research Society on Alcoholism.
Smedley, Damian; Kohler, Sebastian; Czeschik, Johanna Christina; ...
2014-07-30
Here, whole-exome sequencing (WES) has opened up previously unheard of possibilities for identifying novel disease genes in Mendelian disorders, only about half of which have been elucidated to date. However, interpretation of WES data remains challenging. As a result, we analyze protein–protein association (PPA) networks to identify candidate genes in the vicinity of genes previously implicated in a disease. The analysis, using a random-walk with restart (RWR) method, is adapted to the setting of WES by developing a composite variant-gene relevance score based on the rarity, location and predicted pathogenicity of variants and the RWR evaluation of genes harboring themore » variants. Benchmarking using known disease variants from 88 disease-gene families reveals that the correct gene is ranked among the top 10 candidates in ≥50% of cases, a figure which we confirmed using a prospective study of disease genes identified in 2012 and PPA data produced before that date. In conclusion, we implement our method in a freely available Web server, ExomeWalker, that displays a ranked list of candidates together with information on PPAs, frequency and predicted pathogenicity of the variants to allow quick and effective searches for candidates that are likely to reward closer investigation.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Smedley, Damian; Kohler, Sebastian; Czeschik, Johanna Christina
Here, whole-exome sequencing (WES) has opened up previously unheard of possibilities for identifying novel disease genes in Mendelian disorders, only about half of which have been elucidated to date. However, interpretation of WES data remains challenging. As a result, we analyze protein–protein association (PPA) networks to identify candidate genes in the vicinity of genes previously implicated in a disease. The analysis, using a random-walk with restart (RWR) method, is adapted to the setting of WES by developing a composite variant-gene relevance score based on the rarity, location and predicted pathogenicity of variants and the RWR evaluation of genes harboring themore » variants. Benchmarking using known disease variants from 88 disease-gene families reveals that the correct gene is ranked among the top 10 candidates in ≥50% of cases, a figure which we confirmed using a prospective study of disease genes identified in 2012 and PPA data produced before that date. In conclusion, we implement our method in a freely available Web server, ExomeWalker, that displays a ranked list of candidates together with information on PPAs, frequency and predicted pathogenicity of the variants to allow quick and effective searches for candidates that are likely to reward closer investigation.« less
Meta-Analysis of Tumor Stem-Like Breast Cancer Cells Using Gene Set and Network Analysis
Lee, Won Jun; Kim, Sang Cheol; Yoon, Jung-Ho; Yoon, Sang Jun; Lim, Johan; Kim, You-Sun; Kwon, Sung Won; Park, Jeong Hill
2016-01-01
Generally, cancer stem cells have epithelial-to-mesenchymal-transition characteristics and other aggressive properties that cause metastasis. However, there have been no confident markers for the identification of cancer stem cells and comparative methods examining adherent and sphere cells are widely used to investigate mechanism underlying cancer stem cells, because sphere cells have been known to maintain cancer stem cell characteristics. In this study, we conducted a meta-analysis that combined gene expression profiles from several studies that utilized tumorsphere technology to investigate tumor stem-like breast cancer cells. We used our own gene expression profiles along with the three different gene expression profiles from the Gene Expression Omnibus, which we combined using the ComBat method, and obtained significant gene sets using the gene set analysis of our datasets and the combined dataset. This experiment focused on four gene sets such as cytokine-cytokine receptor interaction that demonstrated significance in both datasets. Our observations demonstrated that among the genes of four significant gene sets, six genes were consistently up-regulated and satisfied the p-value of < 0.05, and our network analysis showed high connectivity in five genes. From these results, we established CXCR4, CXCL1 and HMGCS1, the intersecting genes of the datasets with high connectivity and p-value of < 0.05, as significant genes in the identification of cancer stem cells. Additional experiment using quantitative reverse transcription-polymerase chain reaction showed significant up-regulation in MCF-7 derived sphere cells and confirmed the importance of these three genes. Taken together, using meta-analysis that combines gene set and network analysis, we suggested CXCR4, CXCL1 and HMGCS1 as candidates involved in tumor stem-like breast cancer cells. Distinct from other meta-analysis, by using gene set analysis, we selected possible markers which can explain the biological mechanisms and suggested network analysis as an additional criterion for selecting candidates. PMID:26870956
Sardos, Julie; Rouard, Mathieu; Hueber, Yann; Cenci, Alberto; Hyma, Katie E; van den Houwe, Ines; Hribova, Eva; Courtois, Brigitte; Roux, Nicolas
2016-01-01
Banana (Musa sp.) is a vegetatively propagated, low fertility, potentially hybrid and polyploid crop. These qualities make the breeding and targeted genetic improvement of this crop a difficult and long process. The Genome-Wide Association Study (GWAS) approach is becoming widely used in crop plants and has proven efficient to detecting candidate genes for traits of interest, especially in cereals. GWAS has not been applied yet to a vegetatively propagated crop. However, successful GWAS in banana would considerably help unravel the genomic basis of traits of interest and therefore speed up this crop improvement. We present here a dedicated panel of 105 accessions of banana, freely available upon request, and their corresponding GBS data. A set of 5,544 highly reliable markers revealed high levels of admixture in most accessions, except for a subset of 33 individuals from Papua. A GWAS on the seedless phenotype was then successfully applied to the panel. By applying the Mixed Linear Model corrected for both kinship and structure as implemented in TASSEL, we detected 13 candidate genomic regions in which we found a number of genes potentially linked with the seedless phenotype (i.e. parthenocarpy combined with female sterility). An additional GWAS performed on the unstructured Papuan subset composed of 33 accessions confirmed six of these regions as candidate. Out of both sets of analyses, one strong candidate gene for female sterility, a putative orthologous gene to Histidine Kinase CKI1, was identified. The results presented here confirmed the feasibility and potential of GWAS when applied to small sets of banana accessions, at least for traits underpinned by a few loci. As phenotyping in banana is extremely space and time-consuming, this latest finding is of particular importance in the context of banana improvement.
Sardos, Julie; Rouard, Mathieu; Hueber, Yann; Cenci, Alberto; Hyma, Katie E.; van den Houwe, Ines; Hribova, Eva; Courtois, Brigitte; Roux, Nicolas
2016-01-01
Banana (Musa sp.) is a vegetatively propagated, low fertility, potentially hybrid and polyploid crop. These qualities make the breeding and targeted genetic improvement of this crop a difficult and long process. The Genome-Wide Association Study (GWAS) approach is becoming widely used in crop plants and has proven efficient to detecting candidate genes for traits of interest, especially in cereals. GWAS has not been applied yet to a vegetatively propagated crop. However, successful GWAS in banana would considerably help unravel the genomic basis of traits of interest and therefore speed up this crop improvement. We present here a dedicated panel of 105 accessions of banana, freely available upon request, and their corresponding GBS data. A set of 5,544 highly reliable markers revealed high levels of admixture in most accessions, except for a subset of 33 individuals from Papua. A GWAS on the seedless phenotype was then successfully applied to the panel. By applying the Mixed Linear Model corrected for both kinship and structure as implemented in TASSEL, we detected 13 candidate genomic regions in which we found a number of genes potentially linked with the seedless phenotype (i.e. parthenocarpy combined with female sterility). An additional GWAS performed on the unstructured Papuan subset composed of 33 accessions confirmed six of these regions as candidate. Out of both sets of analyses, one strong candidate gene for female sterility, a putative orthologous gene to Histidine Kinase CKI1, was identified. The results presented here confirmed the feasibility and potential of GWAS when applied to small sets of banana accessions, at least for traits underpinned by a few loci. As phenotyping in banana is extremely space and time-consuming, this latest finding is of particular importance in the context of banana improvement. PMID:27144345
Ingham, Victoria A; Jones, Christopher M; Pignatelli, Patricia; Balabanidou, Vasileia; Vontas, John; Wagstaff, Simon C; Moore, Jonathan D; Ranson, Hilary
2014-11-25
The elevated expression of enzymes with insecticide metabolism activity can lead to high levels of insecticide resistance in the malaria vector, Anopheles gambiae. In this study, adult female mosquitoes from an insecticide susceptible and resistant strain were dissected into four different body parts. RNA from each of these samples was used in microarray analysis to determine the enrichment patterns of the key detoxification gene families within the mosquito and to identify additional candidate insecticide resistance genes that may have been overlooked in previous experiments on whole organisms. A general enrichment in the transcription of genes from the four major detoxification gene families (carboxylesterases, glutathione transferases, UDP glucornyltransferases and cytochrome P450s) was observed in the midgut and malpighian tubules. Yet the subset of P450 genes that have previously been implicated in insecticide resistance in An gambiae, show a surprisingly varied profile of tissue enrichment, confirmed by qPCR and, for three candidates, by immunostaining. A stringent selection process was used to define a list of 105 genes that are significantly (p ≤0.001) over expressed in body parts from the resistant versus susceptible strain. Over half of these, including all the cytochrome P450s on this list, were identified in previous whole organism comparisons between the strains, but several new candidates were detected, notably from comparisons of the transcriptomes from dissected abdomen integuments. The use of RNA extracted from the whole organism to identify candidate insecticide resistance genes has a risk of missing candidates if key genes responsible for the phenotype have restricted expression within the body and/or are over expression only in certain tissues. However, as transcription of genes implicated in metabolic resistance to insecticides is not enriched in any one single organ, comparison of the transcriptome of individual dissected body parts cannot be recommended as a preferred means to identify new candidate insecticide resistant genes. Instead the rich data set on in vivo sites of transcription should be consulted when designing follow up qPCR validation steps, or for screening known candidates in field populations.
Schizophrenia, vitamin D, and brain development.
Mackay-Sim, Alan; Féron, François; Eyles, Darryl; Burne, Thomas; McGrath, John
2004-01-01
Schizophrenia research is invigorated at present by the recent discovery of several plausible candidate susceptibility genes identified from genetic linkage and gene expression studies of brains from persons with schizophrenia. It is a current challenge to reconcile this gathering evidence for specific candidate susceptibility genes with the "neurodevelopmental hypothesis," which posits that schizophrenia arises from gene-environment interactions that disrupt brain development. We make the case here that schizophrenia may result not from numerous genes of small effect, but a few genes of transcriptional regulation acting during brain development. In particular we propose that low vitamin D during brain development interacts with susceptibility genes to alter the trajectory of brain development, probably by epigenetic regulation that alters gene expression throughout adult life. Vitamin D is an attractive "environmental" candidate because it appears to explain several key epidemiological features of schizophrenia. Vitamin D is an attractive "genetic" candidate because its nuclear hormone receptor regulates gene expression and nervous system development. The polygenic quality of schizophrenia, with linkage to many genes of small effect, maybe brought together via this "vitamin D hypothesis." We also discuss the possibility of a broader set of environmental and genetic factors interacting via the nuclear hormone receptors to affect the development of the brain leading to schizophrenia.
Bosch, Linda J.W.; Coupé, Veerle M.H.; Mongera, Sandra; Haan, Josien C.; Richman, Susan D.; Koopman, Miriam; Tol, Jolien; de Meyer, Tim; Louwagie, Joost; Dehaspe, Luc; van Grieken, Nicole C.T.; Ylstra, Bauke; Verheul, Henk M.W.; van Engeland, Manon; Nagtegaal, Iris D.; Herman, James G.; Quirke, Philip; Seymour, Matthew T.; Punt, Cornelis J.A.; van Criekinge, Wim; Carvalho, Beatriz; Meijer, Gerrit A.
2017-01-01
Diversity in colorectal cancer biology is associated with variable responses to standard chemotherapy. We aimed to identify and validate DNA hypermethylated genes as predictive biomarkers for irinotecan treatment of metastatic CRC patients. Candidate genes were selected from 389 genes involved in DNA Damage Repair by correlation analyses between gene methylation status and drug response in 32 cell lines. A large series of samples (n=818) from two phase III clinical trials was used to evaluate these candidate genes by correlating methylation status to progression-free survival after treatment with first-line single-agent fluorouracil (Capecitabine or 5-fluorouracil) or combination chemotherapy (Capecitabine or 5-fluorouracil plus irinotecan (CAPIRI/FOLFIRI)). In the discovery (n=185) and initial validation set (n=166), patients with methylated Decoy Receptor 1 (DCR1) did not benefit from CAPIRI over Capecitabine treatment (discovery set: HR=1.2 (95%CI 0.7-1.9, p=0.6), validation set: HR=0.9 (95%CI 0.6-1.4, p=0.5)), whereas patients with unmethylated DCR1 did (discovery set: HR=0.4 (95%CI 0.3-0.6, p=0.00001), validation set: HR=0.5 (95%CI 0.3-0.7, p=0.0008)). These results could not be replicated in the external data set (n=467), where a similar effect size was found in patients with methylated and unmethylated DCR1 for FOLFIRI over 5FU treatment (methylated DCR1: HR=0.7 (95%CI 0.5-0.9, p=0.01), unmethylated DCR1: HR=0.8 (95%CI 0.6-1.2, p=0.4)). In conclusion, DCR1 promoter hypermethylation status is a potential predictive biomarker for response to treatment with irinotecan, when combined with capecitabine. This finding could not be replicated in an external validation set, in which irinotecan was combined with 5FU. These results underline the challenge and importance of extensive clinical evaluation of candidate biomarkers in multiple trials. PMID:28968978
Bosch, Linda J W; Trooskens, Geert; Snaebjornsson, Petur; Coupé, Veerle M H; Mongera, Sandra; Haan, Josien C; Richman, Susan D; Koopman, Miriam; Tol, Jolien; de Meyer, Tim; Louwagie, Joost; Dehaspe, Luc; van Grieken, Nicole C T; Ylstra, Bauke; Verheul, Henk M W; van Engeland, Manon; Nagtegaal, Iris D; Herman, James G; Quirke, Philip; Seymour, Matthew T; Punt, Cornelis J A; van Criekinge, Wim; Carvalho, Beatriz; Meijer, Gerrit A
2017-09-08
Diversity in colorectal cancer biology is associated with variable responses to standard chemotherapy. We aimed to identify and validate DNA hypermethylated genes as predictive biomarkers for irinotecan treatment of metastatic CRC patients. Candidate genes were selected from 389 genes involved in DNA Damage Repair by correlation analyses between gene methylation status and drug response in 32 cell lines. A large series of samples (n=818) from two phase III clinical trials was used to evaluate these candidate genes by correlating methylation status to progression-free survival after treatment with first-line single-agent fluorouracil (Capecitabine or 5-fluorouracil) or combination chemotherapy (Capecitabine or 5-fluorouracil plus irinotecan (CAPIRI/FOLFIRI)). In the discovery (n=185) and initial validation set (n=166), patients with methylated Decoy Receptor 1 ( DCR1) did not benefit from CAPIRI over Capecitabine treatment (discovery set: HR=1.2 (95%CI 0.7-1.9, p =0.6), validation set: HR=0.9 (95%CI 0.6-1.4, p =0.5)), whereas patients with unmethylated DCR1 did (discovery set: HR=0.4 (95%CI 0.3-0.6, p =0.00001), validation set: HR=0.5 (95%CI 0.3-0.7, p =0.0008)). These results could not be replicated in the external data set (n=467), where a similar effect size was found in patients with methylated and unmethylated DCR1 for FOLFIRI over 5FU treatment (methylated DCR1 : HR=0.7 (95%CI 0.5-0.9, p =0.01), unmethylated DCR1 : HR=0.8 (95%CI 0.6-1.2, p =0.4)). In conclusion, DCR1 promoter hypermethylation status is a potential predictive biomarker for response to treatment with irinotecan, when combined with capecitabine. This finding could not be replicated in an external validation set, in which irinotecan was combined with 5FU. These results underline the challenge and importance of extensive clinical evaluation of candidate biomarkers in multiple trials.
Chini, Vasiliki; Stambouli, Danai; Nedelea, Florina Mihaela; Filipescu, George Alexandru; Mina, Diana; Kambouris, Marios; El-Shantil, Hatem
2014-06-01
Prenatal diagnosis was requested for an undiagnosed eye disease showing X-linked inheritance in a family. No medical records existed for the affected family members. Mapping of the X chromosome and candidate gene mutation screening identified a c.C267A[p.F89L] mutation in NPD previously described as possibly causing Norrie disease. The detection of the c.C267A[p.F89L] variant in another unrelated family confirms the pathogenic nature of the mutation for the Norrie disease phenotype. Gene mapping, haplotype analysis, and candidate gene screening have been previously utilized in research applications but were applied here in a diagnostic setting due to the scarcity of available clinical information. The clinical diagnosis and mutation identification were critical for providing proper genetic counseling and prenatal diagnosis for this family.
Juul, Malene; Bertl, Johanna; Guo, Qianyun; Nielsen, Morten Muhlig; Świtnicki, Michał; Hornshøj, Henrik; Madsen, Tobias; Hobolth, Asger; Pedersen, Jakob Skou
2017-01-01
Non-coding mutations may drive cancer development. Statistical detection of non-coding driver regions is challenged by a varying mutation rate and uncertainty of functional impact. Here, we develop a statistically founded non-coding driver-detection method, ncdDetect, which includes sample-specific mutational signatures, long-range mutation rate variation, and position-specific impact measures. Using ncdDetect, we screened non-coding regulatory regions of protein-coding genes across a pan-cancer set of whole-genomes (n = 505), which top-ranked known drivers and identified new candidates. For individual candidates, presence of non-coding mutations associates with altered expression or decreased patient survival across an independent pan-cancer sample set (n = 5454). This includes an antigen-presenting gene (CD1A), where 5’UTR mutations correlate significantly with decreased survival in melanoma. Additionally, mutations in a base-excision-repair gene (SMUG1) correlate with a C-to-T mutational-signature. Overall, we find that a rich model of mutational heterogeneity facilitates non-coding driver identification and integrative analysis points to candidates of potential clinical relevance. DOI: http://dx.doi.org/10.7554/eLife.21778.001 PMID:28362259
Windhorst, Dafna A; Mileva-Seitz, Viara R; Rippe, Ralph C A; Tiemeier, Henning; Jaddoe, Vincent W V; Verhulst, Frank C; van IJzendoorn, Marinus H; Bakermans-Kranenburg, Marian J
2016-08-01
In a longitudinal cohort study, we investigated the interplay of harsh parenting and genetic variation across a set of functionally related dopamine genes, in association with children's externalizing behavior. This is one of the first studies to employ gene-based and gene-set approaches in tests of Gene by Environment (G × E) effects on complex behavior. This approach can offer an important alternative or complement to candidate gene and genome-wide environmental interaction (GWEI) studies in the search for genetic variation underlying individual differences in behavior. Genetic variants in 12 autosomal dopaminergic genes were available in an ethnically homogenous part of a population-based cohort. Harsh parenting was assessed with maternal (n = 1881) and paternal (n = 1710) reports at age 3. Externalizing behavior was assessed with the Child Behavior Checklist (CBCL) at age 5 (71 ± 3.7 months). We conducted gene-set analyses of the association between variation in dopaminergic genes and externalizing behavior, stratified for harsh parenting. The association was statistically significant or approached significance for children without harsh parenting experiences, but was absent in the group with harsh parenting. Similarly, significant associations between single genes and externalizing behavior were only found in the group without harsh parenting. Effect sizes in the groups with and without harsh parenting did not differ significantly. Gene-environment interaction tests were conducted for individual genetic variants, resulting in two significant interaction effects (rs1497023 and rs4922132) after correction for multiple testing. Our findings are suggestive of G × E interplay, with associations between dopamine genes and externalizing behavior present in children without harsh parenting, but not in children with harsh parenting experiences. Harsh parenting may overrule the role of genetic factors in externalizing behavior. Gene-based and gene-set analyses offer promising new alternatives to analyses focusing on single candidate polymorphisms when examining the interplay between genetic and environmental factors.
Mariette, Stéphanie; Wong Jun Tai, Fabienne; Roch, Guillaume; Barre, Aurélien; Chague, Aurélie; Decroocq, Stéphane; Groppi, Alexis; Laizet, Yec'han; Lambert, Patrick; Tricon, David; Nikolski, Macha; Audergon, Jean-Marc; Abbott, Albert G; Decroocq, Véronique
2016-01-01
In fruit tree species, many important traits have been characterized genetically by using single-family descent mapping in progenies segregating for the traits. However, most mapped loci have not been sufficiently resolved to the individual genes due to insufficient progeny sizes for high resolution mapping and the previous lack of whole-genome sequence resources of the study species. To address this problem for Plum Pox Virus (PPV) candidate resistance gene identification in Prunus species, we implemented a genome-wide association (GWA) approach in apricot. This study exploited the broad genetic diversity of the apricot (Prunus armeniaca) germplasm containing resistance to PPV, next-generation sequence-based genotyping, and the high-quality peach (Prunus persica) genome reference sequence for single nucleotide polymorphism (SNP) identification. The results of this GWA study validated previously reported PPV resistance quantitative trait loci (QTL) intervals, highlighted other potential resistance loci, and resolved each to a limited set of candidate genes for further study. This work substantiates the association genetics approach for resolution of QTL to candidate genes in apricot and suggests that this approach could simplify identification of other candidate genes for other marked trait intervals in this germplasm. © 2015 INRA, UMR 1332 BFP New Phytologist © 2015 New Phytologist Trust.
Revealing Alzheimer's disease genes spectrum in the whole-genome by machine learning.
Huang, Xiaoyan; Liu, Hankui; Li, Xinming; Guan, Liping; Li, Jiankang; Tellier, Laurent Christian Asker M; Yang, Huanming; Wang, Jian; Zhang, Jianguo
2018-01-10
Alzheimer's disease (AD) is an important, progressive neurodegenerative disease, with a complex genetic architecture. A key goal of biomedical research is to seek out disease risk genes, and to elucidate the function of these risk genes in the development of disease. For this purpose, expanding the AD-associated gene set is necessary. In past research, the prediction methods for AD related genes has been limited in their exploration of the target genome regions. We here present a genome-wide method for AD candidate genes predictions. We present a machine learning approach (SVM), based upon integrating gene expression data with human brain-specific gene network data, to discover the full spectrum of AD genes across the whole genome. We classified AD candidate genes with an accuracy and the area under the receiver operating characteristic (ROC) curve of 84.56% and 94%. Our approach provides a supplement for the spectrum of AD-associated genes extracted from more than 20,000 genes in a genome wide scale. In this study, we have elucidated the whole-genome spectrum of AD, using a machine learning approach. Through this method, we expect for the candidate gene catalogue to provide a more comprehensive annotation of AD for researchers.
2010-01-01
Background Parkinson's disease is the second most common neurodegenerative disorder. The pathological hallmark of the disease is degeneration of midbrain dopaminergic neurons. Genetic association studies have linked 13 human chromosomal loci to Parkinson's disease. Identification of gene(s), as part of the etiology of Parkinson's disease, within the large number of genes residing in these loci can be achieved through several approaches, including screening methods, and considering appropriate criteria. Since several of the indentified Parkinson's disease genes are expressed in substantia nigra pars compact of the midbrain, expression within the neurons of this area could be a suitable criterion to limit the number of candidates and identify PD genes. Methods In this work we have used the combination of findings from six rodent transcriptome analysis studies on the gene expression profile of midbrain dopaminergic neurons and the PARK loci in OMIM (Online Mendelian Inheritance in Man) database, to identify new candidate genes for Parkinson's disease. Results Merging the two datasets, we identified 20 genes within PARK loci, 7 of which are located in an orphan Parkinson's disease locus and one, which had been identified as a disease gene. In addition to identifying a set of candidates for further genetic association studies, these results show that the criteria of expression in midbrain dopaminergic neurons may be used to narrow down the number of genes in PARK loci for such studies. PMID:20716345
Adaptation to climate through flowering phenology: a case study in Medicago truncatula.
Burgarella, Concetta; Chantret, Nathalie; Gay, Laurène; Prosperi, Jean-Marie; Bonhomme, Maxime; Tiffin, Peter; Young, Nevin D; Ronfort, Joelle
2016-07-01
Local climatic conditions likely constitute an important selective pressure on genes underlying important fitness-related traits such as flowering time, and in many species, flowering phenology and climatic gradients strongly covary. To test whether climate shapes the genetic variation on flowering time genes and to identify candidate flowering genes involved in the adaptation to environmental heterogeneity, we used a large Medicago truncatula core collection to examine the association between nucleotide polymorphisms at 224 candidate genes and both climate variables and flowering phenotypes. Unlike genome-wide studies, candidate gene approaches are expected to enrich for the number of meaningful trait associations because they specifically target genes that are known to affect the trait of interest. We found that flowering time mediates adaptation to climatic conditions mainly by variation at genes located upstream in the flowering pathways, close to the environmental stimuli. Variables related to the annual precipitation regime reflected selective constraints on flowering time genes better than the other variables tested (temperature, altitude, latitude or longitude). By comparing phenotype and climate associations, we identified 12 flowering genes as the most promising candidates responsible for phenological adaptation to climate. Four of these genes were located in the known flowering time QTL region on chromosome 7. However, climate and flowering associations also highlighted largely distinct gene sets, suggesting different genetic architectures for adaptation to climate and flowering onset. © 2016 John Wiley & Sons Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ranjan, Priya; Yin, Tongming; Zhang, Xinye
2009-11-01
Quantitative trait locus (QTL) studies are an integral part of plant research and are used to characterize the genetic basis of phenotypic variation observed in structured populations and inform marker-assisted breeding efforts. These QTL intervals can span large physical regions on a chromosome comprising hundreds of genes, thereby hampering candidate gene identification. Genome history, evolution, and expression evidence can be used to narrow the genes in the interval to a smaller list that is manageable for detailed downstream functional genomics characterization. Our primary motivation for the present study was to address the need for a research methodology that identifies candidatemore » genes within a broad QTL interval. Here we present a bioinformatics-based approach for subdividing candidate genes within QTL intervals into alternate groups of high probability candidates. Application of this approach in the context of studying cell wall traits, specifically lignin content and S/G ratios of stem and root in Populus plants, resulted in manageable sets of genes of both known and putative cell wall biosynthetic function. These results provide a roadmap for future experimental work leading to identification of new genes controlling cell wall recalcitrance and, ultimately, in the utility of plant biomass as an energy feedstock.« less
Mining functionally relevant gene sets for analyzing physiologically novel clinical expression data.
Turcan, Sevin; Vetter, Douglas E; Maron, Jill L; Wei, Xintao; Slonim, Donna K
2011-01-01
Gene set analyses have become a standard approach for increasing the sensitivity of transcriptomic studies. However, analytical methods incorporating gene sets require the availability of pre-defined gene sets relevant to the underlying physiology being studied. For novel physiological problems, relevant gene sets may be unavailable or existing gene set databases may bias the results towards only the best-studied of the relevant biological processes. We describe a successful attempt to mine novel functional gene sets for translational projects where the underlying physiology is not necessarily well characterized in existing annotation databases. We choose targeted training data from public expression data repositories and define new criteria for selecting biclusters to serve as candidate gene sets. Many of the discovered gene sets show little or no enrichment for informative Gene Ontology terms or other functional annotation. However, we observe that such gene sets show coherent differential expression in new clinical test data sets, even if derived from different species, tissues, and disease states. We demonstrate the efficacy of this method on a human metabolic data set, where we discover novel, uncharacterized gene sets that are diagnostic of diabetes, and on additional data sets related to neuronal processes and human development. Our results suggest that our approach may be an efficient way to generate a collection of gene sets relevant to the analysis of data for novel clinical applications where existing functional annotation is relatively incomplete.
Liu, Mingying; Jiang, Jing; Han, Xiaojiao; Qiao, Guirong; Zhuo, Renying
2014-01-01
Dendrocalamus latiflorus Munro distributes widely in subtropical areas and plays vital roles as valuable natural resources. The transcriptome sequencing for D. latiflorus Munro has been performed and numerous genes especially those predicted to be unique to D. latiflorus Munro were revealed. qRT-PCR has become a feasible approach to uncover gene expression profiling, and the accuracy and reliability of the results obtained depends upon the proper selection of stable reference genes for accurate normalization. Therefore, a set of suitable internal controls should be validated for D. latiflorus Munro. In this report, twelve candidate reference genes were selected and the assessment of gene expression stability was performed in ten tissue samples and four leaf samples from seedlings and anther-regenerated plants of different ploidy. The PCR amplification efficiency was estimated, and the candidate genes were ranked according to their expression stability using three software packages: geNorm, NormFinder and Bestkeeper. GAPDH and EF1α were characterized to be the most stable genes among different tissues or in all the sample pools, while CYP showed low expression stability. RPL3 had the optimal performance among four leaf samples. The application of verified reference genes was illustrated by analyzing ferritin and laccase expression profiles among different experimental sets. The analysis revealed the biological variation in ferritin and laccase transcript expression among the tissues studied and the individual plants. geNorm, NormFinder, and BestKeeper analyses recommended different suitable reference gene(s) for normalization according to the experimental sets. GAPDH and EF1α had the highest expression stability across different tissues and RPL3 for the other sample set. This study emphasizes the importance of validating superior reference genes for qRT-PCR analysis to accurately normalize gene expression of D. latiflorus Munro.
González-Martínez, Santiago C; Ersoz, Elhan; Brown, Garth R; Wheeler, Nicholas C; Neale, David B
2006-03-01
Genetic association studies are rapidly becoming the experimental approach of choice to dissect complex traits, including tolerance to drought stress, which is the most common cause of mortality and yield losses in forest trees. Optimization of association mapping requires knowledge of the patterns of nucleotide diversity and linkage disequilibrium and the selection of suitable polymorphisms for genotyping. Moreover, standard neutrality tests applied to DNA sequence variation data can be used to select candidate genes or amino acid sites that are putatively under selection for association mapping. In this article, we study the pattern of polymorphism of 18 candidate genes for drought-stress response in Pinus taeda L., an important tree crop. Data analyses based on a set of 21 putatively neutral nuclear microsatellites did not show population genetic structure or genomewide departures from neutrality. Candidate genes had moderate average nucleotide diversity at silent sites (pi(sil) = 0.00853), varying 100-fold among single genes. The level of within-gene LD was low, with an average pairwise r2 of 0.30, decaying rapidly from approximately 0.50 to approximately 0.20 at 800 bp. No apparent LD among genes was found. A selective sweep may have occurred at the early-response-to-drought-3 (erd3) gene, although population expansion can also explain our results and evidence for selection was not conclusive. One other gene, ccoaomt-1, a methylating enzyme involved in lignification, showed dimorphism (i.e., two highly divergent haplotype lineages at equal frequency), which is commonly associated with the long-term action of balancing selection. Finally, a set of haplotype-tagging SNPs (htSNPs) was selected. Using htSNPs, a reduction of genotyping effort of approximately 30-40%, while sampling most common allelic variants, can be gained in our ongoing association studies for drought tolerance in pine.
Finding gene regulatory network candidates using the gene expression knowledge base.
Venkatesan, Aravind; Tripathi, Sushil; Sanz de Galdeano, Alejandro; Blondé, Ward; Lægreid, Astrid; Mironov, Vladimir; Kuiper, Martin
2014-12-10
Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of 'omics' data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis. We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions. Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.
Farber, Charles R; van Nas, Atila; Ghazalpour, Anatole; Aten, Jason E; Doss, Sudheer; Sos, Brandon; Schadt, Eric E; Ingram-Drake, Leslie; Davis, Richard C; Horvath, Steve; Smith, Desmond J; Drake, Thomas A; Lusis, Aldons J
2009-01-01
Numerous quantitative trait loci (QTLs) affecting bone traits have been identified in the mouse; however, few of the underlying genes have been discovered. To improve the process of transitioning from QTL to gene, we describe an integrative genetics approach, which combines linkage analysis, expression QTL (eQTL) mapping, causality modeling, and genetic association in outbred mice. In C57BL/6J × C3H/HeJ (BXH) F2 mice, nine QTLs regulating femoral BMD were identified. To select candidate genes from within each QTL region, microarray gene expression profiles from individual F2 mice were used to identify 148 genes whose expression was correlated with BMD and regulated by local eQTLs. Many of the genes that were the most highly correlated with BMD have been previously shown to modulate bone mass or skeletal development. Candidates were further prioritized by determining whether their expression was predicted to underlie variation in BMD. Using network edge orienting (NEO), a causality modeling algorithm, 18 of the 148 candidates were predicted to be causally related to differences in BMD. To fine-map QTLs, markers in outbred MF1 mice were tested for association with BMD. Three chromosome 11 SNPs were identified that were associated with BMD within the Bmd11 QTL. Finally, our approach provides strong support for Wnt9a, Rasd1, or both underlying Bmd11. Integration of multiple genetic and genomic data sets can substantially improve the efficiency of QTL fine-mapping and candidate gene identification. PMID:18767929
DOE Office of Scientific and Technical Information (OSTI.GOV)
Greenspan, D.S.; Northrup, H.; Au, K.S.
1995-02-10
COL5A1, the gene for the {alpha}1 chain of type V collagen, has been considered a candidate gene for certain diseases based on chromosomal location and/or disease phenotype. We have employed 3{prime}-untranslated region RFLPs to exclude COL5A1 as a candidate gene in families with tuberous sclerosis 1, Ehlers-Danlos syndrome type H, and nail-patella syndrome. In addition, we describe a polymorphic simple sequence repeat (SSR) within a COL5A1 intron. This SSR is used to exclude COL5A1 as a candidate gene in hereditary hemorrhagic telangiectasia (Osler-Rendu-Weber disease) and to add COL5A1 to the existing map of {open_quotes}index{close_quotes} markers of chromosome 9 by evaluationmore » of the COL5A1 locus on the CEPH 40-family reference pedigree set. This genetic mapping places COL5A1 between markers D9S66 and D9S67. 14 refs., 1 fig., 2 tabs.« less
Abbott, Kenneth L; Nyre, Erik T; Abrahante, Juan; Ho, Yen-Yi; Isaksson Vogel, Rachel; Starr, Timothy K
2015-01-01
Identification of cancer driver gene mutations is crucial for advancing cancer therapeutics. Due to the overwhelming number of passenger mutations in the human tumor genome, it is difficult to pinpoint causative driver genes. Using transposon mutagenesis in mice many laboratories have conducted forward genetic screens and identified thousands of candidate driver genes that are highly relevant to human cancer. Unfortunately, this information is difficult to access and utilize because it is scattered across multiple publications using different mouse genome builds and strength metrics. To improve access to these findings and facilitate meta-analyses, we developed the Candidate Cancer Gene Database (CCGD, http://ccgd-starrlab.oit.umn.edu/). The CCGD is a manually curated database containing a unified description of all identified candidate driver genes and the genomic location of transposon common insertion sites (CISs) from all currently published transposon-based screens. To demonstrate relevance to human cancer, we performed a modified gene set enrichment analysis using KEGG pathways and show that human cancer pathways are highly enriched in the database. We also used hierarchical clustering to identify pathways enriched in blood cancers compared to solid cancers. The CCGD is a novel resource available to scientists interested in the identification of genetic drivers of cancer. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Giesbers, Anne K J; Pelgrom, Alexandra J E; Visser, Richard G F; Niks, Rients E; Van den Ackerveken, Guido; Jeuken, Marieke J W
2017-11-01
Candidate effectors from lettuce downy mildew (Bremia lactucae) enable high-throughput germplasm screening for the presence of resistance (R) genes. The nonhost species Lactuca saligna comprises a source of B. lactucae R genes that has hardly been exploited in lettuce breeding. Its cross-compatibility with the host species L. sativa enables the study of inheritance of nonhost resistance (NHR). We performed transient expression of candidate RXLR effector genes from B. lactucae in a diverse Lactuca germplasm set. Responses to two candidate effectors (BLR31 and BLN08) were genetically mapped and tested for co-segregation with disease resistance. BLN08 induced a hypersensitive response (HR) in 55% of the L. saligna accessions, but responsiveness did not co-segregate with resistance to Bl:24. BLR31 triggered an HR in 5% of the L. saligna accessions, and revealed a novel R gene providing complete B. lactucae race Bl:24 resistance. Resistant hybrid plants that were BLR31 nonresponsive indicated other unlinked R genes and/or nonhost QTLs. We have identified a candidate avirulence effector of B. lactucae (BLR31) and its cognate R gene in L. saligna. Concurrently, our results suggest that R genes are not required for NHR of L. saligna. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.
Ron, Micha; Israeli, Galit; Seroussi, Eyal; Weller, Joel I; Gregg, Jeffrey P; Shani, Moshe; Medrano, Juan F
2007-01-01
Background Many studies have found segregating quantitative trait loci (QTL) for milk production traits in different dairy cattle populations. However, even for relatively large effects with a saturated marker map the confidence interval for QTL location by linkage analysis spans tens of map units, or hundreds of genes. Combining mapping and arraying has been suggested as an approach to identify candidate genes. Thus, gene expression analysis in the mammary gland of genes positioned in the confidence interval of the QTL can bridge the gap between fine mapping and quantitative trait nucleotide (QTN) determination. Results We hybridized Affymetrix microarray (MG-U74v2), containing 12,488 murine probes, with RNA derived from mammary gland of virgin, pregnant, lactating and involuting C57BL/6J mice in a total of nine biological replicates. We combined microarray data from two additional studies that used the same design in mice with a total of 75 biological replicates. The same filtering and normalization was applied to each microarray data using GeneSpring software. Analysis of variance identified 249 differentially expressed probe sets common to the three experiments along the four developmental stages of puberty, pregnancy, lactation and involution. 212 genes were assigned to their bovine map positions through comparative mapping, and thus form a list of candidate genes for previously identified QTLs for milk production traits. A total of 82 of the genes showed mammary gland-specific expression with at least 3-fold expression over the median representing all tissues tested in GeneAtlas. Conclusion This work presents a web tool for candidate genes for QTL (cgQTL) that allows navigation between the map of bovine milk production QTL, potential candidate genes and their level of expression in mammary gland arrays and in GeneAtlas. Three out of four confirmed genes that affect QTL in livestock (ABCG2, DGAT1, GDF8, IGF2) were over expressed in the target organ. Thus, cgQTL can be used to determine priority of candidate genes for QTN analysis based on differential expression in the target organ. PMID:17584498
Beretta, Lorenzo; Santaniello, Alessandro; van Riel, Piet L C M; Coenen, Marieke J H; Scorza, Raffaella
2010-08-06
Epistasis is recognized as a fundamental part of the genetic architecture of individuals. Several computational approaches have been developed to model gene-gene interactions in case-control studies, however, none of them is suitable for time-dependent analysis. Herein we introduce the Survival Dimensionality Reduction (SDR) algorithm, a non-parametric method specifically designed to detect epistasis in lifetime datasets. The algorithm requires neither specification about the underlying survival distribution nor about the underlying interaction model and proved satisfactorily powerful to detect a set of causative genes in synthetic epistatic lifetime datasets with a limited number of samples and high degree of right-censorship (up to 70%). The SDR method was then applied to a series of 386 Dutch patients with active rheumatoid arthritis that were treated with anti-TNF biological agents. Among a set of 39 candidate genes, none of which showed a detectable marginal effect on anti-TNF responses, the SDR algorithm did find that the rs1801274 SNP in the Fc gamma RIIa gene and the rs10954213 SNP in the IRF5 gene non-linearly interact to predict clinical remission after anti-TNF biologicals. Simulation studies and application in a real-world setting support the capability of the SDR algorithm to model epistatic interactions in candidate-genes studies in presence of right-censored data. http://sourceforge.net/projects/sdrproject/.
Jin, Yulan; Sharma, Ashok; Bai, Shan; Davis, Colleen; Liu, Haitao; Hopkins, Diane; Barriga, Kathy; Rewers, Marian; She, Jin-Xiong
2014-07-01
There is tremendous scientific and clinical value to further improving the predictive power of autoantibodies because autoantibody-positive (AbP) children have heterogeneous rates of progression to clinical diabetes. This study explored the potential of gene expression profiles as biomarkers for risk stratification among 104 AbP subjects from the Diabetes Autoimmunity Study in the Young (DAISY) using a discovery data set based on microarray and a validation data set based on real-time RT-PCR. The microarray data identified 454 candidate genes with expression levels associated with various type 1 diabetes (T1D) progression rates. RT-PCR analyses of the top-27 candidate genes confirmed 5 genes (BACH2, IGLL3, EIF3A, CDC20, and TXNDC5) associated with differential progression and implicated in lymphocyte activation and function. Multivariate analyses of these five genes in the discovery and validation data sets identified and confirmed four multigene models (BI, ICE, BICE, and BITE, with each letter representing a gene) that consistently stratify high- and low-risk subsets of AbP subjects with hazard ratios >6 (P < 0.01). The results suggest that these genes may be involved in T1D pathogenesis and potentially serve as excellent gene expression biomarkers to predict the risk of progression to clinical diabetes for AbP subjects. © 2014 by the American Diabetes Association.
Whole Blood mRNA Expression-Based Prognosis of Metastatic Renal Cell Carcinoma.
Giridhar, Karthik V; Sosa, Carlos P; Hillman, David W; Sanhueza, Cristobal; Dalpiaz, Candace L; Costello, Brian A; Quevedo, Fernando J; Pitot, Henry C; Dronca, Roxana S; Ertz, Donna; Cheville, John C; Donkena, Krishna Vanaja; Kohli, Manish
2017-11-03
The Memorial Sloan Kettering Cancer Center (MSKCC) prognostic score is based on clinical parameters. We analyzed whole blood mRNA expression in metastatic clear cell renal cell carcinoma (mCCRCC) patients and compared it to the MSKCC score for predicting overall survival. In a discovery set of 19 patients with mRCC, we performed whole transcriptome RNA sequencing and selected eighteen candidate genes for further evaluation based on associations with overall survival and statistical significance. In an independent validation of set of 47 patients with mCCRCC, transcript expression of the 18 candidate genes were quantified using a customized NanoString probeset. Cox regression multivariate analysis confirmed that two of the candidate genes were significantly associated with overall survival. Higher expression of BAG1 [hazard ratio (HR) of 0.14, p < 0.0001, 95% confidence interval (CI) 0.04-0.36] and NOP56 (HR 0.13, p < 0.0001, 95% CI 0.05-0.34) were associated with better prognosis. A prognostic model incorporating expression of BAG1 and NOP56 into the MSKCC score improved prognostication significantly over a model using the MSKCC prognostic score only ( p < 0.0001). Prognostic value of using whole blood mRNA gene profiling in mCCRCC is feasible and should be prospectively confirmed in larger studies.
Whole Blood mRNA Expression-Based Prognosis of Metastatic Renal Cell Carcinoma
Sosa, Carlos P.; Hillman, David W.; Sanhueza, Cristobal; Dalpiaz, Candace L.; Costello, Brian A.; Quevedo, Fernando J.; Pitot, Henry C.; Dronca, Roxana S.; Ertz, Donna; Cheville, John C.; Donkena, Krishna Vanaja; Kohli, Manish
2017-01-01
The Memorial Sloan Kettering Cancer Center (MSKCC) prognostic score is based on clinical parameters. We analyzed whole blood mRNA expression in metastatic clear cell renal cell carcinoma (mCCRCC) patients and compared it to the MSKCC score for predicting overall survival. In a discovery set of 19 patients with mRCC, we performed whole transcriptome RNA sequencing and selected eighteen candidate genes for further evaluation based on associations with overall survival and statistical significance. In an independent validation of set of 47 patients with mCCRCC, transcript expression of the 18 candidate genes were quantified using a customized NanoString probeset. Cox regression multivariate analysis confirmed that two of the candidate genes were significantly associated with overall survival. Higher expression of BAG1 [hazard ratio (HR) of 0.14, p < 0.0001, 95% confidence interval (CI) 0.04–0.36] and NOP56 (HR 0.13, p < 0.0001, 95% CI 0.05–0.34) were associated with better prognosis. A prognostic model incorporating expression of BAG1 and NOP56 into the MSKCC score improved prognostication significantly over a model using the MSKCC prognostic score only (p < 0.0001). Prognostic value of using whole blood mRNA gene profiling in mCCRCC is feasible and should be prospectively confirmed in larger studies. PMID:29099775
Gupta, Mayetri; Cheung, Ching-Lung; Hsu, Yi-Hsiang; Demissie, Serkalem; Cupples, L Adrienne; Kiel, Douglas P; Karasik, David
2011-06-01
Genome-wide association studies (GWAS) using high-density genotyping platforms offer an unbiased strategy to identify new candidate genes for osteoporosis. It is imperative to be able to clearly distinguish signal from noise by focusing on the best phenotype in a genetic study. We performed GWAS of multiple phenotypes associated with fractures [bone mineral density (BMD), bone quantitative ultrasound (QUS), bone geometry, and muscle mass] with approximately 433,000 single-nucleotide polymorphisms (SNPs) and created a database of resulting associations. We performed analysis of GWAS data from 23 phenotypes by a novel modification of a block clustering algorithm followed by gene-set enrichment analysis. A data matrix of standardized regression coefficients was partitioned along both axes--SNPs and phenotypes. Each partition represents a distinct cluster of SNPs that have similar effects over a particular set of phenotypes. Application of this method to our data shows several SNP-phenotype connections. We found a strong cluster of association coefficients of high magnitude for 10 traits (BMD at several skeletal sites, ultrasound measures, cross-sectional bone area, and section modulus of femoral neck and shaft). These clustered traits were highly genetically correlated. Gene-set enrichment analyses indicated the augmentation of genes that cluster with the 10 osteoporosis-related traits in pathways such as aldosterone signaling in epithelial cells, role of osteoblasts, osteoclasts, and chondrocytes in rheumatoid arthritis, and Parkinson signaling. In addition to several known candidate genes, we also identified PRKCH and SCNN1B as potential candidate genes for multiple bone traits. In conclusion, our mining of GWAS results revealed the similarity of association results between bone strength phenotypes that may be attributed to pleiotropic effects of genes. This knowledge may prove helpful in identifying novel genes and pathways that underlie several correlated phenotypes, as well as in deciphering genetic and phenotypic modularity underlying osteoporosis risk. Copyright © 2011 American Society for Bone and Mineral Research.
Genome-wide identification of lineage-specific genes in Arabidopsis, Oryza and Populus
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Xiaohan; Jawdy, Sara; Tschaplinski, Timothy J
2009-01-01
Protein sequences were compared among Arabidopsis, Oryza and Populus to identify differential gene (DG) sets that are in one but not the other two genomes. The DG sets were screened against a plant transcript database, the NR protein database and six newly-sequenced genomes (Carica, Glycine, Medicago, Sorghum, Vitis and Zea) to identify a set of species-specific genes (SS). Gene expression, protein motif and intron number were examined. 192, 641 and 109 SS genes were identified in Arabidopsis, Oryza and Populus, respectively. Some SS genes were preferentially expressed in flowers, roots, xylem and cambium or up-regulated by stress. Six conserved motifsmore » in Arabidopsis and Oryza SS proteins were found in other distant lineages. The SS gene sets were enriched with intronless genes. The results reflect functional and/or anatomical differences between monocots and eudicots or between herbaceous and woody plants. The Populus-specific genes are candidates for carbon sequestration and biofuel research.« less
Bozinovic, Goran; Oleksiak, Marjorie F.
2010-01-01
Transcriptomics and population genomics are two complementary genomic approaches that can be used to gain insight into pollutant effects in natural populations. Transcriptomics identify altered gene expression pathways while population genomics approaches more directly target the causative genomic polymorphisms. Neither approach is restricted to a pre-determined set of genes or loci. Instead, both approaches allow a broad overview of genomic processes. Transcriptomics and population genomic approaches have been used to explore genomic responses in populations of fish from polluted environments and have identified sets of candidate genes and loci that appear biologically important in response to pollution. Often differences in gene expression or loci between polluted and reference populations are not conserved among polluted populations suggesting a biological complexity that we do not yet fully understand. As genomic approaches become less expensive with the advent of new sequencing and genotyping technologies, they will be more widely used in complimentary studies. However, while these genomic approaches are immensely powerful for identifying candidate gene and loci, the challenge of determining biological mechanisms that link genotypes and phenotypes remains. PMID:21072843
Jiang, Li; Edwards, Stefan M; Thomsen, Bo; Workman, Christopher T; Guldbrandtsen, Bernt; Sørensen, Peter
2014-09-24
Prioritizing genetic variants is a challenge because disease susceptibility loci are often located in genes of unknown function or the relationship with the corresponding phenotype is unclear. A global data-mining exercise on the biomedical literature can establish the phenotypic profile of genes with respect to their connection to disease phenotypes. The importance of protein-protein interaction networks in the genetic heterogeneity of common diseases or complex traits is becoming increasingly recognized. Thus, the development of a network-based approach combined with phenotypic profiling would be useful for disease gene prioritization. We developed a random-set scoring model and implemented it to quantify phenotype relevance in a network-based disease gene-prioritization approach. We validated our approach based on different gene phenotypic profiles, which were generated from PubMed abstracts, OMIM, and GeneRIF records. We also investigated the validity of several vocabulary filters and different likelihood thresholds for predicted protein-protein interactions in terms of their effect on the network-based gene-prioritization approach, which relies on text-mining of the phenotype data. Our method demonstrated good precision and sensitivity compared with those of two alternative complex-based prioritization approaches. We then conducted a global ranking of all human genes according to their relevance to a range of human diseases. The resulting accurate ranking of known causal genes supported the reliability of our approach. Moreover, these data suggest many promising novel candidate genes for human disorders that have a complex mode of inheritance. We have implemented and validated a network-based approach to prioritize genes for human diseases based on their phenotypic profile. We have devised a powerful and transparent tool to identify and rank candidate genes. Our global gene prioritization provides a unique resource for the biological interpretation of data from genome-wide association studies, and will help in the understanding of how the associated genetic variants influence disease or quantitative phenotypes.
Pollard, Harvey B.; Shivakumar, Chittari; Starr, Joshua; Eidelman, Ofer; Jacobowitz, David M.; Dalgard, Clifton L.; Srivastava, Meera; Wilkerson, Matthew D.; Stein, Murray B.; Ursano, Robert J.
2016-01-01
“Soldier's Heart,” is an American Civil War term linking post-traumatic stress disorder (PTSD) with increased propensity for cardiovascular disease (CVD). We have hypothesized that there might be a quantifiable genetic basis for this linkage. To test this hypothesis we identified a comprehensive set of candidate risk genes for PTSD, and tested whether any were also independent risk genes for CVD. A functional analysis algorithm was used to identify associated signaling networks. We identified 106 PTSD studies that report one or more polymorphic variants in 87 candidate genes in 83,463 subjects and controls. The top upstream drivers for these PTSD risk genes are predicted to be the glucocorticoid receptor (NR3C1) and Tumor Necrosis Factor alpha (TNFA). We find that 37 of the PTSD candidate risk genes are also candidate independent risk genes for CVD. The association between PTSD and CVD is significant by Fisher's Exact Test (P = 3 × 10−54). We also find 15 PTSD risk genes that are independently associated with Type 2 Diabetes Mellitus (T2DM; also significant by Fisher's Exact Test (P = 1.8 × 10−16). Our findings offer quantitative evidence for a genetic link between post-traumatic stress and cardiovascular disease, Computationally, the common mechanism for this linkage between PTSD and CVD is innate immunity and NFκB-mediated inflammation. PMID:27721742
Pollard, Harvey B; Shivakumar, Chittari; Starr, Joshua; Eidelman, Ofer; Jacobowitz, David M; Dalgard, Clifton L; Srivastava, Meera; Wilkerson, Matthew D; Stein, Murray B; Ursano, Robert J
2016-01-01
"Soldier's Heart," is an American Civil War term linking post-traumatic stress disorder (PTSD) with increased propensity for cardiovascular disease (CVD). We have hypothesized that there might be a quantifiable genetic basis for this linkage. To test this hypothesis we identified a comprehensive set of candidate risk genes for PTSD, and tested whether any were also independent risk genes for CVD. A functional analysis algorithm was used to identify associated signaling networks. We identified 106 PTSD studies that report one or more polymorphic variants in 87 candidate genes in 83,463 subjects and controls. The top upstream drivers for these PTSD risk genes are predicted to be the glucocorticoid receptor (NR3C1) and Tumor Necrosis Factor alpha (TNFA). We find that 37 of the PTSD candidate risk genes are also candidate independent risk genes for CVD. The association between PTSD and CVD is significant by Fisher's Exact Test ( P = 3 × 10 -54 ). We also find 15 PTSD risk genes that are independently associated with Type 2 Diabetes Mellitus (T2DM; also significant by Fisher's Exact Test ( P = 1.8 × 10 -16 ). Our findings offer quantitative evidence for a genetic link between post-traumatic stress and cardiovascular disease, Computationally, the common mechanism for this linkage between PTSD and CVD is innate immunity and NFκB-mediated inflammation.
Herman, Dorota; Slabbinck, Bram; Pè, Mario Enrico
2016-01-01
Leaves are vital organs for biomass and seed production because of their role in the generation of metabolic energy and organic compounds. A better understanding of the molecular networks underlying leaf development is crucial to sustain global requirements for food and renewable energy. Here, we combined transcriptome profiling of proliferative leaf tissue with in-depth phenotyping of the fourth leaf at later stages of development in 197 recombinant inbred lines of two different maize (Zea mays) populations. Previously, correlation analysis in a classical biparental mapping population identified 1,740 genes correlated with at least one of 14 traits. Here, we extended these results with data from a multiparent advanced generation intercross population. As expected, the phenotypic variability was found to be larger in the latter population than in the biparental population, although general conclusions on the correlations among the traits are comparable. Data integration from the two diverse populations allowed us to identify a set of 226 genes that are robustly associated with diverse leaf traits. This set of genes is enriched for transcriptional regulators and genes involved in protein synthesis and cell wall metabolism. In order to investigate the molecular network context of the candidate gene set, we integrated our data with publicly available functional genomics data and identified a growth regulatory network of 185 genes. Our results illustrate the power of combining in-depth phenotyping with transcriptomics in mapping populations to dissect the genetic control of complex traits and present a set of candidate genes for use in biomass improvement. PMID:26754667
Baute, Joke; Herman, Dorota; Coppens, Frederik; De Block, Jolien; Slabbinck, Bram; Dell'Acqua, Matteo; Pè, Mario Enrico; Maere, Steven; Nelissen, Hilde; Inzé, Dirk
2016-03-01
Leaves are vital organs for biomass and seed production because of their role in the generation of metabolic energy and organic compounds. A better understanding of the molecular networks underlying leaf development is crucial to sustain global requirements for food and renewable energy. Here, we combined transcriptome profiling of proliferative leaf tissue with in-depth phenotyping of the fourth leaf at later stages of development in 197 recombinant inbred lines of two different maize (Zea mays) populations. Previously, correlation analysis in a classical biparental mapping population identified 1,740 genes correlated with at least one of 14 traits. Here, we extended these results with data from a multiparent advanced generation intercross population. As expected, the phenotypic variability was found to be larger in the latter population than in the biparental population, although general conclusions on the correlations among the traits are comparable. Data integration from the two diverse populations allowed us to identify a set of 226 genes that are robustly associated with diverse leaf traits. This set of genes is enriched for transcriptional regulators and genes involved in protein synthesis and cell wall metabolism. In order to investigate the molecular network context of the candidate gene set, we integrated our data with publicly available functional genomics data and identified a growth regulatory network of 185 genes. Our results illustrate the power of combining in-depth phenotyping with transcriptomics in mapping populations to dissect the genetic control of complex traits and present a set of candidate genes for use in biomass improvement. © 2016 American Society of Plant Biologists. All Rights Reserved.
Gibbons, John G.; Beauvais, Anne; Beau, Remi; McGary, Kriston L.
2012-01-01
Aspergillus fumigatus is the most common and deadly pulmonary fungal infection worldwide. In the lung, the fungus usually forms a dense colony of filaments embedded in a polymeric extracellular matrix. To identify candidate genes involved in this biofilm (BF) growth, we used RNA-Seq to compare the transcriptomes of BF and liquid plankton (PL) growth. Sequencing and mapping of tens of millions sequence reads against the A. fumigatus transcriptome identified 3,728 differentially regulated genes in the two conditions. Although many of these genes, including the ones coding for transcription factors, stress response, the ribosome, and the translation machinery, likely reflect the different growth demands in the two conditions, our experiment also identified hundreds of candidate genes for the observed differences in morphology and pathobiology between BF and PL. We found an overrepresentation of upregulated genes in transport, secondary metabolism, and cell wall and surface functions. Furthermore, upregulated genes showed significant spatial structure across the A. fumigatus genome; they were more likely to occur in subtelomeric regions and colocalized in 27 genomic neighborhoods, many of which overlapped with known or candidate secondary metabolism gene clusters. We also identified 1,164 genes that were downregulated. This gene set was not spatially structured across the genome and was overrepresented in genes participating in primary metabolic functions, including carbon and amino acid metabolism. These results add valuable insight into the genetics of biofilm formation in A. fumigatus and other filamentous fungi and identify many relevant, in the context of biofilm biology, candidate genes for downstream functional experiments. PMID:21724936
Dissecting Vancomycin-Intermediate Resistance in Staphylococcus aureus Using Genome-Wide Association
Alam, Md Tauqeer; Petit, Robert A.; Crispell, Emily K.; Thornton, Timothy A.; Conneely, Karen N.; Jiang, Yunxuan; Satola, Sarah W.; Read, Timothy D.
2014-01-01
Vancomycin-intermediate Staphylococcus aureus (VISA) is currently defined as having minimal inhibitory concentration (MIC) of 4–8 µg/ml. VISA evolves through changes in multiple genetic loci with at least 16 candidate genes identified in clinical and in vitro-selected VISA strains. We report a whole-genome comparative analysis of 49 vancomycin-sensitive S. aureus and 26 VISA strains. Resistance to vancomycin was determined by broth microdilution, Etest, and population analysis profile-area under the curve (PAP-AUC). Genome-wide association studies (GWAS) of 55,977 single-nucleotide polymorphisms identified in one or more strains found one highly significant association (P = 8.78E-08) between a nonsynonymous mutation at codon 481 (H481) of the rpoB gene and increased vancomycin MIC. Additionally, we used a database of public S. aureus genome sequences to identify rare mutations in candidate genes associated with VISA. On the basis of these data, we proposed a preliminary model called ECM+RMCG for the VISA phenotype as a benchmark for future efforts. The model predicted VISA based on the presence of a rare mutation in a set of candidate genes (walKR, vraSR, graSR, and agrA) and/or three previously experimentally verified mutations (including the rpoB H481 locus) with an accuracy of 81% and a sensitivity of 73%. Further, the level of resistance measured by both Etest and PAP-AUC regressed positively with the number of mutations present in a strain. This study demonstrated 1) the power of GWAS for identifying common genetic variants associated with antibiotic resistance in bacteria and 2) that rare mutations in candidate gene, identified using large genomic data sets, can also be associated with resistance phenotypes. PMID:24787619
Ali, Muhammad Y; Pavasovic, Ana; Dammannagoda, Lalith K; Mather, Peter B; Prentis, Peter J
2017-01-01
Systemic acid-base balance and osmotic/ionic regulation in decapod crustaceans are in part maintained by a set of transport-related enzymes such as carbonic anhydrase (CA), Na + /K + -ATPase (NKA), H + -ATPase (HAT), Na + /K + /2Cl - cotransporter (NKCC), Na + /Cl - /HCO[Formula: see text] cotransporter (NBC), Na + /H + exchanger (NHE), Arginine kinase (AK), Sarcoplasmic Ca +2 -ATPase (SERCA) and Calreticulin (CRT). We carried out a comparative molecular analysis of these genes in three commercially important yet eco-physiologically distinct freshwater crayfish , Cherax quadricarinatus, C. destructor and C. cainii , with the aim to identify mutations in these genes and determine if observed patterns of mutations were consistent with the action of natural selection. We also conducted a tissue-specific expression analysis of these genes across seven different organs, including gills, hepatopancreas, heart, kidney, liver, nerve and testes using NGS transcriptome data. The molecular analysis of the candidate genes revealed a high level of sequence conservation across the three Cherax sp. Hyphy analysis revealed that all candidate genes showed patterns of molecular variation consistent with neutral evolution. The tissue-specific expression analysis showed that 46% of candidate genes were expressed in all tissue types examined, while approximately 10% of candidate genes were only expressed in a single tissue type. The largest number of genes was observed in nerve (84%) and gills (78%) and the lowest in testes (66%). The tissue-specific expression analysis also revealed that most of the master genes regulating pH and osmoregulation (CA, NKA, HAT, NKCC, NBC, NHE) were expressed in all tissue types indicating an important physiological role for these genes outside of osmoregulation in other tissue types. The high level of sequence conservation observed in the candidate genes may be explained by the important role of these genes as well as potentially having a number of other basic physiological functions in different tissue types.
Genomic approaches for the elucidation of genes and gene networks underlying cardiovascular traits.
Adriaens, M E; Bezzina, C R
2018-06-22
Genome-wide association studies have shed light on the association between natural genetic variation and cardiovascular traits. However, linking a cardiovascular trait associated locus to a candidate gene or set of candidate genes for prioritization for follow-up mechanistic studies is all but straightforward. Genomic technologies based on next-generation sequencing technology nowadays offer multiple opportunities to dissect gene regulatory networks underlying genetic cardiovascular trait associations, thereby aiding in the identification of candidate genes at unprecedented scale. RNA sequencing in particular becomes a powerful tool when combined with genotyping to identify loci that modulate transcript abundance, known as expression quantitative trait loci (eQTL), or loci modulating transcript splicing known as splicing quantitative trait loci (sQTL). Additionally, the allele-specific resolution of RNA-sequencing technology enables estimation of allelic imbalance, a state where the two alleles of a gene are expressed at a ratio differing from the expected 1:1 ratio. When multiple high-throughput approaches are combined with deep phenotyping in a single study, a comprehensive elucidation of the relationship between genotype and phenotype comes into view, an approach known as systems genetics. In this review, we cover key applications of systems genetics in the broad cardiovascular field.
Analysis of genetic association using hierarchical clustering and cluster validation indices.
Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L
2017-10-01
It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.
Chen, Junhui; Meng, Yuhuan; Zhou, Jinghui; Zhuo, Min; Ling, Fei; Zhang, Yu; Du, Hongli; Wang, Xiaoning
2013-01-01
Type 2 Diabetes Mellitus (T2DM) and obesity have become increasingly prevalent in recent years. Recent studies have focused on identifying causal variations or candidate genes for obesity and T2DM via analysis of expression quantitative trait loci (eQTL) within a single tissue. T2DM and obesity are affected by comprehensive sets of genes in multiple tissues. In the current study, gene expression levels in multiple human tissues from GEO datasets were analyzed, and 21 candidate genes displaying high percentages of differential expression were filtered out. Specifically, DENND1B, LYN, MRPL30, POC1B, PRKCB, RP4-655J12.3, HIBADH, and TMBIM4 were identified from the T2DM-control study, and BCAT1, BMP2K, CSRNP2, MYNN, NCKAP5L, SAP30BP, SLC35B4, SP1, BAP1, GRB14, HSP90AB1, ITGA5, and TOMM5 were identified from the obesity-control study. The majority of these genes are known to be involved in T2DM and obesity. Therefore, analysis of gene expression in various tissues using GEO datasets may be an effective and feasible method to determine novel or causal genes associated with T2DM and obesity.
2014-01-01
Background Kidney stone disease (KSD) is a complex disorder with unknown etiology in majority of the patients. Genetic and environmental factors may cause the disease. In the present study, we used DNA microarray to genotype single nucleotide polymorphisms (SNP) and performed candidate gene association analysis to determine genetic variations associated with the disease. Methods A whole genome SNP genotyping by DNA microarray was initially conducted in 101 patients and 105 control subjects. A set of 104 candidate genes reported to be involved in KSD, gathered from public databases and candidate gene association study databases, were evaluated for their variations associated with KSD. Results Altogether 82 SNPs distributed within 22 candidate gene regions showed significant differences in SNP allele frequencies between the patient and control groups (P < 0.05). Of these, 4 genes including BGLAP, AHSG, CD44, and HAO1, encoding osteocalcin, fetuin-A, CD44-molecule and glycolate oxidase 1, respectively, were further assessed for their associations with the disease because they carried high proportion of SNPs with statistical differences of allele frequencies between the patient and control groups within the gene. The total of 26 SNPs showed significant differences of allele frequencies between the patient and control groups and haplotypes associated with disease risk were identified. The SNP rs759330 located 144 bp downstream of BGLAP where it is a predicted microRNA binding site at 3′UTR of PAQR6 – a gene encoding progestin and adipoQ receptor family member VI, was genotyped in 216 patients and 216 control subjects and found to have significant differences in its genotype and allele frequencies (P = 0.0007, OR 2.02 and P = 0.0001, OR 2.02, respectively). Conclusions Our results suggest that these candidate genes are associated with KSD and PAQR6 comes into our view as the most potent candidate since associated SNP rs759330 is located in the miRNA binding site and may affect mRNA expression level. PMID:24886237
Álvarez, María F.; Angarita, Myrian; Delgado, María C.; García, Celsa; Jiménez-Gomez, José; Gebhardt, Christiane; Mosquera, Teresa
2017-01-01
The genetic basis of quantitative disease resistance has been studied in crops for several decades as an alternative to R gene mediated resistance. The most important disease in the potato crop is late blight, caused by the oomycete Phytophthora infestans. Quantitative disease resistance (QDR), as any other quantitative trait in plants, can be genetically mapped to understand the genetic architecture. Association mapping using DNA-based markers has been implemented in many crops to dissect quantitative traits. We used an association mapping approach with candidate genes to identify the first genes associated with quantitative resistance to late blight in Solanum tuberosum Group Phureja. Twenty-nine candidate genes were selected from a set of genes that were differentially expressed during the resistance response to late blight in tetraploid European potato cultivars. The 29 genes were amplified and sequenced in 104 accessions of S. tuberosum Group Phureja from Latin America. We identified 238 SNPs in the selected genes and tested them for association with resistance to late blight. The phenotypic data were obtained under field conditions by determining the area under disease progress curve (AUDPC) in two seasons and in two locations. Two genes were associated with QDR to late blight, a potato homolog of thylakoid lumen 15 kDa protein (StTL15A) and a stem 28 kDa glycoprotein (StGP28). Key message: A first association mapping experiment was conducted in Solanum tuberosum Group Phureja germplasm, which identified among 29 candidates two genes associated with quantitative resistance to late blight. PMID:28674545
Álvarez, María F; Angarita, Myrian; Delgado, María C; García, Celsa; Jiménez-Gomez, José; Gebhardt, Christiane; Mosquera, Teresa
2017-01-01
The genetic basis of quantitative disease resistance has been studied in crops for several decades as an alternative to R gene mediated resistance. The most important disease in the potato crop is late blight, caused by the oomycete Phytophthora infestans. Quantitative disease resistance (QDR), as any other quantitative trait in plants, can be genetically mapped to understand the genetic architecture. Association mapping using DNA-based markers has been implemented in many crops to dissect quantitative traits. We used an association mapping approach with candidate genes to identify the first genes associated with quantitative resistance to late blight in Solanum tuberosum Group Phureja. Twenty-nine candidate genes were selected from a set of genes that were differentially expressed during the resistance response to late blight in tetraploid European potato cultivars. The 29 genes were amplified and sequenced in 104 accessions of S. tuberosum Group Phureja from Latin America. We identified 238 SNPs in the selected genes and tested them for association with resistance to late blight. The phenotypic data were obtained under field conditions by determining the area under disease progress curve (AUDPC) in two seasons and in two locations. Two genes were associated with QDR to late blight, a potato homolog of thylakoid lumen 15 kDa protein ( StTL15A ) and a stem 28 kDa glycoprotein ( StGP28 ). Key message : A first association mapping experiment was conducted in Solanum tuberosum Group Phureja germplasm, which identified among 29 candidates two genes associated with quantitative resistance to late blight.
2010-01-01
Background Epistasis is recognized as a fundamental part of the genetic architecture of individuals. Several computational approaches have been developed to model gene-gene interactions in case-control studies, however, none of them is suitable for time-dependent analysis. Herein we introduce the Survival Dimensionality Reduction (SDR) algorithm, a non-parametric method specifically designed to detect epistasis in lifetime datasets. Results The algorithm requires neither specification about the underlying survival distribution nor about the underlying interaction model and proved satisfactorily powerful to detect a set of causative genes in synthetic epistatic lifetime datasets with a limited number of samples and high degree of right-censorship (up to 70%). The SDR method was then applied to a series of 386 Dutch patients with active rheumatoid arthritis that were treated with anti-TNF biological agents. Among a set of 39 candidate genes, none of which showed a detectable marginal effect on anti-TNF responses, the SDR algorithm did find that the rs1801274 SNP in the FcγRIIa gene and the rs10954213 SNP in the IRF5 gene non-linearly interact to predict clinical remission after anti-TNF biologicals. Conclusions Simulation studies and application in a real-world setting support the capability of the SDR algorithm to model epistatic interactions in candidate-genes studies in presence of right-censored data. Availability: http://sourceforge.net/projects/sdrproject/ PMID:20691091
Optimal selection of markers for validation or replication from genome-wide association studies.
Greenwood, Celia M T; Rangrej, Jagadish; Sun, Lei
2007-07-01
With reductions in genotyping costs and the fast pace of improvements in genotyping technology, it is not uncommon for the individuals in a single study to undergo genotyping using several different platforms, where each platform may contain different numbers of markers selected via different criteria. For example, a set of cases and controls may be genotyped at markers in a small set of carefully selected candidate genes, and shortly thereafter, the same cases and controls may be used for a genome-wide single nucleotide polymorphism (SNP) association study. After such initial investigations, often, a subset of "interesting" markers is selected for validation or replication. Specifically, by validation, we refer to the investigation of associations between the selected subset of markers and the disease in independent data. However, it is not obvious how to choose the best set of markers for this validation. There may be a prior expectation that some sets of genotyping data are more likely to contain real associations. For example, it may be more likely for markers in plausible candidate genes to show disease associations than markers in a genome-wide scan. Hence, it would be desirable to select proportionally more markers from the candidate gene set. When a fixed number of markers are selected for validation, we propose an approach for identifying an optimal marker-selection configuration by basing the approach on minimizing the stratified false discovery rate. We illustrate this approach using a case-control study of colorectal cancer from Ontario, Canada, and we show that this approach leads to substantial reductions in the estimated false discovery rates in the Ontario dataset for the selected markers, as well as reductions in the expected false discovery rates for the proposed validation dataset. Copyright 2007 Wiley-Liss, Inc.
Sork, Victoria L; Squire, Kevin; Gugger, Paul F; Steele, Stephanie E; Levy, Eric D; Eckert, Andrew J
2016-01-01
The ability of California tree populations to survive anthropogenic climate change will be shaped by the geographic structure of adaptive genetic variation. Our goal is to test whether climate-associated candidate genes show evidence of spatially divergent selection in natural populations of valley oak, Quercus lobata, as preliminary indication of local adaptation. Using DNA from 45 individuals from 13 localities across the species' range, we sequenced portions of 40 candidate genes related to budburst/flowering, growth, osmotic stress, and temperature stress. Using 195 single nucleotide polymorphisms (SNPs), we estimated genetic differentiation across populations and correlated allele frequencies with climate gradients using single-locus and multivariate models. The top 5% of FST estimates ranged from 0.25 to 0.68, yielding loci potentially under spatially divergent selection. Environmental analyses of SNP frequencies with climate gradients revealed three significantly correlated SNPs within budburst/flowering genes and two SNPs within temperature stress genes with mean annual precipitation, after controlling for multiple testing. A redundancy model showed a significant association between SNPs and climate variables and revealed a similar set of SNPs with high loadings on the first axis. In the RDA, climate accounted for 67% of the explained variation, when holding climate constant, in contrast to a putatively neutral SSR data set where climate accounted for only 33%. Population differentiation and geographic gradients of allele frequencies in climate-associated functional genes in Q. lobata provide initial evidence of adaptive genetic variation and background for predicting population response to climate change. © 2016 Botanical Society of America.
Yang, Chunxiao; Pan, Huipeng; Noland, Jeffrey Edward; Zhang, Deyong; Zhang, Zhanhong; Liu, Yong; Zhou, Xuguo
2015-12-10
Reverse transcriptase-quantitative polymerase chain reaction (RT-qPCR) is a reliable technique for quantifying gene expression across various biological processes, of which requires a set of suited reference genes to normalize the expression data. Coleomegilla maculata (Coleoptera: Coccinellidae), is one of the most extensively used biological control agents in the field to manage arthropod pest species. In this study, expression profiles of 16 housekeeping genes selected from C. maculata were cloned and investigated. The performance of these candidates as endogenous controls under specific experimental conditions was evaluated by dedicated algorithms, including geNorm, Normfinder, BestKeeper, and ΔCt method. In addition, RefFinder, a comprehensive platform integrating all the above-mentioned algorithms, ranked the overall stability of these candidate genes. As a result, various sets of suitable reference genes were recommended specifically for experiments involving different tissues, developmental stages, sex, and C. maculate larvae treated with dietary double stranded RNA. This study represents the critical first step to establish a standardized RT-qPCR protocol for the functional genomics research in a ladybeetle C. maculate. Furthermore, it lays the foundation for conducting ecological risk assessment of RNAi-based gene silencing biotechnologies on non-target organisms; in this case, a key predatory biological control agent.
Gaponova, Anna V.; Deneka, Alexander Y.; Beck, Tim N.; Liu, Hanqing; Andrianov, Gregory; Nikonova, Anna S.; Nicolas, Emmanuelle; Einarson, Margret B.; Golemis, Erica A.; Serebriiskii, Ilya G.
2017-01-01
Ovarian, head and neck, and other cancers are commonly treated with cisplatin and other DNA damaging cytotoxic agents. Altered DNA damage response (DDR) contributes to resistance of these tumors to chemotherapies, some targeted therapies, and radiation. DDR involves multiple protein complexes and signaling pathways, some of which are evolutionarily ancient and involve protein orthologs conserved from yeast to humans. To identify new regulators of cisplatin-resistance in human tumors, we integrated high throughput and curated datasets describing yeast genes that regulate sensitivity to cisplatin and/or ionizing radiation. Next, we clustered highly validated genes based on chemogenomic profiling, and then mapped orthologs of these genes in expanded genomic networks for multiple metazoans, including humans. This approach identified an enriched candidate set of genes involved in the regulation of resistance to radiation and/or cisplatin in humans. Direct functional assessment of selected candidate genes using RNA interference confirmed their activity in influencing cisplatin resistance, degree of γH2AX focus formation and ATR phosphorylation, in ovarian and head and neck cancer cell lines, suggesting impaired DDR signaling as the driving mechanism. This work enlarges the set of genes that may contribute to chemotherapy resistance and provides a new contextual resource for interpreting next generation sequencing (NGS) genomic profiling of tumors. PMID:27863405
Liu, Bin; Jin, Min; Zeng, Pan
2015-10-01
The identification of gene-phenotype relationships is very important for the treatment of human diseases. Studies have shown that genes causing the same or similar phenotypes tend to interact with each other in a protein-protein interaction (PPI) network. Thus, many identification methods based on the PPI network model have achieved good results. However, in the PPI network, some interactions between the proteins encoded by candidate gene and the proteins encoded by known disease genes are very weak. Therefore, some studies have combined the PPI network with other genomic information and reported good predictive performances. However, we believe that the results could be further improved. In this paper, we propose a new method that uses the semantic similarity between the candidate gene and known disease genes to set the initial probability vector of a random walk with a restart algorithm in a human PPI network. The effectiveness of our method was demonstrated by leave-one-out cross-validation, and the experimental results indicated that our method outperformed other methods. Additionally, our method can predict new causative genes of multifactor diseases, including Parkinson's disease, breast cancer and obesity. The top predictions were good and consistent with the findings in the literature, which further illustrates the effectiveness of our method. Copyright © 2015 Elsevier Inc. All rights reserved.
Longevity candidate genes and their association with personality traits in the elderly
Luciano, Michelle; Lopez, Lorna M.; de Moor, Marleen H.M.; Harris, Sarah E.; Davies, Gail; Nutile, Teresa; Krueger, Robert F.; Esko, Tõnu; Schlessinger, David; Toshiko, Tanaka; Derringer, Jaime L.; Realo, Anu; Hansell, Narelle K.; Pergadia, Michele L.; Pesonen, Anu-Katriina; Sanna, Serena; Terracciano, Antonio; Madden, Pamela A.F.; Penninx, Brenda; Spinhoven, Philip; Hartman, Catherine; Oostra, Ben A.; Janssens, A. Cecile J.W.; Eriksson, Johan G; Starr, John M.; Cannas, Alessandra; Ferrucci, Luigi; Metspalu, Andres; Wright, Margeret J.; Heath, Andrew C.; van Duijn, Cornelia M.; Bierut, Laura J.; Raikkonen, Katri; Martin, Nicholas G.; Ciullo, Marina; Rujescu, Dan; Boomsma, Dorret I.; Deary, Ian J.
2013-01-01
Human longevity and personality traits are both heritable and are consistently linked at the phenotypic level. We test the hypothesis that candidate genes influencing longevity in lower organisms are associated with variance in the five major dimensions of human personality (measured by the NEO-FFI and IPIP inventories) plus related mood states of anxiety and depression. Seventy single nucleotide polymorphisms (SNPs) in six brain expressed, longevity candidate genes (AFG3L2, FRAP1, MAT1A, MAT2A, SYNJ1 and SYNJ2) were typed in over one thousand 70-year old participants from the Lothian Birth Cohort of 1936 (LBC1936). No SNPs were associated with the personality and psychological distress traits at a Bonferroni corrected level of significance (p < 0.0002), but there was an over-representation of nominally significant (p < 0.05) SNPs in the synaptojanin-2 (SYNJ2) gene associated with agreeableness and symptoms of depression. Eight SNPs which showed nominally significant association across personality measurement instruments were tested in an extremely large replication sample of 17 106 participants. SNP rs350292, in SYNJ2, was significant: the minor allele was associated with an average decrease in NEO agreeableness scale scores of 0.25 points, and 0.67 points in the restricted analysis of elderly cohorts (most aged > 60 years). Because we selected a specific set of longevity genes based on functional genomics findings, further research on other longevity gene candidates is warranted to discover whether they are relevant candidates for personality and psychological distress traits. PMID:22213687
Ma, Chuang; Xin, Mingming; Feldmann, Kenneth A.; Wang, Xiangfeng
2014-01-01
Machine learning (ML) is an intelligent data mining technique that builds a prediction model based on the learning of prior knowledge to recognize patterns in large-scale data sets. We present an ML-based methodology for transcriptome analysis via comparison of gene coexpression networks, implemented as an R package called machine learning–based differential network analysis (mlDNA) and apply this method to reanalyze a set of abiotic stress expression data in Arabidopsis thaliana. The mlDNA first used a ML-based filtering process to remove nonexpressed, constitutively expressed, or non-stress-responsive “noninformative” genes prior to network construction, through learning the patterns of 32 expression characteristics of known stress-related genes. The retained “informative” genes were subsequently analyzed by ML-based network comparison to predict candidate stress-related genes showing expression and network differences between control and stress networks, based on 33 network topological characteristics. Comparative evaluation of the network-centric and gene-centric analytic methods showed that mlDNA substantially outperformed traditional statistical testing–based differential expression analysis at identifying stress-related genes, with markedly improved prediction accuracy. To experimentally validate the mlDNA predictions, we selected 89 candidates out of the 1784 predicted salt stress–related genes with available SALK T-DNA mutagenesis lines for phenotypic screening and identified two previously unreported genes, mutants of which showed salt-sensitive phenotypes. PMID:24520154
Jiang, Yiwei
2013-01-01
Drought is a major environmental stress limiting growth of perennial grasses in temperate regions. Plant drought tolerance is a complex trait that is controlled by multiple genes. Candidate gene association mapping provides a powerful tool for dissection of complex traits. Candidate gene association mapping of drought tolerance traits was conducted in 192 diverse perennial ryegrass (Lolium perenne L.) accessions from 43 countries. The panel showed significant variations in leaf wilting, leaf water content, canopy and air temperature difference, and chlorophyll fluorescence under well-watered and drought conditions across six environments. Analysis of 109 simple sequence repeat markers revealed five population structures in the mapping panel. A total of 2520 expression-based sequence readings were obtained for a set of candidate genes involved in antioxidant metabolism, dehydration, water movement across membranes, and signal transduction, from which 346 single nucleotide polymorphisms were identified. Significant associations were identified between a putative LpLEA3 encoding late embryogenesis abundant group 3 protein and a putative LpFeSOD encoding iron superoxide dismutase and leaf water content, as well as between a putative LpCyt Cu-ZnSOD encoding cytosolic copper-zinc superoxide dismutase and chlorophyll fluorescence under drought conditions. Four of these identified significantly associated single nucleotide polymorphisms from these three genes were also translated to amino acid substitutions in different genotypes. These results indicate that allelic variation in these genes may affect whole-plant response to drought stress in perennial ryegrass. PMID:23386684
Rai, Muhammad Farooq; Schmidt, Eric J; McAlinden, Audrey; Cheverud, James M; Sandell, Linda J
2013-11-06
Tissue regeneration is a complex trait with few genetic models available. Mouse strains LG/J and MRL are exceptional healers. Using recombinant inbred strains from a large (LG/J, healer) and small (SM/J, nonhealer) intercross, we have previously shown a positive genetic correlation between ear wound healing, knee cartilage regeneration, and protection from osteoarthritis. We hypothesize that a common set of genes operates in tissue healing and articular cartilage regeneration. Taking advantage of archived histological sections from recombinant inbred strains, we analyzed expression of candidate genes through branched-chain DNA technology directly from tissue lysates. We determined broad-sense heritability of candidates, Pearson correlation of candidates with healing phenotypes, and Ward minimum variance cluster analysis for strains. A bioinformatic assessment of allelic polymorphisms within and near candidate genes was also performed. The expression of several candidates was significantly heritable among strains. Although several genes correlated with both ear wound healing and cartilage healing at a marginal level, the expression of four genes representing DNA repair (Xrcc2, Pcna) and Wnt signaling (Axin2, Wnt16) pathways was significantly positively correlated with both phenotypes. Cluster analysis accurately classified healers and nonhealers for seven out of eight strains based on gene expression. Specific sequence differences between LG/J and SM/J were identified as potential causal polymorphisms. Our study suggests a common genetic basis between tissue healing and osteoarthritis susceptibility. Mapping genetic variations causing differences in diverse healing responses in multiple tissues may reveal generic healing processes in pursuit of new therapeutic targets designed to induce or enhance regeneration and, potentially, protection from osteoarthritis.
Comparative genomics reveals candidate carotenoid pathway regulators of ripening watermelon fruit.
Grassi, Stefania; Piro, Gabriella; Lee, Je Min; Zheng, Yi; Fei, Zhangjun; Dalessandro, Giuseppe; Giovannoni, James J; Lenucci, Marcello S
2013-11-12
Many fruits, including watermelon, are proficient in carotenoid accumulation during ripening. While most genes encoding steps in the carotenoid biosynthetic pathway have been cloned, few transcriptional regulators of these genes have been defined to date. Here we describe the identification of a set of putative carotenoid-related transcription factors resulting from fresh watermelon carotenoid and transcriptome analysis during fruit development and ripening. Our goal is to both clarify the expression profiles of carotenoid pathway genes and to identify candidate regulators and molecular targets for crop improvement. Total carotenoids progressively increased during fruit ripening up to ~55 μg g(-1) fw in red-ripe fruits. Trans-lycopene was the carotenoid that contributed most to this increase. Many of the genes related to carotenoid metabolism displayed changing expression levels during fruit ripening generating a metabolic flux toward carotenoid synthesis. Constitutive low expression of lycopene cyclase genes resulted in lycopene accumulation. RNA-seq expression profiling of watermelon fruit development yielded a set of transcription factors whose expression was correlated with ripening and carotenoid accumulation. Nineteen putative transcription factor genes from watermelon and homologous to tomato carotenoid-associated genes were identified. Among these, six were differentially expressed in the flesh of both species during fruit development and ripening. Taken together the data suggest that, while the regulation of a common set of metabolic genes likely influences carotenoid synthesis and accumulation in watermelon and tomato fruits during development and ripening, specific and limiting regulators may differ between climacteric and non-climacteric fruits, possibly related to their differential susceptibility to and use of ethylene during ripening.
Comparative genomics reveals candidate carotenoid pathway regulators of ripening watermelon fruit
2013-01-01
Background Many fruits, including watermelon, are proficient in carotenoid accumulation during ripening. While most genes encoding steps in the carotenoid biosynthetic pathway have been cloned, few transcriptional regulators of these genes have been defined to date. Here we describe the identification of a set of putative carotenoid-related transcription factors resulting from fresh watermelon carotenoid and transcriptome analysis during fruit development and ripening. Our goal is to both clarify the expression profiles of carotenoid pathway genes and to identify candidate regulators and molecular targets for crop improvement. Results Total carotenoids progressively increased during fruit ripening up to ~55 μg g-1 fw in red-ripe fruits. Trans-lycopene was the carotenoid that contributed most to this increase. Many of the genes related to carotenoid metabolism displayed changing expression levels during fruit ripening generating a metabolic flux toward carotenoid synthesis. Constitutive low expression of lycopene cyclase genes resulted in lycopene accumulation. RNA-seq expression profiling of watermelon fruit development yielded a set of transcription factors whose expression was correlated with ripening and carotenoid accumulation. Nineteen putative transcription factor genes from watermelon and homologous to tomato carotenoid-associated genes were identified. Among these, six were differentially expressed in the flesh of both species during fruit development and ripening. Conclusions Taken together the data suggest that, while the regulation of a common set of metabolic genes likely influences carotenoid synthesis and accumulation in watermelon and tomato fruits during development and ripening, specific and limiting regulators may differ between climacteric and non-climacteric fruits, possibly related to their differential susceptibility to and use of ethylene during ripening. PMID:24219562
Thomassen, Mads; Tan, Qihua; Kruse, Torben A
2009-01-01
Breast cancer cells exhibit complex karyotypic alterations causing deregulation of numerous genes. Some of these genes are probably causal for cancer formation and local growth whereas others are causal for the various steps of metastasis. In a fraction of tumors deregulation of the same genes might be caused by epigenetic modulations, point mutations or the influence of other genes. We have investigated the relation of gene expression and chromosomal position, using eight datasets including more than 1200 breast tumors, to identify chromosomal regions and candidate genes possibly causal for breast cancer metastasis. By use of "Gene Set Enrichment Analysis" we have ranked chromosomal regions according to their relation to metastasis. Overrepresentation analysis identified regions with increased expression for chromosome 1q41-42, 8q24, 12q14, 16q22, 16q24, 17q12-21.2, 17q21-23, 17q25, 20q11, and 20q13 among metastasizing tumors and reduced gene expression at 1p31-21, 8p22-21, and 14q24. By analysis of genes with extremely imbalanced expression in these regions we identified DIRAS3 at 1p31, PSD3, LPL, EPHX2 at 8p21-22, and FOS at 14q24 as candidate metastasis suppressor genes. Potential metastasis promoting genes includes RECQL4 at 8q24, PRMT7 at 16q22, GINS2 at 16q24, and AURKA at 20q13.
2013-01-01
Background Ever since the recent completion of the peach genome, the focus of genetic research in this area has turned to the identification of genes related to important traits, such as fruit aroma volatiles. Of the over 100 volatile compounds described in peach, lactones most likely have the strongest effect on fruit aroma, while esters, terpenoids, and aldehydes have minor, yet significant effects. The identification of key genes underlying the production of aroma compounds is of interest for any fruit-quality improvement strategy. Results Volatile (52 compounds) and gene expression (4348 genes) levels were profiled in peach fruit from a maturity time-course series belonging to two peach genotypes that showed considerable differences in maturation characteristics and postharvest ripening. This data set was analyzed by complementary correlation-based approaches to discover the genes related to the main aroma-contributing compounds: lactones, esters, and phenolic volatiles, among others. As a case study, one of the candidate genes was cloned and expressed in yeast to show specificity as an ω-6 Oleate desaturase, which may be involved in the production of a precursor of lactones/esters. Conclusions Our approach revealed a set of genes (an alcohol acyl transferase, fatty acid desaturases, transcription factors, protein kinases, cytochromes, etc.) that are highly associated with peach fruit volatiles, and which could prove useful in breeding or for biotechnological purposes. PMID:23701715
Sporulation genes associated with sporulation efficiency in natural isolates of yeast.
Tomar, Parul; Bhatia, Aatish; Ramdas, Shweta; Diao, Liyang; Bhanot, Gyan; Sinha, Himanshu
2013-01-01
Yeast sporulation efficiency is a quantitative trait and is known to vary among experimental populations and natural isolates. Some studies have uncovered the genetic basis of this variation and have identified the role of sporulation genes (IME1, RME1) and sporulation-associated genes (FKH2, PMS1, RAS2, RSF1, SWS2), as well as non-sporulation pathway genes (MKT1, TAO3) in maintaining this variation. However, these studies have been done mostly in experimental populations. Sporulation is a response to nutrient deprivation. Unlike laboratory strains, natural isolates have likely undergone multiple selections for quick adaptation to varying nutrient conditions. As a result, sporulation efficiency in natural isolates may have different genetic factors contributing to phenotypic variation. Using Saccharomyces cerevisiae strains in the genetically and environmentally diverse SGRP collection, we have identified genetic loci associated with sporulation efficiency variation in a set of sporulation and sporulation-associated genes. Using two independent methods for association mapping and correcting for population structure biases, our analysis identified two linked clusters containing 4 non-synonymous mutations in genes - HOS4, MCK1, SET3, and SPO74. Five regulatory polymorphisms in five genes such as MLS1 and CDC10 were also identified as putative candidates. Our results provide candidate genes contributing to phenotypic variation in the sporulation efficiency of natural isolates of yeast.
Sporulation Genes Associated with Sporulation Efficiency in Natural Isolates of Yeast
Ramdas, Shweta; Diao, Liyang; Bhanot, Gyan; Sinha, Himanshu
2013-01-01
Yeast sporulation efficiency is a quantitative trait and is known to vary among experimental populations and natural isolates. Some studies have uncovered the genetic basis of this variation and have identified the role of sporulation genes (IME1, RME1) and sporulation-associated genes (FKH2, PMS1, RAS2, RSF1, SWS2), as well as non-sporulation pathway genes (MKT1, TAO3) in maintaining this variation. However, these studies have been done mostly in experimental populations. Sporulation is a response to nutrient deprivation. Unlike laboratory strains, natural isolates have likely undergone multiple selections for quick adaptation to varying nutrient conditions. As a result, sporulation efficiency in natural isolates may have different genetic factors contributing to phenotypic variation. Using Saccharomyces cerevisiae strains in the genetically and environmentally diverse SGRP collection, we have identified genetic loci associated with sporulation efficiency variation in a set of sporulation and sporulation-associated genes. Using two independent methods for association mapping and correcting for population structure biases, our analysis identified two linked clusters containing 4 non-synonymous mutations in genes – HOS4, MCK1, SET3, and SPO74. Five regulatory polymorphisms in five genes such as MLS1 and CDC10 were also identified as putative candidates. Our results provide candidate genes contributing to phenotypic variation in the sporulation efficiency of natural isolates of yeast. PMID:23874994
Zwaenepoel, Arthur; Diels, Tim; Amar, David; Van Parys, Thomas; Shamir, Ron; Van de Peer, Yves; Tzfadia, Oren
2018-01-01
Recent times have seen an enormous growth of "omics" data, of which high-throughput gene expression data are arguably the most important from a functional perspective. Despite huge improvements in computational techniques for the functional classification of gene sequences, common similarity-based methods often fall short of providing full and reliable functional information. Recently, the combination of comparative genomics with approaches in functional genomics has received considerable interest for gene function analysis, leveraging both gene expression based guilt-by-association methods and annotation efforts in closely related model organisms. Besides the identification of missing genes in pathways, these methods also typically enable the discovery of biological regulators (i.e., transcription factors or signaling genes). A previously built guilt-by-association method is MORPH, which was proven to be an efficient algorithm that performs particularly well in identifying and prioritizing missing genes in plant metabolic pathways. Here, we present MorphDB, a resource where MORPH-based candidate genes for large-scale functional annotations (Gene Ontology, MapMan bins) are integrated across multiple plant species. Besides a gene centric query utility, we present a comparative network approach that enables researchers to efficiently browse MORPH predictions across functional gene sets and species, facilitating efficient gene discovery and candidate gene prioritization. MorphDB is available at http://bioinformatics.psb.ugent.be/webtools/morphdb/morphDB/index/. We also provide a toolkit, named "MORPH bulk" (https://github.com/arzwa/morph-bulk), for running MORPH in bulk mode on novel data sets, enabling researchers to apply MORPH to their own species of interest.
Castiello, Luciano; Sabatino, Marianna; Zhao, Yingdong; Tumaini, Barbara; Ren, Jiaqiang; Ping, Jin; Wang, Ena; Wood, Lauren V; Marincola, Francesco M; Puri, Raj K; Stroncek, David F
2013-02-01
Cell-based immunotherapies are among the most promising approaches for developing effective and targeted immune response. However, their clinical usefulness and the evaluation of their efficacy rely heavily on complex quality control assessment. Therefore, rapid systematic methods are urgently needed for the in-depth characterization of relevant factors affecting newly developed cell product consistency and the identification of reliable markers for quality control. Using dendritic cells (DCs) as a model, we present a strategy to comprehensively characterize manufactured cellular products in order to define factors affecting their variability, quality and function. After generating clinical grade human monocyte-derived mature DCs (mDCs), we tested by gene expression profiling the degrees of product consistency related to the manufacturing process and variability due to intra- and interdonor factors, and how each factor affects single gene variation. Then, by calculating for each gene an index of variation we selected candidate markers for identity testing, and defined a set of genes that may be useful comparability and potency markers. Subsequently, we confirmed the observed gene index of variation in a larger clinical data set. In conclusion, using high-throughput technology we developed a method for the characterization of cellular therapies and the discovery of novel candidate quality assurance markers.
Polonikov, Alexey V.; Ivanov, Vladimir P.; Bogomazov, Alexey D.; Freidin, Maxim B.; Illig, Thomas; Solodilova, Maria A.
2014-01-01
Oxidative stress resulting from an increased amount of reactive oxygen species and an imbalance between oxidants and antioxidants plays an important role in the pathogenesis of asthma. The present study tested the hypothesis that genetic susceptibility to allergic and nonallergic variants of asthma is determined by complex interactions between genes encoding antioxidant defense enzymes (ADE). We carried out a comprehensive analysis of the associations between adult asthma and 46 single nucleotide polymorphisms of 34 ADE genes and 12 other candidate genes of asthma in Russian population using set association analysis and multifactor dimensionality reduction approaches. We found for the first time epistatic interactions between ADE genes underlying asthma susceptibility and the genetic heterogeneity between allergic and nonallergic variants of the disease. We identified GSR (glutathione reductase) and PON2 (paraoxonase 2) as novel candidate genes for asthma susceptibility. We observed gender-specific effects of ADE genes on the risk of asthma. The results of the study demonstrate complexity and diversity of interactions between genes involved in oxidative stress underlying susceptibility to allergic and nonallergic asthma. PMID:24895604
A literature search tool for intelligent extraction of disease-associated genes.
Jung, Jae-Yoon; DeLuca, Todd F; Nelson, Tristan H; Wall, Dennis P
2014-01-01
To extract disorder-associated genes from the scientific literature in PubMed with greater sensitivity for literature-based support than existing methods. We developed a PubMed query to retrieve disorder-related, original research articles. Then we applied a rule-based text-mining algorithm with keyword matching to extract target disorders, genes with significant results, and the type of study described by the article. We compared our resulting candidate disorder genes and supporting references with existing databases. We demonstrated that our candidate gene set covers nearly all genes in manually curated databases, and that the references supporting the disorder-gene link are more extensive and accurate than other general purpose gene-to-disorder association databases. We implemented a novel publication search tool to find target articles, specifically focused on links between disorders and genotypes. Through comparison against gold-standard manually updated gene-disorder databases and comparison with automated databases of similar functionality we show that our tool can search through the entirety of PubMed to extract the main gene findings for human diseases rapidly and accurately.
Pavasovic, Ana; Dammannagoda, Lalith K.; Mather, Peter B.; Prentis, Peter J.
2017-01-01
Systemic acid-base balance and osmotic/ionic regulation in decapod crustaceans are in part maintained by a set of transport-related enzymes such as carbonic anhydrase (CA), Na+/K+-ATPase (NKA), H+-ATPase (HAT), Na+/K+/2Cl− cotransporter (NKCC), Na+/Cl−/HCO\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}${}_{3}^{-}$\\end{document}3− cotransporter (NBC), Na+/H+ exchanger (NHE), Arginine kinase (AK), Sarcoplasmic Ca+2-ATPase (SERCA) and Calreticulin (CRT). We carried out a comparative molecular analysis of these genes in three commercially important yet eco-physiologically distinct freshwater crayfish, Cherax quadricarinatus, C. destructor and C. cainii, with the aim to identify mutations in these genes and determine if observed patterns of mutations were consistent with the action of natural selection. We also conducted a tissue-specific expression analysis of these genes across seven different organs, including gills, hepatopancreas, heart, kidney, liver, nerve and testes using NGS transcriptome data. The molecular analysis of the candidate genes revealed a high level of sequence conservation across the three Cherax sp. Hyphy analysis revealed that all candidate genes showed patterns of molecular variation consistent with neutral evolution. The tissue-specific expression analysis showed that 46% of candidate genes were expressed in all tissue types examined, while approximately 10% of candidate genes were only expressed in a single tissue type. The largest number of genes was observed in nerve (84%) and gills (78%) and the lowest in testes (66%). The tissue-specific expression analysis also revealed that most of the master genes regulating pH and osmoregulation (CA, NKA, HAT, NKCC, NBC, NHE) were expressed in all tissue types indicating an important physiological role for these genes outside of osmoregulation in other tissue types. The high level of sequence conservation observed in the candidate genes may be explained by the important role of these genes as well as potentially having a number of other basic physiological functions in different tissue types. PMID:28852583
Genes Regulated by Vitamin D in Bone Cells Are Positively Selected in East Asians
Chen, Yuan; Xue, Yali; Luiselli, Donata; Tyler-Smith, Chris; Pagani, Luca; Ayub, Qasim
2015-01-01
Vitamin D and folate are activated and degraded by sunlight, respectively, and the physiological processes they control are likely to have been targets of selection as humans expanded from Africa into Eurasia. We investigated signals of positive selection in gene sets involved in the metabolism, regulation and action of these two vitamins in worldwide populations sequenced by Phase I of the 1000 Genomes Project. Comparing allele frequency-spectrum-based summary statistics between these gene sets and matched control genes, we observed a selection signal specific to East Asians for a gene set associated with vitamin D action in bones. The selection signal was mainly driven by three genes CXXC finger protein 1 (CXXC1), low density lipoprotein receptor-related protein 5 (LRP5) and runt-related transcription factor 2 (RUNX2). Examination of population differentiation and haplotypes allowed us to identify several candidate causal regulatory variants in each gene. Four of these candidate variants (one each in CXXC1 and RUNX2 and two in LRP5) had a >70% derived allele frequency in East Asians, but were present at lower (20–60%) frequency in Europeans as well, suggesting that the adaptation might have been part of a common response to climatic and dietary changes as humans expanded out of Africa, with implications for their role in vitamin D-dependent bone mineralization and osteoporosis insurgence. We also observed haplotype sharing between East Asians, Finns and an extinct archaic human (Denisovan) sample at the CXXC1 locus, which is best explained by incomplete lineage sorting. PMID:26719974
Gaykalova, Daria A; Vatapalli, Rajita; Wei, Yingying; Tsai, Hua-Ling; Wang, Hao; Zhang, Chi; Hennessey, Patrick T; Guo, Theresa; Tan, Marietta; Li, Ryan; Ahn, Julie; Khan, Zubair; Westra, William H; Bishop, Justin A; Zaboli, David; Koch, Wayne M; Khan, Tanbir; Ochs, Michael F; Califano, Joseph A
2015-01-01
Head and Neck Squamous Cell Carcinoma (HNSCC) is the fifth most common cancer, annually affecting over half a million people worldwide. Presently, there are no accepted biomarkers for clinical detection and surveillance of HNSCC. In this work, a comprehensive genome-wide analysis of epigenetic alterations in primary HNSCC tumors was employed in conjunction with cancer-specific outlier statistics to define novel biomarker genes which are differentially methylated in HNSCC. The 37 identified biomarker candidates were top-scoring outlier genes with prominent differential methylation in tumors, but with no signal in normal tissues. These putative candidates were validated in independent HNSCC cohorts from our institution and TCGA (The Cancer Genome Atlas). Using the top candidates, ZNF14, ZNF160, and ZNF420, an assay was developed for detection of HNSCC cancer in primary tissue and saliva samples with 100% specificity when compared to normal control samples. Given the high detection specificity, the analysis of ZNF DNA methylation in combination with other DNA methylation biomarkers may be useful in the clinical setting for HNSCC detection and surveillance, particularly in high-risk patients. Several additional candidates identified through this work can be further investigated toward future development of a multi-gene panel of biomarkers for the surveillance and detection of HNSCC.
Genetic neuropathology of obsessive psychiatric syndromes
Jaffe, A E; Deep-Soboslay, A; Tao, R; Hauptman, D T; Kaye, W H; Arango, V; Weinberger, D R; Hyde, T M; Kleinman, J E
2014-01-01
Anorexia nervosa (AN), bulimia nervosa (BN) and obsessive-compulsive disorder (OCD) are complex psychiatric disorders with shared obsessive features, thought to arise from the interaction of multiple genes of small effect with environmental factors. Potential candidate genes for AN, BN and OCD have been identified through clinical association and neuroimaging studies; however, recent genome-wide association studies of eating disorders (ED) so far have failed to report significant findings. In addition, few, if any, studies have interrogated postmortem brain tissue for evidence of expression quantitative trait loci (eQTLs) associated with candidate genes, which has particular promise as an approach to elucidating molecular mechanisms of association. We therefore selected single-nucleotide polymorphisms (SNPs) based on candidate gene studies for AN, BN and OCD from the literature, and examined the association of these SNPs with gene expression across the lifespan in prefrontal cortex of a nonpsychiatric control cohort (N=268). Several risk-predisposing SNPs were significantly associated with gene expression among control subjects. We then measured gene expression in the prefrontal cortex of cases previously diagnosed with obsessive psychiatric disorders, for example, ED (N=15) and OCD/obsessive-compulsive personality disorder or tics (OCD/OCPD/Tic; N=16), and nonpsychiatric controls (N=102) and identified 6 and 286 genes that were differentially expressed between ED compared with controls and OCD cases compared with controls, respectively (false discovery rate (FDR) <5%). However, none of the clinical risk SNPs were among the eQTLs and none were significantly associated with gene expression within the broad obsessive cohort, suggesting larger sample sizes or other brain regions may be required to identify candidate molecular mechanisms of clinical association in postmortem brain data sets. PMID:25180571
Genetic neuropathology of obsessive psychiatric syndromes.
Jaffe, A E; Deep-Soboslay, A; Tao, R; Hauptman, D T; Kaye, W H; Arango, V; Weinberger, D R; Hyde, T M; Kleinman, J E
2014-09-02
Anorexia nervosa (AN), bulimia nervosa (BN) and obsessive-compulsive disorder (OCD) are complex psychiatric disorders with shared obsessive features, thought to arise from the interaction of multiple genes of small effect with environmental factors. Potential candidate genes for AN, BN and OCD have been identified through clinical association and neuroimaging studies; however, recent genome-wide association studies of eating disorders (ED) so far have failed to report significant findings. In addition, few, if any, studies have interrogated postmortem brain tissue for evidence of expression quantitative trait loci (eQTLs) associated with candidate genes, which has particular promise as an approach to elucidating molecular mechanisms of association. We therefore selected single-nucleotide polymorphisms (SNPs) based on candidate gene studies for AN, BN and OCD from the literature, and examined the association of these SNPs with gene expression across the lifespan in prefrontal cortex of a nonpsychiatric control cohort (N=268). Several risk-predisposing SNPs were significantly associated with gene expression among control subjects. We then measured gene expression in the prefrontal cortex of cases previously diagnosed with obsessive psychiatric disorders, for example, ED (N=15) and OCD/obsessive-compulsive personality disorder or tics (OCD/OCPD/Tic; N=16), and nonpsychiatric controls (N=102) and identified 6 and 286 genes that were differentially expressed between ED compared with controls and OCD cases compared with controls, respectively (false discovery rate (FDR) <5%). However, none of the clinical risk SNPs were among the eQTLs and none were significantly associated with gene expression within the broad obsessive cohort, suggesting larger sample sizes or other brain regions may be required to identify candidate molecular mechanisms of clinical association in postmortem brain data sets.
Annotating novel genes by integrating synthetic lethals and genomic information
Schöner, Daniel; Kalisch, Markus; Leisner, Christian; Meier, Lukas; Sohrmann, Marc; Faty, Mahamadou; Barral, Yves; Peter, Matthias; Gruissem, Wilhelm; Bühlmann, Peter
2008-01-01
Background Large scale screening for synthetic lethality serves as a common tool in yeast genetics to systematically search for genes that play a role in specific biological processes. Often the amounts of data resulting from a single large scale screen far exceed the capacities of experimental characterization of every identified target. Thus, there is need for computational tools that select promising candidate genes in order to reduce the number of follow-up experiments to a manageable size. Results We analyze synthetic lethality data for arp1 and jnm1, two spindle migration genes, in order to identify novel members in this process. To this end, we use an unsupervised statistical method that integrates additional information from biological data sources, such as gene expression, phenotypic profiling, RNA degradation and sequence similarity. Different from existing methods that require large amounts of synthetic lethal data, our method merely relies on synthetic lethality information from two single screens. Using a Multivariate Gaussian Mixture Model, we determine the best subset of features that assign the target genes to two groups. The approach identifies a small group of genes as candidates involved in spindle migration. Experimental testing confirms the majority of our candidates and we present she1 (YBL031W) as a novel gene involved in spindle migration. We applied the statistical methodology also to TOR2 signaling as another example. Conclusion We demonstrate the general use of Multivariate Gaussian Mixture Modeling for selecting candidate genes for experimental characterization from synthetic lethality data sets. For the given example, integration of different data sources contributes to the identification of genetic interaction partners of arp1 and jnm1 that play a role in the same biological process. PMID:18194531
Bazzi, Gaia; Podofillini, Stefano; Gatti, Emanuele; Gianfranceschi, Luca; Cecere, Jacopo G; Spina, Fernando; Saino, Nicola; Rubolini, Diego
2017-10-01
The timing of major life-history events, such as migration and moult, is set by endogenous circadian and circannual clocks, that have been well characterized at the molecular level. Conversely, the genetic sources of variation in phenology and in other behavioral traits have been sparsely addressed. It has been proposed that inter-individual variability in the timing of seasonal events may arise from allelic polymorphism at phenological candidate genes involved in the signaling cascade of the endogenous clocks. In this study of a long-distance migratory passerine bird, the willow warbler Phylloscopus trochilus , we investigated whether allelic variation at 5 polymorphic loci of 4 candidate genes ( Adcyap1 , Clock , Creb1 , and Npas2 ), predicted 2 major components of the annual schedule, namely timing of spring migration across the central Mediterranean sea and moult speed, the latter gauged from ptilochronological analyses of tail feathers moulted in the African winter quarters. We identified a novel Clock gene locus ( Clock region 3) showing polyQ polymorphism, which was however not significantly associated with any phenotypic trait. Npas2 allele size predicted male (but not female) spring migration date, with males bearing longer alleles migrating significantly earlier than those bearing shorter alleles. Creb1 allele size significantly predicted male (but not female) moult speed, longer alleles being associated with faster moult. All other genotype-phenotype associations were statistically non-significant. These findings provide new evidence for a role of candidate genes in modulating the phenology of different circannual activities in long-distance migratory birds, and for the occurrence of sex-specific candidate gene effects.
NASA Astrophysics Data System (ADS)
Pagnuco, Inti A.; Pastore, Juan I.; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L.
2016-04-01
It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, where significative groups of genes are defined based on some criteria. This task is usually performed by clustering algorithms, where the whole family of genes, or a subset of them, are clustered into meaningful groups based on their expression values in a set of experiment. In this work we used a methodology based on the Silhouette index as a measure of cluster quality for individual gene groups, and a combination of several variants of hierarchical clustering to generate the candidate groups, to obtain sets of co-expressed genes for two real data examples. We analyzed the quality of the best ranked groups, obtained by the algorithm, using an online bioinformatics tool that provides network information for the selected genes. Moreover, to verify the performance of the algorithm, considering the fact that it doesn’t find all possible subsets, we compared its results against a full search, to determine the amount of good co-regulated sets not detected.
Norling, A; Hirschberg, A L; Rodriguez-Wallberg, K A; Iwarsson, E; Wedell, A; Barbaro, M
2014-08-01
Can high-resolution array comparative genomic hybridization (CGH) analysis of DNA samples from women with primary ovarian insufficiency (POI) improve the diagnosis of the condition and identify novel candidate genes for POI? A mutation affecting the regulatory region of growth differentiation factor 9 (GDF9) was identified for the first time together with several novel candidate genes for POI. Most patients with POI do not receive a molecular diagnosis despite a significant genetic component in the pathogenesis. We performed a case-control study. Twenty-six patients were analyzed by array CGH for identification of copy number variants. Novel changes were investigated in 95 controls and in a separate population of 28 additional patients with POI. The experimental procedures were performed during a 1-year period. DNA samples from 26 patients with POI were analyzed by a customized 1M array-CGH platform with whole genome coverage and probe enrichment targeting 78 genes in sex development. By PCR amplification and sequencing, the breakpoint of an identified partial GDF9 gene duplication was characterized. A multiplex ligation-dependent probe amplification (MLPA) probe set for specific identification of deletions/duplications affecting GDF9 was developed. An MLPA probe set for the identification of additional cases or controls carrying novel candidate regions identified by array-CGH was developed. Sequencing of three candidate genes was performed. Eleven unique copy number changes were identified in a total of 11 patients, including a tandem duplication of 475 bp, containing part of the GDF9 gene promoter region. The duplicated region contains three NOBOX-binding elements and an E-box, important for GDF9 gene regulation. This aberration is likely causative of POI. Fifty-four patients were investigated for copy number changes within GDF9, but no additional cases were found. Ten aberrations constituting novel candidate regions were detected, including a second DNAH6 deletion in a patient with POI. Other identified candidate genes were TSPYL6, SMARCC1, CSPG5 and ZFR2. This is a descriptive study and no functional experiments were performed. The study illustrates the importance of analyzing small copy number changes in addition to sequence alterations in the genetic investigation of patients with POI. Also, promoter regions should be included in the investigation. The study was supported by grants from the Swedish Research council (project no 12198 to A.W. and project no 20324 to A.L.H.), Stockholm County Council (E.I., A.W. and K.R.W.), Foundation Frimurare Barnhuset (A.N., A.W. and M.B.), Karolinska Institutet (A.N., A.L.H., E.I., A.W. and M.B.), Novo Nordic Foundation (A.W.) and Svenska Läkaresällskapet (M.B.). The funding sources had no involvement in the design or analysis of the study. The authors have no competing interests to declare. Not applicable. © The Author 2014. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology.
Discovery of cancer common and specific driver gene sets
2017-01-01
Abstract Cancer is known as a disease mainly caused by gene alterations. Discovery of mutated driver pathways or gene sets is becoming an important step to understand molecular mechanisms of carcinogenesis. However, systematically investigating commonalities and specificities of driver gene sets among multiple cancer types is still a great challenge, but this investigation will undoubtedly benefit deciphering cancers and will be helpful for personalized therapy and precision medicine in cancer treatment. In this study, we propose two optimization models to de novo discover common driver gene sets among multiple cancer types (ComMDP) and specific driver gene sets of one certain or multiple cancer types to other cancers (SpeMDP), respectively. We first apply ComMDP and SpeMDP to simulated data to validate their efficiency. Then, we further apply these methods to 12 cancer types from The Cancer Genome Atlas (TCGA) and obtain several biologically meaningful driver pathways. As examples, we construct a common cancer pathway model for BRCA and OV, infer a complex driver pathway model for BRCA carcinogenesis based on common driver gene sets of BRCA with eight cancer types, and investigate specific driver pathways of the liquid cancer lymphoblastic acute myeloid leukemia (LAML) versus other solid cancer types. In these processes more candidate cancer genes are also found. PMID:28168295
Uddin, Raihan; Singh, Shiva M.
2017-01-01
As humans age many suffer from a decrease in normal brain functions including spatial learning impairments. This study aimed to better understand the molecular mechanisms in age-associated spatial learning impairment (ASLI). We used a mathematical modeling approach implemented in Weighted Gene Co-expression Network Analysis (WGCNA) to create and compare gene network models of young (learning unimpaired) and aged (predominantly learning impaired) brains from a set of exploratory datasets in rats in the context of ASLI. The major goal was to overcome some of the limitations previously observed in the traditional meta- and pathway analysis using these data, and identify novel ASLI related genes and their networks based on co-expression relationship of genes. This analysis identified a set of network modules in the young, each of which is highly enriched with genes functioning in broad but distinct GO functional categories or biological pathways. Interestingly, the analysis pointed to a single module that was highly enriched with genes functioning in “learning and memory” related functions and pathways. Subsequent differential network analysis of this “learning and memory” module in the aged (predominantly learning impaired) rats compared to the young learning unimpaired rats allowed us to identify a set of novel ASLI candidate hub genes. Some of these genes show significant repeatability in networks generated from independent young and aged validation datasets. These hub genes are highly co-expressed with other genes in the network, which not only show differential expression but also differential co-expression and differential connectivity across age and learning impairment. The known function of these hub genes indicate that they play key roles in critical pathways, including kinase and phosphatase signaling, in functions related to various ion channels, and in maintaining neuronal integrity relating to synaptic plasticity and memory formation. Taken together, they provide a new insight and generate new hypotheses into the molecular mechanisms responsible for age associated learning impairment, including spatial learning. PMID:29066959
Uddin, Raihan; Singh, Shiva M
2017-01-01
As humans age many suffer from a decrease in normal brain functions including spatial learning impairments. This study aimed to better understand the molecular mechanisms in age-associated spatial learning impairment (ASLI). We used a mathematical modeling approach implemented in Weighted Gene Co-expression Network Analysis (WGCNA) to create and compare gene network models of young (learning unimpaired) and aged (predominantly learning impaired) brains from a set of exploratory datasets in rats in the context of ASLI. The major goal was to overcome some of the limitations previously observed in the traditional meta- and pathway analysis using these data, and identify novel ASLI related genes and their networks based on co-expression relationship of genes. This analysis identified a set of network modules in the young, each of which is highly enriched with genes functioning in broad but distinct GO functional categories or biological pathways. Interestingly, the analysis pointed to a single module that was highly enriched with genes functioning in "learning and memory" related functions and pathways. Subsequent differential network analysis of this "learning and memory" module in the aged (predominantly learning impaired) rats compared to the young learning unimpaired rats allowed us to identify a set of novel ASLI candidate hub genes. Some of these genes show significant repeatability in networks generated from independent young and aged validation datasets. These hub genes are highly co-expressed with other genes in the network, which not only show differential expression but also differential co-expression and differential connectivity across age and learning impairment. The known function of these hub genes indicate that they play key roles in critical pathways, including kinase and phosphatase signaling, in functions related to various ion channels, and in maintaining neuronal integrity relating to synaptic plasticity and memory formation. Taken together, they provide a new insight and generate new hypotheses into the molecular mechanisms responsible for age associated learning impairment, including spatial learning.
Mafra, Valéria; Kubo, Karen S.; Alves-Ferreira, Marcio; Ribeiro-Alves, Marcelo; Stuart, Rodrigo M.; Boava, Leonardo P.; Rodrigues, Carolina M.; Machado, Marcos A.
2012-01-01
Real-time reverse transcription PCR (RT-qPCR) has emerged as an accurate and widely used technique for expression profiling of selected genes. However, obtaining reliable measurements depends on the selection of appropriate reference genes for gene expression normalization. The aim of this work was to assess the expression stability of 15 candidate genes to determine which set of reference genes is best suited for transcript normalization in citrus in different tissues and organs and leaves challenged with five pathogens (Alternaria alternata, Phytophthora parasitica, Xylella fastidiosa and Candidatus Liberibacter asiaticus). We tested traditional genes used for transcript normalization in citrus and orthologs of Arabidopsis thaliana genes described as superior reference genes based on transcriptome data. geNorm and NormFinder algorithms were used to find the best reference genes to normalize all samples and conditions tested. Additionally, each biotic stress was individually analyzed by geNorm. In general, FBOX (encoding a member of the F-box family) and GAPC2 (GAPDH) was the most stable candidate gene set assessed under the different conditions and subsets tested, while CYP (cyclophilin), TUB (tubulin) and CtP (cathepsin) were the least stably expressed genes found. Validation of the best suitable reference genes for normalizing the expression level of the WRKY70 transcription factor in leaves infected with Candidatus Liberibacter asiaticus showed that arbitrary use of reference genes without previous testing could lead to misinterpretation of data. Our results revealed FBOX, SAND (a SAND family protein), GAPC2 and UPL7 (ubiquitin protein ligase 7) to be superior reference genes, and we recommend their use in studies of gene expression in citrus species and relatives. This work constitutes the first systematic analysis for the selection of superior reference genes for transcript normalization in different citrus organs and under biotic stress. PMID:22347455
Inference of combinatorial Boolean rules of synergistic gene sets from cancer microarray datasets.
Park, Inho; Lee, Kwang H; Lee, Doheon
2010-06-15
Gene set analysis has become an important tool for the functional interpretation of high-throughput gene expression datasets. Moreover, pattern analyses based on inferred gene set activities of individual samples have shown the ability to identify more robust disease signatures than individual gene-based pattern analyses. Although a number of approaches have been proposed for gene set-based pattern analysis, the combinatorial influence of deregulated gene sets on disease phenotype classification has not been studied sufficiently. We propose a new approach for inferring combinatorial Boolean rules of gene sets for a better understanding of cancer transcriptome and cancer classification. To reduce the search space of the possible Boolean rules, we identify small groups of gene sets that synergistically contribute to the classification of samples into their corresponding phenotypic groups (such as normal and cancer). We then measure the significance of the candidate Boolean rules derived from each group of gene sets; the level of significance is based on the class entropy of the samples selected in accordance with the rules. By applying the present approach to publicly available prostate cancer datasets, we identified 72 significant Boolean rules. Finally, we discuss several identified Boolean rules, such as the rule of glutathione metabolism (down) and prostaglandin synthesis regulation (down), which are consistent with known prostate cancer biology. Scripts written in Python and R are available at http://biosoft.kaist.ac.kr/~ihpark/. The refined gene sets and the full list of the identified Boolean rules are provided in the Supplementary Material. Supplementary data are available at Bioinformatics online.
Fine mapping of the genic male-sterile ms 1 gene in Capsicum annuum L.
Jeong, Kyumi; Choi, Doil; Lee, Jundae
2018-01-01
The genomic region cosegregating with the genic male-sterile ms 1 gene of Capsicum annuum L. was delimited to a region of 869.9 kb on chromosome 5 through fine mapping analysis. A strong candidate gene, CA05g06780, a homolog of the Arabidopsis MALE STERILITY 1 gene that controls pollen development, was identified in this region. Genic male sterility caused by the ms 1 gene has been used for the economically efficient production of massive hybrid seeds in paprika (Capsicum annuum L.), a colored bell-type sweet pepper. Previously, a CAPS marker, PmsM1-CAPS, located about 2-3 cM from the ms 1 locus, was reported. In this study, we constructed a fine map near the ms 1 locus using high-resolution melting (HRM) markers in an F 2 population consisting of 1118 individual plants, which segregated into 867 male-fertile and 251 male-sterile plants. A total of 12 HRM markers linked to the ms 1 locus were developed from 53 primer sets targeting intraspecific SNPs derived by comparing genome-wide sequences obtained by next-generation resequencing analysis. Using this approach, we narrowed down the region cosegregating with the ms 1 gene to 869.9 kb of sequence. Gene prediction analysis revealed 11 open reading frames in this region. A strong candidate gene, CA05g06780, was identified; this gene is a homolog of the Arabidopsis MALE STERILITY 1 (MS1) gene, which encodes a PHD-type transcription factor that regulates pollen and tapetum development. Sequence comparison analysis suggested that the CA05g06780 gene is the strongest candidate for the ms 1 gene of paprika. To summarize, we developed a cosegregated marker, 32187928-HRM, for marker-assisted selection and identified a strong candidate for the ms 1 gene.
Baye, Tesfaye M; Butsch Kovacic, Melinda; Biagini Myers, Jocelyn M; Martin, Lisa J; Lindsey, Mark; Patterson, Tia L; He, Hua; Ericksen, Mark B; Gupta, Jayanta; Tsoras, Anna M; Lindsley, Andrew; Rothenberg, Marc E; Wills-Karp, Marsha; Eissa, N Tony; Borish, Larry; Khurana Hershey, Gurjit K
2011-02-28
Candidate gene case-control studies have identified several single nucleotide polymorphisms (SNPs) that are associated with asthma susceptibility. Most of these studies have been restricted to evaluations of specific SNPs within a single gene and within populations from European ancestry. Recently, there is increasing interest in understanding racial differences in genetic risk associated with childhood asthma. Our aim was to compare association patterns of asthma candidate genes between children of European and African ancestry. Using a custom-designed Illumina SNP array, we genotyped 1,485 children within the Greater Cincinnati Pediatric Clinic Repository and Cincinnati Genomic Control Cohort for 259 SNPs in 28 genes and evaluated their associations with asthma. We identified 14 SNPs located in 6 genes that were significantly associated (p-values <0.05) with childhood asthma in African Americans. Among Caucasians, 13 SNPs in 5 genes were associated with childhood asthma. Two SNPs in IL4 were associated with asthma in both races (p-values <0.05). Gene-gene interaction studies identified race specific sets of genes that best discriminate between asthmatic children and non-allergic controls. We identified IL4 as having a role in asthma susceptibility in both African American and Caucasian children. However, while IL4 SNPs were associated with asthma in asthmatic children with European and African ancestry, the relative contributions of the most replicated asthma-associated SNPs varied by ancestry. These data provides valuable insights into the pathways that may predispose to asthma in individuals with European vs. African ancestry.
Ho, Daniel W. H.; Yap, Maurice K. H.; Ng, Po Wah; Fung, Wai Yan; Yip, Shea Ping
2012-01-01
Background Myopia is the most common ocular disorder worldwide and imposes tremendous burden on the society. It is a complex disease. The MYP6 locus at 22 q12 is of particular interest because many studies have detected linkage signals at this interval. The MYP6 locus is likely to contain susceptibility gene(s) for myopia, but none has yet been identified. Methodology/Principal Findings Two independent subject groups of southern Chinese in Hong Kong participated in the study an initial study using a discovery sample set of 342 cases and 342 controls, and a follow-up study using a replication sample set of 316 cases and 313 controls. Cases with high myopia were defined by spherical equivalent ≤ -8 dioptres and emmetropic controls by spherical equivalent within ±1.00 dioptre for both eyes. Manual candidate gene selection from the MYP6 locus was supported by objective in silico prioritization. DNA samples of discovery sample set were genotyped for 178 tagging single nucleotide polymorphisms (SNPs) from 26 genes. For replication, 25 SNPs (tagging or located at predicted transcription factor or microRNA binding sites) from 4 genes were subsequently examined using the replication sample set. Fisher P value was calculated for all SNPs and overall association results were summarized by meta-analysis. Based on initial and replication studies, rs2009066 located in the crystallin beta A4 (CRYBA4) gene was identified to be the most significantly associated with high myopia (initial study: P = 0.02; replication study: P = 1.88e-4; meta-analysis: P = 1.54e-5) among all the SNPs tested. The association result survived correction for multiple comparisons. Under the allelic genetic model for the combined sample set, the odds ratio of the minor allele G was 1.41 (95% confidence intervals, 1.21-1.64). Conclusions/Significance A novel susceptibility gene (CRYBA4) was discovered for high myopia. Our study also signified the potential importance of appropriate gene prioritization in candidate selection. PMID:22792142
Cross-platform method for identifying candidate network biomarkers for prostate cancer.
Jin, G; Zhou, X; Cui, K; Zhang, X-S; Chen, L; Wong, S T C
2009-11-01
Discovering biomarkers using mass spectrometry (MS) and microarray expression profiles is a promising strategy in molecular diagnosis. Here, the authors proposed a new pipeline for biomarker discovery that integrates disease information for proteins and genes, expression profiles in both genomic and proteomic levels, and protein-protein interactions (PPIs) to discover high confidence network biomarkers. Using this pipeline, a total of 474 molecules (genes and proteins) related to prostate cancer were identified and a prostate-cancer-related network (PCRN) was derived from the integrative information. Thus, a set of candidate network biomarkers were identified from multiple expression profiles composed by eight microarray datasets and one proteomics dataset. The network biomarkers with PPIs can accurately distinguish the prostate patients from the normal ones, which potentially provide more reliable hits of biomarker candidates than conventional biomarker discovery methods.
Barrière, Yves; Courtial, Audrey; Chateigner-Boutin, Anne-Laure; Denoue, Dominique; Grima-Pettenati, Jacqueline
2016-01-01
The knowledge of the gene families mostly impacting cell wall digestibility variations would significantly increase the efficiency of marker-assisted selection when breeding maize and grass varieties with improved silage feeding value and/or with better straw fermentability into alcohol or methane. The maize genome sequence of the B73 inbred line was released at the end of 2009, opening up new avenues to identify the genetic determinants of quantitative traits. Colocalizations between a large set of candidate genes putatively involved in secondary cell wall assembly and QTLs for cell wall digestibility (IVNDFD) were then investigated, considering physical positions of both genes and QTLs. Based on available data from six RIL progenies, 59 QTLs corresponding to 38 non-overlapping positions were matched up with a list of 442 genes distributed all over the genome. Altogether, 176 genes colocalized with IVNDFD QTLs and most often, several candidate genes colocalized at each QTL position. Frequent QTL colocalizations were found firstly with genes encoding ZmMYB and ZmNAC transcription factors, and secondly with genes encoding zinc finger, bHLH, and xylogen regulation factors. In contrast, close colocalizations were less frequent with genes involved in monolignol biosynthesis, and found only with the C4H2, CCoAOMT5, and CCR1 genes. Close colocalizations were also infrequent with genes involved in cell wall feruloylation and cross-linkages. Altogether, investigated colocalizations between candidate genes and cell wall digestibility QTLs suggested a prevalent role of regulation factors over constitutive cell wall genes on digestibility variations. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Cellular dissection of psoriasis for transcriptome analyses and the post-GWAS era
2014-01-01
Background Genome-scale studies of psoriasis have been used to identify genes of potential relevance to disease mechanisms. For many identified genes, however, the cell type mediating disease activity is uncertain, which has limited our ability to design gene functional studies based on genomic findings. Methods We identified differentially expressed genes (DEGs) with altered expression in psoriasis lesions (n = 216 patients), as well as candidate genes near susceptibility loci from psoriasis GWAS studies. These gene sets were characterized based upon their expression across 10 cell types present in psoriasis lesions. Susceptibility-associated variation at intergenic (non-coding) loci was evaluated to identify sites of allele-specific transcription factor binding. Results Half of DEGs showed highest expression in skin cells, although the dominant cell type differed between psoriasis-increased DEGs (keratinocytes, 35%) and psoriasis-decreased DEGs (fibroblasts, 33%). In contrast, psoriasis GWAS candidates tended to have highest expression in immune cells (71%), with a significant fraction showing maximal expression in neutrophils (24%, P < 0.001). By identifying candidate cell types for genes near susceptibility loci, we could identify and prioritize SNPs at which susceptibility variants are predicted to influence transcription factor binding. This led to the identification of potentially causal (non-coding) SNPs for which susceptibility variants influence binding of AP-1, NF-κB, IRF1, STAT3 and STAT4. Conclusions These findings underscore the role of innate immunity in psoriasis and highlight neutrophils as a cell type linked with pathogenetic mechanisms. Assignment of candidate cell types to genes emerging from GWAS studies provides a first step towards functional analysis, and we have proposed an approach for generating hypotheses to explain GWAS hits at intergenic loci. PMID:24885462
Liu, Na; Xue, Yadong; Guo, Zhanyong; Li, Weihua; Tang, Jihua
2016-01-01
Kernel starch content is an important trait in maize (Zea mays L.) as it accounts for 65–75% of the dry kernel weight and positively correlates with seed yield. A number of starch synthesis-related genes have been identified in maize in recent years. However, many loci underlying variation in starch content among maize inbred lines still remain to be identified. The current study is a genome-wide association study that used a set of 263 maize inbred lines. In this panel, the average kernel starch content was 66.99%, ranging from 60.60 to 71.58% over the three study years. These inbred lines were genotyped with the SNP50 BeadChip maize array, which is comprised of 56,110 evenly spaced, random SNPs. Population structure was controlled by a mixed linear model (MLM) as implemented in the software package TASSEL. After the statistical analyses, four SNPs were identified as significantly associated with starch content (P ≤ 0.0001), among which one each are located on chromosomes 1 and 5 and two are on chromosome 2. Furthermore, 77 candidate genes associated with starch synthesis were found within the 100-kb intervals containing these four QTLs, and four highly associated genes were within 20-kb intervals of the associated SNPs. Among the four genes, Glucose-1-phosphate adenylyltransferase (APS1; Gene ID GRMZM2G163437) is known as an important regulator of kernel starch content. The identified SNPs, QTLs, and candidate genes may not only be readily used for germplasm improvement by marker-assisted selection in breeding, but can also elucidate the genetic basis of starch content. Further studies on these identified candidate genes may help determine the molecular mechanisms regulating kernel starch content in maize and other important cereal crops. PMID:27512395
Taylor, Candy M; Jost, Ricarda; Erskine, William; Nelson, Matthew N
2016-01-01
Quantitative Reverse Transcription PCR (qRT-PCR) is currently one of the most popular, high-throughput and sensitive technologies available for quantifying gene expression. Its accurate application depends heavily upon normalisation of gene-of-interest data with reference genes that are uniformly expressed under experimental conditions. The aim of this study was to provide the first validation of reference genes for Lupinus angustifolius (narrow-leafed lupin, a significant grain legume crop) using a selection of seven genes previously trialed as reference genes for the model legume, Medicago truncatula. In a preliminary evaluation, the seven candidate reference genes were assessed on the basis of primer specificity for their respective targeted region, PCR amplification efficiency, and ability to discriminate between cDNA and gDNA. Following this assessment, expression of the three most promising candidates [Ubiquitin C (UBC), Helicase (HEL), and Polypyrimidine tract-binding protein (PTB)] was evaluated using the NormFinder and RefFinder statistical algorithms in two narrow-leafed lupin lines, both with and without vernalisation treatment, and across seven organ types (cotyledons, stem, leaves, shoot apical meristem, flowers, pods and roots) encompassing three developmental stages. UBC was consistently identified as the most stable candidate and has sufficiently uniform expression that it may be used as a sole reference gene under the experimental conditions tested here. However, as organ type and developmental stage were associated with greater variability in relative expression, it is recommended using UBC and HEL as a pair to achieve optimal normalisation. These results highlight the importance of rigorously assessing candidate reference genes for each species across a diverse range of organs and developmental stages. With emerging technologies, such as RNAseq, and the completion of valuable transcriptome data sets, it is possible that other potentially more suitable reference genes will be identified for this species in future.
Erskine, William; Nelson, Matthew N.
2016-01-01
Quantitative Reverse Transcription PCR (qRT-PCR) is currently one of the most popular, high-throughput and sensitive technologies available for quantifying gene expression. Its accurate application depends heavily upon normalisation of gene-of-interest data with reference genes that are uniformly expressed under experimental conditions. The aim of this study was to provide the first validation of reference genes for Lupinus angustifolius (narrow-leafed lupin, a significant grain legume crop) using a selection of seven genes previously trialed as reference genes for the model legume, Medicago truncatula. In a preliminary evaluation, the seven candidate reference genes were assessed on the basis of primer specificity for their respective targeted region, PCR amplification efficiency, and ability to discriminate between cDNA and gDNA. Following this assessment, expression of the three most promising candidates [Ubiquitin C (UBC), Helicase (HEL), and Polypyrimidine tract-binding protein (PTB)] was evaluated using the NormFinder and RefFinder statistical algorithms in two narrow-leafed lupin lines, both with and without vernalisation treatment, and across seven organ types (cotyledons, stem, leaves, shoot apical meristem, flowers, pods and roots) encompassing three developmental stages. UBC was consistently identified as the most stable candidate and has sufficiently uniform expression that it may be used as a sole reference gene under the experimental conditions tested here. However, as organ type and developmental stage were associated with greater variability in relative expression, it is recommended using UBC and HEL as a pair to achieve optimal normalisation. These results highlight the importance of rigorously assessing candidate reference genes for each species across a diverse range of organs and developmental stages. With emerging technologies, such as RNAseq, and the completion of valuable transcriptome data sets, it is possible that other potentially more suitable reference genes will be identified for this species in future. PMID:26872362
The search for the genetic basis of hypertension.
Yagil, Yoram; Yagil, Chana
2005-03-01
This review surveys the literature on the search for the genetic basis of hypertension during the 10 months since November 2003. The goals set forth by this search are defined and the highlights of the work accomplished are provided. The search for the genetic basis of hypertension is ongoing, generating an abundance of new data. These data consist of a large number of candidate genes, association of previously known and novel candidate genes with various facets of hypertension, detection of new quantitative trait loci and identification of genes that mediate susceptibility to hypertension. The renin-zangiotensin-aldosterone system continues to dominate the interest of investigators. Other gene systems are also emerging but a single-gene system cannot be singled out beyond the renin-angiotensin-aldosterone system and the data are mostly sporadic and do not reflect a guided or coordinated effort to resolve unanswered issues. The notion that hypertension is polygenic is reinforced, yet few data are provided as to the actual number of genes involved, gene-gene interaction or gene-environment interaction. Advanced biotechnological tools involving transcriptomics and proteomics are underused. Research on the genetic basis of hypertension has generated over the past year a large number of candidate genes and tied them to various aspects of hypertension. How these genes fit into the complex pathophysiological network that induces hypertension remains unclear. The task of putting together these genes into a cohesive framework still lies ahead, but promises to enlighten us as to the true nature of hypertension, the pathogenic mechanisms involved and improved therapeutic and preventive measures.
Izquierdo-Lahuerta, Adriana; de Luis, Oscar; Gómez-Esquer, Francisco; Cruces, Jesús; Coloma, Antonio
2016-09-23
Alpha-dystroglycanopathies are a heterogenic group of human rare diseases that have in common defects of α-dystroglycan O-glycosylation. These congenital disorders share common features as muscular dystrophy, malformations on central nervous system and more rarely altered ocular development, as well as mutations on a set of candidate genes involved on those syndromes. Severity of the syndromes is variable, appearing Walker-Warburg as the most severe where mutations at protein O-mannosyl transferases POMT1 and POMT2 genes are frequently described. When studying the lack of MmPomt1 in mouse embryonic development, as a murine model of Walker-Warburg syndrome, MmPomt1 null phenotype was lethal because Reitchert's membrane fails during embryonic development. Here, we report gene expression from Gallus gallus orthologous genes to human candidates on alpha-dystroglycanopathies POMT1, POMT2, POMGnT1, FKTN, FKRP and LARGE, making special emphasis in expression and localization of GgPomt1. Results obtained by quantitative RT-PCR, western-blot and immunochemistry revealed close gene expression patterns among human and chicken at key tissues affected during development when suffering an alpha-dystroglycanopathy, leading us to stand chicken as a useful animal model for molecular characterization of glycosyltransferases involved in the O-glycosylation of α-Dystroglycan and its role in embryonic development. Copyright © 2016 Elsevier Inc. All rights reserved.
An integrated analysis of genes and functional pathways for aggression in human and rodent models.
Zhang-James, Yanli; Fernàndez-Castillo, Noèlia; Hess, Jonathan L; Malki, Karim; Glatt, Stephen J; Cormand, Bru; Faraone, Stephen V
2018-06-01
Human genome-wide association studies (GWAS), transcriptome analyses of animal models, and candidate gene studies have advanced our understanding of the genetic architecture of aggressive behaviors. However, each of these methods presents unique limitations. To generate a more confident and comprehensive view of the complex genetics underlying aggression, we undertook an integrated, cross-species approach. We focused on human and rodent models to derive eight gene lists from three main categories of genetic evidence: two sets of genes identified in GWAS studies, four sets implicated by transcriptome-wide studies of rodent models, and two sets of genes with causal evidence from online Mendelian inheritance in man (OMIM) and knockout (KO) mice reports. These gene sets were evaluated for overlap and pathway enrichment to extract their similarities and differences. We identified enriched common pathways such as the G-protein coupled receptor (GPCR) signaling pathway, axon guidance, reelin signaling in neurons, and ERK/MAPK signaling. Also, individual genes were ranked based on their cumulative weights to quantify their importance as risk factors for aggressive behavior, which resulted in 40 top-ranked and highly interconnected genes. The results of our cross-species and integrated approach provide insights into the genetic etiology of aggression.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Metzenberg, A.B.; Pan, Y.; Das, S.
1994-05-01
Mapping studies have indicated that over two dozen genetic diseases lie on Xq28, the distal long arm of the X chromosome. In most cases the responsible gene has not yet been isolated. Most of these diseases occur at low frequency, and together with small family sizes and the lack of associated cytogenetic aberrations, this characteristic has made isolation of the genes difficult. Identification of the genes responsible for inherited disorders should eventually lead to a greater understanding of biochemical and developmental pathways. We and others are attempting to find these genes by examining genes that are candidates by virtue ofmore » their map location. One candidate is the Xq28-linked gene MPP-1, which encodes the p55 protein. In this study, we asked whether mutations in the p55 gene are present in patients affected with the Xq28-linked disorders dyskeratosis congenita and Emergy-Dreifuss muscular dystrophy. The p55 cDNA is [approx]2 kb in length. The strategy for mutation detection in this sequence involved reverse transciption (RT)-PCR amplification of patient and control cDNA, yielding five sets of overlapping fragments, each set consisting of 400 bp, followed by SSCP analysis of each fragment. In no case was a true mutation in the p55 gene discovered. Therefore, it is highly unlikely that mutations in the p55 gene are responsible for any cases of dyskeratosis congenita or Emergy-Dreifuss muscular dystrophy.« less
Text mining-based in silico drug discovery in oral mucositis caused by high-dose cancer therapy.
Kirk, Jon; Shah, Nirav; Noll, Braxton; Stevens, Craig B; Lawler, Marshall; Mougeot, Farah B; Mougeot, Jean-Luc C
2018-08-01
Oral mucositis (OM) is a major dose-limiting side effect of chemotherapy and radiation used in cancer treatment. Due to the complex nature of OM, currently available drug-based treatments are of limited efficacy. Our objectives were (i) to determine genes and molecular pathways associated with OM and wound healing using computational tools and publicly available data and (ii) to identify drugs formulated for topical use targeting the relevant OM molecular pathways. OM and wound healing-associated genes were determined by text mining, and the intersection of the two gene sets was selected for gene ontology analysis using the GeneCodis program. Protein interaction network analysis was performed using STRING-db. Enriched gene sets belonging to the identified pathways were queried against the Drug-Gene Interaction database to find drug candidates for topical use in OM. Our analysis identified 447 genes common to both the "OM" and "wound healing" text mining concepts. Gene enrichment analysis yielded 20 genes representing six pathways and targetable by a total of 32 drugs which could possibly be formulated for topical application. A manual search on ClinicalTrials.gov confirmed no relevant pathway/drug candidate had been overlooked. Twenty-five of the 32 drugs can directly affect the PTGS2 (COX-2) pathway, the pathway that has been targeted in previous clinical trials with limited success. Drug discovery using in silico text mining and pathway analysis tools can facilitate the identification of existing drugs that have the potential of topical administration to improve OM treatment.
Characterization of candidate genes in inflammatory bowel disease–associated risk loci
Peloquin, Joanna M.; Sartor, R. Balfour; Newberry, Rodney D.; McGovern, Dermot P.; Yajnik, Vijay; Lira, Sergio A.
2016-01-01
GWAS have linked SNPs to risk of inflammatory bowel disease (IBD), but a systematic characterization of disease-associated genes has been lacking. Prior studies utilized microarrays that did not capture many genes encoded within risk loci or defined expression quantitative trait loci (eQTLs) using peripheral blood, which is not the target tissue in IBD. To address these gaps, we sought to characterize the expression of IBD-associated risk genes in disease-relevant tissues and in the setting of active IBD. Terminal ileal (TI) and colonic mucosal tissues were obtained from patients with Crohn’s disease or ulcerative colitis and from healthy controls. We developed a NanoString code set to profile 678 genes within IBD risk loci. A subset of patients and controls were genotyped for IBD-associated risk SNPs. Analyses included differential expression and variance analysis, weighted gene coexpression network analysis, and eQTL analysis. We identified 116 genes that discriminate between healthy TI and colon samples and uncovered patterns in variance of gene expression that highlight heterogeneity of disease. We identified 107 coexpressed gene pairs for which transcriptional regulation is either conserved or reversed in an inflammation-independent or -dependent manner. We demonstrate that on average approximately 60% of disease-associated genes are differentially expressed in inflamed tissue. Last, we identified eQTLs with either genotype-only effects on expression or an interaction effect between genotype and inflammation. Our data reinforce tissue specificity of expression in disease-associated candidate genes, highlight genes and gene pairs that are regulated in disease-relevant tissue and inflammation, and provide a foundation to advance the understanding of IBD pathogenesis. PMID:27668286
Klein, Ronald; Li, Xiaohui; Kuo, Jane Z; Klein, Barbara E K; Cotch, Mary Frances; Wong, Tien Y; Taylor, Kent D; Rotter, Jerome I
2013-11-01
To describe the relationships of selected candidate genes to the prevalence of early age-related macular degeneration (AMD) in a cohort of whites, blacks, Hispanics, and Chinese Americans. Cross-sectional study. setting: Multicenter study. study population: A total of 2456 persons aged 45-84 years with genotype information and fundus photographs. procedures: Twelve of 2862 single nucleotide polymorphisms (SNPs) from 11 of 233 candidate genes for cardiovascular disease were selected for analysis based on screening with marginal unadjusted P value <.001 within 1 or more racial/ethnic groups. Logistic regression models tested for association in case-control samples. main outcome measure: Prevalence of early AMD. Early AMD was present in 4.0% of the cohort and varied from 2.4% in blacks to 6.0% in whites. The odds ratio increased from 2.3 for 1 to 10.0 for 4 risk alleles in a joint effect analysis of Age-Related Maculopathy Susceptibility 2 rs10490924 and Complement Factor H Y402H (P for trend = 4.2×10(-7)). Frequencies of each SNP varied among the racial/ethnic groups. Adjusting for age and other factors, few statistically significant associations of the 12 SNPs with AMD were consistent across all groups. In a multivariate model, most candidate genes did not attenuate the comparatively higher odds of AMD in whites. The higher frequency of risk alleles for several SNPs in Chinese Americans may partially explain their AMD frequency's approaching that of whites. The relationships of 11 candidate genes to early AMD varied among 4 racial/ethnic groups, and partially explained the observed variations in early AMD prevalence among them. Copyright © 2013 Elsevier Inc. All rights reserved.
Chen, Minhui; Wang, Jiying; Wang, Yanping; Wu, Ying; Fu, Jinluan; Liu, Jian-Feng
2018-05-18
Currently, genome-wide scans for positive selection signatures in commercial breed have been investigated. However, few studies have focused on selection footprints of indigenous breeds. Laiwu pig is an invaluable Chinese indigenous pig breed with extremely high proportion of intramuscular fat (IMF), and an excellent model to detect footprint as the result of natural and artificial selection for fat deposition in muscle. In this study, based on GeneSeek Genomic profiler Porcine HD data, three complementary methods, F ST , iHS (integrated haplotype homozygosity score) and CLR (composite likelihood ratio), were implemented to detect selection signatures in the whole genome of Laiwu pigs. Totally, 175 candidate selected regions were obtained by at least two of the three methods, which covered 43.75 Mb genomic regions and corresponded to 1.79% of the genome sequence. Gene annotation of the selected regions revealed a list of functionally important genes for feed intake and fat deposition, reproduction, and immune response. Especially, in accordance to the phenotypic features of Laiwu pigs, among the candidate genes, we identified several genes, NPY1R, NPY5R, PIK3R1 and JAKMIP1, involved in the actions of two sets of neurons, which are central regulators in maintaining the balance between food intake and energy expenditure. Our results identified a number of regions showing signatures of selection, as well as a list of functionally candidate genes with potential effect on phenotypic traits, especially fat deposition in muscle. Our findings provide insights into the mechanisms of artificial selection of fat deposition and further facilitate follow-up functional studies.
McArt, Darragh G.; Dunne, Philip D.; Blayney, Jaine K.; Salto-Tellez, Manuel; Van Schaeybroeck, Sandra; Hamilton, Peter W.; Zhang, Shu-Dong
2013-01-01
The advent of next generation sequencing technologies (NGS) has expanded the area of genomic research, offering high coverage and increased sensitivity over older microarray platforms. Although the current cost of next generation sequencing is still exceeding that of microarray approaches, the rapid advances in NGS will likely make it the platform of choice for future research in differential gene expression. Connectivity mapping is a procedure for examining the connections among diseases, genes and drugs by differential gene expression initially based on microarray technology, with which a large collection of compound-induced reference gene expression profiles have been accumulated. In this work, we aim to test the feasibility of incorporating NGS RNA-Seq data into the current connectivity mapping framework by utilizing the microarray based reference profiles and the construction of a differentially expressed gene signature from a NGS dataset. This would allow for the establishment of connections between the NGS gene signature and those microarray reference profiles, alleviating the associated incurring cost of re-creating drug profiles with NGS technology. We examined the connectivity mapping approach on a publicly available NGS dataset with androgen stimulation of LNCaP cells in order to extract candidate compounds that could inhibit the proliferative phenotype of LNCaP cells and to elucidate their potential in a laboratory setting. In addition, we also analyzed an independent microarray dataset of similar experimental settings. We found a high level of concordance between the top compounds identified using the gene signatures from the two datasets. The nicotine derivative cotinine was returned as the top candidate among the overlapping compounds with potential to suppress this proliferative phenotype. Subsequent lab experiments validated this connectivity mapping hit, showing that cotinine inhibits cell proliferation in an androgen dependent manner. Thus the results in this study suggest a promising prospect of integrating NGS data with connectivity mapping. PMID:23840550
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liang, Ying; Gao, Yajun; Jones, Alan M.
The three-member family of Arabidopsis extra-large G proteins (XLG1-3) defines the prototype of an atypical Ga subunit in the heterotrimeric G protein complex. Some recent evidence indicate that XLG subunits operate along with its Gbg dimer in root morphology, stress responsiveness, and cytokinin induced development, however downstream targets of activated XLG proteins in the stress pathways are rarely known. In order to assemble a set of candidate XLG-targeted proteins, a yeast two-hybrid complementation-based screen was performed using XLG protein baits to query interactions between XLG and partner protein found in glucose-treated seedlings, roots, and Arabidopsis cells in culture. Seventy twomore » interactors were identified and >60% of a test set displayed in vivo interaction with XLG proteins. Gene co-expression analysis shows that >70% of the interactors are positively correlated with the corresponding XLG partners. Gene Ontology enrichment for all the candidates indicates stress responses and posits a molecular mechanism involving a specific set of transcription factor partners to XLG. Genes encoding two of these transcription factors, SZF1 and 2, require XLG proteins for full NaCl-induced expression. Furthermore, the subcellular localization of the XLG proteins in the nucleus, endosome, and plasma membrane is dependent on the specific interacting partner.« less
Liang, Ying; Gao, Yajun; Jones, Alan M.
2017-06-13
The three-member family of Arabidopsis extra-large G proteins (XLG1-3) defines the prototype of an atypical Ga subunit in the heterotrimeric G protein complex. Some recent evidence indicate that XLG subunits operate along with its Gbg dimer in root morphology, stress responsiveness, and cytokinin induced development, however downstream targets of activated XLG proteins in the stress pathways are rarely known. In order to assemble a set of candidate XLG-targeted proteins, a yeast two-hybrid complementation-based screen was performed using XLG protein baits to query interactions between XLG and partner protein found in glucose-treated seedlings, roots, and Arabidopsis cells in culture. Seventy twomore » interactors were identified and >60% of a test set displayed in vivo interaction with XLG proteins. Gene co-expression analysis shows that >70% of the interactors are positively correlated with the corresponding XLG partners. Gene Ontology enrichment for all the candidates indicates stress responses and posits a molecular mechanism involving a specific set of transcription factor partners to XLG. Genes encoding two of these transcription factors, SZF1 and 2, require XLG proteins for full NaCl-induced expression. Furthermore, the subcellular localization of the XLG proteins in the nucleus, endosome, and plasma membrane is dependent on the specific interacting partner.« less
Quaggiotti, Silvia; Barcaccia, Gianni; Schiavon, Michela; Nicolé, Silvia; Galla, Giulio; Rossignolo, Virginia; Soattin, Marica; Malagoli, Mario
2007-11-01
In this research a differential display based on the detection of cDNA-AFLP markers was used to identify candidate genes potentially involved in the regulation of the response to chromium in four different willow species (Salix alba, Salix eleagnos, Salix fragilis and Salix matsudana) chosen on the basis of their suitability in phytoremediation techniques. Our approach enabled the assay of a large set of mRNA-related fragments and increased the reliability of amplification-based transcriptome analysis. The vast majority of transcript-derived fragments were shared among samples within species and thus attributable to constitutively expressed genes. However, a number of differentially expressed mRNAs were scored in each species and a total of 68 transcripts displaying an altered expression in response to Cr were isolated and sequenced. Public database querying revealed that 44.1% and 4.4% of the cloned ESTs score significant similarity with genes encoding proteins having known or putative function, or with genes coding for unknown proteins, respectively, whereas the remaining 51.5% did not retrieve any homology. Semi-quantitative RT-PCR analysis of seven candidate genes fully confirmed the expression patterns obtained by cDNA-AFLP. Our results indicate the existence of common mechanisms of gene regulation in response to Cr, pathogen attack and senescence-mediated programmed cell death, and suggest a role for the genes isolated in the cross-talk of the signaling pathways governing the adaptation to biotic and abiotic stresses.
Leung, Kim Hung; Yiu, Wai Chi; Yap, Maurice K H; Ng, Po Wah; Fung, Wai Yan; Sham, Pak Chung; Yip, Shea Ping
2011-06-01
This study examined the relationship between high myopia and three myopia candidate genes--matrix metalloproteinase 2 (MMP2) and tissue inhibitor of metalloproteinase-2 and -3 (TIMP2 and TIMP3)--involved in scleral remodeling. Recruited for the study were unrelated adult Han Chinese who were high myopes (spherical equivalent, ≤ -6.0 D in both eyes; cases) and emmetropes (within ±1.0 D in both eyes; controls). Sample set 1 had 300 cases and 300 controls, and sample set 2 had 356 cases and 354 controls. Forty-nine tag single-nucleotide polymorphisms (SNPs) were selected from these candidate genes. The first stage was an initial screen of six case pools and six control pools constructed from sample set 1, each pool consisting of 50 distinct subjects of the same affection status. In the second stage, positive SNPs from the first stage were confirmed by genotyping individual samples forming the DNA pools. In the third stage, positive SNPs from stage 2 were replicated, with sample set 2 genotyped individually. Of the 49 SNPs screened by DNA pooling, three passed the lenient threshold of P < 0.10 (nested ANOVA) and were followed up by individual genotyping. Of the three SNPs genotyped, two TIMP3 SNPs were found to be significantly associated with high myopia by single-marker or haplotype analysis. However, the initial positive results could not be replicated by sample set 2. MMP2, TIPM2, and TIMP3 genes were not associated with high myopia in this Chinese sample and hence are unlikely to play a major role in the genetic susceptibility to high myopia.
de Jong, Simone; Vidler, Lewis R; Mokrab, Younes; Collier, David A; Breen, Gerome
2016-08-01
Genome-wide association studies (GWAS) have identified thousands of novel genetic associations for complex genetic disorders, leading to the identification of potential pharmacological targets for novel drug development. In schizophrenia, 108 conservatively defined loci that meet genome-wide significance have been identified and hundreds of additional sub-threshold associations harbour information on the genetic aetiology of the disorder. In the present study, we used gene-set analysis based on the known binding targets of chemical compounds to identify the 'drug pathways' most strongly associated with schizophrenia-associated genes, with the aim of identifying potential drug repositioning opportunities and clues for novel treatment paradigms, especially in multi-target drug development. We compiled 9389 gene sets (2496 with unique gene content) and interrogated gene-based p-values from the PGC2-SCZ analysis. Although no single drug exceeded experiment wide significance (corrected p<0.05), highly ranked gene-sets reaching suggestive significance including the dopamine receptor antagonists metoclopramide and trifluoperazine and the tyrosine kinase inhibitor neratinib. This is a proof of principle analysis showing the potential utility of GWAS data of schizophrenia for the direct identification of candidate drugs and molecules that show polypharmacy. © The Author(s) 2016.
GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data.
Schulz, Tizian; Stoye, Jens; Doerr, Daniel
2018-05-08
Hi-C sequencing offers novel, cost-effective means to study the spatial conformation of chromosomes. We use data obtained from Hi-C experiments to provide new evidence for the existence of spatial gene clusters. These are sets of genes with associated functionality that exhibit close proximity to each other in the spatial conformation of chromosomes across several related species. We present the first gene cluster model capable of handling spatial data. Our model generalizes a popular computational model for gene cluster prediction, called δ-teams, from sequences to graphs. Following previous lines of research, we subsequently extend our model to allow for several vertices being associated with the same label. The model, called δ-teams with families, is particular suitable for our application as it enables handling of gene duplicates. We develop algorithmic solutions for both models. We implemented the algorithm for discovering δ-teams with families and integrated it into a fully automated workflow for discovering gene clusters in Hi-C data, called GraphTeams. We applied it to human and mouse data to find intra- and interchromosomal gene cluster candidates. The results include intrachromosomal clusters that seem to exhibit a closer proximity in space than on their chromosomal DNA sequence. We further discovered interchromosomal gene clusters that contain genes from different chromosomes within the human genome, but are located on a single chromosome in mouse. By identifying δ-teams with families, we provide a flexible model to discover gene cluster candidates in Hi-C data. Our analysis of Hi-C data from human and mouse reveals several known gene clusters (thus validating our approach), but also few sparsely studied or possibly unknown gene cluster candidates that could be the source of further experimental investigations.
Xiong, Dong-Hai; Shen, Hui; Zhao, Lan-Juan; Xiao, Peng; Yang, Tie-Lin; Guo, Yan; Wang, Wei; Guo, Yan-Fang; Liu, Yong-Jun; Recker, Robert R; Deng, Hong-Wen
2007-01-01
Many “novel” osteoporosis candidate genes have been proposed in recent years. To advance our knowledge of their roles in osteoporosis, we screened 20 such genes using a set of high-density SNPs in a large family-based study. Our efforts led to the prioritization of those osteoporosis genes and the detection of gene–gene interactions. Introduction We performed large-scale family-based association analyses of 20 novel osteoporosis candidate genes using 277 single nucleotide polymorphisms (SNPs) for the quantitative trait BMD variation and the qualitative trait osteoporosis (OP) at three clinically important skeletal sites: spine, hip, and ultradistal radius (UD). Materials and Methods One thousand eight hundred seventy-three subjects from 405 white nuclear families were genotyped and analyzed with an average density of one SNP per 4 kb across the 20 genes. We conducted association analyses by SNP- and haplotype-based family-based association test (FBAT) and performed gene–gene interaction analyses using multianalytic approaches such as multifactor-dimensionality reduction (MDR) and conditional logistic regression. Results and Conclusions We detected four genes (DBP, LRP5, CYP17, and RANK) that showed highly suggestive associations (10,000-permutation derived empirical global p ≤ 0.01) with spine BMD/OP; four genes (CYP19, RANK, RANKL, and CYP17) highly suggestive for hip BMD/OP; and four genes (CYP19, BMP2, RANK, and TNFR2) highly suggestive for UD BMD/OP. The associations between BMP2 with UD BMD and those between RANK with OP at the spine, hip, and UD also met the experiment-wide stringent criterion (empirical global p ≤ 0.0007). Sex-stratified analyses further showed that some of the significant associations in the total sample were driven by either male or female subjects. In addition, we identified and validated a two-locus gene–gene interaction model involving GCR and ESR2, for which prior biological evidence exists. Our results suggested the prioritization of osteoporosis candidate genes from among the many proposed in recent years and revealed the significant gene–gene interaction effects influencing osteoporosis risk. PMID:17002564
Unsupervised text mining for assessing and augmenting GWAS results.
Ailem, Melissa; Role, François; Nadif, Mohamed; Demenais, Florence
2016-04-01
Text mining can assist in the analysis and interpretation of large-scale biomedical data, helping biologists to quickly and cheaply gain confirmation of hypothesized relationships between biological entities. We set this question in the context of genome-wide association studies (GWAS), an actively emerging field that contributed to identify many genes associated with multifactorial diseases. These studies allow to identify groups of genes associated with the same phenotype, but provide no information about the relationships between these genes. Therefore, our objective is to leverage unsupervised text mining techniques using text-based cosine similarity comparisons and clustering applied to candidate and random gene vectors, in order to augment the GWAS results. We propose a generic framework which we used to characterize the relationships between 10 genes reported associated with asthma by a previous GWAS. The results of this experiment showed that the similarities between these 10 genes were significantly stronger than would be expected by chance (one-sided p-value<0.01). The clustering of observed and randomly selected gene also allowed to generate hypotheses about potential functional relationships between these genes and thus contributed to the discovery of new candidate genes for asthma. Copyright © 2016 Elsevier Inc. All rights reserved.
Identification of Reference Genes for Normalizing Quantitative Real-Time PCR in Urechis unicinctus
NASA Astrophysics Data System (ADS)
Bai, Yajiao; Zhou, Di; Wei, Maokai; Xie, Yueyang; Gao, Beibei; Qin, Zhenkui; Zhang, Zhifeng
2018-06-01
The reverse transcription quantitative real-time PCR (RT-qPCR) has become one of the most important techniques of studying gene expression. A set of valid reference genes are essential for the accurate normalization of data. In this study, five candidate genes were analyzed with geNorm, NormFinder, BestKeeper and ΔCt methods to identify the genes stably expressed in echiuran Urechis unicinctus, an important commercial marine benthic worm, under abiotic (sulfide stress) and normal (adult tissues, embryos and larvae at different development stages) conditions. The comprehensive results indicated that the expression of TBP was the most stable at sulfide stress and in developmental process, while the expression of EF- 1- α was the most stable at sulfide stress and in various tissues. TBP and EF- 1- α were recommended as a suitable reference gene combination to accurately normalize the expression of target genes at sulfide stress; and EF- 1- α, TBP and TUB were considered as a potential reference gene combination for normalizing the expression of target genes in different tissues. No suitable gene combination was obtained among these five candidate genes for normalizing the expression of target genes for developmental process of U. unicinctus. Our results provided a valuable support for quantifying gene expression using RT-qPCR in U. unicinctus.
Artico, Sinara; Nardeli, Sarah M; Brilhante, Osmundo; Grossi-de-Sa, Maria Fátima; Alves-Ferreira, Marcio
2010-03-21
Normalizing through reference genes, or housekeeping genes, can make more accurate and reliable results from reverse transcription real-time quantitative polymerase chain reaction (qPCR). Recent studies have shown that no single housekeeping gene is universal for all experiments. Thus, suitable reference genes should be the first step of any qPCR analysis. Only a few studies on the identification of housekeeping gene have been carried on plants. Therefore qPCR studies on important crops such as cotton has been hampered by the lack of suitable reference genes. By the use of two distinct algorithms, implemented by geNorm and NormFinder, we have assessed the gene expression of nine candidate reference genes in cotton: GhACT4, GhEF1alpha5, GhFBX6, GhPP2A1, GhMZA, GhPTB, GhGAPC2, GhbetaTUB3 and GhUBQ14. The candidate reference genes were evaluated in 23 experimental samples consisting of six distinct plant organs, eight stages of flower development, four stages of fruit development and in flower verticils. The expression of GhPP2A1 and GhUBQ14 genes were the most stable across all samples and also when distinct plants organs are examined. GhACT4 and GhUBQ14 present more stable expression during flower development, GhACT4 and GhFBX6 in the floral verticils and GhMZA and GhPTB during fruit development. Our analysis provided the most suitable combination of reference genes for each experimental set tested as internal control for reliable qPCR data normalization. In addition, to illustrate the use of cotton reference genes we checked the expression of two cotton MADS-box genes in distinct plant and floral organs and also during flower development. We have tested the expression stabilities of nine candidate genes in a set of 23 tissue samples from cotton plants divided into five different experimental sets. As a result of this evaluation, we recommend the use of GhUBQ14 and GhPP2A1 housekeeping genes as superior references for normalization of gene expression measures in different cotton plant organs; GhACT4 and GhUBQ14 for flower development, GhACT4 and GhFBX6 for the floral organs and GhMZA and GhPTB for fruit development. We also provide the primer sequences whose performance in qPCR experiments is demonstrated. These genes will enable more accurate and reliable normalization of qPCR results for gene expression studies in this important crop, the major source of natural fiber and also an important source of edible oil. The use of bona fide reference genes allowed a detailed and accurate characterization of the temporal and spatial expression pattern of two MADS-box genes in cotton.
2010-01-01
Background Normalizing through reference genes, or housekeeping genes, can make more accurate and reliable results from reverse transcription real-time quantitative polymerase chain reaction (qPCR). Recent studies have shown that no single housekeeping gene is universal for all experiments. Thus, suitable reference genes should be the first step of any qPCR analysis. Only a few studies on the identification of housekeeping gene have been carried on plants. Therefore qPCR studies on important crops such as cotton has been hampered by the lack of suitable reference genes. Results By the use of two distinct algorithms, implemented by geNorm and NormFinder, we have assessed the gene expression of nine candidate reference genes in cotton: GhACT4, GhEF1α5, GhFBX6, GhPP2A1, GhMZA, GhPTB, GhGAPC2, GhβTUB3 and GhUBQ14. The candidate reference genes were evaluated in 23 experimental samples consisting of six distinct plant organs, eight stages of flower development, four stages of fruit development and in flower verticils. The expression of GhPP2A1 and GhUBQ14 genes were the most stable across all samples and also when distinct plants organs are examined. GhACT4 and GhUBQ14 present more stable expression during flower development, GhACT4 and GhFBX6 in the floral verticils and GhMZA and GhPTB during fruit development. Our analysis provided the most suitable combination of reference genes for each experimental set tested as internal control for reliable qPCR data normalization. In addition, to illustrate the use of cotton reference genes we checked the expression of two cotton MADS-box genes in distinct plant and floral organs and also during flower development. Conclusion We have tested the expression stabilities of nine candidate genes in a set of 23 tissue samples from cotton plants divided into five different experimental sets. As a result of this evaluation, we recommend the use of GhUBQ14 and GhPP2A1 housekeeping genes as superior references for normalization of gene expression measures in different cotton plant organs; GhACT4 and GhUBQ14 for flower development, GhACT4 and GhFBX6 for the floral organs and GhMZA and GhPTB for fruit development. We also provide the primer sequences whose performance in qPCR experiments is demonstrated. These genes will enable more accurate and reliable normalization of qPCR results for gene expression studies in this important crop, the major source of natural fiber and also an important source of edible oil. The use of bona fide reference genes allowed a detailed and accurate characterization of the temporal and spatial expression pattern of two MADS-box genes in cotton. PMID:20302670
Srivastava, Mousami; Khurana, Pankaj; Sugadev, Ragumani
2012-11-02
The tissue-specific Unigene Sets derived from more than one million expressed sequence tags (ESTs) in the NCBI, GenBank database offers a platform for identifying significantly and differentially expressed tissue-specific genes by in-silico methods. Digital differential display (DDD) rapidly creates transcription profiles based on EST comparisons and numerically calculates, as a fraction of the pool of ESTs, the relative sequence abundance of known and novel genes. However, the process of identifying the most likely tissue for a specific disease in which to search for candidate genes from the pool of differentially expressed genes remains difficult. Therefore, we have used 'Gene Ontology semantic similarity score' to measure the GO similarity between gene products of lung tissue-specific candidate genes from control (normal) and disease (cancer) sets. This semantic similarity score matrix based on hierarchical clustering represents in the form of a dendrogram. The dendrogram cluster stability was assessed by multiple bootstrapping. Multiple bootstrapping also computes a p-value for each cluster and corrects the bias of the bootstrap probability. Subsequent hierarchical clustering by the multiple bootstrapping method (α = 0.95) identified seven clusters. The comparative, as well as subtractive, approach revealed a set of 38 biomarkers comprising four distinct lung cancer signature biomarker clusters (panel 1-4). Further gene enrichment analysis of the four panels revealed that each panel represents a set of lung cancer linked metastasis diagnostic biomarkers (panel 1), chemotherapy/drug resistance biomarkers (panel 2), hypoxia regulated biomarkers (panel 3) and lung extra cellular matrix biomarkers (panel 4). Expression analysis reveals that hypoxia induced lung cancer related biomarkers (panel 3), HIF and its modulating proteins (TGM2, CSNK1A1, CTNNA1, NAMPT/Visfatin, TNFRSF1A, ETS1, SRC-1, FN1, APLP2, DMBT1/SAG, AIB1 and AZIN1) are significantly down regulated. All down regulated genes in this panel were highly up regulated in most other types of cancers. These panels of proteins may represent signature biomarkers for lung cancer and will aid in lung cancer diagnosis and disease monitoring as well as in the prediction of responses to therapeutics.
Naaijen, J; Bralten, J; Poelmans, G; Glennon, J C; Franke, B; Buitelaar, J K
2017-01-10
Attention-deficit/hyperactivity disorder (ADHD) and autism spectrum disorders (ASD) often co-occur. Both are highly heritable; however, it has been difficult to discover genetic risk variants. Glutamate and GABA are main excitatory and inhibitory neurotransmitters in the brain; their balance is essential for proper brain development and functioning. In this study we investigated the role of glutamate and GABA genetics in ADHD severity, autism symptom severity and inhibitory performance, based on gene set analysis, an approach to investigate multiple genetic variants simultaneously. Common variants within glutamatergic and GABAergic genes were investigated using the MAGMA software in an ADHD case-only sample (n=931), in which we assessed ASD symptoms and response inhibition on a Stop task. Gene set analysis for ADHD symptom severity, divided into inattention and hyperactivity/impulsivity symptoms, autism symptom severity and inhibition were performed using principal component regression analyses. Subsequently, gene-wide association analyses were performed. The glutamate gene set showed an association with severity of hyperactivity/impulsivity (P=0.009), which was robust to correcting for genome-wide association levels. The GABA gene set showed nominally significant association with inhibition (P=0.04), but this did not survive correction for multiple comparisons. None of single gene or single variant associations was significant on their own. By analyzing multiple genetic variants within candidate gene sets together, we were able to find genetic associations supporting the involvement of excitatory and inhibitory neurotransmitter systems in ADHD and ASD symptom severity in ADHD.
Chen, Chengjie; Zhang, Yafeng; Xu, Zhiqiang; Luan, Aiping; Mao, Qi; Feng, Junting; Xie, Tao; Gong, Xue; Wang, Xiaoshuang; Chen, Hao; He, Yehua
2016-01-01
The pineapple (Ananas comosus) is cold sensitive. Most cultivars are injured during winter periods, especially in sub-tropical regions. There is a lack of molecular information on the pineapple’s response to cold stress. In this study, high-throughput transcriptome sequencing and gene expression analysis were performed on plantlets of a cold-tolerant genotype of the pineapple cultivar ‘Shenwan’ before and after cold treatment. A total of 1,186 candidate cold responsive genes were identified, and their credibility was confirmed by RT-qPCR. Gene set functional enrichment analysis indicated that genes related to cell wall properties, stomatal closure and ABA and ROS signal transduction play important roles in pineapple cold tolerance. In addition, a protein association network of CORs (cold responsive genes) was predicted, which could serve as an entry point to dissect the complex cold response network. Our study found a series of candidate genes and their association network, which will be helpful to cold stress response studies and pineapple breeding for cold tolerance. PMID:27656892
Ding, Fangrui; Tan, Aidi; Ju, Wenjun; Li, Xuejuan; Li, Shao; Ding, Jie
2016-01-01
Maintenance of the physiological morphologies of different types of cells and tissues is essential for the normal functioning of each system in the human body. Dynamic variations in cell and tissue morphologies depend on accurate adjustments of the cytoskeletal system. The cytoskeletal system in the glomerulus plays a key role in the normal process of kidney filtration. To enhance the understanding of the possible roles of the cytoskeleton in glomerular diseases, we constructed the Glomerular Cytoskeleton Network (GCNet), which shows the protein-protein interaction network in the glomerulus, and identified several possible key cytoskeletal components involved in glomerular diseases. In this study, genes/proteins annotated to the cytoskeleton were detected by Gene Ontology analysis, and glomerulus-enriched genes were selected from nine available glomerular expression datasets. Then, the GCNet was generated by combining these two sets of information. To predict the possible key cytoskeleton components in glomerular diseases, we then examined the common regulation of the genes in GCNet in the context of five glomerular diseases based on their transcriptomic data. As a result, twenty-one cytoskeleton components as potential candidate were highlighted for consistently down- or up-regulating in all five glomerular diseases. And then, these candidates were examined in relation to existing known glomerular diseases and genes to determine their possible functions and interactions. In addition, the mRNA levels of these candidates were also validated in a puromycin aminonucleoside(PAN) induced rat nephropathy model and were also matched with existing Diabetic Nephropathy (DN) transcriptomic data. As a result, there are 15 of 21 candidates in PAN induced nephropathy model were consistent with our predication and also 12 of 21 candidates were matched with differentially expressed genes in the DN transcriptomic data. By providing a novel interaction network and prediction, GCNet contributes to improving the understanding of normal glomerular function and will be useful for detecting target cytoskeleton molecules of interest that may be involved in glomerular diseases in future studies.
Ju, Wenjun; Li, Xuejuan; Li, Shao; Ding, Jie
2016-01-01
Maintenance of the physiological morphologies of different types of cells and tissues is essential for the normal functioning of each system in the human body. Dynamic variations in cell and tissue morphologies depend on accurate adjustments of the cytoskeletal system. The cytoskeletal system in the glomerulus plays a key role in the normal process of kidney filtration. To enhance the understanding of the possible roles of the cytoskeleton in glomerular diseases, we constructed the Glomerular Cytoskeleton Network (GCNet), which shows the protein-protein interaction network in the glomerulus, and identified several possible key cytoskeletal components involved in glomerular diseases. In this study, genes/proteins annotated to the cytoskeleton were detected by Gene Ontology analysis, and glomerulus-enriched genes were selected from nine available glomerular expression datasets. Then, the GCNet was generated by combining these two sets of information. To predict the possible key cytoskeleton components in glomerular diseases, we then examined the common regulation of the genes in GCNet in the context of five glomerular diseases based on their transcriptomic data. As a result, twenty-one cytoskeleton components as potential candidate were highlighted for consistently down- or up-regulating in all five glomerular diseases. And then, these candidates were examined in relation to existing known glomerular diseases and genes to determine their possible functions and interactions. In addition, the mRNA levels of these candidates were also validated in a puromycin aminonucleoside(PAN) induced rat nephropathy model and were also matched with existing Diabetic Nephropathy (DN) transcriptomic data. As a result, there are 15 of 21 candidates in PAN induced nephropathy model were consistent with our predication and also 12 of 21 candidates were matched with differentially expressed genes in the DN transcriptomic data. By providing a novel interaction network and prediction, GCNet contributes to improving the understanding of normal glomerular function and will be useful for detecting target cytoskeleton molecules of interest that may be involved in glomerular diseases in future studies. PMID:27227331
Watanabe, Yoshiyuki; Kim, Hyun Soo; Castoro, Ryan J; Chung, Woonbok; Estecio, Marcos R H; Kondo, Kimie; Guo, Yi; Ahmed, Saira S; Toyota, Minoru; Itoh, Fumio; Suk, Ki Tae; Cho, Mee-Yon; Shen, Lanlan; Jelinek, Jaroslav; Issa, Jean-Pierre J
2009-06-01
Aberrant DNA methylation is an early and frequent process in gastric carcinogenesis and could be useful for detection of gastric neoplasia. We hypothesized that methylation analysis of DNA recovered from gastric washes could be used to detect gastric cancer. We studied 51 candidate genes in 7 gastric cancer cell lines and 24 samples (training set) and identified 6 for further studies. We examined the methylation status of these genes in a test set consisting of 131 gastric neoplasias at various stages. Finally, we validated the 6 candidate genes in a different population of 40 primary gastric cancer samples and 113 nonneoplastic gastric mucosa samples. Six genes (MINT25, RORA, GDNF, ADAM23, PRDM5, MLF1) showed frequent differential methylation between gastric cancer and normal mucosa in the training, test, and validation sets. GDNF and MINT25 were most sensitive molecular markers of early stage gastric cancer, whereas PRDM5 and MLF1 were markers of a field defect. There was a close correlation (r = 0.5-0.9, P = .03-.001) between methylation levels in tumor biopsy and gastric washes. MINT25 methylation had the best sensitivity (90%), specificity (96%), and area under the receiver operating characteristic curve (0.961) in terms of tumor detection in gastric washes. These findings suggest MINT25 is a sensitive and specific marker for screening in gastric cancer. Additionally, we have developed a new method for gastric cancer detection by DNA methylation in gastric washes.
Detecting Horizontal Gene Transfer between Closely Related Taxa
Adato, Orit; Ninyo, Noga; Gophna, Uri; Snir, Sagi
2015-01-01
Horizontal gene transfer (HGT), the transfer of genetic material between organisms, is crucial for genetic innovation and the evolution of genome architecture. Existing HGT detection algorithms rely on a strong phylogenetic signal distinguishing the transferred sequence from ancestral (vertically derived) genes in its recipient genome. Detecting HGT between closely related species or strains is challenging, as the phylogenetic signal is usually weak and the nucleotide composition is normally nearly identical. Nevertheless, there is a great importance in detecting HGT between congeneric species or strains, especially in clinical microbiology, where understanding the emergence of new virulent and drug-resistant strains is crucial, and often time-sensitive. We developed a novel, self-contained technique named Near HGT, based on the synteny index, to measure the divergence of a gene from its native genomic environment and used it to identify candidate HGT events between closely related strains. The method confirms candidate transferred genes based on the constant relative mutability (CRM). Using CRM, the algorithm assigns a confidence score based on “unusual” sequence divergence. A gene exhibiting exceptional deviations according to both synteny and mutability criteria, is considered a validated HGT product. We first employed the technique to a set of three E. coli strains and detected several highly probable horizontally acquired genes. We then compared the method to existing HGT detection tools using a larger strain data set. When combined with additional approaches our new algorithm provides richer picture and brings us closer to the goal of detecting all newly acquired genes in a particular strain. PMID:26439115
Abruzzo, Lynne V; Barron, Lynn L; Anderson, Keith; Newman, Rachel J; Wierda, William G; O'brien, Susan; Ferrajoli, Alessandra; Luthra, Madan; Talwalkar, Sameer; Luthra, Rajyalakshmi; Jones, Dan; Keating, Michael J; Coombes, Kevin R
2007-09-01
To develop a model incorporating relevant prognostic biomarkers for untreated chronic lymphocytic leukemia patients, we re-analyzed the raw data from four published gene expression profiling studies. We selected 88 candidate biomarkers linked to immunoglobulin heavy-chain variable region gene (IgV(H)) mutation status and produced a reliable and reproducible microfluidics quantitative real-time polymerase chain reaction array. We applied this array to a training set of 29 purified samples from previously untreated patients. In an unsupervised analysis, the samples clustered into two groups. Using a cutoff point of 2% homology to the germline IgV(H) sequence, one group contained all 14 IgV(H)-unmutated samples; the other contained all 15 mutated samples. We confirmed the differential expression of 37 of the candidate biomarkers using two-sample t-tests. Next, we constructed 16 different models to predict IgV(H) mutation status and evaluated their performance on an independent test set of 20 new samples. Nine models correctly classified 11 of 11 IgV(H)-mutated cases and eight of nine IgV(H)-unmutated cases, with some models using three to seven genes. Thus, we can classify cases with 95% accuracy based on the expression of as few as three genes.
Identification of Differentially Expressed Genes in Blood Cells of Narcolepsy Patients
Tanaka, Susumu; Honda, Yutaka; Honda, Makoto
2007-01-01
Study Objective: A close association between the human leukocyte antigen (HLA)-DRB1*1501/DQB1*0602 and abnormalities in some inflammatory cytokines have been demonstrated in narcolepsy. Specific alterations in the immune system have been suggested to occur in this disorder. We attempted to identify alterations in gene expression underlying the abnormalities in the blood cells of narcoleptic patients. Designs: Total RNA from 12 narcolepsy-cataplexy patients and from 12 age- and sex-matched healthy controls were pooled. The pooled samples were initially screened for candidate genes for narcolepsy by differential display analysis using annealing control primers (ACP). The second screening of the samples was carried out by semiquantitative PCR using gene-specific primers. Finally, the expression levels of the candidate genes were further confirmed by quantitative real-time PCR using a new set of samples (20 narcolepsy-cataplexy patients and 20 healthy controls). Results: The second screening revealed differential expression of 4 candidate genes. Among them, MX2 was confirmed as a significantly down-regulated gene in the white blood cells of narcoleptic patients by quantitative real-time PCR. Conclusion: We found the MX2 gene to be significantly less expressed in comparison with normal subjects in the white blood cells of narcoleptic patients. This gene is relevant to the immune system. Although differential display analysis using ACP technology has a limitation in that it does not help in determining the functional mechanism underlying sleep/wakefulness dysregulation, it is useful for identifying novel genetic factors related to narcolepsy, such as HLA molecules. Further studies are required to explore the functional relationship between the MX2 gene and narcolepsy pathophysiology. Citation: Tanaka S; Honda Y; Honda M. Identification of differentially expressed genes in blood cells of narcolepsy patients. SLEEP 2007;30(8):974-979. PMID:17702266
Roy, Janine; Aust, Daniela; Knösel, Thomas; Rümmele, Petra; Jahnke, Beatrix; Hentrich, Vera; Rückert, Felix; Niedergethmann, Marco; Weichert, Wilko; Bahra, Marcus; Schlitt, Hans J.; Settmacher, Utz; Friess, Helmut; Büchler, Markus; Saeger, Hans-Detlev; Schroeder, Michael; Pilarsky, Christian; Grützmann, Robert
2012-01-01
Predicting the clinical outcome of cancer patients based on the expression of marker genes in their tumors has received increasing interest in the past decade. Accurate predictors of outcome and response to therapy could be used to personalize and thereby improve therapy. However, state of the art methods used so far often found marker genes with limited prediction accuracy, limited reproducibility, and unclear biological relevance. To address this problem, we developed a novel computational approach to identify genes prognostic for outcome that couples gene expression measurements from primary tumor samples with a network of known relationships between the genes. Our approach ranks genes according to their prognostic relevance using both expression and network information in a manner similar to Google's PageRank. We applied this method to gene expression profiles which we obtained from 30 patients with pancreatic cancer, and identified seven candidate marker genes prognostic for outcome. Compared to genes found with state of the art methods, such as Pearson correlation of gene expression with survival time, we improve the prediction accuracy by up to 7%. Accuracies were assessed using support vector machine classifiers and Monte Carlo cross-validation. We then validated the prognostic value of our seven candidate markers using immunohistochemistry on an independent set of 412 pancreatic cancer samples. Notably, signatures derived from our candidate markers were independently predictive of outcome and superior to established clinical prognostic factors such as grade, tumor size, and nodal status. As the amount of genomic data of individual tumors grows rapidly, our algorithm meets the need for powerful computational approaches that are key to exploit these data for personalized cancer therapies in clinical practice. PMID:22615549
Evaluation of RNA from human trabecular bone and identification of stable reference genes.
Cepollaro, Simona; Della Bella, Elena; de Biase, Dario; Visani, Michela; Fini, Milena
2018-06-01
The isolation of good quality RNA from tissues is an essential prerequisite for gene expression analysis to study pathophysiological processes. This study evaluated the RNA isolated from human trabecular bone and defined a set of stable reference genes. After pulverization, RNA was extracted with a phenol/chloroform method and then purified using silica columns. The A260/280 ratio, A260/230 ratio, RIN, and ribosomal ratio were measured to evaluate RNA quality and integrity. Moreover, the expression of six candidates was analyzed by qPCR and different algorithms were applied to assess reference gene stability. A good purity and quality of RNA was achieved according to A260/280 and A260/230 ratios, and RIN values. TBP, YWHAZ, and PGK1 were the most stable reference genes that should be used for gene expression analysis. In summary, the method proposed is suitable for gene expression evaluation in human bone and a set of reliable reference genes has been identified. © 2017 Wiley Periodicals, Inc.
2009-01-01
Background The majority of the genes even in well-studied multi-cellular model organisms have not been functionally characterized yet. Mining the numerous genome wide data sets related to protein function to retrieve potential candidate genes for a particular biological process remains a challenge. Description GExplore has been developed to provide a user-friendly database interface for data mining at the gene expression/protein function level to help in hypothesis development and experiment design. It supports combinatorial searches for proteins with certain domains, tissue- or developmental stage-specific expression patterns, and mutant phenotypes. GExplore operates on a stand-alone database and has fast response times, which is essential for exploratory searches. The interface is not only user-friendly, but also modular so that it accommodates additional data sets in the future. Conclusion GExplore is an online database for quick mining of data related to gene and protein function, providing a multi-gene display of data sets related to the domain composition of proteins as well as expression and phenotype data. GExplore is publicly available at: http://genome.sfu.ca/gexplore/ PMID:19917126
A complete collection of single-gene deletion mutants of Acinetobacter baylyi ADP1
de Berardinis, Véronique; Vallenet, David; Castelli, Vanina; Besnard, Marielle; Pinet, Agnès; Cruaud, Corinne; Samair, Sumitta; Lechaplais, Christophe; Gyapay, Gabor; Richez, Céline; Durot, Maxime; Kreimeyer, Annett; Le Fèvre, François; Schächter, Vincent; Pezo, Valérie; Döring, Volker; Scarpelli, Claude; Médigue, Claudine; Cohen, Georges N; Marlière, Philippe; Salanoubat, Marcel; Weissenbach, Jean
2008-01-01
We have constructed a collection of single-gene deletion mutants for all dispensable genes of the soil bacterium Acinetobacter baylyi ADP1. A total of 2594 deletion mutants were obtained, whereas 499 (16%) were not, and are therefore candidate essential genes for life on minimal medium. This essentiality data set is 88% consistent with the Escherichia coli data set inferred from the Keio mutant collection profiled for growth on minimal medium, while 80% of the orthologous genes described as essential in Pseudomonas aeruginosa are also essential in ADP1. Several strategies were undertaken to investigate ADP1 metabolism by (1) searching for discrepancies between our essentiality data and current metabolic knowledge, (2) comparing this essentiality data set to those from other organisms, (3) systematic phenotyping of the mutant collection on a variety of carbon sources (quinate, 2-3 butanediol, glucose, etc.). This collection provides a new resource for the study of gene function by forward and reverse genetic approaches and constitutes a robust experimental data source for systems biology approaches. PMID:18319726
Findeisen, Peter; Röckel, Matthias; Nees, Matthias; Röder, Christian; Kienle, Peter; Von Knebel Doeberitz, Magnus; Kalthoff, Holger; Neumaier, Michael
2008-11-01
The presence of tumor cells in peripheral blood is being regarded increasingly as a clinically relevant prognostic factor for colorectal cancer patients. Current molecular methods are very sensitive but due to low specificity their diagnostic value is limited. This study was undertaken in order to systematically identify and validate new colorectal cancer (CRC) marker genes for improved detection of minimal residual disease in peripheral blood mononuclear cells of colorectal cancer patients. Marker genes with upregulated gene expression in colorectal cancer tissue and cell lines were identified using microarray experiments and publicly available gene expression data. A systematic iterative approach was used to reduce a set of 346 candidate genes, reportedly associated with CRC to a selection of candidate genes that were then further validated by relative quantitative real-time RT-PCR. Analytical sensitivity of RT-PCR assays was determined by spiking experiments with CRC cells. Diagnostic sensitivity as well as specificity was tested on a control group consisting of 18 CRC patients compared to 12 individuals without malignant disease. From a total of 346-screened genes only serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 5 (SERPINB5) showed significantly elevated transcript levels in peripheral venous blood specimens of tumor patients when compared to the nonmalignant control group. These results were confirmed by analysis of an enlarged collective consisting of 63 CRC patients and 36 control individuals without malignant disease. In conclusion SERPINB5 seems to be a promising marker for detection of circulating tumor cells in peripheral blood of colorectal cancer patients.
Liang, Junjun; Chen, Xin; Deng, Guangbing; Pan, Zhifen; Zhang, Haili; Li, Qiao; Yang, Kaijun; Long, Hai; Yu, Maoqun
2017-10-11
The harsh environment on the Qinghai-Tibetan Plateau gives Tibetan hulless barley (Hordeum vulgare var. nudum) great ability to resist adversities such as drought, salinity, and low temperature, and makes it a good subject for the analysis of drought tolerance mechanism. To elucidate the specific gene networks and pathways that contribute to its drought tolerance, and for identifying new candidate genes for breeding purposes, we performed a transcriptomic analysis using two accessions of Tibetan hulless barley, namely Z772 (drought-tolerant) and Z013 (drought-sensitive). There were more up-regulated genes of Z772 than Z013 under both mild (5439-VS-2604) and severe (7203-VS-3359) dehydration treatments. Under mild dehydration stress, the pathways exclusively enriched in drought-tolerance genotype Z772 included Protein processing in endoplasmic reticulum, tricarboxylic acid (TCA) cycle, Wax biosynthesis, and Spliceosome. Under severe dehydration stress, the pathways that were mainly enriched in Z772 included Carbon fixation in photosynthetic organisms, Pyruvate metabolism, Porphyrin and chlorophyll metabolism. The main differentially expressed genes (DEGs) in response to dehydration stress and genes whose expression was different between tolerant and sensitive genotypes were presented in this study, respectively. The candidate genes for drought tolerance were selected based on their expression patterns. The RNA-Seq data obtained in this study provided an initial overview on global gene expression patterns and networks that related to dehydration shock in Tibetan hulless barley. Furthermore, these data provided pathways and a targeted set of candidate genes that might be essential for deep analyzing the molecular mechanisms of plant tolerance to drought stress.
Verslues, Paul E.; Lasky, Jesse R.; Juenger, Thomas E.; Liu, Tzu-Wen; Kumar, M. Nagaraj
2014-01-01
Arabidopsis (Arabidopsis thaliana) exhibits natural genetic variation in drought response, including varying levels of proline (Pro) accumulation under low water potential. As Pro accumulation is potentially important for stress tolerance and cellular redox control, we conducted a genome-wide association (GWAS) study of low water potential-induced Pro accumulation using a panel of natural accessions and publicly available single-nucleotide polymorphism (SNP) data sets. Candidate genomic regions were prioritized for subsequent study using metrics considering both the strength and spatial clustering of the association signal. These analyses found many candidate regions likely containing gene(s) influencing Pro accumulation. Reverse genetic analysis of several candidates identified new Pro effector genes, including thioredoxins and several genes encoding Universal Stress Protein A domain proteins. These new Pro effector genes further link Pro accumulation to cellular redox and energy status. Additional new Pro effector genes found include the mitochondrial protease LON1, ribosomal protein RPL24A, protein phosphatase 2A subunit A3, a MADS box protein, and a nucleoside triphosphate hydrolase. Several of these new Pro effector genes were from regions with multiple SNPs, each having moderate association with Pro accumulation. This pattern supports the use of summary approaches that incorporate clusters of SNP associations in addition to consideration of individual SNP probability values. Further GWAS-guided reverse genetics promises to find additional effectors of Pro accumulation. The combination of GWAS and reverse genetics to efficiently identify new effector genes may be especially applicable for traits difficult to analyze by other genetic screening methods. PMID:24218491
Di, Shengmeng; Tian, Zongcheng; Qian, Airong; Gao, Xiang; Yu, Dan; Brandi, Maria Luisa; Shang, Peng
2011-12-01
Studies of animals and humans subjected to spaceflight demonstrate that weightlessness negatively affects the mass and mechanical properties of bone tissue. Bone cells could sense and respond to the gravity unloading, and genes sensitive to gravity change were considered to play a critical role in the mechanotransduction of bone cells. To evaluate the fold-change of gene expression, appropriate reference genes should be identified because there is no housekeeping gene having stable expression in all experimental conditions. Consequently, expression stability of ten candidate housekeeping genes were examined in osteoblast-like MC3T3-E1, osteocyte-like MLO-Y4, and preosteoclast-like FLG29.1 cells under different apparent gravities (μg, 1 g, and 2 g) in the high-intensity gradient magnetic field produced by a superconducting magnet. The results showed that the relative expression of these ten candidate housekeeping genes was different in different bone cells; Moreover, the most suitable reference genes of the same cells in altered gravity conditions were also different from that in strong magnetic field. It demonstrated the importance of selecting suitable reference genes in experimental set-ups. Furthermore, it provides an alternative choice to the traditionally accepted housekeeping genes used so far about studies of gravitational biology and magneto biology.
Namroud, Marie-Claire; Beaulieu, Jean; Juge, Nicolas; Laroche, Jérôme; Bousquet, Jean
2008-01-01
Conifers are characterized by a large genome size and a rapid decay of linkage disequilibrium, most often within gene limits. Genome scans based on noncoding markers are less likely to detect molecular adaptation linked to genes in these species. In this study, we assessed the effectiveness of a genome-wide single nucleotide polymorphism (SNP) scan focused on expressed genes in detecting local adaptation in a conifer species. Samples were collected from six natural populations of white spruce (Picea glauca) moderately differentiated for several quantitative characters. A total of 534 SNPs representing 345 expressed genes were analysed. Genes potentially under natural selection were identified by estimating the differentiation in SNP frequencies among populations (FST) and identifying outliers, and by estimating local differentiation using a Bayesian approach. Both average expected heterozygosity and population differentiation estimates (HE = 0.270 and FST = 0.006) were comparable to those obtained with other genetic markers. Of all genes, 5.5% were identified as outliers with FST at the 95% confidence level, while 14% were identified as candidates for local adaptation with the Bayesian method. There was some overlap between the two gene sets. More than half of the candidate genes for local adaptation were specific to the warmest population, about 20% to the most arid population, and 15% to the coldest and most humid higher altitude population. These adaptive trends were consistent with the genes’ putative functions and the divergence in quantitative traits noted among the populations. The results suggest that an approach separating the locus and population effects is useful to identify genes potentially under selection. These candidates are worth exploring in more details at the physiological and ecological levels. PMID:18662225
Shchetynsky, Klementy; Diaz-Gallo, Lina-Marcella; Folkersen, Lasse; Hensvold, Aase Haj; Catrina, Anca Irinel; Berg, Louise; Klareskog, Lars; Padyukov, Leonid
2017-02-02
Here we integrate verified signals from previous genetic association studies with gene expression and pathway analysis for discovery of new candidate genes and signaling networks, relevant for rheumatoid arthritis (RA). RNA-sequencing-(RNA-seq)-based expression analysis of 377 genes from previously verified RA-associated loci was performed in blood cells from 5 newly diagnosed, non-treated patients with RA, 7 patients with treated RA and 12 healthy controls. Differentially expressed genes sharing a similar expression pattern in treated and untreated RA sub-groups were selected for pathway analysis. A set of "connector" genes derived from pathway analysis was tested for differential expression in the initial discovery cohort and validated in blood cells from 73 patients with RA and in 35 healthy controls. There were 11 qualifying genes selected for pathway analysis and these were grouped into two evidence-based functional networks, containing 29 and 27 additional connector molecules. The expression of genes, corresponding to connector molecules was then tested in the initial RNA-seq data. Differences in the expression of ERBB2, TP53 and THOP1 were similar in both treated and non-treated patients with RA and an additional nine genes were differentially expressed in at least one group of patients compared to healthy controls. The ERBB2, TP53. THOP1 expression profile was successfully replicated in RNA-seq data from peripheral blood mononuclear cells from healthy controls and non-treated patients with RA, in an independent collection of samples. Integration of RNA-seq data with findings from association studies, and consequent pathway analysis implicate new candidate genes, ERBB2, TP53 and THOP1 in the pathogenesis of RA.
Feltus, F Alex
2014-06-01
Understanding the control of any trait optimally requires the detection of causal genes, gene interaction, and mechanism of action to discover and model the biochemical pathways underlying the expressed phenotype. Functional genomics techniques, including RNA expression profiling via microarray and high-throughput DNA sequencing, allow for the precise genome localization of biological information. Powerful genetic approaches, including quantitative trait locus (QTL) and genome-wide association study mapping, link phenotype with genome positions, yet genetics is less precise in localizing the relevant mechanistic information encoded in DNA. The coupling of salient functional genomic signals with genetically mapped positions is an appealing approach to discover meaningful gene-phenotype relationships. Techniques used to define this genetic-genomic convergence comprise the field of systems genetics. This short review will address an application of systems genetics where RNA profiles are associated with genetically mapped genome positions of individual genes (eQTL mapping) or as gene sets (co-expression network modules). Both approaches can be applied for knowledge independent selection of candidate genes (and possible control mechanisms) underlying complex traits where multiple, likely unlinked, genomic regions might control specific complex traits. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Yue, S J; Zhao, Y Q; Gu, X R; Yin, B; Jiang, Y L; Wang, Z H; Shi, K R
2017-12-01
A genome-wide association study (GWAS) was conducted on 15 milk production traits in Chinese Holstein. The experimental population consisted of 445 cattle, each genotyped by the GGP (GeneSeek genomic profiling)-BovineLD V3 SNP chip, which had 26 151 public SNPs in its manifest file. After data cleaning, 20 326 SNPs were retained for the GWAS. The phenotypes were estimated breeding values of traits, provided by a public dairy herd improvement program center that had been collected once a month for 3 years. Two statistical models, a fixed-effect linear regression model and a mixed-effect linear model, were used to estimate the association effects of SNPs on each of the phenotypes. Genome-wide significant and suggestive thresholds were set at 2.46E-06 and 4.95E-05 respectively. The two statistical models concurrently identified two genome-wide significant (P < 0.05) SNPs on milk production traits in this Chinese Holstein population. The positional candidate genes, which were the ones closest to these two identified SNPs, were EEF2K (eukaryotic elongation factor 2 kinase) and KLHL1 (kelch like family member 1). These two genes could serve as new candidate genes for milk yield and lactation persistence, yet their roles need to be verified in further function studies. © 2017 Stichting International Foundation for Animal Genetics.
Eckert, Andrew J; Wegrzyn, Jill L; Pande, Barnaly; Jermstad, Kathleen D; Lee, Jennifer M; Liechty, John D; Tearse, Brandon R; Krutovsky, Konstantin V; Neale, David B
2009-09-01
Forest trees exhibit remarkable adaptations to their environments. The genetic basis for phenotypic adaptation to climatic gradients has been established through a long history of common garden, provenance, and genecological studies. The identities of genes underlying these traits, however, have remained elusive and thus so have the patterns of adaptive molecular diversity in forest tree genomes. Here, we report an analysis of diversity and divergence for a set of 121 cold-hardiness candidate genes in coastal Douglas fir (Pseudotsuga menziesii var. menziesii). Application of several different tests for neutrality, including those that incorporated demographic models, revealed signatures of selection consistent with selective sweeps at three to eight loci, depending upon the severity of a bottleneck event and the method used to detect selection. Given the high levels of recombination, these candidate genes are likely to be closely linked to the target of selection if not the genes themselves. Putative homologs in Arabidopsis act primarily to stabilize the plasma membrane and protect against denaturation of proteins at freezing temperatures. These results indicate that surveys of nucleotide diversity and divergence, when framed within the context of further association mapping experiments, will come full circle with respect to their utility in the dissection of complex phenotypic traits into their genetic components.
Using expression genetics to study the neurobiology of ethanol and alcoholism.
Farris, Sean P; Wolen, Aaron R; Miles, Michael F
2010-01-01
Recent simultaneous progress in human and animal model genetics and the advent of microarray whole genome expression profiling have produced prodigious data sets on genetic loci, potential candidate genes, and differential gene expression related to alcoholism and ethanol behaviors. Validated target genes or gene networks functioning in alcoholism are still of meager proportions. Genetical genomics, which combines genetic analysis of both traditional phenotypes and whole genome expression data, offers a potential methodology for characterizing brain gene networks functioning in alcoholism. This chapter will describe concepts, approaches, and recent findings in the field of genetical genomics as it applies to alcohol research. Copyright 2010 Elsevier Inc. All rights reserved.
Genomic Locus Modulating IOP in the BXD RI Mouse Strains
King, Rebecca; Li, Ying; Wang, Jiaxing; Struebing, Felix L.; Geisert, Eldon E.
2018-01-01
Intraocular pressure (IOP) is the primary risk factor for developing glaucoma, yet little is known about the contribution of genomic background to IOP regulation. The present study leverages an array of systems genetics tools to study genomic factors modulating normal IOP in the mouse. The BXD recombinant inbred (RI) strain set was used to identify genomic loci modulating IOP. We measured the IOP in a total of 506 eyes from 38 different strains. Strain averages were subjected to conventional quantitative trait analysis by means of composite interval mapping. Candidate genes were defined, and immunohistochemistry and quantitative PCR (qPCR) were used for validation. Of the 38 BXD strains examined the mean IOP ranged from a low of 13.2mmHg to a high of 17.1mmHg. The means for each strain were used to calculate a genome wide interval map. One significant quantitative trait locus (QTL) was found on Chr.8 (96 to 103 Mb). Within this 7 Mb region only 4 annotated genes were found: Gm15679, Cdh8, Cdh11 and Gm8730. Only two genes (Cdh8 and Cdh11) were candidates for modulating IOP based on the presence of non-synonymous SNPs. Further examination using SIFT (Sorting Intolerant From Tolerant) analysis revealed that the SNPs in Cdh8 (Cadherin 8) were predicted to not change protein function; while the SNPs in Cdh11 (Cadherin 11) would not be tolerated, affecting protein function. Furthermore, immunohistochemistry demonstrated that CDH11 is expressed in the trabecular meshwork of the mouse. We have examined the genomic regulation of IOP in the BXD RI strain set and found one significant QTL on Chr. 8. Within this QTL, there is one good candidate gene, Cdh11. PMID:29496776
Bubier, Jason A.; Jay, Jeremy J.; Baker, Christopher L.; Bergeson, Susan E.; Ohno, Hiroshi; Metten, Pamela; Crabbe, John C.; Chesler, Elissa J.
2014-01-01
Extensive genetic and genomic studies of the relationship between alcohol drinking preference and withdrawal severity have been performed using animal models. Data from multiple such publications and public data resources have been incorporated in the GeneWeaver database with >60,000 gene sets including 285 alcohol withdrawal and preference-related gene sets. Among these are evidence for positional candidates regulating these behaviors in overlapping quantitative trait loci (QTL) mapped in distinct mouse populations. Combinatorial integration of functional genomics experimental results revealed a single QTL positional candidate gene in one of the loci common to both preference and withdrawal. Functional validation studies in Ap3m2 knockout mice confirmed these relationships. Genetic validation involves confirming the existence of segregating polymorphisms that could account for the phenotypic effect. By exploiting recent advances in mouse genotyping, sequence, epigenetics, and phylogeny resources, we confirmed that Ap3m2 resides in an appropriately segregating genomic region. We have demonstrated genetic and alcohol-induced regulation of Ap3m2 expression. Although sequence analysis revealed no polymorphisms in the Ap3m2-coding region that could account for all phenotypic differences, there are several upstream SNPs that could. We have identified one of these to be an H3K4me3 site that exhibits strain differences in methylation. Thus, by making cross-species functional genomics readily computable we identified a common QTL candidate for two related bio-behavioral processes via functional evidence and demonstrate sufficiency of the genetic locus as a source of variation underlying two traits. PMID:24923803
Bubier, Jason A; Jay, Jeremy J; Baker, Christopher L; Bergeson, Susan E; Ohno, Hiroshi; Metten, Pamela; Crabbe, John C; Chesler, Elissa J
2014-08-01
Extensive genetic and genomic studies of the relationship between alcohol drinking preference and withdrawal severity have been performed using animal models. Data from multiple such publications and public data resources have been incorporated in the GeneWeaver database with >60,000 gene sets including 285 alcohol withdrawal and preference-related gene sets. Among these are evidence for positional candidates regulating these behaviors in overlapping quantitative trait loci (QTL) mapped in distinct mouse populations. Combinatorial integration of functional genomics experimental results revealed a single QTL positional candidate gene in one of the loci common to both preference and withdrawal. Functional validation studies in Ap3m2 knockout mice confirmed these relationships. Genetic validation involves confirming the existence of segregating polymorphisms that could account for the phenotypic effect. By exploiting recent advances in mouse genotyping, sequence, epigenetics, and phylogeny resources, we confirmed that Ap3m2 resides in an appropriately segregating genomic region. We have demonstrated genetic and alcohol-induced regulation of Ap3m2 expression. Although sequence analysis revealed no polymorphisms in the Ap3m2-coding region that could account for all phenotypic differences, there are several upstream SNPs that could. We have identified one of these to be an H3K4me3 site that exhibits strain differences in methylation. Thus, by making cross-species functional genomics readily computable we identified a common QTL candidate for two related bio-behavioral processes via functional evidence and demonstrate sufficiency of the genetic locus as a source of variation underlying two traits. Copyright © 2014 by the Genetics Society of America.
Naaijen, J; Bralten, J; Poelmans, G; Faraone, Stephen; Asherson, Philip; Banaschewski, Tobias; Buitelaar, Jan; Franke, Barbara; P Ebstein, Richard; Gill, Michael; Miranda, Ana; D Oades, Robert; Roeyers, Herbert; Rothenberger, Aribert; Sergeant, Joseph; Sonuga-Barke, Edmund; Anney, Richard; Mulas, Fernando; Steinhausen, Hans-Christoph; Glennon, J C; Franke, B; Buitelaar, J K
2017-01-01
Attention-deficit/hyperactivity disorder (ADHD) and autism spectrum disorders (ASD) often co-occur. Both are highly heritable; however, it has been difficult to discover genetic risk variants. Glutamate and GABA are main excitatory and inhibitory neurotransmitters in the brain; their balance is essential for proper brain development and functioning. In this study we investigated the role of glutamate and GABA genetics in ADHD severity, autism symptom severity and inhibitory performance, based on gene set analysis, an approach to investigate multiple genetic variants simultaneously. Common variants within glutamatergic and GABAergic genes were investigated using the MAGMA software in an ADHD case-only sample (n=931), in which we assessed ASD symptoms and response inhibition on a Stop task. Gene set analysis for ADHD symptom severity, divided into inattention and hyperactivity/impulsivity symptoms, autism symptom severity and inhibition were performed using principal component regression analyses. Subsequently, gene-wide association analyses were performed. The glutamate gene set showed an association with severity of hyperactivity/impulsivity (P=0.009), which was robust to correcting for genome-wide association levels. The GABA gene set showed nominally significant association with inhibition (P=0.04), but this did not survive correction for multiple comparisons. None of single gene or single variant associations was significant on their own. By analyzing multiple genetic variants within candidate gene sets together, we were able to find genetic associations supporting the involvement of excitatory and inhibitory neurotransmitter systems in ADHD and ASD symptom severity in ADHD. PMID:28072412
Li, Tao; Wang, Jing; Lu, Miao; Zhang, Tianyi; Qu, Xinyun; Wang, Zhezhi
2017-01-01
Due to its sensitivity and specificity, real-time quantitative PCR (qRT-PCR) is a popular technique for investigating gene expression levels in plants. Based on the Minimum Information for Publication of Real-Time Quantitative PCR Experiments (MIQE) guidelines, it is necessary to select and validate putative appropriate reference genes for qRT-PCR normalization. In the current study, three algorithms, geNorm, NormFinder, and BestKeeper, were applied to assess the expression stability of 10 candidate reference genes across five different tissues and three different abiotic stresses in Isatis indigotica Fort. Additionally, the IiYUC6 gene associated with IAA biosynthesis was applied to validate the candidate reference genes. The analysis results of the geNorm, NormFinder, and BestKeeper algorithms indicated certain differences for the different sample sets and different experiment conditions. Considering all of the algorithms, PP2A-4 and TUB4 were recommended as the most stable reference genes for total and different tissue samples, respectively. Moreover, RPL15 and PP2A-4 were considered to be the most suitable reference genes for abiotic stress treatments. The obtained experimental results might contribute to improved accuracy and credibility for the expression levels of target genes by qRT-PCR normalization in I. indigotica. PMID:28702046
Babben, Steve; Perovic, Dragan; Koch, Michael; Ordon, Frank
2015-01-01
Recent declines in costs accelerated sequencing of many species with large genomes, including hexaploid wheat (Triticum aestivum L.). Although the draft sequence of bread wheat is known, it is still one of the major challenges to developlocus specific primers suitable to be used in marker assisted selection procedures, due to the high homology of the three genomes. In this study we describe an efficient approach for the development of locus specific primers comprising four steps, i.e. (i) identification of genomic and coding sequences (CDS) of candidate genes, (ii) intron- and exon-structure reconstruction, (iii) identification of wheat A, B and D sub-genome sequences and primer development based on sequence differences between the three sub-genomes, and (iv); testing of primers for functionality, correct size and localisation. This approach was applied to single, low and high copy genes involved in frost tolerance in wheat. In summary for 27 of these genes for which sequences were derived from Triticum aestivum, Triticum monococcum and Hordeum vulgare, a set of 119 primer pairs was developed and after testing on Nulli-tetrasomic (NT) lines, a set of 65 primer pairs (54.6%), corresponding to 19 candidate genes, turned out to be specific. Out of these a set of 35 fragments was selected for validation via Sanger's amplicon re-sequencing. All fragments, with the exception of one, could be assigned to the original reference sequence. The approach presented here showed a much higher specificity in primer development in comparison to techniques used so far in bread wheat and can be applied to other polyploid species with a known draft sequence. PMID:26565976
Genomic convergence to identify candidate genes for Alzheimer disease on chromosome 10
Liang, Xueying; Slifer, Michael; Martin, Eden R.; Schnetz-Boutaud, Nathalie; Bartlett, Jackie; Anderson, Brent; Züchner, Stephan; Gwirtsman, Harry; Gilbert, John R.; Pericak-Vance, Margaret A.; Haines, Jonathan L.
2009-01-01
A broad region of chromosome 10 (chr10) has engendered continued interest in the etiology of late-onset Alzheimer Disease (LOAD) from both linkage and candidate gene studies. However, there is a very extensive heterogeneity on chr10. We converged linkage analysis and gene expression data using the concept of genomic convergence that suggests that genes showing positive results across multiple different data types are more likely to be involved in AD. We identified and examined 28 genes on chr10 for association with AD in a Caucasian case-control dataset of 506 cases and 558 controls with substantial clinical information. The cases were all LOAD (minimum age at onset ≥ 60 years). Both single marker and haplotypic associations were tested in the overall dataset and 8 subsets defined by age, gender, ApoE and clinical status. PTPLA showed allelic, genotypic and haplotypic association in the overall dataset. SORCS1 was significant in the overall data sets (p=0.0025) and most significant in the female subset (allelic association p=0.00002, a 3-locus haplotype had p=0.0005). Odds Ratio of SORCS1 in the female subset was 1.7 (p<0.0001). SORCS1 is an interesting candidate gene involved in the Aβ pathway. Therefore, genetic variations in PTPLA and SORCS1 may be associated and have modest effect to the risk of AD by affecting Aβ pathway. The replication of the effect of these genes in different study populations and search for susceptible variants and functional studies of these genes are necessary to get a better understanding of the roles of the genes in Alzheimer disease. PMID:19241460
2012-01-01
Background Single nucleotide polymorphism (SNP) validation and large-scale genotyping are required to maximize the use of DNA sequence variation and determine the functional relevance of candidate genes for complex stress tolerance traits through genetic association in rice. We used the bead array platform-based Illumina GoldenGate assay to validate and genotype SNPs in a select set of stress-responsive genes to understand their functional relevance and study the population structure in rice. Results Of the 384 putative SNPs assayed, we successfully validated and genotyped 362 (94.3%). Of these 325 (84.6%) showed polymorphism among the 91 rice genotypes examined. Physical distribution, degree of allele sharing, admixtures and introgression, and amino acid replacement of SNPs in 263 abiotic and 62 biotic stress-responsive genes provided clues for identification and targeted mapping of trait-associated genomic regions. We assessed the functional and adaptive significance of validated SNPs in a set of contrasting drought tolerant upland and sensitive lowland rice genotypes by correlating their allelic variation with amino acid sequence alterations in catalytic domains and three-dimensional secondary protein structure encoded by stress-responsive genes. We found a strong genetic association among SNPs in the nine stress-responsive genes with upland and lowland ecological adaptation. Higher nucleotide diversity was observed in indica accessions compared with other rice sub-populations based on different population genetic parameters. The inferred ancestry of 16% among rice genotypes was derived from admixed populations with the maximum between upland aus and wild Oryza species. Conclusions SNPs validated in biotic and abiotic stress-responsive rice genes can be used in association analyses to identify candidate genes and develop functional markers for stress tolerance in rice. PMID:22921105
Identification of prostate cancer modifier pathways using parental strain expression mapping
Xu, Qing; Majumder, Pradip K.; Ross, Kenneth; Shim, Yeonju; Golub, Todd R.; Loda, Massimo; Sellers, William R.
2007-01-01
Inherited genetic risk factors play an important role in cancer. However, other than the Mendelian fashion cancer susceptibility genes found in familial cancer syndromes, little is known about risk modifiers that control individual susceptibility. Here we developed a strategy, parental strain expression mapping, that utilizes the homogeneity of inbred mice and genome-wide mRNA expression analyses to directly identify candidate germ-line modifier genes and pathways underlying phenotypic differences among murine strains exposed to transgenic activation of AKT1. We identified multiple candidate modifier pathways and, specifically, the glycolysis pathway as a candidate negative modulator of AKT1-induced proliferation. In keeping with the findings in the murine models, in multiple human prostate expression data set, we found that enrichment of glycolysis pathways in normal tissues was associated with decreased rates of cancer recurrence after prostatectomy. Together, these data suggest that parental strain expression mapping can directly identify germ-line modifier pathways of relevance to human disease. PMID:17978178
2013-01-01
Background 3,4-methylenedioxymethamphetamine (MDMA, "ecstasy") is a widely used recreational drug known to impair cognitive functions on the long-run. Both hippocampal and frontal cortical regions have well established roles in behavior, memory formation and other cognitive tasks and damage of these regions is associated with altered behavior and cognitive functions, impairments frequently described in heavy MDMA users. The aim of this study was to examine the hippocampus, frontal cortex and dorsal raphe of Dark Agouti rats with gene expression arrays (Illumina RatRef bead arrays) looking for possible mechanisms and new candidates contributing to the effects of a single dose of MDMA (15 mg/kg) 3 weeks earlier. Results The number of differentially expressed genes in the hippocampus, frontal cortex and the dorsal raphe were 481, 155, and 15, respectively. Gene set enrichment analysis of the microarray data revealed reduced expression of 'memory’ and 'cognition’, 'dendrite development’ and 'regulation of synaptic plasticity’ gene sets in the hippocampus, parallel to the upregulation of the CB1 cannabinoid- and Epha4, Epha5, Epha6 ephrin receptors. Downregulated gene sets in the frontal cortex were related to protein synthesis, chromatin organization, transmembrane transport processes, while 'dendrite development’, 'regulation of synaptic plasticity’ and 'positive regulation of synapse assembly’ gene sets were upregulated. Changes in the dorsal raphe region were mild and in most cases not significant. Conclusion The present data raise the possibility of new synapse formation/synaptic reorganization in the frontal cortex three weeks after a single neurotoxic dose of MDMA. In contrast, a prolonged depression of new neurite formation in the hippocampus is suggested by the data, which underlines the particular vulnerability of this brain region after the drug treatment. Finally, our results also suggest the substantial contribution of CB1 receptor and endocannabinoid mediated pathways in the hippocampal impairments. Taken together the present study provides evidence for the participation of new molecular candidates in the long-term effects of MDMA. PMID:24378229
Bagheri, Masoumeh; Moradi-Sharhrbabak, M; Miraie-Ashtiani, R; Safdari-Shahroudi, M; Abdollahi-Arpanahi, R
2016-02-01
Mastitis is a major source of economic loss in dairy herds. The objective of this research was to evaluate the association between genotypes within SLC11A1 and CXCR1 candidate genes and clinical mastitis in Holstein dairy cattle using the selective genotyping method. The data set contained clinical mastitis records of 3,823 Holstein cows from two Holstein dairy herds located in two different regions in Iran. Data included the number of cases of clinical mastitis per lactation. Selective genotyping was based on extreme values for clinical mastitis residuals (CMR) from mixed model analyses. Two extreme groups consisting of 135 cows were formed (as cases and controls), and genotyped for the two candidate genes, namely, SLC11A1 and CXCR1, using polymerase chain reaction-single strand conformation polymorphism (PCR-SSCP) and polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP), respectively. Associations between single nucleotide polymorphism (SNP) genotypes with CMR and breeding values for milk and protein yield were carried out by applying logistic regression analyses, i.e. estimating the probability of the heterogeneous genotype in the dependency of values for CMR and breeding values (BVs). The sequencing results revealed a novel mutation in 1139 bp of exon 11 of the SLC11A1 gene and this SNP had a significant association with CMR (P < 0.05). PCR-RFLP analysis leads to three banding patterns for CXCR1c.735C>G and these genotypes had significant relationships with CMR. Overall, the results showed that SLC11A1 and CXCR1 are valuable candidate genes for the improvement of mastitis resistance as well as production traits in dairy cattle populations.
The ecological genomic basis of salinity adaptation in Tunisian Medicago truncatula.
Friesen, Maren L; von Wettberg, Eric J B; Badri, Mounawer; Moriuchi, Ken S; Barhoumi, Fathi; Chang, Peter L; Cuellar-Ortiz, Sonia; Cordeiro, Matilde A; Vu, Wendy T; Arraouadi, Soumaya; Djébali, Naceur; Zribi, Kais; Badri, Yazid; Porter, Stephanie S; Aouani, Mohammed Elarbi; Cook, Douglas R; Strauss, Sharon Y; Nuzhdin, Sergey V
2014-12-22
As our world becomes warmer, agriculture is increasingly impacted by rising soil salinity and understanding plant adaptation to salt stress can help enable effective crop breeding. Salt tolerance is a complex plant phenotype and we know little about the pathways utilized by naturally tolerant plants. Legumes are important species in agricultural and natural ecosystems, since they engage in symbiotic nitrogen-fixation, but are especially vulnerable to salinity stress. Our studies of the model legume Medicago truncatula in field and greenhouse settings demonstrate that Tunisian populations are locally adapted to saline soils at the metapopulation level and that saline origin genotypes are less impacted by salt than non-saline origin genotypes; these populations thus likely contain adaptively diverged alleles. Whole genome resequencing of 39 wild accessions reveals ongoing migration and candidate genomic regions that assort non-randomly with soil salinity. Consistent with natural selection acting at these sites, saline alleles are typically rare in the range-wide species' gene pool and are also typically derived relative to the sister species M. littoralis. Candidate regions for adaptation contain genes that regulate physiological acclimation to salt stress, such as abscisic acid and jasmonic acid signaling, including a novel salt-tolerance candidate orthologous to the uncharacterized gene AtCIPK21. Unexpectedly, these regions also contain biotic stress genes and flowering time pathway genes. We show that flowering time is differentiated between saline and non-saline populations and may allow salt stress escape. This work nominates multiple potential pathways of adaptation to naturally stressful environments in a model legume. These candidates point to the importance of both tolerance and avoidance in natural legume populations. We have uncovered several promising targets that could be used to breed for enhanced salt tolerance in crop legumes to enhance food security in an era of increasing soil salinization.
Sedeek, Khalid E M; Qi, Weihong; Schauer, Monica A; Gupta, Alok K; Poveda, Lucy; Xu, Shuqing; Liu, Zhong-Jian; Grossniklaus, Ueli; Schiestl, Florian P; Schlüter, Philipp M
2013-01-01
Sexually deceptive orchids of the genus Ophrys mimic the mating signals of their pollinator females to attract males as pollinators. This mode of pollination is highly specific and leads to strong reproductive isolation between species. This study aims to identify candidate genes responsible for pollinator attraction and reproductive isolation between three closely related species, O. exaltata, O. sphegodes and O. garganica. Floral traits such as odour, colour and morphology are necessary for successful pollinator attraction. In particular, different odour hydrocarbon profiles have been linked to differences in specific pollinator attraction among these species. Therefore, the identification of genes involved in these traits is important for understanding the molecular basis of pollinator attraction by sexually deceptive orchids. We have created floral reference transcriptomes and proteomes for these three Ophrys species using a combination of next-generation sequencing (454 and Solexa), Sanger sequencing, and shotgun proteomics (tandem mass spectrometry). In total, 121 917 unique transcripts and 3531 proteins were identified. This represents the first orchid proteome and transcriptome from the orchid subfamily Orchidoideae. Proteome data revealed proteins corresponding to 2644 transcripts and 887 proteins not observed in the transcriptome. Candidate genes for hydrocarbon and anthocyanin biosynthesis were represented by 156 and 61 unique transcripts in 20 and 7 genes classes, respectively. Moreover, transcription factors putatively involved in the regulation of flower odour, colour and morphology were annotated, including Myb, MADS and TCP factors. Our comprehensive data set generated by combining transcriptome and proteome technologies allowed identification of candidate genes for pollinator attraction and reproductive isolation among sexually deceptive orchids. This includes genes for hydrocarbon and anthocyanin biosynthesis and regulation, and the development of floral morphology. These data will serve as an invaluable resource for research in orchid floral biology, enabling studies into the molecular mechanisms of pollinator attraction and speciation.
Sedeek, Khalid E. M.; Qi, Weihong; Schauer, Monica A.; Gupta, Alok K.; Poveda, Lucy; Xu, Shuqing; Liu, Zhong-Jian; Grossniklaus, Ueli; Schiestl, Florian P.; Schlüter, Philipp M.
2013-01-01
Background Sexually deceptive orchids of the genus Ophrys mimic the mating signals of their pollinator females to attract males as pollinators. This mode of pollination is highly specific and leads to strong reproductive isolation between species. This study aims to identify candidate genes responsible for pollinator attraction and reproductive isolation between three closely related species, O. exaltata, O. sphegodes and O. garganica. Floral traits such as odour, colour and morphology are necessary for successful pollinator attraction. In particular, different odour hydrocarbon profiles have been linked to differences in specific pollinator attraction among these species. Therefore, the identification of genes involved in these traits is important for understanding the molecular basis of pollinator attraction by sexually deceptive orchids. Results We have created floral reference transcriptomes and proteomes for these three Ophrys species using a combination of next-generation sequencing (454 and Solexa), Sanger sequencing, and shotgun proteomics (tandem mass spectrometry). In total, 121 917 unique transcripts and 3531 proteins were identified. This represents the first orchid proteome and transcriptome from the orchid subfamily Orchidoideae. Proteome data revealed proteins corresponding to 2644 transcripts and 887 proteins not observed in the transcriptome. Candidate genes for hydrocarbon and anthocyanin biosynthesis were represented by 156 and 61 unique transcripts in 20 and 7 genes classes, respectively. Moreover, transcription factors putatively involved in the regulation of flower odour, colour and morphology were annotated, including Myb, MADS and TCP factors. Conclusion Our comprehensive data set generated by combining transcriptome and proteome technologies allowed identification of candidate genes for pollinator attraction and reproductive isolation among sexually deceptive orchids. This includes genes for hydrocarbon and anthocyanin biosynthesis and regulation, and the development of floral morphology. These data will serve as an invaluable resource for research in orchid floral biology, enabling studies into the molecular mechanisms of pollinator attraction and speciation. PMID:23734209
Watanabe, Yoshiyuki; Kim, Hyun Soo; Castoro, Ryan J.; Chung, Woonbok; Estecio, Marcos R. H.; Kondo, Kimie; Guo, Yi; Ahmed, Saira S.; Toyota, Minoru; Itoh, Fumio; Suk, Ki Tae; Cho, Mee-Yon; Shen, Lanlan; Jelinek, Jaroslav; Issa, Jean-Pierre J.
2009-01-01
Background & Aims Aberrant DNA methylation is an early and frequent process in gastric carcinogenesis and could be useful for detection of gastric neoplasia. We hypothesized that methylation analysis of DNA recovered from gastric washes could be used to detect gastric cancer. Methods We studied 51 candidate genes in 7 gastric cancer cell lines and 24 samples (training set) and identified 6 for further studies. We examined the methylation status of these genes in a test set consisting of 131 gastric neoplasias at various stages. Finally, we validated the 6 candidate genes in a different population of 40 primary gastric cancer samples and 113 non-neoplastic gastric mucosa samples. Results 6 genes (MINT25, RORA, GDNF, ADAM23, PRDM5, MLF1) showed frequent differential methylation between gastric cancer and normal mucosa in the training, test and validation sets. GDNF and MINT25 were most sensitive molecular markers of early stage gastric cancer while PRDM5 and MLF1 were markers of a field defect. There was a close correlation (r=0.5 to 0.9, p=0.03 to 0.001) between methylation levels in tumor biopsy and gastric washes. MINT25 methylation had the best sensitivity (90%), specificity (96%), and area under the ROC curve (0.961) in terms of tumor detection in gastric washes. Conclusions These findings suggest MINT25 is a sensitive and specific marker for screening in gastric cancer. Additionally we have developed a new methodology for gastric cancer detection by DNA methylation in gastric washes. PMID:19375421
Knapp, Dunja; Schulz, Herbert; Rascon, Cynthia Alexander; Volkmer, Michael; Scholz, Juliane; Nacu, Eugen; Le, Mu; Novozhilov, Sergey; Tazaki, Akira; Protze, Stephanie; Jacob, Tina; Hubner, Norbert; Habermann, Bianca; Tanaka, Elly M.
2013-01-01
Understanding how the limb blastema is established after the initial wound healing response is an important aspect of regeneration research. Here we performed parallel expression profile time courses of healing lateral wounds versus amputated limbs in axolotl. This comparison between wound healing and regeneration allowed us to identify amputation-specific genes. By clustering the expression profiles of these samples, we could detect three distinguishable phases of gene expression – early wound healing followed by a transition-phase leading to establishment of the limb development program, which correspond to the three phases of limb regeneration that had been defined by morphological criteria. By focusing on the transition-phase, we identified 93 strictly amputation-associated genes many of which are implicated in oxidative-stress response, chromatin modification, epithelial development or limb development. We further classified the genes based on whether they were or were not significantly expressed in the developing limb bud. The specific localization of 53 selected candidates within the blastema was investigated by in situ hybridization. In summary, we identified a set of genes that are expressed specifically during regeneration and are therefore, likely candidates for the regulation of blastema formation. PMID:23658691
Tiffin, Nicki; Meintjes, Ayton; Ramesar, Rajkumar; Bajic, Vladimir B.; Rayner, Brian
2010-01-01
Multiple factors underlie susceptibility to essential hypertension, including a significant genetic and ethnic component, and environmental effects. Blood pressure response of hypertensive individuals to salt is heterogeneous, but salt sensitivity appears more prevalent in people of indigenous African origin. The underlying genetics of salt-sensitive hypertension, however, are poorly understood. In this study, computational methods including text- and data-mining have been used to select and prioritize candidate aetiological genes for salt-sensitive hypertension. Additionally, we have compared allele frequencies and copy number variation for single nucleotide polymorphisms in candidate genes between indigenous Southern African and Caucasian populations, with the aim of identifying candidate genes with significant variability between the population groups: identifying genetic variability between population groups can exploit ethnic differences in disease prevalence to aid with prioritisation of good candidate genes. Our top-ranking candidate genes include parathyroid hormone precursor (PTH) and type-1angiotensin II receptor (AGTR1). We propose that the candidate genes identified in this study warrant further investigation as potential aetiological genes for salt-sensitive hypertension. PMID:20886000
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahmad, Rumana; Nicora, Carrie D.; Shukla, Anil K.
Prostate cancer (CP) cells differ from their normal counterpart in gene expression. Genes encoding secreted or extracellular proteins with increased expression in CP may serve as potential biomarkers. For their detection and quantification, assays based on monoclonal antibodies are best suited for development in a clinical setting. One approach to obtain antibodies is to use recombinant proteins as immunogen. However, the synthesis of recombinant protein for each identified candidate is time-consuming and expensive. It is also not practical to generate high quality antibodies to all identified candidates individually. Furthermore, non-native forms (e.g., recombinant) of proteins may not always lead tomore » useful antibodies. Our approach was to purify a subset of proteins from CP tissue specimens for use as immunogen.« less
Youssef, Noha H; Blainey, Paul C; Quake, Stephen R; Elshahed, Mostafa S
2011-11-01
Members of candidate division OP11 are widely distributed in terrestrial and marine ecosystems, yet little information regarding their metabolic capabilities and ecological role within such habitats is currently available. Here, we report on the microfluidic isolation, multiple-displacement-amplification, pyrosequencing, and genomic analysis of a single cell (ZG1) belonging to candidate division OP11. Genome analysis of the ∼270-kb partial genome assembly obtained showed that it had no particular similarity to a specific phylum. Four hundred twenty-three open reading frames were identified, 46% of which had no function prediction. In-depth analysis revealed a heterotrophic lifestyle, with genes encoding endoglucanase, amylopullulanase, and laccase enzymes, suggesting a capacity for utilization of cellulose, starch, and, potentially, lignin, respectively. Genes encoding several glycolysis enzymes as well as formate utilization were identified, but no evidence for an electron transport chain was found. The presence of genes encoding various components of lipopolysaccharide biosynthesis indicates a Gram-negative bacterial cell wall. The partial genome also provides evidence for antibiotic resistance (β-lactamase, aminoglycoside phosphotransferase), as well as antibiotic production (bacteriocin) and extracellular bactericidal peptidases. Multiple mechanisms for stress response were identified, as were elements of type I and type IV secretion systems. Finally, housekeeping genes identified within the partial genome were used to demonstrate the OP11 affiliation of multiple hitherto unclassified genomic fragments from multiple database-deposited metagenomic data sets. These results provide the first glimpse into the lifestyle of a member of a ubiquitous, yet poorly understood bacterial candidate division.
Ultsch, Alfred; Kringel, Dario; Kalso, Eija; Mogil, Jeffrey S; Lötsch, Jörn
2016-12-01
The increasing availability of "big data" enables novel research approaches to chronic pain while also requiring novel techniques for data mining and knowledge discovery. We used machine learning to combine the knowledge about n = 535 genes identified empirically as relevant to pain with the knowledge about the functions of thousands of genes. Starting from an accepted description of chronic pain as displaying systemic features described by the terms "learning" and "neuronal plasticity," a functional genomics analysis proposed that among the functions of the 535 "pain genes," the biological processes "learning or memory" (P = 8.6 × 10) and "nervous system development" (P = 2.4 × 10) are statistically significantly overrepresented as compared with the annotations to these processes expected by chance. After establishing that the hypothesized biological processes were among important functional genomics features of pain, a subset of n = 34 pain genes were found to be annotated with both Gene Ontology terms. Published empirical evidence supporting their involvement in chronic pain was identified for almost all these genes, including 1 gene identified in March 2016 as being involved in pain. By contrast, such evidence was virtually absent in a randomly selected set of 34 other human genes. Hence, the present computational functional genomics-based method can be used for candidate gene selection, providing an alternative to established methods.
Kulaeva, Olga A; Zhernakov, Aleksandr I; Afonin, Alexey M; Boikov, Sergei S; Sulima, Anton S; Tikhonovich, Igor A; Zhukov, Vladimir A
2017-01-01
Pea (Pisum sativum L.) is the oldest model object of plant genetics and one of the most agriculturally important legumes in the world. Since the pea genome has not been sequenced yet, identification of genes responsible for mutant phenotypes or desirable agricultural traits is usually performed via genetic mapping followed by candidate gene search. Such mapping is best carried out using gene-based molecular markers, as it opens the possibility for exploiting genome synteny between pea and its close relative Medicago truncatula Gaertn., possessing sequenced and annotated genome. In the last 5 years, a large number of pea gene-based molecular markers have been designed and mapped owing to the rapid evolution of "next-generation sequencing" technologies. However, the access to the complete set of markers designed worldwide is limited because the data are not uniformed and therefore hard to use. The Pea Marker Database was designed to combine the information about pea markers in a form of user-friendly and practical online tool. Version 1 (PMD1) comprises information about 2484 genic markers, including their locations in linkage groups, the sequences of corresponding pea transcripts and the names of related genes in M. truncatula. Version 2 (PMD2) is an updated version comprising 15944 pea markers in the same format with several advanced features. To test the performance of the PMD, fine mapping of pea symbiotic genes Sym13 and Sym27 in linkage groups VII and V, respectively, was carried out. The results of mapping allowed us to propose the Sen1 gene (a homologue of SEN1 gene of Lotus japonicus (Regel) K. Larsen) as the best candidate gene for Sym13, and to narrow the list of possible candidate genes for Sym27 to ten, thus proving PMD to be useful for pea gene mapping and cloning. All information contained in PMD1 and PMD2 is available at www.peamarker.arriam.ru.
Suzuki, Hitoshi; Osaki, Ken; Sano, Kaori; Alam, A H M Khurshid; Nakamura, Yuichiro; Ishigaki, Yasuhito; Kawahara, Kozo; Tsukahara, Toshifumi
2011-02-18
Alternative splicing, which produces multiple mRNAs from a single gene, occurs in most human genes and contributes to protein diversity. Many alternative isoforms are expressed in a spatio-temporal manner, and function in diverse processes, including in the neural system. The purpose of the present study was to comprehensively investigate neural-splicing using P19 cells. GeneChip Exon Array analysis was performed using total RNAs purified from cells during neuronal cell differentiation. To efficiently and readily extract the alternative exon candidates, 9 filtering conditions were prepared, yielding 262 candidate exons (236 genes). Semiquantitative RT-PCR results in 30 randomly selected candidates suggested that 87% of the candidates were differentially alternatively spliced in neuronal cells compared to undifferentiated cells. Gene ontology and pathway analyses suggested that many of the candidate genes were associated with neural events. Together with 66 genes whose functions in neural cells or organs were reported previously, 47 candidate genes were found to be linked to 189 events in the gene-level profile of neural differentiation. By text-mining for the alternative isoform, distinct functions of the isoforms of 9 candidate genes indicated by the result of Exon Array were confirmed. Alternative exons were successfully extracted. Results from the informatics analyses suggested that neural events were primarily governed by genes whose expression was increased and whose transcripts were differentially alternatively spliced in the neuronal cells. In addition to known functions in neural cells or organs, the uninvestigated alternative splicing events of 11 genes among 47 candidate genes suggested that cell cycle events are also potentially important. These genes may help researchers to differentiate the roles of alternative splicing in cell differentiation and cell proliferation.
Boulain, Hélène; Legeai, Fabrice; Guy, Endrick; Morlière, Stéphanie; Douglas, Nadine E; Oh, Jonghee; Murugan, Marimuthu; Smith, Michael; Jaquiéry, Julie; Peccoud, Jean; White, Frank F; Carolan, James C; Simon, Jean-Christophe; Sugio, Akiko
2018-05-18
Effector proteins play crucial roles in plant-parasite interactions by suppressing plant defenses and hijacking plant physiological responses to facilitate parasite invasion and propagation. Although effector proteins have been characterized in many microbial plant pathogens, their nature and role in adaptation to host plants are largely unknown in insect herbivores. Aphids rely on salivary effector proteins injected into the host plants to promote phloem sap uptake. Therefore, gaining insight into the repertoire and evolution of aphid effectors is key to unveiling the mechanisms responsible for aphid virulence and host plant specialization. With this aim in mind, we assembled catalogues of putative effectors in the legume specialist aphid, Acyrthosiphon pisum, using transcriptomics and proteomics approaches. We identified 3603 candidate effector genes predicted to be expressed in A. pisum salivary glands (SGs), and 740 of which displayed up-regulated expression in SGs in comparison to the alimentary tract. A search for orthologs in 17 arthropod genomes revealed that SG-up-regulated effector candidates of A. pisum are enriched in aphid-specific genes and tend to evolve faster compared to the whole gene set. We also found that a large fraction of proteins detected in the A. pisum saliva belonged to three gene families, of which certain members show evidence consistent with positive selection. Overall, this comprehensive analysis suggests that the large repertoire of effector candidates in A. pisum constitutes a source of novelties promoting plant adaptation to legumes.
Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo
2011-01-01
Accumulated transcriptome data can be used to investigate regulatory networks of genes involved in various biological systems. Co-expression analysis data sets generated from comprehensively collected transcriptome data sets now represent efficient resources that are capable of facilitating the discovery of genes with closely correlated expression patterns. In order to construct a co-expression network for barley, we analyzed 45 publicly available experimental series, which are composed of 1,347 sets of GeneChip data for barley. On the basis of a gene-to-gene weighted correlation coefficient, we constructed a global barley co-expression network and classified it into clusters of subnetwork modules. The resulting clusters are candidates for functional regulatory modules in the barley transcriptome. To annotate each of the modules, we performed comparative annotation using genes in Arabidopsis and Brachypodium distachyon. On the basis of a comparative analysis between barley and two model species, we investigated functional properties from the representative distributions of the gene ontology (GO) terms. Modules putatively involved in drought stress response and cellulose biogenesis have been identified. These modules are discussed to demonstrate the effectiveness of the co-expression analysis. Furthermore, we applied the data set of co-expressed genes coupled with comparative analysis in attempts to discover potentially Triticeae-specific network modules. These results demonstrate that analysis of the co-expression network of the barley transcriptome together with comparative analysis should promote the process of gene discovery in barley. Furthermore, the insights obtained should be transferable to investigations of Triticeae plants. The associated data set generated in this analysis is publicly accessible at http://coexpression.psc.riken.jp/barley/. PMID:21441235
Vivante, Asaf; Ityel, Hadas; Pode-Shakked, Ben; Chen, Jing; Shril, Shirlee; van der Ven, Amelie T; Mann, Nina; Schmidt, Johanna Magdalena; Segel, Reeval; Aran, Adi; Zeharia, Avraham; Staretz-Chacham, Orna; Bar-Yosef, Omer; Raas-Rothschild, Annick; Landau, Yuval E; Lifton, Richard P; Anikster, Yair; Hildebrandt, Friedhelm
2017-12-01
Rhabdomyolysis is a clinical emergency that may cause acute kidney injury (AKI). It can be acquired or due to monogenic mutations. Around 60 different rare monogenic forms of rhabdomyolysis have been reported to date. In the clinical setting, identifying the underlying molecular diagnosis is challenging due to nonspecific presentation, the high number of causative genes, and current lack of data on the prevalence of monogenic forms. We employed whole exome sequencing (WES) to reveal the percentage of rhabdomyolysis cases explained by single-gene (monogenic) mutations in one of 58 candidate genes. We investigated a cohort of 21 unrelated families with rhabdomyolysis, in whom no underlying etiology had been previously established. Using WES, we identified causative mutations in candidate genes in nine of the 21 families (43%). We detected disease-causing mutations in eight of 58 candidate genes, grouped into the following categories: (1) disorders of fatty acid metabolism (CPT2), (2) disorders of glycogen metabolism (PFKM and PGAM2), (3) disorders of abnormal skeletal muscle relaxation and contraction (CACNA1S, MYH3, RYR1 and SCN4A), and (4) disorders of purine metabolism (AHCY). Our findings demonstrate a very high detection rate for monogenic etiologies using WES and reveal broad genetic heterogeneity for rhabdomyolysis. These results highlight the importance of molecular genetic diagnostics for establishing an etiologic diagnosis. Because these patients are at risk for recurrent episodes of rhabdomyolysis and subsequent risk for AKI, WES allows adequate prophylaxis and treatment for these patients and their family members and enables a personalized medicine approach.
Rangel-Salazar, Rubén; Wickström-Lindholm, Marie; Aguilar-Salinas, Carlos A; Alvarado-Caudillo, Yolanda; Døssing, Kristina B V; Esteller, Manel; Labourier, Emmanuel; Lund, Gertrud; Nielsen, Finn C; Rodríguez-Ríos, Dalia; Solís-Martínez, Martha O; Wrobel, Katarzyna; Wrobel, Kazimierz; Zaina, Silvio
2011-11-25
We previously showed that a VLDL- and LDL-rich mix of human native lipoproteins induces a set of repressive epigenetic marks, i.e. de novo DNA methylation, histone 4 hypoacetylation and histone 4 lysine 20 (H4K20) hypermethylation in THP-1 macrophages. Here, we: 1) ask what gene expression changes accompany these epigenetic responses; 2) test the involvement of candidate factors mediating the latter. We exploited genome expression arrays to identify target genes for lipoprotein-induced silencing, in addition to RNAi and expression studies to test the involvement of candidate mediating factors. The study was conducted in human THP-1 macrophages. Native lipoprotein-induced de novo DNA methylation was associated with a general repression of various critical genes for macrophage function, including pro-inflammatory genes. Lipoproteins showed differential effects on epigenetic marks, as de novo DNA methylation was induced by VLDL and to a lesser extent by LDL, but not by HDL, and VLDL induced H4K20 hypermethylation, while HDL caused H4 deacetylation. The analysis of candidate factors mediating VLDL-induced DNA hypermethylation revealed that this response was: 1) surprisingly, mediated exclusively by the canonical maintenance DNA methyltransferase DNMT1, and 2) independent of the Dicer/micro-RNA pathway. Our work provides novel insights into epigenetic gene regulation by native lipoproteins. Furthermore, we provide an example of DNMT1 acting as a de novo DNA methyltransferase independently of canonical de novo enzymes, and show proof of principle that de novo DNA methylation can occur independently of a functional Dicer/micro-RNA pathway in mammals.
Chen, Xiaoping; Zhu, Wei; Azam, Sarwar; Li, Heying; Zhu, Fanghe; Li, Haifen; Hong, Yanbin; Liu, Haiyan; Zhang, Erhua; Wu, Hong; Yu, Shanlin; Zhou, Guiyuan; Li, Shaoxiong; Zhong, Ni; Wen, Shijie; Li, Xingyu; Knapp, Steve J; Ozias-Akins, Peggy; Varshney, Rajeev K; Liang, Xuanqiang
2013-01-01
The failure of peg penetration into the soil leads to seed abortion in peanut. Knowledge of genes involved in these processes is comparatively deficient. Here, we used RNA-seq to gain insights into transcriptomes of aerial and subterranean pods. More than 2 million transcript reads with an average length of 396 bp were generated from one aerial (AP) and two subterranean (SP1 and SP2) pod libraries using pyrosequencing technology. After assembly, sets of 49 632, 49 952 and 50 494 from a total of 74 974 transcript assembly contigs (TACs) were identified in AP, SP1 and SP2, respectively. A clear linear relationship in the gene expression level was observed between these data sets. In brief, 2194 differentially expressed TACs with a 99.0% true-positive rate were identified, among which 859 and 1068 TACs were up-regulated in aerial and subterranean pods, respectively. Functional analysis showed that putative function based on similarity with proteins catalogued in UniProt and gene ontology term classification could be determined for 59 342 (79.2%) and 42 955 (57.3%) TACs, respectively. A total of 2968 TACs were mapped to 174 KEGG pathways, of which 168 were shared by aerial and subterranean transcriptomes. TACs involved in photosynthesis were significantly up-regulated and enriched in the aerial pod. In addition, two senescence-associated genes were identified as significantly up-regulated in the aerial pod, which potentially contribute to embryo abortion in aerial pods, and in turn, to cessation of swelling. The data set generated in this study provides evidence for some functional genes as robust candidates underlying aerial and subterranean pod development and contributes to an elucidation of the evolutionary implications resulting from fruit development under light and dark conditions. © 2012 The Authors Plant Biotechnology Journal © 2012 Society for Experimental Biology, Association of Applied Biologists and Blackwell Publishing Ltd.
Convergence of GWA and candidate gene studies for alcoholism
Olfson, Emily; Bierut, Laura Jean
2012-01-01
Background Genome-wide association (GWA) studies have led to a paradigm shift in how researchers study the genetics underlying disease. Many GWA studies are now publicly available and can be used to examine whether or not previously proposed candidate genes are supported by GWA data. This approach is particularly important for the field of alcoholism because the contribution of many candidate genes remains controversial. Methods Using the Human Genome Epidemiology (HuGE) Navigator, we selected candidate genes for alcoholism that have been frequently examined in scientific articles in the past decade. Specific candidate loci as well as all the reported SNPs in candidate genes were examined in the Study of Alcohol Addiction: Genetics and Addiction (SAGE), a GWA study comparing alcohol dependent and non-dependent subjects. Results Several commonly reported candidate loci, including rs1800497 in DRD2, rs698 in ADH1C, rs1799971 in OPRM1 and rs4680 in COMT, are not replicated in SAGE (p> .05). Among candidate loci available for analysis, only rs279858 in GABRA2 (p=0.0052, OR=1.16) demonstrated a modest association. Examination of all SNPs reported in SAGE in over 50 candidate genes revealed no SNPs with large frequency differences between cases and controls and the lowest p value of any SNP was .0006. Discussion We provide evidence that several extensively studied candidate loci do not have a strong contribution to risk of developing alcohol dependence in European and African Ancestry populations. Due to lack of coverage, we were unable to rule out the contribution of other variants and these genes and particular loci warrant further investigation. Our analysis demonstrates that publicly available GWA results can be used to better understand which if any of previously proposed candidate genes contribute to disease. Furthermore, we illustrate how examining the convergence of candidate gene and GWA studies can help elucidate the genetic architecture of alcoholism and more generally complex diseases. PMID:22978509
Jouffe, Vincent; Rowe, Suzanne; Liaubet, Laurence; Buitenhuis, Bart; Hornshøj, Henrik; SanCristobal, Magali; Mormède, Pierre; de Koning, D J
2009-07-16
Microarray studies can supplement QTL studies by suggesting potential candidate genes in the QTL regions, which by themselves are too large to provide a limited selection of candidate genes. Here we provide a case study where we explore ways to integrate QTL data and microarray data for the pig, which has only a partial genome sequence. We outline various procedures to localize differentially expressed genes on the pig genome and link this with information on published QTL. The starting point is a set of 237 differentially expressed cDNA clones in adrenal tissue from two pig breeds, before and after treatment with adrenocorticotropic hormone (ACTH). Different approaches to localize the differentially expressed (DE) genes to the pig genome showed different levels of success and a clear lack of concordance for some genes between the various approaches. For a focused analysis on 12 genes, overlapping QTL from the public domain were presented. Also, differentially expressed genes underlying QTL for ACTH response were described. Using the latest version of the draft sequence, the differentially expressed genes were mapped to the pig genome. This enabled co-location of DE genes and previously studied QTL regions, but the draft genome sequence is still incomplete and will contain many errors. A further step to explore links between DE genes and QTL at the pathway level was largely unsuccessful due to the lack of annotation of the pig genome. This could be improved by further comparative mapping analyses but this would be time consuming. This paper provides a case study for the integration of QTL data and microarray data for a species with limited genome sequence information and annotation. The results illustrate the challenges that must be addressed but also provide a roadmap for future work that is applicable to other non-model species.
Cross-talk of the biotrophic pathogen Claviceps purpurea and its host Secale cereale.
Oeser, Birgitt; Kind, Sabine; Schurack, Selma; Schmutzer, Thomas; Tudzynski, Paul; Hinsch, Janine
2017-04-04
The economically important Ergot fungus Claviceps purpurea is an interesting biotrophic model system because of its strict organ specificity (grass ovaries) and the lack of any detectable plant defense reactions. Though several virulence factors were identified, the exact infection mechanisms are unknown, e.g. how the fungus masks its attack and if the host detects the infection at all. We present a first dual transcriptome analysis using an RNA-Seq approach. We studied both, fungal and plant gene expression in young ovaries infected by the wild-type and two virulence-attenuated mutants. We can show that the plant recognizes the fungus, since defense related genes are upregulated, especially several phytohormone genes. We present a survey of in planta expressed fungal genes, among them several confirmed virulence genes. Interestingly, the set of most highly expressed genes includes a high proportion of genes encoding putative effectors, small secreted proteins which might be involved in masking the fungal attack or interfering with host defense reactions. As known from several other phytopathogens, the C. purpurea genome contains more than 400 of such genes, many of them clustered and probably highly redundant. Since the lack of effective defense reactions in spite of recognition of the fungus could very well be achieved by effectors, we started a functional analysis of some of the most highly expressed candidates. However, the redundancy of the system made the identification of a drastic effect of a single gene most unlikely. We can show that at least one candidate accumulates in the plant apoplast. Deletion of some candidates led to a reduced virulence of C. purpurea on rye, indicating a role of the respective proteins during the infection process. We show for the first time that- despite the absence of effective plant defense reactions- the biotrophic pathogen C. purpurea is detected by its host. This points to a role of effectors in modulation of the effective plant response. Indeed, several putative effector genes are among the highest expressed genes in planta.
Identification of a set of genes showing regionally enriched expression in the mouse brain
D'Souza, Cletus A; Chopra, Vikramjit; Varhol, Richard; Xie, Yuan-Yun; Bohacec, Slavita; Zhao, Yongjun; Lee, Lisa LC; Bilenky, Mikhail; Portales-Casamar, Elodie; He, An; Wasserman, Wyeth W; Goldowitz, Daniel; Marra, Marco A; Holt, Robert A; Simpson, Elizabeth M; Jones, Steven JM
2008-01-01
Background The Pleiades Promoter Project aims to improve gene therapy by designing human mini-promoters (< 4 kb) that drive gene expression in specific brain regions or cell-types of therapeutic interest. Our goal was to first identify genes displaying regionally enriched expression in the mouse brain so that promoters designed from orthologous human genes can then be tested to drive reporter expression in a similar pattern in the mouse brain. Results We have utilized LongSAGE to identify regionally enriched transcripts in the adult mouse brain. As supplemental strategies, we also performed a meta-analysis of published literature and inspected the Allen Brain Atlas in situ hybridization data. From a set of approximately 30,000 mouse genes, 237 were identified as showing specific or enriched expression in 30 target regions of the mouse brain. GO term over-representation among these genes revealed co-involvement in various aspects of central nervous system development and physiology. Conclusion Using a multi-faceted expression validation approach, we have identified mouse genes whose human orthologs are good candidates for design of mini-promoters. These mouse genes represent molecular markers in several discrete brain regions/cell-types, which could potentially provide a mechanistic explanation of unique functions performed by each region. This set of markers may also serve as a resource for further studies of gene regulatory elements influencing brain expression. PMID:18625066
Identification of a set of genes showing regionally enriched expression in the mouse brain.
D'Souza, Cletus A; Chopra, Vikramjit; Varhol, Richard; Xie, Yuan-Yun; Bohacec, Slavita; Zhao, Yongjun; Lee, Lisa L C; Bilenky, Mikhail; Portales-Casamar, Elodie; He, An; Wasserman, Wyeth W; Goldowitz, Daniel; Marra, Marco A; Holt, Robert A; Simpson, Elizabeth M; Jones, Steven J M
2008-07-14
The Pleiades Promoter Project aims to improve gene therapy by designing human mini-promoters (< 4 kb) that drive gene expression in specific brain regions or cell-types of therapeutic interest. Our goal was to first identify genes displaying regionally enriched expression in the mouse brain so that promoters designed from orthologous human genes can then be tested to drive reporter expression in a similar pattern in the mouse brain. We have utilized LongSAGE to identify regionally enriched transcripts in the adult mouse brain. As supplemental strategies, we also performed a meta-analysis of published literature and inspected the Allen Brain Atlas in situ hybridization data. From a set of approximately 30,000 mouse genes, 237 were identified as showing specific or enriched expression in 30 target regions of the mouse brain. GO term over-representation among these genes revealed co-involvement in various aspects of central nervous system development and physiology. Using a multi-faceted expression validation approach, we have identified mouse genes whose human orthologs are good candidates for design of mini-promoters. These mouse genes represent molecular markers in several discrete brain regions/cell-types, which could potentially provide a mechanistic explanation of unique functions performed by each region. This set of markers may also serve as a resource for further studies of gene regulatory elements influencing brain expression.
Marcolino-Gomes, Juliana; Rodrigues, Fabiana Aparecida; Fuganti-Pagliarini, Renata; Nakayama, Thiago Jonas; Ribeiro Reis, Rafaela; Bouças Farias, Jose Renato; Harmon, Frank G; Correa Molinari, Hugo Bruno; Correa Molinari, Mayla Daiane; Nepomuceno, Alexandre
2015-01-01
The soybean transcriptome displays strong variation along the day in optimal growth conditions and also in response to adverse circumstances, like drought stress. However, no study conducted to date has presented suitable reference genes, with stable expression along the day, for relative gene expression quantification in combined studies on drought stress and diurnal oscillations. Recently, water deficit responses have been associated with circadian clock oscillations at the transcription level, revealing the existence of hitherto unknown processes and increasing the demand for studies on plant responses to drought stress and its oscillation during the day. We performed data mining from a transcriptome-wide background using microarrays and RNA-seq databases to select an unpublished set of candidate reference genes, specifically chosen for the normalization of gene expression in studies on soybean under both drought stress and diurnal oscillations. Experimental validation and stability analysis in soybean plants submitted to drought stress and sampled during a 24 h timecourse showed that four of these newer reference genes (FYVE, NUDIX, Golgin-84 and CYST) indeed exhibited greater expression stability than the conventionally used housekeeping genes (ELF1-β and β-actin) under these conditions. We also demonstrated the effect of using reference candidate genes with different stability values to normalize the relative expression data from a drought-inducible soybean gene (DREB5) evaluated in different periods of the day.
D'Addabbo, Annarita; Palmieri, Orazio; Maglietta, Rosalia; Latiano, Anna; Mukherjee, Sayan; Annese, Vito; Ancona, Nicola
2011-08-01
A meta-analysis has re-analysed previous genome-wide association scanning definitively confirming eleven genes and further identifying 21 new loci. However, the identified genes/loci still explain only the minority of genetic predisposition of Crohn's disease. To identify genes weakly involved in disease predisposition by analysing chromosomal regions enriched of single nucleotide polymorphisms with modest statistical association. We utilized the WTCCC data set evaluating 1748 CD and 2938 controls. The identification of candidate genes/loci was performed by a two-step procedure: first of all chromosomal regions enriched of weak association signals were localized; subsequently, weak signals clustered in gene regions were identified. The statistical significance was assessed by non parametric permutation tests. The cytoband enrichment analysis highlighted 44 regions (P≤0.05) enriched with single nucleotide polymorphisms significantly associated with the trait including 23 out of 31 previously confirmed and replicated genes. Importantly, we highlight further 20 novel chromosomal regions carrying approximately one hundred genes/loci with modest association. Amongst these we find compelling functional candidate genes such as MAPT, GRB2 and CREM, LCT, and IL12RB2. Our study suggests a different statistical perspective to discover genes weakly associated with a given trait, although further confirmatory functional studies are needed. Copyright © 2011 Editrice Gastroenterologica Italiana S.r.l. All rights reserved.
Xie, Dongwei; Dai, Zhigang; Yang, Zemao; Sun, Jian; Zhao, Debao; Yang, Xue; Zhang, Liguo; Tang, Qing; Su, Jianguang
2018-01-01
Flax (Linum usitatissimum L.) is an important cash crop, and its agronomic traits directly affect yield and quality. Molecular studies on flax remain inadequate because relatively few flax genes have been associated with agronomic traits or have been identified as having potential applications. To identify markers and candidate genes that can potentially be used for genetic improvement of crucial agronomic traits, we examined 224 specimens of core flax germplasm; specifically, phenotypic data for key traits, including plant height, technical length, number of branches, number of fruits, and 1000-grain weight were investigated under three environmental conditions before specific-locus amplified fragment sequencing (SLAF-seq) was employed to perform a genome-wide association study (GWAS) for these five agronomic traits. Subsequently, the results were used to screen single nucleotide polymorphism (SNP) loci and candidate genes that exhibited a significant correlation with the important agronomic traits. Our analyses identified a total of 42 SNP loci that showed significant correlations with the five important agronomic flax traits. Next, candidate genes were screened in the 10 kb zone of each of the 42 SNP loci. These SNP loci were then analyzed by a more stringent screening via co-identification using both a general linear model (GLM) and a mixed linear model (MLM) as well as co-occurrences in at least two of the three environments, whereby 15 final candidate genes were obtained. Based on these results, we determined that UGT and PL are candidate genes for plant height, GRAS and XTH are candidate genes for the number of branches, Contig1437 and LU0019C12 are candidate genes for the number of fruits, and PHO1 is a candidate gene for the 1000-seed weight. We propose that the identified SNP loci and corresponding candidate genes might serve as a biological basis for improving crucial agronomic flax traits. PMID:29375606
Xie, Dongwei; Dai, Zhigang; Yang, Zemao; Sun, Jian; Zhao, Debao; Yang, Xue; Zhang, Liguo; Tang, Qing; Su, Jianguang
2017-01-01
Flax ( Linum usitatissimum L.) is an important cash crop, and its agronomic traits directly affect yield and quality. Molecular studies on flax remain inadequate because relatively few flax genes have been associated with agronomic traits or have been identified as having potential applications. To identify markers and candidate genes that can potentially be used for genetic improvement of crucial agronomic traits, we examined 224 specimens of core flax germplasm; specifically, phenotypic data for key traits, including plant height, technical length, number of branches, number of fruits, and 1000-grain weight were investigated under three environmental conditions before specific-locus amplified fragment sequencing (SLAF-seq) was employed to perform a genome-wide association study (GWAS) for these five agronomic traits. Subsequently, the results were used to screen single nucleotide polymorphism (SNP) loci and candidate genes that exhibited a significant correlation with the important agronomic traits. Our analyses identified a total of 42 SNP loci that showed significant correlations with the five important agronomic flax traits. Next, candidate genes were screened in the 10 kb zone of each of the 42 SNP loci. These SNP loci were then analyzed by a more stringent screening via co-identification using both a general linear model (GLM) and a mixed linear model (MLM) as well as co-occurrences in at least two of the three environments, whereby 15 final candidate genes were obtained. Based on these results, we determined that UGT and PL are candidate genes for plant height, GRAS and XTH are candidate genes for the number of branches, Contig1437 and LU0019C12 are candidate genes for the number of fruits, and PHO1 is a candidate gene for the 1000-seed weight. We propose that the identified SNP loci and corresponding candidate genes might serve as a biological basis for improving crucial agronomic flax traits.
Talke, Ina N; Hanikenne, Marc; Krämer, Ute
2006-09-01
The metal hyperaccumulator Arabidopsis halleri exhibits naturally selected zinc (Zn) and cadmium (Cd) hypertolerance and accumulates extraordinarily high Zn concentrations in its leaves. With these extreme physiological traits, A. halleri phylogenetically belongs to the sister clade of Arabidopsis thaliana. Using a combination of genome-wide cross species microarray analysis and real-time reverse transcription-PCR, a set of candidate genes is identified for Zn hyperaccumulation, Zn and Cd hypertolerance, and the adjustment of micronutrient homeostasis in A. halleri. Eighteen putative metal homeostasis genes are newly identified to be more highly expressed in A. halleri than in A. thaliana, and 11 previously identified candidate genes are confirmed. The encoded proteins include HMA4, known to contribute to root-shoot transport of Zn in A. thaliana. Expression of either AtHMA4 or AhHMA4 confers cellular Zn and Cd tolerance to yeast (Saccharomyces cerevisiae). Among further newly implicated proteins are IRT3 and ZIP10, which have been proposed to contribute to cytoplasmic Zn influx, and FRD3 required for iron partitioning in A. thaliana. In A. halleri, the presence of more than a single genomic copy is a hallmark of several highly expressed candidate genes with possible roles in metal hyperaccumulation and metal hypertolerance. Both A. halleri and A. thaliana exert tight regulatory control over Zn homeostasis at the transcript level. Zn hyperaccumulation in A. halleri involves enhanced partitioning of Zn from roots into shoots. The transcriptional regulation of marker genes suggests that in the steady state, A. halleri roots, but not the shoots, act as physiologically Zn deficient under conditions of moderate Zn supply.
Motamayor, Juan C; Mockaitis, Keithanne; Schmutz, Jeremy; Haiminen, Niina; Livingstone, Donald; Cornejo, Omar; Findley, Seth D; Zheng, Ping; Utro, Filippo; Royaert, Stefan; Saski, Christopher; Jenkins, Jerry; Podicheti, Ram; Zhao, Meixia; Scheffler, Brian E; Stack, Joseph C; Feltus, Frank A; Mustiga, Guiliana M; Amores, Freddy; Phillips, Wilbert; Marelli, Jean Philippe; May, Gregory D; Shapiro, Howard; Ma, Jianxin; Bustamante, Carlos D; Schnell, Raymond J; Main, Dorrie; Gilbert, Don; Parida, Laxmi; Kuhn, David N
2013-06-03
Theobroma cacao L. cultivar Matina 1-6 belongs to the most cultivated cacao type. The availability of its genome sequence and methods for identifying genes responsible for important cacao traits will aid cacao researchers and breeders. We describe the sequencing and assembly of the genome of Theobroma cacao L. cultivar Matina 1-6. The genome of the Matina 1-6 cultivar is 445 Mbp, which is significantly larger than a sequenced Criollo cultivar, and more typical of other cultivars. The chromosome-scale assembly, version 1.1, contains 711 scaffolds covering 346.0 Mbp, with a contig N50 of 84.4 kbp, a scaffold N50 of 34.4 Mbp, and an evidence-based gene set of 29,408 loci. Version 1.1 has 10x the scaffold N50 and 4x the contig N50 as Criollo, and includes 111 Mb more anchored sequence. The version 1.1 assembly has 4.4% gap sequence, while Criollo has 10.9%. Through a combination of haplotype, association mapping and gene expression analyses, we leverage this robust reference genome to identify a promising candidate gene responsible for pod color variation. We demonstrate that green/red pod color in cacao is likely regulated by the R2R3 MYB transcription factor TcMYB113, homologs of which determine pigmentation in Rosaceae, Solanaceae, and Brassicaceae. One SNP within the target site for a highly conserved trans-acting siRNA in dicots, found within TcMYB113, seems to affect transcript levels of this gene and therefore pod color variation. We report a high-quality sequence and annotation of Theobroma cacao L. and demonstrate its utility in identifying candidate genes regulating traits.
Bime, Christian; Pouladi, Nima; Sammani, Saad; Batai, Ken; Casanova, Nancy; Zhou, Tong; Kempf, Carrie L; Sun, Xiaoguang; Camp, Sara M; Wang, Ting; Kittles, Rick A; Lussier, Yves A; Jones, Tiffanie K; Reilly, John P; Meyer, Nuala J; Christie, Jason D; Karnes, Jason H; Gonzalez-Garay, Manuel; Christiani, David C; Yates, Charles R; Wurfel, Mark M; Meduri, Gianfranco U; Garcia, Joe G N
2018-06-01
Genetic factors are involved in acute respiratory distress syndrome (ARDS) susceptibility. Identification of novel candidate genes associated with increased risk and severity will improve our understanding of ARDS pathophysiology and enhance efforts to develop novel preventive and therapeutic approaches. To identify genetic susceptibility targets for ARDS. A genome-wide association study was performed on 232 African American patients with ARDS and 162 at-risk control subjects. The Identify Candidate Causal SNPs and Pathways platform was used to infer the association of known gene sets with the top prioritized intragenic SNPs. Preclinical validation of SELPLG (selectin P ligand gene) was performed using mouse models of LPS- and ventilator-induced lung injury. Exonic variation within SELPLG distinguishing patients with ARDS from sepsis control subjects was confirmed in an independent cohort. Pathway prioritization analysis identified a nonsynonymous coding SNP (rs2228315) within SELPLG, encoding P-selectin glycoprotein ligand 1, to be associated with increased susceptibility. In an independent cohort, two exonic SELPLG SNPs were significantly associated with ARDS susceptibility. Additional support for SELPLG as an ARDS candidate gene was derived from preclinical ARDS models where SELPLG gene expression in lung tissues was significantly increased in both ventilator-induced (twofold increase) and LPS-induced (5.7-fold increase) murine lung injury models compared with controls. Furthermore, Selplg -/- mice exhibited significantly reduced LPS-induced inflammatory lung injury compared with wild-type C57/B6 mice. Finally, an antibody that neutralizes P-selectin glycoprotein ligand 1 significantly attenuated LPS-induced lung inflammation. These findings identify SELPLG as a novel ARDS susceptibility gene among individuals of European and African descent.
2013-01-01
Background Theobroma cacao L. cultivar Matina 1-6 belongs to the most cultivated cacao type. The availability of its genome sequence and methods for identifying genes responsible for important cacao traits will aid cacao researchers and breeders. Results We describe the sequencing and assembly of the genome of Theobroma cacao L. cultivar Matina 1-6. The genome of the Matina 1-6 cultivar is 445 Mbp, which is significantly larger than a sequenced Criollo cultivar, and more typical of other cultivars. The chromosome-scale assembly, version 1.1, contains 711 scaffolds covering 346.0 Mbp, with a contig N50 of 84.4 kbp, a scaffold N50 of 34.4 Mbp, and an evidence-based gene set of 29,408 loci. Version 1.1 has 10x the scaffold N50 and 4x the contig N50 as Criollo, and includes 111 Mb more anchored sequence. The version 1.1 assembly has 4.4% gap sequence, while Criollo has 10.9%. Through a combination of haplotype, association mapping and gene expression analyses, we leverage this robust reference genome to identify a promising candidate gene responsible for pod color variation. We demonstrate that green/red pod color in cacao is likely regulated by the R2R3 MYB transcription factor TcMYB113, homologs of which determine pigmentation in Rosaceae, Solanaceae, and Brassicaceae. One SNP within the target site for a highly conserved trans-acting siRNA in dicots, found within TcMYB113, seems to affect transcript levels of this gene and therefore pod color variation. Conclusions We report a high-quality sequence and annotation of Theobroma cacao L. and demonstrate its utility in identifying candidate genes regulating traits. PMID:23731509
NASA Astrophysics Data System (ADS)
Devanna, Paolo; Vernes, Sonja C.
2014-02-01
Retinoic acid-related orphan receptor alpha gene (RORa) and the microRNA MIR137 have both recently been identified as novel candidate genes for neuropsychiatric disorders. RORa encodes a ligand-dependent orphan nuclear receptor that acts as a transcriptional regulator and miR-137 is a brain enriched small non-coding RNA that interacts with gene transcripts to control protein levels. Given the mounting evidence for RORa in autism spectrum disorders (ASD) and MIR137 in schizophrenia and ASD, we investigated if there was a functional biological relationship between these two genes. Herein, we demonstrate that miR-137 targets the 3'UTR of RORa in a site specific manner. We also provide further support for MIR137 as an autism candidate by showing that a large number of previously implicated autism genes are also putatively targeted by miR-137. This work supports the role of MIR137 as an ASD candidate and demonstrates a direct biological link between these previously unrelated autism candidate genes.
Candidate genes and molecular markers associated with heat tolerance in colonial Bentgrass.
Jespersen, David; Belanger, Faith C; Huang, Bingru
2017-01-01
Elevated temperature is a major abiotic stress limiting the growth of cool-season grasses during the summer months. The objectives of this study were to determine the genetic variation in the expression patterns of selected genes involved in several major metabolic pathways regulating heat tolerance for two genotypes contrasting in heat tolerance to confirm their status as potential candidate genes, and to identify PCR-based markers associated with candidate genes related to heat tolerance in a colonial (Agrostis capillaris L.) x creeping bentgrass (Agrostis stolonifera L.) hybrid backcross population. Plants were subjected to heat stress in controlled-environmental growth chambers for phenotypic evaluation and determination of genetic variation in candidate gene expression. Molecular markers were developed for genes involved in protein degradation (cysteine protease), antioxidant defense (catalase and glutathione-S-transferase), energy metabolism (glyceraldehyde-3-phosphate dehydrogenase), cell expansion (expansin), and stress protection (heat shock proteins HSP26, HSP70, and HSP101). Kruskal-Wallis analysis, a commonly used non-parametric test used to compare population individuals with or without the gene marker, found the physiological traits of chlorophyll content, electrolyte leakage, normalized difference vegetative index, and turf quality were associated with all candidate gene markers with the exception of HSP101. Differential gene expression was frequently found for the tested candidate genes. The development of candidate gene markers for important heat tolerance genes may allow for the development of new cultivars with increased abiotic stress tolerance using marker-assisted selection.
Candidate genes and molecular markers associated with heat tolerance in colonial Bentgrass
Jespersen, David; Belanger, Faith C.; Huang, Bingru
2017-01-01
Elevated temperature is a major abiotic stress limiting the growth of cool-season grasses during the summer months. The objectives of this study were to determine the genetic variation in the expression patterns of selected genes involved in several major metabolic pathways regulating heat tolerance for two genotypes contrasting in heat tolerance to confirm their status as potential candidate genes, and to identify PCR-based markers associated with candidate genes related to heat tolerance in a colonial (Agrostis capillaris L.) x creeping bentgrass (Agrostis stolonifera L.) hybrid backcross population. Plants were subjected to heat stress in controlled-environmental growth chambers for phenotypic evaluation and determination of genetic variation in candidate gene expression. Molecular markers were developed for genes involved in protein degradation (cysteine protease), antioxidant defense (catalase and glutathione-S-transferase), energy metabolism (glyceraldehyde-3-phosphate dehydrogenase), cell expansion (expansin), and stress protection (heat shock proteins HSP26, HSP70, and HSP101). Kruskal-Wallis analysis, a commonly used non-parametric test used to compare population individuals with or without the gene marker, found the physiological traits of chlorophyll content, electrolyte leakage, normalized difference vegetative index, and turf quality were associated with all candidate gene markers with the exception of HSP101. Differential gene expression was frequently found for the tested candidate genes. The development of candidate gene markers for important heat tolerance genes may allow for the development of new cultivars with increased abiotic stress tolerance using marker-assisted selection. PMID:28187136
Association Studies of 22 Candidate SNPs with Late-Onset Alzheimer's Disease
Figgins, Jessica A.; Minster, Ryan L.; Demirci, F. Yesim; DeKosky, Steven T.; Kamboh, M. Ilyas
2009-01-01
Alzheimer's disease (AD) is a complex and multifactorial disease with the possible involvement of several genes. With the exception of the APOE gene as a susceptibility marker, no other genes have been shown consistently to be associated with late-onset AD (LOAD). A recent genome-wide association study of 17,343 gene-based putative functional single nucleotide polymorphisms (SNPs) found 19 significant variants, including 3 linked to APOE, showing association with LOAD (Hum Mol Genet 2007; 16:865–873). We have set out to replicate the 16 new significant associations in a large case-control cohort of American Whites. Additionally, we examined six variants present in positional and/or biological candidate genes for AD. We genotyped the 22 SNPs in up to 1,009 Caucasian Americans with LOAD and up to 1,010 age-matched healthy Caucasian Americans, using 5′ nuclease assays. We did not observe a statistically significant association between the SNPs and the risk of AD, either individually or stratified by APOE. Our data suggest that the association of the studied variants with LOAD risk, if it exists, is not statistically significant in our sample. PMID:18780302
Benitez, Cecil M.; Qu, Kun; Sugiyama, Takuya; Pauerstein, Philip T.; Liu, Yinghua; Tsai, Jennifer; Gu, Xueying; Ghodasara, Amar; Arda, H. Efsun; Zhang, Jiajing; Dekker, Joseph D.; Tucker, Haley O.; Chang, Howard Y.; Kim, Seung K.
2014-01-01
The regulatory logic underlying global transcriptional programs controlling development of visceral organs like the pancreas remains undiscovered. Here, we profiled gene expression in 12 purified populations of fetal and adult pancreatic epithelial cells representing crucial progenitor cell subsets, and their endocrine or exocrine progeny. Using probabilistic models to decode the general programs organizing gene expression, we identified co-expressed gene sets in cell subsets that revealed patterns and processes governing progenitor cell development, lineage specification, and endocrine cell maturation. Purification of Neurog3 mutant cells and module network analysis linked established regulators such as Neurog3 to unrecognized gene targets and roles in pancreas development. Iterative module network analysis nominated and prioritized transcriptional regulators, including diabetes risk genes. Functional validation of a subset of candidate regulators with corresponding mutant mice revealed that the transcription factors Etv1, Prdm16, Runx1t1 and Bcl11a are essential for pancreas development. Our integrated approach provides a unique framework for identifying regulatory genes and functional gene sets underlying pancreas development and associated diseases such as diabetes mellitus. PMID:25330008
Ramayo-Caldas, Yuliaxis; Renand, Gilles; Ballester, Maria; Saintilan, Romain; Rocha, Dominique
2016-04-23
Studies to identify markers associated with beef tenderness have focused on Warner-Bratzler shear force (WBSF) but the interplay between the genes associated with WBSF has not been explored. We used the association weight matrix (AWM), a systems biology approach, to identify a set of interacting genes that are co-associated with tenderness and other meat quality traits, and shared across the Charolaise, Limousine and Blonde d'Aquitaine beef cattle breeds. Genome-wide association studies were performed using ~500K single nucleotide polymorphisms (SNPs) and 17 phenotypes measured on more than 1000 animals for each breed. First, this multi-trait approach was applied separately for each breed across 17 phenotypes and second, between- and across-breed comparisons at the AWM and functional levels were performed. Genetic heterogeneity was observed, and most of the variants that were associated with WBSF segregated within rather than across breeds. We identified 206 common candidate genes associated with WBSF across the three breeds. SNPs in these common genes explained between 28 and 30 % of the phenotypic variance for WBSF. A reduced number of common SNPs mapping to the 206 common genes were identified, suggesting that different mutations may target the same genes in a breed-specific manner. Therefore, it is likely that, depending on allele frequencies and linkage disequilibrium patterns, a SNP that is identified for one breed may not be informative for another unrelated breed. Well-known candidate genes affecting beef tenderness were identified. In addition, some of the 206 common genes are located within previously reported quantitative trait loci for WBSF in several cattle breeds. Moreover, the multi-breed co-association analysis detected new candidate genes, regulators and metabolic pathways that are likely involved in the determination of meat tenderness and other meat quality traits in beef cattle. Our results suggest that systems biology approaches that explore associations of correlated traits increase statistical power to identify candidate genes beyond the one-dimensional approach. Further studies on the 206 common genes, their pathways, regulators and interactions will expand our knowledge on the molecular basis of meat tenderness and could lead to the discovery of functional mutations useful for genomic selection in a multi-breed beef cattle context.
Getting the most out of RNA-seq data analysis.
Khang, Tsung Fei; Lau, Ching Yee
2015-01-01
Background. A common research goal in transcriptome projects is to find genes that are differentially expressed in different phenotype classes. Biologists might wish to validate such gene candidates experimentally, or use them for downstream systems biology analysis. Producing a coherent differential gene expression analysis from RNA-seq count data requires an understanding of how numerous sources of variation such as the replicate size, the hypothesized biological effect size, and the specific method for making differential expression calls interact. We believe an explicit demonstration of such interactions in real RNA-seq data sets is of practical interest to biologists. Results. Using two large public RNA-seq data sets-one representing strong, and another mild, biological effect size-we simulated different replicate size scenarios, and tested the performance of several commonly-used methods for calling differentially expressed genes in each of them. We found that, when biological effect size was mild, RNA-seq experiments should focus on experimental validation of differentially expressed gene candidates. Importantly, at least triplicates must be used, and the differentially expressed genes should be called using methods with high positive predictive value (PPV), such as NOISeq or GFOLD. In contrast, when biological effect size was strong, differentially expressed genes mined from unreplicated experiments using NOISeq, ASC and GFOLD had between 30 to 50% mean PPV, an increase of more than 30-fold compared to the cases of mild biological effect size. Among methods with good PPV performance, having triplicates or more substantially improved mean PPV to over 90% for GFOLD, 60% for DESeq2, 50% for NOISeq, and 30% for edgeR. At a replicate size of six, we found DESeq2 and edgeR to be reasonable methods for calling differentially expressed genes at systems level analysis, as their PPV and sensitivity trade-off were superior to the other methods'. Conclusion. When biological effect size is weak, systems level investigation is not possible using RNAseq data, and no meaningful result can be obtained in unreplicated experiments. Nonetheless, NOISeq or GFOLD may yield limited numbers of gene candidates with good validation potential, when triplicates or more are available. When biological effect size is strong, NOISeq and GFOLD are effective tools for detecting differentially expressed genes in unreplicated RNA-seq experiments for qPCR validation. When triplicates or more are available, GFOLD is a sharp tool for identifying high confidence differentially expressed genes for targeted qPCR validation; for downstream systems level analysis, combined results from DESeq2 and edgeR are useful.
A small number of candidate gene SNPs reveal continental ancestry in African Americans
KODAMAN, NURI; ALDRICH, MELINDA C.; SMITH, JEFFREY R.; SIGNORELLO, LISA B.; BRADLEY, KEVIN; BREYER, JOAN; COHEN, SARAH S.; LONG, JIRONG; CAI, QIUYIN; GILES, JUSTIN; BUSH, WILLIAM S.; BLOT, WILLIAM J.; MATTHEWS, CHARLES E.; WILLIAMS, SCOTT M.
2013-01-01
SUMMARY Using genetic data from an obesity candidate gene study of self-reported African Americans and European Americans, we investigated the number of Ancestry Informative Markers (AIMs) and candidate gene SNPs necessary to infer continental ancestry. Proportions of African and European ancestry were assessed with STRUCTURE (K=2), using 276 AIMs. These reference values were compared to estimates derived using 120, 60, 30, and 15 SNP subsets randomly chosen from the 276 AIMs and from 1144 SNPs in 44 candidate genes. All subsets generated estimates of ancestry consistent with the reference estimates, with mean correlations greater than 0.99 for all subsets of AIMs, and mean correlations of 0.99±0.003; 0.98± 0.01; 0.93±0.03; and 0.81± 0.11 for subsets of 120, 60, 30, and 15 candidate gene SNPs, respectively. Among African Americans, the median absolute difference from reference African ancestry values ranged from 0.01 to 0.03 for the four AIMs subsets and from 0.03 to 0.09 for the four candidate gene SNP subsets. Furthermore, YRI/CEU Fst values provided a metric to predict the performance of candidate gene SNPs. Our results demonstrate that a small number of SNPs randomly selected from candidate genes can be used to estimate admixture proportions in African Americans reliably. PMID:23278390
Vischi Winck, Flavia; Arvidsson, Samuel; Riaño-Pachón, Diego Mauricio; Hempel, Sabrina; Koseska, Aneta; Nikoloski, Zoran; Urbina Gomez, David Alejandro; Rupprecht, Jens; Mueller-Roeber, Bernd
2013-01-01
The unicellular green alga Chlamydomonas reinhardtii is a long-established model organism for studies on photosynthesis and carbon metabolism-related physiology. Under conditions of air-level carbon dioxide concentration [CO2], a carbon concentrating mechanism (CCM) is induced to facilitate cellular carbon uptake. CCM increases the availability of carbon dioxide at the site of cellular carbon fixation. To improve our understanding of the transcriptional control of the CCM, we employed FAIRE-seq (formaldehyde-assisted Isolation of Regulatory Elements, followed by deep sequencing) to determine nucleosome-depleted chromatin regions of algal cells subjected to carbon deprivation. Our FAIRE data recapitulated the positions of known regulatory elements in the promoter of the periplasmic carbonic anhydrase (Cah1) gene, which is upregulated during CCM induction, and revealed new candidate regulatory elements at a genome-wide scale. In addition, time series expression patterns of 130 transcription factor (TF) and transcription regulator (TR) genes were obtained for cells cultured under photoautotrophic condition and subjected to a shift from high to low [CO2]. Groups of co-expressed genes were identified and a putative directed gene-regulatory network underlying the CCM was reconstructed from the gene expression data using the recently developed IOTA (inner composition alignment) method. Among the candidate regulatory genes, two members of the MYB-related TF family, Lcr1 (Low-CO 2 response regulator 1) and Lcr2 (Low-CO 2 response regulator 2), may play an important role in down-regulating the expression of a particular set of TF and TR genes in response to low [CO2]. The results obtained provide new insights into the transcriptional control of the CCM and revealed more than 60 new candidate regulatory genes. Deep sequencing of nucleosome-depleted genomic regions indicated the presence of new, previously unknown regulatory elements in the C. reinhardtii genome. Our work can serve as a basis for future functional studies of transcriptional regulator genes and genomic regulatory elements in Chlamydomonas. PMID:24224019
EnRICH: Extraction and Ranking using Integration and Criteria Heuristics.
Zhang, Xia; Greenlee, M Heather West; Serb, Jeanne M
2013-01-15
High throughput screening technologies enable biologists to generate candidate genes at a rate that, due to time and cost constraints, cannot be studied by experimental approaches in the laboratory. Thus, it has become increasingly important to prioritize candidate genes for experiments. To accomplish this, researchers need to apply selection requirements based on their knowledge, which necessitates qualitative integration of heterogeneous data sources and filtration using multiple criteria. A similar approach can also be applied to putative candidate gene relationships. While automation can assist in this routine and imperative procedure, flexibility of data sources and criteria must not be sacrificed. A tool that can optimize the trade-off between automation and flexibility to simultaneously filter and qualitatively integrate data is needed to prioritize candidate genes and generate composite networks from heterogeneous data sources. We developed the java application, EnRICH (Extraction and Ranking using Integration and Criteria Heuristics), in order to alleviate this need. Here we present a case study in which we used EnRICH to integrate and filter multiple candidate gene lists in order to identify potential retinal disease genes. As a result of this procedure, a candidate pool of several hundred genes was narrowed down to five candidate genes, of which four are confirmed retinal disease genes and one is associated with a retinal disease state. We developed a platform-independent tool that is able to qualitatively integrate multiple heterogeneous datasets and use different selection criteria to filter each of them, provided the datasets are tables that have distinct identifiers (required) and attributes (optional). With the flexibility to specify data sources and filtering criteria, EnRICH automatically prioritizes candidate genes or gene relationships for biologists based on their specific requirements. Here, we also demonstrate that this tool can be effectively and easily used to apply highly specific user-defined criteria and can efficiently identify high quality candidate genes from relatively sparse datasets.
Degrees of separation as a statistical tool for evaluating candidate genes.
Nelson, Ronald M; Pettersson, Mats E
2014-12-01
Selection of candidate genes is an important step in the exploration of complex genetic architecture. The number of gene networks available is increasing and these can provide information to help with candidate gene selection. It is currently common to use the degree of connectedness in gene networks as validation in Genome Wide Association (GWA) and Quantitative Trait Locus (QTL) mapping studies. However, it can cause misleading results if not validated properly. Here we present a method and tool for validating the gene pairs from GWA studies given the context of the network they co-occur in. It ensures that proposed interactions and gene associations are not statistical artefacts inherent to the specific gene network architecture. The CandidateBacon package provides an easy and efficient method to calculate the average degree of separation (DoS) between pairs of genes to currently available gene networks. We show how these empirical estimates of average connectedness are used to validate candidate gene pairs. Validation of interacting genes by comparing their connectedness with the average connectedness in the gene network will provide support for said interactions by utilising the growing amount of gene network information available. Copyright © 2014 Elsevier Ltd. All rights reserved.
Maver, Ales; Medica, Igor; Peterlin, Borut
2009-12-01
The search for gene candidates in multifactorial diseases such as sarcoidosis can be based on the integration of linkage association data, gene expression data, and protein profile data from genomic, transcriptomic and proteomic studies, respectively. In this study we performed a literature-based search for studies reporting such data, followed by integration of collected information. Different databases were examined--Medline, HugGE Navigator, ArrayExpress and Gene Expression Omnibus (GEO). Candidate genes were defined as genes which were reported in at least 2 different types of omics studies. Genes previously investigated in sarcoidosis were excluded from further analyses. We identified 177 genes associated with sarcoidosis as potential new candidate genes. Subsequently, 9 gene candidates identified to overlap in 2 different types of studies (genomic, transcriptomic and/or proteomic) were consistently reported in at least 3 studies: SERPINB1, FABP4, S100A8, HBEGF, IL7R, LRIG1, PTPN23, DPM2 and NUP214. These genes are involved in regulation of immune response, cellular proliferation, apoptosis, inhibition of protease activity, lipid metabolism. Exact biological functions of HBEGF, LRIG1, PTPN23, DPM2 and NUP214 remain to be completely elucidated. We propose 9 candidate genes: SERPINB1, FABP4, S100A8, HBEGF, IL7R, LRIG1, PTPN23, DPM2 and NUP214, as genes with high potential for association with sarcoidosis.
Torres, Katherine J.; Castrillon, Carlos E.; Moss, Eli L.; Saito, Mayuko; Tenorio, Roy; Molina, Douglas M.; Davies, Huw; Neafsey, Daniel E.; Felgner, Philip; Vinetz, Joseph M.; Gamboa, Dionicia
2015-01-01
Background. Persons with blood-stage Plasmodium falciparum parasitemia in the absence of symptoms are considered to be clinically immune. We hypothesized that asymptomatic subjects with P. falciparum parasitemia would differentially recognize a subset of P. falciparum proteins on a genomic scale. Methods and Findings. Compared with symptomatic subjects, sera from clinically immune, asymptomatically infected individuals differentially recognized 51 P. falciparum proteins, including the established vaccine candidate PfMSP1. Novel, hitherto unstudied hypothetical proteins and other proteins not previously recognized as potential vaccine candidates were also differentially recognized. Genes encoding the proteins differentially recognized by the Peruvian clinically immune individuals exhibited a significant enrichment of nonsynonymous nucleotide variation, an observation consistent with these genes undergoing immune selection. Conclusions. A limited set of P. falciparum protein antigens was associated with the development of naturally acquired clinical immunity in the low-transmission setting of the Peruvian Amazon. These results imply that, even in a low-transmission setting, an asexual blood-stage vaccine designed to reduce clinical malaria symptoms will likely need to contain large numbers of often-polymorphic proteins, a finding at odds with many current efforts in the design of vaccines against asexual blood-stage P. falciparum. PMID:25381370
Fekete, Tibor; Rásó, Erzsébet; Pete, Imre; Tegze, Bálint; Liko, István; Munkácsy, Gyöngyi; Sipos, Norbert; Rigó, János; Györffy, Balázs
2012-07-01
Transcriptomic analysis of global gene expression in ovarian carcinoma can identify dysregulated genes capable to serve as molecular markers for histology subtypes and survival. The aim of our study was to validate previous candidate signatures in an independent setting and to identify single genes capable to serve as biomarkers for ovarian cancer progression. As several datasets are available in the GEO today, we were able to perform a true meta-analysis. First, 829 samples (11 datasets) were downloaded, and the predictive power of 16 previously published gene sets was assessed. Of these, eight were capable to discriminate histology subtypes, and none was capable to predict survival. To overcome the differences in previous studies, we used the 829 samples to identify new predictors. Then, we collected 64 ovarian cancer samples (median relapse-free survival 24.5 months) and performed TaqMan Real Time Polimerase Chain Reaction (RT-PCR) analysis for the best 40 genes associated with histology subtypes and survival. Over 90% of subtype-associated genes were confirmed. Overall survival was effectively predicted by hormone receptors (PGR and ESR2) and by TSPAN8. Relapse-free survival was predicted by MAPT and SNCG. In summary, we successfully validated several gene sets in a meta-analysis in large datasets of ovarian samples. Additionally, several individual genes identified were validated in a clinical cohort. Copyright © 2011 UICC.
Li, Mengmeng; Rao, Man; Chen, Kai; Zhou, Jianye; Song, Jiangping
2017-07-15
Real-time quantitative reverse transcriptase-PCR (qRT-PCR) is a feasible tool for determining gene expression profiles, but the accuracy and reliability of the results depends on the stable expression of selected housekeeping genes in different samples. By far, researches on stable housekeeping genes in human heart failure samples are rare. Moreover the effect of heart failure on the expression of housekeeping genes in right and left ventricles is yet to be studied. Therefore we aim to provide stable housekeeping genes for both ventricles in heart failure and normal heart samples. In this study, we selected seven commonly used housekeeping genes as candidates. By using the qRT-PCR, the expression levels of ACTB, RAB7A, GAPDH, REEP5, RPL5, PSMB4 and VCP in eight heart failure and four normal heart samples were assessed. The stability of candidate housekeeping genes was evaluated by geNorm and Normfinder softwares. GAPDH showed the least variation in all heart samples. Results also indicated the difference of gene expression existed in heart failure left and right ventricles. GAPDH had the highest expression stability in both heart failure and normal heart samples. We also propose using different sets of housekeeping genes for left and right ventricles respectively. The combination of RPL5, GAPDH and PSMB4 is suitable for the right ventricle and the combination of GAPDH, REEP5 and RAB7A is suitable for the left ventricle. Copyright © 2017 Elsevier B.V. All rights reserved.
The ecological and genetic basis of convergent thick-lipped phenotypes in cichlid fishes.
Colombo, Marco; Diepeveen, Eveline T; Muschick, Moritz; Santos, M Emilia; Indermaur, Adrian; Boileau, Nicolas; Barluenga, Marta; Salzburger, Walter
2013-02-01
The evolution of convergent phenotypes is one of the most interesting outcomes of replicate adaptive radiations. Remarkable cases of convergence involve the thick-lipped phenotype found across cichlid species flocks in the East African Great Lakes. Unlike most other convergent forms in cichlids, which are restricted to East Africa, the thick-lipped phenotype also occurs elsewhere, for example in the Central American Midas Cichlid assemblage. Here, we use an ecological genomic approach to study the function, the evolution and the genetic basis of this phenotype in two independent cichlid adaptive radiations on two continents. We applied phylogenetic, demographic, geometric morphometric and stomach content analyses to an African (Lobochilotes labiatus) and a Central American (Amphilophus labiatus) thick-lipped species. We found that similar morphological adaptations occur in both thick-lipped species and that the 'fleshy' lips are associated with hard-shelled prey in the form of molluscs and invertebrates. We then used comparative Illumina RNA sequencing of thick vs. normal lip tissue in East African cichlids and identified a set of 141 candidate genes that appear to be involved in the morphogenesis of this trait. A more detailed analysis of six of these genes led to three strong candidates: Actb, Cldn7 and Copb. The function of these genes can be linked to the loose connective tissue constituting the fleshy lips. Similar trends in gene expression between African and Central American thick-lipped species appear to indicate that an overlapping set of genes was independently recruited to build this particular phenotype in both lineages. © 2012 Blackwell Publishing Ltd.
LOD score exclusion analyses for candidate QTLs using random population samples.
Deng, Hong-Wen
2003-11-01
While extensive analyses have been conducted to test for, no formal analyses have been conducted to test against, the importance of candidate genes as putative QTLs using random population samples. Previously, we developed an LOD score exclusion mapping approach for candidate genes for complex diseases. Here, we extend this LOD score approach for exclusion analyses of candidate genes for quantitative traits. Under this approach, specific genetic effects (as reflected by heritability) and inheritance models at candidate QTLs can be analyzed and if an LOD score is < or = -2.0, the locus can be excluded from having a heritability larger than that specified. Simulations show that this approach has high power to exclude a candidate gene from having moderate genetic effects if it is not a QTL and is robust to population admixture. Our exclusion analysis complements association analysis for candidate genes as putative QTLs in random population samples. The approach is applied to test the importance of Vitamin D receptor (VDR) gene as a potential QTL underlying the variation of bone mass, an important determinant of osteoporosis.
Longhi, Sara; Moretto, Marco; Viola, Roberto; Velasco, Riccardo; Costa, Fabrizio
2012-02-01
Fruit ripening is a complex physiological process in plants whereby cell wall programmed changes occur mainly to promote seed dispersal. Cell wall modification also directly regulates the textural properties, a fundamental aspect of fruit quality. In this study, two full-sib populations of apple, with 'Fuji' as the common maternal parent, crossed with 'Delearly' and 'Pink Lady', were used to understand the control of fruit texture by QTL mapping and in silico gene mining. Texture was dissected with a novel high resolution phenomics strategy, simultaneously profiling both mechanical and acoustic fruit texture components. In 'Fuji × Delearly' nine linkage groups were associated with QTLs accounting from 15.6% to 49% of the total variance, and a highly significant QTL cluster for both textural components was mapped on chromosome 10 and co-located with Md-PG1, a polygalacturonase gene that, in apple, is known to be involved in cell wall metabolism processes. In addition, other candidate genes related to Md-NOR and Md-RIN transcription factors, Md-Pel (pectate lyase), and Md-ACS1 were mapped within statistical intervals. In 'Fuji × Pink Lady', a smaller set of linkage groups associated with the QTLs identified for fruit texture (15.9-34.6% variance) was observed. The analysis of the phenotypic variance over a two-dimensional PCA plot highlighted a transgressive segregation for this progeny, revealing two QTL sets distinctively related to both mechanical and acoustic texture components. The mining of the apple genome allowed the discovery of the gene inventory underlying each QTL, and functional profile assessment unravelled specific gene expression patterns of these candidate genes.
A Discovery Resource of Rare Copy Number Variations in Individuals with Autism Spectrum Disorder
Prasad, Aparna; Merico, Daniele; Thiruvahindrapuram, Bhooma; Wei, John; Lionel, Anath C.; Sato, Daisuke; Rickaby, Jessica; Lu, Chao; Szatmari, Peter; Roberts, Wendy; Fernandez, Bridget A.; Marshall, Christian R.; Hatchwell, Eli; Eis, Peggy S.; Scherer, Stephen W.
2012-01-01
The identification of rare inherited and de novo copy number variations (CNVs) in human subjects has proven a productive approach to highlight risk genes for autism spectrum disorder (ASD). A variety of microarrays are available to detect CNVs, including single-nucleotide polymorphism (SNP) arrays and comparative genomic hybridization (CGH) arrays. Here, we examine a cohort of 696 unrelated ASD cases using a high-resolution one-million feature CGH microarray, the majority of which were previously genotyped with SNP arrays. Our objective was to discover new CNVs in ASD cases that were not detected by SNP microarray analysis and to delineate novel ASD risk loci via combined analysis of CGH and SNP array data sets on the ASD cohort and CGH data on an additional 1000 control samples. Of the 615 ASD cases analyzed on both SNP and CGH arrays, we found that 13,572 of 21,346 (64%) of the CNVs were exclusively detected by the CGH array. Several of the CGH-specific CNVs are rare in population frequency and impact previously reported ASD genes (e.g., NRXN1, GRM8, DPYD), as well as novel ASD candidate genes (e.g., CIB2, DAPP1, SAE1), and all were inherited except for a de novo CNV in the GPHN gene. A functional enrichment test of gene-sets in ASD cases over controls revealed nucleotide metabolism as a potential novel pathway involved in ASD, which includes several candidate genes for follow-up (e.g., DPYD, UPB1, UPP1, TYMP). Finally, this extensively phenotyped and genotyped ASD clinical cohort serves as an invaluable resource for the next step of genome sequencing for complete genetic variation detection. PMID:23275889
Yıldırım, Kubilay; Uylaş, Senem
2016-12-01
Boron (B) is an essential nutrient for normal growth of plants. Despite its low abundance in soils, it could be highly toxic to plants in especially arid and semi-arid environments. Poplars are known to be tolerant species to B toxicity and accumulation. However, physiological and gene regulation responses of these trees to B toxicity have not been investigated yet. Here, B accumulation and tolerance level of black poplar clones were firstly tested in the current study. Rooted cutting of these clones were treated with elevated B toxicity to select the most B accumulator and tolerant genotype. Then we carried out a microarray based transcriptome experiment on the leaves and roots of this genotype to find out transcriptional networks, genes and molecular mechanisms behind B toxicity tolerance. The results of the study indicated that black poplar is quite suitable for phytoremediation of B pollution. It could resist 15 ppm soil B content and >1500 ppm B accumulation in leaves, which are highly toxic concentrations for almost all agricultural plants. Transcriptomics results of study revealed totally 1625 and 1419 altered probe sets under 15 ppm B toxicity in leaf and root tissues, respectively. The highest induction were recorded for the probes sets annotated to tyrosine aminotransferase, ATP binding cassette transporters, glutathione S transferases and metallochaperone proteins. Strong up regulation of these genes attributed to internal excretion of B into the cell vacuole and existence of B detoxification processes in black poplar. Many other candidate genes functional in signalling, gene regulation, antioxidation, B uptake and transport processes were also identified in this hyper B accumulator plant for the first time with the current study. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Schmidt, S; Pericak-Vance, M A; Sawcer, S; Barcellos, L F; Hart, J; Sims, J; Prokop, A M; van der Walt, J; DeLoa, C; Lincoln, R R; Oksenberg, J R; Compston, A; Hauser, S L; Haines, J L; Gregory, S G
2006-07-01
Discrepant findings have been reported regarding an association of the apolipoprotein E (APOE) gene with the clinical course of multiple sclerosis (MS). To resolve these discrepancies, we examined common sequence variation in six candidate genes residing in a 380-kb genomic region surrounding and including the APOE locus for an association with MS severity. We genotyped at least three polymorphisms in each of six candidate genes in 1,540 Caucasian MS families (729 single-case and multiple-case families from the United States, 811 single-case families from the UK). By applying the quantitative transmission/disequilibrium test to a recently proposed MS severity score, the only statistically significant (P=0.003) association with MS severity was found for an intronic variant in the Herpes Virus Entry Mediator-B Gene PVRL2. Additional genotyping extended the association to a 16.6 kb block spanning intron 1 to intron 2 of the gene. Sequencing of PVRL2 failed to identify variants with an obvious functional role. In conclusion, the analysis of a very large data set suggests that genetic polymorphisms in PVRL2 may influence MS severity and supports the possibility that viral factors may contribute to the clinical course of MS, consistent with previous reports.
Detection of gene communities in multi-networks reveals cancer drivers
NASA Astrophysics Data System (ADS)
Cantini, Laura; Medico, Enzo; Fortunato, Santo; Caselle, Michele
2015-12-01
We propose a new multi-network-based strategy to integrate different layers of genomic information and use them in a coordinate way to identify driving cancer genes. The multi-networks that we consider combine transcription factor co-targeting, microRNA co-targeting, protein-protein interaction and gene co-expression networks. The rationale behind this choice is that gene co-expression and protein-protein interactions require a tight coregulation of the partners and that such a fine tuned regulation can be obtained only combining both the transcriptional and post-transcriptional layers of regulation. To extract the relevant biological information from the multi-network we studied its partition into communities. To this end we applied a consensus clustering algorithm based on state of art community detection methods. Even if our procedure is valid in principle for any pathology in this work we concentrate on gastric, lung, pancreas and colorectal cancer and identified from the enrichment analysis of the multi-network communities a set of candidate driver cancer genes. Some of them were already known oncogenes while a few are new. The combination of the different layers of information allowed us to extract from the multi-network indications on the regulatory pattern and functional role of both the already known and the new candidate driver genes.
Xiaoqing Yu; Guihua Bai; Shuwei Liu; Na Luo; Ying Wang; Douglas S. Richmond; Paula M. Pijut; Scott A. Jackson; Jianming Yu; Yiwei Jiang
2013-01-01
Drought is a major environmental stress limiting growth of perennial grasses in temperate regions. Plant drought tolerance is a complex trait that is controlled by multiple genes. Candidate gene association mapping provides a powerful tool for dissection of complex traits. Candidate gene association mapping of drought tolerance traits was conducted in 192 diverse...
Identification of a core set of rhizobial infection genes using data from single cell-types.
Chen, Da-Song; Liu, Cheng-Wu; Roy, Sonali; Cousins, Donna; Stacey, Nicola; Murray, Jeremy D
2015-01-01
Genome-wide expression studies on nodulation have varied in their scale from entire root systems to dissected nodules or root sections containing nodule primordia (NP). More recently efforts have focused on developing methods for isolation of root hairs from infected plants and the application of laser-capture microdissection technology to nodules. Here we analyze two published data sets to identify a core set of infection genes that are expressed in the nodule and in root hairs during infection. Among the genes identified were those encoding phenylpropanoid biosynthesis enzymes including Chalcone-O-Methyltransferase which is required for the production of the potent Nod gene inducer 4',4-dihydroxy-2-methoxychalcone. A promoter-GUS analysis in transgenic hairy roots for two genes encoding Chalcone-O-Methyltransferase isoforms revealed their expression in rhizobially infected root hairs and the nodule infection zone but not in the nitrogen fixation zone. We also describe a group of Rhizobially Induced Peroxidases whose expression overlaps with the production of superoxide in rhizobially infected root hairs and in nodules and roots. Finally, we identify a cohort of co-regulated transcription factors as candidate regulators of these processes.
Carlson, Kimberly A.; Gardner, Kylee; Pashaj, Anjeza; Carlson, Darby J.; Yu, Fang; Eudy, James D.; Zhang, Chi; Harshman, Lawrence G.
2015-01-01
Aging is a complex process characterized by a steady decline in an organism's ability to perform life-sustaining tasks. In the present study, two cages of approximately 12,000 mated Drosophila melanogaster females were used as a source of RNA from individuals sampled frequently as a function of age. A linear model for microarray data method was used for the microarray analysis to adjust for the box effect; it identified 1,581 candidate aging genes. Cluster analyses using a self-organizing map algorithm on the 1,581 significant genes identified gene expression patterns across different ages. Genes involved in immune system function and regulation, chorion assembly and function, and metabolism were all significantly differentially expressed as a function of age. The temporal pattern of data indicated that gene expression related to aging is affected relatively early in life span. In addition, the temporal variance in gene expression in immune function genes was compared to a random set of genes. There was an increase in the variance of gene expression within each cohort, which was not observed in the set of random genes. This observation is compatible with the hypothesis that D. melanogaster immune function genes lose control of gene expression as flies age. PMID:26090231
Functional genome-wide siRNA screen identifies KIAA0586 as mutated in Joubert syndrome
Roosing, Susanne; Hofree, Matan; Kim, Sehyun; Scott, Eric; Copeland, Brett; Romani, Marta; Silhavy, Jennifer L; Rosti, Rasim O; Schroth, Jana; Mazza, Tommaso; Miccinilli, Elide; Zaki, Maha S; Swoboda, Kathryn J; Milisa-Drautz, Joanne; Dobyns, William B; Mikati, Mohamed A; İncecik, Faruk; Azam, Matloob; Borgatti, Renato; Romaniello, Romina; Boustany, Rose-Mary; Clericuzio, Carol L; D'Arrigo, Stefano; Strømme, Petter; Boltshauser, Eugen; Stanzial, Franco; Mirabelli-Badenier, Marisol; Moroni, Isabella; Bertini, Enrico; Emma, Francesco; Steinlin, Maja; Hildebrandt, Friedhelm; Johnson, Colin A; Freilinger, Michael; Vaux, Keith K; Gabriel, Stacey B; Aza-Blanc, Pedro; Heynen-Genel, Susanne; Ideker, Trey; Dynlacht, Brian D; Lee, Ji Eun; Valente, Enza Maria; Kim, Joon; Gleeson, Joseph G
2015-01-01
Defective primary ciliogenesis or cilium stability forms the basis of human ciliopathies, including Joubert syndrome (JS), with defective cerebellar vermis development. We performed a high-content genome-wide small interfering RNA (siRNA) screen to identify genes regulating ciliogenesis as candidates for JS. We analyzed results with a supervised-learning approach, using SYSCILIA gold standard, Cildb3.0, a centriole siRNA screen and the GTex project, identifying 591 likely candidates. Intersection of this data with whole exome results from 145 individuals with unexplained JS identified six families with predominantly compound heterozygous mutations in KIAA0586. A c.428del base deletion in 0.1% of the general population was found in trans with a second mutation in an additional set of 9 of 163 unexplained JS patients. KIAA0586 is an orthologue of chick Talpid3, required for ciliogenesis and Sonic hedgehog signaling. Our results uncover a relatively high frequency cause for JS and contribute a list of candidates for future gene discoveries in ciliopathies. DOI: http://dx.doi.org/10.7554/eLife.06602.001 PMID:26026149
Hammarlöf, Disa L; Canals, Rocío; Hinton, Jay C D
2013-10-01
The availability of thousands of genome sequences of bacterial pathogens poses a particular challenge because each genome contains hundreds of genes of unknown function (FUN). How can we easily discover which FUN genes encode important virulence factors? One solution is to combine two different functional genomic approaches. First, transcriptomics identifies bacterial FUN genes that show differential expression during the process of mammalian infection. Second, global mutagenesis identifies individual FUN genes that the pathogen requires to cause disease. The intersection of these datasets can reveal a small set of candidate genes most likely to encode novel virulence attributes. We demonstrate this approach with the Salmonella infection model, and propose that a similar strategy could be used for other bacterial pathogens. Copyright © 2013 Elsevier Ltd. All rights reserved.
Convergence of genome-wide association and candidate gene studies for alcoholism.
Olfson, Emily; Bierut, Laura Jean
2012-12-01
Genome-wide association (GWA) studies have led to a paradigm shift in how researchers study the genetics underlying disease. Many GWA studies are now publicly available and can be used to examine whether or not previously proposed candidate genes are supported by GWA data. This approach is particularly important for the field of alcoholism because the contribution of many candidate genes remains controversial. Using the Human Genome Epidemiology (HuGE) Navigator, we selected candidate genes for alcoholism that have been frequently examined in scientific articles in the past decade. Specific candidate loci as well as all the reported single nucleotide polymorphisms (SNPs) in candidate genes were examined in the Study of Addiction: Genetics and Environment (SAGE), a GWA study comparing alcohol-dependent and nondependent subjects. Several commonly reported candidate loci, including rs1800497 in DRD2, rs698 in ADH1C, rs1799971 in OPRM1, and rs4680 in COMT, are not replicated in SAGE (p > 0.05). Among candidate loci available for analysis, only rs279858 in GABRA2 (p = 0.0052, OR = 1.16) demonstrated a modest association. Examination of all SNPs reported in SAGE in over 50 candidate genes revealed no SNPs with large frequency differences between cases and controls, and the lowest p-value of any SNP was 0.0006. We provide evidence that several extensively studied candidate loci do not have a strong contribution to risk of developing alcohol dependence in European and African ancestry populations. Owing to the lack of coverage, we were unable to rule out the contribution of other variants, and these genes and particular loci warrant further investigation. Our analysis demonstrates that publicly available GWA results can be used to better understand which if any of previously proposed candidate genes contribute to disease. Furthermore, we illustrate how examining the convergence of candidate gene and GWA studies can help elucidate the genetic architecture of alcoholism and more generally complex diseases. Copyright © 2012 by the Research Society on Alcoholism.
Lamba, Jatinder K; Crews, Kristine R; Pounds, Stanley B; Cao, Xueyuan; Gandhi, Varsha; Plunkett, William; Razzouk, Bassem I; Lamba, Vishal; Baker, Sharyn D; Raimondi, Susana C; Campana, Dario; Pui, Ching-Hon; Downing, James R; Rubnitz, Jeffrey E; Ribeiro, Raul C
2011-01-01
Aim To identify gene-expression signatures predicting cytarabine response by an integrative analysis of multiple clinical and pharmacological end points in acute myeloid leukemia (AML) patients. Materials & methods We performed an integrated analysis to associate the gene expression of diagnostic bone marrow blasts from acute myeloid leukemia (AML) patients treated in the discovery set (AML97; n = 42) and in the independent validation set (AML02; n = 46) with multiple clinical and pharmacological end points. Based on prior biological knowledge, we defined a gene to show a therapeutically beneficial (detrimental) pattern of association of its expression positively (negatively) correlated with favorable phenotypes such as intracellular cytarabine 5´-triphosphate levels, morphological response and event-free survival, and negatively (positively) correlated with unfavorable end points such as post-cytarabine DNA synthesis levels, minimal residual disease and cytarabine LC50. Results We identified 240 probe sets predicting a therapeutically beneficial pattern and 97 predicting detrimental pattern (p ≤ 0.005) in the discovery set. Of these, 60 were confirmed in the independent validation set. The validated probe sets correspond to genes involved in PIK3/PTEN/AKT/mTOR signaling, G-protein-coupled receptor signaling and leukemogenesis. This suggests that targeting these pathways as potential pharmacogenomic and therapeutic candidates could be useful for improving treatment outcomes in AML. Conclusion This study illustrates the power of integrated data analysis of genomic data as well as multiple clinical and pharmacologic end points in the identification of genes and pathways of biological relevance. PMID:21449673
Griffin, Philippa C.; Hangartner, Sandra B.; Fournier-Level, Alexandre; Hoffmann, Ary A.
2017-01-01
Adaptation to environmental stress is critical for long-term species persistence. With climate change and other anthropogenic stressors compounding natural selective pressures, understanding the nature of adaptation is as important as ever in evolutionary biology. In particular, the number of alternative molecular trajectories available for an organism to reach the same adaptive phenotype remains poorly understood. Here, we investigate this issue in a set of replicated Drosophila melanogaster lines selected for increased desiccation resistance—a classical physiological trait that has been closely linked to Drosophila species distributions. We used pooled whole-genome sequencing (Pool-Seq) to compare the genetic basis of their selection responses, using a matching set of replicated control lines for characterizing laboratory (lab-)adaptation, as well as the original base population. The ratio of effective population size to census size was high over the 21 generations of the experiment at 0.52–0.88 for all selected and control lines. While selected SNPs in replicates of the same treatment (desiccation-selection or lab-adaptation) tended to change frequency in the same direction, suggesting some commonality in the selection response, candidate SNP and gene lists often differed among replicates. Three of the five desiccation-selection replicates showed significant overlap at the gene and network level. All five replicates showed enrichment for ovary-expressed genes, suggesting maternal effects on the selected trait. Divergence between pairs of replicate lines for desiccation-candidate SNPs was greater than between pairs of control lines. This difference also far exceeded the divergence between pairs of replicate lines for neutral SNPs. Overall, while there was overlap in the direction of allele frequency changes and the network and functional categories affected by desiccation selection, replicates showed unique responses at all levels, likely reflecting hitchhiking effects, and highlighting the challenges in identifying candidate genes from these types of experiments when traits are likely to be polygenic. PMID:28007884
Defining the Human Macula Transcriptome and Candidate Retinal Disease Genes UsingEyeSAGE
Rickman, Catherine Bowes; Ebright, Jessica N.; Zavodni, Zachary J.; Yu, Ling; Wang, Tianyuan; Daiger, Stephen P.; Wistow, Graeme; Boon, Kathy; Hauser, Michael A.
2009-01-01
Purpose To develop large-scale, high-throughput annotation of the human macula transcriptome and to identify and prioritize candidate genes for inherited retinal dystrophies, based on ocular-expression profiles using serial analysis of gene expression (SAGE). Methods Two human retina and two retinal pigment epithelium (RPE)/choroid SAGE libraries made from matched macula or midperipheral retina and adjacent RPE/choroid of morphologically normal 28- to 66-year-old donors and a human central retina longSAGE library made from 41- to 66-year-old donors were generated. Their transcription profiles were entered into a relational database, EyeSAGE, including microarray expression profiles of retina and publicly available normal human tissue SAGE libraries. EyeSAGE was used to identify retina- and RPE-specific and -associated genes, and candidate genes for retina and RPE disease loci. Differential and/or cell-type specific expression was validated by quantitative and single-cell RT-PCR. Results Cone photoreceptor-associated gene expression was elevated in the macula transcription profiles. Analysis of the longSAGE retina tags enhanced tag-to-gene mapping and revealed alternatively spliced genes. Analysis of candidate gene expression tables for the identified Bardet-Biedl syndrome disease gene (BBS5) in the BBS5 disease region table yielded BBS5 as the top candidate. Compelling candidates for inherited retina diseases were identified. Conclusions The EyeSAGE database, combining three different gene-profiling platforms including the authors’ multidonor-derived retina/RPE SAGE libraries and existing single-donor retina/RPE libraries, is a powerful resource for definition of the retina and RPE transcriptomes. It can be used to identify retina-specific genes, including alternatively spliced transcripts and to prioritize candidate genes within mapped retinal disease regions. PMID:16723438
Defining the human macula transcriptome and candidate retinal disease genes using EyeSAGE.
Bowes Rickman, Catherine; Ebright, Jessica N; Zavodni, Zachary J; Yu, Ling; Wang, Tianyuan; Daiger, Stephen P; Wistow, Graeme; Boon, Kathy; Hauser, Michael A
2006-06-01
To develop large-scale, high-throughput annotation of the human macula transcriptome and to identify and prioritize candidate genes for inherited retinal dystrophies, based on ocular-expression profiles using serial analysis of gene expression (SAGE). Two human retina and two retinal pigment epithelium (RPE)/choroid SAGE libraries made from matched macula or midperipheral retina and adjacent RPE/choroid of morphologically normal 28- to 66-year-old donors and a human central retina longSAGE library made from 41- to 66-year-old donors were generated. Their transcription profiles were entered into a relational database, EyeSAGE, including microarray expression profiles of retina and publicly available normal human tissue SAGE libraries. EyeSAGE was used to identify retina- and RPE-specific and -associated genes, and candidate genes for retina and RPE disease loci. Differential and/or cell-type specific expression was validated by quantitative and single-cell RT-PCR. Cone photoreceptor-associated gene expression was elevated in the macula transcription profiles. Analysis of the longSAGE retina tags enhanced tag-to-gene mapping and revealed alternatively spliced genes. Analysis of candidate gene expression tables for the identified Bardet-Biedl syndrome disease gene (BBS5) in the BBS5 disease region table yielded BBS5 as the top candidate. Compelling candidates for inherited retina diseases were identified. The EyeSAGE database, combining three different gene-profiling platforms including the authors' multidonor-derived retina/RPE SAGE libraries and existing single-donor retina/RPE libraries, is a powerful resource for definition of the retina and RPE transcriptomes. It can be used to identify retina-specific genes, including alternatively spliced transcripts and to prioritize candidate genes within mapped retinal disease regions.
Balasubbu, Suganthalakshmi; Sundaresan, Periasamy; Rajendran, Anand; Ramasamy, Kim; Govindarajan, Gowthaman; Perumalsamy, Namperumalsamy; Hejtmancik, J Fielding
2010-11-10
Diabetic retinopathy (DR) is classically defined as a microvasculopathy that primarily affects the small blood vessels of the inner retina as a complication of diabetes mellitus (DM).It is a multifactorial disease with a strong genetic component. The aim of this study is to investigate the association of a set of nine candidate genes with the development of diabetic retinopathy in a South Indian cohort who have type 2 diabetes mellitus (T2DM). Seven candidate genes (RAGE, PEDF, AKR1B1, EPO, HTRA1, ICAM and HFE) were chosen based on reported association with DR in the literature. Two more, CFH and ARMS2, were chosen based on their roles in biological pathways previously implicated in DR. Fourteen single nucleotide polymorphisms (SNPs) and one dinucleotide repeat polymorphism, previously reported to show association with DR or other related diseases, were genotyped in 345 DR and 356 diabetic patients without retinopathy (DNR). The genes which showed positive association in this screening set were tested further in additional sets of 100 DR and 90 DNR additional patients from the Aravind Eye Hospital. Those which showed association in the secondary screen were subjected to a combined analysis with the 100 DR and 100 DNR subjects previously recruited and genotyped through the Sankara Nethralaya Hospital, India. Genotypes were evaluated using a combination of direct sequencing, TaqMan SNP genotyping, RFLP analysis, and SNaPshot PCR assays. Chi-square and Fisher exact tests were used to analyze the genotype and allele frequencies. Among the nine loci (15 polymorphisms) screened, SNP rs2070600 (G82S) in the RAGE gene, showed significant association with DR (allelic P = 0.016, dominant model P = 0.012), compared to DNR. SNP rs2070600 further showed significant association with DR in the confirmation cohort (P = 0.035, dominant model P = 0.032). Combining the two cohorts gave an allelic P < 0.003 and dominant P = 0.0013). Combined analysis with the Sankara Nethralaya cohort gave an allelic P = 0.0003 and dominant P = 0.00011 with an OR = 0.49 (0.34 - 0.70) for the minor allele. In HTRA1, rs11200638 (G>A), showed marginal significance with DR (P = 0.055) while rs10490924 in LOC387715 gave a P = 0.07. No statistical significance was observed for SNPs in the other 7 genes studied. This study confirms significant association of one polymorphism only (rs2070600 in RAGE) with DR in an Indian population which had T2DM.
2013-01-01
Background The genomic architecture of adaptive traits remains poorly understood in non-model plants. Various approaches can be used to bridge this gap, including the mapping of quantitative trait loci (QTL) in pedigrees, and genetic association studies in non-structured populations. Here we present results on the genomic architecture of adaptive traits in black spruce, which is a widely distributed conifer of the North American boreal forest. As an alternative to the usual candidate gene approach, a candidate SNP approach was developed for association testing. Results A genetic map containing 231 gene loci was used to identify QTL that were related to budset timing and to tree height assessed over multiple years and sites. Twenty-two unique genomic regions were identified, including 20 that were related to budset timing and 6 that were related to tree height. From results of outlier detection and bulk segregant analysis for adaptive traits using DNA pool sequencing of 434 genes, 52 candidate SNPs were identified and subsequently tested in genetic association studies for budset timing and tree height assessed over multiple years and sites. A total of 34 (65%) SNPs were significantly associated with budset timing, or tree height, or both. Although the percentages of explained variance (PVE) by individual SNPs were small, several significant SNPs were shared between sites and among years. Conclusions The sharing of genomic regions and significant SNPs between budset timing and tree height indicates pleiotropic effects. Significant QTLs and SNPs differed quite greatly among years, suggesting that different sets of genes for the same characters are involved at different stages in the tree’s life history. The functional diversity of genes carrying significant SNPs and low observed PVE further indicated that a large number of polymorphisms are involved in adaptive genetic variation. Accordingly, for undomesticated species such as black spruce with natural populations of large effective size and low linkage disequilibrium, efficient marker systems that are predictive of adaptation should require the survey of large numbers of SNPs. Candidate SNP approaches like the one developed in the present study could contribute to reducing these numbers. PMID:23724860
Klangnurak, Wanlada; Fukuyo, Taketo; Rezanujjaman, M D; Seki, Masahide; Sugano, Sumio; Suzuki, Yutaka; Tokumoto, Toshinobu
2018-01-01
We previously reported the microarray-based selection of three ovulation-related genes in zebrafish. We used a different selection method in this study, RNA sequencing analysis. An additional eight up-regulated candidates were found as specifically up-regulated genes in ovulation-induced samples. Changes in gene expression were confirmed by qPCR analysis. Furthermore, up-regulation prior to ovulation during natural spawning was verified in samples from natural pairing. Gene knock-out zebrafish strains of one of the candidates, the starmaker gene (stm), were established by CRISPR genome editing techniques. Unexpectedly, homozygous mutants were fertile and could spawn eggs. However, a high percentage of unfertilized eggs and abnormal embryos were produced from these homozygous females. The results suggest that the stm gene is necessary for fertilization. In this study, we selected additional ovulation-inducing candidate genes, and a novel function of the stm gene was investigated.
Sperschneider, Jana; Garnica, Diana P.; Miller, Marisa E.; Taylor, Jennifer M.; Dodds, Peter N.; Park, Robert F.
2018-01-01
ABSTRACT A long-standing biological question is how evolution has shaped the genomic architecture of dikaryotic fungi. To answer this, high-quality genomic resources that enable haplotype comparisons are essential. Short-read genome assemblies for dikaryotic fungi are highly fragmented and lack haplotype-specific information due to the high heterozygosity and repeat content of these genomes. Here, we present a diploid-aware assembly of the wheat stripe rust fungus Puccinia striiformis f. sp. tritici based on long reads using the FALCON-Unzip assembler. Transcriptome sequencing data sets were used to infer high-quality gene models and identify virulence genes involved in plant infection referred to as effectors. This represents the most complete Puccinia striiformis f. sp. tritici genome assembly to date (83 Mb, 156 contigs, N50 of 1.5 Mb) and provides phased haplotype information for over 92% of the genome. Comparisons of the phase blocks revealed high interhaplotype diversity of over 6%. More than 25% of all genes lack a clear allelic counterpart. When we investigated genome features that potentially promote the rapid evolution of virulence, we found that candidate effector genes are spatially associated with conserved genes commonly found in basidiomycetes. Yet, candidate effectors that lack an allelic counterpart are more distant from conserved genes than allelic candidate effectors and are less likely to be evolutionarily conserved within the P. striiformis species complex and Pucciniales. In summary, this haplotype-phased assembly enabled us to discover novel genome features of a dikaryotic plant-pathogenic fungus previously hidden in collapsed and fragmented genome assemblies. PMID:29463659
Al-Hebshi, Nezar Noor; Li, Shiyong; Nasher, Akram Thabet; El-Setouhy, Maged; Alsanosi, Rashad; Blancato, Jan; Loffredo, Christopher
2016-07-15
The study sought to identify genetic aberrations driving oral squamous cell carcinoma (OSCC) development among users of shammah, an Arabian preparation of smokeless tobacco. Twenty archival OSCC samples, 15 of which with a history of shammah exposure, were whole-exome sequenced at an average depth of 127×. Somatic mutations were identified using a novel, matched controls-independent filtration algorithm. CODEX and Exomedepth coupled with a novel, Database of Genomic Variant-based filter were employed to call somatic gene-copy number variations. Significantly mutated genes were identified with Oncodrive FM and the Youn and Simon's method. Candidate driver genes were nominated based on Gene Set Enrichment Analysis. The observed mutational spectrum was similar to that reported by the TCGA project. In addition to confirming known genes of OSCC (TP53, CDKNA2, CASP8, PIK3CA, HRAS, FAT1, TP63, CCND1 and FADD) the analysis identified several candidate novel driver events including mutations of NOTCH3, CSMD3, CRB1, CLTCL1, OSMR and TRPM2, amplification of the proto-oncogenes FOSL1, RELA, TRAF6, MDM2, FRS2 and BAG1, and deletion of the recently described tumor suppressor SMARCC1. Analysis also revealed significantly altered pathways not previously implicated in OSCC including Oncostatin-M signalling pathway, AP-1 and C-MYB transcription networks and endocytosis. There was a trend for higher number of mutations, amplifications and driver events in samples with history of shammah exposure particularly those that tested EBV positive, suggesting an interaction between tobacco exposure and EBV. The work provides further evidence for the genetic heterogeneity of oral cancer and suggests shammah-associated OSCC is characterized by extensive amplification of oncogenes. © 2016 UICC.
Hsieh, PingHsun; Veeramah, Krishna R.; Lachance, Joseph; Tishkoff, Sarah A.; Wall, Jeffrey D.; Hammer, Michael F.; Gutenkunst, Ryan N.
2016-01-01
African Pygmies practicing a mobile hunter-gatherer lifestyle are phenotypically and genetically diverged from other anatomically modern humans, and they likely experienced strong selective pressures due to their unique lifestyle in the Central African rainforest. To identify genomic targets of adaptation, we sequenced the genomes of four Biaka Pygmies from the Central African Republic and jointly analyzed these data with the genome sequences of three Baka Pygmies from Cameroon and nine Yoruba famers. To account for the complex demographic history of these populations that includes both isolation and gene flow, we fit models using the joint allele frequency spectrum and validated them using independent approaches. Our two best-fit models both suggest ancient divergence between the ancestors of the farmers and Pygmies, 90,000 or 150,000 yr ago. We also find that bidirectional asymmetric gene flow is statistically better supported than a single pulse of unidirectional gene flow from farmers to Pygmies, as previously suggested. We then applied complementary statistics to scan the genome for evidence of selective sweeps and polygenic selection. We found that conventional statistical outlier approaches were biased toward identifying candidates in regions of high mutation or low recombination rate. To avoid this bias, we assigned P-values for candidates using whole-genome simulations incorporating demography and variation in both recombination and mutation rates. We found that genes and gene sets involved in muscle development, bone synthesis, immunity, reproduction, cell signaling and development, and energy metabolism are likely to be targets of positive natural selection in Western African Pygmies or their recent ancestors. PMID:26888263
Mashiach, R.; Cohen, S.; Kedem, A.; Baron, A.; Zajicek, M.; Feldman, I.; Seidman, D.; Soriano, D.
2018-01-01
Endometriosis is a disease characterized by the development of endometrial tissue outside the uterus, but its cause remains largely unknown. Numerous genes have been studied and proposed to help explain its pathogenesis. However, the large number of these candidate genes has made functional validation through experimental methodologies nearly impossible. Computational methods could provide a useful alternative for prioritizing those most likely to be susceptibility genes. Using artificial intelligence applied to text mining, this study analyzed the genes involved in the pathogenesis, development, and progression of endometriosis. The data extraction by text mining of the endometriosis-related genes in the PubMed database was based on natural language processing, and the data were filtered to remove false positives. Using data from the text mining and gene network information as input for the web-based tool, 15,207 endometriosis-related genes were ranked according to their score in the database. Characterization of the filtered gene set through gene ontology, pathway, and network analysis provided information about the numerous mechanisms hypothesized to be responsible for the establishment of ectopic endometrial tissue, as well as the migration, implantation, survival, and proliferation of ectopic endometrial cells. Finally, the human genome was scanned through various databases using filtered genes as a seed to determine novel genes that might also be involved in the pathogenesis of endometriosis but which have not yet been characterized. These genes could be promising candidates to serve as useful diagnostic biomarkers and therapeutic targets in the management of endometriosis. PMID:29750165
Bouaziz, J; Mashiach, R; Cohen, S; Kedem, A; Baron, A; Zajicek, M; Feldman, I; Seidman, D; Soriano, D
2018-01-01
Endometriosis is a disease characterized by the development of endometrial tissue outside the uterus, but its cause remains largely unknown. Numerous genes have been studied and proposed to help explain its pathogenesis. However, the large number of these candidate genes has made functional validation through experimental methodologies nearly impossible. Computational methods could provide a useful alternative for prioritizing those most likely to be susceptibility genes. Using artificial intelligence applied to text mining, this study analyzed the genes involved in the pathogenesis, development, and progression of endometriosis. The data extraction by text mining of the endometriosis-related genes in the PubMed database was based on natural language processing, and the data were filtered to remove false positives. Using data from the text mining and gene network information as input for the web-based tool, 15,207 endometriosis-related genes were ranked according to their score in the database. Characterization of the filtered gene set through gene ontology, pathway, and network analysis provided information about the numerous mechanisms hypothesized to be responsible for the establishment of ectopic endometrial tissue, as well as the migration, implantation, survival, and proliferation of ectopic endometrial cells. Finally, the human genome was scanned through various databases using filtered genes as a seed to determine novel genes that might also be involved in the pathogenesis of endometriosis but which have not yet been characterized. These genes could be promising candidates to serve as useful diagnostic biomarkers and therapeutic targets in the management of endometriosis.
PGMapper: a web-based tool linking phenotype to genes.
Xiong, Qing; Qiu, Yuhui; Gu, Weikuan
2008-04-01
With the availability of whole genome sequence in many species, linkage analysis, positional cloning and microarray are gradually becoming powerful tools for investigating the links between phenotype and genotype or genes. However, in these methods, causative genes underlying a quantitative trait locus, or a disease, are usually located within a large genomic region or a large set of genes. Examining the function of every gene is very time consuming and needs to retrieve and integrate the information from multiple databases or genome resources. PGMapper is a software tool for automatically matching phenotype to genes from a defined genome region or a group of given genes by combining the mapping information from the Ensembl database and gene function information from the OMIM and PubMed databases. PGMapper is currently available for candidate gene search of human, mouse, rat, zebrafish and 12 other species. Available online at http://www.genediscovery.org/pgmapper/index.jsp.
Cai, Jing; Li, Pengfei; Luo, Xiao; Chang, Tianliang; Li, Jiaxing; Zhao, Yuwei; Xu, Yao
2018-01-01
Hulless barley (Hordeum vulgare L. var. nudum. hook. f.) has been cultivated as a major crop in the Qinghai-Tibet plateau of China for thousands of years. Compared to other cereal crops, the Tibetan hulless barley has developed stronger endogenous resistances to survive in the severe environment of its habitat. To understand the unique resistant mechanisms of this plant, detailed genetic studies need to be performed. The quantitative real-time reverse transcription-polymerase chain reaction (qRT-PCR) is the most commonly used method in detecting gene expression. However, the selection of stable reference genes under limited experimental conditions was considered to be an essential step for obtaining accurate results in qRT-PCR. In this study, 10 candidate reference genes-ACT (Actin), E2 (Ubiquitin conjugating enzyme 2), TUBα (Alpha-tubulin), TUBβ6 (Beta-tubulin 6), GAPDH (Glyceraldehyde 3-phosphate dehydrogenase), EF-1α (Elongation factor 1-alpha), SAMDC (S-adenosylmethionine decarboxylase), PKABA1 (Gene for protein kinase HvPKABA1), PGK (Phosphoglycerate kinase), and HSP90 (Heat shock protein 90)-were selected from the NCBI gene database of barley. Following qRT-PCR amplifications of all candidate reference genes in Tibetan hulless barley seedlings under various stressed conditions, the stabilities of these candidates were analyzed by three individual software packages including geNorm, NormFinder, and BestKeeper. The results demonstrated that TUBβ6, E2, TUBα, and HSP90 were generally the most suitable sets under all tested conditions; similarly, TUBα and HSP90 showed peak stability under salt stress, TUBα and EF-1α were the most suitable reference genes under cold stress, and ACT and E2 were the most stable under drought stress. Finally, a known circadian gene CCA1 was used to verify the service ability of chosen reference genes. The results confirmed that all recommended reference genes by the three software were suitable for gene expression analysis under tested stress conditions by the qRT-PCR method.
Malki, K; Pain, O; Tosto, M G; Du Rietz, E; Carboni, L; Schalkwyk, L C
2015-01-01
Despite moderate heritability estimates, progress in uncovering the molecular substrate underpinning major depressive disorder (MDD) has been slow. In this study, we used prefrontal cortex (PFC) gene expression from a genetic rat model of MDD to inform probe set prioritization in PFC in a human post-mortem study to uncover genes and gene pathways associated with MDD. Gene expression differences between Flinders sensitive (FSL) and Flinders resistant (FRL) rat lines were statistically evaluated using the RankProd, non-parametric algorithm. Top ranking probe sets in the rat study were subsequently used to prioritize orthologous selection in a human PFC in a case–control post-mortem study on MDD from the Stanley Brain Consortium. Candidate genes in the human post-mortem study were then tested against a matched control sample using the RankProd method. A total of 1767 probe sets were differentially expressed in the PFC between FSL and FRL rat lines at (q⩽0.001). A total of 898 orthologous probe sets was found on Affymetrix's HG-U95A chip used in the human study. Correcting for the number of multiple, non-independent tests, 20 probe sets were found to be significantly dysregulated between human cases and controls at q⩽0.05. These probe sets tagged the expression profile of 18 human genes (11 upregulated and seven downregulated). Using an integrative rat–human study, a number of convergent genes that may have a role in pathogenesis of MDD were uncovered. Eighty percent of these genes were functionally associated with a key stress response signalling cascade, involving NF-κB (nuclear factor kappa-light-chain-enhancer of activated B cells), AP-1 (activator protein 1) and ERK/MAPK, which has been systematically associated with MDD, neuroplasticity and neurogenesis. PMID:25734512
Harripaul, R; Vasli, N; Mikhailov, A; Rafiq, M A; Mittal, K; Windpassinger, C; Sheikh, T I; Noor, A; Mahmood, H; Downey, S; Johnson, M; Vleuten, K; Bell, L; Ilyas, M; Khan, F S; Khan, V; Moradi, M; Ayaz, M; Naeem, F; Heidari, A; Ahmed, I; Ghadami, S; Agha, Z; Zeinali, S; Qamar, R; Mozhdehipanah, H; John, P; Mir, A; Ansar, M; French, L; Ayub, M; Vincent, J B
2018-04-01
Approximately 1% of the global population is affected by intellectual disability (ID), and the majority receive no molecular diagnosis. Previous studies have indicated high levels of genetic heterogeneity, with estimates of more than 2500 autosomal ID genes, the majority of which are autosomal recessive (AR). Here, we combined microarray genotyping, homozygosity-by-descent (HBD) mapping, copy number variation (CNV) analysis, and whole exome sequencing (WES) to identify disease genes/mutations in 192 multiplex Pakistani and Iranian consanguineous families with non-syndromic ID. We identified definite or candidate mutations (or CNVs) in 51% of families in 72 different genes, including 26 not previously reported for ARID. The new ARID genes include nine with loss-of-function mutations (ABI2, MAPK8, MPDZ, PIDD1, SLAIN1, TBC1D23, TRAPPC6B, UBA7 and USP44), and missense mutations include the first reports of variants in BDNF or TET1 associated with ID. The genes identified also showed overlap with de novo gene sets for other neuropsychiatric disorders. Transcriptional studies showed prominent expression in the prenatal brain. The high yield of AR mutations for ID indicated that this approach has excellent clinical potential and should inform clinical diagnostics, including clinical whole exome and genome sequencing, for populations in which consanguinity is common. As with other AR disorders, the relevance will also apply to outbred populations.
Zhang, P; Wang, J G; Wan, J G; Liu, W Q
2010-01-01
The frequent disease outbreaks caused by avian influenza virus not only affect the poultry industry but also pose a threat to human safety. To address the problem, RNA interference (RNAi) has recently been widely used as a potential antiviral approach. Transgenesis in combination with RNAi to specifically inhibit avian enza virus gene expression has been proposed to make chickens resistant to the infection. For the transgenic breeding, screening in vitro efficient siRNAs as the candidate genes is one of the most important tasks. Here, we combined an online search tool and a series of bioinformatics programs with a set of rules for designing siRNAs targeted towards different mRNA regions of H5N1 avian influenza virus. Five rational siRNAs were chosen by this method, five U6 promoter-driven shRNA expression plasmids containing the siRNA genes were constructed and used for producing stably transfected MDCK cells. The data obtained by virus titration, IFA, PI-stained flow cytometry, real-time quantitative RT-PCR, and DAS-ELISA analyses showed that all five stably transfected cell lines we re resistant to virusreplication when exposed to 100 CCID50 of avian influenza virus H5N1. Finally, most effective plasmids (pSi-604i and pSi-1597i) as the candidates for making the transgenic chickens were chosen. These findings provide baseline information on use of RNAi technique for breeding transgenic chickens resistant to avian influenza virus.
Zhong, Chao; Sun, Suli; Li, Yinping; Duan, Canxing; Zhu, Zhendong
2018-03-01
A novel Phytophthora sojae resistance gene RpsHC18 was identified and finely mapped on soybean chromosome 3. Two NBS-LRR candidate genes were identified and two diagnostic markers of RpsHC18 were developed. Phytophthora root rot caused by Phytophthora sojae is a destructive disease of soybean. The most effective disease-control strategy is to deploy resistant cultivars carrying Phytophthora-resistant Rps genes. The soybean cultivar Huachun 18 has a broad and distinct resistance spectrum to 12 P. sojae isolates. Quantitative trait loci sequencing (QTL-seq), based on the whole-genome resequencing (WGRS) of two extreme resistant and susceptible phenotype bulks from an F 2:3 population, was performed, and one 767-kb genomic region with ΔSNP-index ≥ 0.9 on chromosome 3 was identified as the RpsHC18 candidate region in Huachun 18. The candidate region was reduced to a 146-kb region by fine mapping. Nonsynonymous SNP and haplotype analyses were carried out in the 146-kb region among ten soybean genotypes using WGRS. Four specific nonsynonymous SNPs were identified in two nucleotide-binding sites-leucine-rich repeat (NBS-LRR) genes, RpsHC18-NBL1 and RpsHC18-NBL2, which were considered to be the candidate genes. Finally, one specific SNP marker in each candidate gene was successfully developed using a tetra-primer ARMS-PCR assay, and the two markers were verified to be specific for RpsHC18 and to effectively distinguish other known Rps genes. In this study, we applied an integrated genomic-based strategy combining WGRS with traditional genetic mapping to identify RpsHC18 candidate genes and develop diagnostic markers. These results suggest that next-generation sequencing is a precise, rapid and cost-effective way to identify candidate genes and develop diagnostic markers, and it can accelerate Rps gene cloning and marker-assisted selection for breeding of P. sojae-resistant soybean cultivars.
Database of cattle candidate genes and genetic markers for milk production and mastitis
Ogorevc, J; Kunej, T; Razpet, A; Dovc, P
2009-01-01
A cattle database of candidate genes and genetic markers for milk production and mastitis has been developed to provide an integrated research tool incorporating different types of information supporting a genomic approach to study lactation, udder development and health. The database contains 943 genes and genetic markers involved in mammary gland development and function, representing candidates for further functional studies. The candidate loci were drawn on a genetic map to reveal positional overlaps. For identification of candidate loci, data from seven different research approaches were exploited: (i) gene knockouts or transgenes in mice that result in specific phenotypes associated with mammary gland (143 loci); (ii) cattle QTL for milk production (344) and mastitis related traits (71); (iii) loci with sequence variations that show specific allele-phenotype interactions associated with milk production (24) or mastitis (10) in cattle; (iv) genes with expression profiles associated with milk production (207) or mastitis (107) in cattle or mouse; (v) cattle milk protein genes that exist in different genetic variants (9); (vi) miRNAs expressed in bovine mammary gland (32) and (vii) epigenetically regulated cattle genes associated with mammary gland function (1). Fourty-four genes found by multiple independent analyses were suggested as the most promising candidates and were further in silico analysed for expression levels in lactating mammary gland, genetic variability and top biological functions in functional networks. A miRNA target search for mammary gland expressed miRNAs identified 359 putative binding sites in 3′UTRs of candidate genes. PMID:19508288
Wang, Erlong; Wang, Kaiyu; Chen, Defang; Wang, Jun; He, Yang; Long, Bo; Yang, Lei; Yang, Qian; Geng, Yi; Huang, Xiaoli; Ouyang, Ping; Lai, Weimin
2015-01-01
qPCR as a powerful and attractive methodology has been widely applied to aquaculture researches for gene expression analyses. However, the suitable reference selection is critical for normalizing target genes expression in qPCR. In the present study, six commonly used endogenous controls were selected as candidate reference genes to evaluate and analyze their expression levels, stabilities and normalization to immune-related gene IgM expression during vaccination and infection in spleen of tilapia with RefFinder and GeNorm programs. The results showed that all of these candidate reference genes exhibited transcriptional variations to some extent at different periods. Among them, EF1A was the most stable reference with RefFinder, followed by 18S rRNA, ACTB, UBCE, TUBA and GAPDH respectively and the optimal number of reference genes for IgM normalization under different experiment sets was two with GeNorm. Meanwhile, combination the Cq (quantification cycle) value and the recommended comprehensive ranking of reference genes, EF1A and ACTB, the two optimal reference genes, were used together as reference genes for accurate analysis of immune-related gene expression during vaccination and infection in Nile tilapia with qPCR. Moreover, the highest IgM expression level was at two weeks post-vaccination when normalized to EF1A, 18S rRNA, ACTB, and EF1A together with ACTB compared to one week post-vaccination before normalizing, which was also consistent with the IgM antibody titers detection by ELISA. PMID:25941937
MetaRanker 2.0: a web server for prioritization of genetic variation data
Pers, Tune H.; Dworzyński, Piotr; Thomas, Cecilia Engel; Lage, Kasper; Brunak, Søren
2013-01-01
MetaRanker 2.0 is a web server for prioritization of common and rare frequency genetic variation data. Based on heterogeneous data sets including genetic association data, protein–protein interactions, large-scale text-mining data, copy number variation data and gene expression experiments, MetaRanker 2.0 prioritizes the protein-coding part of the human genome to shortlist candidate genes for targeted follow-up studies. MetaRanker 2.0 is made freely available at www.cbs.dtu.dk/services/MetaRanker-2.0. PMID:23703204
MetaRanker 2.0: a web server for prioritization of genetic variation data.
Pers, Tune H; Dworzyński, Piotr; Thomas, Cecilia Engel; Lage, Kasper; Brunak, Søren
2013-07-01
MetaRanker 2.0 is a web server for prioritization of common and rare frequency genetic variation data. Based on heterogeneous data sets including genetic association data, protein-protein interactions, large-scale text-mining data, copy number variation data and gene expression experiments, MetaRanker 2.0 prioritizes the protein-coding part of the human genome to shortlist candidate genes for targeted follow-up studies. MetaRanker 2.0 is made freely available at www.cbs.dtu.dk/services/MetaRanker-2.0.
Kim, Eunjung; Kim, Eun Jung; Seo, Seung-Won; Hur, Cheol-Goo; McGregor, Robin A; Choi, Myung-Sook
2014-01-01
Worldwide obesity and related comorbidities are increasing, but identifying new therapeutic targets remains a challenge. A plethora of microarray studies in diet-induced obesity models has provided large datasets of obesity associated genes. In this review, we describe an approach to examine the underlying molecular network regulating obesity, and we discuss interactions between obesity candidate genes. We conducted network analysis on functional protein-protein interactions associated with 25 obesity candidate genes identified in a literature-driven approach based on published microarray studies of diet-induced obesity. The obesity candidate genes were closely associated with lipid metabolism and inflammation. Peroxisome proliferator activated receptor gamma (Pparg) appeared to be a core obesity gene, and obesity candidate genes were highly interconnected, suggesting a coordinately regulated molecular network in adipose tissue. In conclusion, the current network analysis approach may help elucidate the underlying molecular network regulating obesity and identify anti-obesity targets for therapeutic intervention.
Bruse, Shannon; Moreau, Michael; Bromberg, Yana; Jang, Jun-Ho; Wang, Nan; Ha, Hongseok; Picchi, Maria; Lin, Yong; Langley, Raymond J; Qualls, Clifford; Klensney-Tait, Julia; Zabner, Joseph; Leng, Shuguang; Mao, Jenny; Belinsky, Steven A; Xing, Jinchuan; Nyunoya, Toru
2016-01-07
Chronic obstructive pulmonary disease (COPD) is characterized by an irreversible airflow limitation in response to inhalation of noxious stimuli, such as cigarette smoke. However, only 15-20 % smokers manifest COPD, suggesting a role for genetic predisposition. Although genome-wide association studies have identified common genetic variants that are associated with susceptibility to COPD, effect sizes of the identified variants are modest, as is the total heritability accounted for by these variants. In this study, an extreme phenotype exome sequencing study was combined with in vitro modeling to identify COPD candidate genes. We performed whole exome sequencing of 62 highly susceptible smokers and 30 exceptionally resistant smokers to identify rare variants that may contribute to disease risk or resistance to COPD. This was a cross-sectional case-control study without therapeutic intervention or longitudinal follow-up information. We identified candidate genes based on rare variant analyses and evaluated exonic variants to pinpoint individual genes whose function was computationally established to be significantly different between susceptible and resistant smokers. Top scoring candidate genes from these analyses were further filtered by requiring that each gene be expressed in human bronchial epithelial cells (HBECs). A total of 81 candidate genes were thus selected for in vitro functional testing in cigarette smoke extract (CSE)-exposed HBECs. Using small interfering RNA (siRNA)-mediated gene silencing experiments, we showed that silencing of several candidate genes augmented CSE-induced cytotoxicity in vitro. Our integrative analysis through both genetic and functional approaches identified two candidate genes (TACC2 and MYO1E) that augment cigarette smoke (CS)-induced cytotoxicity and, potentially, COPD susceptibility.
Domingos, Sara; Fino, Joana; Paulo, Octávio S; Oliveira, Cristina M; Goulao, Luis F
2016-03-01
Flower-to-fruit transition depends of nutrient availability and regulation at the molecular level by sugar and hormone signalling crosstalk. However, in most species, the identities of fruit initiation regulators and their targets are largely unknown. To ascertain the main pathways involved in stenospermocarpic table grape fruit set, comprehensive transcriptional and metabolomic analyses were conducted specifically targeting the early phase of this developmental stage in 'Thompson Seedless'. The high-throughput analyses performed disclosed the involvement of 496 differentially expressed genes and 28 differently accumulated metabolites in the sampled inflorescences. Our data show broad transcriptome reprogramming of molecule transporters, globally down-regulating gene expression, and suggest that regulation of sugar- and hormone-mediated pathways determines the downstream activation of berry development. The most affected gene was the SWEET14 sugar transporter. Hormone-related transcription changes were observed associated with increased indole-3-acetic acid, stimulation of ethylene and gibberellin metabolisms and cytokinin degradation, and regulation of MADS-box and AP2-like ethylene-responsive transcription factor expression. Secondary metabolism, the most representative biological process at transcriptome level, was predominantly repressed. The results add to the knowledge of molecular events occurring in grapevine inflorescence fruit set and provide a list of candidates, paving the way for genetic manipulation aimed at model research and plant breeding. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Groten, Karin; Pahari, Nabin T; Xu, Shuqing; Miloradovic van Doorn, Maja; Baldwin, Ian T
2015-01-01
Most land plants live in a symbiotic association with arbuscular mycorrhizal fungi (AMF) that belong to the phylum Glomeromycota. Although a number of plant genes involved in the plant-AMF interactions have been identified by analyzing mutants, the ability to rapidly manipulate gene expression to study the potential functions of new candidate genes remains unrealized. We analyzed changes in gene expression of wild tobacco roots (Nicotiana attenuata) after infection with mycorrhizal fungi (Rhizophagus irregularis) by serial analysis of gene expression (SuperSAGE) combined with next generation sequencing, and established a virus-induced gene-silencing protocol to study the function of candidate genes in the interaction. From 92,434 SuperSAGE Tag sequences, 32,808 (35%) matched with our in-house Nicotiana attenuata transcriptome database and 3,698 (4%) matched to Rhizophagus genes. In total, 11,194 Tags showed a significant change in expression (p<0.05, >2-fold change) after infection. When comparing the functions of highly up-regulated annotated Tags in this study with those of two previous large-scale gene expression studies, 18 gene functions were found to be up-regulated in all three studies mainly playing roles related to phytohormone metabolism, catabolism and defense. To validate the function of identified candidate genes, we used the technique of virus-induced gene silencing (VIGS) to silence the expression of three putative N. attenuata genes: germin-like protein, indole-3-acetic acid-amido synthetase GH3.9 and, as a proof-of-principle, calcium and calmodulin-dependent protein kinase (CCaMK). The silencing of the three plant genes in roots was successful, but only CCaMK silencing had a significant effect on the interaction with R. irregularis. Interestingly, when a highly activated inoculum was used for plant inoculation, the effect of CCaMK silencing on fungal colonization was masked, probably due to trans-complementation. This study demonstrates that large-scale gene expression studies across different species induce of a core set of genes of similar functions. However, additional factors seem to influence the overall pattern of gene expression, resulting in high variability among independent studies with different hosts. We conclude that VIGS is a powerful tool with which to investigate the function of genes involved in plant-AMF interactions but that inoculum strength can strongly influence the outcome of the interaction.
de Manuel, Marc; Shiina, Takashi; Suzuki, Shingo; Dereuddre-Bosquet, Nathalie; Garchon, Henri-Jean; Tanaka, Masayuki; Congy-Jolivet, Nicolas; Aarnink, Alice; Le Grand, Roger; Marques-Bonet, Tomas; Blancher, Antoine
2018-05-08
In the Mauritian macaque experimentally inoculated with SIV, gene polymorphisms potentially associated with the plasma virus load at a set point, approximately 100 days post inoculation, were investigated. Among the 42 animals inoculated with 50 AID 50 of the same strain of SIV, none of which received any preventive or curative treatment, nine individuals were selected: three with a plasma virus load (PVL) among the lowest, three with intermediate PVL values and three among the highest PVL values. The complete genomes of these nine animals were then analyzed. Initially, attention was focused on variants with a potential functional impact on protein encoding genes (non-synonymous SNPs (NS-SNPs) and splicing variants). Thus, 424 NS-SNPs possibly associated with PVL were detected. The 424 candidates SNPs were genotyped in these 42 SIV experimentally infected animals (including the nine animals subjected to whole genome sequencing). The genes containing variants most probably associated with PVL at a set time point are analyzed herein.
Malki, Karim; Tosto, Maria Grazia; Jumabhoy, Irfan; Lourdusamy, Anbarasu; Sluyter, Frans; Craig, Ian; Uher, Rudolf; McGuffin, Peter; Schalkwyk, Leonard C
2013-12-01
This study aims to identify novel genes associated with major depressive disorder and pharmacological treatment response using animal and human mRNA studies. Weighted gene coexpression network analysis was used to uncover genes associated with stress factors in mice and to inform mRNA probe set selection in a post-mortem study of depression. A total of 171 genes were found to be differentially regulated in response to both early and late stress protocols in a mouse study. Ten human genes, orthologous to mouse genes differentially expressed by stress, were also found to be dysregulated in depressed cases in a human post-mortem brain study from the Stanley Foundation Brain Collection. Several novel genes associated with depression were uncovered, including NOVA1 and USP9X. Moreover, we found further evidence in support of hippocampal neurogenesis and peripheral inflammation in major depressive disorder.
Kinome-wide siRNA screens targeting 713 human (MISSION® siRNA Human Gene Family Set, Sigma) were performed with viability as the phenotypic endpoint on five HNSCC lines: JHU-019; PCI15A and 15B; UM-SCC14A and 14C. Read the abstract
Kinome-wide siRNA screens targeting 713 human (MISSION® siRNA Human Gene Family Set, Sigma) were performed with viability as the phenotypic endpoint on five HNSCC lines: JHU-019; PCI15A and 15B; UM-SCC14A and 14C. Read the abstract
Tiwari, Jagesh Kumar; Devi, Sapna; Sundaresha, S; Chandel, Poonam; Ali, Nilofer; Singh, Brajesh; Bhardwaj, Vinay; Singh, Bir Pal
2015-06-01
Genes involved in photoassimilate partitioning and changes in hormonal balance are important for potato tuberization. In the present study, we investigated gene expression patterns in the tuber-bearing potato somatic hybrid (E1-3) and control non-tuberous wild species Solanum etuberosum (Etb) by microarray. Plants were grown under controlled conditions and leaves were collected at eight tuber developmental stages for microarray analysis. A t-test analysis identified a total of 468 genes (94 up-regulated and 374 down-regulated) that were statistically significant (p ≤ 0.05) and differentially expressed in E1-3 and Etb. Gene Ontology (GO) characterization of the 468 genes revealed that 145 were annotated and 323 were of unknown function. Further, these 145 genes were grouped based on GO biological processes followed by molecular function and (or) PGSC description into 15 gene sets, namely (1) transport, (2) metabolic process, (3) biological process, (4) photosynthesis, (5) oxidation-reduction, (6) transcription, (7) translation, (8) binding, (9) protein phosphorylation, (10) protein folding, (11) ubiquitin-dependent protein catabolic process, (12) RNA processing, (13) negative regulation of protein, (14) methylation, and (15) mitosis. RT-PCR analysis of 10 selected highly significant genes (p ≤ 0.01) confirmed the microarray results. Overall, we show that candidate genes induced in leaves of E1-3 were implicated in tuberization processes such as transport, carbohydrate metabolism, phytohormones, and transcription/translation/binding functions. Hence, our results provide an insight into the candidate genes induced in leaf tissues during tuberization in E1-3.
Involvement of Ethylene in the Latex Metabolism and Tapping Panel Dryness of Hevea brasiliensis
Putranto, Riza-Arief; Herlinawati, Eva; Rio, Maryannick; Leclercq, Julie; Piyatrakul, Piyanuch; Gohet, Eric; Sanier, Christine; Oktavia, Fetrina; Pirrello, Julien; Kuswanhadi; Montoro, Pascal
2015-01-01
Ethephon, an ethylene releaser, is used to stimulate latex production in Hevea brasiliensis. Ethylene induces many functions in latex cells including the production of reactive oxygen species (ROS). The accumulation of ROS is responsible for the coagulation of rubber particles in latex cells, resulting in the partial or complete stoppage of latex flow. This study set out to assess biochemical and histological changes as well as changes in gene expression in latex and phloem tissues from trees grown under various harvesting systems. The Tapping Panel Dryness (TPD) susceptibility of Hevea clones was found to be related to some biochemical parameters, such as low sucrose and high inorganic phosphorus contents. A high tapping frequency and ethephon stimulation induced early TPD occurrence in a high latex metabolism clone and late occurrence in a low latex metabolism clone. TPD-affected trees had smaller number of laticifer vessels compared to healthy trees, suggesting a modification of cambial activity. The differential transcript abundance was observed for twenty-seven candidate genes related to TPD occurrence in latex and phloem tissues for ROS-scavenging, ethylene biosynthesis and signalling genes. The predicted function for some Ethylene Response Factor genes suggested that these candidate genes should play an important role in regulating susceptibility to TPD. PMID:26247941
Involvement of Ethylene in the Latex Metabolism and Tapping Panel Dryness of Hevea brasiliensis.
Putranto, Riza-Arief; Herlinawati, Eva; Rio, Maryannick; Leclercq, Julie; Piyatrakul, Piyanuch; Gohet, Eric; Sanier, Christine; Oktavia, Fetrina; Pirrello, Julien; Kuswanhadi; Montoro, Pascal
2015-08-04
Ethephon, an ethylene releaser, is used to stimulate latex production in Hevea brasiliensis. Ethylene induces many functions in latex cells including the production of reactive oxygen species (ROS). The accumulation of ROS is responsible for the coagulation of rubber particles in latex cells, resulting in the partial or complete stoppage of latex flow. This study set out to assess biochemical and histological changes as well as changes in gene expression in latex and phloem tissues from trees grown under various harvesting systems. The Tapping Panel Dryness (TPD) susceptibility of Hevea clones was found to be related to some biochemical parameters, such as low sucrose and high inorganic phosphorus contents. A high tapping frequency and ethephon stimulation induced early TPD occurrence in a high latex metabolism clone and late occurrence in a low latex metabolism clone. TPD-affected trees had smaller number of laticifer vessels compared to healthy trees, suggesting a modification of cambial activity. The differential transcript abundance was observed for twenty-seven candidate genes related to TPD occurrence in latex and phloem tissues for ROS-scavenging, ethylene biosynthesis and signalling genes. The predicted function for some Ethylene Response Factor genes suggested that these candidate genes should play an important role in regulating susceptibility to TPD.
2010-01-01
Background Discovering novel disease genes is still challenging for diseases for which no prior knowledge - such as known disease genes or disease-related pathways - is available. Performing genetic studies frequently results in large lists of candidate genes of which only few can be followed up for further investigation. We have recently developed a computational method for constitutional genetic disorders that identifies the most promising candidate genes by replacing prior knowledge by experimental data of differential gene expression between affected and healthy individuals. To improve the performance of our prioritization strategy, we have extended our previous work by applying different machine learning approaches that identify promising candidate genes by determining whether a gene is surrounded by highly differentially expressed genes in a functional association or protein-protein interaction network. Results We have proposed three strategies scoring disease candidate genes relying on network-based machine learning approaches, such as kernel ridge regression, heat kernel, and Arnoldi kernel approximation. For comparison purposes, a local measure based on the expression of the direct neighbors is also computed. We have benchmarked these strategies on 40 publicly available knockout experiments in mice, and performance was assessed against results obtained using a standard procedure in genetics that ranks candidate genes based solely on their differential expression levels (Simple Expression Ranking). Our results showed that our four strategies could outperform this standard procedure and that the best results were obtained using the Heat Kernel Diffusion Ranking leading to an average ranking position of 8 out of 100 genes, an AUC value of 92.3% and an error reduction of 52.8% relative to the standard procedure approach which ranked the knockout gene on average at position 17 with an AUC value of 83.7%. Conclusion In this study we could identify promising candidate genes using network based machine learning approaches even if no knowledge is available about the disease or phenotype. PMID:20840752
Zhou, Liang-Yun; Mo, Ge; Wang, Sheng; Tang, Jin-Fu; Yue, Hong; Huang, Lu-Qi; Shao, Ai-Juan; Guo, Lan-Ping
2014-03-01
In this study, Actin, 18S rRNA, PAL, GAPDH and CPR of Artemisia annua were selected as candidate reference genes, and their gene-specific primers for real-time PCR were designed, then geNorm, NormFinder, BestKeeper, Delta CT and RefFinder were used to evaluate their expression stability in the leaves of A. annua under treatment of different concentrations of Cd, with the purpose of finding a reliable reference gene to ensure the reliability of gene-expression analysis. The results showed that there were some significant differences among the candidate reference genes under different treatments and the order of expression stability of candidate reference gene was Actin > 18S rRNA > PAL > GAPDH > CPR. These results suggested that Actin, 18S rRNA and PAL could be used as ideal reference genes of gene expression analysis in A. annua and multiple internal control genes were adopted for results calibration. In addition, differences in expression stability of candidate reference genes in the leaves of A. annua under the same concentrations of Cd were observed, which suggested that the screening of candidate reference genes was needed even under the same treatment. To our best knowledge, this study for the first time provided the ideal reference genes under Cd treatment in the leaves of A. annua and offered reference for the gene expression analysis of A. annua under other conditions.
An Evolutionary Approach for Identifying Driver Mutations in Colorectal Cancer
Leder, Kevin; Riester, Markus; Iwasa, Yoh; Lengauer, Christoph; Michor, Franziska
2015-01-01
The traditional view of cancer as a genetic disease that can successfully be treated with drugs targeting mutant onco-proteins has motivated whole-genome sequencing efforts in many human cancer types. However, only a subset of mutations found within the genomic landscape of cancer is likely to provide a fitness advantage to the cell. Distinguishing such “driver” mutations from innocuous “passenger” events is critical for prioritizing the validation of candidate mutations in disease-relevant models. We design a novel statistical index, called the Hitchhiking Index, which reflects the probability that any observed candidate gene is a passenger alteration, given the frequency of alterations in a cross-sectional cancer sample set, and apply it to a mutational data set in colorectal cancer. Our methodology is based upon a population dynamics model of mutation accumulation and selection in colorectal tissue prior to cancer initiation as well as during tumorigenesis. This methodology can be used to aid in the prioritization of candidate mutations for functional validation and contributes to the process of drug discovery. PMID:26379039
Identification and analysis of pig chimeric mRNAs using RNA sequencing data
2012-01-01
Background Gene fusion is ubiquitous over the course of evolution. It is expected to increase the diversity and complexity of transcriptomes and proteomes through chimeric sequence segments or altered regulation. However, chimeric mRNAs in pigs remain unclear. Here we identified some chimeric mRNAs in pigs and analyzed the expression of them across individuals and breeds using RNA-sequencing data. Results The present study identified 669 putative chimeric mRNAs in pigs, of which 251 chimeric candidates were detected in a set of RNA-sequencing data. The 618 candidates had clear trans-splicing sites, 537 of which obeyed the canonical GU-AG splice rule. Only two putative pig chimera variants whose fusion junction was overlapped with that of a known human chimeric mRNA were found. A set of unique chimeric events were considered middle variances in the expression across individuals and breeds, and revealed non-significant variance between sexes. Furthermore, the genomic region of the 5′ partner gene shares a similar DNA sequence with that of the 3′ partner gene for 458 putative chimeric mRNAs. The 81 of those shared DNA sequences significantly matched the known DNA-binding motifs in the JASPAR CORE database. Four DNA motifs shared in parental genomic regions had significant similarity with known human CTCF binding sites. Conclusions The present study provided detailed information on some pig chimeric mRNAs. We proposed a model that trans-acting factors, such as CTCF, induced the spatial organisation of parental genes to the same transcriptional factory so that parental genes were coordinatively transcribed to give birth to chimeric mRNAs. PMID:22925561
Nguyen, Dinh-Duc; Lee, Dong Gyu; Kim, Sinae; Kang, Keunsoo; Rhee, Je-Keun; Chang, Suhwan
2018-05-14
BRCA1 is a multifunctional tumor suppressor involved in several essential cellular processes. Although many of these functions are driven by or related to its transcriptional/epigenetic regulator activity, there has been no genome-wide study to reveal the transcriptional/epigenetic targets of BRCA1. Therefore, we conducted a comprehensive analysis of genomics/transcriptomics data to identify novel BRCA1 target genes. We first analyzed ENCODE data with BRCA1 chromatin immunoprecipitation (ChIP)-sequencing results and identified a set of genes with a promoter occupied by BRCA1. We collected 3085 loci with a BRCA1 ChIP signal from four cell lines and calculated the distance between the loci and the nearest gene transcription start site (TSS). Overall, 66.5% of the BRCA1-bound loci fell into a 2-kb region around the TSS, suggesting a role in transcriptional regulation. We selected 45 candidate genes based on gene expression correlation data, obtained from two GEO (Gene Expression Omnibus) datasets and TCGA data of human breast cancer, compared to BRCA1 expression levels. Among them, we further tested three genes ( MEIS2 , CKS1B and FADD ) and verified FADD as a novel direct target of BRCA1 by ChIP, RT-PCR, and a luciferase reporter assay. Collectively, our data demonstrate genome-wide transcriptional regulation by BRCA1 and suggest target genes as biomarker candidates for BRCA1-associated breast cancer.
Prediction of Human Disease Genes by Human-Mouse Conserved Coexpression Analysis
Grassi, Elena; Damasco, Christian; Silengo, Lorenzo; Oti, Martin; Provero, Paolo; Di Cunto, Ferdinando
2008-01-01
Background Even in the post-genomic era, the identification of candidate genes within loci associated with human genetic diseases is a very demanding task, because the critical region may typically contain hundreds of positional candidates. Since genes implicated in similar phenotypes tend to share very similar expression profiles, high throughput gene expression data may represent a very important resource to identify the best candidates for sequencing. However, so far, gene coexpression has not been used very successfully to prioritize positional candidates. Methodology/Principal Findings We show that it is possible to reliably identify disease-relevant relationships among genes from massive microarray datasets by concentrating only on genes sharing similar expression profiles in both human and mouse. Moreover, we show systematically that the integration of human-mouse conserved coexpression with a phenotype similarity map allows the efficient identification of disease genes in large genomic regions. Finally, using this approach on 850 OMIM loci characterized by an unknown molecular basis, we propose high-probability candidates for 81 genetic diseases. Conclusion Our results demonstrate that conserved coexpression, even at the human-mouse phylogenetic distance, represents a very strong criterion to predict disease-relevant relationships among human genes. PMID:18369433
Moncrieffe, Halima; Hinks, Anne; Ursu, Simona; Kassoumeri, Laura; Etheridge, Angela; Hubank, Mike; Martin, Paul; Weiler, Tracey; Glass, David N; Thompson, Susan D.; Thomson, Wendy; Wedderburn, Lucy R
2010-01-01
Objectives Little is known about mechanisms of efficacy of methotrexate (MTX) in childhood arthritis, or genetic influences upon response to MTX. The aims of this study were to use gene expression profiling to identify novel pathways/genes altered by MTX and then investigate these genes for genotype associations with response to MTX treatment. Methods Gene expression profiling before and after MTX treatment was performed on 11 children with juvenile idiopathic arthritis (JIA) treated with MTX, in whom response at 6 months of treatment was defined. Genes showing the most differential gene expression after treatment were selected for SNP genotyping. Genotype frequencies were compared between non-responders and responders (ACR-Ped70). An independent cohort was available for validation. Results Gene expression profiling before and after MTX treatment revealed 1222 differentially expressed probes sets (fold change >1.7, p< 0.05) and 1065 when restricted to full responder cases only. Six highly differentially expressed genes were analysed for genetic association to response to MTX. Three SNPs in the SLC16A7 gene showed significant association with MTX response. One SNP showed validated association in an independent cohort. Conclusions This study is the first, to our knowledge, to evaluate gene expression profiles in children with JIA before and after MTX, and to analyse genetic variation in differentially expressed genes. We have identified a gene which may contribute to genetic variability in MTX response in JIA, and established as proof of principle that genes which are differentially expressed at mRNA level after drug administration may also be good candidates for genetic analysis. PMID:20827233
[Strategies of elucidation of biosynthetic pathways of natural products].
Zou, Li-Qiu; Kuang, Xue-Jun; Sun, Chao; Chen, Shi-Lin
2016-11-01
Elucidation of the biosynthetic pathways of natural products is not only the major goal of herb genomics, but also the solid foundation of synthetic biology of natural products. Here, this paper reviewed recent advance in this field and put forward strategies to elucidate the biosynthetic pathway of natural products. Firstly, a proposed biosynthetic pathway should be set up based on well-known knowledge about chemical reactions and information on the identified compounds, as well as studies with isotope tracer. Secondly, candidate genes possibly involved in the biosynthetic pathway were screened out by co-expression analysis and/or gene cluster mining. Lastly, all the candidate genes were heterologously expressed in the host and then the enzyme involved in the biosynthetic pathway was characterized by activity assay. Sometimes, the function of the enzyme in the original plant could be further studied by RNAi or VIGS technology. Understanding the biosynthetic pathways of natural products will contribute to supply of new leading compounds by synthetic biology and provide "functional marker" for herbal molecular breeding, thus but boosting the development of traditional Chinese medicine agriculture. Copyright© by the Chinese Pharmaceutical Association.
Population genomics reveals a candidate gene involved in bumble bee pigmentation.
Pimsler, Meaghan L; Jackson, Jason M; Lozier, Jeffrey D
2017-05-01
Variation in bumble bee color patterns is well-documented within and between species. Identifying the genetic mechanisms underlying such variation may be useful in revealing evolutionary forces shaping rapid phenotypic diversification. The widespread North American species Bombus bifarius exhibits regional variation in abdominal color forms, ranging from red-banded to black-banded phenotypes and including geographically and phenotypically intermediate forms. Identifying genomic regions linked to this variation has been complicated by strong, near species level, genome-wide differentiation between red- and black-banded forms. Here, we instead focus on the closely related black-banded and intermediate forms that both belong to the subspecies B. bifarius nearcticus . We analyze an RNA sequencing (RNAseq) data set and identify a cluster of single nucleotide polymorphisms (SNPs) within one gene, Xanthine dehydrogenase/oxidase -like, that exhibit highly unusual differentiation compared to the rest of the sequenced genome. Homologs of this gene contribute to pigmentation in other insects, and results thus represent a strong candidate for investigating the genetic basis of pigment variation in B. bifarius and other bumble bee mimicry complexes.
Quantitative trait loci controlling leaf venation in Arabidopsis.
Rishmawi, Louai; Bühler, Jonas; Jaegle, Benjamin; Hülskamp, Martin; Koornneef, Maarten
2017-08-01
Leaf veins provide the mechanical support and are responsible for the transport of nutrients and water to the plant. High vein density is a prerequisite for plants to have C4 photosynthesis. We investigated the genetic variation and genetic architecture of leaf venation traits within the species Arabidopsis thaliana using natural variation. Leaf venation traits, including leaf vein density (LVD) were analysed in 66 worldwide accessions and 399 lines of the multi-parent advanced generation intercross population. It was shown that there is no correlation between LVD and photosynthesis parameters within A. thaliana. Association mapping was performed for LVD and identified 16 and 17 putative quantitative trait loci (QTLs) in the multi-parent advanced generation intercross and worldwide sets, respectively. There was no overlap between the identified QTLs suggesting that many genes can affect the traits. In addition, linkage mapping was performed using two biparental recombinant inbred line populations. Combining linkage and association mapping revealed seven candidate genes. For one of the candidate genes, RCI2c, we demonstrated its function in leaf venation patterning. © 2017 John Wiley & Sons Ltd.
Juraeva, Dilafruz; Haenisch, Britta; Zapatka, Marc; Frank, Josef; Witt, Stephanie H; Mühleisen, Thomas W; Treutlein, Jens; Strohmaier, Jana; Meier, Sandra; Degenhardt, Franziska; Giegling, Ina; Ripke, Stephan; Leber, Markus; Lange, Christoph; Schulze, Thomas G; Mössner, Rainald; Nenadic, Igor; Sauer, Heinrich; Rujescu, Dan; Maier, Wolfgang; Børglum, Anders; Ophoff, Roel; Cichon, Sven; Nöthen, Markus M; Rietschel, Marcella; Mattheisen, Manuel; Brors, Benedikt
2014-06-01
In the present study, an integrated hierarchical approach was applied to: (1) identify pathways associated with susceptibility to schizophrenia; (2) detect genes that may be potentially affected in these pathways since they contain an associated polymorphism; and (3) annotate the functional consequences of such single-nucleotide polymorphisms (SNPs) in the affected genes or their regulatory regions. The Global Test was applied to detect schizophrenia-associated pathways using discovery and replication datasets comprising 5,040 and 5,082 individuals of European ancestry, respectively. Information concerning functional gene-sets was retrieved from the Kyoto Encyclopedia of Genes and Genomes, Gene Ontology, and the Molecular Signatures Database. Fourteen of the gene-sets or pathways identified in the discovery dataset were confirmed in the replication dataset. These include functional processes involved in transcriptional regulation and gene expression, synapse organization, cell adhesion, and apoptosis. For two genes, i.e. CTCF and CACNB2, evidence for association with schizophrenia was available (at the gene-level) in both the discovery study and published data from the Psychiatric Genomics Consortium schizophrenia study. Furthermore, these genes mapped to four of the 14 presently identified pathways. Several of the SNPs assigned to CTCF and CACNB2 have potential functional consequences, and a gene in close proximity to CACNB2, i.e. ARL5B, was identified as a potential gene of interest. Application of the present hierarchical approach thus allowed: (1) identification of novel biological gene-sets or pathways with potential involvement in the etiology of schizophrenia, as well as replication of these findings in an independent cohort; (2) detection of genes of interest for future follow-up studies; and (3) the highlighting of novel genes in previously reported candidate regions for schizophrenia.
Identification of Inherited Retinal Disease-Associated Genetic Variants in 11 Candidate Genes.
Astuti, Galuh D N; van den Born, L Ingeborgh; Khan, M Imran; Hamel, Christian P; Bocquet, Béatrice; Manes, Gaël; Quinodoz, Mathieu; Ali, Manir; Toomes, Carmel; McKibbin, Martin; El-Asrag, Mohammed E; Haer-Wigman, Lonneke; Inglehearn, Chris F; Black, Graeme C M; Hoyng, Carel B; Cremers, Frans P M; Roosing, Susanne
2018-01-10
Inherited retinal diseases (IRDs) display an enormous genetic heterogeneity. Whole exome sequencing (WES) recently identified genes that were mutated in a small proportion of IRD cases. Consequently, finding a second case or family carrying pathogenic variants in the same candidate gene often is challenging. In this study, we searched for novel candidate IRD gene-associated variants in isolated IRD families, assessed their causality, and searched for novel genotype-phenotype correlations. Whole exome sequencing was performed in 11 probands affected with IRDs. Homozygosity mapping data was available for five cases. Variants with minor allele frequencies ≤ 0.5% in public databases were selected as candidate disease-causing variants. These variants were ranked based on their: (a) presence in a gene that was previously implicated in IRD; (b) minor allele frequency in the Exome Aggregation Consortium database (ExAC); (c) in silico pathogenicity assessment using the combined annotation dependent depletion (CADD) score; and (d) interaction of the corresponding protein with known IRD-associated proteins. Twelve unique variants were found in 11 different genes in 11 IRD probands. Novel autosomal recessive and dominant inheritance patterns were found for variants in Small Nuclear Ribonucleoprotein U5 Subunit 200 ( SNRNP200 ) and Zinc Finger Protein 513 ( ZNF513 ), respectively. Using our pathogenicity assessment, a variant in DEAH-Box Helicase 32 ( DHX32 ) was the top ranked novel candidate gene to be associated with IRDs, followed by eight medium and lower ranked candidate genes. The identification of candidate disease-associated sequence variants in 11 single families underscores the notion that the previously identified IRD-associated genes collectively carry > 90% of the defects implicated in IRDs. To identify multiple patients or families with variants in the same gene and thereby provide extra proof for pathogenicity, worldwide data sharing is needed.
Torres, Katherine J; Castrillon, Carlos E; Moss, Eli L; Saito, Mayuko; Tenorio, Roy; Molina, Douglas M; Davies, Huw; Neafsey, Daniel E; Felgner, Philip; Vinetz, Joseph M; Gamboa, Dionicia
2015-04-15
Persons with blood-stage Plasmodium falciparum parasitemia in the absence of symptoms are considered to be clinically immune. We hypothesized that asymptomatic subjects with P. falciparum parasitemia would differentially recognize a subset of P. falciparum proteins on a genomic scale. Compared with symptomatic subjects, sera from clinically immune, asymptomatically infected individuals differentially recognized 51 P. falciparum proteins, including the established vaccine candidate PfMSP1. Novel, hitherto unstudied hypothetical proteins and other proteins not previously recognized as potential vaccine candidates were also differentially recognized. Genes encoding the proteins differentially recognized by the Peruvian clinically immune individuals exhibited a significant enrichment of nonsynonymous nucleotide variation, an observation consistent with these genes undergoing immune selection. A limited set of P. falciparum protein antigens was associated with the development of naturally acquired clinical immunity in the low-transmission setting of the Peruvian Amazon. These results imply that, even in a low-transmission setting, an asexual blood-stage vaccine designed to reduce clinical malaria symptoms will likely need to contain large numbers of often-polymorphic proteins, a finding at odds with many current efforts in the design of vaccines against asexual blood-stage P. falciparum. © The Author 2014. Published by Oxford University Press on behalf of the Infectious Diseases Society of America. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Integrated database for identifying candidate genes for Aspergillus flavus resistance in maize
2010-01-01
Background Aspergillus flavus Link:Fr, an opportunistic fungus that produces aflatoxin, is pathogenic to maize and other oilseed crops. Aflatoxin is a potent carcinogen, and its presence markedly reduces the value of grain. Understanding and enhancing host resistance to A. flavus infection and/or subsequent aflatoxin accumulation is generally considered an efficient means of reducing grain losses to aflatoxin. Different proteomic, genomic and genetic studies of maize (Zea mays L.) have generated large data sets with the goal of identifying genes responsible for conferring resistance to A. flavus, or aflatoxin. Results In order to maximize the usage of different data sets in new studies, including association mapping, we have constructed a relational database with web interface integrating the results of gene expression, proteomic (both gel-based and shotgun), Quantitative Trait Loci (QTL) genetic mapping studies, and sequence data from the literature to facilitate selection of candidate genes for continued investigation. The Corn Fungal Resistance Associated Sequences Database (CFRAS-DB) (http://agbase.msstate.edu/) was created with the main goal of identifying genes important to aflatoxin resistance. CFRAS-DB is implemented using MySQL as the relational database management system running on a Linux server, using an Apache web server, and Perl CGI scripts as the web interface. The database and the associated web-based interface allow researchers to examine many lines of evidence (e.g. microarray, proteomics, QTL studies, SNP data) to assess the potential role of a gene or group of genes in the response of different maize lines to A. flavus infection and subsequent production of aflatoxin by the fungus. Conclusions CFRAS-DB provides the first opportunity to integrate data pertaining to the problem of A. flavus and aflatoxin resistance in maize in one resource and to support queries across different datasets. The web-based interface gives researchers different query options for mining the database across different types of experiments. The database is publically available at http://agbase.msstate.edu. PMID:20946609
Teaniniuraitemoana, Vaihiti; Huvet, Arnaud; Levy, Peva; Gaertner-Mazouni, Nabila; Gueguen, Yannick; Le Moullac, Gilles
2015-01-01
The genomics of economically important marine bivalves is studied to provide better understanding of the molecular mechanisms underlying their different reproductive strategies. The recently available gonad transcriptome of the black-lip pearl oyster Pinctada margaritifera is a novel and powerful resource to study these mechanisms in marine mollusks displaying hermaphroditic features. In this study, RNAseq quantification data of the P. margaritifera gonad transcriptome were analyzed to identify candidate genes in histologically-characterized gonad samples to provide molecular signatures of the female and male sexual pathway in this pearl oyster. Based on the RNAseq data set, stringent expression analysis identified 1,937 contigs that were differentially expressed between the gonad histological categories. From the hierarchical clustering analysis, a new reproduction model is proposed, based on a dual histo-molecular analytical approach. Nine candidate genes were identified as markers of the sexual pathway: 7 for the female pathway and 2 for the male one. Their mRNA levels were assayed by real-time PCR on a new set of gonadic samples. A clustering method revealed four principal expression patterns based on the relative gene expression ratio. A multivariate regression tree realized on these new samples and validated on the previously analyzed RNAseq samples showed that the sexual pathway of P. margaritifera can be predicted by a 3-gene-pair expression ratio model of 4 different genes: pmarg-43476, pmarg-foxl2, pmarg-54338 and pmarg-fem1-like. This 3-gene-pair expression ratio model strongly suggests only the implication of pmarg-foxl2 and pmarg-fem1-like in the sex inversion of P. margaritifera. This work provides the first histo-molecular model of P. margaritifera reproduction and a gene expression signature of its sexual pathway discriminating the male and female pathways. These represent useful tools for understanding and studying sex inversion, sex differentiation and sex determinism in this species and other related species for aquaculture purposes such as genetic selection programs. PMID:25815473
2013-01-01
Background Austism spectrum disorder (ASD) is a heterogeneous behavioral disorder or condition characterized by severe impairment of social engagement and the presence of repetitive activities. The molecular etiology of ASD is still largely unknown despite a strong genetic component. Part of the difficulty in turning genetics into disease mechanisms and potentially new therapeutics is the sheer number and diversity of the genes that have been associated with ASD and ASD symptoms. The goal of this work is to use shRNA-generated models of genetic defects proposed as causative for ASD to identify the common pathways that might explain how they produce a core clinical disability. Methods Transcript levels of Mecp2, Mef2a, Mef2d, Fmr1, Nlgn1, Nlgn3, Pten, and Shank3 were knocked-down in mouse primary neuron cultures using shRNA constructs. Whole genome expression analysis was conducted for each of the knockdown cultures as well as a mock-transduced culture and a culture exposed to a lentivirus expressing an anti-luciferase shRNA. Gene set enrichment and a causal reasoning engine was employed to identify pathway level perturbations generated by the transcript knockdown. Results Quantification of the shRNA targets confirmed the successful knockdown at the transcript and protein levels of at least 75% for each of the genes. After subtracting out potential artifacts caused by viral infection, gene set enrichment and causal reasoning engine analysis showed that a significant number of gene expression changes mapped to pathways associated with neurogenesis, long-term potentiation, and synaptic activity. Conclusions This work demonstrates that despite the complex genetic nature of ASD, there are common molecular mechanisms that connect many of the best established autism candidate genes. By identifying the key regulatory checkpoints in the interlinking transcriptional networks underlying autism, we are better able to discover the ideal points of intervention that provide the broadest efficacy across the diverse population of autism patients. PMID:24238429
Lanz, Thomas A; Guilmette, Edward; Gosink, Mark M; Fischer, James E; Fitzgerald, Lawrence W; Stephenson, Diane T; Pletcher, Mathew T
2013-11-15
Austism spectrum disorder (ASD) is a heterogeneous behavioral disorder or condition characterized by severe impairment of social engagement and the presence of repetitive activities. The molecular etiology of ASD is still largely unknown despite a strong genetic component. Part of the difficulty in turning genetics into disease mechanisms and potentially new therapeutics is the sheer number and diversity of the genes that have been associated with ASD and ASD symptoms. The goal of this work is to use shRNA-generated models of genetic defects proposed as causative for ASD to identify the common pathways that might explain how they produce a core clinical disability. Transcript levels of Mecp2, Mef2a, Mef2d, Fmr1, Nlgn1, Nlgn3, Pten, and Shank3 were knocked-down in mouse primary neuron cultures using shRNA constructs. Whole genome expression analysis was conducted for each of the knockdown cultures as well as a mock-transduced culture and a culture exposed to a lentivirus expressing an anti-luciferase shRNA. Gene set enrichment and a causal reasoning engine was employed to identify pathway level perturbations generated by the transcript knockdown. Quantification of the shRNA targets confirmed the successful knockdown at the transcript and protein levels of at least 75% for each of the genes. After subtracting out potential artifacts caused by viral infection, gene set enrichment and causal reasoning engine analysis showed that a significant number of gene expression changes mapped to pathways associated with neurogenesis, long-term potentiation, and synaptic activity. This work demonstrates that despite the complex genetic nature of ASD, there are common molecular mechanisms that connect many of the best established autism candidate genes. By identifying the key regulatory checkpoints in the interlinking transcriptional networks underlying autism, we are better able to discover the ideal points of intervention that provide the broadest efficacy across the diverse population of autism patients.
Integrated database for identifying candidate genes for Aspergillus flavus resistance in maize.
Kelley, Rowena Y; Gresham, Cathy; Harper, Jonathan; Bridges, Susan M; Warburton, Marilyn L; Hawkins, Leigh K; Pechanova, Olga; Peethambaran, Bela; Pechan, Tibor; Luthe, Dawn S; Mylroie, J E; Ankala, Arunkanth; Ozkan, Seval; Henry, W B; Williams, W P
2010-10-07
Aspergillus flavus Link:Fr, an opportunistic fungus that produces aflatoxin, is pathogenic to maize and other oilseed crops. Aflatoxin is a potent carcinogen, and its presence markedly reduces the value of grain. Understanding and enhancing host resistance to A. flavus infection and/or subsequent aflatoxin accumulation is generally considered an efficient means of reducing grain losses to aflatoxin. Different proteomic, genomic and genetic studies of maize (Zea mays L.) have generated large data sets with the goal of identifying genes responsible for conferring resistance to A. flavus, or aflatoxin. In order to maximize the usage of different data sets in new studies, including association mapping, we have constructed a relational database with web interface integrating the results of gene expression, proteomic (both gel-based and shotgun), Quantitative Trait Loci (QTL) genetic mapping studies, and sequence data from the literature to facilitate selection of candidate genes for continued investigation. The Corn Fungal Resistance Associated Sequences Database (CFRAS-DB) (http://agbase.msstate.edu/) was created with the main goal of identifying genes important to aflatoxin resistance. CFRAS-DB is implemented using MySQL as the relational database management system running on a Linux server, using an Apache web server, and Perl CGI scripts as the web interface. The database and the associated web-based interface allow researchers to examine many lines of evidence (e.g. microarray, proteomics, QTL studies, SNP data) to assess the potential role of a gene or group of genes in the response of different maize lines to A. flavus infection and subsequent production of aflatoxin by the fungus. CFRAS-DB provides the first opportunity to integrate data pertaining to the problem of A. flavus and aflatoxin resistance in maize in one resource and to support queries across different datasets. The web-based interface gives researchers different query options for mining the database across different types of experiments. The database is publically available at http://agbase.msstate.edu.
Crosley, E J; Elliot, M G; Christians, J K; Crespi, B J
2013-02-01
Recent evidence from chimpanzees and gorillas has raised doubts that preeclampsia is a uniquely human disease. The deep extravillous trophoblast (EVT) invasion and spiral artery remodeling that characterizes our placenta (and is abnormal in preeclampsia) is shared within great apes, setting Homininae apart from Hylobatidae and Old World Monkeys, which show much shallower trophoblast invasion and limited spiral artery remodeling. We hypothesize that the evolution of a more invasive placenta in the lineage ancestral to the great apes involved positive selection on genes crucial to EVT invasion and spiral artery remodeling. Furthermore, identification of placentally-expressed genes under selection in this lineage may identify novel genes involved in placental development. We tested for positive selection in approximately 18,000 genes using the ratio of non-synonymous to synonymous amino acid substitution for protein-coding DNA. DAVID Bioinformatics Resources identified biological processes enriched in positively selected genes, including processes related to EVT invasion and spiral artery remodeling. Analyses revealed 295 and 264 genes under significant positive selection on the branches ancestral to Hominidae (Human, Chimp, Gorilla, Orangutan) and Homininae (Human, Chimp, Gorilla), respectively. Gene ontology analysis of these gene sets demonstrated significant enrichments for several functional gene clusters relevant to preeclampsia risk, and sets of placentally-expressed genes that have been linked with preeclampsia and/or trophoblast invasion in other studies. Our study represents a novel approach to the identification of candidate genes and amino acid residues involved in placental pathologies by implicating them in the evolution of highly-invasive placenta. Copyright © 2012 Elsevier Ltd. All rights reserved.
Silva, C; Garcia-Mas, J; Sánchez, A M; Arús, P; Oliveira, M M
2005-03-01
Blooming time is one of the most important agronomic traits in almond. Biochemical and molecular events underlying flowering regulation must be understood before methods to stimulate late flowering can be developed. Attempts to elucidate the genetic control of this process have led to the identification of a major gene (Lb) and quantitative trait loci (QTLs) linked to observed phenotypic differences, but although this gene and these QTLs have been placed on the Prunus reference genetic map, their sequences and specific functions remain unknown. The aim of our investigation was to associate these loci with known genes using a candidate gene approach. Two almond cDNAs and eight Prunus expressed sequence tags were selected as candidate genes (CGs) since their sequences were highly identical to those of flowering regulatory genes characterized in other species. The CGs were amplified from both parental lines of the mapping population using specific primers. Sequence comparison revealed DNA polymorphisms between the parental lines, mainly of the single nucleotide type. Polymorphisms were used to develop co-dominant cleaved amplified polymorphic sequence markers or length polymorphisms based on insertion/deletion events for mapping the candidate genes on the Prunus reference map. Ten candidate genes were assigned to six linkage groups in the Prunus genome. The positions of two of these were compatible with the regions where two QTLs for blooming time were detected. One additional candidate was localized close to the position of the Evergrowing gene, which determines a non-deciduous behaviour in peach.
Bioinformatics analysis and detection of gelatinase encoded gene in Lysinibacillussphaericus
NASA Astrophysics Data System (ADS)
Repin, Rul Aisyah Mat; Mutalib, Sahilah Abdul; Shahimi, Safiyyah; Khalid, Rozida Mohd.; Ayob, Mohd. Khan; Bakar, Mohd. Faizal Abu; Isa, Mohd Noor Mat
2016-11-01
In this study, we performed bioinformatics analysis toward genome sequence of Lysinibacillussphaericus (L. sphaericus) to determine gene encoded for gelatinase. L. sphaericus was isolated from soil and gelatinase species-specific bacterium to porcine and bovine gelatin. This bacterium offers the possibility of enzymes production which is specific to both species of meat, respectively. The main focus of this research is to identify the gelatinase encoded gene within the bacteria of L. Sphaericus using bioinformatics analysis of partially sequence genome. From the research study, three candidate gene were identified which was, gelatinase candidate gene 1 (P1), NODE_71_length_93919_cov_158.931839_21 which containing 1563 base pair (bp) in size with 520 amino acids sequence; Secondly, gelatinase candidate gene 2 (P2), NODE_23_length_52851_cov_190.061386_17 which containing 1776 bp in size with 591 amino acids sequence; and Thirdly, gelatinase candidate gene 3 (P3), NODE_106_length_32943_cov_169.147919_8 containing 1701 bp in size with 566 amino acids sequence. Three pairs of oligonucleotide primers were designed and namely as, F1, R1, F2, R2, F3 and R3 were targeted short sequences of cDNA by PCR. The amplicons were reliably results in 1563 bp in size for candidate gene P1 and 1701 bp in size for candidate gene P3. Therefore, the results of bioinformatics analysis of L. Sphaericus resulting in gene encoded gelatinase were identified.
Comparative mRNA analysis of behavioral and genetic mouse models of aggression.
Malki, Karim; Tosto, Maria G; Pain, Oliver; Sluyter, Frans; Mineur, Yann S; Crusio, Wim E; de Boer, Sietse; Sandnabba, Kenneth N; Kesserwani, Jad; Robinson, Edward; Schalkwyk, Leonard C; Asherson, Philip
2016-04-01
Mouse models of aggression have traditionally compared strains, most notably BALB/cJ and C57BL/6. However, these strains were not designed to study aggression despite differences in aggression-related traits and distinct reactivity to stress. This study evaluated expression of genes differentially regulated in a stress (behavioral) mouse model of aggression with those from a recent genetic mouse model aggression. The study used a discovery-replication design using two independent mRNA studies from mouse brain tissue. The discovery study identified strain (BALB/cJ and C57BL/6J) × stress (chronic mild stress or control) interactions. Probe sets differentially regulated in the discovery set were intersected with those uncovered in the replication study, which evaluated differences between high and low aggressive animals from three strains specifically bred to study aggression. Network analysis was conducted on overlapping genes uncovered across both studies. A significant overlap was found with the genetic mouse study sharing 1,916 probe sets with the stress model. Fifty-one probe sets were found to be strongly dysregulated across both studies mapping to 50 known genes. Network analysis revealed two plausible pathways including one centered on the UBC gene hub which encodes ubiquitin, a protein well-known for protein degradation, and another on P38 MAPK. Findings from this study support the stress model of aggression, which showed remarkable molecular overlap with a genetic model. The study uncovered a set of candidate genes including the Erg2 gene, which has previously been implicated in different psychopathologies. The gene networks uncovered points at a Redox pathway as potentially being implicated in aggressive related behaviors. © 2016 Wiley Periodicals, Inc.
Reliable pre-eclampsia pathways based on multiple independent microarray data sets.
Kawasaki, Kaoru; Kondoh, Eiji; Chigusa, Yoshitsugu; Ujita, Mari; Murakami, Ryusuke; Mogami, Haruta; Brown, J B; Okuno, Yasushi; Konishi, Ikuo
2015-02-01
Pre-eclampsia is a multifactorial disorder characterized by heterogeneous clinical manifestations. Gene expression profiling of preeclamptic placenta have provided different and even opposite results, partly due to data compromised by various experimental artefacts. Here we aimed to identify reliable pre-eclampsia-specific pathways using multiple independent microarray data sets. Gene expression data of control and preeclamptic placentas were obtained from Gene Expression Omnibus. Single-sample gene-set enrichment analysis was performed to generate gene-set activation scores of 9707 pathways obtained from the Molecular Signatures Database. Candidate pathways were identified by t-test-based screening using data sets, GSE10588, GSE14722 and GSE25906. Additionally, recursive feature elimination was applied to arrive at a further reduced set of pathways. To assess the validity of the pre-eclampsia pathways, a statistically-validated protocol was executed using five data sets including two independent other validation data sets, GSE30186, GSE44711. Quantitative real-time PCR was performed for genes in a panel of potential pre-eclampsia pathways using placentas of 20 women with normal or severe preeclamptic singleton pregnancies (n = 10, respectively). A panel of ten pathways were found to discriminate women with pre-eclampsia from controls with high accuracy. Among these were pathways not previously associated with pre-eclampsia, such as the GABA receptor pathway, as well as pathways that have already been linked to pre-eclampsia, such as the glutathione and CDKN1C pathways. mRNA expression of GABRA3 (GABA receptor pathway), GCLC and GCLM (glutathione metabolic pathway), and CDKN1C was significantly reduced in the preeclamptic placentas. In conclusion, ten accurate and reliable pre-eclampsia pathways were identified based on multiple independent microarray data sets. A pathway-based classification may be a worthwhile approach to elucidate the pathogenesis of pre-eclampsia. © The Author 2014. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Duffy, A; Turecki, G; Grof, P; Cavazzoni, P; Grof, E; Joober, R; Ahrens, B; Berghöfer, A; Müller-Oerlinghausen, B; Dvoráková, M; Libigerová, E; Vojtĕchovský, M; Zvolský, P; Nilsson, A; Licht, R W; Rasmussen, N A; Schou, M; Vestergaard, P; Holzinger, A; Schumann, C; Thau, K; Robertson, C; Rouleau, G A; Alda, M
2000-01-01
OBJECTIVE: To test for genetic linkage and association with GABAergic candidate genes in lithium-responsive bipolar disorder. DESIGN: Polymorphisms located in genes that code for GABRA3, GABRA5 and GABRB3 subunits of the GABAA receptor were investigated using association and linkage strategies. PARTICIPANTS: A total of 138 patients with bipolar 1 disorder with a clear response to lithium prophylaxis, selected from specialized lithium clinics in Canada and Europe that are part of the International Group for the Study of Lithium-Treated Patients, and 108 psychiatrically healthy controls. Families of 24 probands were suitable for linkage analysis. OUTCOME MEASURES: The association between the candidate genes and patients with bipolar disorder versus that of controls and genetic linkage within families. RESULTS: There was no significant association or linkage found between lithium-responsive bipolar disorder and the GABAergic candidate genes investigated. CONCLUSIONS: This study does not support a major role for the GABAergic candidate genes tested in lithium-responsive bipolar disorder. PMID:11022400
van Manen, Daniëlle; Bunnik, Evelien M.; van Sighem, Ard I.; Sieberer, Margit; Boeser-Nunnink, Brigitte; de Wolf, Frank; Schuitemaker, Hanneke; Portegies, Peter; Kootstra, Neeltje A.; van 't Wout, Angélique B.
2012-01-01
Background Infection with HIV-1 may result in severe cognitive and motor impairment, referred to as HIV-1-associated dementia (HAD). While its prevalence has dropped significantly in the era of combination antiretroviral therapy, milder neurocognitive disorders persist with a high prevalence. To identify additional therapeutic targets for treating HIV-associated neurocognitive disorders, several candidate gene polymorphisms have been evaluated, but few have been replicated across multiple studies. Methods We here tested 7 candidate gene polymorphisms for association with HAD in a case-control study consisting of 86 HAD cases and 246 non-HAD AIDS patients as controls. Since infected monocytes and macrophages are thought to play an important role in the infection of the brain, 5 recently identified single nucleotide polymorphisms (SNPs) affecting HIV-1 replication in macrophages in vitro were also tested. Results The CCR5 wt/Δ32 genotype was only associated with HAD in individuals who developed AIDS prior to 1991, in agreement with the observed fading effect of this genotype on viral load set point. A significant difference in genotype distribution among all cases and controls irrespective of year of AIDS diagnosis was found only for a SNP in candidate gene PREP1 (p = 1.2×10−5). Prep1 has recently been identified as a transcription factor preferentially binding the −2,518 G allele in the promoter of the gene encoding MCP-1, a protein with a well established role in the etiology of HAD. Conclusion These results support previous findings suggesting an important role for MCP-1 in the onset of HIV-1-associated neurocognitive disorders. PMID:22347417
Sierra, Beatriz; Triska, Petr; Soares, Pedro; Garcia, Gissel; Perez, Ana B; Aguirre, Eglys; Oliveira, Marisa; Cavadas, Bruno; Regnault, Béatrice; Alvarez, Mayling; Ruiz, Didye; Samuels, David C; Sakuntabhai, Anavaj; Pereira, Luisa; Guzman, Maria G
2017-02-01
Ethnic groups can display differential genetic susceptibility to infectious diseases. The arthropod-born viral dengue disease is one such disease, with empirical and limited genetic evidence showing that African ancestry may be protective against the haemorrhagic phenotype. Global ancestry analysis based on high-throughput genotyping in admixed populations can be used to test this hypothesis, while admixture mapping can map candidate protective genes. A Cuban dengue fever cohort was genotyped using a 2.5 million SNP chip. Global ancestry was ascertained through ADMIXTURE and used in a fine-matched corrected association study, while local ancestry was inferred by the RFMix algorithm. The expression of candidate genes was evaluated by RT-PCR in a Cuban dengue patient cohort and gene set enrichment analysis was performed in a Thai dengue transcriptome. OSBPL10 and RXRA candidate genes were identified, with most significant SNPs placed in inferred weak enhancers, promoters and lncRNAs. OSBPL10 had significantly lower expression in Africans than Europeans, while for RXRA several SNPs may differentially regulate its transcription between Africans and Europeans. Their expression was confirmed to change through dengue disease progression in Cuban patients and to vary with disease severity in a Thai transcriptome dataset. These genes interact in the LXR/RXR activation pathway that integrates lipid metabolism and immune functions, being a key player in dengue virus entrance into cells, its replication therein and in cytokine production. Knockdown of OSBPL10 expression in THP-1 cells by two shRNAs followed by DENV2 infection tests led to a significant reduction in DENV replication, being a direct functional proof that the lower OSBPL10 expression profile in Africans protects this ancestry against dengue disease.
Genome-Wide Association Study for Carcass Traits in an Experimental Nelore Cattle Population
Medeiros de Oliveira Silva, Rafael; Bonvino Stafuzza, Nedenia; de Oliveira Fragomeni, Breno; Miguel Ferreira de Camargo, Gregório; Matos Ceacero, Thaís; Noely dos Santos Gonçalves Cyrillo, Joslaine; Baldi, Fernando; Augusti Boligon, Arione; Zerlotti Mercadante, Maria Eugênia; Lino Lourenco, Daniela; Misztal, Ignacy; Galvão de Albuquerque, Lucia
2017-01-01
The purpose of this study was to identify genomic regions associated with carcass traits in an experimental Nelore cattle population. The studied data set contained 2,306 ultrasound records for longissimus muscle area (LMA), 1,832 for backfat thickness (BF), and 1,830 for rump fat thickness (RF). A high-density SNP panel (BovineHD BeadChip assay 700k, Illumina Inc., San Diego, CA) was used for genotyping. After genomic data quality control, 437,197 SNPs from 761 animals were available, of which 721 had phenotypes for LMA, 669 for BF, and 718 for RF. The SNP solutions were estimated using a single-step genomic BLUP approach (ssGWAS), which calculated the variance for windows of 50 consecutive SNPs and the regions that accounted for more than 0.5% of the additive genetic variance were used to search for candidate genes. The results indicated that 12, 18, and 15 different windows were associated to LMA, BF, and RF, respectively. Confirming the polygenic nature of the studied traits, 43, 65, and 53 genes were found in those associated windows, respectively for LMA, BF, and RF. Among the candidate genes, some of them, which already had their functions associated with the expression of energy metabolism, were found associated with fat deposition in this study. In addition, ALKBH3 and HSD17B12 genes, which are related in fibroblast death and metabolism of steroids, were found associated with LMA. The results presented here should help to better understand the genetic and physiologic mechanism regulating the muscle tissue deposition and subcutaneous fat cover expression of Zebu animals. The identification of candidate genes should contribute for Zebu breeding programs in order to consider carcass traits as selection criteria in their genetic evaluation. PMID:28118362
Soares, Pedro; Garcia, Gissel; Perez, Ana B.; Aguirre, Eglys; Cavadas, Bruno; Regnault, Béatrice; Alvarez, Mayling; Ruiz, Didye; Guzman, Maria G.
2017-01-01
Ethnic groups can display differential genetic susceptibility to infectious diseases. The arthropod-born viral dengue disease is one such disease, with empirical and limited genetic evidence showing that African ancestry may be protective against the haemorrhagic phenotype. Global ancestry analysis based on high-throughput genotyping in admixed populations can be used to test this hypothesis, while admixture mapping can map candidate protective genes. A Cuban dengue fever cohort was genotyped using a 2.5 million SNP chip. Global ancestry was ascertained through ADMIXTURE and used in a fine-matched corrected association study, while local ancestry was inferred by the RFMix algorithm. The expression of candidate genes was evaluated by RT-PCR in a Cuban dengue patient cohort and gene set enrichment analysis was performed in a Thai dengue transcriptome. OSBPL10 and RXRA candidate genes were identified, with most significant SNPs placed in inferred weak enhancers, promoters and lncRNAs. OSBPL10 had significantly lower expression in Africans than Europeans, while for RXRA several SNPs may differentially regulate its transcription between Africans and Europeans. Their expression was confirmed to change through dengue disease progression in Cuban patients and to vary with disease severity in a Thai transcriptome dataset. These genes interact in the LXR/RXR activation pathway that integrates lipid metabolism and immune functions, being a key player in dengue virus entrance into cells, its replication therein and in cytokine production. Knockdown of OSBPL10 expression in THP-1 cells by two shRNAs followed by DENV2 infection tests led to a significant reduction in DENV replication, being a direct functional proof that the lower OSBPL10 expression profile in Africans protects this ancestry against dengue disease. PMID:28241052
Schwarz, Jodi A; Brokstein, Peter B; Voolstra, Christian; Terry, Astrid Y; Miller, David J; Szmant, Alina M; Coffroth, Mary Alice; Medina, Mónica
2008-01-01
Background Scleractinian corals are the foundation of reef ecosystems in tropical marine environments. Their great success is due to interactions with endosymbiotic dinoflagellates (Symbiodinium spp.), with which they are obligately symbiotic. To develop a foundation for studying coral biology and coral symbiosis, we have constructed a set of cDNA libraries and generated and annotated ESTs from two species of corals, Acropora palmata and Montastraea faveolata. Results We generated 14,588 (Ap) and 3,854 (Mf) high quality ESTs from five life history/symbiosis stages (spawned eggs, early-stage planula larvae, late-stage planula larvae either infected with symbionts or uninfected, and adult coral). The ESTs assembled into a set of primarily stage-specific clusters, producing 4,980 (Ap), and 1,732 (Mf) unigenes. The egg stage library, relative to the other developmental stages, was enriched in genes functioning in cell division and proliferation, transcription, signal transduction, and regulation of protein function. Fifteen unigenes were identified as candidate symbiosis-related genes as they were expressed in all libraries constructed from the symbiotic stages and were absent from all of the non symbiotic stages. These include several DNA interacting proteins, and one highly expressed unigene (containing 17 cDNAs) with no significant protein-coding region. A significant number of unigenes (25) encode potential pattern recognition receptors (lectins, scavenger receptors, and others), as well as genes that may function in signaling pathways involved in innate immune responses (toll-like signaling, NFkB p105, and MAP kinases). Comparison between the A. palmata and an A. millepora EST dataset identified ferritin as a highly expressed gene in both datasets that appears to be undergoing adaptive evolution. Five unigenes appear to be restricted to the Scleractinia, as they had no homology to any sequences in the nr databases nor to the non-scleractinian cnidarians Nematostella vectensis and Hydra magnipapillata. Conclusion Partial sequencing of 5 cDNA libraries each for A. palmata and M. faveolata has produced a rich set of candidate genes (4,980 genes from A. palmata, and 1,732 genes from M. faveolata) that we can use as a starting point for examining the life history and symbiosis of these two species, as well as to further expand the dataset of cnidarian genes for comparative genomics and evolutionary studies. PMID:18298846
Schwarz, Jodi A.; Brokstein, Peter B.; Voolstra, Christian R.; ...
2008-02-25
Scleractinian corals are the foundation of reef ecosystems in tropical marine environments. Their great success is due to interactions with endosymbiotic dinoflagellates (Symbiodinium spp.), with which they are obligately symbiotic. To develop a foundation for studying coral biology and coral symbiosis, we have constructed a set of cDNA libraries and generated and annotated ESTs from two species of corals, Acropora palmata and Montastraea faveolata. Here we generated 14,588 (Ap) and 3,854 (Mf) high quality ESTs from five life history/symbiosis stages (spawned eggs, early-stage planula larvae, late-stage planula larvae either infected with symbionts or uninfected, and adult coral). The ESTs assembledmore » into a set of primarily stage-specific clusters, producing 4,980 (Ap), and 1,732 (Mf) unigenes. The egg stage library, relative to the other developmental stages, was enriched in genes functioning in cell division and proliferation, transcription, signal transduction, and regulation of protein function. Fifteen unigenes were identified as candidate symbiosis-related genes as they were expressed in all libraries constructed from the symbiotic stages and were absent from all of the non symbiotic stages. These include several DNA interacting proteins, and one highly expressed unigene (containing 17 cDNAs) with no significant protein-coding region. A significant number of unigenes (25) encode potential pattern recognition receptors (lectins, scavenger receptors, and others), as well as genes that may function in signaling pathways involved in innate immune responses (toll-like signaling, NFkB p105, and MAP kinases). Comparison between the A. palmata and an A. millepora EST dataset identified ferritin as a highly expressed gene in both datasets that appears to be undergoing adaptive evolution. Five unigenes appear to be restricted to the Scleractinia, as they had no homology to any sequences in the nr databases nor to the non-scleractinian cnidarians Nematostella vectensis and Hydra magnipapillata. In conclusion, partial sequencing of 5 cDNA libraries each for A. palmata and M. faveolata has produced a rich set of candidate genes (4,980 genes from A. palmata, and 1,732 genes from M. faveolata) that we can use as a starting point for examining the life history and symbiosis of these two species, as well as to further expand the dataset of cnidarian genes for comparative genomics and evolutionary studies.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schwarz, Jodi A.; Brokstein, Peter B.; Voolstra, Christian R.
Scleractinian corals are the foundation of reef ecosystems in tropical marine environments. Their great success is due to interactions with endosymbiotic dinoflagellates (Symbiodinium spp.), with which they are obligately symbiotic. To develop a foundation for studying coral biology and coral symbiosis, we have constructed a set of cDNA libraries and generated and annotated ESTs from two species of corals, Acropora palmata and Montastraea faveolata. Here we generated 14,588 (Ap) and 3,854 (Mf) high quality ESTs from five life history/symbiosis stages (spawned eggs, early-stage planula larvae, late-stage planula larvae either infected with symbionts or uninfected, and adult coral). The ESTs assembledmore » into a set of primarily stage-specific clusters, producing 4,980 (Ap), and 1,732 (Mf) unigenes. The egg stage library, relative to the other developmental stages, was enriched in genes functioning in cell division and proliferation, transcription, signal transduction, and regulation of protein function. Fifteen unigenes were identified as candidate symbiosis-related genes as they were expressed in all libraries constructed from the symbiotic stages and were absent from all of the non symbiotic stages. These include several DNA interacting proteins, and one highly expressed unigene (containing 17 cDNAs) with no significant protein-coding region. A significant number of unigenes (25) encode potential pattern recognition receptors (lectins, scavenger receptors, and others), as well as genes that may function in signaling pathways involved in innate immune responses (toll-like signaling, NFkB p105, and MAP kinases). Comparison between the A. palmata and an A. millepora EST dataset identified ferritin as a highly expressed gene in both datasets that appears to be undergoing adaptive evolution. Five unigenes appear to be restricted to the Scleractinia, as they had no homology to any sequences in the nr databases nor to the non-scleractinian cnidarians Nematostella vectensis and Hydra magnipapillata. In conclusion, partial sequencing of 5 cDNA libraries each for A. palmata and M. faveolata has produced a rich set of candidate genes (4,980 genes from A. palmata, and 1,732 genes from M. faveolata) that we can use as a starting point for examining the life history and symbiosis of these two species, as well as to further expand the dataset of cnidarian genes for comparative genomics and evolutionary studies.« less
Defining a new candidate gene for amelogenesis imperfecta: from molecular genetics to biochemistry.
Urzúa, Blanca; Ortega-Pinto, Ana; Morales-Bozo, Irene; Rojas-Alcayaga, Gonzalo; Cifuentes, Víctor
2011-02-01
Amelogenesis imperfecta is a group of genetic conditions that affect the structure and clinical appearance of tooth enamel. The types (hypoplastic, hypocalcified, and hypomature) are correlated with defects in different stages of the process of enamel synthesis. Autosomal dominant, recessive, and X-linked types have been previously described. These disorders are considered clinically and genetically heterogeneous in etiology, involving a variety of genes, such as AMELX, ENAM, DLX3, FAM83H, MMP-20, KLK4, and WDR72. The mutations identified within these causal genes explain less than half of all cases of amelogenesis imperfecta. Most of the candidate and causal genes currently identified encode proteins involved in enamel synthesis. We think it is necessary to refocus the search for candidate genes using biochemical processes. This review provides theoretical evidence that the human SLC4A4 gene (sodium bicarbonate cotransporter) may be a new candidate gene.
Warburton, Marilyn L; Williams, William Paul; Hawkins, Leigh; Bridges, Susan; Gresham, Cathy; Harper, Jonathan; Ozkan, Seval; Mylroie, J Erik; Shan, Xueyan
2011-07-01
A public candidate gene testing pipeline for resistance to aflatoxin accumulation or Aspergillus flavus infection in maize is presented here. The pipeline consists of steps for identifying, testing, and verifying the association of selected maize gene sequences with resistance under field conditions. Resources include a database of genetic and protein sequences associated with the reduction in aflatoxin contamination from previous studies; eight diverse inbred maize lines for polymorphism identification within any maize gene sequence; four Quantitative Trait Loci (QTL) mapping populations and one association mapping panel, all phenotyped for aflatoxin accumulation resistance and associated phenotypes; and capacity for Insertion/Deletion (InDel) and SNP genotyping in the population(s) for mapping. To date, ten genes have been identified as possible candidate genes and put through the candidate gene testing pipeline, and results are presented here to demonstrate the utility of the pipeline.
ENU Mutagenesis in Mice Identifies Candidate Genes For Hypogonadism
Weiss, Jeffrey; Hurley, Lisa A.; Harris, Rebecca M.; Finlayson, Courtney; Tong, Minghan; Fisher, Lisa A.; Moran, Jennifer L.; Beier, David R.; Mason, Christopher; Jameson, J. Larry
2012-01-01
Genome-wide mutagenesis was performed in mice to identify candidate genes for male infertility, for which the predominant causes remain idiopathic. Mice were mutagenized using N-ethyl-N-nitrosourea (ENU), bred, and screened for phenotypes associated with the male urogenital system. Fifteen heritable lines were isolated and chromosomal loci were assigned using low density genome-wide SNP arrays. Ten of the fifteen lines were pursued further using higher resolution SNP analysis to narrow the candidate gene regions. Exon sequencing of candidate genes identified mutations in mice with cystic kidneys (Bicc1), cryptorchidism (Rxfp2), restricted germ cell deficiency (Plk4), and severe germ cell deficiency (Prdm9). In two other lines with severe hypogonadism candidate sequencing failed to identify mutations, suggesting defects in genes with previously undocumented roles in gonadal function. These genomic intervals were sequenced in their entirety and a candidate mutation was identified in SnrpE in one of the two lines. The line harboring the SnrpE variant retains substantial spermatogenesis despite small testis size, an unusual phenotype. In addition to the reproductive defects, heritable phenotypes were observed in mice with ataxia (Myo5a), tremors (Pmp22), growth retardation (unknown gene), and hydrocephalus (unknown gene). These results demonstrate that the ENU screen is an effective tool for identifying potential causes of male infertility. PMID:22258617
Filling gaps in PPAR-alpha signaling through comparative nutrigenomics analysis.
Cavalieri, Duccio; Calura, Enrica; Romualdi, Chiara; Marchi, Emmanuela; Radonjic, Marijana; Van Ommen, Ben; Müller, Michael
2009-12-11
The application of high-throughput genomic tools in nutrition research is a widespread practice. However, it is becoming increasingly clear that the outcome of individual expression studies is insufficient for the comprehensive understanding of such a complex field. Currently, the availability of the large amounts of expression data in public repositories has opened up new challenges on microarray data analyses. We have focused on PPARalpha, a ligand-activated transcription factor functioning as fatty acid sensor controlling the gene expression regulation of a large set of genes in various metabolic organs such as liver, small intestine or heart. The function of PPARalpha is strictly connected to the function of its target genes and, although many of these have already been identified, major elements of its physiological function remain to be uncovered. To further investigate the function of PPARalpha, we have applied a cross-species meta-analysis approach to integrate sixteen microarray datasets studying high fat diet and PPARalpha signal perturbations in different organisms. We identified 164 genes (MDEGs) that were differentially expressed in a constant way in response to a high fat diet or to perturbations in PPARs signalling. In particular, we found five genes in yeast which were highly conserved and homologous of PPARalpha targets in mammals, potential candidates to be used as models for the equivalent mammalian genes. Moreover, a screening of the MDEGs for all known transcription factor binding sites and the comparison with a human genome-wide screening of Peroxisome Proliferating Response Elements (PPRE), enabled us to identify, 20 new potential candidate genes that show, both binding site, both change in expression in the condition studied. Lastly, we found a non random localization of the differentially expressed genes in the genome. The results presented are potentially of great interest to resume the currently available expression data, exploiting the power of in silico analysis filtered by evolutionary conservation. The analysis enabled us to indicate potential gene candidates that could fill in the gaps with regards to the signalling of PPARalpha and, moreover, the non-random localization of the differentially expressed genes in the genome, suggest that epigenetic mechanisms are of importance in the regulation of the transcription operated by PPARalpha.
Identification of Causal Genes, Networks, and Transcriptional Regulators of REM Sleep and Wake
Millstein, Joshua; Winrow, Christopher J.; Kasarskis, Andrew; Owens, Joseph R.; Zhou, Lili; Summa, Keith C.; Fitzpatrick, Karrie; Zhang, Bin; Vitaterna, Martha H.; Schadt, Eric E.; Renger, John J.; Turek, Fred W.
2011-01-01
Study Objective: Sleep-wake traits are well-known to be under substantial genetic control, but the specific genes and gene networks underlying primary sleep-wake traits have largely eluded identification using conventional approaches, especially in mammals. Thus, the aim of this study was to use systems genetics and statistical approaches to uncover the genetic networks underlying 2 primary sleep traits in the mouse: 24-h duration of REM sleep and wake. Design: Genome-wide RNA expression data from 3 tissues (anterior cortex, hypothalamus, thalamus/midbrain) were used in conjunction with high-density genotyping to identify candidate causal genes and networks mediating the effects of 2 QTL regulating the 24-h duration of REM sleep and one regulating the 24-h duration of wake. Setting: Basic sleep research laboratory. Patients or Participants: Male [C57BL/6J × (BALB/cByJ × C57BL/6J*) F1] N2 mice (n = 283). Interventions: None. Measurements and Results: The genetic variation of a mouse N2 mapping cross was leveraged against sleep-state phenotypic variation as well as quantitative gene expression measurement in key brain regions using integrative genomics approaches to uncover multiple causal sleep-state regulatory genes, including several surprising novel candidates, which interact as components of networks that modulate REM sleep and wake. In particular, it was discovered that a core network module, consisting of 20 genes, involved in the regulation of REM sleep duration is conserved across the cortex, hypothalamus, and thalamus. A novel application of a formal causal inference test was also used to identify those genes directly regulating sleep via control of expression. Conclusion: Systems genetics approaches reveal novel candidate genes, complex networks and specific transcriptional regulators of REM sleep and wake duration in mammals. Citation: Millstein J; Winrow CJ; Kasarskis A; Owens JR; Zhou L; Summa KC; Fitzpatrick K; Zhang B; Vitaterna MH; Schadt EE; Renger JJ; Turek FW. Identification of causal genes, networks, and transcriptional regulators of REM sleep and wake. SLEEP 2011;34(11):1469-1477. PMID:22043117
Li, Xiaoshuang; Zhang, Daoyuan; Li, Haiyan; Gao, Bei; Yang, Honglan; Zhang, Yuanming; Wood, Andrew J.
2015-01-01
Syntrichia caninervis is the dominant bryophyte of the biological soil crusts found in the Gurbantunggut desert. The extreme desert environment is characterized by prolonged drought, temperature extremes, high radiation and frequent cycles of hydration and dehydration. S. caninervis is an ideal organism for the identification and characterization of genes related to abiotic stress tolerance. Reverse transcription quantitative real-time polymerase chain reaction (RT-qPCR) expression analysis is a powerful analytical technique that requires the use of stable reference genes. Using available S. caninervis transcriptome data, we selected 15 candidate reference genes and analyzed their relative expression stabilities in S. caninervis gametophores exposed to a range of abiotic stresses or a hydration-desiccation-rehydration cycle. The programs geNorm, NormFinder, and RefFinder were used to assess and rank the expression stability of the 15 candidate genes. The stability ranking results of reference genes under each specific experimental condition showed high consistency using different algorithms. For abiotic stress treatments, the combination of two genes (α-TUB2 and CDPK) were sufficient for accurate normalization. For the hydration-desiccation-rehydration process, the combination of two genes (α-TUB1 and CDPK) were sufficient for accurate normalization. 18S was among the least stable genes in all of the experimental sets and was unsuitable as reference gene in S. caninervis. This is the first systematic investigation and comparison of reference gene selection for RT-qPCR work in S. caninervis. This research will facilitate gene expression studies in S. caninervis, related moss species from the Syntrichia complex and other mosses. PMID:25699066
The cld mutation: narrowing the critical chromosomal region and selecting candidate genes.
Péterfy, Miklós; Mao, Hui Z; Doolittle, Mark H
2006-10-01
Combined lipase deficiency (cld) is a recessive, lethal mutation specific to the tw73 haplotype on mouse Chromosome 17. While the cld mutation results in lipase proteins that are inactive, aggregated, and retained in the endoplasmic reticulum (ER), it maps separately from the lipase structural genes. We have narrowed the gene critical region by about 50% using the tw18 haplotype for deletion mapping and a recombinant chromosome used originally to map cld with respect to the phenotypic marker tf. The region now extends from 22 to 25.6 Mbp on the wild-type chromosome, currently containing 149 genes and 50 expressed sequence tags (ESTs). To identify the affected gene, we have selected candidates based on their known role in associated biological processes, cellular components, and molecular functions that best fit with the predicted function of the cld gene. A secondary approach was based on differences in mRNA levels between mutant (cld/cld) and unaffected (+/cld) cells. Using both approaches, we have identified seven functional candidates with an ER localization and/or an involvement in protein maturation and folding that could explain the lipase deficiency, and six expression candidates that exhibit large differences in mRNA levels between mutant and unaffected cells. Significantly, two genes were found to be candidates with regard to both function and expression, thus emerging as the strongest candidates for cld. We discuss the implications of our mapping results and our selection of candidates with respect to other genes, deletions, and mutations occurring in the cld critical region.
Brondani, Rosana PV; Williams, Emlyn R; Brondani, Claudio; Grattapaglia, Dario
2006-01-01
Background Eucalypts are the most widely planted hardwood trees in the world occupying globally more than 18 million hectares as an important source of carbon neutral renewable energy and raw material for pulp, paper and solid wood. Quantitative Trait Loci (QTLs) in Eucalyptus have been localized on pedigree-specific RAPD or AFLP maps seriously limiting the value of such QTL mapping efforts for molecular breeding. The availability of a genus-wide genetic map with transferable microsatellite markers has become a must for the effective advancement of genomic undertakings. This report describes the development of a novel set of 230 EMBRA microsatellites, the construction of the first comprehensive microsatellite-based consensus linkage map for Eucalyptus and the consolidation of existing linkage information for other microsatellites and candidate genes mapped in other species of the genus. Results The consensus map covers ~90% of the recombining genome of Eucalyptus, involves 234 mapped EMBRA loci on 11 linkage groups, an observed length of 1,568 cM and a mean distance between markers of 8.4 cM. A compilation of all microsatellite linkage information published in Eucalyptus allowed us to establish the homology among linkage groups between this consensus map and other maps published for E. globulus. Comparative mapping analyses also resulted in the linkage group assignment of other 41 microsatellites derived from other Eucalyptus species as well as candidate genes and QTLs for wood and flowering traits published in the literature. This report significantly increases the availability of microsatellite markers and mapping information for species of Eucalyptus and corroborates the high conservation of microsatellite flanking sequences and locus ordering between species of the genus. Conclusion This work represents an important step forward for Eucalyptus comparative genomics, opening stimulating perspectives for evolutionary studies and molecular breeding applications. The generalized use of an increasingly larger set of interspecific transferable markers and consensus mapping information, will allow faster and more detailed investigations of QTL synteny among species, validation of expression-QTL across variable genetic backgrounds and positioning of a growing number of candidate genes co-localized with QTLs, to be tested in association mapping experiments. PMID:16995939
Cosart, Ted; Beja-Pereira, Albano; Luikart, Gordon
2014-11-01
The computer program EXONSAMPLER automates the sampling of thousands of exon sequences from publicly available reference genome sequences and gene annotation databases. It was designed to provide exon sequences for the efficient, next-generation gene sequencing method called exon capture. The exon sequences can be sampled by a list of gene name abbreviations (e.g. IFNG, TLR1), or by sampling exons from genes spaced evenly across chromosomes. It provides a list of genomic coordinates (a bed file), as well as a set of sequences in fasta format. User-adjustable parameters for collecting exon sequences include a minimum and maximum acceptable exon length, maximum number of exonic base pairs (bp) to sample per gene, and maximum total bp for the entire collection. It allows for partial sampling of very large exons. It can preferentially sample upstream (5 prime) exons, downstream (3 prime) exons, both external exons, or all internal exons. It is written in the Python programming language using its free libraries. We describe the use of EXONSAMPLER to collect exon sequences from the domestic cow (Bos taurus) genome for the design of an exon-capture microarray to sequence exons from related species, including the zebu cow and wild bison. We collected ~10% of the exome (~3 million bp), including 155 candidate genes, and ~16,000 exons evenly spaced genomewide. We prioritized the collection of 5 prime exons to facilitate discovery and genotyping of SNPs near upstream gene regulatory DNA sequences, which control gene expression and are often under natural selection. © 2014 John Wiley & Sons Ltd.
Torrezan, Giovana T; de Almeida, Fernanda G Dos Santos R; Figueiredo, Márcia C P; Barros, Bruna D de Figueiredo; de Paula, Cláudia A A; Valieris, Renan; de Souza, Jorge E S; Ramalho, Rodrigo F; da Silva, Felipe C C; Ferreira, Elisa N; de Nóbrega, Amanda F; Felicio, Paula S; Achatz, Maria I; de Souza, Sandro J; Palmero, Edenir I; Carraro, Dirce M
2018-01-01
Pathogenic variants in known breast cancer (BC) predisposing genes explain only about 30% of Hereditary Breast Cancer (HBC) cases, whereas the underlying genetic factors for most families remain unknown. Here, we used whole-exome sequencing (WES) to identify genetic variants associated to HBC in 17 patients of Brazil with familial BC and negative for causal variants in major BC risk genes ( BRCA1/2, TP53 , and CHEK2 c.1100delC). First, we searched for rare variants in 27 known HBC genes and identified two patients harboring truncating pathogenic variants in ATM and BARD1 . For the remaining 15 negative patients, we found a substantial vast number of rare genetic variants. Thus, for selecting the most promising variants we used functional-based variant prioritization, followed by NGS validation, analysis in a control group, cosegregation analysis in one family and comparison with previous WES studies, shrinking our list to 23 novel BC candidate genes, which were evaluated in an independent cohort of 42 high-risk BC patients. Rare and possibly damaging variants were identified in 12 candidate genes in this cohort, including variants in DNA repair genes ( ERCC1 and SXL4 ) and other cancer-related genes ( NOTCH2, ERBB2, MST1R , and RAF1 ). Overall, this is the first WES study applied for identifying novel genes associated to HBC in Brazilian patients, in which we provide a set of putative BC predisposing genes. We also underpin the value of using WES for assessing the complex landscape of HBC susceptibility, especially in less characterized populations.
Richards, Thomas A; Soanes, Darren M; Foster, Peter G; Leonard, Guy; Thornton, Christopher R; Talbot, Nicholas J
2009-07-01
Horizontal gene transfer (HGT) describes the transmission of genetic material across species boundaries and is an important evolutionary phenomenon in the ancestry of many microbes. The role of HGT in plant evolutionary history is, however, largely unexplored. Here, we compare the genomes of six plant species with those of 159 prokaryotic and eukaryotic species and identify 1689 genes that show the highest similarity to corresponding genes from fungi. We constructed a phylogeny for all 1689 genes identified and all homolog groups available from the rice (Oryza sativa) genome (3177 gene families) and used these to define 14 candidate plant-fungi HGT events. Comprehensive phylogenetic analyses of these 14 data sets, using methods that account for site rate heterogeneity, demonstrated support for nine HGT events, demonstrating an infrequent pattern of HGT between plants and fungi. Five HGTs were fungi-to-plant transfers and four were plant-to-fungi HGTs. None of the fungal-to-plant HGTs involved angiosperm recipients. These results alter the current view of organismal barriers to HGT, suggesting that phagotrophy, the consumption of a whole cell by another, is not necessarily a prerequisite for HGT between eukaryotes. Putative functional annotation of the HGT candidate genes suggests that two fungi-to-plant transfers have added phenotypes important for life in a soil environment. Our study suggests that genetic exchange between plants and fungi is exceedingly rare, particularly among the angiosperms, but has occurred during their evolutionary history and added important metabolic traits to plant lineages.
TP53 mutations, expression and interaction networks in human cancers
Wang, Xiaosheng; Sun, Qingrong
2017-01-01
Although the associations of p53 dysfunction, p53 interaction networks and oncogenesis have been widely explored, a systematic analysis of TP53 mutations and its related interaction networks in various types of human cancers is lacking. Our study explored the associations of TP53 mutations, gene expression, clinical outcomes, and TP53 interaction networks across 33 cancer types using data from The Cancer Genome Atlas (TCGA). We show that TP53 is the most frequently mutated gene in a number of cancers, and its mutations appear to be early events in cancer initiation. We identified genes potentially repressed by p53, and genes whose expression correlates significantly with TP53 expression. These gene products may be especially important nodes in p53 interaction networks in human cancers. This study shows that while TP53-truncating mutations often result in decreased TP53 expression, other non-truncating TP53 mutations result in increased TP53 expression in some cancers. Survival analyses in a number of cancers show that patients with TP53 mutations are more likely to have worse prognoses than TP53-wildtype patients, and that elevated TP53 expression often leads to poor clinical outcomes. We identified a set of candidate synthetic lethal (SL) genes for TP53, and validated some of these SL interactions using data from the Cancer Cell Line Project. These predicted SL genes are promising candidates for experimental validation and the development of personalized therapeutics for patients with TP53-mutated cancers. PMID:27880943
TP53 mutations, expression and interaction networks in human cancers.
Wang, Xiaosheng; Sun, Qingrong
2017-01-03
Although the associations of p53 dysfunction, p53 interaction networks and oncogenesis have been widely explored, a systematic analysis of TP53 mutations and its related interaction networks in various types of human cancers is lacking. Our study explored the associations of TP53 mutations, gene expression, clinical outcomes, and TP53 interaction networks across 33 cancer types using data from The Cancer Genome Atlas (TCGA). We show that TP53 is the most frequently mutated gene in a number of cancers, and its mutations appear to be early events in cancer initiation. We identified genes potentially repressed by p53, and genes whose expression correlates significantly with TP53 expression. These gene products may be especially important nodes in p53 interaction networks in human cancers. This study shows that while TP53-truncating mutations often result in decreased TP53 expression, other non-truncating TP53 mutations result in increased TP53 expression in some cancers. Survival analyses in a number of cancers show that patients with TP53 mutations are more likely to have worse prognoses than TP53-wildtype patients, and that elevated TP53 expression often leads to poor clinical outcomes. We identified a set of candidate synthetic lethal (SL) genes for TP53, and validated some of these SL interactions using data from the Cancer Cell Line Project. These predicted SL genes are promising candidates for experimental validation and the development of personalized therapeutics for patients with TP53-mutated cancers.
Eckert, Andrew J; Bower, Andrew D; Wegrzyn, Jill L; Pande, Barnaly; Jermstad, Kathleen D; Krutovsky, Konstantin V; St Clair, J Bradley; Neale, David B
2009-08-01
Adaptation to cold is one of the greatest challenges to forest trees. This process is highly synchronized with environmental cues relating to photoperiod and temperature. Here, we use a candidate gene-based approach to search for genetic associations between 384 single-nucleotide polymorphism (SNP) markers from 117 candidate genes and 21 cold-hardiness related traits. A general linear model approach, including population structure estimates as covariates, was implemented for each marker-trait pair. We discovered 30 highly significant genetic associations [false discovery rate (FDR) Q < 0.10] across 12 candidate genes and 10 of the 21 traits. We also detected a set of 7 markers that had elevated levels of differentiation between sampling sites situated across the Cascade crest in northeastern Washington. Marker effects were small (r(2) < 0.05) and within the range of those published previously for forest trees. The derived SNP allele, as measured by a comparison to a recently diverged sister species, typically affected the phenotype in a way consistent with cold hardiness. The majority of markers were characterized as having largely nonadditive modes of gene action, especially underdominance in the case of cold-tolerance related phenotypes. We place these results in the context of trade-offs between the abilities to grow longer and to avoid fall cold damage, as well as putative epigenetic effects. These associations provide insight into the genetic components of complex traits in coastal Douglas fir, as well as highlight the need for landscape genetic approaches to the detection of adaptive genetic diversity.
Machine Learning Helps Identify CHRONO as a Circadian Clock Component
Venkataraman, Anand; Ramanathan, Chidambaram; Kavakli, Ibrahim H.; Hughes, Michael E.; Baggs, Julie E.; Growe, Jacqueline; Liu, Andrew C.; Kim, Junhyong; Hogenesch, John B.
2014-01-01
Over the last decades, researchers have characterized a set of “clock genes” that drive daily rhythms in physiology and behavior. This arduous work has yielded results with far-reaching consequences in metabolic, psychiatric, and neoplastic disorders. Recent attempts to expand our understanding of circadian regulation have moved beyond the mutagenesis screens that identified the first clock components, employing higher throughput genomic and proteomic techniques. In order to further accelerate clock gene discovery, we utilized a computer-assisted approach to identify and prioritize candidate clock components. We used a simple form of probabilistic machine learning to integrate biologically relevant, genome-scale data and ranked genes on their similarity to known clock components. We then used a secondary experimental screen to characterize the top candidates. We found that several physically interact with known clock components in a mammalian two-hybrid screen and modulate in vitro cellular rhythms in an immortalized mouse fibroblast line (NIH 3T3). One candidate, Gene Model 129, interacts with BMAL1 and functionally represses the key driver of molecular rhythms, the BMAL1/CLOCK transcriptional complex. Given these results, we have renamed the gene CHRONO (computationally highlighted repressor of the network oscillator). Bi-molecular fluorescence complementation and co-immunoprecipitation demonstrate that CHRONO represses by abrogating the binding of BMAL1 to its transcriptional co-activator CBP. Most importantly, CHRONO knockout mice display a prolonged free-running circadian period similar to, or more drastic than, six other clock components. We conclude that CHRONO is a functional clock component providing a new layer of control on circadian molecular dynamics. PMID:24737000
Towards an informative mutant phenotype for every bacterial gene
Deutschbauer, Adam; Price, Morgan N.; Wetmore, Kelly M.; ...
2014-08-11
Mutant phenotypes provide strong clues to the functions of the underlying genes and could allow annotation of the millions of sequenced yet uncharacterized bacterial genes. However, it is not known how many genes have a phenotype under laboratory conditions, how many phenotypes are biologically interpretable for predicting gene function, and what experimental conditions are optimal to maximize the number of genes with a phenotype. To address these issues, we measured the mutant fitness of 1,586 genes of the ethanol-producing bacterium Zymomonas mobilis ZM4 across 492 diverse experiments and found statistically significant phenotypes for 89% of all assayed genes. Thus, inmore » Z. mobilis, most genes have a functional consequence under laboratory conditions. We demonstrate that 41% of Z. mobilis genes have both a strong phenotype and a similar fitness pattern (cofitness) to another gene, and are therefore good candidates for functional annotation using mutant fitness. Among 502 poorly characterized Z. mobilis genes, we identified a significant cofitness relationship for 174. For 57 of these genes without a specific functional annotation, we found additional evidence to support the biological significance of these gene-gene associations, and in 33 instances, we were able to predict specific physiological or biochemical roles for the poorly characterized genes. Last, we identified a set of 79 diverse mutant fitness experiments in Z. mobilis that are nearly as biologically informative as the entire set of 492 experiments. Therefore, our work provides a blueprint for the functional annotation of diverse bacteria using mutant fitness.« less
Validating internal controls for quantitative plant gene expression studies.
Brunner, Amy M; Yakovlev, Igor A; Strauss, Steven H
2004-08-18
Real-time reverse transcription PCR (RT-PCR) has greatly improved the ease and sensitivity of quantitative gene expression studies. However, accurate measurement of gene expression with this method relies on the choice of a valid reference for data normalization. Studies rarely verify that gene expression levels for reference genes are adequately consistent among the samples used, nor compare alternative genes to assess which are most reliable for the experimental conditions analyzed. Using real-time RT-PCR to study the expression of 10 poplar (genus Populus) housekeeping genes, we demonstrate a simple method for determining the degree of stability of gene expression over a set of experimental conditions. Based on a traditional method for analyzing the stability of varieties in plant breeding, it defines measures of gene expression stability from analysis of variance (ANOVA) and linear regression. We found that the potential internal control genes differed widely in their expression stability over the different tissues, developmental stages and environmental conditions studied. Our results support that quantitative comparisons of candidate reference genes are an important part of real-time RT-PCR studies that seek to precisely evaluate variation in gene expression. The method we demonstrated facilitates statistical and graphical evaluation of gene expression stability. Selection of the best reference gene for a given set of experimental conditions should enable detection of biologically significant changes in gene expression that are too small to be revealed by less precise methods, or when highly variable reference genes are unknowingly used in real-time RT-PCR experiments.
Eronen, Lauri; Toivonen, Hannu
2012-06-06
Biological databases contain large amounts of data concerning the functions and associations of genes and proteins. Integration of data from several such databases into a single repository can aid the discovery of previously unknown connections spanning multiple types of relationships and databases. Biomine is a system that integrates cross-references from several biological databases into a graph model with multiple types of edges, such as protein interactions, gene-disease associations and gene ontology annotations. Edges are weighted based on their type, reliability, and informativeness. We present Biomine and evaluate its performance in link prediction, where the goal is to predict pairs of nodes that will be connected in the future, based on current data. In particular, we formulate protein interaction prediction and disease gene prioritization tasks as instances of link prediction. The predictions are based on a proximity measure computed on the integrated graph. We consider and experiment with several such measures, and perform a parameter optimization procedure where different edge types are weighted to optimize link prediction accuracy. We also propose a novel method for disease-gene prioritization, defined as finding a subset of candidate genes that cluster together in the graph. We experimentally evaluate Biomine by predicting future annotations in the source databases and prioritizing lists of putative disease genes. The experimental results show that Biomine has strong potential for predicting links when a set of selected candidate links is available. The predictions obtained using the entire Biomine dataset are shown to clearly outperform ones obtained using any single source of data alone, when different types of links are suitably weighted. In the gene prioritization task, an established reference set of disease-associated genes is useful, but the results show that under favorable conditions, Biomine can also perform well when no such information is available.The Biomine system is a proof of concept. Its current version contains 1.1 million entities and 8.1 million relations between them, with focus on human genetics. Some of its functionalities are available in a public query interface at http://biomine.cs.helsinki.fi, allowing searching for and visualizing connections between given biological entities.
Identification of Small RNAs in Desulfovibrio vulgaris Hildenborough
DOE Office of Scientific and Technical Information (OSTI.GOV)
Burns, Andrew; Joachimiak, Marcin; Deutschbauer, Adam
2010-05-17
Desulfovibrio vulgaris is an anaerobic sulfate-reducing bacterium capable of facilitating the removal of toxic metals such as uranium from contaminated sites via reduction. As such, it is essential to understand the intricate regulatory cascades involved in how D. vulgaris and its relatives respond to stressors in such sites. One approach is the identification and analysis of small non-coding RNAs (sRNAs); molecules ranging in size from 20-200 nucleotides that predominantly affect gene regulation by binding to complementary mRNA in an anti-sense fashion and therefore provide an immediate regulatory response. To identify sRNAs in D. vulgaris, a bacterium that does not possessmore » an annotated hfq gene, RNA was pooled from stationary and exponential phases, nitrate exposure, and biofilm conditions. The subsequent RNA was size fractionated, modified, and converted to cDNA for high throughput transcriptomic deep sequencing. A computational approach to identify sRNAs via the alignment of seven separate Desulfovibrio genomes was also performed. From the deep sequencing analysis, 2,296 reads between 20 and 250 nt were identified with expression above genome background. Analysis of those reads limited the number of candidates to ~;;87 intergenic, while ~;;140 appeared to be antisense to annotated open reading frames (ORFs). Further BLAST analysis of the intergenic candidates and other Desulfovibrio genomes indicated that eight candidates were likely portions of ORFs not previously annotated in the D. vulgaris genome. Comparison of the intergenic and antisense data sets to the bioinformatical predicted candidates, resulted in ~;;54 common candidates. Current approaches using Northern analysis and qRT-PCR are being used toverify expression of the candidates and to further develop the role these sRNAs play in D. vulgaris regulation.« less
LGscore: A method to identify disease-related genes using biological literature and Google data.
Kim, Jeongwoo; Kim, Hyunjin; Yoon, Youngmi; Park, Sanghyun
2015-04-01
Since the genome project in 1990s, a number of studies associated with genes have been conducted and researchers have confirmed that genes are involved in disease. For this reason, the identification of the relationships between diseases and genes is important in biology. We propose a method called LGscore, which identifies disease-related genes using Google data and literature data. To implement this method, first, we construct a disease-related gene network using text-mining results. We then extract gene-gene interactions based on co-occurrences in abstract data obtained from PubMed, and calculate the weights of edges in the gene network by means of Z-scoring. The weights contain two values: the frequency and the Google search results. The frequency value is extracted from literature data, and the Google search result is obtained using Google. We assign a score to each gene through a network analysis. We assume that genes with a large number of links and numerous Google search results and frequency values are more likely to be involved in disease. For validation, we investigated the top 20 inferred genes for five different diseases using answer sets. The answer sets comprised six databases that contain information on disease-gene relationships. We identified a significant number of disease-related genes as well as candidate genes for Alzheimer's disease, diabetes, colon cancer, lung cancer, and prostate cancer. Our method was up to 40% more accurate than existing methods. Copyright © 2015 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deutschbauer, Adam; Price, Morgan N.; Wetmore, Kelly M.
Mutant phenotypes provide strong clues to the functions of the underlying genes and could allow annotation of the millions of sequenced yet uncharacterized bacterial genes. However, it is not known how many genes have a phenotype under laboratory conditions, how many phenotypes are biologically interpretable for predicting gene function, and what experimental conditions are optimal to maximize the number of genes with a phenotype. To address these issues, we measured the mutant fitness of 1,586 genes of the ethanol-producing bacterium Zymomonas mobilis ZM4 across 492 diverse experiments and found statistically significant phenotypes for 89% of all assayed genes. Thus, inmore » Z. mobilis, most genes have a functional consequence under laboratory conditions. We demonstrate that 41% of Z. mobilis genes have both a strong phenotype and a similar fitness pattern (cofitness) to another gene, and are therefore good candidates for functional annotation using mutant fitness. Among 502 poorly characterized Z. mobilis genes, we identified a significant cofitness relationship for 174. For 57 of these genes without a specific functional annotation, we found additional evidence to support the biological significance of these gene-gene associations, and in 33 instances, we were able to predict specific physiological or biochemical roles for the poorly characterized genes. Last, we identified a set of 79 diverse mutant fitness experiments in Z. mobilis that are nearly as biologically informative as the entire set of 492 experiments. Therefore, our work provides a blueprint for the functional annotation of diverse bacteria using mutant fitness.« less
2011-01-01
Background Several computational candidate gene selection and prioritization methods have recently been developed. These in silico selection and prioritization techniques are usually based on two central approaches - the examination of similarities to known disease genes and/or the evaluation of functional annotation of genes. Each of these approaches has its own caveats. Here we employ a previously described method of candidate gene prioritization based mainly on gene annotation, in accompaniment with a technique based on the evaluation of pertinent sequence motifs or signatures, in an attempt to refine the gene prioritization approach. We apply this approach to X-linked mental retardation (XLMR), a group of heterogeneous disorders for which some of the underlying genetics is known. Results The gene annotation-based binary filtering method yielded a ranked list of putative XLMR candidate genes with good plausibility of being associated with the development of mental retardation. In parallel, a motif finding approach based on linear discriminatory analysis (LDA) was employed to identify short sequence patterns that may discriminate XLMR from non-XLMR genes. High rates (>80%) of correct classification was achieved, suggesting that the identification of these motifs effectively captures genomic signals associated with XLMR vs. non-XLMR genes. The computational tools developed for the motif-based LDA is integrated into the freely available genomic analysis portal Galaxy (http://main.g2.bx.psu.edu/). Nine genes (APLN, ZC4H2, MAGED4, MAGED4B, RAP2C, FAM156A, FAM156B, TBL1X, and UXT) were highlighted as highly-ranked XLMR methods. Conclusions The combination of gene annotation information and sequence motif-orientated computational candidate gene prediction methods highlight an added benefit in generating a list of plausible candidate genes, as has been demonstrated for XLMR. Reviewers: This article was reviewed by Dr Barbara Bardoni (nominated by Prof Juergen Brosius); Prof Neil Smalheiser and Dr Dustin Holloway (nominated by Prof Charles DeLisi). PMID:21668950
Identifying Candidate Reprogramming Genes in Mouse Induced Pluripotent Stem Cells.
Gao, Fang; Li, Jingyu; Zhang, Heng; Yang, Xu; An, Tiezhu
2017-08-01
Factor-based induced reprogramming approaches have tremendous potential for human regenerative medicine, but the efficiencies of these approaches are still low. In this study, we analyzed the global transcriptional profiles of mouse induced pluripotent stem cells (miPSCs) and mouse embryonic stem cells (mESCs) from seven different labs and present here the first successful clustering according to cell type, not by lab of origin. We identified 2131 different expression genes (DEs) as candidate pluripotency-associated genes by comparing mESCs/miPSCs with somatic cells and 720 DEs between miPSCs and mESCs. Interestingly, there was a significant overlap between the two DE sets. Therefore, we defined the overlap DEs as "consensus DEs" including 313 miPSC-specific genes expressed at a higher level in miPSCs versus mESCs and 184 mESC-specific genes in total and reasoned that these may contribute to the differences in pluripotency between mESCs and miPSCs. A classification of "consensus DEs" according to their different expression levels between somatic cells and mESCs/miPSCs shows that 86% of the miPSC-specific genes are more highly expressed in somatic cells, while 73% of mESC-specific genes are highly expressed in mESCs/miPSCs, indicating that the miPSCs have not efficiently silenced the expression pattern of the somatic cells from which they are derived and failed to completely induce the genes with high expression levels in mESCs. We further revealed a strong correlation between oocyte-enriched factors and insufficiently induced mESC-specific genes and identified 11 hub genes via network analysis. In light of these findings, we postulated that these key hub genes might not only drive somatic cell nuclear transfer (SCNT) reprogramming but also augment the efficiency and quality of miPSC reprogramming.
Smith, Adam Alexander Thil; Belda, Eugeni; Viari, Alain; Medigue, Claudine; Vallenet, David
2012-05-01
Of all biochemically characterized metabolic reactions formalized by the IUBMB, over one out of four have yet to be associated with a nucleic or protein sequence, i.e. are sequence-orphan enzymatic activities. Few bioinformatics annotation tools are able to propose candidate genes for such activities by exploiting context-dependent rather than sequence-dependent data, and none are readily accessible and propose result integration across multiple genomes. Here, we present CanOE (Candidate genes for Orphan Enzymes), a four-step bioinformatics strategy that proposes ranked candidate genes for sequence-orphan enzymatic activities (or orphan enzymes for short). The first step locates "genomic metabolons", i.e. groups of co-localized genes coding proteins catalyzing reactions linked by shared metabolites, in one genome at a time. These metabolons can be particularly helpful for aiding bioanalysts to visualize relevant metabolic data. In the second step, they are used to generate candidate associations between un-annotated genes and gene-less reactions. The third step integrates these gene-reaction associations over several genomes using gene families, and summarizes the strength of family-reaction associations by several scores. In the final step, these scores are used to rank members of gene families which are proposed for metabolic reactions. These associations are of particular interest when the metabolic reaction is a sequence-orphan enzymatic activity. Our strategy found over 60,000 genomic metabolons in more than 1,000 prokaryote organisms from the MicroScope platform, generating candidate genes for many metabolic reactions, of which more than 70 distinct orphan reactions. A computational validation of the approach is discussed. Finally, we present a case study on the anaerobic allantoin degradation pathway in Escherichia coli K-12.
Xu, Hai-Ming; Kong, Xiang-Dong; Chen, Fei; Huang, Ji-Xiang; Lou, Xiang-Yang; Zhao, Jian-Yi
2015-10-24
Brassica napus is an important oilseed crop. Dissection of the genetic architecture underlying oil-related biological processes will greatly facilitates the genetic improvement of rapeseed. The differential gene expression during pod development offers a snapshot on the genes responsible for oil accumulation in. To identify candidate genes in the linkage peaks reported previously, we used RNA sequencing (RNA-Seq) technology to analyze the pod transcriptomes of German cultivar Sollux and Chinese inbred line Gaoyou. The RNA samples were collected for RNA-Seq at 5-7, 15-17 and 25-27 days after flowering (DAF). Bioinformatics analysis was performed to investigate differentially expressed genes (DEGs). Gene annotation analysis was integrated with QTL mapping and Brassica napus pod transcriptome profiling to detect potential candidate genes in oilseed. Four hundred sixty five and two thousand, one hundred fourteen candidate DEGs were identified, respectively, between two varieties at the same stages and across different periods of each variety. Then, 33 DEGs between Sollux and Gaoyou were identified as the candidate genes affecting seed oil content by combining those DEGs with the quantitative trait locus (QTL) mapping results, of which, one was found to be homologous to Arabidopsis thaliana lipid-related genes. Intervarietal DEGs of lipid pathways in QTL regions represent important candidate genes for oil-related traits. Integrated analysis of transcriptome profiling, QTL mapping and comparative genomics with other relative species leads to efficient identification of most plausible functional genes underlying oil-content related characters, offering valuable resources for bettering breeding program of Brassica napus. This study provided a comprehensive overview on the pod transcriptomes of two varieties with different oil-contents at the three developmental stages.
Prediction of gene-phenotype associations in humans, mice, and plants using phenologs.
Woods, John O; Singh-Blom, Ulf Martin; Laurent, Jon M; McGary, Kriston L; Marcotte, Edward M
2013-06-21
Phenotypes and diseases may be related to seemingly dissimilar phenotypes in other species by means of the orthology of underlying genes. Such "orthologous phenotypes," or "phenologs," are examples of deep homology, and may be used to predict additional candidate disease genes. In this work, we develop an unsupervised algorithm for ranking phenolog-based candidate disease genes through the integration of predictions from the k nearest neighbor phenologs, comparing classifiers and weighting functions by cross-validation. We also improve upon the original method by extending the theory to paralogous phenotypes. Our algorithm makes use of additional phenotype data--from chicken, zebrafish, and E. coli, as well as new datasets for C. elegans--establishing that several types of annotations may be treated as phenotypes. We demonstrate the use of our algorithm to predict novel candidate genes for human atrial fibrillation (such as HRH2, ATP4A, ATP4B, and HOPX) and epilepsy (e.g., PAX6 and NKX2-1). We suggest gene candidates for pharmacologically-induced seizures in mouse, solely based on orthologous phenotypes from E. coli. We also explore the prediction of plant gene-phenotype associations, as for the Arabidopsis response to vernalization phenotype. We are able to rank gene predictions for a significant portion of the diseases in the Online Mendelian Inheritance in Man database. Additionally, our method suggests candidate genes for mammalian seizures based only on bacterial phenotypes and gene orthology. We demonstrate that phenotype information may come from diverse sources, including drug sensitivities, gene ontology biological processes, and in situ hybridization annotations. Finally, we offer testable candidates for a variety of human diseases, plant traits, and other classes of phenotypes across a wide array of species.
Metagenomic discovery of biomass-degrading genes and genomes from cow rumen.
Hess, Matthias; Sczyrba, Alexander; Egan, Rob; Kim, Tae-Wan; Chokhawala, Harshal; Schroth, Gary; Luo, Shujun; Clark, Douglas S; Chen, Feng; Zhang, Tao; Mackie, Roderick I; Pennacchio, Len A; Tringe, Susannah G; Visel, Axel; Woyke, Tanja; Wang, Zhong; Rubin, Edward M
2011-01-28
The paucity of enzymes that efficiently deconstruct plant polysaccharides represents a major bottleneck for industrial-scale conversion of cellulosic biomass into biofuels. Cow rumen microbes specialize in degradation of cellulosic plant material, but most members of this complex community resist cultivation. To characterize biomass-degrading genes and genomes, we sequenced and analyzed 268 gigabases of metagenomic DNA from microbes adherent to plant fiber incubated in cow rumen. From these data, we identified 27,755 putative carbohydrate-active genes and expressed 90 candidate proteins, of which 57% were enzymatically active against cellulosic substrates. We also assembled 15 uncultured microbial genomes, which were validated by complementary methods including single-cell genome sequencing. These data sets provide a substantially expanded catalog of genes and genomes participating in the deconstruction of cellulosic biomass.
Graeber, Kai; Linkies, Ada; Wood, Andrew T.A.; Leubner-Metzger, Gerhard
2011-01-01
Comparative biology includes the comparison of transcriptome and quantitative real-time RT-PCR (qRT-PCR) data sets in a range of species to detect evolutionarily conserved and divergent processes. Transcript abundance analysis of target genes by qRT-PCR requires a highly accurate and robust workflow. This includes reference genes with high expression stability (i.e., low intersample transcript abundance variation) for correct target gene normalization. Cross-species qRT-PCR for proper comparative transcript quantification requires reference genes suitable for different species. We addressed this issue using tissue-specific transcriptome data sets of germinating Lepidium sativum seeds to identify new candidate reference genes. We investigated their expression stability in germinating seeds of L. sativum and Arabidopsis thaliana by qRT-PCR, combined with in silico analysis of Arabidopsis and Brassica napus microarray data sets. This revealed that reference gene expression stability is higher for a given developmental process between distinct species than for distinct developmental processes within a given single species. The identified superior cross-species reference genes may be used for family-wide comparative qRT-PCR analysis of Brassicaceae seed germination. Furthermore, using germinating seeds, we exemplify optimization of the qRT-PCR workflow for challenging tissues regarding RNA quality, transcript stability, and tissue abundance. Our work therefore can serve as a guideline for moving beyond Arabidopsis by establishing high-quality cross-species qRT-PCR. PMID:21666000
Kim, Jae Yoon; Moon, Jun-Cheol; Kim, Hyo Chul; Shin, Seungho; Song, Kitae; Kim, Kyung-Hee; Lee, Byung-Moo
2017-01-01
Premise of the study: Positional cloning in combination with phenotyping is a general approach to identify disease-resistance gene candidates in plants; however, it requires several time-consuming steps including population or fine mapping. Therefore, in the present study, we suggest a new combined strategy to improve the identification of disease-resistance gene candidates. Methods and Results: Downy mildew (DM)–resistant maize was selected from five cultivars using a spreader row technique. Positional cloning and bioinformatics tools were used to identify the DM-resistance quantitative trait locus marker (bnlg1702) and 47 protein-coding gene annotations. Eventually, five DM-resistance gene candidates, including bZIP34, Bak1, and Ppr, were identified by quantitative reverse-transcription PCR (RT-PCR) without fine mapping of the bnlg1702 locus. Conclusions: The combined protocol with the spreader row technique, quantitative trait locus positional cloning, and quantitative RT-PCR was effective for identifying DM-resistance candidate genes. This cloning approach may be applied to other whole-genome-sequenced crops or resistance to other diseases. PMID:28224059
Integrative strategies to identify candidate genes in rodent models of human alcoholism.
Treadwell, Julie A
2006-01-01
The search for genes underlying alcohol-related behaviours in rodent models of human alcoholism has been ongoing for many years with only limited success. Recently, new strategies that integrate several of the traditional approaches have provided new insights into the molecular mechanisms underlying ethanol's actions in the brain. We have used alcohol-preferring C57BL/6J (B6) and alcohol-avoiding DBA/2J (D2) genetic strains of mice in an integrative strategy combining high-throughput gene expression screening, genetic segregation analysis, and mapping to previously published quantitative trait loci to uncover candidate genes for the ethanol-preference phenotype. In our study, 2 genes, retinaldehyde binding protein 1 (Rlbp1) and syntaxin 12 (Stx12), were found to be strong candidates for ethanol preference. Such experimental approaches have the power and the potential to greatly speed up the laborious process of identifying candidate genes for the animal models of human alcoholism.
Demographically-Based Evaluation of Genomic Regions under Selection in Domestic Dogs
Freedman, Adam H.; Schweizer, Rena M.; Ortega-Del Vecchyo, Diego; Han, Eunjung; Davis, Brian W.; Gronau, Ilan; Silva, Pedro M.; Galaverni, Marco; Fan, Zhenxin; Marx, Peter; Lorente-Galdos, Belen; Ramirez, Oscar; Hormozdiari, Farhad; Alkan, Can; Vilà, Carles; Squire, Kevin; Geffen, Eli; Kusak, Josip; Boyko, Adam R.; Parker, Heidi G.; Lee, Clarence; Tadigotla, Vasisht; Siepel, Adam; Bustamante, Carlos D.; Harkins, Timothy T.; Nelson, Stanley F.; Marques-Bonet, Tomas; Ostrander, Elaine A.; Wayne, Robert K.; Novembre, John
2016-01-01
Controlling for background demographic effects is important for accurately identifying loci that have recently undergone positive selection. To date, the effects of demography have not yet been explicitly considered when identifying loci under selection during dog domestication. To investigate positive selection on the dog lineage early in the domestication, we examined patterns of polymorphism in six canid genomes that were previously used to infer a demographic model of dog domestication. Using an inferred demographic model, we computed false discovery rates (FDR) and identified 349 outlier regions consistent with positive selection at a low FDR. The signals in the top 100 regions were frequently centered on candidate genes related to brain function and behavior, including LHFPL3, CADM2, GRIK3, SH3GL2, MBP, PDE7B, NTAN1, and GLRA1. These regions contained significant enrichments in behavioral ontology categories. The 3rd top hit, CCRN4L, plays a major role in lipid metabolism, that is supported by additional metabolism related candidates revealed in our scan, including SCP2D1 and PDXC1. Comparing our method to an empirical outlier approach that does not directly account for demography, we found only modest overlaps between the two methods, with 60% of empirical outliers having no overlap with our demography-based outlier detection approach. Demography-aware approaches have lower-rates of false discovery. Our top candidates for selection, in addition to expanding the set of neurobehavioral candidate genes, include genes related to lipid metabolism, suggesting a dietary target of selection that was important during the period when proto-dogs hunted and fed alongside hunter-gatherers. PMID:26943675
Changes in Gene Expression with Sleep
Thimgan, Matthew S.; Duntley, Stephen P.; Shaw, Paul J.
2011-01-01
There is general agreement within the sleep community and among public health officials of the need for an accessible biomarker of sleepiness. As the foregoing discussions emphasize, however, it may be more difficult to reach consensus on how to define such a biomarker than to identify candidate molecules that can be then evaluated to determine if they might be useful to solve a variety of real-world problems related to insufficient sleep. With that in mind, a goal of our laboratories has been to develop a rational strategy to expedite the identification of candidate biomarkers. 1 We began with the assumption that since both the genetic and environmental context of a gene can influence its behavior, an effective test of sleep loss will likely be composed of a panel of multiple biomarkers. That is, we believe that it is premature to exclude a candidate analyte simply because it might also be modulated in response to other conditions (e.g., illness, metabolism, sympathetic tone, etc.). Our next assumption was that an easily accessible biomarker would be more useful in real-world settings. Thus, we have focused on saliva, as opposed to urine or blood, as a rich source of biological analytes that can be mined to optimize the chances of bringing a biomarker out into the field. Finally, we recognize that conducting validation studies in humans can be expensive and time consuming. Thus, we have exploited genetic and pharmacological tools in the model organism Drosophila melanogaster to more fully characterize the behavior of the most exciting candidate biomarkers. Citation: Thimgan MS; Duntley SP; Shaw PJ. Changes in gene expression with sleep. J Clin Sleep Med 2011;7(5):Supplement S26-S27. PMID:22003326
Rutter, William B; Salcedo, Andres; Akhunova, Alina; He, Fei; Wang, Shichen; Liang, Hanquan; Bowden, Robert L; Akhunov, Eduard
2017-04-12
Two opposing evolutionary constraints exert pressure on plant pathogens: one to diversify virulence factors in order to evade plant defenses, and the other to retain virulence factors critical for maintaining a compatible interaction with the plant host. To better understand how the diversified arsenals of fungal genes promote interaction with the same compatible wheat line, we performed a comparative genomic analysis of two North American isolates of Puccinia graminis f. sp. tritici (Pgt). The patterns of inter-isolate divergence in the secreted candidate effector genes were compared with the levels of conservation and divergence of plant-pathogen gene co-expression networks (GCN) developed for each isolate. Comprative genomic analyses revealed substantial level of interisolate divergence in effector gene complement and sequence divergence. Gene Ontology (GO) analyses of the conserved and unique parts of the isolate-specific GCNs identified a number of conserved host pathways targeted by both isolates. Interestingly, the degree of inter-isolate sub-network conservation varied widely for the different host pathways and was positively associated with the proportion of conserved effector candidates associated with each sub-network. While different Pgt isolates tended to exploit similar wheat pathways for infection, the mode of plant-pathogen interaction varied for different pathways with some pathways being associated with the conserved set of effectors and others being linked with the diverged or isolate-specific effectors. Our data suggest that at the intra-species level pathogen populations likely maintain divergent sets of effectors capable of targeting the same plant host pathways. This functional redundancy may play an important role in the dynamic of the "arms-race" between host and pathogen serving as the basis for diverse virulence strategies and creating conditions where mutations in certain effector groups will not have a major effect on the pathogen's ability to infect the host.
Rieseberg, Loren H.; Blackman, Benjamin K.
2010-01-01
Background Analyses of speciation genes – genes that contribute to the cessation of gene flow between populations – can offer clues regarding the ecological settings, evolutionary forces and molecular mechanisms that drive the divergence of populations and species. This review discusses the identities and attributes of genes that contribute to reproductive isolation (RI) in plants, compares them with animal speciation genes and investigates what these genes can tell us about speciation. Scope Forty-one candidate speciation genes were identified in the plant literature. Of these, seven contributed to pre-pollination RI, one to post-pollination, prezygotic RI, eight to hybrid inviability, and 25 to hybrid sterility. Genes, gene families and genetic pathways that were frequently found to underlie the evolution of RI in different plant groups include the anthocyanin pathway and its regulators (pollinator isolation), S RNase-SI genes (unilateral incompatibility), disease resistance genes (hybrid necrosis), chimeric mitochondrial genes (cytoplasmic male sterility), and pentatricopeptide repeat family genes (cytoplasmic male sterility). Conclusions The most surprising conclusion from this review is that identities of genes underlying both prezygotic and postzygotic RI are often predictable in a broad sense from the phenotype of the reproductive barrier. Regulatory changes (both cis and trans) dominate the evolution of pre-pollination RI in plants, whereas a mix of regulatory mutations and changes in protein-coding genes underlie intrinsic postzygotic barriers. Also, loss-of-function mutations and copy number variation frequently contribute to RI. Although direct evidence of positive selection on speciation genes is surprisingly scarce in plants, analyses of gene family evolution, along with theoretical considerations, imply an important role for diversifying selection and genetic conflict in the evolution of RI. Unlike in animals, however, most candidate speciation genes in plants exhibit intraspecific polymorphism, consistent with an important role for stochastic forces and/or balancing selection in development of RI in plants. PMID:20576737
LOD score exclusion analyses for candidate genes using random population samples.
Deng, H W; Li, J; Recker, R R
2001-05-01
While extensive analyses have been conducted to test for, no formal analyses have been conducted to test against, the importance of candidate genes with random population samples. We develop a LOD score approach for exclusion analyses of candidate genes with random population samples. Under this approach, specific genetic effects and inheritance models at candidate genes can be analysed and if a LOD score is < or = - 2.0, the locus can be excluded from having an effect larger than that specified. Computer simulations show that, with sample sizes often employed in association studies, this approach has high power to exclude a gene from having moderate genetic effects. In contrast to regular association analyses, population admixture will not affect the robustness of our analyses; in fact, it renders our analyses more conservative and thus any significant exclusion result is robust. Our exclusion analysis complements association analysis for candidate genes in random population samples and is parallel to the exclusion mapping analyses that may be conducted in linkage analyses with pedigrees or relative pairs. The usefulness of the approach is demonstrated by an application to test the importance of vitamin D receptor and estrogen receptor genes underlying the differential risk to osteoporotic fractures.
Clinical Neuropathology practice news 2-2014: ATRX, a new candidate biomarker in gliomas.
Haberler, Christine; Wöhrer, Adelheid
2014-01-01
Genome-wide molecular approaches have substantially elucidated molecular alterations and pathways involved in the oncogenesis of brain tumors. In gliomas, several molecular biomarkers including IDH mutation, 1p/19q co-deletion, and MGMT promotor methylation status have been introduced into neuropathological practice. Recently, mutations of the ATRX gene have been found in various subtypes and grades of gliomas and were shown to refine the prognosis of malignant gliomas in combination with IDH and 1p/19q status. Mutations of ATRX are associated with loss of nuclear ATRX protein expression, detectable by a commercially available antibody, thus turning ATRX into a promising prognostic candidate biomarker in the routine neuropathological setting.
Almeida, Nuno F.; Krezdorn, Nicolas; Rotter, Björn; Winter, Peter; Rubiales, Diego; Vaz Patto, Maria C.
2015-01-01
Lathyrus sativus (grass pea) is a temperate grain legume crop with a great potential for expansion in dry areas or zones that are becoming more drought-prone. It is also recognized as a potential source of resistance to several important diseases in legumes, such as ascochyta blight. Nevertheless, the lack of detailed genomic and/or transcriptomic information hampers further exploitation of grass pea resistance-related genes in precision breeding. To elucidate the pathways differentially regulated during ascochyta-grass pea interaction and to identify resistance candidate genes, we compared the early response of the leaf gene expression profile of a resistant L. sativus genotype to Ascochyta lathyri infection with a non-inoculated control sample from the same genotype employing deepSuperSAGE. This analysis generated 14.387 UniTags of which 95.7% mapped to a reference grass pea/rust interaction transcriptome. From the total mapped UniTags, 738 were significantly differentially expressed between control and inoculated leaves. The results indicate that several gene classes acting in different phases of the plant/pathogen interaction are involved in the L. sativus response to A. lathyri infection. Most notably a clear up-regulation of defense-related genes involved in and/or regulated by the ethylene pathway was observed. There was also evidence of alterations in cell wall metabolism indicated by overexpression of cellulose synthase and lignin biosynthesis genes. This first genome-wide overview of the gene expression profile of the L. sativus response to ascochyta infection delivered a valuable set of candidate resistance genes for future use in precision breeding. PMID:25852725
Regulation of neural macroRNAs by the transcriptional repressor REST
Johnson, Rory; Teh, Christina Hui-Leng; Jia, Hui; Vanisri, Ravi Raj; Pandey, Tridansh; Lu, Zhong-Hao; Buckley, Noel J.; Stanton, Lawrence W.; Lipovich, Leonard
2009-01-01
The essential transcriptional repressor REST (repressor element 1-silencing transcription factor) plays central roles in development and human disease by regulating a large cohort of neural genes. These have conventionally fallen into the class of known, protein-coding genes; recently, however, several noncoding microRNA genes were identified as REST targets. Given the widespread transcription of messenger RNA-like, noncoding RNAs (“macroRNAs”), some of which are functional and implicated in disease in mammalian genomes, we sought to determine whether this class of noncoding RNAs can also be regulated by REST. By applying a new, unbiased target gene annotation pipeline to computationally discovered REST binding sites, we find that 23% of mammalian REST genomic binding sites are within 10 kb of a macroRNA gene. These putative target genes were overlooked by previous studies. Focusing on a set of 18 candidate macroRNA targets from mouse, we experimentally demonstrate that two are regulated by REST in neural stem cells. Flanking protein-coding genes are, at most, weakly repressed, suggesting specific targeting of the macroRNAs by REST. Similar to the majority of known REST target genes, both of these macroRNAs are induced during nervous system development and have neurally restricted expression profiles in adult mouse. We observe a similar phenomenon in human: the DiGeorge syndrome-associated noncoding RNA, DGCR5, is repressed by REST through a proximal upstream binding site. Therefore neural macroRNAs represent an additional component of the REST regulatory network. These macroRNAs are new candidates for understanding the role of REST in neuronal development, neurodegeneration, and cancer. PMID:19050060
Regulation of neural macroRNAs by the transcriptional repressor REST.
Johnson, Rory; Teh, Christina Hui-Leng; Jia, Hui; Vanisri, Ravi Raj; Pandey, Tridansh; Lu, Zhong-Hao; Buckley, Noel J; Stanton, Lawrence W; Lipovich, Leonard
2009-01-01
The essential transcriptional repressor REST (repressor element 1-silencing transcription factor) plays central roles in development and human disease by regulating a large cohort of neural genes. These have conventionally fallen into the class of known, protein-coding genes; recently, however, several noncoding microRNA genes were identified as REST targets. Given the widespread transcription of messenger RNA-like, noncoding RNAs ("macroRNAs"), some of which are functional and implicated in disease in mammalian genomes, we sought to determine whether this class of noncoding RNAs can also be regulated by REST. By applying a new, unbiased target gene annotation pipeline to computationally discovered REST binding sites, we find that 23% of mammalian REST genomic binding sites are within 10 kb of a macroRNA gene. These putative target genes were overlooked by previous studies. Focusing on a set of 18 candidate macroRNA targets from mouse, we experimentally demonstrate that two are regulated by REST in neural stem cells. Flanking protein-coding genes are, at most, weakly repressed, suggesting specific targeting of the macroRNAs by REST. Similar to the majority of known REST target genes, both of these macroRNAs are induced during nervous system development and have neurally restricted expression profiles in adult mouse. We observe a similar phenomenon in human: the DiGeorge syndrome-associated noncoding RNA, DGCR5, is repressed by REST through a proximal upstream binding site. Therefore neural macroRNAs represent an additional component of the REST regulatory network. These macroRNAs are new candidates for understanding the role of REST in neuronal development, neurodegeneration, and cancer.
Tamplin, Owen J; Cox, Brian J; Rossant, Janet
2011-12-15
The node and notochord are key tissues required for patterning of the vertebrate body plan. Understanding the gene regulatory network that drives their formation and function is therefore important. Foxa2 is a key transcription factor at the top of this genetic hierarchy and finding its targets will help us to better understand node and notochord development. We performed an extensive microarray-based gene expression screen using sorted embryonic notochord cells to identify early notochord-enriched genes. We validated their specificity to the node and notochord by whole mount in situ hybridization. This provides the largest available resource of notochord-expressed genes, and therefore candidate Foxa2 target genes in the notochord. Using existing Foxa2 ChIP-seq data from adult liver, we were able to identify a set of genes expressed in the notochord that had associated regions of Foxa2-bound chromatin. Given that Foxa2 is a pioneer transcription factor, we reasoned that these sites might represent notochord-specific enhancers. Candidate Foxa2-bound regions were tested for notochord specific enhancer function in a zebrafish reporter assay and 7 novel notochord enhancers were identified. Importantly, sequence conservation or predictive models could not have readily identified these regions. Mutation of putative Foxa2 binding elements in two of these novel enhancers abrogated reporter expression and confirmed their Foxa2 dependence. The combination of highly specific gene expression profiling and genome-wide ChIP analysis is a powerful means of understanding developmental pathways, even for small cell populations such as the notochord. Copyright © 2011 Elsevier Inc. All rights reserved.
Informed walks: whispering hints to gene hunters inside networks' jungle.
Bourdakou, Marilena M; Spyrou, George M
2017-10-11
Systemic approaches offer a different point of view on the analysis of several types of molecular associations as well as on the identification of specific gene communities in several cancer types. However, due to lack of sufficient data needed to construct networks based on experimental evidence, statistical gene co-expression networks are widely used instead. Many efforts have been made to exploit the information hidden in these networks. However, these approaches still need to capitalize comprehensively the prior knowledge encrypted into molecular pathway associations and improve their efficiency regarding the discovery of both exclusive subnetworks as candidate biomarkers and conserved subnetworks that may uncover common origins of several cancer types. In this study we present the development of the Informed Walks model based on random walks that incorporate information from molecular pathways to mine candidate genes and gene-gene links. The proposed model has been applied to TCGA (The Cancer Genome Atlas) datasets from seven different cancer types, exploring the reconstructed co-expression networks of the whole set of genes and driving to highlighted sub-networks for each cancer type. In the sequel, we elucidated the impact of each subnetwork on the indication of underlying exclusive and common molecular mechanisms as well as on the short-listing of drugs that have the potential to suppress the corresponding cancer type through a drug-repurposing pipeline. We have developed a method of gene subnetwork highlighting based on prior knowledge, capable to give fruitful insights regarding the underlying molecular mechanisms and valuable input to drug-repurposing pipelines for a variety of cancer types.
Phenoscape: Identifying Candidate Genes for Evolutionary Phenotypes
Edmunds, Richard C.; Su, Baofeng; Balhoff, James P.; Eames, B. Frank; Dahdul, Wasila M.; Lapp, Hilmar; Lundberg, John G.; Vision, Todd J.; Dunham, Rex A.; Mabee, Paula M.; Westerfield, Monte
2016-01-01
Phenotypes resulting from mutations in genetic model organisms can help reveal candidate genes for evolutionarily important phenotypic changes in related taxa. Although testing candidate gene hypotheses experimentally in nonmodel organisms is typically difficult, ontology-driven information systems can help generate testable hypotheses about developmental processes in experimentally tractable organisms. Here, we tested candidate gene hypotheses suggested by expert use of the Phenoscape Knowledgebase, specifically looking for genes that are candidates responsible for evolutionarily interesting phenotypes in the ostariophysan fishes that bear resemblance to mutant phenotypes in zebrafish. For this, we searched ZFIN for genetic perturbations that result in either loss of basihyal element or loss of scales phenotypes, because these are the ancestral phenotypes observed in catfishes (Siluriformes). We tested the identified candidate genes by examining their endogenous expression patterns in the channel catfish, Ictalurus punctatus. The experimental results were consistent with the hypotheses that these features evolved through disruption in developmental pathways at, or upstream of, brpf1 and eda/edar for the ancestral losses of basihyal element and scales, respectively. These results demonstrate that ontological annotations of the phenotypic effects of genetic alterations in model organisms, when aggregated within a knowledgebase, can be used effectively to generate testable, and useful, hypotheses about evolutionary changes in morphology. PMID:26500251
Ensemble positive unlabeled learning for disease gene identification.
Yang, Peng; Li, Xiaoli; Chua, Hon-Nian; Kwoh, Chee-Keong; Ng, See-Kiong
2014-01-01
An increasing number of genes have been experimentally confirmed in recent years as causative genes to various human diseases. The newly available knowledge can be exploited by machine learning methods to discover additional unknown genes that are likely to be associated with diseases. In particular, positive unlabeled learning (PU learning) methods, which require only a positive training set P (confirmed disease genes) and an unlabeled set U (the unknown candidate genes) instead of a negative training set N, have been shown to be effective in uncovering new disease genes in the current scenario. Using only a single source of data for prediction can be susceptible to bias due to incompleteness and noise in the genomic data and a single machine learning predictor prone to bias caused by inherent limitations of individual methods. In this paper, we propose an effective PU learning framework that integrates multiple biological data sources and an ensemble of powerful machine learning classifiers for disease gene identification. Our proposed method integrates data from multiple biological sources for training PU learning classifiers. A novel ensemble-based PU learning method EPU is then used to integrate multiple PU learning classifiers to achieve accurate and robust disease gene predictions. Our evaluation experiments across six disease groups showed that EPU achieved significantly better results compared with various state-of-the-art prediction methods as well as ensemble learning classifiers. Through integrating multiple biological data sources for training and the outputs of an ensemble of PU learning classifiers for prediction, we are able to minimize the potential bias and errors in individual data sources and machine learning algorithms to achieve more accurate and robust disease gene predictions. In the future, our EPU method provides an effective framework to integrate the additional biological and computational resources for better disease gene predictions.
Integration of QTL and bioinformatic tools to identify candidate genes for triglycerides in mice[S
Leduc, Magalie S.; Hageman, Rachael S.; Verdugo, Ricardo A.; Tsaih, Shirng-Wern; Walsh, Kenneth; Churchill, Gary A.; Paigen, Beverly
2011-01-01
To identify genetic loci influencing lipid levels, we performed quantitative trait loci (QTL) analysis between inbred mouse strains MRL/MpJ and SM/J, measuring triglyceride levels at 8 weeks of age in F2 mice fed a chow diet. We identified one significant QTL on chromosome (Chr) 15 and three suggestive QTL on Chrs 2, 7, and 17. We also carried out microarray analysis on the livers of parental strains of 282 F2 mice and used these data to find cis-regulated expression QTL. We then narrowed the list of candidate genes under significant QTL using a “toolbox” of bioinformatic resources, including haplotype analysis; parental strain comparison for gene expression differences and nonsynonymous coding single nucleotide polymorphisms (SNP); cis-regulated eQTL in livers of F2 mice; correlation between gene expression and phenotype; and conditioning of expression on the phenotype. We suggest Slc25a7 as a candidate gene for the Chr 7 QTL and, based on expression differences, five genes (Polr3 h, Cyp2d22, Cyp2d26, Tspo, and Ttll12) as candidate genes for Chr 15 QTL. This study shows how bioinformatics can be used effectively to reduce candidate gene lists for QTL related to complex traits. PMID:21622629
Wang, Gongwei; Schmalenbach, Inga; von Korff, Maria; Léon, Jens; Kilian, Benjamin; Rode, Jeannette
2010-01-01
The control of flowering time has important impacts on crop yield. The variation in response to day length (photoperiod) and low temperature (vernalization) has been selected in barley to provide adaptation to different environments and farming practices. As a further step towards unraveling the genetic mechanisms underlying flowering time control in barley, we investigated the allelic variation of ten known or putative photoperiod and vernalization pathway genes between two genotypes, the spring barley elite cultivar ‘Scarlett’ (Hordeum vulgare ssp. vulgare) and the wild barley accession ‘ISR42-8’ (Hordeum vulgare ssp. spontaneum). The genes studied are Ppd-H1, VRN-H1, VRN-H2, VRN-H3, HvCO1, HvCO2, HvGI, HvFT2, HvFT3 and HvFT4. ‘Scarlett’ and ‘ISR42-8’ are the parents of the BC2DH advanced backcross population S42 and a set of wild barley introgression lines (S42ILs). The latter are derived from S42 after backcrossing and marker-assisted selection. The genotypes and phenotypes in S42 and S42ILs were utilized to determine the genetic map location of the candidate genes and to test if these genes may exert quantitative trait locus (QTL) effects on flowering time, yield and yield-related traits in the two populations studied. By sequencing the characteristic regions of the genes and genotyping with diagnostic markers, the contrasting allelic constitutions of four known flowering regulation genes were identified as ppd-H1, Vrn-H1, vrn-H2 and vrn-H3 in ‘Scarlett’ and as Ppd-H1, vrn-H1, Vrn-H2 and a novel allele of VRN-H3 in ‘ISR42-8’. All candidate genes could be placed on a barley simple sequence repeat (SSR) map. Seven candidate genes (Ppd-H1, VRN-H2, VRN-H3, HvGI, HvFT2, HvFT3 and HvFT4) were associated with flowering time QTLs in population S42. Four exotic alleles (Ppd-H1, Vrn-H2, vrn-H3 and HvCO1) possibly exhibited significant effects on flowering time in S42ILs. In both populations, the QTL showing the strongest effect corresponded to Ppd-H1. Here, the exotic allele was associated with a reduction of number of days until flowering by 8.0 and 12.7%, respectively. Our data suggest that Ppd-H1, Vrn-H2 and Vrn-H3 may also exert pleiotropic effects on yield and yield-related traits. PMID:20155245
Pers, Tune H; Hansen, Niclas Tue; Lage, Kasper; Koefoed, Pernille; Dworzynski, Piotr; Miller, Martin Lee; Flint, Tracey J; Mellerup, Erling; Dam, Henrik; Andreassen, Ole A; Djurovic, Srdjan; Melle, Ingrid; Børglum, Anders D; Werge, Thomas; Purcell, Shaun; Ferreira, Manuel A; Kouskoumvekaki, Irene; Workman, Christopher T; Hansen, Torben; Mors, Ole; Brunak, Søren
2011-07-01
Meta-analyses of large-scale association studies typically proceed solely within one data type and do not exploit the potential complementarities in other sources of molecular evidence. Here, we present an approach to combine heterogeneous data from genome-wide association (GWA) studies, protein-protein interaction screens, disease similarity, linkage studies, and gene expression experiments into a multi-layered evidence network which is used to prioritize the entire protein-coding part of the genome identifying a shortlist of candidate genes. We report specifically results on bipolar disorder, a genetically complex disease where GWA studies have only been moderately successful. We validate one such candidate experimentally, YWHAH, by genotyping five variations in 640 patients and 1,377 controls. We found a significant allelic association for the rs1049583 polymorphism in YWHAH (adjusted P = 5.6e-3) with an odds ratio of 1.28 [1.12-1.48], which replicates a previous case-control study. In addition, we demonstrate our approach's general applicability by use of type 2 diabetes data sets. The method presented augments moderately powered GWA data, and represents a validated, flexible, and publicly available framework for identifying risk genes in highly polygenic diseases. The method is made available as a web service at www.cbs.dtu.dk/services/metaranker. © 2011 Wiley-Liss, Inc.
Reiner, Gerald; Dreher, Felix; Drungowski, Mario; Hoeltig, Doris; Bertsch, Natalie; Selke, Martin; Willems, Hermann; Gerlach, Gerald Friedrich; Probst, Inga; Tuemmler, Burkhardt; Waldmann, Karl-Heinz; Herwig, Ralf
2014-12-01
Actinobacillus (A.) pleuropneumoniae is among the most important pathogens in pig. The agent causes severe economic losses due to decreased performance, the occurrence of acute or chronic pleuropneumonia, and an increase in death incidence. Since therapeutics cannot be used in a sustainable manner, and vaccination is not always available, new prophylactic measures are urgently needed. Recent research has provided evidence for a genetic predisposition in susceptibility to A. pleuropneumoniae in a Hampshire × German Landrace F2 family with 170 animals. The aim of the present study is to characterize the expression response in this family in order to unravel resistance and susceptibility mechanisms and to prioritize candidate genes for future fine mapping approaches. F2 pigs differed distinctly in clinical, pathological, and microbiological parameters after challenge with A. pleuropneumoniae. We monitored genome-wide gene expression from the 50 most and 50 least susceptible F2 pigs and identified 171 genes differentially expressed between these extreme phenotypes. We combined expression QTL analyses with network analyses and functional characterization using gene set enrichment analysis and identified a functional hotspot on SSC13, including 55 eQTL. The integration of the different results provides a resource for candidate prioritization for fine mapping strategies, such as TF, TFRC, RUNX1, TCN1, HP, CD14, among others.
Wang, Yimin; Du, Xiaonan; Bin, Rao; Yu, Shanshan; Xia, Zhezhi; Zheng, Guo; Zhong, Jianmin; Zhang, Yunjian; Jiang, Yong-hui; Wang, Yi
2017-01-01
Genetic factors play a major role in the etiology of epilepsy disorders. Recent genomics studies using next generation sequencing (NGS) technique have identified a large number of genetic variants including copy number (CNV) and single nucleotide variant (SNV) in a small set of genes from individuals with epilepsy. These discoveries have contributed significantly to evaluate the etiology of epilepsy in clinic and lay the foundation to develop molecular specific treatment. However, the molecular basis for a majority of epilepsy patients remains elusive, and furthermore, most of these studies have been conducted in Caucasian children. Here we conducted a targeted exome-sequencing of 63 trios of Chinese epilepsy families using a custom-designed NGS panel that covers 412 known and candidate genes for epilepsy. We identified pathogenic and likely pathogenic variants in 15 of 63 (23.8%) families in known epilepsy genes including SCN1A, CDKL5, STXBP1, CHD2, SCN3A, SCN9A, TSC2, MBD5, POLG and EFHC1. More importantly, we identified likely pathologic variants in several novel candidate genes such as GABRE, MYH1, and CLCN6. Our results provide the evidence supporting the application of custom-designed NGS panel in clinic and indicate a conserved genetic susceptibility for epilepsy between Chinese and Caucasian children. PMID:28074849
Perdiguero, Pedro; Collada, Carmen; Barbero, María Del Carmen; García Casado, Gloria; Cervera, María Teresa; Soto, Alvaro
2012-01-01
Climate change is a major challenge particularly for forest tree species, which will have to face the severe alterations of environmental conditions with their current genetic pool. Thus, an understanding of their adaptive responses is of the utmost interest. In this work we have selected Pinus pinaster as a model species. This pine is one of the most important conifers (for which molecular tools and knowledge are far more scarce than for angiosperms) in the Mediterranean Basin, which is characterised in all foreseen scenarios as one of the regions most drastically affected by climate change, mainly because of increasing temperature and, particularly, by increasing drought. We have induced a controlled, increasing water stress by adding PEG to a hydroponic culture. We have generated a subtractive library, with the aim of identifying the genes induced by this stress and have searched for the most reliable expressional candidate genes, based on their overexpression during water stress, as revealed by microarray analysis and confirmed by RT-PCR. We have selected a set of 67 candidate genes belonging to different functional groups that will be useful molecular tools for further studies on drought stress responses, adaptation, and population genomics in conifers, as well as in breeding programs. Copyright © 2011 Elsevier Masson SAS. All rights reserved.
Spanagel, Rainer
2013-01-01
Convergent functional genomics (CFG) is a translational methodology that integrates in a Bayesian fashion multiple lines of evidence from studies in human and animal models to get a better understanding of the genetics of a disease or pathological behavior. Here the integration of data sets that derive from forward genetics in animals and genetic association studies including genome wide association studies (GWAS) in humans is described for addictive behavior. The aim of forward genetics in animals and association studies in humans is to identify mutations (e.g. SNPs) that produce a certain phenotype; i.e. "from phenotype to genotype". Most powerful in terms of forward genetics is combined quantitative trait loci (QTL) analysis and gene expression profiling in recombinant inbreed rodent lines or genetically selected animals for a specific phenotype, e.g. high vs. low drug consumption. By Bayesian scoring genomic information from forward genetics in animals is then combined with human GWAS data on a similar addiction-relevant phenotype. This integrative approach generates a robust candidate gene list that has to be functionally validated by means of reverse genetics in animals; i.e. "from genotype to phenotype". It is proposed that studying addiction relevant phenotypes and endophenotypes by this CFG approach will allow a better determination of the genetics of addictive behavior.
Wittig-Blaich, Stephanie; Wittig, Rainer; Schmidt, Steffen; Lyer, Stefan; Bewerunge-Hudler, Melanie; Gronert-Sum, Sabine; Strobel-Freidekind, Olga; Müller, Carolin; List, Markus; Jaskot, Aleksandra; Christiansen, Helle; Hafner, Mathias; Schadendorf, Dirk; Block, Ines; Mollenhauer, Jan
2017-01-01
Next-generation sequencing has dramatically increased genome-wide profiling options and conceptually initiates the possibility for personalized cancer therapy. State-of-the-art sequencing studies yield large candidate gene sets comprising dozens or hundreds of mutated genes. However, few technologies are available for the systematic downstream evaluation of these results to identify novel starting points of future cancer therapies. We improved and extended a site-specific recombination-based system for systematic analysis of the individual functions of a large number of candidate genes. This was facilitated by a novel system for the construction of isogenic constitutive and inducible gain- and loss-of-function cell lines. Additionally, we demonstrate the construction of isogenic cell lines with combinations of the traits for advanced functional in vitro analyses. In a proof-of-concept experiment, a library of 108 isogenic melanoma cell lines was constructed and 8 genes were identified that significantly reduced viability in a discovery screen and in an independent validation screen. Here, we demonstrate the broad applicability of this recombination-based method and we proved its potential to identify new drug targets via the identification of the tumor suppressor DUSP6 as potential synthetic lethal target in melanoma cell lines with BRAF V600E mutations and high DUSP6 expression. PMID:28423600
Leonenko, Ganna; Richards, Alexander L; Walters, James T; Pocklington, Andrew; Chambert, Kimberly; Al Eissa, Mariam M; Sharp, Sally I; O'Brien, Niamh L; Curtis, David; Bass, Nicholas J; McQuillin, Andrew; Hultman, Christina; Moran, Jennifer L; McCarroll, Steven A; Sklar, Pamela; Neale, Benjamin M; Holmans, Peter A; Owen, Michael J; Sullivan, Patrick F; O'Donovan, Michael C
2017-10-01
Risk of schizophrenia is conferred by alleles occurring across the full spectrum of frequencies from common SNPs of weak effect through to ultra rare alleles, some of which may be moderately to highly penetrant. Previous studies have suggested that some of the risk of schizophrenia is attributable to uncommon alleles represented on Illumina exome arrays. Here, we present the largest study of exomic variation in schizophrenia to date, using samples from the United Kingdom and Sweden (10,011 schizophrenia cases and 13,791 controls). Single variants, genes, and gene sets were analyzed for association with schizophrenia. No single variant or gene reached genome-wide significance. Among candidate gene sets, we found significant enrichment for rare alleles (minor allele frequency [MAF] < 0.001) in genes intolerant of loss-of-function (LoF) variation and in genes whose messenger RNAs bind to fragile X mental retardation protein (FMRP). We further delineate the genetic architecture of schizophrenia by excluding a role for uncommon exomic variants (0.01 ≤ MAF ≥ 0.001) that confer a relatively large effect (odds ratio [OR] > 4). We also show risk alleles within this frequency range exist, but confer smaller effects and should be identified by larger studies. © 2017 Wiley Periodicals, Inc.
Zhu, Jie; Qin, Yufang; Liu, Taigang; Wang, Jun; Zheng, Xiaoqi
2013-01-01
Identification of gene-phenotype relationships is a fundamental challenge in human health clinic. Based on the observation that genes causing the same or similar phenotypes tend to correlate with each other in the protein-protein interaction network, a lot of network-based approaches were proposed based on different underlying models. A recent comparative study showed that diffusion-based methods achieve the state-of-the-art predictive performance. In this paper, a new diffusion-based method was proposed to prioritize candidate disease genes. Diffusion profile of a disease was defined as the stationary distribution of candidate genes given a random walk with restart where similarities between phenotypes are incorporated. Then, candidate disease genes are prioritized by comparing their diffusion profiles with that of the disease. Finally, the effectiveness of our method was demonstrated through the leave-one-out cross-validation against control genes from artificial linkage intervals and randomly chosen genes. Comparative study showed that our method achieves improved performance compared to some classical diffusion-based methods. To further illustrate our method, we used our algorithm to predict new causing genes of 16 multifactorial diseases including Prostate cancer and Alzheimer's disease, and the top predictions were in good consistent with literature reports. Our study indicates that integration of multiple information sources, especially the phenotype similarity profile data, and introduction of global similarity measure between disease and gene diffusion profiles are helpful for prioritizing candidate disease genes. Programs and data are available upon request.
In Silico Gene Prioritization by Integrating Multiple Data Sources
Zhou, Yingyao; Shields, Robert; Chanda, Sumit K.; Elston, Robert C.; Li, Jing
2011-01-01
Identifying disease genes is crucial to the understanding of disease pathogenesis, and to the improvement of disease diagnosis and treatment. In recent years, many researchers have proposed approaches to prioritize candidate genes by considering the relationship of candidate genes and existing known disease genes, reflected in other data sources. In this paper, we propose an expandable framework for gene prioritization that can integrate multiple heterogeneous data sources by taking advantage of a unified graphic representation. Gene-gene relationships and gene-disease relationships are then defined based on the overall topology of each network using a diffusion kernel measure. These relationship measures are in turn normalized to derive an overall measure across all networks, which is utilized to rank all candidate genes. Based on the informativeness of available data sources with respect to each specific disease, we also propose an adaptive threshold score to select a small subset of candidate genes for further validation studies. We performed large scale cross-validation analysis on 110 disease families using three data sources. Results have shown that our approach consistently outperforms other two state of the art programs. A case study using Parkinson disease (PD) has identified four candidate genes (UBB, SEPT5, GPR37 and TH) that ranked higher than our adaptive threshold, all of which are involved in the PD pathway. In particular, a very recent study has observed a deletion of TH in a patient with PD, which supports the importance of the TH gene in PD pathogenesis. A web tool has been implemented to assist scientists in their genetic studies. PMID:21731658
A genome-wide association study of corneal astigmatism: The CREAM Consortium.
Shah, Rupal L; Li, Qing; Zhao, Wanting; Tedja, Milly S; Tideman, J Willem L; Khawaja, Anthony P; Fan, Qiao; Yazar, Seyhan; Williams, Katie M; Verhoeven, Virginie J M; Xie, Jing; Wang, Ya Xing; Hess, Moritz; Nickels, Stefan; Lackner, Karl J; Pärssinen, Olavi; Wedenoja, Juho; Biino, Ginevra; Concas, Maria Pina; Uitterlinden, André; Rivadeneira, Fernando; Jaddoe, Vincent W V; Hysi, Pirro G; Sim, Xueling; Tan, Nicholas; Tham, Yih-Chung; Sensaki, Sonoko; Hofman, Albert; Vingerling, Johannes R; Jonas, Jost B; Mitchell, Paul; Hammond, Christopher J; Höhn, René; Baird, Paul N; Wong, Tien-Yin; Cheng, Chinfsg-Yu; Teo, Yik Ying; Mackey, David A; Williams, Cathy; Saw, Seang-Mei; Klaver, Caroline C W; Guggenheim, Jeremy A; Bailey-Wilson, Joan E
2018-01-01
To identify genes and genetic markers associated with corneal astigmatism. A meta-analysis of genome-wide association studies (GWASs) of corneal astigmatism undertaken for 14 European ancestry (n=22,250) and 8 Asian ancestry (n=9,120) cohorts was performed by the Consortium for Refractive Error and Myopia. Cases were defined as having >0.75 diopters of corneal astigmatism. Subsequent gene-based and gene-set analyses of the meta-analyzed results of European ancestry cohorts were performed using VEGAS2 and MAGMA software. Additionally, estimates of single nucleotide polymorphism (SNP)-based heritability for corneal and refractive astigmatism and the spherical equivalent were calculated for Europeans using LD score regression. The meta-analysis of all cohorts identified a genome-wide significant locus near the platelet-derived growth factor receptor alpha ( PDGFRA ) gene: top SNP: rs7673984, odds ratio=1.12 (95% CI:1.08-1.16), p=5.55×10 -9 . No other genome-wide significant loci were identified in the combined analysis or European/Asian ancestry-specific analyses. Gene-based analysis identified three novel candidate genes for corneal astigmatism in Europeans-claudin-7 ( CLDN7 ), acid phosphatase 2, lysosomal ( ACP2 ), and TNF alpha-induced protein 8 like 3 ( TNFAIP8L3 ). In addition to replicating a previously identified genome-wide significant locus for corneal astigmatism near the PDGFRA gene, gene-based analysis identified three novel candidate genes, CLDN7 , ACP2 , and TNFAIP8L3 , that warrant further investigation to understand their role in the pathogenesis of corneal astigmatism. The much lower number of genetic variants and genes demonstrating an association with corneal astigmatism compared to published spherical equivalent GWAS analyses suggest a greater influence of rare genetic variants, non-additive genetic effects, or environmental factors in the development of astigmatism.
Tamilzhalagan, Sembulingam; Muthuswami, Muthulakshmi; Ganesan, Kumaresan
2017-04-01
Genomic Copy Number Variations (CNV) and the associated gene signatures are useful for cancer prognosis, diagnosis, and targeted therapeutics. Earlier, 7q21-22 region was reported for frequent amplification in gastric cancer and potential candidate genes were identified. An analysis of the expression pattern of the 159 genes located in this amplicon revealed the consistent elevated expression of 21 genes in gastric tumors. These genes are closely arranged within the 20 Mb region, and they showed a bimodal expression pattern. SHFM1 and 14 other genes are expressed in intestinal type gastric tumors. COL1A2 and PCOLCE genes of this region are expressed in diffuse type gastric tumors. Similarly, genome-wide expression neighbors of SHFM1 and COL1A2 also showed mutually exclusive expression pattern, and stratify intestinal and diffuse type gastric tumors. The expression of COL1A2 gene-set is associated with poor prognosis, whereas the SHFM1 gene-set is associated with better prognosis among the gastric cancer patients. Despite being physical neighbors, the SHFM1 and COL1A2 genes express differentially in the two major clinical sub-types of gastric cancer in a mutually exclusive manner. The tight gene regulations operating between these juxtaposed genes deserve investigation to understand the molecular regulatory switch defining the determinants of the gastric cancer sub-types. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
The Genetic Basis for Variation in Sensitivity to Lead Toxicity in Drosophila melanogaster.
Zhou, Shanshan; Morozova, Tatiana V; Hussain, Yasmeen N; Luoma, Sarah E; McCoy, Lenovia; Yamamoto, Akihiko; Mackay, Trudy F C; Anholt, Robert R H
2016-07-01
Lead toxicity presents a worldwide health problem, especially due to its adverse effects on cognitive development in children. However, identifying genes that give rise to individual variation in susceptibility to lead toxicity is challenging in human populations. Our goal was to use Drosophila melanogaster to identify evolutionarily conserved candidate genes associated with individual variation in susceptibility to lead exposure. To identify candidate genes associated with variation in susceptibility to lead toxicity, we measured effects of lead exposure on development time, viability and adult activity in the Drosophila melanogaster Genetic Reference Panel (DGRP) and performed genome-wide association analyses to identify candidate genes. We used mutants to assess functional causality of candidate genes and constructed a genetic network associated with variation in sensitivity to lead exposure, on which we could superimpose human orthologs. We found substantial heritabilities for all three traits and identified candidate genes associated with variation in susceptibility to lead exposure for each phenotype. The genetic architectures that determine variation in sensitivity to lead exposure are highly polygenic. Gene ontology and network analyses showed enrichment of genes associated with early development and function of the nervous system. Drosophila melanogaster presents an advantageous model to study the genetic underpinnings of variation in susceptibility to lead toxicity. Evolutionary conservation of cellular pathways that respond to toxic exposure allows predictions regarding orthologous genes and pathways across phyla. Thus, studies in the D. melanogaster model system can identify candidate susceptibility genes to guide subsequent studies in human populations. Zhou S, Morozova TV, Hussain YN, Luoma SE, McCoy L, Yamamoto A, Mackay TF, Anholt RR. 2016. The genetic basis for variation in sensitivity to lead toxicity in Drosophila melanogaster. Environ Health Perspect 124:1062-1070; http://dx.doi.org/10.1289/ehp.1510513.
Ali, Shafat; Chopra, Rupali; Manvati, Siddharth; Singh, Yoginder Pal; Kaul, Nabodita; Behura, Anita; Mahajan, Ankit; Sehajpal, Prabodh; Gupta, Subash; Dhar, Manoj K; Chainy, Gagan B N; Bhanwer, Amarjit S; Sharma, Swarkar; Bamezai, Rameshwar N K
2013-01-01
Type 2 diabetes (T2D) is a syndrome of multiple metabolic disorders and is genetically heterogeneous. India comprises one of the largest global populations with highest number of reported type 2 diabetes cases. However, limited information about T2D associated loci is available for Indian populations. It is, therefore, pertinent to evaluate the previously associated candidates as well as identify novel genetic variations in Indian populations to understand the extent of genetic heterogeneity. We chose to do a cost effective high-throughput mass-array genotyping and studied the candidate gene variations associated with T2D in literature. In this case-control candidate genes association study, 91 SNPs from 55 candidate genes have been analyzed in three geographically independent population groups from India. We report the genetic variants in five candidate genes: TCF7L2, HHEX, ENPP1, IDE and FTO, are significantly associated (after Bonferroni correction, p<5.5E-04) with T2D susceptibility in combined population. Interestingly, SNP rs7903146 of the TCF7L2 gene passed the genome wide significance threshold (combined P value = 2.05E-08) in the studied populations. We also observed the association of rs7903146 with blood glucose (fasting and postprandial) levels, supporting the role of TCF7L2 gene in blood glucose homeostasis. Further, we noted that the moderate risk provided by the independently associated loci in combined population with Odds Ratio (OR)<1.38 increased to OR = 2.44, (95%CI = 1.67-3.59) when the risk providing genotypes of TCF7L2, HHEX, ENPP1 and FTO genes were combined, suggesting the importance of gene-gene interactions evaluation in complex disorders like T2D.
Ali, Shafat; Chopra, Rupali; Manvati, Siddharth; Mahajan, Ankit; Sehajpal, Prabodh; Gupta, Subash; Dhar, Manoj K.; Chainy, Gagan B. N.; Bhanwer, Amarjit S.; Sharma, Swarkar; Bamezai, Rameshwar N. K.
2013-01-01
Type 2 diabetes (T2D) is a syndrome of multiple metabolic disorders and is genetically heterogeneous. India comprises one of the largest global populations with highest number of reported type 2 diabetes cases. However, limited information about T2D associated loci is available for Indian populations. It is, therefore, pertinent to evaluate the previously associated candidates as well as identify novel genetic variations in Indian populations to understand the extent of genetic heterogeneity. We chose to do a cost effective high-throughput mass-array genotyping and studied the candidate gene variations associated with T2D in literature. In this case-control candidate genes association study, 91 SNPs from 55 candidate genes have been analyzed in three geographically independent population groups from India. We report the genetic variants in five candidate genes: TCF7L2, HHEX, ENPP1, IDE and FTO, are significantly associated (after Bonferroni correction, p<5.5E−04) with T2D susceptibility in combined population. Interestingly, SNP rs7903146 of the TCF7L2 gene passed the genome wide significance threshold (combined P value = 2.05E−08) in the studied populations. We also observed the association of rs7903146 with blood glucose (fasting and postprandial) levels, supporting the role of TCF7L2 gene in blood glucose homeostasis. Further, we noted that the moderate risk provided by the independently associated loci in combined population with Odds Ratio (OR)<1.38 increased to OR = 2.44, (95%CI = 1.67–3.59) when the risk providing genotypes of TCF7L2, HHEX, ENPP1 and FTO genes were combined, suggesting the importance of gene-gene interactions evaluation in complex disorders like T2D. PMID:23527042
DeWoody, J Andrew; Fernandez, Nadia B; Brüniche-Olsen, Anna; Antonides, Jennifer D; Doyle, Jacqueline M; San Miguel, Phillip; Westerman, Rick; Vertyankin, Vladimir V; Godard-Codding, Céline A J; Bickham, John W
2017-06-01
Genetic and genomic approaches have much to offer in terms of ecology, evolution, and conservation. To better understand the biology of the gray whale Eschrichtius robustus (Lilljeborg, 1861), we sequenced the genome and produced an assembly that contains ∼95% of the genes known to be highly conserved among eukaryotes. From this assembly, we annotated 22,711 genes and identified 2,057,254 single-nucleotide polymorphisms (SNPs). Using this assembly, we generated a curated list of candidate genes potentially subject to strong natural selection, including genes associated with osmoregulation, oxygen binding and delivery, and other aspects of marine life. From these candidate genes, we queried 92 autosomal protein-coding markers with a panel of 96 SNPs that also included 2 sexing and 2 mitochondrial markers. Genotyping error rates, calculated across loci and across 69 intentional replicate samples, were low (0.021%), and observed heterozygosity was 0.33 averaged over all autosomal markers. This level of variability provides substantial discriminatory power across loci (mean probability of identity of 1.6 × 10 -25 and mean probability of exclusion >0.999 with neither parent known), indicating that these markers provide a powerful means to assess parentage and relatedness in gray whales. We found 29 unique multilocus genotypes represented among our 36 biopsies (indicating that we inadvertently sampled 7 whales twice). In total, we compiled an individual data set of 28 western gray whales (WGSs) and 1 presumptive eastern gray whale (EGW). The lone EGW we sampled was no more or less related to the WGWs than expected by chance alone. The gray whale genomes reported here will enable comparative studies of natural selection in cetaceans, and the SNP markers should be highly informative for future studies of gray whale evolution, population structure, demography, and relatedness.
Agarwal, Parul; Garg, Varsha; Gautam, Taru; Pillai, Beena; Kanoria, Shaveta; Burma, Pradeep Kumar
2014-04-01
Several reports of promoters from plants, viral and artificial origin that confer high constitutive expression are known. Among these the CaMV 35S promoter is used extensively for transgene expression in plants. We identified candidate promoters from Arabidopsis based on their transcript levels (meta-analysis of available microarray control datasets) to test their activity in comparison to the CaMV 35S promoter. A set of 11 candidate genes were identified which showed high transcript levels in the aerial tissue (i.e. leaf, shoot, flower and stem). In the initial part of the study binary vectors were developed wherein the promoter and 5'UTR region of these candidate genes (Upstream Regulatory Module, URM) were cloned upstream to the reporter gene β glucuronidase (gus). The promoter strengths were tested in transformed callus of Nicotiana tabacum and Gossypium hirsutum. On the basis of the results obtained from the callus, the influence of the URM cassettes on transgene expression was tested in transgenic tobacco. The URM regions of the genes encoding a subunit of photosystem I (PHOTO) and geranyl geranyl reductase (GGR) in A. thaliana genome showed significantly high levels of GUS activity in comparison to the CaMV 35S promoter. Further, when the 5'UTRs of both the genes were placed downstream to the CaMV 35S promoter it led to a substantial increase in GUS activity in transgenic tobacco lines and cotton callus. The enhancement observed was even higher to that observed with the viral leader sequences like Ω and AMV, known translational enhancers. Our results indicate that the two URM cassettes or the 5'UTR regions of PHOTO and GGR when placed downstream to the CaMV 35S promoter can be used to drive high levels of transgene expression in dicotyledons.
Johns, N; Tan, B H; MacMillan, M; Solheim, T S; Ross, J A; Baracos, V E; Damaraju, S; Fearon, K C H
2014-12-01
Cancer cachexia is a complex and multifactorial disease. Evolving definitions highlight the fact that a diverse range of biological processes contribute to cancer cachexia. Part of the variation in who will and who will not develop cancer cachexia may be genetically determined. As new definitions, classifications and biological targets continue to evolve, there is a need for reappraisal of the literature for future candidate association studies. This review summarizes genes identified or implicated as well as putative candidate genes contributing to cachexia, identified through diverse technology platforms and model systems to further guide association studies. A systematic search covering 1986-2012 was performed for potential candidate genes / genetic polymorphisms relating to cancer cachexia. All candidate genes were reviewed for functional polymorphisms or clinically significant polymorphisms associated with cachexia using the OMIM and GeneRIF databases. Pathway analysis software was used to reveal possible network associations between genes. Functionality of SNPs/genes was explored based on published literature, algorithms for detecting putative deleterious SNPs and interrogating the database for expression of quantitative trait loci (eQTLs). A total of 154 genes associated with cancer cachexia were identified and explored for functional polymorphisms. Of these 154 genes, 119 had a combined total of 281 polymorphisms with functional and/or clinical significance in terms of cachexia associated with them. Of these, 80 polymorphisms (in 51 genes) were replicated in more than one study with 24 polymorphisms found to influence two or more hallmarks of cachexia (i.e., inflammation, loss of fat mass and/or lean mass and reduced survival). Selection of candidate genes and polymorphisms is a key element of multigene study design. The present study provides a contemporary basis to select genes and/or polymorphisms for further association studies in cancer cachexia, and to develop their potential as susceptibility biomarkers of cachexia.
A fruit quality gene map of Prunus
2009-01-01
Background Prunus fruit development, growth, ripening, and senescence includes major biochemical and sensory changes in texture, color, and flavor. The genetic dissection of these complex processes has important applications in crop improvement, to facilitate maximizing and maintaining stone fruit quality from production and processing through to marketing and consumption. Here we present an integrated fruit quality gene map of Prunus containing 133 genes putatively involved in the determination of fruit texture, pigmentation, flavor, and chilling injury resistance. Results A genetic linkage map of 211 markers was constructed for an intraspecific peach (Prunus persica) progeny population, Pop-DG, derived from a canning peach cultivar 'Dr. Davis' and a fresh market cultivar 'Georgia Belle'. The Pop-DG map covered 818 cM of the peach genome and included three morphological markers, 11 ripening candidate genes, 13 cold-responsive genes, 21 novel EST-SSRs from the ChillPeach database, 58 previously reported SSRs, 40 RAFs, 23 SRAPs, 14 IMAs, and 28 accessory markers from candidate gene amplification. The Pop-DG map was co-linear with the Prunus reference T × E map, with 39 SSR markers in common to align the maps. A further 158 markers were bin-mapped to the reference map: 59 ripening candidate genes, 50 cold-responsive genes, and 50 novel EST-SSRs from ChillPeach, with deduced locations in Pop-DG via comparative mapping. Several candidate genes and EST-SSRs co-located with previously reported major trait loci and quantitative trait loci for chilling injury symptoms in Pop-DG. Conclusion The candidate gene approach combined with bin-mapping and availability of a community-recognized reference genetic map provides an efficient means of locating genes of interest in a target genome. We highlight the co-localization of fruit quality candidate genes with previously reported fruit quality QTLs. The fruit quality gene map developed here is a valuable tool for dissecting the genetic architecture of fruit quality traits in Prunus crops. PMID:19995417
Miao, Yuanxin; Soudy, Fathia; Xu, Zhong; Liao, Mingxing; Zhao, Shuhong; Li, Xinyun
2017-01-01
Feed efficiency (FE) is a very important trait in livestock industry. Identification of the candidate genes could be of benefit for the improvement of FE trait. Mouse is used as the model for many studies in mammals. In this study, the candidate genes related to FE and coat color were identified using C57BL/6J (C57) × Kunming (KM) F2 mouse population. GWAS results showed that 61 and 2 SNPs were genome-wise suggestive significantly associated with feed conversion ratio (FCR) and feed intake (FI) traits, respectively. Moreover, the Erbin, Msrb2, Ptf1a, and Fgf10 were considered as the candidate genes of FE. The Lpl was considered as the candidate gene of FI. Further, the coat color trait was studied. KM mice are white and C57 ones are black. The GWAS results showed that the most significant SNP was located at chromosome 7, and the closely linked gene was Tyr. Therefore, our study offered useful target genes related to FE in mice; these genes may play similar roles in FE of livestock. Also, we identified the major gene of coat color in mice, which would be useful for better understanding of natural mutation of the coat color in mice.
Jąkalski, Marcin; Takeshita, Kazutaka; Deblieck, Mathieu; Koyanagi, Kanako O; Makałowska, Izabela; Watanabe, Hidemi; Makałowski, Wojciech
2016-08-04
Retroposition, one of the processes of copying the genetic material, is an important RNA-mediated mechanism leading to the emergence of new genes. Because the transcription controlling segments are usually not copied to the new location in this mechanism, the duplicated gene copies (retrocopies) become pseudogenized. However, few can still survive, e.g. by recruiting novel regulatory elements from the region of insertion. Subsequently, these duplicated genes can contribute to the formation of lineage-specific traits and phenotypic diversity. Despite the numerous studies of the functional retrocopies (retrogenes) in animals and plants, very little is known about their presence in green algae, including morphologically diverse species. The current availability of the genomes of both uni- and multicellular algae provides a good opportunity to conduct a genome-wide investigation in order to fill the knowledge gap in retroposition phenomenon in this lineage. Here we present a comparative genomic analysis of uni- and multicellular algae, Chlamydomonas reinhardtii and Volvox carteri, respectively, to explore their retrogene complements. By adopting a computational approach, we identified 141 retrogene candidates in total in both genomes, with their fraction being significantly higher in the multicellular Volvox. Majority of the retrogene candidates showed signatures of functional constraints, thus indicating their functionality. Detailed analyses of the identified retrogene candidates, their parental genes, and homologs of both, revealed that most of the retrogene candidates were derived from ancient retroposition events in the common ancestor of the two algae and that the parental genes were subsequently lost from the respective lineages, making many retrogenes 'orphan'. We revealed that the genomes of the green algae have maintained many possibly functional retrogenes in spite of experiencing various molecular evolutionary events during a long evolutionary time after the retroposition events. Our first report about the retrogene set in the green algae provides a good foundation for any future investigation of the repertoire of retrogenes and facilitates the assessment of the evolutionary impact of retroposition on diverse morphological traits in this lineage. This article was reviewed by William Martin and Piotr Zielenkiewicz.
Moschen, Sebastian; Bengoa Luoni, Sofia; Paniego, Norma B.; Hopp, H. Esteban; Dosio, Guillermo A. A.
2014-01-01
Cultivated sunflower (Helianthus annuus L.), an important source of edible vegetable oil, shows rapid onset of senescence, which limits production by reducing photosynthetic capacity under specific growing conditions. Carbon for grain filling depends strongly on light interception by green leaf area, which diminishes during grain filling due to leaf senescence. Transcription factors (TFs) regulate the progression of leaf senescence in plants and have been well explored in model systems, but information for many agronomic crops remains limited. Here, we characterize the expression profiles of a set of putative senescence associated genes (SAGs) identified by a candidate gene approach and sunflower microarray expression studies. We examined a time course of sunflower leaves undergoing natural senescence and used quantitative PCR (qPCR) to measure the expression of 11 candidate genes representing the NAC, WRKY, MYB and NF-Y TF families. In addition, we measured physiological parameters such as chlorophyll, total soluble sugars and nitrogen content. The expression of Ha-NAC01, Ha-NAC03, Ha-NAC04, Ha-NAC05 and Ha-MYB01 TFs increased before the remobilization rate increased and therefore, before the appearance of the first physiological symptoms of senescence, whereas Ha-NAC02 expression decreased. In addition, we also examined the trifurcate feed-forward pathway (involving ORE1, miR164, and ETHYLENE INSENSITIVE 2) previously reported for Arabidopsis. We measured transcription of Ha-NAC01 (the sunflower homolog of ORE1) and Ha-EIN2, along with the levels of miR164, in two leaves from different stem positions, and identified differences in transcription between basal and upper leaves. Interestingly, Ha-NAC01 and Ha-EIN2 transcription profiles showed an earlier up-regulation in upper leaves of plants close to maturity, compared with basal leaves of plants at pre-anthesis stages. These results suggest that the H. annuus TFs characterized in this work could play important roles as potential triggers of leaf senescence and thus can be considered putative candidate genes for senescence in sunflower. PMID:25110882
Moschen, Sebastian; Bengoa Luoni, Sofia; Paniego, Norma B; Hopp, H Esteban; Dosio, Guillermo A A; Fernandez, Paula; Heinz, Ruth A
2014-01-01
Cultivated sunflower (Helianthus annuus L.), an important source of edible vegetable oil, shows rapid onset of senescence, which limits production by reducing photosynthetic capacity under specific growing conditions. Carbon for grain filling depends strongly on light interception by green leaf area, which diminishes during grain filling due to leaf senescence. Transcription factors (TFs) regulate the progression of leaf senescence in plants and have been well explored in model systems, but information for many agronomic crops remains limited. Here, we characterize the expression profiles of a set of putative senescence associated genes (SAGs) identified by a candidate gene approach and sunflower microarray expression studies. We examined a time course of sunflower leaves undergoing natural senescence and used quantitative PCR (qPCR) to measure the expression of 11 candidate genes representing the NAC, WRKY, MYB and NF-Y TF families. In addition, we measured physiological parameters such as chlorophyll, total soluble sugars and nitrogen content. The expression of Ha-NAC01, Ha-NAC03, Ha-NAC04, Ha-NAC05 and Ha-MYB01 TFs increased before the remobilization rate increased and therefore, before the appearance of the first physiological symptoms of senescence, whereas Ha-NAC02 expression decreased. In addition, we also examined the trifurcate feed-forward pathway (involving ORE1, miR164, and ethylene insensitive 2) previously reported for Arabidopsis. We measured transcription of Ha-NAC01 (the sunflower homolog of ORE1) and Ha-EIN2, along with the levels of miR164, in two leaves from different stem positions, and identified differences in transcription between basal and upper leaves. Interestingly, Ha-NAC01 and Ha-EIN2 transcription profiles showed an earlier up-regulation in upper leaves of plants close to maturity, compared with basal leaves of plants at pre-anthesis stages. These results suggest that the H. annuus TFs characterized in this work could play important roles as potential triggers of leaf senescence and thus can be considered putative candidate genes for senescence in sunflower.
The shape of the human language-ready brain
Boeckx, Cedric; Benítez-Burraco, Antonio
2014-01-01
Our core hypothesis is that the emergence of our species-specific language-ready brain ought to be understood in light of the developmental changes expressed at the levels of brain morphology and neural connectivity that occurred in our species after the split from Neanderthals–Denisovans and that gave us a more globular braincase configuration. In addition to changes at the cortical level, we hypothesize that the anatomical shift that led to globularity also entailed significant changes at the subcortical level. We claim that the functional consequences of such changes must also be taken into account to gain a fuller understanding of our linguistic capacity. Here we focus on the thalamus, which we argue is central to language and human cognition, as it modulates fronto-parietal activity. With this new neurobiological perspective in place, we examine its possible molecular basis. We construct a candidate gene set whose members are involved in the development and connectivity of the thalamus, in the evolution of the human head, and are known to give rise to language-associated cognitive disorders. We submit that the new gene candidate set opens up new windows into our understanding of the genetic basis of our linguistic capacity. Thus, our hypothesis aims at generating new testing grounds concerning core aspects of language ontogeny and phylogeny. PMID:24772099
Chen, Lei; Zhong, Hai-ying; Kuang, Jian-fei; Li, Jian-guo; Lu, Wang-jin; Chen, Jian-ye
2011-08-01
Reverse transcription quantitative real-time PCR (RT-qPCR) is a sensitive technique for quantifying gene expression, but its success depends on the stability of the reference gene(s) used for data normalization. Only a few studies on validation of reference genes have been conducted in fruit trees and none in banana yet. In the present work, 20 candidate reference genes were selected, and their expression stability in 144 banana samples were evaluated and analyzed using two algorithms, geNorm and NormFinder. The samples consisted of eight sample sets collected under different experimental conditions, including various tissues, developmental stages, postharvest ripening, stresses (chilling, high temperature, and pathogen), and hormone treatments. Our results showed that different suitable reference gene(s) or combination of reference genes for normalization should be selected depending on the experimental conditions. The RPS2 and UBQ2 genes were validated as the most suitable reference genes across all tested samples. More importantly, our data further showed that the widely used reference genes, ACT and GAPDH, were not the most suitable reference genes in many banana sample sets. In addition, the expression of MaEBF1, a gene of interest that plays an important role in regulating fruit ripening, under different experimental conditions was used to further confirm the validated reference genes. Taken together, our results provide guidelines for reference gene(s) selection under different experimental conditions and a foundation for more accurate and widespread use of RT-qPCR in banana.
Marshall, Elaine; Lowrey, Jacqueline; MacPherson, Sheila; Maybin, Jacqueline A.; Collins, Frances; Critchley, Hilary O. D.
2011-01-01
Context: The endometrium is a multicellular, steroid-responsive tissue that undergoes dynamic remodeling every menstrual cycle in preparation for implantation and, in absence of pregnancy, menstruation. Androgen receptors are present in the endometrium. Objective: The objective of the study was to investigate the impact of androgens on human endometrial stromal cells (hESC). Design: Bioinformatics was used to identify an androgen-regulated gene set and processes associated with their function. Regulation of target genes and impact of androgens on cell function were validated using primary hESC. Setting: The study was conducted at the University Research Institute. Patients: Endometrium was collected from women with regular menses; tissues were used for recovery of cells, total mRNA, or protein and for immunohistochemistry. Results: A new endometrial androgen target gene set (n = 15) was identified. Bioinformatics revealed 12 of these genes interacted in one pathway and identified an association with control of cell survival. Dynamic androgen-dependent changes in expression of the gene set were detected in hESC with nine significantly down-regulated at 2 and/or 8 h. Treatment of hESC with dihydrotestosterone reduced staurosporine-induced apoptosis and cell migration/proliferation. Conclusions: Rigorous in silico analysis resulted in identification of a group of androgen-regulated genes expressed in human endometrium. Pathway analysis and functional assays suggest androgen-dependent changes in gene expression may have a significant impact on stromal cell proliferation, migration, and survival. These data provide the platform for further studies on the role of circulatory or local androgens in the regulation of endometrial function and identify androgens as candidates in the pathogenesis of common endometrial disorders including polycystic ovarian syndrome, cancer, and endometriosis. PMID:21865353
Yang, Yuting; Zhang, Xu; Chen, Yun; Guo, Jinlong; Ling, Hui; Gao, Shiwu; Su, Yachun; Que, Youxiong; Xu, Liping
2016-01-01
Sugarcane, accounting for 80% of world's sugar, originates in the tropics but is cultivated mainly in the subtropics. Therefore, chilling injury frequently occurs and results in serious losses. Recent studies in various plant species have established microRNAs as key elements in the post-transcriptional regulation of response to biotic and abiotic stresses including cold stress. Though, its accuracy is largely influenced by the use of reference gene for normalization, quantitative PCR is undoubtedly a popular method used for identification of microRNAs. For identifying the most suitable reference genes for normalizing miRNAs expression in sugarcane under cold stress, 13 candidates among 17 were investigated using four algorithms: geNorm, NormFinder, deltaCt, and Bestkeeper, and four candidates were excluded because of unsatisfactory efficiency and specificity. Verification was carried out using cold-related genes miR319 and miR393 in cold-tolerant and sensitive cultivars. The results suggested that miR171/18S rRNA and miR171/miR5059 were the best reference gene sets for normalization for miRNA RT-qPCR, followed by the single miR171 and 18S rRNA. These results can aid research on miRNA responses during sugarcane stress, and the development of sugarcane tolerant to cold stress. This study is the first report concerning the reference gene selection of miRNA RT-qPCR in sugarcane. PMID:26904058
Yang, Chunxiao; Preisser, Evan L; Zhang, Hongjun; Liu, Yong; Dai, Liangying; Pan, Huipeng; Zhou, Xuguo
2016-01-01
The development of genetically engineered plants that employ RNA interference (RNAi) to suppress invertebrate pests opens up new avenues for insect control. While this biotechnology shows tremendous promise, the potential for both non-target and off-target impacts, which likely manifest via altered mRNA expression in the exposed organisms, remains a major concern. One powerful tool for the analysis of these un-intended effects is reverse transcriptase-quantitative polymerase chain reaction, a technique for quantifying gene expression using a suite of reference genes for normalization. The seven-spotted ladybeetle Coccinella septempunctata , a commonly used predator in both classical and augmentative biological controls, is a model surrogate species used in the environmental risk assessment (ERA) of plant incorporated protectants (PIPs). Here, we assessed the suitability of eight reference gene candidates for the normalization and analysis of C. septempunctata v-ATPase A gene expression under both biotic and abiotic conditions. Five computational tools with distinct algorisms, geNorm, Normfinder, BestKeeper , the Δ C t method, and RefFinder , were used to evaluate the stability of these candidates. As a result, unique sets of reference genes were recommended, respectively, for experiments involving different developmental stages, tissues, and ingested dsRNAs. By providing a foundation for standardized RT-qPCR analysis in C. septempunctata , our work improves the accuracy and replicability of the ERA of PIPs involving RNAi transgenic plants.
Defining the role of the MADS-box gene, Zea agamous like1, in maize domestication
USDA-ARS?s Scientific Manuscript database
Genomic scans for genes that show the signature of past selection have been widely applied to a number of species and have identified a large number of selection candidate genes. In cultivated maize (Zea mays ssp. mays) selection scans have identified several hundred candidate domestication genes...
A genome-wide scan for signatures of selection in Azeri and Khuzestani buffalo breeds.
Mokhber, Mahdi; Moradi-Shahrbabak, Mohammad; Sadeghi, Mostafa; Moradi-Shahrbabak, Hossein; Stella, Alessandra; Nicolzzi, Ezequiel; Rahmaninia, Javad; Williams, John L
2018-06-11
Identification of genomic regions that have been targets of selection may shed light on the genetic history of livestock populations and help to identify variation controlling commercially important phenotypes. The Azeri and Kuzestani buffalos are the most common indigenous Iranian breeds which have been subjected to divergent selection and are well adapted to completely different regions. Examining the genetic structure of these populations may identify genomic regions associated with adaptation to the different environments and production goals. A set of 385 water buffalo samples from Azeri (N = 262) and Khuzestani (N = 123) breeds were genotyped using the Axiom® Buffalo Genotyping 90 K Array. The unbiased fixation index method (F ST ) was used to detect signatures of selection. In total, 13 regions with outlier F ST values (0.1%) were identified. Annotation of these regions using the UMD3.1 Bos taurus Genome Assembly was performed to find putative candidate genes and QTLs within the selected regions. Putative candidate genes identified include FBXO9, NDFIP1, ACTR3, ARHGAP26, SERPINF2, BOLA-DRB3, BOLA-DQB, CLN8, and MYOM2. Candidate genes identified in regions potentially under selection were associated with physiological pathways including milk production, cytoskeleton organization, growth, metabolic function, apoptosis and domestication-related changes include immune and nervous system development. The QTL identified are involved in economically important traits in buffalo related to milk composition, udder structure, somatic cell count, meat quality, and carcass and body weight.
Poon, Kar Lai; Wang, Xingang; Lee, Serene G P; Ng, Ashley S; Goh, Wei Huang; Zhao, Zhonghua; Al-Haddawi, Muthafar; Wang, Haishan; Mathavan, Sinnakaruppan; Ingham, Philip W; McGinnis, Claudia; Carney, Tom J
2017-03-01
Organ toxicity, particularly liver toxicity, remains one of the major reasons for the termination of drug candidates in the development pipeline as well as withdrawal or restrictions of marketed drugs. A screening-amenable alternative in vivo model such as zebrafish would, therefore, find immediate application in the early prediction of unacceptable organ toxicity. To identify highly upregulated genes as biomarkers of toxic responses in the zebrafish model, a set of well-characterized reference drugs that cause drug-induced liver injury (DILI) in the clinic were applied to zebrafish larvae and adults. Transcriptome microarray analysis was performed on whole larvae or dissected adult livers. Integration of data sets from different drug treatments at different stages identified common upregulated detoxification pathways. Within these were candidate biomarkers which recurred in multiple treatments. We prioritized 4 highly upregulated genes encoding enzymes acting in distinct phases of the drug metabolism pathway. Through promoter isolation and fosmid recombineering, eGFP reporter transgenic zebrafish lines were generated and evaluated for their response to DILI drugs. Three of the 4 generated reporter lines showed a dose and time-dependent induction in endodermal organs to reference drugs and an expanded drug set. In conclusion, through integrated transcriptomics and transgenic approaches, we have developed parallel independent zebrafish in vivo screening platforms able to predict organ toxicities of preclinical drugs. © The Author 2017. Published by Oxford University Press on behalf of the Society of Toxicology. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Genome-scale expression studies and comprehensive loss-of-function genetic screens have focused almost exclusively on the highest confidence candidate genes. Here, we describe a strategy for characterizing the lower confidence candidates identified by such approaches.
Raju, Nikku L; Gnanesh, Belaghihalli N; Lekha, Pazhamala; Jayashree, Balaji; Pande, Suresh; Hiremath, Pavana J; Byregowda, Munishamappa; Singh, Nagendra K; Varshney, Rajeev K
2010-03-11
Pigeonpea (Cajanus cajan (L.) Millsp) is one of the major grain legume crops of the tropics and subtropics, but biotic stresses [Fusarium wilt (FW), sterility mosaic disease (SMD), etc.] are serious challenges for sustainable crop production. Modern genomic tools such as molecular markers and candidate genes associated with resistance to these stresses offer the possibility of facilitating pigeonpea breeding for improving biotic stress resistance. Availability of limited genomic resources, however, is a serious bottleneck to undertake molecular breeding in pigeonpea to develop superior genotypes with enhanced resistance to above mentioned biotic stresses. With an objective of enhancing genomic resources in pigeonpea, this study reports generation and analysis of comprehensive resource of FW- and SMD- responsive expressed sequence tags (ESTs). A total of 16 cDNA libraries were constructed from four pigeonpea genotypes that are resistant and susceptible to FW ('ICPL 20102' and 'ICP 2376') and SMD ('ICP 7035' and 'TTB 7') and a total of 9,888 (9,468 high quality) ESTs were generated and deposited in dbEST of GenBank under accession numbers GR463974 to GR473857 and GR958228 to GR958231. Clustering and assembly analyses of these ESTs resulted into 4,557 unique sequences (unigenes) including 697 contigs and 3,860 singletons. BLASTN analysis of 4,557 unigenes showed a significant identity with ESTs of different legumes (23.2-60.3%), rice (28.3%), Arabidopsis (33.7%) and poplar (35.4%). As expected, pigeonpea ESTs are more closely related to soybean (60.3%) and cowpea ESTs (43.6%) than other plant ESTs. Similarly, BLASTX similarity results showed that only 1,603 (35.1%) out of 4,557 total unigenes correspond to known proteins in the UniProt database (
2010-01-01
Background Pigeonpea (Cajanus cajan (L.) Millsp) is one of the major grain legume crops of the tropics and subtropics, but biotic stresses [Fusarium wilt (FW), sterility mosaic disease (SMD), etc.] are serious challenges for sustainable crop production. Modern genomic tools such as molecular markers and candidate genes associated with resistance to these stresses offer the possibility of facilitating pigeonpea breeding for improving biotic stress resistance. Availability of limited genomic resources, however, is a serious bottleneck to undertake molecular breeding in pigeonpea to develop superior genotypes with enhanced resistance to above mentioned biotic stresses. With an objective of enhancing genomic resources in pigeonpea, this study reports generation and analysis of comprehensive resource of FW- and SMD- responsive expressed sequence tags (ESTs). Results A total of 16 cDNA libraries were constructed from four pigeonpea genotypes that are resistant and susceptible to FW ('ICPL 20102' and 'ICP 2376') and SMD ('ICP 7035' and 'TTB 7') and a total of 9,888 (9,468 high quality) ESTs were generated and deposited in dbEST of GenBank under accession numbers GR463974 to GR473857 and GR958228 to GR958231. Clustering and assembly analyses of these ESTs resulted into 4,557 unique sequences (unigenes) including 697 contigs and 3,860 singletons. BLASTN analysis of 4,557 unigenes showed a significant identity with ESTs of different legumes (23.2-60.3%), rice (28.3%), Arabidopsis (33.7%) and poplar (35.4%). As expected, pigeonpea ESTs are more closely related to soybean (60.3%) and cowpea ESTs (43.6%) than other plant ESTs. Similarly, BLASTX similarity results showed that only 1,603 (35.1%) out of 4,557 total unigenes correspond to known proteins in the UniProt database (≤ 1E-08). Functional categorization of the annotated unigenes sequences showed that 153 (3.3%) genes were assigned to cellular component category, 132 (2.8%) to biological process, and 132 (2.8%) in molecular function. Further, 19 genes were identified differentially expressed between FW- responsive genotypes and 20 between SMD- responsive genotypes. Generated ESTs were compiled together with 908 ESTs available in public domain, at the time of analysis, and a set of 5,085 unigenes were defined that were used for identification of molecular markers in pigeonpea. For instance, 3,583 simple sequence repeat (SSR) motifs were identified in 1,365 unigenes and 383 primer pairs were designed. Assessment of a set of 84 primer pairs on 40 elite pigeonpea lines showed polymorphism with 15 (28.8%) markers with an average of four alleles per marker and an average polymorphic information content (PIC) value of 0.40. Similarly, in silico mining of 133 contigs with ≥ 5 sequences detected 102 single nucleotide polymorphisms (SNPs) in 37 contigs. As an example, a set of 10 contigs were used for confirming in silico predicted SNPs in a set of four genotypes using wet lab experiments. Occurrence of SNPs were confirmed for all the 6 contigs for which scorable and sequenceable amplicons were generated. PCR amplicons were not obtained in case of 4 contigs. Recognition sites for restriction enzymes were identified for 102 SNPs in 37 contigs that indicates possibility of assaying SNPs in 37 genes using cleaved amplified polymorphic sequences (CAPS) assay. Conclusion The pigeonpea EST dataset generated here provides a transcriptomic resource for gene discovery and development of functional markers associated with biotic stress resistance. Sequence analyses of this dataset have showed conservation of a considerable number of pigeonpea transcripts across legume and model plant species analysed as well as some putative pigeonpea specific genes. Validation of identified biotic stress responsive genes should provide candidate genes for allele mining as well as candidate markers for molecular breeding. PMID:20222972
Lilja, Heidi E; Soro, Aino; Ylitalo, Kati; Nuotio, Ilpo; Viikari, Jorma S A; Salomaa, Veikko; Vartiainen, Erkki; Taskinen, Marja-Riitta; Peltonen, Leena; Pajukanta, Päivi
2002-09-01
In patients with premature coronary heart disease, the most common lipoprotein abnormality is high-density lipoprotein (HDL) deficiency. To assess the genetic background of the low HDL-cholesterol trait, we performed a candidate gene study in 25 families with low HDL, collected from the genetically isolated population of Finland. We studied 21 genes encoding essential proteins involved in the HDL metabolism by genotyping intragenic and flanking markers for these genes. We found suggestive evidence for linkage in two candidate regions: Marker D1S2844, in the apolipoprotein A-II (APOA2) region, yielded a LOD score of 2.14 and marker D11S939 flanking the apolipoprotein A-I/C-III/A-IV gene cluster (APOA1C3A4) produced a LOD score of 1.69. Interestingly, we identified potential shared haplotypes in these two regions in a subset of low HDL families. These families also contributed to the obtained positive LOD scores, whereas the rest of the families produced negative LOD scores. None of the remaining candidate regions provided any evidence for linkage. Since only a limited number of loci were tested in this candidate gene study, these LOD scores suggest significant involvement of the APOA2 gene and the APOA1C3A4 gene cluster, or loci in their immediate vicinity, in the pathogenesis of low HDL.
Improving information retrieval in functional analysis.
Rodriguez, Juan C; González, Germán A; Fresno, Cristóbal; Llera, Andrea S; Fernández, Elmer A
2016-12-01
Transcriptome analysis is essential to understand the mechanisms regulating key biological processes and functions. The first step usually consists of identifying candidate genes; to find out which pathways are affected by those genes, however, functional analysis (FA) is mandatory. The most frequently used strategies for this purpose are Gene Set and Singular Enrichment Analysis (GSEA and SEA) over Gene Ontology. Several statistical methods have been developed and compared in terms of computational efficiency and/or statistical appropriateness. However, whether their results are similar or complementary, the sensitivity to parameter settings, or possible bias in the analyzed terms has not been addressed so far. Here, two GSEA and four SEA methods and their parameter combinations were evaluated in six datasets by comparing two breast cancer subtypes with well-known differences in genetic background and patient outcomes. We show that GSEA and SEA lead to different results depending on the chosen statistic, model and/or parameters. Both approaches provide complementary results from a biological perspective. Hence, an Integrative Functional Analysis (IFA) tool is proposed to improve information retrieval in FA. It provides a common gene expression analytic framework that grants a comprehensive and coherent analysis. Only a minimal user parameter setting is required, since the best SEA/GSEA alternatives are integrated. IFA utility was demonstrated by evaluating four prostate cancer and the TCGA breast cancer microarray datasets, which showed its biological generalization capabilities. Copyright © 2016 Elsevier Ltd. All rights reserved.
Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project.
Gerstein, Mark B; Lu, Zhi John; Van Nostrand, Eric L; Cheng, Chao; Arshinoff, Bradley I; Liu, Tao; Yip, Kevin Y; Robilotto, Rebecca; Rechtsteiner, Andreas; Ikegami, Kohta; Alves, Pedro; Chateigner, Aurelien; Perry, Marc; Morris, Mitzi; Auerbach, Raymond K; Feng, Xin; Leng, Jing; Vielle, Anne; Niu, Wei; Rhrissorrakrai, Kahn; Agarwal, Ashish; Alexander, Roger P; Barber, Galt; Brdlik, Cathleen M; Brennan, Jennifer; Brouillet, Jeremy Jean; Carr, Adrian; Cheung, Ming-Sin; Clawson, Hiram; Contrino, Sergio; Dannenberg, Luke O; Dernburg, Abby F; Desai, Arshad; Dick, Lindsay; Dosé, Andréa C; Du, Jiang; Egelhofer, Thea; Ercan, Sevinc; Euskirchen, Ghia; Ewing, Brent; Feingold, Elise A; Gassmann, Reto; Good, Peter J; Green, Phil; Gullier, Francois; Gutwein, Michelle; Guyer, Mark S; Habegger, Lukas; Han, Ting; Henikoff, Jorja G; Henz, Stefan R; Hinrichs, Angie; Holster, Heather; Hyman, Tony; Iniguez, A Leo; Janette, Judith; Jensen, Morten; Kato, Masaomi; Kent, W James; Kephart, Ellen; Khivansara, Vishal; Khurana, Ekta; Kim, John K; Kolasinska-Zwierz, Paulina; Lai, Eric C; Latorre, Isabel; Leahey, Amber; Lewis, Suzanna; Lloyd, Paul; Lochovsky, Lucas; Lowdon, Rebecca F; Lubling, Yaniv; Lyne, Rachel; MacCoss, Michael; Mackowiak, Sebastian D; Mangone, Marco; McKay, Sheldon; Mecenas, Desirea; Merrihew, Gennifer; Miller, David M; Muroyama, Andrew; Murray, John I; Ooi, Siew-Loon; Pham, Hoang; Phippen, Taryn; Preston, Elicia A; Rajewsky, Nikolaus; Rätsch, Gunnar; Rosenbaum, Heidi; Rozowsky, Joel; Rutherford, Kim; Ruzanov, Peter; Sarov, Mihail; Sasidharan, Rajkumar; Sboner, Andrea; Scheid, Paul; Segal, Eran; Shin, Hyunjin; Shou, Chong; Slack, Frank J; Slightam, Cindie; Smith, Richard; Spencer, William C; Stinson, E O; Taing, Scott; Takasaki, Teruaki; Vafeados, Dionne; Voronina, Ksenia; Wang, Guilin; Washington, Nicole L; Whittle, Christina M; Wu, Beijing; Yan, Koon-Kiu; Zeller, Georg; Zha, Zheng; Zhong, Mei; Zhou, Xingliang; Ahringer, Julie; Strome, Susan; Gunsalus, Kristin C; Micklem, Gos; Liu, X Shirley; Reinke, Valerie; Kim, Stuart K; Hillier, LaDeana W; Henikoff, Steven; Piano, Fabio; Snyder, Michael; Stein, Lincoln; Lieb, Jason D; Waterston, Robert H
2010-12-24
We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor-binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor-binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.
FUN-L: gene prioritization for RNAi screens.
Lees, Jonathan G; Hériché, Jean-Karim; Morilla, Ian; Fernández, José M; Adler, Priit; Krallinger, Martin; Vilo, Jaak; Valencia, Alfonso; Ellenberg, Jan; Ranea, Juan A; Orengo, Christine
2015-06-15
Most biological processes remain only partially characterized with many components still to be identified. Given that a whole genome can usually not be tested in a functional assay, identifying the genes most likely to be of interest is of critical importance to avoid wasting resources. Given a set of known functionally related genes and using a state-of-the-art approach to data integration and mining, our Functional Lists (FUN-L) method provides a ranked list of candidate genes for testing. Validation of predictions from FUN-L with independent RNAi screens confirms that FUN-L-produced lists are enriched in genes with the expected phenotypes. In this article, we describe a website front end to FUN-L. The website is freely available to use at http://funl.org © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Xu, Jin; Spitale, Robert C.; Guan, Linna; Flynn, Ryan A.; Torre, Eduardo A.; Li, Rui; Raber, Inbar; Qu, Kun; Kern, Dale; Knaggs, Helen E.; Chang, Howard Y.; Chang, Anne Lynn S.
2016-01-01
While much is known about genes that promote aging, little is known about genes that protect against or prevent aging, particularly in human skin. The main objective of this study was to perform an unbiased, whole transcriptome search for genes that associate with intrinsic skin youthfulness. To accomplish this, healthy women (n = 122) of European descent, ages 18–89 years with Fitzpatrick skin type I/II were examined for facial skin aging parameters and clinical covariates, including smoking and ultraviolet exposure. Skin youthfulness was defined as the top 10% of individuals whose assessed skin aging features were most discrepant with their chronological ages. Skin biopsies from sun-protected inner arm were subjected to 3’-end sequencing for expression quantification, with results verified by quantitative reverse transcriptase-polymerase chain reaction. Unbiased clustering revealed gene expression signatures characteristic of older women with skin youthfulness (n = 12) compared to older women without skin youthfulness (n = 33), after accounting for gene expression changes associated with chronological age alone. Gene set analysis was performed using Genomica open-access software. This study identified a novel set of candidate skin youthfulness genes demonstrating differences between SY and non-SY group, including pleckstrin homology like domain family A member 1 (PHLDA1) (p = 2.4x10-5), a follicle stem cell marker, and hyaluronan synthase 2-anti-sense 1 (HAS2-AS1) (p = 0.00105), a non-coding RNA that is part of the hyaluronan synthesis pathway. We show that immunologic gene sets are the most significantly altered in skin youthfulness (with the most significant gene set p = 2.4x10-5), suggesting the immune system plays an important role in skin youthfulness, a finding that has not previously been recognized. These results are a valuable resource from which multiple future studies may be undertaken to better understand the mechanisms that promote skin youthfulness in humans. PMID:27829007
Lamellar ichthyosis maps to chromosome 14q11
DOE Office of Scientific and Technical Information (OSTI.GOV)
Russell, L.J.; Compton, J.G.; Bale, S.J.
1994-09-01
Lamellar ichthyosis (LI) is a serious skin disorder inherited as an autosomal recessive trait and characterized by large, brown plate-like scales covering the body. Skin involvement is apparent at birth, often as a collodion membrane. Scarring alopecia, ectropion, and secondary hypohidrosis are frequent. We used a panel of candidates genes that are expressed in the epidermis to study seven multiplex Caucasian families in the U.S. and six inbred (multiplex and simplex) families in Egypt. We find no recombination (Z=9.11 at {theta}=0) in either set of families with transglutaminse 1 (TGM1), the gene encoding the enzyme responsible for cross-linking proteins tomore » the cell envelope in the upper-most layer of the epidermis. In addition, striking homozygosity is observed in the inbred families for markers neighboring TGM1, defining a 9.3 cM candidate region which is bounded by MYH7 and D14S275. This is the first report of linkage in LI and suggests that further study of the TGM1 gene may identify the underlying pathogenesis of this severe, disfiguring disorder. Linkage-based genetic counseling and prenatal diagnosis is now available for informative at-risk families.« less
Retrieval of Enterobacteriaceae drug targets using singular value decomposition.
Silvério-Machado, Rita; Couto, Bráulio R G M; Dos Santos, Marcos A
2015-04-15
The identification of potential drug target proteins in bacteria is important in pharmaceutical research for the development of new antibiotics to combat bacterial agents that cause diseases. A new model that combines the singular value decomposition (SVD) technique with biological filters composed of a set of protein properties associated with bacterial drug targets and similarity to protein-coding essential genes of Escherichia coli (strain K12) has been created to predict potential antibiotic drug targets in the Enterobacteriaceae family. This model identified 99 potential drug target proteins in the studied family, which exhibit eight different functions and are protein-coding essential genes or similar to protein-coding essential genes of E.coli (strain K12), indicating that the disruption of the activities of these proteins is critical for cells. Proteins from bacteria with described drug resistance were found among the retrieved candidates. These candidates have no similarity to the human proteome, therefore exhibiting the advantage of causing no adverse effects or at least no known adverse effects on humans. rita_silverio@hotmail.com. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Meta-analysis of shared genetic architecture across ten pediatric autoimmune diseases
Li, Yun R; Li, Jin; Zhao, Sihai D; Bradfield, Jonathan P; Mentch, Frank D; Maggadottir, S Melkorka; Hou, Cuiping; Abrams, Debra J; Chang, Diana; Gao, Feng; Guo, Yiran; Wei, Zhi; Connolly, John J; Cardinale, Christopher J; Bakay, Marina; Glessner, Joseph T; Li, Dong; Kao, Charlly; Thomas, Kelly A; Qiu, Haijun; Chiavacci, Rosetta M; Kim, Cecilia E; Wang, Fengxiang; Snyder, James; Richie, Marylyn D; Flatø, Berit; Førre, Øystein; Denson, Lee A; Thompson, Susan D; Becker, Mara L; Guthery, Stephen L; Latiano, Anna; Perez, Elena; Resnick, Elena; Russell, Richard K; Wilson, David C; Silverberg, Mark S; Annese, Vito; Lie, Benedicte A; Punaro, Marilynn; Dubinsky, Marla C; Monos, Dimitri S; Strisciuglio, Caterina; Staiano, Annamaria; Miele, Erasmo; Kugathasan, Subra; Ellis, Justine A; Munro, Jane E; Sullivan, Kathleen E; Wise, Carol A; Chapel, Helen; Cunningham-Rundles, Charlotte; Grant, Struan F A; Orange, Jordan S; Sleiman, Patrick M A; Behrens, Edward M; Griffiths, Anne M; Satsangi, Jack; Finkel, Terri H; Keinan, Alon; Prak, Eline T Luning; Polychronakos, Constantin; Baldassano, Robert N; Li, Hongzhe; Keating, Brendan J; Hakonarson, Hakon
2016-01-01
Genome-wide association studies (GWASs) have identified hundreds of susceptibility genes, including shared associations across clinically distinct autoimmune diseases. We performed an inverse χ2 meta-analysis across ten pediatric-age-of-onset autoimmune diseases (pAIDs) in a case-control study including more than 6,035 cases and 10,718 shared population-based controls. We identified 27 genome-wide significant loci associated with one or more pAIDs, mapping to in silico–replicated autoimmune-associated genes (including IL2RA) and new candidate loci with established immunoregulatory functions such as ADGRL2, TENM3, ANKRD30A, ADCY7 and CD40LG. The pAID-associated single-nucleotide polymorphisms (SNPs) were functionally enriched for deoxyribonuclease (DNase)-hypersensitivity sites, expression quantitative trait loci (eQTLs), microRNA (miRNA)-binding sites and coding variants. We also identified biologically correlated, pAID-associated candidate gene sets on the basis of immune cell expression profiling and found evidence of genetic sharing. Network and protein-interaction analyses demonstrated converging roles for the signaling pathways of type 1, 2 and 17 helper T cells (TH1, TH2 and TH17), JAK-STAT, interferon and interleukin in multiple autoimmune diseases. PMID:26301688
Viveka Thangaraj, Soundara; Periasamy, Jayaprakash; Bhaskar Rao, Divya; Barnabas, Georgina D.; Raghavan, Swetha; Ganesan, Kumaresan
2013-01-01
Genomic aberrations are common in cancers and the long arm of chromosome 1 is known for its frequent amplifications in breast cancer. However, the key candidate genes of 1q, and their contribution in breast cancer pathogenesis remain unexplored. We have analyzed the gene expression profiles of 1635 breast tumor samples using meta-analysis based approach and identified clinically significant candidates from chromosome 1q. Seven candidate genes including exonuclease 1 (EXO1) are consistently over expressed in breast tumors, specifically in high grade and aggressive breast tumors with poor clinical outcome. We derived a EXO1 co-expression module from the mRNA profiles of breast tumors which comprises 1q candidate genes and their co-expressed genes. By integrative functional genomics investigation, we identified the involvement of EGFR, RAS, PI3K / AKT, MYC, E2F signaling in the regulation of these selected 1q genes in breast tumors and breast cancer cell lines. Expression of EXO1 module was found as indicative of elevated cell proliferation, genomic instability, activated RAS/AKT/MYC/E2F1 signaling pathways and loss of p53 activity in breast tumors. mRNA–drug connectivity analysis indicates inhibition of RAS/PI3K as a possible targeted therapeutic approach for the patients with activated EXO1 module in breast tumors. Thus, we identified seven 1q candidate genes strongly associated with the poor survival of breast cancer patients and identified the possibility of targeting them with EGFR/RAS/PI3K inhibitors. PMID:24147022
Porto, Diogo Denardi; Bruneau, Maryline; Perini, Pâmela; Anzanello, Rafael; Renou, Jean-Pierre; dos Santos, Henrique Pessoa; Fialho, Flávio Bello; Revers, Luís Fernando
2015-05-01
Apple production depends on the fulfilment of a chilling requirement for bud dormancy release. Insufficient winter chilling results in irregular and suboptimal bud break in the spring, with negative impacts on apple yield. Trees from apple cultivars with contrasting chilling requirements for bud break were used to investigate the expression of the entire set of apple genes in response to chilling accumulation in the field and controlled conditions. Total RNA was analysed on the AryANE v.1.0 oligonucleotide microarray chip representing 57,000 apple genes. The data were tested for functional enrichment, and differential expression was confirmed by real-time PCR. The largest number of differentially expressed genes was found in samples treated with cold temperatures. Cold exposure mostly repressed expression of transcripts related to photosynthesis, and long-term cold exposure repressed flavonoid biosynthesis genes. Among the differentially expressed selected candidates, we identified genes whose annotations were related to the circadian clock, hormonal signalling, regulation of growth, and flower development. Two genes, annotated as FLOWERING LOCUS C-like and MADS AFFECTING FLOWERING, showed strong differential expression in several comparisons. One of these two genes was upregulated in most comparisons involving dormancy release, and this gene's chromosomal position co-localized with the confidence interval of a major quantitative trait locus for the timing of bud break. These results indicate that photosynthesis and auxin transport are major regulatory nodes of apple dormancy and unveil strong candidates for the control of bud dormancy. © The Author 2015. Published by Oxford University Press on behalf of the Society for Experimental Biology. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Validating internal controls for quantitative plant gene expression studies
Brunner, Amy M; Yakovlev, Igor A; Strauss, Steven H
2004-01-01
Background Real-time reverse transcription PCR (RT-PCR) has greatly improved the ease and sensitivity of quantitative gene expression studies. However, accurate measurement of gene expression with this method relies on the choice of a valid reference for data normalization. Studies rarely verify that gene expression levels for reference genes are adequately consistent among the samples used, nor compare alternative genes to assess which are most reliable for the experimental conditions analyzed. Results Using real-time RT-PCR to study the expression of 10 poplar (genus Populus) housekeeping genes, we demonstrate a simple method for determining the degree of stability of gene expression over a set of experimental conditions. Based on a traditional method for analyzing the stability of varieties in plant breeding, it defines measures of gene expression stability from analysis of variance (ANOVA) and linear regression. We found that the potential internal control genes differed widely in their expression stability over the different tissues, developmental stages and environmental conditions studied. Conclusion Our results support that quantitative comparisons of candidate reference genes are an important part of real-time RT-PCR studies that seek to precisely evaluate variation in gene expression. The method we demonstrated facilitates statistical and graphical evaluation of gene expression stability. Selection of the best reference gene for a given set of experimental conditions should enable detection of biologically significant changes in gene expression that are too small to be revealed by less precise methods, or when highly variable reference genes are unknowingly used in real-time RT-PCR experiments. PMID:15317655
Identification of genes from the Treacher Collins candidate region
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dixon, M.; Dixon, J.; Edwards, S.
Treacher Collins syndrome (TCOF1) is an autosomal dominant disorder of craniofacial development. The TCOF1 locus has previously been mapped to chromosome 5q32-33. The candidate gene region has been defined as being between two flanking markers, ribosomal protein S14 (RPS14) and Annexin 6 (ANX6), by analyzing recombination events in affected individuals. It is estimated that the distance between these flanking markers is 500 kb by three separate analysis methods: (1) radiation hybrid mapping; (2) genetic linkage; and (3) YAC contig analysis. A cosmid contig which spans the candidate gene region for TCOF1 has been constructed by screening the Los Alamos Nationalmore » Laboratory flow-sorted chromosome 5 cosmid library. Cosmids were obtained by using a combination of probes generated from YAC end clones, Alu-PCR fragments from YACs, and asymmetric PCR fragments from both T7 and T3 cosmid ends. Exon amplifications, the selection of genomic coding sequences based upon the presence of functional splice acceptor and donor sites, was used to identify potential exon sequences. Sequences found to be conserved between species were then used to screen cDNA libraries in order to identify candidate genes. To date, four different cDNAs have been isolated from this region and are being analyzed as potential candidate genes for TCOF1. These include the genes encoding plasma glutathione peroxidase (GPX3), heparin sulfate sulfotransferase (HSST), a gene with homology to the ETS family of proteins and one which shows no homology to any known genes. Work is also in progress to identify and characterize additional cDNAs from the candidate gene region.« less
Mapping a candidate gene (MdMYB10) for red flesh and foliage colour in apple
Chagné, David; Carlisle, Charmaine M; Blond, Céline; Volz, Richard K; Whitworth, Claire J; Oraguzie, Nnadozie C; Crowhurst, Ross N; Allan, Andrew C; Espley, Richard V; Hellens, Roger P; Gardiner, Susan E
2007-01-01
Background Integrating plant genomics and classical breeding is a challenge for both plant breeders and molecular biologists. Marker-assisted selection (MAS) is a tool that can be used to accelerate the development of novel apple varieties such as cultivars that have fruit with anthocyanin through to the core. In addition, determining the inheritance of novel alleles, such as the one responsible for red flesh, adds to our understanding of allelic variation. Our goal was to map candidate anthocyanin biosynthetic and regulatory genes in a population segregating for the red flesh phenotypes. Results We have identified the Rni locus, a major genetic determinant of the red foliage and red colour in the core of apple fruit. In a population segregating for the red flesh and foliage phenotype we have determined the inheritance of the Rni locus and DNA polymorphisms of candidate anthocyanin biosynthetic and regulatory genes. Simple Sequence Repeats (SSRs) and Single Nucleotide Polymorphisms (SNPs) in the candidate genes were also located on an apple genetic map. We have shown that the MdMYB10 gene co-segregates with the Rni locus and is on Linkage Group (LG) 09 of the apple genome. Conclusion We have performed candidate gene mapping in a fruit tree crop and have provided genetic evidence that red colouration in the fruit core as well as red foliage are both controlled by a single locus named Rni. We have shown that the transcription factor MdMYB10 may be the gene underlying Rni as there were no recombinants between the marker for this gene and the red phenotype in a population of 516 individuals. Associating markers derived from candidate genes with a desirable phenotypic trait has demonstrated the application of genomic tools in a breeding programme of a horticultural crop species. PMID:17608951
Antennal transcriptome analysis of the piercing moth Oraesia emarginata (Lepidoptera: Noctuidae)
Feng, Bo; Guo, Qianshuang; Zheng, Kaidi; Qin, Yuanxia; Du, Yongjun
2017-01-01
The piercing fruit moth Oraesia emarginata is an economically significant pest; however, our understanding of its olfactory mechanisms in infestation is limited. The present study conducted antennal transcriptome analysis of olfactory genes using real-time quantitative reverse transcription PCR analysis (RT-qPCR). We identified a total of 104 candidate chemosensory genes from several gene families, including 35 olfactory receptors (ORs), 41 odorant-binding proteins, 20 chemosensory proteins, 6 ionotropic receptors, and 2 sensory neuron membrane proteins. Seven candidate pheromone receptors (PRs) and 3 candidate pheromone-binding proteins (PBPs) for sex pheromone recognition were found. OemaOR29 and OemaPBP1 had the highest fragments per kb per million fragments (FPKM) values in all ORs and OBPs, respectively. Eighteen olfactory genes were upregulated in females, including 5 candidate PRs, and 20 olfactory genes were upregulated in males, including 2 candidate PRs (OemaOR29 and 4) and 2 PBPs (OemaPBP1 and 3). These genes may have roles in mediating sex-specific behaviors. Most candidate olfactory genes of sex pheromone recognition (except OemaOR29 and OemaPBP3) in O. emarginata were not clustered with those of studied noctuid species (type I pheromone). In addition, OemaOR29 was belonged to cluster PRIII, which comprise proteins that recognize type II pheromones instead of type I pheromones. The structure and function of olfactory genes that encode sex pheromones in O. emarginata might thus differ from those of other studied noctuids. The findings of the present study may help explain the molecular mechanism underlying olfaction and the evolution of olfactory genes encoding sex pheromones in O. emarginata. PMID:28614384
Thanseem, Ismail; Anitha, Ayyappan; Nakamura, Kazuhiko; Suda, Shiro; Iwata, Keiko; Matsuzaki, Hideo; Ohtsubo, Masafumi; Ueki, Takatoshi; Katayama, Taiichi; Iwata, Yasuhide; Suzuki, Katsuaki; Minoshima, Shinsei; Mori, Norio
2012-03-01
Profound changes in gene expression can result from abnormalities in the concentrations of sequence-specific transcription factors like specificity protein 1 (Sp1). Specificity protein 1 binding sites have been reported in the promoter regions of several genes implicated in autism. We hypothesize that dysfunction of Sp1 could affect the expression of multiple autism candidate genes, contributing to the heterogeneity of autism. We assessed any alterations in the expression of Sp1 and that of autism candidate genes in the postmortem brain (anterior cingulate gyrus [ACG], motor cortex, and thalamus) of autism patients (n = 8) compared with healthy control subjects (n = 13). Alterations in the expression of candidate genes upon Sp1/DNA binding inhibition with mithramycin and Sp1 silencing by RNAi were studied in SK-N-SH neuronal cells. We observed elevated expression of Sp1 in ACG of autism patients (p = .010). We also observed altered expression of several autism candidate genes. GABRB3, RELN, and HTR2A showed reduced expression, whereas CD38, ITGB3, MAOA, MECP2, OXTR, and PTEN showed elevated expression in autism. In SK-N-SH cells, OXTR, PTEN, and RELN showed reduced expression upon Sp1/DNA binding inhibition and Sp1 silencing. The RNA integrity number was not available for any of the samples. Transcription factor Sp1 is dysfunctional in the ACG of autistic brain. Consequently, the expression of potential autism candidate genes regulated by Sp1, especially OXTR and PTEN, could be affected. The diverse downstream pathways mediated by the Sp1-regulated genes, along with the environmental and intracellular signal-related regulation of Sp1, could explain the complex phenotypes associated with autism.
EBF factors drive expression of multiple classes of target genes governing neuronal development.
Green, Yangsook S; Vetter, Monica L
2011-04-30
Early B cell factor (EBF) family members are transcription factors known to have important roles in several aspects of vertebrate neurogenesis, including commitment, migration and differentiation. Knowledge of how EBF family members contribute to neurogenesis is limited by a lack of detailed understanding of genes that are transcriptionally regulated by these factors. We performed a microarray screen in Xenopus animal caps to search for targets of EBF transcriptional activity, and identified candidate targets with multiple roles, including transcription factors of several classes. We determined that, among the most upregulated candidate genes with expected neuronal functions, most require EBF activity for some or all of their expression, and most have overlapping expression with ebf genes. We also found that the candidate target genes that had the most strongly overlapping expression patterns with ebf genes were predicted to be direct transcriptional targets of EBF transcriptional activity. The identification of candidate targets that are transcription factor genes, including nscl-1, emx1 and aml1, improves our understanding of how EBF proteins participate in the hierarchy of transcription control during neuronal development, and suggests novel mechanisms by which EBF activity promotes migration and differentiation. Other candidate targets, including pcdh8 and kcnk5, expand our knowledge of the types of terminal differentiated neuronal functions that EBF proteins regulate.
Development of New Candidate Gene and EST-Based Molecular Markers for Gossypium Species
Buyyarapu, Ramesh; Kantety, Ramesh V.; Yu, John Z.; Saha, Sukumar; Sharma, Govind C.
2011-01-01
New source of molecular markers accelerate the efforts in improving cotton fiber traits and aid in developing high-density integrated genetic maps. We developed new markers based on candidate genes and G. arboreum EST sequences that were used for polymorphism detection followed by genetic and physical mapping. Nineteen gene-based markers were surveyed for polymorphism detection in 26 Gossypium species. Cluster analysis generated a phylogenetic tree with four major sub-clusters for 23 species while three species branched out individually. CAP method enhanced the rate of polymorphism of candidate gene-based markers between G. hirsutum and G. barbadense. Two hundred A-genome based SSR markers were designed after datamining of G. arboreum EST sequences (Mississippi Gossypium arboreum EST-SSR: MGAES). Over 70% of MGAES markers successfully produced amplicons while 65 of them demonstrated polymorphism between the parents of G. hirsutum and G. barbadense RIL population and formed 14 linkage groups. Chromosomal localization of both candidate gene-based and MGAES markers was assisted by euploid and hypoaneuploid CS-B analysis. Gene-based and MGAES markers were highly informative as they were designed from candidate genes and fiber transcriptome with a potential to be integrated into the existing cotton genetic and physical maps. PMID:22315588
Lempereur, Laetitia; Larcombe, Stephen D; Durrani, Zeeshan; Karagenc, Tulin; Bilgic, Huseyin Bilgin; Bakirci, Serkan; Hacilarlioglu, Selin; Kinnaird, Jane; Thompson, Joanne; Weir, William; Shiels, Brian
2017-06-05
Vector-borne apicomplexan parasites are a major cause of mortality and morbidity to humans and livestock globally. The most important disease syndromes caused by these parasites are malaria, babesiosis and theileriosis. Strategies for control often target parasite stages in the mammalian host that cause disease, but this can result in reservoir infections that promote pathogen transmission and generate economic loss. Optimal control strategies should protect against clinical disease, block transmission and be applicable across related genera of parasites. We have used bioinformatics and transcriptomics to screen for transmission-blocking candidate antigens in the tick-borne apicomplexan parasite, Theileria annulata. A number of candidate antigen genes were identified which encoded amino acid domains that are conserved across vector-borne Apicomplexa (Babesia, Plasmodium and Theileria), including the Pfs48/45 6-cys domain and a novel cysteine-rich domain. Expression profiling confirmed that selected candidate genes are expressed by life cycle stages within infected ticks. Additionally, putative B cell epitopes were identified in the T. annulata gene sequences encoding the 6-cys and cysteine rich domains, in a gene encoding a putative papain-family cysteine peptidase, with similarity to the Plasmodium SERA family, and the gene encoding the T. annulata major merozoite/piroplasm surface antigen, Tams1. Candidate genes were identified that encode proteins with similarity to known transmission blocking candidates in related parasites, while one is a novel candidate conserved across vector-borne apicomplexans and has a potential role in the sexual phase of the life cycle. The results indicate that a 'One Health' approach could be utilised to develop a transmission-blocking strategy effective against vector-borne apicomplexan parasites of animals and humans.
Telonis-Scott, Marina; Sgrò, Carla M.; Hoffmann, Ary A.; Griffin, Philippa C.
2016-01-01
Repeated attempts to map the genomic basis of complex traits often yield different outcomes because of the influence of genetic background, gene-by-environment interactions, and/or statistical limitations. However, where repeatability is low at the level of individual genes, overlap often occurs in gene ontology categories, genetic pathways, and interaction networks. Here we report on the genomic overlap for natural desiccation resistance from a Pool-genome-wide association study experiment and a selection experiment in flies collected from the same region in southeastern Australia in different years. We identified over 600 single nucleotide polymorphisms associated with desiccation resistance in flies derived from almost 1,000 wild-caught genotypes, a similar number of loci to that observed in our previous genomic study of selected lines, demonstrating the genetic complexity of this ecologically important trait. By harnessing the power of cross-study comparison, we narrowed the candidates from almost 400 genes in each study to a core set of 45 genes, enriched for stimulus, stress, and defense responses. In addition to gene-level overlap, there was higher order congruence at the network and functional levels, suggesting genetic redundancy in key stress sensing, stress response, immunity, signaling, and gene expression pathways. We also identified variants linked to different molecular aspects of desiccation physiology previously verified from functional experiments. Our approach provides insight into the genomic basis of a complex and ecologically important trait and predicts candidate genetic pathways to explore in multiple genetic backgrounds and related species within a functional framework. PMID:26733490
Ontology based molecular signatures for immune cell types via gene expression analysis
2013-01-01
Background New technologies are focusing on characterizing cell types to better understand their heterogeneity. With large volumes of cellular data being generated, innovative methods are needed to structure the resulting data analyses. Here, we describe an ‘Ontologically BAsed Molecular Signature’ (OBAMS) method that identifies novel cellular biomarkers and infers biological functions as characteristics of particular cell types. This method finds molecular signatures for immune cell types based on mapping biological samples to the Cell Ontology (CL) and navigating the space of all possible pairwise comparisons between cell types to find genes whose expression is core to a particular cell type’s identity. Results We illustrate this ontological approach by evaluating expression data available from the Immunological Genome project (IGP) to identify unique biomarkers of mature B cell subtypes. We find that using OBAMS, candidate biomarkers can be identified at every strata of cellular identity from broad classifications to very granular. Furthermore, we show that Gene Ontology can be used to cluster cell types by shared biological processes in order to find candidate genes responsible for somatic hypermutation in germinal center B cells. Moreover, through in silico experiments based on this approach, we have identified genes sets that represent genes overexpressed in germinal center B cells and identify genes uniquely expressed in these B cells compared to other B cell types. Conclusions This work demonstrates the utility of incorporating structured ontological knowledge into biological data analysis – providing a new method for defining novel biomarkers and providing an opportunity for new biological insights. PMID:24004649
Takeda, Haruna; Rust, Alistair G.; Ward, Jerrold M.; Yew, Christopher Chin Kuan; Jenkins, Nancy A.; Copeland, Neal G.
2016-01-01
Mutations in SMAD4 predispose to the development of gastrointestinal cancer, which is the third leading cause of cancer-related deaths. To identify genes driving gastric cancer (GC) development, we performed a Sleeping Beauty (SB) transposon mutagenesis screen in the stomach of Smad4+/− mutant mice. This screen identified 59 candidate GC trunk drivers and a much larger number of candidate GC progression genes. Strikingly, 22 SB-identified trunk drivers are known or candidate cancer genes, whereas four SB-identified trunk drivers, including PTEN, SMAD4, RNF43, and NF1, are known human GC trunk drivers. Similar to human GC, pathway analyses identified WNT, TGF-β, and PI3K-PTEN signaling, ubiquitin-mediated proteolysis, adherens junctions, and RNA degradation in addition to genes involved in chromatin modification and organization as highly deregulated pathways in GC. Comparative oncogenomic filtering of the complete list of SB-identified genes showed that they are highly enriched for genes mutated in human GC and identified many candidate human GC genes. Finally, by comparing our complete list of SB-identified genes against the list of mutated genes identified in five large-scale human GC sequencing studies, we identified LDL receptor-related protein 1B (LRP1B) as a previously unidentified human candidate GC tumor suppressor gene. In LRP1B, 129 mutations were found in 462 human GC samples sequenced, and LRP1B is one of the top 10 most deleted genes identified in a panel of 3,312 human cancers. SB mutagenesis has, thus, helped to catalog the cooperative molecular mechanisms driving SMAD4-induced GC growth and discover genes with potential clinical importance in human GC. PMID:27006499
Takeda, Haruna; Rust, Alistair G; Ward, Jerrold M; Yew, Christopher Chin Kuan; Jenkins, Nancy A; Copeland, Neal G
2016-04-05
Mutations in SMAD4 predispose to the development of gastrointestinal cancer, which is the third leading cause of cancer-related deaths. To identify genes driving gastric cancer (GC) development, we performed a Sleeping Beauty (SB) transposon mutagenesis screen in the stomach of Smad4(+/-) mutant mice. This screen identified 59 candidate GC trunk drivers and a much larger number of candidate GC progression genes. Strikingly, 22 SB-identified trunk drivers are known or candidate cancer genes, whereas four SB-identified trunk drivers, including PTEN, SMAD4, RNF43, and NF1, are known human GC trunk drivers. Similar to human GC, pathway analyses identified WNT, TGF-β, and PI3K-PTEN signaling, ubiquitin-mediated proteolysis, adherens junctions, and RNA degradation in addition to genes involved in chromatin modification and organization as highly deregulated pathways in GC. Comparative oncogenomic filtering of the complete list of SB-identified genes showed that they are highly enriched for genes mutated in human GC and identified many candidate human GC genes. Finally, by comparing our complete list of SB-identified genes against the list of mutated genes identified in five large-scale human GC sequencing studies, we identified LDL receptor-related protein 1B (LRP1B) as a previously unidentified human candidate GC tumor suppressor gene. In LRP1B, 129 mutations were found in 462 human GC samples sequenced, and LRP1B is one of the top 10 most deleted genes identified in a panel of 3,312 human cancers. SB mutagenesis has, thus, helped to catalog the cooperative molecular mechanisms driving SMAD4-induced GC growth and discover genes with potential clinical importance in human GC.
A Frameshift Mutation in KIT is Associated with White Spotting in the Arabian Camel.
Holl, Heather; Isaza, Ramiro; Mohamoud, Yasmin; Ahmed, Ayeda; Almathen, Faisal; Youcef, Cherifi; Gaouar, Semir; Antczak, Douglas F; Brooks, Samantha
2017-03-09
While the typical Arabian camel is characterized by a single colored coat, there are rare populations with white spotting patterns. White spotting coat patterns are found in virtually all domesticated species, but are rare in wild species. Theories suggest that white spotting is linked to the domestication process, and is occasionally associated with health disorders. Though mutations have been found in a diverse array of species, fewer than 30 genes have been associated with spotting patterns, thus providing a key set of candidate genes for the Arabian camel. We obtained 26 spotted camels and 24 solid controls for candidate gene analysis. One spotted and eight solid camels were whole genome sequenced as part of a separate project. The spotted camel was heterozygous for a frameshift deletion in KIT (c.1842delG, named KITW1 for White spotting 1), whereas all other camels were wild-type (KIT+/KIT+). No additional mutations unique to the spotted camel were detected in the EDNRB, EDN3, SOX10, KITLG, PDGFRA, MITF, and PAX3 candidate white spotting genes. Sanger sequencing of the study population identified an additional five kITW1/KIT+ spotted camels. The frameshift results in a premature stop codon five amino acids downstream, thus terminating KIT at the tyrosine kinase domain. An additional 13 spotted camels tested KIT+/KIT+, but due to phenotypic differences when compared to the KITW1/KIT+ camels, they likely represent an independent mutation. Our study suggests that there are at least two causes of white spotting in the Arabian camel, the newly described KITW1 allele and an uncharacterized mutation.
A Frameshift Mutation in KIT is Associated with White Spotting in the Arabian Camel
Holl, Heather; Isaza, Ramiro; Mohamoud, Yasmin; Ahmed, Ayeda; Almathen, Faisal; Youcef, Cherifi; Gaouar, Semir; Antczak, Douglas F.; Brooks, Samantha
2017-01-01
While the typical Arabian camel is characterized by a single colored coat, there are rare populations with white spotting patterns. White spotting coat patterns are found in virtually all domesticated species, but are rare in wild species. Theories suggest that white spotting is linked to the domestication process, and is occasionally associated with health disorders. Though mutations have been found in a diverse array of species, fewer than 30 genes have been associated with spotting patterns, thus providing a key set of candidate genes for the Arabian camel. We obtained 26 spotted camels and 24 solid controls for candidate gene analysis. One spotted and eight solid camels were whole genome sequenced as part of a separate project. The spotted camel was heterozygous for a frameshift deletion in KIT (c.1842delG, named KITW1 for White spotting 1), whereas all other camels were wild-type (KIT+/KIT+). No additional mutations unique to the spotted camel were detected in the EDNRB, EDN3, SOX10, KITLG, PDGFRA, MITF, and PAX3 candidate white spotting genes. Sanger sequencing of the study population identified an additional five KITW1/KIT+ spotted camels. The frameshift results in a premature stop codon five amino acids downstream, thus terminating KIT at the tyrosine kinase domain. An additional 13 spotted camels tested KIT+/KIT+, but due to phenotypic differences when compared to the KITW1/KIT+ camels, they likely represent an independent mutation. Our study suggests that there are at least two causes of white spotting in the Arabian camel, the newly described KITW1 allele and an uncharacterized mutation. PMID:28282952
Oiestad, A J; Martin, J M; Cook, J; Varella, A C; Giroux, M J
2017-07-01
The wheat stem sawfly (WSS) is an economically important pest of wheat in the Northern Great Plains. The primary means of WSS control is resistance associated with the single quantitative trait locus (QTL) , which controls most stem solidness variation. The goal of this study was to identify stem solidness candidate genes via RNA-seq. This study made use of 28 single nucleotide polymorphism (SNP) makers derived from expressed sequence tags (ESTs) linked to contained within a 5.13 cM region. Allele specific expression of EST markers was examined in stem tissue for solid and hollow-stemmed pairs of two spring wheat near isogenic lines (NILs) differing for the QTL. Of the 28 ESTs, 13 were located within annotated genes and 10 had detectable stem expression. Annotated genes corresponding to four of the ESTs were differentially expressed between solid and hollow-stemmed NILs and represent possible stem solidness gene candidates. Further examination of the 5.13 cM region containing the 28 EST markers identified 260 annotated genes. Twenty of the 260 linked genes were up-regulated in hollow NIL stems, while only seven genes were up-regulated in solid NIL stems. An -methyltransferase within the region of interest was identified as a candidate based on differential expression between solid and hollow-stemmed NILs and putative function. Further study of these candidate genes may lead to the identification of the gene(s) controlling stem solidness and an increased ability to select for wheat stem solidness and manage WSS. Copyright © 2017 Crop Science Society of America.
confFuse: High-Confidence Fusion Gene Detection across Tumor Entities.
Huang, Zhiqin; Jones, David T W; Wu, Yonghe; Lichter, Peter; Zapatka, Marc
2017-01-01
Background: Fusion genes play an important role in the tumorigenesis of many cancers. Next-generation sequencing (NGS) technologies have been successfully applied in fusion gene detection for the last several years, and a number of NGS-based tools have been developed for identifying fusion genes during this period. Most fusion gene detection tools based on RNA-seq data report a large number of candidates (mostly false positives), making it hard to prioritize candidates for experimental validation and further analysis. Selection of reliable fusion genes for downstream analysis becomes very important in cancer research. We therefore developed confFuse, a scoring algorithm to reliably select high-confidence fusion genes which are likely to be biologically relevant. Results: confFuse takes multiple parameters into account in order to assign each fusion candidate a confidence score, of which score ≥8 indicates high-confidence fusion gene predictions. These parameters were manually curated based on our experience and on certain structural motifs of fusion genes. Compared with alternative tools, based on 96 published RNA-seq samples from different tumor entities, our method can significantly reduce the number of fusion candidates (301 high-confidence from 8,083 total predicted fusion genes) and keep high detection accuracy (recovery rate 85.7%). Validation of 18 novel, high-confidence fusions detected in three breast tumor samples resulted in a 100% validation rate. Conclusions: confFuse is a novel downstream filtering method that allows selection of highly reliable fusion gene candidates for further downstream analysis and experimental validations. confFuse is available at https://github.com/Zhiqin-HUANG/confFuse.
Systems Biology-Based Identification of Mycobacterium tuberculosis Persistence Genes in Mouse Lungs
Dutta, Noton K.; Bandyopadhyay, Nirmalya; Veeramani, Balaji; Lamichhane, Gyanu; Karakousis, Petros C.; Bader, Joel S.
2014-01-01
ABSTRACT Identifying Mycobacterium tuberculosis persistence genes is important for developing novel drugs to shorten the duration of tuberculosis (TB) treatment. We developed computational algorithms that predict M. tuberculosis genes required for long-term survival in mouse lungs. As the input, we used high-throughput M. tuberculosis mutant library screen data, mycobacterial global transcriptional profiles in mice and macrophages, and functional interaction networks. We selected 57 unique, genetically defined mutants (18 previously tested and 39 untested) to assess the predictive power of this approach in the murine model of TB infection. We observed a 6-fold enrichment in the predicted set of M. tuberculosis genes required for persistence in mouse lungs relative to randomly selected mutant pools. Our results also allowed us to reclassify several genes as required for M. tuberculosis persistence in vivo. Finally, the new results implicated additional high-priority candidate genes for testing. Experimental validation of computational predictions demonstrates the power of this systems biology approach for elucidating M. tuberculosis persistence genes. PMID:24549847
A Systematic Analysis of Candidate Genes Associated with Nicotine Addiction
Liu, Meng; Li, Xia; Fan, Rui; Liu, Xinhua; Wang, Ju
2015-01-01
Nicotine, as the major psychoactive component of tobacco, has broad physiological effects within the central nervous system, but our understanding of the molecular mechanism underlying its neuronal effects remains incomplete. In this study, we performed a systematic analysis on a set of nicotine addiction-related genes to explore their characteristics at network levels. We found that NAGenes tended to have a more moderate degree and weaker clustering coefficient and to be less central in the network compared to alcohol addiction-related genes or cancer genes. Further, clustering of these genes resulted in six clusters with themes in synaptic transmission, signal transduction, metabolic process, and apoptosis, which provided an intuitional view on the major molecular functions of the genes. Moreover, functional enrichment analysis revealed that neurodevelopment, neurotransmission activity, and metabolism related biological processes were involved in nicotine addiction. In summary, by analyzing the overall characteristics of the nicotine addiction related genes, this study provided valuable information for understanding the molecular mechanisms underlying nicotine addiction. PMID:26097843
Li, Yongsheng; Sahni, Nidhi; Yi, Song
2016-11-29
Comprehensive understanding of human cancer mechanisms requires the identification of a thorough list of cancer-associated genes, which could serve as biomarkers for diagnoses and therapies in various types of cancer. Although substantial progress has been made in functional studies to uncover genes involved in cancer, these efforts are often time-consuming and costly. Therefore, it remains challenging to comprehensively identify cancer candidate genes. Network-based methods have accelerated this process through the analysis of complex molecular interactions in the cell. However, the extent to which various interactome networks can contribute to prediction of candidate genes responsible for cancer is still enigmatic. In this study, we evaluated different human protein-protein interactome networks and compared their application to cancer gene prioritization. Our results indicate that network analyses can increase the power to identify novel cancer genes. In particular, such predictive power can be enhanced with the use of unbiased systematic protein interaction maps for cancer gene prioritization. Functional analysis reveals that the top ranked genes from network predictions co-occur often with cancer-related terms in literature, and further, these candidate genes are indeed frequently mutated across cancers. Finally, our study suggests that integrating interactome networks with other omics datasets could provide novel insights into cancer-associated genes and underlying molecular mechanisms.
Analyzing gene perturbation screens with nested effects models in R and bioconductor.
Fröhlich, Holger; Beissbarth, Tim; Tresch, Achim; Kostka, Dennis; Jacob, Juby; Spang, Rainer; Markowetz, F
2008-11-01
Nested effects models (NEMs) are a class of probabilistic models introduced to analyze the effects of gene perturbation screens visible in high-dimensional phenotypes like microarrays or cell morphology. NEMs reverse engineer upstream/downstream relations of cellular signaling cascades. NEMs take as input a set of candidate pathway genes and phenotypic profiles of perturbing these genes. NEMs return a pathway structure explaining the observed perturbation effects. Here, we describe the package nem, an open-source software to efficiently infer NEMs from data. Our software implements several search algorithms for model fitting and is applicable to a wide range of different data types and representations. The methods we present summarize the current state-of-the-art in NEMs. Our software is written in the R language and freely avail-able via the Bioconductor project at http://www.bioconductor.org.
Recent progress in the genetics of spontaneously hypertensive rats.
Pravenec, M; Křen, V; Landa, V; Mlejnek, P; Musilová, A; Šilhavý, J; Šimáková, M; Zídek, V
2014-01-01
The spontaneously hypertensive rat (SHR) is the most widely used animal model of essential hypertension and accompanying metabolic disturbances. Recent advances in sequencing of genomes of BN-Lx and SHR progenitors of the BXH/HXB recombinant inbred (RI) strains as well as accumulation of multiple data sets of intermediary phenotypes in the RI strains, including mRNA and microRNA abundance, quantitative metabolomics, proteomics, methylomics or histone modifications, will make it possible to systematically search for genetic variants involved in regulation of gene expression and in the etiology of complex pathophysiological traits. New advances in manipulation of the rat genome, including efficient transgenesis and gene targeting, will enable in vivo functional analyses of selected candidate genes to identify QTL at the molecular level or to provide insight into mechanisms whereby targeted genes affect pathophysiological traits in the SHR.
Genome-wide association study for host response to bovine leukemia virus in Holstein cows.
Brym, P; Bojarojć-Nosowicz, B; Oleński, K; Hering, D M; Ruść, A; Kaczmarczyk, E; Kamiński, S
2016-07-01
The mechanisms of leukemogenesis induced by bovine leukemia virus (BLV) and the processes underlying the phenomenon of differential host response to BLV infection still remain poorly understood. The aim of the study was to screen the entire cattle genome to identify markers and candidate genes that might be involved in host response to bovine leukemia virus infection. A genome-wide association study was performed using Holstein cows naturally infected by BLV. A data set included 43 cows (BLV positive) and 30 cows (BLV negative) genotyped for 54,609 SNP markers (Illumina Bovine SNP50 BeadChip). The BLV status of cows was determined by serum ELISA, nested-PCR and hematological counts. Linear Regression Analysis with a False Discovery Rate and kinship matrix (computed on the autosomal SNPs) was calculated to find out which SNP markers significantly differentiate BLV-positive and BLV-negative cows. Nine markers reached genome-wide significance. The most significant SNPs were located on chromosomes 23 (rs41583098), 3 (rs109405425, rs110785500) and 8 (rs43564499) in close vicinity of a patatin-like phospholipase domain containing 1 (PNPLA1); adaptor-related protein complex 4, beta 1 subunit (AP4B1); tripartite motif-containing 45 (TRIM45) and cell division cycle associated 2 (CDCA2) genes, respectively. Furthermore, a list of 41 candidate genes was composed based on their proximity to significant markers (within a distance of ca. 1 Mb) and functional involvement in processes potentially underlying BLV-induced pathogenesis. In conclusion, it was demonstrated that host response to BLV infection involves nine sub-regions of the cattle genome (represented by 9 SNP markers), containing many genes which, based on the literature, could be involved to enzootic bovine leukemia progression. New group of promising candidate genes associated with the host response to BLV infection were identified and could therefore be a target for future studies. The functions of candidate genes surrounding significant SNP markers imply that there is no single regulatory process that is solely targeted by BLV infection, but rather the network of interrelated pathways is deregulated, leading to the disruption of the control of B-cell proliferation and programmed cell death. Copyright © 2016 Elsevier B.V. All rights reserved.
Leigh Hawkins; Marilyn Warburton; Juliet Tang; John Tomashek; Dafne Alves Oliveira; Oluwaseun Ogunola; J. Smith; W. Williams
2018-01-01
Many projects have identified candidate genes for resistance to aflatoxin accumulation or Aspergillus flavus infection and growth in maize using genetic mapping, genomics, transcriptomics and/or proteomics studies. However, only a small percentage of these candidates have been validated in field conditions, and their relative contribution to...
Changes in gene expression with sleep.
Thimgan, Matthew S; Duntley, Stephen P; Shaw, Paul J
2011-10-15
There is general agreement within the sleep community and among public health officials of the need for an accessible biomarker of sleepiness. As the foregoing discussions emphasize, however, it may be more difficult to reach consensus on how to define such a biomarker than to identify candidate molecules that can be then evaluated to determine if they might be useful to solve a variety of real-world problems related to insufficient sleep. With that in mind, a goal of our laboratories has been to develop a rational strategy to expedite the identification of candidate biomarkers. 1 We began with the assumption that since both the genetic and environmental context of a gene can influence its behavior, an effective test of sleep loss will likely be composed of a panel of multiple biomarkers. That is, we believe that it is premature to exclude a candidate analyte simply because it might also be modulated in response to other conditions (e.g., illness, metabolism, sympathetic tone, etc.). Our next assumption was that an easily accessible biomarker would be more useful in real-world settings. Thus, we have focused on saliva, as opposed to urine or blood, as a rich source of biological analytes that can be mined to optimize the chances of bringing a biomarker out into the field. Finally, we recognize that conducting validation studies in humans can be expensive and time consuming. Thus, we have exploited genetic and pharmacological tools in the model organism Drosophila melanogaster to more fully characterize the behavior of the most exciting candidate biomarkers.
Liu, Shuang; Wang, Feng; Gao, Li Jun; Li, Jin Hua; Li, Rong Bai; Gao, Han Liang; Deng, Guo Fu; Yang, Jin Shui; Luo, Xiao Jin
2012-01-01
Heading date in rice (Oryza sativa L.) is a critical agronomic trait with a complex inheritance. To investigate the genetic basis and mechanism of gene interaction in heading date, we conducted genetic analysis on segregation populations derived from crosses among the indica cultivars Bo B, Yuefeng B and Baoxuan 2. A set of dominant complementary genes controlling late heading, designated LH1 and LH2, were detected by molecular marker mapping. Genetic analysis revealed that Baoxuan 2 contains both dominant genes, while Bo B and Yuefeng B each possess either LH1 or LH2. Using larger populations with segregant ratios of 3 : 1, we fine-mapped LH1 to a 63-kb region near the centromere of chromosome 7 flanked by markers RM5436 and RM8034, and LH2 to a 177-kb region on the short arm of chromosome 8 between flanking markers Indel22468-3 and RM25. Some candidate genes were identified through sequencing of Bo B and Yuefeng B in these target regions. Our work provides a solid foundation for further study on gene interaction in heading date and has application in marker-assisted breeding of photosensitive hybrid rice in China. PMID:23341744
Liu, Shuang; Wang, Feng; Gao, Li Jun; Li, Jin Hua; Li, Rong Bai; Gao, Han Liang; Deng, Guo Fu; Yang, Jin Shui; Luo, Xiao Jin
2012-12-01
Heading date in rice (Oryza sativa L.) is a critical agronomic trait with a complex inheritance. To investigate the genetic basis and mechanism of gene interaction in heading date, we conducted genetic analysis on segregation populations derived from crosses among the indica cultivars Bo B, Yuefeng B and Baoxuan 2. A set of dominant complementary genes controlling late heading, designated LH1 and LH2, were detected by molecular marker mapping. Genetic analysis revealed that Baoxuan 2 contains both dominant genes, while Bo B and Yuefeng B each possess either LH1 or LH2. Using larger populations with segregant ratios of 3 : 1, we fine-mapped LH1 to a 63-kb region near the centromere of chromosome 7 flanked by markers RM5436 and RM8034, and LH2 to a 177-kb region on the short arm of chromosome 8 between flanking markers Indel22468-3 and RM25. Some candidate genes were identified through sequencing of Bo B and Yuefeng B in these target regions. Our work provides a solid foundation for further study on gene interaction in heading date and has application in marker-assisted breeding of photosensitive hybrid rice in China.
Variation in the oxytocin receptor gene (OXTR) is associated with differences in moral judgment
Chaponis, Jonathan; Siburian, Richie; Gallagher, Patience; Ransohoff, Katherine; Wikler, Daniel; Perlis, Roy H.; Greene, Joshua D.
2016-01-01
Moral judgments are produced through the coordinated interaction of multiple neural systems, each of which relies on a characteristic set of neurotransmitters. Genes that produce or regulate these neurotransmitters may have distinctive influences on moral judgment. Two studies examined potential genetic influences on moral judgment using dilemmas that reliably elicit competing automatic and controlled responses, generated by dissociable neural systems. Study 1 (N = 228) examined 49 common variants (SNPs) within 10 candidate genes and identified a nominal association between a polymorphism (rs237889) of the oxytocin receptor gene (OXTR) and variation in deontological vs utilitarian moral judgment (that is, judgments favoring individual rights vs the greater good). An association was likewise observed for rs1042615 of the arginine vasopressin receptor gene (AVPR1A). Study 2 (N = 322) aimed to replicate these findings using the aforementioned dilemmas as well as a new set of structurally similar medical dilemmas. Study 2 failed to replicate the association with AVPR1A, but replicated the OXTR finding using both the original and new dilemmas. Together, these findings suggest that moral judgment is influenced by variation in the oxytocin receptor gene and, more generally, that single genetic polymorphisms can have a detectable effect on complex decision processes. PMID:27497314
Variation in the oxytocin receptor gene (OXTR) is associated with differences in moral judgment.
Bernhard, Regan M; Chaponis, Jonathan; Siburian, Richie; Gallagher, Patience; Ransohoff, Katherine; Wikler, Daniel; Perlis, Roy H; Greene, Joshua D
2016-12-01
Moral judgments are produced through the coordinated interaction of multiple neural systems, each of which relies on a characteristic set of neurotransmitters. Genes that produce or regulate these neurotransmitters may have distinctive influences on moral judgment. Two studies examined potential genetic influences on moral judgment using dilemmas that reliably elicit competing automatic and controlled responses, generated by dissociable neural systems. Study 1 (N = 228) examined 49 common variants (SNPs) within 10 candidate genes and identified a nominal association between a polymorphism (rs237889) of the oxytocin receptor gene (OXTR) and variation in deontological vs utilitarian moral judgment (that is, judgments favoring individual rights vs the greater good). An association was likewise observed for rs1042615 of the arginine vasopressin receptor gene (AVPR1A). Study 2 (N = 322) aimed to replicate these findings using the aforementioned dilemmas as well as a new set of structurally similar medical dilemmas. Study 2 failed to replicate the association with AVPR1A, but replicated the OXTR finding using both the original and new dilemmas. Together, these findings suggest that moral judgment is influenced by variation in the oxytocin receptor gene and, more generally, that single genetic polymorphisms can have a detectable effect on complex decision processes. © The Author (2016). Published by Oxford University Press.
Transcriptome analysis of trigeminal ganglia following masseter muscle inflammation in rats
Park, Jennifer; Asgar, Jamila; Ro, Jin Y.
2016-01-01
Background Chronic pain in masticatory muscles is a major medical problem. Although mechanisms underlying persistent pain in masticatory muscles are not fully understood, sensitization of nociceptive primary afferents following muscle inflammation or injury contributes to muscle hyperalgesia. It is well known that craniofacial muscle injury or inflammation induces regulation of multiple genes in trigeminal ganglia, which is associated with muscle hyperalgesia. However, overall transcriptional profiles within trigeminal ganglia following masseter inflammation have not yet been determined. In the present study, we performed RNA sequencing assay in rat trigeminal ganglia to identify transcriptome profiles of genes relevant to hyperalgesia following inflammation of the rat masseter muscle. Results Masseter inflammation differentially regulated >3500 genes in trigeminal ganglia. Predominant biological pathways were predicted to be related with activation of resident non-neuronal cells within trigeminal ganglia or recruitment of immune cells. To focus our analysis on the genes more relevant to nociceptors, we selected genes implicated in pain mechanisms, genes enriched in small- to medium-sized sensory neurons, and genes enriched in TRPV1-lineage nociceptors. Among the 2320 candidate genes, 622 genes showed differential expression following masseter inflammation. When the analysis was limited to these candidate genes, pathways related with G protein-coupled signaling and synaptic plasticity were predicted to be enriched. Inspection of individual gene expression changes confirmed the transcriptional changes of multiple nociceptor genes associated with masseter hyperalgesia (e.g., Trpv1, Trpa1, P2rx3, Tac1, and Bdnf) and also suggested a number of novel probable contributors (e.g., Piezo2, Tmem100, and Hdac9). Conclusion These findings should further advance our understanding of peripheral mechanisms involved in persistent craniofacial muscle pain conditions and provide a rational basis for identifying novel genes or sets of genes that can be potentially targeted for treating such conditions. PMID:27702909
Gong, Xian; Zhang, Chao; Yiliyasi·Aisa, Yiliyasi·Aisa; Shi, Ying; Yang, Xue-wei; NuersimanguliAosiman, NuersimanguliAosiman; Guan, Ya-qun; Xu, Shu-hua
2016-06-20
Over the last decade, a larger number of type 2 diabetes mellitus (T2DM) susceptible candidate genes have been reported by numerous genome-wide association studies (GWAS). Understanding the genetic diversity of these candidate genes among worldwide populations not only facilitates to elucidating the genetic mechanism of T2DM, but also provides guidance to further studies of pathogenesis of T2DM in any certain population. In this study, we identified 170 genes or genomic regions associated with T2DM by searching the GWAS databases and related literatures. We next analyzed the genetic diversity of these genes (or genomic regions) among present-day human populations by curetting the 1000 Genomes Projects phase1 dataset covering 14 worldwide populations. We further compared the characteristics of T2DM genes in different populations. No significant differences of genetic diversity were observed among the 14 worldwide populations between the T2DM candidate genes and the non-T2DM genes in terms of overall pattern. However, we observed some genes, such as IL20RA, RNMTL1-NXN, NOTCH2, ADRA2A-BTBD7P2, TBC1D4, RBM38-HMGB1P1, UBE2E2, and PPARD, show considerable differentiation between populations. In particular, IL20RA (FST=0.1521) displays the greatest population difference which is mainly contributed by that between Africans and non-Africans. Moreover, we revealed genetic differences between East Asians and Europeans on some candidate genes such as DGKB-AGMO (FST=0.173) and JAZF1 (FST=0.182). Our results indicate that some T2DM susceptible candidate genes harbor highly-differentiated variants between populations. These analyses, despite preliminary, should advance our understanding of the population difference of susceptibility to T2DM and provide insightful reference that future studies can relay on.
Enciso-Rodríguez, Felix E.; González, Carolina; Rodríguez, Edwin A.; López, Camilo E.; Landsman, David; Barrero, Luz Stella; Mariño-Ramírez, Leonardo
2013-01-01
The Cape gooseberry ( Physalis peruviana L) is an Andean exotic fruit with high nutritional value and appealing medicinal properties. However, its cultivation faces important phytosanitary problems mainly due to pathogens like Fusarium oxysporum, Cercosporaphysalidis and Alternaria spp. Here we used the Cape gooseberry foliar transcriptome to search for proteins that encode conserved domains related to plant immunity including: NBS (Nucleotide Binding Site), CC (Coiled-Coil), TIR (Toll/Interleukin-1 Receptor). We identified 74 immunity related gene candidates in P . peruviana which have the typical resistance gene (R-gene) architecture, 17 Receptor like kinase (RLKs) candidates related to PAMP-Triggered Immunity (PTI), eight (TIR-NBS-LRR, or TNL) and nine (CC–NBS-LRR, or CNL) candidates related to Effector-Triggered Immunity (ETI) genes among others. These candidate genes were categorized by molecular function (98%), biological process (85%) and cellular component (79%) using gene ontology. Some of the most interesting predicted roles were those associated with binding and transferase activity. We designed 94 primers pairs from the 74 immunity-related genes (IRGs) to amplify the corresponding genomic regions on six genotypes that included resistant and susceptible materials. From these, we selected 17 single band amplicons and sequenced them in 14 F. oxysporum resistant and susceptible genotypes. Sequence polymorphisms were analyzed through preliminary candidate gene association, which allowed the detection of one SNP at the PpIRG-63 marker revealing a nonsynonymous mutation in the predicted LRR domain suggesting functional roles for resistance. PMID:23844210
Enciso-Rodríguez, Felix E; González, Carolina; Rodríguez, Edwin A; López, Camilo E; Landsman, David; Barrero, Luz Stella; Mariño-Ramírez, Leonardo
2013-01-01
The Cape gooseberry (Physalisperuviana L) is an Andean exotic fruit with high nutritional value and appealing medicinal properties. However, its cultivation faces important phytosanitary problems mainly due to pathogens like Fusarium oxysporum, Cercosporaphysalidis and Alternaria spp. Here we used the Cape gooseberry foliar transcriptome to search for proteins that encode conserved domains related to plant immunity including: NBS (Nucleotide Binding Site), CC (Coiled-Coil), TIR (Toll/Interleukin-1 Receptor). We identified 74 immunity related gene candidates in P. peruviana which have the typical resistance gene (R-gene) architecture, 17 Receptor like kinase (RLKs) candidates related to PAMP-Triggered Immunity (PTI), eight (TIR-NBS-LRR, or TNL) and nine (CC-NBS-LRR, or CNL) candidates related to Effector-Triggered Immunity (ETI) genes among others. These candidate genes were categorized by molecular function (98%), biological process (85%) and cellular component (79%) using gene ontology. Some of the most interesting predicted roles were those associated with binding and transferase activity. We designed 94 primers pairs from the 74 immunity-related genes (IRGs) to amplify the corresponding genomic regions on six genotypes that included resistant and susceptible materials. From these, we selected 17 single band amplicons and sequenced them in 14 F. oxysporum resistant and susceptible genotypes. Sequence polymorphisms were analyzed through preliminary candidate gene association, which allowed the detection of one SNP at the PpIRG-63 marker revealing a nonsynonymous mutation in the predicted LRR domain suggesting functional roles for resistance.
Le Bail, Aude; Scholz, Sebastian; Kost, Benedikt
2013-01-01
The use of the moss Physcomitrella patens as a model system to study plant development and physiology is rapidly expanding. The strategic position of P. patens within the green lineage between algae and vascular plants, the high efficiency with which transgenes are incorporated by homologous recombination, advantages associated with the haploid gametophyte representing the dominant phase of the P. patens life cycle, the simple structure of protonemata, leafy shoots and rhizoids that constitute the haploid gametophyte, as well as a readily accessible high-quality genome sequence make this moss a very attractive experimental system. The investigation of the genetic and hormonal control of P. patens development heavily depends on the analysis of gene expression patterns by real time quantitative PCR (RT qPCR). This technique requires well characterized sets of reference genes, which display minimal expression level variations under all analyzed conditions, for data normalization. Sets of suitable reference genes have been described for most widely used model systems including e.g. Arabidopsis thaliana, but not for P. patens. Here, we present a RT qPCR based comparison of transcript levels of 12 selected candidate reference genes in a range of gametophytic P. patens structures at different developmental stages, and in P. patens protonemata treated with hormones or hormone transport inhibitors. Analysis of these RT qPCR data using GeNorm and NormFinder software resulted in the identification of sets of P. patens reference genes suitable for gene expression analysis under all tested conditions, and suggested that the two best reference genes are sufficient for effective data normalization under each of these conditions. PMID:23951063
Singh, Vikas K; Khan, Aamir W; Saxena, Rachit K; Kumar, Vinay; Kale, Sandip M; Sinha, Pallavi; Chitikineni, Annapurna; Pazhamala, Lekha T; Garg, Vanika; Sharma, Mamta; Sameer Kumar, Chanda Venkata; Parupalli, Swathi; Vechalapu, Suryanarayana; Patil, Suyash; Muniswamy, Sonnappa; Ghanta, Anuradha; Yamini, Kalinati Narasimhan; Dharmaraj, Pallavi Subbanna; Varshney, Rajeev K
2016-05-01
To map resistance genes for Fusarium wilt (FW) and sterility mosaic disease (SMD) in pigeonpea, sequencing-based bulked segregant analysis (Seq-BSA) was used. Resistant (R) and susceptible (S) bulks from the extreme recombinant inbred lines of ICPL 20096 × ICPL 332 were sequenced. Subsequently, SNP index was calculated between R- and S-bulks with the help of draft genome sequence and reference-guided assembly of ICPL 20096 (resistant parent). Seq-BSA has provided seven candidate SNPs for FW and SMD resistance in pigeonpea. In parallel, four additional genotypes were re-sequenced and their combined analysis with R- and S-bulks has provided a total of 8362 nonsynonymous (ns) SNPs. Of 8362 nsSNPs, 60 were found within the 2-Mb flanking regions of seven candidate SNPs identified through Seq-BSA. Haplotype analysis narrowed down to eight nsSNPs in seven genes. These eight nsSNPs were further validated by re-sequencing 11 genotypes that are resistant and susceptible to FW and SMD. This analysis revealed association of four candidate nsSNPs in four genes with FW resistance and four candidate nsSNPs in three genes with SMD resistance. Further, In silico protein analysis and expression profiling identified two most promising candidate genes namely C.cajan_01839 for SMD resistance and C.cajan_03203 for FW resistance. Identified candidate genomic regions/SNPs will be useful for genomics-assisted breeding in pigeonpea. © 2015 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Using SCOPE to identify potential regulatory motifs in coregulated genes.
Martyanov, Viktor; Gross, Robert H
2011-05-31
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data. In this article, we utilize a web version of SCOPE to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs and has been used in other studies. The three algorithms that comprise SCOPE are BEAM, which finds non-degenerate motifs (ACCGGT), PRISM, which finds degenerate motifs (ASCGWT), and SPACER, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well. Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor. Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run. Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail.
Cabiati, Manuela; Raucci, Serena; Caselli, Chiara; Guzzardi, Maria Angela; D'Amico, Andrea; Prescimone, Tommaso; Giannessi, Daniela; Del Ry, Silvia
2012-06-01
Obesity is a complex pathology with interacting and confounding causes due to the environment, hormonal signaling patterns, and genetic predisposition. At present, the Zucker rat is an eligible genetic model for research on obesity and metabolic syndrome, allowing scrutiny of gene expression profiles. Real-time PCR is the benchmark method for measuring mRNA expressions, but the accuracy and reproducibility of its data greatly depend on appropriate normalization strategies. In the Zucker rat model, no specific reference genes have been identified in myocardium, kidney, and lung, the main organs involved in this syndrome. The aim of this study was to select among ten candidates (Actb, Gapdh, Polr2a, Ywhag, Rpl13a, Sdha, Ppia, Tbp, Hprt1 and Tfrc) a set of reference genes that can be used for the normalization of mRNA expression data obtained by real-time PCR in obese and lean Zucker rats both at fasting and during acute hyperglycemia. The most stable genes in the heart were Sdha, Tbp, and Hprt1; in kidney, Tbp, Actb, and Gapdh were chosen, while Actb, Ywhag, and Sdha were selected as the most stably expressed set for pulmonary tissue. The normalization strategy was used to analyze mRNA expression of tumor necrosis factor α, the main inflammatory mediator in obesity, whose variations were more significant when normalized with the appropriately selected reference genes. The findings obtained in this study underline the importance of having three stably expressed reference gene sets for use in the cardiac, renal, and pulmonary tissues of an experimental model of obese and hyperglycemic Zucker rats.
Wang, Quan; Jia, Peilin; Cuenco, Karen T.; Feingold, Eleanor; Marazita, Mary L.; Wang, Lily; Zhao, Zhongming
2013-01-01
A number of genetic studies have suggested numerous susceptibility genes for dental caries over the past decade with few definite conclusions. The rapid accumulation of relevant information, along with the complex architecture of the disease, provides a challenging but also unique opportunity to review and integrate the heterogeneous data for follow-up validation and exploration. In this study, we collected and curated candidate genes from four major categories: association studies, linkage scans, gene expression analyses, and literature mining. Candidate genes were prioritized according to the magnitude of evidence related to dental caries. We then searched for dense modules enriched with the prioritized candidate genes through their protein-protein interactions (PPIs). We identified 23 modules comprising of 53 genes. Functional analyses of these 53 genes revealed three major clusters: cytokine network relevant genes, matrix metalloproteinases (MMPs) family, and transforming growth factor-beta (TGF-β) family, all of which have been previously implicated to play important roles in tooth development and carious lesions. Through our extensive data collection and an integrative application of gene prioritization and PPI network analyses, we built a dental caries-specific sub-network for the first time. Our study provided insights into the molecular mechanisms underlying dental caries. The framework we proposed in this work can be applied to other complex diseases. PMID:24146904
Investigating highly replicated asthma genes as candidate genes for allergic rhinitis.
Andiappan, Anand Kumar; Nilsson, Daniel; Halldén, Christer; Yun, Wang De; Säll, Torbjörn; Cardell, Lars Olaf; Tim, Chew Fook
2013-05-10
Asthma genetics has been extensively studied and many genes have been associated with the development or severity of this disease. In contrast, the genetic basis of allergic rhinitis (AR) has not been evaluated as extensively. It is well known that asthma is closely related with AR since a large proportion of individuals with asthma also present symptoms of AR, and patients with AR have a 5-6 fold increased risk of developing asthma. Thus, the relevance of asthma candidate genes as predisposing factors for AR is worth investigating. The present study was designed to investigate if SNPs in highly replicated asthma genes are associated with the occurrence of AR. A total of 192 SNPs from 21 asthma candidate genes reported to be associated with asthma in 6 or more unrelated studies were genotyped in a Swedish population with 246 AR patients and 431 controls. Genotypes for 429 SNPs from the same set of genes were also extracted from a Singapore Chinese genome-wide dataset which consisted of 456 AR cases and 486 controls. All SNPs were subsequently analyzed for association with AR and their influence on allergic sensitization to common allergens. A limited number of potential associations were observed and the overall pattern of P-values corresponds well to the expectations in the absence of an effect. However, in the tests of allele effects in the Chinese population the number of significant P-values exceeds the expectations. The strongest signals were found for SNPs in NPSR1 and CTLA4. In these genes, a total of nine SNPs showed P-values <0.001 with corresponding Q-values <0.05. In the NPSR1 gene some P-values were lower than the Bonferroni correction level. Reanalysis after elimination of all patients with asthmatic symptoms excluded asthma as a confounding factor in our results. Weaker indications were found for IL13 and GSTP1 with respect to sensitization to birch pollen in the Swedish population. Genetic variation in the majority of the highly replicated asthma genes were not associated to AR in our populations which suggest that asthma and AR could have less in common than previously anticipated. However, NPSR1 and CTLA4 can be genetic links between AR and asthma and associations of polymorphisms in NPSR1 with AR have not been reported previously.
Filling gaps in PPAR-alpha signaling through comparative nutrigenomics analysis
2009-01-01
Background The application of high-throughput genomic tools in nutrition research is a widespread practice. However, it is becoming increasingly clear that the outcome of individual expression studies is insufficient for the comprehensive understanding of such a complex field. Currently, the availability of the large amounts of expression data in public repositories has opened up new challenges on microarray data analyses. We have focused on PPARα, a ligand-activated transcription factor functioning as fatty acid sensor controlling the gene expression regulation of a large set of genes in various metabolic organs such as liver, small intestine or heart. The function of PPARα is strictly connected to the function of its target genes and, although many of these have already been identified, major elements of its physiological function remain to be uncovered. To further investigate the function of PPARα, we have applied a cross-species meta-analysis approach to integrate sixteen microarray datasets studying high fat diet and PPARα signal perturbations in different organisms. Results We identified 164 genes (MDEGs) that were differentially expressed in a constant way in response to a high fat diet or to perturbations in PPARs signalling. In particular, we found five genes in yeast which were highly conserved and homologous of PPARα targets in mammals, potential candidates to be used as models for the equivalent mammalian genes. Moreover, a screening of the MDEGs for all known transcription factor binding sites and the comparison with a human genome-wide screening of Peroxisome Proliferating Response Elements (PPRE), enabled us to identify, 20 new potential candidate genes that show, both binding site, both change in expression in the condition studied. Lastly, we found a non random localization of the differentially expressed genes in the genome. Conclusion The results presented are potentially of great interest to resume the currently available expression data, exploiting the power of in silico analysis filtered by evolutionary conservation. The analysis enabled us to indicate potential gene candidates that could fill in the gaps with regards to the signalling of PPARα and, moreover, the non-random localization of the differentially expressed genes in the genome, suggest that epigenetic mechanisms are of importance in the regulation of the transcription operated by PPARα. PMID:20003344
Evaluation of reference genes for insect olfaction studies.
Omondi, Bonaventure Aman; Latorre-Estivalis, Jose Manuel; Rocha Oliveira, Ivana Helena; Ignell, Rickard; Lorenzo, Marcelo Gustavo
2015-04-22
Quantitative reverse transcription PCR (qRT-PCR) is a robust and accessible method to assay gene expression and to infer gene regulation. Being a chain of procedures, this technique is subject to systematic error due to biological and technical limitations mainly set by the starting material and downstream procedures. Thus, rigorous data normalization is critical to grant reliability and repeatability of gene expression quantification by qRT-PCR. A number of 'housekeeping genes', involved in basic cellular functions, have been commonly used as internal controls for this normalization process. However, these genes could themselves be regulated and must therefore be tested a priori. We evaluated eight potential reference genes for their stability as internal controls for RT-qPCR studies of olfactory gene expression in the antennae of Rhodnius prolixus, a Chagas disease vector. The set of genes included were: α-tubulin; β-actin; Glyceraldehyde-3-phosphate dehydrogenase; Eukaryotic initiation factor 1A; Glutathione-S-transferase; Serine protease; Succinate dehydrogenase; and Glucose-6-phosphate dehydrogenase. Five experimental conditions, including changes in age,developmental stage and feeding status were tested in both sexes. We show that the evaluation of candidate reference genes is necessary for each combination of sex, tissue and physiological condition analyzed in order to avoid inconsistent results and conclusions. Although, Normfinder and geNorm software yielded different results between males and females, five genes (SDH, Tub, GAPDH, Act and G6PDH) appeared in the first positions in all rankings obtained. By using gene expression data of a single olfactory coreceptor gene as an example, we demonstrated the extent of changes expected using different internal standards. This work underlines the need for a rigorous selection of internal standards to grant the reliability of normalization processes in qRT-PCR studies. Furthermore, we show that particular physiological or developmental conditions require independent evaluation of a diverse set of potential reference genes.
Integrative Approach to Pain Genetics Identifies Pain Sensitivity Loci across Diseases
Ruau, David; Dudley, Joel T.; Chen, Rong; Phillips, Nicholas G.; Swan, Gary E.; Lazzeroni, Laura C.; Clark, J. David
2012-01-01
Identifying human genes relevant for the processing of pain requires difficult-to-conduct and expensive large-scale clinical trials. Here, we examine a novel integrative paradigm for data-driven discovery of pain gene candidates, taking advantage of the vast amount of existing disease-related clinical literature and gene expression microarray data stored in large international repositories. First, thousands of diseases were ranked according to a disease-specific pain index (DSPI), derived from Medical Subject Heading (MESH) annotations in MEDLINE. Second, gene expression profiles of 121 of these human diseases were obtained from public sources. Third, genes with expression variation significantly correlated with DSPI across diseases were selected as candidate pain genes. Finally, selected candidate pain genes were genotyped in an independent human cohort and prospectively evaluated for significant association between variants and measures of pain sensitivity. The strongest signal was with rs4512126 (5q32, ABLIM3, P = 1.3×10−10) for the sensitivity to cold pressor pain in males, but not in females. Significant associations were also observed with rs12548828, rs7826700 and rs1075791 on 8q22.2 within NCALD (P = 1.7×10−4, 1.8×10−4, and 2.2×10−4 respectively). Our results demonstrate the utility of a novel paradigm that integrates publicly available disease-specific gene expression data with clinical data curated from MEDLINE to facilitate the discovery of pain-relevant genes. This data-derived list of pain gene candidates enables additional focused and efficient biological studies validating additional candidates. PMID:22685391
Pyun, Jung-A; Kim, Sunshin; Cho, Nam H; Koh, InSong; Lee, Jong-Young; Shin, Chol; Kwack, KyuBum
2014-05-01
The aim of this study was to identify polymorphisms and gene-gene interactions that are significantly associated with age at menarche and age at menopause in a Korean population. A total of 3,452 and 1,827 women participated in studies of age at menarche and age at natural menopause, respectively. Linear regression analyses adjusted for residence area were used to perform genome-wide association studies (GWAS), candidate gene association studies, and interactions between the candidate genes for age at menarche and age at natural menopause. In GWAS, four single nucleotide polymorphisms (SNPs; rs7528241, rs1324329, rs11597068, and rs6495785) were strongly associated with age at natural menopause (lowest P = 9.66 × 10). However, GWAS of age at menarche did not reveal any strong associations. In candidate gene association studies, SNPs with P < 0.01 were selected to test their synergistic interactions. For age at natural menopause, there was a significant interaction between intronic SNPs on ADAM metallopeptidase with thrombospondin type I motif 9 (ADAMTS9) and SMAD family member 3 (SMAD3) genes (P = 9.52 × 10). For age at menarche, there were three significant interactions between three intronic SNPs on follicle-stimulating hormone receptor (FSHR) gene and one SNP located at the 3' flanking region of insulin-like growth factor 2 receptor (IGF2R) gene (lowest P = 1.95 × 10). Novel SNPs and synergistic interactions between candidate genes are significantly associated with age at menarche and age at natural menopause in a Korean population.
Kertai, Miklos D; Qi, Wenjing; Li, Yi-Ju; Lombard, Frederick W; Liu, Yutao; Smith, Michael P; Stafford-Smith, Mark; Newman, Mark F; Milano, Carmelo A; Mathew, Joseph P; Podgoreanu, Mihai V
2016-03-01
Atrial tissue gene expression profiling may help to determine how differentially expressed genes in the human atrium before cardiopulmonary bypass (CPB) are related to subsequent biologic pathway activation patterns, and whether specific expression profiles are associated with an increased risk for postoperative atrial fibrillation (AF) or altered response to β-blocker (BB) therapy after coronary artery bypass grafting (CABG) surgery. Right atrial appendage (RAA) samples were collected from 45 patients who were receiving perioperative BB treatment, and underwent CABG surgery. The isolated RNA samples were used for microarray gene expression analysis, to identify probes that were expressed differently in patients with and without postoperative AF. Gene expression analysis was performed to identify probes that were expressed differently in patients with and without postoperative AF. Gene set enrichment analysis (GSEA) was performed to determine how sets of genes might be systematically altered in patients with postoperative AF. Of the 45 patients studied, genomic DNA from 42 patients was used for target sequencing of 66 candidate genes potentially associated with AF, and 2,144 single-nucleotide polymorphisms (SNPs) were identified. We then performed expression quantitative trait loci (eQTL) analysis to determine the correlation between SNPs identified in the genotyped patients, and RAA expression. Probes that met a false discovery rate<0.25 were selected for eQTL analysis. Of the 17,678 gene expression probes analyzed, 2 probes met our prespecified significance threshold of false discovery rate<0.25. The most significant probe corresponded to vesicular overexpressed in cancer - prosurvival protein 1 gene (VOPP1; 1.83 fold change; P=3.47×10(-7)), and was up-regulated in patients with postoperative AF, whereas the second most significant probe, which corresponded to the LOC389286 gene (0.49 fold change; P=1.54×10(-5)), was down-regulated in patients with postoperative AF. GSEA highlighted the role of VOPP1 in pathways with biologic relevance to myocardial homeostasis, and oxidative stress and redox modulation. Candidate gene eQTL showed a trans-acting association between variants of G protein-coupled receptor kinase 5 gene, previously linked to altered BB response, and high expression of VOPP1. In patients undergoing CABG surgery, RAA gene expression profiling, and pathway and eQTL analysis suggested that VOPP1 plays a novel etiological role in postoperative AF despite perioperative BB therapy. Copyright © 2016. Published by Elsevier Ltd.
Cusick, Kathleen D; Fitzgerald, Lisa A; Pirlo, Russell K; Cockrell, Allison L; Petersen, Emily R; Biffinger, Justin C
2014-01-01
Neurospora crassa has served as a model organism for studying circadian pathways and more recently has gained attention in the biofuel industry due to its enhanced capacity for cellulase production. However, in order to optimize N. crassa for biotechnological applications, metabolic pathways during growth under different environmental conditions must be addressed. Reverse-transcription quantitative PCR (RT-qPCR) is a technique that provides a high-throughput platform from which to measure the expression of a large set of genes over time. The selection of a suitable reference gene is critical for gene expression studies using relative quantification, as this strategy is based on normalization of target gene expression to a reference gene whose expression is stable under the experimental conditions. This study evaluated twelve candidate reference genes for use with N. crassa when grown in continuous culture bioreactors under different light and temperature conditions. Based on combined stability values from NormFinder and Best Keeper software packages, the following are the most appropriate reference genes under conditions of: (1) light/dark cycling: btl, asl, and vma1; (2) all-dark growth: btl, tbp, vma1, and vma2; (3) temperature flux: btl, vma1, act, and asl; (4) all conditions combined: vma1, vma2, tbp, and btl. Since N. crassa exists as different cell types (uni- or multi-nucleated), expression changes in a subset of the candidate genes was further assessed using absolute quantification. A strong negative correlation was found to exist between ratio and threshold cycle (CT) values, demonstrating that CT changes serve as a reliable reflection of transcript, and not gene copy number, fluctuations. The results of this study identified genes that are appropriate for use as reference genes in RT-qPCR studies with N. crassa and demonstrated that even with the presence of different cell types, relative quantification is an acceptable method for measuring gene expression changes during growth in bioreactors.
Identification of a neuronal transcription factor network involved in medulloblastoma development.
Lastowska, Maria; Al-Afghani, Hani; Al-Balool, Haya H; Sheth, Harsh; Mercer, Emma; Coxhead, Jonathan M; Redfern, Chris P F; Peters, Heiko; Burt, Alastair D; Santibanez-Koref, Mauro; Bacon, Chris M; Chesler, Louis; Rust, Alistair G; Adams, David J; Williamson, Daniel; Clifford, Steven C; Jackson, Michael S
2013-07-11
Medulloblastomas, the most frequent malignant brain tumours affecting children, comprise at least 4 distinct clinicogenetic subgroups. Aberrant sonic hedgehog (SHH) signalling is observed in approximately 25% of tumours and defines one subgroup. Although alterations in SHH pathway genes (e.g. PTCH1, SUFU) are observed in many of these tumours, high throughput genomic analyses have identified few other recurring mutations. Here, we have mutagenised the Ptch+/- murine tumour model using the Sleeping Beauty transposon system to identify additional genes and pathways involved in SHH subgroup medulloblastoma development. Mutagenesis significantly increased medulloblastoma frequency and identified 17 candidate cancer genes, including orthologs of genes somatically mutated (PTEN, CREBBP) or associated with poor outcome (PTEN, MYT1L) in the human disease. Strikingly, these candidate genes were enriched for transcription factors (p=2x10-5), the majority of which (6/7; Crebbp, Myt1L, Nfia, Nfib, Tead1 and Tgif2) were linked within a single regulatory network enriched for genes associated with a differentiated neuronal phenotype. Furthermore, activity of this network varied significantly between the human subgroups, was associated with metastatic disease, and predicted poor survival specifically within the SHH subgroup of tumours. Igf2, previously implicated in medulloblastoma, was the most differentially expressed gene in murine tumours with network perturbation, and network activity in both mouse and human tumours was characterised by enrichment for multiple gene-sets indicating increased cell proliferation, IGF signalling, MYC target upregulation, and decreased neuronal differentiation. Collectively, our data support a model of medulloblastoma development in SB-mutagenised Ptch+/- mice which involves disruption of a novel transcription factor network leading to Igf2 upregulation, proliferation of GNPs, and tumour formation. Moreover, our results identify rational therapeutic targets for SHH subgroup tumours, alongside prognostic biomarkers for the identification of poor-risk SHH patients.
Zadeh Modarres, Shahrzad; Heidar, Zahra; Foroozanfard, Fatemeh; Rahmati, Zahra; Aghadavod, Esmat; Asemi, Zatollah
2018-06-01
This study was conducted to evaluate the effects of selenium supplementation on gene expression related to insulin and lipid in infertile women with polycystic ovary syndrome (PCOS) candidate for in vitro fertilization (IVF). This randomized double-blind, placebo-controlled trial was conducted among 40 infertile women with PCOS candidate for IVF. Subjects were randomly allocated into two groups to intake either 200-μg selenium (n = 20) or placebo (n = 20) per day for 8 weeks. Gene expression levels related to insulin and lipid were quantified in lymphocytes of women with PCOS candidate for IVF with RT-PCR method. Results of RT-PCR demonstrated that after the 8-week intervention, compared with the placebo, selenium supplementation upregulated gene expression of peroxisome proliferator-activated receptor gamma (PPAR-γ) (1.06 ± 0.15-fold increase vs. 0.94 ± 0.18-fold reduction, P = 0.02) and glucose transporter 1 (GLUT-1) (1.07 ± 0.20-fold increase vs. 0.87 ± 0.18-fold reduction, P = 0.003) in lymphocytes of women with PCOS candidate for IVF. In addition, compared with the placebo, selenium supplementation downregulated gene expression of low-density lipoprotein receptor (LDLR) (0.88 ± 0.17-fold reduction vs. 1.05 ± 0.22-fold increase, P = 0.01) in lymphocytes of women with PCOS candidate for IVF. We did not observe any significant effect of selenium supplementation on gene expression levels of lipoprotein(a) [LP(a)] in lymphocytes of women with PCOS candidate for IVF. Overall, selenium supplementation for 8 weeks in lymphocytes of women with infertile PCOS candidate for IVF significantly increased gene expression levels of PPAR-γ and GLUT-1 and significantly decreased gene expression levels of LDLR, but did not affect LP(a). http://www.irct.ir : IRCT201704245623N113.
Tohidi, Reza; Idris, Ismail Bin; Panandam, Jothi Malar; Bejo, Mohd Hair
2012-12-01
Salmonella Enteritidis is a major cause of food poisoning worldwide, and poultry products are the main source of S. Enteritidis contamination for humans. Among the numerous strategies for disease control, improving genetic resistance to S. Enteritidis has been the most effective approach. We investigated the association between S. Enteritidis burden in the caecum, spleen, and liver of young indigenous chickens and seven candidate genes, selected on the basis of their critical roles in immunological functions. The genes included those encoding interleukin 2 (IL-2), interferon-γ (IFN-γ), transforming growth factor β2 (TGF-β2), immunoglobulin light chain (IgL), toll-like receptor 4 (TLR-4), myeloid differentiation protein 2 (MD-2), and inducible nitric oxide synthase (iNOS). Two Malaysian indigenous chicken breeds were used as sustainable genetic sources of alleles that are resistant to salmonellosis. The polymerase chain reaction restriction fragment-length polymorphism technique was used to genotype the candidate genes. Three different genotypes were observed in all of the candidate genes, except for MD-2. All of the candidate genes showed the Hardy-Weinberg equilibrium for the two populations. The IL-2-MnlI polymorphism was associated with S. Enteritidis burden in the caecum and spleen. The TGF-β2-RsaI, TLR-4-Sau 96I, and iNOS-AluI polymorphisms were associated with the caecum S. Enteritidis load. The other candidate genes were not associated with S. Enteritidis load in any organ. The results indicate that the IL-2, TGF-β2, TLR-4, and iNOS genes are potential candidates for use in selection programmes for increasing genetic resistance against S. Enteritidis in Malaysian indigenous chickens.
Mitchell, Sabrina; Ellingson, Clint; Coyne, Thomas; Hall, Lynn; Neill, Meaghan; Christian, Natalie; Higham, Catherine; Dobrowolski, Steven F; Tuchman, Mendel; Summar, Marshall
2009-01-01
The urea cycle is the primary means of nitrogen metabolism in humans and other ureotelic organisms. There are five key enzymes in the urea cycle: carbamoyl-phosphate synthetase 1 (CPS1), ornithine transcarbamylase (OTC), argininosuccinate synthetase (ASS1), argininosuccinate lyase (ASL), and arginase 1 (ARG1). Additionally, a sixth enzyme, N-acetylglutamate synthase (NAGS), is critical for urea cycle function, providing CPS1 with its necessary cofactor. Deficiencies in any of these enzymes result in elevated blood ammonia concentrations, which can have detrimental effects, including central nervous system dysfunction, brain damage, coma, and death. Functional variants, which confer susceptibility for disease or dysfunction, have been described for enzymes within the cycle; however, a comprehensive screen of all the urea cycle enzymes has not been performed. We examined the exons and intron/exon boundaries of the five key urea cycle enzymes, NAGS, and two solute carrier transporter genes (SLC25A13 and SLC25A15) for sequence alterations using single-stranded conformational polymorphism (SSCP) analysis and high-resolution melt profiling. SSCP was performed on a set of DNA from 47 unrelated North American individuals with a mixture of ethnic backgrounds. High-resolution melt profiling was performed on a nonoverlapping DNA set of either 47 or 100 unrelated individuals with a mixture of backgrounds. We identified 33 unarchived polymorphisms in this screen that potentially play a role in the variation observed in urea cycle function. Screening all the genes in the pathway provides a catalog of variants that can be used in investigating candidate diseases. Copyright 2008 Wiley-Liss, Inc.
Association Between Germline Mutation in VSIG10L and Familial Barrett Neoplasia
Fecteau, Ryan E.; Kong, Jianping; Kresak, Adam; Brock, Wendy; Song, Yeunjoo; Fujioka, Hisashi; Elston, Robert; Willis, Joseph E.; Lynch, John P.; Markowitz, Sanford D.; Guda, Kishore; Chak, Amitabh
2016-01-01
IMPORTANCE Esophageal adenocarcinoma and its precursor lesion Barrett esophagus have seen a dramatic increase in incidence over the past 4 decades yet marked genetic heterogeneity of this disease has precluded advances in understanding its pathogenesis and improving treatment. OBJECTIVE To identify novel disease susceptibility variants in a familial syndrome of esophageal adenocarcinoma and Barrett esophagus, termed familial Barrett esophagus, by using high-throughput sequencing in affected individuals from a large, multigenerational family. DESIGN, SETTING, AND PARTICIPANTS We performed whole exome sequencing (WES) from peripheral lymphocyte DNA on 4 distant relatives from our multiplex, multigenerational familial Barrett esophagus family to identify candidate disease susceptibility variants. Gene variants were filtered, verified, and segregation analysis performed to identify a single candidate variant. Gene expression analysis was done with both quantitative real-time polymerase chain reaction and in situ RNA hybridization. A 3-dimensional organotypic cell culture model of esophageal maturation was utilized to determine the phenotypic effects of our gene variant. We used electron microscopy on esophageal mucosa from an affected family member carrying the gene variant to assess ultrastructural changes. MAIN OUTCOMES AND MEASURES Identification of a novel, germline disease susceptibility variant in a previously uncharacterized gene. RESULTS A multiplex, multigenerational family with 14 members affected (3 members with esophageal adenocarcinoma and 11 with Barrett esophagus) was identified, and whole-exome sequencing identified a germline mutation (S631G) at a highly conserved serine residue in the uncharacterized gene VSIG10L that segregated in affected members. Transfection of S631G variant into a 3-dimensional organotypic culture model of normal esophageal squamous cells dramatically inhibited epithelial maturation compared with the wild-type. VSIG10L exhibited high expression in normal squamous esophagus with marked loss of expression in Barrett-associated lesions. Electron microscopy of squamous esophageal mucosa harboring the S631G variant revealed dilated intercellular spaces and reduced desmosomes. CONCLUSIONS AND RELEVANCE This study presents VSIG10L as a candidate familial Barrett esophagus susceptibility gene, with a putative role in maintaining normal esophageal homeostasis. Further research assessing VSIG10L function may reveal pathways important for esophageal maturation and the pathogenesis of Barrett esophagus and esophageal adenocarcinoma. PMID:27467440
Genetic correlates of insight in schizophrenia.
Xavier, Rose Mary; Vorderstrasse, Allison; Keefe, Richard S E; Dungan, Jennifer R
2018-05-01
Insight in schizophrenia is clinically important as it is associated with several adverse outcomes. Genetic contributions to insight are unknown. We examined genetic contributions to insight by investigating if polygenic risk scores (PRS) and candidate regions were associated with insight. Schizophrenia case-only analysis of the Clinical Antipsychotics Trials of Intervention Effectiveness trial. Schizophrenia PRS was constructed using Psychiatric Genomics Consortium (PGC) leave-one out GWAS as discovery data set. For candidate regions, we selected 105 schizophrenia-associated autosomal loci and 11 schizophrenia-related oligodendrocyte genes. We used regressions to examine PRS associations and set-based testing for candidate analysis. We examined data from 730 subjects. Best-fit PRS at p-threshold of 1e-07 was associated with total insight (R 2 =0.005, P=0.05, empirical P=0.054) and treatment insight (R 2 =0.005, P=0.048, empirical P=0.048). For models that controlled for neurocognition, PRS significantly predicted treatment insight but at higher p-thresholds (0.1 to 0.5) but did not survive correction. Patients with highest polygenic burden had 5.9 times increased risk for poor insight compared to patients with lowest burden. PRS explained 3.2% (P=0.002, empirical P=0.011) of variance in poor insight. Set-based analyses identified two variants associated with poor insight- rs320703, an intergenic variant (within-set P=6e-04, FDR P=0.046) and rs1479165 in SOX2-OT (within-set P=9e-04, FDR P=0.046). To the best of our knowledge, this is the first study examining genetic basis of insight. We provide evidence for genetic contributions to impaired insight. Relevance of findings and necessity for replication are discussed. Copyright © 2017 Elsevier B.V. All rights reserved.
The Genetic Basis for Variation in Sensitivity to Lead Toxicity in Drosophila melanogaster
Zhou, Shanshan; Morozova, Tatiana V.; Hussain, Yasmeen N.; Luoma, Sarah E.; McCoy, Lenovia; Yamamoto, Akihiko; Mackay, Trudy F.C.; Anholt, Robert R.H.
2016-01-01
Background: Lead toxicity presents a worldwide health problem, especially due to its adverse effects on cognitive development in children. However, identifying genes that give rise to individual variation in susceptibility to lead toxicity is challenging in human populations. Objectives: Our goal was to use Drosophila melanogaster to identify evolutionarily conserved candidate genes associated with individual variation in susceptibility to lead exposure. Methods: To identify candidate genes associated with variation in susceptibility to lead toxicity, we measured effects of lead exposure on development time, viability and adult activity in the Drosophila melanogaster Genetic Reference Panel (DGRP) and performed genome-wide association analyses to identify candidate genes. We used mutants to assess functional causality of candidate genes and constructed a genetic network associated with variation in sensitivity to lead exposure, on which we could superimpose human orthologs. Results: We found substantial heritabilities for all three traits and identified candidate genes associated with variation in susceptibility to lead exposure for each phenotype. The genetic architectures that determine variation in sensitivity to lead exposure are highly polygenic. Gene ontology and network analyses showed enrichment of genes associated with early development and function of the nervous system. Conclusions: Drosophila melanogaster presents an advantageous model to study the genetic underpinnings of variation in susceptibility to lead toxicity. Evolutionary conservation of cellular pathways that respond to toxic exposure allows predictions regarding orthologous genes and pathways across phyla. Thus, studies in the D. melanogaster model system can identify candidate susceptibility genes to guide subsequent studies in human populations. Citation: Zhou S, Morozova TV, Hussain YN, Luoma SE, McCoy L, Yamamoto A, Mackay TF, Anholt RR. 2016. The genetic basis for variation in sensitivity to lead toxicity in Drosophila melanogaster. Environ Health Perspect 124:1062–1070; http://dx.doi.org/10.1289/ehp.1510513 PMID:26859824
Xia, Chongjing; Wang, Meinan; Cornejo, Omar E; Jiwan, Derick A; See, Deven R; Chen, Xianming
2017-01-01
Stripe (yellow) rust, caused by Puccinia striiformis f. sp. tritici ( Pst ), is one of the most destructive diseases of wheat worldwide. Planting resistant cultivars is an effective way to control this disease, but race-specific resistance can be overcome quickly due to the rapid evolving Pst population. Studying the pathogenicity mechanisms is critical for understanding how Pst virulence changes and how to develop wheat cultivars with durable resistance to stripe rust. We re-sequenced 7 Pst isolates and included additional 7 previously sequenced isolates to represent balanced virulence/avirulence profiles for several avirulence loci in seretome analyses. We observed an uneven distribution of heterozygosity among the isolates. Secretome comparison of Pst with other rust fungi identified a large portion of species-specific secreted proteins, suggesting that they may have specific roles when interacting with the wheat host. Thirty-two effectors of Pst were identified from its secretome. We identified candidates for Avr genes corresponding to six Yr genes by correlating polymorphisms for effector genes to the virulence/avirulence profiles of the 14 Pst isolates. The putative AvYr76 was present in the avirulent isolates, but absent in the virulent isolates, suggesting that deleting the coding region of the candidate avirulence gene has produced races virulent to resistance gene Yr76 . We conclude that incorporating avirulence/virulence phenotypes into correlation analysis with variations in genomic structure and secretome, particularly presence/absence polymorphisms of effectors, is an efficient way to identify candidate Avr genes in Pst . The candidate effector genes provide a rich resource for further studies to determine the evolutionary history of Pst populations and the co-evolutionary arms race between Pst and wheat. The Avr candidates identified in this study will lead to cloning avirulence genes in Pst , which will enable us to understand molecular mechanisms underlying Pst -wheat interactions, to determine the effectiveness of resistance genes and further to develop durable resistance to stripe rust.
Tsai, Pei-Chien; Breen, Matthew
2012-09-01
To identify suitable reference genes for normalization of real-time quantitative PCR (RT-qPCR) assay data for common tumors of dogs. Malignant lymph node (n = 8), appendicular osteosarcoma (9), and histiocytic sarcoma (12) samples and control samples of various nonneoplastic canine tissues. Array-based comparative genomic hybridization (aCGH) data were used to guide selection of 9 candidate reference genes. Expression stability of candidate reference genes and 4 commonly used reference genes was determined for tumor samples with RT-qPCR assays and 3 software programs. LOC611555 was the candidate reference gene with the highest expression stability among the 3 tumor types. Of the commonly used reference genes, expression stability of HPRT was high in histiocytic sarcoma samples, and expression stability of Ubi and RPL32 was high in osteosarcoma samples. Some of the candidate reference genes had higher expression stability than did the commonly used reference genes. Data for constitutively expressed genes with high expression stability are required for normalization of RT-qPCR assay results. Without such data, accurate quantification of gene expression in tumor tissue samples is difficult. Results of the present study indicated LOC611555 may be a useful RT-qPCR assay reference gene for multiple tissue types. Some commonly used reference genes may be suitable for normalization of gene expression data for tumors of dogs, such as lymphomas, osteosarcomas, or histiocytic sarcomas.
2011-01-01
Background Internal control genes with highly uniform expression throughout the experimental conditions are required for accurate gene expression analysis as no universal reference genes exists. In this study, the expression stability of 24 candidate genes from Triticum aestivum cv. Cubus flag leaves grown under organic and conventional farming systems was evaluated in two locations in order to select suitable genes that can be used for normalization of real-time quantitative reverse-transcription PCR (RT-qPCR) reactions. The genes were selected among the most common used reference genes as well as genes encoding proteins involved in several metabolic pathways. Findings Individual genes displayed different expression rates across all samples assayed. Applying geNorm, a set of three potential reference genes were suitable for normalization of RT-qPCR reactions in winter wheat flag leaves cv. Cubus: TaFNRII (ferredoxin-NADP(H) oxidoreductase; AJ457980.1), ACT2 (actin 2; TC234027), and rrn26 (a putative homologue to RNA 26S gene; AL827977.1). In addition of these three genes that were also top-ranked by NormFinder, two extra genes: CYP18-2 (Cyclophilin A, AY456122.1) and TaWIN1 (14-3-3 like protein, AB042193) were most consistently stably expressed. Furthermore, we showed that TaFNRII, ACT2, and CYP18-2 are suitable for gene expression normalization in other two winter wheat varieties (Tommi and Centenaire) grown under three treatments (organic, conventional and no nitrogen) and a different environment than the one tested with cv. Cubus. Conclusions This study provides a new set of reference genes which should improve the accuracy of gene expression analyses when using wheat flag leaves as those related to the improvement of nitrogen use efficiency for cereal production. PMID:21951810
Performance of Polygenic Scores for Predicting Phobic Anxiety
Walter, Stefan; Glymour, M. Maria; Koenen, Karestan; Liang, Liming; Tchetgen Tchetgen, Eric J.; Cornelis, Marilyn; Chang, Shun-Chiao; Rimm, Eric; Kawachi, Ichiro; Kubzansky, Laura D.
2013-01-01
Context Anxiety disorders are common, with a lifetime prevalence of 20% in the U.S., and are responsible for substantial burdens of disability, missed work days and health care utilization. To date, no causal genetic variants have been identified for anxiety, anxiety disorders, or related traits. Objective To investigate whether a phobic anxiety symptom score was associated with 3 alternative polygenic risk scores, derived from external genome-wide association studies of anxiety, an internally estimated agnostic polygenic score, or previously identified candidate genes. Design Longitudinal follow-up study. Using linear and logistic regression we investigated whether phobic anxiety was associated with polygenic risk scores derived from internal, leave-one out genome-wide association studies, from 31 candidate genes, and from out-of-sample genome-wide association weights previously shown to predict depression and anxiety in another cohort. Setting and Participants Study participants (n = 11,127) were individuals from the Nurses' Health Study and Health Professionals Follow-up Study. Main Outcome Measure Anxiety symptoms were assessed via the 8-item phobic anxiety scale of the Crown Crisp Index at two time points, from which a continuous phenotype score was derived. Results We found no genome-wide significant associations with phobic anxiety. Phobic anxiety was also not associated with a polygenic risk score derived from the genome-wide association study beta weights using liberal p-value thresholds; with a previously published genome-wide polygenic score; or with a candidate gene risk score based on 31 genes previously hypothesized to predict anxiety. Conclusion There is a substantial gap between twin-study heritability estimates of anxiety disorders ranging between 20–40% and heritability explained by genome-wide association results. New approaches such as improved genome imputations, application of gene expression and biological pathways information, and incorporating social or environmental modifiers of genetic risks may be necessary to identify significant genetic predictors of anxiety. PMID:24278274
Genetic Marker Discovery in Complex Traits: A Field Example on Fat Content and Composition in Pigs.
Pena, Ramona Natacha; Ros-Freixedes, Roger; Tor, Marc; Estany, Joan
2016-12-14
Among the large number of attributes that define pork quality, fat content and composition have attracted the attention of breeders in the recent years due to their interaction with human health and technological and sensorial properties of meat. In livestock species, fat accumulates in different depots following a temporal pattern that is also recognized in humans. Intramuscular fat deposition rate and fatty acid composition change with life. Despite indication that it might be possible to select for intramuscular fat without affecting other fat depots, to date only one depot-specific genetic marker ( PCK1 c.2456C>A) has been reported. In contrast, identification of polymorphisms related to fat composition has been more successful. For instance, our group has described a variant in the stearoyl-coA desaturase ( SCD ) gene that improves the desaturation index of fat without affecting overall fatness or growth. Identification of mutations in candidate genes can be a tedious and costly process. Genome-wide association studies can help in narrowing down the number of candidate genes by highlighting those which contribute most to the genetic variation of the trait. Results from our group and others indicate that fat content and composition are highly polygenic and that very few genes explain more than 5% of the variance of the trait. Moreover, as the complexity of the genome emerges, the role of non-coding genes and regulatory elements cannot be disregarded. Prediction of breeding values from genomic data is discussed in comparison with conventional best linear predictors of breeding values. An example based on real data is given, and the implications in phenotype prediction are discussed in detail. The benefits and limitations of using large SNP sets versus a few very informative markers as predictors of genetic merit of breeding candidates are evaluated using field data as an example.
In Silico Detection of Sequence Variations Modifying Transcriptional Regulation
Andersen, Malin C; Engström, Pär G; Lithwick, Stuart; Arenillas, David; Eriksson, Per; Lenhard, Boris; Wasserman, Wyeth W; Odeberg, Jacob
2008-01-01
Identification of functional genetic variation associated with increased susceptibility to complex diseases can elucidate genes and underlying biochemical mechanisms linked to disease onset and progression. For genes linked to genetic diseases, most identified causal mutations alter an encoded protein sequence. Technological advances for measuring RNA abundance suggest that a significant number of undiscovered causal mutations may alter the regulation of gene transcription. However, it remains a challenge to separate causal genetic variations from linked neutral variations. Here we present an in silico driven approach to identify possible genetic variation in regulatory sequences. The approach combines phylogenetic footprinting and transcription factor binding site prediction to identify variation in candidate cis-regulatory elements. The bioinformatics approach has been tested on a set of SNPs that are reported to have a regulatory function, as well as background SNPs. In the absence of additional information about an analyzed gene, the poor specificity of binding site prediction is prohibitive to its application. However, when additional data is available that can give guidance on which transcription factor is involved in the regulation of the gene, the in silico binding site prediction improves the selection of candidate regulatory polymorphisms for further analyses. The bioinformatics software generated for the analysis has been implemented as a Web-based application system entitled RAVEN (regulatory analysis of variation in enhancers). The RAVEN system is available at http://www.cisreg.ca for all researchers interested in the detection and characterization of regulatory sequence variation. PMID:18208319
Patterns of Piscirickettsia salmonis load in susceptible and resistant families of Salmo salar.
Dettleff, Phillip; Bravo, Cristian; Patel, Alok; Martinez, Victor
2015-07-01
The pathogen Piscirickettsia salmonis produces a systemic aggressive infection that involves several organs and tissues in salmonids. In spite of the great economic losses caused by this pathogen in the Atlantic salmon (Salmo salar) industry, very little is known about the resistance mechanisms of the host to this pathogen. In this paper, for the first time, we aimed to identify the bacterial load in head kidney and muscle of Atlantic salmon exhibiting differential familiar mortality. Furthermore, in order to assess the patterns of gene expression of immune related genes in susceptible and resistant families, a set of candidate genes was evaluated using deep sequencing of the transcriptome. The results showed that the bacterial load was significantly lower in resistant fish, when compared with the susceptible individuals. Based on the candidate genes analysis, we infer that the resistant hosts triggered up-regulation of specific genes (such as for example the LysC), which may explain a decrease in the bacterial load in head kidney, while the susceptible fish presented an exacerbated innate response, which is unable to exert an effective response against the bacteria. Interestingly, we found a higher bacterial load in muscle when compared with head kidney. We argue that this is possible due to the availability of an additional source of iron in muscle. Besides, the results show that the resistant fish could not be a likely reservoir of the bacteria. Copyright © 2015 Elsevier Ltd. All rights reserved.
Mixture models for detecting differentially expressed genes in microarrays.
Jones, Liat Ben-Tovim; Bean, Richard; McLachlan, Geoffrey J; Zhu, Justin Xi
2006-10-01
An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local FDR (false discovery rate) is provided for each gene. An attractive feature of the mixture model approach is that it provides a framework for the estimation of the prior probability that a gene is not differentially expressed, and this probability can subsequently be used in forming a decision rule. The rule can also be formed to take the false negative rate into account. We apply this approach to a well-known publicly available data set on breast cancer, and discuss our findings with reference to other approaches.
Uddenberg, Daniel; Reimegård, Johan; Clapham, David; Almqvist, Curt; von Arnold, Sara; Emanuelsson, Olof; Sundström, Jens F.
2013-01-01
Conifers normally go through a long juvenile period, for Norway spruce (Picea abies) around 20 to 25 years, before developing male and female cones. We have grown plants from inbred crosses of a naturally occurring spruce mutant (acrocona). One-fourth of the segregating acrocona plants initiate cones already in their second growth cycle, suggesting control by a single locus. The early cone-setting properties of the acrocona mutant were utilized to identify candidate genes involved in vegetative-to-reproductive phase change in Norway spruce. Poly(A+) RNA samples from apical and basal shoots of cone-setting and non-cone-setting plants were subjected to high-throughput sequencing (RNA-seq). We assembled and investigated 33,383 expressed putative protein-coding acrocona transcripts. Eight transcripts were differentially expressed between selected sample pairs. One of these (Acr42124_1) was significantly up-regulated in apical shoot samples from cone-setting acrocona plants, and the encoded protein belongs to the MADS box gene family of transcription factors. Using quantitative real-time polymerase chain reaction with independently derived plant material, we confirmed that the MADS box gene is up-regulated in both needles and buds of cone-inducing shoots when reproductive identity is determined. Our results constitute important steps for the development of a rapid cycling model system that can be used to study gene function in conifers. In addition, our data suggest the involvement of a MADS box transcription factor in the vegetative-to-reproductive phase change in Norway spruce. PMID:23221834
Regulation of behaviorally associated gene networks in worker honey bee ovaries
Wang, Ying; Kocher, Sarah D.; Linksvayer, Timothy A.; Grozinger, Christina M.; Page, Robert E.; Amdam, Gro V.
2012-01-01
SUMMARY Several lines of evidence support genetic links between ovary size and division of labor in worker honey bees. However, it is largely unknown how ovaries influence behavior. To address this question, we first performed transcriptional profiling on worker ovaries from two genotypes that differ in social behavior and ovary size. Then, we contrasted the differentially expressed ovarian genes with six sets of available brain transcriptomes. Finally, we probed behavior-related candidate gene networks in wild-type ovaries of different sizes. We found differential expression in 2151 ovarian transcripts in these artificially selected honey bee strains, corresponding to approximately 20.3% of the predicted gene set of honey bees. Differences in gene expression overlapped significantly with changes in the brain transcriptomes. Differentially expressed genes were associated with neural signal transmission (tyramine receptor, TYR) and ecdysteroid signaling; two independently tested nuclear hormone receptors (HR46 and ftz-f1) were also significantly correlated with ovary size in wild-type bees. We suggest that the correspondence between ovary and brain transcriptomes identified here indicates systemic regulatory networks among hormones (juvenile hormone and ecdysteroids), pheromones (queen mandibular pheromone), reproductive organs and nervous tissues in worker honey bees. Furthermore, robust correlations between ovary size and neuraland endocrine response genes are consistent with the hypothesized roles of the ovaries in honey bee behavioral regulation. PMID:22162860
Mutational Landscape of Candidate Genes in Familial Prostate Cancer
Johnson, Anna M.; Zuhlke, Kimberly A.; Plotts, Chris; McDonnell, Shannon K.; Middha, Sumit; Riska, Shaun M.; Thibodeau, Stephen N.; Douglas, Julie A.; Cooney, Kathleen A.
2014-01-01
Background Family history is a major risk factor for prostate cancer (PCa), suggesting a genetic component to the disease. However, traditional linkage and association studies have failed to fully elucidate the underlying genetic basis of familial PCa. Methods Here we use a candidate gene approach to identify potential PCa susceptibility variants in whole exome sequencing data from familial PCa cases. Six hundred ninety-seven candidate genes were identified based on function, location near a known chromosome 17 linkage signal, and/or previous association with prostate or other cancers. Single nucleotide variants (SNVs) in these candidate genes were identified in whole exome sequence data from 33 PCa cases from 11 multiplex PCa families (3 cases/family). Results Overall, 4856 candidate gene SNVs were identified, including 1052 missense and 10 nonsense variants. Twenty missense variants were shared by all 3 family members in each family in which they were observed. Additionally, 15 missense variants were shared by 2 of 3 family members and predicted to be deleterious by 5 different algorithms. Four missense variants, BLM Gln123Arg, PARP2 Arg283Gln, LRCC46 Ala295Thr and KIF2B Pro91Leu, and 1 nonsense variant, CYP3A43 Arg441Ter, showed complete co-segregation with PCa status. Twelve additional variants displayed partial co-segregation with PCa. Conclusions Forty-three nonsense and shared, missense variants were identified in our candidate genes. Further research is needed to determine the contribution of these variants to PCa susceptibility. PMID:25111073
Candidate genes for idiopathic epilepsy in four dog breeds.
Ekenstedt, Kari J; Patterson, Edward E; Minor, Katie M; Mickelson, James R
2011-04-25
Idiopathic epilepsy (IE) is a naturally occurring and significant seizure disorder affecting all dog breeds. Because dog breeds are genetically isolated populations, it is possible that IE is attributable to common founders and is genetically homogenous within breeds. In humans, a number of mutations, the majority of which are genes encoding ion channels, neurotransmitters, or their regulatory subunits, have been discovered to cause rare, specific types of IE. It was hypothesized that there are simple genetic bases for IE in some purebred dog breeds, specifically in Vizslas, English Springer Spaniels (ESS), Greater Swiss Mountain Dogs (GSMD), and Beagles, and that the gene(s) responsible may, in some cases, be the same as those already discovered in humans. Candidate genes known to be involved in human epilepsy, along with selected additional genes in the same gene families that are involved in murine epilepsy or are expressed in neural tissue, were examined in populations of affected and unaffected dogs. Microsatellite markers in close proximity to each candidate gene were genotyped and subjected to two-point linkage in Vizslas, and association analysis in ESS, GSMD and Beagles. Most of these candidate genes were not significantly associated with IE in these four dog breeds, while a few genes remained inconclusive. Other genes not included in this study may still be causing monogenic IE in these breeds or, like many cases of human IE, the disease in dogs may be likewise polygenic.
Identification of candidate genes in osteoporosis by integrated microarray analysis.
Li, J J; Wang, B Q; Fei, Q; Yang, Y; Li, D
2016-12-01
In order to screen the altered gene expression profile in peripheral blood mononuclear cells of patients with osteoporosis, we performed an integrated analysis of the online microarray studies of osteoporosis. We searched the Gene Expression Omnibus (GEO) database for microarray studies of peripheral blood mononuclear cells in patients with osteoporosis. Subsequently, we integrated gene expression data sets from multiple microarray studies to obtain differentially expressed genes (DEGs) between patients with osteoporosis and normal controls. Gene function analysis was performed to uncover the functions of identified DEGs. A total of three microarray studies were selected for integrated analysis. In all, 1125 genes were found to be significantly differentially expressed between osteoporosis patients and normal controls, with 373 upregulated and 752 downregulated genes. Positive regulation of the cellular amino metabolic process (gene ontology (GO): 0033240, false discovery rate (FDR) = 1.00E + 00) was significantly enriched under the GO category for biological processes, while for molecular functions, flavin adenine dinucleotide binding (GO: 0050660, FDR = 3.66E-01) and androgen receptor binding (GO: 0050681, FDR = 6.35E-01) were significantly enriched. DEGs were enriched in many osteoporosis-related signalling pathways, including those of mitogen-activated protein kinase (MAPK) and calcium. Protein-protein interaction (PPI) network analysis showed that the significant hub proteins contained ubiquitin specific peptidase 9, X-linked (Degree = 99), ubiquitin specific peptidase 19 (Degree = 57) and ubiquitin conjugating enzyme E2 B (Degree = 57). Analysis of gene function of identified differentially expressed genes may expand our understanding of fundamental mechanisms leading to osteoporosis. Moreover, significantly enriched pathways, such as MAPK and calcium, may involve in osteoporosis through osteoblastic differentiation and bone formation.Cite this article: J. J. Li, B. Q. Wang, Q. Fei, Y. Yang, D. Li. Identification of candidate genes in osteoporosis by integrated microarray analysis. Bone Joint Res 2016;5:594-601. DOI: 10.1302/2046-3758.512.BJR-2016-0073.R1. © 2016 Fei et al.
Genome expansion and gene loss in powdery mildew fungi reveal tradeoffs in extreme parasitism.
Spanu, Pietro D; Abbott, James C; Amselem, Joelle; Burgis, Timothy A; Soanes, Darren M; Stüber, Kurt; Ver Loren van Themaat, Emiel; Brown, James K M; Butcher, Sarah A; Gurr, Sarah J; Lebrun, Marc-Henri; Ridout, Christopher J; Schulze-Lefert, Paul; Talbot, Nicholas J; Ahmadinejad, Nahal; Ametz, Christian; Barton, Geraint R; Benjdia, Mariam; Bidzinski, Przemyslaw; Bindschedler, Laurence V; Both, Maike; Brewer, Marin T; Cadle-Davidson, Lance; Cadle-Davidson, Molly M; Collemare, Jerome; Cramer, Rainer; Frenkel, Omer; Godfrey, Dale; Harriman, James; Hoede, Claire; King, Brian C; Klages, Sven; Kleemann, Jochen; Knoll, Daniela; Koti, Prasanna S; Kreplak, Jonathan; López-Ruiz, Francisco J; Lu, Xunli; Maekawa, Takaki; Mahanil, Siraprapa; Micali, Cristina; Milgroom, Michael G; Montana, Giovanni; Noir, Sandra; O'Connell, Richard J; Oberhaensli, Simone; Parlange, Francis; Pedersen, Carsten; Quesneville, Hadi; Reinhardt, Richard; Rott, Matthias; Sacristán, Soledad; Schmidt, Sarah M; Schön, Moritz; Skamnioti, Pari; Sommer, Hans; Stephens, Amber; Takahara, Hiroyuki; Thordal-Christensen, Hans; Vigouroux, Marielle; Wessling, Ralf; Wicker, Thomas; Panstruga, Ralph
2010-12-10
Powdery mildews are phytopathogens whose growth and reproduction are entirely dependent on living plant cells. The molecular basis of this life-style, obligate biotrophy, remains unknown. We present the genome analysis of barley powdery mildew, Blumeria graminis f.sp. hordei (Blumeria), as well as a comparison with the analysis of two powdery mildews pathogenic on dicotyledonous plants. These genomes display massive retrotransposon proliferation, genome-size expansion, and gene losses. The missing genes encode enzymes of primary and secondary metabolism, carbohydrate-active enzymes, and transporters, probably reflecting their redundancy in an exclusively biotrophic life-style. Among the 248 candidate effectors of pathogenesis identified in the Blumeria genome, very few (less than 10) define a core set conserved in all three mildews, suggesting that most effectors represent species-specific adaptations.
Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger.
Wright, James C; Sugden, Deana; Francis-McIntyre, Sue; Riba-Garcia, Isabel; Gaskell, Simon J; Grigoriev, Igor V; Baker, Scott E; Beynon, Robert J; Hubbard, Simon J
2009-02-04
Proteomic data is a potentially rich, but arguably unexploited, data source for genome annotation. Peptide identifications from tandem mass spectrometry provide prima facie evidence for gene predictions and can discriminate over a set of candidate gene models. Here we apply this to the recently sequenced Aspergillus niger fungal genome from the Joint Genome Institutes (JGI) and another predicted protein set from another A.niger sequence. Tandem mass spectra (MS/MS) were acquired from 1d gel electrophoresis bands and searched against all available gene models using Average Peptide Scoring (APS) and reverse database searching to produce confident identifications at an acceptable false discovery rate (FDR). 405 identified peptide sequences were mapped to 214 different A.niger genomic loci to which 4093 predicted gene models clustered, 2872 of which contained the mapped peptides. Interestingly, 13 (6%) of these loci either had no preferred predicted gene model or the genome annotators' chosen "best" model for that genomic locus was not found to be the most parsimonious match to the identified peptides. The peptides identified also boosted confidence in predicted gene structures spanning 54 introns from different gene models. This work highlights the potential of integrating experimental proteomics data into genomic annotation pipelines much as expressed sequence tag (EST) data has been. A comparison of the published genome from another strain of A.niger sequenced by DSM showed that a number of the gene models or proteins with proteomics evidence did not occur in both genomes, further highlighting the utility of the method.
Language Impairments in ASD Resulting from a Failed Domestication of the Human Brain
Benítez-Burraco, Antonio; Lattanzi, Wanda; Murphy, Elliot
2016-01-01
Autism spectrum disorders (ASD) are pervasive neurodevelopmental disorders entailing social and cognitive deficits, including marked problems with language. Numerous genes have been associated with ASD, but it is unclear how language deficits arise from gene mutation or dysregulation. It is also unclear why ASD shows such high prevalence within human populations. Interestingly, the emergence of a modern faculty of language has been hypothesized to be linked to changes in the human brain/skull, but also to the process of self-domestication of the human species. It is our intention to show that people with ASD exhibit less marked domesticated traits at the morphological, physiological, and behavioral levels. We also discuss many ASD candidates represented among the genes known to be involved in the “domestication syndrome” (the constellation of traits exhibited by domesticated mammals, which seemingly results from the hypofunction of the neural crest) and among the set of genes involved in language function closely connected to them. Moreover, many of these genes show altered expression profiles in the brain of autists. In addition, some candidates for domestication and language-readiness show the same expression profile in people with ASD and chimps in different brain areas involved in language processing. Similarities regarding the brain oscillatory behavior of these areas can be expected too. We conclude that ASD may represent an abnormal ontogenetic itinerary for the human faculty of language resulting in part from changes in genes important for the “domestication syndrome” and, ultimately, from the normal functioning of the neural crest. PMID:27621700
Tavtigian, Sean V; Byrnes, Graham B; Goldgar, David E; Thomas, Alun
2008-11-01
Many individually rare missense substitutions are encountered during deep resequencing of candidate susceptibility genes and clinical mutation screening of known susceptibility genes. BRCA1 and BRCA2 are among the most resequenced of all genes, and clinical mutation screening of these genes provides an extensive data set for analysis of rare missense substitutions. Align-GVGD is a mathematically simple missense substitution analysis algorithm, based on the Grantham difference, which has already contributed to classification of missense substitutions in BRCA1, BRCA2, and CHEK2. However, the distribution of genetic risk as a function of Align-GVGD's output variables Grantham variation (GV) and Grantham deviation (GD) has not been well characterized. Here, we used data from the Myriad Genetic Laboratories database of nearly 70,000 full-sequence tests plus two risk estimates, one approximating the odds ratio and the other reflecting strength of selection, to display the distribution of risk in the GV-GD plane as a series of surfaces. We abstracted contours from the surfaces and used the contours to define a sequence of missense substitution grades ordered from greatest risk to least risk. The grades were validated internally using a third, personal and family history-based, measure of risk. The Align-GVGD grades defined here are applicable to both the genetic epidemiology problem of classifying rare missense substitutions observed in known susceptibility genes and the molecular epidemiology problem of analyzing rare missense substitutions observed during case-control mutation screening studies of candidate susceptibility genes. (c) 2008 Wiley-Liss, Inc.
Kim, Yong-June; Yoon, Hyung-Yoon; Kim, Seon-Kyu; Kim, Young-Won; Kim, Eun-Jung; Kim, Isaac Yi; Kim, Wun-Jae
2011-07-01
Abnormal DNA methylation is associated with many human cancers. The aim of the present study was to identify novel methylation markers in prostate cancer (PCa) by microarray analysis and to test whether these markers could discriminate normal and PCa cells. Microarray-based DNA methylation and gene expression profiling was carried out using a panel of PCa cell lines and a control normal prostate cell line. The methylation status of candidate genes in prostate cell lines was confirmed by real-time reverse transcriptase-PCR, bisulfite sequencing analysis, and treatment with a demethylation agent. DNA methylation and gene expression analysis in 203 human prostate specimens, including 106 PCa and 97 benign prostate hyperplasia (BPH), were carried out. Further validation using microarray gene expression data from the Gene Expression Omnibus (GEO) was carried out. Epidermal growth factor-containing fibulin-like extracellular matrix protein 1 (EFEMP1) was identified as a lead candidate methylation marker for PCa. The gene expression level of EFEMP1 was significantly higher in tissue samples from patients with BPH than in those with PCa (P < 0.001). The sensitivity and specificity of EFEMP1 methylation status in discriminating between PCa and BPH reached 95.3% (101 of 106) and 86.6% (84 of 97), respectively. From the GEO data set, we confirmed that the expression level of EFEMP1 was significantly different between PCa and BPH. Genome-wide characterization of DNA methylation profiles enabled the identification of EFEMP1 aberrant methylation patterns in PCa. EFEMP1 might be a useful indicator for the detection of PCa.
Kebede, Aida Z; Johnston, Anne; Schneiderman, Danielle; Bosnich, Whynn; Harris, Linda J
2018-02-09
Gibberella ear rot (GER) is one of the most economically important fungal diseases of maize in the temperate zone due to moldy grain contaminated with health threatening mycotoxins. To develop resistant genotypes and control the disease, understanding the host-pathogen interaction is essential. RNA-Seq-derived transcriptome profiles of fungal- and mock-inoculated developing kernel tissues of two maize inbred lines were used to identify differentially expressed transcripts and propose candidate genes mapping within GER resistance quantitative trait loci (QTL). A total of 1255 transcripts were significantly (P ≤ 0.05) up regulated due to fungal infection in both susceptible and resistant inbreds. A greater number of transcripts were up regulated in the former (1174) than the latter (497) and increased as the infection progressed from 1 to 2 days after inoculation. Focusing on differentially expressed genes located within QTL regions for GER resistance, we identified 81 genes involved in membrane transport, hormone regulation, cell wall modification, cell detoxification, and biosynthesis of pathogenesis related proteins and phytoalexins as candidate genes contributing to resistance. Applying droplet digital PCR, we validated the expression profiles of a subset of these candidate genes from QTL regions contributed by the resistant inbred on chromosomes 1, 2 and 9. By screening global gene expression profiles for differentially expressed genes mapping within resistance QTL regions, we have identified candidate genes for gibberella ear rot resistance on several maize chromosomes which could potentially lead to a better understanding of Fusarium resistance mechanisms.
Morton, Nicholas M.; Nelson, Yvonne B.; Michailidou, Zoi; Di Rollo, Emma M.; Ramage, Lynne; Hadoke, Patrick W. F.; Seckl, Jonathan R.; Bunger, Lutz; Horvat, Simon; Kenyon, Christopher J.; Dunbar, Donald R.
2011-01-01
Background Obesity and metabolic syndrome results from a complex interaction between genetic and environmental factors. In addition to brain-regulated processes, recent genome wide association studies have indicated that genes highly expressed in adipose tissue affect the distribution and function of fat and thus contribute to obesity. Using a stratified transcriptome gene enrichment approach we attempted to identify adipose tissue-specific obesity genes in the unique polygenic Fat (F) mouse strain generated by selective breeding over 60 generations for divergent adiposity from a comparator Lean (L) strain. Results To enrich for adipose tissue obesity genes a ‘snap-shot’ pooled-sample transcriptome comparison of key fat depots and non adipose tissues (muscle, liver, kidney) was performed. Known obesity quantitative trait loci (QTL) information for the model allowed us to further filter genes for increased likelihood of being causal or secondary for obesity. This successfully identified several genes previously linked to obesity (C1qr1, and Np3r) as positional QTL candidate genes elevated specifically in F line adipose tissue. A number of novel obesity candidate genes were also identified (Thbs1, Ppp1r3d, Tmepai, Trp53inp2, Ttc7b, Tuba1a, Fgf13, Fmr) that have inferred roles in fat cell function. Quantitative microarray analysis was then applied to the most phenotypically divergent adipose depot after exaggerating F and L strain differences with chronic high fat feeding which revealed a distinct gene expression profile of line, fat depot and diet-responsive inflammatory, angiogenic and metabolic pathways. Selected candidate genes Npr3 and Thbs1, as well as Gys2, a non-QTL gene that otherwise passed our enrichment criteria were characterised, revealing novel functional effects consistent with a contribution to obesity. Conclusions A focussed candidate gene enrichment strategy in the unique F and L model has identified novel adipose tissue-enriched genes contributing to obesity. PMID:21915269
Morton, Nicholas M; Nelson, Yvonne B; Michailidou, Zoi; Di Rollo, Emma M; Ramage, Lynne; Hadoke, Patrick W F; Seckl, Jonathan R; Bunger, Lutz; Horvat, Simon; Kenyon, Christopher J; Dunbar, Donald R
2011-01-01
Obesity and metabolic syndrome results from a complex interaction between genetic and environmental factors. In addition to brain-regulated processes, recent genome wide association studies have indicated that genes highly expressed in adipose tissue affect the distribution and function of fat and thus contribute to obesity. Using a stratified transcriptome gene enrichment approach we attempted to identify adipose tissue-specific obesity genes in the unique polygenic Fat (F) mouse strain generated by selective breeding over 60 generations for divergent adiposity from a comparator Lean (L) strain. To enrich for adipose tissue obesity genes a 'snap-shot' pooled-sample transcriptome comparison of key fat depots and non adipose tissues (muscle, liver, kidney) was performed. Known obesity quantitative trait loci (QTL) information for the model allowed us to further filter genes for increased likelihood of being causal or secondary for obesity. This successfully identified several genes previously linked to obesity (C1qr1, and Np3r) as positional QTL candidate genes elevated specifically in F line adipose tissue. A number of novel obesity candidate genes were also identified (Thbs1, Ppp1r3d, Tmepai, Trp53inp2, Ttc7b, Tuba1a, Fgf13, Fmr) that have inferred roles in fat cell function. Quantitative microarray analysis was then applied to the most phenotypically divergent adipose depot after exaggerating F and L strain differences with chronic high fat feeding which revealed a distinct gene expression profile of line, fat depot and diet-responsive inflammatory, angiogenic and metabolic pathways. Selected candidate genes Npr3 and Thbs1, as well as Gys2, a non-QTL gene that otherwise passed our enrichment criteria were characterised, revealing novel functional effects consistent with a contribution to obesity. A focussed candidate gene enrichment strategy in the unique F and L model has identified novel adipose tissue-enriched genes contributing to obesity.
Cannistraci, Carlo V; Ogorevc, Jernej; Zorc, Minja; Ravasi, Timothy; Dovc, Peter; Kunej, Tanja
2013-02-14
Cryptorchidism is the most frequent congenital disorder in male children; however the genetic causes of cryptorchidism remain poorly investigated. Comparative integratomics combined with systems biology approach was employed to elucidate genetic factors and molecular pathways underlying testis descent. Literature mining was performed to collect genomic loci associated with cryptorchidism in seven mammalian species. Information regarding the collected candidate genes was stored in MySQL relational database. Genomic view of the loci was presented using Flash GViewer web tool (http://gmod.org/wiki/Flashgviewer/). DAVID Bioinformatics Resources 6.7 was used for pathway enrichment analysis. Cytoscape plug-in PiNGO 1.11 was employed for protein-network-based prediction of novel candidate genes. Relevant protein-protein interactions were confirmed and visualized using the STRING database (version 9.0). The developed cryptorchidism gene atlas includes 217 candidate loci (genes, regions involved in chromosomal mutations, and copy number variations) identified at the genomic, transcriptomic, and proteomic level. Human orthologs of the collected candidate loci were presented using a genomic map viewer. The cryptorchidism gene atlas is freely available online: http://www.integratomics-time.com/cryptorchidism/. Pathway analysis suggested the presence of twelve enriched pathways associated with the list of 179 literature-derived candidate genes. Additionally, a list of 43 network-predicted novel candidate genes was significantly associated with four enriched pathways. Joint pathway analysis of the collected and predicted candidate genes revealed the pivotal importance of the muscle-contraction pathway in cryptorchidism and evidence for genomic associations with cardiomyopathy pathways in RASopathies. The developed gene atlas represents an important resource for the scientific community researching genetics of cryptorchidism. The collected data will further facilitate development of novel genetic markers and could be of interest for functional studies in animals and human. The proposed network-based systems biology approach elucidates molecular mechanisms underlying co-presence of cryptorchidism and cardiomyopathy in RASopathies. Such approach could also aid in molecular explanation of co-presence of diverse and apparently unrelated clinical manifestations in other syndromes.
Impact of training sets on classification of high-throughput bacterial 16s rRNA gene surveys
Werner, Jeffrey J; Koren, Omry; Hugenholtz, Philip; DeSantis, Todd Z; Walters, William A; Caporaso, J Gregory; Angenent, Largus T; Knight, Rob; Ley, Ruth E
2012-01-01
Taxonomic classification of the thousands–millions of 16S rRNA gene sequences generated in microbiome studies is often achieved using a naïve Bayesian classifier (for example, the Ribosomal Database Project II (RDP) classifier), due to favorable trade-offs among automation, speed and accuracy. The resulting classification depends on the reference sequences and taxonomic hierarchy used to train the model; although the influence of primer sets and classification algorithms have been explored in detail, the influence of training set has not been characterized. We compared classification results obtained using three different publicly available databases as training sets, applied to five different bacterial 16S rRNA gene pyrosequencing data sets generated (from human body, mouse gut, python gut, soil and anaerobic digester samples). We observed numerous advantages to using the largest, most diverse training set available, that we constructed from the Greengenes (GG) bacterial/archaeal 16S rRNA gene sequence database and the latest GG taxonomy. Phylogenetic clusters of previously unclassified experimental sequences were identified with notable improvements (for example, 50% reduction in reads unclassified at the phylum level in mouse gut, soil and anaerobic digester samples), especially for phylotypes belonging to specific phyla (Tenericutes, Chloroflexi, Synergistetes and Candidate phyla TM6, TM7). Trimming the reference sequences to the primer region resulted in systematic improvements in classification depth, and greatest gains at higher confidence thresholds. Phylotypes unclassified at the genus level represented a greater proportion of the total community variation than classified operational taxonomic units in mouse gut and anaerobic digester samples, underscoring the need for greater diversity in existing reference databases. PMID:21716311
Barbesino, G; Tomer, Y; Concepcion, E S; Davies, T F; Greenberg, D A
1998-09-01
Hashimoto's thyroiditis (HT) and Graves' disease (GD) are autoimmune thyroid diseases (AITD) in which multiple genetic factors are suspected to play an important role. Until now, only a few minor risk factors for these diseases have been identified. Susceptibility seems to be stronger in women, pointing toward a possible role for genes related to sex steroid action or mechanisms related to genes on the X-chromosome. We have studied a total of 45 multiplex families, each containing at least 2 members affected with either GD (55 patients) or HT (72 patients), and used linkage analysis to target as candidate susceptibility loci genes involved in estrogen activity, such as the estrogen receptor alpha and beta and the aromatase genes. We then screened the entire X-chromosome using a set of polymorphic microsatellite markers spanning the whole chromosome. We found a region of the X-chromosome (Xq21.33-22) giving positive logarithm of odds (LOD) scores and then reanalyzed this area with dense markers in a multipoint analysis. Our results excluded linkage to the estrogen receptor alpha and aromatase genes when either the patients with GD only, those with HT only, or those with any AITD were considered as affected. Linkage to the estrogen receptor beta could not be totally ruled out, partly due to incomplete mapping information for the gene itself at this time. The X-chromosome data revealed consistently positive LOD scores (maximum of 1.88 for marker DXS8020 and GD patients) when either definition of affectedness was considered. Analysis of the family data using a multipoint analysis with eight closely linked markers generated LOD scores suggestive of linkage to GD in a chromosomal area (Xq21.33-22) extending for about 6 cM and encompassing four markers. The maximum LOD score (2.5) occurred at DXS8020. In conclusion, we ruled out a major role for estrogen receptor alpha and the aromatase genes in the genetic predisposition to AITD. Estrogen receptor beta remains a candidate locus. We found a locus on Xq21.33-22 linked to GD that may help to explain the female predisposition to GD. Confirmation of these data in HT may require study of an extended number of families because of possible heterogeneity.
2010-01-01
Introduction Various multigene predictors of breast cancer clinical outcome have been commercialized, but proved to be prognostic only for hormone receptor (HR) subsets overexpressing estrogen or progesterone receptors. Hormone receptor negative (HRneg) breast cancers, particularly those lacking HER2/ErbB2 overexpression and known as triple-negative (Tneg) cases, are heterogeneous and generally aggressive breast cancer subsets in need of prognostic subclassification, since most early stage HRneg and Tneg breast cancer patients are cured with conservative treatment yet invariably receive aggressive adjuvant chemotherapy. Methods An unbiased search for genes predictive of distant metastatic relapse was undertaken using a training cohort of 199 node-negative, adjuvant treatment naïve HRneg (including 154 Tneg) breast cancer cases curated from three public microarray datasets. Prognostic gene candidates were subsequently validated using a different cohort of 75 node-negative, adjuvant naïve HRneg cases curated from three additional datasets. The HRneg/Tneg gene signature was prognostically compared with eight other previously reported gene signatures, and evaluated for cancer network associations by two commercial pathway analysis programs. Results A novel set of 14 prognostic gene candidates was identified as outcome predictors: CXCL13, CLIC5, RGS4, RPS28, RFX7, EXOC7, HAPLN1, ZNF3, SSX3, HRBL, PRRG3, ABO, PRTN3, MATN1. A composite HRneg/Tneg gene signature index proved more accurate than any individual candidate gene or other reported multigene predictors in identifying cases likely to remain free of metastatic relapse. Significant positive correlations between the HRneg/Tneg index and three independent immune-related signatures (STAT1, IFN, and IR) were observed, as were consistent negative associations between the three immune-related signatures and five other proliferation module-containing signatures (MS-14, ONCO-RS, GGI, CSR/wound and NKI-70). Network analysis identified 8 genes within the HRneg/Tneg signature as being functionally linked to immune/inflammatory chemokine regulation. Conclusions A multigene HRneg/Tneg signature linked to immune/inflammatory cytokine regulation was identified from pooled expression microarray data and shown to be superior to other reported gene signatures in predicting the metastatic outcome of early stage and conservatively managed HRneg and Tneg breast cancer. Further validation of this prognostic signature may lead to new therapeutic insights and spare many newly diagnosed breast cancer patients the need for aggressive adjuvant chemotherapy. PMID:20946665
Ponsuwanna, Patrath; Kümpornsin, Krittikorn; Chookajorn, Thanat
2014-01-01
Even though antigenic variation is employed among parasitic protozoa for host immune evasion, Tetrahymena thermophila, a free-living ciliate, can also change its surface protein antigens. These cysteine-rich glycosylphosphatidylinositol (GPI)-linked surface proteins are encoded by a family of polymorphic Ser genes. Despite the availability of T. thermophila genome, a comprehensive analysis of the Ser family is limited by its high degree of polymorphism. In order to overcome this problem, a new approach was adopted by searching for Ser candidates with common motif sequences, namely length-specific repetitive cysteine pattern and GPI anchor site. The candidate genes were phylogenetically compared with the previously identified Ser genes and classified into subtypes. Ser candidates were often found to be located as tandem arrays of the same subtypes on several chromosomal scaffolds. Certain Ser candidates located in the same chromosomal arrays were transcriptionally expressed at specific T. thermophila developmental stages. These Ser candidates selected by the motif analysis approach can form the foundation for a systematic identification of the entire Ser gene family, which will contribute to the understanding of their function and the basis of T. thermophila antigenic variation. PMID:25133747
Analysis of Craniocardiac Malformations in Xenopus using Optical Coherence Tomography
Deniz, Engin; Jonas, Stephan; Hooper, Michael; N. Griffin, John; Choma, Michael A.; Khokha, Mustafa K.
2017-01-01
Birth defects affect 3% of children in the United States. Among the birth defects, congenital heart disease and craniofacial malformations are major causes of mortality and morbidity. Unfortunately, the genetic mechanisms underlying craniocardiac malformations remain largely uncharacterized. To address this, human genomic studies are identifying sequence variations in patients, resulting in numerous candidate genes. However, the molecular mechanisms of pathogenesis for most candidate genes are unknown. Therefore, there is a need for functional analyses in rapid and efficient animal models of human disease. Here, we coupled the frog Xenopus tropicalis with Optical Coherence Tomography (OCT) to create a fast and efficient system for testing craniocardiac candidate genes. OCT can image cross-sections of microscopic structures in vivo at resolutions approaching histology. Here, we identify optimal OCT imaging planes to visualize and quantitate Xenopus heart and facial structures establishing normative data. Next we evaluate known human congenital heart diseases: cardiomyopathy and heterotaxy. Finally, we examine craniofacial defects by a known human teratogen, cyclopamine. We recapitulate human phenotypes readily and quantify the functional and structural defects. Using this approach, we can quickly test human craniocardiac candidate genes for phenocopy as a critical first step towards understanding disease mechanisms of the candidate genes. PMID:28195132
USDA-ARS?s Scientific Manuscript database
Large-scale screens of the maize genome identified 48 genes that show the putative signature of artificial selection during maize domestication or improvement. These selection-candidate genes may act as quantitative trait loci (QTL) that control the phenotypic differences between maize and its proge...
Reddy, Palakolanu Sudhakar; Sri Cindhuri, Katamreddy; Sivaji Ganesh, Adusumalli; Sharma, Kiran Kumar
2016-01-01
Quantitative Real-Time PCR (qPCR) is a preferred and reliable method for accurate quantification of gene expression to understand precise gene functions. A total of 25 candidate reference genes including traditional and new generation reference genes were selected and evaluated in a diverse set of chickpea samples. The samples used in this study included nine chickpea genotypes (Cicer spp.) comprising of cultivated and wild species, six abiotic stress treatments (drought, salinity, high vapor pressure deficit, abscisic acid, cold and heat shock), and five diverse tissues (leaf, root, flower, seedlings and seed). The geNorm, NormFinder and RefFinder algorithms used to identify stably expressed genes in four sample sets revealed stable expression of UCP and G6PD genes across genotypes, while TIP41 and CAC were highly stable under abiotic stress conditions. While PP2A and ABCT genes were ranked as best for different tissues, ABCT, UCP and CAC were most stable across all samples. This study demonstrated the usefulness of new generation reference genes for more accurate qPCR based gene expression quantification in cultivated as well as wild chickpea species. Validation of the best reference genes was carried out by studying their impact on normalization of aquaporin genes PIP1;4 and TIP3;1, in three contrasting chickpea genotypes under high vapor pressure deficit (VPD) treatment. The chickpea TIP3;1 gene got significantly up regulated under high VPD conditions with higher relative expression in the drought susceptible genotype, confirming the suitability of the selected reference genes for expression analysis. This is the first comprehensive study on the stability of the new generation reference genes for qPCR studies in chickpea across species, different tissues and abiotic stresses. PMID:26863232
Reddy, Dumbala Srinivas; Bhatnagar-Mathur, Pooja; Reddy, Palakolanu Sudhakar; Sri Cindhuri, Katamreddy; Sivaji Ganesh, Adusumalli; Sharma, Kiran Kumar
2016-01-01
Quantitative Real-Time PCR (qPCR) is a preferred and reliable method for accurate quantification of gene expression to understand precise gene functions. A total of 25 candidate reference genes including traditional and new generation reference genes were selected and evaluated in a diverse set of chickpea samples. The samples used in this study included nine chickpea genotypes (Cicer spp.) comprising of cultivated and wild species, six abiotic stress treatments (drought, salinity, high vapor pressure deficit, abscisic acid, cold and heat shock), and five diverse tissues (leaf, root, flower, seedlings and seed). The geNorm, NormFinder and RefFinder algorithms used to identify stably expressed genes in four sample sets revealed stable expression of UCP and G6PD genes across genotypes, while TIP41 and CAC were highly stable under abiotic stress conditions. While PP2A and ABCT genes were ranked as best for different tissues, ABCT, UCP and CAC were most stable across all samples. This study demonstrated the usefulness of new generation reference genes for more accurate qPCR based gene expression quantification in cultivated as well as wild chickpea species. Validation of the best reference genes was carried out by studying their impact on normalization of aquaporin genes PIP1;4 and TIP3;1, in three contrasting chickpea genotypes under high vapor pressure deficit (VPD) treatment. The chickpea TIP3;1 gene got significantly up regulated under high VPD conditions with higher relative expression in the drought susceptible genotype, confirming the suitability of the selected reference genes for expression analysis. This is the first comprehensive study on the stability of the new generation reference genes for qPCR studies in chickpea across species, different tissues and abiotic stresses.
Hassani-Pak, Keywan; Rawlings, Christopher
2017-06-13
Genetics and "omics" studies designed to uncover genotype to phenotype relationships often identify large numbers of potential candidate genes, among which the causal genes are hidden. Scientists generally lack the time and technical expertise to review all relevant information available from the literature, from key model species and from a potentially wide range of related biological databases in a variety of data formats with variable quality and coverage. Computational tools are needed for the integration and evaluation of heterogeneous information in order to prioritise candidate genes and components of interaction networks that, if perturbed through potential interventions, have a positive impact on the biological outcome in the whole organism without producing negative side effects. Here we review several bioinformatics tools and databases that play an important role in biological knowledge discovery and candidate gene prioritization. We conclude with several key challenges that need to be addressed in order to facilitate biological knowledge discovery in the future.
Whole-Exome Sequencing Study of Thyrotropin-Secreting Pituitary Adenomas.
Sapkota, Santosh; Horiguchi, Kazuhiko; Tosaka, Masahiko; Yamada, Syozo; Yamada, Masanobu
2017-02-01
Thyrotropin (TSH)-secreting pituitary adenomas (TSHomas) are a rare cause of hyperthyroidism, and the genetic aberrations responsible remain unknown. To identify somatic genetic abnormalities in TSHomas. A single-nucleotide polymorphism (SNP) array analysis was performed on 8 TSHomas. Four tumors with no allelic losses or limited loss of heterozygosity were selected, and whole-exome sequencing was performed, including their corresponding blood samples. Somatic variants were confirmed by Sanger sequencing. A set of 8 tumors was also assessed to validate candidate genes. Twelve patients with sporadic TSHomas were examined. The overall performance of whole-exome sequencing was good, with an average coverage of each base in the targeted region of 97.6%. Six DNA variants were confirmed as candidate driver mutations, with an average of 1.5 somatic mutations per tumor. No mutations were recurrent. Two of these mutations were found in genes with an established role in malignant tumorigenesis (SMOX and SYTL3), and 4 had unknown roles (ZSCAN23, ASTN2, R3HDM2, and CWH43). Similarly, an SNP array analysis revealed frequent chromosomal regions of copy number gains, including recurrent gains at loci harboring 4 of these 6 genes. Several candidate somatic mutations and changes in copy numbers for TSHomas were identified. The results showed no recurrence of mutations in the tumors studied but a low number of mutations, thereby highlighting their benign nature. Further studies on a larger cohort of TSHomas, along with the use of epigenetic and transcriptomic approaches, may reveal the underlying genetic lesions. Copyright © 2017 by the Endocrine Society
Identifying positive selection candidate loci for high-altitude adaptation in Andean populations
2009-01-01
High-altitude environments (>2,500 m) provide scientists with a natural laboratory to study the physiological and genetic effects of low ambient oxygen tension on human populations. One approach to understanding how life at high altitude has affected human metabolism is to survey genome-wide datasets for signatures of natural selection. In this work, we report on a study to identify selection-nominated candidate genes involved in adaptation to hypoxia in one highland group, Andeans from the South American Altiplano. We analysed dense microarray genotype data using four test statistics that detect departures from neutrality. Using a candidate gene, single nucleotide polymorphism-based approach, we identified genes exhibiting preliminary evidence of recent genetic adaptation in this population. These included genes that are part of the hypoxia-inducible transcription factor (HIF) pathway, a biochemical pathway involved in oxygen homeostasis, as well as three other genomic regions previously not known to be associated with high-altitude phenotypes. In addition to identifying selection-nominated candidate genes, we also tested whether the HIF pathway shows evidence of natural selection. Our results indicate that the genes of this biochemical pathway as a group show no evidence of having evolved in response to hypoxia in Andeans. Results from particular HIF-targeted genes, however, suggest that genes in this pathway could play a role in Andean adaptation to high altitude, even if the pathway as a whole does not show higher relative rates of evolution. These data suggest a genetic role in high-altitude adaptation and provide a basis for genotype/phenotype association studies that are necessary to confirm the role of putative natural selection candidate genes and gene regions in adaptation to altitude. PMID:20038496
Walsh, Kyle M; Anderson, Erik; Hansen, Helen M; Decker, Paul A; Kosel, Matt L; Kollmeyer, Thomas; Rice, Terri; Zheng, Shichun; Xiao, Yuanyuan; Chang, Jeffrey S; McCoy, Lucie S; Bracci, Paige M; Wiemels, Joe L; Pico, Alexander R; Smirnov, Ivan; Lachance, Daniel H; Sicotte, Hugues; Eckel-Passow, Jeanette E; Wiencke, John K; Jenkins, Robert B; Wrensch, Margaret R
2013-02-01
Genomewide association studies (GWAS) and candidate-gene studies have implicated single-nucleotide polymorphisms (SNPs) in at least 45 different genes as putative glioma risk factors. Attempts to validate these associations have yielded variable results and few genetic risk factors have been consistently replicated. We conducted a case-control study of Caucasian glioma cases and controls from the University of California San Francisco (810 cases, 512 controls) and the Mayo Clinic (852 cases, 789 controls) in an attempt to replicate previously reported genetic risk factors for glioma. Sixty SNPs selected from the literature (eight from GWAS and 52 from candidate-gene studies) were successfully genotyped on an Illumina custom genotyping panel. Eight SNPs in/near seven different genes (TERT, EGFR, CCDC26, CDKN2A, PHLDB1, RTEL1, TP53) were significantly associated with glioma risk in the combined dataset (P < 0.05), with all associations in the same direction as in previous reports. Several SNP associations showed considerable differences across histologic subtype. All eight successfully replicated associations were first identified by GWAS, although none of the putative risk SNPs from candidate-gene studies was associated in the full case-control sample (all P values > 0.05). Although several confirmed associations are located near genes long known to be involved in gliomagenesis (e.g., EGFR, CDKN2A, TP53), these associations were first discovered by the GWAS approach and are in noncoding regions. These results highlight that the deficiencies of the candidate-gene approach lay in selecting both appropriate genes and relevant SNPs within these genes. © 2012 WILEY PERIODICALS, INC.
Genomic features of bacterial adaptation to plants
Levy, Asaf; Gonzalez, Isai Salas; Mittelviefhaus, Maximilian; Clingenpeel, Scott; Paredes, Sur Herrera; Miao, Jiamin; Wang, Kunru; Devescovi, Giulia; Stillman, Kyra; Monteiro, Freddy; Alvarez, Bryan Rangel; Lundberg, Derek S.; Lu, Tse-Yuan; Lebeis, Sarah; Jin, Zhao; McDonald, Meredith; Klein, Andrew P.; Feltcher, Meghan E.; del Rio, Tijana Glavina; Grant, Sarah R.; Doty, Sharon L.; Ley, Ruth E.; Zhao, Bingyu; Venturi, Vittorio; Pelletier, Dale A.; Vorholt, Julia A.; Tringe, Susannah G.; Woyke, Tanja; Dangl, Jeffery L.
2017-01-01
Plants intimately associate with diverse bacteria. Plant-associated (PA) bacteria have ostensibly evolved genes enabling adaptation to the plant environment. However, the identities of such genes are mostly unknown and their functions are poorly characterized. We sequenced 484 genomes of bacterial isolates from roots of Brassicaceae, poplar, and maize. We then compared 3837 bacterial genomes to identify thousands of PA gene clusters. Genomes of PA bacteria encode more carbohydrate metabolism functions and fewer mobile elements than related non-plant associated genomes. We experimentally validated candidates from two sets of PA genes, one involved in plant colonization, the other serving in microbe-microbe competition between PA bacteria. We also identified 64 PA protein domains that potentially mimic plant domains; some are shared with PA fungi and oomycetes. This work expands the genome-based understanding of plant-microbe interactions and provides leads for efficient and sustainable agriculture through microbiome engineering. PMID:29255260
Singh, Vikas K; Khan, Aamir W; Saxena, Rachit K; Sinha, Pallavi; Kale, Sandip M; Parupalli, Swathi; Kumar, Vinay; Chitikineni, Annapurna; Vechalapu, Suryanarayana; Sameer Kumar, Chanda Venkata; Sharma, Mamta; Ghanta, Anuradha; Yamini, Kalinati Narasimhan; Muniswamy, Sonnappa; Varshney, Rajeev K
2017-07-01
Identification of candidate genomic regions associated with target traits using conventional mapping methods is challenging and time-consuming. In recent years, a number of single nucleotide polymorphism (SNP)-based mapping approaches have been developed and used for identification of candidate/putative genomic regions. However, in the majority of these studies, insertion-deletion (Indel) were largely ignored. For efficient use of Indels in mapping target traits, we propose Indel-seq approach, which is a combination of whole-genome resequencing (WGRS) and bulked segregant analysis (BSA) and relies on the Indel frequencies in extreme bulks. Deployment of Indel-seq approach for identification of candidate genomic regions associated with fusarium wilt (FW) and sterility mosaic disease (SMD) resistance in pigeonpea has identified 16 Indels affecting 26 putative candidate genes. Of these 26 affected putative candidate genes, 24 genes showed effect in the upstream/downstream of the genic region and two genes showed effect in the genes. Validation of these 16 candidate Indels in other FW- and SMD-resistant and FW- and SMD-susceptible genotypes revealed a significant association of five Indels (three for FW and two for SMD resistance). Comparative analysis of Indel-seq with other genetic mapping approaches highlighted the importance of the approach in identification of significant genomic regions associated with target traits. Therefore, the Indel-seq approach can be used for quick and precise identification of candidate genomic regions for any target traits in any crop species. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Esibizione, Diana; Cui, Chang-Yi; Schlessinger, David
2009-01-01
EDA, the gene mutated in anhidrotic ectodermal dysplasia, encodes ectodysplasin, a TNF superfamily member that activates NF-kB mediated transcription. To identify EDA target genes, we have earlier used expression profiling to infer genes differentially expressed at various developmental time points in Tabby (Eda-deficient) compared to wild-type mouse skin. To increase the resolution to find genes whose expression may be restricted to epidermal cells, we have now extended studies to primary keratinocyte cultures established from E19 wild-type and Tabby skin. Using microarrays bearing 44,000 gene probes, we found 385 preliminary candidate genes whose expression was significantly affected by Eda loss. By comparing expression profiles to those from Eda-A1 transgenic skin, we restricted the list to 38 “candidate EDA targets”, 14 of which were already known to be expressed in hair follicles or epidermis. We confirmed expression changes for 3 selected genes, Tbx1, Bmp7, and Jag1, both in keratinocytes and in whole skin, by Q-PCR and Western blotting analyses. Thus, by the analysis of keratinocytes, novel candidate pathways downstream of EDA were detected. PMID:18848976
Zhu, Bo; Zhang, Wenli; Jiang, Jiming
2015-01-01
Enhancers are important regulators of gene expression in eukaryotes. Enhancers function independently of their distance and orientation to the promoters of target genes. Thus, enhancers have been difficult to identify. Only a few enhancers, especially distant intergenic enhancers, have been identified in plants. We developed an enhancer prediction system based exclusively on the DNase I hypersensitive sites (DHSs) in the Arabidopsis thaliana genome. A set of 10,044 DHSs located in intergenic regions, which are away from any gene promoters, were predicted to be putative enhancers. We examined the functions of 14 predicted enhancers using the β-glucuronidase gene reporter. Ten of the 14 (71%) candidates were validated by the reporter assay. We also designed 10 constructs using intergenic sequences that are not associated with DHSs, and none of these constructs showed enhancer activities in reporter assays. In addition, the tissue specificity of the putative enhancers can be precisely predicted based on DNase I hypersensitivity data sets developed from different plant tissues. These results suggest that the open chromatin signature-based enhancer prediction system developed in Arabidopsis may serve as a universal system for enhancer identification in plants. PMID:26373455
A comprehensive study of the genomic differentiation between temperate Dent and Flint maize.
Unterseer, Sandra; Pophaly, Saurabh D; Peis, Regina; Westermeier, Peter; Mayer, Manfred; Seidel, Michael A; Haberer, Georg; Mayer, Klaus F X; Ordas, Bernardo; Pausch, Hubert; Tellier, Aurélien; Bauer, Eva; Schön, Chris-Carolin
2016-07-08
Dent and Flint represent two major germplasm pools exploited in maize breeding. Several traits differentiate the two pools, like cold tolerance, early vigor, and flowering time. A comparative investigation of their genomic architecture relevant for quantitative trait expression has not been reported so far. Understanding the genomic differences between germplasm pools may contribute to a better understanding of the complementarity in heterotic patterns exploited in hybrid breeding and of mechanisms involved in adaptation to different environments. We perform whole-genome screens for signatures of selection specific to temperate Dent and Flint maize by comparing high-density genotyping data of 70 American and European Dent and 66 European Flint inbred lines. We find 2.2 % and 1.4 % of the genes are under selective pressure, respectively, and identify candidate genes associated with agronomic traits known to differ between the two pools. Taking flowering time as an example for the differentiation between Dent and Flint, we investigate candidate genes involved in the flowering network by phenotypic analyses in a Dent-Flint introgression library and find that the Flint haplotypes of the candidates promote earlier flowering. Within the flowering network, the majority of Flint candidates are associated with endogenous pathways in contrast to Dent candidate genes, which are mainly involved in response to environmental factors like light and photoperiod. The diversity patterns of the candidates in a unique panel of more than 900 individuals from 38 European landraces indicate a major contribution of landraces from France, Germany, and Spain to the candidate gene diversity of the Flint elite lines. In this study, we report the investigation of pool-specific differences between temperate Dent and Flint on a genome-wide scale. The identified candidate genes represent a promising source for the functional investigation of pool-specific haplotypes in different genetic backgrounds and for the evaluation of their potential for future crop improvement like the adaptation to specific environments.
Identification and evaluation of reference genes for qRT-PCR normalization in Ganoderma lucidum.
Xu, Jiang; Xu, ZhiChao; Zhu, YingJie; Luo, HongMei; Qian, Jun; Ji, AiJia; Hu, YuanLei; Sun, Wei; Wang, Bo; Song, JingYuan; Sun, Chao; Chen, ShiLin
2014-01-01
Quantitative real-time reverse transcription PCR (qRT-PCR) is a rapid, sensitive, and reliable technique for gene expression studies. The accuracy and reliability of qRT-PCR results depend on the stability of the reference genes used for gene normalization. Therefore, a systematic process of reference gene evaluation is needed. Ganoderma lucidum is a famous medicinal mushroom in East Asia. In the current study, 10 potential reference genes were selected from the G. lucidum genomic data. The sequences of these genes were manually curated, and primers were designed following strict criteria. The experiment was conducted using qRT-PCR, and the stability of each candidate gene was assessed using four commonly used statistical programs-geNorm, NormFinder, BestKeeper, and RefFinder. According to our results, PP2A was expressed at the most stable levels under different fermentation conditions, and RPL4 was the most stably expressed gene in different tissues. RPL4, PP2A, and β-tubulin are the most commonly recommended reference genes for normalizing gene expression in the entire sample set. The current study provides a foundation for the further use of qRT-PCR in G. lucidum gene analysis.
Estévez-López, Fernando; Camiletti-Moirón, Daniel; Aparicio, Virginia A; Segura-Jiménez, Víctor; Álvarez-Gallardo, Inmaculada C; Soriano-Maldonado, Alberto; Borges-Cosic, Milkana; Acosta-Manzano, Pedro; Geenen, Rinie; Delgado-Fernández, Manuel; Martínez-González, Luis J; Ruiz, Jonatan R; Álvarez-Cubero, María J
2018-02-27
Candidate-gene studies on fibromyalgia susceptibility often include a small number of single nucleotide polymorphisms (SNPs), which is a limitation. Moreover, there is a paucity of evidence in Europe. Therefore, we compared genotype frequencies of candidate SNPs in a well-characterised sample of Spanish women with fibromyalgia and healthy non-fibromyalgia women. A total of 314 women with a diagnosis of fibromyalgia (cases) and 112 non-fibromyalgia healthy (controls) women participated in this candidate-gene study. Buccal swabs were collected for DNA extraction. Using TaqMan™ OpenArray™, we analysed 61 SNPs of 33 genes related to fibromyalgia susceptibility, symptoms, or potential mechanisms. We observed that the rs841 and rs1799971 GG genotype was more frequently observed in fibromyalgia than in controls (p = 0.04 and p = 0.02, respectively). The rs2097903 AT/TT genotypes were also more often present in the fibromyalgia participants than in their control peers (p = 0.04). There were no differences for the remaining SNPs. We identified, for the first time, associations of the rs841 (guanosine triphosphate cyclohydrolase 1 gene) and rs2097903 (catechol-O-methyltransferase gene) SNPs with higher risk of fibromyalgia susceptibility. We also confirmed that the rs1799971 SNP (opioid receptor μ1 gene) might confer genetic risk of fibromyalgia. We did not adjust for multiple comparisons, which would be too stringent and yield to non-significant differences in the genotype frequencies between cases and controls. Our findings may be biologically meaningful and informative, and should be further investigated in other populations. Of particular interest is to replicate the present study in a larger independent sample to confirm or refute our findings. On the other hand, by including 61 SNPs of 33 candidate-genes with a strong rationale (they were previously investigated in relation to fibromyalgia susceptibility, symptoms or potential mechanisms), the present research is the most comprehensive candidate-gene study on fibromyalgia susceptibility to date.
Winnier, Deidre A.; Fourcaudot, Marcel; Norton, Luke; Abdul-Ghani, Muhammad A.; Hu, Shirley L.; Farook, Vidya S.; Coletta, Dawn K.; Kumar, Satish; Puppala, Sobha; Chittoor, Geetha; Dyer, Thomas D.; Arya, Rector; Carless, Melanie; Lehman, Donna M.; Curran, Joanne E.; Cromack, Douglas T.; Tripathy, Devjit; Blangero, John; Duggirala, Ravindranath; Göring, Harald H. H.; DeFronzo, Ralph A.; Jenkinson, Christopher P.
2015-01-01
Type 2 diabetes (T2D) is a complex metabolic disease that is more prevalent in ethnic groups such as Mexican Americans, and is strongly associated with the risk factors obesity and insulin resistance. The goal of this study was to perform whole genome gene expression profiling in adipose tissue to detect common patterns of gene regulation associated with obesity and insulin resistance. We used phenotypic and genotypic data from 308 Mexican American participants from the Veterans Administration Genetic Epidemiology Study (VAGES). Basal fasting RNA was extracted from adipose tissue biopsies from a subset of 75 unrelated individuals, and gene expression data generated on the Illumina BeadArray platform. The number of gene probes with significant expression above baseline was approximately 31,000. We performed multiple regression analysis of all probes with 15 metabolic traits. Adipose tissue had 3,012 genes significantly associated with the traits of interest (false discovery rate, FDR ≤ 0.05). The significance of gene expression changes was used to select 52 genes with significant (FDR ≤ 10-4) gene expression changes across multiple traits. Gene sets/Pathways analysis identified one gene, alcohol dehydrogenase 1B (ADH1B) that was significantly enriched (P < 10-60) as a prime candidate for involvement in multiple relevant metabolic pathways. Illumina BeadChip derived ADH1B expression data was consistent with quantitative real time PCR data. We observed significant inverse correlations with waist circumference (2.8 x 10-9), BMI (5.4 x 10-6), and fasting plasma insulin (P < 0.001). These findings are consistent with a central role for ADH1B in obesity and insulin resistance and provide evidence for a novel genetic regulatory mechanism for human metabolic diseases related to these traits. PMID:25830378
Roberts, Wade R; Roalson, Eric H
2017-03-20
Flowers have an amazingly diverse display of colors and shapes, and these characteristics often vary significantly among closely related species. The evolution of diverse floral form can be thought of as an adaptive response to pollination and reproduction, but it can also be seen through the lens of morphological and developmental constraints. To explore these interactions, we use RNA-seq across species and development to investigate gene expression and sequence evolution as they relate to the evolution of the diverse flowers in a group of Neotropical plants native to Mexico-magic flowers (Achimenes, Gesneriaceae). The assembled transcriptomes contain between 29,000 and 42,000 genes expressed during development. We combine sequence orthology and coexpression clustering with analyses of protein evolution to identify candidate genes for roles in floral form evolution. Over 25% of transcripts captured were distinctive to Achimenes and overrepresented by genes involved in transcription factor activity. Using a model-based clustering approach we find dynamic, temporal patterns of gene expression among species. Selection tests provide evidence of positive selection in several genes with roles in pigment production, flowering time, and morphology. Combining these approaches to explore genes related to flower color and flower shape, we find distinct patterns that correspond to transitions of floral form among Achimenes species. The floral transcriptomes developed from four species of Achimenes provide insight into the mechanisms involved in the evolution of diverse floral form among closely related species with different pollinators. We identified several candidate genes that will serve as an important and useful resource for future research. High conservation of sequence structure, patterns of gene coexpression, and detection of positive selection acting on few genes suggests that large phenotypic differences in floral form may be caused by genetic differences in a small set of genes. Our characterized floral transcriptomes provided here should facilitate further analyses into the genomics of flower development and the mechanisms underlying the evolution of diverse flowers in Achimenes and other Neotropical Gesneriaceae.
Dunachie, Susanna; Berthoud, Tamara; Hill, Adrian V.S.; Fletcher, Helen A.
2015-01-01
Introduction The complexity of immunity to malaria is well known, and clear correlates of protection against malaria have not been established. A better understanding of immune markers induced by candidate malaria vaccines would greatly enhance vaccine development, immunogenicity monitoring and estimation of vaccine efficacy in the field. We have previously reported complete or partial efficacy against experimental sporozoite challenge by several vaccine regimens in healthy malaria-naïve subjects in Oxford. These include a prime-boost regimen with RTS,S/AS02A and modified vaccinia virus Ankara (MVA) expressing the CSP antigen, and a DNA-prime, MVA-boost regimen expressing the ME TRAP antigens. Using samples from these trials we performed transcriptional profiling, allowing a global assessment of responses to vaccination. Methods We used Human RefSeq8 Bead Chips from Illumina to examine gene expression using PBMC (peripheral blood mononuclear cells) from 16 human volunteers. To focus on antigen-specific changes, comparisons were made between PBMC stimulated with CSP or TRAP peptide pools and unstimulated PBMC post vaccination. We then correlated gene expression with protection against malaria in a human Plasmodium falciparum malaria challenge model. Results Differentially expressed genes induced by both vaccine regimens were predominantly in the IFN-γ pathway. Gene set enrichment analysis revealed antigen-specific effects on genes associated with IFN induction and proteasome modules after vaccination. Genes associated with IFN induction and antigen presentation modules were positively enriched in subjects with complete protection from malaria challenge, while genes associated with haemopoietic stem cells, regulatory monocytes and the myeloid lineage modules were negatively enriched in protected subjects. Conclusions These results represent novel insights into the immune repertoires involved in malaria vaccination. PMID:26256523
Banfi, Federica; Colombini, Alessandra; Perucca Orfei, Carlotta; Parazzi, Valentina; Ragni, Enrico
2018-05-26
The molecular profile of human mesenchymal stem cells (MSCs) have emerged as a key factor in defining their identity. Nevertheless, the effect of fetal bovine serum (FBS) batches or origin on MSC molecular signature has been neglected. In this frame, chemical fingerprint of FBS batches from unrelated countries showed strong correlation between chemical composition and country of origin. Thus, the aim of this study was to evaluate in stem cells isolated from bone marrow (BMMSCs) and umbilical cord-blood (CBMSCs) the effects of independently collected FBS batches on both twelve commonly used reference genes (RGs) and a selected panel of thirty-eight genes crucial for MSC definition in both research and clinical settings. Gene expression stability was estimated comparing the outcomes of two applets: geNorm and NormFinder. The bioinformatics analysis emphasized that, in a panorama of general balance, few RG candidates (YWHAZ/UBC for BMMSCs, RPLP0/EF1A for CBMSCs and EF1A/TBP for both MSCs scored together) showed superior stability. In addition, a wider study on genes involved in differentiation/proliferation/stemness processes, often used to define MSC potency, showed that these genes exhibited no major transcriptional modulation after treatment with different FBS, and allowed the identification of genes strongly discriminating between BM- and CBMSC populations. Therefore, in conclusion, FBS origin does not dramatically impact the general molecular profile of MSCs, although we could identify validated candidates able to allow more reliable comparison of data regarding MSC identity and potency and obtained by research laboratories and clinical manufacturers using different sera.
Dunachie, Susanna; Berthoud, Tamara; Hill, Adrian V S; Fletcher, Helen A
2015-09-29
The complexity of immunity to malaria is well known, and clear correlates of protection against malaria have not been established. A better understanding of immune markers induced by candidate malaria vaccines would greatly enhance vaccine development, immunogenicity monitoring and estimation of vaccine efficacy in the field. We have previously reported complete or partial efficacy against experimental sporozoite challenge by several vaccine regimens in healthy malaria-naïve subjects in Oxford. These include a prime-boost regimen with RTS,S/AS02A and modified vaccinia virus Ankara (MVA) expressing the CSP antigen, and a DNA-prime, MVA-boost regimen expressing the ME TRAP antigens. Using samples from these trials we performed transcriptional profiling, allowing a global assessment of responses to vaccination. We used Human RefSeq8 Bead Chips from Illumina to examine gene expression using PBMC (peripheral blood mononuclear cells) from 16 human volunteers. To focus on antigen-specific changes, comparisons were made between PBMC stimulated with CSP or TRAP peptide pools and unstimulated PBMC post vaccination. We then correlated gene expression with protection against malaria in a human Plasmodium falciparum malaria challenge model. Differentially expressed genes induced by both vaccine regimens were predominantly in the IFN-γ pathway. Gene set enrichment analysis revealed antigen-specific effects on genes associated with IFN induction and proteasome modules after vaccination. Genes associated with IFN induction and antigen presentation modules were positively enriched in subjects with complete protection from malaria challenge, while genes associated with haemopoietic stem cells, regulatory monocytes and the myeloid lineage modules were negatively enriched in protected subjects. These results represent novel insights into the immune repertoires involved in malaria vaccination. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Dissecting Daily and Circadian Expression Rhythms of Clock-Controlled Genes in Human Blood.
Lech, Karolina; Ackermann, Katrin; Revell, Victoria L; Lao, Oscar; Skene, Debra J; Kayser, Manfred
2016-02-01
The identification and investigation of novel clock-controlled genes (CCGs) has been conducted thus far mainly in model organisms such as nocturnal rodents, with limited information in humans. Here, we aimed to characterize daily and circadian expression rhythms of CCGs in human peripheral blood during a sleep/sleep deprivation (S/SD) study and a constant routine (CR) study. Blood expression levels of 9 candidate CCGs (SREBF1, TRIB1, USF1, THRA1, SIRT1, STAT3, CAPRIN1, MKNK2, and ROCK2), were measured across 48 h in 12 participants in the S/SD study and across 33 h in 12 participants in the CR study. Statistically significant rhythms in expression were observed for STAT3, SREBF1, TRIB1, and THRA1 in samples from both the S/SD and the CR studies, indicating that their rhythmicity is driven by the endogenous clock. The MKNK2 gene was significantly rhythmic in the S/SD but not the CR study, which implies its exogenously driven rhythmic expression. In addition, we confirmed the circadian expression of PER1, PER3, and REV-ERBα in the CR study samples, while BMAL1 and HSPA1B were not significantly rhythmic in the CR samples; all 5 genes previously showed significant expression in the S/SD study samples. Overall, our results demonstrate that rhythmic expression patterns of clock and selected clock-controlled genes in human blood cells are in part determined by exogenous factors (sleep and fasting state) and in part by the endogenous circadian timing system. Knowledge of the exogenous and endogenous regulation of gene expression rhythms is needed prior to the selection of potential candidate marker genes for future applications in medical and forensic settings. © 2015 The Author(s).
Joy, Nisha; Soniya, Eppurathu Vasudevan
2012-06-01
Plant miRNAs (18-24nt) are generated by the RNase III-type Dicer endonuclease from the endogenous hairpin precursors ('pre-miRNAs') with significant regulatory functions. The transcribed regions display a higher frequency of microsatellites, when compared to other regions of the genomic DNA. Simple sequence repeats (SSRs) resulting from replication slippage occurring in transcripts affect the expression of genes. The available experimental evidence for the incidence of SSRs in the miRNA precursors is limited. Considering the potential significance of SSRs in the miRNA genes, we carried out a preliminary analysis to verify the presence of SSRs in the pri-miRNAs of black pepper (Piper nigrum L.). We isolated a (CT) dinucleotide SSR bearing transcript using SMART strategy. The transcript was predicted to be a 'pri-miRNA candidate' with Dicer sites based on miRNA prediction tools and MFOLD structural predictions. The presence of this 'miRNA candidate' was confirmed by real-time TaqMan assays. The upstream sequence of the 'miRNA candidate' by genome walking when subjected to PlantCARE showed the presence of certain promoter elements, and the deduced amino acid showed significant similarity with NAP1 gene, which affects the transcription of many genes. Moreover the hairpin-like precursor overlapped the neighbouring NAP1 gene. In silico analysis revealed distinct putative functions for the 'miRNA candidate', of which majority were related to growth. Hence, we assume that this 'miRNA candidate' may get activated during transcription of NAP gene, thereby regulating the expression of many genes involved in developmental processes.
Forouzanfar, Narjes; Baranova, Ancha; Milanizadeh, Saman; Heravi-Moussavi, Alireza; Jebelli, Amir; Abbaszadegan, Mohammad Reza
2017-05-01
Esophageal squamous cell carcinoma is one of the deadliest of all the cancers. Its metastatic properties portend poor prognosis and high rate of recurrence. A more advanced method to identify new molecular biomarkers predicting disease prognosis can be whole exome sequencing. Here, we report the most effective genetic variants of the Notch signaling pathway in esophageal squamous cell carcinoma susceptibility by whole exome sequencing. We analyzed nine probands in unrelated familial esophageal squamous cell carcinoma pedigrees to identify candidate genes. Genomic DNA was extracted and whole exome sequencing performed to generate information about genetic variants in the coding regions. Bioinformatics software applications were utilized to exploit statistical algorithms to demonstrate protein structure and variants conservation. Polymorphic regions were excluded by false-positive investigations. Gene-gene interactions were analyzed for Notch signaling pathway candidates. We identified novel and damaging variants of the Notch signaling pathway through extensive pathway-oriented filtering and functional predictions, which led to the study of 27 candidate novel mutations in all nine patients. Detection of the trinucleotide repeat containing 6B gene mutation (a slice site alteration) in five of the nine probands, but not in any of the healthy samples, suggested that it may be a susceptibility factor for familial esophageal squamous cell carcinoma. Noticeably, 8 of 27 novel candidate gene mutations (e.g. epidermal growth factor, signal transducer and activator of transcription 3, MET) act in a cascade leading to cell survival and proliferation. Our results suggest that the trinucleotide repeat containing 6B mutation may be a candidate predisposing gene in esophageal squamous cell carcinoma. In addition, some of the Notch signaling pathway genetic mutations may act as key contributors to esophageal squamous cell carcinoma.
Máximo, Wesley P. F.; Zanetti, Ronald; Paiva, Luciano V.
2018-01-01
Although several ant species are important targets for the development of molecular control strategies, only a few studies focus on identifying and validating reference genes for quantitative reverse transcription polymerase chain reaction (RT-qPCR) data normalization. We provide here an extensive study to identify and validate suitable reference genes for gene expression analysis in the ant Atta sexdens, a threatening agricultural pest in South America. The optimal number of reference genes varies according to each sample and the result generated by RefFinder differed about which is the most suitable reference gene. Results suggest that the RPS16, NADH and SDHB genes were the best reference genes in the sample pool according to stability values. The SNF7 gene expression pattern was stable in all evaluated sample set. In contrast, when using less stable reference genes for normalization a large variability in SNF7 gene expression was recorded. There is no universal reference gene suitable for all conditions under analysis, since these genes can also participate in different cellular functions, thus requiring a systematic validation of possible reference genes for each specific condition. The choice of reference genes on SNF7 gene normalization confirmed that unstable reference genes might drastically change the expression profile analysis of target candidate genes. PMID:29419794
Identification of a B cell signature associated with renal transplant tolerance in humans
Newell, Kenneth A.; Asare, Adam; Kirk, Allan D.; Gisler, Trang D.; Bourcier, Kasia; Suthanthiran, Manikkam; Burlingham, William J.; Marks, William H.; Sanz, Ignacio; Lechler, Robert I.; Hernandez-Fuentes, Maria P.; Turka, Laurence A.; Seyfert-Margolis, Vicki L.
2010-01-01
Establishing long-term allograft acceptance without the requirement for continuous immunosuppression, a condition known as allograft tolerance, is a highly desirable therapeutic goal in solid organ transplantation. Determining which recipients would benefit from withdrawal or minimization of immunosuppression would be greatly facilitated by biomarkers predictive of tolerance. In this study, we identified the largest reported cohort to our knowledge of tolerant renal transplant recipients, as defined by stable graft function and receiving no immunosuppression for more than 1 year, and compared their gene expression profiles and peripheral blood lymphocyte subsets with those of subjects with stable graft function who are receiving immunosuppressive drugs as well as healthy controls. In addition to being associated with clinical and phenotypic parameters, renal allograft tolerance was strongly associated with a B cell signature using several assays. Tolerant subjects showed increased expression of multiple B cell differentiation genes, and a set of just 3 of these genes distinguished tolerant from nontolerant recipients in a unique test set of samples. This B cell signature was associated with upregulation of CD20 mRNA in urine sediment cells and elevated numbers of peripheral blood naive and transitional B cells in tolerant participants compared with those receiving immunosuppression. These results point to a critical role for B cells in regulating alloimmunity and provide a candidate set of genes for wider-scale screening of renal transplant recipients. PMID:20501946
Mishra, Ankita; Singh, Anuradha; Sharma, Monica; Kumar, Pankaj; Roy, Joy
2016-10-06
Starch is a major part of cereal grain. It comprises two glucose polymer fractions, amylose (AM) and amylopectin (AP), that make up about 25 and 75 % of total starch, respectively. The ratio of the two affects processing quality and digestibility of starch-based food products. Digestibility determines nutritional quality, as high amylose starch is considered a resistant or healthy starch (RS type 2) and is highly preferred for preventive measures against obesity and related health conditions. The topic of nutrition security is currently receiving much attention and consumer demand for food products with improved nutritional qualities has increased. In bread wheat (Triticum aestivum L.), variation in amylose content is narrow, hence its limited improvement. Therefore, it is necessary to produce wheat lines or populations showing wide variation in amylose/resistant starch content. In this study, a set of EMS-induced M4 mutant lines showing dynamic variation in amylose/resistant starch content were produced. Furthermore, two diverse mutant lines for amylose content were used to study quantitative expression patterns of 20 starch metabolic pathway genes and to identify candidate genes for amylose biosynthesis. A population comprising 101 EMS-induced mutation lines (M4 generation) was produced in a bread wheat (Triticum aestivum) variety. Two methods of amylose measurement in grain starch showed variation in amylose content ranging from ~3 to 76 % in the population. The method of in vitro digestion showed variation in resistant starch content from 1 to 41 %. One-way ANOVA analysis showed significant variation (p < 0.05) in amylose and resistant starch content within the population. A multiple comparison test (Dunnett's test) showed that significant variation in amylose and resistant starch content, with respect to the parent, was observed in about 89 and 38 % of the mutant lines, respectively. Expression pattern analysis of 20 starch metabolic pathway genes in two diverse mutant lines (low and high amylose mutants) showed higher expression of key genes of amylose biosynthesis (GBSSI and their isoforms) in the high amylose mutant line, in comparison to the parent. Higher expression of amylopectin biosynthesis (SBE) was observed in the low amylose mutant lines. An additional six candidate genes showed over-expression (BMY, SPA) and reduced-expression (SSIII, SBEI, SBEIII, ISA3) in the high amylose mutant line, indicating that other starch metabolic genes may also contribute to amylose biosynthesis. In this study a set of 101 EMS-induced mutant lines (M4 generation) showing variation in amylose and resistant starch content in seed were produced. This population serves as useful germplasm or pre-breeding material for genome-wide study and improvement of starch-based processing and nutrition quality in wheat. It is also useful for the study of the genetic and molecular basis of amylose/resistant starch variation in wheat. Furthermore, gene expression analysis of 20 starch metabolic genes in the two diverse mutant lines (low and high amylose mutants) indicates that in addition to key genes, several other genes (such as phosphorylases, isoamylases, and pullulanases) may also be involved in contributing to amylose/amylopectin biosynthesis.
A large-scale RNA interference screen identifies genes that regulate autophagy at different stages.
Guo, Sujuan; Pridham, Kevin J; Virbasius, Ching-Man; He, Bin; Zhang, Liqing; Varmark, Hanne; Green, Michael R; Sheng, Zhi
2018-02-12
Dysregulated autophagy is central to the pathogenesis and therapeutic development of cancer. However, how autophagy is regulated in cancer is not well understood and genes that modulate cancer autophagy are not fully defined. To gain more insights into autophagy regulation in cancer, we performed a large-scale RNA interference screen in K562 human chronic myeloid leukemia cells using monodansylcadaverine staining, an autophagy-detecting approach equivalent to immunoblotting of the autophagy marker LC3B or fluorescence microscopy of GFP-LC3B. By coupling monodansylcadaverine staining with fluorescence-activated cell sorting, we successfully isolated autophagic K562 cells where we identified 336 short hairpin RNAs. After candidate validation using Cyto-ID fluorescence spectrophotometry, LC3B immunoblotting, and quantitative RT-PCR, 82 genes were identified as autophagy-regulating genes. 20 genes have been reported previously and the remaining 62 candidates are novel autophagy mediators. Bioinformatic analyses revealed that most candidate genes were involved in molecular pathways regulating autophagy, rather than directly participating in the autophagy process. Further autophagy flux assays revealed that 57 autophagy-regulating genes suppressed autophagy initiation, whereas 21 candidates promoted autophagy maturation. Our RNA interference screen identifies identified genes that regulate autophagy at different stages, which helps decode autophagy regulation in cancer and offers novel avenues to develop autophagy-related therapies for cancer.
Rodriguez-Fernandez, I A; Dell'Angelica, E C
2009-04-01
The study of protein-protein interactions is a powerful approach to uncovering the molecular function of gene products associated with human disease. Protein-protein interaction data are accumulating at an unprecedented pace owing to interactomics projects, although it has been recognized that a significant fraction of these data likely represents false positives. During our studies of biogenesis of lysosome-related organelles complex-1 (BLOC-1), a protein complex involved in protein trafficking and containing the products of genes mutated in Hermansky-Pudlak syndrome, we faced the problem of having too many candidate binding partners to pursue experimentally. In this work, we have explored ways of efficiently gathering high-quality information about candidate binding partners and presenting the information in a visually friendly manner. We applied the approach to rank 70 candidate binding partners of human BLOC-1 and 102 candidates of its counterpart from Drosophila melanogaster. The top candidate for human BLOC-1 was the small GTPase encoded by the RAB11A gene, which is a paralogue of the Rab38 and Rab32 proteins in mammals and the lightoid gene product in flies. Interestingly, genetic analyses in D. melanogaster uncovered a synthetic sick/lethal interaction between Rab11 and lightoid. The data-mining approach described herein can be customized to study candidate binding partners for other proteins or possibly candidates derived from other types of 'omics' data.
Hu, Fengyi; Wang, Di; Zhao, Xiuqin; Zhang, Ting; Sun, Haixi; Zhu, Linghua; Zhang, Fan; Li, Lijuan; Li, Qiong; Tao, Dayun; Fu, Binying; Li, Zhikang
2011-01-24
Rhizomatousness is a key component of perenniality of many grasses that contribute to competitiveness and invasiveness of many noxious grass weeds, but can potentially be used to develop perennial cereal crops for sustainable farmers in hilly areas of tropical Asia. Oryza longistaminata, a perennial wild rice with strong rhizomes, has been used as the model species for genetic and molecular dissection of rhizome development and in breeding efforts to transfer rhizome-related traits into annual rice species. In this study, an effort was taken to get insights into the genes and molecular mechanisms underlying the rhizomatous trait in O. longistaminata by comparative analysis of the genome-wide tissue-specific gene expression patterns of five different tissues of O. longistaminata using the Affymetrix GeneChip Rice Genome Array. A total of 2,566 tissue-specific genes were identified in five different tissues of O. longistaminata, including 58 and 61 unique genes that were specifically expressed in the rhizome tips (RT) and internodes (RI), respectively. In addition, 162 genes were up-regulated and 261 genes were down-regulated in RT compared to the shoot tips. Six distinct cis-regulatory elements (CGACG, GCCGCC, GAGAC, AACGG, CATGCA, and TAAAG) were found to be significantly more abundant in the promoter regions of genes differentially expressed in RT than in the promoter regions of genes uniformly expressed in all other tissues. Many of the RT and/or RI specifically or differentially expressed genes were located in the QTL regions associated with rhizome expression, rhizome abundance and rhizome growth-related traits in O. longistaminata and thus are good candidate genes for these QTLs. The initiation and development of the rhizomatous trait in O. longistaminata are controlled by very complex gene networks involving several plant hormones and regulatory genes, different members of gene families showing tissue specificity and their regulated pathways. Auxin/IAA appears to act as a negative regulator in rhizome development, while GA acts as the activator in rhizome development. Co-localization of the genes specifically expressed in rhizome tips and rhizome internodes with the QTLs for rhizome traits identified a large set of candidate genes for rhizome initiation and development in rice for further confirmation.
SFM: A novel sequence-based fusion method for disease genes identification and prioritization.
Yousef, Abdulaziz; Moghadam Charkari, Nasrollah
2015-10-21
The identification of disease genes from human genome is of great importance to improve diagnosis and treatment of disease. Several machine learning methods have been introduced to identify disease genes. However, these methods mostly differ in the prior knowledge used to construct the feature vector for each instance (gene), the ways of selecting negative data (non-disease genes) where there is no investigational approach to find them and the classification methods used to make the final decision. In this work, a novel Sequence-based fusion method (SFM) is proposed to identify disease genes. In this regard, unlike existing methods, instead of using a noisy and incomplete prior-knowledge, the amino acid sequence of the proteins which is universal data has been carried out to present the genes (proteins) into four different feature vectors. To select more likely negative data from candidate genes, the intersection set of four negative sets which are generated using distance approach is considered. Then, Decision Tree (C4.5) has been applied as a fusion method to combine the results of four independent state-of the-art predictors based on support vector machine (SVM) algorithm, and to make the final decision. The experimental results of the proposed method have been evaluated by some standard measures. The results indicate the precision, recall and F-measure of 82.6%, 85.6% and 84, respectively. These results confirm the efficiency and validity of the proposed method. Copyright © 2015 Elsevier Ltd. All rights reserved.
USDA-ARS?s Scientific Manuscript database
A public candidate gene testing pipeline for resistance to aflatoxin accumulation or Aspergillus flavus infection in maize is presented here. The pipeline consists of steps for identifying, testing, and verifying the association of any maize gene sequence with resistance under field conditions. Reso...
SNP discovery in candidate adaptive genes using exon capture in a free-ranging alpine ungulate
Gretchen H. Roffler; Stephen J. Amish; Seth Smith; Ted Cosart; Marty Kardos; Michael K. Schwartz; Gordon Luikart
2016-01-01
Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein-coding...
Chapman, Mark A; Pashley, Catherine H; Wenzler, Jessica; Hvala, John; Tang, Shunxue; Knapp, Steven J; Burke, John M
2008-11-01
Genomic scans for selection are a useful tool for identifying genes underlying phenotypic transitions. In this article, we describe the results of a genome scan designed to identify candidates for genes targeted by selection during the evolution of cultivated sunflower. This work involved screening 492 loci derived from ESTs on a large panel of wild, primitive (i.e., landrace), and improved sunflower (Helianthus annuus) lines. This sampling strategy allowed us to identify candidates for selectively important genes and investigate the likely timing of selection. Thirty-six genes showed evidence of selection during either domestication or improvement based on multiple criteria, and a sequence-based test of selection on a subset of these loci confirmed this result. In view of what is known about the structure of linkage disequilibrium across the sunflower genome, these genes are themselves likely to have been targeted by selection, rather than being merely linked to the actual targets. While the selection candidates showed a broad range of putative functions, they were enriched for genes involved in amino acid synthesis and protein catabolism. Given that a similar pattern has been detected in maize (Zea mays), this finding suggests that selection on amino acid composition may be a general feature of the evolution of crop plants. In terms of genomic locations, the selection candidates were significantly clustered near quantitative trait loci (QTL) that contribute to phenotypic differences between wild and cultivated sunflower, and specific instances of QTL colocalization provide some clues as to the roles that these genes may have played during sunflower evolution.
Analysis of protocadherin alpha gene enhancer polymorphism in bipolar disorder and schizophrenia
Pedrosa, Erika; Stefanescu, Radu; Margolis, Benjamin; Petruolo, Oriana; Lo, Yungtai; Nolan, Karen; Novak, Tomas; Stopkova, Pavla; Lachman, Herbert M.
2008-01-01
Cadherins and protocadherins are cell adhesion proteins that play an important role in neuronal migration, differentiation and synaptogenesis, properties that make them targets to consider in schizophrenia (SZ) and bipolar disorder (BD) pathogenesis. Consequently, allelic variation occurring in protocadherin and cadherin encoding genes that map to regions of the genome mapped in SZ and BD linkage studies are particularly strong candidates to consider. One such set of candidate genes is the 5q31-linked PCDH family, which consists of more than 50 exons encoding three related, though distinct family members – α, β, and γ – which can generate thousands of different protocadherin proteins through alternative promoter usage and cis-alternative splicing. In this study, we focused on a SNP, rs31745, which is located in a putative PCDHα enhancer mapped by ChIP-chip using antibodies to covalently modified histone H3. A striking increase in homozygotes for the minor allele at this locus was detected in patients with BD. Molecular analysis revealed that the SNP causes allele-specific changes in binding to a brain protein. The findings suggest that the 5q31-linked PCDH locus should be more thoroughly considered as a disease-susceptibility locus in psychiatric disorders. PMID:18508241
Keyhaninejad, Neda; Curry, Jeanne; Romero, Joslynn; O'Connell, Mary A
2014-02-01
Accumulation of capsaicinoids in the placental tissue of ripening chile (Capsicum spp.) fruit follows the coordinated expression of multiple biosynthetic enzymes producing the substrates for capsaicin synthase. Transcription factors are likely agents to regulate expression of these biosynthetic genes. Placental RNAs from habanero fruit (Capsicum chinense) were screened for expression of candidate transcription factors; with two candidate genes identified, both in the ERF family of transcription factors. Characterization of these transcription factors, Erf and Jerf, in nine chile cultivars with distinct capsaicinoid contents demonstrated a correlation of expression with pungency. Amino acid variants were observed in both ERF and JERF from different chile cultivars; none of these changes involved the DNA binding domains. Little to no transcription of Erf was detected in non-pungent Capsium annuum or C. chinense mutants. This correlation was characterized at an individual fruit level in a set of jalapeño (C. annuum) lines again with distinct and variable capsaicinoid contents. Both Erf and Jerf are expressed early in fruit development, 16-20 days post-anthesis, at times prior to the accumulation of capsaicinoids in the placental tissues. These data support the hypothesis that these two members of the complex ERF family participate in regulation of the pungency phenotype in chile. Copyright © 2013. Published by Elsevier Ireland Ltd.
Keyhaninejad, Neda; Curry, Jeanne; Romero, Joslynn; O’Connell, Mary A.
2013-01-01
Accumulation of capsaicinoids in the placental tissue of ripening chile (Capsicum spp.) fruit follows the coordinated expression of multiple biosynthetic enzymes producing the substrates for capsaicin synthase. Transcription factors are likely agents to regulate expression of these biosynthetic genes. Placental RNAs from habanero fruit (C. chinense) were screened for expression of candidate transcription factors; with two candidate genes identified, both in the ERF family of transcription factors. Characterization of these transcription factors, Erf and Jerf, in nine chile cultivars with distinct capsaicinoid contents demonstrated a correlation of expression with pungency. Amino acid variants were observed in both ERF and JERF from different chile cultivars; none of these changes involved the DNA binding domains. Little to no transcription of Erf was detected in non-pungent C. annuum or C. chinense mutants. This correlation was characterized at an individual fruit level in a set of jalapeño (C. annuum) lines again with distinct and variable capsaicinoid contents. Both Erf and Jerf are expressed early in fruit development, 16–20 days post-anthesis, at times prior to the accumulation of capsaicinoids in the placental tissues. These data support the hypothesis that these two members of the complex ERF family participate in regulation of the pungency phenotype in chile. PMID:24388515
Marra, Nicholas J; Eo, Soo Hyung; Hale, Matthew C; Waser, Peter M; DeWoody, J Andrew
2012-12-01
One common goal in evolutionary biology is the identification of genes underlying adaptive traits of evolutionary interest. Recently next-generation sequencing techniques have greatly facilitated such evolutionary studies in species otherwise depauperate of genomic resources. Kangaroo rats (Dipodomys sp.) serve as exemplars of adaptation in that they inhabit extremely arid environments, yet require no drinking water because of ultra-efficient kidney function and osmoregulation. As a basis for identifying water conservation genes in kangaroo rats, we conducted a priori bioinformatics searches in model rodents (Mus musculus and Rattus norvegicus) to identify candidate genes with known or suspected osmoregulatory function. We then obtained 446,758 reads via 454 pyrosequencing to characterize genes expressed in the kidney of banner-tailed kangaroo rats (Dipodomys spectabilis). We also determined candidates a posteriori by identifying genes that were overexpressed in the kidney. The kangaroo rat sequences revealed nine different a priori candidate genes predicted from our Mus and Rattus searches, as well as 32 a posteriori candidate genes that were overexpressed in kidney. Mutations in two of these genes, Slc12a1 and Slc12a3, cause human renal diseases that result in the inability to concentrate urine. These genes are likely key determinants of physiological water conservation in desert rodents. Copyright © 2012 Elsevier Inc. All rights reserved.
Henry, Ellen C; Welle, Stephen L; Gasiewicz, Thomas A
2010-03-01
The aryl hydrocarbon receptor (AhR), a ligand-dependent transcription factor, mediates toxicity of several classes of xenobiotics and also has important physiological roles in differentiation, reproduction, and immunity, although the endogenous ligand(s) mediating these functions is/are as yet unidentified. One candidate endogenous ligand, 2-(1'H-indolo-3'-carbonyl)-thiazole-4-carboxylic acid methyl ester (ITE), is a potent AhR agonist in vitro, activates the murine AhR in vivo, but does not induce toxicity. We hypothesized that ITE and the toxic ligand, 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), may modify transcription of different sets of genes to account for their different toxicity. To test this hypothesis, primary mouse lung fibroblasts were exposed to 0.5muM ITE, 0.2nM TCDD, or vehicle for 4 h, and total gene expression was evaluated using microarrays. After this short-term and low-dose treatment, several hundred genes were changed significantly, and the response to ITE and TCDD was remarkably similar, both qualitatively and quantitatively. Induced gene sets included the expected battery of AhR-dependent xenobiotic-metabolizing enzymes, as well as several sets that reflect the inflammatory role of lung fibroblasts. Real time quantitative RT-qPCR assay of several selected genes confirmed these microarray data and further suggested that there may be kinetic differences in expression between ligands. These data suggest that ITE and TCDD elicit an analogous change in AhR conformation such that the initial transcription response is the same. Furthermore, if the difference in toxicity between TCDD and ITE is mediated by differences in gene expression, then it is likely that secondary changes enabled by the persistent TCDD, but not by the shorter lived ITE, are responsible.
Leveraging lung tissue transcriptome to uncover candidate causal genes in COPD genetic associations.
Lamontagne, Maxime; Bérubé, Jean-Christophe; Obeidat, Ma'en; Cho, Michael H; Hobbs, Brian D; Sakornsakolpat, Phuwanat; de Jong, Kim; Boezen, H Marike; Nickle, David; Hao, Ke; Timens, Wim; van den Berge, Maarten; Joubert, Philippe; Laviolette, Michel; Sin, Don D; Paré, Peter D; Bossé, Yohan
2018-05-15
Causal genes of chronic obstructive pulmonary disease (COPD) remain elusive. The current study aims at integrating genome-wide association studies (GWAS) and lung expression quantitative trait loci (eQTL) data to map COPD candidate causal genes and gain biological insights into the recently discovered COPD susceptibility loci. Two complementary genomic datasets on COPD were studied. First, the lung eQTL dataset which included whole-genome gene expression and genotyping data from 1038 individuals. Second, the largest COPD GWAS to date from the International COPD Genetics Consortium (ICGC) with 13 710 cases and 38 062 controls. Methods that integrated GWAS with eQTL signals including transcriptome-wide association study (TWAS), colocalization and Mendelian randomization-based (SMR) approaches were used to map causality genes, i.e. genes with the strongest evidence of being the functional effector at specific loci. These methods were applied at the genome-wide level and at COPD risk loci derived from the GWAS literature. Replication was performed using lung data from GTEx. We collated 129 non-overlapping risk loci for COPD from the GWAS literature. At the genome-wide scale, 12 new COPD candidate genes/loci were revealed and six replicated in GTEx including CAMK2A, DMPK, MYO15A, TNFRSF10A, BTN3A2 and TRBV30. In addition, we mapped candidate causal genes for 60 out of the 129 GWAS-nominated loci and 23 of them were replicated in GTEx. Mapping candidate causal genes in lung tissue represents an important contribution to the genetics of COPD, enriches our biological interpretation of GWAS findings, and brings us closer to clinical translation of genetic associations.
2009-01-01
Background Chickpea (Cicer arietinum L.), an important grain legume crop of the world is seriously challenged by terminal drought and salinity stresses. However, very limited number of molecular markers and candidate genes are available for undertaking molecular breeding in chickpea to tackle these stresses. This study reports generation and analysis of comprehensive resource of drought- and salinity-responsive expressed sequence tags (ESTs) and gene-based markers. Results A total of 20,162 (18,435 high quality) drought- and salinity- responsive ESTs were generated from ten different root tissue cDNA libraries of chickpea. Sequence editing, clustering and assembly analysis resulted in 6,404 unigenes (1,590 contigs and 4,814 singletons). Functional annotation of unigenes based on BLASTX analysis showed that 46.3% (2,965) had significant similarity (≤1E-05) to sequences in the non-redundant UniProt database. BLASTN analysis of unique sequences with ESTs of four legume species (Medicago, Lotus, soybean and groundnut) and three model plant species (rice, Arabidopsis and poplar) provided insights on conserved genes across legumes as well as novel transcripts for chickpea. Of 2,965 (46.3%) significant unigenes, only 2,071 (32.3%) unigenes could be functionally categorised according to Gene Ontology (GO) descriptions. A total of 2,029 sequences containing 3,728 simple sequence repeats (SSRs) were identified and 177 new EST-SSR markers were developed. Experimental validation of a set of 77 SSR markers on 24 genotypes revealed 230 alleles with an average of 4.6 alleles per marker and average polymorphism information content (PIC) value of 0.43. Besides SSR markers, 21,405 high confidence single nucleotide polymorphisms (SNPs) in 742 contigs (with ≥ 5 ESTs) were also identified. Recognition sites for restriction enzymes were identified for 7,884 SNPs in 240 contigs. Hierarchical clustering of 105 selected contigs provided clues about stress- responsive candidate genes and their expression profile showed predominance in specific stress-challenged libraries. Conclusion Generated set of chickpea ESTs serves as a resource of high quality transcripts for gene discovery and development of functional markers associated with abiotic stress tolerance that will be helpful to facilitate chickpea breeding. Mapping of gene-based markers in chickpea will also add more anchoring points to align genomes of chickpea and other legume species. PMID:19912666
Moon, Myungjin; Nakai, Kenta
2018-04-01
Currently, cancer biomarker discovery is one of the important research topics worldwide. In particular, detecting significant genes related to cancer is an important task for early diagnosis and treatment of cancer. Conventional studies mostly focus on genes that are differentially expressed in different states of cancer; however, noise in gene expression datasets and insufficient information in limited datasets impede precise analysis of novel candidate biomarkers. In this study, we propose an integrative analysis of gene expression and DNA methylation using normalization and unsupervised feature extractions to identify candidate biomarkers of cancer using renal cell carcinoma RNA-seq datasets. Gene expression and DNA methylation datasets are normalized by Box-Cox transformation and integrated into a one-dimensional dataset that retains the major characteristics of the original datasets by unsupervised feature extraction methods, and differentially expressed genes are selected from the integrated dataset. Use of the integrated dataset demonstrated improved performance as compared with conventional approaches that utilize gene expression or DNA methylation datasets alone. Validation based on the literature showed that a considerable number of top-ranked genes from the integrated dataset have known relationships with cancer, implying that novel candidate biomarkers can also be acquired from the proposed analysis method. Furthermore, we expect that the proposed method can be expanded for applications involving various types of multi-omics datasets.
Kumar, Bharath; Abdel-Ghani, Adel H; Pace, Jordon; Reyes-Matamoros, Jenaro; Hochholdinger, Frank; Lübberstedt, Thomas
2014-07-01
Several genes involved in maize root development have been isolated. Identification of SNPs associated with root traits would enable the selection of maize lines with better root architecture that might help to improve N uptake, and consequently plant growth particularly under N deficient conditions. In the present study, an association study (AS) panel consisting of 74 maize inbred lines was screened for seedling root traits in 6, 10, and 14-day-old seedlings. Allele re-sequencing of candidate root genes Rtcl, Rth3, Rum1, and Rul1 was also carried out in the same AS panel lines. All four candidate genes displayed different levels of nucleotide diversity, haplotype diversity and linkage disequilibrium. Gene based association analyses were carried out between individual polymorphisms in candidate genes, and root traits measured in 6, 10, and 14-day-old maize seedlings. Association analyses revealed several polymorphisms within the Rtcl, Rth3, Rum1, and Rul1 genes associated with seedling root traits. Several nucleotide polymorphisms in Rtcl, Rth3, Rum1, and Rul1 were significantly (P<0.05) associated with seedling root traits in maize suggesting that all four tested genes are involved in the maize root development. Thus considerable allelic variation present in these root genes can be exploited for improving maize root characteristics. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
A genome-wide scan for signatures of differential artificial selection in ten cattle breeds.
Rothammer, Sophie; Seichter, Doris; Förster, Martin; Medugorac, Ivica
2013-12-21
Since the times of domestication, cattle have been continually shaped by the influence of humans. Relatively recent history, including breed formation and the still enduring enormous improvement of economically important traits, is expected to have left distinctive footprints of selection within the genome. The purpose of this study was to map genome-wide selection signatures in ten cattle breeds and thus improve the understanding of the genome response to strong artificial selection and support the identification of the underlying genetic variants of favoured phenotypes. We analysed 47,651 single nucleotide polymorphisms (SNP) using Cross Population Extended Haplotype Homozygosity (XP-EHH). We set the significance thresholds using the maximum XP-EHH values of two essentially artificially unselected breeds and found up to 229 selection signatures per breed. Through a confirmation process we verified selection for three distinct phenotypes typical for one breed (polledness in Galloway, double muscling in Blanc-Bleu Belge and red coat colour in Red Holstein cattle). Moreover, we detected six genes strongly associated with known QTL for beef or dairy traits (TG, ABCG2, DGAT1, GH1, GHR and the Casein Cluster) within selection signatures of at least one breed. A literature search for genes lying in outstanding signatures revealed further promising candidate genes. However, in concordance with previous genome-wide studies, we also detected a substantial number of signatures without any yet known gene content. These results show the power of XP-EHH analyses in cattle to discover promising candidate genes and raise the hope of identifying phenotypically important variants in the near future. The finding of plausible functional candidates in some short signatures supports this hope. For instance, MAP2K6 is the only annotated gene of two signatures detected in Galloway and Gelbvieh cattle and is already known to be associated with carcass weight, back fat thickness and marbling score in Korean beef cattle. Based on the confirmation process and literature search we deduce that XP-EHH is able to uncover numerous artificial selection targets in subpopulations of domesticated animals.
Vrijens, Karen; Winckelmans, Ellen; Tsamou, Maria; Baeyens, Willy; De Boever, Patrick; Jennen, Danyel; de Kok, Theo M; Den Hond, Elly; Lefebvre, Wouter; Plusquin, Michelle; Reynders, Hans; Schoeters, Greet; Van Larebeke, Nicolas; Vanpoucke, Charlotte; Kleinjans, Jos; Nawrot, Tim S
2017-04-01
Particulate matter (PM) exposure leads to premature death, mainly due to respiratory and cardiovascular diseases. Identification of transcriptomic biomarkers of air pollution exposure and effect in a healthy adult population. Microarray analyses were performed in 98 healthy volunteers (48 men, 50 women). The expression of eight sex-specific candidate biomarker genes (significantly associated with PM 10 in the discovery cohort and with a reported link to air pollution-related disease) was measured with qPCR in an independent validation cohort (75 men, 94 women). Pathway analysis was performed using Gene Set Enrichment Analysis. Average daily PM 2.5 and PM 10 exposures over 2-years were estimated for each participant's residential address using spatiotemporal interpolation in combination with a dispersion model. Average long-term PM 10 was 25.9 (± 5.4) and 23.7 (± 2.3) μg/m 3 in the discovery and validation cohorts, respectively. In discovery analysis, associations between PM 10 and the expression of individual genes differed by sex. In the validation cohort, long-term PM 10 was associated with the expression of DNAJB5 and EAPP in men and ARHGAP4 ( p = 0.053) in women. AKAP6 and LIMK1 were significantly associated with PM 10 in women, although associations differed in direction between the discovery and validation cohorts. Expression of the eight candidate genes in the discovery cohort differentiated between validation cohort participants with high versus low PM 10 exposure (area under the receiver operating curve = 0.92; 95% CI: 0.85, 1.00; p = 0.0002 in men, 0.86; 95% CI: 0.76, 0.96; p = 0.004 in women). Expression of the sex-specific candidate genes identified in the discovery population predicted PM 10 exposure in an independent cohort of adults from the same area. Confirmation in other populations may further support this as a new approach for exposure assessment, and may contribute to the discovery of molecular mechanisms for PM-induced health effects.
Lai, Y C; Fujikawa, T; Ando, T; Kitahara, G; Koiwa, M; Kubota, C; Miura, N
2017-06-01
Our aim was to identify a suitable microRNA housekeeping gene for real-time PCR analysis of bovine mastitis-related microRNA in milk. We identified , , and as housekeeping gene candidates on the basis of previous Solexa sequencing results. Threshold cycle (CT) values for , , and did not differ between milk from control cows and milk from mastitis-affected cows. NormFinder software identified as the most stable single housekeeping gene. We evaluated the suitability of the housekeeping gene candidates by using them to assess expression levels of the inflammation-related gene . Regardless of the housekeeping gene candidates used for normalization, relative expression levels of were significantly higher in mastitis-affected samples than in control samples. However, of all the housekeeping genes and gene combinations investigated, normalization with alone generated the difference in relative expression between mastitis-affected and control samples with the highest significance. These results suggest that is suitable for use as a housekeeping gene for analysis of bovine mastitis-related microRNA in milk.
A genome-wide association study of corneal astigmatism: The CREAM Consortium
Shah, Rupal L.; Li, Qing; Zhao, Wanting; Tedja, Milly S.; Tideman, J. Willem L.; Khawaja, Anthony P.; Fan, Qiao; Yazar, Seyhan; Williams, Katie M.; Verhoeven, Virginie J.M.; Xie, Jing; Wang, Ya Xing; Hess, Moritz; Nickels, Stefan; Lackner, Karl J.; Pärssinen, Olavi; Wedenoja, Juho; Biino, Ginevra; Concas, Maria Pina; Uitterlinden, André; Rivadeneira, Fernando; Jaddoe, Vincent W.V.; Hysi, Pirro G.; Sim, Xueling; Tan, Nicholas; Tham, Yih-Chung; Sensaki, Sonoko; Hofman, Albert; Vingerling, Johannes R.; Jonas, Jost B.; Mitchell, Paul; Hammond, Christopher J.; Höhn, René; Baird, Paul N.; Wong, Tien-Yin; Cheng, Chinfsg-Yu; Teo, Yik Ying; Mackey, David A.; Williams, Cathy; Saw, Seang-Mei; Klaver, Caroline C.W.; Bailey-Wilson, Joan E.
2018-01-01
Purpose To identify genes and genetic markers associated with corneal astigmatism. Methods A meta-analysis of genome-wide association studies (GWASs) of corneal astigmatism undertaken for 14 European ancestry (n=22,250) and 8 Asian ancestry (n=9,120) cohorts was performed by the Consortium for Refractive Error and Myopia. Cases were defined as having >0.75 diopters of corneal astigmatism. Subsequent gene-based and gene-set analyses of the meta-analyzed results of European ancestry cohorts were performed using VEGAS2 and MAGMA software. Additionally, estimates of single nucleotide polymorphism (SNP)-based heritability for corneal and refractive astigmatism and the spherical equivalent were calculated for Europeans using LD score regression. Results The meta-analysis of all cohorts identified a genome-wide significant locus near the platelet-derived growth factor receptor alpha (PDGFRA) gene: top SNP: rs7673984, odds ratio=1.12 (95% CI:1.08–1.16), p=5.55×10−9. No other genome-wide significant loci were identified in the combined analysis or European/Asian ancestry-specific analyses. Gene-based analysis identified three novel candidate genes for corneal astigmatism in Europeans—claudin-7 (CLDN7), acid phosphatase 2, lysosomal (ACP2), and TNF alpha-induced protein 8 like 3 (TNFAIP8L3). Conclusions In addition to replicating a previously identified genome-wide significant locus for corneal astigmatism near the PDGFRA gene, gene-based analysis identified three novel candidate genes, CLDN7, ACP2, and TNFAIP8L3, that warrant further investigation to understand their role in the pathogenesis of corneal astigmatism. The much lower number of genetic variants and genes demonstrating an association with corneal astigmatism compared to published spherical equivalent GWAS analyses suggest a greater influence of rare genetic variants, non-additive genetic effects, or environmental factors in the development of astigmatism. PMID:29422769
Huang, Huiyan; Zhu, Yong; Eliot, Melissa N; Knopik, Valerie S; McGeary, John E; Carskadon, Mary A; Hart, Anne C
2017-06-01
We aimed to test a combined approach to identify conserved genes regulating sleep and to explore the association between DNA methylation and sleep length. We identified candidate genes associated with shorter versus longer sleep duration in college students based on DNA methylation using Illumina Infinium HumanMethylation450 BeadChip arrays. Orthologous genes in Caenorhabditis elegans were identified, and we examined whether their loss of function affected C. elegans sleep. For genes whose perturbation affected C. elegans sleep, we subsequently undertook a small pilot study to re-examine DNA methylation in an independent set of human participants with shorter versus longer sleep durations. Eighty-seven out of 485,577 CpG sites had significant differential methylation in young adults with shorter versus longer sleep duration, corresponding to 52 candidate genes. We identified 34 C. elegans orthologs, including NPY/flp-18 and flp-21, which are known to affect sleep. Loss of five additional genes alters developmentally timed C. elegans sleep (B4GALT6/bre-4, DOCK180/ced-5, GNB2L1/rack-1, PTPRN2/ida-1, ZFYVE28/lst-2). For one of these genes, ZFYVE28 (also known as hLst2), the pilot replication study again found decreased DNA methylation associated with shorter sleep duration at the same two CpG sites in the first intron of ZFYVE28. Using an approach that combines human epigenetics and C. elegans sleep studies, we identified five genes that play previously unidentified roles in C. elegans sleep. We suggest sleep duration in humans may be associated with differential DNA methylation at specific sites and that the conserved genes identified here likely play roles in C. elegans sleep and in other species. © Sleep Research Society 2017. Published by Oxford University Press on behalf of the Sleep Research Society. All rights reserved. For permissions, please e-mail journals.permissions@oup.com.
Disease Model Discovery from 3,328 Gene Knockouts by The International Mouse Phenotyping Consortium
Meehan, Terrence F.; Conte, Nathalie; West, David B.; Jacobsen, Julius O.; Mason, Jeremy; Warren, Jonathan; Chen, Chao-Kung; Tudose, Ilinca; Relac, Mike; Matthews, Peter; Karp, Natasha; Santos, Luis; Fiegel, Tanja; Ring, Natalie; Westerberg, Henrik; Greenaway, Simon; Sneddon, Duncan; Morgan, Hugh; Codner, Gemma F; Stewart, Michelle E; Brown, James; Horner, Neil; Haendel, Melissa; Washington, Nicole; Mungall, Christopher J.; Reynolds, Corey L; Gallegos, Juan; Gailus-Durner, Valerie; Sorg, Tania; Pavlovic, Guillaume; Bower, Lynette R; Moore, Mark; Morse, Iva; Gao, Xiang; Tocchini-Valentini, Glauco P; Obata, Yuichi; Cho, Soo Young; Seong, Je Kyung; Seavitt, John; Beaudet, Arthur L.; Dickinson, Mary E.; Herault, Yann; Wurst, Wolfgang; de Angelis, Martin Hrabe; Lloyd, K.C. Kent; Flenniken, Ann M; Nutter, Lauryl MJ; Newbigging, Susan; McKerlie, Colin; Justice, Monica J.; Murray, Stephen A.; Svenson, Karen L.; Braun, Robert E.; White, Jacqueline K.; Bradley, Allan; Flicek, Paul; Wells, Sara; Skarnes, William C.; Adams, David J.; Parkinson, Helen; Mallon, Ann-Marie; Brown, Steve D.M.; Smedley, Damian
2017-01-01
Although next generation sequencing has revolutionised the ability to associate variants with human diseases, diagnostic rates and development of new therapies are still limited by our lack of knowledge of function and pathobiological mechanism for most genes. To address this challenge, the International Mouse Phenotyping Consortium (IMPC) is creating a genome- and phenome-wide catalogue of gene function by characterizing new knockout mouse strains across diverse biological systems through a broad set of standardised phenotyping tests, with all mice made readily available to the biomedical community. Analysing the first 3328 genes reveals models for 360 diseases including the first for type C Bernard-Soulier, Bardet-Biedl-5 and Gordon Holmes syndromes. 90% of our phenotype annotations are novel, providing the first functional evidence for 1092 genes and candidates in unsolved diseases such as Arrhythmogenic Right Ventricular Dysplasia 3. Finally, we describe our role in variant functional validation with the 100,000 Genomes and other projects. PMID:28650483
Glubb, Dylan M.; Johnatty, Sharon E.; Quinn, Michael C.J.; O’Mara, Tracy A.; Tyrer, Jonathan P.; Gao, Bo; Fasching, Peter A.; Beckmann, Matthias W.; Lambrechts, Diether; Vergote, Ignace; Velez Edwards, Digna R.; Beeghly-Fadiel, Alicia; Benitez, Javier; Garcia, Maria J.; Goodman, Marc T.; Thompson, Pamela J.; Dörk, Thilo; Dürst, Matthias; Modungo, Francesmary; Moysich, Kirsten; Heitz, Florian; du Bois, Andreas; Pfisterer, Jacobus; Hillemanns, Peter; Karlan, Beth Y.; Lester, Jenny; Goode, Ellen L.; Cunningham, Julie M.; Winham, Stacey J.; Larson, Melissa C.; McCauley, Bryan M.; Kjær, Susanne Krüger; Jensen, Allan; Schildkraut, Joellen M.; Berchuck, Andrew; Cramer, Daniel W.; Terry, Kathryn L.; Salvesen, Helga B.; Bjorge, Line; Webb, Penny M.; Grant, Peter; Pejovic, Tanja; Moffitt, Melissa; Hogdall, Claus K.; Hogdall, Estrid; Paul, James; Glasspool, Rosalind; Bernardini, Marcus; Tone, Alicia; Huntsman, David; Woo, Michelle; Group, AOCS; deFazio, Anna; Kennedy, Catherine J.; Pharoah, Paul D.P.; MacGregor, Stuart; Chenevix-Trench, Georgia
2017-01-01
We previously identified associations with ovarian cancer outcome at five genetic loci. To identify putatively causal genetic variants and target genes, we prioritized two ovarian outcome loci (1q22 and 19p12) for further study. Bioinformatic and functional genetic analyses indicated that MEF2D and ZNF100 are targets of candidate outcome variants at 1q22 and 19p12, respectively. At 19p12, the chromatin interaction of a putative regulatory element with the ZNF100 promoter region correlated with candidate outcome variants. At 1q22, putative regulatory elements enhanced MEF2D promoter activity and haplotypes containing candidate outcome variants modulated these effects. In a public dataset, MEF2D and ZNF100 expression were both associated with ovarian cancer progression-free or overall survival time. In an extended set of 6,162 epithelial ovarian cancer patients, we found that functional candidates at the 1q22 and 19p12 loci, as well as other regional variants, were nominally associated with patient outcome; however, no associations reached our threshold for statistical significance (p<1×10-5). Larger patient numbers will be needed to convincingly identify any true associations at these loci. PMID:29029385