gene function analysis: Topics by Science.gov

Sample records for gene function analysis

Gene context analysis in the Integrated Microbial Genomes (IMG) data management system.

PubMed

Mavromatis, Konstantinos; Chu, Ken; Ivanova, Natalia; Hooper, Sean D; Markowitz, Victor M; Kyrpides, Nikos C

2009-11-24

Computational methods for determining the function of genes in newly sequenced genomes have been traditionally based on sequence similarity to genes whose function has been identified experimentally. Function prediction methods can be extended using gene context analysis approaches such as examining the conservation of chromosomal gene clusters, gene fusion events and co-occurrence profiles across genomes. Context analysis is based on the observation that functionally related genes are often having similar gene context and relies on the identification of such events across phylogenetically diverse collection of genomes. We have used the data management system of the Integrated Microbial Genomes (IMG) as the framework to implement and explore the power of gene context analysis methods because it provides one of the largest available genome integrations. Visualization and search tools to facilitate gene context analysis have been developed and applied across all publicly available archaeal and bacterial genomes in IMG. These computations are now maintained as part of IMG's regular genome content update cycle. IMG is available at: http://img.jgi.doe.gov.
Plant functional genomics

NASA Astrophysics Data System (ADS)

Holtorf, Hauke; Guitton, Marie-Christine; Reski, Ralf

2002-04-01

Functional genome analysis of plants has entered the high-throughput stage. The complete genome information from key species such as Arabidopsis thaliana and rice is now available and will further boost the application of a range of new technologies to functional plant gene analysis. To broadly assign functions to unknown genes, different fast and multiparallel approaches are currently used and developed. These new technologies are based on known methods but are adapted and improved to accommodate for comprehensive, large-scale gene analysis, i.e. such techniques are novel in the sense that their design allows researchers to analyse many genes at the same time and at an unprecedented pace. Such methods allow analysis of the different constituents of the cell that help to deduce gene function, namely the transcripts, proteins and metabolites. Similarly the phenotypic variations of entire mutant collections can now be analysed in a much faster and more efficient way than before. The different methodologies have developed to form their own fields within the functional genomics technological platform and are termed transcriptomics, proteomics, metabolomics and phenomics. Gene function, however, cannot solely be inferred by using only one such approach. Rather, it is only by bringing together all the information collected by different functional genomic tools that one will be able to unequivocally assign functions to unknown plant genes. This review focuses on current technical developments and their impact on the field of plant functional genomics. The lower plant Physcomitrella is introduced as a new model system for gene function analysis, owing to its high rate of homologous recombination.
Comparative genome analysis of PHB gene family reveals deep evolutionary origins and diverse gene function.

PubMed

Di, Chao; Xu, Wenying; Su, Zhen; Yuan, Joshua S

2010-10-07

PHB (Prohibitin) gene family is involved in a variety of functions important for different biological processes. PHB genes are ubiquitously present in divergent species from prokaryotes to eukaryotes. Human PHB genes have been found to be associated with various diseases. Recent studies by our group and others have shown diverse function of PHB genes in plants for development, senescence, defence, and others. Despite the importance of the PHB gene family, no comprehensive gene family analysis has been carried to evaluate the relatedness of PHB genes across different species. In order to better guide the gene function analysis and understand the evolution of the PHB gene family, we therefore carried out the comparative genome analysis of the PHB genes across different kingdoms. The relatedness, motif distribution, and intron/exon distribution all indicated that PHB genes is a relatively conserved gene family. The PHB genes can be classified into 5 classes and each class have a very deep evolutionary origin. The PHB genes within the class maintained the same motif patterns during the evolution. With Arabidopsis as the model species, we found that PHB gene intron/exon structure and domains are also conserved during the evolution. Despite being a conserved gene family, various gene duplication events led to the expansion of the PHB genes. Both segmental and tandem gene duplication were involved in Arabidopsis PHB gene family expansion. However, segmental duplication is predominant in Arabidopsis. Moreover, most of the duplicated genes experienced neofunctionalization. The results highlighted that PHB genes might be involved in important functions so that the duplicated genes are under the evolutionary pressure to derive new function. PHB gene family is a conserved gene family and accounts for diverse but important biological functions based on the similar molecular mechanisms. The highly diverse biological function indicated that more research needs to be carried out to dissect the PHB gene function. The conserved gene evolution indicated that the study in the model species can be translated to human and mammalian studies.
Identification of key genes associated with the effect of estrogen on ovarian cancer using microarray analysis.

PubMed

Zhang, Shi-tao; Zuo, Chao; Li, Wan-nan; Fu, Xue-qi; Xing, Shu; Zhang, Xiao-ping

2016-02-01

To identify key genes related to the effect of estrogen on ovarian cancer. Microarray data (GSE22600) were downloaded from Gene Expression Omnibus. Eight estrogen and seven placebo treatment samples were obtained using a 2 × 2 factorial designs, which contained 2 cell lines (PEO4 and 2008) and 2 treatments (estrogen and placebo). Differentially expressed genes were identified by Bayesian methods, and the genes with P < 0.05 and |log2FC (fold change)| ≥0.5 were chosen as cut-off criterion. Differentially co-expressed genes (DCGs) and differentially regulated genes (DRGs) were, respectively, identified by DCe function and DRsort function in DCGL package. Topological structure analysis was performed on the important transcriptional factors (TFs) and genes in transcriptional regulatory network using tYNA. Functional enrichment analysis was, respectively, performed for DEGs and the important genes using Gene Ontology and KEGG databases. In total, 465 DEGs were identified. Functional enrichment analysis of DEGs indicated that ACVR2B, LTBP1, BMP7 and MYC involved in TGF-beta signaling pathway. The 2285 DCG pairs and 357 DRGs were identified. Topological structure analysis showed that 52 important TFs and 65 important genes were identified. Functional enrichment analysis of the important genes showed that TP53 and MLH1 participated in DNA damage response and the genes (ACVR2B, LTBP1, BMP7 and MYC) involved in TGF-beta signaling pathway. TP53, MLH1, ACVR2B, LTBP1 and BMP7 might participate in the pathogenesis of ovarian cancer.
Genome-wide analysis of the Dof transcription factor gene family reveals soybean-specific duplicable and functional characteristics.

PubMed

Guo, Yong; Qiu, Li-Juan

2013-01-01

The Dof domain protein family is a classic plant-specific zinc-finger transcription factor family involved in a variety of biological processes. There is great diversity in the number of Dof genes in different plants. However, there are only very limited reports on the characterization of Dof transcription factors in soybean (Glycine max). In the present study, 78 putative Dof genes were identified from the whole-genome sequence of soybean. The predicted GmDof genes were non-randomly distributed within and across 19 out of 20 chromosomes and 97.4% (38 pairs) were preferentially retained duplicate paralogous genes located in duplicated regions of the genome. Soybean-specific segmental duplications contributed significantly to the expansion of the soybean Dof gene family. These Dof proteins were phylogenetically clustered into nine distinct subgroups among which the gene structure and motif compositions were considerably conserved. Comparative phylogenetic analysis of these Dof proteins revealed four major groups, similar to those reported for Arabidopsis and rice. Most of the GmDofs showed specific expression patterns based on RNA-seq data analyses. The expression patterns of some duplicate genes were partially redundant while others showed functional diversity, suggesting the occurrence of sub-functionalization during subsequent evolution. Comprehensive expression profile analysis also provided insights into the soybean-specific functional divergence among members of the Dof gene family. Cis-regulatory element analysis of these GmDof genes suggested diverse functions associated with different processes. Taken together, our results provide useful information for the functional characterization of soybean Dof genes by combining phylogenetic analysis with global gene-expression profiling.
Soybean kinome: functional classification and gene expression patterns

PubMed Central

Liu, Jinyi; Chen, Nana; Grant, Joshua N.; Cheng, Zong-Ming (Max); Stewart, C. Neal; Hewezi, Tarek

2015-01-01

The protein kinase (PK) gene family is one of the largest and most highly conserved gene families in plants and plays a role in nearly all biological functions. While a large number of genes have been predicted to encode PKs in soybean, a comprehensive functional classification and global analysis of expression patterns of this large gene family is lacking. In this study, we identified the entire soybean PK repertoire or kinome, which comprised 2166 putative PK genes, representing 4.67% of all soybean protein-coding genes. The soybean kinome was classified into 19 groups, 81 families, and 122 subfamilies. The receptor-like kinase (RLK) group was remarkably large, containing 1418 genes. Collinearity analysis indicated that whole-genome segmental duplication events may have played a key role in the expansion of the soybean kinome, whereas tandem duplications might have contributed to the expansion of specific subfamilies. Gene structure, subcellular localization prediction, and gene expression patterns indicated extensive functional divergence of PK subfamilies. Global gene expression analysis of soybean PK subfamilies revealed tissue- and stress-specific expression patterns, implying regulatory functions over a wide range of developmental and physiological processes. In addition, tissue and stress co-expression network analysis uncovered specific subfamilies with narrow or wide interconnected relationships, indicative of their association with particular or broad signalling pathways, respectively. Taken together, our analyses provide a foundation for further functional studies to reveal the biological and molecular functions of PKs in soybean. PMID:25614662
GO-based functional dissimilarity of gene sets.

PubMed

Díaz-Díaz, Norberto; Aguilar-Ruiz, Jesús S

2011-09-01

The Gene Ontology (GO) provides a controlled vocabulary for describing the functions of genes and can be used to evaluate the functional coherence of gene sets. Many functional coherence measures consider each pair of gene functions in a set and produce an output based on all pairwise distances. A single gene can encode multiple proteins that may differ in function. For each functionality, other proteins that exhibit the same activity may also participate. Therefore, an identification of the most common function for all of the genes involved in a biological process is important in evaluating the functional similarity of groups of genes and a quantification of functional coherence can helps to clarify the role of a group of genes working together. To implement this approach to functional assessment, we present GFD (GO-based Functional Dissimilarity), a novel dissimilarity measure for evaluating groups of genes based on the most relevant functions of the whole set. The measure assigns a numerical value to the gene set for each of the three GO sub-ontologies. Results show that GFD performs robustly when applied to gene set of known functionality (extracted from KEGG). It performs particularly well on randomly generated gene sets. An ROC analysis reveals that the performance of GFD in evaluating the functional dissimilarity of gene sets is very satisfactory. A comparative analysis against other functional measures, such as GS2 and those presented by Resnik and Wang, also demonstrates the robustness of GFD.
Using scale and feather traits for module construction provides a functional approach to chicken epidermal development.

PubMed

Bao, Weier; Greenwold, Matthew J; Sawyer, Roger H

2017-11-01

Gene co-expression network analysis has been a research method widely used in systematically exploring gene function and interaction. Using the Weighted Gene Co-expression Network Analysis (WGCNA) approach to construct a gene co-expression network using data from a customized 44K microarray transcriptome of chicken epidermal embryogenesis, we have identified two distinct modules that are highly correlated with scale or feather development traits. Signaling pathways related to feather development were enriched in the traditional KEGG pathway analysis and functional terms relating specifically to embryonic epidermal development were also enriched in the Gene Ontology analysis. Significant enrichment annotations were discovered from customized enrichment tools such as Modular Single-Set Enrichment Test (MSET) and Medical Subject Headings (MeSH). Hub genes in both trait-correlated modules showed strong specific functional enrichment toward epidermal development. Also, regulatory elements, such as transcription factors and miRNAs, were targeted in the significant enrichment result. This work highlights the advantage of this methodology for functional prediction of genes not previously associated with scale- and feather trait-related modules.
Annotation of gene function in citrus using gene expression information and co-expression networks

PubMed Central

2014-01-01

Background The genus Citrus encompasses major cultivated plants such as sweet orange, mandarin, lemon and grapefruit, among the world’s most economically important fruit crops. With increasing volumes of transcriptomics data available for these species, Gene Co-expression Network (GCN) analysis is a viable option for predicting gene function at a genome-wide scale. GCN analysis is based on a “guilt-by-association” principle whereby genes encoding proteins involved in similar and/or related biological processes may exhibit similar expression patterns across diverse sets of experimental conditions. While bioinformatics resources such as GCN analysis are widely available for efficient gene function prediction in model plant species including Arabidopsis, soybean and rice, in citrus these tools are not yet developed. Results We have constructed a comprehensive GCN for citrus inferred from 297 publicly available Affymetrix Genechip Citrus Genome microarray datasets, providing gene co-expression relationships at a genome-wide scale (33,000 transcripts). The comprehensive citrus GCN consists of a global GCN (condition-independent) and four condition-dependent GCNs that survey the sweet orange species only, all citrus fruit tissues, all citrus leaf tissues, or stress-exposed plants. All of these GCNs are clustered using genome-wide, gene-centric (guide) and graph clustering algorithms for flexibility of gene function prediction. For each putative cluster, gene ontology (GO) enrichment and gene expression specificity analyses were performed to enhance gene function, expression and regulation pattern prediction. The guide-gene approach was used to infer novel roles of genes involved in disease susceptibility and vitamin C metabolism, and graph-clustering approaches were used to investigate isoprenoid/phenylpropanoid metabolism in citrus peel, and citric acid catabolism via the GABA shunt in citrus fruit. Conclusions Integration of citrus gene co-expression networks, functional enrichment analysis and gene expression information provide opportunities to infer gene function in citrus. We present a publicly accessible tool, Network Inference for Citrus Co-Expression (NICCE, http://citrus.adelaide.edu.au/nicce/home.aspx), for the gene co-expression analysis in citrus. PMID:25023870
Gene Network Construction from Microarray Data Identifies a Key Network Module and Several Candidate Hub Genes in Age-Associated Spatial Learning Impairment

PubMed Central

Uddin, Raihan; Singh, Shiva M.

2017-01-01

As humans age many suffer from a decrease in normal brain functions including spatial learning impairments. This study aimed to better understand the molecular mechanisms in age-associated spatial learning impairment (ASLI). We used a mathematical modeling approach implemented in Weighted Gene Co-expression Network Analysis (WGCNA) to create and compare gene network models of young (learning unimpaired) and aged (predominantly learning impaired) brains from a set of exploratory datasets in rats in the context of ASLI. The major goal was to overcome some of the limitations previously observed in the traditional meta- and pathway analysis using these data, and identify novel ASLI related genes and their networks based on co-expression relationship of genes. This analysis identified a set of network modules in the young, each of which is highly enriched with genes functioning in broad but distinct GO functional categories or biological pathways. Interestingly, the analysis pointed to a single module that was highly enriched with genes functioning in “learning and memory” related functions and pathways. Subsequent differential network analysis of this “learning and memory” module in the aged (predominantly learning impaired) rats compared to the young learning unimpaired rats allowed us to identify a set of novel ASLI candidate hub genes. Some of these genes show significant repeatability in networks generated from independent young and aged validation datasets. These hub genes are highly co-expressed with other genes in the network, which not only show differential expression but also differential co-expression and differential connectivity across age and learning impairment. The known function of these hub genes indicate that they play key roles in critical pathways, including kinase and phosphatase signaling, in functions related to various ion channels, and in maintaining neuronal integrity relating to synaptic plasticity and memory formation. Taken together, they provide a new insight and generate new hypotheses into the molecular mechanisms responsible for age associated learning impairment, including spatial learning. PMID:29066959
Gene Network Construction from Microarray Data Identifies a Key Network Module and Several Candidate Hub Genes in Age-Associated Spatial Learning Impairment.

PubMed

Uddin, Raihan; Singh, Shiva M

2017-01-01

As humans age many suffer from a decrease in normal brain functions including spatial learning impairments. This study aimed to better understand the molecular mechanisms in age-associated spatial learning impairment (ASLI). We used a mathematical modeling approach implemented in Weighted Gene Co-expression Network Analysis (WGCNA) to create and compare gene network models of young (learning unimpaired) and aged (predominantly learning impaired) brains from a set of exploratory datasets in rats in the context of ASLI. The major goal was to overcome some of the limitations previously observed in the traditional meta- and pathway analysis using these data, and identify novel ASLI related genes and their networks based on co-expression relationship of genes. This analysis identified a set of network modules in the young, each of which is highly enriched with genes functioning in broad but distinct GO functional categories or biological pathways. Interestingly, the analysis pointed to a single module that was highly enriched with genes functioning in "learning and memory" related functions and pathways. Subsequent differential network analysis of this "learning and memory" module in the aged (predominantly learning impaired) rats compared to the young learning unimpaired rats allowed us to identify a set of novel ASLI candidate hub genes. Some of these genes show significant repeatability in networks generated from independent young and aged validation datasets. These hub genes are highly co-expressed with other genes in the network, which not only show differential expression but also differential co-expression and differential connectivity across age and learning impairment. The known function of these hub genes indicate that they play key roles in critical pathways, including kinase and phosphatase signaling, in functions related to various ion channels, and in maintaining neuronal integrity relating to synaptic plasticity and memory formation. Taken together, they provide a new insight and generate new hypotheses into the molecular mechanisms responsible for age associated learning impairment, including spatial learning.
Genome-Wide Detection and Analysis of Multifunctional Genes

PubMed Central

Pritykin, Yuri; Ghersi, Dario; Singh, Mona

2015-01-01

Many genes can play a role in multiple biological processes or molecular functions. Identifying multifunctional genes at the genome-wide level and studying their properties can shed light upon the complexity of molecular events that underpin cellular functioning, thereby leading to a better understanding of the functional landscape of the cell. However, to date, genome-wide analysis of multifunctional genes (and the proteins they encode) has been limited. Here we introduce a computational approach that uses known functional annotations to extract genes playing a role in at least two distinct biological processes. We leverage functional genomics data sets for three organisms—H. sapiens, D. melanogaster, and S. cerevisiae—and show that, as compared to other annotated genes, genes involved in multiple biological processes possess distinct physicochemical properties, are more broadly expressed, tend to be more central in protein interaction networks, tend to be more evolutionarily conserved, and are more likely to be essential. We also find that multifunctional genes are significantly more likely to be involved in human disorders. These same features also hold when multifunctionality is defined with respect to molecular functions instead of biological processes. Our analysis uncovers key features about multifunctional genes, and is a step towards a better genome-wide understanding of gene multifunctionality. PMID:26436655
Functional regression method for whole genome eQTL epistasis analysis with sequencing data.

PubMed

Xu, Kelin; Jin, Li; Xiong, Momiao

2017-05-18

Epistasis plays an essential rule in understanding the regulation mechanisms and is an essential component of the genetic architecture of the gene expressions. However, interaction analysis of gene expressions remains fundamentally unexplored due to great computational challenges and data availability. Due to variation in splicing, transcription start sites, polyadenylation sites, post-transcriptional RNA editing across the entire gene, and transcription rates of the cells, RNA-seq measurements generate large expression variability and collectively create the observed position level read count curves. A single number for measuring gene expression which is widely used for microarray measured gene expression analysis is highly unlikely to sufficiently account for large expression variation across the gene. Simultaneously analyzing epistatic architecture using the RNA-seq and whole genome sequencing (WGS) data poses enormous challenges. We develop a nonlinear functional regression model (FRGM) with functional responses where the position-level read counts within a gene are taken as a function of genomic position, and functional predictors where genotype profiles are viewed as a function of genomic position, for epistasis analysis with RNA-seq data. Instead of testing the interaction of all possible pair-wises SNPs, the FRGM takes a gene as a basic unit for epistasis analysis, which tests for the interaction of all possible pairs of genes and use all the information that can be accessed to collectively test interaction between all possible pairs of SNPs within two genome regions. By large-scale simulations, we demonstrate that the proposed FRGM for epistasis analysis can achieve the correct type 1 error and has higher power to detect the interactions between genes than the existing methods. The proposed methods are applied to the RNA-seq and WGS data from the 1000 Genome Project. The numbers of pairs of significantly interacting genes after Bonferroni correction identified using FRGM, RPKM and DESeq were 16,2361, 260 and 51, respectively, from the 350 European samples. The proposed FRGM for epistasis analysis of RNA-seq can capture isoform and position-level information and will have a broad application. Both simulations and real data analysis highlight the potential for the FRGM to be a good choice of the epistatic analysis with sequencing data.
Dynamic gene expression analysis in a H1N1 influenza virus mouse pneumonia model.

PubMed

Bao, Yanyan; Gao, Yingjie; Shi, Yujing; Cui, Xiaolan

2017-06-01

H1N1, a major pathogenic subtype of influenza A virus, causes a respiratory infection in humans and livestock that can range from a mild infection to more severe pneumonia associated with acute respiratory distress syndrome. Understanding the dynamic changes in the genome and the related functional changes induced by H1N1 influenza virus infection is essential to elucidating the pathogenesis of this virus and thereby determining strategies to prevent future outbreaks. In this study, we filtered the significantly expressed genes in mouse pneumonia using mRNA microarray analysis. Using STC analysis, seven significant gene clusters were revealed, and using STC-GO analysis, we explored the significant functions of these seven gene clusters. The results revealed GOs related to H1N1 virus-induced inflammatory and immune functions, including innate immune response, inflammatory response, specific immune response, and cellular response to interferon-beta. Furthermore, the dynamic regulation relationships of the key genes in mouse pneumonia were revealed by dynamic gene network analysis, and the most important genes were filtered, including Dhx58, Cxcl10, Cxcl11, Zbp1, Ifit1, Ifih1, Trim25, Mx2, Oas2, Cd274, Irgm1, and Irf7. These results suggested that during mouse pneumonia, changes in the expression of gene clusters and the complex interactions among genes lead to significant changes in function. Dynamic gene expression analysis revealed key genes that performed important functions. These results are a prelude to advancements in mouse H1N1 influenza virus infection biology, as well as the use of mice as a model organism for human H1N1 influenza virus infection studies.
Systems analysis of transcriptome data provides new hypotheses about Arabidopsis root response to nitrate treatments

PubMed Central

Canales, Javier; Moyano, Tomás C.; Villarroel, Eva; Gutiérrez, Rodrigo A.

2014-01-01

Nitrogen (N) is an essential macronutrient for plant growth and development. Plants adapt to changes in N availability partly by changes in global gene expression. We integrated publicly available root microarray data under contrasting nitrate conditions to identify new genes and functions important for adaptive nitrate responses in Arabidopsis thaliana roots. Overall, more than 2000 genes exhibited changes in expression in response to nitrate treatments in Arabidopsis thaliana root organs. Global regulation of gene expression by nitrate depends largely on the experimental context. However, despite significant differences from experiment to experiment in the identity of regulated genes, there is a robust nitrate response of specific biological functions. Integrative gene network analysis uncovered relationships between nitrate-responsive genes and 11 highly co-expressed gene clusters (modules). Four of these gene network modules have robust nitrate responsive functions such as transport, signaling, and metabolism. Network analysis hypothesized G2-like transcription factors are key regulatory factors controlling transport and signaling functions. Our meta-analysis highlights the role of biological processes not studied before in the context of the nitrate response such as root hair development and provides testable hypothesis to advance our understanding of nitrate responses in plants. PMID:24570678
Genetic interaction analysis of point mutations enables interrogation of gene function at a residue-level resolution

PubMed Central

Braberg, Hannes; Moehle, Erica A.; Shales, Michael; Guthrie, Christine; Krogan, Nevan J.

2014-01-01

We have achieved a residue-level resolution of genetic interaction mapping – a technique that measures how the function of one gene is affected by the alteration of a second gene – by analyzing point mutations. Here, we describe how to interpret point mutant genetic interactions, and outline key applications for the approach, including interrogation of protein interaction interfaces and active sites, and examination of post-translational modifications. Genetic interaction analysis has proven effective for characterizing cellular processes; however, to date, systematic high-throughput genetic interaction screens have relied on gene deletions or knockdowns, which limits the resolution of gene function analysis and poses problems for multifunctional genes. Our point mutant approach addresses these issues, and further provides a tool for in vivo structure-function analysis that complements traditional biophysical methods. We also discuss the potential for genetic interaction mapping of point mutations in human cells and its application to personalized medicine. PMID:24842270
Pan-Cancer Analysis of Mutation Hotspots in Protein Domains.

PubMed

Miller, Martin L; Reznik, Ed; Gauthier, Nicholas P; Aksoy, Bülent Arman; Korkut, Anil; Gao, Jianjiong; Ciriello, Giovanni; Schultz, Nikolaus; Sander, Chris

2015-09-23

In cancer genomics, recurrence of mutations in independent tumor samples is a strong indicator of functional impact. However, rare functional mutations can escape detection by recurrence analysis owing to lack of statistical power. We enhance statistical power by extending the notion of recurrence of mutations from single genes to gene families that share homologous protein domains. Domain mutation analysis also sharpens the functional interpretation of the impact of mutations, as domains more succinctly embody function than entire genes. By mapping mutations in 22 different tumor types to equivalent positions in multiple sequence alignments of domains, we confirm well-known functional mutation hotspots, identify uncharacterized rare variants in one gene that are equivalent to well-characterized mutations in another gene, detect previously unknown mutation hotspots, and provide hypotheses about molecular mechanisms and downstream effects of domain mutations. With the rapid expansion of cancer genomics projects, protein domain hotspot analysis will likely provide many more leads linking mutations in proteins to the cancer phenotype. Copyright © 2015 Elsevier Inc. All rights reserved.
Reveal genes functionally associated with ACADS by a network study.

PubMed

Chen, Yulong; Su, Zhiguang

2015-09-15

Establishing a systematic network is aimed at finding essential human gene-gene/gene-disease pathway by means of network inter-connecting patterns and functional annotation analysis. In the present study, we have analyzed functional gene interactions of short-chain acyl-coenzyme A dehydrogenase gene (ACADS). ACADS plays a vital role in free fatty acid β-oxidation and regulates energy homeostasis. Modules of highly inter-connected genes in disease-specific ACADS network are derived by integrating gene function and protein interaction data. Among the 8 genes in ACADS web retrieved from both STRING and GeneMANIA, ACADS is effectively conjoined with 4 genes including HAHDA, HADHB, ECHS1 and ACAT1. The functional analysis is done via ontological briefing and candidate disease identification. We observed that the highly efficient-interlinked genes connected with ACADS are HAHDA, HADHB, ECHS1 and ACAT1. Interestingly, the ontological aspect of genes in the ACADS network reveals that ACADS, HAHDA and HADHB play equally vital roles in fatty acid metabolism. The gene ACAT1 together with ACADS indulges in ketone metabolism. Our computational gene web analysis also predicts potential candidate disease recognition, thus indicating the involvement of ACADS, HAHDA, HADHB, ECHS1 and ACAT1 not only with lipid metabolism but also with infant death syndrome, skeletal myopathy, acute hepatic encephalopathy, Reye-like syndrome, episodic ketosis, and metabolic acidosis. The current study presents a comprehensible layout of ACADS network, its functional strategies and candidate disease approach associated with ACADS network. Copyright © 2015 Elsevier B.V. All rights reserved.
Systematic computation with functional gene-sets among leukemic and hematopoietic stem cells reveals a favorable prognostic signature for acute myeloid leukemia.

PubMed

Yang, Xinan Holly; Li, Meiyi; Wang, Bin; Zhu, Wanqi; Desgardin, Aurelie; Onel, Kenan; de Jong, Jill; Chen, Jianjun; Chen, Luonan; Cunningham, John M

2015-03-24

Genes that regulate stem cell function are suspected to exert adverse effects on prognosis in malignancy. However, diverse cancer stem cell signatures are difficult for physicians to interpret and apply clinically. To connect the transcriptome and stem cell biology, with potential clinical applications, we propose a novel computational "gene-to-function, snapshot-to-dynamics, and biology-to-clinic" framework to uncover core functional gene-sets signatures. This framework incorporates three function-centric gene-set analysis strategies: a meta-analysis of both microarray and RNA-seq data, novel dynamic network mechanism (DNM) identification, and a personalized prognostic indicator analysis. This work uses complex disease acute myeloid leukemia (AML) as a research platform. We introduced an adjustable "soft threshold" to a functional gene-set algorithm and found that two different analysis methods identified distinct gene-set signatures from the same samples. We identified a 30-gene cluster that characterizes leukemic stem cell (LSC)-depleted cells and a 25-gene cluster that characterizes LSC-enriched cells in parallel; both mark favorable-prognosis in AML. Genes within each signature significantly share common biological processes and/or molecular functions (empirical p = 6e-5 and 0.03 respectively). The 25-gene signature reflects the abnormal development of stem cells in AML, such as AURKA over-expression. We subsequently determined that the clinical relevance of both signatures is independent of known clinical risk classifications in 214 patients with cytogenetically normal AML. We successfully validated the prognosis of both signatures in two independent cohorts of 91 and 242 patients respectively (log-rank p < 0.0015 and 0.05; empirical p < 0.015 and 0.08). The proposed algorithms and computational framework will harness systems biology research because they efficiently translate gene-sets (rather than single genes) into biological discoveries about AML and other complex diseases.
GeneSCF: a real-time based functional enrichment tool with support for multiple organisms.

PubMed

Subhash, Santhilal; Kanduri, Chandrasekhar

2016-09-13

High-throughput technologies such as ChIP-sequencing, RNA-sequencing, DNA sequencing and quantitative metabolomics generate a huge volume of data. Researchers often rely on functional enrichment tools to interpret the biological significance of the affected genes from these high-throughput studies. However, currently available functional enrichment tools need to be updated frequently to adapt to new entries from the functional database repositories. Hence there is a need for a simplified tool that can perform functional enrichment analysis by using updated information directly from the source databases such as KEGG, Reactome or Gene Ontology etc. In this study, we focused on designing a command-line tool called GeneSCF (Gene Set Clustering based on Functional annotations), that can predict the functionally relevant biological information for a set of genes in a real-time updated manner. It is designed to handle information from more than 4000 organisms from freely available prominent functional databases like KEGG, Reactome and Gene Ontology. We successfully employed our tool on two of published datasets to predict the biologically relevant functional information. The core features of this tool were tested on Linux machines without the need for installation of more dependencies. GeneSCF is more reliable compared to other enrichment tools because of its ability to use reference functional databases in real-time to perform enrichment analysis. It is an easy-to-integrate tool with other pipelines available for downstream analysis of high-throughput data. More importantly, GeneSCF can run multiple gene lists simultaneously on different organisms thereby saving time for the users. Since the tool is designed to be ready-to-use, there is no need for any complex compilation and installation procedures.

NaviGO: interactive tool for visualization and functional similarity and coherence analysis with gene ontology.

PubMed

Wei, Qing; Khan, Ishita K; Ding, Ziyun; Yerneni, Satwica; Kihara, Daisuke

2017-03-20

The number of genomics and proteomics experiments is growing rapidly, producing an ever-increasing amount of data that are awaiting functional interpretation. A number of function prediction algorithms were developed and improved to enable fast and automatic function annotation. With the well-defined structure and manual curation, Gene Ontology (GO) is the most frequently used vocabulary for representing gene functions. To understand relationship and similarity between GO annotations of genes, it is important to have a convenient pipeline that quantifies and visualizes the GO function analyses in a systematic fashion. NaviGO is a web-based tool for interactive visualization, retrieval, and computation of functional similarity and associations of GO terms and genes. Similarity of GO terms and gene functions is quantified with six different scores including protein-protein interaction and context based association scores we have developed in our previous works. Interactive navigation of the GO function space provides intuitive and effective real-time visualization of functional groupings of GO terms and genes as well as statistical analysis of enriched functions. We developed NaviGO, which visualizes and analyses functional similarity and associations of GO terms and genes. The NaviGO webserver is freely available at: http://kiharalab.org/web/navigo .
Identification and expression profiling analysis of TCP family genes involved in growth and development in maize.

PubMed

Chai, Wenbo; Jiang, Pengfei; Huang, Guoyu; Jiang, Haiyang; Li, Xiaoyu

2017-10-01

The TCP family is a group of plant-specific transcription factors. TCP genes encode proteins harboring bHLH structure, which is implicated in DNA binding and protein-protein interactions and known as the TCP domain. TCP genes play important roles in plant development and have been evolutionarily and functionally elaborated in various plants, however, no overall phylogenetic analysis or expression profiling of TCP genes in Zea mays has been reported. In the present study, a systematic analysis of molecular evolution and functional prediction of TCP family genes in maize ( Z . mays L.) has been conducted. We performed a genome-wide survey of TCP genes in maize, revealing the gene structure, chromosomal location and phylogenetic relationship of family members. Microsynteny between grass species and tissue-specific expression profiles were also investigated. In total, 29 TCP genes were identified in the maize genome, unevenly distributed on the 10 maize chromosomes. Additionally, ZmTCP genes were categorized into nine classes based on phylogeny and purifying selection may largely be responsible for maintaining the functions of maize TCP genes. What's more, microsynteny analysis suggested that TCP genes have been conserved during evolution. Finally, expression analysis revealed that most TCP genes are expressed in the stem and ear, which suggests that ZmTCP genes influence stem and ear growth. This result is consistent with the previous finding that maize TCP genes represses the growth of axillary organs and enables the formation of female inflorescences. Altogether, this study presents a thorough overview of TCP family in maize and provides a new perspective on the evolution of this gene family. The results also indicate that TCP family genes may be involved in development stage in plant growing conditions. Additionally, our results will be useful for further functional analysis of the TCP gene family in maize.
A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

PubMed Central

Marques, M Carmen; Alonso-Cantabrana, Hugo; Forment, Javier; Arribas, Raquel; Alamar, Santiago; Conejero, Vicente; Perez-Amador, Miguel A

2009-01-01

Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new EST collection denotes an important step towards the identification of all genes in the citrus genome. Furthermore, public availability of the cDNA clones generated in this study, and not only their sequence, enables testing of the biological function of the genes represented in the collection. Expression of the citrus SEP3 homologue, CitrSEP, in Arabidopsis results in early flowering, along with other phenotypes resembling the over-expression of the Arabidopsis SEPALLATA genes. Our findings suggest that the members of the SEP gene family play similar roles in these quite distant plant species. PMID:19747386
Identification, characterization and expression analysis of lineage-specific genes within sweet orange (Citrus sinensis).

PubMed

Xu, Yuantao; Wu, Guizhi; Hao, Baohai; Chen, Lingling; Deng, Xiuxin; Xu, Qiang

2015-11-23

With the availability of rapidly increasing number of genome and transcriptome sequences, lineage-specific genes (LSGs) can be identified and characterized. Like other conserved functional genes, LSGs play important roles in biological evolution and functions. Two set of citrus LSGs, 296 citrus-specific genes (CSGs) and 1039 orphan genes specific to sweet orange, were identified by comparative analysis between the sweet orange genome sequences and 41 genomes and 273 transcriptomes. With the two sets of genes, gene structure and gene expression pattern were investigated. On average, both the CSGs and orphan genes have fewer exons, shorter gene length and higher GC content when compared with those evolutionarily conserved genes (ECs). Expression profiling indicated that most of the LSGs expressed in various tissues of sweet orange and some of them exhibited distinct temporal and spatial expression patterns. Particularly, the orphan genes were preferentially expressed in callus, which is an important pluripotent tissue of citrus. Besides, part of the CSGs and orphan genes expressed responsive to abiotic stress, indicating their potential functions during interaction with environment. This study identified and characterized two sets of LSGs in citrus, dissected their sequence features and expression patterns, and provided valuable clues for future functional analysis of the LSGs in sweet orange.
Transcriptome Analysis of a Premature Leaf Senescence Mutant of Common Wheat (Triticum aestivum L.)

PubMed Central

Xia, Chuan; Zhang, Lichao; Dong, Chunhao; Liu, Xu; Kong, Xiuying

2018-01-01

Leaf senescence is an important agronomic trait that affects both crop yield and quality. In this study, we characterized a premature leaf senescence mutant of wheat (Triticum aestivum L.) obtained by ethylmethane sulfonate (EMS) mutagenesis, named m68. Genetic analysis showed that the leaf senescence phenotype of m68 is controlled by a single recessive nuclear gene. We compared the transcriptome of wheat leaves between the wild type (WT) and the m68 mutant at four time points. Differentially expressed gene (DEG) analysis revealed many genes that were closely related to senescence genes. Gene Ontology (GO) enrichment analysis suggested that transcription factors and protein transport genes might function in the beginning of leaf senescence, while genes that were associated with chlorophyll and carbon metabolism might function in the later stage. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis showed that the genes that are involved in plant hormone signal transduction were significantly enriched. Through expression pattern clustering of DEGs, we identified 1012 genes that were induced during senescence, and we found that the WRKY family and zinc finger transcription factors might be more important than other transcription factors in the early stage of leaf senescence. These results will not only support further gene cloning and functional analysis of m68, but also facilitate the study of leaf senescence in wheat. PMID:29534430
BeeSpace Navigator: exploratory analysis of gene function using semantic indexing of biological literature.

PubMed

Sen Sarma, Moushumi; Arcoleo, David; Khetani, Radhika S; Chee, Brant; Ling, Xu; He, Xin; Jiang, Jing; Mei, Qiaozhu; Zhai, ChengXiang; Schatz, Bruce

2011-07-01

With the rapid decrease in cost of genome sequencing, the classification of gene function is becoming a primary problem. Such classification has been performed by human curators who read biological literature to extract evidence. BeeSpace Navigator is a prototype software for exploratory analysis of gene function using biological literature. The software supports an automatic analogue of the curator process to extract functions, with a simple interface intended for all biologists. Since extraction is done on selected collections that are semantically indexed into conceptual spaces, the curation can be task specific. Biological literature containing references to gene lists from expression experiments can be analyzed to extract concepts that are computational equivalents of a classification such as Gene Ontology, yielding discriminating concepts that differentiate gene mentions from other mentions. The functions of individual genes can be summarized from sentences in biological literature, to produce results resembling a model organism database entry that is automatically computed. Statistical frequency analysis based on literature phrase extraction generates offline semantic indexes to support these gene function services. The website with BeeSpace Navigator is free and open to all; there is no login requirement at www.beespace.illinois.edu for version 4. Materials from the 2010 BeeSpace Software Training Workshop are available at www.beespace.illinois.edu/bstwmaterials.php.
Genome-Wide Identification and Structural Analysis of bZIP Transcription Factor Genes in Brassica napus.

PubMed

Zhou, Yan; Xu, Daixiang; Jia, Ledong; Huang, Xiaohu; Ma, Guoqiang; Wang, Shuxian; Zhu, Meichen; Zhang, Aoxiang; Guan, Mingwei; Lu, Kun; Xu, Xinfu; Wang, Rui; Li, Jiana; Qu, Cunmin

2017-10-24

The basic region/leucine zipper motif (bZIP) transcription factor family is one of the largest families of transcriptional regulators in plants. bZIP genes have been systematically characterized in some plants, but not in rapeseed ( Brassica napus ). In this study, we identified 247 BnbZIP genes in the rapeseed genome, which we classified into 10 subfamilies based on phylogenetic analysis of their deduced protein sequences. The BnbZIP genes were grouped into functional clades with Arabidopsis genes with similar putative functions, indicating functional conservation. Genome mapping analysis revealed that the BnbZIPs are distributed unevenly across all 19 chromosomes, and that some of these genes arose through whole-genome duplication and dispersed duplication events. All expression profiles of 247 bZIP genes were extracted from RNA-sequencing data obtained from 17 different B . napus ZS11 tissues with 42 various developmental stages. These genes exhibited different expression patterns in various tissues, revealing that these genes are differentially regulated. Our results provide a valuable foundation for functional dissection of the different BnbZIP homologs in B . napus and its parental lines and for molecular breeding studies of bZIP genes in B . napus .
Genome-Wide Identification and Structural Analysis of bZIP Transcription Factor Genes in Brassica napus

PubMed Central

Zhou, Yan; Xu, Daixiang; Jia, Ledong; Huang, Xiaohu; Ma, Guoqiang; Wang, Shuxian; Zhu, Meichen; Zhang, Aoxiang; Guan, Mingwei; Xu, Xinfu; Wang, Rui; Li, Jiana

2017-01-01

The basic region/leucine zipper motif (bZIP) transcription factor family is one of the largest families of transcriptional regulators in plants. bZIP genes have been systematically characterized in some plants, but not in rapeseed (Brassica napus). In this study, we identified 247 BnbZIP genes in the rapeseed genome, which we classified into 10 subfamilies based on phylogenetic analysis of their deduced protein sequences. The BnbZIP genes were grouped into functional clades with Arabidopsis genes with similar putative functions, indicating functional conservation. Genome mapping analysis revealed that the BnbZIPs are distributed unevenly across all 19 chromosomes, and that some of these genes arose through whole-genome duplication and dispersed duplication events. All expression profiles of 247 bZIP genes were extracted from RNA-sequencing data obtained from 17 different B. napus ZS11 tissues with 42 various developmental stages. These genes exhibited different expression patterns in various tissues, revealing that these genes are differentially regulated. Our results provide a valuable foundation for functional dissection of the different BnbZIP homologs in B. napus and its parental lines and for molecular breeding studies of bZIP genes in B. napus. PMID:29064393
Familial aggregation analysis of gene expressions

PubMed Central

Rao, Shao-Qi; Xu, Liang-De; Zhang, Guang-Mei; Li, Xia; Li, Lin; Shen, Gong-Qing; Jiang, Yang; Yang, Yue-Ying; Gong, Bin-Sheng; Jiang, Wei; Zhang, Fan; Xiao, Yun; Wang, Qing K

2007-01-01

Traditional studies of familial aggregation are aimed at defining the genetic (and non-genetic) causes of a disease from physiological or clinical traits. However, there has been little attempt to use genome-wide gene expressions, the direct phenotypic measures of genes, as the traits to investigate several extended issues regarding the distributions of familially aggregated genes on chromosomes or in functions. In this study we conducted a genome-wide familial aggregation analysis by using the in vitro cell gene expressions of 3300 human autosome genes (Problem 1 data provided to Genetic Analysis Workshop 15) in order to answer three basic genetics questions. First, we investigated how gene expressions aggregate among different types (degrees) of relative pairs. Second, we conducted a bioinformatics analysis of highly familially aggregated genes to see how they are distributed on chromosomes. Third, we performed a gene ontology enrichment test of familially aggregated genes to find evidence to support their functional consensus. The results indicated that 1) gene expressions did aggregate in families, especially between sibs. Of 3300 human genes analyzed, there were a total of 1105 genes with one or more significant (empirical p < 0.05) familial correlation; 2) there were several genomic hot spots where highly familially aggregated genes (e.g., the chromosome 6 HLA genes cluster) were clustered; 3) as we expected, gene ontology enrichment tests revealed that the 1105 genes were aggregating not only in families but also in functional categories. PMID:18466548
Decoding genes with coexpression networks and metabolomics - 'majority report by precogs'.

PubMed

Saito, Kazuki; Hirai, Masami Y; Yonekura-Sakakibara, Keiko

2008-01-01

Following the sequencing of whole genomes of model plants, high-throughput decoding of gene function is a major challenge in modern plant biology. In view of remarkable technical advances in transcriptomics and metabolomics, integrated analysis of these 'omics' by data-mining informatics is an excellent tool for prediction and identification of gene function, particularly for genes involved in complicated metabolic pathways. The availability of Arabidopsis public transcriptome datasets containing data of >1000 microarrays reinforces the potential for prediction of gene function by transcriptome coexpression analysis. Here, we review the strategy of combining transcriptome and metabolome as a powerful technology for studying the functional genomics of model plants and also crop and medicinal plants.
Recent Achievement in Gene Cloning and Functional Genomics in Soybean

PubMed Central

Zhai, Hong; Lü, Shixiang; Wu, Hongyan; Zhang, Yupeng

2013-01-01

Soybean is a model plant for photoperiodism as well as for symbiotic nitrogen fixation. However, a rather low efficiency in soybean transformation hampers functional analysis of genes isolated from soybean. In comparison, rapid development and progress in flowering time and photoperiodic response have been achieved in Arabidopsis and rice. As the soybean genomic information has been released since 2008, gene cloning and functional genomic studies have been revived as indicated by successfully characterizing genes involved in maturity and nematode resistance. Here, we review some major achievements in the cloning of some important genes and some specific features at genetic or genomic levels revealed by the analysis of functional genomics of soybean. PMID:24311973
COGNAT: a web server for comparative analysis of genomic neighborhoods.

PubMed

Klimchuk, Olesya I; Konovalov, Kirill A; Perekhvatov, Vadim V; Skulachev, Konstantin V; Dibrova, Daria V; Mulkidjanian, Armen Y

2017-11-22

In prokaryotic genomes, functionally coupled genes can be organized in conserved gene clusters enabling their coordinated regulation. Such clusters could contain one or several operons, which are groups of co-transcribed genes. Those genes that evolved from a common ancestral gene by speciation (i.e. orthologs) are expected to have similar genomic neighborhoods in different organisms, whereas those copies of the gene that are responsible for dissimilar functions (i.e. paralogs) could be found in dissimilar genomic contexts. Comparative analysis of genomic neighborhoods facilitates the prediction of co-regulated genes and helps to discern different functions in large protein families. We intended, building on the attribution of gene sequences to the clusters of orthologous groups of proteins (COGs), to provide a method for visualization and comparative analysis of genomic neighborhoods of evolutionary related genes, as well as a respective web server. Here we introduce the COmparative Gene Neighborhoods Analysis Tool (COGNAT), a web server for comparative analysis of genomic neighborhoods. The tool is based on the COG database, as well as the Pfam protein families database. As an example, we show the utility of COGNAT in identifying a new type of membrane protein complex that is formed by paralog(s) of one of the membrane subunits of the NADH:quinone oxidoreductase of type 1 (COG1009) and a cytoplasmic protein of unknown function (COG3002). This article was reviewed by Drs. Igor Zhulin, Uri Gophna and Igor Rogozin.
Robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function

PubMed Central

2009-01-01

Background A central task in contemporary biosciences is the identification of biological processes showing response in genome-wide differential gene expression experiments. Two types of analysis are common. Either, one generates an ordered list based on the differential expression values of the probed genes and examines the tail areas of the list for over-representation of various functional classes. Alternatively, one monitors the average differential expression level of genes belonging to a given functional class. So far these two types of method have not been combined. Results We introduce a scoring function, Gene Set Z-score (GSZ), for the analysis of functional class over-representation that combines two previous analysis methods. GSZ encompasses popular functions such as correlation, hypergeometric test, Max-Mean and Random Sets as limiting cases. GSZ is stable against changes in class size as well as across different positions of the analysed gene list in tests with randomized data. GSZ shows the best overall performance in a detailed comparison to popular functions using artificial data. Likewise, GSZ stands out in a cross-validation of methods using split real data. A comparison of empirical p-values further shows a strong difference in favour of GSZ, which clearly reports better p-values for top classes than the other methods. Furthermore, GSZ detects relevant biological themes that are missed by the other methods. These observations also hold when comparing GSZ with popular program packages. Conclusion GSZ and improved versions of earlier methods are a useful contribution to the analysis of differential gene expression. The methods and supplementary material are available from the website http://ekhidna.biocenter.helsinki.fi/users/petri/public/GSZ/GSZscore.html. PMID:19775443
Integrative analysis for identification of shared markers from various functional cells/tissues for rheumatoid arthritis.

PubMed

Xia, Wei; Wu, Jian; Deng, Fei-Yan; Wu, Long-Fei; Zhang, Yong-Hong; Guo, Yu-Fan; Lei, Shu-Feng

2017-02-01

Rheumatoid arthritis (RA) is a systemic autoimmune disease. So far, it is unclear whether there exist common RA-related genes shared in different tissues/cells. In this study, we conducted an integrative analysis on multiple datasets to identify potential shared genes that are significant in multiple tissues/cells for RA. Seven microarray gene expression datasets representing various RA-related tissues/cells were downloaded from the Gene Expression Omnibus (GEO). Statistical analyses, testing both marginal and joint effects, were conducted to identify significant genes shared in various samples. Followed-up analyses were conducted on functional annotation clustering analysis, protein-protein interaction (PPI) analysis, gene-based association analysis, and ELISA validation analysis in in-house samples. We identified 18 shared significant genes, which were mainly involved in the immune response and chemokine signaling pathway. Among the 18 genes, eight genes (PPBP, PF4, HLA-F, S100A8, RNASEH2A, P2RY6, JAG2, and PCBP1) interact with known RA genes. Two genes (HLA-F and PCBP1) are significant in gene-based association analysis (P = 1.03E-31, P = 1.30E-2, respectively). Additionally, PCBP1 also showed differential protein expression levels in in-house case-control plasma samples (P = 2.60E-2). This study represented the first effort to identify shared RA markers from different functional cells or tissues. The results suggested that one of the shared genes, i.e., PCBP1, is a promising biomarker for RA.
snpGeneSets: An R Package for Genome-Wide Study Annotation

PubMed Central

Mei, Hao; Li, Lianna; Jiang, Fan; Simino, Jeannette; Griswold, Michael; Mosley, Thomas; Liu, Shijian

2016-01-01

Genome-wide studies (GWS) of SNP associations and differential gene expressions have generated abundant results; next-generation sequencing technology has further boosted the number of variants and genes identified. Effective interpretation requires massive annotation and downstream analysis of these genome-wide results, a computationally challenging task. We developed the snpGeneSets package to simplify annotation and analysis of GWS results. Our package integrates local copies of knowledge bases for SNPs, genes, and gene sets, and implements wrapper functions in the R language to enable transparent access to low-level databases for efficient annotation of large genomic data. The package contains functions that execute three types of annotations: (1) genomic mapping annotation for SNPs and genes and functional annotation for gene sets; (2) bidirectional mapping between SNPs and genes, and genes and gene sets; and (3) calculation of gene effect measures from SNP associations and performance of gene set enrichment analyses to identify functional pathways. We applied snpGeneSets to type 2 diabetes (T2D) results from the NHGRI genome-wide association study (GWAS) catalog, a Finnish GWAS, and a genome-wide expression study (GWES). These studies demonstrate the usefulness of snpGeneSets for annotating and performing enrichment analysis of GWS results. The package is open-source, free, and can be downloaded at: https://www.umc.edu/biostats_software/. PMID:27807048
Basic Helix-Loop-Helix Transcription Factor Gene Family Phylogenetics and Nomenclature

PubMed Central

Skinner, Michael K.; Rawls, Alan; Wilson-Rawls, Jeanne; Roalson, Eric H.

2010-01-01

A phylogenetic analysis of the basic helix-loop-helix (bHLH) gene superfamily was performed using seven different species (human, mouse, rat, worm, fly, yeast, and plant Arabidopsis) and involving over 600 bHLH genes [1]. All bHLH genes were identified in the genomes of the various species, including expressed sequence tags, and the entire coding sequence was used in the analysis. Nearly 15% of the gene family has been updated or added since the original publication. A super-tree involving six clades and all structural relationships was established and is now presented for four of the species. The wealth of functional data available for members of the bHLH gene superfamily provides us with the opportunity to use this exhaustive phylogenetic tree to predict potential functions of uncharacterized members of the family. This phylogenetic and genomic analysis of the bHLH gene family has revealed unique elements of the evolution and functional relationships of the different genes in the bHLH gene family. PMID:20219281
Linking disease-associated genes to regulatory networks via promoter organization

PubMed Central

Döhr, S.; Klingenhoff, A.; Maier, H.; de Angelis, M. Hrabé; Werner, T.; Schneider, R.

2005-01-01

Pathway- or disease-associated genes may participate in more than one transcriptional co-regulation network. Such gene groups can be readily obtained by literature analysis or by high-throughput techniques such as microarrays or protein-interaction mapping. We developed a strategy that defines regulatory networks by in silico promoter analysis, finding potentially co-regulated subgroups without a priori knowledge. Pairs of transcription factor binding sites conserved in orthologous genes (vertically) as well as in promoter sequences of co-regulated genes (horizontally) were used as seeds for the development of promoter models representing potential co-regulation. This approach was applied to a Maturity Onset Diabetes of the Young (MODY)-associated gene list, which yielded two models connecting functionally interacting genes within MODY-related insulin/glucose signaling pathways. Additional genes functionally connected to our initial gene list were identified by database searches with these promoter models. Thus, data-driven in silico promoter analysis allowed integrating molecular mechanisms with biological functions of the cell. PMID:15701758
Genomic survey, expression profile and co-expression network analysis of OsWD40 family in rice

PubMed Central

2012-01-01

Background WD40 proteins represent a large family in eukaryotes, which have been involved in a broad spectrum of crucial functions. Systematic characterization and co-expression analysis of OsWD40 genes enable us to understand the networks of the WD40 proteins and their biological processes and gene functions in rice. Results In this study, we identify and analyze 200 potential OsWD40 genes in rice, describing their gene structures, genome localizations, and evolutionary relationship of each member. Expression profiles covering the whole life cycle in rice has revealed that transcripts of OsWD40 were accumulated differentially during vegetative and reproductive development and preferentially up or down-regulated in different tissues. Under phytohormone treatments, 25 OsWD40 genes were differentially expressed with treatments of one or more of the phytohormone NAA, KT, or GA3 in rice seedlings. We also used a combined analysis of expression correlation and Gene Ontology annotation to infer the biological role of the OsWD40 genes in rice. The results suggested that OsWD40 genes may perform their diverse functions by complex network, thus were predictive for understanding their biological pathways. The analysis also revealed that OsWD40 genes might interact with each other to take part in metabolic pathways, suggesting a more complex feedback network. Conclusions All of these analyses suggest that the functions of OsWD40 genes are diversified, which provide useful references for selecting candidate genes for further functional studies. PMID:22429805
Identification of three duplicated Spin genes in medaka (Oryzias latipes).

PubMed

Wang, Xiao-Lei; Mei, Jie; Sun, Min; Hong, Yun-Han; Gui, Jian-Fang

2005-05-09

Gene and genomic duplications are very important and frequent events in fish evolution, and the divergence of duplicated genes in sequences and functions is a focus of research on gene evolution. Here, we report the identification and characterization of three duplicated Spindlin (Spin) genes from medaka (Oryzias latipes): OlSpinA, OlSpinB, and OlSpinC. Molecular cloning, genomic DNA Blast analysis and phylogenetic relationship analysis demonstrated that the three duplicated OlSpin genes should belong to gene duplication. Furthermore, Western blot analysis revealed significant expression differences of the three OlSpins among different tissues and during embryogenesis in medaka, and suggested that sequence and functional divergence might have occurred in evolution among them.
Genome-Wide Comparative Analysis of the Phospholipase D Gene Families among Allotetraploid Cotton and Its Diploid Progenitors

PubMed Central

Tang, Kai; Dong, Chun-Juan; Liu, Jin-Yuan

2016-01-01

In this study, 40 phospholipase D (PLD) genes were identified from allotetraploid cotton Gossypium hirsutum, and 20 PLD genes were examined in diploid cotton Gossypium raimondii. Combining with 19 previously identified Gossypium arboreum PLD genes, a comparative analysis was performed among the PLD gene families among allotetraploid and two diploid cottons. Based on the orthologous relationships, we found that almost each G. hirsutum PLD had a corresponding homolog in the G. arboreum and G. raimondii genomes, except for GhPLDβ3A, whose homolog GaPLDβ3 may have been lost during the evolution of G. arboreum after the interspecific hybridization. Phylogenetic analysis showed that all of the cotton PLDs were unevenly classified into six numbered subgroups: α, β/γ, δ, ε, ζ and φ. An N-terminal C2 domain was found in the α, β/γ, δ and ε subgroups, while phox homology (PX) and pleckstrin homology (PH) domains were identified in the ζ subgroup. The subgroup φ possessed a single peptide instead of a functional domain. In each phylogenetic subgroup, the PLDs showed high conservation in gene structure and amino acid sequences in functional domains. The expansion of GhPLD and GrPLD gene families were mainly attributed to segmental duplication and partly attributed to tandem duplication. Furthermore, purifying selection played a critical role in the evolution of PLD genes in cotton. Quantitative RT-PCR documented that allotetraploid cotton PLD genes were broadly expressed and each had a unique spatial and developmental expression pattern, indicating their functional diversification in cotton growth and development. Further analysis of cis-regulatory elements elucidated transcriptional regulations and potential functions. Our comparative analysis provided valuable information for understanding the putative functions of the PLD genes in cotton fiber. PMID:27213891

Functional relevance for type 1 diabetes mellitus-associated genetic variants by using integrative analyses.

PubMed

Qiu, Ying-Hua; Deng, Fei-Yan; Tang, Zai-Xiang; Jiang, Zhen-Huan; Lei, Shu-Feng

2015-10-01

Type 1 diabetes mellitus (type 1 DM) is an autoimmune disease. Although genome-wide association studies (GWAS) and meta-analyses have successfully identified numerous type 1 DM-associated susceptibility loci, the underlying mechanisms for these susceptibility loci are currently largely unclear. Based on publicly available datasets, we performed integrative analyses (i.e., integrated gene relationships among implicated loci, differential gene expression analysis, functional prediction and functional annotation clustering analysis) and combined with expression quantitative trait loci (eQTL) results to further explore function mechanisms underlying the associations between genetic variants and type 1 DM. Among a total of 183 type 1 DM-associated SNPs, eQTL analysis showed that 17 SNPs with cis-regulated eQTL effects on 9 genes. All the 9 eQTL genes enrich in immune-related pathways or Gene Ontology (GO) terms. Functional prediction analysis identified 5 SNPs located in transcription factor (TF) binding sites. Of the 9 eQTL genes, 6 (TAP2, HLA-DOB, HLA-DQB1, HLA-DQA1, HLA-DRB5 and CTSH) were differentially expressed in type 1 DM-associated related cells. Especially, rs3825932 in CTSH has integrative functional evidence supporting the association with type 1 DM. These findings indicated that integrative analyses can yield important functional information to link genetic variants and type 1 DM. Copyright © 2015 American Society for Histocompatibility and Immunogenetics. Published by Elsevier Inc. All rights reserved.
Phylogenetic analysis of IDD gene family and characterization of its expression in response to flower induction in Malus.

PubMed

Fan, Sheng; Zhang, Dong; Xing, Libo; Qi, Siyan; Du, Lisha; Wu, Haiqin; Shao, Hongxia; Li, Youmei; Ma, Juanjuan; Han, Mingyu

2017-08-01

Although INDETERMINATE DOMAIN (IDD) genes encoding specific plant transcription factors have important roles in plant growth and development, little is known about apple IDD (MdIDD) genes and their potential functions in the flower induction. In this study, we identified 20 putative IDD genes in apple and named them according to their chromosomal locations. All identified MdIDD genes shared a conserved IDD domain. A phylogenetic analysis separated MdIDDs and other plant IDD genes into four groups. Bioinformatic analysis of chemical characteristics, gene structure, and prediction of protein-protein interactions demonstrated the functional and structural diversity of MdIDD genes. To further uncover their potential functions, we performed analysis of tandem, synteny, and gene duplications, which indicated several paired homologs of IDD genes between apple and Arabidopsis. Additionally, genome duplications also promoted the expansion and evolution of the MdIDD genes. Quantitative real-time PCR revealed that all the MdIDD genes showed distinct expression levels in five different tissues (stems, leaves, buds, flowers, and fruits). Furthermore, the expression levels of candidate MdIDD genes were also investigated in response to various circumstances, including GA treatment (decreased the flowering rate), sugar treatment (increased the flowering rate), alternate-bearing conditions, and two varieties with different-flowering intensities. Parts of them were affected by exogenous treatments and showed different expression patterns. Additionally, changes in response to alternate-bearing and different-flowering varieties of apple trees indicated that they were also responsive to flower induction. Taken together, our comprehensive analysis provided valuable information for further analysis of IDD genes aiming at flower induction.
FunGene: the functional gene pipeline and repository.

PubMed

Fish, Jordan A; Chai, Benli; Wang, Qiong; Sun, Yanni; Brown, C Titus; Tiedje, James M; Cole, James R

2013-01-01

Ribosomal RNA genes have become the standard molecular markers for microbial community analysis for good reasons, including universal occurrence in cellular organisms, availability of large databases, and ease of rRNA gene region amplification and analysis. As markers, however, rRNA genes have some significant limitations. The rRNA genes are often present in multiple copies, unlike most protein-coding genes. The slow rate of change in rRNA genes means that multiple species sometimes share identical 16S rRNA gene sequences, while many more species share identical sequences in the short 16S rRNA regions commonly analyzed. In addition, the genes involved in many important processes are not distributed in a phylogenetically coherent manner, potentially due to gene loss or horizontal gene transfer. While rRNA genes remain the most commonly used markers, key genes in ecologically important pathways, e.g., those involved in carbon and nitrogen cycling, can provide important insights into community composition and function not obtainable through rRNA analysis. However, working with ecofunctional gene data requires some tools beyond those required for rRNA analysis. To address this, our Functional Gene Pipeline and Repository (FunGene; http://fungene.cme.msu.edu/) offers databases of many common ecofunctional genes and proteins, as well as integrated tools that allow researchers to browse these collections and choose subsets for further analysis, build phylogenetic trees, test primers and probes for coverage, and download aligned sequences. Additional FunGene tools are specialized to process coding gene amplicon data. For example, FrameBot produces frameshift-corrected protein and DNA sequences from raw reads while finding the most closely related protein reference sequence. These tools can help provide better insight into microbial communities by directly studying key genes involved in important ecological processes.
Gene Ontology-Based Analysis of Zebrafish Omics Data Using the Web Tool Comparative Gene Ontology.

PubMed

Ebrahimie, Esmaeil; Fruzangohar, Mario; Moussavi Nik, Seyyed Hani; Newman, Morgan

2017-10-01

Gene Ontology (GO) analysis is a powerful tool in systems biology, which uses a defined nomenclature to annotate genes/proteins within three categories: "Molecular Function," "Biological Process," and "Cellular Component." GO analysis can assist in revealing functional mechanisms underlying observed patterns in transcriptomic, genomic, and proteomic data. The already extensive and increasing use of zebrafish for modeling genetic and other diseases highlights the need to develop a GO analytical tool for this organism. The web tool Comparative GO was originally developed for GO analysis of bacterial data in 2013 ( www.comparativego.com ). We have now upgraded and elaborated this web tool for analysis of zebrafish genetic data using GOs and annotations from the Gene Ontology Consortium.
A human functional protein interaction network and its application to cancer data analysis

PubMed Central

2010-01-01

Background One challenge facing biologists is to tease out useful information from massive data sets for further analysis. A pathway-based analysis may shed light by projecting candidate genes onto protein functional relationship networks. We are building such a pathway-based analysis system. Results We have constructed a protein functional interaction network by extending curated pathways with non-curated sources of information, including protein-protein interactions, gene coexpression, protein domain interaction, Gene Ontology (GO) annotations and text-mined protein interactions, which cover close to 50% of the human proteome. By applying this network to two glioblastoma multiforme (GBM) data sets and projecting cancer candidate genes onto the network, we found that the majority of GBM candidate genes form a cluster and are closer than expected by chance, and the majority of GBM samples have sequence-altered genes in two network modules, one mainly comprising genes whose products are localized in the cytoplasm and plasma membrane, and another comprising gene products in the nucleus. Both modules are highly enriched in known oncogenes, tumor suppressors and genes involved in signal transduction. Similar network patterns were also found in breast, colorectal and pancreatic cancers. Conclusions We have built a highly reliable functional interaction network upon expert-curated pathways and applied this network to the analysis of two genome-wide GBM and several other cancer data sets. The network patterns revealed from our results suggest common mechanisms in the cancer biology. Our system should provide a foundation for a network or pathway-based analysis platform for cancer and other diseases. PMID:20482850
Identification of candidate genes in osteoporosis by integrated microarray analysis.

PubMed

Li, J J; Wang, B Q; Fei, Q; Yang, Y; Li, D

2016-12-01

In order to screen the altered gene expression profile in peripheral blood mononuclear cells of patients with osteoporosis, we performed an integrated analysis of the online microarray studies of osteoporosis. We searched the Gene Expression Omnibus (GEO) database for microarray studies of peripheral blood mononuclear cells in patients with osteoporosis. Subsequently, we integrated gene expression data sets from multiple microarray studies to obtain differentially expressed genes (DEGs) between patients with osteoporosis and normal controls. Gene function analysis was performed to uncover the functions of identified DEGs. A total of three microarray studies were selected for integrated analysis. In all, 1125 genes were found to be signiﬁcantly differentially expressed between osteoporosis patients and normal controls, with 373 upregulated and 752 downregulated genes. Positive regulation of the cellular amino metabolic process (gene ontology (GO): 0033240, false discovery rate (FDR) = 1.00E + 00) was significantly enriched under the GO category for biological processes, while for molecular functions, flavin adenine dinucleotide binding (GO: 0050660, FDR = 3.66E-01) and androgen receptor binding (GO: 0050681, FDR = 6.35E-01) were significantly enriched. DEGs were enriched in many osteoporosis-related signalling pathways, including those of mitogen-activated protein kinase (MAPK) and calcium. Protein-protein interaction (PPI) network analysis showed that the significant hub proteins contained ubiquitin specific peptidase 9, X-linked (Degree = 99), ubiquitin specific peptidase 19 (Degree = 57) and ubiquitin conjugating enzyme E2 B (Degree = 57). Analysis of gene function of identified differentially expressed genes may expand our understanding of fundamental mechanisms leading to osteoporosis. Moreover, significantly enriched pathways, such as MAPK and calcium, may involve in osteoporosis through osteoblastic differentiation and bone formation.Cite this article: J. J. Li, B. Q. Wang, Q. Fei, Y. Yang, D. Li. Identification of candidate genes in osteoporosis by integrated microarray analysis. Bone Joint Res 2016;5:594-601. DOI: 10.1302/2046-3758.512.BJR-2016-0073.R1. © 2016 Fei et al.
An improved method for functional similarity analysis of genes based on Gene Ontology.

PubMed

Tian, Zhen; Wang, Chunyu; Guo, Maozu; Liu, Xiaoyan; Teng, Zhixia

2016-12-23

Measures of gene functional similarity are essential tools for gene clustering, gene function prediction, evaluation of protein-protein interaction, disease gene prioritization and other applications. In recent years, many gene functional similarity methods have been proposed based on the semantic similarity of GO terms. However, these leading approaches may make errorprone judgments especially when they measure the specificity of GO terms as well as the IC of a term set. Therefore, how to estimate the gene functional similarity reliably is still a challenging problem. We propose WIS, an effective method to measure the gene functional similarity. First of all, WIS computes the IC of a term by employing its depth, the number of its ancestors as well as the topology of its descendants in the GO graph. Secondly, WIS calculates the IC of a term set by means of considering the weighted inherited semantics of terms. Finally, WIS estimates the gene functional similarity based on the IC overlap ratio of term sets. WIS is superior to some other representative measures on the experiments of functional classification of genes in a biological pathway, collaborative evaluation of GO-based semantic similarity measures, protein-protein interaction prediction and correlation with gene expression. Further analysis suggests that WIS takes fully into account the specificity of terms and the weighted inherited semantics of terms between GO terms. The proposed WIS method is an effective and reliable way to compare gene function. The web service of WIS is freely available at http://nclab.hit.edu.cn/WIS/ .
Available nitrogen is the key factor influencing soil microbial functional gene diversity in tropical rainforest.

PubMed

Cong, Jing; Liu, Xueduan; Lu, Hui; Xu, Han; Li, Yide; Deng, Ye; Li, Diqiang; Zhang, Yuguang

2015-08-20

Tropical rainforests cover over 50% of all known plant and animal species and provide a variety of key resources and ecosystem services to humans, largely mediated by metabolic activities of soil microbial communities. A deep analysis of soil microbial communities and their roles in ecological processes would improve our understanding on biogeochemical elemental cycles. However, soil microbial functional gene diversity in tropical rainforests and causative factors remain unclear. GeoChip, contained almost all of the key functional genes related to biogeochemical cycles, could be used as a specific and sensitive tool for studying microbial gene diversity and metabolic potential. In this study, soil microbial functional gene diversity in tropical rainforest was analyzed by using GeoChip technology. Gene categories detected in the tropical rainforest soils were related to different biogeochemical processes, such as carbon (C), nitrogen (N) and phosphorus (P) cycling. The relative abundance of genes related to C and P cycling detected mostly derived from the cultured bacteria. C degradation gene categories for substrates ranging from labile C to recalcitrant C were all detected, and gene abundances involved in many recalcitrant C degradation gene categories were significantly (P < 0.05) different among three sampling sites. The relative abundance of genes related to N cycling detected was significantly (P < 0.05) different, mostly derived from the uncultured bacteria. The gene categories related to ammonification had a high relative abundance. Both canonical correspondence analysis and multivariate regression tree analysis showed that soil available N was the most correlated with soil microbial functional gene structure. Overall high microbial functional gene diversity and different soil microbial metabolic potential for different biogeochemical processes were considered to exist in tropical rainforest. Soil available N could be the key factor in shaping the soil microbial functional gene structure and metabolic potential.
Functional clustering of time series gene expression data by Granger causality

PubMed Central

2012-01-01

Background A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them. PMID:23107425
Genome-wide analysis of the Solanum tuberosum (potato) trehalose-6-phosphate synthase (TPS) gene family: evolution and differential expression during development and stress.

PubMed

Xu, Yingchun; Wang, Yanjie; Mattson, Neil; Yang, Liu; Jin, Qijiang

2017-12-01

Trehalose-6-phosphate synthase (TPS) serves important functions in plant desiccation tolerance and response to environmental stimuli. At present, a comprehensive analysis, i.e. functional classification, molecular evolution, and expression patterns of this gene family are still lacking in Solanum tuberosum (potato). In this study, a comprehensive analysis of the TPS gene family was conducted in potato. A total of eight putative potato TPS genes (StTPSs) were identified by searching the latest potato genome sequence. The amino acid identity among eight StTPSs varied from 59.91 to 89.54%. Analysis of d N /d S ratios suggested that regions in the TPP (trehalose-6-phosphate phosphatase) domains evolved faster than the TPS domains. Although the sequence of the eight StTPSs showed high similarity (2571-2796 bp), their gene length is highly differentiated (3189-8406 bp). Many of the regulatory elements possibly related to phytohormones, abiotic stress and development were identified in different TPS genes. Based on the phylogenetic tree constructed using TPS genes of potato, and four other Solanaceae plants, TPS genes could be categorized into 6 distinct groups. Analysis revealed that purifying selection most likely played a major role during the evolution of this family. Amino acid changes detected in specific branches of the phylogenetic tree suggests relaxed constraints might have contributed to functional divergence among groups. Moreover, StTPSs were found to exhibit tissue and treatment specific expression patterns upon analysis of transcriptome data, and performing qRT-PCR. This study provides a reference for genome-wide identification of the potato TPS gene family and sets a framework for further functional studies of this important gene family in development and stress response.
GOEAST: a web-based software toolkit for Gene Ontology enrichment analysis.

PubMed

Zheng, Qi; Wang, Xiu-Jie

2008-07-01

Gene Ontology (GO) analysis has become a commonly used approach for functional studies of large-scale genomic or transcriptomic data. Although there have been a lot of software with GO-related analysis functions, new tools are still needed to meet the requirements for data generated by newly developed technologies or for advanced analysis purpose. Here, we present a Gene Ontology Enrichment Analysis Software Toolkit (GOEAST), an easy-to-use web-based toolkit that identifies statistically overrepresented GO terms within given gene sets. Compared with available GO analysis tools, GOEAST has the following improved features: (i) GOEAST displays enriched GO terms in graphical format according to their relationships in the hierarchical tree of each GO category (biological process, molecular function and cellular component), therefore, provides better understanding of the correlations among enriched GO terms; (ii) GOEAST supports analysis for data from various sources (probe or probe set IDs of Affymetrix, Illumina, Agilent or customized microarrays, as well as different gene identifiers) and multiple species (about 60 prokaryote and eukaryote species); (iii) One unique feature of GOEAST is to allow cross comparison of the GO enrichment status of multiple experiments to identify functional correlations among them. GOEAST also provides rigorous statistical tests to enhance the reliability of analysis results. GOEAST is freely accessible at http://omicslab.genetics.ac.cn/GOEAST/
New Dimensions in Microbial Ecology-Functional Genes in Studies to Unravel the Biodiversity and Role of Functional Microbial Groups in the Environment.

PubMed

Imhoff, Johannes F

2016-05-24

During the past decades, tremendous advances have been made in the possibilities to study the diversity of microbial communities in the environment. The development of methods to study these communities on the basis of 16S rRNA gene sequences analysis was a first step into the molecular analysis of environmental communities and the study of biodiversity in natural habitats. A new dimension in this field was reached with the introduction of functional genes of ecological importance and the establishment of genetic tools to study the diversity of functional microbial groups and their responses to environmental factors. Functional gene approaches are excellent tools to study the diversity of a particular function and to demonstrate changes in the composition of prokaryote communities contributing to this function. The phylogeny of many functional genes largely correlates with that of the 16S rRNA gene, and microbial species may be identified on the basis of functional gene sequences. Functional genes are perfectly suited to link culture-based microbiological work with environmental molecular genetic studies. In this review, the development of functional gene studies in environmental microbiology is highlighted with examples of genes relevant for important ecophysiological functions. Examples are presented for bacterial photosynthesis and two types of anoxygenic phototrophic bacteria, with genes of the Fenna-Matthews-Olson-protein (fmoA) as target for the green sulfur bacteria and of two reaction center proteins (pufLM) for the phototrophic purple bacteria, with genes of adenosine-5'phosphosulfate (APS) reductase (aprA), sulfate thioesterase (soxB) and dissimilatory sulfite reductase (dsrAB) for sulfur oxidizing and sulfate reducing bacteria, with genes of ammonia monooxygenase (amoA) for nitrifying/ammonia-oxidizing bacteria, with genes of particulate nitrate reductase and nitrite reductases (narH/G, nirS, nirK) for denitrifying bacteria and with genes of methane monooxygenase (pmoA) for methane oxidizing bacteria.
EvoCor: a platform for predicting functionally related genes using phylogenetic and expression profiles.

PubMed

Dittmar, W James; McIver, Lauren; Michalak, Pawel; Garner, Harold R; Valdez, Gregorio

2014-07-01

The wealth of publicly available gene expression and genomic data provides unique opportunities for computational inference to discover groups of genes that function to control specific cellular processes. Such genes are likely to have co-evolved and be expressed in the same tissues and cells. Unfortunately, the expertise and computational resources required to compare tens of genomes and gene expression data sets make this type of analysis difficult for the average end-user. Here, we describe the implementation of a web server that predicts genes involved in affecting specific cellular processes together with a gene of interest. We termed the server 'EvoCor', to denote that it detects functional relationships among genes through evolutionary analysis and gene expression correlation. This web server integrates profiles of sequence divergence derived by a Hidden Markov Model (HMM) and tissue-wide gene expression patterns to determine putative functional linkages between pairs of genes. This server is easy to use and freely available at http://pilot-hmm.vbi.vt.edu/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Statistical indicators of collective behavior and functional clusters in gene networks of yeast

NASA Astrophysics Data System (ADS)

Živković, J.; Tadić, B.; Wick, N.; Thurner, S.

2006-03-01

We analyze gene expression time-series data of yeast (S. cerevisiae) measured along two full cell-cycles. We quantify these data by using q-exponentials, gene expression ranking and a temporal mean-variance analysis. We construct gene interaction networks based on correlation coefficients and study the formation of the corresponding giant components and minimum spanning trees. By coloring genes according to their cell function we find functional clusters in the correlation networks and functional branches in the associated trees. Our results suggest that a percolation point of functional clusters can be identified on these gene expression correlation networks.
Coexpression network based on natural variation in human gene expression reveals gene interactions and functions

PubMed Central

Nayak, Renuka R.; Kearns, Michael; Spielman, Richard S.; Cheung, Vivian G.

2009-01-01

Genes interact in networks to orchestrate cellular processes. Analysis of these networks provides insights into gene interactions and functions. Here, we took advantage of normal variation in human gene expression to infer gene networks, which we constructed using correlations in expression levels of more than 8.5 million gene pairs in immortalized B cells from three independent samples. The resulting networks allowed us to identify biological processes and gene functions. Among the biological pathways, we found processes such as translation and glycolysis that co-occur in the same subnetworks. We predicted the functions of poorly characterized genes, including CHCHD2 and TMEM111, and provided experimental evidence that TMEM111 is part of the endoplasmic reticulum-associated secretory pathway. We also found that IFIH1, a susceptibility gene of type 1 diabetes, interacts with YES1, which plays a role in glucose transport. Furthermore, genes that predispose to the same diseases are clustered nonrandomly in the coexpression network, suggesting that networks can provide candidate genes that influence disease susceptibility. Therefore, our analysis of gene coexpression networks offers information on the role of human genes in normal and disease processes. PMID:19797678
Differential expression of pancreatic protein and chemosensing receptor mRNAs in NKCC1-null intestine.

PubMed

Bradford, Emily M; Vairamani, Kanimozhi; Shull, Gary E

2016-02-15

To investigate the intestinal functions of the NKCC1 Na(+)-K(+)-2Cl cotransporter (SLC12a2 gene), differential mRNA expression changes in NKCC1-null intestine were analyzed. Microarray analysis of mRNA from intestines of adult wild-type mice and gene-targeted NKCC1-null mice (n = 6 of each genotype) was performed to identify patterns of differential gene expression changes. Differential expression patterns were further examined by Gene Ontology analysis using the online Gorilla program, and expression changes of selected genes were verified using northern blot analysis and quantitative real time-polymerase chain reaction. Histological staining and immunofluorescence were performed to identify cell types in which upregulated pancreatic digestive enzymes were expressed. Genes typically associated with pancreatic function were upregulated. These included lipase, amylase, elastase, and serine proteases indicative of pancreatic exocrine function, as well as insulin and regenerating islet genes, representative of endocrine function. Northern blot analysis and immunohistochemistry showed that differential expression of exocrine pancreas mRNAs was specific to the duodenum and localized to a subset of goblet cells. In addition, a major pattern of changes involving differential expression of olfactory receptors that function in chemical sensing, as well as other chemosensing G-protein coupled receptors, was observed. These changes in chemosensory receptor expression may be related to the failure of intestinal function and dependency on parenteral nutrition observed in humans with SLC12a2 mutations. The results suggest that loss of NKCC1 affects not only secretion, but also goblet cell function and chemosensing of intestinal contents via G-protein coupled chemosensory receptors.
LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights

PubMed Central

Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong

2016-01-01

Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher’s exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO’s usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher. PMID:26750448
LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights.

PubMed

Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong

2016-01-11

Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher's exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO's usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.
A Prototype System for Retrieval of Gene Functional Information

PubMed Central

Folk, Lillian C.; Patrick, Timothy B.; Pattison, James S.; Wolfinger, Russell D.; Mitchell, Joyce A.

2003-01-01

Microarrays allow researchers to gather data about the expression patterns of thousands of genes simultaneously. Statistical analysis can reveal which genes show statistically significant results. Making biological sense of those results requires the retrieval of functional information about the genes thus identified, typically a manual gene-by-gene retrieval of information from various on-line databases. For experiments generating thousands of genes of interest, retrieval of functional information can become a significant bottleneck. To address this issue, we are currently developing a prototype system to automate the process of retrieval of functional information from multiple on-line sources. PMID:14728346
Analysis of multiplex gene expression maps obtained by voxelation.

PubMed

An, Li; Xie, Hongbo; Chin, Mark H; Obradovic, Zoran; Smith, Desmond J; Megalooikonomou, Vasileios

2009-04-29

Gene expression signatures in the mammalian brain hold the key to understanding neural development and neurological disease. Researchers have previously used voxelation in combination with microarrays for acquisition of genome-wide atlases of expression patterns in the mouse brain. On the other hand, some work has been performed on studying gene functions, without taking into account the location information of a gene's expression in a mouse brain. In this paper, we present an approach for identifying the relation between gene expression maps obtained by voxelation and gene functions. To analyze the dataset, we chose typical genes as queries and aimed at discovering similar gene groups. Gene similarity was determined by using the wavelet features extracted from the left and right hemispheres averaged gene expression maps, and by the Euclidean distance between each pair of feature vectors. We also performed a multiple clustering approach on the gene expression maps, combined with hierarchical clustering. Among each group of similar genes and clusters, the gene function similarity was measured by calculating the average gene function distances in the gene ontology structure. By applying our methodology to find similar genes to certain target genes we were able to improve our understanding of gene expression patterns and gene functions. By applying the clustering analysis method, we obtained significant clusters, which have both very similar gene expression maps and very similar gene functions respectively to their corresponding gene ontologies. The cellular component ontology resulted in prominent clusters expressed in cortex and corpus callosum. The molecular function ontology gave prominent clusters in cortex, corpus callosum and hypothalamus. The biological process ontology resulted in clusters in cortex, hypothalamus and choroid plexus. Clusters from all three ontologies combined were most prominently expressed in cortex and corpus callosum. The experimental results confirm the hypothesis that genes with similar gene expression maps might have similar gene functions. The voxelation data takes into account the location information of gene expression level in mouse brain, which is novel in related research. The proposed approach can potentially be used to predict gene functions and provide helpful suggestions to biologists.

Estimating gene function with least squares nonnegative matrix factorization.

PubMed

Wang, Guoli; Ochs, Michael F

2007-01-01

Nonnegative matrix factorization is a machine learning algorithm that has extracted information from data in a number of fields, including imaging and spectral analysis, text mining, and microarray data analysis. One limitation with the method for linking genes through microarray data in order to estimate gene function is the high variance observed in transcription levels between different genes. Least squares nonnegative matrix factorization uses estimates of the uncertainties on the mRNA levels for each gene in each condition, to guide the algorithm to a local minimum in normalized chi2, rather than a Euclidean distance or divergence between the reconstructed data and the data itself. Herein, application of this method to microarray data is demonstrated in order to predict gene function.
A chronological expression profile of gene activity during embryonic mouse brain development.

PubMed

Goggolidou, P; Soneji, S; Powles-Glover, N; Williams, D; Sethi, S; Baban, D; Simon, M M; Ragoussis, I; Norris, D P

2013-12-01

The brain is a functionally complex organ, the patterning and development of which are key to adult health. To help elucidate the genetic networks underlying mammalian brain patterning, we conducted detailed transcriptional profiling during embryonic development of the mouse brain. A total of 2,400 genes were identified as showing differential expression between three developmental stages. Analysis of the data identified nine gene clusters to demonstrate analogous expression profiles. A significant group of novel genes of as yet undiscovered biological function were detected as being potentially relevant to brain development and function, in addition to genes that have previously identified roles in the brain. Furthermore, analysis for genes that display asymmetric expression between the left and right brain hemispheres during development revealed 35 genes as putatively asymmetric from a combined data set. Our data constitute a valuable new resource for neuroscience and neurodevelopment, exposing possible functional associations between genes, including novel loci, and encouraging their further investigation in human neurological and behavioural disorders.
A comparative analysis of the avirulence and translational transactivator functions of gene VI of Cauliflower mosaic virus.

PubMed

Palanichelvam, Karuppaiah; Schoelz, James E

2002-02-15

The primary function associated at present with the gene VI product of Cauliflower mosaic virus (CaMV) is that of a translational transactivator (TAV). In this capacity, it alters the host translational machinery to allow reinitiation of translation of other CaMV genes on the polycistronic 35S RNA of CaMV. In addition, the gene VI protein can elicit a specific type of plant defense response called the hypersensitive response (HR) in Nicotiana edwardsonii. In this study, we have adapted the agroinfiltration technique to compare the sequences of CaMV gene VI required for TAV function and elicitation of HR. To measure the activity of the TAV, we coagroinfiltrated gene VI of CaMV strain W260 with a bicistronic GUS reporter plasmid. TAV function could be assayed 4 days postinfiltration, before the onset of HR in N. edwardsonii. Through the use of the TAV and HR assays, we could show that the TAV functions of gene VI of CaMV strains W260 and D4 were equivalent, but only W260 gene VI elicited HR. A mutational analysis of W260 gene VI showed that the structural requirements for elicitation of HR were much more stringent than those for TAV function. Small deletions from either the 5' or 3' end of W260 gene VI abolished its ability to elicit HR, although the TAV function was retained in the mutant. The TAV function could also tolerate a small insertion within gene VI; this insertion abolished the elicitor function. This study provides direct evidence that the TAV function of gene VI is separate from its role as an elicitor of HR.
A Systematic Investigation into Aging Related Genes in Brain and Their Relationship with Alzheimer's Disease.

PubMed

Meng, Guofeng; Zhong, Xiaoyan; Mei, Hongkang

2016-01-01

Aging, as a complex biological process, is accompanied by the accumulation of functional loses at different levels, which makes age to be the biggest risk factor to many neurological diseases. Even following decades of investigation, the process of aging is still far from being fully understood, especially at a systematic level. In this study, we identified aging related genes in brain by collecting the ones with sustained and consistent gene expression or DNA methylation changes in the aging process. Functional analysis with Gene Ontology to these genes suggested transcriptional regulators to be the most affected genes in the aging process. Transcription regulation analysis found some transcription factors, especially Specificity Protein 1 (SP1), to play important roles in regulating aging related gene expression. Module-based functional analysis indicated these genes to be associated with many well-known aging related pathways, supporting the validity of our approach to select aging related genes. Finally, we investigated the roles of aging related genes on Alzheimer's Disease (AD). We found that aging and AD related genes both involved some common pathways, which provided a possible explanation why aging made the brain more vulnerable to Alzheimer's Disease.
fabp4 is central to eight obesity associated genes: a functional gene network-based polymorphic study.

PubMed

Bag, Susmita; Ramaiah, Sudha; Anbarasu, Anand

2015-01-07

Network study on genes and proteins offers functional basics of the complexity of gene and protein, and its interacting partners. The gene fatty acid-binding protein 4 (fabp4) is found to be highly expressed in adipose tissue, and is one of the most abundant proteins in mature adipocytes. Our investigations on functional modules of fabp4 provide useful information on the functional genes interacting with fabp4, their biochemical properties and their regulatory functions. The present study shows that there are eight set of candidate genes: acp1, ext2, insr, lipe, ostf1, sncg, usp15, and vim that are strongly and functionally linked up with fabp4. Gene ontological analysis of network modules of fabp4 provides an explicit idea on the functional aspect of fabp4 and its interacting nodes. The hierarchal mapping on gene ontology indicates gene specific processes and functions as well as their compartmentalization in tissues. The fabp4 along with its interacting genes are involved in lipid metabolic activity and are integrated in multi-cellular processes of tissues and organs. They also have important protein/enzyme binding activity. Our study elucidated disease-associated nsSNP prediction for fabp4 and it is interesting to note that there are four rsID׳s (rs1051231, rs3204631, rs140925685 and rs141169989) with disease allelic variation (T104P, T126P, G27D and G90V respectively). On the whole, our gene network analysis presents a clear insight about the interactions and functions associated with fabp4 gene network. Copyright © 2014 Elsevier Ltd. All rights reserved.
[Key effect genes responding to nerve injury identified by gene ontology and computer pattern recognition].

PubMed

Pan, Qian; Peng, Jin; Zhou, Xue; Yang, Hao; Zhang, Wei

2012-07-01

In order to screen out important genes from large gene data of gene microarray after nerve injury, we combine gene ontology (GO) method and computer pattern recognition technology to find key genes responding to nerve injury, and then verify one of these screened-out genes. Data mining and gene ontology analysis of gene chip data GSE26350 was carried out through MATLAB software. Cd44 was selected from screened-out key gene molecular spectrum by comparing genes' different GO terms and positions on score map of principal component. Function interferences were employed to influence the normal binding of Cd44 and one of its ligands, chondroitin sulfate C (CSC), to observe neurite extension. Gene ontology analysis showed that the first genes on score map (marked by red *) mainly distributed in molecular transducer activity, receptor activity, protein binding et al molecular function GO terms. Cd44 is one of six effector protein genes, and attracted us with its function diversity. After adding different reagents into the medium to interfere the normal binding of CSC and Cd44, varying-degree remissions of CSC's inhibition on neurite extension were observed. CSC can inhibit neurite extension through binding Cd44 on the neuron membrane. This verifies that important genes in given physiological processes can be identified by gene ontology analysis of gene chip data.
Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data.

PubMed

Tintle, Nathan L; Sitarik, Alexandra; Boerema, Benjamin; Young, Kylie; Best, Aaron A; Dejongh, Matthew

2012-08-08

Statistical analyses of whole genome expression data require functional information about genes in order to yield meaningful biological conclusions. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) are common sources of functionally grouped gene sets. For bacteria, the SEED and MicrobesOnline provide alternative, complementary sources of gene sets. To date, no comprehensive evaluation of the data obtained from these resources has been performed. We define a series of gene set consistency metrics directly related to the most common classes of statistical analyses for gene expression data, and then perform a comprehensive analysis of 3581 Affymetrix® gene expression arrays across 17 diverse bacteria. We find that gene sets obtained from GO and KEGG demonstrate lower consistency than those obtained from the SEED and MicrobesOnline, regardless of gene set size. Despite the widespread use of GO and KEGG gene sets in bacterial gene expression data analysis, the SEED and MicrobesOnline provide more consistent sets for a wide variety of statistical analyses. Increased use of the SEED and MicrobesOnline gene sets in the analysis of bacterial gene expression data may improve statistical power and utility of expression data.
Gene function in early mouse embryonic stem cell differentiation

PubMed Central

Sene, Kagnew Hailesellasse; Porter, Christopher J; Palidwor, Gareth; Perez-Iratxeta, Carolina; Muro, Enrique M; Campbell, Pearl A; Rudnicki, Michael A; Andrade-Navarro, Miguel A

2007-01-01

Background Little is known about the genes that drive embryonic stem cell differentiation. However, such knowledge is necessary if we are to exploit the therapeutic potential of stem cells. To uncover the genetic determinants of mouse embryonic stem cell (mESC) differentiation, we have generated and analyzed 11-point time-series of DNA microarray data for three biologically equivalent but genetically distinct mESC lines (R1, J1, and V6.5) undergoing undirected differentiation into embryoid bodies (EBs) over a period of two weeks. Results We identified the initial 12 hour period as reflecting the early stages of mESC differentiation and studied probe sets showing consistent changes of gene expression in that period. Gene function analysis indicated significant up-regulation of genes related to regulation of transcription and mRNA splicing, and down-regulation of genes related to intracellular signaling. Phylogenetic analysis indicated that the genes showing the largest expression changes were more likely to have originated in metazoans. The probe sets with the most consistent gene changes in the three cell lines represented 24 down-regulated and 12 up-regulated genes, all with closely related human homologues. Whereas some of these genes are known to be involved in embryonic developmental processes (e.g. Klf4, Otx2, Smn1, Socs3, Tagln, Tdgf1), our analysis points to others (such as transcription factor Phf21a, extracellular matrix related Lama1 and Cyr61, or endoplasmic reticulum related Sc4mol and Scd2) that have not been previously related to mESC function. The majority of identified functions were related to transcriptional regulation, intracellular signaling, and cytoskeleton. Genes involved in other cellular functions important in ESC differentiation such as chromatin remodeling and transmembrane receptors were not observed in this set. Conclusion Our analysis profiles for the first time gene expression at a very early stage of mESC differentiation, and identifies a functional and phylogenetic signature for the genes involved. The data generated constitute a valuable resource for further studies. All DNA microarray data used in this study are available in the StemBase database of stem cell gene expression data [1] and in the NCBI's GEO database. PMID:17394647
Discovering transnosological molecular basis of human brain diseases using biclustering analysis of integrated gene expression data.

PubMed

Cha, Kihoon; Hwang, Taeho; Oh, Kimin; Yi, Gwan-Su

2015-01-01

It has been reported that several brain diseases can be treated as transnosological manner implicating possible common molecular basis under those diseases. However, molecular level commonality among those brain diseases has been largely unexplored. Gene expression analyses of human brain have been used to find genes associated with brain diseases but most of those studies were restricted either to an individual disease or to a couple of diseases. In addition, identifying significant genes in such brain diseases mostly failed when it used typical methods depending on differentially expressed genes. In this study, we used a correlation-based biclustering approach to find coexpressed gene sets in five neurodegenerative diseases and three psychiatric disorders. By using biclustering analysis, we could efficiently and fairly identified various gene sets expressed specifically in both single and multiple brain diseases. We could find 4,307 gene sets correlatively expressed in multiple brain diseases and 3,409 gene sets exclusively specified in individual brain diseases. The function enrichment analysis of those gene sets showed many new possible functional bases as well as neurological processes that are common or specific for those eight diseases. This study introduces possible common molecular bases for several brain diseases, which open the opportunity to clarify the transnosological perspective assumed in brain diseases. It also showed the advantages of correlation-based biclustering analysis and accompanying function enrichment analysis for gene expression data in this type of investigation.
Discovering transnosological molecular basis of human brain diseases using biclustering analysis of integrated gene expression data

PubMed Central

2015-01-01

Background It has been reported that several brain diseases can be treated as transnosological manner implicating possible common molecular basis under those diseases. However, molecular level commonality among those brain diseases has been largely unexplored. Gene expression analyses of human brain have been used to find genes associated with brain diseases but most of those studies were restricted either to an individual disease or to a couple of diseases. In addition, identifying significant genes in such brain diseases mostly failed when it used typical methods depending on differentially expressed genes. Results In this study, we used a correlation-based biclustering approach to find coexpressed gene sets in five neurodegenerative diseases and three psychiatric disorders. By using biclustering analysis, we could efficiently and fairly identified various gene sets expressed specifically in both single and multiple brain diseases. We could find 4,307 gene sets correlatively expressed in multiple brain diseases and 3,409 gene sets exclusively specified in individual brain diseases. The function enrichment analysis of those gene sets showed many new possible functional bases as well as neurological processes that are common or specific for those eight diseases. Conclusions This study introduces possible common molecular bases for several brain diseases, which open the opportunity to clarify the transnosological perspective assumed in brain diseases. It also showed the advantages of correlation-based biclustering analysis and accompanying function enrichment analysis for gene expression data in this type of investigation. PMID:26043779
Construction and Analysis of Functional Networks in the Gut Microbiome of Type 2 Diabetes Patients.

PubMed

Li, Lianshuo; Wang, Zicheng; He, Peng; Ma, Shining; Du, Jie; Jiang, Rui

2016-10-01

Although networks of microbial species have been widely used in the analysis of 16S rRNA sequencing data of a microbiome, the construction and analysis of a complete microbial gene network are in general problematic because of the large number of microbial genes in metagenomics studies. To overcome this limitation, we propose to map microbial genes to functional units, including KEGG orthologous groups and the evolutionary genealogy of genes: Non-supervised Orthologous Groups (eggNOG) orthologous groups, to enable the construction and analysis of a microbial functional network. We devised two statistical methods to infer pairwise relationships between microbial functional units based on a deep sequencing dataset of gut microbiome from type 2 diabetes (T2D) patients as well as healthy controls. Networks containing such functional units and their significant interactions were constructed subsequently. We conducted a variety of analyses of global properties, local properties, and functional modules in the resulting functional networks. Our data indicate that besides the observations consistent with the current knowledge, this study provides novel biological insights into the gut microbiome associated with T2D. Copyright © 2016. Production and hosting by Elsevier Ltd.
Interconnection of Key Microbial Functional Genes for Enhanced Benzo[a]pyrene Biodegradation in Sediments by Microbial Electrochemistry.

PubMed

Yan, Zaisheng; He, Yuhong; Cai, Haiyuan; Van Nostrand, Joy D; He, Zhili; Zhou, Jizhong; Krumholz, Lee R; Jiang, He-Long

2017-08-01

Sediment microbial fuel cells (SMFCs) can stimulate the degradation of polycyclic aromatic hydrocarbons in sediments, but the mechanism of this process is poorly understood at the microbial functional gene level. Here, the use of SMFC resulted in 92% benzo[a]pyrene (BaP) removal over 970 days relative to 54% in the controls. Sediment functions, microbial community structure, and network interactions were dramatically altered by the SMFC employment. Functional gene analysis showed that c-type cytochrome genes for electron transfer, aromatic degradation genes, and extracellular ligninolytic enzymes involved in lignin degradation were significantly enriched in bulk sediments during SMFC operation. Correspondingly, chemical analysis of the system showed that these genetic changes resulted in increases in the levels of easily oxidizable organic carbon and humic acids which may have resulted in increased BaP bioavailability and increased degradation rates. Tracking microbial functional genes and corresponding organic matter responses should aid mechanistic understanding of BaP enhanced biodegradation by microbial electrochemistry and development of sustainable bioremediation strategies.
Evidence of Dynamically Dysregulated Gene Expression Pathways in Hyperresponsive B Cells from African American Lupus Patients

PubMed Central

Dozmorov, Igor; Dominguez, Nicolas; Sestak, Andrea L.; Robertson, Julie M.; Harley, John B.; James, Judith A.; Guthridge, Joel M.

2013-01-01

Recent application of gene expression profiling to the immune system has shown a great potential for characterization of complex regulatory processes. It is becoming increasingly important to characterize functional systems through multigene interactions to provide valuable insights into differences between healthy controls and autoimmune patients. Here we apply an original systematic approach to the analysis of changes in regulatory gene interconnections between in Epstein-Barr virus transformed hyperresponsive B cells from SLE patients and normal control B cells. Both traditional analysis of differential gene expression and analysis of the dynamics of gene expression variations were performed in combination to establish model networks of functional gene expression. This Pathway Dysregulation Analysis identified known transcription factors and transcriptional regulators activated uniquely in stimulated B cells from SLE patients. PMID:23977035
Genome-wide identification, functional and evolutionary analysis of terpene synthases in pineapple.

PubMed

Chen, Xiaoe; Yang, Wei; Zhang, Liqin; Wu, Xianmiao; Cheng, Tian; Li, Guanglin

2017-10-01

Terpene synthases (TPSs) are vital for the biosynthesis of active terpenoids, which have important physiological, ecological and medicinal value. Although terpenoids have been reported in pineapple (Ananas comosus), genome-wide investigations of the TPS genes responsible for pineapple terpenoid synthesis are still lacking. By integrating pineapple genome and proteome data, twenty-one putative terpene synthase genes were found in pineapple and divided into five subfamilies. Tandem duplication is the cause of TPS gene family duplication. Furthermore, functional differentiation between each TPS subfamily may have occurred for several reasons. Sixty-two key amino acid sites were identified as being type-II functionally divergence between TPS-a and TPS-c subfamily. Finally, coevolution analysis indicated that multiple amino acid residues are involved in coevolutionary processes. In addition, the enzyme activity of two TPSs were tested. This genome-wide identification, functional and evolutionary analysis of pineapple TPS genes provide a new insight into understanding the roles of TPS family and lay the basis for further characterizing the function and evolution of TPS gene family. Copyright © 2017 Elsevier Ltd. All rights reserved.
Microbial Communities and Functional Genes Associated with Soil Arsenic Contamination and the Rhizosphere of the Arsenic-Hyperaccumulating Plant Pteris vittata L. ▿ †

PubMed Central

Xiong, Jinbo; Wu, Liyou; Tu, Shuxin; Van Nostrand, Joy D.; He, Zhili; Zhou, Jizhong; Wang, Gejiao

2010-01-01

To understand how microbial communities and functional genes respond to arsenic contamination in the rhizosphere of Pteris vittata, five soil samples with different arsenic contamination levels were collected from the rhizosphere of P. vittata and nonrhizosphere areas and investigated by Biolog, geochemical, and functional gene microarray (GeoChip 3.0) analyses. Biolog analysis revealed that the uncontaminated soil harbored the greatest diversity of sole-carbon utilization abilities and that arsenic contamination decreased the metabolic diversity, while rhizosphere soils had higher metabolic diversities than did the nonrhizosphere soils. GeoChip 3.0 analysis showed low proportions of overlapping genes across the five soil samples (16.52% to 45.75%). The uncontaminated soil had a higher heterogeneity and more unique genes (48.09%) than did the arsenic-contaminated soils. Arsenic resistance, sulfur reduction, phosphorus utilization, and denitrification genes were remarkably distinct between P. vittata rhizosphere and nonrhizosphere soils, which provides evidence for a strong linkage among the level of arsenic contamination, the rhizosphere, and the functional gene distribution. Canonical correspondence analysis (CCA) revealed that arsenic is the main driver in reducing the soil functional gene diversity; however, organic matter and phosphorus also have significant effects on the soil microbial community structure. The results implied that rhizobacteria play an important role during soil arsenic uptake and hyperaccumulation processes of P. vittata. PMID:20833780
Integrated analysis of microRNA and gene expression profiles reveals a functional regulatory module associated with liver fibrosis.

PubMed

Chen, Wei; Zhao, Wenshan; Yang, Aiting; Xu, Anjian; Wang, Huan; Cong, Min; Liu, Tianhui; Wang, Ping; You, Hong

2017-12-15

Liver fibrosis, characterized with the excessive accumulation of extracellular matrix (ECM) proteins, represents the final common pathway of chronic liver inflammation. Ever-increasing evidence indicates microRNAs (miRNAs) dysregulation has important implications in the different stages of liver fibrosis. However, our knowledge of miRNA-gene regulation details pertaining to such disease remains unclear. The publicly available Gene Expression Omnibus (GEO) datasets of patients suffered from cirrhosis were extracted for integrated analysis. Differentially expressed miRNAs (DEMs) and genes (DEGs) were identified using GEO2R web tool. Putative target gene prediction of DEMs was carried out using the intersection of five major algorithms: DIANA-microT, TargetScan, miRanda, PICTAR5 and miRWalk. Functional miRNA-gene regulatory network (FMGRN) was constructed based on the computational target predictions at the sequence level and the inverse expression relationships between DEMs and DEGs. DAVID web server was selected to perform KEGG pathway enrichment analysis. Functional miRNA-gene regulatory module was generated based on the biological interpretation. Internal connections among genes in liver fibrosis-related module were determined using String database. MiRNA-gene regulatory modules related to liver fibrosis were experimentally verified in recombinant human TGFβ1 stimulated and specific miRNA inhibitor treated LX-2 cells. We totally identified 85 and 923 dysregulated miRNAs and genes in liver cirrhosis biopsy samples compared to their normal controls. All evident miRNA-gene pairs were identified and assembled into FMGRN which consisted of 990 regulations between 51 miRNAs and 275 genes, forming two big sub-networks that were defined as down-network and up-network, respectively. KEGG pathway enrichment analysis revealed that up-network was prominently involved in several KEGG pathways, in which "Focal adhesion", "PI3K-Akt signaling pathway" and "ECM-receptor interaction" were remarked significant (adjusted p<0.001). Genes enriched in these pathways coupled with their regulatory miRNAs formed a functional miRNA-gene regulatory module that contains 7 miRNAs, 22 genes and 42 miRNA-gene connections. Gene interaction analysis based on String database revealed that 8 out of 22 genes were highly clustered. Finally, we experimentally confirmed a functional regulatory module containing 5 miRNAs (miR-130b-3p, miR-148a-3p, miR-345-5p, miR-378a-3p, and miR-422a) and 6 genes (COL6A1, COL6A2, COL6A3, PIK3R3, COL1A1, CCND2) associated with liver fibrosis. Our integrated analysis of miRNA and gene expression profiles highlighted a functional miRNA-gene regulatory module associated with liver fibrosis, which, to some extent, may provide important clues to better understand the underlying pathogenesis of liver fibrosis. Copyright © 2017. Published by Elsevier B.V.
Analysis of the Prefoldin Gene Family in 14 Plant Species

PubMed Central

Cao, Jun

2016-01-01

Prefoldin is a hexameric molecular chaperone complex present in all eukaryotes and archaea. The evolution of this gene family in plants is unknown. Here, I identified 140 prefoldin genes in 14 plant species. These prefoldin proteins were divided into nine groups through phylogenetic analysis. Highly conserved gene organization and motif distribution exist in each prefoldin group, implying their functional conservation. I also observed the segmental duplication of maize prefoldin gene family. Moreover, a few functional divergence sites were identified within each group pairs. Functional network analyses identified 78 co-expressed genes, and most of them were involved in carrying, binding and kinase activity. Divergent expression profiles of the maize prefoldin genes were further investigated in different tissues and development periods and under auxin and some abiotic stresses. I also found a few cis-elements responding to abiotic stress and phytohormone in the upstream sequences of the maize prefoldin genes. The results provided a foundation for exploring the characterization of the prefoldin genes in plants and will offer insights for additional functional studies. PMID:27014333
Genome-wide analysis of the Hsp70 family genes in pepper (Capsicum annuum L.) and functional identification of CaHsp70-2 involvement in heat stress.

PubMed

Guo, Meng; Liu, Jin-Hong; Ma, Xiao; Zhai, Yu-Fei; Gong, Zhen-Hui; Lu, Ming-Hui

2016-11-01

Hsp70s function as molecular chaperones and are encoded by a multi-gene family whose members play a crucial role in plant response to stress conditions, and in plant growth and development. Pepper (Capsicum annuum L.) is an important vegetable crop whose genome has been sequenced. Nonetheless, no overall analysis of the Hsp70 gene family is reported in this crop plant to date. To assess the functionality of Capsicum annuum Hsp70 (CaHsp70) genes, pepper genome database was analyzed in this research. A total of 21 CaHsp70 genes were identified and their characteristics were also described. The promoter and transcript expression analysis revealed that CaHsp70s were involved in pepper growth and development, and heat stress response. Ectopic expression of a cytosolic gene, CaHsp70-2, regulated expression of stress-related genes and conferred increased thermotolerance in transgenic Arabidopsis. Taken together, our results provide the basis for further studied to dissect CaHsp70s' function in response to heat stress as well as other environmental stresses. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Genes Responsive to Low-Intensity Pulsed Ultrasound in MC3T3-E1 Preosteoblast Cells

PubMed Central

Tabuchi, Yoshiaki; Sugahara, Yuuki; Ikegame, Mika; Suzuki, Nobuo; Kitamura, Kei-ichiro; Kondo, Takashi

2013-01-01

Although low-intensity pulsed ultrasound (LIPUS) has been shown to enhance bone fracture healing, the underlying mechanism of LIPUS remains to be fully elucidated. Here, to better understand the molecular mechanism underlying cellular responses to LIPUS, we investigated gene expression profiles in mouse MC3T3-E1 preosteoblast cells exposed to LIPUS using high-density oligonucleotide microarrays and computational gene expression analysis tools. Although treatment of the cells with a single 20-min LIPUS (1.5 MHz, 30 mW/cm2) did not affect the cell growth or alkaline phosphatase activity, the treatment significantly increased the mRNA level of Bglap. Microarray analysis demonstrated that 38 genes were upregulated and 37 genes were downregulated by 1.5-fold or more in the cells at 24-h post-treatment. Ingenuity pathway analysis demonstrated that the gene network U (up) contained many upregulated genes that were mainly associated with bone morphology in the category of biological functions of skeletal and muscular system development and function. Moreover, the biological function of the gene network D (down), which contained downregulated genes, was associated with gene expression, the cell cycle and connective tissue development and function. These results should help to further clarify the molecular basis of the mechanisms of the LIPUS response in osteoblast cells. PMID:24252911
Chemical-genetic profile analysis in yeast suggests that a previously uncharacterized open reading frame, YBR261C, affects protein synthesis

PubMed Central

Alamgir, Md; Eroukova, Veronika; Jessulat, Matthew; Xu, Jianhua; Golshani, Ashkan

2008-01-01

Background Functional genomics has received considerable attention in the post-genomic era, as it aims to identify function(s) for different genes. One way to study gene function is to investigate the alterations in the responses of deletion mutants to different stimuli. Here we investigate the genetic profile of yeast non-essential gene deletion array (yGDA, ~4700 strains) for increased sensitivity to paromomycin, which targets the process of protein synthesis. Results As expected, our analysis indicated that the majority of deletion strains (134) with increased sensitivity to paromomycin, are involved in protein biosynthesis. The remaining strains can be divided into smaller functional categories: metabolism (45), cellular component biogenesis and organization (28), DNA maintenance (21), transport (20), others (38) and unknown (39). These may represent minor cellular target sites (side-effects) for paromomycin. They may also represent novel links to protein synthesis. One of these strains carries a deletion for a previously uncharacterized ORF, YBR261C, that we term TAE1 for Translation Associated Element 1. Our focused follow-up experiments indicated that deletion of TAE1 alters the ribosomal profile of the mutant cells. Also, gene deletion strain for TAE1 has defects in both translation efficiency and fidelity. Miniaturized synthetic genetic array analysis further indicates that TAE1 genetically interacts with 16 ribosomal protein genes. Phenotypic suppression analysis using TAE1 overexpression also links TAE1 to protein synthesis. Conclusion We show that a previously uncharacterized ORF, YBR261C, affects the process of protein synthesis and reaffirm that large-scale genetic profile analysis can be a useful tool to study novel gene function(s). PMID:19055778

Chemical-genetic profile analysis in yeast suggests that a previously uncharacterized open reading frame, YBR261C, affects protein synthesis.

PubMed

Alamgir, Md; Eroukova, Veronika; Jessulat, Matthew; Xu, Jianhua; Golshani, Ashkan

2008-12-03

Functional genomics has received considerable attention in the post-genomic era, as it aims to identify function(s) for different genes. One way to study gene function is to investigate the alterations in the responses of deletion mutants to different stimuli. Here we investigate the genetic profile of yeast non-essential gene deletion array (yGDA, approximately 4700 strains) for increased sensitivity to paromomycin, which targets the process of protein synthesis. As expected, our analysis indicated that the majority of deletion strains (134) with increased sensitivity to paromomycin, are involved in protein biosynthesis. The remaining strains can be divided into smaller functional categories: metabolism (45), cellular component biogenesis and organization (28), DNA maintenance (21), transport (20), others (38) and unknown (39). These may represent minor cellular target sites (side-effects) for paromomycin. They may also represent novel links to protein synthesis. One of these strains carries a deletion for a previously uncharacterized ORF, YBR261C, that we term TAE1 for Translation Associated Element 1. Our focused follow-up experiments indicated that deletion of TAE1 alters the ribosomal profile of the mutant cells. Also, gene deletion strain for TAE1 has defects in both translation efficiency and fidelity. Miniaturized synthetic genetic array analysis further indicates that TAE1 genetically interacts with 16 ribosomal protein genes. Phenotypic suppression analysis using TAE1 overexpression also links TAE1 to protein synthesis. We show that a previously uncharacterized ORF, YBR261C, affects the process of protein synthesis and reaffirm that large-scale genetic profile analysis can be a useful tool to study novel gene function(s).
Differential expression of pancreatic protein and chemosensing receptor mRNAs in NKCC1-null intestine

PubMed Central

Bradford, Emily M; Vairamani, Kanimozhi; Shull, Gary E

2016-01-01

AIM: To investigate the intestinal functions of the NKCC1 Na+-K+-2Cl cotransporter (SLC12a2 gene), differential mRNA expression changes in NKCC1-null intestine were analyzed. METHODS: Microarray analysis of mRNA from intestines of adult wild-type mice and gene-targeted NKCC1-null mice (n = 6 of each genotype) was performed to identify patterns of differential gene expression changes. Differential expression patterns were further examined by Gene Ontology analysis using the online Gorilla program, and expression changes of selected genes were verified using northern blot analysis and quantitative real time-polymerase chain reaction. Histological staining and immunofluorescence were performed to identify cell types in which upregulated pancreatic digestive enzymes were expressed. RESULTS: Genes typically associated with pancreatic function were upregulated. These included lipase, amylase, elastase, and serine proteases indicative of pancreatic exocrine function, as well as insulin and regenerating islet genes, representative of endocrine function. Northern blot analysis and immunohistochemistry showed that differential expression of exocrine pancreas mRNAs was specific to the duodenum and localized to a subset of goblet cells. In addition, a major pattern of changes involving differential expression of olfactory receptors that function in chemical sensing, as well as other chemosensing G-protein coupled receptors, was observed. These changes in chemosensory receptor expression may be related to the failure of intestinal function and dependency on parenteral nutrition observed in humans with SLC12a2 mutations. CONCLUSION: The results suggest that loss of NKCC1 affects not only secretion, but also goblet cell function and chemosensing of intestinal contents via G-protein coupled chemosensory receptors. PMID:26909237
Functional genomics analysis of low concentration of ethanol in human hepatocellular carcinoma (HepG2) cells. Role of genes involved in transcriptional and translational processes.

PubMed

Castaneda, Francisco; Rosin-Steiner, Sigrid; Jung, Klaus

2006-12-21

We previously found that ethanol at millimolar level (1 mM) activates the expression of transcription factors with subsequent regulation of apoptotic genes in human hepatocellular carcinoma (HCC) HepG2 cells. However, the role of ethanol on the expression of genes implicated in transcriptional and translational processes remains unknown. Therefore, the aim of this study was to characterize the effect of low concentration of ethanol on gene expression profiling in HepG2 cells using cDNA microarrays with especial interest in genes with transcriptional and translational function. The gene expression pattern observed in the ethanol-treated HepG2 cells revealed a relatively similar pattern to that found in the untreated control cells. The pairwise comparison analysis demonstrated four significantly up-regulated (COBRA1, ITGB4, STAU2, and HMGN3) genes and one down-regulated (ANK3) gene. All these genes exert their function on transcriptional and translational processes and until now none of these genes have been associated with ethanol. This functional genomic analysis demonstrates the reported interaction between ethanol and ethanol-regulated genes. Moreover, it confirms the relationship between ethanol-regulated genes and various signaling pathways associated with ethanol-induced apoptosis. The data presented in this study represents an important contribution toward the understanding of the molecular mechanisms of ethanol at low concentration in HepG2 cells, a HCC-derived cell line.
Functional genomics analysis of low concentration of ethanol in human hepatocellular carcinoma (HepG2) cells. Role of genes involved in transcriptional and translational processes

PubMed Central

Castaneda, Francisco; Rosin-Steiner, Sigrid; Jung, Klaus

2007-01-01

We previously found that ethanol at millimolar level (1 mM) activates the expression of transcription factors with subsequent regulation of apoptotic genes in human hepatocellular carcinoma (HCC) HepG2 cells. However, the role of ethanol on the expression of genes implicated in transcriptional and translational processes remains unknown. Therefore, the aim of this study was to characterize the effect of low concentration of ethanol on gene expression profiling in HepG2 cells using cDNA microarrays with especial interest in genes with transcriptional and translational function. The gene expression pattern observed in the ethanol-treated HepG2 cells revealed a relatively similar pattern to that found in the untreated control cells. The pairwise comparison analysis demonstrated four significantly up-regulated (COBRA1, ITGB4, STAU2, and HMGN3) genes and one down-regulated (ANK3) gene. All these genes exert their function on transcriptional and translational processes and until now none of these genes have been associated with ethanol. This functional genomic analysis demonstrates the reported interaction between ethanol and ethanol-regulated genes. Moreover, it confirms the relationship between ethanol-regulated genes and various signaling pathways associated with ethanol-induced apoptosis. The data presented in this study represents an important contribution toward the understanding of the molecular mechanisms of ethanol at low concentration in HepG2 cells, a HCC-derived cell line. PMID:17211498
Extensive complementarity between gene function prediction methods.

PubMed

Vidulin, Vedrana; Šmuc, Tomislav; Supek, Fran

2016-12-01

The number of sequenced genomes rises steadily but we still lack the knowledge about the biological roles of many genes. Automated function prediction (AFP) is thus a necessity. We hypothesized that AFP approaches that draw on distinct genome features may be useful for predicting different types of gene functions, motivating a systematic analysis of the benefits gained by obtaining and integrating such predictions. Our pipeline amalgamates 5 133 543 genes from 2071 genomes in a single massive analysis that evaluates five established genomic AFP methodologies. While 1227 Gene Ontology (GO) terms yielded reliable predictions, the majority of these functions were accessible to only one or two of the methods. Moreover, different methods tend to assign a GO term to non-overlapping sets of genes. Thus, inferences made by diverse genomic AFP methods display a striking complementary, both gene-wise and function-wise. Because of this, a viable integration strategy is to rely on a single most-confident prediction per gene/function, rather than enforcing agreement across multiple AFP methods. Using an information-theoretic approach, we estimate that current databases contain 29.2 bits/gene of known Escherichia coli gene functions. This can be increased by up to 5.5 bits/gene using individual AFP methods or by 11 additional bits/gene upon integration, thereby providing a highly-ranking predictor on the Critical Assessment of Function Annotation 2 community benchmark. Availability of more sequenced genomes boosts the predictive accuracy of AFP approaches and also the benefit from integrating them. The individual and integrated GO predictions for the complete set of genes are available from http://gorbi.irb.hr/ CONTACT: fran.supek@irb.hrSupplementary information: Supplementary materials are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Protoplast isolation, transient transformation of leaf mesophyll protoplasts and improved Agrobacterium-mediated leaf disc infiltration of Phaseolus vulgaris: tools for rapid gene expression analysis.

PubMed

Nanjareddy, Kalpana; Arthikala, Manoj-Kumar; Blanco, Lourdes; Arellano, Elizabeth S; Lara, Miguel

2016-06-24

Phaseolus vulgaris is one of the most extensively studied model legumes in the world. The P. vulgaris genome sequence is available; therefore, the need for an efficient and rapid transformation system is more imperative than ever. The functional characterization of P. vulgaris genes is impeded chiefly due to the non-amenable nature of Phaseolus sp. to stable genetic transformation. Transient transformation systems are convenient and versatile alternatives for rapid gene functional characterization studies. Hence, the present work focuses on standardizing methodologies for protoplast isolation from multiple tissues and transient transformation protocols for rapid gene expression analysis in the recalcitrant grain legume P. vulgaris. Herein, we provide methodologies for the high-throughput isolation of leaf mesophyll-, flower petal-, hypocotyl-, root- and nodule-derived protoplasts from P. vulgaris. The highly efficient polyethylene glycol-mannitol magnesium (PEG-MMG)-mediated transformation of leaf mesophyll protoplasts was optimized using a GUS reporter gene. We used the P. vulgaris SNF1-related protein kinase 1 (PvSnRK1) gene as proof of concept to demonstrate rapid gene functional analysis. An RT-qPCR analysis of protoplasts that had been transformed with PvSnRK1-RNAi and PvSnRK1-OE vectors showed the significant downregulation and ectopic constitutive expression (overexpression), respectively, of the PvSnRK1 transcript. We also demonstrated an improved transient transformation approach, sonication-assisted Agrobacterium-mediated transformation (SAAT), for the leaf disc infiltration of P. vulgaris. Interestingly, this method resulted in a 90 % transformation efficiency and transformed 60-85 % of the cells in a given area of the leaf surface. The constitutive expression of YFP further confirmed the amenability of the system to gene functional characterization studies. We present simple and efficient methodologies for protoplast isolation from multiple P. vulgaris tissues. We also provide a high-efficiency and amenable method for leaf mesophyll transformation for rapid gene functional characterization studies. Furthermore, a modified SAAT leaf disc infiltration approach aids in validating genes and their functions. Together, these methods help to rapidly unravel novel gene functions and are promising tools for P. vulgaris research.
Analysis of the functional gene structure and metabolic potential of microbial community in high arsenic groundwater.

PubMed

Li, Ping; Jiang, Zhou; Wang, Yanhong; Deng, Ye; Van Nostrand, Joy D; Yuan, Tong; Liu, Han; Wei, Dazhun; Zhou, Jizhong

2017-10-15

Microbial functional potential in high arsenic (As) groundwater ecosystems remains largely unknown. In this study, the microbial community functional composition of nineteen groundwater samples was investigated using a functional gene array (GeoChip 5.0). Samples were divided into low and high As groups based on the clustering analysis of geochemical parameters and microbial functional structures. The results showed that As related genes (arsC, arrA), sulfate related genes (dsrA and dsrB), nitrogen cycling related genes (ureC, amoA, and hzo) and methanogen genes (mcrA, hdrB) in groundwater samples were correlated with As, SO 4 2- , NH 4 + or CH 4 concentrations, respectively. Canonical correspondence analysis (CCA) results indicated that some geochemical parameters including As, total organic content, SO 4 2- , NH 4 + , oxidation-reduction potential (ORP) and pH were important factors shaping the functional microbial community structures. Alkaline and reducing conditions with relatively low SO 4 2- , ORP, and high NH 4 + , as well as SO 4 2- and Fe reduction and ammonification involved in microbially-mediated geochemical processes could be associated with As enrichment in groundwater. This study provides an overall picture of functional microbial communities in high As groundwater aquifers, and also provides insights into the critical role of microorganisms in As biogeochemical cycling. Copyright © 2017 Elsevier Ltd. All rights reserved.
Gene co-expression network analysis in Rhodobacter capsulatus and application to comparative expression analysis of Rhodobacter sphaeroides

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pena-Castillo, Lourdes; Mercer, Ryan; Gurinovich, Anastasia

2014-08-28

The genus Rhodobacter contains purple nonsulfur bacteria found mostly in freshwater environments. Representative strains of two Rhodobacter species, R. capsulatus and R. sphaeroides, have had their genomes fully sequenced and both have been the subject of transcriptional profiling studies. Gene co-expression networks can be used to identify modules of genes with similar expression profiles. Functional analysis of gene modules can then associate co-expressed genes with biological pathways, and network statistics can determine the degree of module preservation in related networks. In this paper, we constructed an R. capsulatus gene co-expression network, performed functional analysis of identified gene modules, and investigatedmore » preservation of these modules in R. capsulatus proteomics data and in R. sphaeroides transcriptomics data. Results: The analysis identified 40 gene co-expression modules in R. capsulatus. Investigation of the module gene contents and expression profiles revealed patterns that were validated based on previous studies supporting the biological relevance of these modules. We identified two R. capsulatus gene modules preserved in the protein abundance data. We also identified several gene modules preserved between both Rhodobacter species, which indicate that these cellular processes are conserved between the species and are candidates for functional information transfer between species. Many gene modules were non-preserved, providing insight into processes that differentiate the two species. In addition, using Local Network Similarity (LNS), a recently proposed metric for expression divergence, we assessed the expression conservation of between-species pairs of orthologs, and within-species gene-protein expression profiles. Conclusions: Our analyses provide new sources of information for functional annotation in R. capsulatus because uncharacterized genes in modules are now connected with groups of genes that constitute a joint functional annotation. We identified R. capsulatus modules enriched with genes for ribosomal proteins, porphyrin and bacteriochlorophyll anabolism, and biosynthesis of secondary metabolites to be preserved in R. sphaeroides whereas modules related to RcGTA production and signalling showed lack of preservation in R. sphaeroides. In addition, we demonstrated that network statistics may also be applied within-species to identify congruence between mRNA expression and protein abundance data for which simple correlation measurements have previously had mixed results.« less
Genome-wide identification and expression analysis of the ClTCP transcription factors in Citrullus lanatus.

PubMed

Shi, Pibiao; Guy, Kateta Malangisha; Wu, Weifang; Fang, Bingsheng; Yang, Jinghua; Zhang, Mingfang; Hu, Zhongyuan

2016-04-12

The plant-specific TCP transcription factor family, which is involved in the regulation of cell growth and proliferation, performs diverse functions in multiple aspects of plant growth and development. However, no comprehensive analysis of the TCP family in watermelon (Citrullus lanatus) has been undertaken previously. A total of 27 watermelon TCP encoding genes distributed on nine chromosomes were identified. Phylogenetic analysis clustered the genes into 11 distinct subgroups. Furthermore, phylogenetic and structural analyses distinguished two homology classes within the ClTCP family, designated Class I and Class II. The Class II genes were differentiated into two subclasses, the CIN subclass and the CYC/TB1 subclass. The expression patterns of all members were determined by semi-quantitative PCR. The functions of two ClTCP genes, ClTCP14a and ClTCP15, in regulating plant height were confirmed by ectopic expression in Arabidopsis wild-type and ortholog mutants. This study represents the first genome-wide analysis of the watermelon TCP gene family, which provides valuable information for understanding the classification and functions of the TCP genes in watermelon.
Use of DAVID algorithms for gene functional classification in a non-model organism, rainbow trout

USDA-ARS?s Scientific Manuscript database

Gene functional clustering is essential in transcriptome data analysis but software programs are not always suitable for use with non-model species. The DAVID Gene Functional Classification Tool has been widely used for soft clustering in model species, but requires adaptations for use in non-model ...
Revealing the Strong Functional Association of adipor2 and cdh13 with adipoq: A Gene Network Study.

PubMed

Bag, Susmita; Anbarasu, Anand

2015-04-01

In the present study, we have analyzed functional gene interactions of adiponectin gene (adipoq). The key role of adipoq is in regulating energy homeostasis and it functions as a novel signaling molecule for adipose tissue. Modules of highly inter-connected genes in disease-specific adipoq network are derived by integrating gene function and protein interaction data. Among twenty genes in adipoq web, adipoq is effectively conjoined with two genes: Adiponectin receptor 2 (adipor2) and cadherin 13 (cdh13). The functional analysis is done via ontological briefing and candidate disease identification. We observed that the highly efficient-interlinked genes connected with adipoq are adipor2 and cdh13. Interestingly, the ontological aspect of adipor2 and cdh13 in the adipoq network reveal the fact that adipoq and adipor2 are involved mostly in glucose and lipid metabolic processes. The gene cdh13 indulge in cell adhesion process with adipoq and adipor2. Our computational gene web analysis also predicts potential candidate disease recognition, thus indicating the involvement of adipoq, adipor2, and cdh13 with not only with obesity but also with breast cancer, leukemia, renal cancer, lung cancer, and cervical cancer. The current study provides researchers a comprehensible layout of adipoq network, its functional strategies and candidate disease approach associated with adipoq network.
Dating and functional characterization of duplicated genes in the apple (Malus domestica Borkh.) by analyzing EST data.

PubMed

Sanzol, Javier

2010-05-14

Gene duplication is central to genome evolution. In plants, genes can be duplicated through small-scale events and large-scale duplications often involving polyploidy. The apple belongs to the subtribe Pyrinae (Rosaceae), a diverse lineage that originated via allopolyploidization. Both small-scale duplications and polyploidy may have been important mechanisms shaping the genome of this species. This study evaluates the gene duplication and polyploidy history of the apple by characterizing duplicated genes in this species using EST data. Overall, 68% of the apple genes were clustered into families with a mean copy-number of 4.6. Analysis of the age distribution of gene duplications supported a continuous mode of small-scale duplications, plus two episodes of large-scale duplicates of vastly different ages. The youngest was consistent with the polyploid origin of the Pyrinae 37-48 MYBP, whereas the older may be related to gamma-triplication; an ancient hexapolyploidization previously characterized in the four sequenced eurosid genomes and basal to the eurosid-asterid divergence. Duplicated genes were studied for functional diversification with an emphasis on young paralogs; those originated during or after the formation of the Pyrinae lineage. Unequal assignment of single-copy genes and gene families to Gene Ontology categories suggested functional bias in the pattern of gene retention of paralogs. Young paralogs related to signal transduction, metabolism, and energy pathways have been preferentially retained. Non-random retention of duplicated genes seems to have mediated the expansion of gene families, some of which may have substantially increased their members after the origin of the Pyrinae. The joint analysis of over-duplicated functional categories and phylogenies, allowed evaluation of the role of both polyploidy and small-scale duplications during this process. Finally, gene expression analysis indicated that 82% of duplicated genes, including 80% of young paralogs, showed uncorrelated expression profiles, suggesting extensive subfunctionalization and a role of gene duplication in the acquisition of novel patterns of gene expression. This study reports a genome-wide analysis of the mode of gene duplication in the apple, and provides evidence for its role in genome functional diversification by characterising three major processes: selective retention of paralogs, amplification of gene families, and changes in gene expression.
Comparative genome analysis in the integrated microbial genomes (IMG) system.

PubMed

Markowitz, Victor M; Kyrpides, Nikos C

2007-01-01

Comparative genome analysis is critical for the effective exploration of a rapidly growing number of complete and draft sequences for microbial genomes. The Integrated Microbial Genomes (IMG) system (img.jgi.doe.gov) has been developed as a community resource that provides support for comparative analysis of microbial genomes in an integrated context. IMG allows users to navigate the multidimensional microbial genome data space and focus their analysis on a subset of genes, genomes, and functions of interest. IMG provides graphical viewers, summaries, and occurrence profile tools for comparing genes, pathways, and functions (terms) across specific genomes. Genes can be further examined using gene neighborhoods and compared with sequence alignment tools.
Transcriptome Sequencing Revealed Significant Alteration of Cortical Promoter Usage and Splicing in Schizophrenia

PubMed Central

Wu, Jing Qin; Wang, Xi; Beveridge, Natalie J.; Tooney, Paul A.; Scott, Rodney J.; Carr, Vaughan J.; Cairns, Murray J.

2012-01-01

Background While hybridization based analysis of the cortical transcriptome has provided important insight into the neuropathology of schizophrenia, it represents a restricted view of disease-associated gene activity based on predetermined probes. By contrast, sequencing technology can provide un-biased analysis of transcription at nucleotide resolution. Here we use this approach to investigate schizophrenia-associated cortical gene expression. Methodology/Principal Findings The data was generated from 76 bp reads of RNA-Seq, aligned to the reference genome and assembled into transcripts for quantification of exons, splice variants and alternative promoters in postmortem superior temporal gyrus (STG/BA22) from 9 male subjects with schizophrenia and 9 matched non-psychiatric controls. Differentially expressed genes were then subjected to further sequence and functional group analysis. The output, amounting to more than 38 Gb of sequence, revealed significant alteration of gene expression including many previously shown to be associated with schizophrenia. Gene ontology enrichment analysis followed by functional map construction identified three functional clusters highly relevant to schizophrenia including neurotransmission related functions, synaptic vesicle trafficking, and neural development. Significantly, more than 2000 genes displayed schizophrenia-associated alternative promoter usage and more than 1000 genes showed differential splicing (FDR<0.05). Both types of transcriptional isoforms were exemplified by reads aligned to the neurodevelopmentally significant doublecortin-like kinase 1 (DCLK1) gene. Conclusions This study provided the first deep and un-biased analysis of schizophrenia-associated transcriptional diversity within the STG, and revealed variants with important implications for the complex pathophysiology of schizophrenia. PMID:22558445
Integrative Functional Genomics for Systems Genetics in GeneWeaver.org.

PubMed

Bubier, Jason A; Langston, Michael A; Baker, Erich J; Chesler, Elissa J

2017-01-01

The abundance of existing functional genomics studies permits an integrative approach to interpreting and resolving the results of diverse systems genetics studies. However, a major challenge lies in assembling and harmonizing heterogeneous data sets across species for facile comparison to the positional candidate genes and coexpression networks that come from systems genetic studies. GeneWeaver is an online database and suite of tools at www.geneweaver.org that allows for fast aggregation and analysis of gene set-centric data. GeneWeaver contains curated experimental data together with resource-level data such as GO annotations, MP annotations, and KEGG pathways, along with persistent stores of user entered data sets. These can be entered directly into GeneWeaver or transferred from widely used resources such as GeneNetwork.org. Data are analyzed using statistical tools and advanced graph algorithms to discover new relations, prioritize candidate genes, and generate function hypotheses. Here we use GeneWeaver to find genes common to multiple gene sets, prioritize candidate genes from a quantitative trait locus, and characterize a set of differentially expressed genes. Coupling a large multispecies repository curated and empirical functional genomics data to fast computational tools allows for the rapid integrative analysis of heterogeneous data for interpreting and extrapolating systems genetics results.
Structured association analysis leads to insight into Saccharomyces cerevisiae gene regulation by finding multiple contributing eQTL hotspots associated with functional gene modules.

PubMed

Curtis, Ross E; Kim, Seyoung; Woolford, John L; Xu, Wenjie; Xing, Eric P

2013-03-21

Association analysis using genome-wide expression quantitative trait locus (eQTL) data investigates the effect that genetic variation has on cellular pathways and leads to the discovery of candidate regulators. Traditional analysis of eQTL data via pairwise statistical significance tests or linear regression does not leverage the availability of the structural information of the transcriptome, such as presence of gene networks that reveal correlation and potentially regulatory relationships among the study genes. We employ a new eQTL mapping algorithm, GFlasso, which we have previously developed for sparse structured regression, to reanalyze a genome-wide yeast dataset. GFlasso fully takes into account the dependencies among expression traits to suppress false positives and to enhance the signal/noise ratio. Thus, GFlasso leverages the gene-interaction network to discover the pleiotropic effects of genetic loci that perturb the expression level of multiple (rather than individual) genes, which enables us to gain more power in detecting previously neglected signals that are marginally weak but pleiotropically significant. While eQTL hotspots in yeast have been reported previously as genomic regions controlling multiple genes, our analysis reveals additional novel eQTL hotspots and, more interestingly, uncovers groups of multiple contributing eQTL hotspots that affect the expression level of functional gene modules. To our knowledge, our study is the first to report this type of gene regulation stemming from multiple eQTL hotspots. Additionally, we report the results from in-depth bioinformatics analysis for three groups of these eQTL hotspots: ribosome biogenesis, telomere silencing, and retrotransposon biology. We suggest candidate regulators for the functional gene modules that map to each group of hotspots. Not only do we find that many of these candidate regulators contain mutations in the promoter and coding regions of the genes, in the case of the Ribi group, we provide experimental evidence suggesting that the identified candidates do regulate the target genes predicted by GFlasso. Thus, this structured association analysis of a yeast eQTL dataset via GFlasso, coupled with extensive bioinformatics analysis, discovers a novel regulation pattern between multiple eQTL hotspots and functional gene modules. Furthermore, this analysis demonstrates the potential of GFlasso as a powerful computational tool for eQTL studies that exploit the rich structural information among expression traits due to correlation, regulation, or other forms of biological dependencies.
Microarray analysis reveals key genes and pathways in Tetralogy of Fallot

PubMed Central

He, Yue-E; Qiu, Hui-Xian; Jiang, Jian-Bing; Wu, Rong-Zhou; Xiang, Ru-Lian; Zhang, Yuan-Hai

2017-01-01

The aim of the present study was to identify key genes that may be involved in the pathogenesis of Tetralogy of Fallot (TOF) using bioinformatics methods. The GSE26125 microarray dataset, which includes cardiovascular tissue samples derived from 16 children with TOF and five healthy age-matched control infants, was downloaded from the Gene Expression Omnibus database. Differential expression analysis was performed between TOF and control samples to identify differentially expressed genes (DEGs) using Student's t-test, and the R/limma package, with a log2 fold-change of >2 and a false discovery rate of <0.01 set as thresholds. The biological functions of DEGs were analyzed using the ToppGene database. The ReactomeFIViz application was used to construct functional interaction (FI) networks, and the genes in each module were subjected to pathway enrichment analysis. The iRegulon plugin was used to identify transcription factors predicted to regulate the DEGs in the FI network, and the gene-transcription factor pairs were then visualized using Cytoscape software. A total of 878 DEGs were identified, including 848 upregulated genes and 30 downregulated genes. The gene FI network contained seven function modules, which were all comprised of upregulated genes. Genes enriched in Module 1 were enriched in the following three neurological disorder-associated signaling pathways: Parkinson's disease, Alzheimer's disease and Huntington's disease. Genes in Modules 0, 3 and 5 were dominantly enriched in pathways associated with ribosomes and protein translation. The Xbox binding protein 1 transcription factor was demonstrated to be involved in the regulation of genes encoding the subunits of cytoplasmic and mitochondrial ribosomes, as well as genes involved in neurodegenerative disorders. Therefore, dysfunction of genes involved in signaling pathways associated with neurodegenerative disorders, ribosome function and protein translation may contribute to the pathogenesis of TOF. PMID:28713939
Extending bicluster analysis to annotate unclassified ORFs and predict novel functional modules using expression data

PubMed Central

Bryan, Kenneth; Cunningham, Pádraig

2008-01-01

Background Microarrays have the capacity to measure the expressions of thousands of genes in parallel over many experimental samples. The unsupervised classification technique of bicluster analysis has been employed previously to uncover gene expression correlations over subsets of samples with the aim of providing a more accurate model of the natural gene functional classes. This approach also has the potential to aid functional annotation of unclassified open reading frames (ORFs). Until now this aspect of biclustering has been under-explored. In this work we illustrate how bicluster analysis may be extended into a 'semi-supervised' ORF annotation approach referred to as BALBOA. Results The efficacy of the BALBOA ORF classification technique is first assessed via cross validation and compared to a multi-class k-Nearest Neighbour (kNN) benchmark across three independent gene expression datasets. BALBOA is then used to assign putative functional annotations to unclassified yeast ORFs. These predictions are evaluated using existing experimental and protein sequence information. Lastly, we employ a related semi-supervised method to predict the presence of novel functional modules within yeast. Conclusion In this paper we demonstrate how unsupervised classification methods, such as bicluster analysis, may be extended using of available annotations to form semi-supervised approaches within the gene expression analysis domain. We show that such methods have the potential to improve upon supervised approaches and shed new light on the functions of unclassified ORFs and their co-regulation. PMID:18831786
A genome scale metabolic network for rice and accompanying analysis of tryptophan, auxin and serotonin biosynthesis regulation under biotic stress

USDA-ARS?s Scientific Manuscript database

Functional annotations of large plant genome projects mostly provide information on gene function and gene families based on the presence of protein domains and gene homology, but not necessarily in association with gene expression or metabolic and regulatory networks. These additional annotations a...
Genomewide Analysis of Aryl Hydrocarbon Receptor Binding Targets Reveals an Extensive Array of Gene Clusters that Control Morphogenetic and Developmental Programs

PubMed Central

Sartor, Maureen A.; Schnekenburger, Michael; Marlowe, Jennifer L.; Reichard, John F.; Wang, Ying; Fan, Yunxia; Ma, Ci; Karyala, Saikumar; Halbleib, Danielle; Liu, Xiangdong; Medvedovic, Mario; Puga, Alvaro

2009-01-01

Background The vertebrate aryl hydrocarbon receptor (AHR) is a ligand-activated transcription factor that regulates cellular responses to environmental polycyclic and halogenated compounds. The naive receptor is believed to reside in an inactive cytosolic complex that translocates to the nucleus and induces transcription of xenobiotic detoxification genes after activation by ligand. Objectives We conducted an integrative genomewide analysis of AHR gene targets in mouse hepatoma cells and determined whether AHR regulatory functions may take place in the absence of an exogenous ligand. Methods The network of AHR-binding targets in the mouse genome was mapped through a multipronged approach involving chromatin immunoprecipitation/chip and global gene expression signatures. The findings were integrated into a prior functional knowledge base from Gene Ontology, interaction networks, Kyoto Encyclopedia of Genes and Genomes pathways, sequence motif analysis, and literature molecular concepts. Results We found the naive receptor in unstimulated cells bound to an extensive array of gene clusters with functions in regulation of gene expression, differentiation, and pattern specification, connecting multiple morphogenetic and developmental programs. Activation by the ligand displaced the receptor from some of these targets toward sites in the promoters of xenobiotic metabolism genes. Conclusions The vertebrate AHR appears to possess unsuspected regulatory functions that may be potential targets of environmental injury. PMID:19654925

Genome wide in silico characterization of Dof gene families of pigeonpea (Cajanus cajan (L) Millsp.).

PubMed

Malviya, N; Gupta, S; Singh, V K; Yadav, M K; Bisht, N C; Sarangi, B K; Yadav, D

2015-02-01

The DNA binding with One Finger (Dof) protein is a plant specific transcription factor involved in the regulation of wide range of processes. The analysis of whole genome sequence of pigeonpea has identified 38 putative Dof genes (CcDof) distributed on 8 chromosomes. A total of 17 out of 38 CcDof genes were found to be intronless. A comprehensive in silico characterization of CcDof gene family including the gene structure, chromosome location, protein motif, phylogeny, gene duplication and functional divergence has been attempted. The phylogenetic analysis resulted in 3 major clusters with closely related members in phylogenetic tree revealed common motif distribution. The in silico cis-regulatory element analysis revealed functional diversity with predominance of light responsive and stress responsive elements indicating the possibility of these CcDof genes to be associated with photoperiodic control and biotic and abiotic stress. The duplication pattern showed that tandem duplication is predominant over segmental duplication events. The comparative phylogenetic analysis of these Dof proteins along with 78 soybean, 36 Arabidopsis and 30 rice Dof proteins revealed 7 major clusters. Several groups of orthologs and paralogs were identified based on phylogenetic tree constructed. Our study provides useful information for functional characterization of CcDof genes.
Computational gene expression profiling under salt stress reveals patterns of co-expression

PubMed Central

Sanchita; Sharma, Ashok

2016-01-01

Plants respond differently to environmental conditions. Among various abiotic stresses, salt stress is a condition where excess salt in soil causes inhibition of plant growth. To understand the response of plants to the stress conditions, identification of the responsible genes is required. Clustering is a data mining technique used to group the genes with similar expression. The genes of a cluster show similar expression and function. We applied clustering algorithms on gene expression data of Solanum tuberosum showing differential expression in Capsicum annuum under salt stress. The clusters, which were common in multiple algorithms were taken further for analysis. Principal component analysis (PCA) further validated the findings of other cluster algorithms by visualizing their clusters in three-dimensional space. Functional annotation results revealed that most of the genes were involved in stress related responses. Our findings suggest that these algorithms may be helpful in the prediction of the function of co-expressed genes. PMID:26981411
Genome-Wide Analysis of the NADK Gene Family in Plants

PubMed Central

Li, Wen-Yan; Wang, Xiang; Li, Ri; Li, Wen-Qiang; Chen, Kun-Ming

2014-01-01

Background NAD(H) kinase (NADK) is the key enzyme that catalyzes de novo synthesis of NADP(H) from NAD(H) for NADP(H)-based metabolic pathways. In plants, NADKs form functional subfamilies. Studies of these families in Arabidopsis thaliana indicate that they have undergone considerable evolutionary selection; however, the detailed evolutionary history and functions of the various NADKs in plants are not clearly understood. Principal Findings We performed a comparative genomic analysis that identified 74 NADK gene homologs from 24 species representing the eight major plant lineages within the supergroup Plantae: glaucophytes, rhodophytes, chlorophytes, bryophytes, lycophytes, gymnosperms, monocots and eudicots. Phylogenetic and structural analysis classified these NADK genes into four well-conserved subfamilies with considerable variety in the domain organization and gene structure among subfamily members. In addition to the typical NAD_kinase domain, additional domains, such as adenylate kinase, dual-specificity phosphatase, and protein tyrosine phosphatase catalytic domains, were found in subfamily II. Interestingly, NADKs in subfamily III exhibited low sequence similarity (∼30%) in the kinase domain within the subfamily and with the other subfamilies. These observations suggest that gene fusion and exon shuffling may have occurred after gene duplication, leading to specific domain organization seen in subfamilies II and III, respectively. Further analysis of the exon/intron structures showed that single intron loss and gain had occurred, yielding the diversified gene structures, during the process of structural evolution of NADK family genes. Finally, both available global microarray data analysis and qRT-RCR experiments revealed that the NADK genes in Arabidopsis and Oryza sativa show different expression patterns in different developmental stages and under several different abiotic/biotic stresses and hormone treatments, underscoring the functional diversity and functional divergence of the NADK family in plants. Conclusions These findings will facilitate further studies of the NADK family and provide valuable information for functional validation of this family in plants. PMID:24968225
A genome-wide analysis of the flax (Linum usitatissimum L.) dirigent protein family: from gene identification and evolution to differential regulation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Corbin, Cyrielle; Drouet, Samantha; Markulin, Lucija

Identification of DIR encoding genes in flax genome. Analysis of phylogeny, gene/protein structures and evolution. Identification of new conserved motifs linked to biochemical functions. Investigation of spatio-temporal gene expression and response to stress. Dirigent proteins (DIRs) were discovered during 8-8' lignan biosynthesis studies, through identification of stereoselective coupling to afford either (+)- or (-)-pinoresinols from E-coniferyl alcohol. DIRs are also involved or potentially involved in terpenoid, allyl/propenyl phenol lignan, pterocarpan and lignin biosynthesis. DIRs have very large multigene families in different vascular plants including flax, with most still of unknown function. DIR studies typically focus on a small subset ofmore » genes and identification of biochemical/physiological functions. Herein, a genome-wide analysis and characterization of the predicted flax DIR 44-membered multigene family was performed, this species being a rich natural grain source of 8-8' linked secoisolariciresinol-derived lignan oligomers. All predicted DIR sequences, including their promoters, were analyzed together with their public gene expression datasets. Expression patterns of selected DIRs were examined using qPCR, as well as through clustering analysis of DIR gene expression. These analyses further implicated roles for specific DIRs in (-)-pinoresinol formation in seed-coats, as well as (+)-pinoresinol in vegetative organs and/or specific responses to stress. Phylogeny and gene expression analysis segregated flax DIRs into six distinct clusters with new cluster-specific motifs identified. We propose that these findings can serve as a foundation to further systematically determine functions of DIRs, i.e. other than those already known in lignan biosynthesis in flax and other species. Given the differential expression profiles and inducibility of the flax DIR family, we provisionally propose that some DIR genes of unknown function could be involved in different aspects of secondary cell wall biosynthesis and plant defense.« less
A genome-wide analysis of the flax (Linum usitatissimum L.) dirigent protein family: from gene identification and evolution to differential regulation.

PubMed

Corbin, Cyrielle; Drouet, Samantha; Markulin, Lucija; Auguin, Daniel; Lainé, Éric; Davin, Laurence B; Cort, John R; Lewis, Norman G; Hano, Christophe

2018-05-01

Identification of DIR encoding genes in flax genome. Analysis of phylogeny, gene/protein structures and evolution. Identification of new conserved motifs linked to biochemical functions. Investigation of spatio-temporal gene expression and response to stress. Dirigent proteins (DIRs) were discovered during 8-8' lignan biosynthesis studies, through identification of stereoselective coupling to afford either (+)- or (-)-pinoresinols from E-coniferyl alcohol. DIRs are also involved or potentially involved in terpenoid, allyl/propenyl phenol lignan, pterocarpan and lignin biosynthesis. DIRs have very large multigene families in different vascular plants including flax, with most still of unknown function. DIR studies typically focus on a small subset of genes and identification of biochemical/physiological functions. Herein, a genome-wide analysis and characterization of the predicted flax DIR 44-membered multigene family was performed, this species being a rich natural grain source of 8-8' linked secoisolariciresinol-derived lignan oligomers. All predicted DIR sequences, including their promoters, were analyzed together with their public gene expression datasets. Expression patterns of selected DIRs were examined using qPCR, as well as through clustering analysis of DIR gene expression. These analyses further implicated roles for specific DIRs in (-)-pinoresinol formation in seed-coats, as well as (+)-pinoresinol in vegetative organs and/or specific responses to stress. Phylogeny and gene expression analysis segregated flax DIRs into six distinct clusters with new cluster-specific motifs identified. We propose that these findings can serve as a foundation to further systematically determine functions of DIRs, i.e. other than those already known in lignan biosynthesis in flax and other species. Given the differential expression profiles and inducibility of the flax DIR family, we provisionally propose that some DIR genes of unknown function could be involved in different aspects of secondary cell wall biosynthesis and plant defense.
Comprehensive analysis of genetic variations in strictly-defined Leber congenital amaurosis with whole-exome sequencing in Chinese.

PubMed

Wang, Shi-Yuan; Zhang, Qi; Zhang, Xiang; Zhao, Pei-Quan

2016-01-01

To make a comprehensive analysis of the potential pathogenic genes related with Leber congenital amaurosis (LCA) in Chinese. LCA subjects and their families were retrospectively collected from 2013 to 2015. Firstly, whole-exome sequencing was performed in patients who had underwent gene mutation screening with nothing found, and then homozygous sites was selected, candidate sites were annotated, and pathogenic analysis was conducted using softwares including Sorting Tolerant from Intolerant (SIFT), Polyphen-2, Mutation assessor, Condel, and Functional Analysis through Hidden Markov Models (FATHMM). Furthermore, Gene Ontology function and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses of pathogenic genes were performed followed by co-segregation analysis using Fisher exact Test. Sanger sequencing was used to validate single-nucleotide variations (SNVs). Expanded verification was performed in the rest patients. Totally 51 LCA families with 53 patients and 24 family members were recruited. A total of 104 SNVs (66 LCA-related genes and 15 co-segregated genes) were submitted for expand verification. The frequencies of homozygous mutation of KRT12 and CYP1A1 were simultaneously observed in 3 families. Enrichment analysis showed that the potential pathogenic genes were mainly enriched in functions related to cell adhesion, biological adhesion, retinoid metabolic process, and eye development biological adhesion. Additionally, WFS1 and STAU2 had the highest homozygous frequencies. LCA is a highly heterogeneous disease. Mutations in KRT12, CYP1A1, WFS1, and STAU2 may be involved in the development of LCA.
The Association of Multiple Interacting Genes with Specific Phenotypes in Rice Using Gene Coexpression Networks1[C][W][OA

PubMed Central

Ficklin, Stephen P.; Luo, Feng; Feltus, F. Alex

2010-01-01

Discovering gene sets underlying the expression of a given phenotype is of great importance, as many phenotypes are the result of complex gene-gene interactions. Gene coexpression networks, built using a set of microarray samples as input, can help elucidate tightly coexpressed gene sets (modules) that are mixed with genes of known and unknown function. Functional enrichment analysis of modules further subdivides the coexpressed gene set into cofunctional gene clusters that may coexist in the module with other functionally related gene clusters. In this study, 45 coexpressed gene modules and 76 cofunctional gene clusters were discovered for rice (Oryza sativa) using a global, knowledge-independent paradigm and the combination of two network construction methodologies. Some clusters were enriched for previously characterized mutant phenotypes, providing evidence for specific gene sets (and their annotated molecular functions) that underlie specific phenotypes. PMID:20668062
The association of multiple interacting genes with specific phenotypes in rice using gene coexpression networks.

PubMed

Ficklin, Stephen P; Luo, Feng; Feltus, F Alex

2010-09-01

Discovering gene sets underlying the expression of a given phenotype is of great importance, as many phenotypes are the result of complex gene-gene interactions. Gene coexpression networks, built using a set of microarray samples as input, can help elucidate tightly coexpressed gene sets (modules) that are mixed with genes of known and unknown function. Functional enrichment analysis of modules further subdivides the coexpressed gene set into cofunctional gene clusters that may coexist in the module with other functionally related gene clusters. In this study, 45 coexpressed gene modules and 76 cofunctional gene clusters were discovered for rice (Oryza sativa) using a global, knowledge-independent paradigm and the combination of two network construction methodologies. Some clusters were enriched for previously characterized mutant phenotypes, providing evidence for specific gene sets (and their annotated molecular functions) that underlie specific phenotypes.
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family.

PubMed

Danisman, Selahattin; van Dijk, Aalt D J; Bimbo, Andrea; van der Wal, Froukje; Hennig, Lars; de Folter, Stefan; Angenent, Gerco C; Immink, Richard G H

2013-12-01

Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein-protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein-protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family.
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family

PubMed Central

Danisman, Selahattin; de Folter, Stefan; Immink, Richard G. H.

2013-01-01

Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein–protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein–protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family. PMID:24129704
Dominant selectable markers for Penicillium spp. transformation and gene function studies

USDA-ARS?s Scientific Manuscript database

Penicillium spp. has been genetically manipulated and gene function studies have utilized single gene deletion strains for phenotypic analysis. Fungal transformation experiments have relied on hygromycin and hygromycin phosphotransferase (hph) as the main dominant selectable marker (DSM) system in P...
Combining classifiers to predict gene function in Arabidopsis thaliana using large-scale gene expression measurements.

PubMed

Lan, Hui; Carson, Rachel; Provart, Nicholas J; Bonner, Anthony J

2007-09-21

Arabidopsis thaliana is the model species of current plant genomic research with a genome size of 125 Mb and approximately 28,000 genes. The function of half of these genes is currently unknown. The purpose of this study is to infer gene function in Arabidopsis using machine-learning algorithms applied to large-scale gene expression data sets, with the goal of identifying genes that are potentially involved in plant response to abiotic stress. Using in house and publicly available data, we assembled a large set of gene expression measurements for A. thaliana. Using those genes of known function, we first evaluated and compared the ability of basic machine-learning algorithms to predict which genes respond to stress. Predictive accuracy was measured using ROC50 and precision curves derived through cross validation. To improve accuracy, we developed a method for combining these classifiers using a weighted-voting scheme. The combined classifier was then trained on genes of known function and applied to genes of unknown function, identifying genes that potentially respond to stress. Visual evidence corroborating the predictions was obtained using electronic Northern analysis. Three of the predicted genes were chosen for biological validation. Gene knockout experiments confirmed that all three are involved in a variety of stress responses. The biological analysis of one of these genes (At1g16850) is presented here, where it is shown to be necessary for the normal response to temperature and NaCl. Supervised learning methods applied to large-scale gene expression measurements can be used to predict gene function. However, the ability of basic learning methods to predict stress response varies widely and depends heavily on how much dimensionality reduction is used. Our method of combining classifiers can improve the accuracy of such predictions - in this case, predictions of genes involved in stress response in plants - and it effectively chooses the appropriate amount of dimensionality reduction automatically. The method provides a useful means of identifying genes in A. thaliana that potentially respond to stress, and we expect it would be useful in other organisms and for other gene functions.
Genome-Wide Distribution, Organisation and Functional Characterization of Disease Resistance and Defence Response Genes across Rice Species

PubMed Central

Singh, Sangeeta; Chand, Suresh; Singh, N. K.; Sharma, Tilak Raj

2015-01-01

The resistance (R) genes and defense response (DR) genes have become very important resources for the development of disease resistant cultivars. In the present investigation, genome-wide identification, expression, phylogenetic and synteny analysis was done for R and DR-genes across three species of rice viz: Oryza sativa ssp indica cv 93-11, Oryza sativa ssp japonica and wild rice species, Oryza brachyantha. We used the in silico approach to identify and map 786 R -genes and 167 DR-genes, 672 R-genes and 142 DR-genes, 251 R-genes and 86 DR-genes in the japonica, indica and O. brachyanth a genomes, respectively. Our analysis showed that 60.5% and 55.6% of the R-genes are tandemly repeated within clusters and distributed over all the rice chromosomes in indica and japonica genomes, respectively. The phylogenetic analysis along with motif distribution shows high degree of conservation of R- and DR-genes in clusters. In silico expression analysis of R-genes and DR-genes showed more than 85% were expressed genes showing corresponding EST matches in the databases. This study gave special emphasis on mechanisms of gene evolution and duplication for R and DR genes across species. Analysis of paralogs across rice species indicated 17% and 4.38% R-genes, 29% and 11.63% DR-genes duplication in indica and Oryza brachyantha, as compared to 20% and 26% duplication of R-genes and DR-genes in japonica respectively. We found that during the course of duplication only 9.5% of R- and DR-genes changed their function and rest of the genes have maintained their identity. Syntenic relationship across three genomes inferred that more orthology is shared between indica and japonica genomes as compared to brachyantha genome. Genome wide identification of R-genes and DR-genes in the rice genome will help in allele mining and functional validation of these genes, and to understand molecular mechanism of disease resistance and their evolution in rice and related species. PMID:25902056
Shortening tobacco life cycle accelerates functional gene identification in genomic research.

PubMed

Ning, G; Xiao, X; Lv, H; Li, X; Zuo, Y; Bao, M

2012-11-01

Definitive allocation of function requires the introduction of genetic mutations and analysis of their phenotypic consequences. Novel, rapid and convenient techniques or materials are very important and useful to accelerate gene identification in functional genomics research. Here, over-expression of PmFT (Prunus mume), a novel FT orthologue, and PtFT (Populus tremula) lead to shortening of the tobacco life cycle. A series of novel short life cycle stable tobacco lines (30-50 days) were developed through repeated self-crossing selection breeding. Based on the second transformation via a gusA reporter gene, the promoter from BpFULL1 in silver birch (Betula pendula) and the gene (CPC) from Arabidopsis thaliana were effectively tested using short life cycle tobacco lines. Comparative analysis among wild type, short life cycle tobacco and Arabidopsis transformation system verified that it is optional to accelerate functional gene studies by shortening host plant material life cycle, at least in these short life cycle tobacco lines. The results verified that the novel short life cycle transgenic tobacco lines not only combine the advantages of economic nursery requirements and a simple transformation system, but also provide a robust, effective and stable host system to accelerate gene analysis. Thus, shortening tobacco life cycle strategy is feasible to accelerate heterologous or homologous functional gene identification in genomic research. © 2012 German Botanical Society and The Royal Botanical Society of the Netherlands.
Functional Analysis of the Polyketide Synthase Genes in the Filamentous Fungus Gibberella zeae (Anamorph Fusarium graminearum)

PubMed Central

Gaffoor, Iffa; Brown, Daren W.; Plattner, Ron; Proctor, Robert H.; Qi, Weihong; Trail, Frances

2005-01-01

Polyketides are a class of secondary metabolites that exhibit a vast diversity of form and function. In fungi, these compounds are produced by large, multidomain enzymes classified as type I polyketide synthases (PKSs). In this study we identified and functionally disrupted 15 PKS genes from the genome of the filamentous fungus Gibberella zeae. Five of these genes are responsible for producing the mycotoxins zearalenone, aurofusarin, and fusarin C and the black perithecial pigment. A comprehensive expression analysis of the 15 genes revealed diverse expression patterns during grain colonization, plant colonization, sexual development, and mycelial growth. Expression of one of the PKS genes was not detected under any of 18 conditions tested. This is the first study to genetically characterize a complete set of PKS genes from a single organism. PMID:16278459
Technological advances and genomics in metazoan parasites.

PubMed

Knox, D P

2004-02-01

Molecular biology has provided the means to identify parasite proteins, to define their function, patterns of expression and the means to produce them in quantity for subsequent functional analyses. Whole genome and expressed sequence tag programmes, and the parallel development of powerful bioinformatics tools, allow the execution of genome-wide between stage or species comparisons and meaningful gene-expression profiling. The latter can be undertaken with several new technologies such as DNA microarray and serial analysis of gene expression. Proteome analysis has come to the fore in recent years providing a crucial link between the gene and its protein product. RNA interference and ballistic gene transfer are exciting developments which can provide the means to precisely define the function of individual genes and, of importance in devising novel parasite control strategies, the effect that gene knockdown will have on parasite survival.
Evolutionary characterization and transcript profiling of β-tubulin genes in flax (Linum usitatissimum L.) during plant development.

PubMed

Gavazzi, Floriana; Pigna, Gaia; Braglia, Luca; Gianì, Silvia; Breviario, Diego; Morello, Laura

2017-12-08

Microtubules, polymerized from alpha and beta-tubulin monomers, play a fundamental role in plant morphogenesis, determining the cell division plane, the direction of cell expansion and the deposition of cell wall material. During polarized pollen tube elongation, microtubules serve as tracks for vesicular transport and deposition of proteins/lipids at the tip membrane. Such functions are controlled by cortical microtubule arrays. Aim of this study was to first characterize the flax β-tubulin family by sequence and phylogenetic analysis and to investigate differential expression of β-tubulin genes possibly related to fibre elongation and to flower development. We report the cloning and characterization of the complete flax β-tubulin gene family: exon-intron organization, duplicated gene comparison, phylogenetic analysis and expression pattern during stem and hypocotyl elongation and during flower development. Sequence analysis of the fourteen expressed β-tubulin genes revealed that the recent whole genome duplication of the flax genome was followed by massive retention of duplicated tubulin genes. Expression analysis showed that β-tubulin mRNA profiles gradually changed along with phloem fibre development in both the stem and hypocotyl. In flowers, changes in relative tubulin transcript levels took place at anthesis in anthers, but not in carpels. Phylogenetic analysis supports the origin of extant plant β-tubulin genes from four ancestral genes pre-dating angiosperm separation. Expression analysis suggests that particular tubulin subpopulations are more suitable to sustain different microtubule functions such as cell elongation, cell wall thickening or pollen tube growth. Tubulin genes possibly related to different microtubule functions were identified as candidate for more detailed studies.
Identification of potential therapeutic target genes, key miRNAs and mechanisms in oral lichen planus by bioinformatics analysis.

PubMed

Gong, Cuihua; Sun, Shangtong; Liu, Bing; Wang, Jing; Chen, Xiaodong

2017-06-01

The study aimed to identify the potential target genes and key miRNAs as well as to explore the underlying mechanisms in the pathogenesis of oral lichen planus (OLP) by bioinformatics analysis. The microarray data of GSE38617 were downloaded from Gene Expression Omnibus (GEO) database. A total of 7 OLP and 7 normal samples were used to identify the differentially expressed genes (DEGs) and miRNAs. The DEGs were then performed functional enrichment analyses. Furthermore, DEG-miRNA network and miRNA-function network were constructed by Cytoscape software. Total 1758 DEGs (598 up- and 1160 down-regulated genes) and 40 miRNAs (17 up- and 23 down-regulated miRNAs) were selected. The up-regulated genes were related to nuclear factor-Kappa B (NF-κB) signaling pathway, while down-regulated genes were mainly enriched in the function of ribosome. Tumor necrosis factor (TNF), caspase recruitment domain family, member 11 (CARD11) and mitochondrial ribosomal protein (MRP) genes were identified in these functions. In addition, miR-302 was a hub node in DEG-miRNA network and regulated cyclin D1 (CCND1). MiR-548a-2 was the key miRNA in miRNA-function network by regulating multiple functions including ribosomal function. The NF-κB signaling pathway and ribosome function may be the pathogenic mechanisms of OLP. The genes such as TNF, CARD11, MRP genes and CCND1 may be potential therapeutic target genes in OLP. MiR-548a-2 and miR-302 may play important roles in OLP development. Copyright © 2017 Elsevier Ltd. All rights reserved.
Molecular characterization and expression analysis of Triticum aestivum squamosa-promoter binding protein-box genes involved in ear development.

PubMed

Zhang, Bin; Liu, Xia; Zhao, Guangyao; Mao, Xinguo; Li, Ang; Jing, Ruilian

2014-06-01

Wheat (Triticum aestivum L.) is one of the most important crops in the world. Squamosa-promoter binding protein (SBP)-box genes play a critical role in regulating flower and fruit development. In this study, 10 novel SBP-box genes (TaSPL genes) were isolated from wheat ((Triticum aestivum L.) cultivar Yanzhan 4110). Phylogenetic analysis classified the TaSPL genes into five groups (G1-G5). The motif combinations and expression patterns of the TaSPL genes varied among the five groups with each having own distinctive characteristics: TaSPL20/21 in G1 and TaSPL17 in G2 mainly expressed in the shoot apical meristem and the young ear, and their expression levels responded to development of the ear; TaSPL6/15 belonging to G3 were upregulated and TaSPL1/23 in G4 were downregulated during grain development; the gene in G5 (TaSPL3) expressed constitutively. Thus, the consistency of the phylogenetic analysis, motif compositions, and expression patterns of the TaSPL genes revealed specific gene structures and functions. On the other hand, the diverse gene structures and different expression patterns suggested that wheat SBP-box genes have a wide range of functions. The results also suggest a potential role for wheat SBP-box genes in ear development. This study provides a significant beginning of functional analysis of SBP-box genes in wheat. © 2014 The Authors. Journal of Integrative Plant Biology Published by Wiley Publishing Asia Pty Ltd on behalf of Institute of Botany, Chinese Academy of Sciences.
Genome-wide analysis and expression profiling suggest diverse roles of GH3 genes during development and abiotic stress responses in legumes

PubMed Central

Singh, Vikash K.; Jain, Mukesh; Garg, Rohini

2014-01-01

Growth hormone auxin regulates various cellular processes by altering the expression of diverse genes in plants. Among various auxin-responsive genes, GH3 genes maintain endogenous auxin homeostasis by conjugating excess of auxin with amino acids. GH3 genes have been characterized in many plant species, but not in legumes. In the present work, we identified members of GH3 gene family and analyzed their chromosomal distribution, gene structure, gene duplication and phylogenetic analysis in different legumes, including chickpea, soybean, Medicago, and Lotus. A comprehensive expression analysis in different vegetative and reproductive tissues/stages revealed that many of GH3 genes were expressed in a tissue-specific manner. Notably, chickpea CaGH3-3, soybean GmGH3-8 and -25, and Lotus LjGH3-4, -5, -9 and -18 genes were up-regulated in root, indicating their putative role in root development. In addition, chickpea CaGH3-1 and -7, and Medicago MtGH3-7, -8, and -9 were found to be highly induced under drought and/or salt stresses, suggesting their role in abiotic stress responses. We also observed the examples of differential expression pattern of duplicated GH3 genes in soybean, indicating their functional diversification. Furthermore, analyses of three-dimensional structures, active site residues and ligand preferences provided molecular insights into function of GH3 genes in legumes. The analysis presented here would help in investigation of precise function of GH3 genes in legumes during development and stress conditions. PMID:25642236

FMAP: Functional Mapping and Analysis Pipeline for metagenomics and metatranscriptomics studies.

PubMed

Kim, Jiwoong; Kim, Min Soo; Koh, Andrew Y; Xie, Yang; Zhan, Xiaowei

2016-10-10

Given the lack of a complete and comprehensive library of microbial reference genomes, determining the functional profile of diverse microbial communities is challenging. The available functional analysis pipelines lack several key features: (i) an integrated alignment tool, (ii) operon-level analysis, and (iii) the ability to process large datasets. Here we introduce our open-sourced, stand-alone functional analysis pipeline for analyzing whole metagenomic and metatranscriptomic sequencing data, FMAP (Functional Mapping and Analysis Pipeline). FMAP performs alignment, gene family abundance calculations, and statistical analysis (three levels of analyses are provided: differentially-abundant genes, operons and pathways). The resulting output can be easily visualized with heatmaps and functional pathway diagrams. FMAP functional predictions are consistent with currently available functional analysis pipelines. FMAP is a comprehensive tool for providing functional analysis of metagenomic/metatranscriptomic sequencing data. With the added features of integrated alignment, operon-level analysis, and the ability to process large datasets, FMAP will be a valuable addition to the currently available functional analysis toolbox. We believe that this software will be of great value to the wider biology and bioinformatics communities.
Human microRNA target analysis and gene ontology clustering by GOmir, a novel stand-alone application

PubMed Central

Roubelakis, Maria G; Zotos, Pantelis; Papachristoudis, Georgios; Michalopoulos, Ioannis; Pappa, Kalliopi I; Anagnou, Nicholas P; Kossida, Sophia

2009-01-01

Background microRNAs (miRNAs) are single-stranded RNA molecules of about 20–23 nucleotides length found in a wide variety of organisms. miRNAs regulate gene expression, by interacting with target mRNAs at specific sites in order to induce cleavage of the message or inhibit translation. Predicting or verifying mRNA targets of specific miRNAs is a difficult process of great importance. Results GOmir is a novel stand-alone application consisting of two separate tools: JTarget and TAGGO. JTarget integrates miRNA target prediction and functional analysis by combining the predicted target genes from TargetScan, miRanda, RNAhybrid and PicTar computational tools as well as the experimentally supported targets from TarBase and also providing a full gene description and functional analysis for each target gene. On the other hand, TAGGO application is designed to automatically group gene ontology annotations, taking advantage of the Gene Ontology (GO), in order to extract the main attributes of sets of proteins. GOmir represents a new tool incorporating two separate Java applications integrated into one stand-alone Java application. Conclusion GOmir (by using up to five different databases) introduces miRNA predicted targets accompanied by (a) full gene description, (b) functional analysis and (c) detailed gene ontology clustering. Additionally, a reverse search initiated by a potential target can also be conducted. GOmir can freely be downloaded BRFAA. PMID:19534746
Human microRNA target analysis and gene ontology clustering by GOmir, a novel stand-alone application.

PubMed

Roubelakis, Maria G; Zotos, Pantelis; Papachristoudis, Georgios; Michalopoulos, Ioannis; Pappa, Kalliopi I; Anagnou, Nicholas P; Kossida, Sophia

2009-06-16

microRNAs (miRNAs) are single-stranded RNA molecules of about 20-23 nucleotides length found in a wide variety of organisms. miRNAs regulate gene expression, by interacting with target mRNAs at specific sites in order to induce cleavage of the message or inhibit translation. Predicting or verifying mRNA targets of specific miRNAs is a difficult process of great importance. GOmir is a novel stand-alone application consisting of two separate tools: JTarget and TAGGO. JTarget integrates miRNA target prediction and functional analysis by combining the predicted target genes from TargetScan, miRanda, RNAhybrid and PicTar computational tools as well as the experimentally supported targets from TarBase and also providing a full gene description and functional analysis for each target gene. On the other hand, TAGGO application is designed to automatically group gene ontology annotations, taking advantage of the Gene Ontology (GO), in order to extract the main attributes of sets of proteins. GOmir represents a new tool incorporating two separate Java applications integrated into one stand-alone Java application. GOmir (by using up to five different databases) introduces miRNA predicted targets accompanied by (a) full gene description, (b) functional analysis and (c) detailed gene ontology clustering. Additionally, a reverse search initiated by a potential target can also be conducted. GOmir can freely be downloaded BRFAA.
Saliva Microbiota Carry Caries-Specific Functional Gene Signatures

PubMed Central

Chang, Xingzhi; Yuan, Xiao; Tu, Qichao; Yuan, Tong; Deng, Ye; Hemme, Christopher L.; Van Nostrand, Joy; Cui, Xinping; He, Zhili; Chen, Zhenggang; Guo, Dawei; Yu, Jiangbo; Zhang, Yue; Zhou, Jizhong; Xu, Jian

2014-01-01

Human saliva microbiota is phylogenetically divergent among host individuals yet their roles in health and disease are poorly appreciated. We employed a microbial functional gene microarray, HuMiChip 1.0, to reconstruct the global functional profiles of human saliva microbiota from ten healthy and ten caries-active adults. Saliva microbiota in the pilot population featured a vast diversity of functional genes. No significant distinction in gene number or diversity indices was observed between healthy and caries-active microbiota. However, co-presence network analysis of functional genes revealed that caries-active microbiota was more divergent in non-core genes than healthy microbiota, despite both groups exhibited a similar degree of conservation at their respective core genes. Furthermore, functional gene structure of saliva microbiota could potentially distinguish caries-active patients from healthy hosts. Microbial functions such as Diaminopimelate epimerase, Prephenate dehydrogenase, Pyruvate-formate lyase and N-acetylmuramoyl-L-alanine amidase were significantly linked to caries. Therefore, saliva microbiota carried disease-associated functional signatures, which could be potentially exploited for caries diagnosis. PMID:24533043
Saliva microbiota carry caries-specific functional gene signatures.

PubMed

Yang, Fang; Ning, Kang; Chang, Xingzhi; Yuan, Xiao; Tu, Qichao; Yuan, Tong; Deng, Ye; Hemme, Christopher L; Van Nostrand, Joy; Cui, Xinping; He, Zhili; Chen, Zhenggang; Guo, Dawei; Yu, Jiangbo; Zhang, Yue; Zhou, Jizhong; Xu, Jian

2014-01-01

Human saliva microbiota is phylogenetically divergent among host individuals yet their roles in health and disease are poorly appreciated. We employed a microbial functional gene microarray, HuMiChip 1.0, to reconstruct the global functional profiles of human saliva microbiota from ten healthy and ten caries-active adults. Saliva microbiota in the pilot population featured a vast diversity of functional genes. No significant distinction in gene number or diversity indices was observed between healthy and caries-active microbiota. However, co-presence network analysis of functional genes revealed that caries-active microbiota was more divergent in non-core genes than healthy microbiota, despite both groups exhibited a similar degree of conservation at their respective core genes. Furthermore, functional gene structure of saliva microbiota could potentially distinguish caries-active patients from healthy hosts. Microbial functions such as Diaminopimelate epimerase, Prephenate dehydrogenase, Pyruvate-formate lyase and N-acetylmuramoyl-L-alanine amidase were significantly linked to caries. Therefore, saliva microbiota carried disease-associated functional signatures, which could be potentially exploited for caries diagnosis.
Investigating a multigene prognostic assay based on significant pathways for Luminal A breast cancer through gene expression profile analysis.

PubMed

Gao, Haiyan; Yang, Mei; Zhang, Xiaolan

2018-04-01

The present study aimed to investigate potential recurrence-risk biomarkers based on significant pathways for Luminal A breast cancer through gene expression profile analysis. Initially, the gene expression profiles of Luminal A breast cancer patients were downloaded from The Cancer Genome Atlas database. The differentially expressed genes (DEGs) were identified using a Limma package and the hierarchical clustering analysis was conducted for the DEGs. In addition, the functional pathways were screened using Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses and rank ratio calculation. The multigene prognostic assay was exploited based on the statistically significant pathways and its prognostic function was tested using train set and verified using the gene expression data and survival data of Luminal A breast cancer patients downloaded from the Gene Expression Omnibus. A total of 300 DEGs were identified between good and poor outcome groups, including 176 upregulated genes and 124 downregulated genes. The DEGs may be used to effectively distinguish Luminal A samples with different prognoses verified by hierarchical clustering analysis. There were 9 pathways screened as significant pathways and a total of 18 DEGs involved in these 9 pathways were identified as prognostic biomarkers. According to the survival analysis and receiver operating characteristic curve, the obtained 18-gene prognostic assay exhibited good prognostic function with high sensitivity and specificity to both the train and test samples. In conclusion the 18-gene prognostic assay including the key genes, transcription factor 7-like 2, anterior parietal cortex and lymphocyte enhancer factor-1 may provide a new method for predicting outcomes and may be conducive to the promotion of precision medicine for Luminal A breast cancer.
Analysis of mammalian gene function through broad based phenotypic screens across a consortium of mouse clinics

PubMed Central

Adams, David J; Adams, Niels C; Adler, Thure; Aguilar-Pimentel, Antonio; Ali-Hadji, Dalila; Amann, Gregory; André, Philippe; Atkins, Sarah; Auburtin, Aurelie; Ayadi, Abdel; Becker, Julien; Becker, Lore; Bedu, Elodie; Bekeredjian, Raffi; Birling, Marie-Christine; Blake, Andrew; Bottomley, Joanna; Bowl, Mike; Brault, Véronique; Busch, Dirk H; Bussell, James N; Calzada-Wack, Julia; Cater, Heather; Champy, Marie-France; Charles, Philippe; Chevalier, Claire; Chiani, Francesco; Codner, Gemma F; Combe, Roy; Cox, Roger; Dalloneau, Emilie; Dierich, André; Di Fenza, Armida; Doe, Brendan; Duchon, Arnaud; Eickelberg, Oliver; Esapa, Chris T; El Fertak, Lahcen; Feigel, Tanja; Emelyanova, Irina; Estabel, Jeanne; Favor, Jack; Flenniken, Ann; Gambadoro, Alessia; Garrett, Lilian; Gates, Hilary; Gerdin, Anna-Karin; Gkoutos, George; Greenaway, Simon; Glasl, Lisa; Goetz, Patrice; Da Cruz, Isabelle Goncalves; Götz, Alexander; Graw, Jochen; Guimond, Alain; Hans, Wolfgang; Hicks, Geoff; Hölter, Sabine M; Höfler, Heinz; Hancock, John M; Hoehndorf, Robert; Hough, Tertius; Houghton, Richard; Hurt, Anja; Ivandic, Boris; Jacobs, Hughes; Jacquot, Sylvie; Jones, Nora; Karp, Natasha A; Katus, Hugo A; Kitchen, Sharon; Klein-Rodewald, Tanja; Klingenspor, Martin; Klopstock, Thomas; Lalanne, Valerie; Leblanc, Sophie; Lengger, Christoph; le Marchand, Elise; Ludwig, Tonia; Lux, Aline; McKerlie, Colin; Maier, Holger; Mandel, Jean-Louis; Marschall, Susan; Mark, Manuel; Melvin, David G; Meziane, Hamid; Micklich, Kateryna; Mittelhauser, Christophe; Monassier, Laurent; Moulaert, David; Muller, Stéphanie; Naton, Beatrix; Neff, Frauke; Nolan, Patrick M; Nutter, Lauryl MJ; Ollert, Markus; Pavlovic, Guillaume; Pellegata, Natalia S; Peter, Emilie; Petit-Demoulière, Benoit; Pickard, Amanda; Podrini, Christine; Potter, Paul; Pouilly, Laurent; Puk, Oliver; Richardson, David; Rousseau, Stephane; Quintanilla-Fend, Leticia; Quwailid, Mohamed M; Racz, Ildiko; Rathkolb, Birgit; Riet, Fabrice; Rossant, Janet; Roux, Michel; Rozman, Jan; Ryder, Ed; Salisbury, Jennifer; Santos, Luis; Schäble, Karl-Heinz; Schiller, Evelyn; Schrewe, Anja; Schulz, Holger; Steinkamp, Ralf; Simon, Michelle; Stewart, Michelle; Stöger, Claudia; Stöger, Tobias; Sun, Minxuan; Sunter, David; Teboul, Lydia; Tilly, Isabelle; Tocchini-Valentini, Glauco P; Tost, Monica; Treise, Irina; Vasseur, Laurent; Velot, Emilie; Vogt-Weisenhorn, Daniela; Wagner, Christelle; Walling, Alison; Weber, Bruno; Wendling, Olivia; Westerberg, Henrik; Willershäuser, Monja; Wolf, Eckhard; Wolter, Anne; Wood, Joe; Wurst, Wolfgang; Yildirim, Ali Önder; Zeh, Ramona; Zimmer, Andreas; Zimprich, Annemarie

2015-01-01

The function of the majority of genes in the mouse and human genomes remains unknown. The mouse ES cell knockout resource provides a basis for characterisation of relationships between gene and phenotype. The EUMODIC consortium developed and validated robust methodologies for broad-based phenotyping of knockouts through a pipeline comprising 20 disease-orientated platforms. We developed novel statistical methods for pipeline design and data analysis aimed at detecting reproducible phenotypes with high power. We acquired phenotype data from 449 mutant alleles, representing 320 unique genes, of which half had no prior functional annotation. We captured data from over 27,000 mice finding that 83% of the mutant lines are phenodeviant, with 65% demonstrating pleiotropy. Surprisingly, we found significant differences in phenotype annotation according to zygosity. Novel phenotypes were uncovered for many genes with unknown function providing a powerful basis for hypothesis generation and further investigation in diverse systems. PMID:26214591
Functional Analysis of the Arabidopsis TETRASPANIN Gene Family in Plant Growth and Development.

PubMed

Wang, Feng; Muto, Antonella; Van de Velde, Jan; Neyt, Pia; Himanen, Kristiina; Vandepoele, Klaas; Van Lijsebettens, Mieke

2015-11-01

TETRASPANIN (TET) genes encode conserved integral membrane proteins that are known in animals to function in cellular communication during gamete fusion, immunity reaction, and pathogen recognition. In plants, functional information is limited to one of the 17 members of the Arabidopsis (Arabidopsis thaliana) TET gene family and to expression data in reproductive stages. Here, the promoter activity of all 17 Arabidopsis TET genes was investigated by pAtTET::NUCLEAR LOCALIZATION SIGNAL-GREEN FLUORESCENT PROTEIN/β-GLUCURONIDASE reporter lines throughout the life cycle, which predicted functional divergence in the paralogous genes per clade. However, partial overlap was observed for many TET genes across the clades, correlating with few phenotypes in single mutants and, therefore, requiring double mutant combinations for functional investigation. Mutational analysis showed a role for TET13 in primary root growth and lateral root development and redundant roles for TET5 and TET6 in leaf and root growth through negative regulation of cell proliferation. Strikingly, a number of TET genes were expressed in embryonic and seedling progenitor cells and remained expressed until the differentiation state in the mature plant, suggesting a dynamic function over developmental stages. The cis-regulatory elements together with transcription factor-binding data provided molecular insight into the sites, conditions, and perturbations that affect TET gene expression and positioned the TET genes in different molecular pathways; the data represent a hypothesis-generating resource for further functional analyses. © 2015 American Society of Plant Biologists. All Rights Reserved.
Functional Analysis of the Arabidopsis TETRASPANIN Gene Family in Plant Growth and Development1[OPEN

PubMed Central

Wang, Feng; Muto, Antonella; Van de Velde, Jan; Neyt, Pia; Himanen, Kristiina; Vandepoele, Klaas; Van Lijsebettens, Mieke

2015-01-01

TETRASPANIN (TET) genes encode conserved integral membrane proteins that are known in animals to function in cellular communication during gamete fusion, immunity reaction, and pathogen recognition. In plants, functional information is limited to one of the 17 members of the Arabidopsis (Arabidopsis thaliana) TET gene family and to expression data in reproductive stages. Here, the promoter activity of all 17 Arabidopsis TET genes was investigated by pAtTET::NUCLEAR LOCALIZATION SIGNAL-GREEN FLUORESCENT PROTEIN/β-GLUCURONIDASE reporter lines throughout the life cycle, which predicted functional divergence in the paralogous genes per clade. However, partial overlap was observed for many TET genes across the clades, correlating with few phenotypes in single mutants and, therefore, requiring double mutant combinations for functional investigation. Mutational analysis showed a role for TET13 in primary root growth and lateral root development and redundant roles for TET5 and TET6 in leaf and root growth through negative regulation of cell proliferation. Strikingly, a number of TET genes were expressed in embryonic and seedling progenitor cells and remained expressed until the differentiation state in the mature plant, suggesting a dynamic function over developmental stages. The cis-regulatory elements together with transcription factor-binding data provided molecular insight into the sites, conditions, and perturbations that affect TET gene expression and positioned the TET genes in different molecular pathways; the data represent a hypothesis-generating resource for further functional analyses. PMID:26417009
[Construction and functional identification of eukaryotic expression vector carrying Sprague-Dawley rat MSX-2 gene].

PubMed

Yang, Xian-Xian; Zhang, Mei; Yan, Zhao-Wen; Zhang, Ru-Hong; Mu, Xiong-Zheng

2008-01-01

To construct a high effective eukaryotic expressing plasmid PcDNA 3.1-MSX-2 encoding Sprague-Dawley rat MSX-2 gene for the further study of MSX-2 gene function. The full length SD rat MSX-2 gene was amplified by PCR, and the full length DNA was inserted in the PMD1 8-T vector. It was isolated by restriction enzyme digest with BamHI and Xhol, then ligated into the cloning site of the PcDNA3.1 expression plasmid. The positive recombinant was identified by PCR analysis, restriction endonudease analysis and sequence analysis. Expression of RNA and protein was detected by RT-PCR and Western blot analysis in PcDNA3.1-MSX-2 transfected HEK293 cells. Sequence analysis and restriction endonudease analysis of PcDNA3.1-MSX-2 demonstrated that the position and size of MSX-2 cDNA insertion were consistent with the design. RT-PCR and Western blot analysis showed specific expression of mRNA and protein of MSX-2 in the transfected HEK293 cells. The high effective eukaryotic expression plasmid PcDNA3.1-MSX-2 encoding Sprague-Dawley Rat MSX-2 gene which is related to craniofacial development can be successfully reconstructed. It may serve as the basis for the further study of MSX-2 gene function.
Conserved Genes Act as Modifiers of Invertebrate SMN Loss of Function Defects

PubMed Central

Chang, Howard C.; Sen, Anindya; Kalloo, Geetika; Harris, Jevede; Barsby, Tom; Walsh, Melissa B.; Satterlee, John S.; Li, Chris; Van Vactor, David; Artavanis-Tsakonas, Spyros; Hart, Anne C.

2010-01-01

Spinal Muscular Atrophy (SMA) is caused by diminished function of the Survival of Motor Neuron (SMN) protein, but the molecular pathways critical for SMA pathology remain elusive. We have used genetic approaches in invertebrate models to identify conserved SMN loss of function modifier genes. Drosophila melanogaster and Caenorhabditis elegans each have a single gene encoding a protein orthologous to human SMN; diminished function of these invertebrate genes causes lethality and neuromuscular defects. To find genes that modulate SMN function defects across species, two approaches were used. First, a genome-wide RNAi screen for C. elegans SMN modifier genes was undertaken, yielding four genes. Second, we tested the conservation of modifier gene function across species; genes identified in one invertebrate model were tested for function in the other invertebrate model. Drosophila orthologs of two genes, which were identified originally in C. elegans, modified Drosophila SMN loss of function defects. C. elegans orthologs of twelve genes, which were originally identified in a previous Drosophila screen, modified C. elegans SMN loss of function defects. Bioinformatic analysis of the conserved, cross-species, modifier genes suggests that conserved cellular pathways, specifically endocytosis and mRNA regulation, act as critical genetic modifiers of SMN loss of function defects across species. PMID:21124729
Differential network analysis reveals the genome-wide landscape of estrogen receptor modulation in hormonal cancers

PubMed Central

Hsiao, Tzu-Hung; Chiu, Yu-Chiao; Hsu, Pei-Yin; Lu, Tzu-Pin; Lai, Liang-Chuan; Tsai, Mong-Hsun; Huang, Tim H.-M.; Chuang, Eric Y.; Chen, Yidong

2016-01-01

Several mutual information (MI)-based algorithms have been developed to identify dynamic gene-gene and function-function interactions governed by key modulators (genes, proteins, etc.). Due to intensive computation, however, these methods rely heavily on prior knowledge and are limited in genome-wide analysis. We present the modulated gene/gene set interaction (MAGIC) analysis to systematically identify genome-wide modulation of interaction networks. Based on a novel statistical test employing conjugate Fisher transformations of correlation coefficients, MAGIC features fast computation and adaption to variations of clinical cohorts. In simulated datasets MAGIC achieved greatly improved computation efficiency and overall superior performance than the MI-based method. We applied MAGIC to construct the estrogen receptor (ER) modulated gene and gene set (representing biological function) interaction networks in breast cancer. Several novel interaction hubs and functional interactions were discovered. ER+ dependent interaction between TGFβ and NFκB was further shown to be associated with patient survival. The findings were verified in independent datasets. Using MAGIC, we also assessed the essential roles of ER modulation in another hormonal cancer, ovarian cancer. Overall, MAGIC is a systematic framework for comprehensively identifying and constructing the modulated interaction networks in a whole-genome landscape. MATLAB implementation of MAGIC is available for academic uses at https://github.com/chiuyc/MAGIC. PMID:26972162
Improving information retrieval in functional analysis.

PubMed

Rodriguez, Juan C; González, Germán A; Fresno, Cristóbal; Llera, Andrea S; Fernández, Elmer A

2016-12-01

Transcriptome analysis is essential to understand the mechanisms regulating key biological processes and functions. The first step usually consists of identifying candidate genes; to find out which pathways are affected by those genes, however, functional analysis (FA) is mandatory. The most frequently used strategies for this purpose are Gene Set and Singular Enrichment Analysis (GSEA and SEA) over Gene Ontology. Several statistical methods have been developed and compared in terms of computational efficiency and/or statistical appropriateness. However, whether their results are similar or complementary, the sensitivity to parameter settings, or possible bias in the analyzed terms has not been addressed so far. Here, two GSEA and four SEA methods and their parameter combinations were evaluated in six datasets by comparing two breast cancer subtypes with well-known differences in genetic background and patient outcomes. We show that GSEA and SEA lead to different results depending on the chosen statistic, model and/or parameters. Both approaches provide complementary results from a biological perspective. Hence, an Integrative Functional Analysis (IFA) tool is proposed to improve information retrieval in FA. It provides a common gene expression analytic framework that grants a comprehensive and coherent analysis. Only a minimal user parameter setting is required, since the best SEA/GSEA alternatives are integrated. IFA utility was demonstrated by evaluating four prostate cancer and the TCGA breast cancer microarray datasets, which showed its biological generalization capabilities. Copyright © 2016 Elsevier Ltd. All rights reserved.
A multicolor panel of novel lentiviral "gene ontology" (LeGO) vectors for functional gene analysis.

PubMed

Weber, Kristoffer; Bartsch, Udo; Stocking, Carol; Fehse, Boris

2008-04-01

Functional gene analysis requires the possibility of overexpression, as well as downregulation of one, or ideally several, potentially interacting genes. Lentiviral vectors are well suited for this purpose as they ensure stable expression of complementary DNAs (cDNAs), as well as short-hairpin RNAs (shRNAs), and can efficiently transduce a wide spectrum of cell targets when packaged within the coat proteins of other viruses. Here we introduce a multicolor panel of novel lentiviral "gene ontology" (LeGO) vectors designed according to the "building blocks" principle. Using a wide spectrum of different fluorescent markers, including drug-selectable enhanced green fluorescent protein (eGFP)- and dTomato-blasticidin-S resistance fusion proteins, LeGO vectors allow simultaneous analysis of multiple genes and shRNAs of interest within single, easily identifiable cells. Furthermore, each functional module is flanked by unique cloning sites, ensuring flexibility and individual optimization. The efficacy of these vectors for analyzing multiple genes in a single cell was demonstrated in several different cell types, including hematopoietic, endothelial, and neural stem and progenitor cells, as well as hepatocytes. LeGO vectors thus represent a valuable tool for investigating gene networks using conditional ectopic expression and knock-down approaches simultaneously.
Norrie disease gene: characterization of deletions and possible function.

PubMed

Chen, Z Y; Battinelli, E M; Hendriks, R W; Powell, J F; Middleton-Price, H; Sims, K B; Breakefield, X O; Craig, I W

1993-05-01

Positional cloning experiments have resulted recently in the isolation of a candidate gene for Norrie disease (pseudoglioma; NDP), a severe X-linked neurodevelopmental disorder. Here we report the isolation and analysis of human genomic DNA clones encompassing the NDP gene. The gene spans 28 kb and consists of 3 exons, the first of which is entirely contained within the 5' untranslated region. Detailed analysis of genomic deletions in Norrie patients shows that they are heterogeneous, both in size and in position. By PCR analysis, we found that expression of the NDP gene was not confined to the eye or to the brain. An extensive DNA and protein sequence comparison between the human NDP gene and related genes from the database revealed homology with cysteine-rich protein-binding domains of immediate--early genes implicated in the regulation of cell proliferation. We propose that NDP is a molecule related in function to these genes and may be involved in a pathway that regulates neural cell differentiation and proliferation.
Weighted functional linear regression models for gene-based association analysis.

PubMed

Belonogova, Nadezhda M; Svishcheva, Gulnara R; Wilson, James F; Campbell, Harry; Axenovich, Tatiana I

2018-01-01

Functional linear regression models are effectively used in gene-based association analysis of complex traits. These models combine information about individual genetic variants, taking into account their positions and reducing the influence of noise and/or observation errors. To increase the power of methods, where several differently informative components are combined, weights are introduced to give the advantage to more informative components. Allele-specific weights have been introduced to collapsing and kernel-based approaches to gene-based association analysis. Here we have for the first time introduced weights to functional linear regression models adapted for both independent and family samples. Using data simulated on the basis of GAW17 genotypes and weights defined by allele frequencies via the beta distribution, we demonstrated that type I errors correspond to declared values and that increasing the weights of causal variants allows the power of functional linear models to be increased. We applied the new method to real data on blood pressure from the ORCADES sample. Five of the six known genes with P < 0.1 in at least one analysis had lower P values with weighted models. Moreover, we found an association between diastolic blood pressure and the VMP1 gene (P = 8.18×10-6), when we used a weighted functional model. For this gene, the unweighted functional and weighted kernel-based models had P = 0.004 and 0.006, respectively. The new method has been implemented in the program package FREGAT, which is freely available at https://cran.r-project.org/web/packages/FREGAT/index.html.
Genome-Wide Screening and Characterization of the Dof Gene Family in Physic Nut (Jatropha curcas L.).

PubMed

Wang, Peipei; Li, Jing; Gao, Xiaoyang; Zhang, Di; Li, Anlin; Liu, Changning

2018-05-29

Physic nut ( Jatropha curcas L.) is a species of flowering plant with great potential for biofuel production and as an emerging model organism for functional genomic analysis, particularly in the Euphorbiaceae family. DNA binding with one finger (Dof) transcription factors play critical roles in numerous biological processes in plants. Nevertheless, the knowledge about members, and the evolutionary and functional characteristics of the Dof gene family in physic nut is insufficient. Therefore, we performed a genome-wide screening and characterization of the Dof gene family within the physic nut draft genome. In total, 24 JcDof genes (encoding 33 JcDof proteins) were identified. All the JcDof genes were divided into three major groups based on phylogenetic inference, which was further validated by the subsequent gene structure and motif analysis. Genome comparison revealed that segmental duplication may have played crucial roles in the expansion of the JcDof gene family, and gene expansion was mainly subjected to positive selection. The expression profile demonstrated the broad involvement of JcDof genes in response to various abiotic stresses, hormonal treatments and functional divergence. This study provides valuable information for better understanding the evolution of JcDof genes, and lays a foundation for future functional exploration of JcDof genes.
Bioinformatics tools for quantitative and functional metagenome and metatranscriptome data analysis in microbes.

PubMed

Niu, Sheng-Yong; Yang, Jinyu; McDermaid, Adam; Zhao, Jing; Kang, Yu; Ma, Qin

2017-05-08

Metagenomic and metatranscriptomic sequencing approaches are more frequently being used to link microbiota to important diseases and ecological changes. Many analyses have been used to compare the taxonomic and functional profiles of microbiota across habitats or individuals. While a large portion of metagenomic analyses focus on species-level profiling, some studies use strain-level metagenomic analyses to investigate the relationship between specific strains and certain circumstances. Metatranscriptomic analysis provides another important insight into activities of genes by examining gene expression levels of microbiota. Hence, combining metagenomic and metatranscriptomic analyses will help understand the activity or enrichment of a given gene set, such as drug-resistant genes among microbiome samples. Here, we summarize existing bioinformatics tools of metagenomic and metatranscriptomic data analysis, the purpose of which is to assist researchers in deciding the appropriate tools for their microbiome studies. Additionally, we propose an Integrated Meta-Function mapping pipeline to incorporate various reference databases and accelerate functional gene mapping procedures for both metagenomic and metatranscriptomic analyses. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
[Current status of gene test market].

PubMed

Ohtani, Shinichi

2002-12-01

The technological innovation of the gene analysis makes the adaptation range of the gene test in clinical diagnosis expand. Then, gene test has popularized increasingly around the infection disease for clinical inspection. Also in the field of clinical inspection, the increase of the importance of clinical application and the inspection item new year by year have appeared with the functional analysis of a gene. Moreover, the new test method and automation analysis equipment tend to be developed by progress of gene-analysis technology, and it is going to be introduced. The spread of gene test and development of a gene test market have an important possibility of activating the present clinical inspection field.
Differential gene expression profiles of peripheral blood mononuclear cells in childhood asthma.

PubMed

Kong, Qian; Li, Wen-Jing; Huang, Hua-Rong; Zhong, Ying-Qiang; Fang, Jian-Pei

2015-05-01

Asthma is a common childhood disease with strong genetic components. This study compared whole-genome expression differences between asthmatic young children and healthy controls to identify gene signatures of childhood asthma. Total RNA extracted from peripheral blood mononuclear cells (PBMC) was subjected to microarray analysis. QRT-PCR was performed to verify the microarray results. Classification and functional characterization of differential genes were illustrated by hierarchical clustering and gene ontology analysis. Multiple logistic regression (MLR) analysis, receiver operating characteristic (ROC) curve analysis, and discriminate power were used to scan asthma-specific diagnostic markers. For fold-change>2 and p < 0.05, there were 758 named differential genes. The results of QRT-PCR confirmed successfully the array data. Hierarchical clustering divided 29 highly possible genes into seven categories and the genes in the same cluster were likely to possess similar expression patterns or functions. Gene ontology analysis presented that differential genes primarily enriched in immune response, response to stress or stimulus, and regulation of apoptosis in biological process. MLR and ROC curve analysis revealed that the combination of ADAM33, Smad7, and LIGHT possessed excellent discriminating power. The combination of ADAM33, Smad7, and LIGHT would be a reliable and useful childhood asthma model for prediction and diagnosis.

Systematic analysis of microarray datasets to identify Parkinson's disease‑associated pathways and genes.

PubMed

Feng, Yinling; Wang, Xuefeng

2017-03-01

In order to investigate commonly disturbed genes and pathways in various brain regions of patients with Parkinson's disease (PD), microarray datasets from previous studies were collected and systematically analyzed. Different normalization methods were applied to microarray datasets from different platforms. A strategy combining gene co‑expression networks and clinical information was adopted, using weighted gene co‑expression network analysis (WGCNA) to screen for commonly disturbed genes in different brain regions of patients with PD. Functional enrichment analysis of commonly disturbed genes was performed using the Database for Annotation, Visualization, and Integrated Discovery (DAVID). Co‑pathway relationships were identified with Pearson's correlation coefficient tests and a hypergeometric distribution‑based test. Common genes in pathway pairs were selected out and regarded as risk genes. A total of 17 microarray datasets from 7 platforms were retained for further analysis. Five gene coexpression modules were identified, containing 9,745, 736, 233, 101 and 93 genes, respectively. One module was significantly correlated with PD samples and thus the 736 genes it contained were considered to be candidate PD‑associated genes. Functional enrichment analysis demonstrated that these genes were implicated in oxidative phosphorylation and PD. A total of 44 pathway pairs and 52 risk genes were revealed, and a risk gene pathway relationship network was constructed. Eight modules were identified and were revealed to be associated with PD, cancers and metabolism. A number of disturbed pathways and risk genes were unveiled in PD, and these findings may help advance understanding of PD pathogenesis.
Microbial Functional Gene Diversity with a Shift of Subsurface Redox Conditions during In Situ Uranium Reduction

PubMed Central

Liang, Yuting; Van Nostrand, Joy D.; N′Guessan, Lucie A.; Peacock, Aaron D.; Deng, Ye; Long, Philip E.; Resch, C. Tom; Wu, Liyou; He, Zhili; Li, Guanghe; Hazen, Terry C.; Lovley, Derek R.

2012-01-01

To better understand the microbial functional diversity changes with subsurface redox conditions during in situ uranium bioremediation, key functional genes were studied with GeoChip, a comprehensive functional gene microarray, in field experiments at a uranium mill tailings remedial action (UMTRA) site (Rifle, CO). The results indicated that functional microbial communities altered with a shift in the dominant metabolic process, as documented by hierarchical cluster and ordination analyses of all detected functional genes. The abundance of dsrAB genes (dissimilatory sulfite reductase genes) and methane generation-related mcr genes (methyl coenzyme M reductase coding genes) increased when redox conditions shifted from Fe-reducing to sulfate-reducing conditions. The cytochrome genes detected were primarily from Geobacter sp. and decreased with lower subsurface redox conditions. Statistical analysis of environmental parameters and functional genes indicated that acetate, U(VI), and redox potential (Eh) were the most significant geochemical variables linked to microbial functional gene structures, and changes in microbial functional diversity were strongly related to the dominant terminal electron-accepting process following acetate addition. The study indicates that the microbial functional genes clearly reflect the in situ redox conditions and the dominant microbial processes, which in turn influence uranium bioreduction. Microbial functional genes thus could be very useful for tracking microbial community structure and dynamics during bioremediation. PMID:22327592
Deduction and Analysis of the Interacting Stress Response Pathways of Metal/Radionuclide-reducing Bacteria

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhou, Jizhong; He, Zhili

2010-02-28

Project Title: Deduction and Analysis of the Interacting Stress Response Pathways of Metal/Radionuclide-reducing Bacteria DOE Grant Number: DE-FG02-06ER64205 Principal Investigator: Jizhong (Joe) Zhou (University of Oklahoma) Key members: Zhili He, Aifen Zhou, Christopher Hemme, Joy Van Nostrand, Ye Deng, and Qichao Tu Collaborators: Terry Hazen, Judy Wall, Adam Arkin, Matthew Fields, Aindrila Mukhopadhyay, and David Stahl Summary Three major objectives have been conducted in the Zhou group at the University of Oklahoma (OU): (i) understanding of gene function, regulation, network and evolution of Desulfovibrio vugaris Hildenborough in response to environmental stresses, (ii) development of metagenomics technologies for microbial community analysis,more » and (iii) functional characterization of microbial communities with metagenomic approaches. In the past a few years, we characterized four CRP/FNR regulators, sequenced ancestor and evolved D. vulgaris strains, and functionally analyzed those mutated genes identified in salt-adapted strains. Also, a new version of GeoChip 4.0 has been developed, which also includes stress response genes (StressChip), and a random matrix theory-based conceptual framework for identifying functional molecular ecological networks has been developed with the high throughput functional gene array hybridization data as well as pyrosequencing data from 16S rRNA genes. In addition, GeoChip and sequencing technologies as well as network analysis approaches have been used to analyze microbial communities from different habitats. Those studies provide a comprehensive understanding of gene function, regulation, network, and evolution in D. vulgaris, and microbial community diversity, composition and structure as well as their linkages with environmental factors and ecosystem functioning, which has resulted in more than 60 publications.« less
Guidelines for the functional annotation of microRNAs using the Gene Ontology

PubMed Central

D'Eustachio, Peter; Smith, Jennifer R.; Zampetaki, Anna

2016-01-01

MicroRNA regulation of developmental and cellular processes is a relatively new field of study, and the available research data have not been organized to enable its inclusion in pathway and network analysis tools. The association of gene products with terms from the Gene Ontology is an effective method to analyze functional data, but until recently there has been no substantial effort dedicated to applying Gene Ontology terms to microRNAs. Consequently, when performing functional analysis of microRNA data sets, researchers have had to rely instead on the functional annotations associated with the genes encoding microRNA targets. In consultation with experts in the field of microRNA research, we have created comprehensive recommendations for the Gene Ontology curation of microRNAs. This curation manual will enable provision of a high-quality, reliable set of functional annotations for the advancement of microRNA research. Here we describe the key aspects of the work, including development of the Gene Ontology to represent this data, standards for describing the data, and guidelines to support curators making these annotations. The full microRNA curation guidelines are available on the GO Consortium wiki (http://wiki.geneontology.org/index.php/MicroRNA_GO_annotation_manual). PMID:26917558
Characterization and Functional Analysis of PEBP Family Genes in Upland Cotton (Gossypium hirsutum L.).

PubMed

Zhang, Xiaohong; Wang, Congcong; Pang, Chaoyou; Wei, Hengling; Wang, Hantao; Song, Meizhen; Fan, Shuli; Yu, Shuxun

2016-01-01

Upland cotton (Gossypium hirsutum L.) is a naturally occurring photoperiod-sensitive perennial plant species. However, sensitivity to the day length was lost during domestication. The phosphatidylethanolamine-binding protein (PEBP) gene family, of which three subclades have been identified in angiosperms, functions to promote and suppress flowering in photoperiod pathway. Recent evidence indicates that PEBP family genes play an important role in generating mobile flowering signals. We isolated homologues of the PEBP gene family in upland cotton and examined their regulation and function. Nine PEBP-like genes were cloned and phylogenetic analysis indicated the genes belonged to four subclades (FT, MFT, TFL1 and PEBP). Cotton PEBP-like genes showed distinct expression patterns in relation to different cotton genotypes, photoperiod responsive and cultivar maturity. The GhFT gene expression of a semi-wild race of upland cotton were strongly induced under short day condition, whereas the GhPEBP2 gene expression was induced under long days. We also elucidated that GhFT but not GhPEBP2 interacted with FD-like bZIP transcription factor GhFD and promote flowering under both long- and short-day conditions. The present result indicated that GhPEBP-like genes may perform different functions. This work corroborates the involvement of PEBP-like genes in photoperiod response and regulation of flowering time in different cotton genotypes, and contributes to an improved understanding of the function of PEBP-like genes in cotton.
Characterization and Functional Analysis of PEBP Family Genes in Upland Cotton (Gossypium hirsutum L.)

PubMed Central

Wang, Congcong; Pang, Chaoyou; Wei, Hengling; Wang, Hantao; Song, Meizhen; Fan, Shuli; Yu, Shuxun

2016-01-01

Upland cotton (Gossypium hirsutum L.) is a naturally occurring photoperiod-sensitive perennial plant species. However, sensitivity to the day length was lost during domestication. The phosphatidylethanolamine-binding protein (PEBP) gene family, of which three subclades have been identified in angiosperms, functions to promote and suppress flowering in photoperiod pathway. Recent evidence indicates that PEBP family genes play an important role in generating mobile flowering signals. We isolated homologues of the PEBP gene family in upland cotton and examined their regulation and function. Nine PEBP-like genes were cloned and phylogenetic analysis indicated the genes belonged to four subclades (FT, MFT, TFL1 and PEBP). Cotton PEBP-like genes showed distinct expression patterns in relation to different cotton genotypes, photoperiod responsive and cultivar maturity. The GhFT gene expression of a semi-wild race of upland cotton were strongly induced under short day condition, whereas the GhPEBP2 gene expression was induced under long days. We also elucidated that GhFT but not GhPEBP2 interacted with FD-like bZIP transcription factor GhFD and promote flowering under both long- and short-day conditions. The present result indicated that GhPEBP-like genes may perform different functions. This work corroborates the involvement of PEBP-like genes in photoperiod response and regulation of flowering time in different cotton genotypes, and contributes to an improved understanding of the function of PEBP-like genes in cotton. PMID:27552108
Rapid functional analysis of computationally complex rare human IRF6 gene variants using a novel zebrafish model.

PubMed

Li, Edward B; Truong, Dawn; Hallett, Shawn A; Mukherjee, Kusumika; Schutte, Brian C; Liao, Eric C

2017-09-01

Large-scale sequencing efforts have captured a rapidly growing catalogue of genetic variations. However, the accurate establishment of gene variant pathogenicity remains a central challenge in translating personal genomics information to clinical decisions. Interferon Regulatory Factor 6 (IRF6) gene variants are significant genetic contributors to orofacial clefts. Although approximately three hundred IRF6 gene variants have been documented, their effects on protein functions remain difficult to interpret. Here, we demonstrate the protein functions of human IRF6 missense gene variants could be rapidly assessed in detail by their abilities to rescue the irf6 -/- phenotype in zebrafish through variant mRNA microinjections at the one-cell stage. The results revealed many missense variants previously predicted by traditional statistical and computational tools to be loss-of-function and pathogenic retained partial or full protein function and rescued the zebrafish irf6 -/- periderm rupture phenotype. Through mRNA dosage titration and analysis of the Exome Aggregation Consortium (ExAC) database, IRF6 missense variants were grouped by their abilities to rescue at various dosages into three functional categories: wild type function, reduced function, and complete loss-of-function. This sensitive and specific biological assay was able to address the nuanced functional significances of IRF6 missense gene variants and overcome many limitations faced by current statistical and computational tools in assigning variant protein function and pathogenicity. Furthermore, it unlocked the possibility for characterizing yet undiscovered human IRF6 missense gene variants from orofacial cleft patients, and illustrated a generalizable functional genomics paradigm in personalized medicine.
Molecular cloning and expression analysis of WRKY transcription factor genes in Salvia miltiorrhiza.

PubMed

Li, Caili; Li, Dongqiao; Shao, Fenjuan; Lu, Shanfa

2015-03-17

WRKY proteins comprise a large family of transcription factors and play important regulatory roles in plant development and defense response. The WRKY gene family in Salvia miltiorrhiza has not been characterized. A total of 61 SmWRKYs were cloned from S. miltiorrhiza. Multiple sequence alignment showed that SmWRKYs could be classified into 3 groups and 8 subgroups. Sequence features, the WRKY domain and other motifs of SmWRKYs are largely conserved with Arabidopsis AtWRKYs. Each group of WRKY domains contains characteristic conserved sequences, and group-specific motifs might attribute to functional divergence of WRKYs. A total of 17 pairs of orthologous SmWRKY and AtWRKY genes and 21 pairs of paralogous SmWRKY genes were identified. Maximum likelihood analysis showed that SmWRKYs had undergone strong selective pressure for adaptive evolution. Functional divergence analysis suggested that the SmWRKY subgroup genes and many paralogous SmWRKY gene pairs were divergent in functions. Various critical amino acids contributed to functional divergence among subgroups were detected. Of the 61 SmWRKYs, 22, 13, 4 and 1 were predominantly expressed in roots, stems, leaves, and flowers, respectively. The other 21 were mainly expressed in at least two tissues analyzed. In S. miltiorrhiza roots treated with MeJA, significant changes of gene expression were observed for 49 SmWRKYs, of which 26 were up-regulated, 18 were down-regulated, while the other 5 were either up-regulated or down-regulated at different time-points of treatment. Analysis of published RNA-seq data showed that 42 of the 61 identified SmWRKYs were yeast extract and Ag(+)-responsive. Through a systematic analysis, SmWRKYs potentially involved in tanshinone biosynthesis were predicted. These results provide insights into functional conservation and diversification of SmWRKYs and are useful information for further elucidating SmWRKY functions.
Mining, identification and function analysis of microRNAs and target genes in peanut (Arachis hypogaea L.).

PubMed

Zhang, Tingting; Hu, Shuhao; Yan, Caixia; Li, Chunjuan; Zhao, Xiaobo; Wan, Shubo; Shan, Shihua

2017-02-01

In the present investigation, a total of 60 conserved peanut (Arachis hypogaea L.) microRNA (miRNA) sequences, belonging to 16 families, were identified using bioinformatics methods. There were 392 target gene sequences, identified from 58 miRNAs with Target-align software and BLASTx analyses. Gene Ontology (GO) functional analysis suggested that these target genes were involved in mediating peanut growth and development, signal transduction and stress resistance. There were 55 miRNA sequences, verified employing a poly (A) tailing test, with a success rate of up to 91.67%. Twenty peanut target gene sequences were randomly selected, and the 5' rapid amplification of the cDNA ends (5'-RACE) method were used to validate the cleavage sites of these target genes. Of these, 14 (70%) peanut miRNA targets were verified by means of gel electrophoresis, cloning and sequencing. Furthermore, functional analysis and homologous sequence retrieval were conducted for target gene sequences, and 26 target genes were chosen as the objects for stress resistance experimental study. Real-time fluorescence quantitative PCR (qRT-PCR) technology was applied to measure the expression level of resistance-associated miRNAs and their target genes in peanut exposed to Aspergillus flavus (A. flavus) infection and drought stress, respectively. In consequence, 5 groups of miRNAs & targets were found accorded with the mode of miRNA negatively controlling the expression of target genes. This study, preliminarily determined the biological functions of some resistance-associated miRNAs and their target genes in peanut. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
GOMA: functional enrichment analysis tool based on GO modules

PubMed Central

Huang, Qiang; Wu, Ling-Yun; Wang, Yong; Zhang, Xiang-Sun

2013-01-01

Analyzing the function of gene sets is a critical step in interpreting the results of high-throughput experiments in systems biology. A variety of enrichment analysis tools have been developed in recent years, but most output a long list of significantly enriched terms that are often redundant, making it difficult to extract the most meaningful functions. In this paper, we present GOMA, a novel enrichment analysis method based on the new concept of enriched functional Gene Ontology (GO) modules. With this method, we systematically revealed functional GO modules, i.e., groups of functionally similar GO terms, via an optimization model and then ranked them by enrichment scores. Our new method simplifies enrichment analysis results by reducing redundancy, thereby preventing inconsistent enrichment results among functionally similar terms and providing more biologically meaningful results. PMID:23237213
Analysis of global gene expression in Brachypodium distachyon reveals extensive network plasticity in response to abiotic stress.

PubMed

Priest, Henry D; Fox, Samuel E; Rowley, Erik R; Murray, Jessica R; Michael, Todd P; Mockler, Todd C

2014-01-01

Brachypodium distachyon is a close relative of many important cereal crops. Abiotic stress tolerance has a significant impact on productivity of agriculturally important food and feedstock crops. Analysis of the transcriptome of Brachypodium after chilling, high-salinity, drought, and heat stresses revealed diverse differential expression of many transcripts. Weighted Gene Co-Expression Network Analysis revealed 22 distinct gene modules with specific profiles of expression under each stress. Promoter analysis implicated short DNA sequences directly upstream of module members in the regulation of 21 of 22 modules. Functional analysis of module members revealed enrichment in functional terms for 10 of 22 network modules. Analysis of condition-specific correlations between differentially expressed gene pairs revealed extensive plasticity in the expression relationships of gene pairs. Photosynthesis, cell cycle, and cell wall expression modules were down-regulated by all abiotic stresses. Modules which were up-regulated by each abiotic stress fell into diverse and unique gene ontology GO categories. This study provides genomics resources and improves our understanding of abiotic stress responses of Brachypodium.
Genome-Wide Analyses of the Soybean F-Box Gene Family in Response to Salt Stress

PubMed Central

Jia, Qi; Xiao, Zhi-Xia; Wong, Fuk-Ling; Sun, Song; Liang, Kang-Jing; Lam, Hon-Ming

2017-01-01

The F-box family is one of the largest gene families in plants that regulate diverse life processes, including salt responses. However, the knowledge of the soybean F-box genes and their roles in salt tolerance remains limited. Here, we conducted a genome-wide survey of the soybean F-box family, and their expression analysis in response to salinity via in silico analysis of online RNA-sequencing (RNA-seq) data and quantitative reverse-transcription polymerase chain reaction (qRT-PCR) to predict their potential functions. A total of 725 potential F-box proteins encoded by 509 genes were identified and classified into 9 subfamilies. The gene structures, conserved domains and chromosomal distributions were characterized. There are 76 pairs of duplicate genes identified, including genome-wide segmental and tandem duplication events, which lead to the expansion of the number of F-box genes. The in silico expression analysis showed that these genes would be involved in diverse developmental functions and play an important role in salt response. Our qRT-PCR analysis confirmed 12 salt-responding F-box genes. Overall, our results provide useful information on soybean F-box genes, especially their potential roles in salt tolerance. PMID:28417911
Genome-Wide Analyses of the Soybean F-Box Gene Family in Response to Salt Stress.

PubMed

Jia, Qi; Xiao, Zhi-Xia; Wong, Fuk-Ling; Sun, Song; Liang, Kang-Jing; Lam, Hon-Ming

2017-04-12

The F-box family is one of the largest gene families in plants that regulate diverse life processes, including salt responses. However, the knowledge of the soybean F-box genes and their roles in salt tolerance remains limited. Here, we conducted a genome-wide survey of the soybean F-box family, and their expression analysis in response to salinity via in silico analysis of online RNA-sequencing (RNA-seq) data and quantitative reverse-transcription polymerase chain reaction (qRT-PCR) to predict their potential functions. A total of 725 potential F-box proteins encoded by 509 genes were identified and classified into 9 subfamilies. The gene structures, conserved domains and chromosomal distributions were characterized. There are 76 pairs of duplicate genes identified, including genome-wide segmental and tandem duplication events, which lead to the expansion of the number of F-box genes. The in silico expression analysis showed that these genes would be involved in diverse developmental functions and play an important role in salt response. Our qRT-PCR analysis confirmed 12 salt-responding F-box genes. Overall, our results provide useful information on soybean F-box genes, especially their potential roles in salt tolerance.
A Genome-Wide RNAi Screen for Modifiers of the Circadian Clock in Human Cells

PubMed Central

Zhang, Eric E.; Liu, Andrew C.; Hirota, Tsuyoshi; Miraglia, Loren J.; Welch, Genevieve; Pongsawakul, Pagkapol Y.; Liu, Xianzhong; Atwood, Ann; Huss, Jon W.; Janes, Jeff; Su, Andrew I.; Hogenesch, John B.; Kay, Steve A.

2009-01-01

Summary Two decades of research identified more than a dozen clock genes and defined a biochemical feedback mechanism of circadian oscillator function. To identify additional clock genes and modifiers, we conducted a genome-wide siRNA screen in a human cellular clock model. Knockdown of nearly a thousand genes reduced rhythm amplitude. Potent effects on period length or increased amplitude were less frequent; we found hundreds of these and confirmed them in secondary screens. Characterization of a subset of these genes demonstrated a dosage-dependent effect on oscillator function. Protein interaction network analysis showed that dozens of gene products directly or indirectly associate with known clock components. Pathway analysis revealed these genes are overrepresented for components of insulin and hedgehog signaling, the cell cycle, and the folate metabolism. Coupled with data showing many of these pathways are clock-regulated, we conclude the clock is interconnected with many aspects of cellular function. PMID:19765810
Computer analysis of protein functional sites projection on exon structure of genes in Metazoa.

PubMed

Medvedeva, Irina V; Demenkov, Pavel S; Ivanisenko, Vladimir A

2015-01-01

Study of the relationship between the structural and functional organization of proteins and their coding genes is necessary for an understanding of the evolution of molecular systems and can provide new knowledge for many applications for designing proteins with improved medical and biological properties. It is well known that the functional properties of proteins are determined by their functional sites. Functional sites are usually represented by a small number of amino acid residues that are distantly located from each other in the amino acid sequence. They are highly conserved within their functional group and vary significantly in structure between such groups. According to this facts analysis of the general properties of the structural organization of the functional sites at the protein level and, at the level of exon-intron structure of the coding gene is still an actual problem. One approach to this analysis is the projection of amino acid residue positions of the functional sites along with the exon boundaries to the gene structure. In this paper, we examined the discontinuity of the functional sites in the exon-intron structure of genes and the distribution of lengths and phases of the functional site encoding exons in vertebrate genes. We have shown that the DNA fragments coding the functional sites were in the same exons, or in close exons. The observed tendency to cluster the exons that code functional sites which could be considered as the unit of protein evolution. We studied the characteristics of the structure of the exon boundaries that code, and do not code, functional sites in 11 Metazoa species. This is accompanied by a reduced frequency of intercodon gaps (phase 0) in exons encoding the amino acid residue functional site, which may be evidence of the existence of evolutionary limitations to the exon shuffling. These results characterize the features of the coding exon-intron structure that affect the functionality of the encoded protein and allow a better understanding of the emergence of biological diversity.
The human RHOX gene cluster: target genes and functional analysis of gene variants in infertile men.

PubMed

Borgmann, Jennifer; Tüttelmann, Frank; Dworniczak, Bernd; Röpke, Albrecht; Song, Hye-Won; Kliesch, Sabine; Wilkinson, Miles F; Laurentino, Sandra; Gromoll, Jörg

2016-11-15

The X-linked reproductive homeobox (RHOX) gene cluster encodes transcription factors preferentially expressed in reproductive tissues. This gene cluster has important roles in male fertility based on phenotypic defects of Rhox-mutant mice and the finding that aberrant RHOX promoter methylation is strongly associated with abnormal human sperm parameters. However, little is known about the molecular mechanism of RHOX function in humans. Using gene expression profiling, we identified genes regulated by members of the human RHOX gene cluster. Some genes were uniquely regulated by RHOXF1 or RHOXF2/2B, while others were regulated by both of these transcription factors. Several of these regulated genes encode proteins involved in processes relevant to spermatogenesis; e.g. stress protection and cell survival. One of the target genes of RHOXF2/2B is RHOXF1, suggesting cross-regulation to enhance transcriptional responses. The potential role of RHOX in human infertility was addressed by sequencing all RHOX exons in a group of 250 patients with severe oligozoospermia. This revealed two mutations in RHOXF1 (c.515G > A and c.522C > T) and four in RHOXF2/2B (-73C > G, c.202G > A, c.411C > T and c.679G > A), of which only one (c.202G > A) was found in a control group of men with normal sperm concentration. Functional analysis demonstrated that c.202G > A and c.679G > A significantly impaired the ability of RHOXF2/2B to regulate downstream genes. Molecular modelling suggested that these mutations alter RHOXF2/F2B protein conformation. By combining clinical data with in vitro functional analysis, we demonstrate how the X-linked RHOX gene cluster may function in normal human spermatogenesis and we provide evidence that it is impaired in human male fertility.
Text analysis of MEDLINE for discovering functional relationships among genes: evaluation of keyword extraction weighting schemes.

PubMed

Liu, Ying; Navathe, Shamkant B; Pivoshenko, Alex; Dasigi, Venu G; Dingledine, Ray; Ciliax, Brian J

2006-01-01

One of the key challenges of microarray studies is to derive biological insights from the gene-expression patterns. Clustering genes by functional keyword association can provide direct information about the functional links among genes. However, the quality of the keyword lists significantly affects the clustering results. We compared two keyword weighting schemes: normalised z-score and term frequency-inverse document frequency (TFIDF). Two gene sets were tested to evaluate the effectiveness of the weighting schemes for keyword extraction for gene clustering. Using established measures of cluster quality, the results produced from TFIDF-weighted keywords outperformed those produced from normalised z-score weighted keywords. The optimised algorithms should be useful for partitioning genes from microarray lists into functionally discrete clusters.
Comprehensive analysis of genetic variations in strictly-defined Leber congenital amaurosis with whole-exome sequencing in Chinese

PubMed Central

Wang, Shi-Yuan; Zhang, Qi; Zhang, Xiang; Zhao, Pei-Quan

2016-01-01

AIM To make a comprehensive analysis of the potential pathogenic genes related with Leber congenital amaurosis (LCA) in Chinese. METHODS LCA subjects and their families were retrospectively collected from 2013 to 2015. Firstly, whole-exome sequencing was performed in patients who had underwent gene mutation screening with nothing found, and then homozygous sites was selected, candidate sites were annotated, and pathogenic analysis was conducted using softwares including Sorting Tolerant from Intolerant (SIFT), Polyphen-2, Mutation assessor, Condel, and Functional Analysis through Hidden Markov Models (FATHMM). Furthermore, Gene Ontology function and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses of pathogenic genes were performed followed by co-segregation analysis using Fisher exact Test. Sanger sequencing was used to validate single-nucleotide variations (SNVs). Expanded verification was performed in the rest patients. RESULTS Totally 51 LCA families with 53 patients and 24 family members were recruited. A total of 104 SNVs (66 LCA-related genes and 15 co-segregated genes) were submitted for expand verification. The frequencies of homozygous mutation of KRT12 and CYP1A1 were simultaneously observed in 3 families. Enrichment analysis showed that the potential pathogenic genes were mainly enriched in functions related to cell adhesion, biological adhesion, retinoid metabolic process, and eye development biological adhesion. Additionally, WFS1 and STAU2 had the highest homozygous frequencies. CONCLUSION LCA is a highly heterogeneous disease. Mutations in KRT12, CYP1A1, WFS1, and STAU2 may be involved in the development of LCA. PMID:27672588
Separate enrichment analysis of pathways for up- and downregulated genes.

PubMed

Hong, Guini; Zhang, Wenjing; Li, Hongdong; Shen, Xiaopei; Guo, Zheng

2014-03-06

Two strategies are often adopted for enrichment analysis of pathways: the analysis of all differentially expressed (DE) genes together or the analysis of up- and downregulated genes separately. However, few studies have examined the rationales of these enrichment analysis strategies. Using both microarray and RNA-seq data, we show that gene pairs with functional links in pathways tended to have positively correlated expression levels, which could result in an imbalance between the up- and downregulated genes in particular pathways. We then show that the imbalance could greatly reduce the statistical power for finding disease-associated pathways through the analysis of all-DE genes. Further, using gene expression profiles from five types of tumours, we illustrate that the separate analysis of up- and downregulated genes could identify more pathways that are really pertinent to phenotypic difference. In conclusion, analysing up- and downregulated genes separately is more powerful than analysing all of the DE genes together.
Computational analysis of microRNA function in heart development.

PubMed

Liu, Ganqiang; Ding, Min; Chen, Jiajia; Huang, Jinyan; Wang, Haiyun; Jing, Qing; Shen, Bairong

2010-09-01

Emerging evidence suggests that specific spatio-temporal microRNA (miRNA) expression is required for heart development. In recent years, hundreds of miRNAs have been discovered. In contrast, functional annotations are available only for a very small fraction of these regulatory molecules. In order to provide a global perspective for the biologists who study the relationship between differentially expressed miRNAs and heart development, we employed computational analysis to uncover the specific cellular processes and biological pathways targeted by miRNAs in mouse heart development. Here, we utilized Gene Ontology (GO) categories, KEGG Pathway, and GeneGo Pathway Maps as a gene functional annotation system for miRNA target enrichment analysis. The target genes of miRNAs were found to be enriched in functional categories and pathway maps in which miRNAs could play important roles during heart development. Meanwhile, we developed miRHrt (http://sysbio.suda.edu.cn/mirhrt/), a database aiming to provide a comprehensive resource of miRNA function in regulating heart development. These computational analysis results effectively illustrated the correlation of differentially expressed miRNAs with cellular functions and heart development. We hope that the identified novel heart development-associated pathways and the database presented here would facilitate further understanding of the roles and mechanisms of miRNAs in heart development.

Evolutionary analysis of the jacalin-related lectin family genes in 11 fishes.

PubMed

Cao, Jun; Lv, Yueqing

2016-09-01

Jacalin-related lectins are a type of carbohydrate-binding proteins, which are distributed across a wide variety of organisms and involved in some important biological processes. The evolution of this gene family in fishes is unknown. Here, 47 putative jacalin genes in 11 fish species were identified and divided into 4 groups through phylogenetic analysis. Conserved gene organization and motif distribution existed in each group, suggesting their functional conservation. Some fishes have eleven jacalin genes, while others have only one or zero gene in their genomes, suggesting dynamic changes in the number of jacalin genes during the evolution of fishes. Intragenic recombination played a key role in the evolution of jacalin genes. Synteny analyses of jacalin genes in some fishes implied conserved and dynamic evolution characteristics of this gene family and related genome segments. Moreover, a few functional divergence sites were identified within each group pairs. Divergent expression profiles of the zebra fish jacalin genes were further investigated in different stresses. The results provided a foundation for exploring the characterization of the jacalin genes in fishes and will offer insights for additional functional studies. Copyright © 2016 Elsevier Ltd. All rights reserved.
ProbFAST: Probabilistic functional analysis system tool.

PubMed

Silva, Israel T; Vêncio, Ricardo Z N; Oliveira, Thiago Y K; Molfetta, Greice A; Silva, Wilson A

2010-03-30

The post-genomic era has brought new challenges regarding the understanding of the organization and function of the human genome. Many of these challenges are centered on the meaning of differential gene regulation under distinct biological conditions and can be performed by analyzing the Multiple Differential Expression (MDE) of genes associated with normal and abnormal biological processes. Currently MDE analyses are limited to usual methods of differential expression initially designed for paired analysis. We proposed a web platform named ProbFAST for MDE analysis which uses Bayesian inference to identify key genes that are intuitively prioritized by means of probabilities. A simulated study revealed that our method gives a better performance when compared to other approaches and when applied to public expression data, we demonstrated its flexibility to obtain relevant genes biologically associated with normal and abnormal biological processes. ProbFAST is a free accessible web-based application that enables MDE analysis on a global scale. It offers an efficient methodological approach for MDE analysis of a set of genes that are turned on and off related to functional information during the evolution of a tumor or tissue differentiation. ProbFAST server can be accessed at http://gdm.fmrp.usp.br/probfast.
ProbFAST: Probabilistic Functional Analysis System Tool

PubMed Central

2010-01-01

Background The post-genomic era has brought new challenges regarding the understanding of the organization and function of the human genome. Many of these challenges are centered on the meaning of differential gene regulation under distinct biological conditions and can be performed by analyzing the Multiple Differential Expression (MDE) of genes associated with normal and abnormal biological processes. Currently MDE analyses are limited to usual methods of differential expression initially designed for paired analysis. Results We proposed a web platform named ProbFAST for MDE analysis which uses Bayesian inference to identify key genes that are intuitively prioritized by means of probabilities. A simulated study revealed that our method gives a better performance when compared to other approaches and when applied to public expression data, we demonstrated its flexibility to obtain relevant genes biologically associated with normal and abnormal biological processes. Conclusions ProbFAST is a free accessible web-based application that enables MDE analysis on a global scale. It offers an efficient methodological approach for MDE analysis of a set of genes that are turned on and off related to functional information during the evolution of a tumor or tissue differentiation. ProbFAST server can be accessed at http://gdm.fmrp.usp.br/probfast. PMID:20353576
Transcriptome-wide analysis of WRKY transcription factors in wheat and their leaf rust responsive expression profiling.

PubMed

Satapathy, Lopamudra; Singh, Dharmendra; Ranjan, Prashant; Kumar, Dhananjay; Kumar, Manish; Prabhu, Kumble Vinod; Mukhopadhyay, Kunal

2014-12-01

WRKY, a plant-specific transcription factor family, has important roles in pathogen defense, abiotic cues and phytohormone signaling, yet little is known about their roles and molecular mechanism of function in response to rust diseases in wheat. We identified 100 TaWRKY sequences using wheat Expressed Sequence Tag database of which 22 WRKY sequences were novel. Identified proteins were characterized based on their zinc finger motifs and phylogenetic analysis clustered them into six clades consisting of class IIc and class III WRKY proteins. Functional annotation revealed major functions in metabolic and cellular processes in control plants; whereas response to stimuli, signaling and defense in pathogen inoculated plants, their major molecular function being binding to DNA. Tag-based expression analysis of the identified genes revealed differential expression between mock and Puccinia triticina inoculated wheat near isogenic lines. Gene expression was also performed with six rust-related microarray experiments at Gene Expression Omnibus database. TaWRKY10, 15, 17 and 56 were common in both tag-based and microarray-based differential expression analysis and could be representing rust specific WRKY genes. The obtained results will bestow insight into the functional characterization of WRKY transcription factors responsive to leaf rust pathogenesis that can be used as candidate genes in molecular breeding programs to improve biotic stress tolerance in wheat.
Identifying arsenic trioxide (ATO) functions in leukemia cells by using time series gene expression profiles.

PubMed

Yang, Hong; Lin, Shan; Cui, Jingru

2014-02-10

Arsenic trioxide (ATO) is presently the most active single agent in the treatment of acute promyelocytic leukemia (APL). In order to explore the molecular mechanism of ATO in leukemia cells with time series, we adopted bioinformatics strategy to analyze expression changing patterns and changes in transcription regulation modules of time series genes filtered from Gene Expression Omnibus database (GSE24946). We totally screened out 1847 time series genes for subsequent analysis. The KEGG (Kyoto encyclopedia of genes and genomes) pathways enrichment analysis of these genes showed that oxidative phosphorylation and ribosome were the top 2 significantly enriched pathways. STEM software was employed to compare changing patterns of gene expression with assigned 50 expression patterns. We screened out 7 significantly enriched patterns and 4 tendency charts of time series genes. The result of Gene Ontology showed that functions of times series genes mainly distributed in profiles 41, 40, 39 and 38. Seven genes with positive regulation of cell adhesion function were enriched in profile 40, and presented the same first increased model then decreased model as profile 40. The transcription module analysis showed that they mainly involved in oxidative phosphorylation pathway and ribosome pathway. Overall, our data summarized the gene expression changes in ATO treated K562-r cell lines with time and suggested that time series genes mainly regulated cell adhesive. Furthermore, our result may provide theoretical basis of molecular biology in treating acute promyelocytic leukemia. Copyright © 2013 Elsevier B.V. All rights reserved.
Systematic Analysis of Zn2Cys6 Transcription Factors Required for Development and Pathogenicity by High-Throughput Gene Knockout in the Rice Blast Fungus

PubMed Central

Huang, Pengyun; Lin, Fucheng

2014-01-01

Because of great challenges and workload in deleting genes on a large scale, the functions of most genes in pathogenic fungi are still unclear. In this study, we developed a high-throughput gene knockout system using a novel yeast-Escherichia-Agrobacterium shuttle vector, pKO1B, in the rice blast fungus Magnaporthe oryzae. Using this method, we deleted 104 fungal-specific Zn2Cys6 transcription factor (TF) genes in M. oryzae. We then analyzed the phenotypes of these mutants with regard to growth, asexual and infection-related development, pathogenesis, and 9 abiotic stresses. The resulting data provide new insights into how this rice pathogen of global significance regulates important traits in the infection cycle through Zn2Cys6TF genes. A large variation in biological functions of Zn2Cys6TF genes was observed under the conditions tested. Sixty-one of 104 Zn2Cys6 TF genes were found to be required for fungal development. In-depth analysis of TF genes revealed that TF genes involved in pathogenicity frequently tend to function in multiple development stages, and disclosed many highly conserved but unidentified functional TF genes of importance in the fungal kingdom. We further found that the virulence-required TF genes GPF1 and CNF2 have similar regulation mechanisms in the gene expression involved in pathogenicity. These experimental validations clearly demonstrated the value of a high-throughput gene knockout system in understanding the biological functions of genes on a genome scale in fungi, and provided a solid foundation for elucidating the gene expression network that regulates the development and pathogenicity of M. oryzae. PMID:25299517
Genome-wide identification, characterisation and expression analysis of the MADS-box gene family in Prunus mume.

PubMed

Xu, Zongda; Zhang, Qixiang; Sun, Lidan; Du, Dongliang; Cheng, Tangren; Pan, Huitang; Yang, Weiru; Wang, Jia

2014-10-01

MADS-box genes encode transcription factors that play crucial roles in plant development, especially in flower and fruit development. To gain insight into this gene family in Prunus mume, an important ornamental and fruit plant in East Asia, and to elucidate their roles in flower organ determination and fruit development, we performed a genome-wide identification, characterisation and expression analysis of MADS-box genes in this Rosaceae tree. In this study, 80 MADS-box genes were identified in P. mume and categorised into MIKC, Mα, Mβ, Mγ and Mδ groups based on gene structures and phylogenetic relationships. The MIKC group could be further classified into 12 subfamilies. The FLC subfamily was absent in P. mume and the six tandemly arranged DAM genes might experience a species-specific evolution process in P. mume. The MADS-box gene family might experience an evolution process from MIKC genes to Mδ genes to Mα, Mβ and Mγ genes. The expression analysis suggests that P. mume MADS-box genes have diverse functions in P. mume development and the functions of duplicated genes diverged after the duplication events. In addition to its involvement in the development of female gametophytes, type I genes also play roles in male gametophytes development. In conclusion, this study adds to our understanding of the roles that the MADS-box genes played in flower and fruit development and lays a foundation for selecting candidate genes for functional studies in P. mume and other species. Furthermore, this study also provides a basis to study the evolution of the MADS-box family.
Proof of Concept Study to Assess Fetal Gene Expression in Amniotic Fluid by NanoArray PCR

PubMed Central

Massingham, Lauren J.; Johnson, Kirby L.; Bianchi, Diana W.; Pei, Shermin; Peter, Inga; Cowan, Janet M.; Tantravahi, Umadevi; Morrison, Tom B.

2011-01-01

Microarray analysis of cell-free RNA in amniotic fluid (AF) supernatant has revealed differential fetal gene expression as a function of gestational age and karyotype. Once informative genes are identified, research moves to a more focused platform such as quantitative reverse transcriptase-PCR. Standardized NanoArray PCR (SNAP) is a recently developed gene profiling technology that enables the measurement of transcripts from samples containing reduced quantities or degraded nucleic acids. We used a previously developed SNAP gene panel as proof of concept to determine whether fetal functional gene expression could be ascertained from AF supernatant. RNA was extracted and converted to cDNA from 19 AF supernatant samples of euploid fetuses between 15 to 20 weeks of gestation, and transcript abundance of 21 genes was measured. Statistically significant differences in expression, as a function of advancing gestational age, were observed for 5 of 21 genes. ANXA5, GUSB, and PPIA showed decreasing gene expression over time, whereas CASC3 and ZNF264 showed increasing gene expression over time. Statistically significantly increased expression of MTOR and STAT2 was seen in female compared with male fetuses. This study demonstrates the feasibility of focused fetal gene expression analysis using SNAP technology. In the future, this technique could be optimized to examine specific genes instrumental in fetal organ system function, which could be a useful addition to prenatal care. PMID:21827969
Microarray analysis of gene expression patterns in the leaf during potato tuberization in the potato somatic hybrid Solanum tuberosum and Solanum etuberosum.

PubMed

Tiwari, Jagesh Kumar; Devi, Sapna; Sundaresha, S; Chandel, Poonam; Ali, Nilofer; Singh, Brajesh; Bhardwaj, Vinay; Singh, Bir Pal

2015-06-01

Genes involved in photoassimilate partitioning and changes in hormonal balance are important for potato tuberization. In the present study, we investigated gene expression patterns in the tuber-bearing potato somatic hybrid (E1-3) and control non-tuberous wild species Solanum etuberosum (Etb) by microarray. Plants were grown under controlled conditions and leaves were collected at eight tuber developmental stages for microarray analysis. A t-test analysis identified a total of 468 genes (94 up-regulated and 374 down-regulated) that were statistically significant (p ≤ 0.05) and differentially expressed in E1-3 and Etb. Gene Ontology (GO) characterization of the 468 genes revealed that 145 were annotated and 323 were of unknown function. Further, these 145 genes were grouped based on GO biological processes followed by molecular function and (or) PGSC description into 15 gene sets, namely (1) transport, (2) metabolic process, (3) biological process, (4) photosynthesis, (5) oxidation-reduction, (6) transcription, (7) translation, (8) binding, (9) protein phosphorylation, (10) protein folding, (11) ubiquitin-dependent protein catabolic process, (12) RNA processing, (13) negative regulation of protein, (14) methylation, and (15) mitosis. RT-PCR analysis of 10 selected highly significant genes (p ≤ 0.01) confirmed the microarray results. Overall, we show that candidate genes induced in leaves of E1-3 were implicated in tuberization processes such as transport, carbohydrate metabolism, phytohormones, and transcription/translation/binding functions. Hence, our results provide an insight into the candidate genes induced in leaf tissues during tuberization in E1-3.
Comparative genomic analysis of SET domain family reveals the origin, expansion, and putative function of the arthropod-specific SmydA genes as histone modifiers in insects.

PubMed

Jiang, Feng; Liu, Qing; Wang, Yanli; Zhang, Jie; Wang, Huimin; Song, Tianqi; Yang, Meiling; Wang, Xianhui; Kang, Le

2017-06-01

The SET domain is an evolutionarily conserved motif present in histone lysine methyltransferases, which are important in the regulation of chromatin and gene expression in animals. In this study, we searched for SET domain-containing genes (SET genes) in all of the 147 arthropod genomes sequenced at the time of carrying out this experiment to understand the evolutionary history by which SET domains have evolved in insects. Phylogenetic and ancestral state reconstruction analysis revealed an arthropod-specific SET gene family, named SmydA, that is ancestral to arthropod animals and specifically diversified during insect evolution. Considering that pseudogenization is the most probable fate of the new emerging gene copies, we provided experimental and evolutionary evidence to demonstrate their essential functions. Fluorescence in situ hybridization analysis and in vitro methyltransferase activity assays showed that the SmydA-2 gene was transcriptionally active and retained the original histone methylation activity. Expression knockdown by RNA interference significantly increased mortality, implying that the SmydA genes may be essential for insect survival. We further showed predominantly strong purifying selection on the SmydA gene family and a potential association between the regulation of gene expression and insect phenotypic plasticity by transcriptome analysis. Overall, these data suggest that the SmydA gene family retains essential functions that may possibly define novel regulatory pathways in insects. This work provides insights into the roles of lineage-specific domain duplication in insect evolution. © The Authors 2017. Published by Oxford University Press.
Comparative genomic analysis of SET domain family reveals the origin, expansion, and putative function of the arthropod-specific SmydA genes as histone modifiers in insects

PubMed Central

Jiang, Feng; Liu, Qing; Wang, Yanli; Zhang, Jie; Wang, Huimin; Song, Tianqi; Yang, Meiling

2017-01-01

Abstract The SET domain is an evolutionarily conserved motif present in histone lysine methyltransferases, which are important in the regulation of chromatin and gene expression in animals. In this study, we searched for SET domain–containing genes (SET genes) in all of the 147 arthropod genomes sequenced at the time of carrying out this experiment to understand the evolutionary history by which SET domains have evolved in insects. Phylogenetic and ancestral state reconstruction analysis revealed an arthropod-specific SET gene family, named SmydA, that is ancestral to arthropod animals and specifically diversified during insect evolution. Considering that pseudogenization is the most probable fate of the new emerging gene copies, we provided experimental and evolutionary evidence to demonstrate their essential functions. Fluorescence in situ hybridization analysis and in vitro methyltransferase activity assays showed that the SmydA-2 gene was transcriptionally active and retained the original histone methylation activity. Expression knockdown by RNA interference significantly increased mortality, implying that the SmydA genes may be essential for insect survival. We further showed predominantly strong purifying selection on the SmydA gene family and a potential association between the regulation of gene expression and insect phenotypic plasticity by transcriptome analysis. Overall, these data suggest that the SmydA gene family retains essential functions that may possibly define novel regulatory pathways in insects. This work provides insights into the roles of lineage-specific domain duplication in insect evolution. PMID:28444351
Gene Expression Correlated with Severe Asthma Characteristics Reveals Heterogeneous Mechanisms of Severe Disease.

PubMed

Modena, Brian D; Bleecker, Eugene R; Busse, William W; Erzurum, Serpil C; Gaston, Benjamin M; Jarjour, Nizar N; Meyers, Deborah A; Milosevic, Jadranka; Tedrow, John R; Wu, Wei; Kaminski, Naftali; Wenzel, Sally E

2017-06-01

Severe asthma (SA) is a heterogeneous disease with multiple molecular mechanisms. Gene expression studies of bronchial epithelial cells in individuals with asthma have provided biological insight and underscored possible mechanistic differences between individuals. Identify networks of genes reflective of underlying biological processes that define SA. Airway epithelial cell gene expression from 155 subjects with asthma and healthy control subjects in the Severe Asthma Research Program was analyzed by weighted gene coexpression network analysis to identify gene networks and profiles associated with SA and its specific characteristics (i.e., pulmonary function tests, quality of life scores, urgent healthcare use, and steroid use), which potentially identified underlying biological processes. A linear model analysis confirmed these findings while adjusting for potential confounders. Weighted gene coexpression network analysis constructed 64 gene network modules, including modules corresponding to T1 and T2 inflammation, neuronal function, cilia, epithelial growth, and repair mechanisms. Although no network selectively identified SA, genes in modules linked to epithelial growth and repair and neuronal function were markedly decreased in SA. Several hub genes of the epithelial growth and repair module were found located at the 17q12-21 locus, near a well-known asthma susceptibility locus. T2 genes increased with severity in those treated with corticosteroids but were also elevated in untreated, mild-to-moderate disease compared with healthy control subjects. T1 inflammation, especially when associated with increased T2 gene expression, was elevated in a subgroup of younger patients with SA. In this hypothesis-generating analysis, gene expression networks in relation to asthma severity provided potentially new insight into biological mechanisms associated with the development of SA and its phenotypes.
Gene Expression Correlated with Severe Asthma Characteristics Reveals Heterogeneous Mechanisms of Severe Disease

PubMed Central

Modena, Brian D.; Bleecker, Eugene R.; Busse, William W.; Erzurum, Serpil C.; Gaston, Benjamin M.; Jarjour, Nizar N.; Meyers, Deborah A.; Milosevic, Jadranka; Tedrow, John R.; Wu, Wei; Kaminski, Naftali

2017-01-01

Rationale: Severe asthma (SA) is a heterogeneous disease with multiple molecular mechanisms. Gene expression studies of bronchial epithelial cells in individuals with asthma have provided biological insight and underscored possible mechanistic differences between individuals. Objectives: Identify networks of genes reflective of underlying biological processes that define SA. Methods: Airway epithelial cell gene expression from 155 subjects with asthma and healthy control subjects in the Severe Asthma Research Program was analyzed by weighted gene coexpression network analysis to identify gene networks and profiles associated with SA and its specific characteristics (i.e., pulmonary function tests, quality of life scores, urgent healthcare use, and steroid use), which potentially identified underlying biological processes. A linear model analysis confirmed these findings while adjusting for potential confounders. Measurements and Main Results: Weighted gene coexpression network analysis constructed 64 gene network modules, including modules corresponding to T1 and T2 inflammation, neuronal function, cilia, epithelial growth, and repair mechanisms. Although no network selectively identified SA, genes in modules linked to epithelial growth and repair and neuronal function were markedly decreased in SA. Several hub genes of the epithelial growth and repair module were found located at the 17q12–21 locus, near a well-known asthma susceptibility locus. T2 genes increased with severity in those treated with corticosteroids but were also elevated in untreated, mild-to-moderate disease compared with healthy control subjects. T1 inflammation, especially when associated with increased T2 gene expression, was elevated in a subgroup of younger patients with SA. Conclusions: In this hypothesis-generating analysis, gene expression networks in relation to asthma severity provided potentially new insight into biological mechanisms associated with the development of SA and its phenotypes. PMID:27984699
WGCNA: an R package for weighted correlation network analysis.

PubMed

Langfelder, Peter; Horvath, Steve

2008-12-29

Correlation networks are increasingly being used in bioinformatics applications. For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples. Weighted correlation network analysis (WGCNA) can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures. Correlation networks facilitate network based gene screening methods that can be used to identify candidate biomarkers or therapeutic targets. These methods have been successfully applied in various biological contexts, e.g. cancer, mouse genetics, yeast genetics, and analysis of brain imaging data. While parts of the correlation network methodology have been described in separate publications, there is a need to provide a user-friendly, comprehensive, and consistent software implementation and an accompanying tutorial. The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis. The package includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software. Along with the R package we also present R software tutorials. While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings. The WGCNA package provides R functions for weighted correlation network analysis, e.g. co-expression network analysis of gene expression data. The R package along with its source code and additional material are freely available at http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/Rpackages/WGCNA.
WGCNA: an R package for weighted correlation network analysis

PubMed Central

Langfelder, Peter; Horvath, Steve

2008-01-01

Background Correlation networks are increasingly being used in bioinformatics applications. For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples. Weighted correlation network analysis (WGCNA) can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures. Correlation networks facilitate network based gene screening methods that can be used to identify candidate biomarkers or therapeutic targets. These methods have been successfully applied in various biological contexts, e.g. cancer, mouse genetics, yeast genetics, and analysis of brain imaging data. While parts of the correlation network methodology have been described in separate publications, there is a need to provide a user-friendly, comprehensive, and consistent software implementation and an accompanying tutorial. Results The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis. The package includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software. Along with the R package we also present R software tutorials. While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings. Conclusion The WGCNA package provides R functions for weighted correlation network analysis, e.g. co-expression network analysis of gene expression data. The R package along with its source code and additional material are freely available at . PMID:19114008
Structural, functional and evolutionary characterization of major drought transcription factors families in maize

NASA Astrophysics Data System (ADS)

Mittal, Shikha; Banduni, Pooja; Mallikarjuna, Mallana G.; Rao, Atmakuri R.; Jain, Prashant A.; Dash, Prasanta K.; Thirunavukkarasu, Nepolean

2018-05-01

Drought is one of the major threats to maize production. In order to improve the production and to breed tolerant hybrids, understanding the genes and regulatory mechanisms during drought stress is important. Transcription factors (TFs) play a major role in gene regulation and many TFs have been identified in response to drought stress. In our experiment, a set of 15 major TF families comprising 1436 genes was structurally and functionally characterized using in-silico tools and a gene expression assay. All 1436 genes were mapped on 10 chromosome of maize. The functional annotation indicated the involvement of these genes in ABA signaling, ROS scavenging, photosynthesis, stomatal regulation, and sucrose metabolism. Duplication was identified as the primary force in divergence and expansion of TF families. Phylogenetic relationship was developed individually for each TF family as well as combined TF families. Phylogenetic analysis grouped the TF family of genes into TF-specific and mixed groups. Phylogenetic analysis of genes belonging to various TF families suggested that the origin of TFs occurred in the lineage of maize evolution. Gene structure analysis revealed that more number of genes were intron-rich as compared to intronless genes. Drought-responsive CRE’s such as ABREA, ABREB, DRE1 and DRECRTCOREAT have been identified. Expression and interaction analyses identified leaf-specific bZIP TF, GRMZM2G140355, as a potential contributor toward drought tolerance in maize. We also analyzed protein-protein interaction network of 269 drought-responsive genes belonging to different drought-related TFs. The information generated on structural and functional characteristics, expression and interaction of the drought-related TF families will be useful to decipher the drought tolerance mechanisms and to derive drought-tolerant genotypes in maize.
Functional Interaction Network Construction and Analysis for Disease Discovery.

PubMed

Wu, Guanming; Haw, Robin

2017-01-01

Network-based approaches project seemingly unrelated genes or proteins onto a large-scale network context, therefore providing a holistic visualization and analysis platform for genomic data generated from high-throughput experiments, reducing the dimensionality of data via using network modules and increasing the statistic analysis power. Based on the Reactome database, the most popular and comprehensive open-source biological pathway knowledgebase, we have developed a highly reliable protein functional interaction network covering around 60 % of total human genes and an app called ReactomeFIViz for Cytoscape, the most popular biological network visualization and analysis platform. In this chapter, we describe the detailed procedures on how this functional interaction network is constructed by integrating multiple external data sources, extracting functional interactions from human curated pathway databases, building a machine learning classifier called a Naïve Bayesian Classifier, predicting interactions based on the trained Naïve Bayesian Classifier, and finally constructing the functional interaction database. We also provide an example on how to use ReactomeFIViz for performing network-based data analysis for a list of genes.
Functional network analysis of genes differentially expressed during xylogenesis in soc1ful woody Arabidopsis plants.

PubMed

Davin, Nicolas; Edger, Patrick P; Hefer, Charles A; Mizrachi, Eshchar; Schuetz, Mathias; Smets, Erik; Myburg, Alexander A; Douglas, Carl J; Schranz, Michael E; Lens, Frederic

2016-06-01

Many plant genes are known to be involved in the development of cambium and wood, but how the expression and functional interaction of these genes determine the unique biology of wood remains largely unknown. We used the soc1ful loss of function mutant - the woodiest genotype known in the otherwise herbaceous model plant Arabidopsis - to investigate the expression and interactions of genes involved in secondary growth (wood formation). Detailed anatomical observations of the stem in combination with mRNA sequencing were used to assess transcriptome remodeling during xylogenesis in wild-type and woody soc1ful plants. To interpret the transcriptome changes, we constructed functional gene association networks of differentially expressed genes using the STRING database. This analysis revealed functionally enriched gene association hubs that are differentially expressed in herbaceous and woody tissues. In particular, we observed the differential expression of genes related to mechanical stress and jasmonate biosynthesis/signaling during wood formation in soc1ful plants that may be an effect of greater tension within woody tissues. Our results suggest that habit shifts from herbaceous to woody life forms observed in many angiosperm lineages could have evolved convergently by genetic changes that modulate the gene expression and interaction network, and thereby redeploy the conserved wood developmental program. © 2016 The Authors. The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.
Using the gene ontology for microarray data mining: a comparison of methods and application to age effects in human prefrontal cortex.

PubMed

Pavlidis, Paul; Qin, Jie; Arango, Victoria; Mann, John J; Sibille, Etienne

2004-06-01

One of the challenges in the analysis of gene expression data is placing the results in the context of other data available about genes and their relationships to each other. Here, we approach this problem in the study of gene expression changes associated with age in two areas of the human prefrontal cortex, comparing two computational methods. The first method, "overrepresentation analysis" (ORA), is based on statistically evaluating the fraction of genes in a particular gene ontology class found among the set of genes showing age-related changes in expression. The second method, "functional class scoring" (FCS), examines the statistical distribution of individual gene scores among all genes in the gene ontology class and does not involve an initial gene selection step. We find that FCS yields more consistent results than ORA, and the results of ORA depended strongly on the gene selection threshold. Our findings highlight the utility of functional class scoring for the analysis of complex expression data sets and emphasize the advantage of considering all available genomic information rather than sets of genes that pass a predetermined "threshold of significance."
GOTree Machine (GOTM): a web-based platform for interpreting sets of interesting genes using Gene Ontology hierarchies

PubMed Central

Zhang, Bing; Schmoyer, Denise; Kirov, Stefan; Snoddy, Jay

2004-01-01

Background Microarray and other high-throughput technologies are producing large sets of interesting genes that are difficult to analyze directly. Bioinformatics tools are needed to interpret the functional information in the gene sets. Results We have created a web-based tool for data analysis and data visualization for sets of genes called GOTree Machine (GOTM). This tool was originally intended to analyze sets of co-regulated genes identified from microarray analysis but is adaptable for use with other gene sets from other high-throughput analyses. GOTree Machine generates a GOTree, a tree-like structure to navigate the Gene Ontology Directed Acyclic Graph for input gene sets. This system provides user friendly data navigation and visualization. Statistical analysis helps users to identify the most important Gene Ontology categories for the input gene sets and suggests biological areas that warrant further study. GOTree Machine is available online at . Conclusion GOTree Machine has a broad application in functional genomic, proteomic and other high-throughput methods that generate large sets of interesting genes; its primary purpose is to help users sort for interesting patterns in gene sets. PMID:14975175

Genome-wide profiling of 24 hr diel rhythmicity in the water flea, Daphnia pulex: network analysis reveals rhythmic gene expression and enhances functional gene annotation.

PubMed

Rund, Samuel S C; Yoo, Boyoung; Alam, Camille; Green, Taryn; Stephens, Melissa T; Zeng, Erliang; George, Gary F; Sheppard, Aaron D; Duffield, Giles E; Milenković, Tijana; Pfrender, Michael E

2016-08-18

Marine and freshwater zooplankton exhibit daily rhythmic patterns of behavior and physiology which may be regulated directly by the light:dark (LD) cycle and/or a molecular circadian clock. One of the best-studied zooplankton taxa, the freshwater crustacean Daphnia, has a 24 h diel vertical migration (DVM) behavior whereby the organism travels up and down through the water column daily. DVM plays a critical role in resource tracking and the behavioral avoidance of predators and damaging ultraviolet radiation. However, there is little information at the transcriptional level linking the expression patterns of genes to the rhythmic physiology/behavior of Daphnia. Here we analyzed genome-wide temporal transcriptional patterns from Daphnia pulex collected over a 44 h time period under a 12:12 LD cycle (diel) conditions using a cosine-fitting algorithm. We used a comprehensive network modeling and analysis approach to identify novel co-regulated rhythmic genes that have similar network topological properties and functional annotations as rhythmic genes identified by the cosine-fitting analyses. Furthermore, we used the network approach to predict with high accuracy novel gene-function associations, thus enhancing current functional annotations available for genes in this ecologically relevant model species. Our results reveal that genes in many functional groupings exhibit 24 h rhythms in their expression patterns under diel conditions. We highlight the rhythmic expression of immunity, oxidative detoxification, and sensory process genes. We discuss differences in the chronobiology of D. pulex from other well-characterized terrestrial arthropods. This research adds to a growing body of literature suggesting the genetic mechanisms governing rhythmicity in crustaceans may be divergent from other arthropod lineages including insects. Lastly, these results highlight the power of using a network analysis approach to identify differential gene expression and provide novel functional annotation.
An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods.

PubMed

Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E; Re, Matteo

2014-06-01

In the context of "network medicine", gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different "informativeness" embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both local and global learning strategies, able to exploit the overall topology of the network. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.
pico-PLAZA, a genome database of microbial photosynthetic eukaryotes.

PubMed

Vandepoele, Klaas; Van Bel, Michiel; Richard, Guilhem; Van Landeghem, Sofie; Verhelst, Bram; Moreau, Hervé; Van de Peer, Yves; Grimsley, Nigel; Piganeau, Gwenael

2013-08-01

With the advent of next generation genome sequencing, the number of sequenced algal genomes and transcriptomes is rapidly growing. Although a few genome portals exist to browse individual genome sequences, exploring complete genome information from multiple species for the analysis of user-defined sequences or gene lists remains a major challenge. pico-PLAZA is a web-based resource (http://bioinformatics.psb.ugent.be/pico-plaza/) for algal genomics that combines different data types with intuitive tools to explore genomic diversity, perform integrative evolutionary sequence analysis and study gene functions. Apart from homologous gene families, multiple sequence alignments, phylogenetic trees, Gene Ontology, InterPro and text-mining functional annotations, different interactive viewers are available to study genome organization using gene collinearity and synteny information. Different search functions, documentation pages, export functions and an extensive glossary are available to guide non-expert scientists. To illustrate the versatility of the platform, different case studies are presented demonstrating how pico-PLAZA can be used to functionally characterize large-scale EST/RNA-Seq data sets and to perform environmental genomics. Functional enrichments analysis of 16 Phaeodactylum tricornutum transcriptome libraries offers a molecular view on diatom adaptation to different environments of ecological relevance. Furthermore, we show how complementary genomic data sources can easily be combined to identify marker genes to study the diversity and distribution of algal species, for example in metagenomes, or to quantify intraspecific diversity from environmental strains. © 2013 John Wiley & Sons Ltd and Society for Applied Microbiology.
Global analysis of gene expression in response to L-Cysteine deprivation in the anaerobic protozoan parasite Entamoeba histolytica

PubMed Central

2011-01-01

Background Entamoeba histolytica, an enteric protozoan parasite, causes amebic colitis and extra intestinal abscesses in millions of inhabitants of endemic areas. E. histolytica completely lacks glutathione metabolism but possesses L-cysteine as the principle low molecular weight thiol. L-Cysteine is essential for the structure, stability, and various protein functions, including catalysis, electron transfer, redox regulation, nitrogen fixation, and sensing for regulatory processes. Recently, we demonstrated that in E. histolytica, L-cysteine regulates various metabolic pathways including energy, amino acid, and phospholipid metabolism. Results In this study, employing custom-made Affymetrix microarrays, we performed time course (3, 6, 12, 24, and 48 h) gene expression analysis upon L-cysteine deprivation. We identified that out of 9,327 genes represented on the array, 290 genes encoding proteins with functions in metabolism, signalling, DNA/RNA regulation, electron transport, stress response, membrane transport, vesicular trafficking/secretion, and cytoskeleton were differentially expressed (≥3 fold) at one or more time points upon L-cysteine deprivation. Approximately 60% of these modulated genes encoded proteins of no known function and annotated as hypothetical proteins. We also attempted further functional analysis of some of the most highly modulated genes by L-cysteine depletion. Conclusions To our surprise, L-cysteine depletion caused only limited changes in the expression of genes involved in sulfur-containing amino acid metabolism and oxidative stress defense. In contrast, we observed significant changes in the expression of several genes encoding iron sulfur flavoproteins, a major facilitator super-family transporter, regulator of nonsense transcripts, NADPH-dependent oxido-reductase, short chain dehydrogenase, acetyltransferases, and various other genes involved in diverse cellular functions. This study represents the first genome-wide analysis of transcriptional changes induced by L-cysteine deprivation in protozoan parasites, and in eukaryotic organisms where L-cysteine represents the major intracellular thiol. PMID:21627801
Identification and Functional Analysis of Healing Regulators in Drosophila

PubMed Central

Álvarez-Fernández, Carmen; Tamirisa, Srividya; Prada, Federico; Chernomoretz, Ariel; Podhajcer, Osvaldo; Blanco, Enrique; Martín-Blanco, Enrique

2015-01-01

Wound healing is an essential homeostatic mechanism that maintains the epithelial barrier integrity after tissue damage. Although we know the overall steps in wound healing, many of the underlying molecular mechanisms remain unclear. Genetically amenable systems, such as wound healing in Drosophila imaginal discs, do not model all aspects of the repair process. However, they do allow the less understood aspects of the healing response to be explored, e.g., which signal(s) are responsible for initiating tissue remodeling? How is sealing of the epithelia achieved? Or, what inhibitory cues cancel the healing machinery upon completion? Answering these and other questions first requires the identification and functional analysis of wound specific genes. A variety of different microarray analyses of murine and humans have identified characteristic profiles of gene expression at the wound site, however, very few functional studies in healing regulation have been carried out. We developed an experimentally controlled method that is healing-permissive and that allows live imaging and biochemical analysis of cultured imaginal discs. We performed comparative genome-wide profiling between Drosophila imaginal cells actively involved in healing versus their non-engaged siblings. Sets of potential wound-specific genes were subsequently identified. Importantly, besides identifying and categorizing new genes, we functionally tested many of their gene products by genetic interference and overexpression in healing assays. This non-saturated analysis defines a relevant set of genes whose changes in expression level are functionally significant for proper tissue repair. Amongst these we identified the TCP1 chaperonin complex as a key regulator of the actin cytoskeleton essential for the wound healing response. There is promise that our newly identified wound-healing genes will guide future work in the more complex mammalian wound healing response. PMID:25647511
Use of transcriptome sequencing to understand the pistillate flowering in hickory (Carya cathayensis Sarg.).

PubMed

Huang, You-Jun; Liu, Li-Li; Huang, Jian-Qin; Wang, Zheng-Jia; Chen, Fang-Fang; Zhang, Qi-Xiang; Zheng, Bing-Song; Chen, Ming

2013-10-10

Different from herbaceous plants, the woody plants undergo a long-period vegetative stage to achieve floral transition. They then turn into seasonal plants, flowering annually. In this study, a preliminary model of gene regulations for seasonal pistillate flowering in hickory (Carya cathayensis) was proposed. The genome-wide dynamic transcriptome was characterized via the joint-approach of RNA sequencing and microarray analysis. Differential transcript abundance analysis uncovered the dynamic transcript abundance patterns of flowering correlated genes and their major functions based on Gene Ontology (GO) analysis. To explore pistillate flowering mechanism in hickory, a comprehensive flowering gene regulatory network based on Arabidopsis thaliana was constructed by additional literature mining. A total of 114 putative flowering or floral genes including 31 with differential transcript abundance were identified in hickory. The locations, functions and dynamic transcript abundances were analyzed in the gene regulatory networks. A genome-wide co-expression network for the putative flowering or floral genes shows three flowering regulatory modules corresponding to response to light abiotic stimulus, cold stress, and reproductive development process, respectively. Totally 27 potential flowering or floral genes were recruited which are meaningful to understand the hickory specific seasonal flowering mechanism better. Flowering event of pistillate flower bud in hickory is triggered by several pathways synchronously including the photoperiod, autonomous, vernalization, gibberellin, and sucrose pathway. Totally 27 potential flowering or floral genes were recruited from the genome-wide co-expression network function module analysis. Moreover, the analysis provides a potential FLC-like gene based vernalization pathway and an 'AC' model for pistillate flower development in hickory. This work provides an available framework for pistillate flower development in hickory, which is significant for insight into regulation of flowering and floral development of woody plants.
Use of transcriptome sequencing to understand the pistillate flowering in hickory (Carya cathayensis Sarg.)

PubMed Central

2013-01-01

Background Different from herbaceous plants, the woody plants undergo a long-period vegetative stage to achieve floral transition. They then turn into seasonal plants, flowering annually. In this study, a preliminary model of gene regulations for seasonal pistillate flowering in hickory (Carya cathayensis) was proposed. The genome-wide dynamic transcriptome was characterized via the joint-approach of RNA sequencing and microarray analysis. Results Differential transcript abundance analysis uncovered the dynamic transcript abundance patterns of flowering correlated genes and their major functions based on Gene Ontology (GO) analysis. To explore pistillate flowering mechanism in hickory, a comprehensive flowering gene regulatory network based on Arabidopsis thaliana was constructed by additional literature mining. A total of 114 putative flowering or floral genes including 31 with differential transcript abundance were identified in hickory. The locations, functions and dynamic transcript abundances were analyzed in the gene regulatory networks. A genome-wide co-expression network for the putative flowering or floral genes shows three flowering regulatory modules corresponding to response to light abiotic stimulus, cold stress, and reproductive development process, respectively. Totally 27 potential flowering or floral genes were recruited which are meaningful to understand the hickory specific seasonal flowering mechanism better. Conclusions Flowering event of pistillate flower bud in hickory is triggered by several pathways synchronously including the photoperiod, autonomous, vernalization, gibberellin, and sucrose pathway. Totally 27 potential flowering or floral genes were recruited from the genome-wide co-expression network function module analysis. Moreover, the analysis provides a potential FLC-like gene based vernalization pathway and an 'AC’ model for pistillate flower development in hickory. This work provides an available framework for pistillate flower development in hickory, which is significant for insight into regulation of flowering and floral development of woody plants. PMID:24106755
Dual Analysis of the Murine Cytomegalovirus and Host Cell Transcriptomes Reveal New Aspects of the Virus-Host Cell Interface

PubMed Central

Juranic Lisnic, Vanda; Babic Cac, Marina; Lisnic, Berislav; Trsan, Tihana; Mefferd, Adam; Das Mukhopadhyay, Chitrangada; Cook, Charles H.; Jonjic, Stipan; Trgovcich, Joanne

2013-01-01

Major gaps in our knowledge of pathogen genes and how these gene products interact with host gene products to cause disease represent a major obstacle to progress in vaccine and antiviral drug development for the herpesviruses. To begin to bridge these gaps, we conducted a dual analysis of Murine Cytomegalovirus (MCMV) and host cell transcriptomes during lytic infection. We analyzed the MCMV transcriptome during lytic infection using both classical cDNA cloning and sequencing of viral transcripts and next generation sequencing of transcripts (RNA-Seq). We also investigated the host transcriptome using RNA-Seq combined with differential gene expression analysis, biological pathway analysis, and gene ontology analysis. We identify numerous novel spliced and unspliced transcripts of MCMV. Unexpectedly, the most abundantly transcribed viral genes are of unknown function. We found that the most abundant viral transcript, recently identified as a noncoding RNA regulating cellular microRNAs, also codes for a novel protein. To our knowledge, this is the first viral transcript that functions both as a noncoding RNA and an mRNA. We also report that lytic infection elicits a profound cellular response in fibroblasts. Highly upregulated and induced host genes included those involved in inflammation and immunity, but also many unexpected transcription factors and host genes related to development and differentiation. Many top downregulated and repressed genes are associated with functions whose roles in infection are obscure, including host long intergenic noncoding RNAs, antisense RNAs or small nucleolar RNAs. Correspondingly, many differentially expressed genes cluster in biological pathways that may shed new light on cytomegalovirus pathogenesis. Together, these findings provide new insights into the molecular warfare at the virus-host interface and suggest new areas of research to advance the understanding and treatment of cytomegalovirus-associated diseases. PMID:24086132
Functional and bioinformatics analysis of an exopolysaccharide-related gene (epsN) from Lactobacillus kefiranofaciens ZW3.

PubMed

Wang, Jingrui; Tang, Wei; Zheng, Yongna; Xing, Zhuqing; Wang, Yanping

2016-09-01

A novel lactic acid bacteria strain Lactobacillus kefiranofaciens ZW3 exhibited the characteristics of high production of exopolysaccharide (EPS). The epsN gene, located in the eps gene cluster of this strain, is associated with EPS biosynthesis. Bioinformatics analysis of this gene was performed. The conserved domain analysis showed that the EpsN protein contained MATE-Wzx-like domains. Then the epsN gene was amplified to construct the recombinant expression vector pMG36e-epsN. The results showed that the EPS yields of the recombinants were significantly improved. By determining the yields of EPS and intracellular polysaccharide, it was considered that epsN gene could play its Wzx flippase role in the EPS biosynthesis. This is the first time to prove the effect of EpsN on L. kefiranofaciens EPS biosynthesis and further prove its functional property.
A Modified ABCDE Model of Flowering in Orchids Based on Gene Expression Profiling Studies of the Moth Orchid Phalaenopsis aphrodite

PubMed Central

Lee, Ann-Ying; Chen, Chun-Yi; Chang, Yao-Chien Alex; Chao, Ya-Ting; Shih, Ming-Che

2013-01-01

Previously we developed genomic resources for orchids, including transcriptomic analyses using next-generation sequencing techniques and construction of a web-based orchid genomic database. Here, we report a modified molecular model of flower development in the Orchidaceae based on functional analysis of gene expression profiles in Phalaenopsis aphrodite (a moth orchid) that revealed novel roles for the transcription factors involved in floral organ pattern formation. Phalaenopsis orchid floral organ-specific genes were identified by microarray analysis. Several critical transcription factors including AP3, PI, AP1 and AGL6, displayed distinct spatial distribution patterns. Phylogenetic analysis of orchid MADS box genes was conducted to infer the evolutionary relationship among floral organ-specific genes. The results suggest that gene duplication MADS box genes in orchid may have resulted in their gaining novel functions during evolution. Based on these analyses, a modified model of orchid flowering was proposed. Comparison of the expression profiles of flowers of a peloric mutant and wild-type Phalaenopsis orchid further identified genes associated with lip morphology and peloric effects. Large scale investigation of gene expression profiles revealed that homeotic genes from the ABCDE model of flower development classes A and B in the Phalaenopsis orchid have novel functions due to evolutionary diversification, and display differential expression patterns. PMID:24265826
Tissue-Specific Transcriptomic Profiling of Sorghum propinquum using a Rice Genome Array

PubMed Central

Zhang, Ting; Zhao, Xiuqin; Huang, Liyu; Liu, Xiaoyue; Zong, Ying; Zhu, Linghua; Yang, Daichang; Fu, Binying

2013-01-01

Sorghum (Sorghum bicolor) is one of the world's most important cereal crops. S. propinquum is a perennial wild relative of S. bicolor with well-developed rhizomes. Functional genomics analysis of S. propinquum, especially with respect to molecular mechanisms related to rhizome growth and development, can contribute to the development of more sustainable grain, forage, and bioenergy cropping systems. In this study, we used a whole rice genome oligonucleotide microarray to obtain tissue-specific gene expression profiles of S. propinquum with special emphasis on rhizome development. A total of 548 tissue-enriched genes were detected, including 31 and 114 unique genes that were expressed predominantly in the rhizome tips (RT) and internodes (RI), respectively. Further GO analysis indicated that the functions of these tissue-enriched genes corresponded to their characteristic biological processes. A few distinct cis-elements, including ABA-responsive RY repeat CATGCA, sugar-repressive TTATCC, and GA-responsive TAACAA, were found to be prevalent in RT-enriched genes, implying an important role in rhizome growth and development. Comprehensive comparative analysis of these rhizome-enriched genes and rhizome-specific genes previously identified in Oryza longistaminata and S. propinquum indicated that phytohormones, including ABA, GA, and SA, are key regulators of gene expression during rhizome development. Co-localization of rhizome-enriched genes with rhizome-related QTLs in rice and sorghum generated functional candidates for future cloning of genes associated with rhizome growth and development. PMID:23536906
Characteristics of functional enrichment and gene expression level of human putative transcriptional target genes.

PubMed

Osato, Naoki

2018-01-19

Transcriptional target genes show functional enrichment of genes. However, how many and how significantly transcriptional target genes include functional enrichments are still unclear. To address these issues, I predicted human transcriptional target genes using open chromatin regions, ChIP-seq data and DNA binding sequences of transcription factors in databases, and examined functional enrichment and gene expression level of putative transcriptional target genes. Gene Ontology annotations showed four times larger numbers of functional enrichments in putative transcriptional target genes than gene expression information alone, independent of transcriptional target genes. To compare the number of functional enrichments of putative transcriptional target genes between cells or search conditions, I normalized the number of functional enrichment by calculating its ratios in the total number of transcriptional target genes. With this analysis, native putative transcriptional target genes showed the largest normalized number of functional enrichments, compared with target genes including 5-60% of randomly selected genes. The normalized number of functional enrichments was changed according to the criteria of enhancer-promoter interactions such as distance from transcriptional start sites and orientation of CTCF-binding sites. Forward-reverse orientation of CTCF-binding sites showed significantly higher normalized number of functional enrichments than the other orientations. Journal papers showed that the top five frequent functional enrichments were related to the cellular functions in the three cell types. The median expression level of transcriptional target genes changed according to the criteria of enhancer-promoter assignments (i.e. interactions) and was correlated with the changes of the normalized number of functional enrichments of transcriptional target genes. Human putative transcriptional target genes showed significant functional enrichments. Functional enrichments were related to the cellular functions. The normalized number of functional enrichments of human putative transcriptional target genes changed according to the criteria of enhancer-promoter assignments and correlated with the median expression level of the target genes. These analyses and characters of human putative transcriptional target genes would be useful to examine the criteria of enhancer-promoter assignments and to predict the novel mechanisms and factors such as DNA binding proteins and DNA sequences of enhancer-promoter interactions.
Prioritizing chronic obstructive pulmonary disease (COPD) candidate genes in COPD-related networks

PubMed Central

Zhang, Yihua; Li, Wan; Feng, Yuyan; Guo, Shanshan; Zhao, Xilei; Wang, Yahui; He, Yuehan; He, Weiming; Chen, Lina

2017-01-01

Chronic obstructive pulmonary disease (COPD) is a multi-factor disease, which could be caused by many factors, including disturbances of metabolism and protein-protein interactions (PPIs). In this paper, a weighted COPD-related metabolic network and a weighted COPD-related PPI network were constructed base on COPD disease genes and functional information. Candidate genes in these weighted COPD-related networks were prioritized by making use of a gene prioritization method, respectively. Literature review and functional enrichment analysis of the top 100 genes in these two networks suggested the correlation of COPD and these genes. The performance of our gene prioritization method was superior to that of ToppGene and ToppNet for genes from the COPD-related metabolic network or the COPD-related PPI network after assessing using leave-one-out cross-validation, literature validation and functional enrichment analysis. The top-ranked genes prioritized from COPD-related metabolic and PPI networks could promote the better understanding about the molecular mechanism of this disease from different perspectives. The top 100 genes in COPD-related metabolic network or COPD-related PPI network might be potential markers for the diagnosis and treatment of COPD. PMID:29262568
Prioritizing chronic obstructive pulmonary disease (COPD) candidate genes in COPD-related networks.

PubMed

Zhang, Yihua; Li, Wan; Feng, Yuyan; Guo, Shanshan; Zhao, Xilei; Wang, Yahui; He, Yuehan; He, Weiming; Chen, Lina

2017-11-28

Chronic obstructive pulmonary disease (COPD) is a multi-factor disease, which could be caused by many factors, including disturbances of metabolism and protein-protein interactions (PPIs). In this paper, a weighted COPD-related metabolic network and a weighted COPD-related PPI network were constructed base on COPD disease genes and functional information. Candidate genes in these weighted COPD-related networks were prioritized by making use of a gene prioritization method, respectively. Literature review and functional enrichment analysis of the top 100 genes in these two networks suggested the correlation of COPD and these genes. The performance of our gene prioritization method was superior to that of ToppGene and ToppNet for genes from the COPD-related metabolic network or the COPD-related PPI network after assessing using leave-one-out cross-validation, literature validation and functional enrichment analysis. The top-ranked genes prioritized from COPD-related metabolic and PPI networks could promote the better understanding about the molecular mechanism of this disease from different perspectives. The top 100 genes in COPD-related metabolic network or COPD-related PPI network might be potential markers for the diagnosis and treatment of COPD.
Microarray characterization of gene expression changes in blood during acute ethanol exposure

PubMed Central

2013-01-01

Background As part of the civil aviation safety program to define the adverse effects of ethanol on flying performance, we performed a DNA microarray analysis of human whole blood samples from a five-time point study of subjects administered ethanol orally, followed by breathalyzer analysis, to monitor blood alcohol concentration (BAC) to discover significant gene expression changes in response to the ethanol exposure. Methods Subjects were administered either orange juice or orange juice with ethanol. Blood samples were taken based on BAC and total RNA was isolated from PaxGene™ blood tubes. The amplified cDNA was used in microarray and quantitative real-time polymerase chain reaction (RT-qPCR) analyses to evaluate differential gene expression. Microarray data was analyzed in a pipeline fashion to summarize and normalize and the results evaluated for relative expression across time points with multiple methods. Candidate genes showing distinctive expression patterns in response to ethanol were clustered by pattern and further analyzed for related function, pathway membership and common transcription factor binding within and across clusters. RT-qPCR was used with representative genes to confirm relative transcript levels across time to those detected in microarrays. Results Microarray analysis of samples representing 0%, 0.04%, 0.08%, return to 0.04%, and 0.02% wt/vol BAC showed that changes in gene expression could be detected across the time course. The expression changes were verified by qRT-PCR. The candidate genes of interest (GOI) identified from the microarray analysis and clustered by expression pattern across the five BAC points showed seven coordinately expressed groups. Analysis showed function-based networks, shared transcription factor binding sites and signaling pathways for members of the clusters. These include hematological functions, innate immunity and inflammation functions, metabolic functions expected of ethanol metabolism, and pancreatic and hepatic function. Five of the seven clusters showed links to the p38 MAPK pathway. Conclusions The results of this study provide a first look at changing gene expression patterns in human blood during an acute rise in blood ethanol concentration and its depletion because of metabolism and excretion, and demonstrate that it is possible to detect changes in gene expression using total RNA isolated from whole blood. The analysis approach for this study serves as a workflow to investigate the biology linked to expression changes across a time course and from these changes, to identify target genes that could serve as biomarkers linked to pilot performance. PMID:23883607
Global transcriptomic analysis of model human cell lines exposed to surface-modified gold nanoparticles: the effect of surface chemistry

NASA Astrophysics Data System (ADS)

Grzincic, E. M.; Yang, J. A.; Drnevich, J.; Falagan-Lotsch, P.; Murphy, C. J.

2015-01-01

Gold nanoparticles (Au NPs) are attractive for biomedical applications not only for their remarkable physical properties, but also for the ease of which their surface chemistry can be manipulated. Many applications involve functionalization of the Au NP surface in order to improve biocompatibility, attach targeting ligands or carry drugs. However, changes in cells exposed to Au NPs of different surface chemistries have been observed, and little is known about how Au NPs and their surface coatings may impact cellular gene expression. The gene expression of two model human cell lines, human dermal fibroblasts (HDF) and prostate cancer cells (PC3) was interrogated by microarray analysis of over 14 000 human genes. The cell lines were exposed to four differently functionalized Au NPs: citrate, poly(allylamine hydrochloride) (PAH), and lipid coatings combined with alkanethiols or PAH. Gene functional annotation categories and weighted gene correlation network analysis were used in order to connect gene expression changes to common cellular functions and to elucidate expression patterns between Au NP samples. Coated Au NPs affect genes implicated in proliferation, angiogenesis, and metabolism in HDF cells, and inflammation, angiogenesis, proliferation apoptosis regulation, survival and invasion in PC3 cells. Subtle changes in surface chemistry, such as the initial net charge, lability of the ligand, and underlying layers greatly influence the degree of expression change and the type of cellular pathway affected.Gold nanoparticles (Au NPs) are attractive for biomedical applications not only for their remarkable physical properties, but also for the ease of which their surface chemistry can be manipulated. Many applications involve functionalization of the Au NP surface in order to improve biocompatibility, attach targeting ligands or carry drugs. However, changes in cells exposed to Au NPs of different surface chemistries have been observed, and little is known about how Au NPs and their surface coatings may impact cellular gene expression. The gene expression of two model human cell lines, human dermal fibroblasts (HDF) and prostate cancer cells (PC3) was interrogated by microarray analysis of over 14 000 human genes. The cell lines were exposed to four differently functionalized Au NPs: citrate, poly(allylamine hydrochloride) (PAH), and lipid coatings combined with alkanethiols or PAH. Gene functional annotation categories and weighted gene correlation network analysis were used in order to connect gene expression changes to common cellular functions and to elucidate expression patterns between Au NP samples. Coated Au NPs affect genes implicated in proliferation, angiogenesis, and metabolism in HDF cells, and inflammation, angiogenesis, proliferation apoptosis regulation, survival and invasion in PC3 cells. Subtle changes in surface chemistry, such as the initial net charge, lability of the ligand, and underlying layers greatly influence the degree of expression change and the type of cellular pathway affected. Electronic supplementary information (ESI) available: UV-Vis spectra of Au NPs, the most significantly changed genes of HDF cells after Au NP incubation under GO accession number GO:0007049 ``cell cycle'', detailed information about the primer/probe sets used for RT-PCR validation of results. See DOI: 10.1039/c4nr05166a
Transcriptomic meta-analysis identifies gene expression characteristics in various samples of HIV-infected patients with nonprogressive disease.

PubMed

Zhang, Le-Le; Zhang, Zi-Ning; Wu, Xian; Jiang, Yong-Jun; Fu, Ya-Jing; Shang, Hong

2017-09-12

A small proportion of HIV-infected patients remain clinically and/or immunologically stable for years, including elite controllers (ECs) who have undetectable viremia (<50 copies/ml) and long-term nonprogressors (LTNPs) who maintain normal CD4 + T cell counts for prolonged periods (>10 years). However, the mechanism of nonprogression needs to be further resolved. In this study, a transcriptome meta-analysis was performed on nonprogressor and progressor microarray data to identify differential transcriptome pathways and potential biomarkers. Using the INMEX (integrative meta-analysis of expression data) program, we performed the meta-analysis to identify consistently differentially expressed genes (DEGs) in nonprogressors and further performed functional interpretation (gene ontology analysis and pathway analysis) of the DEGs identified in the meta-analysis. Five microarray datasets (81 cases and 98 controls in total), including whole blood, CD4 + and CD8 + T cells, were collected for meta-analysis. We determined that nonprogressors have reduced expression of important interferon-stimulated genes (ISGs), CD38, lymphocyte activation gene 3 (LAG-3) in whole blood, CD4 + and CD8 + T cells. Gene ontology (GO) analysis showed a significant enrichment in DEGs that function in the type I interferon signaling pathway. Upregulated pathways, including the PI3K-Akt signaling pathway in whole blood, cytokine-cytokine receptor interaction in CD4 + T cells and the MAPK signaling pathway in CD8 + T cells, were identified in nonprogressors compared with progressors. In each metabolic functional category, the number of downregulated DEGs was more than the upregulated DEGs, and almost all genes were downregulated DEGs in the oxidative phosphorylation (OXPHOS) and tricarboxylic acid (TCA) cycle in the three types of samples. Our transcriptomic meta-analysis provides a comprehensive evaluation of the gene expression profiles in major blood types of nonprogressors, providing new insights in the understanding of HIV pathogenesis and developing strategies to delay HIV disease progression.
Genome-wide analysis of the WRKY gene family in cotton.

PubMed

Dou, Lingling; Zhang, Xiaohong; Pang, Chaoyou; Song, Meizhen; Wei, Hengling; Fan, Shuli; Yu, Shuxun

2014-12-01

WRKY proteins are major transcription factors involved in regulating plant growth and development. Although many studies have focused on the functional identification of WRKY genes, our knowledge concerning many areas of WRKY gene biology is limited. For example, in cotton, the phylogenetic characteristics, global expression patterns, molecular mechanisms regulating expression, and target genes/pathways of WRKY genes are poorly characterized. Therefore, in this study, we present a genome-wide analysis of the WRKY gene family in cotton (Gossypium raimondii and Gossypium hirsutum). We identified 116 WRKY genes in G. raimondii from the completed genome sequence, and we cloned 102 WRKY genes in G. hirsutum. Chromosomal location analysis indicated that WRKY genes in G. raimondii evolved mainly from segmental duplication followed by tandem amplifications. Phylogenetic analysis of alga, bryophyte, lycophyta, monocot and eudicot WRKY domains revealed family member expansion with increasing complexity of the plant body. Microarray, expression profiling and qRT-PCR data revealed that WRKY genes in G. hirsutum may regulate the development of fibers, anthers, tissues (roots, stems, leaves and embryos), and are involved in the response to stresses. Expression analysis showed that most group II and III GhWRKY genes are highly expressed under diverse stresses. Group I members, representing the ancestral form, seem to be insensitive to abiotic stress, with low expression divergence. Our results indicate that cotton WRKY genes might have evolved by adaptive duplication, leading to sensitivity to diverse stresses. This study provides fundamental information to inform further analysis and understanding of WRKY gene functions in cotton species.
Comparative expression profiling reveals gene functions in female meiosis and gametophyte development in Arabidopsis.

PubMed

Zhao, Lihua; He, Jiangman; Cai, Hanyang; Lin, Haiyan; Li, Yanqiang; Liu, Renyi; Yang, Zhenbiao; Qin, Yuan

2014-11-01

Megasporogenesis is essential for female fertility, and requires the accomplishment of meiosis and the formation of functional megaspores. The inaccessibility and low abundance of female meiocytes make it particularly difficult to elucidate the molecular basis underlying megasporogenesis. We used high-throughput tag-sequencing analysis to identify genes expressed in female meiocytes (FMs) by comparing gene expression profiles from wild-type ovules undergoing megasporogenesis with those from the spl mutant ovules, which lack megasporogenesis. A total of 862 genes were identified as FMs, with levels that are consistently reduced in spl ovules in two biological replicates. Fluorescence-assisted cell sorting followed by RNA-seq analysis of DMC1:GFP-labeled female meiocytes confirmed that 90% of the FMs are indeed detected in the female meiocyte protoplast profiling. We performed reverse genetic analysis of 120 candidate genes and identified four FM genes with a function in female meiosis progression in Arabidopsis. We further revealed that KLU, a putative cytochrome P450 monooxygenase, is involved in chromosome pairing during female meiosis, most likely by affecting the normal expression pattern of DMC1 in ovules during female meiosis. Our studies provide valuable information for functional genomic analyses of plant germline development as well as insights into meiosis. © 2014 The Authors The Plant Journal © 2014 John Wiley & Sons Ltd.
Comparative and Evolutionary Analysis of the HES/HEY Gene Family Reveal Exon/Intron Loss and Teleost Specific Duplication Events

PubMed Central

Ma, Zhaowu; Zhou, Yang; Abbood, Nibras Najm; Liu, Jianfeng; Su, Li; Jia, Haibo; Guo, An-Yuan

2012-01-01

Background HES/HEY genes encode a family of basic helix-loop-helix (bHLH) transcription factors with both bHLH and Orange domain. HES/HEY proteins are direct targets of the Notch signaling pathway and play an essential role in developmental decisions, such as the developments of nervous system, somitogenesis, blood vessel and heart. Despite their important functions, the origin and evolution of this HES/HEY gene family has yet to be elucidated. Methods and Findings In this study, we identified genes of the HES/HEY family in representative species and performed evolutionary analysis to elucidate their origin and evolutionary process. Our results showed that the HES/HEY genes only existed in metazoans and may originate from the common ancestor of metazoans. We identified HES/HEY genes in more than 10 species representing the main lineages. Combining the bHLH and Orange domain sequences, we constructed the phylogenetic trees by different methods (Bayesian, ML, NJ and ME) and classified the HES/HEY gene family into four groups. Our results indicated that this gene family had undergone three expansions, which were along with the origins of Eumetazoa, vertebrate, and teleost. Gene structure analysis revealed that the HES/HEY genes were involved in exon and/or intron loss in different species lineages. Genes of this family were duplicated in bony fishes and doubled than other vertebrates. Furthermore, we studied the teleost-specific duplications in zebrafish and investigated the expression pattern of duplicated genes in different tissues by RT-PCR. Finally, we proposed a model to show the evolution of this gene family with processes of expansion, exon/intron loss, and motif loss. Conclusions Our study revealed the evolution of HES/HEY gene family, the expression and function divergence of duplicated genes, which also provide clues for the research of Notch function in development. This study shows a model of gene family analysis with gene structure evolution and duplication. PMID:22808219

Comparative and evolutionary analysis of the HES/HEY gene family reveal exon/intron loss and teleost specific duplication events.

PubMed

Zhou, Mi; Yan, Jun; Ma, Zhaowu; Zhou, Yang; Abbood, Nibras Najm; Liu, Jianfeng; Su, Li; Jia, Haibo; Guo, An-Yuan

2012-01-01

HES/HEY genes encode a family of basic helix-loop-helix (bHLH) transcription factors with both bHLH and Orange domain. HES/HEY proteins are direct targets of the Notch signaling pathway and play an essential role in developmental decisions, such as the developments of nervous system, somitogenesis, blood vessel and heart. Despite their important functions, the origin and evolution of this HES/HEY gene family has yet to be elucidated. In this study, we identified genes of the HES/HEY family in representative species and performed evolutionary analysis to elucidate their origin and evolutionary process. Our results showed that the HES/HEY genes only existed in metazoans and may originate from the common ancestor of metazoans. We identified HES/HEY genes in more than 10 species representing the main lineages. Combining the bHLH and Orange domain sequences, we constructed the phylogenetic trees by different methods (Bayesian, ML, NJ and ME) and classified the HES/HEY gene family into four groups. Our results indicated that this gene family had undergone three expansions, which were along with the origins of Eumetazoa, vertebrate, and teleost. Gene structure analysis revealed that the HES/HEY genes were involved in exon and/or intron loss in different species lineages. Genes of this family were duplicated in bony fishes and doubled than other vertebrates. Furthermore, we studied the teleost-specific duplications in zebrafish and investigated the expression pattern of duplicated genes in different tissues by RT-PCR. Finally, we proposed a model to show the evolution of this gene family with processes of expansion, exon/intron loss, and motif loss. Our study revealed the evolution of HES/HEY gene family, the expression and function divergence of duplicated genes, which also provide clues for the research of Notch function in development. This study shows a model of gene family analysis with gene structure evolution and duplication.
Deep transcriptome sequencing provides new insights into the structural and functional organization of the wheat genome.

PubMed

Pingault, Lise; Choulet, Frédéric; Alberti, Adriana; Glover, Natasha; Wincker, Patrick; Feuillet, Catherine; Paux, Etienne

2015-02-10

Because of its size, allohexaploid nature, and high repeat content, the bread wheat genome is a good model to study the impact of the genome structure on gene organization, function, and regulation. However, because of the lack of a reference genome sequence, such studies have long been hampered and our knowledge of the wheat gene space is still limited. The access to the reference sequence of the wheat chromosome 3B provided us with an opportunity to study the wheat transcriptome and its relationships to genome and gene structure at a level that has never been reached before. By combining this sequence with RNA-seq data, we construct a fine transcriptome map of the chromosome 3B. More than 8,800 transcription sites are identified, that are distributed throughout the entire chromosome. Expression level, expression breadth, alternative splicing as well as several structural features of genes, including transcript length, number of exons, and cumulative intron length are investigated. Our analysis reveals a non-monotonic relationship between gene expression and structure and leads to the hypothesis that gene structure is determined by its function, whereas gene expression is subject to energetic cost. Moreover, we observe a recombination-based partitioning at the gene structure and function level. Our analysis provides new insights into the relationships between gene and genome structure and function. It reveals mechanisms conserved with other plant species as well as superimposed evolutionary forces that shaped the wheat gene space, likely participating in wheat adaptation.
Robust Principal Component Analysis Regularized by Truncated Nuclear Norm for Identifying Differentially Expressed Genes.

PubMed

Wang, Ya-Xuan; Gao, Ying-Lian; Liu, Jin-Xing; Kong, Xiang-Zhen; Li, Hai-Jun

2017-09-01

Identifying differentially expressed genes from the thousands of genes is a challenging task. Robust principal component analysis (RPCA) is an efficient method in the identification of differentially expressed genes. RPCA method uses nuclear norm to approximate the rank function. However, theoretical studies showed that the nuclear norm minimizes all singular values, so it may not be the best solution to approximate the rank function. The truncated nuclear norm is defined as the sum of some smaller singular values, which may achieve a better approximation of the rank function than nuclear norm. In this paper, a novel method is proposed by replacing nuclear norm of RPCA with the truncated nuclear norm, which is named robust principal component analysis regularized by truncated nuclear norm (TRPCA). The method decomposes the observation matrix of genomic data into a low-rank matrix and a sparse matrix. Because the significant genes can be considered as sparse signals, the differentially expressed genes are viewed as the sparse perturbation signals. Thus, the differentially expressed genes can be identified according to the sparse matrix. The experimental results on The Cancer Genome Atlas data illustrate that the TRPCA method outperforms other state-of-the-art methods in the identification of differentially expressed genes.
[Endoplasmic reticulum stress in INS-1-3 cell associated with the expression changes of MODY gene pathway].

PubMed

Liu, Y T; Li, S R; Wang, Z; Xiao, J Z

2016-09-13

Objective: To profile the gene expression changes associated with endoplasmic reticulum stress in INS-1-3 cells induced by thapsigargin (TG) and tunicamycin (TM). Methods: Normal cultured INS-1-3 cells were used as a control. TG and TM were used to induce endoplasmic reticulum stress in INS-1-3 cells. Digital gene expression profiling technique was used to detect differentially expressed gene. The changes of gene expression were detected by expression pattern clustering analysis, gene ontology (GO) function and pathway enrichment analysis. Real time polymerase chain reaction (RT-PCR) was used to verify the key changes of gene expression. Results: Compared with the control group, there were 57 (45 up-regulated, 12 down-regulated) and 135 (99 up-regulated, 36 down-regulated) differentially expressed genes in TG and TM group, respectively. GO function enrichment analyses indicated that the main enrichment was in the endoplasmic reticulum. In signaling pathway analysis, the identified pathways were related with endoplasmic reticulum stress, antigen processing and presentation, protein export, and most of all, the maturity onset diabetes of the young (MODY) pathway. Conclusion: Under the condition of endoplasmic reticulum stress, the related expression changes of transcriptional factors in MODY signaling pathway may be related with the impaired function in islet beta cells.
Altered expression of four miRNA (miR-1238-3p, miR-202-3p, miR-630 and miR-766-3p) and their potential targets in peripheral blood from vitiligo patients.

PubMed

Shang, Zhiwei; Li, Hongwen

2017-10-01

Vitiligo is an acquired skin disease with pigmentary disorder. Autoimmune destruction of melanocytes is thought to be major factor in the etiology of vitiligo. miRNA-based regulators of gene expression have been reported to play crucial roles in autoimmune disease. Therefore, we attempt to profile the miRNA expressions and predict their potential targets, assessing the biological functions of differentially expressed miRNA. Total RNA was extracted from peripheral blood of vitiligo (experimental group, n = 5) and non-vitiligo (control group, n = 5) age-matched patients. Samples were hybridized to a miRNA array. Box, scatter and principal component analysis plots were performed, followed by unsupervised hierarchical clustering analysis to classify the samples. Quantitative reverse transcription polymerase chain reaction (RT-PCR) was conducted for validation of microarray data. Three different databases, TargetScan, PITA and microRNA.org, were used to predict the potential target genes. Gene ontology (GO) annotation and pathway analysis were performed to assess the potential functions of predicted genes of identified miRNA. A total of 100 (29 upregulated and 71 downregulated) miRNA were filtered by volcano plot analysis. Four miRNA were validated by quantitative RT-PCR as significantly downregulated in the vitiligo group. The functions of predicted target genes associated with differentially expressed miRNA were assessed by GO analysis, showing that the GO term with most significantly enriched target genes was axon guidance, and that the axon guidance pathway was most significantly correlated with these miRNA. In conclusion, we identified four downregulated miRNA in vitiligo and assessed the potential functions of target genes related to these differentially expressed miRNA. © 2017 Japanese Dermatological Association.
Functional Annotation, Genome Organization and Phylogeny of the Grapevine (Vitis vinifera) Terpene Synthase Gene Family Based on Genome Assembly, FLcDNA Cloning, and Enzyme Assays

PubMed Central

2010-01-01

Background Terpenoids are among the most important constituents of grape flavour and wine bouquet, and serve as useful metabolite markers in viticulture and enology. Based on the initial 8-fold sequencing of a nearly homozygous Pinot noir inbred line, 89 putative terpenoid synthase genes (VvTPS) were predicted by in silico analysis of the grapevine (Vitis vinifera) genome assembly [1]. The finding of this very large VvTPS family, combined with the importance of terpenoid metabolism for the organoleptic properties of grapevine berries and finished wines, prompted a detailed examination of this gene family at the genomic level as well as an investigation into VvTPS biochemical functions. Results We present findings from the analysis of the up-dated 12-fold sequencing and assembly of the grapevine genome that place the number of predicted VvTPS genes at 69 putatively functional VvTPS, 20 partial VvTPS, and 63 VvTPS probable pseudogenes. Gene discovery and annotation included information about gene architecture and chromosomal location. A dense cluster of 45 VvTPS is localized on chromosome 18. Extensive FLcDNA cloning, gene synthesis, and protein expression enabled functional characterization of 39 VvTPS; this is the largest number of functionally characterized TPS for any species reported to date. Of these enzymes, 23 have unique functions and/or phylogenetic locations within the plant TPS gene family. Phylogenetic analyses of the TPS gene family showed that while most VvTPS form species-specific gene clusters, there are several examples of gene orthology with TPS of other plant species, representing perhaps more ancient VvTPS, which have maintained functions independent of speciation. Conclusions The highly expanded VvTPS gene family underpins the prominence of terpenoid metabolism in grapevine. We provide a detailed experimental functional annotation of 39 members of this important gene family in grapevine and comprehensive information about gene structure and phylogeny for the entire currently known VvTPS gene family. PMID:20964856
Differential Effect of Active Smoking on Gene Expression in Male and Female Smokers

PubMed Central

Paul, Sunirmal; Amundson, Sally A

2015-01-01

Smoking is the second leading cause of preventable death in the United States. Cohort epidemiological studies have demonstrated that women are more vulnerable to cigarette-smoking induced diseases than their male counterparts, however, the molecular basis of these differences has remained unknown. In this study, we explored if there were differences in the gene expression patterns between male and female smokers, and how these patterns might reflect different sex-specific responses to the stress of smoking. Using whole genome microarray gene expression profiling, we found that a substantial number of oxidant related genes were expressed in both male and female smokers, however, smoking-responsive genes did indeed differ greatly between male and female smokers. Gene set enrichment analysis (GSEA) against reference oncogenic signature gene sets identified a large number of oncogenic pathway gene-sets that were significantly altered in female smokers compared to male smokers. In addition, functional annotation with Ingenuity Pathway Analysis (IPA) identified smoking-correlated genes associated with biological functions in male and female smokers that are directly relevant to well-known smoking related pathologies. However, these relevant biological functions were strikingly overrepresented in female smokers compared to male smokers. IPA network analysis with the functional categories of immune and inflammatory response gene products suggested potential interactions between smoking response and female hormones. Our results demonstrate a striking dichotomy between male and female gene expression responses to smoking. This is the first genome-wide expression study to compare the sex-specific impacts of smoking at a molecular level and suggests a novel potential connection between sex hormone signaling and smoking-induced diseases in female smokers. PMID:25621181
Gene expression profiles analysis identifies key genes for acute lung injury in patients with sepsis.

PubMed

Guo, Zhiqiang; Zhao, Chuncheng; Wang, Zheng

2014-09-26

To identify critical genes and biological pathways in acute lung injury (ALI), a comparative analysis of gene expression profiles of patients with ALI + sepsis compared with patients with sepsis alone were performed with bioinformatic tools. GSE10474 was downloaded from Gene Expression Omnibus, including a collective of 13 whole blood samples with ALI + sepsis and 21 whole blood samples with sepsis alone. After pre-treatment with robust multichip averaging (RMA) method, differential analysis was conducted using simpleaffy package based upon t-test and fold change. Hierarchical clustering was also performed using function hclust from package stats. Beisides, functional enrichment analysis was conducted using iGepros. Moreover, the gene regulatory network was constructed with information from Kyoto Encyclopedia of Genes and Genomes (KEGG) and then visualized by Cytoscape. A total of 128 differentially expressed genes (DEGs) were identified, including 47 up- and 81 down-regulated genes. The significantly enriched functions included negative regulation of cell proliferation, regulation of response to stimulus and cellular component morphogenesis. A total of 27 DEGs were significantly enriched in 16 KEGG pathways, such as protein digestion and absorption, fatty acid metabolism, amoebiasis, etc. Furthermore, the regulatory network of these 27 DEGs was constructed, which involved several key genes, including protein tyrosine kinase 2 (PTK2), v-src avian sarcoma (SRC) and Caveolin 2 (CAV2). PTK2, SRC and CAV2 may be potential markers for diagnosis and treatment of ALI. The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/5865162912987143.
Characteristics of allelic gene expression in human brain cells from single-cell RNA-seq data analysis.

PubMed

Zhao, Dejian; Lin, Mingyan; Pedrosa, Erika; Lachman, Herbert M; Zheng, Deyou

2017-11-10

Monoallelic expression of autosomal genes has been implicated in human psychiatric disorders. However, there is a paucity of allelic expression studies in human brain cells at the single cell and genome wide levels. In this report, we reanalyzed a previously published single-cell RNA-seq dataset from several postmortem human brains and observed pervasive monoallelic expression in individual cells, largely in a random manner. Examining single nucleotide variants with a predicted functional disruption, we found that the "damaged" alleles were overall expressed in fewer brain cells than their counterparts, and at a lower level in cells where their expression was detected. We also identified many brain cell type-specific monoallelically expressed genes. Interestingly, many of these cell type-specific monoallelically expressed genes were enriched for functions important for those brain cell types. In addition, function analysis showed that genes displaying monoallelic expression and correlated expression across neuronal cells from different individual brains were implicated in the regulation of synaptic function. Our findings suggest that monoallelic gene expression is prevalent in human brain cells, which may play a role in generating cellular identity and neuronal diversity and thus increasing the complexity and diversity of brain cell functions.
A study of structural properties of gene network graphs for mathematical modeling of integrated mosaic gene networks.

PubMed

Petrovskaya, Olga V; Petrovskiy, Evgeny D; Lavrik, Inna N; Ivanisenko, Vladimir A

2017-04-01

Gene network modeling is one of the widely used approaches in systems biology. It allows for the study of complex genetic systems function, including so-called mosaic gene networks, which consist of functionally interacting subnetworks. We conducted a study of a mosaic gene networks modeling method based on integration of models of gene subnetworks by linear control functionals. An automatic modeling of 10,000 synthetic mosaic gene regulatory networks was carried out using computer experiments on gene knockdowns/knockouts. Structural analysis of graphs of generated mosaic gene regulatory networks has revealed that the most important factor for building accurate integrated mathematical models, among those analyzed in the study, is data on expression of genes corresponding to the vertices with high properties of centrality.
Comprehensive analysis of alternative splicing and functionality in neuronal differentiation of P19 cells.

PubMed

Suzuki, Hitoshi; Osaki, Ken; Sano, Kaori; Alam, A H M Khurshid; Nakamura, Yuichiro; Ishigaki, Yasuhito; Kawahara, Kozo; Tsukahara, Toshifumi

2011-02-18

Alternative splicing, which produces multiple mRNAs from a single gene, occurs in most human genes and contributes to protein diversity. Many alternative isoforms are expressed in a spatio-temporal manner, and function in diverse processes, including in the neural system. The purpose of the present study was to comprehensively investigate neural-splicing using P19 cells. GeneChip Exon Array analysis was performed using total RNAs purified from cells during neuronal cell differentiation. To efficiently and readily extract the alternative exon candidates, 9 filtering conditions were prepared, yielding 262 candidate exons (236 genes). Semiquantitative RT-PCR results in 30 randomly selected candidates suggested that 87% of the candidates were differentially alternatively spliced in neuronal cells compared to undifferentiated cells. Gene ontology and pathway analyses suggested that many of the candidate genes were associated with neural events. Together with 66 genes whose functions in neural cells or organs were reported previously, 47 candidate genes were found to be linked to 189 events in the gene-level profile of neural differentiation. By text-mining for the alternative isoform, distinct functions of the isoforms of 9 candidate genes indicated by the result of Exon Array were confirmed. Alternative exons were successfully extracted. Results from the informatics analyses suggested that neural events were primarily governed by genes whose expression was increased and whose transcripts were differentially alternatively spliced in the neuronal cells. In addition to known functions in neural cells or organs, the uninvestigated alternative splicing events of 11 genes among 47 candidate genes suggested that cell cycle events are also potentially important. These genes may help researchers to differentiate the roles of alternative splicing in cell differentiation and cell proliferation.
Bioinformatic analysis of Msx1 and Msx2 involved in craniofacial development.

PubMed

Dai, Jiewen; Mou, Zhifang; Shen, Shunyao; Dong, Yuefu; Yang, Tong; Shen, Steve Guofang

2014-01-01

Msx1 and Msx2 were revealed to be candidate genes for some craniofacial deformities, such as cleft lip with/without cleft palate (CL/P) and craniosynostosis. Many other genes were demonstrated to have a cross-talk with MSX genes in causing these defects. However, there is no systematic evaluation for these MSX gene-related factors. In this study, we performed systematic bioinformatic analysis for MSX genes by combining using GeneDecks, DAVID, and STRING database, and the results showed that there were numerous genes related to MSX genes, such as Irf6, TP63, Dlx2, Dlx5, Pax3, Pax9, Bmp4, Tgf-beta2, and Tgf-beta3 that have been demonstrated to be involved in CL/P, and Fgfr2, Fgfr1, Fgfr3, and Twist1 that were involved in craniosynostosis. Many of these genes could be enriched into different gene groups involved in different signaling ways, different craniofacial deformities, and different biological process. These findings could make us analyze the function of MSX gens in a gene network. In addition, our findings showed that Sumo, a novel gene whose polymorphisms were demonstrated to be associated with nonsyndromic CL/P by genome-wide association study, has protein-protein interaction with MSX1, which may offer us an alternative method to perform bioinformatic analysis for genes found by genome-wide association study and can make us predict the disrupted protein function due to the mutation in a gene DNA sequence. These findings may guide us to perform further functional studies in the future.
Analysis of mammalian gene function through broad-based phenotypic screens across a consortium of mouse clinics.

PubMed

de Angelis, Martin Hrabě; Nicholson, George; Selloum, Mohammed; White, Jacqui; Morgan, Hugh; Ramirez-Solis, Ramiro; Sorg, Tania; Wells, Sara; Fuchs, Helmut; Fray, Martin; Adams, David J; Adams, Niels C; Adler, Thure; Aguilar-Pimentel, Antonio; Ali-Hadji, Dalila; Amann, Gregory; André, Philippe; Atkins, Sarah; Auburtin, Aurelie; Ayadi, Abdel; Becker, Julien; Becker, Lore; Bedu, Elodie; Bekeredjian, Raffi; Birling, Marie-Christine; Blake, Andrew; Bottomley, Joanna; Bowl, Mike; Brault, Véronique; Busch, Dirk H; Bussell, James N; Calzada-Wack, Julia; Cater, Heather; Champy, Marie-France; Charles, Philippe; Chevalier, Claire; Chiani, Francesco; Codner, Gemma F; Combe, Roy; Cox, Roger; Dalloneau, Emilie; Dierich, André; Di Fenza, Armida; Doe, Brendan; Duchon, Arnaud; Eickelberg, Oliver; Esapa, Chris T; El Fertak, Lahcen; Feigel, Tanja; Emelyanova, Irina; Estabel, Jeanne; Favor, Jack; Flenniken, Ann; Gambadoro, Alessia; Garrett, Lilian; Gates, Hilary; Gerdin, Anna-Karin; Gkoutos, George; Greenaway, Simon; Glasl, Lisa; Goetz, Patrice; Da Cruz, Isabelle Goncalves; Götz, Alexander; Graw, Jochen; Guimond, Alain; Hans, Wolfgang; Hicks, Geoff; Hölter, Sabine M; Höfler, Heinz; Hancock, John M; Hoehndorf, Robert; Hough, Tertius; Houghton, Richard; Hurt, Anja; Ivandic, Boris; Jacobs, Hughes; Jacquot, Sylvie; Jones, Nora; Karp, Natasha A; Katus, Hugo A; Kitchen, Sharon; Klein-Rodewald, Tanja; Klingenspor, Martin; Klopstock, Thomas; Lalanne, Valerie; Leblanc, Sophie; Lengger, Christoph; le Marchand, Elise; Ludwig, Tonia; Lux, Aline; McKerlie, Colin; Maier, Holger; Mandel, Jean-Louis; Marschall, Susan; Mark, Manuel; Melvin, David G; Meziane, Hamid; Micklich, Kateryna; Mittelhauser, Christophe; Monassier, Laurent; Moulaert, David; Muller, Stéphanie; Naton, Beatrix; Neff, Frauke; Nolan, Patrick M; Nutter, Lauryl Mj; Ollert, Markus; Pavlovic, Guillaume; Pellegata, Natalia S; Peter, Emilie; Petit-Demoulière, Benoit; Pickard, Amanda; Podrini, Christine; Potter, Paul; Pouilly, Laurent; Puk, Oliver; Richardson, David; Rousseau, Stephane; Quintanilla-Fend, Leticia; Quwailid, Mohamed M; Racz, Ildiko; Rathkolb, Birgit; Riet, Fabrice; Rossant, Janet; Roux, Michel; Rozman, Jan; Ryder, Ed; Salisbury, Jennifer; Santos, Luis; Schäble, Karl-Heinz; Schiller, Evelyn; Schrewe, Anja; Schulz, Holger; Steinkamp, Ralf; Simon, Michelle; Stewart, Michelle; Stöger, Claudia; Stöger, Tobias; Sun, Minxuan; Sunter, David; Teboul, Lydia; Tilly, Isabelle; Tocchini-Valentini, Glauco P; Tost, Monica; Treise, Irina; Vasseur, Laurent; Velot, Emilie; Vogt-Weisenhorn, Daniela; Wagner, Christelle; Walling, Alison; Weber, Bruno; Wendling, Olivia; Westerberg, Henrik; Willershäuser, Monja; Wolf, Eckhard; Wolter, Anne; Wood, Joe; Wurst, Wolfgang; Yildirim, Ali Önder; Zeh, Ramona; Zimmer, Andreas; Zimprich, Annemarie; Holmes, Chris; Steel, Karen P; Herault, Yann; Gailus-Durner, Valérie; Mallon, Ann-Marie; Brown, Steve Dm

2015-09-01

The function of the majority of genes in the mouse and human genomes remains unknown. The mouse embryonic stem cell knockout resource provides a basis for the characterization of relationships between genes and phenotypes. The EUMODIC consortium developed and validated robust methodologies for the broad-based phenotyping of knockouts through a pipeline comprising 20 disease-oriented platforms. We developed new statistical methods for pipeline design and data analysis aimed at detecting reproducible phenotypes with high power. We acquired phenotype data from 449 mutant alleles, representing 320 unique genes, of which half had no previous functional annotation. We captured data from over 27,000 mice, finding that 83% of the mutant lines are phenodeviant, with 65% demonstrating pleiotropy. Surprisingly, we found significant differences in phenotype annotation according to zygosity. New phenotypes were uncovered for many genes with previously unknown function, providing a powerful basis for hypothesis generation and further investigation in diverse systems.
Informatic and genomic analysis of melanocyte cDNA libraries as a resource for the study of melanocyte development and function.

PubMed

Baxter, Laura L; Hsu, Benjamin J; Umayam, Lowell; Wolfsberg, Tyra G; Larson, Denise M; Frith, Martin C; Kawai, Jun; Hayashizaki, Yoshihide; Carninci, Piero; Pavan, William J

2007-06-01

As part of the RIKEN mouse encyclopedia project, two cDNA libraries were prepared from melanocyte-derived cell lines, using techniques of full-length clone selection and subtraction/normalization to enrich for rare transcripts. End sequencing showed that these libraries display over 83% complete coding sequence at the 5' end and 96-97% complete coding sequence at the 3' end. Evaluation of the libraries, derived from B16F10Y tumor cells and melan-c cells, revealed that they contain clones for a majority of the genes previously demonstrated to function in melanocyte biology. Analysis of genomic locations for transcripts revealed that the distribution of melanocyte genes is non-random throughout the genome. Three genomic regions identified that showed significant clustering of melanocyte-expressed genes contain one or more genes previously shown to regulate melanocyte development or function. A catalog of genes expressed in these libraries is presented, providing a valuable resource of cDNA clones and sequence information that can be used for identification of new genes important for melanocyte development, function, and disease.
Genomic and Proteomic Profiling Reveals Reduced Mitochondrial Function and Disruption of the Neuromuscular Junction Driving Rat Sarcopenia

PubMed Central

Ibebunjo, Chikwendu; Chick, Joel M.; Kendall, Tracee; Eash, John K.; Li, Christine; Zhang, Yunyu; Vickers, Chad; Wu, Zhidan; Clarke, Brian A.; Shi, Jun; Cruz, Joseph; Fournier, Brigitte; Brachat, Sophie; Gutzwiller, Sabine; Ma, QiCheng; Markovits, Judit; Broome, Michelle; Steinkrauss, Michelle; Skuba, Elizabeth; Galarneau, Jean-Rene; Gygi, Steven P.

2013-01-01

Molecular mechanisms underlying sarcopenia, the age-related loss of skeletal muscle mass and function, remain unclear. To identify molecular changes that correlated best with sarcopenia and might contribute to its pathogenesis, we determined global gene expression profiles in muscles of rats aged 6, 12, 18, 21, 24, and 27 months. These rats exhibit sarcopenia beginning at 21 months. Correlation of the gene expression versus muscle mass or age changes, and functional annotation analysis identified gene signatures of sarcopenia distinct from gene signatures of aging. Specifically, mitochondrial energy metabolism (e.g., tricarboxylic acid cycle and oxidative phosphorylation) pathway genes were the most downregulated and most significantly correlated with sarcopenia. Also, perturbed were genes/pathways associated with neuromuscular junction patency (providing molecular evidence of sarcopenia-related functional denervation and neuromuscular junction remodeling), protein degradation, and inflammation. Proteomic analysis of samples at 6, 18, and 27 months confirmed the depletion of mitochondrial energy metabolism proteins and neuromuscular junction proteins. Together, these findings suggest that therapeutic approaches that simultaneously stimulate mitochondrogenesis and reduce muscle proteolysis and inflammation have potential for treating sarcopenia. PMID:23109432
Genome-Wide Identification, Characterization and Phylogenetic Analysis of ATP-Binding Cassette (ABC) Transporter Genes in Common Carp (Cyprinus carpio).

PubMed

Liu, Xiang; Li, Shangqi; Peng, Wenzhu; Feng, Shuaisheng; Feng, Jianxin; Mahboob, Shahid; Al-Ghanim, Khalid A; Xu, Peng

2016-01-01

The ATP-binding cassette (ABC) gene family is considered to be one of the largest gene families in all forms of prokaryotic and eukaryotic life. Although the ABC transporter genes have been annotated in some species, detailed information about the ABC superfamily and the evolutionary characterization of ABC genes in common carp (Cyprinus carpio) are still unclear. In this research, we identified 61 ABC transporter genes in the common carp genome. Phylogenetic analysis revealed that they could be classified into seven subfamilies, namely 11 ABCAs, six ABCBs, 19 ABCCs, eight ABCDs, two ABCEs, four ABCFs, and 11 ABCGs. Comparative analysis of the ABC genes in seven vertebrate species including common carp, showed that at least 10 common carp genes were retained from the third round of whole genome duplication, while 12 duplicated ABC genes may have come from the fourth round of whole genome duplication. Gene losses were also observed for 14 ABC genes. Expression profiles of the 61 ABC genes in six common carp tissues (brain, heart, spleen, kidney, intestine, and gill) revealed extensive functional divergence among the ABC genes. Different copies of some genes had tissue-specific expression patterns, which may indicate some gene function specialization. This study provides essential genomic resources for future studies in common carp.
Genome-Wide Identification, Characterization and Phylogenetic Analysis of ATP-Binding Cassette (ABC) Transporter Genes in Common Carp (Cyprinus carpio)

PubMed Central

Peng, Wenzhu; Feng, Shuaisheng; Feng, Jianxin; Mahboob, Shahid; Al-Ghanim, Khalid A.

2016-01-01

The ATP-binding cassette (ABC) gene family is considered to be one of the largest gene families in all forms of prokaryotic and eukaryotic life. Although the ABC transporter genes have been annotated in some species, detailed information about the ABC superfamily and the evolutionary characterization of ABC genes in common carp (Cyprinus carpio) are still unclear. In this research, we identified 61 ABC transporter genes in the common carp genome. Phylogenetic analysis revealed that they could be classified into seven subfamilies, namely 11 ABCAs, six ABCBs, 19 ABCCs, eight ABCDs, two ABCEs, four ABCFs, and 11 ABCGs. Comparative analysis of the ABC genes in seven vertebrate species including common carp, showed that at least 10 common carp genes were retained from the third round of whole genome duplication, while 12 duplicated ABC genes may have come from the fourth round of whole genome duplication. Gene losses were also observed for 14 ABC genes. Expression profiles of the 61 ABC genes in six common carp tissues (brain, heart, spleen, kidney, intestine, and gill) revealed extensive functional divergence among the ABC genes. Different copies of some genes had tissue-specific expression patterns, which may indicate some gene function specialization. This study provides essential genomic resources for future studies in common carp. PMID:27058731
Transcriptional profiling by DDRT-PCR analysis reveals gene expression during seed development in Carya cathayensis Sarg.

PubMed

Huang, You-Jun; Zhou, Qin; Huang, Jian-Qin; Zeng, Yan-Ru; Wang, Zheng-Jia; Zhang, Qi-Xiang; Zhu, Yi-Hang; Shen, Chen; Zheng, Bing-Song

2015-06-01

Hickory (Carya cathayensis Sarg.) seed has one of the highest oil content and is rich in polyunsaturated fatty acids (PUFAs), which kernel is helpful to human health, particularly to human brain function. A better elucidation of lipid accumulation mechanism would help to improve hickory production and seed quality. DDRT-PCR analysis was used to examine gene expression in hickory at thirteen time points during seed development process. A total of 67 unique genes involved in seed development were obtained, and those expression patterns were further confirmed by semi-quantitative RT-PCR and real time RT-PCR analysis. Of them, the genes with known functions were involved in signal transduction, amino acid metabolism, nuclear metabolism, fatty acid metabolism, protein metabolism, carbon metabolism, secondary metabolism, oxidation of fatty acids and stress response, suggesting that hickory underwent a complex metabolism process in seed development. Furthermore, 6 genes related to fatty acid synthesis were explored, and their functions in seed development process were further discussed. The data obtained here would provide the first clues for guiding further functional studies of fatty acid synthesis in hickory. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
Evaluation of RNAi and CRISPR technologies by large-scale gene expression profiling in the Connectivity Map.

PubMed

Smith, Ian; Greenside, Peyton G; Natoli, Ted; Lahr, David L; Wadden, David; Tirosh, Itay; Narayan, Rajiv; Root, David E; Golub, Todd R; Subramanian, Aravind; Doench, John G

2017-11-01

The application of RNA interference (RNAi) to mammalian cells has provided the means to perform phenotypic screens to determine the functions of genes. Although RNAi has revolutionized loss-of-function genetic experiments, it has been difficult to systematically assess the prevalence and consequences of off-target effects. The Connectivity Map (CMAP) represents an unprecedented resource to study the gene expression consequences of expressing short hairpin RNAs (shRNAs). Analysis of signatures for over 13,000 shRNAs applied in 9 cell lines revealed that microRNA (miRNA)-like off-target effects of RNAi are far stronger and more pervasive than generally appreciated. We show that mitigating off-target effects is feasible in these datasets via computational methodologies to produce a consensus gene signature (CGS). In addition, we compared RNAi technology to clustered regularly interspaced short palindromic repeat (CRISPR)-based knockout by analysis of 373 single guide RNAs (sgRNAs) in 6 cells lines and show that the on-target efficacies are comparable, but CRISPR technology is far less susceptible to systematic off-target effects. These results will help guide the proper use and analysis of loss-of-function reagents for the determination of gene function.
Partial Roc Reveals Superiority of Mutual Rank of Pearson's Correlation Coefficient as a Coexpression Measure to Elucidate Functional Association of Genes

NASA Astrophysics Data System (ADS)

Obayashi, Takeshi; Kinoshita, Kengo

2013-01-01

Gene coexpression analysis is a powerful approach to elucidate gene function. We have established and developed this approach using vast amount of publicly available gene expression data measured by microarray techniques. The coexpressed genes are used to estimate gene function of the guide gene or to construct gene coexpression networks. In the case to construct gene networks, researchers should introduce an arbitrary threshold of gene coexpression, because gene coexpression value is continuous value. In the viewpoint to introduce common threshold of gene coexpression, we previously reported rank of Pearson's correlation coefficient (PCC) is more useful than the original PCC value. In this manuscript, we re-assessed the measure of gene coexpression to construct gene coexpression network, and found that mutual rank (MR) of PCC showed better performance than rank of PCC and the original PCC in low false positive rate.

Gene Fusion: A Genome Wide Survey

NASA Technical Reports Server (NTRS)

Liang, Ping; Riley, Monica

2001-01-01

As a well known fact, organisms form larger and complex multimodular (composite or chimeric) and mostly multi-functional proteins through gene fusion of two or more individual genes which have independent evolution histories and functions. We call each of these components a module. The existence of multimodular proteins may improves the efficiency in gene regulation and in cellular functions, and thus may give the host organism advantages in adaptation to environments. Analysis of all gene fusions in present-day organisms should allow us to examine the patterns of gene fusion in context with cellular functions, to trace back the evolution processes from the ancient smaller and uni-functional proteins to the present-day larger and complex multi-functional proteins, and to estimate the minimal number of ancestor proteins that existed in the last common ancestor for all life on earth. Although many multimodular proteins have been experimentally known, identification of gene fusion events systematically at genome scale had not been possible until recently when large number of completed genome sequences have been becoming available. In addition, technical difficulties for such analysis also exist due to the complexity of this biological and evolutionary process. We report from this study a new strategy to computationally identify multimodular proteins using completed genome sequences and the results surveyed from 22 organisms with the data from over 40 organisms to be presented during the meeting. Additional information is contained in the original extended abstract.
Analytical workflow profiling gene expression in murine macrophages

PubMed Central

Nixon, Scott E.; González-Peña, Dianelys; Lawson, Marcus A.; McCusker, Robert H.; Hernandez, Alvaro G.; O’Connor, Jason C.; Dantzer, Robert; Kelley, Keith W.

2015-01-01

Comprehensive and simultaneous analysis of all genes in a biological sample is a capability of RNA-Seq technology. Analysis of the entire transcriptome benefits from summarization of genes at the functional level. As a cellular response of interest not previously explored with RNA-Seq, peritoneal macrophages from mice under two conditions (control and immunologically challenged) were analyzed for gene expression differences. Quantification of individual transcripts modeled RNA-Seq read distribution and uncertainty (using a Beta Negative Binomial distribution), then tested for differential transcript expression (False Discovery Rate-adjusted p-value < 0.05). Enrichment of functional categories utilized the list of differentially expressed genes. A total of 2079 differentially expressed transcripts representing 1884 genes were detected. Enrichment of 92 categories from Gene Ontology Biological Processes and Molecular Functions, and KEGG pathways were grouped into 6 clusters. Clusters included defense and inflammatory response (Enrichment Score = 11.24) and ribosomal activity (Enrichment Score = 17.89). Our work provides a context to the fine detail of individual gene expression differences in murine peritoneal macrophages during immunological challenge with high throughput RNA-Seq. PMID:25708305
Molecular Phylogenetic and Expression Analysis of the Complete WRKY Transcription Factor Family in Maize

PubMed Central

Wei, Kai-Fa; Chen, Juan; Chen, Yan-Feng; Wu, Ling-Juan; Xie, Dao-Xin

2012-01-01

The WRKY transcription factors function in plant growth and development, and response to the biotic and abiotic stresses. Although many studies have focused on the functional identification of the WRKY transcription factors, much less is known about molecular phylogenetic and global expression analysis of the complete WRKY family in maize. In this study, we identified 136 WRKY proteins coded by 119 genes in the B73 inbred line from the complete genome and named them in an orderly manner. Then, a comprehensive phylogenetic analysis of five species was performed to explore the origin and evolutionary patterns of these WRKY genes, and the result showed that gene duplication is the major driving force for the origin of new groups and subgroups and functional divergence during evolution. Chromosomal location analysis of maize WRKY genes indicated that 20 gene clusters are distributed unevenly in the genome. Microarray-based expression analysis has revealed that 131 WRKY transcripts encoded by 116 genes may participate in the regulation of maize growth and development. Among them, 102 transcripts are stably expressed with a coefficient of variation (CV) value of <15%. The remaining 29 transcripts produced by 25 WRKY genes with the CV value of >15% are further analysed to discover new organ- or tissue-specific genes. In addition, microarray analyses of transcriptional responses to drought stress and fungal infection showed that maize WRKY proteins are involved in stress responses. All these results contribute to a deep probing into the roles of WRKY transcription factors in maize growth and development and stress tolerance. PMID:22279089
Molecular phylogenetic and expression analysis of the complete WRKY transcription factor family in maize.

PubMed

Wei, Kai-Fa; Chen, Juan; Chen, Yan-Feng; Wu, Ling-Juan; Xie, Dao-Xin

2012-04-01

The WRKY transcription factors function in plant growth and development, and response to the biotic and abiotic stresses. Although many studies have focused on the functional identification of the WRKY transcription factors, much less is known about molecular phylogenetic and global expression analysis of the complete WRKY family in maize. In this study, we identified 136 WRKY proteins coded by 119 genes in the B73 inbred line from the complete genome and named them in an orderly manner. Then, a comprehensive phylogenetic analysis of five species was performed to explore the origin and evolutionary patterns of these WRKY genes, and the result showed that gene duplication is the major driving force for the origin of new groups and subgroups and functional divergence during evolution. Chromosomal location analysis of maize WRKY genes indicated that 20 gene clusters are distributed unevenly in the genome. Microarray-based expression analysis has revealed that 131 WRKY transcripts encoded by 116 genes may participate in the regulation of maize growth and development. Among them, 102 transcripts are stably expressed with a coefficient of variation (CV) value of <15%. The remaining 29 transcripts produced by 25 WRKY genes with the CV value of >15% are further analysed to discover new organ- or tissue-specific genes. In addition, microarray analyses of transcriptional responses to drought stress and fungal infection showed that maize WRKY proteins are involved in stress responses. All these results contribute to a deep probing into the roles of WRKY transcription factors in maize growth and development and stress tolerance.
A Systematic Analysis of Candidate Genes Associated with Nicotine Addiction

PubMed Central

Liu, Meng; Li, Xia; Fan, Rui; Liu, Xinhua; Wang, Ju

2015-01-01

Nicotine, as the major psychoactive component of tobacco, has broad physiological effects within the central nervous system, but our understanding of the molecular mechanism underlying its neuronal effects remains incomplete. In this study, we performed a systematic analysis on a set of nicotine addiction-related genes to explore their characteristics at network levels. We found that NAGenes tended to have a more moderate degree and weaker clustering coefficient and to be less central in the network compared to alcohol addiction-related genes or cancer genes. Further, clustering of these genes resulted in six clusters with themes in synaptic transmission, signal transduction, metabolic process, and apoptosis, which provided an intuitional view on the major molecular functions of the genes. Moreover, functional enrichment analysis revealed that neurodevelopment, neurotransmission activity, and metabolism related biological processes were involved in nicotine addiction. In summary, by analyzing the overall characteristics of the nicotine addiction related genes, this study provided valuable information for understanding the molecular mechanisms underlying nicotine addiction. PMID:26097843
Genome-wide analysis of the homeodomain-leucine zipper (HD-ZIP) gene family in peach (Prunus persica).

PubMed

Zhang, C H; Ma, R J; Shen, Z J; Sun, X; Korir, N K; Yu, M L

2014-04-08

In this study, 33 homeodomain-leucine zipper (HD-ZIP) genes were identified in peach using the HD-ZIP amino acid sequences of Arabidopsis thaliana as a probe. Based on the phylogenetic analysis and the individual gene or protein characteristics, the HD-ZIP gene family in peach can be classified into 4 subfamilies, HD-ZIP I, II, III, and IV, containing 14, 7, 4, and 8 members, respectively. The most closely related peach HD-ZIP members within the same subfamilies shared very similar gene structure in terms of either intron/exon numbers or lengths. Almost all members of the same subfamily shared common motif compositions, thereby implying that the HD-ZIP proteins within the same subfamily may have functional similarity. The 33 peach HD-ZIP genes were distributed across scaffolds 1 to 7. Although the primary structure varied among HD-ZIP family proteins, their tertiary structures were similar. The results from this study will be useful in selecting candidate genes from specific subfamilies for functional analysis.
Genome-wide analysis of DUF221 domain-containing gene family in Oryza species and identification of its salinity stress-responsive members in rice.

PubMed

Ganie, Showkat Ahmad; Pani, Dipti Ranjan; Mondal, Tapan Kumar

2017-01-01

DUF221 domain-containing genes (DDP genes) play important roles in developmental biology, hormone signalling transduction, and responses to abiotic stress. Therefore to understand their structural and evolutionary relationship, we did a genome-wide analysis of this important gene family in rice. Further, through comparative genomics, DDP genes from Oryza sativa subsp. (indica), nine different wild species of rice and Arabidopsis were also identified. We also found an expansion of the DDP gene families in rice and Arabidopsis which is due to the segmental duplication events in some of the gene family members. In general, a highly purifying selection was found acting on all the deduced paralogous and orthologous DDP gene pairs. The data from microarray and subsequent qRT-PCR analysis revealed that although several OsDDPs were differentially regulated under salinity stress, yet OsDDP6 was upregulated at all the developmental stages in salt tolerant rice genotype, FL478. Interestingly, OsDDP6 was found to be involved in proline metabolism pathway as indicated by protein network analysis. The diverse gene structures, varied transmembrane topologies and the differential expression patterns implied the functional diversity in DDP genes. Therefore, the comprehensive evolutionary analysis of DDP genes from different Oryza species and Arabidopsis performed in this study will provide the basis for further functional validation studies vis-à-vis DDP genes of rice and other plant species.
Genome-wide analysis of DUF221 domain-containing gene family in Oryza species and identification of its salinity stress-responsive members in rice

PubMed Central

Ganie, Showkat Ahmad; Pani, Dipti Ranjan

2017-01-01

DUF221 domain-containing genes (DDP genes) play important roles in developmental biology, hormone signalling transduction, and responses to abiotic stress. Therefore to understand their structural and evolutionary relationship, we did a genome-wide analysis of this important gene family in rice. Further, through comparative genomics, DDP genes from Oryza sativa subsp. (indica), nine different wild species of rice and Arabidopsis were also identified. We also found an expansion of the DDP gene families in rice and Arabidopsis which is due to the segmental duplication events in some of the gene family members. In general, a highly purifying selection was found acting on all the deduced paralogous and orthologous DDP gene pairs. The data from microarray and subsequent qRT-PCR analysis revealed that although several OsDDPs were differentially regulated under salinity stress, yet OsDDP6 was upregulated at all the developmental stages in salt tolerant rice genotype, FL478. Interestingly, OsDDP6 was found to be involved in proline metabolism pathway as indicated by protein network analysis. The diverse gene structures, varied transmembrane topologies and the differential expression patterns implied the functional diversity in DDP genes. Therefore, the comprehensive evolutionary analysis of DDP genes from different Oryza species and Arabidopsis performed in this study will provide the basis for further functional validation studies vis-à-vis DDP genes of rice and other plant species. PMID:28846681
dbCPG: A web resource for cancer predisposition genes.

PubMed

Wei, Ran; Yao, Yao; Yang, Wu; Zheng, Chun-Hou; Zhao, Min; Xia, Junfeng

2016-06-21

Cancer predisposition genes (CPGs) are genes in which inherited mutations confer highly or moderately increased risks of developing cancer. Identification of these genes and understanding the biological mechanisms that underlie them is crucial for the prevention, early diagnosis, and optimized management of cancer. Over the past decades, great efforts have been made to identify CPGs through multiple strategies. However, information on these CPGs and their molecular functions is scattered. To address this issue and provide a comprehensive resource for researchers, we developed the Cancer Predisposition Gene Database (dbCPG, Database URL: http://bioinfo.ahu.edu.cn:8080/dbCPG/index.jsp), the first literature-based gene resource for exploring human CPGs. It contains 827 human (724 protein-coding, 23 non-coding, and 80 unknown type genes), 637 rats, and 658 mouse CPGs. Furthermore, data mining was performed to gain insights into the understanding of the CPGs data, including functional annotation, gene prioritization, network analysis of prioritized genes and overlap analysis across multiple cancer types. A user-friendly web interface with multiple browse, search, and upload functions was also developed to facilitate access to the latest information on CPGs. Taken together, the dbCPG database provides a comprehensive data resource for further studies of cancer predisposition genes.
Comprehensive Genomic Analysis and Expression Profiling of the NOX Gene Families under Abiotic Stresses and Hormones in Plants.

PubMed

Chang, Yan-Li; Li, Wen-Yan; Miao, Hai; Yang, Shuai-Qi; Li, Ri; Wang, Xiang; Li, Wen-Qiang; Chen, Kun-Ming

2016-02-23

Plasma membrane NADPH oxidases (NOXs) are key producers of reactive oxygen species under both normal and stress conditions in plants and they form functional subfamilies. Studies of these subfamilies indicated that they show considerable evolutionary selection. We performed a comparative genomic analysis that identified 50 ferric reduction oxidases (FRO) and 77 NOX gene homologs from 20 species representing the eight major plant lineages within the supergroup Plantae: glaucophytes, rhodophytes, chlorophytes, bryophytes, lycophytes, gymnosperms, monocots, and eudicots. Phylogenetic and structural analysis classified these FRO and NOX genes into four well-conserved groups represented as NOX, FRO I, FRO II, and FRO III. Further analysis of NOXs of phylogenetic and exon/intron structures showed that single intron loss and gain had occurred, yielding the diversified gene structures during the evolution of NOXs family genes and which were classified into four conserved subfamilies which are represented as Sub.I, Sub.II, Sub.III, and Sub.IV. Additionally, both available global microarray data analysis and quantitative real-time PCR experiments revealed that the NOX genes in Arabidopsis and rice (Oryza sativa) have different expression patterns in different developmental stages, various abiotic stresses and hormone treatments. Finally, coexpression network analysis of NOX genes in Arabidopsis and rice revealed that NOXs have significantly correlated expression profiles with genes which are involved in plants metabolic and resistance progresses. All these results suggest that NOX family underscores the functional diversity and divergence in plants. This finding will facilitate further studies of the NOX family and provide valuable information for functional validation of this family in plants. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Identification and characterization of nuclear genes involved in photosynthesis in Populus

PubMed Central

2014-01-01

Background The gap between the real and potential photosynthetic rate under field conditions suggests that photosynthesis could potentially be improved. Nuclear genes provide possible targets for improving photosynthetic efficiency. Hence, genome-wide identification and characterization of the nuclear genes affecting photosynthetic traits in woody plants would provide key insights on genetic regulation of photosynthesis and identify candidate processes for improvement of photosynthesis. Results Using microarray and bulked segregant analysis strategies, we identified differentially expressed nuclear genes for photosynthesis traits in a segregating population of poplar. We identified 515 differentially expressed genes in this population (FC ≥ 2 or FC ≤ 0.5, P < 0.05), 163 up-regulated and 352 down-regulated. Real-time PCR expression analysis confirmed the microarray data. Singular Enrichment Analysis identified 48 significantly enriched GO terms for molecular functions (28), biological processes (18) and cell components (2). Furthermore, we selected six candidate genes for functional examination by a single-marker association approach, which demonstrated that 20 SNPs in five candidate genes significantly associated with photosynthetic traits, and the phenotypic variance explained by each SNP ranged from 2.3% to 12.6%. This revealed that regulation of photosynthesis by the nuclear genome mainly involves transport, metabolism and response to stimulus functions. Conclusions This study provides new genome-scale strategies for the discovery of potential candidate genes affecting photosynthesis in Populus, and for identification of the functions of genes involved in regulation of photosynthesis. This work also suggests that improving photosynthetic efficiency under field conditions will require the consideration of multiple factors, such as stress responses. PMID:24673936
Characterization and phylogenetic analysis of lectin gene cDNA isolated from sea cucumber ( Apostichopus japonicus) body wall

NASA Astrophysics Data System (ADS)

Xue, Zhuang; Li, Hui; Liu, Yang; Zhou, Wei; Sun, Jing; Wang, Xiuli

2017-12-01

As a `living fossil' of species origin and `rich treasure' of food and nutrition development, sea cucumber has received a lot of attentions from researchers. The cDNA library construction and EST sequencing of blood had been conducted previously in our lab. The bioinformatic analysis provided a gene fragment which is highly homologous with the genes of lectin family, named AjL ( Apostichopus japonicus lectin). To characterize and determine the phylogeny of AjL genes in early evolution, we isolated a full-length cDNA of lectin gene from the body wall of A. japonicus. The open reading frame of this gene contained 489 bp and encoded a 163 amino acids secretory protein being homologous to lectins of mammals and aquatic organisms. The deduced protein included a lectin-like domain. SDS-PAGE analysis showed that AjL migrated as a specific band (about 36.09 kDa under reducing), and agglutinated against rabbit red blood cells. AjL was similar to chain A of CEL-IV in space structure. We predicted that AjL may play the same role of CEL-IV. Our results suggested that more than one lectin gene functioned in sea cucumber and most of other species, which was fused by uncertain sequences during the evolution and encoded different proteins with diverse functions. Our findings provided the insights into the function and characteristics of lectin genes invertebrates. The results will also be helpful for the identification and structural, functional, and evolutionary analyses of lectin genes.
Gene expression profiles in rainbow trout, Onchorynchus mykiss, exposed to a simple chemical mixture.

PubMed

Hook, Sharon E; Skillman, Ann D; Gopalan, Banu; Small, Jack A; Schultz, Irvin R

2008-03-01

Among proposed uses for microarrays in environmental toxiciology is the identification of key contributors to toxicity within a mixture. However, it remains uncertain whether the transcriptomic profiles resulting from exposure to a mixture have patterns of altered gene expression that contain identifiable contributions from each toxicant component. We exposed isogenic rainbow trout Onchorynchus mykiss, to sublethal levels of ethynylestradiol, 2,2,4,4-tetrabromodiphenyl ether, and chromium VI or to a mixture of all three toxicants Fluorescently labeled complementary DNA (cDNA) were generated and hybridized against a commercially available Salmonid array spotted with 16,000 cDNAs. Data were analyzed using analysis of variance (p<0.05) with a Benjamani-Hochberg multiple test correction (Genespring [Agilent] software package) to identify up and downregulated genes. Gene clustering patterns that can be used as "expression signatures" were determined using hierarchical cluster analysis. The gene ontology terms associated with significantly altered genes were also used to identify functional groups that were associated with toxicant exposure. Cross-ontological analytics approach was used to assign functional annotations to genes with "unknown" function. Our analysis indicates that transcriptomic profiles resulting from the mixture exposure resemble those of the individual contaminant exposures, but are not a simple additive list. However, patterns of altered genes representative of each component of the mixture are clearly discernible, and the functional classes of genes altered represent the individual components of the mixture. These findings indicate that the use of microarrays to identify transcriptomic profiles may aid in the identification of key stressors within a chemical mixture, ultimately improving environmental assessment.
Analysis of barosensitive mechanisms in yeast for Pressure Regulated Fermentation

NASA Astrophysics Data System (ADS)

Nomura, Kazuki; Iwahashi, Hitoshi; Iguchi, Akinori; Shigematsu, Toru

2013-06-01

Introduction: We are intending to develop a novel food processing technology, Pressure Regulated Fermentation (PReF), using pressure sensitive (barosensitive) fermentation microorganisms. Objectives of our study are to clarify barosensitive mechanisms for application to PReF technology. We isolated Saccharomyces cerevisiae barosensitive mutant a924E1 that was derived from the parent KA31a. Methods: Gene expression levels were analyzed by DNA microarray. The altered genes of expression levels were classified according to the gene function. Mutated genes were estimated by mating and producing diploid strains and confirmed by PCR of mitochondrial DNA (mtDNA). Results and Discussion: Gene expression profiles showed that genes of `Energy' function and that of encoding protein localized in ``Mitochondria'' were significantly down regulated in the mutant. These results suggest the respiratory deficiency and relationship between barosensitivity and respiratory deficiency. Since the respiratory functions of diploids showed non Mendelian inheritance, the respiratory deficiency was indicated to be due to mtDNA mutation. PCR analysis showed that the region of COX1 locus was deleted. COX1 gene encodes the subunit 1 of cytochrome c oxidase. For this reason, barosensitivity is strongly correlated with mitochondrial functions.
A Reverse-Genetics Mutational Analysis of the Barley HvDWARF Gene Results in Identification of a Series of Alleles and Mutants with Short Stature of Various Degree and Disturbance in BR Biosynthesis Allowing a New Insight into the Process.

PubMed

Gruszka, Damian; Gorniak, Malgorzata; Glodowska, Ewelina; Wierus, Ewa; Oklestkova, Jana; Janeczko, Anna; Maluszynski, Miroslaw; Szarejko, Iwona

2016-04-22

Brassinosteroids (BRs) are plant steroid hormones, regulating a broad range of physiological processes. The largest amount of data related with BR biosynthesis has been gathered in Arabidopsis thaliana, however understanding of this process is far less elucidated in monocot crops. Up to now, only four barley genes implicated in BR biosynthesis have been identified. Two of them, HvDWARF and HvBRD, encode BR-6-oxidases catalyzing biosynthesis of castasterone, but their relation is not yet understood. In the present study, the identification of the HvDWARF genomic sequence, its mutational and functional analysis and characterization of new mutants are reported. Various types of mutations located in different positions within functional domains were identified and characterized. Analysis of their impact on phenotype of the mutants was performed. The identified homozygous mutants show reduced height of various degree and disrupted skotomorphogenesis. Mutational analysis of the HvDWARF gene with the "reverse genetics" approach allowed for its detailed functional analysis at the level of protein functional domains. The HvDWARF gene function and mutants' phenotypes were also validated by measurement of endogenous BR concentration. These results allowed a new insight into the BR biosynthesis in barley.
Agrobacterium-mediated virus-induced gene silencing assay in cotton.

PubMed

Gao, Xiquan; Britt, Robert C; Shan, Libo; He, Ping

2011-08-20

Cotton (Gossypium hirsutum) is one of the most important crops worldwide. Considerable efforts have been made on molecular breeding of new varieties. The large-scale gene functional analysis in cotton has been lagged behind most of the modern plant species, likely due to its large size of genome, gene duplication and polyploidy, long growth cycle and recalcitrance to genetic transformation(1). To facilitate high throughput functional genetic/genomic study in cotton, we attempt to develop rapid and efficient transient assays to assess cotton gene functions. Virus-Induced Gene Silencing (VIGS) is a powerful technique that was developed based on the host Post-Transcriptional Gene Silencing (PTGS) to repress viral proliferation(2,3). Agrobacterium-mediated VIGS has been successfully applied in a wide range of dicots species such as Solanaceae, Arabidopsis and legume species, and monocots species including barley, wheat and maize, for various functional genomic studies(3,4). As this rapid and efficient approach avoids plant transformation and overcomes functional redundancy, it is particularly attractive and suitable for functional genomic study in crop species like cotton not amenable for transformation. In this study, we report the detailed protocol of Agrobacterium-mediated VIGS system in cotton. Among the several viral VIGS vectors, the tobacco rattle virus (TRV) invades a wide range of hosts and is able to spread vigorously throughout the entire plant yet produce mild symptoms on the hosts5. To monitor the silencing efficiency, GrCLA1, a homolog gene of Arabidopsis Cloroplastos alterados 1 gene (AtCLA1) in cotton, has been cloned and inserted into the VIGS binary vector pYL156. CLA1 gene is involved in chloroplast development(6), and previous studies have shown that loss-of-function of AtCLA1 resulted in an albino phenotype on true leaves(7), providing an excellent visual marker for silencing efficiency. At approximately two weeks post Agrobacterium infiltration, the albino phenotype started to appear on the true leaves, with 100% silencing efficiency in all replicated experiments. The silencing of endogenous gene expression was also confirmed by RT-PCR analysis. Significantly, silencing could potently occur in all the cultivars we tested, including various commercially grown varieties in Texas. This rapid and efficient Agrobacterium-mediated VIGS assay provides a very powerful tool for rapid large-scale analysis of gene functions at genome-wide level in cotton.
Agrobacterium-Mediated Virus-Induced Gene Silencing Assay In Cotton

PubMed Central

Gao, Xiquan; Britt Jr., Robert C.; Shan, Libo; He, Ping

2011-01-01

Cotton (Gossypium hirsutum) is one of the most important crops worldwide. Considerable efforts have been made on molecular breeding of new varieties. The large-scale gene functional analysis in cotton has been lagged behind most of the modern plant species, likely due to its large size of genome, gene duplication and polyploidy, long growth cycle and recalcitrance to genetic transformation1. To facilitate high throughput functional genetic/genomic study in cotton, we attempt to develop rapid and efficient transient assays to assess cotton gene functions. Virus-Induced Gene Silencing (VIGS) is a powerful technique that was developed based on the host Post-Transcriptional Gene Silencing (PTGS) to repress viral proliferation2,3. Agrobacterium-mediated VIGS has been successfully applied in a wide range of dicots species such as Solanaceae, Arabidopsis and legume species, and monocots species including barley, wheat and maize, for various functional genomic studies3,4. As this rapid and efficient approach avoids plant transformation and overcomes functional redundancy, it is particularly attractive and suitable for functional genomic study in crop species like cotton not amenable for transformation. In this study, we report the detailed protocol of Agrobacterium-mediated VIGS system in cotton. Among the several viral VIGS vectors, the tobacco rattle virus (TRV) invades a wide range of hosts and is able to spread vigorously throughout the entire plant yet produce mild symptoms on the hosts5. To monitor the silencing efficiency, GrCLA1, a homolog gene of Arabidopsis Cloroplastos alterados 1 gene (AtCLA1) in cotton, has been cloned and inserted into the VIGS binary vector pYL156. CLA1 gene is involved in chloroplast development6, and previous studies have shown that loss-of-function of AtCLA1 resulted in an albino phenotype on true leaves7, providing an excellent visual marker for silencing efficiency. At approximately two weeks post Agrobacterium infiltration, the albino phenotype started to appear on the true leaves, with 100% silencing efficiency in all replicated experiments. The silencing of endogenous gene expression was also confirmed by RT-PCR analysis. Significantly, silencing could potently occur in all the cultivars we tested, including various commercially grown varieties in Texas. This rapid and efficient Agrobacterium-mediated VIGS assay provides a very powerful tool for rapid large-scale analysis of gene functions at genome-wide level in cotton. PMID:21876527
Functional dissection of drought-responsive gene expression patterns in Cynodon dactylon L.

PubMed

Kim, Changsoo; Lemke, Cornelia; Paterson, Andrew H

2009-05-01

Water deficit is one of the main abiotic factors that affect plant productivity in subtropical regions. To identify genes induced during the water stress response in Bermudagrass (Cynodon dactylon), cDNA macroarrays were used. The macroarray analysis identified 189 drought-responsive candidate genes from C. dactylon, of which 120 were up-regulated and 69 were down-regulated. The candidate genes were classified into seven groups by cluster analysis of expression levels across two intensities and three durations of imposed stress. Annotation using BLASTX suggested that up-regulated genes may be involved in proline biosynthesis, signal transduction pathways, protein repair systems, and removal of toxins, while down-regulated genes were mostly related to basic plant metabolism such as photosynthesis and glycolysis. The functional classification of gene ontology (GO) was consistent with the BLASTX results, also suggesting some crosstalk between abiotic and biotic stress. Comparative analysis of cis-regulatory elements from the candidate genes implicated specific elements in drought response in Bermudagrass. Although only a subset of genes was studied, Bermudagrass shared many drought-responsive genes and cis-regulatory elements with other botanical models, supporting a strategy of cross-taxon application of drought-responsive genes, regulatory cues, and physiological-genetic information.
Functional Analysis of OMICs Data and Small Molecule Compounds in an Integrated "Knowledge-Based" Platform.

PubMed

Dubovenko, Alexey; Nikolsky, Yuri; Rakhmatulin, Eugene; Nikolskaya, Tatiana

2017-01-01

Analysis of NGS and other sequencing data, gene variants, gene expression, proteomics, and other high-throughput (OMICs) data is challenging because of its biological complexity and high level of technical and biological noise. One way to deal with both problems is to perform analysis with a high fidelity annotated knowledgebase of protein interactions, pathways, and functional ontologies. This knowledgebase has to be structured in a computer-readable format and must include software tools for managing experimental data, analysis, and reporting. Here, we present MetaCore™ and Key Pathway Advisor (KPA), an integrated platform for functional data analysis. On the content side, MetaCore and KPA encompass a comprehensive database of molecular interactions of different types, pathways, network models, and ten functional ontologies covering human, mouse, and rat genes. The analytical toolkit includes tools for gene/protein list enrichment analysis, statistical "interactome" tool for the identification of over- and under-connected proteins in the dataset, and a biological network analysis module made up of network generation algorithms and filters. The suite also features Advanced Search, an application for combinatorial search of the database content, as well as a Java-based tool called Pathway Map Creator for drawing and editing custom pathway maps. Applications of MetaCore and KPA include molecular mode of action of disease research, identification of potential biomarkers and drug targets, pathway hypothesis generation, analysis of biological effects for novel small molecule compounds and clinical applications (analysis of large cohorts of patients, and translational and personalized medicine).
Integrated annotation and analysis of in situ hybridization images using the ImAnno system: application to the ear and sensory organs of the fetal mouse.

PubMed

Romand, Raymond; Ripp, Raymond; Poidevin, Laetitia; Boeglin, Marcel; Geffers, Lars; Dollé, Pascal; Poch, Olivier

2015-01-01

An in situ hybridization (ISH) study was performed on 2000 murine genes representing around 10% of the protein-coding genes present in the mouse genome using data generated by the EURExpress consortium. This study was carried out in 25 tissues of late gestation embryos (E14.5), with a special emphasis on the developing ear and on five distinct developing sensory organs, including the cochlea, the vestibular receptors, the sensory retina, the olfactory organ, and the vibrissae follicles. The results obtained from an analysis of more than 11,000 micrographs have been integrated in a newly developed knowledgebase, called ImAnno. In addition to managing the multilevel micrograph annotations performed by human experts, ImAnno provides public access to various integrated databases and tools. Thus, it facilitates the analysis of complex ISH gene expression patterns, as well as functional annotation and interaction of gene sets. It also provides direct links to human pathways and diseases. Hierarchical clustering of expression patterns in the 25 tissues revealed three main branches corresponding to tissues with common functions and/or embryonic origins. To illustrate the integrative power of ImAnno, we explored the expression, function and disease traits of the sensory epithelia of the five presumptive sensory organs. The study identified 623 genes (out of 2000) concomitantly expressed in the five embryonic epithelia, among which many (∼12%) were involved in human disorders. Finally, various multilevel interaction networks were characterized, highlighting differential functional enrichments of directly or indirectly interacting genes. These analyses exemplify an under-represention of "sensory" functions in the sensory gene set suggests that E14.5 is a pivotal stage between the developmental stage and the functional phase that will be fully reached only after birth.

GeoChip-based analysis of microbial functional gene diversity in a landfill leachate-contaminated aquifer

USGS Publications Warehouse

Lu, Zhenmei; He, Zhili; Parisi, Victoria A.; Kang, Sanghoon; Deng, Ye; Van Nostrand, Joy D.; Masoner, Jason R.; Cozzarelli, Isabelle M.; Suflita, Joseph M.; Zhou, Jizhong

2012-01-01

The functional gene diversity and structure of microbial communities in a shallow landfill leachate-contaminated aquifer were assessed using a comprehensive functional gene array (GeoChip 3.0). Water samples were obtained from eight wells at the same aquifer depth immediately below a municipal landfill or along the predominant downgradient groundwater flowpath. Functional gene richness and diversity immediately below the landfill and the closest well were considerably lower than those in downgradient wells. Mantel tests and canonical correspondence analysis (CCA) suggested that various geochemical parameters had a significant impact on the subsurface microbial community structure. That is, leachate from the unlined landfill impacted the diversity, composition, structure, and functional potential of groundwater microbial communities as a function of groundwater pH, and concentrations of sulfate, ammonia, and dissolved organic carbon (DOC). Historical geochemical records indicate that all sampled wells chronically received leachate, and the increase in microbial diversity as a function of distance from the landfill is consistent with mitigation of the impact of leachate on the groundwater system by natural attenuation mechanisms.
Diverse types of genetic variation converge on functional gene networks involved in schizophrenia.

PubMed

Gilman, Sarah R; Chang, Jonathan; Xu, Bin; Bawa, Tejdeep S; Gogos, Joseph A; Karayiorgou, Maria; Vitkup, Dennis

2012-12-01

Despite the successful identification of several relevant genomic loci, the underlying molecular mechanisms of schizophrenia remain largely unclear. We developed a computational approach (NETBAG+) that allows an integrated analysis of diverse disease-related genetic data using a unified statistical framework. The application of this approach to schizophrenia-associated genetic variations, obtained using unbiased whole-genome methods, allowed us to identify several cohesive gene networks related to axon guidance, neuronal cell mobility, synaptic function and chromosomal remodeling. The genes forming the networks are highly expressed in the brain, with higher brain expression during prenatal development. The identified networks are functionally related to genes previously implicated in schizophrenia, autism and intellectual disability. A comparative analysis of copy number variants associated with autism and schizophrenia suggests that although the molecular networks implicated in these distinct disorders may be related, the mutations associated with each disease are likely to lead, at least on average, to different functional consequences.
CNL Disease Resistance Genes in Soybean and Their Evolutionary Divergence

PubMed Central

Nepal, Madhav P; Benson, Benjamin V

2015-01-01

Disease resistance genes (R-genes) encode proteins involved in detecting pathogen attack and activating downstream defense molecules. Recent availability of soybean genome sequences makes it possible to examine the diversity of gene families including disease-resistant genes. The objectives of this study were to identify coiled-coil NBS-LRR (= CNL) R-genes in soybean, infer their evolutionary relationships, and assess structural as well as functional divergence of the R-genes. Profile hidden Markov models were used for sequence identification and model-based maximum likelihood was used for phylogenetic analysis, and variation in chromosomal positioning, gene clustering, and functional divergence were assessed. We identified 188 soybean CNL genes nested into four clades consistent to their orthologs in Arabidopsis. Gene clustering analysis revealed the presence of 41 gene clusters located on 13 different chromosomes. Analyses of the Ks-values and chromosomal positioning suggest duplication events occurring at varying timescales, and an extrapericentromeric positioning may have facilitated their rapid evolution. Each of the four CNL clades exhibited distinct patterns of gene expression. Phylogenetic analysis further supported the extrapericentromeric positioning effect on the divergence and retention of the CNL genes. The results are important for understanding the diversity and divergence of CNL genes in soybean, which would have implication in soybean crop improvement in future. PMID:25922568
CNL Disease Resistance Genes in Soybean and Their Evolutionary Divergence.

PubMed

Nepal, Madhav P; Benson, Benjamin V

2015-01-01

Disease resistance genes (R-genes) encode proteins involved in detecting pathogen attack and activating downstream defense molecules. Recent availability of soybean genome sequences makes it possible to examine the diversity of gene families including disease-resistant genes. The objectives of this study were to identify coiled-coil NBS-LRR (= CNL) R-genes in soybean, infer their evolutionary relationships, and assess structural as well as functional divergence of the R-genes. Profile hidden Markov models were used for sequence identification and model-based maximum likelihood was used for phylogenetic analysis, and variation in chromosomal positioning, gene clustering, and functional divergence were assessed. We identified 188 soybean CNL genes nested into four clades consistent to their orthologs in Arabidopsis. Gene clustering analysis revealed the presence of 41 gene clusters located on 13 different chromosomes. Analyses of the K s-values and chromosomal positioning suggest duplication events occurring at varying timescales, and an extrapericentromeric positioning may have facilitated their rapid evolution. Each of the four CNL clades exhibited distinct patterns of gene expression. Phylogenetic analysis further supported the extrapericentromeric positioning effect on the divergence and retention of the CNL genes. The results are important for understanding the diversity and divergence of CNL genes in soybean, which would have implication in soybean crop improvement in future.
[Analysis of horizontal transfer gene of Bombyx mori NPV].

PubMed

Duan, Hai-Rong; Qiu, De-Bin; Gong, Cheng-Liang; Huang, Mo-Li

2011-06-01

For research on genetic characters and evolutionary origin of the genome of baculoviruses, a comprehensive homology search and phylogenetic analysis of the complete genomes of Bombyx mori NPV and Bombyx mori were used. Three horizontally transferred genes (inhibitor of apoptosis, chitinase, and UDP-glucosyltransferase) were identified, and there was evidence that all of these genes were derived from the insect host. The results of analysis showed lots of differences between the features of horizontal transferred genes and the ones of whole genomic genes, such as nucleotide composition, codon usagebias and selection pressure. These results reconfirmed that the horizontally transferred genes are exogenous. The analysis of gene function suggested that horizontally transferred genes acquired from an ancestral host insect can increase the efficiency of baculoviruses transmission.
Transcriptional profiling of host gene expression in chicken embryo lung cells infected with laryngotracheitis virus

PubMed Central

2010-01-01

Background Infection by infectious laryngotracheitis virus (ILTV; gallid herpesvirus 1) causes acute respiratory diseases in chickens often with high mortality. To better understand host-ILTV interactions at the host transcriptional level, a microarray analysis was performed using 4 × 44 K Agilent chicken custom oligo microarrays. Results Microarrays were hybridized using the two color hybridization method with total RNA extracted from ILTV infected chicken embryo lung cells at 0, 1, 3, 5, and 7 days post infection (dpi). Results showed that 789 genes were differentially expressed in response to ILTV infection that include genes involved in the immune system (cytokines, chemokines, MHC, and NF-κB), cell cycle regulation (cyclin B2, CDK1, and CKI3), matrix metalloproteinases (MMPs) and cellular metabolism. Differential expression for 20 out of 789 genes were confirmed by quantitative reverse transcription-PCR (qRT-PCR). A bioinformatics tool (Ingenuity Pathway Analysis) used to analyze biological functions and pathways on the group of 789 differentially expressed genes revealed that 21 possible gene networks with intermolecular connections among 275 functionally identified genes. These 275 genes were classified into a number of functional groups that included cancer, genetic disorder, cellular growth and proliferation, and cell death. Conclusion The results of this study provide comprehensive knowledge on global gene expression, and biological functionalities of differentially expressed genes in chicken embryo lung cells in response to ILTV infections. PMID:20663125
Transcriptomic Analysis and Meta-Analysis of Human Granulosa and Cumulus Cells

PubMed Central

Burnik Papler, Tanja; Vrtacnik Bokal, Eda; Maver, Ales; Kopitar, Andreja Natasa; Lovrečić, Luca

2015-01-01

Specific gene expression in oocytes and its surrounding cumulus (CC) and granulosa (GC) cells is needed for successful folliculogenesis and oocyte maturation. The aim of the present study was to compare genome-wide gene expression and biological functions of human GC and CC. Individual GC and CC were derived from 37 women undergoing IVF procedures. Gene expression analysis was performed using microarrays, followed by a meta-analysis. Results were validated using quantitative real-time PCR. There were 6029 differentially expressed genes (q < 10−4); of which 650 genes had a log2 FC ≥ 2. After the meta-analysis there were 3156 genes differentially expressed. Among these there were genes that have previously not been reported in human somatic follicular cells, like prokineticin 2 (PROK2), higher expressed in GC, and pregnancy up-regulated nonubiquitous CaM kinase (PNCK), higher expressed in CC. Pathways like inflammatory response and angiogenesis were enriched in GC, whereas in CC, cell differentiation and multicellular organismal development were among enriched pathways. In conclusion, transcriptomes of GC and CC as well as biological functions, are distinctive for each cell subpopulation. By describing novel genes like PROK2 and PNCK, expressed in GC and CC, we upgraded the existing data on human follicular biology. PMID:26313571
Probing the Xenopus laevis inner ear transcriptome for biological function

PubMed Central

2012-01-01

Background The senses of hearing and balance depend upon mechanoreception, a process that originates in the inner ear and shares features across species. Amphibians have been widely used for physiological studies of mechanotransduction by sensory hair cells. In contrast, much less is known of the genetic basis of auditory and vestibular function in this class of animals. Among amphibians, the genus Xenopus is a well-characterized genetic and developmental model that offers unique opportunities for inner ear research because of the amphibian capacity for tissue and organ regeneration. For these reasons, we implemented a functional genomics approach as a means to undertake a large-scale analysis of the Xenopus laevis inner ear transcriptome through microarray analysis. Results Microarray analysis uncovered genes within the X. laevis inner ear transcriptome associated with inner ear function and impairment in other organisms, thereby supporting the inclusion of Xenopus in cross-species genetic studies of the inner ear. The use of gene categories (inner ear tissue; deafness; ion channels; ion transporters; transcription factors) facilitated the assignment of functional significance to probe set identifiers. We enhanced the biological relevance of our microarray data by using a variety of curation approaches to increase the annotation of the Affymetrix GeneChip® Xenopus laevis Genome array. In addition, annotation analysis revealed the prevalence of inner ear transcripts represented by probe set identifiers that lack functional characterization. Conclusions We identified an abundance of targets for genetic analysis of auditory and vestibular function. The orthologues to human genes with known inner ear function and the highly expressed transcripts that lack annotation are particularly interesting candidates for future analyses. We used informatics approaches to impart biologically relevant information to the Xenopus inner ear transcriptome, thereby addressing the impediment imposed by insufficient gene annotation. These findings heighten the relevance of Xenopus as a model organism for genetic investigations of inner ear organogenesis, morphogenesis, and regeneration. PMID:22676585
Cell cloning-based transcriptome analysis in Rett patients: relevance to the pathogenesis of Rett syndrome of new human MeCP2 target genes.

PubMed

Nectoux, J; Fichou, Y; Rosas-Vargas, H; Cagnard, N; Bahi-Buisson, N; Nusbaum, P; Letourneur, F; Chelly, J; Bienvenu, T

2010-07-01

More than 90% of Rett syndrome (RTT) patients have heterozygous mutations in the X-linked methyl-CpG binding protein 2 (MECP2) gene that encodes the methyl-CpG-binding protein 2, a transcriptional modulator. Because MECP2 is subjected to X chromosome inactivation (XCI), girls with RTT either express the wild-type or mutant allele in each individual cell. To test the consequences of MECP2 mutations resulting from a genome-wide transcriptional dysregulation and to identify its target genes in a system that circumvents the functional mosaicism resulting from XCI, we carried out gene expression profiling of clonal populations derived from fibroblast primary cultures expressing exclusively either the wild-type or the mutant MECP2 allele. Clonal cultures were obtained from skin biopsy of three RTT patients carrying either a non-sense or a frameshift MECP2 mutation. For each patient, gene expression profiles of wild-type and mutant clones were compared by oligonucleotide expression microarray analysis. Firstly, clustering analysis classified the RTT patients according to their genetic background and MECP2 mutation. Secondly, expression profiling by microarray analysis and quantitative RT-PCR indicated four up-regulated genes and five down-regulated genes significantly dysregulated in all our statistical analysis, including excellent potential candidate genes for the understanding of the pathophysiology of this neurodevelopmental disease. Thirdly, chromatin immunoprecipitation analysis confirmed MeCP2 binding to respective CpG islands in three out of four up-regulated candidate genes and sequencing of bisulphite-converted DNA indicated that MeCP2 preferentially binds to methylated-DNA sequences. Most importantly, the finding that at least two of these genes (BMCC1 and RNF182) were shown to be involved in cell survival and/or apoptosis may suggest that impaired MeCP2 function could alter the survival of neurons thus compromising brain function without inducing cell death.
Characteristics of microbial community functional structure of a biological coking wastewater treatment system.

PubMed

Joshi, Dev Raj; Zhang, Yu; Zhang, Hong; Gao, Yingxin; Yang, Min

2018-01-01

Nitrogenous heterocyclic compounds are key pollutants in coking wastewater; however, the functional potential of microbial communities for biodegradation of such contaminants during biological treatment is still elusive. Herein, a high throughput functional gene array (GeoChip 5.0) in combination with Illumina HiSeq2500 sequencing was used to compare and characterize the microbial community functional structure in a long run (500days) bench scale bioreactor treating coking wastewater, with a control system treating synthetic wastewater. Despite the inhibitory toxic pollutants, GeoChip 5.0 detected almost all key functional gene (average 61,940 genes) categories in the coking wastewater sludge. With higher abundance, aromatic ring cleavage dioxygenase genes including multi ring1,2diox; one ring2,3diox; catechol represented significant functional potential for degradation of aromatic pollutants which was further confirmed by Illumina HiSeq2500 analysis results. Response ratio analysis revealed that three nitrogenous compound degrading genes- nbzA (nitro-aromatics), tdnB (aniline), and scnABC (thiocyanate) were unique for coking wastewater treatment, which might be strong cause to increase ammonia level during the aerobic process. Additionally, HiSeq2500 elucidated carbozole and isoquinoline degradation genes in the system. These findings expanded our understanding on functional potential of microbial communities to remove organic nitrogenous pollutants; hence it will be useful in optimization strategies for biological treatment of coking wastewater. Copyright © 2017. Published by Elsevier B.V.
Molecular analysis of the glucocerebrosidase gene locus

DOE Office of Scientific and Technical Information (OSTI.GOV)

Winfield, S.L.; Martin, B.M.; Fandino, A.

1994-09-01

Gaucher disease is due to a deficiency in the activity of the lysosomal enzyme glucocerebrosidase. Both the functional gene for this enzyme and a pseudogene are located in close proximity on chromosome 1q21. Analysis of the mutations present in patient samples has suggested interaction between the functional gene and the pseudogene in the origin of mutant genotypes. To investigate the involvement of regions flanking the functional gene and pseudogene in the origin of mutations found in Gaucher disease, a YAC clone containing DNA from this locus has been subcloned and characterized. The original YAC containing {approximately}360 kb was truncated withmore » the use of fragmentation plasmids to about 85 kb. A lambda library derived from this YAC was screened to obtain clones containing glucocerebrosidase sequences. PCR amplification was used to identify subclones containing 5{prime}, central, or 3{prime} sequences of the functional gene or of the pseudogene. Clones spanning the entire distance from the last exon of the functional gene to intron 1 of the pseudogene, the 5{prime} end of the functional gene and 16 kb of 5{prime} flanking region and approximately 15 kb of 3{prime} flanking region of the pseudogene were sequenced. Sequence data from 48 kb of intergenic and flanking regions of the glucocerebrosidase gene and its pseudogene has been generated. A large number of Alu sequences and several simple repeats have been found. Two of these repeats exhibit fragment length polymorphism. There is almost 100% homology between the 3{prime} flanking regions of the functional gene and the pseudogene, extending to about 4 kb past the termination codons. A much lower degree of homology is observed in the 5{prime} flanking region. Patient samples are currently being screened for polymorphisms in these flanking regions.« less
Clinical and multiple gene expression variables in survival analysis of breast cancer: Analysis with the hypertabastic survival model

PubMed Central

2012-01-01

Background We explore the benefits of applying a new proportional hazard model to analyze survival of breast cancer patients. As a parametric model, the hypertabastic survival model offers a closer fit to experimental data than Cox regression, and furthermore provides explicit survival and hazard functions which can be used as additional tools in the survival analysis. In addition, one of our main concerns is utilization of multiple gene expression variables. Our analysis treats the important issue of interaction of different gene signatures in the survival analysis. Methods The hypertabastic proportional hazards model was applied in survival analysis of breast cancer patients. This model was compared, using statistical measures of goodness of fit, with models based on the semi-parametric Cox proportional hazards model and the parametric log-logistic and Weibull models. The explicit functions for hazard and survival were then used to analyze the dynamic behavior of hazard and survival functions. Results The hypertabastic model provided the best fit among all the models considered. Use of multiple gene expression variables also provided a considerable improvement in the goodness of fit of the model, as compared to use of only one. By utilizing the explicit survival and hazard functions provided by the model, we were able to determine the magnitude of the maximum rate of increase in hazard, and the maximum rate of decrease in survival, as well as the times when these occurred. We explore the influence of each gene expression variable on these extrema. Furthermore, in the cases of continuous gene expression variables, represented by a measure of correlation, we were able to investigate the dynamics with respect to changes in gene expression. Conclusions We observed that use of three different gene signatures in the model provided a greater combined effect and allowed us to assess the relative importance of each in determination of outcome in this data set. These results point to the potential to combine gene signatures to a greater effect in cases where each gene signature represents some distinct aspect of the cancer biology. Furthermore we conclude that the hypertabastic survival models can be an effective survival analysis tool for breast cancer patients. PMID:23241496
Effect of the absolute statistic on gene-sampling gene-set analysis methods.

PubMed

Nam, Dougu

2017-06-01

Gene-set enrichment analysis and its modified versions have commonly been used for identifying altered functions or pathways in disease from microarray data. In particular, the simple gene-sampling gene-set analysis methods have been heavily used for datasets with only a few sample replicates. The biggest problem with this approach is the highly inflated false-positive rate. In this paper, the effect of absolute gene statistic on gene-sampling gene-set analysis methods is systematically investigated. Thus far, the absolute gene statistic has merely been regarded as a supplementary method for capturing the bidirectional changes in each gene set. Here, it is shown that incorporating the absolute gene statistic in gene-sampling gene-set analysis substantially reduces the false-positive rate and improves the overall discriminatory ability. Its effect was investigated by power, false-positive rate, and receiver operating curve for a number of simulated and real datasets. The performances of gene-set analysis methods in one-tailed (genome-wide association study) and two-tailed (gene expression data) tests were also compared and discussed.
Filling gaps in PPAR-alpha signaling through comparative nutrigenomics analysis.

PubMed

Cavalieri, Duccio; Calura, Enrica; Romualdi, Chiara; Marchi, Emmanuela; Radonjic, Marijana; Van Ommen, Ben; Müller, Michael

2009-12-11

The application of high-throughput genomic tools in nutrition research is a widespread practice. However, it is becoming increasingly clear that the outcome of individual expression studies is insufficient for the comprehensive understanding of such a complex field. Currently, the availability of the large amounts of expression data in public repositories has opened up new challenges on microarray data analyses. We have focused on PPARalpha, a ligand-activated transcription factor functioning as fatty acid sensor controlling the gene expression regulation of a large set of genes in various metabolic organs such as liver, small intestine or heart. The function of PPARalpha is strictly connected to the function of its target genes and, although many of these have already been identified, major elements of its physiological function remain to be uncovered. To further investigate the function of PPARalpha, we have applied a cross-species meta-analysis approach to integrate sixteen microarray datasets studying high fat diet and PPARalpha signal perturbations in different organisms. We identified 164 genes (MDEGs) that were differentially expressed in a constant way in response to a high fat diet or to perturbations in PPARs signalling. In particular, we found five genes in yeast which were highly conserved and homologous of PPARalpha targets in mammals, potential candidates to be used as models for the equivalent mammalian genes. Moreover, a screening of the MDEGs for all known transcription factor binding sites and the comparison with a human genome-wide screening of Peroxisome Proliferating Response Elements (PPRE), enabled us to identify, 20 new potential candidate genes that show, both binding site, both change in expression in the condition studied. Lastly, we found a non random localization of the differentially expressed genes in the genome. The results presented are potentially of great interest to resume the currently available expression data, exploiting the power of in silico analysis filtered by evolutionary conservation. The analysis enabled us to indicate potential gene candidates that could fill in the gaps with regards to the signalling of PPARalpha and, moreover, the non-random localization of the differentially expressed genes in the genome, suggest that epigenetic mechanisms are of importance in the regulation of the transcription operated by PPARalpha.
RoMo: An efficient strategy for functional mosaic analysis via stochastic Cre recombination and gene targeting in the ROSA26 locus.

PubMed

Movahedi, Kiavash; Wiegmann, Robert; De Vlaminck, Karen; Van Ginderachter, Jo A; Nikolaev, Viacheslav O

2018-07-01

Functional mosaic analysis allows for the direct comparison of mutant cells with differentially marked control cells in the same organism. While this offers a powerful approach for elucidating the role of specific genes or signalling pathways in cell populations of interest, genetic strategies for generating functional mosaicism remain challenging. We describe a novel and streamlined approach for functional mosaic analysis, which combines stochastic Cre/lox recombination with gene targeting in the ROSA26 locus. With the RoMo strategy a cell population of interest is randomly split into a cyan fluorescent and red fluorescent subset, of which the latter overexpresses a chosen transgene. To integrate this approach into high-throughput gene targeting initiatives, we developed a procedure that utilizes Gateway cloning for the generation of new targeting vectors. RoMo can be used for gain-of-function experiments or for altering signaling pathways in a mosaic fashion. To demonstrate this, we developed RoMo-dnGs mice, in which Cre-recombined red fluorescent cells co-express a dominant-negative Gs protein. RoMo-dnGs mice allowed us to inhibit G protein-coupled receptor activation in a fraction of cells, which could then be directly compared to differentially marked control cells in the same animal. We demonstrate how RoMo-dnGs mice can be used to obtain mosaicism in the brain and in peripheral organs for various cell types. RoMo offers an efficient new approach for functional mosaic analysis that extends the current toolbox and may reveal important new insights into in vivo gene function. © 2018 Wiley Periodicals, Inc.
atBioNet--an integrated network analysis tool for genomics and biomarker discovery.

PubMed

Ding, Yijun; Chen, Minjun; Liu, Zhichao; Ding, Don; Ye, Yanbin; Zhang, Min; Kelly, Reagan; Guo, Li; Su, Zhenqiang; Harris, Stephen C; Qian, Feng; Ge, Weigong; Fang, Hong; Xu, Xiaowei; Tong, Weida

2012-07-20

Large amounts of mammalian protein-protein interaction (PPI) data have been generated and are available for public use. From a systems biology perspective, Proteins/genes interactions encode the key mechanisms distinguishing disease and health, and such mechanisms can be uncovered through network analysis. An effective network analysis tool should integrate different content-specific PPI databases into a comprehensive network format with a user-friendly platform to identify key functional modules/pathways and the underlying mechanisms of disease and toxicity. atBioNet integrates seven publicly available PPI databases into a network-specific knowledge base. Knowledge expansion is achieved by expanding a user supplied proteins/genes list with interactions from its integrated PPI network. The statistically significant functional modules are determined by applying a fast network-clustering algorithm (SCAN: a Structural Clustering Algorithm for Networks). The functional modules can be visualized either separately or together in the context of the whole network. Integration of pathway information enables enrichment analysis and assessment of the biological function of modules. Three case studies are presented using publicly available disease gene signatures as a basis to discover new biomarkers for acute leukemia, systemic lupus erythematosus, and breast cancer. The results demonstrated that atBioNet can not only identify functional modules and pathways related to the studied diseases, but this information can also be used to hypothesize novel biomarkers for future analysis. atBioNet is a free web-based network analysis tool that provides a systematic insight into proteins/genes interactions through examining significant functional modules. The identified functional modules are useful for determining underlying mechanisms of disease and biomarker discovery. It can be accessed at: http://www.fda.gov/ScienceResearch/BioinformaticsTools/ucm285284.htm.
Virus induced gene silencing (VIGS) for functional analysis of wheat genes involved in Zymoseptoria tritici susceptibility and resistance.

PubMed

Lee, Wing-Sham; Rudd, Jason J; Kanyuka, Kostya

2015-06-01

Virus-induced gene silencing (VIGS) has emerged as a powerful reverse genetic technology in plants supplementary to stable transgenic RNAi and, in certain species, as a viable alternative approach for gene functional analysis. The RNA virus Barley stripe mosaic virus (BSMV) was developed as a VIGS vector in the early 2000s and since then it has been used to study the function of wheat genes. Several variants of BSMV vectors are available, with some requiring in vitro transcription of infectious viral RNA, while others rely on in planta production of viral RNA from DNA-based vectors delivered to plant cells either by particle bombardment or Agrobacterium tumefaciens. We adapted the latest generation of binary BSMV VIGS vectors for the identification and study of wheat genes of interest involved in interactions with Zymoseptoria tritici and here present detailed and the most up-to-date protocols. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
A Compendium of Canine Normal Tissue Gene Expression

PubMed Central

Chen, Qing-Rong; Wen, Xinyu; Khan, Javed; Khanna, Chand

2011-01-01

Background Our understanding of disease is increasingly informed by changes in gene expression between normal and abnormal tissues. The release of the canine genome sequence in 2005 provided an opportunity to better understand human health and disease using the dog as clinically relevant model. Accordingly, we now present the first genome-wide, canine normal tissue gene expression compendium with corresponding human cross-species analysis. Methodology/Principal Findings The Affymetrix platform was utilized to catalogue gene expression signatures of 10 normal canine tissues including: liver, kidney, heart, lung, cerebrum, lymph node, spleen, jejunum, pancreas and skeletal muscle. The quality of the database was assessed in several ways. Organ defining gene sets were identified for each tissue and functional enrichment analysis revealed themes consistent with known physio-anatomic functions for each organ. In addition, a comparison of orthologous gene expression between matched canine and human normal tissues uncovered remarkable similarity. To demonstrate the utility of this dataset, novel canine gene annotations were established based on comparative analysis of dog and human tissue selective gene expression and manual curation of canine probeset mapping. Public access, using infrastructure identical to that currently in use for human normal tissues, has been established and allows for additional comparisons across species. Conclusions/Significance These data advance our understanding of the canine genome through a comprehensive analysis of gene expression in a diverse set of tissues, contributing to improved functional annotation that has been lacking. Importantly, it will be used to inform future studies of disease in the dog as a model for human translational research and provides a novel resource to the community at large. PMID:21655323
Genome-wide identification of the MADS-box transcription factor family in pear (Pyrus bretschneideri) reveals evolution and functional divergence.

PubMed

Wang, Runze; Ming, Meiling; Li, Jiaming; Shi, Dongqing; Qiao, Xin; Li, Leiting; Zhang, Shaoling; Wu, Jun

2017-01-01

MADS-box transcription factors play significant roles in plant developmental processes such as floral organ conformation, flowering time, and fruit development. Pear ( Pyrus ), as the third-most crucial temperate fruit crop, has been fully sequenced. However, there is limited information about the MADS family and its functional divergence in pear. In this study, a total of 95 MADS-box genes were identified in the pear genome, and classified into two types by phylogenetic analysis. Type I MADS-box genes were divided into three subfamilies and type II genes into 14 subfamilies. Synteny analysis suggested that whole-genome duplications have played key roles in the expansion of the MADS family, followed by rearrangement events. Purifying selection was the primary force driving MADS-box gene evolution in pear, and one gene pairs presented three codon sites under positive selection. Full-scale expression information for PbrMADS genes in vegetative and reproductive organs was provided and proved by transcriptional and reverse transcription PCR analysis. Furthermore, the PbrMADS11(12) gene, together with partners PbMYB10 and PbbHLH3 was confirmed to activate the promoters of the structural genes in anthocyanin pathway of red pear through dual luciferase assay. In addition, the PbrMADS11 and PbrMADS12 were deduced involving in the regulation of anthocyanin synthesis response to light and temperature changes. These results provide a solid foundation for future functional analysis of PbrMADS genes in different biological processes, especially of pigmentation in pear.
Genome-wide identification of the MADS-box transcription factor family in pear (Pyrus bretschneideri) reveals evolution and functional divergence

PubMed Central

Li, Jiaming; Shi, Dongqing; Qiao, Xin; Li, Leiting; Zhang, Shaoling

2017-01-01

MADS-box transcription factors play significant roles in plant developmental processes such as floral organ conformation, flowering time, and fruit development. Pear (Pyrus), as the third-most crucial temperate fruit crop, has been fully sequenced. However, there is limited information about the MADS family and its functional divergence in pear. In this study, a total of 95 MADS-box genes were identified in the pear genome, and classified into two types by phylogenetic analysis. Type I MADS-box genes were divided into three subfamilies and type II genes into 14 subfamilies. Synteny analysis suggested that whole-genome duplications have played key roles in the expansion of the MADS family, followed by rearrangement events. Purifying selection was the primary force driving MADS-box gene evolution in pear, and one gene pairs presented three codon sites under positive selection. Full-scale expression information for PbrMADS genes in vegetative and reproductive organs was provided and proved by transcriptional and reverse transcription PCR analysis. Furthermore, the PbrMADS11(12) gene, together with partners PbMYB10 and PbbHLH3 was confirmed to activate the promoters of the structural genes in anthocyanin pathway of red pear through dual luciferase assay. In addition, the PbrMADS11 and PbrMADS12 were deduced involving in the regulation of anthocyanin synthesis response to light and temperature changes. These results provide a solid foundation for future functional analysis of PbrMADS genes in different biological processes, especially of pigmentation in pear. PMID:28924499

Regulatory network analysis of Epstein-Barr virus identifies functional modules and hub genes involved in infectious mononucleosis.

PubMed

Poorebrahim, Mansour; Salarian, Ali; Najafi, Saeideh; Abazari, Mohammad Foad; Aleagha, Maryam Nouri; Dadras, Mohammad Nasr; Jazayeri, Seyed Mohammad; Ataei, Atousa; Poortahmasebi, Vahdat

2017-05-01

Epstein-Barr virus (EBV) is the most common cause of infectious mononucleosis (IM) and establishes lifetime infection associated with a variety of cancers and autoimmune diseases. The aim of this study was to develop an integrative gene regulatory network (GRN) approach and overlying gene expression data to identify the representative subnetworks for IM and EBV latent infection (LI). After identifying differentially expressed genes (DEGs) in both IM and LI gene expression profiles, functional annotations were applied using gene ontology (GO) and BiNGO tools, and construction of GRNs, topological analysis and identification of modules were carried out using several plugins of Cytoscape. In parallel, a human-EBV GRN was generated using the Hu-Vir database for further analyses. Our analysis revealed that the majority of DEGs in both IM and LI were involved in cell-cycle and DNA repair processes. However, these genes showed a significant negative correlation in the IM and LI states. Furthermore, cyclin-dependent kinase 2 (CDK2) - a hub gene with the highest centrality score - appeared to be the key player in cell cycle regulation in IM disease. The most significant functional modules in the IM and LI states were involved in the regulation of the cell cycle and apoptosis, respectively. Human-EBV network analysis revealed several direct targets of EBV proteins during IM disease. Our study provides an important first report on the response to IM/LI EBV infection in humans. An important aspect of our data was the upregulation of genes associated with cell cycle progression and proliferation.
Genome-Wide Gene Expression in relation to Age in Large Laboratory Cohorts of Drosophila melanogaster

PubMed Central

Carlson, Kimberly A.; Gardner, Kylee; Pashaj, Anjeza; Carlson, Darby J.; Yu, Fang; Eudy, James D.; Zhang, Chi; Harshman, Lawrence G.

2015-01-01

Aging is a complex process characterized by a steady decline in an organism's ability to perform life-sustaining tasks. In the present study, two cages of approximately 12,000 mated Drosophila melanogaster females were used as a source of RNA from individuals sampled frequently as a function of age. A linear model for microarray data method was used for the microarray analysis to adjust for the box effect; it identified 1,581 candidate aging genes. Cluster analyses using a self-organizing map algorithm on the 1,581 significant genes identified gene expression patterns across different ages. Genes involved in immune system function and regulation, chorion assembly and function, and metabolism were all significantly differentially expressed as a function of age. The temporal pattern of data indicated that gene expression related to aging is affected relatively early in life span. In addition, the temporal variance in gene expression in immune function genes was compared to a random set of genes. There was an increase in the variance of gene expression within each cohort, which was not observed in the set of random genes. This observation is compatible with the hypothesis that D. melanogaster immune function genes lose control of gene expression as flies age. PMID:26090231
Microarray RNA expression analysis of cerebral white matter lesions reveals changes in multiple functional pathways.

PubMed

Simpson, Julie E; Hosny, Ola; Wharton, Stephen B; Heath, Paul R; Holden, Hazel; Fernando, Malee S; Matthews, Fiona; Forster, Gill; O'Brien, John T; Barber, Robert; Kalaria, Raj N; Brayne, Carol; Shaw, Pamela J; Lewis, Claire E; Ince, Paul G

2009-02-01

White matter lesions (WML) in brain aging are linked to dementia and depression. Ischemia contributes to their pathogenesis but other mechanisms may contribute. We used RNA microarray analysis with functional pathway grouping as an unbiased approach to investigate evidence for additional pathogenetic mechanisms. WML were identified by MRI and pathology in brains donated to the Medical Research Council Cognitive Function and Ageing Study Cognitive Function and Aging Study. RNA was extracted to compare WML with nonlesional white matter samples from cases with lesions (WM[L]), and from cases with no lesions (WM[C]) using RNA microarray and pathway analysis. Functional pathways were validated for selected genes by quantitative real-time polymerase chain reaction and immunocytochemistry. We identified 8 major pathways in which multiple genes showed altered RNA transcription (immune regulation, cell cycle, apoptosis, proteolysis, ion transport, cell structure, electron transport, metabolism) among 502 genes that were differentially expressed in WML compared to WM[C]. In WM[L], 409 genes were altered involving the same pathways. Genes selected to validate this microarray data all showed the expected changes in RNA levels and immunohistochemical expression of protein. WML represent areas with a complex molecular phenotype. From this and previous evidence, WML may arise through tissue ischemia but may also reflect the contribution of additional factors like blood-brain barrier dysfunction. Differential expression of genes in WM[L] compared to WM[C] indicate a "field effect" in the seemingly normal surrounding white matter.
[FANCA gene mutation analysis in Fanconi anemia patients].

PubMed

Chen, Fei; Peng, Guang-Jie; Zhang, Kejian; Hu, Qun; Zhang, Liu-Qing; Liu, Ai-Guo

2005-10-01

To screen the FANCA gene mutation and explore the FANCA protein function in Fanconi anemia (FA) patients. FANCA protein expression and its interaction with FANCF were analyzed using Western blot and immunoprecipitation in 3 cases of FA-A. Genomic DNA was used for MLPA analysis followed by sequencing. FANCA protein was undetectable and FANCA and FANCF protein interaction was impaired in these 3 cases of FA-A. Each case of FA-A contained biallelic pathogenic mutations in FANCA gene. No functional FANCA protein was found in these 3 cases of FA-A, and intragenic deletion, frame shift and splice site mutation were the major pathogenic mutations found in FANCA gene.
The gene and the genon concept: a functional and information-theoretic analysis

PubMed Central

Scherrer, Klaus; Jost, Jürgen

2007-01-01

‘Gene' has become a vague and ill-defined concept. To set the stage for mathematical analysis of gene storage and expression, we return to the original concept of the gene as a function encoded in the genome, basis of genetic analysis, that is a polypeptide or other functional product. The additional information needed to express a gene is contained within each mRNA as an ensemble of signals, added to or superimposed onto the coding sequence. To designate this programme, we introduce the term ‘genon'. Individual genons are contained in the pre-mRNA forming a pre-genon. A genomic domain contains a proto-genon, with the signals of transcription activation in addition to the pre-genon in the transcripts. Some contain several mRNAs and hence genons, to be singled out by RNA processing and differential splicing. The programme in the genon in cis is implemented by corresponding factors of protein or RNA nature contained in the transgenon of the cell or organism. The gene, the cis programme contained in the individual domain and transcript, and the trans programme of factors, can be analysed by information theory. PMID:17353929
[Cloning and functional characterization of phytoene desaturase in Andrographis paniculata].

PubMed

Shen, Qin-qin; Li, Li-xia; Zhan, Peng-lin; Wang, Qiang

2015-10-01

A full-length cDNA of phytoene desaturase (PDS) gene from Andrographis paniculata was obtained through RACE-PCR. The cDNA sequence consists of 2 224 bp with an intact ORF of 1 752 bp (GeneBank: KP982892), encoding a ploypeptide of 584 amino acids. Homology analysis showed that the deduced protein has extensive sequence similarities to PDS from other plants, and contains a conserved NAD ( H) -binding domain of plant dehydrase cofactor binding-domain in N-terminal. Phylogenetic analysis demonstrated that ApPDS was more related to PDS of Sesamum indicum and Pogostemon cablin. The semi-quantitative RT-PCR analysis revealed that ApPDS expressed in whole aboveground tissues with the highest expression in leaves. Virus induced gene silencing (VIGS) was performed to characterize the functional of ApPDS in planta. Significant photobleaching was not observed in infiltrated leaves, while the PDS gene has been down-regulated significantly at the yellowish area. To the best of our knowledge, this represents the first report of PDS gene cloning and functional characterization from A. paniculata, which lays the foundation for further investigation of new genes, especially that correlative to andrographolide biosynthetic pathway.
Functional analysis of regulatory single-nucleotide polymorphisms.

PubMed

Pampín, Sandra; Rodríguez-Rey, José C

2007-04-01

The identification of regulatory polymorphisms has become a key problem in human genetics. In the past few years there has been a conceptual change in the way in which regulatory single-nucleotide polymorphisms are studied. We revise the new approaches and discuss how gene expression studies can contribute to a better knowledge of the genetics of common diseases. New techniques for the association of single-nucleotide polymorphisms with changes in gene expression have been recently developed. This, together with a more comprehensive use of the old in-vitro methods, has produced a great amount of genetic information. When added to current databases, it will help to design better tools for the detection of regulatory single-nucleotide polymorphisms. The identification of functional regulatory single-nucleotide polymorphisms cannot be done by the simple inspection of DNA sequence. In-vivo techniques, based on primer-extension, and the more recently developed 'haploChIP' allow the association of gene variants to changes in gene expression. Gene expression analysis by conventional in-vitro techniques is the only way to identify the functional consequences of regulatory single-nucleotide polymorphisms. The amount of information produced in the last few years will help to refine the tools for the future analysis of regulatory gene variants.
Development of resources for the analysis of gene function in Pucciniomycotina red yeasts.

PubMed

Ianiri, Giuseppe; Wright, Sandra A I; Castoria, Raffaello; Idnurm, Alexander

2011-07-01

The Pucciniomycotina is an important subphylum of basidiomycete fungi but with limited tools to analyze gene functions. Transformation protocols were established for a Sporobolomyces species (strain IAM 13481), the first Pucciniomycotina species with a completed draft genome sequence, to enable assessment of gene function through phenotypic characterization of mutant strains. Transformation markers were the URA3 and URA5 genes that enable selection and counter-selection based on uracil auxotrophy and resistance to 5-fluoroorotic acid. The wild type copies of these genes were cloned into plasmids that were used for transformation of Sporobolomyces sp. by both biolistic and Agrobacterium-mediated approaches. These resources have been deposited to be available from the Fungal Genetics Stock Center. To show that these techniques could be used to elucidate gene functions, the LEU1 gene was targeted for specific homologous replacement, and also demonstrating that this gene is required for the biosynthesis of leucine in basidiomycete fungi. T-DNA insertional mutants were isolated and further characterized, revealing insertions in genes that encode the homologs of Chs7, Erg3, Kre6, Kex1, Pik1, Sad1, Ssu1 and Tlg1. Phenotypic analysis of these mutants reveals both conserved and divergent functions compared with other fungi. Some of these strains exhibit reduced resistance to detergents, the antifungal agent fluconazole or sodium sulfite, or lower recovery from heat stress. While there are current experimental limitations for Sporobolomyces sp. such as the lack of Mendelian genetics for conventional mating, these findings demonstrate the facile nature of at least one Pucciniomycotina species for genetic manipulation and the potential to develop these organisms into new models for understanding gene function and evolution in the fungi. Copyright © 2011 Elsevier Inc. All rights reserved.
Large-scale inference of gene function through phylogenetic annotation of Gene Ontology terms: case study of the apoptosis and autophagy cellular processes.

PubMed

Feuermann, Marc; Gaudet, Pascale; Mi, Huaiyu; Lewis, Suzanna E; Thomas, Paul D

2016-01-01

We previously reported a paradigm for large-scale phylogenomic analysis of gene families that takes advantage of the large corpus of experimentally supported Gene Ontology (GO) annotations. This 'GO Phylogenetic Annotation' approach integrates GO annotations from evolutionarily related genes across ∼100 different organisms in the context of a gene family tree, in which curators build an explicit model of the evolution of gene functions. GO Phylogenetic Annotation models the gain and loss of functions in a gene family tree, which is used to infer the functions of uncharacterized (or incompletely characterized) gene products, even for human proteins that are relatively well studied. Here, we report our results from applying this paradigm to two well-characterized cellular processes, apoptosis and autophagy. This revealed several important observations with respect to GO annotations and how they can be used for function inference. Notably, we applied only a small fraction of the experimentally supported GO annotations to infer function in other family members. The majority of other annotations describe indirect effects, phenotypes or results from high throughput experiments. In addition, we show here how feedback from phylogenetic annotation leads to significant improvements in the PANTHER trees, the GO annotations and GO itself. Thus GO phylogenetic annotation both increases the quantity and improves the accuracy of the GO annotations provided to the research community. We expect these phylogenetically based annotations to be of broad use in gene enrichment analysis as well as other applications of GO annotations.Database URL: http://amigo.geneontology.org/amigo. © The Author(s) 2016. Published by Oxford University Press.
Functional characterization of an apple (Malus x domestica) LysM domain receptor encoding gene for its role in defense response

USDA-ARS?s Scientific Manuscript database

Apple gene MDP0000136494 was identified as the only LysM containing protein encoding gene which was specifically up-regulated in P. ultimum infected apple root by a previous transcriptome analysis. In current study, the functional identity of MDP0000136494 was investigated using combined genomic, tr...
TMEM88, CCL14 and CLEC3B as prognostic biomarkers for prognosis and palindromia of human hepatocellular carcinoma.

PubMed

Zhang, Xin; Wan, Jin-Xiang; Ke, Zun-Ping; Wang, Feng; Chai, Hai-Xia; Liu, Jia-Qiang

2017-07-01

Hepatocellular carcinoma is one of the most mortal and prevalent cancers with increasing incidence worldwide. Elucidating genetic driver genes for prognosis and palindromia of hepatocellular carcinoma helps managing clinical decisions for patients. In this study, the high-throughput RNA sequencing data on platform IlluminaHiSeq of hepatocellular carcinoma were downloaded from The Cancer Genome Atlas with 330 primary hepatocellular carcinoma patient samples. Stable key genes with differential expressions were identified with which Kaplan-Meier survival analysis was performed using Cox proportional hazards test in R language. Driver genes influencing the prognosis of this disease were determined using clustering analysis. Functional analysis of driver genes was performed by literature search and Gene Set Enrichment Analysis. Finally, the selected driver genes were verified using external dataset GSE40873. A total of 5781 stable key genes were identified, including 156 genes definitely related to prognoses of hepatocellular carcinoma. Based on the significant key genes, samples were grouped into five clusters which were further integrated into high- and low-risk classes based on clinical features. TMEM88, CCL14, and CLEC3B were selected as driver genes which clustered high-/low-risk patients successfully (generally, p = 0.0005124445). Finally, survival analysis of the high-/low-risk samples from external database illustrated significant difference with p value 0.0198. In conclusion, TMEM88, CCL14, and CLEC3B genes were stable and available in predicting the survival and palindromia time of hepatocellular carcinoma. These genes could function as potential prognostic genes contributing to improve patients' outcomes and survival.
Comprehensive analysis of coding-lncRNA gene co-expression network uncovers conserved functional lncRNAs in zebrafish.

PubMed

Chen, Wen; Zhang, Xuan; Li, Jing; Huang, Shulan; Xiang, Shuanglin; Hu, Xiang; Liu, Changning

2018-05-09

Zebrafish is a full-developed model system for studying development processes and human disease. Recent studies of deep sequencing had discovered a large number of long non-coding RNAs (lncRNAs) in zebrafish. However, only few of them had been functionally characterized. Therefore, how to take advantage of the mature zebrafish system to deeply investigate the lncRNAs' function and conservation is really intriguing. We systematically collected and analyzed a series of zebrafish RNA-seq data, then combined them with resources from known database and literatures. As a result, we obtained by far the most complete dataset of zebrafish lncRNAs, containing 13,604 lncRNA genes (21,128 transcripts) in total. Based on that, a co-expression network upon zebrafish coding and lncRNA genes was constructed and analyzed, and used to predict the Gene Ontology (GO) and the KEGG annotation of lncRNA. Meanwhile, we made a conservation analysis on zebrafish lncRNA, identifying 1828 conserved zebrafish lncRNA genes (1890 transcripts) that have their putative mammalian orthologs. We also found that zebrafish lncRNAs play important roles in regulation of the development and function of nervous system; these conserved lncRNAs present a significant sequential and functional conservation, with their mammalian counterparts. By integrative data analysis and construction of coding-lncRNA gene co-expression network, we gained the most comprehensive dataset of zebrafish lncRNAs up to present, as well as their systematic annotations and comprehensive analyses on function and conservation. Our study provides a reliable zebrafish-based platform to deeply explore lncRNA function and mechanism, as well as the lncRNA commonality between zebrafish and human.
Microarray-based determination of anti-inflammatory genes targeted by 6-(methylsulfinyl)hexyl isothiocyanate in macrophages.

PubMed

Chen, Jihua; Uto, Takuhiro; Tanigawa, Shunsuke; Yamada-Kato, Tomeo; Fujii, Makoto; Hou, DE-Xing

2010-01-01

6-(Methylsulfinyl)hexyl isothiocyanate (6-MSITC) is a bioactive ingredient of wasabi [Wasabia japonica (Miq.) Matsumura], which is a popular pungent spice of Japan. To evaluate the anti-inflammatory function and underlying genes targeted by 6-MSITC, gene expression profiling through DNA microarray was performed in mouse macrophages. Among 22,050 oligonucleotides, the expression levels of 406 genes were increased by ≥3-fold in lipopolysaccharide (LPS)-activated RAW264 cells, 238 gene signals of which were attenuated by 6-MSITC (≥2-fold). Expression levels of 717 genes were decreased by ≥3-fold in LPS-activated cells, of which 336 gene signals were restored by 6-MSITC (≥2-fold). Utilizing group analysis, 206 genes affected by 6-MSITC with a ≥2-fold change were classified into 35 categories relating to biological processes (81), molecular functions (108) and signaling pathways (17). The genes were further categorized as 'defense, inflammatory response, cytokine activities and receptor activities' and some were confirmed by real-time polymerase chain reaction. Ingenuity pathway analysis further revealed that wasabi 6-MSITC regulated the relevant networks of chemokines, interleukins and interferons to exert its anti-inflammatory function.
Microarray-based determination of anti-inflammatory genes targeted by 6-(methylsulfinyl)hexyl isothiocyanate in macrophages

PubMed Central

CHEN, JIHUA; UTO, TAKUHIRO; TANIGAWA, SHUNSUKE; YAMADA-KATO, TOMEO; FUJII, MAKOTO; HOU, DE-XING

2010-01-01

6-(Methylsulfinyl)hexyl isothiocyanate (6-MSITC) is a bioactive ingredient of wasabi [Wasabia japonica (Miq.) Matsumura], which is a popular pungent spice of Japan. To evaluate the anti-inflammatory function and underlying genes targeted by 6-MSITC, gene expression profiling through DNA microarray was performed in mouse macrophages. Among 22,050 oligonucleotides, the expression levels of 406 genes were increased by ≥3-fold in lipopolysaccharide (LPS)-activated RAW264 cells, 238 gene signals of which were attenuated by 6-MSITC (≥2-fold). Expression levels of 717 genes were decreased by ≥3-fold in LPS-activated cells, of which 336 gene signals were restored by 6-MSITC (≥2-fold). Utilizing group analysis, 206 genes affected by 6-MSITC with a ≥2-fold change were classified into 35 categories relating to biological processes (81), molecular functions (108) and signaling pathways (17). The genes were further categorized as ‘defense, inflammatory response, cytokine activities and receptor activities’ and some were confirmed by real-time polymerase chain reaction. Ingenuity pathway analysis further revealed that wasabi 6-MSITC regulated the relevant networks of chemokines, interleukins and interferons to exert its anti-inflammatory function. PMID:23136589
A single preovulatory administration of ulipristal acetate affects the decidualization process of the human endometrium during the receptive period of the menstrual cycle.

PubMed

Lira-Albarrán, Saúl; Durand, Marta; Barrera, David; Vega, Claudia; Becerra, Rocio García; Díaz, Lorenza; García-Quiroz, Janice; Rangel, Claudia; Larrea, Fernando

2018-04-27

In order to get further information on the effects of ulipristal acetate (UPA) upon the process of decidualization of endometrium, a functional analysis of the differentially expressed genes in endometrium (DEG) from UPA treated-versus control-cycles of normal ovulatory women was performed. A list of 1183 endometrial DEG, from a previously published study by our group, was submitted to gene ontology, gene enrichment and ingenuity pathway analyses (IPA). This functional analysis showed that decidualization was a biological process overrepresented. Gene set enrichment analysis identified LIF, PRL, IL15 and STAT3 among the most down-regulated genes within the JAK STAT canonical pathway. IPA showed that decidualization of uterus was a bio-function predicted as inhibited by UPA. The results demonstrated that this selective progesterone receptor modulator, when administered during the periovulatory phase of the menstrual cycle, may affect the molecular mechanisms leading to endometrial decidualization in response to progesterone during the period of maximum embryo receptivity. Copyright © 2018 Elsevier B.V. All rights reserved.
Computation and application of tissue-specific gene set weights.

PubMed

Frost, H Robert

2018-04-06

Gene set testing, or pathway analysis, has become a critical tool for the analysis of highdimensional genomic data. Although the function and activity of many genes and higher-level processes is tissue-specific, gene set testing is typically performed in a tissue agnostic fashion, which impacts statistical power and the interpretation and replication of results. To address this challenge, we have developed a bioinformatics approach to compute tissuespecific weights for individual gene sets using information on tissue-specific gene activity from the Human Protein Atlas (HPA). We used this approach to create a public repository of tissue-specific gene set weights for 37 different human tissue types from the HPA and all collections in the Molecular Signatures Database (MSigDB). To demonstrate the validity and utility of these weights, we explored three different applications: the functional characterization of human tissues, multi-tissue analysis for systemic diseases and tissue-specific gene set testing. All data used in the reported analyses is publicly available. An R implementation of the method and tissue-specific weights for MSigDB gene set collections can be downloaded at http://www.dartmouth.edu/∼hrfrost/TissueSpecificGeneSets. rob.frost@dartmouth.edu.
Structural and Functional Analysis of the GRAS Gene Family in Grapevine Indicates a Role of GRAS Proteins in the Control of Development and Stress Responses

PubMed Central

Grimplet, Jérôme; Agudelo-Romero, Patricia; Teixeira, Rita T.; Martinez-Zapater, Jose M.; Fortes, Ana M.

2016-01-01

GRAS transcription factors are involved in many processes of plant growth and development (e.g., axillary shoot meristem formation, root radial patterning, nodule morphogenesis, arbuscular development) as well as in plant disease resistance and abiotic stress responses. However, little information is available concerning this gene family in grapevine (Vitis vinifera L.), an economically important woody crop. We performed a model curation of GRAS genes identified in the latest genome annotation leading to the identification of 52 genes. Gene models were improved and three new genes were identified that could be grapevine- or woody-plant specific. Phylogenetic analysis showed that GRAS genes could be classified into 13 groups that mapped on the 19 V. vinifera chromosomes. Five new subfamilies, previously not characterized in other species, were identified. Multiple sequence alignment showed typical GRAS domain in the proteins and new motifs were also described. As observed in other species, both segmental and tandem duplications contributed significantly to the expansion and evolution of the GRAS gene family in grapevine. Expression patterns across a variety of tissues and upon abiotic and biotic conditions revealed possible divergent functions of GRAS genes in grapevine development and stress responses. By comparing the information available for tomato and grapevine GRAS genes, we identified candidate genes that might constitute conserved transcriptional regulators of both climacteric and non-climacteric fruit ripening. Altogether this study provides valuable information and robust candidate genes for future functional analysis aiming at improving the quality of fleshy fruits. PMID:27065316
Identification of hub subnetwork based on topological features of genes in breast cancer

PubMed Central

ZHUANG, DA-YONG; JIANG, LI; HE, QING-QING; ZHOU, PENG; YUE, TAO

2015-01-01

The aim of this study was to provide functional insight into the identification of hub subnetworks by aggregating the behavior of genes connected in a protein-protein interaction (PPI) network. We applied a protein network-based approach to identify subnetworks which may provide new insight into the functions of pathways involved in breast cancer rather than individual genes. Five groups of breast cancer data were downloaded and analyzed from the Gene Expression Omnibus (GEO) database of high-throughput gene expression data to identify gene signatures using the genome-wide global significance (GWGS) method. A PPI network was constructed using Cytoscape and clusters that focused on highly connected nodes were obtained using the molecular complex detection (MCODE) clustering algorithm. Pathway analysis was performed to assess the functional relevance of selected gene signatures based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Topological centrality was used to characterize the biological importance of gene signatures, pathways and clusters. The results revealed that, cluster1, as well as the cell cycle and oocyte meiosis pathways were significant subnetworks in the analysis of degree and other centralities, in which hub nodes mostly distributed. The most important hub nodes, with top ranked centrality, were also similar with the common genes from the above three subnetwork intersections, which was viewed as a hub subnetwork with more reproducible than individual critical genes selected without network information. This hub subnetwork attributed to the same biological process which was essential in the function of cell growth and death. This increased the accuracy of identifying gene interactions that took place within the same functional process and was potentially useful for the development of biomarkers and networks for breast cancer. PMID:25573623
Computer analysis of protein functional sites projection on exon structure of genes in Metazoa

PubMed Central

2015-01-01

Background Study of the relationship between the structural and functional organization of proteins and their coding genes is necessary for an understanding of the evolution of molecular systems and can provide new knowledge for many applications for designing proteins with improved medical and biological properties. It is well known that the functional properties of proteins are determined by their functional sites. Functional sites are usually represented by a small number of amino acid residues that are distantly located from each other in the amino acid sequence. They are highly conserved within their functional group and vary significantly in structure between such groups. According to this facts analysis of the general properties of the structural organization of the functional sites at the protein level and, at the level of exon-intron structure of the coding gene is still an actual problem. Results One approach to this analysis is the projection of amino acid residue positions of the functional sites along with the exon boundaries to the gene structure. In this paper, we examined the discontinuity of the functional sites in the exon-intron structure of genes and the distribution of lengths and phases of the functional site encoding exons in vertebrate genes. We have shown that the DNA fragments coding the functional sites were in the same exons, or in close exons. The observed tendency to cluster the exons that code functional sites which could be considered as the unit of protein evolution. We studied the characteristics of the structure of the exon boundaries that code, and do not code, functional sites in 11 Metazoa species. This is accompanied by a reduced frequency of intercodon gaps (phase 0) in exons encoding the amino acid residue functional site, which may be evidence of the existence of evolutionary limitations to the exon shuffling. Conclusions These results characterize the features of the coding exon-intron structure that affect the functionality of the encoded protein and allow a better understanding of the emergence of biological diversity. PMID:26693737
Global transcriptome analysis of eukaryotic genes affected by gromwell extract.

PubMed

Bang, Soohyun; Lee, Dohyun; Kim, Hanhe; Park, Jiyong; Bahn, Yong-Sun

2014-02-01

Gromwell is known to have diverse pharmacological, cosmetic and nutritional benefits for humans. Nevertheless, the biological influence of gromwell extract (GE) on the general physiology of eukaryotic cells remains unknown. In this study a global transcriptome analysis was performed to identify genes affected by the addition of GE with Cryptococcus neoformans as the model system. In response to GE treatment, genes involved in signal transduction were immediately regulated, and the evolutionarily conserved sets of genes involved in the core cellular functions, including DNA replication, RNA transcription/processing and protein translation/processing, were generally up-regulated. In contrast, a number of genes involved in carbohydrate metabolism and transport, inorganic ion transport and metabolism, post-translational modification/protein turnover/chaperone functions and signal transduction were down-regulated. Among the GE-responsive genes that are also evolutionarily conserved in the human genome, the expression patterns of YSA1, TPO2, CFO1 and PZF1 were confirmed by northern blot analysis. Based on the functional characterization of some GE-responsive genes, it was found that GE treatment may promote cellular tolerance against a variety of environmental stresses in eukaryotes. GE treatment affects the expression levels of a significant portion of the Cryptococcus genome, implying that GE significantly affects the general physiology of eukaryotic cells. © 2013 Society of Chemical Industry.

Cancerouspdomains: comprehensive analysis of cancer type-specific recurrent somatic mutations in proteins and domains.

PubMed

Hashemi, Seirana; Nowzari Dalini, Abbas; Jalali, Adrin; Banaei-Moghaddam, Ali Mohammad; Razaghi-Moghadam, Zahra

2017-08-16

Discriminating driver mutations from the ones that play no role in cancer is a severe bottleneck in elucidating molecular mechanisms underlying cancer development. Since protein domains are representatives of functional regions within proteins, mutations on them may disturb the protein functionality. Therefore, studying mutations at domain level may point researchers to more accurate assessment of the functional impact of the mutations. This article presents a comprehensive study to map mutations from 29 cancer types to both sequence- and structure-based domains. Statistical analysis was performed to identify candidate domains in which mutations occur with high statistical significance. For each cancer type, the corresponding type-specific domains were distinguished among all candidate domains. Subsequently, cancer type-specific domains facilitated the identification of specific proteins for each cancer type. Besides, performing interactome analysis on specific proteins of each cancer type showed high levels of interconnectivity among them, which implies their functional relationship. To evaluate the role of mitochondrial genes, stem cell-specific genes and DNA repair genes in cancer development, their mutation frequency was determined via further analysis. This study has provided researchers with a publicly available data repository for studying both CATH and Pfam domain regions on protein-coding genes. Moreover, the associations between different groups of genes/domains and various cancer types have been clarified. The work is available at http://www.cancerouspdomains.ir .
Genome-wide analysis of the R2R3-MYB transcription factor gene family in sweet orange (Citrus sinensis).

PubMed

Liu, Chaoyang; Wang, Xia; Xu, Yuantao; Deng, Xiuxin; Xu, Qiang

2014-10-01

MYB transcription factor represents one of the largest gene families in plant genomes. Sweet orange (Citrus sinensis) is one of the most important fruit crops worldwide, and recently the genome has been sequenced. This provides an opportunity to investigate the organization and evolutionary characteristics of sweet orange MYB genes from whole genome view. In the present study, we identified 100 R2R3-MYB genes in the sweet orange genome. A comprehensive analysis of this gene family was performed, including the phylogeny, gene structure, chromosomal localization and expression pattern analyses. The 100 genes were divided into 29 subfamilies based on the sequence similarity and phylogeny, and the classification was also well supported by the highly conserved exon/intron structures and motif composition. The phylogenomic comparison of MYB gene family among sweet orange and related plant species, Arabidopsis, cacao and papaya suggested the existence of functional divergence during evolution. Expression profiling indicated that sweet orange R2R3-MYB genes exhibited distinct temporal and spatial expression patterns. Our analysis suggested that the sweet orange MYB genes may play important roles in different plant biological processes, some of which may be potentially involved in citrus fruit quality. These results will be useful for future functional analysis of the MYB gene family in sweet orange.
Identification and function analysis of contrary genes in Dupuytren's contracture.

PubMed

Ji, Xianglu; Tian, Feng; Tian, Lijie

2015-07-01

The present study aimed to analyze the expression of genes involved in Dupuytren's contracture (DC), using bioinformatic methods. The profile of GSE21221 was downloaded from the gene expression ominibus, which included six samples, derived from fibroblasts and six healthy control samples, derived from carpal-tunnel fibroblasts. A Distributed Intrusion Detection System was used in order to identify differentially expressed genes. The term contrary genes is proposed. Contrary genes were the genes that exhibited opposite expression patterns in the positive and negative groups, and likely exhibited opposite functions. These were identified using Coexpress software. Gene ontology (GO) function analysis was conducted for the contrary genes. A network of GO terms was constructed using the reduce and visualize gene ontology database. Significantly expressed genes (801) and contrary genes (98) were screened. A significant association was observed between Chitinase-3-like protein 1 and ten genes in the positive gene set. Positive regulation of transcription and the activation of nuclear factor-κB (NF-κB)-inducing kinase activity exhibited the highest degree values in the network of GO terms. In the present study, the expression of genes involved in the development of DC was analyzed, and the concept of contrary genes proposed. The genes identified in the present study are involved in the positive regulation of transcription and activation of NF-κB-inducing kinase activity. The contrary genes and GO terms identified in the present study may potentially be used for DC diagnosis and treatment.
Multifunctionality and diversity of GDSL esterase/lipase gene family in rice (Oryza sativa L. japonica) genome: new insights from bioinformatics analysis

PubMed Central

2012-01-01

Background GDSL esterases/lipases are a newly discovered subclass of lipolytic enzymes that are very important and attractive research subjects because of their multifunctional properties, such as broad substrate specificity and regiospecificity. Compared with the current knowledge regarding these enzymes in bacteria, our understanding of the plant GDSL enzymes is very limited, although the GDSL gene family in plant species include numerous members in many fully sequenced plant genomes. Only two genes from a large rice GDSL esterase/lipase gene family were previously characterised, and the majority of the members remain unknown. In the present study, we describe the rice OsGELP (Oryza sativa GDSL esterase/lipase protein) gene family at the genomic and proteomic levels, and use this knowledge to provide insights into the multifunctionality of the rice OsGELP enzymes. Results In this study, an extensive bioinformatics analysis identified 114 genes in the rice OsGELP gene family. A complete overview of this family in rice is presented, including the chromosome locations, gene structures, phylogeny, and protein motifs. Among the OsGELPs and the plant GDSL esterase/lipase proteins of known functions, 41 motifs were found that represent the core secondary structure elements or appear specifically in different phylogenetic subclades. The specification and distribution of identified putative conserved clade-common and -specific peptide motifs, and their location on the predicted protein three dimensional structure may possibly signify their functional roles. Potentially important regions for substrate specificity are highlighted, in accordance with protein three-dimensional model and location of the phylogenetic specific conserved motifs. The differential expression of some representative genes were confirmed by quantitative real-time PCR. The phylogenetic analysis, together with protein motif architectures, and the expression profiling were analysed to predict the possible biological functions of the rice OsGELP genes. Conclusions Our current genomic analysis, for the first time, presents fundamental information on the organization of the rice OsGELP gene family. With combination of the genomic, phylogenetic, microarray expression, protein motif distribution, and protein structure analyses, we were able to create supported basis for the functional prediction of many members in the rice GDSL esterase/lipase family. The present study provides a platform for the selection of candidate genes for further detailed functional study. PMID:22793791
Exercise-associated DNA methylation change in skeletal muscle and the importance of imprinted genes: a bioinformatics meta-analysis.

PubMed

Brown, William M

2015-12-01

Epigenetics is the study of processes--beyond DNA sequence alteration--producing heritable characteristics. For example, DNA methylation modifies gene expression without altering the nucleotide sequence. A well-studied DNA methylation-based phenomenon is genomic imprinting (ie, genotype-independent parent-of-origin effects). We aimed to elucidate: (1) the effect of exercise on DNA methylation and (2) the role of imprinted genes in skeletal muscle gene networks (ie, gene group functional profiling analyses). Gene ontology (ie, gene product elucidation)/meta-analysis. 26 skeletal muscle and 86 imprinted genes were subjected to g:Profiler ontology analysis. Meta-analysis assessed exercise-associated DNA methylation change. g:Profiler found four muscle gene networks with imprinted loci. Meta-analysis identified 16 articles (387 genes/1580 individuals) associated with exercise. Age, method, sample size, sex and tissue variation could elevate effect size bias. Only skeletal muscle gene networks including imprinted genes were reported. Exercise-associated effect sizes were calculated by gene. Age, method, sample size, sex and tissue variation were moderators. Six imprinted loci (RB1, MEG3, UBE3A, PLAGL1, SGCE, INS) were important for muscle gene networks, while meta-analysis uncovered five exercise-associated imprinted loci (KCNQ1, MEG3, GRB10, L3MBTL1, PLAGL1). DNA methylation decreased with exercise (60% of loci). Exercise-associated DNA methylation change was stronger among older people (ie, age accounted for 30% of the variation). Among older people, genes exhibiting DNA methylation decreases were part of a microRNA-regulated gene network functioning to suppress cancer. Imprinted genes were identified in skeletal muscle gene networks and exercise-associated DNA methylation change. Exercise-associated DNA methylation modification could rewind the 'epigenetic clock' as we age. CRD42014009800. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Suppression subtractive hybridization and comparative expression analysis to identify developmentally regulated genes in filamentous fungi.

PubMed

Gesing, Stefan; Schindler, Daniel; Nowrousian, Minou

2013-09-01

Ascomycetes differentiate four major morphological types of fruiting bodies (apothecia, perithecia, pseudothecia and cleistothecia) that are derived from an ancestral fruiting body. Thus, fruiting body differentiation is most likely controlled by a set of common core genes. One way to identify such genes is to search for genes with evolutionary conserved expression patterns. Using suppression subtractive hybridization (SSH), we selected differentially expressed transcripts in Pyronema confluens (Pezizales) by comparing two cDNA libraries specific for sexual and for vegetative development, respectively. The expression patterns of selected genes from both libraries were verified by quantitative real time PCR. Expression of several corresponding homologous genes was found to be conserved in two members of the Sordariales (Sordaria macrospora and Neurospora crassa), a derived group of ascomycetes that is only distantly related to the Pezizales. Knockout studies with N. crassa orthologues of differentially regulated genes revealed a functional role during fruiting body development for the gene NCU05079, encoding a putative MFS peptide transporter. These data indicate conserved gene expression patterns and a functional role of the corresponding genes during fruiting body development; such genes are candidates of choice for further functional analysis. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The aquatic animals' transcriptome resource for comparative functional analysis.

PubMed

Chou, Chih-Hung; Huang, Hsi-Yuan; Huang, Wei-Chih; Hsu, Sheng-Da; Hsiao, Chung-Der; Liu, Chia-Yu; Chen, Yu-Hung; Liu, Yu-Chen; Huang, Wei-Yun; Lee, Meng-Lin; Chen, Yi-Chang; Huang, Hsien-Da

2018-05-09

Aquatic animals have great economic and ecological importance. Among them, non-model organisms have been studied regarding eco-toxicity, stress biology, and environmental adaptation. Due to recent advances in next-generation sequencing techniques, large amounts of RNA-seq data for aquatic animals are publicly available. However, currently there is no comprehensive resource exist for the analysis, unification, and integration of these datasets. This study utilizes computational approaches to build a new resource of transcriptomic maps for aquatic animals. This aquatic animal transcriptome map database dbATM provides de novo assembly of transcriptome, gene annotation and comparative analysis of more than twenty aquatic organisms without draft genome. To improve the assembly quality, three computational tools (Trinity, Oases and SOAPdenovo-Trans) were employed to enhance individual transcriptome assembly, and CAP3 and CD-HIT-EST software were then used to merge these three assembled transcriptomes. In addition, functional annotation analysis provides valuable clues to gene characteristics, including full-length transcript coding regions, conserved domains, gene ontology and KEGG pathways. Furthermore, all aquatic animal genes are essential for comparative genomics tasks such as constructing homologous gene groups and blast databases and phylogenetic analysis. In conclusion, we establish a resource for non model organism aquatic animals, which is great economic and ecological importance and provide transcriptomic information including functional annotation and comparative transcriptome analysis. The database is now publically accessible through the URL http://dbATM.mbc.nctu.edu.tw/ .
Global Landscape of a Co-Expressed Gene Network in Barley and its Application to Gene Discovery in Triticeae Crops

PubMed Central

Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo

2011-01-01

Accumulated transcriptome data can be used to investigate regulatory networks of genes involved in various biological systems. Co-expression analysis data sets generated from comprehensively collected transcriptome data sets now represent efficient resources that are capable of facilitating the discovery of genes with closely correlated expression patterns. In order to construct a co-expression network for barley, we analyzed 45 publicly available experimental series, which are composed of 1,347 sets of GeneChip data for barley. On the basis of a gene-to-gene weighted correlation coefficient, we constructed a global barley co-expression network and classified it into clusters of subnetwork modules. The resulting clusters are candidates for functional regulatory modules in the barley transcriptome. To annotate each of the modules, we performed comparative annotation using genes in Arabidopsis and Brachypodium distachyon. On the basis of a comparative analysis between barley and two model species, we investigated functional properties from the representative distributions of the gene ontology (GO) terms. Modules putatively involved in drought stress response and cellulose biogenesis have been identified. These modules are discussed to demonstrate the effectiveness of the co-expression analysis. Furthermore, we applied the data set of co-expressed genes coupled with comparative analysis in attempts to discover potentially Triticeae-specific network modules. These results demonstrate that analysis of the co-expression network of the barley transcriptome together with comparative analysis should promote the process of gene discovery in barley. Furthermore, the insights obtained should be transferable to investigations of Triticeae plants. The associated data set generated in this analysis is publicly accessible at http://coexpression.psc.riken.jp/barley/. PMID:21441235
A genome-wide analysis of the expansin genes in Malus × Domestica.

PubMed

Zhang, Shizhong; Xu, Ruirui; Gao, Zheng; Chen, Changtian; Jiang, Zesheng; Shu, Huairui

2014-04-01

Expansins were first identified as cell wall-loosening proteins; they are involved in regulating cell expansion, fruits softening and many other physiological processes. However, our knowledge about the expansin family members and their evolutionary relationships in fruit trees, such as apple, is limited. In this study, we identified 41 members of the expansin gene family in the genome of apple (Malus × Domestica L. Borkh). Phylogenetic analysis revealed that expansin genes in apple could be divided into four subfamilies according to their gene structures and protein motifs. By phylogenetic analysis of the expansins in five plants (Arabidopsis, rice, poplar, grape and apple), the expansins were divided into 17 subgroups. Our gene duplication analysis revealed that whole-genome and chromosomal-segment duplications contributed to the expansion of Mdexpansins. The microarray and expressed sequence tag (EST) data showed that 34 Mdexpansin genes could be divided into five groups by the EST analysis; they may also play different roles during fruit development. An expression model for MdEXPA16 and MdEXPA20 showed their potential role in developing fruit. Overall, our study provides useful data and novel insights into the functions and regulatory mechanisms of the expansin genes in apple, as well as their evolution and divergence. As the first step towards genome-wide analysis of the expansin genes in apple, our results have established a solid foundation for future studies on the function of the expansin genes in fruit development.
Association of presence/absence and on/off patterns of Helicobacter pylori oipA gene with peptic ulcer disease and gastric cancer risks: a meta-analysis.

PubMed

Liu, Jingwei; He, Caiyun; Chen, Moye; Wang, Zhenning; Xing, Chengzhong; Yuan, Yuan

2013-11-20

There are increasing studies examining the relationship between the status of H. pylori oipA gene and peptic ulcer disease (PUD) and gastric cancer (GC) but the results turn out to be controversial. We attempted to clarify whether oipA gene status is linked with PUD and/or GC risks. A systematically literature search was performed through four electronic databases. According to the specific inclusion and exclusion criteria, seven articles were ultimately available for the meta-analysis of oipA presence/absence with PUD and GC, and eleven articles were included for the meta-analysis of oipA on/off status with PUD and GC. For the on/off functional status analysis of oipA gene, the "on" status showed significant associations with increased risks of PUD (OR = 3.97, 95% CI: 2.89, 5.45; P < 0.001) and GC (OR = 2.43, 95% CI: 1.45, 4.07; P = 0.001) compared with gastritis and functional dyspepsia controls. Results of the homogeneity test indicated different effects of oipA "on" status on PUD risk between children and adult subgroups and on GC risk between PCR-sequencing and immunoblot subgroups. For the presence/absence analysis of oipA gene, we found null association of the presence of oipA gene with the risks of PUD (OR = 1.93, 95% CI: 0.60, 6.25; P = 0.278) and GC (OR = 2.09, 95% CI: 0.51, 8.66; P = 0.308) compared with gastritis and functional dyspepsia controls. To be concluded, when oipA exists, the functional "on" status of this gene showed association with increased risks for PUD and GC compared with gastritis and FD controls. However, merely investigating the presence/absence of oipA would overlook the importance of its functional on/off status and would not be reliable to predict risks of PUD and GC. Further large-scale and well-designed studies concerning on/off status of oipA are required to confirm our meta-analysis results.
Lymphocyte signaling: beyond knockouts.

PubMed

Saveliev, Alexander; Tybulewicz, Victor L J

2009-04-01

The analysis of lymphocyte signaling was greatly enhanced by the advent of gene targeting, which allows the selective inactivation of a single gene. Although this gene 'knockout' approach is often informative, in many cases, the phenotype resulting from gene ablation might not provide a complete picture of the function of the corresponding protein. If a protein has multiple functions within a single or several signaling pathways, or stabilizes other proteins in a complex, the phenotypic consequences of a gene knockout may manifest as a combination of several different perturbations. In these cases, gene targeting to 'knock in' subtle point mutations might provide more accurate insight into protein function. However, to be informative, such mutations must be carefully based on structural and biophysical data.
Quantitative analysis of bristle number in Drosophila mutants identifies genes involved in neural development

NASA Technical Reports Server (NTRS)

Norga, Koenraad K.; Gurganus, Marjorie C.; Dilda, Christy L.; Yamamoto, Akihiko; Lyman, Richard F.; Patel, Prajal H.; Rubin, Gerald M.; Hoskins, Roger A.; Mackay, Trudy F.; Bellen, Hugo J.

2003-01-01

BACKGROUND: The identification of the function of all genes that contribute to specific biological processes and complex traits is one of the major challenges in the postgenomic era. One approach is to employ forward genetic screens in genetically tractable model organisms. In Drosophila melanogaster, P element-mediated insertional mutagenesis is a versatile tool for the dissection of molecular pathways, and there is an ongoing effort to tag every gene with a P element insertion. However, the vast majority of P element insertion lines are viable and fertile as homozygotes and do not exhibit obvious phenotypic defects, perhaps because of the tendency for P elements to insert 5' of transcription units. Quantitative genetic analysis of subtle effects of P element mutations that have been induced in an isogenic background may be a highly efficient method for functional genome annotation. RESULTS: Here, we have tested the efficacy of this strategy by assessing the extent to which screening for quantitative effects of P elements on sensory bristle number can identify genes affecting neural development. We find that such quantitative screens uncover an unusually large number of genes that are known to function in neural development, as well as genes with yet uncharacterized effects on neural development, and novel loci. CONCLUSIONS: Our findings establish the use of quantitative trait analysis for functional genome annotation through forward genetics. Similar analyses of quantitative effects of P element insertions will facilitate our understanding of the genes affecting many other complex traits in Drosophila.
Rye B chromosomes encode a functional Argonaute-like protein with in vitro slicer activities similar to its A chromosome paralog.

PubMed

Ma, Wei; Gabriel, Tobias Sebastian; Martis, Mihaela Maria; Gursinsky, Torsten; Schubert, Veit; Vrána, Jan; Doležel, Jaroslav; Grundlach, Heidrun; Altschmied, Lothar; Scholz, Uwe; Himmelbach, Axel; Behrens, Sven-Erik; Banaei-Moghaddam, Ali Mohammad; Houben, Andreas

2017-01-01

B chromosomes (Bs) are supernumerary, dispensable parts of the nuclear genome, which appear in many different species of eukaryote. So far, Bs have been considered to be genetically inert elements without any functional genes. Our comparative transcriptome analysis and the detection of active RNA polymerase II (RNAPII) in the proximity of B chromatin demonstrate that the Bs of rye (Secale cereale) contribute to the transcriptome. In total, 1954 and 1218 B-derived transcripts with an open reading frame were expressed in generative and vegetative tissues, respectively. In addition to B-derived transposable element transcripts, a high percentage of short transcripts without detectable similarity to known proteins and gene fragments from A chromosomes (As) were found, suggesting an ongoing gene erosion process. In vitro analysis of the A- and B-encoded AGO4B protein variants demonstrated that both possess RNA slicer activity. These data demonstrate unambiguously the presence of a functional AGO4B gene on Bs and that these Bs carry both functional protein coding genes and pseudogene copies. Thus, B-encoded genes may provide an additional level of gene control and complexity in combination with their related A-located genes. Hence, physiological effects, associated with the presence of Bs, may partly be explained by the activity of B-located (pseudo)genes. © 2016 IPK Gatersleben. New Phytologist © 2016 New Phytologist Trust.
Structural and transcriptional analysis of plant genes encoding the bifunctional lysine ketoglutarate reductase saccharopine dehydrogenase enzyme.

PubMed

Anderson, Olin D; Coleman-Derr, Devin; Gu, Yong Q; Heath, Sekou

2010-06-16

Among the dietary essential amino acids, the most severely limiting in the cereals is lysine. Since cereals make up half of the human diet, lysine limitation has quality/nutritional consequences. The breakdown of lysine is controlled mainly by the catabolic bifunctional enzyme lysine ketoglutarate reductase - saccharopine dehydrogenase (LKR/SDH). The LKR/SDH gene has been reported to produce transcripts for the bifunctional enzyme and separate monofunctional transcripts. In addition to lysine metabolism, this gene has been implicated in a number of metabolic and developmental pathways, which along with its production of multiple transcript types and complex exon/intron structure suggest an important node in plant metabolism. Understanding more about the LKR/SDH gene is thus interesting both from applied standpoint and for basic plant metabolism. The current report describes a wheat genomic fragment containing an LKR/SDH gene and adjacent genes. The wheat LKR/SDH genomic segment was found to originate from the A-genome of wheat, and EST analysis indicates all three LKR/SDH genes in hexaploid wheat are transcriptionally active. A comparison of a set of plant LKR/SDH genes suggests regions of greater sequence conservation likely related to critical enzymatic functions and metabolic controls. Although most plants contain only a single LKR/SDH gene per genome, poplar contains at least two functional bifunctional genes in addition to a monofunctional LKR gene. Analysis of ESTs finds evidence for monofunctional LKR transcripts in switchgrass, and monofunctional SDH transcripts in wheat, Brachypodium, and poplar. The analysis of a wheat LKR/SDH gene and comparative structural and functional analyses among available plant genes provides new information on this important gene. Both the structure of the LKR/SDH gene and the immediately adjacent genes show lineage-specific differences between monocots and dicots, and findings suggest variation in activity of LKR/SDH genes among plants. Although most plant genomes seem to contain a single conserved LKR/SDH gene per genome, poplar possesses multiple contiguous genes. A preponderance of SDH transcripts suggests the LKR region may be more rate-limiting. Only switchgrass has EST evidence for LKR monofunctional transcripts. Evidence for monofunctional SDH transcripts shows a novel intron in wheat, Brachypodium, and poplar.
Developmentally distinct MYB genes encode functionally equivalent proteins in Arabidopsis.

PubMed

Lee, M M; Schiefelbein, J

2001-05-01

The duplication and divergence of developmental control genes is thought to have driven morphological diversification during the evolution of multicellular organisms. To examine the molecular basis of this process, we analyzed the functional relationship between two paralogous MYB transcription factor genes, WEREWOLF (WER) and GLABROUS1 (GL1), in Arabidopsis. The WER and GL1 genes specify distinct cell types and exhibit non-overlapping expression patterns during Arabidopsis development. Nevertheless, reciprocal complementation experiments with a series of gene fusions showed that WER and GL1 encode functionally equivalent proteins, and their unique roles in plant development are entirely due to differences in their cis-regulatory sequences. Similar experiments with a distantly related MYB gene (MYB2) showed that its product cannot functionally substitute for WER or GL1. Furthermore, an analysis of the WER and GL1 proteins shows that conserved sequences correspond to specific functional domains. These results provide new insights into the evolution of the MYB gene family in Arabidopsis, and, more generally, they demonstrate that novel developmental gene function may arise solely by the modification of cis-regulatory sequences.
Exome sequencing links corticospinal motor neuron disease to common neurodegenerative disorders.

PubMed

Novarino, Gaia; Fenstermaker, Ali G; Zaki, Maha S; Hofree, Matan; Silhavy, Jennifer L; Heiberg, Andrew D; Abdellateef, Mostafa; Rosti, Basak; Scott, Eric; Mansour, Lobna; Masri, Amira; Kayserili, Hulya; Al-Aama, Jumana Y; Abdel-Salam, Ghada M H; Karminejad, Ariana; Kara, Majdi; Kara, Bulent; Bozorgmehri, Bita; Ben-Omran, Tawfeg; Mojahedi, Faezeh; El Din Mahmoud, Iman Gamal; Bouslam, Naima; Bouhouche, Ahmed; Benomar, Ali; Hanein, Sylvain; Raymond, Laure; Forlani, Sylvie; Mascaro, Massimo; Selim, Laila; Shehata, Nabil; Al-Allawi, Nasir; Bindu, P S; Azam, Matloob; Gunel, Murat; Caglayan, Ahmet; Bilguvar, Kaya; Tolun, Aslihan; Issa, Mahmoud Y; Schroth, Jana; Spencer, Emily G; Rosti, Rasim O; Akizu, Naiara; Vaux, Keith K; Johansen, Anide; Koh, Alice A; Megahed, Hisham; Durr, Alexandra; Brice, Alexis; Stevanin, Giovanni; Gabriel, Stacy B; Ideker, Trey; Gleeson, Joseph G

2014-01-31

Hereditary spastic paraplegias (HSPs) are neurodegenerative motor neuron diseases characterized by progressive age-dependent loss of corticospinal motor tract function. Although the genetic basis is partly understood, only a fraction of cases can receive a genetic diagnosis, and a global view of HSP is lacking. By using whole-exome sequencing in combination with network analysis, we identified 18 previously unknown putative HSP genes and validated nearly all of these genes functionally or genetically. The pathways highlighted by these mutations link HSP to cellular transport, nucleotide metabolism, and synapse and axon development. Network analysis revealed a host of further candidate genes, of which three were mutated in our cohort. Our analysis links HSP to other neurodegenerative disorders and can facilitate gene discovery and mechanistic understanding of disease.
Widespread antisense transcription of Populus genome under drought.

PubMed

Yuan, Yinan; Chen, Su

2018-06-06

Antisense transcription is widespread in many genomes and plays important regulatory roles in gene expression. The objective of our study was to investigate the extent and functional relevance of antisense transcription in forest trees. We employed Populus, a model tree species, to probe the antisense transcriptional response of tree genome under drought, through stranded RNA-seq analysis. We detected nearly 48% of annotated Populus gene loci with antisense transcripts and 44% of them with co-transcription from both DNA strands. Global distribution of reads pattern across annotated gene regions uncovered that antisense transcription was enriched in untranslated regions while sense reads were predominantly mapped in coding exons. We further detected 1185 drought-responsive sense and antisense gene loci and identified a strong positive correlation between the expression of antisense and sense transcripts. Additionally, we assessed the antisense expression in introns and found a strong correlation between intronic expression and exonic expression, confirming antisense transcription of introns contributes to transcriptional activity of Populus genome under drought. Finally, we functionally characterized drought-responsive sense-antisense transcript pairs through gene ontology analysis and discovered that functional groups including transcription factors and histones were concordantly regulated at both sense and antisense transcriptional level. Overall, our study demonstrated the extensive occurrence of antisense transcripts of Populus genes under drought and provided insights into genome structure, regulation pattern and functional significance of drought-responsive antisense genes in forest trees. Datasets generated in this study serve as a foundation for future genetic analysis to improve our understanding of gene regulation by antisense transcription.
PSAT: A web tool to compare genomic neighborhoods of multiple prokaryotic genomes

PubMed Central

Fong, Christine; Rohmer, Laurence; Radey, Matthew; Wasnick, Michael; Brittnacher, Mitchell J

2008-01-01

Background The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT) is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. Results PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. Conclusion PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any web browser with no client side software setup or installation required. Source code is freely available to researchers interested in setting up a local version of PSAT for analysis of genomes not available through the public server. Access to the public web server and instructions for obtaining source code can be found at . PMID:18366802
Nitrogen Cycle Evaluation (NiCE) Chip for the Simultaneous Analysis of Multiple N-Cycle Associated Genes.

PubMed

Oshiki, Mamoru; Segawa, Takahiro; Ishii, Satoshi

2018-02-02

Various microorganisms play key roles in the Nitrogen (N) cycle. Quantitative PCR (qPCR) and PCR-amplicon sequencing of the N cycle functional genes allow us to analyze the abundance and diversity of microbes responsible in the N transforming reactions in various environmental samples. However, analysis of multiple target genes can be cumbersome and expensive. PCR-independent analysis, such as metagenomics and metatranscriptomics, is useful but expensive especially when we analyze multiple samples and try to detect N cycle functional genes present at relatively low abundance. Here, we present the application of microfluidic qPCR chip technology to simultaneously quantify and prepare amplicon sequence libraries for multiple N cycle functional genes as well as taxon-specific 16S rRNA gene markers for many samples. This approach, named as N cycle evaluation (NiCE) chip, was evaluated by using DNA from pure and artificially mixed bacterial cultures and by comparing the results with those obtained by conventional qPCR and amplicon sequencing methods. Quantitative results obtained by the NiCE chip were comparable to those obtained by conventional qPCR. In addition, the NiCE chip was successfully applied to examine abundance and diversity of N cycle functional genes in wastewater samples. Although non-specific amplification was detected on the NiCE chip, this could be overcome by optimizing the primer sequences in the future. As the NiCE chip can provide high-throughput format to quantify and prepare sequence libraries for multiple N cycle functional genes, this tool should advance our ability to explore N cycling in various samples. Importance. We report a novel approach, namely Nitrogen Cycle Evaluation (NiCE) chip by using microfluidic qPCR chip technology. By sequencing the amplicons recovered from the NiCE chip, we can assess diversities of the N cycle functional genes. The NiCE chip technology is applicable to analyze the temporal dynamics of the N cycle gene transcriptions in wastewater treatment bioreactors. The NiCE chip can provide high-throughput format to quantify and prepare sequence libraries for multiple N cycle functional genes. While there is a room for future improvement, this tool should significantly advance our ability to explore the N cycle in various environmental samples. Copyright © 2018 American Society for Microbiology.
Systematic analysis of human kinase genes: a large number of genes and alternative splicing events result in functional and structural diversity

PubMed Central

Milanesi, Luciano; Petrillo, Mauro; Sepe, Leandra; Boccia, Angelo; D'Agostino, Nunzio; Passamano, Myriam; Di Nardo, Salvatore; Tasco, Gianluca; Casadio, Rita; Paolella, Giovanni

2005-01-01

Background Protein kinases are a well defined family of proteins, characterized by the presence of a common kinase catalytic domain and playing a significant role in many important cellular processes, such as proliferation, maintenance of cell shape, apoptosys. In many members of the family, additional non-kinase domains contribute further specialization, resulting in subcellular localization, protein binding and regulation of activity, among others. About 500 genes encode members of the kinase family in the human genome, and although many of them represent well known genes, a larger number of genes code for proteins of more recent identification, or for unknown proteins identified as kinase only after computational studies. Results A systematic in silico study performed on the human genome, led to the identification of 5 genes, on chromosome 1, 11, 13, 15 and 16 respectively, and 1 pseudogene on chromosome X; some of these genes are reported as kinases from NCBI but are absent in other databases, such as KinBase. Comparative analysis of 483 gene regions and subsequent computational analysis, aimed at identifying unannotated exons, indicates that a large number of kinase may code for alternately spliced forms or be incorrectly annotated. An InterProScan automated analysis was perfomed to study domain distribution and combination in the various families. At the same time, other structural features were also added to the annotation process, including the putative presence of transmembrane alpha helices, and the cystein propensity to participate into a disulfide bridge. Conclusion The predicted human kinome was extended by identifiying both additional genes and potential splice variants, resulting in a varied panorama where functionality may be searched at the gene and protein level. Structural analysis of kinase proteins domains as defined in multiple sources together with transmembrane alpha helices and signal peptide prediction provides hints to function assignment. The results of the human kinome analysis are collected in the KinWeb database, available for browsing and searching over the internet, where all results from the comparative analysis and the gene structure annotation are made available, alongside the domain information. Kinases may be searched by domain combinations and the relative genes may be viewed in a graphic browser at various level of magnification up to gene organization on the full chromosome set. PMID:16351747

Study of Staphylococcus aureus N315 Pathogenic Genes by Text Mining and Enrichment Analysis of Pathways and Operons.

PubMed

Yang, Chun-Feng; Gou, Wei-Hui; Dai, Xin-Lun; Li, Yu-Mei

2018-06-01

Staphylococcus aureus (S. aureus) is a versatile pathogen found in many environments and can cause nosocomial infections in the community and hospitals. S. aureus infection is an increasingly serious threat to global public health that requires action across many government bodies, medical and health sectors, and scientific research institutions. In the present study, S. aureus N315 genes that have been shown in the literature to be pathogenic were extracted using a bibliometric method for functional enrichment analysis of pathways and operons to statistically discover novel pathogenic genes associated with S. aureus N315. A total of 383 pathogenic genes were mined from the literature using bibliometrics, and subsequently a few new pathogenic genes of S. aureus N315 were identified by functional enrichment analysis of pathways and operons. The discovery of these novel S. aureus N315 pathogenic genes is of great significance to treat S. aureus induced diseases and identify potential diagnostic markers, thus providing theoretical fundamentals for epidemiological prevention.
Genome-wide identification and expression analysis of TCP transcription factors in Gossypium raimondii.

PubMed

Ma, Jun; Wang, Qinglian; Sun, Runrun; Xie, Fuliang; Jones, Don C; Zhang, Baohong

2014-10-16

Plant-specific TEOSINTE-BRANCHED1/CYCLOIDEA/PCF (TCP) transcription factors play versatile functions in multiple aspects of plant growth and development. However, no systematical study has been performed in cotton. In this study, we performed for the first time the genome-wide identification and expression analysis of the TCP transcription factor family in Gossypium raimondii. A total of 38 non-redundant cotton TCP encoding genes were identified. The TCP transcription factors were divided into eleven subgroups based on phylogenetic analysis. Most TCP genes within the same subfamily demonstrated similar exon and intron organization and the motif structures were highly conserved among the subfamilies. Additionally, the chromosomal distribution pattern revealed that TCP genes were unevenly distributed across 11 out of the 13 chromosomes; segmental duplication is a predominant duplication event for TCP genes and the major contributor to the expansion of TCP gene family in G. raimondii. Moreover, the expression profiles of TCP genes shed light on their functional divergence.
Genome-wide identification and expression analysis of TCP transcription factors in Gossypium raimondii

PubMed Central

Ma, Jun; Wang, Qinglian; Sun, Runrun; Xie, Fuliang; Jones, Don C.; Zhang, Baohong

2014-01-01

Plant-specific TEOSINTE-BRANCHED1/CYCLOIDEA/PCF (TCP) transcription factors play versatile functions in multiple aspects of plant growth and development. However, no systematical study has been performed in cotton. In this study, we performed for the first time the genome-wide identification and expression analysis of the TCP transcription factor family in Gossypium raimondii. A total of 38 non-redundant cotton TCP encoding genes were identified. The TCP transcription factors were divided into eleven subgroups based on phylogenetic analysis. Most TCP genes within the same subfamily demonstrated similar exon and intron organization and the motif structures were highly conserved among the subfamilies. Additionally, the chromosomal distribution pattern revealed that TCP genes were unevenly distributed across 11 out of the 13 chromosomes; segmental duplication is a predominant duplication event for TCP genes and the major contributor to the expansion of TCP gene family in G. raimondii. Moreover, the expression profiles of TCP genes shed light on their functional divergence. PMID:25322260
Systematic exploration of essential yeast gene function with temperature-sensitive mutants

PubMed Central

Li, Zhijian; Vizeacoumar, Franco J; Bahr, Sondra; Li, Jingjing; Warringer, Jonas; Vizeacoumar, Frederick S; Min, Renqiang; VanderSluis, Benjamin; Bellay, Jeremy; DeVit, Michael; Fleming, James A; Stephens, Andrew; Haase, Julian; Lin, Zhen-Yuan; Baryshnikova, Anastasia; Lu, Hong; Yan, Zhun; Jin, Ke; Barker, Sarah; Datti, Alessandro; Giaever, Guri; Nislow, Corey; Bulawa, Chris; Myers, Chad L; Costanzo, Michael; Gingras, Anne-Claude; Zhang, Zhaolei; Blomberg, Anders; Bloom, Kerry; Andrews, Brenda; Boone, Charles

2012-01-01

Conditional temperature-sensitive (ts) mutations are valuable reagents for studying essential genes in the yeast Saccharomyces cerevisiae. We constructed 787 ts strains, covering 497 (~45%) of the 1,101 essential yeast genes, with ~30% of the genes represented by multiple alleles. All of the alleles are integrated into their native genomic locus in the S288C common reference strain and are linked to a kanMX selectable marker, allowing further genetic manipulation by synthetic genetic array (SGA)–based, high-throughput methods. We show two such manipulations: barcoding of 440 strains, which enables chemical-genetic suppression analysis, and the construction of arrays of strains carrying different fluorescent markers of subcellular structure, which enables quantitative analysis of phenotypes using high-content screening. Quantitative analysis of a GFP-tubulin marker identified roles for cohesin and condensin genes in spindle disassembly. This mutant collection should facilitate a wide range of systematic studies aimed at understanding the functions of essential genes. PMID:21441928
Genome-wide STAT3 binding analysis after histone deacetylase inhibition reveals novel target genes in dendritic cells

PubMed Central

Sun, Yaping; Iyer, Matthew; McEachin, Richard; Zhao, Meng; Wu, Yi-Mi; Cao, Xuhong; Oravecz-Wilson, Katherine; Zajac, Cynthia; Mathewson, Nathan; Wu, Shin-Rong Julia; Rossi, Corinne; Toubai, Tomomi; Qin, Zhaohui S.; Chinnaiya, Arul M.; Reddy, Pavan

2016-01-01

STAT3 is a master transcriptional regulator that plays an important role in the induction of both immune activation and immune tolerance in dendritic cells (DCs). The transcriptional targets of STAT3 in promoting DC activation are becoming increasingly understood; however, the mechanisms underpinning its role in causing DC suppression remain largely unknown. To determine the functional gene targets of STAT3, we compared the genome-wide binding of STAT3 using ChIP-seq coupled with gene expression microarrays to determine STAT3-dependent gene regulation in DCs after histone deacetylase (HDAC) inhibition. HDAC inhibition boosted the ability of STAT3 to bind to distinct DNA targets and regulate gene expression. Among the top 500 STAT3 binding sites, the frequency of canonical motifs was significantly higher than that of non-canonical motifs. Functional analysis revealed that after treatment with an HDAC inhibitor, the upregulated STAT3 target genes were those that were primarily the negative regulators of pro-inflammatory cytokines and those in the IL-10 signaling pathway. The downregulated STAT3-dependent targets were those involved in immune effector processes and antigen processing/presentation. The expression and functional relevance of these genes were validated. Specifically, functional studies confirmed that the upregulation of IL-10Ra by STAT3 contributed to the suppressive function of DCs following HDAC inhibition. PMID:27866206
Non-Gaussian Distributions Affect Identification of Expression Patterns, Functional Annotation, and Prospective Classification in Human Cancer Genomes

PubMed Central

Marko, Nicholas F.; Weil, Robert J.

2012-01-01

Introduction Gene expression data is often assumed to be normally-distributed, but this assumption has not been tested rigorously. We investigate the distribution of expression data in human cancer genomes and study the implications of deviations from the normal distribution for translational molecular oncology research. Methods We conducted a central moments analysis of five cancer genomes and performed empiric distribution fitting to examine the true distribution of expression data both on the complete-experiment and on the individual-gene levels. We used a variety of parametric and nonparametric methods to test the effects of deviations from normality on gene calling, functional annotation, and prospective molecular classification using a sixth cancer genome. Results Central moments analyses reveal statistically-significant deviations from normality in all of the analyzed cancer genomes. We observe as much as 37% variability in gene calling, 39% variability in functional annotation, and 30% variability in prospective, molecular tumor subclassification associated with this effect. Conclusions Cancer gene expression profiles are not normally-distributed, either on the complete-experiment or on the individual-gene level. Instead, they exhibit complex, heavy-tailed distributions characterized by statistically-significant skewness and kurtosis. The non-Gaussian distribution of this data affects identification of differentially-expressed genes, functional annotation, and prospective molecular classification. These effects may be reduced in some circumstances, although not completely eliminated, by using nonparametric analytics. This analysis highlights two unreliable assumptions of translational cancer gene expression analysis: that “small” departures from normality in the expression data distributions are analytically-insignificant and that “robust” gene-calling algorithms can fully compensate for these effects. PMID:23118863
Genome-wide Identification and Expression Analysis of the CDPK Gene Family in Grape, Vitis spp.

PubMed

Zhang, Kai; Han, Yong-Tao; Zhao, Feng-Li; Hu, Yang; Gao, Yu-Rong; Ma, Yan-Fei; Zheng, Yi; Wang, Yue-Jin; Wen, Ying-Qiang

2015-06-30

Calcium-dependent protein kinases (CDPKs) play vital roles in plant growth and development, biotic and abiotic stress responses, and hormone signaling. Little is known about the CDPK gene family in grapevine. In this study, we performed a genome-wide analysis of the 12X grape genome (Vitis vinifera) and identified nineteen CDPK genes. Comparison of the structures of grape CDPK genes allowed us to examine their functional conservation and differentiation. Segmentally duplicated grape CDPK genes showed high structural conservation and contributed to gene family expansion. Additional comparisons between grape and Arabidopsis thaliana demonstrated that several grape CDPK genes occured in the corresponding syntenic blocks of Arabidopsis, suggesting that these genes arose before the divergence of grapevine and Arabidopsis. Phylogenetic analysis divided the grape CDPK genes into four groups. Furthermore, we examined the expression of the corresponding nineteen homologous CDPK genes in the Chinese wild grape (Vitis pseudoreticulata) under various conditions, including biotic stress, abiotic stress, and hormone treatments. The expression profiles derived from reverse transcription and quantitative PCR suggested that a large number of VpCDPKs responded to various stimuli on the transcriptional level, indicating their versatile roles in the responses to biotic and abiotic stresses. Moreover, we examined the subcellular localization of VpCDPKs by transiently expressing six VpCDPK-GFP fusion proteins in Arabidopsis mesophyll protoplasts; this revealed high variability consistent with potential functional differences. Taken as a whole, our data provide significant insights into the evolution and function of grape CDPKs and a framework for future investigation of grape CDPK genes.
Transcriptome analysis of salinity stress responses in common wheat using a 22k oligo-DNA microarray.

PubMed

Kawaura, Kanako; Mochida, Keiichi; Yamazaki, Yukiko; Ogihara, Yasunari

2006-04-01

In this study, we constructed a 22k wheat oligo-DNA microarray. A total of 148,676 expressed sequence tags of common wheat were collected from the database of the Wheat Genomics Consortium of Japan. These were grouped into 34,064 contigs, which were then used to design an oligonucleotide DNA microarray. Following a multistep selection of the sense strand, 21,939 60-mer oligo-DNA probes were selected for attachment on the microarray slide. This 22k oligo-DNA microarray was used to examine the transcriptional response of wheat to salt stress. More than 95% of the probes gave reproducible hybridization signals when targeted with RNAs extracted from salt-treated wheat shoots and roots. With the microarray, we identified 1,811 genes whose expressions changed more than 2-fold in response to salt. These included genes known to mediate response to salt, as well as unknown genes, and they were classified into 12 major groups by hierarchical clustering. These gene expression patterns were also confirmed by real-time reverse transcription-PCR. Many of the genes with unknown function were clustered together with genes known to be involved in response to salt stress. Thus, analysis of gene expression patterns combined with gene ontology should help identify the function of the unknown genes. Also, functional analysis of these wheat genes should provide new insight into the response to salt stress. Finally, these results indicate that the 22k oligo-DNA microarray is a reliable method for monitoring global gene expression patterns in wheat.
Pathogenic Gene Screening of Mycobacterium tuberculosis by Literature Data Mining and Information Pathway Enrichment Analysis.

PubMed

Xu, Guangyu; Wen, Simin; Pan, Yuchen; Zhang, Nan; Wang, Yuanyi

2018-05-01

Recent studies have unraveled mutations which have led to changes in the original conformation of functional proteins targeted by frontline drugs against Mycobacterium tuberculosis. These mutations are likely responsible for the emergence of drug-resistant strains of M. tuberculosis. Identification of new therapeutic targets is fundamental to the development of novel anti-TB drugs. Boost evolution analysis of interactome data with use of high-throughput biological experimental technologies provides opportunities for identification of pathogenic genes and for screening out novel therapeutic targets. In this study, we identified 584 proven pathogenic genes of M. tuberculosis and new pathogenic genes via bibliometrics and relevant websites such as PubMed, KEGG, and DOOR websites. We identified 13 new genes that are most likely to be pathogenic. This study may contribute to the discovery of new pathogenic genes and help unravel new functions of known pathogenic genes of M. tuberculosis.
Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana.

PubMed

Mayer, K; Schüller, C; Wambutt, R; Murphy, G; Volckaert, G; Pohl, T; Düsterhöft, A; Stiekema, W; Entian, K D; Terryn, N; Harris, B; Ansorge, W; Brandt, P; Grivell, L; Rieger, M; Weichselgartner, M; de Simone, V; Obermaier, B; Mache, R; Müller, M; Kreis, M; Delseny, M; Puigdomenech, P; Watson, M; Schmidtheini, T; Reichert, B; Portatelle, D; Perez-Alonso, M; Boutry, M; Bancroft, I; Vos, P; Hoheisel, J; Zimmermann, W; Wedler, H; Ridley, P; Langham, S A; McCullagh, B; Bilham, L; Robben, J; Van der Schueren, J; Grymonprez, B; Chuang, Y J; Vandenbussche, F; Braeken, M; Weltjens, I; Voet, M; Bastiaens, I; Aert, R; Defoor, E; Weitzenegger, T; Bothe, G; Ramsperger, U; Hilbert, H; Braun, M; Holzer, E; Brandt, A; Peters, S; van Staveren, M; Dirske, W; Mooijman, P; Klein Lankhorst, R; Rose, M; Hauf, J; Kötter, P; Berneiser, S; Hempel, S; Feldpausch, M; Lamberth, S; Van den Daele, H; De Keyser, A; Buysshaert, C; Gielen, J; Villarroel, R; De Clercq, R; Van Montagu, M; Rogers, J; Cronin, A; Quail, M; Bray-Allen, S; Clark, L; Doggett, J; Hall, S; Kay, M; Lennard, N; McLay, K; Mayes, R; Pettett, A; Rajandream, M A; Lyne, M; Benes, V; Rechmann, S; Borkova, D; Blöcker, H; Scharfe, M; Grimm, M; Löhnert, T H; Dose, S; de Haan, M; Maarse, A; Schäfer, M; Müller-Auer, S; Gabel, C; Fuchs, M; Fartmann, B; Granderath, K; Dauner, D; Herzl, A; Neumann, S; Argiriou, A; Vitale, D; Liguori, R; Piravandi, E; Massenet, O; Quigley, F; Clabauld, G; Mündlein, A; Felber, R; Schnabl, S; Hiller, R; Schmidt, W; Lecharny, A; Aubourg, S; Chefdor, F; Cooke, R; Berger, C; Montfort, A; Casacuberta, E; Gibbons, T; Weber, N; Vandenbol, M; Bargues, M; Terol, J; Torres, A; Perez-Perez, A; Purnelle, B; Bent, E; Johnson, S; Tacon, D; Jesse, T; Heijnen, L; Schwarz, S; Scholler, P; Heber, S; Francs, P; Bielke, C; Frishman, D; Haase, D; Lemcke, K; Mewes, H W; Stocker, S; Zaccaria, P; Bevan, M; Wilson, R K; de la Bastide, M; Habermann, K; Parnell, L; Dedhia, N; Gnoj, L; Schutz, K; Huang, E; Spiegel, L; Sehkon, M; Murray, J; Sheet, P; Cordes, M; Abu-Threideh, J; Stoneking, T; Kalicki, J; Graves, T; Harmon, G; Edwards, J; Latreille, P; Courtney, L; Cloud, J; Abbott, A; Scott, K; Johnson, D; Minx, P; Bentley, D; Fulton, B; Miller, N; Greco, T; Kemp, K; Kramer, J; Fulton, L; Mardis, E; Dante, M; Pepin, K; Hillier, L; Nelson, J; Spieth, J; Ryan, E; Andrews, S; Geisel, C; Layman, D; Du, H; Ali, J; Berghoff, A; Jones, K; Drone, K; Cotton, M; Joshu, C; Antonoiu, B; Zidanic, M; Strong, C; Sun, H; Lamar, B; Yordan, C; Ma, P; Zhong, J; Preston, R; Vil, D; Shekher, M; Matero, A; Shah, R; Swaby, I K; O'Shaughnessy, A; Rodriguez, M; Hoffmann, J; Till, S; Granat, S; Shohdy, N; Hasegawa, A; Hameed, A; Lodhi, M; Johnson, A; Chen, E; Marra, M; Martienssen, R; McCombie, W R

1999-12-16

The higher plant Arabidopsis thaliana (Arabidopsis) is an important model for identifying plant genes and determining their function. To assist biological investigations and to define chromosome structure, a coordinated effort to sequence the Arabidopsis genome was initiated in late 1996. Here we report one of the first milestones of this project, the sequence of chromosome 4. Analysis of 17.38 megabases of unique sequence, representing about 17% of the genome, reveals 3,744 protein coding genes, 81 transfer RNAs and numerous repeat elements. Heterochromatic regions surrounding the putative centromere, which has not yet been completely sequenced, are characterized by an increased frequency of a variety of repeats, new repeats, reduced recombination, lowered gene density and lowered gene expression. Roughly 60% of the predicted protein-coding genes have been functionally characterized on the basis of their homology to known genes. Many genes encode predicted proteins that are homologous to human and Caenorhabditis elegans proteins.
Functional organization of a single nif cluster in the mesophilic archaeon Methanosarcina mazei strain Gö1

PubMed Central

Ehlers, Claudia; Veit, Katharina; Gottschalk, Gerhard; Schmitz, Ruth A.

2002-01-01

The mesophilic methanogenic archaeon Methanosarcina mazei strain Gö1 is able to utilize molecular nitrogen (N2) as its sole nitrogen source. We have identified and characterized a single nitrogen fixation (nif) gene cluster in M. mazei Gö1 with an approximate length of 9 kbp. Sequence analysis revealed seven genes with sequence similarities to nifH, nifI1, nifI2, nifD, nifK, nifE and nifN, similar to other diazotrophic methanogens and certain bacteria such as Clostridium acetobutylicum, with the two glnB-like genes (nifI1 and nifI2) located between nifH and nifD. Phylogenetic analysis of deduced amino acid sequences for the nitrogenase structural genes of M. mazei Gö1 showed that they are most closely related to Methanosarcina barkeri nif2 genes, and also closely resemble those for the corresponding nif products of the gram-positive bacterium C. acetobutylicum. Northern blot analysis and reverse transcription PCR analysis demonstrated that the M. mazei nif genes constitute an operon transcribed only under nitrogen starvation as a single 8 kb transcript. Sequence analysis revealed a palindromic sequence at the transcriptional start site in front of the M. mazei nifH gene, which may have a function in transcriptional regulation of the nif operon. PMID:15803652
Identification of Novel Tissue-Specific Genes by Analysis of Microarray Databases: A Human and Mouse Model

PubMed Central

Suh, Yeunsu; Davis, Michael E.; Lee, Kichoon

2013-01-01

Understanding the tissue-specific pattern of gene expression is critical in elucidating the molecular mechanisms of tissue development, gene function, and transcriptional regulations of biological processes. Although tissue-specific gene expression information is available in several databases, follow-up strategies to integrate and use these data are limited. The objective of the current study was to identify and evaluate novel tissue-specific genes in human and mouse tissues by performing comparative microarray database analysis and semi-quantitative PCR analysis. We developed a powerful approach to predict tissue-specific genes by analyzing existing microarray data from the NCBI′s Gene Expression Omnibus (GEO) public repository. We investigated and confirmed tissue-specific gene expression in the human and mouse kidney, liver, lung, heart, muscle, and adipose tissue. Applying our novel comparative microarray approach, we confirmed 10 kidney, 11 liver, 11 lung, 11 heart, 8 muscle, and 8 adipose specific genes. The accuracy of this approach was further verified by employing semi-quantitative PCR reaction and by searching for gene function information in existing publications. Three novel tissue-specific genes were discovered by this approach including AMDHD1 (amidohydrolase domain containing 1) in the liver, PRUNE2 (prune homolog 2) in the heart, and ACVR1C (activin A receptor, type IC) in adipose tissue. We further confirmed the tissue-specific expression of these 3 novel genes by real-time PCR. Among them, ACVR1C is adipose tissue-specific and adipocyte-specific in adipose tissue, and can be used as an adipocyte developmental marker. From GEO profiles, we predicted the processes in which AMDHD1 and PRUNE2 may participate. Our approach provides a novel way to identify new sets of tissue-specific genes and to predict functions in which they may be involved. PMID:23741331
A novel gene expression-based prognostic scoring system to predict survival in gastric cancer

DOE PAGES

Wang, Pin; Wang, Yunshan; Hang, Bo; ...

2016-07-11

Analysis of gene expression patterns in gastric cancer (GC) can help to identify a comprehensive panel of gene biomarkers for predicting clinical outcomes and to discover potential new therapeutic targets. Here, a multi-step bioinformatics analytic approach was developed to establish a novel prognostic scoring system for GC. We first identified 276 genes that were robustly differentially expressed between normal and GC tissues, of which, 249 were found to be significantly associated with overall survival (OS) by univariate Cox regression analysis. The biological functions of 249 genes are related to cell cycle, RNA/ncRNA process, acetylation and extracellular matrix organization. A networkmore » was generated for view of the gene expression architecture of 249 genes in 265 GCs. Finally, we applied a canonical discriminant analysis approach to identify a 53-gene signature and a prognostic scoring system was established based on a canonical discriminant function of 53 genes. The prognostic scores strongly predicted patients with GC to have either a poor or good OS. Our study raises the prospect that the practicality of GC patient prognosis can be assessed by this prognostic scoring system.« less
Integrating text mining, data mining, and network analysis for identifying genetic breast cancer trends.

PubMed

Jurca, Gabriela; Addam, Omar; Aksac, Alper; Gao, Shang; Özyer, Tansel; Demetrick, Douglas; Alhajj, Reda

2016-04-26

Breast cancer is a serious disease which affects many women and may lead to death. It has received considerable attention from the research community. Thus, biomedical researchers aim to find genetic biomarkers indicative of the disease. Novel biomarkers can be elucidated from the existing literature. However, the vast amount of scientific publications on breast cancer make this a daunting task. This paper presents a framework which investigates existing literature data for informative discoveries. It integrates text mining and social network analysis in order to identify new potential biomarkers for breast cancer. We utilized PubMed for the testing. We investigated gene-gene interactions, as well as novel interactions such as gene-year, gene-country, and abstract-country to find out how the discoveries varied over time and how overlapping/diverse are the discoveries and the interest of various research groups in different countries. Interesting trends have been identified and discussed, e.g., different genes are highlighted in relationship to different countries though the various genes were found to share functionality. Some text analysis based results have been validated against results from other tools that predict gene-gene relations and gene functions.
A novel gene expression-based prognostic scoring system to predict survival in gastric cancer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Pin; Wang, Yunshan; Hang, Bo

Analysis of gene expression patterns in gastric cancer (GC) can help to identify a comprehensive panel of gene biomarkers for predicting clinical outcomes and to discover potential new therapeutic targets. Here, a multi-step bioinformatics analytic approach was developed to establish a novel prognostic scoring system for GC. We first identified 276 genes that were robustly differentially expressed between normal and GC tissues, of which, 249 were found to be significantly associated with overall survival (OS) by univariate Cox regression analysis. The biological functions of 249 genes are related to cell cycle, RNA/ncRNA process, acetylation and extracellular matrix organization. A networkmore » was generated for view of the gene expression architecture of 249 genes in 265 GCs. Finally, we applied a canonical discriminant analysis approach to identify a 53-gene signature and a prognostic scoring system was established based on a canonical discriminant function of 53 genes. The prognostic scores strongly predicted patients with GC to have either a poor or good OS. Our study raises the prospect that the practicality of GC patient prognosis can be assessed by this prognostic scoring system.« less
Identification and expression analysis of the SQUAMOSA promoter-binding protein (SBP)-box gene family in Prunus mume.

PubMed

Xu, Zongda; Sun, Lidan; Zhou, Yuzhen; Yang, Weiru; Cheng, Tangren; Wang, Jia; Zhang, Qixiang

2015-10-01

SQUAMOSA promoter-binding protein (SBP)-box family genes encode plant-specific transcription factors that play crucial roles in plant development, especially flower and fruit development. However, little information on this gene family is available for Prunus mume, an ornamental and fruit tree widely cultivated in East Asia. To explore the evolution of SBP-box genes in Prunus and explore their functions in flower and fruit development, we performed a genome-wide analysis of the SBP-box gene family in P. mume. Fifteen SBP-box genes were identified, and 11 of them contained an miR156 target site. Phylogenetic and comprehensive bioinformatics analyses revealed that different groups of SBP-box genes have undergone different evolutionary processes and varied in their length, structure, and motif composition. Purifying selection has been the main selective constraint on both paralogous and orthologous SBP-box genes. In addition, the sequences of orthologous SBP-box genes did not diverge widely after the split of P. mume and Prunus persica. Expression analysis of P. mume SBP-box genes revealed their diverse spatiotemporal expression patterns. Three duplicated SBP-box genes may have undergone subfunctionalization in Prunus. Most of the SBP-box genes showed high transcript levels in flower buds and young fruit. The four miR156-nontargeted genes were upregulated during fruit ripening. Together, these results provide information about the evolution of SBP-box genes in Prunus. The expression analysis lays the foundation for further research on the functions of SBP-box genes in P. mume and other Prunus species, especially during flower and fruit development.
GSCALite: A Web Server for Gene Set Cancer Analysis.

PubMed

Liu, Chun-Jie; Hu, Fei-Fei; Xia, Mengxuan; Han, Leng; Zhang, Qiong; Guo, An-Yuan

2018-05-22

The availability of cancer genomic data makes it possible to analyze genes related to cancer. Cancer is usually the result of a set of genes and the signal of a single gene could be covered by background noise. Here, we present a web server named Gene Set Cancer Analysis (GSCALite) to analyze a set of genes in cancers with the following functional modules. (i) Differential expression in tumor vs normal, and the survival analysis; (ii) Genomic variations and their survival analysis; (iii) Gene expression associated cancer pathway activity; (iv) miRNA regulatory network for genes; (v) Drug sensitivity for genes; (vi) Normal tissue expression and eQTL for genes. GSCALite is a user-friendly web server for dynamic analysis and visualization of gene set in cancer and drug sensitivity correlation, which will be of broad utilities to cancer researchers. GSCALite is available on http://bioinfo.life.hust.edu.cn/web/GSCALite/. guoay@hust.edu.cn or zhangqiong@hust.edu.cn. Supplementary data are available at Bioinformatics online.
Genome-wide analysis of WRKY transcription factors in wheat (Triticum aestivum L.) and differential expression under water deficit condition.

PubMed

Ning, Pan; Liu, Congcong; Kang, Jingquan; Lv, Jinyin

2017-01-01

WRKY proteins, which comprise one of the largest transcription factor (TF) families in the plant kingdom, play crucial roles in plant development and stress responses. Despite several studies on WRKYs in wheat ( Triticum aestivum L.), functional annotation information about wheat WRKYs is limited. Here, 171 TaWRKY TFs were identified from the whole wheat genome and compared with proteins from 19 other species representing nine major plant lineages. A phylogenetic analysis, coupled with gene structure analysis and motif determination, divided these TaWRKYs into seven subgroups (Group I, IIa-e, and III). Chromosomal location showed that most TaWRKY genes were enriched on four chromosomes, especially on chromosome 3B. In addition, 85 (49.7%) genes were either tandem (5) or segmental duplication (80), which suggested that though tandem duplication has contributed to the expansion of TaWRKY family, segmental duplication probably played a more pivotal role. Analysis of cis -acting elements revealed putative functions of WRKYs in wheat during development as well as under numerous biotic and abiotic stresses. Finally, the expression of TaWRKY genes in flag leaves, glumes, and lemmas under water-deficit condition were analyzed. Results showed that different TaWRKY genes preferentially express in specific tissue during the grain-filling stage. Our results provide a more extensive insight on WRKY gene family in wheat, and also contribute to the screening of more candidate genes for further investigation on function characterization of WRKYs under various stresses.
Gene Set−Based Integrative Analysis Revealing Two Distinct Functional Regulation Patterns in Four Common Subtypes of Epithelial Ovarian Cancer

PubMed Central

Chang, Chia-Ming; Chuang, Chi-Mu; Wang, Mong-Lien; Yang, Yi-Ping; Chuang, Jen-Hua; Yang, Ming-Jie; Yen, Ming-Shyen; Chiou, Shih-Hwa; Chang, Cheng-Chang

2016-01-01

Clear cell (CCC), endometrioid (EC), mucinous (MC) and high-grade serous carcinoma (SC) are the four most common subtypes of epithelial ovarian carcinoma (EOC). The widely accepted dualistic model of ovarian carcinogenesis divided EOCs into type I and II categories based on the molecular features. However, this hypothesis has not been experimentally demonstrated. We carried out a gene set-based analysis by integrating the microarray gene expression profiles downloaded from the publicly available databases. These quantified biological functions of EOCs were defined by 1454 Gene Ontology (GO) term and 674 Reactome pathway gene sets. The pathogenesis of the four EOC subtypes was investigated by hierarchical clustering and exploratory factor analysis. The patterns of functional regulation among the four subtypes containing 1316 cases could be accurately classified by machine learning. The results revealed that the ERBB and PI3K-related pathways played important roles in the carcinogenesis of CCC, EC and MC; while deregulation of cell cycle was more predominant in SC. The study revealed that two different functional regulation patterns exist among the four EOC subtypes, which were compatible with the type I and II classifications proposed by the dualistic model of ovarian carcinogenesis. PMID:27527159
dbCPG: A web resource for cancer predisposition genes

PubMed Central

Wei, Ran; Yao, Yao; Yang, Wu; Zheng, Chun-Hou; Zhao, Min; Xia, Junfeng

2016-01-01

Cancer predisposition genes (CPGs) are genes in which inherited mutations confer highly or moderately increased risks of developing cancer. Identification of these genes and understanding the biological mechanisms that underlie them is crucial for the prevention, early diagnosis, and optimized management of cancer. Over the past decades, great efforts have been made to identify CPGs through multiple strategies. However, information on these CPGs and their molecular functions is scattered. To address this issue and provide a comprehensive resource for researchers, we developed the Cancer Predisposition Gene Database (dbCPG, Database URL: http://bioinfo.ahu.edu.cn:8080/dbCPG/index.jsp), the first literature-based gene resource for exploring human CPGs. It contains 827 human (724 protein-coding, 23 non-coding, and 80 unknown type genes), 637 rats, and 658 mouse CPGs. Furthermore, data mining was performed to gain insights into the understanding of the CPGs data, including functional annotation, gene prioritization, network analysis of prioritized genes and overlap analysis across multiple cancer types. A user-friendly web interface with multiple browse, search, and upload functions was also developed to facilitate access to the latest information on CPGs. Taken together, the dbCPG database provides a comprehensive data resource for further studies of cancer predisposition genes. PMID:27192119

NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis.

PubMed

Sun, Duanchen; Liu, Yinliang; Zhang, Xiang-Sun; Wu, Ling-Yun

2017-09-21

High-throughput experimental techniques have been dramatically improved and widely applied in the past decades. However, biological interpretation of the high-throughput experimental results, such as differential expression gene sets derived from microarray or RNA-seq experiments, is still a challenging task. Gene Ontology (GO) is commonly used in the functional enrichment studies. The GO terms identified via current functional enrichment analysis tools often contain direct parent or descendant terms in the GO hierarchical structure. Highly redundant terms make users difficult to analyze the underlying biological processes. In this paper, a novel network-based probabilistic generative model, NetGen, was proposed to perform the functional enrichment analysis. An additional protein-protein interaction (PPI) network was explicitly used to assist the identification of significantly enriched GO terms. NetGen achieved a superior performance than the existing methods in the simulation studies. The effectiveness of NetGen was explored further on four real datasets. Notably, several GO terms which were not directly linked with the active gene list for each disease were identified. These terms were closely related to the corresponding diseases when accessed to the curated literatures. NetGen has been implemented in the R package CopTea publicly available at GitHub ( http://github.com/wulingyun/CopTea/ ). Our procedure leads to a more reasonable and interpretable result of the functional enrichment analysis. As a novel term combination-based functional enrichment analysis method, NetGen is complementary to current individual term-based methods, and can help to explore the underlying pathogenesis of complex diseases.
Involvement of astrocyte metabolic coupling in Tourette syndrome pathogenesis.

PubMed

de Leeuw, Christiaan; Goudriaan, Andrea; Smit, August B; Yu, Dongmei; Mathews, Carol A; Scharf, Jeremiah M; Verheijen, Mark H G; Posthuma, Danielle

2015-11-01

Tourette syndrome is a heritable neurodevelopmental disorder whose pathophysiology remains unknown. Recent genome-wide association studies suggest that it is a polygenic disorder influenced by many genes of small effect. We tested whether these genes cluster in cellular function by applying gene-set analysis using expert curated sets of brain-expressed genes in the current largest available Tourette syndrome genome-wide association data set, involving 1285 cases and 4964 controls. The gene sets included specific synaptic, astrocytic, oligodendrocyte and microglial functions. We report association of Tourette syndrome with a set of genes involved in astrocyte function, specifically in astrocyte carbohydrate metabolism. This association is driven primarily by a subset of 33 genes involved in glycolysis and glutamate metabolism through which astrocytes support synaptic function. Our results indicate for the first time that the process of astrocyte-neuron metabolic coupling may be an important contributor to Tourette syndrome pathogenesis.
Involvement of astrocyte metabolic coupling in Tourette syndrome pathogenesis

PubMed Central

de Leeuw, Christiaan; Goudriaan, Andrea; Smit, August B; Yu, Dongmei; Mathews, Carol A; Scharf, Jeremiah M; Scharf, J M; Pauls, D L; Yu, D; Illmann, C; Osiecki, L; Neale, B M; Mathews, C A; Reus, V I; Lowe, T L; Freimer, N B; Cox, N J; Davis, L K; Rouleau, G A; Chouinard, S; Dion, Y; Girard, S; Cath, D C; Posthuma, D; Smit, J H; Heutink, P; King, R A; Fernandez, T; Leckman, J F; Sandor, P; Barr, C L; McMahon, W; Lyon, G; Leppert, M; Morgan, J; Weiss, R; Grados, M A; Singer, H; Jankovic, J; Tischfield, J A; Heiman, G A; Verheijen, Mark H G; Posthuma, Danielle

2015-01-01

Tourette syndrome is a heritable neurodevelopmental disorder whose pathophysiology remains unknown. Recent genome-wide association studies suggest that it is a polygenic disorder influenced by many genes of small effect. We tested whether these genes cluster in cellular function by applying gene-set analysis using expert curated sets of brain-expressed genes in the current largest available Tourette syndrome genome-wide association data set, involving 1285 cases and 4964 controls. The gene sets included specific synaptic, astrocytic, oligodendrocyte and microglial functions. We report association of Tourette syndrome with a set of genes involved in astrocyte function, specifically in astrocyte carbohydrate metabolism. This association is driven primarily by a subset of 33 genes involved in glycolysis and glutamate metabolism through which astrocytes support synaptic function. Our results indicate for the first time that the process of astrocyte-neuron metabolic coupling may be an important contributor to Tourette syndrome pathogenesis. PMID:25735483
Discovering the Deregulated Molecular Functions Involved in Malignant Transformation of Endometriosis to Endometriosis-Associated Ovarian Carcinoma Using a Data-Driven, Function-Based Analysis

PubMed Central

Chang, Chia-Ming; Yang, Yi-Ping; Chuang, Jen-Hua; Chuang, Chi-Mu; Lin, Tzu-Wei; Wang, Peng-Hui; Yu, Mu-Hsien

2017-01-01

The clinical characteristics of clear cell carcinoma (CCC) and endometrioid carcinoma EC) are concomitant with endometriosis (ES), which leads to the postulation of malignant transformation of ES to endometriosis-associated ovarian carcinoma (EAOC). Different deregulated functional areas were proposed accounting for the pathogenesis of EAOC transformation, and there is still a lack of a data-driven analysis with the accumulated experimental data in publicly-available databases to incorporate the deregulated functions involved in the malignant transformation of EOAC. We used the microarray gene expression datasets of ES, CCC and EC downloaded from the National Center for Biotechnology Information Gene Expression Omnibus (NCBI GEO) database. Then, we investigated the pathogenesis of EAOC by a data-driven, function-based analytic model with the quantified molecular functions defined by 1454 Gene Ontology (GO) term gene sets. This model converts the gene expression profiles to the functionome consisting of 1454 quantified GO functions, and then, the key functions involving the malignant transformation of EOAC can be extracted by a series of filters. Our results demonstrate that the deregulated oxidoreductase activity, metabolism, hormone activity, inflammatory response, innate immune response and cell-cell signaling play the key roles in the malignant transformation of EAOC. These results provide the evidence supporting the specific molecular pathways involved in the malignant transformation of EAOC. PMID:29113136
Evidence against the selfish operon theory.

PubMed

Pál, Csaba; Hurst, Laurence D

2004-06-01

According to the selfish operon hypothesis, the clustering of genes and their subsequent organization into operons is beneficial for the constituent genes because it enables the horizontal gene transfer of weakly selected, functionally coupled genes. The majority of these are expected to be non-essential genes. From our analysis of the Escherichia coli genome, we conclude that the selfish operon hypothesis is unlikely to provide a general explanation for clustering nor can it account for the gene composition of operons. Contrary to expectations, essential genes with related functions have an especially strong tendency to cluster, even if they are not in operons. Moreover, essential genes are particularly abundant in operons.
Comparative phylogenetic analysis and transcriptional profiling of MADS-box gene family identified DAM and FLC-like genes in apple (Malusx domestica)

PubMed Central

Kumar, Gulshan; Arya, Preeti; Gupta, Khushboo; Randhawa, Vinay; Acharya, Vishal; Singh, Anil Kumar

2016-01-01

The MADS-box transcription factors play essential roles in various processes of plant growth and development. In the present study, phylogenetic analysis of 142 apple MADS-box proteins with that of other dicotyledonous species identified six putative Dormancy-Associated MADS-box (DAM) and four putative Flowering Locus C-like (FLC-like) proteins. In order to study the expression of apple MADS-box genes, RNA-seq analysis of 3 apical and 5 spur bud stages during dormancy, 6 flower stages and 7 fruit development stages was performed. The dramatic reduction in expression of two MdDAMs, MdMADS063 and MdMADS125 and two MdFLC-like genes, MdMADS135 and MdMADS136 during dormancy release suggests their role as flowering-repressors in apple. Apple orthologs of Arabidopsis genes, FLOWERING LOCUS T, FRIGIDA, SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1 and LEAFY exhibit similar expression patterns as reported in Arabidopsis, suggesting functional conservation in floral signal integration and meristem determination pathways. Gene ontology enrichment analysis of predicted targets of DAM revealed their involvement in regulation of reproductive processes and meristematic activities, indicating functional conservation of SVP orthologs (DAM) in apple. This study provides valuable insights into the functions of MADS-box proteins during apple phenology, which may help in devising strategies to improve important traits in apple. PMID:26856238
Horizontal gene transfer in silkworm, Bombyx mori.

PubMed

Zhu, Bo; Lou, Miao-Miao; Xie, Guan-Lin; Zhang, Guo-Qing; Zhou, Xue-Ping; Li, Bin; Jin, Gu-Lei

2011-05-19

The domesticated silkworm, Bombyx mori, is the model insect for the order Lepidoptera, has economically important values, and has gained some representative behavioral characteristics compared to its wild ancestor. The genome of B. mori has been fully sequenced while function analysis of BmChi-h and BmSuc1 genes revealed that horizontal gene transfer (HGT) maybe bestow a clear selective advantage to B. mori. However, the role of HGT in the evolutionary history of B. mori is largely unexplored. In this study, we compare the whole genome of B. mori with those of 382 prokaryotic and eukaryotic species to investigate the potential HGTs. Ten candidate HGT events were defined in B. mori by comprehensive sequence analysis using Maximum Likelihood and Bayesian method combining with EST checking. Phylogenetic analysis of the candidate HGT genes suggested that one HGT was plant-to- B. mori transfer while nine were bacteria-to- B. mori transfer. Furthermore, functional analysis based on expression, coexpression and related literature searching revealed that several HGT candidate genes have added important characters, such as resistance to pathogen, to B. mori. Results from this study clearly demonstrated that HGTs play an important role in the evolution of B. mori although the number of HGT events in B. mori is in general smaller than those of microbes and other insects. In particular, interdomain HGTs in B. mori may give rise to functional, persistent, and possibly evolutionarily significant new genes.
Comparative phylogenetic analysis and transcriptional profiling of MADS-box gene family identified DAM and FLC-like genes in apple (Malusx domestica).

PubMed

Kumar, Gulshan; Arya, Preeti; Gupta, Khushboo; Randhawa, Vinay; Acharya, Vishal; Singh, Anil Kumar

2016-02-09

The MADS-box transcription factors play essential roles in various processes of plant growth and development. In the present study, phylogenetic analysis of 142 apple MADS-box proteins with that of other dicotyledonous species identified six putative Dormancy-Associated MADS-box (DAM) and four putative Flowering Locus C-like (FLC-like) proteins. In order to study the expression of apple MADS-box genes, RNA-seq analysis of 3 apical and 5 spur bud stages during dormancy, 6 flower stages and 7 fruit development stages was performed. The dramatic reduction in expression of two MdDAMs, MdMADS063 and MdMADS125 and two MdFLC-like genes, MdMADS135 and MdMADS136 during dormancy release suggests their role as flowering-repressors in apple. Apple orthologs of Arabidopsis genes, FLOWERING LOCUS T, FRIGIDA, SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1 and LEAFY exhibit similar expression patterns as reported in Arabidopsis, suggesting functional conservation in floral signal integration and meristem determination pathways. Gene ontology enrichment analysis of predicted targets of DAM revealed their involvement in regulation of reproductive processes and meristematic activities, indicating functional conservation of SVP orthologs (DAM) in apple. This study provides valuable insights into the functions of MADS-box proteins during apple phenology, which may help in devising strategies to improve important traits in apple.
RAS oncogene-mediated deregulation of the transcriptome: from molecular signature to function.

PubMed

Schäfer, Reinhold; Sers, Christine

2011-01-01

Transcriptome analysis of cancer cells has developed into a standard procedure to elucidate multiple features of the malignant process and to link gene expression to clinical properties. Gene expression profiling based on microarrays provides essentially correlative information and needs to be transferred to the functional level in order to understand the activity and contribution of individual genes or sets of genes as elements of the gene signature. To date, there exist significant gaps in the functional understanding of gene expression profiles. Moreover, the processes that drive the profound transcriptional alterations that characterize cancer cells remain mainly elusive. We have used pathway-restricted gene expression profiles derived from RAS oncogene-transformed cells and from RAS-expressing cancer cells to identify regulators downstream of the MAPK pathway.We describe the role of epigenetic regulation exemplified by the control of several immune genes in generic cell lines and colorectal cancer cells, particularly the functional interaction between signaling and DNA methylation. Moreover, we assess the role of the architectural transcription factor high mobility AT-hook 2 (HMGA2) as a regulator of the RAS-responsive transcriptome in ovarian epithelial cells. Finally, we describe an integrated approach combining pathway interference in colorectal cancer cells, gene expression profiling and computational analysis of regulatory elements of deregulated target genes. This strategy resulted in the identification of Y-box binding protein 1 (YBX1) as a regulator of MAPK-dependent proliferation and gene expression. The implications for a therapeutic application of HMGA2 gene silencing and the role of YBX1 as a prognostic factor are discussed.
Identification of possible genetic polymorphisms involved in cancer cachexia: a systematic review.

PubMed

Tan, Benjamin H L; Ross, James A; Kaasa, Stein; Skorpen, Frank; Fearon, Kenneth C H

2011-04-01

Cancer cachexia is a polygenic and complex syndrome. Genetic variations in regulation of the inflammatory response, muscle and fat metabolic pathways, and pathways in appetite regulation are likely to contribute to the susceptibility or resistance to developing cancer cachexia. A systematic search of Medline and EmBase databases, covering 1986-2008 was performed for potential candidate genes/genetic polymorphisms relating to cancer cachexia. Related genes were then identified using pathway functional analysis software. All candidate genes were reviewed for functional polymorphisms or clinically significant polymorphisms associated with cachexia using the OMIM and GeneRIF databases. Genes with variants which had functional or clinical associations with cachexia and replicated in at least one study were entered into pathway analysis software to reveal possible network associations between genes. A total of 184 polymorphisms with functional or clinical relevance to cancer cachexia were identified in 92 candidate genes. Of these, 42 polymorphisms (in 33 genes) were replicated in more than one study with 13 polymorphisms found to influence two or more hallmarks of cachexia (i.e. inflammation, loss of fat mass and/or lean mass and reduced survival). Thirty-three genes were found to be significantly interconnected in two major networks with four genes (ADIPOQ, IL6, NFKB1 and TLR4) interlinking both networks. Selection of candidate genes and polymorphisms is a key element of multigene study design. The present study provides an initial framework to select genes/polymorphisms for further study in cancer cachexia, and to develop their potential as susceptibility biomarkers of developing cachexia.
A high resolution atlas of gene expression in the domestic sheep (Ovis aries)

PubMed Central

Farquhar, Iseabail L.; Young, Rachel; Lefevre, Lucas; Pridans, Clare; Tsang, Hiu G.; Afrasiabi, Cyrus; Watson, Mick; Whitelaw, C. Bruce; Freeman, Tom C.; Archibald, Alan L.; Hume, David A.

2017-01-01

Sheep are a key source of meat, milk and fibre for the global livestock sector, and an important biomedical model. Global analysis of gene expression across multiple tissues has aided genome annotation and supported functional annotation of mammalian genes. We present a large-scale RNA-Seq dataset representing all the major organ systems from adult sheep and from several juvenile, neonatal and prenatal developmental time points. The Ovis aries reference genome (Oar v3.1) includes 27,504 genes (20,921 protein coding), of which 25,350 (19,921 protein coding) had detectable expression in at least one tissue in the sheep gene expression atlas dataset. Network-based cluster analysis of this dataset grouped genes according to their expression pattern. The principle of ‘guilt by association’ was used to infer the function of uncharacterised genes from their co-expression with genes of known function. We describe the overall transcriptional signatures present in the sheep gene expression atlas and assign those signatures, where possible, to specific cell populations or pathways. The findings are related to innate immunity by focusing on clusters with an immune signature, and to the advantages of cross-breeding by examining the patterns of genes exhibiting the greatest expression differences between purebred and crossbred animals. This high-resolution gene expression atlas for sheep is, to our knowledge, the largest transcriptomic dataset from any livestock species to date. It provides a resource to improve the annotation of the current reference genome for sheep, presenting a model transcriptome for ruminants and insight into gene, cell and tissue function at multiple developmental stages. PMID:28915238
A high resolution atlas of gene expression in the domestic sheep (Ovis aries).

PubMed

Clark, Emily L; Bush, Stephen J; McCulloch, Mary E B; Farquhar, Iseabail L; Young, Rachel; Lefevre, Lucas; Pridans, Clare; Tsang, Hiu G; Wu, Chunlei; Afrasiabi, Cyrus; Watson, Mick; Whitelaw, C Bruce; Freeman, Tom C; Summers, Kim M; Archibald, Alan L; Hume, David A

2017-09-01

Sheep are a key source of meat, milk and fibre for the global livestock sector, and an important biomedical model. Global analysis of gene expression across multiple tissues has aided genome annotation and supported functional annotation of mammalian genes. We present a large-scale RNA-Seq dataset representing all the major organ systems from adult sheep and from several juvenile, neonatal and prenatal developmental time points. The Ovis aries reference genome (Oar v3.1) includes 27,504 genes (20,921 protein coding), of which 25,350 (19,921 protein coding) had detectable expression in at least one tissue in the sheep gene expression atlas dataset. Network-based cluster analysis of this dataset grouped genes according to their expression pattern. The principle of 'guilt by association' was used to infer the function of uncharacterised genes from their co-expression with genes of known function. We describe the overall transcriptional signatures present in the sheep gene expression atlas and assign those signatures, where possible, to specific cell populations or pathways. The findings are related to innate immunity by focusing on clusters with an immune signature, and to the advantages of cross-breeding by examining the patterns of genes exhibiting the greatest expression differences between purebred and crossbred animals. This high-resolution gene expression atlas for sheep is, to our knowledge, the largest transcriptomic dataset from any livestock species to date. It provides a resource to improve the annotation of the current reference genome for sheep, presenting a model transcriptome for ruminants and insight into gene, cell and tissue function at multiple developmental stages.
Whole genome co-expression analysis of soybean cytochrome P450 genes identifies nodulation-specific P450 monooxygenases

PubMed Central

2010-01-01

Background Cytochrome P450 monooxygenases (P450s) catalyze oxidation of various substrates using oxygen and NAD(P)H. Plant P450s are involved in the biosynthesis of primary and secondary metabolites performing diverse biological functions. The recent availability of the soybean genome sequence allows us to identify and analyze soybean putative P450s at a genome scale. Co-expression analysis using an available soybean microarray and Illumina sequencing data provides clues for functional annotation of these enzymes. This approach is based on the assumption that genes that have similar expression patterns across a set of conditions may have a functional relationship. Results We have identified a total number of 332 full-length P450 genes and 378 pseudogenes from the soybean genome. From the full-length sequences, 195 genes belong to A-type, which could be further divided into 20 families. The remaining 137 genes belong to non-A type P450s and are classified into 28 families. A total of 178 probe sets were found to correspond to P450 genes on the Affymetrix soybean array. Out of these probe sets, 108 represented single genes. Using the 28 publicly available microarray libraries that contain organ-specific information, some tissue-specific P450s were identified. Similarly, stress responsive soybean P450s were retrieved from 99 microarray soybean libraries. We also utilized Illumina transcriptome sequencing technology to analyze the expressions of all 332 soybean P450 genes. This dataset contains total RNAs isolated from nodules, roots, root tips, leaves, flowers, green pods, apical meristem, mock-inoculated and Bradyrhizobium japonicum-infected root hair cells. The tissue-specific expression patterns of these P450 genes were analyzed and the expression of a representative set of genes were confirmed by qRT-PCR. We performed the co-expression analysis on many of the 108 P450 genes on the Affymetrix arrays. First we confirmed that CYP93C5 (an isoflavone synthase gene) is co-expressed with several genes encoding isoflavonoid-related metabolic enzymes. We then focused on nodulation-induced P450s and found that CYP728H1 was co-expressed with the genes involved in phenylpropanoid metabolism. Similarly, CYP736A34 was highly co-expressed with lipoxygenase, lectin and CYP83D1, all of which are involved in root and nodule development. Conclusions The genome scale analysis of P450s in soybean reveals many unique features of these important enzymes in this crop although the functions of most of them are largely unknown. Gene co-expression analysis proves to be a useful tool to infer the function of uncharacterized genes. Our work presented here could provide important leads toward functional genomics studies of soybean P450s and their regulatory network through the integration of reverse genetics, biochemistry, and metabolic profiling tools. The identification of nodule-specific P450s and their further exploitation may help us to better understand the intriguing process of soybean and rhizobium interaction. PMID:21062474
Identification of gene expression profiles and key genes in subchondral bone of osteoarthritis using weighted gene coexpression network analysis.

PubMed

Guo, Sheng-Min; Wang, Jian-Xiong; Li, Jin; Xu, Fang-Yuan; Wei, Quan; Wang, Hai-Ming; Huang, Hou-Qiang; Zheng, Si-Lin; Xie, Yu-Jie; Zhang, Chi

2018-06-15

Osteoarthritis (OA) significantly influences the quality life of people around the world. It is urgent to find an effective way to understand the genetic etiology of OA. We used weighted gene coexpression network analysis (WGCNA) to explore the key genes involved in the subchondral bone pathological process of OA. Fifty gene expression profiles of GSE51588 were downloaded from the Gene Expression Omnibus database. The OA-associated genes and gene ontologies were acquired from JuniorDoc. Weighted gene coexpression network analysis was used to find disease-related networks based on 21756 gene expression correlation coefficients, hub-genes with the highest connectivity in each module were selected, and the correlation between module eigengene and clinical traits was calculated. The genes in the traits-related gene coexpression modules were subject to functional annotation and pathway enrichment analysis using ClusterProfiler. A total of 73 gene modules were identified, of which, 12 modules were found with high connectivity with clinical traits. Five modules were found with enriched OA-associated genes. Moreover, 310 OA-associated genes were found, and 34 of them were among hub-genes in each module. Consequently, enrichment results indicated some key metabolic pathways, such as extracellular matrix (ECM)-receptor interaction (hsa04512), focal adhesion (hsa04510), the phosphatidylinositol 3'-kinase (PI3K)-Akt signaling pathway (PI3K-AKT) (hsa04151), transforming growth factor beta pathway, and Wnt pathway. We intended to identify some core genes, collagen (COL)6A3, COL6A1, ITGA11, BAMBI, and HCK, which could influence downstream signaling pathways once they were activated. In this study, we identified important genes within key coexpression modules, which associate with a pathological process of subchondral bone in OA. Functional analysis results could provide important information to understand the mechanism of OA. © 2018 Wiley Periodicals, Inc.
GIANT 2.0: genome-scale integrated analysis of gene networks in tissues.

PubMed

Wong, Aaron K; Krishnan, Arjun; Troyanskaya, Olga G

2018-05-25

GIANT2 (Genome-wide Integrated Analysis of gene Networks in Tissues) is an interactive web server that enables biomedical researchers to analyze their proteins and pathways of interest and generate hypotheses in the context of genome-scale functional maps of human tissues. The precise actions of genes are frequently dependent on their tissue context, yet direct assay of tissue-specific protein function and interactions remains infeasible in many normal human tissues and cell-types. With GIANT2, researchers can explore predicted tissue-specific functional roles of genes and reveal changes in those roles across tissues, all through interactive multi-network visualizations and analyses. Additionally, the NetWAS approach available through the server uses tissue-specific/cell-type networks predicted by GIANT2 to re-prioritize statistical associations from GWAS studies and identify disease-associated genes. GIANT2 predicts tissue-specific interactions by integrating diverse functional genomics data from now over 61 400 experiments for 283 diverse tissues and cell-types. GIANT2 does not require any registration or installation and is freely available for use at http://giant-v2.princeton.edu.
De Novo Transcriptome Assembly and Characterization of Lithospermum officinale to Discover Putative Genes Involved in Specialized Metabolites Biosynthesis.

PubMed

Rai, Amit; Nakaya, Taiki; Shimizu, Yohei; Rai, Megha; Nakamura, Michimi; Suzuki, Hideyuki; Saito, Kazuki; Yamazaki, Mami

2018-05-29

Lithospermum officinale is a valuable source of bioactive metabolites with medicinal and industrial values. However, little is known about genes involved in the biosynthesis of these metabolites, primarily due to the lack of genome or transcriptome resources. This study presents the first effort to establish and characterize de novo transcriptome assembly resource for L. officinale and expression analysis for three of its tissues, namely leaf, stem, and root. Using over 4Gbps of RNA-sequencing datasets, we obtained de novo transcriptome assembly of L. officinale , consisting of 77,047 unigenes with assembly N50 value as 1524 bps. Based on transcriptome annotation and functional classification, 52,766 unigenes were assigned with putative genes functions, gene ontology terms, and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. KEGG pathway and gene ontology enrichment analysis using highly expressed unigenes across three tissues and targeted metabolome analysis showed active secondary metabolic processes enriched specifically in the root of L. officinale . Using co-expression analysis, we also identified 20 and 48 unigenes representing different enzymes of lithospermic/chlorogenic acid and shikonin biosynthesis pathways, respectively. We further identified 15 candidate unigenes annotated as cytochrome P450 with the highest expression in the root of L. officinale as novel genes with a role in key biochemical reactions toward shikonin biosynthesis. Thus, through this study, we not only generated a high-quality genomic resource for L. officinale but also propose candidate genes to be involved in shikonin biosynthesis pathways for further functional characterization. Georg Thieme Verlag KG Stuttgart · New York.
Novel candidate genes of the PARK7 interactome as mediators of apoptosis and acetylation in multiple sclerosis: An in silico analysis.

PubMed

Vavougios, George D; Zarogiannis, Sotirios G; Krogfelt, Karen Angeliki; Gourgoulianis, Konstantinos; Mitsikostas, Dimos Dimitrios; Hadjigeorgiou, Georgios

2018-01-01

currently only 4 studies have explored the potential role of PARK7's dysregulation in MS pathophysiology Currently, no study has evaluated the potential role of the PARK7 interactome in MS. The aim of our study was to assess the differential expression of PARK7 mRNA in peripheral blood mononuclears (PBMCs) donated from MS versus healthy patients using data mining techniques. The PARK7 interactome data from the GDS3920 profile were scrutinized for differentially expressed genes (DEGs); Gene Enrichment Analysis (GEA) was used to detect significantly enriched biological functions. 27 differentially expressed genes in the MS dataset were detected; 12 of these (NDUFA4, UBA2, TDP2, NPM1, NDUFS3, SUMO1, PIAS2, KIAA0101, RBBP4, NONO, RBBP7 AND HSPA4) are reported for the first time in MS. Stepwise Linear Discriminant Function Analysis constructed a predictive model (Wilk's λ = 0.176, χ 2 = 45.204, p = 1.5275e -10 ) with 2 variables (TIDP2, RBBP4) that achieved 96.6% accuracy when discriminating between patients and controls. Gene Enrichment Analysis revealed that induction and regulation of programmed / intrinsic cell death represented the most salient Gene Ontology annotations. Cross-validation on systemic lupus erythematosus and ischemic stroke datasets revealed that these functions are unique to the MS dataset. Based on our results, novel potential target genes are revealed; these differentially expressed genes regulate epigenetic and apoptotic pathways that may further elucidate underlying mechanisms of autorreactivity in MS. Copyright © 2017 Elsevier B.V. All rights reserved.
Text mining and network analysis to find functional associations of genes in high altitude diseases.

PubMed

Bhasuran, Balu; Subramanian, Devika; Natarajan, Jeyakumar

2018-05-02

Travel to elevations above 2500 m is associated with the risk of developing one or more forms of acute altitude illness such as acute mountain sickness (AMS), high altitude cerebral edema (HACE) or high altitude pulmonary edema (HAPE). Our work aims to identify the functional association of genes involved in high altitude diseases. In this work we identified the gene networks responsible for high altitude diseases by using the principle of gene co-occurrence statistics from literature and network analysis. First, we mined the literature data from PubMed on high-altitude diseases, and extracted the co-occurring gene pairs. Next, based on their co-occurrence frequency, gene pairs were ranked. Finally, a gene association network was created using statistical measures to explore potential relationships. Network analysis results revealed that EPO, ACE, IL6 and TNF are the top five genes that were found to co-occur with 20 or more genes, while the association between EPAS1 and EGLN1 genes is strongly substantiated. The network constructed from this study proposes a large number of genes that work in-toto in high altitude conditions. Overall, the result provides a good reference for further study of the genetic relationships in high altitude diseases. Copyright © 2018 Elsevier Ltd. All rights reserved.
Comparative analysis of protein interactome networks prioritizes candidate genes with cancer signatures.

PubMed

Li, Yongsheng; Sahni, Nidhi; Yi, Song

2016-11-29

Comprehensive understanding of human cancer mechanisms requires the identification of a thorough list of cancer-associated genes, which could serve as biomarkers for diagnoses and therapies in various types of cancer. Although substantial progress has been made in functional studies to uncover genes involved in cancer, these efforts are often time-consuming and costly. Therefore, it remains challenging to comprehensively identify cancer candidate genes. Network-based methods have accelerated this process through the analysis of complex molecular interactions in the cell. However, the extent to which various interactome networks can contribute to prediction of candidate genes responsible for cancer is still enigmatic. In this study, we evaluated different human protein-protein interactome networks and compared their application to cancer gene prioritization. Our results indicate that network analyses can increase the power to identify novel cancer genes. In particular, such predictive power can be enhanced with the use of unbiased systematic protein interaction maps for cancer gene prioritization. Functional analysis reveals that the top ranked genes from network predictions co-occur often with cancer-related terms in literature, and further, these candidate genes are indeed frequently mutated across cancers. Finally, our study suggests that integrating interactome networks with other omics datasets could provide novel insights into cancer-associated genes and underlying molecular mechanisms.
Ortholog-based screening and identification of genes related to intracellular survival.

PubMed

Yang, Xiaowen; Wang, Jiawei; Bing, Guoxia; Bie, Pengfei; De, Yanyan; Lyu, Yanli; Wu, Qingmin

2018-04-20

Bioinformatics and comparative genomics analysis methods were used to predict unknown pathogen genes based on homology with identified or functionally clustered genes. In this study, the genes of common pathogens were analyzed to screen and identify genes associated with intracellular survival through sequence similarity, phylogenetic tree analysis and the λ-Red recombination system test method. The total 38,952 protein-coding genes of common pathogens were divided into 19,775 clusters. As demonstrated through a COG analysis, information storage and processing genes might play an important role intracellular survival. Only 19 clusters were present in facultative intracellular pathogens, and not all were present in extracellular pathogens. Construction of a phylogenetic tree selected 18 of these 19 clusters. Comparisons with the DEG database and previous research revealed that seven other clusters are considered essential gene clusters and that seven other clusters are associated with intracellular survival. Moreover, this study confirmed that clusters screened by orthologs with similar function could be replaced with an approved uvrY gene and its orthologs, and the results revealed that the usg gene is associated with intracellular survival. The study improves the current understanding of intracellular pathogens characteristics and allows further exploration of the intracellular survival-related gene modules in these pathogens. Copyright © 2018. Published by Elsevier B.V.

GeoChip-Based Analysis of the Functional Gene Diversity and Metabolic Potential of Microbial Communities in Acid Mine Drainage▿ †

PubMed Central

Xie, Jianping; He, Zhili; Liu, Xinxing; Liu, Xueduan; Van Nostrand, Joy D.; Deng, Ye; Wu, Liyou; Zhou, Jizhong; Qiu, Guanzhou

2011-01-01

Acid mine drainage (AMD) is an extreme environment, usually with low pH and high concentrations of metals. Although the phylogenetic diversity of AMD microbial communities has been examined extensively, little is known about their functional gene diversity and metabolic potential. In this study, a comprehensive functional gene array (GeoChip 2.0) was used to analyze the functional diversity, composition, structure, and metabolic potential of AMD microbial communities from three copper mines in China. GeoChip data indicated that these microbial communities were functionally diverse as measured by the number of genes detected, gene overlapping, unique genes, and various diversity indices. Almost all key functional gene categories targeted by GeoChip 2.0 were detected in the AMD microbial communities, including carbon fixation, carbon degradation, methane generation, nitrogen fixation, nitrification, denitrification, ammonification, nitrogen reduction, sulfur metabolism, metal resistance, and organic contaminant degradation, which suggested that the functional gene diversity was higher than was previously thought. Mantel test results indicated that AMD microbial communities are shaped largely by surrounding environmental factors (e.g., S, Mg, and Cu). Functional genes (e.g., narG and norB) and several key functional processes (e.g., methane generation, ammonification, denitrification, sulfite reduction, and organic contaminant degradation) were significantly (P < 0.10) correlated with environmental variables. This study presents an overview of functional gene diversity and the structure of AMD microbial communities and also provides insights into our understanding of metabolic potential in AMD ecosystems. PMID:21097602
Transcriptome profiling in engrailed-2 mutant mice reveals common molecular pathways associated with autism spectrum disorders.

PubMed

Sgadò, Paola; Provenzano, Giovanni; Dassi, Erik; Adami, Valentina; Zunino, Giulia; Genovesi, Sacha; Casarosa, Simona; Bozzi, Yuri

2013-12-19

Transcriptome analysis has been used in autism spectrum disorder (ASD) to unravel common pathogenic pathways based on the assumption that distinct rare genetic variants or epigenetic modifications affect common biological pathways. To unravel recurrent ASD-related neuropathological mechanisms, we took advantage of the En2-/- mouse model and performed transcriptome profiling on cerebellar and hippocampal adult tissues. Cerebellar and hippocampal tissue samples from three En2-/- and wild type (WT) littermate mice were assessed for differential gene expression using microarray hybridization followed by RankProd analysis. To identify functional categories overrepresented in the differentially expressed genes, we used integrated gene-network analysis, gene ontology enrichment and mouse phenotype ontology analysis. Furthermore, we performed direct enrichment analysis of ASD-associated genes from the SFARI repository in our differentially expressed genes. Given the limited number of animals used in the study, we used permissive criteria and identified 842 differentially expressed genes in En2-/- cerebellum and 862 in the En2-/- hippocampus. Our functional analysis revealed that the molecular signature of En2-/- cerebellum and hippocampus shares convergent pathological pathways with ASD, including abnormal synaptic transmission, altered developmental processes and increased immune response. Furthermore, when directly compared to the repository of the SFARI database, our differentially expressed genes in the hippocampus showed enrichment of ASD-associated genes significantly higher than previously reported. qPCR was performed for representative genes to confirm relative transcript levels compared to those detected in microarrays. Despite the limited number of animals used in the study, our bioinformatic analysis indicates the En2-/- mouse is a valuable tool for investigating molecular alterations related to ASD.
Large-scale gene function analysis with the PANTHER classification system.

PubMed

Mi, Huaiyu; Muruganujan, Anushya; Casagrande, John T; Thomas, Paul D

2013-08-01

The PANTHER (protein annotation through evolutionary relationship) classification system (http://www.pantherdb.org/) is a comprehensive system that combines gene function, ontology, pathways and statistical analysis tools that enable biologists to analyze large-scale, genome-wide data from sequencing, proteomics or gene expression experiments. The system is built with 82 complete genomes organized into gene families and subfamilies, and their evolutionary relationships are captured in phylogenetic trees, multiple sequence alignments and statistical models (hidden Markov models or HMMs). Genes are classified according to their function in several different ways: families and subfamilies are annotated with ontology terms (Gene Ontology (GO) and PANTHER protein class), and sequences are assigned to PANTHER pathways. The PANTHER website includes a suite of tools that enable users to browse and query gene functions, and to analyze large-scale experimental data with a number of statistical tests. It is widely used by bench scientists, bioinformaticians, computer scientists and systems biologists. In the 2013 release of PANTHER (v.8.0), in addition to an update of the data content, we redesigned the website interface to improve both user experience and the system's analytical capability. This protocol provides a detailed description of how to analyze genome-wide experimental data with the PANTHER classification system.
Smooth Muscle Cell Genome Browser: Enabling the Identification of Novel Serum Response Factor Target Genes

PubMed Central

Lee, Moon Young; Park, Chanjae; Berent, Robyn M.; Park, Paul J.; Fuchs, Robert; Syn, Hannah; Chin, Albert; Townsend, Jared; Benson, Craig C.; Redelman, Doug; Shen, Tsai-wei; Park, Jong Kun; Miano, Joseph M.; Sanders, Kenton M.; Ro, Seungil

2015-01-01

Genome-scale expression data on the absolute numbers of gene isoforms offers essential clues in cellular functions and biological processes. Smooth muscle cells (SMCs) perform a unique contractile function through expression of specific genes controlled by serum response factor (SRF), a transcription factor that binds to DNA sites known as the CArG boxes. To identify SRF-regulated genes specifically expressed in SMCs, we isolated SMC populations from mouse small intestine and colon, obtained their transcriptomes, and constructed an interactive SMC genome and CArGome browser. To our knowledge, this is the first online resource that provides a comprehensive library of all genetic transcripts expressed in primary SMCs. The browser also serves as the first genome-wide map of SRF binding sites. The browser analysis revealed novel SMC-specific transcriptional variants and SRF target genes, which provided new and unique insights into the cellular and biological functions of the cells in gastrointestinal (GI) physiology. The SRF target genes in SMCs, which were discovered in silico, were confirmed by proteomic analysis of SMC-specific Srf knockout mice. Our genome browser offers a new perspective into the alternative expression of genes in the context of SRF binding sites in SMCs and provides a valuable reference for future functional studies. PMID:26241044
Reverse Genetics and High Throughput Sequencing Methodologies for Plant Functional Genomics

PubMed Central

Ben-Amar, Anis; Daldoul, Samia; Reustle, Götz M.; Krczal, Gabriele; Mliki, Ahmed

2016-01-01

In the post-genomic era, increasingly sophisticated genetic tools are being developed with the long-term goal of understanding how the coordinated activity of genes gives rise to a complex organism. With the advent of the next generation sequencing associated with effective computational approaches, wide variety of plant species have been fully sequenced giving a wealth of data sequence information on structure and organization of plant genomes. Since thousands of gene sequences are already known, recently developed functional genomics approaches provide powerful tools to analyze plant gene functions through various gene manipulation technologies. Integration of different omics platforms along with gene annotation and computational analysis may elucidate a complete view in a system biology level. Extensive investigations on reverse genetics methodologies were deployed for assigning biological function to a specific gene or gene product. We provide here an updated overview of these high throughout strategies highlighting recent advances in the knowledge of functional genomics in plants. PMID:28217003
Comparative genomic analysis of the WRKY III gene family in populus, grape, arabidopsis and rice.

PubMed

Wang, Yiyi; Feng, Lin; Zhu, Yuxin; Li, Yuan; Yan, Hanwei; Xiang, Yan

2015-09-08

WRKY III genes have significant functions in regulating plant development and resistance. In plant, WRKY gene family has been studied in many species, however, there still lack a comprehensive analysis of WRKY III genes in the woody plant species poplar, three representative lineages of flowering plant species are incorporated in most analyses: Arabidopsis (a model plant for annual herbaceous dicots), grape (one model plant for perennial dicots) and Oryza sativa (a model plant for monocots). In this study, we identified 10, 6, 13 and 28 WRKY III genes in the genomes of Populus trichocarpa, grape (Vitis vinifera), Arabidopsis thaliana and rice (Oryza sativa), respectively. Phylogenetic analysis revealed that the WRKY III proteins could be divided into four clades. By microsynteny analysis, we found that the duplicated regions were more conserved between poplar and grape than Arabidopsis or rice. We dated their duplications by Ks analysis of Populus WRKY III genes and demonstrated that all the blocks were formed after the divergence of monocots and dicots. Strong purifying selection has played a key role in the maintenance of WRKY III genes in Populus. Tissue expression analysis of the WRKY III genes in Populus revealed that five were most highly expressed in the xylem. We also performed quantitative real-time reverse transcription PCR analysis of WRKY III genes in Populus treated with salicylic acid, abscisic acid and polyethylene glycol to explore their stress-related expression patterns. This study highlighted the duplication and diversification of the WRKY III gene family in Populus and provided a comprehensive analysis of this gene family in the Populus genome. Our results indicated that the majority of WRKY III genes of Populus was expanded by large-scale gene duplication. The expression pattern of PtrWRKYIII gene identified that these genes play important roles in the xylem during poplar growth and development, and may play crucial role in defense to drought stress. Our results presented here may aid in the selection of appropriate candidate genes for further characterization of their biological functions in poplar.
Comparative analysis of gene expression profiles of OPN signaling pathway in four kinds of liver diseases.

PubMed

Wang, Gaiping; Chen, Shasha; Zhao, Congcong; Li, Xiaofang; Zhao, Weiming; Yang, Jing; Chang, Cuifang; Xu, Cunshuan

2016-09-01

To explore the relevance of OPN signalling pathway to the occurrence and development of nonalcoholic fatty liver disease (NAFLD), liver cirrhosis (LC), hepatic cancer (HC) and acute hepatic failure (AHF) at transcriptional level, Rat Genome 230 2.0 Array was used to detect expression profiles of OPN signalling pathway-related genes in four kinds of liver diseases. The results showed that 23, 33, 59 and 74 genes were significantly changed in the above four kinds of liver diseases, respectively. H-clustering analysis showed that the expression profiles of OPN signalling-related genes were notably different in four kinds of liver diseases. Subsequently, a total of above-mentioned 147 genes were categorized into four clusters by k-means according to the similarity of gene expression, and expression analysis systematic explorer (EASE) functional enrichment analysis revealed that OPN signalling pathway-related genes were involved in cell adhesion and migration, cell proliferation, apoptosis, stress and inflammatory reaction, etc. Finally, ingenuity pathway analysis (IPA) software was used to predict the functions of OPN signalling-related genes, and the results indicated that the activities of ROS production, cell adhesion and migration, cell proliferation were remarkably increased, while that of apoptosis, stress and inflammatory reaction were reduced in four kinds of liver diseases. In summary, the above physiological activities changed more obviously in LC, HC and AHF than in NAFLD.
Systematic Analysis of the 4-Coumarate:Coenzyme A Ligase (4CL) Related Genes and Expression Profiling during Fruit Development in the Chinese Pear

PubMed Central

Cao, Yunpeng; Han, Yahui; Li, Dahui; Lin, Yi; Cai, Yongping

2016-01-01

In plants, 4-coumarate:coenzyme A ligases (4CLs), comprising some of the adenylate-forming enzymes, are key enzymes involved in regulating lignin metabolism and the biosynthesis of flavonoids and other secondary metabolites. Although several 4CL-related proteins were shown to play roles in secondary metabolism, no comprehensive study on 4CL-related genes in the pear and other Rosaceae species has been reported. In this study, we identified 4CL-related genes in the apple, peach, yangmei, and pear genomes using DNATOOLS software and inferred their evolutionary relationships using phylogenetic analysis, collinearity analysis, conserved motif analysis, and structure analysis. A total of 149 4CL-related genes in four Rosaceous species (pear, apple, peach, and yangmei) were identified, with 30 members in the pear. We explored the functions of several 4CL and acyl-coenzyme A synthetase (ACS) genes during the development of pear fruit by quantitative real-time PCR (qRT-PCR). We found that duplication events had occurred in the 30 4CL-related genes in the pear. These duplicated 4CL-related genes are distributed unevenly across all pear chromosomes except chromosomes 4, 8, 11, and 12. The results of this study provide a basis for further investigation of both the functions and evolutionary history of 4CL-related genes. PMID:27775579
Systematic Analysis of the 4-Coumarate:Coenzyme A Ligase (4CL) Related Genes and Expression Profiling during Fruit Development in the Chinese Pear.

PubMed

Cao, Yunpeng; Han, Yahui; Li, Dahui; Lin, Yi; Cai, Yongping

2016-10-19

In plants, 4-coumarate:coenzyme A ligases (4CLs), comprising some of the adenylate-forming enzymes, are key enzymes involved in regulating lignin metabolism and the biosynthesis of flavonoids and other secondary metabolites. Although several 4CL-related proteins were shown to play roles in secondary metabolism, no comprehensive study on 4CL-related genes in the pear and other Rosaceae species has been reported. In this study, we identified 4CL-related genes in the apple, peach, yangmei, and pear genomes using DNATOOLS software and inferred their evolutionary relationships using phylogenetic analysis, collinearity analysis, conserved motif analysis, and structure analysis. A total of 149 4CL-related genes in four Rosaceous species (pear, apple, peach, and yangmei) were identified, with 30 members in the pear. We explored the functions of several 4CL and acyl-coenzyme A synthetase (ACS) genes during the development of pear fruit by quantitative real-time PCR (qRT-PCR). We found that duplication events had occurred in the 30 4CL-related genes in the pear. These duplicated 4CL-related genes are distributed unevenly across all pear chromosomes except chromosomes 4, 8, 11, and 12. The results of this study provide a basis for further investigation of both the functions and evolutionary history of 4CL-related genes.
Expression and functional analysis of genes encoding cytokinin receptor-like histidine kinase in maize (Zea mays L.).

PubMed

Wang, Bo; Chen, Yanhong; Guo, Baojian; Kabir, Muhammad Rezaul; Yao, Yingyin; Peng, Huiru; Xie, Chaojie; Zhang, Yirong; Sun, Qixin; Ni, Zhongfu

2014-08-01

Cytokinin signaling is vital for plant growth and development which function via the two-component system (TCS). As one of the key component of TCS, transmembrane histidine kinases (HK) are encoded by a small gene family in plants. In this study, we focused on expression and functional analysis of cytokinin receptor-like HK genes (ZmHK) in maize. Firstly, bioinformatics analysis revealed that seven cloned ZmHK genes have different expression patterns during maize development. Secondly, ectopic expression by CaMV35S promoter in Arabidopsis further revealed that functional differentiation exists among these seven members. Among them, the ZmHK1a2-OX transgenic line has the lowest germination rate in the dark, ZmHK1-OX and ZmHK2a2-OX can delay leaf senescence, and seed size of ZmHK1-OX, ZmHK1a2-OX, ZmHK2-OX, ZmHK3b-OX and ZmHK2a2-OX was obviously reduced as compared to wild type. Additionally, ZmHK genes play opposite roles in shoot and root development; all ZmHK-OX transgenic lines display obvious shorter root length and reduced number of lateral roots, but enhanced shoot development compared with the wild type. Most notably, Arabidopsis response regulator ARR5 gene was up-regulated in ZmHK1-OX, ZmHK1a2-OX, ZmHK2-OX, ZmHK3b-OX and ZmHK2a2-OX as compared to wild type. Although the causal link between ZmHK genes and cytokinin signaling pathway is still an area to be further elucidated, these findings reflected that the diversification of ZmHK genes expression patterns and functions occurred in the course of maize evolution, indicating that some ZmHK genes might play different roles during maize development.
Co-acting gene networks predict TRAIL responsiveness of tumour cells with high accuracy.

PubMed

O'Reilly, Paul; Ortutay, Csaba; Gernon, Grainne; O'Connell, Enda; Seoighe, Cathal; Boyce, Susan; Serrano, Luis; Szegezdi, Eva

2014-12-19

Identification of differentially expressed genes from transcriptomic studies is one of the most common mechanisms to identify tumor biomarkers. This approach however is not well suited to identify interaction between genes whose protein products potentially influence each other, which limits its power to identify molecular wiring of tumour cells dictating response to a drug. Due to the fact that signal transduction pathways are not linear and highly interlinked, the biological response they drive may be better described by the relative amount of their components and their functional relationships than by their individual, absolute expression. Gene expression microarray data for 109 tumor cell lines with known sensitivity to the death ligand cytokine tumor necrosis factor-related apoptosis-inducing ligand (TRAIL) was used to identify genes with potential functional relationships determining responsiveness to TRAIL-induced apoptosis. The machine learning technique Random Forest in the statistical environment "R" with backward elimination was used to identify the key predictors of TRAIL sensitivity and differentially expressed genes were identified using the software GeneSpring. Gene co-regulation and statistical interaction was assessed with q-order partial correlation analysis and non-rejection rate. Biological (functional) interactions amongst the co-acting genes were studied with Ingenuity network analysis. Prediction accuracy was assessed by calculating the area under the receiver operator curve using an independent dataset. We show that the gene panel identified could predict TRAIL-sensitivity with a very high degree of sensitivity and specificity (AUC=0·84). The genes in the panel are co-regulated and at least 40% of them functionally interact in signal transduction pathways that regulate cell death and cell survival, cellular differentiation and morphogenesis. Importantly, only 12% of the TRAIL-predictor genes were differentially expressed highlighting the importance of functional interactions in predicting the biological response. The advantage of co-acting gene clusters is that this analysis does not depend on differential expression and is able to incorporate direct- and indirect gene interactions as well as tissue- and cell-specific characteristics. This approach (1) identified a descriptor of TRAIL sensitivity which performs significantly better as a predictor of TRAIL sensitivity than any previously reported gene signatures, (2) identified potential novel regulators of TRAIL-responsiveness and (3) provided a systematic view highlighting fundamental differences between the molecular wiring of sensitive and resistant cell types.
Genome-Wide Identification of the Invertase Gene Family in Populus.

PubMed

Chen, Zhong; Gao, Kai; Su, Xiaoxing; Rao, Pian; An, Xinmin

2015-01-01

Invertase plays a crucial role in carbohydrate partitioning and plant development as it catalyses the irreversible hydrolysis of sucrose into glucose and fructose. The invertase family in plants is composed of two sub-families: acid invertases, which are targeted to the cell wall and vacuole; and neutral/alkaline invertases, which function in the cytosol. In this study, 5 cell wall invertase genes (PtCWINV1-5), 3 vacuolar invertase genes (PtVINV1-3) and 16 neutral/alkaline invertase genes (PtNINV1-16) were identified in the Populus genome and found to be distributed on 14 chromosomes. A comprehensive analysis of poplar invertase genes was performed, including structures, chromosome location, phylogeny, evolutionary pattern and expression profiles. Phylogenetic analysis indicated that the two sub-families were both divided into two clades. Segmental duplication is contributed to neutral/alkaline sub-family expansion. Furthermore, the Populus invertase genes displayed differential expression in roots, stems, leaves, leaf buds and in response to salt/cold stress and pathogen infection. In addition, the analysis of enzyme activity and sugar content revealed that invertase genes play key roles in the sucrose metabolism of various tissues and organs in poplar. This work lays the foundation for future functional analysis of the invertase genes in Populus and other woody perennials.
Genome-Wide Identification of the Invertase Gene Family in Populus

PubMed Central

Su, Xiaoxing; Rao, Pian; An, Xinmin

2015-01-01

Invertase plays a crucial role in carbohydrate partitioning and plant development as it catalyses the irreversible hydrolysis of sucrose into glucose and fructose. The invertase family in plants is composed of two sub-families: acid invertases, which are targeted to the cell wall and vacuole; and neutral/alkaline invertases, which function in the cytosol. In this study, 5 cell wall invertase genes (PtCWINV1-5), 3 vacuolar invertase genes (PtVINV1-3) and 16 neutral/alkaline invertase genes (PtNINV1-16) were identified in the Populus genome and found to be distributed on 14 chromosomes. A comprehensive analysis of poplar invertase genes was performed, including structures, chromosome location, phylogeny, evolutionary pattern and expression profiles. Phylogenetic analysis indicated that the two sub-families were both divided into two clades. Segmental duplication is contributed to neutral/alkaline sub-family expansion. Furthermore, the Populus invertase genes displayed differential expression in roots, stems, leaves, leaf buds and in response to salt/cold stress and pathogen infection. In addition, the analysis of enzyme activity and sugar content revealed that invertase genes play key roles in the sucrose metabolism of various tissues and organs in poplar. This work lays the foundation for future functional analysis of the invertase genes in Populus and other woody perennials. PMID:26393355
Ossification of the posterior longitudinal ligament related genes identification using microarray gene expression profiling and bioinformatics analysis.

PubMed

He, Hailong; Mao, Lingzhou; Xu, Peng; Xi, Yanhai; Xu, Ning; Xue, Mingtao; Yu, Jiangming; Ye, Xiaojian

2014-01-10

Ossification of the posterior longitudinal ligament (OPLL) is a kind of disease with physical barriers and neurological disorders. The objective of this study was to explore the differentially expressed genes (DEGs) in OPLL patient ligament cells and identify the target sites for the prevention and treatment of OPLL in clinic. Gene expression data GSE5464 was downloaded from Gene Expression Omnibus; then DEGs were screened by limma package in R language, and changed functions and pathways of OPLL cells compared to normal cells were identified by DAVID (The Database for Annotation, Visualization and Integrated Discovery); finally, an interaction network of DEGs was constructed by string. A total of 1536 DEGs were screened, with 31 down-regulated and 1505 up-regulated genes. Response to wounding function and Toll-like receptor signaling pathway may involve in the development of OPLL. Genes, such as PDGFB, PRDX2 may involve in OPLL through response to wounding function. Toll-like receptor signaling pathway enriched genes such as TLR1, TLR5, and TLR7 may involve in spine cord injury in OPLL. PIK3R1 was the hub gene in the network of DEGs with the highest degree; INSR was one of the most closely related genes of it. OPLL related genes screened by microarray gene expression profiling and bioinformatics analysis may be helpful for elucidating the mechanism of OPLL. © 2013.
Prediction of operon-like gene clusters in the Arabidopsis thaliana genome based on co-expression analysis of neighboring genes.

PubMed

Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko

2012-07-15

Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of E<10(-5)) are included in 27 clusters. Five clusters are associated with metabolism, containing P450 genes restricted to the Brassica family and predicted to be involved in secondary metabolism. Operon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system. Copyright © 2012 Elsevier B.V. All rights reserved.
Partitioning of functional gene expression data using principal points.

PubMed

Kim, Jaehee; Kim, Haseong

2017-10-12

DNA microarrays offer motivation and hope for the simultaneous study of variations in multiple genes. Gene expression is a temporal process that allows variations in expression levels with a characterized gene function over a period of time. Temporal gene expression curves can be treated as functional data since they are considered as independent realizations of a stochastic process. This process requires appropriate models to identify patterns of gene functions. The partitioning of the functional data can find homogeneous subgroups of entities for the massive genes within the inherent biological networks. Therefor it can be a useful technique for the analysis of time-course gene expression data. We propose a new self-consistent partitioning method of functional coefficients for individual expression profiles based on the orthonormal basis system. A principal points based functional partitioning method is proposed for time-course gene expression data. The method explores the relationship between genes using Legendre coefficients as principal points to extract the features of gene functions. Our proposed method provides high connectivity in connectedness after clustering for simulated data and finds a significant subsets of genes with the increased connectivity. Our approach has comparative advantages that fewer coefficients are used from the functional data and self-consistency of principal points for partitioning. As real data applications, we are able to find partitioned genes through the gene expressions found in budding yeast data and Escherichia coli data. The proposed method benefitted from the use of principal points, dimension reduction, and choice of orthogonal basis system as well as provides appropriately connected genes in the resulting subsets. We illustrate our method by applying with each set of cell-cycle-regulated time-course yeast genes and E. coli genes. The proposed method is able to identify highly connected genes and to explore the complex dynamics of biological systems in functional genomics.
Knowledge Driven Variable Selection (KDVS) – a new approach to enrichment analysis of gene signatures obtained from high–throughput data

PubMed Central

2013-01-01

Background High–throughput (HT) technologies provide huge amount of gene expression data that can be used to identify biomarkers useful in the clinical practice. The most frequently used approaches first select a set of genes (i.e. gene signature) able to characterize differences between two or more phenotypical conditions, and then provide a functional assessment of the selected genes with an a posteriori enrichment analysis, based on biological knowledge. However, this approach comes with some drawbacks. First, gene selection procedure often requires tunable parameters that affect the outcome, typically producing many false hits. Second, a posteriori enrichment analysis is based on mapping between biological concepts and gene expression measurements, which is hard to compute because of constant changes in biological knowledge and genome analysis. Third, such mapping is typically used in the assessment of the coverage of gene signature by biological concepts, that is either score–based or requires tunable parameters as well, limiting its power. Results We present Knowledge Driven Variable Selection (KDVS), a framework that uses a priori biological knowledge in HT data analysis. The expression data matrix is transformed, according to prior knowledge, into smaller matrices, easier to analyze and to interpret from both computational and biological viewpoints. Therefore KDVS, unlike most approaches, does not exclude a priori any function or process potentially relevant for the biological question under investigation. Differently from the standard approach where gene selection and functional assessment are applied independently, KDVS embeds these two steps into a unified statistical framework, decreasing the variability derived from the threshold–dependent selection, the mapping to the biological concepts, and the signature coverage. We present three case studies to assess the usefulness of the method. Conclusions We showed that KDVS not only enables the selection of known biological functionalities with accuracy, but also identification of new ones. An efficient implementation of KDVS was devised to obtain results in a fast and robust way. Computing time is drastically reduced by the effective use of distributed resources. Finally, integrated visualization techniques immediately increase the interpretability of results. Overall, KDVS approach can be considered as a viable alternative to enrichment–based approaches. PMID:23302187
Origin and Functional Prediction of Pollen Allergens in Plants1[OPEN

PubMed Central

Chen, Miaolin; Xu, Jie; Ren, Kang; Searle, Iain

2016-01-01

Pollen allergies have long been a major pandemic health problem for human. However, the evolutionary events and biological function of pollen allergens in plants remain largely unknown. Here, we report the genome-wide prediction of pollen allergens and their biological function in the dicotyledonous model plant Arabidopsis (Arabidopsis thaliana) and the monocotyledonous model plant rice (Oryza sativa). In total, 145 and 107 pollen allergens were predicted from rice and Arabidopsis, respectively. These pollen allergens are putatively involved in stress responses and metabolic processes such as cell wall metabolism during pollen development. Interestingly, these putative pollen allergen genes were derived from large gene families and became diversified during evolution. Sequence analysis across 25 plant species from green alga to angiosperms suggest that about 40% of putative pollen allergenic proteins existed in both lower and higher plants, while other allergens emerged during evolution. Although a high proportion of gene duplication has been observed among allergen-coding genes, our data show that these genes might have undergone purifying selection during evolution. We also observed that epitopes of an allergen might have a biological function, as revealed by comprehensive analysis of two known allergens, expansin and profilin. This implies a crucial role of conserved amino acid residues in both in planta biological function and allergenicity. Finally, a model explaining how pollen allergens were generated and maintained in plants is proposed. Prediction and systematic analysis of pollen allergens in model plants suggest that pollen allergens were evolved by gene duplication and then functional specification. This study provides insight into the phylogenetic and evolutionary scenario of pollen allergens that will be helpful to future characterization and epitope screening of pollen allergens. PMID:27436829
Origin and Functional Prediction of Pollen Allergens in Plants.

PubMed

Chen, Miaolin; Xu, Jie; Devis, Deborah; Shi, Jianxin; Ren, Kang; Searle, Iain; Zhang, Dabing

2016-09-01

Pollen allergies have long been a major pandemic health problem for human. However, the evolutionary events and biological function of pollen allergens in plants remain largely unknown. Here, we report the genome-wide prediction of pollen allergens and their biological function in the dicotyledonous model plant Arabidopsis (Arabidopsis thaliana) and the monocotyledonous model plant rice (Oryza sativa). In total, 145 and 107 pollen allergens were predicted from rice and Arabidopsis, respectively. These pollen allergens are putatively involved in stress responses and metabolic processes such as cell wall metabolism during pollen development. Interestingly, these putative pollen allergen genes were derived from large gene families and became diversified during evolution. Sequence analysis across 25 plant species from green alga to angiosperms suggest that about 40% of putative pollen allergenic proteins existed in both lower and higher plants, while other allergens emerged during evolution. Although a high proportion of gene duplication has been observed among allergen-coding genes, our data show that these genes might have undergone purifying selection during evolution. We also observed that epitopes of an allergen might have a biological function, as revealed by comprehensive analysis of two known allergens, expansin and profilin. This implies a crucial role of conserved amino acid residues in both in planta biological function and allergenicity. Finally, a model explaining how pollen allergens were generated and maintained in plants is proposed. Prediction and systematic analysis of pollen allergens in model plants suggest that pollen allergens were evolved by gene duplication and then functional specification. This study provides insight into the phylogenetic and evolutionary scenario of pollen allergens that will be helpful to future characterization and epitope screening of pollen allergens. © 2016 American Society of Plant Biologists. All rights reserved.
Comparative genomic analysis of the PKS genes in five species and expression analysis in upland cotton

PubMed Central

Cheng, Xi; Wang, Yanan; Abdullah, Muhammad; Li, Manli; Li, Dahui; Gao, Junshan

2017-01-01

Plant type III polyketide synthase (PKS) can catalyse the formation of a series of secondary metabolites with different structures and different biological functions; the enzyme plays an important role in plant growth, development and resistance to stress. At present, the PKS gene has been identified and studied in a variety of plants. Here, we identified 11 PKS genes from upland cotton (Gossypium hirsutum) and compared them with 41 PKS genes in Populus tremula, Vitis vinifera, Malus domestica and Arabidopsis thaliana. According to the phylogenetic tree, a total of 52 PKS genes can be divided into four subfamilies (I–IV). The analysis of gene structures and conserved motifs revealed that most of the PKS genes were composed of two exons and one intron and there are two characteristic conserved domains (Chal_sti_synt_N and Chal_sti_synt_C) of the PKS gene family. In our study of the five species, gene duplication was found in addition to Arabidopsis thaliana and we determined that purifying selection has been of great significance in maintaining the function of PKS gene family. From qRT-PCR analysis and a combination of the role of the accumulation of proanthocyanidins (PAs) in brown cotton fibers, we concluded that five PKS genes are candidate genes involved in brown cotton fiber pigment synthesis. These results are important for the further study of brown cotton PKS genes. It not only reveals the relationship between PKS gene family and pigment in brown cotton, but also creates conditions for improving the quality of brown cotton fiber. PMID:29104824

Identification and functional analysis of the BIM interactome; new clues on its possible involvement in Epstein-Barr Virus-associated diseases.

PubMed

Rouka, Erasmia; Kyriakou, Despoina

2015-12-01

Epigenetic deregulation is a common feature in the pathogenesis of Epstein-Barr Virus (EBV)-related lymphomas and carcinomas. Previous studies have demonstrated a strong association between EBV latency in B-cells and epigenetic silencing of the tumor suppressor gene BIM. This study aimed to the construction and functional analysis of the BIM interactome in order to identify novel host genes that may be targeted by EBV. Fifty-nine unique interactors were found to compose the BIM gene network. Ontological analysis at the pathway level highlighted infectious diseases along with neuropathologies. These results underline the possible interplay between the BIM interactome and EBV-associated disorders.
From data to function: functional modeling of poultry genomics data.

PubMed

McCarthy, F M; Lyons, E

2013-09-01

One of the challenges of functional genomics is to create a better understanding of the biological system being studied so that the data produced are leveraged to provide gains for agriculture, human health, and the environment. Functional modeling enables researchers to make sense of these data as it reframes a long list of genes or gene products (mRNA, ncRNA, and proteins) by grouping based upon function, be it individual molecular functions or interactions between these molecules or broader biological processes, including metabolic and signaling pathways. However, poultry researchers have been hampered by a lack of functional annotation data, tools, and training to use these data and tools. Moreover, this lack is becoming more critical as new sequencing technologies enable us to generate data not only for an increasingly diverse range of species but also individual genomes and populations of individuals. We discuss the impact of these new sequencing technologies on poultry research, with a specific focus on what functional modeling resources are available for poultry researchers. We also describe key strategies for researchers who wish to functionally model their own data, providing background information about functional modeling approaches, the data and tools to support these approaches, and the strengths and limitations of each. Specifically, we describe methods for functional analysis using Gene Ontology (GO) functional summaries, functional enrichment analysis, and pathways and network modeling. As annotation efforts begin to provide the fundamental data that underpin poultry functional modeling (such as improved gene identification, standardized gene nomenclature, temporal and spatial expression data and gene product function), tool developers are incorporating these data into new and existing tools that are used for functional modeling, and cyberinfrastructure is being developed to provide the necessary extendibility and scalability for storing and analyzing these data. This process will support the efforts of poultry researchers to make sense of their functional genomics data sets, and we provide here a starting point for researchers who wish to take advantage of these tools.
Evolution of the bHLH Genes Involved in Stomatal Development: Implications for the Expansion of Developmental Complexity of Stomata in Land Plants

PubMed Central

Ran, Jin-Hua; Shen, Ting-Ting; Liu, Wen-Juan; Wang, Xiao-Quan

2013-01-01

Stomata play significant roles in plant evolution. A trio of closely related basic Helix-Loop-Helix (bHLH) subgroup Ia genes, SPCH, MUTE and FAMA, mediate sequential steps of stomatal development, and their functions may be conserved in land plants. However, the evolutionary history of the putative SPCH/MUTE/FAMA genes is still greatly controversial, especially the phylogenetic positions of the bHLH Ia members from basal land plants. To better understand the evolutionary pattern and functional diversity of the bHLH genes involved in stomatal development, we made a comprehensive evolutionary analysis of the homologous genes from 54 species representing the major lineages of green plants. The phylogenetic analysis indicated: (1) All bHLH Ia genes from the two basal land plants Physcomitrella and Selaginella were closely related to the FAMA genes of seed plants; and (2) the gymnosperm ‘SPCH’ genes were sister to a clade comprising the angiosperm SPCH and MUTE genes, while the FAMA genes of gymnosperms and angiosperms had a sister relationship. The revealed phylogenetic relationships are also supported by the distribution of gene structures and previous functional studies. Therefore, we deduce that the function of FAMA might be ancestral in the bHLH Ia subgroup. In addition, the gymnosperm “SPCH” genes may represent an ancestral state and have a dual function of SPCH and MUTE, two genes that could have originated from a duplication event in the common ancestor of angiosperms. Moreover, in angiosperms, SPCHs have experienced more duplications and harbor more copies than MUTEs and FAMAs, which, together with variation of the stomatal development in the entry division, implies that SPCH might have contributed greatly to the diversity of stomatal development. Based on the above, we proposed a model for the correlation between the evolution of stomatal development and the genes involved in this developmental process in land plants. PMID:24244399
Neurotactin functions in concert with other identified CAMs in growth cone guidance in Drosophila.

PubMed

Speicher, S; García-Alonso, L; Carmena, A; Martín-Bermudo, M D; de la Escalera, S; Jiménez, F

1998-02-01

We have isolated and characterized mutations in Drosophila neurotactin, a gene that encodes a cell adhesion protein widely expressed during neural development. Analysis of both loss and gain of gene function conditions during embryonic and postembryonic development revealed specific requirements for neurotactin during axon outgrowth, fasciculation, and guidance. Furthermore, embryos of some double mutant combinations of neurotactin and other genes encoding adhesion/signaling molecules, including neuroglian, derailed, and kekkon1, displayed phenotypic synergy. This result provides evidence for functional cooperativity in vivo between the adhesion and signaling pathways controlled by neurotactin and the other three genes.
Screening of Critical Genes and MicroRNAs in Blood Samples of Patients with Ruptured Intracranial Aneurysms by Bioinformatic Analysis of Gene Expression Data.

PubMed

Bo, Lijuan; Wei, Bo; Wang, Zhanfeng; Kong, Daliang; Gao, Zheng; Miao, Zhuang

2017-09-20

BACKGROUND This study aimed to identify more potential genes and miRNAs associated with the pathogenesis of intracranial aneurysms (IAs). MATERIAL AND METHODS The dataset of GSE36791 (accession number) was downloaded from the Gene Expression Omnibus database. Differentially expressed genes (DEGs) were screened for in the blood samples from patients with ruptured IAs and controls, followed by functional and pathway enrichment analyses. In addition, gene co-expression network was constructed and significant modules were extracted from the network by WGCNA R package. Screening for miRNAs that could regulate DEGs in the modules was performed and an analysis of regulatory relationships was conducted. RESULTS A total of 304 DEGs (167 up-regulated and 137 down-regulated genes) were screened for in blood samples from patients with ruptured IAs compared with those from controls. Functional enrichment analysis showed that the up-regulated genes were mainly associated with immune response and the down-regulated DEGs were mainly concerned with the structure of ribosome and translation. Besides, six functional modules were significantly identified, including four modules enriched by up-regulated genes and two modules enriched by down-regulated genes. Thereinto, the blue, yellow, and turquoise modules of up-regulated genes were all linked with immune response. Additionally, 16 miRNAs were predicted to regulate DEGs in the three modules associated with immune response, such as hsa-miR-1304, hsa-miR-33b, hsa-miR-125b, and hsa-miR-125a-5p. CONCLUSIONS Several genes and miRNAs (such as miR-1304, miR-33b, IRS2 and KCNJ2) may take part in the pathogenesis of IAs.
Aldehyde Dehydrogenase Gene Superfamily in Populus: Organization and Expression Divergence between Paralogous Gene Pairs.

PubMed

Tian, Feng-Xia; Zang, Jian-Lei; Wang, Tan; Xie, Yu-Li; Zhang, Jin; Hu, Jian-Jun

2015-01-01

Aldehyde dehydrogenases (ALDHs) constitute a superfamily of NAD(P)+-dependent enzymes that catalyze the irreversible oxidation of a wide range of reactive aldehydes to their corresponding nontoxic carboxylic acids. ALDHs have been studied in many organisms from bacteria to mammals; however, no systematic analyses incorporating genome organization, gene structure, expression profiles, and cis-acting elements have been conducted in the model tree species Populus trichocarpa thus far. In this study, a comprehensive analysis of the Populus ALDH gene superfamily was performed. A total of 26 Populus ALDH genes were found to be distributed across 12 chromosomes. Genomic organization analysis indicated that purifying selection may have played a pivotal role in the retention and maintenance of PtALDH gene families. The exon-intron organizations of PtALDHs were highly conserved within the same family, suggesting that the members of the same family also may have conserved functionalities. Microarray data and qRT-PCR analysis indicated that most PtALDHs had distinct tissue-specific expression patterns. The specificity of cis-acting elements in the promoter regions of the PtALDHs and the divergence of expression patterns between nine paralogous PtALDH gene pairs suggested that gene duplications may have freed the duplicate genes from the functional constraints. The expression levels of some ALDHs were up- or down-regulated by various abiotic stresses, implying that the products of these genes may be involved in the adaptation of Populus to abiotic stresses. Overall, the data obtained from our investigation contribute to a better understanding of the complexity of the Populus ALDH gene superfamily and provide insights into the function and evolution of ALDH gene families in vascular plants.
Different functional classes of genes are characterized by different compositional properties.

PubMed

D'Onofrio, Giuseppe; Ghosh, Tapash Chandra; Saccone, Salvatore

2007-12-22

A compositional analysis on a set of human genes classified in several functional classes was performed. We found out that the GC3, i.e. the GC level at the third codon positions, of the genes involved in cellular metabolism was significantly higher than those involved in information storage and processing. Analyses of human/Xenopus ortologous genes showed that: (i) the GC3 increment of the genes involved in cellular metabolism was significantly higher than those involved in information storage and processing; and (ii) a strong correlation between the GC3 and the corresponding GCi, i.e. the GC level of introns, was found in each functional class. The non-randomness of the GC increments favours the selective hypothesis of gene/genome evolution.
Lymphocyte signaling : beyond knockouts

PubMed Central

Saveliev, Alexander; Tybulewicz, Victor L. J.

2016-01-01

The analysis of lymphocyte signaling was greatly enhanced by the advent of gene targeting, which allows the selective inactivation of a single gene. Whereas this gene ‘knockout’ approach is often informative, in many cases the phenotype resulting from gene ablation might not provide a complete picture of the function of the corresponding protein. If a protein has multiple functions within a single or several signaling pathways, or stabilizes other proteins in a complex, the phenotypic consequences of a gene knockout may manifest as a combination of several different perturbations. In these cases, gene targeting to ‘knockin’ subtle point mutations might provide more accurate insight into protein function. However, to be informative, such mutations must be carefully designed based on structural and biophysical data. PMID:19295633
NEAT: an efficient network enrichment analysis test.

PubMed

Signorelli, Mirko; Vinciotti, Veronica; Wit, Ernst C

2016-09-05

Network enrichment analysis is a powerful method, which allows to integrate gene enrichment analysis with the information on relationships between genes that is provided by gene networks. Existing tests for network enrichment analysis deal only with undirected networks, they can be computationally slow and are based on normality assumptions. We propose NEAT, a test for network enrichment analysis. The test is based on the hypergeometric distribution, which naturally arises as the null distribution in this context. NEAT can be applied not only to undirected, but to directed and partially directed networks as well. Our simulations indicate that NEAT is considerably faster than alternative resampling-based methods, and that its capacity to detect enrichments is at least as good as the one of alternative tests. We discuss applications of NEAT to network analyses in yeast by testing for enrichment of the Environmental Stress Response target gene set with GO Slim and KEGG functional gene sets, and also by inspecting associations between functional sets themselves. NEAT is a flexible and efficient test for network enrichment analysis that aims to overcome some limitations of existing resampling-based tests. The method is implemented in the R package neat, which can be freely downloaded from CRAN ( https://cran.r-project.org/package=neat ).
Comparative analysis of Homo sapiens and Mus musculus cyclin-dependent kinase (CDK) inhibitor genes p16 (MTS1) and p15 (MTS2).

PubMed

Jiang, P; Stone, S; Wagner, R; Wang, S; Dayananth, P; Kozak, C A; Wold, B; Kamb, A

1995-12-01

Cyclin-dependent kinase inhibitors are a growing family of molecules that regulate important transitions in the cell cycle. At least one of these molecules, p16, has been implicated in human tumorigenesis while its close homolog, p15, is induced by cell contact and transforming growth factor-beta (TGF-beta). To investigate the evolutionary and functional features of p15 and p16, we have isolated mouse (Mus musculus) homologs of each gene. Comparative analysis of these sequences provides evidence that the genes have similar functions in mouse and human. In addition, the comparison suggests that a gene conversion event is part of the evolution of the human p15 and p16 genes.
An in silico assessment of gene function and organization of the phenylpropanoid pathway metabolic networks in Arabidopsis thaliana and limitations thereof

NASA Technical Reports Server (NTRS)

Costa, Michael A.; Collins, R. Eric; Anterola, Aldwin M.; Cochrane, Fiona C.; Davin, Laurence B.; Lewis, Norman G.

2003-01-01

The Arabidopsis genome sequencing in 2000 gave to science the first blueprint of a vascular plant. Its successful completion also prompted the US National Science Foundation to launch the Arabidopsis 2010 initiative, the goal of which is to identify the function of each gene by 2010. In this study, an exhaustive analysis of The Institute for Genomic Research (TIGR) and The Arabidopsis Information Resource (TAIR) databases, together with all currently compiled EST sequence data, was carried out in order to determine to what extent the various metabolic networks from phenylalanine ammonia lyase (PAL) to the monolignols were organized and/or could be predicted. In these databases, there are some 65 genes which have been annotated as encoding putative enzymatic steps in monolignol biosynthesis, although many of them have only very low homology to monolignol pathway genes of known function in other plant systems. Our detailed analysis revealed that presently only 13 genes (two PALs, a cinnamate-4-hydroxylase, a p-coumarate-3-hydroxylase, a ferulate-5-hydroxylase, three 4-coumarate-CoA ligases, a cinnamic acid O-methyl transferase, two cinnamoyl-CoA reductases) and two cinnamyl alcohol dehydrogenases can be classified as having a bona fide (definitive) function; the remaining 52 genes currently have undetermined physiological roles. The EST database entries for this particular set of genes also provided little new insight into how the monolignol pathway was organized in the different tissues and organs, this being perhaps a consequence of both limitations in how tissue samples were collected and in the incomplete nature of the EST collections. This analysis thus underscores the fact that even with genomic sequencing, presumed to provide the entire suite of putative genes in the monolignol-forming pathway, a very large effort needs to be conducted to establish actual catalytic roles (including enzyme versatility), as well as the physiological function(s) for each member of the (multi)gene families present and the metabolic networks that are operative. Additionally, one key to identifying physiological functions for many of these (and other) unknown genes, and their corresponding metabolic networks, awaits the development of technologies to comprehensively study molecular processes at the single cell level in particular tissues and organs, in order to establish the actual metabolic context.
On the Use of Gene Ontology Annotations to Assess Functional Similarity among Orthologs and Paralogs: A Short Report

PubMed Central

Thomas, Paul D.; Wood, Valerie; Mungall, Christopher J.; Lewis, Suzanna E.; Blake, Judith A.

2012-01-01

A recent paper (Nehrt et al., PLoS Comput. Biol. 7:e1002073, 2011) has proposed a metric for the “functional similarity” between two genes that uses only the Gene Ontology (GO) annotations directly derived from published experimental results. Applying this metric, the authors concluded that paralogous genes within the mouse genome or the human genome are more functionally similar on average than orthologous genes between these genomes, an unexpected result with broad implications if true. We suggest, based on both theoretical and empirical considerations, that this proposed metric should not be interpreted as a functional similarity, and therefore cannot be used to support any conclusions about the “ortholog conjecture” (or, more properly, the “ortholog functional conservation hypothesis”). First, we reexamine the case studies presented by Nehrt et al. as examples of orthologs with divergent functions, and come to a very different conclusion: they actually exemplify how GO annotations for orthologous genes provide complementary information about conserved biological functions. We then show that there is a global ascertainment bias in the experiment-based GO annotations for human and mouse genes: particular types of experiments tend to be performed in different model organisms. We conclude that the reported statistical differences in annotations between pairs of orthologous genes do not reflect differences in biological function, but rather complementarity in experimental approaches. Our results underscore two general considerations for researchers proposing novel types of analysis based on the GO: 1) that GO annotations are often incomplete, potentially in a biased manner, and subject to an “open world assumption” (absence of an annotation does not imply absence of a function), and 2) that conclusions drawn from a novel, large-scale GO analysis should whenever possible be supported by careful, in-depth examination of examples, to help ensure the conclusions have a justifiable biological basis. PMID:22359495
Association between MASP-2 gene polymorphism and risk of infection diseases: A meta-analysis.

PubMed

Fu, Jie; Wang, Jingqiu; Luo, Yanping; Zhang, Lifeng; Zhang, Yuan; Dong, Xinfang; Yu, Hongjuan; Cao, Mingqiang; Ma, Xingming

2016-11-01

The role of MASP-2 is vital in the process of complement activation by the lectin pathway. It is generally considered that the functional activation of MASP-2 contribute to the infection disease development process. To analyze the association between MASP-2 functional gene (rs72550870) polymorphism and the infection disease risk by a meta-analysis. Relevant case-control studies were identiﬁed by searching Cochrane Library, PubMed, Emabase, DOAJ, CAB Abstracts, CSA, CINAHL, EBSCO, Scopus, Global Health, Index Copernicus, CA, China National Knowledge Infrastructure (CNKI) databases up to 10th January 2016. The data were extracted and the methodological quality of studies were evaluated. The STATA 12.0 software was used to perform statistical analysis. 9 studies were included. There was no significant association between masp-2 gene (p.D120G, rs72550870) polymorphism and the risk of infection disease under the allele model (G vs. A: OR = 0.89, 95%CI = 0.66-1.21)(P = 0.445>0.05) and the recessive model (AG + GG vs.AA: OR = 0.88, 95%CI = 0.65-1.20) (P = 0.428>0.05). This is the first comprehensive meta-analysis indicates that the MASP-2 functional gene (rs72550870) polymorphism is not associated with the infection diseases, and the key functional gene polymorphism of rs72550870 did not increase susceptibility to the infection diseases. Similarly, there were no obvious difference in subgroup analysis based on geographical areas and pathogenic microorganisms. Copyright © 2016 Elsevier Ltd. All rights reserved.
Predicted Arabidopsis Interactome Resource and Gene Set Linkage Analysis: A Transcriptomic Analysis Resource.

PubMed

Yao, Heng; Wang, Xiaoxuan; Chen, Pengcheng; Hai, Ling; Jin, Kang; Yao, Lixia; Mao, Chuanzao; Chen, Xin

2018-05-01

An advanced functional understanding of omics data is important for elucidating the design logic of physiological processes in plants and effectively controlling desired traits in plants. We present the latest versions of the Predicted Arabidopsis Interactome Resource (PAIR) and of the gene set linkage analysis (GSLA) tool, which enable the interpretation of an observed transcriptomic change (differentially expressed genes [DEGs]) in Arabidopsis ( Arabidopsis thaliana ) with respect to its functional impact for biological processes. PAIR version 5.0 integrates functional association data between genes in multiple forms and infers 335,301 putative functional interactions. GSLA relies on this high-confidence inferred functional association network to expand our perception of the functional impacts of an observed transcriptomic change. GSLA then interprets the biological significance of the observed DEGs using established biological concepts (annotation terms), describing not only the DEGs themselves but also their potential functional impacts. This unique analytical capability can help researchers gain deeper insights into their experimental results and highlight prospective directions for further investigation. We demonstrate the utility of GSLA with two case studies in which GSLA uncovered how molecular events may have caused physiological changes through their collective functional influence on biological processes. Furthermore, we showed that typical annotation-enrichment tools were unable to produce similar insights to PAIR/GSLA. The PAIR version 5.0-inferred interactome and GSLA Web tool both can be accessed at http://public.synergylab.cn/pair/. © 2018 American Society of Plant Biologists. All Rights Reserved.
Sugar Lego: gene composition of bacterial carbohydrate metabolism genomic loci.

PubMed

Kaznadzey, Anna; Shelyakin, Pavel; Gelfand, Mikhail S

2017-11-25

Bacterial carbohydrate metabolism is extremely diverse, since carbohydrates serve as a major energy source and are involved in a variety of cellular processes. Bacterial genes belonging to same metabolic pathway are often co-localized in the chromosome, but it is not a strict rule. Gene co-localization in linked to co-evolution and co-regulation. This study focuses on a large-scale analysis of bacterial genomic loci related to the carbohydrate metabolism. We demonstrate that only 53% of 148,000 studied genes from over six hundred bacterial genomes are co-localized in bacterial genomes with other carbohydrate metabolism genes, which points to a significant role of singleton genes. Co-localized genes form cassettes, ranging in size from two to fifteen genes. Two major factors influencing the cassette-forming tendency are gene function and bacterial phylogeny. We have obtained a comprehensive picture of co-localization preferences of genes for nineteen major carbohydrate metabolism functional classes, over two hundred gene orthologous clusters, and thirty bacterial classes, and characterized the cassette variety in size and content among different species, highlighting a significant role of short cassettes. The preference towards co-localization of carbohydrate metabolism genes varies between 40 and 76% for bacterial taxa. Analysis of frequently co-localized genes yielded forty-five significant pairwise links between genes belonging to different functional classes. The number of such links per class range from zero to eight, demonstrating varying preferences of respective genes towards a specific chromosomal neighborhood. Genes from eleven functional classes tend to co-localize with genes from the same class, indicating an important role of clustering of genes with similar functions. At that, in most cases such co-localization does not originate from local duplication events. Overall, we describe a complex web formed by evolutionary relationships of bacterial carbohydrate metabolism genes, manifested as co-localization patterns. This article was reviewed by Daria V. Dibrova (A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia), nominated by Armen Mulkidjanian (University of Osnabrück, Germany), Igor Rogozin (NCBI, NLM, NIH, USA) and Yuri Wolf (NCBI, NLM, NIH, USA).
Cross disease analysis of co-functional microRNA pairs on a reconstructed network of disease-gene-microRNA tripartite.

PubMed

Peng, Hui; Lan, Chaowang; Zheng, Yi; Hutvagner, Gyorgy; Tao, Dacheng; Li, Jinyan

2017-03-24

MicroRNAs always function cooperatively in their regulation of gene expression. Dysfunctions of these co-functional microRNAs can play significant roles in disease development. We are interested in those multi-disease associated co-functional microRNAs that regulate their common dysfunctional target genes cooperatively in the development of multiple diseases. The research is potentially useful for human disease studies at the transcriptional level and for the study of multi-purpose microRNA therapeutics. We designed a computational method to detect multi-disease associated co-functional microRNA pairs and conducted cross disease analysis on a reconstructed disease-gene-microRNA (DGR) tripartite network. The construction of the DGR tripartite network is by the integration of newly predicted disease-microRNA associations with those relationships of diseases, microRNAs and genes maintained by existing databases. The prediction method uses a set of reliable negative samples of disease-microRNA association and a pre-computed kernel matrix instead of kernel functions. From this reconstructed DGR tripartite network, multi-disease associated co-functional microRNA pairs are detected together with their common dysfunctional target genes and ranked by a novel scoring method. We also conducted proof-of-concept case studies on cancer-related co-functional microRNA pairs as well as on non-cancer disease-related microRNA pairs. With the prioritization of the co-functional microRNAs that relate to a series of diseases, we found that the co-function phenomenon is not unusual. We also confirmed that the regulation of the microRNAs for the development of cancers is more complex and have more unique properties than those of non-cancer diseases.
Comparative methods for the analysis of gene-expression evolution: an example using yeast functional genomic data.

PubMed

Oakley, Todd H; Gu, Zhenglong; Abouheif, Ehab; Patel, Nipam H; Li, Wen-Hsiung

2005-01-01

Understanding the evolution of gene function is a primary challenge of modern evolutionary biology. Despite an expanding database from genomic and developmental studies, we are lacking quantitative methods for analyzing the evolution of some important measures of gene function, such as gene-expression patterns. Here, we introduce phylogenetic comparative methods to compare different models of gene-expression evolution in a maximum-likelihood framework. We find that expression of duplicated genes has evolved according to a nonphylogenetic model, where closely related genes are no more likely than more distantly related genes to share common expression patterns. These results are consistent with previous studies that found rapid evolution of gene expression during the history of yeast. The comparative methods presented here are general enough to test a wide range of evolutionary hypotheses using genomic-scale data from any organism.
Analysis of hairpin RNA transgene-induced gene silencing in Fusarium oxysporum

PubMed Central

2013-01-01

Background Hairpin RNA (hpRNA) transgenes can be effective at inducing RNA silencing and have been exploited as a powerful tool for gene function analysis in many organisms. However, in fungi, expression of hairpin RNA transcripts can induce post-transcriptional gene silencing, but in some species can also lead to transcriptional gene silencing, suggesting a more complex interplay of the two pathways at least in some fungi. Because many fungal species are important pathogens, RNA silencing is a powerful technique to understand gene function, particularly when gene knockouts are difficult to obtain. We investigated whether the plant pathogenic fungus Fusarium oxysporum possesses a functional gene silencing machinery and whether hairpin RNA transcripts can be employed to effectively induce gene silencing. Results Here we show that, in the phytopathogenic fungus F. oxysporum, hpRNA transgenes targeting either a β-glucuronidase (Gus) reporter transgene (hpGus) or the endogenous gene Frp1 (hpFrp) did not induce significant silencing of the target genes. Expression analysis suggested that the hpRNA transgenes are prone to transcriptional inactivation, resulting in low levels of hpRNA and siRNA production. However, the hpGus RNA can be efficiently transcribed by promoters acquired either by recombination with a pre-existing, actively transcribed Gus transgene or by fortuitous integration near an endogenous gene promoter allowing siRNA production. These siRNAs effectively induced silencing of a target Gus transgene, which in turn appeared to also induce secondary siRNA production. Furthermore, our results suggested that hpRNA transcripts without poly(A) tails are efficiently processed into siRNAs to induce gene silencing. A convergent promoter transgene, designed to express poly(A)-minus sense and antisense Gus RNAs, without an inverted-repeat DNA structure, induced consistent Gus silencing in F. oxysporum. Conclusions These results indicate that F. oxysporum possesses functional RNA silencing machineries for siRNA production and target mRNA cleavage, but hpRNA transgenes may induce transcriptional self-silencing due to its inverted-repeat structure. Our results suggest that F. oxysporum possesses a similar gene silencing pathway to other fungi like fission yeast, and indicate a need for developing more effective RNA silencing technology for gene function studies in this fungal pathogen. PMID:23819794
MorphDB: Prioritizing Genes for Specialized Metabolism Pathways and Gene Ontology Categories in Plants.

PubMed

Zwaenepoel, Arthur; Diels, Tim; Amar, David; Van Parys, Thomas; Shamir, Ron; Van de Peer, Yves; Tzfadia, Oren

2018-01-01

Recent times have seen an enormous growth of "omics" data, of which high-throughput gene expression data are arguably the most important from a functional perspective. Despite huge improvements in computational techniques for the functional classification of gene sequences, common similarity-based methods often fall short of providing full and reliable functional information. Recently, the combination of comparative genomics with approaches in functional genomics has received considerable interest for gene function analysis, leveraging both gene expression based guilt-by-association methods and annotation efforts in closely related model organisms. Besides the identification of missing genes in pathways, these methods also typically enable the discovery of biological regulators (i.e., transcription factors or signaling genes). A previously built guilt-by-association method is MORPH, which was proven to be an efficient algorithm that performs particularly well in identifying and prioritizing missing genes in plant metabolic pathways. Here, we present MorphDB, a resource where MORPH-based candidate genes for large-scale functional annotations (Gene Ontology, MapMan bins) are integrated across multiple plant species. Besides a gene centric query utility, we present a comparative network approach that enables researchers to efficiently browse MORPH predictions across functional gene sets and species, facilitating efficient gene discovery and candidate gene prioritization. MorphDB is available at http://bioinformatics.psb.ugent.be/webtools/morphdb/morphDB/index/. We also provide a toolkit, named "MORPH bulk" (https://github.com/arzwa/morph-bulk), for running MORPH in bulk mode on novel data sets, enabling researchers to apply MORPH to their own species of interest.
Genome-wide association and network analysis of lung function in the Framingham Heart Study.

PubMed

Liao, Shu-Yi; Lin, Xihong; Christiani, David C

2014-09-01

Single nucleotide polymorphisms have been found to be associated with pulmonary function using genome-wide association studies. However, lung function is a complex trait that is likely to be influenced by multiple gene-gene interactions besides individual genes. Our goal is to build a cellular network to explore the relationship between pulmonary function and genotypes by combining SNP level and network analyses using longitudinal lung function data from the Framingham Heart Study. We analyzed 2,698 genotyped participants from the Offspring cohort that had an average of 3.35 spirometry measurements per person for a mean length of 13 years. Repeated forced expiratory volume in one second (FEV1 ) and the ratio of FEV1 to forced vital capacity (FVC) were used as outcomes. Data were analyzed using linear-mixed models for the association between lung function and alleles by accounting for the correlation among repeated measures over time within the same subject and within-family correlation. Network analyses were performed using dmGWAS and validated with data from the Third Generation cohort. Analyses identified SMAD3, TGFBR2, CD44, CTGF, VCAN, CTNNB1, SCGB1A1, PDE4D, NRG1, EPHB1, and LYN as contributors to pulmonary function. Most of these genes were novel that were not found previously using solely SNP-level analysis. These novel genes are involving the transforming growth factor beta (TGFB)-SMAD pathway, Wnt/beta-catenin pathway, etc. Therefore, combining SNP-level and network analyses using longitudinal lung function data is a useful alternative strategy to identify risk genes. © 2014 WILEY PERIODICALS, INC.

Structural, evolutionary and genetic analysis of the histidine biosynthetic "core" in the genus Burkholderia.

PubMed

Papaleo, Maria Cristiana; Russo, Edda; Fondi, Marco; Emiliani, Giovanni; Frandi, Antonio; Brilli, Matteo; Pastorelli, Roberta; Fani, Renato

2009-12-01

In this work a detailed analysis of the structure, the expression and the organization of his genes belonging to the core of histidine biosynthesis (hisBHAF) in 40 newly determined and 13 available sequences of Burkholderia strains was carried out. Data obtained revealed a strong conservation of the structure and organization of these genes through the entire genus. The phylogenetic analysis showed the monophyletic origin of this gene cluster and indicated that it did not undergo horizontal gene transfer events. The analysis of the intergenic regions, based on the substitution rate, entropy plot and bendability suggested the existence of a putative transcription promoter upstream of hisB, that was supported by the genetic analysis that showed that this cluster was able to complement Escherichia colihisA, hisB, and hisF mutations. Moreover, a preliminary transcriptional analysis and the analysis of microarray data revealed that the expression of the his core was constitutive. These findings are in agreement with the fact that the entire Burkholderiahis operon is heterogeneous, in that it contains "alien" genes apparently not involved in histidine biosynthesis. Besides, they also support the idea that the proteobacterial his operon was piece-wisely assembled, i.e. through accretion of smaller units containing only some of the genes (eventually together with their own promoters) involved in this biosynthetic route. The correlation existing between the structure, organization and regulation of his "core" genes and the function(s) they perform in cellular metabolism is discussed.
Genome-Wide Identification, Characterization and Expression Analysis of the Chalcone Synthase Family in Maize

PubMed Central

Han, Yahui; Ding, Ting; Su, Bo; Jiang, Haiyang

2016-01-01

Members of the chalcone synthase (CHS) family participate in the synthesis of a series of secondary metabolites in plants, fungi and bacteria. The metabolites play important roles in protecting land plants against various environmental stresses during the evolutionary process. Our research was conducted on comprehensive investigation of CHS genes in maize (Zea mays L.), including their phylogenetic relationships, gene structures, chromosomal locations and expression analysis. Fourteen CHS genes (ZmCHS01–14) were identified in the genome of maize, representing one of the largest numbers of CHS family members identified in one organism to date. The gene family was classified into four major classes (classes I–IV) based on their phylogenetic relationships. Most of them contained two exons and one intron. The 14 genes were unevenly located on six chromosomes. Two segmental duplication events were identified, which might contribute to the expansion of the maize CHS gene family to some extent. In addition, quantitative real-time PCR and microarray data analyses suggested that ZmCHS genes exhibited various expression patterns, indicating functional diversification of the ZmCHS genes. Our results will contribute to future studies of the complexity of the CHS gene family in maize and provide valuable information for the systematic analysis of the functions of the CHS gene family. PMID:26828478
Genome-wide analysis of TCP family in tobacco.

PubMed

Chen, L; Chen, Y Q; Ding, A M; Chen, H; Xia, F; Wang, W F; Sun, Y H

2016-05-23

The TCP family is a transcription factor family, members of which are extensively involved in plant growth and development as well as in signal transduction in the response against many physiological and biochemical stimuli. In the present study, 61 TCP genes were identified in tobacco (Nicotiana tabacum) genome. Bioinformatic methods were employed for predicting and analyzing the gene structure, gene expression, phylogenetic analysis, and conserved domains of TCP proteins in tobacco. The 61 NtTCP genes were divided into three diverse groups, based on the division of TCP genes in tomato and Arabidopsis, and the results of the conserved domain and sequence analyses further confirmed the classification of the NtTCP genes. The expression pattern of NtTCP also demonstrated that majority of these genes play important roles in all the tissues, while some special genes exercise their functions only in specific tissues. In brief, the comprehensive and thorough study of the TCP family in other plants provides sufficient resources for studying the structure and functions of TCPs in tobacco.
Isolation and Molecular Characterization of 1-Aminocyclopropane-1-carboxylic Acid Synthase Genes in Hevea brasiliensis

PubMed Central

Zhu, Jia-Hong; Xu, Jing; Chang, Wen-Jun; Zhang, Zhi-Li

2015-01-01

Ethylene is an important factor that stimulates Hevea brasiliensis to produce natural rubber. 1-Aminocyclopropane-1-carboxylic acid synthase (ACS) is a rate-limiting enzyme in ethylene biosynthesis. However, knowledge of the ACS gene family of H. brasiliensis is limited. In this study, nine ACS-like genes were identified in H. brasiliensis. Sequence and phylogenetic analysis results confirmed that seven isozymes (HbACS1–7) of these nine ACS-like genes were similar to ACS isozymes with ACS activity in other plants. Expression analysis results showed that seven ACS genes were differentially expressed in roots, barks, flowers, and leaves of H. brasiliensis. However, no or low ACS gene expression was detected in the latex of H. brasiliensis. Moreover, seven genes were differentially up-regulated by ethylene treatment.These results provided relevant information to help determine the functions of the ACS gene in H. brasiliensis, particularly the functions in regulating ethylene stimulation of latex production. PMID:25690030
A high efficiency gene disruption strategy using a positive-negative split selection marker and electroporation for Fusarium oxysporum.

PubMed

Liang, Liqin; Li, Jianqiang; Cheng, Lin; Ling, Jian; Luo, Zhongqin; Bai, Miao; Xie, Bingyan

2014-11-01

The Fusarium oxysporum species complex consists of fungal pathogens that cause serial vascular wilt disease on more than 100 cultivated species throughout the world. Gene function analysis is rapidly becoming more and more important as the whole-genome sequences of various F. oxysporum strains are being completed. Gene-disruption techniques are a common molecular tool for studying gene function, yet are often a limiting step in gene function identification. In this study we have developed a F. oxysporum high-efficiency gene-disruption strategy based on split-marker homologous recombination cassettes with dual selection and electroporation transformation. The method was efficiently used to delete three RNA-dependent RNA polymerase (RdRP) genes. The gene-disruption cassettes of three genes can be constructed simultaneously within a short time using this technique. The optimal condition for electroporation is 10μF capacitance, 300Ω resistance, 4kV/cm field strength, with 1μg of DNA (gene-disruption cassettes). Under these optimal conditions, we were able to obtain 95 transformants per μg DNA. And after positive-negative selection, the transformants were efficiently screened by PCR, screening efficiency averaged 85%: 90% (RdRP1), 85% (RdRP2) and 77% (RdRP3). This gene-disruption strategy should pave the way for high throughout genetic analysis in F. oxysporum. Copyright © 2014 Elsevier GmbH. All rights reserved.
HoloVir: A Workflow for Investigating the Diversity and Function of Viruses in Invertebrate Holobionts

PubMed Central

Laffy, Patrick W.; Wood-Charlson, Elisha M.; Turaev, Dmitrij; Weynberg, Karen D.; Botté, Emmanuelle S.; van Oppen, Madeleine J. H.; Webster, Nicole S.; Rattei, Thomas

2016-01-01

Abundant bioinformatics resources are available for the study of complex microbial metagenomes, however their utility in viral metagenomics is limited. HoloVir is a robust and flexible data analysis pipeline that provides an optimized and validated workflow for taxonomic and functional characterization of viral metagenomes derived from invertebrate holobionts. Simulated viral metagenomes comprising varying levels of viral diversity and abundance were used to determine the optimal assembly and gene prediction strategy, and multiple sequence assembly methods and gene prediction tools were tested in order to optimize our analysis workflow. HoloVir performs pairwise comparisons of single read and predicted gene datasets against the viral RefSeq database to assign taxonomy and additional comparison to phage-specific and cellular markers is undertaken to support the taxonomic assignments and identify potential cellular contamination. Broad functional classification of the predicted genes is provided by assignment of COG microbial functional category classifications using EggNOG and higher resolution functional analysis is achieved by searching for enrichment of specific Swiss-Prot keywords within the viral metagenome. Application of HoloVir to viral metagenomes from the coral Pocillopora damicornis and the sponge Rhopaloeides odorabile demonstrated that HoloVir provides a valuable tool to characterize holobiont viral communities across species, environments, or experiments. PMID:27375564
Curd development associated gene (CDAG1) in cauliflower (Brassica oleracea L. var. botrytis) could result in enlarged organ size and increased biomass.

PubMed

Li, Hui; Liu, Qian; Zhang, Qingli; Qin, Erjun; Jin, Chuan; Wang, Yu; Wu, Mei; Shen, Guangshuang; Chen, Chengbin; Song, Wenqin; Wang, Chunguo

2017-01-01

The curd is a specialized organ and the most important product organ of cauliflower (Brassica oleracea L. var. botrytis). However, the mechanism underlying the regulation of curd formation and development remains largely unknown. In the present study, a novel homologous gene containing the Organ Size Related (OSR) domain, namely, CDAG1 (Curd Development Associated Gene 1) was identified in cauliflower. Quantitative analysis indicated that CDAG1 showed significantly higher transcript levels in young tissues. Functional analysis demonstrated that the ectopic overexpression of CDAG1 in Arabidopsis and cauliflower could significantly promote organ growth and result in larger organ size and increased biomass. Organ enlargement was predominantly due to increased cell number. In addition, 228 genes involved in the CDAG1-mediated regulatory network were discovered by transcriptome analysis. Among these genes, CDAG1 was confirmed to inhibit the transcriptional expression of the endogenous OSR genes, ARGOS and ARL, while a series of ethylene-responsive transcription factors (ERFs) were found to increased expression in 35S:CDAG1 transgenic Arabidopsis plants. This implies that CDAG1 may function in the ethylene-mediated signal pathway. These findings provide new insight into the function of OSR genes, and suggest potential applications of CDAG1 in breeding high-yielding crops. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Genome-wide classification, evolutionary analysis and gene expression patterns of the kinome in Gossypium

PubMed Central

Yan, Jun; Li, Guilin; Guo, Xingqi; Li, Yang; Cao, Xuecheng

2018-01-01

The protein kinase (PK, kinome) family is one of the largest families in plants and regulates almost all aspects of plant processes, including plant development and stress responses. Despite their important functions, comprehensive functional classification, evolutionary analysis and expression patterns of the cotton PK gene family has yet to be performed on PK genes. In this study, we identified the cotton kinomes in the Gossypium raimondii, Gossypium arboretum, Gossypium hirsutum and Gossypium barbadense genomes and classified them into 7 groups and 122–24 subfamilies using software HMMER v3.0 scanning and neighbor-joining (NJ) phylogenetic analysis. Some conserved exon-intron structures were identified not only in cotton species but also in primitive plants, ferns and moss, suggesting the significant function and ancient origination of these PK genes. Collinearity analysis revealed that 16.6 million years ago (Mya) cotton-specific whole genome duplication (WGD) events may have played a partial role in the expansion of the cotton kinomes, whereas tandem duplication (TD) events mainly contributed to the expansion of the cotton RLK group. Synteny analysis revealed that tetraploidization of G. hirsutum and G. barbadense contributed to the expansion of G. hirsutum and G. barbadense PKs. Global expression analysis of cotton PKs revealed stress-specific and fiber development-related expression patterns, suggesting that many cotton PKs might be involved in the regulation of the stress response and fiber development processes. This study provides foundational information for further studies on the evolution and molecular function of cotton PKs. PMID:29768506
An integrative approach to inferring biologically meaningful gene modules.

PubMed

Cho, Ji-Hoon; Wang, Kai; Galas, David J

2011-07-26

The ability to construct biologically meaningful gene networks and modules is critical for contemporary systems biology. Though recent studies have demonstrated the power of using gene modules to shed light on the functioning of complex biological systems, most modules in these networks have shown little association with meaningful biological function. We have devised a method which directly incorporates gene ontology (GO) annotation in construction of gene modules in order to gain better functional association. We have devised a method, Semantic Similarity-Integrated approach for Modularization (SSIM) that integrates various gene-gene pairwise similarity values, including information obtained from gene expression, protein-protein interactions and GO annotations, in the construction of modules using affinity propagation clustering. We demonstrated the performance of the proposed method using data from two complex biological responses: 1. the osmotic shock response in Saccharomyces cerevisiae, and 2. the prion-induced pathogenic mouse model. In comparison with two previously reported algorithms, modules identified by SSIM showed significantly stronger association with biological functions. The incorporation of semantic similarity based on GO annotation with gene expression and protein-protein interaction data can greatly enhance the functional relevance of inferred gene modules. In addition, the SSIM approach can also reveal the hierarchical structure of gene modules to gain a broader functional view of the biological system. Hence, the proposed method can facilitate comprehensive and in-depth analysis of high throughput experimental data at the gene network level.
Inducible repression of multiple expansin genes leads to growth suppression during leaf development.

PubMed

Goh, Hoe-Han; Sloan, Jennifer; Dorca-Fornell, Carmen; Fleming, Andrew

2012-08-01

Expansins are cell wall proteins implicated in the control of plant growth via loosening of the extracellular matrix. They are encoded by a large gene family, and data linked to loss of single gene function to support a role of expansins in leaf growth remain limited. Here, we provide a quantitative growth analysis of transgenics containing an inducible artificial microRNA construct designed to down-regulate the expression of a number of expansin genes that an expression analysis indicated are expressed during the development of Arabidopsis (Arabidopsis thaliana) leaf 6. The results support the hypothesis that expansins are required for leaf growth and show that decreased expansin gene expression leads to a more marked repression of growth during the later stage of leaf development. In addition, a histological analysis of leaves in which expansin gene expression was suppressed indicates that, despite smaller leaves, mean cell size was increased. These data provide functional evidence for a role of expansins in leaf growth, indicate the importance of tissue/organ developmental context for the outcome of altered expansin gene expression, and highlight the separation of the outcome of expansin gene expression at the cellular and organ levels.
Analysis of Msx1 and Msx2 transactivation function in the context of the heat shock 70 (Hspa1b) gene promoter.

PubMed

Zhuang, Fengfeng; Nguyen, Manuel P; Shuler, Charles; Liu, Yi-Hsin

2009-04-03

Previous studies have shown that Msx proteins control gene transcription predominantly through repression mechanisms. However, gene expression studies using either the gain-of-function or the loss-of-function mutants revealed many gene targets whose expression require functional Msx proteins. To date, investigations into the mechanisms of Msx-dependent transactivation have been hindered by the lack of a responsive promoter. Here, we demonstrated the usefulness of the mouse Hspa1b promoter in probing Msx-dependent mechanisms of gene activation. We showed that Msx protein activates Hspa1b promoter via its C-terminal domain. The activation absolutely depends on the HSEs and physical interactions between Msx proteins and heat shock factors may play a contributing role.
The heptanucleotide motif GAGACGC is a key component of a cis-acting promoter element that is critical for SnSAG1 expression in Sarcocystis neurona.

PubMed

Gaji, Rajshekhar Y; Howe, Daniel K

2009-07-01

The apicomplexan parasite Sarcocystis neurona undergoes a complex process of intracellular development, during which many genes are temporally regulated. The described study was undertaken to begin identifying the basic promoter elements that control gene expression in S. neurona. Sequence analysis of the 5'-flanking region of five S. neurona genes revealed a conserved heptanucleotide motif GAGACGC that is similar to the WGAGACG motif described upstream of multiple genes in Toxoplasma gondii. The promoter region for the major surface antigen gene SnSAG1, which contains three heptanucleotide motifs within 135 bases of the transcription start site, was dissected by functional analysis using a dual luciferase reporter assay. These analyses revealed that a minimal promoter fragment containing all three motifs was sufficient to drive reporter molecule expression, with the presence and orientation of the 5'-most heptanucleotide motif being absolutely critical for promoter function. Further studies should help to identify additional sequence elements important for promoter function and for controlling gene expression during intracellular development by this apicomplexan pathogen.
Genome-wide analysis of basic helix-loop-helix (bHLH) transcription factors in Brachypodium distachyon.

PubMed

Niu, Xin; Guan, Yuxiang; Chen, Shoukun; Li, Haifeng

2017-08-15

As a superfamily of transcription factors (TFs), the basic helix-loop-helix (bHLH) proteins have been characterized functionally in many plants with a vital role in the regulation of diverse biological processes including growth, development, response to various stresses, and so on. However, no systemic analysis of the bHLH TFs has been reported in Brachypodium distachyon, an emerging model plant in Poaceae. A total of 146 bHLH TFs were identified in the Brachypodium distachyon genome and classified into 24 subfamilies. BdbHLHs in the same subfamily share similar protein motifs and gene structures. Gene duplication events showed a close relationship to rice, maize and sorghum, and segment duplications might play a key role in the expansion of this gene family. The amino acid sequence of the bHLH domains were quite conservative, especially Leu-27 and Leu-54. Based on the predicted binding activities, the BdbHLHs were divided into DNA binding and non-DNA binding types. According to the gene ontology (GO) analysis, BdbHLHs were speculated to function in homodimer or heterodimer manner. By integrating the available high throughput data in public database and results of quantitative RT-PCR, we found the expression profiles of BdbHLHs were different, implying their differentiated functions. One hundred fourty-six BdbHLHs were identified and their conserved domains, sequence features, phylogenetic relationship, chromosomal distribution, GO annotations, gene structures, gene duplication and expression profiles were investigated. Our findings lay a foundation for further evolutionary and functional elucidation of BdbHLH genes.
Identification of giant Mimivirus protein functions using RNA interference

PubMed Central

Sobhy, Haitham; Scola, Bernard La; Pagnier, Isabelle; Raoult, Didier; Colson, Philippe

2015-01-01

Genomic analysis of giant viruses, such as Mimivirus, has revealed that more than half of the putative genes have no known functions (ORFans). We knocked down Mimivirus genes using short interfering RNA as a proof of concept to determine the functions of giant virus ORFans. As fibers are easy to observe, we targeted a gene encoding a protein absent in a Mimivirus mutant devoid of fibers as well as three genes encoding products identified in a protein concentrate of fibers, including one ORFan and one gene of unknown function. We found that knocking down these four genes was associated with depletion or modification of the fibers. Our strategy of silencing ORFan genes in giant viruses opens a way to identify its complete gene repertoire and may clarify the role of these genes, differentiating between junk DNA and truly used genes. Using this strategy, we were able to annotate four proteins in Mimivirus and 30 homologous proteins in other giant viruses. In addition, we were able to annotate >500 proteins from cellular organisms and 100 from metagenomic databases. PMID:25972846
Genome-wide analysis of trans-splicing in the nematode Pristionchus pacificus unravels conserved gene functions for germline and dauer development in divergent operons.

PubMed

Sinha, Amit; Langnick, Claudia; Sommer, Ralf J; Dieterich, Christoph

2014-09-01

Discovery of trans-splicing in multiple metazoan lineages led to the identification of operon-like gene organization in diverse organisms, including trypanosomes, tunicates, and nematodes, but the functional significance of such operons is not completely understood. To see whether the content or organization of operons serves similar roles across species, we experimentally defined operons in the nematode model Pristionchus pacificus. We performed affinity capture experiments on mRNA pools to specifically enrich for transcripts that are trans-spliced to either the SL1- or SL2-spliced leader, using spliced leader-specific probes. We obtained distinct trans-splicing patterns from the analysis of three mRNA pools (total mRNA, SL1 and SL2 fraction) by RNA-seq. This information was combined with a genome-wide analysis of gene orientation and spacing. We could confirm 2219 operons by RNA-seq data out of 6709 candidate operons, which were predicted by sequence information alone. Our gene order comparison of the Caenorhabditis elegans and P. pacificus genomes shows major changes in operon organization in the two species. Notably, only 128 out of 1288 operons in C. elegans are conserved in P. pacificus. However, analysis of gene-expression profiles identified conserved functions such as an enrichment of germline-expressed genes and higher expression levels of operonic genes during recovery from dauer arrest in both species. These results provide support for the model that a necessity for increased transcriptional efficiency in the context of certain developmental processes could be a selective constraint for operon evolution in metazoans. Our method is generally applicable to other metazoans to see if similar functional constraints regulate gene organization into operons. © 2014 Sinha et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Deletion of the transcriptional coactivator PGC1α in skeletal muscles is associated with reduced expression of genes related to oxidative muscle function

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hatazawa, Yukino; Research Fellow of Japan Society for the Promotion of Science, Tokyo; Minami, Kimiko

The expression of the transcriptional coactivator PGC1α is increased in skeletal muscles during exercise. Previously, we showed that increased PGC1α leads to prolonged exercise performance (the duration for which running can be continued) and, at the same time, increases the expression of branched-chain amino acid (BCAA) metabolism-related enzymes and genes that are involved in supplying substrates for the TCA cycle. We recently created mice with PGC1α knockout specifically in the skeletal muscles (PGC1α KO mice), which show decreased mitochondrial content. In this study, global gene expression (microarray) analysis was performed in the skeletal muscles of PGC1α KO mice compared withmore » that of wild-type control mice. As a result, decreased expression of genes involved in the TCA cycle, oxidative phosphorylation, and BCAA metabolism were observed. Compared with previously obtained microarray data on PGC1α-overexpressing transgenic mice, each gene showed the completely opposite direction of expression change. Bioinformatic analysis of the promoter region of genes with decreased expression in PGC1α KO mice predicted the involvement of several transcription factors, including a nuclear receptor, ERR, in their regulation. As PGC1α KO microarray data in this study show opposing findings to the PGC1α transgenic data, a loss-of-function experiment, as well as a gain-of-function experiment, revealed PGC1α’s function in the oxidative energy metabolism of skeletal muscles. - Highlights: • Microarray analysis was performed in the skeletal muscle of PGC1α KO mice. • Expression of genes in the oxidative energy metabolism was decreased. • Bioinformatic analysis of promoter region of the genes predicted involvement of ERR. • PGC1α KO microarray data in this study show the mirror image of transgenic data.« less
Genetic mechanisms involved in the evolution of the cephalopod camera eye revealed by transcriptomic and developmental studies

PubMed Central

2011-01-01

Background Coleoid cephalopods (squids and octopuses) have evolved a camera eye, the structure of which is very similar to that found in vertebrates and which is considered a classic example of convergent evolution. Other molluscs, however, possess mirror, pin-hole, or compound eyes, all of which differ from the camera eye in the degree of complexity of the eye structures and neurons participating in the visual circuit. Therefore, genes expressed in the cephalopod eye after divergence from the common molluscan ancestor could be involved in eye evolution through association with the acquisition of new structural components. To clarify the genetic mechanisms that contributed to the evolution of the cephalopod camera eye, we applied comprehensive transcriptomic analysis and conducted developmental validation of candidate genes involved in coleoid cephalopod eye evolution. Results We compared gene expression in the eyes of 6 molluscan (3 cephalopod and 3 non-cephalopod) species and selected 5,707 genes as cephalopod camera eye-specific candidate genes on the basis of homology searches against 3 molluscan species without camera eyes. First, we confirmed the expression of these 5,707 genes in the cephalopod camera eye formation processes by developmental array analysis. Second, using molecular evolutionary (dN/dS) analysis to detect positive selection in the cephalopod lineage, we identified 156 of these genes in which functions appeared to have changed after the divergence of cephalopods from the molluscan ancestor and which contributed to structural and functional diversification. Third, we selected 1,571 genes, expressed in the camera eyes of both cephalopods and vertebrates, which could have independently acquired a function related to eye development at the expression level. Finally, as experimental validation, we identified three functionally novel cephalopod camera eye genes related to optic lobe formation in cephalopods by in situ hybridization analysis of embryonic pygmy squid. Conclusion We identified 156 genes positively selected in the cephalopod lineage and 1,571 genes commonly found in the cephalopod and vertebrate camera eyes from the analysis of cephalopod camera eye specificity at the expression level. Experimental validation showed that the cephalopod camera eye-specific candidate genes include those expressed in the outer part of the optic lobes, which unique to coleoid cephalopods. The results of this study suggest that changes in gene expression and in the primary structure of proteins (through positive selection) from those in the common molluscan ancestor could have contributed, at least in part, to cephalopod camera eye acquisition. PMID:21702923
Transcriptional Network Analysis Identifies BACH1 as a Master Regulator of Breast Cancer Bone Metastasis

PubMed Central

Liang, Yajun; Wu, Heng; Lei, Rong; Chong, Robert A.; Wei, Yong; Lu, Xin; Tagkopoulos, Ilias; Kung, Sun-Yuan; Yang, Qifeng; Hu, Guohong; Kang, Yibin

2012-01-01

The application of functional genomic analysis of breast cancer metastasis has led to the identification of a growing number of organ-specific metastasis genes, which often function in concert to facilitate different steps of the metastatic cascade. However, the gene regulatory network that controls the expression of these metastasis genes remains largely unknown. Here, we demonstrate a computational approach for the deconvolution of transcriptional networks to discover master regulators of breast cancer bone metastasis. Several known regulators of breast cancer bone metastasis such as Smad4 and HIF1 were identified in our analysis. Experimental validation of the networks revealed BACH1, a basic leucine zipper transcription factor, as the common regulator of several functional metastasis genes, including MMP1 and CXCR4. Ectopic expression of BACH1 enhanced the malignance of breast cancer cells, and conversely, BACH1 knockdown significantly reduced bone metastasis. The expression of BACH1 and its target genes was linked to the higher risk of breast cancer recurrence in patients. This study established BACH1 as the master regulator of breast cancer bone metastasis and provided a paradigm to identify molecular determinants in complex pathological processes. PMID:22875853
Manganese-Induced Neurotoxicity and Alterations in Gene Expression in Human Neuroblastoma SH-SY5Y Cells.

PubMed

Gandhi, Deepa; Sivanesan, Saravanadevi; Kannan, Krishnamurthi

2018-06-01

Manganese (Mn) is an essential trace element required for many physiological functions including proper biochemical and cellular functioning of the central nervous system (CNS). However, exposure to excess level of Mn through occupational settings or from environmental sources has been associated with neurotoxicity. The cellular and molecular mechanism of Mn-induced neurotoxicity remains unclear. In the current study, we investigated the effects of 30-day exposure to a sub-lethal concentration of Mn (100 μM) in human neuroblastoma cells (SH-SY5Y) using transcriptomic approach. Microarray analysis revealed differential expression of 1057 transcripts in Mn-exposed SH-SY5Y cells as compared to control cells. Gene functional annotation cluster analysis exhibited that the differentially expressed genes were associated with several biological pathways. Specifically, genes involved in neuronal pathways including neuron differentiation and development, regulation of neurogenesis, synaptic transmission, and neuronal cell death (apoptosis) were found to be significantly altered. KEGG pathway analysis showed upregulation of p53 signaling pathways and neuroactive ligand-receptor interaction pathways, and downregulation of neurotrophin signaling pathway. On the basis of the gene expression profile, possible molecular mechanisms underlying Mn-induced neuronal toxicity were predicted.
Genome-Wide Investigation and Expression Profiling of AP2/ERF Transcription Factor Superfamily in Foxtail Millet (Setaria italica L.)

PubMed Central

Lata, Charu; Mishra, Awdhesh Kumar; Muthamilarasan, Mehanathan; Bonthala, Venkata Suresh; Khan, Yusuf; Prasad, Manoj

2014-01-01

The APETALA2/ethylene-responsive element binding factor (AP2/ERF) family is one of the largest transcription factor (TF) families in plants that includes four major sub-families, namely AP2, DREB (dehydration responsive element binding), ERF (ethylene responsive factors) and RAV (Related to ABI3/VP). AP2/ERFs are known to play significant roles in various plant processes including growth and development and biotic and abiotic stress responses. Considering this, a comprehensive genome-wide study was conducted in foxtail millet (Setaria italica L.). A total of 171 AP2/ERF genes were identified by systematic sequence analysis and were physically mapped onto nine chromosomes. Phylogenetic analysis grouped AP2/ERF genes into six classes (I to VI). Duplication analysis revealed that 12 (∼7%) SiAP2/ERF genes were tandem repeated and 22 (∼13%) were segmentally duplicated. Comparative physical mapping between foxtail millet AP2/ERF genes and its orthologs of sorghum (18 genes), maize (14 genes), rice (9 genes) and Brachypodium (6 genes) showed the evolutionary insights of AP2/ERF gene family and also the decrease in orthology with increase in phylogenetic distance. The evolutionary significance in terms of gene-duplication and divergence was analyzed by estimating synonymous and non-synonymous substitution rates. Expression profiling of candidate AP2/ERF genes against drought, salt and phytohormones revealed insights into their precise and/or overlapping expression patterns which could be responsible for their functional divergence in foxtail millet. The study showed that the genes SiAP2/ERF-069, SiAP2/ERF-103 and SiAP2/ERF-120 may be considered as potential candidate genes for further functional validation as well for utilization in crop improvement programs for stress resistance since these genes were up-regulated under drought and salinity stresses in ABA dependent manner. Altogether the present study provides new insights into evolution, divergence and systematic functional analysis of AP2/ERF gene family at genome level in foxtail millet which may be utilized for improving stress adaptation and tolerance in millets, cereals and bioenergy grasses. PMID:25409524

Genome-wide investigation and expression profiling of AP2/ERF transcription factor superfamily in foxtail millet (Setaria italica L.).

PubMed

Lata, Charu; Mishra, Awdhesh Kumar; Muthamilarasan, Mehanathan; Bonthala, Venkata Suresh; Khan, Yusuf; Prasad, Manoj

2014-01-01

The APETALA2/ethylene-responsive element binding factor (AP2/ERF) family is one of the largest transcription factor (TF) families in plants that includes four major sub-families, namely AP2, DREB (dehydration responsive element binding), ERF (ethylene responsive factors) and RAV (Related to ABI3/VP). AP2/ERFs are known to play significant roles in various plant processes including growth and development and biotic and abiotic stress responses. Considering this, a comprehensive genome-wide study was conducted in foxtail millet (Setaria italica L.). A total of 171 AP2/ERF genes were identified by systematic sequence analysis and were physically mapped onto nine chromosomes. Phylogenetic analysis grouped AP2/ERF genes into six classes (I to VI). Duplication analysis revealed that 12 (∼7%) SiAP2/ERF genes were tandem repeated and 22 (∼13%) were segmentally duplicated. Comparative physical mapping between foxtail millet AP2/ERF genes and its orthologs of sorghum (18 genes), maize (14 genes), rice (9 genes) and Brachypodium (6 genes) showed the evolutionary insights of AP2/ERF gene family and also the decrease in orthology with increase in phylogenetic distance. The evolutionary significance in terms of gene-duplication and divergence was analyzed by estimating synonymous and non-synonymous substitution rates. Expression profiling of candidate AP2/ERF genes against drought, salt and phytohormones revealed insights into their precise and/or overlapping expression patterns which could be responsible for their functional divergence in foxtail millet. The study showed that the genes SiAP2/ERF-069, SiAP2/ERF-103 and SiAP2/ERF-120 may be considered as potential candidate genes for further functional validation as well for utilization in crop improvement programs for stress resistance since these genes were up-regulated under drought and salinity stresses in ABA dependent manner. Altogether the present study provides new insights into evolution, divergence and systematic functional analysis of AP2/ERF gene family at genome level in foxtail millet which may be utilized for improving stress adaptation and tolerance in millets, cereals and bioenergy grasses.
Genome-wide identification and transcriptional profiling analysis of auxin response-related gene families in cucumber

PubMed Central

2014-01-01

Background Auxin signaling has a vital function in the regulation of plant growth and development, both which are known to be mediated by auxin-responsive genes. So far, significant progress has been made toward the identification and characterization of auxin-response genes in several model plants, while no systematic analysis for these families was reported in cucumber (Cucumis sativus L.), a reference species for Cucurbitaceae crops. The comprehensive analyses will help design experiments for functional validation of their precise roles in plant development and stress responses. Results A genome-wide search for auxin-response gene homologues identified 16 auxin-response factors (ARFs), 27 auxin/indole acetic acids (Aux/IAAs), 10 Gretchen Hagen 3 (GH3s), 61 small auxin-up mRNAs (SAURs), and 39 lateral organ boundaries (LBDs) in cucumber. Sequence analysis together with the organization of putative motifs indicated the potential diverse functions of these five auxin-related family members. The distribution and density of auxin response-related genes on chromosomes were not uniform. Evolutionary analysis showed that the chromosomal segment duplications mainly contributed to the expansion of the CsARF, CsIAA, CsGH3, and CsLBD gene families. Quantitative real-time RT-PCR analysis demonstrated that many ARFs, AUX/IAAs, GH3s, SAURs, and LBD genes were expressed in diverse patterns within different organs/tissues and during different development stages. They were also implicated in IAA, methyl jasmonic acid, or salicylic acid response, which is consistent with the finding that a great number of diverse cis-elements are present in their promoter regions involving a variety of signaling transduction pathways. Conclusion Genome-wide comparative analysis of auxin response-related family genes and their expression analysis provide new evidence for the potential role of auxin in development and hormone response of plants. Our data imply that the auxin response genes may be involved in various vegetative and reproductive developmental processes. Furthermore, they will be involved in different signal pathways and may mediate the crosstalk between various hormone responses. PMID:24708619
Filling gaps in PPAR-alpha signaling through comparative nutrigenomics analysis

PubMed Central

2009-01-01

Background The application of high-throughput genomic tools in nutrition research is a widespread practice. However, it is becoming increasingly clear that the outcome of individual expression studies is insufficient for the comprehensive understanding of such a complex field. Currently, the availability of the large amounts of expression data in public repositories has opened up new challenges on microarray data analyses. We have focused on PPARα, a ligand-activated transcription factor functioning as fatty acid sensor controlling the gene expression regulation of a large set of genes in various metabolic organs such as liver, small intestine or heart. The function of PPARα is strictly connected to the function of its target genes and, although many of these have already been identified, major elements of its physiological function remain to be uncovered. To further investigate the function of PPARα, we have applied a cross-species meta-analysis approach to integrate sixteen microarray datasets studying high fat diet and PPARα signal perturbations in different organisms. Results We identified 164 genes (MDEGs) that were differentially expressed in a constant way in response to a high fat diet or to perturbations in PPARs signalling. In particular, we found five genes in yeast which were highly conserved and homologous of PPARα targets in mammals, potential candidates to be used as models for the equivalent mammalian genes. Moreover, a screening of the MDEGs for all known transcription factor binding sites and the comparison with a human genome-wide screening of Peroxisome Proliferating Response Elements (PPRE), enabled us to identify, 20 new potential candidate genes that show, both binding site, both change in expression in the condition studied. Lastly, we found a non random localization of the differentially expressed genes in the genome. Conclusion The results presented are potentially of great interest to resume the currently available expression data, exploiting the power of in silico analysis filtered by evolutionary conservation. The analysis enabled us to indicate potential gene candidates that could fill in the gaps with regards to the signalling of PPARα and, moreover, the non-random localization of the differentially expressed genes in the genome, suggest that epigenetic mechanisms are of importance in the regulation of the transcription operated by PPARα. PMID:20003344
Functional differentiation and spatial-temporal co-expression networks of the NBS-encoding gene family in Jilin ginseng, Panax ginseng C.A. Meyer.

PubMed

Yin, Rui; Zhao, Mingzhu; Wang, Kangyu; Lin, Yanping; Wang, Yanfang; Sun, Chunyu; Wang, Yi; Zhang, Meiping

2017-01-01

Ginseng, Panax ginseng C.A. Meyer, is one of the most important medicinal plants for human health and medicine. It has been documented that over 80% of genes conferring resistance to bacteria, viruses, fungi and nematodes are contributed by the nucleotide binding site (NBS)-encoding gene family. Therefore, identification and characterization of NBS genes expressed in ginseng are paramount to its genetic improvement and breeding. However, little is known about the NBS-encoding genes in ginseng. Here we report genome-wide identification and systems analysis of the NBS genes actively expressed in ginseng (PgNBS genes). Four hundred twelve PgNBS gene transcripts, derived from 284 gene models, were identified from the transcriptomes of 14 ginseng tissues. These genes were classified into eight types, including TNL, TN, CNL, CN, NL, N, RPW8-NL and RPW8-N. Seven conserved motifs were identified in both the Toll/interleukine-1 receptor (TIR) and coiled-coil (CC) typed genes whereas six were identified in the RPW8 typed genes. Phylogenetic analysis showed that the PgNBS gene family is an ancient family, with a vast majority of its genes originated before ginseng originated. In spite of their belonging to a family, the PgNBS genes have functionally dramatically differentiated and been categorized into numerous functional categories. The expressions of the across tissues, different aged roots and the roots of different genotypes. However, they are coordinating in expression, forming a single co-expression network. These results provide a deeper understanding of the origin, evolution and functional differentiation and expression dynamics of the NBS-encoding gene family in plants in general and in ginseng particularly, and a NBS gene toolkit useful for isolation and characterization of disease resistance genes and for enhanced disease resistance breeding in ginseng and related species.
Functional differentiation and spatial-temporal co-expression networks of the NBS-encoding gene family in Jilin ginseng, Panax ginseng C.A. Meyer

PubMed Central

Wang, Kangyu; Lin, Yanping; Wang, Yanfang; Sun, Chunyu; Wang, Yi

2017-01-01

Ginseng, Panax ginseng C.A. Meyer, is one of the most important medicinal plants for human health and medicine. It has been documented that over 80% of genes conferring resistance to bacteria, viruses, fungi and nematodes are contributed by the nucleotide binding site (NBS)-encoding gene family. Therefore, identification and characterization of NBS genes expressed in ginseng are paramount to its genetic improvement and breeding. However, little is known about the NBS-encoding genes in ginseng. Here we report genome-wide identification and systems analysis of the NBS genes actively expressed in ginseng (PgNBS genes). Four hundred twelve PgNBS gene transcripts, derived from 284 gene models, were identified from the transcriptomes of 14 ginseng tissues. These genes were classified into eight types, including TNL, TN, CNL, CN, NL, N, RPW8-NL and RPW8-N. Seven conserved motifs were identified in both the Toll/interleukine-1 receptor (TIR) and coiled-coil (CC) typed genes whereas six were identified in the RPW8 typed genes. Phylogenetic analysis showed that the PgNBS gene family is an ancient family, with a vast majority of its genes originated before ginseng originated. In spite of their belonging to a family, the PgNBS genes have functionally dramatically differentiated and been categorized into numerous functional categories. The expressions of the across tissues, different aged roots and the roots of different genotypes. However, they are coordinating in expression, forming a single co-expression network. These results provide a deeper understanding of the origin, evolution and functional differentiation and expression dynamics of the NBS-encoding gene family in plants in general and in ginseng particularly, and a NBS gene toolkit useful for isolation and characterization of disease resistance genes and for enhanced disease resistance breeding in ginseng and related species. PMID:28727829
High-Throughput Analysis of Promoter Occupancy Reveals New Targets for Arx, a Gene Mutated in Mental Retardation and Interneuronopathies

PubMed Central

Quillé, Marie-Lise; Hirchaud, Edouard; Baron, Daniel; Benech, Caroline; Guihot, Jeanne; Placet, Morgane; Mignen, Olivier; Férec, Claude; Houlgatte, Rémi; Friocourt, Gaëlle

2011-01-01

Genetic investigations of X-linked intellectual disabilities have implicated the ARX (Aristaless-related homeobox) gene in a wide spectrum of disorders extending from phenotypes characterised by severe neuronal migration defects such as lissencephaly, to mild or moderate forms of mental retardation without apparent brain abnormalities but with associated features of dystonia and epilepsy. Analysis of Arx spatio-temporal localisation profile in mouse revealed expression in telencephalic structures, mainly restricted to populations of GABAergic neurons at all stages of development. Furthermore, studies of the effects of ARX loss of function in humans and animal models revealed varying defects, suggesting multiple roles of this gene during brain development. However, to date, little is known about how ARX functions as a transcription factor and the nature of its targets. To better understand its role, we combined chromatin immunoprecipitation and mRNA expression with microarray analysis and identified a total of 1006 gene promoters bound by Arx in transfected neuroblastoma (N2a) cells and in mouse embryonic brain. Approximately 24% of Arx-bound genes were found to show expression changes following Arx overexpression or knock-down. Several of the Arx target genes we identified are known to be important for a variety of functions in brain development and some of them suggest new functions for Arx. Overall, these results identified multiple new candidate targets for Arx and should help to better understand the pathophysiological mechanisms of intellectual disability and epilepsy associated with ARX mutations. PMID:21966449
Advances and perspectives on the use of CRISPR/Cas9 systems in plant genomics research

DOE PAGES

Liu, Degao; Hu, Rongbin; Palla, Kaitlin J.; ...

2016-02-18

Genome editing with site-specific nucleases has become a powerful tool for functional characterization of plant genes and genetic improvement of agricultural crops. Among the various site-specific nuclease-based technologies available for genome editing, the clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) systems have shown the greatest potential for rapid and efficient editing of genomes in plant species. Here, this article reviews the current status of application of CRISPR/Cas9 to plant genomics research, with a focus on loss-of-function and gain-of-function analysis of individual genes in the context of perennial plants and the potential application of CRISPR/Cas9 to perturbation ofmore » gene expression, as well as identification and analysis of gene modules as part of an accelerated domestication and synthetic biology effort.« less
Advances and perspectives on the use of CRISPR/Cas9 systems in plant genomics research

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, Degao; Hu, Rongbin; Palla, Kaitlin J.

Genome editing with site-specific nucleases has become a powerful tool for functional characterization of plant genes and genetic improvement of agricultural crops. Among the various site-specific nuclease-based technologies available for genome editing, the clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) systems have shown the greatest potential for rapid and efficient editing of genomes in plant species. Here, this article reviews the current status of application of CRISPR/Cas9 to plant genomics research, with a focus on loss-of-function and gain-of-function analysis of individual genes in the context of perennial plants and the potential application of CRISPR/Cas9 to perturbation ofmore » gene expression, as well as identification and analysis of gene modules as part of an accelerated domestication and synthetic biology effort.« less
SGFSC: speeding the gene functional similarity calculation based on hash tables.

PubMed

Tian, Zhen; Wang, Chunyu; Guo, Maozu; Liu, Xiaoyan; Teng, Zhixia

2016-11-04

In recent years, many measures of gene functional similarity have been proposed and widely used in all kinds of essential research. These methods are mainly divided into two categories: pairwise approaches and group-wise approaches. However, a common problem with these methods is their time consumption, especially when measuring the gene functional similarities of a large number of gene pairs. The problem of computational efficiency for pairwise approaches is even more prominent because they are dependent on the combination of semantic similarity. Therefore, the efficient measurement of gene functional similarity remains a challenging problem. To speed current gene functional similarity calculation methods, a novel two-step computing strategy is proposed: (1) establish a hash table for each method to store essential information obtained from the Gene Ontology (GO) graph and (2) measure gene functional similarity based on the corresponding hash table. There is no need to traverse the GO graph repeatedly for each method with the help of the hash table. The analysis of time complexity shows that the computational efficiency of these methods is significantly improved. We also implement a novel Speeding Gene Functional Similarity Calculation tool, namely SGFSC, which is bundled with seven typical measures using our proposed strategy. Further experiments show the great advantage of SGFSC in measuring gene functional similarity on the whole genomic scale. The proposed strategy is successful in speeding current gene functional similarity calculation methods. SGFSC is an efficient tool that is freely available at http://nclab.hit.edu.cn/SGFSC . The source code of SGFSC can be downloaded from http://pan.baidu.com/s/1dFFmvpZ .
Integrative Transcriptomic Analysis Uncovers Novel Gene Modules That Underlie the Sulfate Response in Arabidopsis thaliana

PubMed Central

Henríquez-Valencia, Carlos; Arenas-M, Anita; Medina, Joaquín; Canales, Javier

2018-01-01

Sulfur is an essential nutrient for plant growth and development. Sulfur is a constituent of proteins, the plasma membrane and cell walls, among other important cellular components. To obtain new insights into the gene regulatory networks underlying the sulfate response, we performed an integrative meta-analysis of transcriptomic data from five different sulfate experiments available in public databases. This bioinformatic approach allowed us to identify a robust set of genes whose expression depends only on sulfate availability, indicating that those genes play an important role in the sulfate response. In relation to sulfate metabolism, the biological function of approximately 45% of these genes is currently unknown. Moreover, we found several consistent Gene Ontology terms related to biological processes that have not been extensively studied in the context of the sulfate response; these processes include cell wall organization, carbohydrate metabolism, nitrogen compound transport, and the regulation of proteolysis. Gene co-expression network analyses revealed relationships between the sulfate-responsive genes that were distributed among seven function-specific co-expression modules. The most connected genes in the sulfate co-expression network belong to a module related to the carbon response, suggesting that this biological function plays an important role in the control of the sulfate response. Temporal analyses of the network suggest that sulfate starvation generates a biphasic response, which involves that major changes in gene expression occur during both the early and late responses. Network analyses predicted that the sulfate response is regulated by a limited number of transcription factors, including MYBs, bZIPs, and NF-YAs. In conclusion, our analysis identified new candidate genes and provided new hypotheses to advance our understanding of the transcriptional regulation of sulfate metabolism in plants. PMID:29692794
Integrative Transcriptomic Analysis Uncovers Novel Gene Modules That Underlie the Sulfate Response in Arabidopsis thaliana.

PubMed

Henríquez-Valencia, Carlos; Arenas-M, Anita; Medina, Joaquín; Canales, Javier

2018-01-01

Sulfur is an essential nutrient for plant growth and development. Sulfur is a constituent of proteins, the plasma membrane and cell walls, among other important cellular components. To obtain new insights into the gene regulatory networks underlying the sulfate response, we performed an integrative meta-analysis of transcriptomic data from five different sulfate experiments available in public databases. This bioinformatic approach allowed us to identify a robust set of genes whose expression depends only on sulfate availability, indicating that those genes play an important role in the sulfate response. In relation to sulfate metabolism, the biological function of approximately 45% of these genes is currently unknown. Moreover, we found several consistent Gene Ontology terms related to biological processes that have not been extensively studied in the context of the sulfate response; these processes include cell wall organization, carbohydrate metabolism, nitrogen compound transport, and the regulation of proteolysis. Gene co-expression network analyses revealed relationships between the sulfate-responsive genes that were distributed among seven function-specific co-expression modules. The most connected genes in the sulfate co-expression network belong to a module related to the carbon response, suggesting that this biological function plays an important role in the control of the sulfate response. Temporal analyses of the network suggest that sulfate starvation generates a biphasic response, which involves that major changes in gene expression occur during both the early and late responses. Network analyses predicted that the sulfate response is regulated by a limited number of transcription factors, including MYBs, bZIPs, and NF-YAs. In conclusion, our analysis identified new candidate genes and provided new hypotheses to advance our understanding of the transcriptional regulation of sulfate metabolism in plants.
Identifying osteosarcoma metastasis associated genes by weighted gene co-expression network analysis (WGCNA).

PubMed

Tian, Honglai; Guan, Donghui; Li, Jianmin

2018-06-01

Osteosarcoma (OS), the most common malignant bone tumor, accounts for the heavy healthy threat in the period of children and adolescents. OS occurrence usually correlates with early metastasis and high death rate. This study aimed to better understand the mechanism of OS metastasis.Based on Gene Expression Omnibus (GEO) database, we downloaded 4 expression profile data sets associated with OS metastasis, and selected differential expressed genes. Weighted gene co-expression network analysis (WGCNA) approach allowed us to investigate the most OS metastasis-correlated module. Gene Ontology functional and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were used to give annotation of selected OS metastasis-associated genes.We select 897 differential expressed genes from OS metastasis and OS non-metastasis groups. Based on these selected genes, WGCNA further explored 142 genes included in the most OS metastasis-correlated module. Gene Ontology functional and KEGG pathway enrichment analyses showed that significantly OS metastasis-associated genes were involved in pathway correlated with insulin-like growth factor binding.Our research figured out several potential molecules participating in metastasis process and factors acting as biomarker. With this study, we could better explore the mechanism of OS metastasis and further discover more therapy targets.
Gene expression analysis reveals schizophrenia-associated dysregulation of immune pathways in peripheral blood mononuclear cells.

PubMed

Gardiner, Erin J; Cairns, Murray J; Liu, Bing; Beveridge, Natalie J; Carr, Vaughan; Kelly, Brian; Scott, Rodney J; Tooney, Paul A

2013-04-01

Peripheral blood mononuclear cells (PBMCs) represent an accessible tissue source for gene expression profiling in schizophrenia that could provide insight into the molecular basis of the disorder. This study used the Illumina HT_12 microarray platform and quantitative real time PCR (QPCR) to perform mRNA expression profiling on 114 patients with schizophrenia or schizoaffective disorder and 80 non-psychiatric controls from the Australian Schizophrenia Research Bank (ASRB). Differential expression analysis revealed altered expression of 164 genes (59 up-regulated and 105 down-regulated) in the PBMCs from patients with schizophrenia compared to controls. Bioinformatic analysis indicated significant enrichment of differentially expressed genes known to be involved or associated with immune function and regulating the immune response. The differential expression of 6 genes, EIF2C2 (Ago 2), MEF2D, EVL, PI3, S100A12 and DEFA4 was confirmed by QPCR. Genome-wide expression analysis of PBMCs from individuals with schizophrenia was characterized by the alteration of genes with immune system function, supporting the hypothesis that the disorder has a significant immunological component in its etiology. Copyright © 2012 Elsevier Ltd. All rights reserved.
Genetic contributions to variation in general cognitive function: a meta-analysis of genome-wide association studies in the CHARGE consortium (N=53 949)

PubMed Central

Davies, G; Armstrong, N; Bis, J C; Bressler, J; Chouraki, V; Giddaluru, S; Hofer, E; Ibrahim-Verbaas, C A; Kirin, M; Lahti, J; van der Lee, S J; Le Hellard, S; Liu, T; Marioni, R E; Oldmeadow, C; Postmus, I; Smith, A V; Smith, J A; Thalamuthu, A; Thomson, R; Vitart, V; Wang, J; Yu, L; Zgaga, L; Zhao, W; Boxall, R; Harris, S E; Hill, W D; Liewald, D C; Luciano, M; Adams, H; Ames, D; Amin, N; Amouyel, P; Assareh, A A; Au, R; Becker, J T; Beiser, A; Berr, C; Bertram, L; Boerwinkle, E; Buckley, B M; Campbell, H; Corley, J; De Jager, P L; Dufouil, C; Eriksson, J G; Espeseth, T; Faul, J D; Ford, I; Scotland, Generation; Gottesman, R F; Griswold, M E; Gudnason, V; Harris, T B; Heiss, G; Hofman, A; Holliday, E G; Huffman, J; Kardia, S L R; Kochan, N; Knopman, D S; Kwok, J B; Lambert, J-C; Lee, T; Li, G; Li, S-C; Loitfelder, M; Lopez, O L; Lundervold, A J; Lundqvist, A; Mather, K A; Mirza, S S; Nyberg, L; Oostra, B A; Palotie, A; Papenberg, G; Pattie, A; Petrovic, K; Polasek, O; Psaty, B M; Redmond, P; Reppermund, S; Rotter, J I; Schmidt, H; Schuur, M; Schofield, P W; Scott, R J; Steen, V M; Stott, D J; van Swieten, J C; Taylor, K D; Trollor, J; Trompet, S; Uitterlinden, A G; Weinstein, G; Widen, E; Windham, B G; Jukema, J W; Wright, A F; Wright, M J; Yang, Q; Amieva, H; Attia, J R; Bennett, D A; Brodaty, H; de Craen, A J M; Hayward, C; Ikram, M A; Lindenberger, U; Nilsson, L-G; Porteous, D J; Räikkönen, K; Reinvang, I; Rudan, I; Sachdev, P S; Schmidt, R; Schofield, P R; Srikanth, V; Starr, J M; Turner, S T; Weir, D R; Wilson, J F; van Duijn, C; Launer, L; Fitzpatrick, A L; Seshadri, S; Mosley, T H; Deary, I J

2015-01-01

General cognitive function is substantially heritable across the human life course from adolescence to old age. We investigated the genetic contribution to variation in this important, health- and well-being-related trait in middle-aged and older adults. We conducted a meta-analysis of genome-wide association studies of 31 cohorts (N=53 949) in which the participants had undertaken multiple, diverse cognitive tests. A general cognitive function phenotype was tested for, and created in each cohort by principal component analysis. We report 13 genome-wide significant single-nucleotide polymorphism (SNP) associations in three genomic regions, 6q16.1, 14q12 and 19q13.32 (best SNP and closest gene, respectively: rs10457441, P=3.93 × 10−9, MIR2113; rs17522122, P=2.55 × 10−8, AKAP6; rs10119, P=5.67 × 10−9, APOE/TOMM40). We report one gene-based significant association with the HMGN1 gene located on chromosome 21 (P=1 × 10−6). These genes have previously been associated with neuropsychiatric phenotypes. Meta-analysis results are consistent with a polygenic model of inheritance. To estimate SNP-based heritability, the genome-wide complex trait analysis procedure was applied to two large cohorts, the Atherosclerosis Risk in Communities Study (N=6617) and the Health and Retirement Study (N=5976). The proportion of phenotypic variation accounted for by all genotyped common SNPs was 29% (s.e.=5%) and 28% (s.e.=7%), respectively. Using polygenic prediction analysis, ~1.2% of the variance in general cognitive function was predicted in the Generation Scotland cohort (N=5487; P=1.5 × 10−17). In hypothesis-driven tests, there was significant association between general cognitive function and four genes previously associated with Alzheimer's disease: TOMM40, APOE, ABCG1 and MEF2C. PMID:25644384
Analysis of the Nicotiana tabacum Stigma/Style Transcriptome Reveals Gene Expression Differences between Wet and Dry Stigma Species1[W][OA

PubMed Central

Quiapim, Andréa C.; Brito, Michael S.; Bernardes, Luciano A.S.; daSilva, Idalete; Malavazi, Iran; DePaoli, Henrique C.; Molfetta-Machado, Jeanne B.; Giuliatti, Silvana; Goldman, Gustavo H.; Goldman, Maria Helena S.

2009-01-01

The success of plant reproduction depends on pollen-pistil interactions occurring at the stigma/style. These interactions vary depending on the stigma type: wet or dry. Tobacco (Nicotiana tabacum) represents a model of wet stigma, and its stigmas/styles express genes to accomplish the appropriate functions. For a large-scale study of gene expression during tobacco pistil development and preparation for pollination, we generated 11,216 high-quality expressed sequence tags (ESTs) from stigmas/styles and created the TOBEST database. These ESTs were assembled in 6,177 clusters, from which 52.1% are pistil transcripts/genes of unknown function. The 21 clusters with the highest number of ESTs (putative higher expression levels) correspond to genes associated with defense mechanisms or pollen-pistil interactions. The database analysis unraveled tobacco sequences homologous to the Arabidopsis (Arabidopsis thaliana) genes involved in specifying pistil identity or determining normal pistil morphology and function. Additionally, 782 independent clusters were examined by macroarray, revealing 46 stigma/style preferentially expressed genes. Real-time reverse transcription-polymerase chain reaction experiments validated the pistil-preferential expression for nine out of 10 genes tested. A search for these 46 genes in the Arabidopsis pistil data sets demonstrated that only 11 sequences, with putative equivalent molecular functions, are expressed in this dry stigma species. The reverse search for the Arabidopsis pistil genes in the TOBEST exposed a partial overlap between these dry and wet stigma transcriptomes. The TOBEST represents the most extensive survey of gene expression in the stigmas/styles of wet stigma plants, and our results indicate that wet and dry stigmas/styles express common as well as distinct genes in preparation for the pollination process. PMID:19052150
Gene-Transformation-Induced Changes in Chemical Functional Group Features and Molecular Structure Conformation in Alfalfa Plants Co-Expressing Lc-bHLH and C1-MYB Transcriptive Flavanoid Regulatory Genes: Effects of Single-Gene and Two-Gene Insertion.

PubMed

Heendeniya, Ravindra G; Yu, Peiqiang

2017-03-20

Alfalfa ( Medicago sativa L.) genotypes transformed with Lc-bHLH and Lc transcription genes were developed with the intention of stimulating proanthocyanidin synthesis in the aerial parts of the plant. To our knowledge, there are no studies on the effect of single-gene and two-gene transformation on chemical functional groups and molecular structure changes in these plants. The objective of this study was to use advanced molecular spectroscopy with multivariate chemometrics to determine chemical functional group intensity and molecular structure changes in alfalfa plants when co-expressing Lc-bHLH and C1-MYB transcriptive flavanoid regulatory genes in comparison with non-transgenic (NT) and AC Grazeland (ACGL) genotypes. The results showed that compared to NT genotype, the presence of double genes ( Lc and C1 ) increased ratios of both the area and peak height of protein structural Amide I/II and the height ratio of α-helix to β-sheet. In carbohydrate-related spectral analysis, the double gene-transformed alfalfa genotypes exhibited lower peak heights at 1370, 1240, 1153, and 1020 cm -1 compared to the NT genotype. Furthermore, the effect of double gene transformation on carbohydrate molecular structure was clearly revealed in the principal component analysis of the spectra. In conclusion, single or double transformation of Lc and C1 genes resulted in changing functional groups and molecular structure related to proteins and carbohydrates compared to the NT alfalfa genotype. The current study provided molecular structural information on the transgenic alfalfa plants and provided an insight into the impact of transgenes on protein and carbohydrate properties and their molecular structure's changes.
High-resolution DNA melting analysis in plant research

USDA-ARS?s Scientific Manuscript database

Genetic and genomic studies provide valuable insight into the inheritance, structure, organization, and function of genes. The knowledge gained from the analysis of plant genes is beneficial to all aspects of plant research, including crop improvement. New methods and tools are continually developed...
Meta-analysis of chicken--salmonella infection experiments.

PubMed

Te Pas, Marinus F W; Hulsegge, Ina; Schokker, Dirkjan; Smits, Mari A; Fife, Mark; Zoorob, Rima; Endale, Marie-Laure; Rebel, Johanna M J

2012-04-24

Chicken meat and eggs can be a source of human zoonotic pathogens, especially Salmonella species. These food items contain a potential hazard for humans. Chickens lines differ in susceptibility for Salmonella and can harbor Salmonella pathogens without showing clinical signs of illness. Many investigations including genomic studies have examined the mechanisms how chickens react to infection. Apart from the innate immune response, many physiological mechanisms and pathways are reported to be involved in the chicken host response to Salmonella infection. The objective of this study was to perform a meta-analysis of diverse experiments to identify general and host specific mechanisms to the Salmonella challenge. Diverse chicken lines differing in susceptibility to Salmonella infection were challenged with different Salmonella serovars at several time points. Various tissues were sampled at different time points post-infection, and resulting host transcriptional differences investigated using different microarray platforms. The meta-analysis was performed with the R-package metaMA to create lists of differentially regulated genes. These gene lists showed many similarities for different chicken breeds and tissues, and also for different Salmonella serovars measured at different times post infection. Functional biological analysis of these differentially expressed gene lists revealed several common mechanisms for the chicken host response to Salmonella infection. The meta-analysis-specific genes (i.e. genes found differentially expressed only in the meta-analysis) confirmed and expanded the biological functional mechanisms. The meta-analysis combination of heterogeneous expression profiling data provided useful insights into the common metabolic pathways and functions of different chicken lines infected with different Salmonella serovars.
Meta-analysis of Chicken – Salmonella infection experiments

PubMed Central

2012-01-01

Background Chicken meat and eggs can be a source of human zoonotic pathogens, especially Salmonella species. These food items contain a potential hazard for humans. Chickens lines differ in susceptibility for Salmonella and can harbor Salmonella pathogens without showing clinical signs of illness. Many investigations including genomic studies have examined the mechanisms how chickens react to infection. Apart from the innate immune response, many physiological mechanisms and pathways are reported to be involved in the chicken host response to Salmonella infection. The objective of this study was to perform a meta-analysis of diverse experiments to identify general and host specific mechanisms to the Salmonella challenge. Results Diverse chicken lines differing in susceptibility to Salmonella infection were challenged with different Salmonella serovars at several time points. Various tissues were sampled at different time points post-infection, and resulting host transcriptional differences investigated using different microarray platforms. The meta-analysis was performed with the R-package metaMA to create lists of differentially regulated genes. These gene lists showed many similarities for different chicken breeds and tissues, and also for different Salmonella serovars measured at different times post infection. Functional biological analysis of these differentially expressed gene lists revealed several common mechanisms for the chicken host response to Salmonella infection. The meta-analysis-specific genes (i.e. genes found differentially expressed only in the meta-analysis) confirmed and expanded the biological functional mechanisms. Conclusions The meta-analysis combination of heterogeneous expression profiling data provided useful insights into the common metabolic pathways and functions of different chicken lines infected with different Salmonella serovars. PMID:22531008
Functional Regression Models for Epistasis Analysis of Multiple Quantitative Traits.

PubMed

Zhang, Futao; Xie, Dan; Liang, Meimei; Xiong, Momiao

2016-04-01

To date, most genetic analyses of phenotypes have focused on analyzing single traits or analyzing each phenotype independently. However, joint epistasis analysis of multiple complementary traits will increase statistical power and improve our understanding of the complicated genetic structure of the complex diseases. Despite their importance in uncovering the genetic structure of complex traits, the statistical methods for identifying epistasis in multiple phenotypes remains fundamentally unexplored. To fill this gap, we formulate a test for interaction between two genes in multiple quantitative trait analysis as a multiple functional regression (MFRG) in which the genotype functions (genetic variant profiles) are defined as a function of the genomic position of the genetic variants. We use large-scale simulations to calculate Type I error rates for testing interaction between two genes with multiple phenotypes and to compare the power with multivariate pairwise interaction analysis and single trait interaction analysis by a single variate functional regression model. To further evaluate performance, the MFRG for epistasis analysis is applied to five phenotypes of exome sequence data from the NHLBI's Exome Sequencing Project (ESP) to detect pleiotropic epistasis. A total of 267 pairs of genes that formed a genetic interaction network showed significant evidence of epistasis influencing five traits. The results demonstrate that the joint interaction analysis of multiple phenotypes has a much higher power to detect interaction than the interaction analysis of a single trait and may open a new direction to fully uncovering the genetic structure of multiple phenotypes.

Isolation and Functional Analyses of a Putative Floral Homeotic C-Function Gene in a Basal Eudicot London Plane Tree (Platanus acerifolia)

PubMed Central

Liu, Guofeng; Bao, Manzhu

2013-01-01

The identification of mutants in model plant species has led to the isolation of the floral homeotic function genes that play crucial roles in flower organ specification. However, floral homeotic C-function genes are rarely studied in basal eudicots. Here, we report the isolation and characterization of the AGAMOUS (AG) orthologous gene (PaAG) from a basal eudicot London plane tree (Platanus acerifolia Willd). Phylogenetic analysis showed that PaAG belongs to the C- clade AG group of genes. PaAG was found to be expressed predominantly in the later developmental stages of male and female inflorescences. Ectopic expression of PaAG-1 in tobacco (Nicotiana tabacum) resulted in morphological alterations of the outer two flower whorls, as well as some defects in vegetative growth. Scanning electron micrographs (SEMs) confirmed homeotic sepal-to-carpel transformation in the transgenic plants. Protein interaction assays in yeast cells indicated that PaAG could interact directly with PaAP3 (a B-class MADS-box protein in P. acerifolia), and also PaSEP1 and PaSEP3 (E-class MADS-box proteins in P. acerifolia). This study performed the functional analysis of AG orthologous genes outside core eudicots and monocots. Our findings demonstrate a conserved functional role of AG homolog in London plane tree, which also represent a contribution towards understanding the molecular mechanisms of flower development in this monoecious tree species. PMID:23691041
APOE polymorphism as a potential determinant of functional fitness in the elderly regardless of nutritional status.

PubMed

Snejdrlova, Michaela; Kalvach, Zdenek; Topinkova, Eva; Vrablik, Michal; Prochazkova, Renata; Kvasilova, Marie; Lanska, Vera; Zlatohlavek, Lukas; Prusikova, Martina; Ceska, Richard

2011-01-01

Life expectancy is determined by a combination of genetic predisposition (~25%) and environmental influences (~75%). Nevertheless a stronger genetic influence is anticipated in long-living individuals. Apolipoprotein E (APOE) gene belongs among the most studied candidate genes of longevity. We evaluated the relation of APOE polymorphism and fitness status in the elderly. We examined a total number of 128 subjects, over 80 years of age. Using a battery of functional tests their fitness status was assessed and the subjects were stratified into 5 functional categories according to Spirduso´s classification. Biochemistry analysis was performed by enzymatic method using automated analyzers. APOE gene polymorphism was analysed performed using PCR-RFLP. APOE4 allele carriers had significantly worse fitness status compared to non-carriers (p=0.025). Multiple logistic regression analysis showed the APOE4 carriers had higher risk (p=0.05) of functional unfitness compared to APOE2/E3 individuals. APOE gene polymorphism seems be an important genetic contributor to frailty development in the elderly. While APOE2 carriers tend to remain functionally fit till higher age, the functional status of APOE4 carriers deteriorates more rapidly. © 2011 Neuroendocrinology Letters
Genome-wide analysis of carotenoid cleavage oxygenase genes and their responses to various phytohormones and abiotic stresses in apple (Malus domestica).

PubMed

Chen, Hongfei; Zuo, Xiya; Shao, Hongxia; Fan, Sheng; Ma, Juanjuan; Zhang, Dong; Zhao, Caiping; Yan, Xiangyan; Liu, Xiaojie; Han, Mingyu

2018-02-01

Carotenoid cleavage oxygenases (CCOs) are able to cleave carotenoids to produce apocarotenoids and their derivatives, which are important for plant growth and development. In this study, 21 apple CCO genes were identified and divided into six groups based on their phylogenetic relationships. We further characterized the apple CCO genes in terms of chromosomal distribution, structure and the presence of cis-elements in the promoter. We also predicted the cellular localization of the encoded proteins. An analysis of the synteny within the apple genome revealed that tandem, segmental, and whole-genome duplication events likely contributed to the expansion of the apple carotenoid oxygenase gene family. An additional integrated synteny analysis identified orthologous carotenoid oxygenase genes between apple and Arabidopsis thaliana, which served as references for the functional analysis of the apple CCO genes. The net photosynthetic rate, transpiration rate, and stomatal conductance of leaves decreased, while leaf stomatal density increased under drought and saline conditions. Tissue-specific gene expression analyses revealed diverse spatiotemporal expression patterns. Finally, hormone and abiotic stress treatments indicated that many apple CCO genes are responsive to various phytohormones as well as drought and salinity stresses. The genome-wide identification of apple CCO genes and the analyses of their expression patterns described herein may provide a solid foundation for future studies examining the regulation and functions of this gene family. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Network inference analysis identifies an APRR2-like gene linked to pigment accumulation in tomato and pepper fruits.

PubMed

Pan, Yu; Bradley, Glyn; Pyke, Kevin; Ball, Graham; Lu, Chungui; Fray, Rupert; Marshall, Alexandra; Jayasuta, Subhalai; Baxter, Charles; van Wijk, Rik; Boyden, Laurie; Cade, Rebecca; Chapman, Natalie H; Fraser, Paul D; Hodgman, Charlie; Seymour, Graham B

2013-03-01

Carotenoids represent some of the most important secondary metabolites in the human diet, and tomato (Solanum lycopersicum) is a rich source of these health-promoting compounds. In this work, a novel and fruit-related regulator of pigment accumulation in tomato has been identified by artificial neural network inference analysis and its function validated in transgenic plants. A tomato fruit gene regulatory network was generated using artificial neural network inference analysis and transcription factor gene expression profiles derived from fruits sampled at various points during development and ripening. One of the transcription factor gene expression profiles with a sequence related to an Arabidopsis (Arabidopsis thaliana) ARABIDOPSIS PSEUDO RESPONSE REGULATOR2-LIKE gene (APRR2-Like) was up-regulated at the breaker stage in wild-type tomato fruits and, when overexpressed in transgenic lines, increased plastid number, area, and pigment content, enhancing the levels of chlorophyll in immature unripe fruits and carotenoids in red ripe fruits. Analysis of the transcriptome of transgenic lines overexpressing the tomato APPR2-Like gene revealed up-regulation of several ripening-related genes in the overexpression lines, providing a link between the expression of this tomato gene and the ripening process. A putative ortholog of the tomato APPR2-Like gene in sweet pepper (Capsicum annuum) was associated with pigment accumulation in fruit tissues. We conclude that the function of this gene is conserved across taxa and that it encodes a protein that has an important role in ripening.
Class I KNOX genes are associated with organogenesis during bulbil formation in Agave tequilana.

PubMed

Abraham-Juárez, María Jazmín; Martínez-Hernández, Aída; Leyva-González, Marco Antonio; Herrera-Estrella, Luis; Simpson, June

2010-09-01

Bulbil formation in Agave tequilana was analysed with the objective of understanding this phenomenon at the molecular and cellular levels. Bulbils formed 14-45 d after induction and were associated with rearrangements in tissue structure and accelerated cell multiplication. Changes at the cellular level during bulbil development were documented by histological analysis. In addition, several cDNA libraries produced from different stages of bulbil development were generated and partially sequenced. Sequence analysis led to the identification of candidate genes potentially involved in the initiation and development of bulbils in Agave, including two putative class I KNOX genes. Real-time reverse transcription-PCR and in situ hybridization revealed that expression of the putative Agave KNOXI genes occurs at bulbil initiation and specifically in tissue where meristems will develop. Functional analysis of Agave KNOXI genes in Arabidopsis thaliana showed the characteristic lobed phenotype of KNOXI ectopic expression in leaves, although a slightly different phenotype was observed for each of the two Agave genes. An Arabidopsis KNOXI (knat1) mutant line (CS30) was successfully complemented with one of the Agave KNOX genes and partially complemented by the other. Analysis of the expression of the endogenous Arabidopsis genes KNAT1, KNAT6, and AS1 in the transformed lines ectopically expressing or complemented by the Agave KNOX genes again showed different regulatory patterns for each Agave gene. These results show that Agave KNOX genes are functionally similar to class I KNOX genes and suggest that spatial and temporal control of their expression is essential during bulbil formation in A. tequilana.
Gene expression factor analysis to differentiate pathways linked to fibromyalgia, chronic fatigue syndrome, and depression in a diverse patient sample

PubMed Central

Iacob, Eli; Light, Alan R.; Donaldson, Gary W.; Okifuji, Akiko; Hughen, Ronald W.; White, Andrea T.; Light, Kathleen C.

2015-01-01

Objective To determine if independent candidate genes can be grouped into meaningful biological factors and if these factors are associated with the diagnosis of chronic fatigue syndrome (CFS) and fibromyalgia (FMS) while controlling for co-morbid depression, sex, and age. Methods We included leukocyte mRNA gene expression from a total of 261 individuals including healthy controls (n=61), patients with FMS only (n=15), CFS only (n=33), co-morbid CFS and FMS (n=79), and medication-resistant (n=42) or medication-responsive (n=31) depression. We used Exploratory Factor Analysis (EFA) on 34 candidate genes to determine factor scores and regression analysis to examine if these factors were associated with specific diagnoses. Results EFA resulted in four independent factors with minimal overlap of genes between factors explaining 51% of the variance. We labeled these factors by function as: 1) Purinergic and cellular modulators; 2) Neuronal growth and immune function; 3) Nociception and stress mediators; 4) Energy and mitochondrial function. Regression analysis predicting these biological factors using FMS, CFS, depression severity, age, and sex revealed that greater expression in Factors 1 and 3 was positively associated with CFS and negatively associated with depression severity (QIDS score), but not associated with FMS. Conclusion Expression of candidate genes can be grouped into meaningful clusters, and CFS and depression are associated with the same 2 clusters but in opposite directions when controlling for co-morbid FMS. Given high co-morbid disease and interrelationships between biomarkers, EFA may help determine patient subgroups in this population based on gene expression. PMID:26097208
Comparative analysis of grapevine whole-genome gene predictions, functional annotation, categorization and integration of the predicted gene sequences

PubMed Central

2012-01-01

Background The first draft assembly and gene prediction of the grapevine genome (8X base coverage) was made available to the scientific community in 2007, and functional annotation was developed on this gene prediction. Since then additional Sanger sequences were added to the 8X sequences pool and a new version of the genomic sequence with superior base coverage (12X) was produced. Results In order to more efficiently annotate the function of the genes predicted in the new assembly, it is important to build on as much of the previous work as possible, by transferring 8X annotation of the genome to the 12X version. The 8X and 12X assemblies and gene predictions of the grapevine genome were compared to answer the question, “Can we uniquely map 8X predicted genes to 12X predicted genes?” The results show that while the assemblies and gene structure predictions are too different to make a complete mapping between them, most genes (18,725) showed a one-to-one relationship between 8X predicted genes and the last version of 12X predicted genes. In addition, reshuffled genomic sequence structures appeared. These highlight regions of the genome where the gene predictions need to be taken with caution. Based on the new grapevine gene functional annotation and in-depth functional categorization, twenty eight new molecular networks have been created for VitisNet while the existing networks were updated. Conclusions The outcomes of this study provide a functional annotation of the 12X genes, an update of VitisNet, the system of the grapevine molecular networks, and a new functional categorization of genes. Data are available at the VitisNet website (http://www.sdstate.edu/ps/research/vitis/pathways.cfm). PMID:22554261
Analysis of the Citrullus colocynthis Transcriptome during Water Deficit Stress

PubMed Central

Wang, Zhuoyu; Hu, Hongtao; Goertzen, Leslie R.; McElroy, J. Scott; Dane, Fenny

2014-01-01

Citrullus colocynthis is a very drought tolerant species, closely related to watermelon (C. lanatus var. lanatus), an economically important cucurbit crop. Drought is a threat to plant growth and development, and the discovery of drought inducible genes with various functions is of great importance. We used high throughput mRNA Illumina sequencing technology and bioinformatic strategies to analyze the C. colocynthis leaf transcriptome under drought treatment. Leaf samples at four different time points (0, 24, 36, or 48 hours of withholding water) were used for RNA extraction and Illumina sequencing. qRT-PCR of several drought responsive genes was performed to confirm the accuracy of RNA sequencing. Leaf transcriptome analysis provided the first glimpse of the drought responsive transcriptome of this unique cucurbit species. A total of 5038 full-length cDNAs were detected, with 2545 genes showing significant changes during drought stress. Principle component analysis indicated that drought was the major contributing factor regulating transcriptome changes. Up regulation of many transcription factors, stress signaling factors, detoxification genes, and genes involved in phytohormone signaling and citrulline metabolism occurred under the water deficit conditions. The C. colocynthis transcriptome data highlight the activation of a large set of drought related genes in this species, thus providing a valuable resource for future functional analysis of candidate genes in defense of drought stress. PMID:25118696
Cloning and expression profile of ionotropic receptors in the parasitoid wasp Microplitis mediator (Hymenoptera: Braconidae).

PubMed

Wang, Shan-Ning; Peng, Yong; Lu, Zi-Yun; Dhiloo, Khalid Hussain; Zheng, Yao; Shan, Shuang; Li, Rui-Jun; Zhang, Yong-Jun; Guo, Yu-Yuan

2016-07-01

Ionotropic receptors (IRs) mainly detect the acids and amines having great importance in many insect species, representing an ancient olfactory receptor family in insects. In the present work, we performed RNAseq of Microplitis mediator antennae and identified seventeen IRs. Full-length MmedIRs were cloned and sequenced. Phylogenetic analysis of the Hymenoptera IRs revealed that ten MmedIR genes encoded "antennal IRs" and seven encoded "divergent IRs". Among the IR25a orthologous groups, two genes, MmedIR25a.1 and MmedIR25a.2, were found in M. mediator. Gene structure analysis of MmedIR25a revealed a tandem duplication of IR25a in M. mediator. The tissue distribution and development specific expression of the MmedIR genes suggested that these genes showed a broad expression profile. Quantitative gene expression analysis showed that most of the genes are highly enriched in adult antennae, indicating the candidate chemosensory function of this family in parasitic wasps. Using immunocytochemistry, we confirmed that one co-receptor, MmedIR8a, was expressed in the olfactory sensory neurons. Our data will supply fundamental information for functional analysis of the IRs in parasitoid wasp chemoreception. Copyright © 2016 Elsevier Ltd. All rights reserved.
Integrated in silico analyses of regulatory and metabolic networks of Synechococcus sp. PCC 7002 reveal relationships between gene centrality and essentiality

DOE PAGES

Song, Hyun-Seob; McClure, Ryan S.; Bernstein, Hans C.; ...

2015-03-27

Cyanobacteria dynamically relay environmental inputs to intracellular adaptations through a coordinated adjustment of photosynthetic efficiency and carbon processing rates. The output of such adaptations is reflected through changes in transcriptional patterns and metabolic flux distributions that ultimately define growth strategy. To address interrelationships between metabolism and regulation, we performed integrative analyses of metabolic and gene co-expression networks in a model cyanobacterium, Synechococcus sp. PCC 7002. Centrality analyses using the gene co-expression network identified a set of key genes, which were defined here as ‘topologically important.’ Parallel in silico gene knock-out simulations, using the genome-scale metabolic network, classified what we termedmore » as ‘functionally important’ genes, deletion of which affected growth or metabolism. A strong positive correlation was observed between topologically and functionally important genes. Functionally important genes exhibited variable levels of topological centrality; however, the majority of topologically central genes were found to be functionally essential for growth. Subsequent functional enrichment analysis revealed that both functionally and topologically important genes in Synechococcus sp. PCC 7002 are predominantly associated with translation and energy metabolism, two cellular processes critical for growth. This research demonstrates how synergistic network-level analyses can be used for reconciliation of metabolic and gene expression data to uncover fundamental biological principles.« less
Variation analysis of transcriptome changes reveals cochlear genes and their associated functions in cochlear susceptibility to acoustic overstimulation.

PubMed

Yang, Shuzhi; Cai, Qunfeng; Bard, Jonathan; Jamison, Jennifer; Wang, Jianmin; Yang, Weiping; Hu, Bo Hua

2015-12-01

Individual variation in the susceptibility of the auditory system to acoustic overstimulation has been well-documented at both the functional and structural levels. However, the molecular mechanism responsible for this variation is unclear. The current investigation was designed to examine the variation patterns of cochlear gene expression using RNA-seq data and to identify the genes with expression variation that increased following acoustic trauma. This study revealed that the constitutive expressions of cochlear genes displayed diverse levels of gene-specific variation. These variation patterns were altered by acoustic trauma; approximately one-third of the examined genes displayed marked increases in their expression variation. Bioinformatics analyses revealed that the genes that exhibited increased variation were functionally related to cell death, biomolecule metabolism, and membrane function. In contrast, the stable genes were primarily related to basic cellular processes, including protein and macromolecular syntheses and transport. There was no functional overlap between the stable and variable genes. Importantly, we demonstrated that glutamate metabolism is related to the variation in the functional response of the cochlea to acoustic overstimulation. Taken together, the results indicate that our analyses of the individual variations in transcriptome changes of cochlear genes provide important information for the identification of genes that potentially contribute to the generation of individual variation in cochlear responses to acoustic overstimulation. Copyright © 2015 Elsevier B.V. All rights reserved.
Integrated in silico analyses of regulatory and metabolic networks of Synechococcus sp. PCC 7002 reveal relationships between gene centrality and essentiality

DOE Office of Scientific and Technical Information (OSTI.GOV)

Song, Hyun-Seob; McClure, Ryan S.; Bernstein, Hans C.

Cyanobacteria dynamically relay environmental inputs to intracellular adaptations through a coordinated adjustment of photosynthetic efficiency and carbon processing rates. The output of such adaptations is reflected through changes in transcriptional patterns and metabolic flux distributions that ultimately define growth strategy. To address interrelationships between metabolism and regulation, we performed integrative analyses of metabolic and gene co-expression networks in a model cyanobacterium, Synechococcus sp. PCC 7002. Centrality analyses using the gene co-expression network identified a set of key genes, which were defined here as ‘topologically important.’ Parallel in silico gene knock-out simulations, using the genome-scale metabolic network, classified what we termedmore » as ‘functionally important’ genes, deletion of which affected growth or metabolism. A strong positive correlation was observed between topologically and functionally important genes. Functionally important genes exhibited variable levels of topological centrality; however, the majority of topologically central genes were found to be functionally essential for growth. Subsequent functional enrichment analysis revealed that both functionally and topologically important genes in Synechococcus sp. PCC 7002 are predominantly associated with translation and energy metabolism, two cellular processes critical for growth. This research demonstrates how synergistic network-level analyses can be used for reconciliation of metabolic and gene expression data to uncover fundamental biological principles.« less
The petunia AGL6 gene has a SEPALLATA-like function in floral patterning.

PubMed

Rijpkema, Anneke S; Zethof, Jan; Gerats, Tom; Vandenbussche, Michiel

2009-10-01

SEPALLATA (SEP) MADS-box genes are required for the regulation of floral meristem determinacy and the specification of sepals, petals, stamens, carpels and ovules, specifically in angiosperms. The SEP subfamily is closely related to the AGAMOUS LIKE6 (AGL6) and SQUAMOSA (SQUA) subfamilies. So far, of these three groups only AGL6-like genes have been found in extant gymnosperms. AGL6 genes are more similar to SEP than to SQUA genes, both in sequence and in expression pattern. Despite the ancestry and wide distribution of AGL6-like MADS-box genes, not a single loss-of-function mutant exhibiting a clear phenotype has yet been reported; consequently the function of AGL6-like genes has remained elusive. Here, we characterize the Petunia hybrida AGL6 (PhAGL6, formerly called PETUNIA MADS BOX GENE4/pMADS4) gene, and show that it functions redundantly with the SEP genes FLORAL BINDING PROTEIN2 (FBP2) and FBP5 in petal and anther development. Moreover, expression analysis suggests a function for PhAGL6 in ovary and ovule development. The PhAGL6 and FBP2 proteins interact in in vitro experiments overall with the same partners, indicating that the two proteins are biochemically quite similar. It will be interesting to determine the functions of AGL6-like genes of other species, especially those of gymnosperms.
Korean Red Ginseng Up-regulates C21-Steroid Hormone Metabolism via Cyp11a1 Gene in Senescent Rat Testes

PubMed Central

Kim, In-Hye; Kim, Si-Kwan; Kim, Eun-Hye; Kim, Sung-Won; Sohn, Sang-Hyun; Lee, Soo Cheol; Choi, Sangdun; Pyo, Suhkneung; Rhee, Dong-Kwon

2011-01-01

Ginseng (Panax ginseng Meyer) has been shown to have anti-aging effects in animal and clinical studies. However, the molecular mechanisms by which ginseng exerts these effects remain unknown. Here, the anti-aging effect of Korean red ginseng (KRG) in rat testes was examined by system biology analysis. KRG water extract prepared in feed pellets was administered orally into 12 month old rats for 4 months, and gene expression in testes was determined by microarray analysis. Microarray analysis identified 33 genes that significantly changed. Compared to the 2 month old young rats, 13 genes (Rps9, Cyp11a1, RT1-A2, LOC365778, Sv2b, RGD1565959, RGD1304748, etc.) were up-regulated and 20 genes (RT1-Db1, Cldn5, Svs5, Degs1, Vdac3, Hbb, LOC684355, Svs5, Tmem97, Orai1, Insl3, LOC497959, etc.) were down-regulated by KRG in the older rats. Ingenuity Pathway Analysis of untreated aged rats versus aged rats treated with KRG showed that the affected most was Cyp11a1, responsible for C21-steroid hormone metabolism, and the top molecular and cellular functions are organ morphology and reproductive system development and function. When genes in young rat were compared with those in the aged rat, sperm capacitation related genes were down-regulated in the old rat. However, when genes in the old rat were compared with those in the old rat treated with KRG, KRG treatment up-regulated C21-steroid hormone metabolism. Taken together, Cyp11a1 expression is decreased in the aged rat, however, it is up-regulated by KRG suggesting that KRG seems enhance testes function via Cyp11a1. PMID:23717070
SNP in starch biosynthesis genes associated with nutritional and functional properties of rice

PubMed Central

Kharabian-Masouleh, Ardashir; Waters, Daniel L. E.; Reinke, Russell F.; Ward, Rachelle; Henry, Robert J.

2012-01-01

Starch is a major component of human diets. The relative contribution of variation in the genes of starch biosynthesis to the nutritional and functional properties of the rice was evaluated in a rice breeding population. Sequencing 18 genes involved in starch synthesis in a population of 233 rice breeding lines discovered 66 functional SNPs in exonic regions. Five genes, AGPS2b, Isoamylase1, SPHOL, SSIIb and SSIVb showed no polymorphism. Association analysis found 31 of the SNP were associated with differences in pasting and cooking quality properties of the rice lines. Two genes appear to be the major loci controlling traits under human selection in rice, GBSSI (waxy gene) and SSIIa. GBSSI influenced amylose content and retrogradation. Other genes contributing to retrogradation were GPT1, SSI, BEI and SSIIIa. SSIIa explained much of the variation in cooking characteristics. Other genes had relatively small effects. PMID:22870386
Genome-wide identification of bacterial plant colonization genes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cole, Benjamin J.; Feltcher, Meghan E.; Waters, Robert J.

Diverse soil-resident bacteria can contribute to plant growth and health, but the molecular mechanisms enabling them to effectively colonize their plant hosts remain poorly understood. We used randomly barcoded transposon mutagenesis sequencing (RB-TnSeq) in Pseudomonas simiae, a model root-colonizing bacterium, to establish a genome-wide map of bacterial genes required for colonization of the Arabidopsis thaliana root system. We identified 115 genes (2% of all P. simiae genes) with functions that are required for maximal competitive colonization of the root system. Among the genes we identified were some with obvious colonization-related roles in motility and carbon metabolism, as well as 44more » other genes that had no or vague functional predictions. Independent validation assays of individual genes confirmed colonization functions for 20 of 22 (91%) cases tested. To further characterize genes identified by our screen, we compared the functional contributions of P. simiae genes to growth in 90 distinct in vitro conditions by RB-TnSeq, highlighting specific metabolic functions associated with root colonization genes. Here, our analysis of bacterial genes by sequence-driven saturation mutagenesis revealed a genome-wide map of the genetic determinants of plant root colonization and offers a starting point for targeted improvement of the colonization capabilities of plant-beneficial microbes.« less
Genome-wide identification of bacterial plant colonization genes

DOE PAGES

Cole, Benjamin J.; Feltcher, Meghan E.; Waters, Robert J.; ...

2017-09-22

Diverse soil-resident bacteria can contribute to plant growth and health, but the molecular mechanisms enabling them to effectively colonize their plant hosts remain poorly understood. We used randomly barcoded transposon mutagenesis sequencing (RB-TnSeq) in Pseudomonas simiae, a model root-colonizing bacterium, to establish a genome-wide map of bacterial genes required for colonization of the Arabidopsis thaliana root system. We identified 115 genes (2% of all P. simiae genes) with functions that are required for maximal competitive colonization of the root system. Among the genes we identified were some with obvious colonization-related roles in motility and carbon metabolism, as well as 44more » other genes that had no or vague functional predictions. Independent validation assays of individual genes confirmed colonization functions for 20 of 22 (91%) cases tested. To further characterize genes identified by our screen, we compared the functional contributions of P. simiae genes to growth in 90 distinct in vitro conditions by RB-TnSeq, highlighting specific metabolic functions associated with root colonization genes. Here, our analysis of bacterial genes by sequence-driven saturation mutagenesis revealed a genome-wide map of the genetic determinants of plant root colonization and offers a starting point for targeted improvement of the colonization capabilities of plant-beneficial microbes.« less
The Reconstruction and Analysis of Gene Regulatory Networks.

PubMed

Zheng, Guangyong; Huang, Tao

2018-01-01

In post-genomic era, an important task is to explore the function of individual biological molecules (i.e., gene, noncoding RNA, protein, metabolite) and their organization in living cells. For this end, gene regulatory networks (GRNs) are constructed to show relationship between biological molecules, in which the vertices of network denote biological molecules and the edges of network present connection between nodes (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). Biologists can understand not only the function of biological molecules but also the organization of components of living cells through interpreting the GRNs, since a gene regulatory network is a comprehensively physiological map of living cells and reflects influence of genetic and epigenetic factors (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). In this paper, we will review the inference methods of GRN reconstruction and analysis approaches of network structure. As a powerful tool for studying complex diseases and biological processes, the applications of the network method in pathway analysis and disease gene identification will be introduced.
Phylogenomic analysis of UDP glycosyltransferase 1 multigene family in Linum usitatissimum identified genes with varied expression patterns.

PubMed

Barvkar, Vitthal T; Pardeshi, Varsha C; Kale, Sandip M; Kadoo, Narendra Y; Gupta, Vidya S

2012-05-08

The glycosylation process, catalyzed by ubiquitous glycosyltransferase (GT) family enzymes, is a prevalent modification of plant secondary metabolites that regulates various functions such as hormone homeostasis, detoxification of xenobiotics and biosynthesis and storage of secondary metabolites. Flax (Linum usitatissimum L.) is a commercially grown oilseed crop, important because of its essential fatty acids and health promoting lignans. Identification and characterization of UDP glycosyltransferase (UGT) genes from flax could provide valuable basic information about this important gene family and help to explain the seed specific glycosylated metabolite accumulation and other processes in plants. Plant genome sequencing projects are useful to discover complexity within this gene family and also pave way for the development of functional genomics approaches. Taking advantage of the newly assembled draft genome sequence of flax, we identified 137 UDP glycosyltransferase (UGT) genes from flax using a conserved signature motif. Phylogenetic analysis of these protein sequences clustered them into 14 major groups (A-N). Expression patterns of these genes were investigated using publicly available expressed sequence tag (EST), microarray data and reverse transcription quantitative real time PCR (RT-qPCR). Seventy-three per cent of these genes (100 out of 137) showed expression evidence in 15 tissues examined and indicated varied expression profiles. The RT-qPCR results of 10 selected genes were also coherent with the digital expression analysis. Interestingly, five duplicated UGT genes were identified, which showed differential expression in various tissues. Of the seven intron loss/gain positions detected, two intron positions were conserved among most of the UGTs, although a clear relationship about the evolution of these genes could not be established. Comparison of the flax UGTs with orthologs from four other sequenced dicot genomes indicated that seven UGTs were flax diverged. Flax has a large number of UGT genes including few flax diverged ones. Phylogenetic analysis and expression profiles of these genes identified tissue and condition specific repertoire of UGT genes from this crop. This study would facilitate precise selection of candidate genes and their further characterization of substrate specificities and in planta functions.
Phylogenomic analysis of UDP glycosyltransferase 1 multigene family in Linum usitatissimum identified genes with varied expression patterns

PubMed Central

2012-01-01

Background The glycosylation process, catalyzed by ubiquitous glycosyltransferase (GT) family enzymes, is a prevalent modification of plant secondary metabolites that regulates various functions such as hormone homeostasis, detoxification of xenobiotics and biosynthesis and storage of secondary metabolites. Flax (Linum usitatissimum L.) is a commercially grown oilseed crop, important because of its essential fatty acids and health promoting lignans. Identification and characterization of UDP glycosyltransferase (UGT) genes from flax could provide valuable basic information about this important gene family and help to explain the seed specific glycosylated metabolite accumulation and other processes in plants. Plant genome sequencing projects are useful to discover complexity within this gene family and also pave way for the development of functional genomics approaches. Results Taking advantage of the newly assembled draft genome sequence of flax, we identified 137 UDP glycosyltransferase (UGT) genes from flax using a conserved signature motif. Phylogenetic analysis of these protein sequences clustered them into 14 major groups (A-N). Expression patterns of these genes were investigated using publicly available expressed sequence tag (EST), microarray data and reverse transcription quantitative real time PCR (RT-qPCR). Seventy-three per cent of these genes (100 out of 137) showed expression evidence in 15 tissues examined and indicated varied expression profiles. The RT-qPCR results of 10 selected genes were also coherent with the digital expression analysis. Interestingly, five duplicated UGT genes were identified, which showed differential expression in various tissues. Of the seven intron loss/gain positions detected, two intron positions were conserved among most of the UGTs, although a clear relationship about the evolution of these genes could not be established. Comparison of the flax UGTs with orthologs from four other sequenced dicot genomes indicated that seven UGTs were flax diverged. Conclusions Flax has a large number of UGT genes including few flax diverged ones. Phylogenetic analysis and expression profiles of these genes identified tissue and condition specific repertoire of UGT genes from this crop. This study would facilitate precise selection of candidate genes and their further characterization of substrate specificities and in planta functions. PMID:22568875

Analysis of the role of Arabidopsis class I TCP genes AtTCP7, AtTCP8, AtTCP22, and AtTCP23 in leaf development

PubMed Central

Aguilar-Martínez, José A.; Sinha, Neelima

2013-01-01

TCP family of plant-specific transcription factors regulates plant form through control of cell proliferation and differentiation. This gene family is comprised of two groups, class I and class II. While the role of class II TCP genes in plant development is well known, data about the function of some class I TCP genes is lacking. We studied a group of phylogenetically related class I TCP genes: AtTCP7, AtTCP8, AtTCP22, and AtTCP23. The similar expression pattern in young growing leaves found for this group suggests similarity in gene function. Gene redundancy is characteristic in this group, as also seen in the class II TCP genes. We generated a pentuple mutant tcp8 tcp15 tcp21 tcp22 tcp23 and show that loss of function of these genes results in changes in leaf developmental traits. We also determined that these factors are able to mutually interact in a yeast two-hybrid assay and regulate the expression of KNOX1 genes. To circumvent the issue of genetic redundancy, dominant negative forms with SRDX repressor domain were used. Analysis of transgenic plants expressing AtTCP7-SRDX and AtTCP23-SRDX indicate a role of these factors in the control of cell proliferation. PMID:24137171
Analysis of the role of Arabidopsis class I TCP genes AtTCP7, AtTCP8, AtTCP22, and AtTCP23 in leaf development.

PubMed

Aguilar-Martínez, José A; Sinha, Neelima

2013-01-01

TCP family of plant-specific transcription factors regulates plant form through control of cell proliferation and differentiation. This gene family is comprised of two groups, class I and class II. While the role of class II TCP genes in plant development is well known, data about the function of some class I TCP genes is lacking. We studied a group of phylogenetically related class I TCP genes: AtTCP7, AtTCP8, AtTCP22, and AtTCP23. The similar expression pattern in young growing leaves found for this group suggests similarity in gene function. Gene redundancy is characteristic in this group, as also seen in the class II TCP genes. We generated a pentuple mutant tcp8 tcp15 tcp21 tcp22 tcp23 and show that loss of function of these genes results in changes in leaf developmental traits. We also determined that these factors are able to mutually interact in a yeast two-hybrid assay and regulate the expression of KNOX1 genes. To circumvent the issue of genetic redundancy, dominant negative forms with SRDX repressor domain were used. Analysis of transgenic plants expressing AtTCP7-SRDX and AtTCP23-SRDX indicate a role of these factors in the control of cell proliferation.
A functional U-statistic method for association analysis of sequencing data.

PubMed

Jadhav, Sneha; Tong, Xiaoran; Lu, Qing

2017-11-01

Although sequencing studies hold great promise for uncovering novel variants predisposing to human diseases, the high dimensionality of the sequencing data brings tremendous challenges to data analysis. Moreover, for many complex diseases (e.g., psychiatric disorders) multiple related phenotypes are collected. These phenotypes can be different measurements of an underlying disease, or measurements characterizing multiple related diseases for studying common genetic mechanism. Although jointly analyzing these phenotypes could potentially increase the power of identifying disease-associated genes, the different types of phenotypes pose challenges for association analysis. To address these challenges, we propose a nonparametric method, functional U-statistic method (FU), for multivariate analysis of sequencing data. It first constructs smooth functions from individuals' sequencing data, and then tests the association of these functions with multiple phenotypes by using a U-statistic. The method provides a general framework for analyzing various types of phenotypes (e.g., binary and continuous phenotypes) with unknown distributions. Fitting the genetic variants within a gene using a smoothing function also allows us to capture complexities of gene structure (e.g., linkage disequilibrium, LD), which could potentially increase the power of association analysis. Through simulations, we compared our method to the multivariate outcome score test (MOST), and found that our test attained better performance than MOST. In a real data application, we apply our method to the sequencing data from Minnesota Twin Study (MTS) and found potential associations of several nicotine receptor subunit (CHRN) genes, including CHRNB3, associated with nicotine dependence and/or alcohol dependence. © 2017 WILEY PERIODICALS, INC.
Tissue Non-Specific Genes and Pathways Associated with Diabetes: An Expression Meta-Analysis.

PubMed

Mei, Hao; Li, Lianna; Liu, Shijian; Jiang, Fan; Griswold, Michael; Mosley, Thomas

2017-01-21

We performed expression studies to identify tissue non-specific genes and pathways of diabetes by meta-analysis. We searched curated datasets of the Gene Expression Omnibus (GEO) database and identified 13 and five expression studies of diabetes and insulin responses at various tissues, respectively. We tested differential gene expression by empirical Bayes-based linear method and investigated gene set expression association by knowledge-based enrichment analysis. Meta-analysis by different methods was applied to identify tissue non-specific genes and gene sets. We also proposed pathway mapping analysis to infer functions of the identified gene sets, and correlation and independent analysis to evaluate expression association profile of genes and gene sets between studies and tissues. Our analysis showed that PGRMC1 and HADH genes were significant over diabetes studies, while IRS1 and MPST genes were significant over insulin response studies, and joint analysis showed that HADH and MPST genes were significant over all combined data sets. The pathway analysis identified six significant gene sets over all studies. The KEGG pathway mapping indicated that the significant gene sets are related to diabetes pathogenesis. The results also presented that 12.8% and 59.0% pairwise studies had significantly correlated expression association for genes and gene sets, respectively; moreover, 12.8% pairwise studies had independent expression association for genes, but no studies were observed significantly different for expression association of gene sets. Our analysis indicated that there are both tissue specific and non-specific genes and pathways associated with diabetes pathogenesis. Compared to the gene expression, pathway association tends to be tissue non-specific, and a common pathway influencing diabetes development is activated through different genes at different tissues.
Passenger mutations and aberrant gene expression in congenic tissue plasminogen activator-deficient mouse strains.

PubMed

Szabo, R; Samson, A L; Lawrence, D A; Medcalf, R L; Bugge, T H

2016-08-01

Essentials C57BL/6J-tissue plasminogen activator (tPA)-deficient mice are widely used to study tPA function. Congenic C57BL/6J-tPA-deficient mice harbor large 129-derived chromosomal segments. The 129-derived chromosomal segments contain gene mutations that may confound data interpretation. Passenger mutation-free isogenic tPA-deficient mice were generated for study of tPA function. Background The ability to generate defined null mutations in mice revolutionized the analysis of gene function in mammals. However, gene-deficient mice generated by using 129-derived embryonic stem cells may carry large segments of 129 DNA, even when extensively backcrossed to reference strains, such as C57BL/6J, and this may confound interpretation of experiments performed in these mice. Tissue plasminogen activator (tPA), encoded by the PLAT gene, is a fibrinolytic serine protease that is widely expressed in the brain. A number of neurological abnormalities have been reported in tPA-deficient mice. Objectives To study genetic contamination of tPA-deficient mice. Materials and methods Whole genome expression array analysis, RNAseq expression profiling, low- and high-density single nucleotide polymorphism (SNP) analysis, bioinformatics and genome editing were used to analyze gene expression in tPA-deficient mouse brains. Results and conclusions Genes differentially expressed in the brain of Plat(-/-) mice from two independent colonies highly backcrossed onto the C57BL/6J strain clustered near Plat on chromosome 8. SNP analysis attributed this anomaly to about 20 Mbp of DNA flanking Plat being of 129 origin in both strains. Bioinformatic analysis of these 129-derived chromosomal segments identified a significant number of mutations in genes co-segregating with the targeted Plat allele, including several potential null mutations. Using zinc finger nuclease technology, we generated novel 'passenger mutation'-free isogenic C57BL/6J-Plat(-/-) and FVB/NJ-Plat(-/-) mouse strains by introducing an 11 bp deletion into the exon encoding the signal peptide. These novel mouse strains will be a useful community resource for further exploration of tPA function in physiological and pathological processes. © 2016 International Society on Thrombosis and Haemostasis.
Integration of biological data by kernels on graph nodes allows prediction of new genes involved in mitotic chromosome condensation

PubMed Central

Hériché, Jean-Karim; Lees, Jon G.; Morilla, Ian; Walter, Thomas; Petrova, Boryana; Roberti, M. Julia; Hossain, M. Julius; Adler, Priit; Fernández, José M.; Krallinger, Martin; Haering, Christian H.; Vilo, Jaak; Valencia, Alfonso; Ranea, Juan A.; Orengo, Christine; Ellenberg, Jan

2014-01-01

The advent of genome-wide RNA interference (RNAi)–based screens puts us in the position to identify genes for all functions human cells carry out. However, for many functions, assay complexity and cost make genome-scale knockdown experiments impossible. Methods to predict genes required for cell functions are therefore needed to focus RNAi screens from the whole genome on the most likely candidates. Although different bioinformatics tools for gene function prediction exist, they lack experimental validation and are therefore rarely used by experimentalists. To address this, we developed an effective computational gene selection strategy that represents public data about genes as graphs and then analyzes these graphs using kernels on graph nodes to predict functional relationships. To demonstrate its performance, we predicted human genes required for a poorly understood cellular function—mitotic chromosome condensation—and experimentally validated the top 100 candidates with a focused RNAi screen by automated microscopy. Quantitative analysis of the images demonstrated that the candidates were indeed strongly enriched in condensation genes, including the discovery of several new factors. By combining bioinformatics prediction with experimental validation, our study shows that kernels on graph nodes are powerful tools to integrate public biological data and predict genes involved in cellular functions of interest. PMID:24943848
Serial analysis of gene expression in a rat lung model of asthma.

PubMed

Yin, Lei-Miao; Jiang, Gong-Hao; Wang, Yu; Wang, Yan; Liu, Yan-Yan; Jin, Wei-Rong; Zhang, Zen; Xu, Yu-Dong; Yang, Yong-Qing

2008-11-01

The pathogenesis and molecular mechanism underlying asthma remain undetermined. The purpose of this study was to identify genes and pathways involved in the early airway response (EAR) phase of asthma by using serial analysis of gene expression (SAGE). Two SAGE tag libraries of lung tissues derived from a rat model of asthma and controls were generated. Bioinformatic analyses were carried out using the Database for Annotation, Visualization and IntegratedDiscovery Functional Annotation Tool, Gene Ontology (GO) TreeMachine and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. A total of 26 552 SAGE tags of asthmatic rat lung were obtained, of which 12 221 were unique tags. Of the unique tags, 55.5% were matched with known genes. By comparison of the two libraries, 186 differentially expressed tags (P < 0.05) were identified, of which 103 were upregulated and 83 were downregulated. Using the bioinformatic tools these genes were classified into 23 functional groups, 15 KEGG pathways and 37 enriched GO categories. The bioinformatic analyses of gene distribution, enriched categories and the involvement of specific pathways in the SAGE libraries have provided information on regulatory networks of the EAR phase of asthma. Analyses of the regulated genes of interest may inform new hypotheses, increase our understanding of the disease and provide a foundation for future research.
A critical assessment of Mus musculus gene function prediction using integrated genomic evidence

PubMed Central

Peña-Castillo, Lourdes; Tasan, Murat; Myers, Chad L; Lee, Hyunju; Joshi, Trupti; Zhang, Chao; Guan, Yuanfang; Leone, Michele; Pagnani, Andrea; Kim, Wan Kyu; Krumpelman, Chase; Tian, Weidong; Obozinski, Guillaume; Qi, Yanjun; Mostafavi, Sara; Lin, Guan Ning; Berriz, Gabriel F; Gibbons, Francis D; Lanckriet, Gert; Qiu, Jian; Grant, Charles; Barutcuoglu, Zafer; Hill, David P; Warde-Farley, David; Grouios, Chris; Ray, Debajyoti; Blake, Judith A; Deng, Minghua; Jordan, Michael I; Noble, William S; Morris, Quaid; Klein-Seetharaman, Judith; Bar-Joseph, Ziv; Chen, Ting; Sun, Fengzhu; Troyanskaya, Olga G; Marcotte, Edward M; Xu, Dong; Hughes, Timothy R; Roth, Frederick P

2008-01-01

Background: Several years after sequencing the human genome and the mouse genome, much remains to be discovered about the functions of most human and mouse genes. Computational prediction of gene function promises to help focus limited experimental resources on the most likely hypotheses. Several algorithms using diverse genomic data have been applied to this task in model organisms; however, the performance of such approaches in mammals has not yet been evaluated. Results: In this study, a standardized collection of mouse functional genomic data was assembled; nine bioinformatics teams used this data set to independently train classifiers and generate predictions of function, as defined by Gene Ontology (GO) terms, for 21,603 mouse genes; and the best performing submissions were combined in a single set of predictions. We identified strengths and weaknesses of current functional genomic data sets and compared the performance of function prediction algorithms. This analysis inferred functions for 76% of mouse genes, including 5,000 currently uncharacterized genes. At a recall rate of 20%, a unified set of predictions averaged 41% precision, with 26% of GO terms achieving a precision better than 90%. Conclusion: We performed a systematic evaluation of diverse, independently developed computational approaches for predicting gene function from heterogeneous data sources in mammals. The results show that currently available data for mammals allows predictions with both breadth and accuracy. Importantly, many highly novel predictions emerge for the 38% of mouse genes that remain uncharacterized. PMID:18613946
Genome Wide Identification, Evolutionary, and Expression Analysis of VQ Genes from Two Pyrus Species.

PubMed

Cao, Yunpeng; Meng, Dandan; Abdullah, Muhammad; Jin, Qing; Lin, Yi; Cai, Yongping

2018-04-23

The VQ motif-containing gene, a member of the plant-specific genes, is involved in the plant developmental process and various stress responses. The VQ motif-containing gene family has been studied in several plants, such as rice ( Oryza sativa ), maize ( Zea mays ), and Arabidopsis ( Arabidopsis thaliana ). However, no systematic study has been performed in Pyrus species, which have important economic value. In our study, we identified 41 and 28 VQ motif-containing genes in Pyrus bretschneideri and Pyrus communis , respectively. Phylogenetic trees were calculated using A. thaliana and O. sativa VQ motif-containing genes as a template, allowing us to categorize these genes into nine subfamilies. Thirty-two and eight paralogous of VQ motif-containing genes were found in P. bretschneideri and P. communis , respectively, showing that the VQ motif-containing genes had a more remarkable expansion in P. bretschneideri than in P. communis . A total of 31 orthologous pairs were identified from the P. bretschneideri and P. communis VQ motif-containing genes. Additionally, among the paralogs, we found that these duplication gene pairs probably derived from segmental duplication/whole-genome duplication (WGD) events in the genomes of P. bretschneideri and P. communis , respectively. The gene expression profiles in both P. bretschneideri and P. communis fruits suggested functional redundancy for some orthologous gene pairs derived from a common ancestry, and sub-functionalization or neo-functionalization for some of them. Our study provided the first systematic evolutionary analysis of the VQ motif-containing genes in Pyrus , and highlighted the diversification and duplication of VQ motif-containing genes in both P. bretschneideri and P. communis .
Horizontal gene transfer in silkworm, Bombyx mori

PubMed Central

2011-01-01

Background The domesticated silkworm, Bombyx mori, is the model insect for the order Lepidoptera, has economically important values, and has gained some representative behavioral characteristics compared to its wild ancestor. The genome of B. mori has been fully sequenced while function analysis of BmChi-h and BmSuc1 genes revealed that horizontal gene transfer (HGT) maybe bestow a clear selective advantage to B. mori. However, the role of HGT in the evolutionary history of B. mori is largely unexplored. In this study, we compare the whole genome of B. mori with those of 382 prokaryotic and eukaryotic species to investigate the potential HGTs. Results Ten candidate HGT events were defined in B. mori by comprehensive sequence analysis using Maximum Likelihood and Bayesian method combining with EST checking. Phylogenetic analysis of the candidate HGT genes suggested that one HGT was plant-to- B. mori transfer while nine were bacteria-to- B. mori transfer. Furthermore, functional analysis based on expression, coexpression and related literature searching revealed that several HGT candidate genes have added important characters, such as resistance to pathogen, to B. mori. Conclusions Results from this study clearly demonstrated that HGTs play an important role in the evolution of B. mori although the number of HGT events in B. mori is in general smaller than those of microbes and other insects. In particular, interdomain HGTs in B. mori may give rise to functional, persistent, and possibly evolutionarily significant new genes. PMID:21595916
Genome wide analysis reveals Zic3 interaction with distal regulatory elements of stage specific developmental genes in zebrafish.

PubMed

Winata, Cecilia L; Kondrychyn, Igor; Kumar, Vibhor; Srinivasan, Kandhadayar G; Orlov, Yuriy; Ravishankar, Ashwini; Prabhakar, Shyam; Stanton, Lawrence W; Korzh, Vladimir; Mathavan, Sinnakaruppan

2013-10-01

Zic3 regulates early embryonic patterning in vertebrates. Loss of Zic3 function is known to disrupt gastrulation, left-right patterning, and neurogenesis. However, molecular events downstream of this transcription factor are poorly characterized. Here we use the zebrafish as a model to study the developmental role of Zic3 in vivo, by applying a combination of two powerful genomics approaches--ChIP-seq and microarray. Besides confirming direct regulation of previously implicated Zic3 targets of the Nodal and canonical Wnt pathways, analysis of gastrula stage embryos uncovered a number of novel candidate target genes, among which were members of the non-canonical Wnt pathway and the neural pre-pattern genes. A similar analysis in zic3-expressing cells obtained by FACS at segmentation stage revealed a dramatic shift in Zic3 binding site locations and identified an entirely distinct set of target genes associated with later developmental functions such as neural development. We demonstrate cis-regulation of several of these target genes by Zic3 using in vivo enhancer assay. Analysis of Zic3 binding sites revealed a distribution biased towards distal intergenic regions, indicative of a long distance regulatory mechanism; some of these binding sites are highly conserved during evolution and act as functional enhancers. This demonstrated that Zic3 regulation of developmental genes is achieved predominantly through long distance regulatory mechanism and revealed that developmental transitions could be accompanied by dramatic changes in regulatory landscape.
An efficient annotation and gene-expression derivation tool for Illumina Solexa datasets.

PubMed

Hosseini, Parsa; Tremblay, Arianne; Matthews, Benjamin F; Alkharouf, Nadim W

2010-07-02

The data produced by an Illumina flow cell with all eight lanes occupied, produces well over a terabyte worth of images with gigabytes of reads following sequence alignment. The ability to translate such reads into meaningful annotation is therefore of great concern and importance. Very easily, one can get flooded with such a great volume of textual, unannotated data irrespective of read quality or size. CASAVA, a optional analysis tool for Illumina sequencing experiments, enables the ability to understand INDEL detection, SNP information, and allele calling. To not only extract from such analysis, a measure of gene expression in the form of tag-counts, but furthermore to annotate such reads is therefore of significant value. We developed TASE (Tag counting and Analysis of Solexa Experiments), a rapid tag-counting and annotation software tool specifically designed for Illumina CASAVA sequencing datasets. Developed in Java and deployed using jTDS JDBC driver and a SQL Server backend, TASE provides an extremely fast means of calculating gene expression through tag-counts while annotating sequenced reads with the gene's presumed function, from any given CASAVA-build. Such a build is generated for both DNA and RNA sequencing. Analysis is broken into two distinct components: DNA sequence or read concatenation, followed by tag-counting and annotation. The end result produces output containing the homology-based functional annotation and respective gene expression measure signifying how many times sequenced reads were found within the genomic ranges of functional annotations. TASE is a powerful tool to facilitate the process of annotating a given Illumina Solexa sequencing dataset. Our results indicate that both homology-based annotation and tag-count analysis are achieved in very efficient times, providing researchers to delve deep in a given CASAVA-build and maximize information extraction from a sequencing dataset. TASE is specially designed to translate sequence data in a CASAVA-build into functional annotations while producing corresponding gene expression measurements. Achieving such analysis is executed in an ultrafast and highly efficient manner, whether the analysis be a single-read or paired-end sequencing experiment. TASE is a user-friendly and freely available application, allowing rapid analysis and annotation of any given Illumina Solexa sequencing dataset with ease.
Analysis of functional polymorphisms in three synaptic plasticity-related genes (BDNF, COMT AND UCHL1) in Alzheimer's disease in Colombia.

PubMed

Forero, Diego A; Benítez, Bruno; Arboleda, Gonzalo; Yunis, Juan J; Pardo, Rodrigo; Arboleda, Humberto

2006-07-01

In recent years, it has been proposed that synaptic dysfunction may be an important etiological factor for Alzheimer's disease (AD). This hypothesis has important implications for the analysis of AD genetic risk in case-control studies. In the present work, we analyzed common functional polymorphisms in three synaptic plasticity-related genes (brain-derived neurotrophic factor, BDNF Val66Met; catechol-O-methyl transferase, COMT Val158; ubiquitin carboxyl-terminal hydroxylase, UCHL1 S18Y) in a sample of 102 AD cases and 168 age and sex matched controls living in Bogotá, Colombia. There was not association between UCHL1 polymorphism and AD in our sample. We have found an initial association with BDNF polymorphism in familial cases and with COMT polymorphism in male and sporadic patients. These initial associations were lost after Bonferroni correction for multiple testing. Unadjusted results may be compatible with the expected functional effect of variations in these genes on pathological memory and cognitive dysfunction, as has been implicated in animal and cell models and also from neuropsychological analysis of normal subjects carriers of the AD associated genotypes. An exploration of functional variants in these and in other synaptic plasticity-related genes (a synaptogenomics approach) in independent larger samples will be important to discover new genes associated with AD.
A resource for functional profiling of noncoding RNA in the yeast Saccharomyces cerevisiae.

PubMed

Parker, Steven; Fraczek, Marcin G; Wu, Jian; Shamsah, Sara; Manousaki, Alkisti; Dungrattanalert, Kobchai; de Almeida, Rogerio Alves; Estrada-Rivadeneyra, Diego; Omara, Walid; Delneri, Daniela; O'Keefe, Raymond T

2017-08-01

Eukaryotic genomes are extensively transcribed, generating many different RNAs with no known function. We have constructed 1502 molecular barcoded ncRNA gene deletion strains encompassing 443 ncRNAs in the yeast Saccharomyces cerevisiae as tools for ncRNA functional analysis. This resource includes deletions of small nuclear RNAs (snRNAs), transfer RNAs (tRNAs), small nucleolar RNAs (snoRNAs), and other annotated ncRNAs as well as the more recently identified stable unannotated transcripts (SUTs) and cryptic unstable transcripts (CUTs) whose functions are largely unknown. Specifically, deletions have been constructed for ncRNAs found in the intergenic regions, not overlapping genes or their promoters (i.e., at least 200 bp minimum distance from the closest gene start codon). The deletion strains carry molecular barcodes designed to be complementary with the protein gene deletion collection enabling parallel analysis experiments. These strains will be useful for the numerous genomic and molecular techniques that utilize deletion strains, including genome-wide phenotypic screens under different growth conditions, pooled chemogenomic screens with drugs or chemicals, synthetic genetic array analysis to uncover novel genetic interactions, and synthetic dosage lethality screens to analyze gene dosage. Overall, we created a valuable resource for the RNA community and for future ncRNA research. © 2017 Parker et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
The transcription factor p53: Not a repressor, solely an activator

PubMed Central

Fischer, Martin; Steiner, Lydia; Engeland, Kurt

2014-01-01

The predominant function of the tumor suppressor p53 is transcriptional regulation. It is generally accepted that p53-dependent transcriptional activation occurs by binding to a specific recognition site in promoters of target genes. Additionally, several models for p53-dependent transcriptional repression have been postulated. Here, we evaluate these models based on a computational meta-analysis of genome-wide data. Surprisingly, several major models of p53-dependent gene regulation are implausible. Meta-analysis of large-scale data is unable to confirm reports on directly repressed p53 target genes and falsifies models of direct repression. This notion is supported by experimental re-analysis of representative genes reported as directly repressed by p53. Therefore, p53 is not a direct repressor of transcription, but solely activates its target genes. Moreover, models based on interference of p53 with activating transcription factors as well as models based on the function of ncRNAs are also not supported by the meta-analysis. As an alternative to models of direct repression, the meta-analysis leads to the conclusion that p53 represses transcription indirectly by activation of the p53-p21-DREAM/RB pathway. PMID:25486564
An interactional network of genes involved in chitin synthesis in Saccharomyces cerevisiae.

PubMed

Lesage, Guillaume; Shapiro, Jesse; Specht, Charles A; Sdicu, Anne-Marie; Ménard, Patrice; Hussein, Shamiza; Tong, Amy Hin Yan; Boone, Charles; Bussey, Howard

2005-02-16

In S. cerevisiae the beta-1,4-linked N-acetylglucosamine polymer, chitin, is synthesized by a family of 3 specialized but interacting chitin synthases encoded by CHS1, CHS2 and CHS3. Chs2p makes chitin in the primary septum, while Chs3p makes chitin in the lateral cell wall and in the bud neck, and can partially compensate for the lack of Chs2p. Chs3p requires a pathway of Bni4p, Chs4p, Chs5p, Chs6p and Chs7p for its localization and activity. Chs1p is thought to have a septum repair function after cell separation. To further explore interactions in the chitin synthase family and to find processes buffering chitin synthesis, we compiled a genetic interaction network of genes showing synthetic interactions with CHS1, CHS3 and genes involved in Chs3p localization and function and made a phenotypic analysis of their mutants. Using deletion mutants in CHS1, CHS3, CHS4, CHS5, CHS6, CHS7 and BNI4 in a synthetic genetic array analysis we assembled a network of 316 interactions among 163 genes. The interaction network with CHS3, CHS4, CHS5, CHS6, CHS7 or BNI4 forms a dense neighborhood, with many genes functioning in cell wall assembly or polarized secretion. Chitin levels were altered in 54 of the mutants in individually deleted genes, indicating a functional relationship between them and chitin synthesis. 32 of these mutants triggered the chitin stress response, with elevated chitin levels and a dependence on CHS3. A large fraction of the CHS1-interaction set was distinct from that of the CHS3 network, indicating broad roles for Chs1p in buffering both Chs2p function and more global cell wall robustness. Based on their interaction patterns and chitin levels we group interacting mutants into functional categories. Genes interacting with CHS3 are involved in the amelioration of cell wall defects and in septum or bud neck chitin synthesis, and we newly assign a number of genes to these functions. Our genetic analysis of genes not interacting with CHS3 indicate expanded roles for Chs4p, Chs5p and Chs6p in secretory protein trafficking and of Bni4p in bud neck organization.
System analysis identifies distinct and common functional networks governed by transcription factor ASCL1, in glioma and small cell lung cancer.

PubMed

Donakonda, Sainitin; Sinha, Swati; Dighe, Shrinivas Nivrutti; Rao, Manchanahalli R Satyanarayana

2017-07-25

ASCL1 is a basic Helix-Loop-Helix transcription factor (TF), which is involved in various cellular processes like neuronal development and signaling pathways. Transcriptome profiling has shown that ASCL1 overexpression plays an important role in the development of glioma and Small Cell Lung Carcinoma (SCLC), but distinct and common molecular mechanisms regulated by ASCL1 in these cancers are unknown. In order to understand how it drives the cellular functional network in these two tumors, we generated a gene expression profile in a glioma cell line (U87MG) to identify ASCL1 gene targets by an si RNA silencing approach and then compared this with a publicly available dataset of similarly silenced SCLC (NCI-H1618 cells). We constructed TF-TF and gene-gene interactions, as well as protein interaction networks of ASCL1 regulated genes in glioma and SCLC cells. Detailed network analysis uncovered various biological processes governed by ASCL1 target genes in these two tumor cell lines. We find that novel ASCL1 functions related to mitosis and signaling pathways influencing development and tumor growth are affected in both glioma and SCLC cells. In addition, we also observed ASCL1 governed functional networks that are distinct to glioma and SCLC.
Rapid deletion plasmid construction methods for protoplast and Agrobacterium based fungal transformation systems

USDA-ARS?s Scientific Manuscript database

Increasing availability of genomic data and sophistication of analytical methodology in fungi has elevated the need for functional genomics tools in these organisms. Gene deletion is a critical tool for functional analysis. The targeted deletion of genes requires both a suitable method for the trans...
Functional cohesion of gene sets determined by latent semantic indexing of PubMed abstracts.

PubMed

Xu, Lijing; Furlotte, Nicholas; Lin, Yunyue; Heinrich, Kevin; Berry, Michael W; George, Ebenezer O; Homayouni, Ramin

2011-04-14

High-throughput genomic technologies enable researchers to identify genes that are co-regulated with respect to specific experimental conditions. Numerous statistical approaches have been developed to identify differentially expressed genes. Because each approach can produce distinct gene sets, it is difficult for biologists to determine which statistical approach yields biologically relevant gene sets and is appropriate for their study. To address this issue, we implemented Latent Semantic Indexing (LSI) to determine the functional coherence of gene sets. An LSI model was built using over 1 million Medline abstracts for over 20,000 mouse and human genes annotated in Entrez Gene. The gene-to-gene LSI-derived similarities were used to calculate a literature cohesion p-value (LPv) for a given gene set using a Fisher's exact test. We tested this method against genes in more than 6,000 functional pathways annotated in Gene Ontology (GO) and found that approximately 75% of gene sets in GO biological process category and 90% of the gene sets in GO molecular function and cellular component categories were functionally cohesive (LPv<0.05). These results indicate that the LPv methodology is both robust and accurate. Application of this method to previously published microarray datasets demonstrated that LPv can be helpful in selecting the appropriate feature extraction methods. To enable real-time calculation of LPv for mouse or human gene sets, we developed a web tool called Gene-set Cohesion Analysis Tool (GCAT). GCAT can complement other gene set enrichment approaches by determining the overall functional cohesion of data sets, taking into account both explicit and implicit gene interactions reported in the biomedical literature. GCAT is freely available at http://binf1.memphis.edu/gcat.
Using the Saccharomyces Genome Database (SGD) for analysis of genomic information

PubMed Central

Skrzypek, Marek S.; Hirschman, Jodi

2011-01-01

Analysis of genomic data requires access to software tools that place the sequence-derived information in the context of biology. The Saccharomyces Genome Database (SGD) integrates functional information about budding yeast genes and their products with a set of analysis tools that facilitate exploring their biological details. This unit describes how the various types of functional data available at SGD can be searched, retrieved, and analyzed. Starting with the guided tour of the SGD Home page and Locus Summary page, this unit highlights how to retrieve data using YeastMine, how to visualize genomic information with GBrowse, how to explore gene expression patterns with SPELL, and how to use Gene Ontology tools to characterize large-scale datasets. PMID:21901739

Molecular characterization and functional analysis of three pathogenesis-related cytochrome P450 genes from Bursaphelenchus xylophilus (Tylenchida: Aphelenchoidoidea).

PubMed

Xu, Xiao-Lu; Wu, Xiao-Qin; Ye, Jian-Ren; Huang, Lin

2015-03-06

Bursaphelenchus xylophilus, the causal agent of pine wilt disease, causes huge economic losses in pine forests. The high expression of cytochrome P450 genes in B. xylophilus during infection in P. thunbergii indicated that these genes had a certain relationship with the pathogenic process of B. xylophilus. Thus, we attempted to identify the molecular characterization and functions of cytochrome P450 genes in B. xylophilus. In this study, full-length cDNA of three cytochrome P450 genes, BxCYP33C9, BxCYP33C4 and BxCYP33D3 were first cloned from B. xylophilus using 3' and 5' RACE PCR amplification. Sequence analysis showed that all of them contained a highly-conserved cytochrome P450 domain. The characteristics of the three putative proteins were analyzed with bioinformatic methods. RNA interference (RNAi) was used to assess the functions of BxCYP33C9, BxCYP33C4 and BxCYP33D3. The results revealed that these cytochrome P450 genes were likely to be associated with the vitality, dispersal ability, reproduction, pathogenicity and pesticide metabolism of B. xylophilus. This discovery confirmed the molecular characterization and functions of three cytochrome P450 genes from B. xylophilus and provided fundamental information in elucidating the molecular interaction mechanism between B. xylophilus and its host plant.
Genomic analysis of expressed sequence tags in American black bear Ursus americanus

PubMed Central

2010-01-01

Background Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Results Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. Conclusion We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes. PMID:20338065
Characterization of a Crabs Claw Gene in basal eudicot species Epimedium sagittatum (Berberidaceae).

PubMed

Sun, Wei; Huang, Wenjun; Li, Zhineng; Lv, Haiyan; Huang, Hongwen; Wang, Ying

2013-01-08

The Crabs Claw (CRC) YABBY gene is required for regulating carpel development in angiosperms and has played an important role in nectary evolution during core eudicot speciation. The function or expression of CRC-like genes has been explored in two basal eudicots, Eschscholzia californica and Aquilegia formosa. To further investigate the function of CRC orthologous genes related to evolution of carpel and nectary development in basal eudicots, a CRC ortholog, EsCRC, was isolated and characterized from Epimedium sagittatum (Sieb. and Zucc.) Maxim. A phylogenetic analysis of EsCRC and previously identified CRC-like genes placed EsCRC within the basal eudicot lineage. Gene expression results suggest that EsCRC is involved in the development of sepals and carpels, but not nectaries. Phenotypic complementation of the Arabidopsis mutant crc-1 was achieved by constitutive expression of EsCRC. In addition, over-expression of EsCRC in Arabidopsis and tobacco gave rise to abaxially curled leaves. Transgenic results together with the gene expression analysis suggest that EsCRC may maintain a conserved function in carpel development and also play a novel role related to sepal formation. Absence of EsCRC and ElCRC expression in nectaries further indicates that nectary development in non-core eudicots is unrelated to expression of CRC-like genes.
Characterization of a Crabs Claw Gene in Basal Eudicot Species Epimedium sagittatum (Berberidaceae)

PubMed Central

Sun, Wei; Huang, Wenjun; Li, Zhineng; Lv, Haiyan; Huang, Hongwen; Wang, Ying

2013-01-01

The Crabs Claw (CRC) YABBY gene is required for regulating carpel development in angiosperms and has played an important role in nectary evolution during core eudicot speciation. The function or expression of CRC-like genes has been explored in two basal eudicots, Eschscholzia californica and Aquilegia formosa. To further investigate the function of CRC orthologous genes related to evolution of carpel and nectary development in basal eudicots, a CRC ortholog, EsCRC, was isolated and characterized from Epimedium sagittatum (Sieb. and Zucc.) Maxim. A phylogenetic analysis of EsCRC and previously identified CRC-like genes placed EsCRC within the basal eudicot lineage. Gene expression results suggest that EsCRC is involved in the development of sepals and carpels, but not nectaries. Phenotypic complementation of the Arabidopsis mutant crc-1 was achieved by constitutive expression of EsCRC. In addition, over-expression of EsCRC in Arabidopsis and tobacco gave rise to abaxially curled leaves. Transgenic results together with the gene expression analysis suggest that EsCRC may maintain a conserved function in carpel development and also play a novel role related to sepal formation. Absence of EsCRC and ElCRC expression in nectaries further indicates that nectary development in non-core eudicots is unrelated to expression of CRC-like genes. PMID:23299438
Identification of aluminum-regulated genes by cDNA-AFLP analysis of roots in two contrasting genotypes of highbush blueberry (Vaccinium corymbosum L.).

PubMed

Inostroza-Blancheteau, Claudio; Aquea, Felipe; Reyes-Díaz, Marjorie; Alberdi, Miren; Arce-Johnson, Patricio

2011-09-01

To investigate the molecular mechanisms of Al(3+)-stress in blueberry, a cDNA-amplified fragment length polymorphism (cDNA-AFLP) analysis was employed to identify Al-regulated genes in roots of contrasting genotypes of highbush blueberry (Brigitta, Al(3+)-resistant and Bluegold, Al(3+)-sensitive). Plants grown in hydroponic culture were treated with 0 and 100 μM Al(3+) and collected at different times over 48 h. Seventy transcript-derived fragments (TDFs) were identified as being Al(3+) responsive, 31 of which showed significant homology to genes with known or putative functions. Twelve TDFs were homologous to uncharacterized genes and 27 did not have significant matches. The expression pattern of several of the genes with known functions in other species was confirmed by quantitative relative real-time RT-PCR. Twelve genes of known or putative function were related to cellular metabolism, nine associated to stress responses and other transcription and transport facilitation processes. Genes involved in signal transduction, photosynthetic and energy processes were also identified, suggesting that a multitude of processes are implicated in the Al(3+)-stress response as reported previously for other species. The Al(3+)-stress response genes identified in this study could be involved in Al(3+)-resistance in woody plants.
Genomic analysis of expressed sequence tags in American black bear Ursus americanus.

PubMed

Zhao, Sen; Shao, Chunxuan; Goropashnaya, Anna V; Stewart, Nathan C; Xu, Yichi; Tøien, Øivind; Barnes, Brian M; Fedorov, Vadim B; Yan, Jun

2010-03-26

Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes.
Common variants in Mendelian kidney disease genes and their association with renal function.

PubMed

Parsa, Afshin; Fuchsberger, Christian; Köttgen, Anna; O'Seaghdha, Conall M; Pattaro, Cristian; de Andrade, Mariza; Chasman, Daniel I; Teumer, Alexander; Endlich, Karlhans; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Kim, Young J; Taliun, Daniel; Li, Man; Feitosa, Mary; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C; Glazer, Nicole; Isaacs, Aaron; Rao, Madhumathi; Smith, Albert V; O'Connell, Jeffrey R; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Hwang, Shih-Jen; Atkinson, Elizabeth J; Lohman, Kurt; Cornelis, Marilyn C; Johansson, Asa; Tönjes, Anke; Dehghan, Abbas; Couraki, Vincent; Holliday, Elizabeth G; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y; Murgia, Federico; Trompet, Stella; Imboden, Medea; Kollerits, Barbara; Pistis, Giorgio; Harris, Tamara B; Launer, Lenore J; Aspelund, Thor; Eiriksdottir, Gudny; Mitchell, Braxton D; Boerwinkle, Eric; Schmidt, Helena; Hofer, Edith; Hu, Frank; Demirkan, Ayse; Oostra, Ben A; Turner, Stephen T; Ding, Jingzhong; Andrews, Jeanette S; Freedman, Barry I; Giulianini, Franco; Koenig, Wolfgang; Illig, Thomas; Döring, Angela; Wichmann, H-Erich; Zgaga, Lina; Zemunik, Tatijana; Boban, Mladen; Minelli, Cosetta; Wheeler, Heather E; Igl, Wilmar; Zaboli, Ghazal; Wild, Sarah H; Wright, Alan F; Campbell, Harry; Ellinghaus, David; Nöthlings, Ute; Jacobs, Gunnar; Biffar, Reiner; Ernst, Florian; Homuth, Georg; Kroemer, Heyo K; Nauck, Matthias; Stracke, Sylvia; Völker, Uwe; Völzke, Henry; Kovacs, Peter; Stumvoll, Michael; Mägi, Reedik; Hofman, Albert; Uitterlinden, Andre G; Rivadeneira, Fernando; Aulchenko, Yurii S; Polasek, Ozren; Hastie, Nick; Vitart, Veronique; Helmer, Catherine; Wang, Jie Jin; Stengel, Bénédicte; Ruggiero, Daniela; Bergmann, Sven; Kähönen, Mika; Viikari, Jorma; Nikopensius, Tiit; Province, Michael; Colhoun, Helen; Doney, Alex; Robino, Antonietta; Krämer, Bernhard K; Portas, Laura; Ford, Ian; Buckley, Brendan M; Adam, Martin; Thun, Gian-Andri; Paulweber, Bernhard; Haun, Margot; Sala, Cinzia; Mitchell, Paul; Ciullo, Marina; Vollenweider, Peter; Raitakari, Olli; Metspalu, Andres; Palmer, Colin; Gasparini, Paolo; Pirastu, Mario; Jukema, J Wouter; Probst-Hensch, Nicole M; Kronenberg, Florian; Toniolo, Daniela; Gudnason, Vilmundur; Shuldiner, Alan R; Coresh, Josef; Schmidt, Reinhold; Ferrucci, Luigi; van Duijn, Cornelia M; Borecki, Ingrid; Kardia, Sharon L R; Liu, Yongmei; Curhan, Gary C; Rudan, Igor; Gyllensten, Ulf; Wilson, James F; Franke, Andre; Pramstaller, Peter P; Rettig, Rainer; Prokopenko, Inga; Witteman, Jacqueline; Hayward, Caroline; Ridker, Paul M; Bochud, Murielle; Heid, Iris M; Siscovick, David S; Fox, Caroline S; Kao, W Linda; Böger, Carsten A

2013-12-01

Many common genetic variants identified by genome-wide association studies for complex traits map to genes previously linked to rare inherited Mendelian disorders. A systematic analysis of common single-nucleotide polymorphisms (SNPs) in genes responsible for Mendelian diseases with kidney phenotypes has not been performed. We thus developed a comprehensive database of genes for Mendelian kidney conditions and evaluated the association between common genetic variants within these genes and kidney function in the general population. Using the Online Mendelian Inheritance in Man database, we identified 731 unique disease entries related to specific renal search terms and confirmed a kidney phenotype in 218 of these entries, corresponding to mutations in 258 genes. We interrogated common SNPs (minor allele frequency >5%) within these genes for association with the estimated GFR in 74,354 European-ancestry participants from the CKDGen Consortium. However, the top four candidate SNPs (rs6433115 at LRP2, rs1050700 at TSC1, rs249942 at PALB2, and rs9827843 at ROBO2) did not achieve significance in a stage 2 meta-analysis performed in 56,246 additional independent individuals, indicating that these common SNPs are not associated with estimated GFR. The effect of less common or rare variants in these genes on kidney function in the general population and disease-specific cohorts requires further research.
Genome-wide analysis of Glycine soja ubiquitin (UBQ) genes and functional analysis of GsUBQ10 in response to alkaline stress.

PubMed

Chen, Chao; Chen, Ranran; Wu, Shengyang; Zhu, Dan; Sun, Xiaoli; Liu, Beidong; Li, Qiang; Zhu, Yanming

2018-03-26

Ubiquitin is a highly conserved protein with multiple essential regulation functions through the ubiquitin-proteasome system. Even though its functions in the ubiquitin-mediated protein degradation pathway were very well characterized. The functions of ubiquitin genes in regulating alkaline stress response are not fully established. In this study, we identified 12 potential UBQ genes in Glycine soja genome, and analyzed their evolutionary relationship, conserved domains and promoter cis-elements. We also explored the expression profiles of G. soja UBQ genes under alkaline stress, based on the transcriptome sequencing. We found that the expression of GsUBQ10 was significantly induced by alkaline stress, and function of GsUBQ10 was characterized using overexpression transgenic alfalfa (Medicago sativa). Our results suggested that GsUBQ10 transgenic lines significantly improved the alkaline tolerance in alfalfa. The GsUBQ10 transgenic lines showed lower relative membrane permeability, lower malon dialdehyde content and higher catalase activity than in the wild-type plants. This indicates that GsUBQ10 is involved in regulating the reactive oxygen species accumulation under alkaline stress. Taken together, we identified an ubiquitin gene GsUBQ10 from G. soja, which plays a positive role in responses to alkaline stress in alfalfa. This article is protected by copyright. All rights reserved.
Alu Elements as Novel Regulators of Gene Expression in Type 1 Diabetes Susceptibility Genes?

PubMed

Kaur, Simranjeet; Pociot, Flemming

2015-07-13

Despite numerous studies implicating Alu repeat elements in various diseases, there is sparse information available with respect to the potential functional and biological roles of the repeat elements in Type 1 diabetes (T1D). Therefore, we performed a genome-wide sequence analysis of T1D candidate genes to identify embedded Alu elements within these genes. We observed significant enrichment of Alu elements within the T1D genes (p-value < 10e-16), which highlights their importance in T1D. Functional annotation of T1D genes harboring Alus revealed significant enrichment for immune-mediated processes (p-value < 10e-6). We also identified eight T1D genes harboring inverted Alus (IRAlus) within their 3' untranslated regions (UTRs) that are known to regulate the expression of host mRNAs by generating double stranded RNA duplexes. Our in silico analysis predicted the formation of duplex structures by IRAlus within the 3'UTRs of T1D genes. We propose that IRAlus might be involved in regulating the expression levels of the host T1D genes.
Altered Pathway Analyzer: A gene expression dataset analysis tool for identification and prioritization of differentially regulated and network rewired pathways

PubMed Central

Kaushik, Abhinav; Ali, Shakir; Gupta, Dinesh

2017-01-01

Gene connection rewiring is an essential feature of gene network dynamics. Apart from its normal functional role, it may also lead to dysregulated functional states by disturbing pathway homeostasis. Very few computational tools measure rewiring within gene co-expression and its corresponding regulatory networks in order to identify and prioritize altered pathways which may or may not be differentially regulated. We have developed Altered Pathway Analyzer (APA), a microarray dataset analysis tool for identification and prioritization of altered pathways, including those which are differentially regulated by TFs, by quantifying rewired sub-network topology. Moreover, APA also helps in re-prioritization of APA shortlisted altered pathways enriched with context-specific genes. We performed APA analysis of simulated datasets and p53 status NCI-60 cell line microarray data to demonstrate potential of APA for identification of several case-specific altered pathways. APA analysis reveals several altered pathways not detected by other tools evaluated by us. APA analysis of unrelated prostate cancer datasets identifies sample-specific as well as conserved altered biological processes, mainly associated with lipid metabolism, cellular differentiation and proliferation. APA is designed as a cross platform tool which may be transparently customized to perform pathway analysis in different gene expression datasets. APA is freely available at http://bioinfo.icgeb.res.in/APA. PMID:28084397
Functional Abstraction as a Method to Discover Knowledge in Gene Ontologies

PubMed Central

Ultsch, Alfred; Lötsch, Jörn

2014-01-01

Computational analyses of functions of gene sets obtained in microarray analyses or by topical database searches are increasingly important in biology. To understand their functions, the sets are usually mapped to Gene Ontology knowledge bases by means of over-representation analysis (ORA). Its result represents the specific knowledge of the functionality of the gene set. However, the specific ontology typically consists of many terms and relationships, hindering the understanding of the ‘main story’. We developed a methodology to identify a comprehensibly small number of GO terms as “headlines” of the specific ontology allowing to understand all central aspects of the roles of the involved genes. The Functional Abstraction method finds a set of headlines that is specific enough to cover all details of a specific ontology and is abstract enough for human comprehension. This method exceeds the classical approaches at ORA abstraction and by focusing on information rather than decorrelation of GO terms, it directly targets human comprehension. Functional abstraction provides, with a maximum of certainty, information value, coverage and conciseness, a representation of the biological functions in a gene set plays a role. This is the necessary means to interpret complex Gene Ontology results thus strengthening the role of functional genomics in biomarker and drug discovery. PMID:24587272
Human Intellectual Disability Genes Form Conserved Functional Modules in Drosophila

PubMed Central

Oortveld, Merel A. W.; Keerthikumar, Shivakumar; Oti, Martin; Nijhof, Bonnie; Fernandes, Ana Clara; Kochinke, Korinna; Castells-Nobau, Anna; van Engelen, Eva; Ellenkamp, Thijs; Eshuis, Lilian; Galy, Anne; van Bokhoven, Hans; Habermann, Bianca; Brunner, Han G.; Zweier, Christiane; Verstreken, Patrik; Huynen, Martijn A.; Schenck, Annette

2013-01-01

Intellectual Disability (ID) disorders, defined by an IQ below 70, are genetically and phenotypically highly heterogeneous. Identification of common molecular pathways underlying these disorders is crucial for understanding the molecular basis of cognition and for the development of therapeutic intervention strategies. To systematically establish their functional connectivity, we used transgenic RNAi to target 270 ID gene orthologs in the Drosophila eye. Assessment of neuronal function in behavioral and electrophysiological assays and multiparametric morphological analysis identified phenotypes associated with knockdown of 180 ID gene orthologs. Most of these genotype-phenotype associations were novel. For example, we uncovered 16 genes that are required for basal neurotransmission and have not previously been implicated in this process in any system or organism. ID gene orthologs with morphological eye phenotypes, in contrast to genes without phenotypes, are relatively highly expressed in the human nervous system and are enriched for neuronal functions, suggesting that eye phenotyping can distinguish different classes of ID genes. Indeed, grouping genes by Drosophila phenotype uncovered 26 connected functional modules. Novel links between ID genes successfully predicted that MYCN, PIGV and UPF3B regulate synapse development. Drosophila phenotype groups show, in addition to ID, significant phenotypic similarity also in humans, indicating that functional modules are conserved. The combined data indicate that ID disorders, despite their extreme genetic diversity, are caused by disruption of a limited number of highly connected functional modules. PMID:24204314
Human intellectual disability genes form conserved functional modules in Drosophila.

PubMed

Oortveld, Merel A W; Keerthikumar, Shivakumar; Oti, Martin; Nijhof, Bonnie; Fernandes, Ana Clara; Kochinke, Korinna; Castells-Nobau, Anna; van Engelen, Eva; Ellenkamp, Thijs; Eshuis, Lilian; Galy, Anne; van Bokhoven, Hans; Habermann, Bianca; Brunner, Han G; Zweier, Christiane; Verstreken, Patrik; Huynen, Martijn A; Schenck, Annette

2013-10-01

Intellectual Disability (ID) disorders, defined by an IQ below 70, are genetically and phenotypically highly heterogeneous. Identification of common molecular pathways underlying these disorders is crucial for understanding the molecular basis of cognition and for the development of therapeutic intervention strategies. To systematically establish their functional connectivity, we used transgenic RNAi to target 270 ID gene orthologs in the Drosophila eye. Assessment of neuronal function in behavioral and electrophysiological assays and multiparametric morphological analysis identified phenotypes associated with knockdown of 180 ID gene orthologs. Most of these genotype-phenotype associations were novel. For example, we uncovered 16 genes that are required for basal neurotransmission and have not previously been implicated in this process in any system or organism. ID gene orthologs with morphological eye phenotypes, in contrast to genes without phenotypes, are relatively highly expressed in the human nervous system and are enriched for neuronal functions, suggesting that eye phenotyping can distinguish different classes of ID genes. Indeed, grouping genes by Drosophila phenotype uncovered 26 connected functional modules. Novel links between ID genes successfully predicted that MYCN, PIGV and UPF3B regulate synapse development. Drosophila phenotype groups show, in addition to ID, significant phenotypic similarity also in humans, indicating that functional modules are conserved. The combined data indicate that ID disorders, despite their extreme genetic diversity, are caused by disruption of a limited number of highly connected functional modules.
EPConDB: a web resource for gene expression related to pancreatic development, beta-cell function and diabetes.

PubMed

Mazzarelli, Joan M; Brestelli, John; Gorski, Regina K; Liu, Junmin; Manduchi, Elisabetta; Pinney, Deborah F; Schug, Jonathan; White, Peter; Kaestner, Klaus H; Stoeckert, Christian J

2007-01-01

EPConDB (http://www.cbil.upenn.edu/EPConDB) is a public web site that supports research in diabetes, pancreatic development and beta-cell function by providing information about genes expressed in cells of the pancreas. EPConDB displays expression profiles for individual genes and information about transcripts, promoter elements and transcription factor binding sites. Gene expression results are obtained from studies examining tissue expression, pancreatic development and growth, differentiation of insulin-producing cells, islet or beta-cell injury, and genetic models of impaired beta-cell function. The expression datasets are derived using different microarray platforms, including the BCBC PancChips and Affymetrix gene expression arrays. Other datasets include semi-quantitative RT-PCR and MPSS expression studies. For selected microarray studies, lists of differentially expressed genes, derived from PaGE analysis, are displayed on the site. EPConDB provides database queries and tools to examine the relationship between a gene, its transcriptional regulation, protein function and expression in pancreatic tissues.
Linking Advanced Visualization and MATLAB for the Analysis of 3D Gene Expression Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ruebel, Oliver; Keranen, Soile V.E.; Biggin, Mark

Three-dimensional gene expression PointCloud data generated by the Berkeley Drosophila Transcription Network Project (BDTNP) provides quantitative information about the spatial and temporal expression of genes in early Drosophila embryos at cellular resolution. The BDTNP team visualizes and analyzes Point-Cloud data using the software application PointCloudXplore (PCX). To maximize the impact of novel, complex data sets, such as PointClouds, the data needs to be accessible to biologists and comprehensible to developers of analysis functions. We address this challenge by linking PCX and Matlab via a dedicated interface, thereby providing biologists seamless access to advanced data analysis functions and giving bioinformatics researchersmore » the opportunity to integrate their analysis directly into the visualization application. To demonstrate the usefulness of this approach, we computationally model parts of the expression pattern of the gene even skipped using a genetic algorithm implemented in Matlab and integrated into PCX via our Matlab interface.« less
Metagenomes reveal microbial structures, functional potentials, and biofouling-related genes in a membrane bioreactor.

PubMed

Ma, Jinxing; Wang, Zhiwei; Li, Huan; Park, Hee-Deung; Wu, Zhichao

2016-06-01

Metagenomic sequencing was used to investigate the microbial structures, functional potentials, and biofouling-related genes in a membrane bioreactor (MBR). The results showed that the microbial community in the MBR was highly diverse. Notably, function analysis of the dominant genera indicated that common genes from different phylotypes were identified for important functional potentials with the observation of variation of abundances of genes in a certain taxon (e.g., Dechloromonas). Despite maintaining similar metabolic functional potentials with a parallel full-scale conventional activated sludge (CAS) system due to treating the identical wastewater, the MBR had more abundant nitrification-related bacteria and coding genes of ammonia monooxygenase, which could well explain its excellent ammonia removal in the low-temperature period. Furthermore, according to quantification of the genes involved in exopolysaccharide and extracellular polymeric substance (EPS) protein metabolism, the MBR did not show a much different potential in producing EPS compared to the CAS system, and bacteria from the membrane biofilm had lower abundances of genes associated with EPS biosynthesis and transport compared to the activated sludge in the MBR.
Gene set analysis of purine and pyrimidine antimetabolites cancer therapies.

PubMed

Fridley, Brooke L; Batzler, Anthony; Li, Liang; Li, Fang; Matimba, Alice; Jenkins, Gregory D; Ji, Yuan; Wang, Liewei; Weinshilboum, Richard M

2011-11-01

Responses to therapies, either with regard to toxicities or efficacy, are expected to involve complex relationships of gene products within the same molecular pathway or functional gene set. Therefore, pathways or gene sets, as opposed to single genes, may better reflect the true underlying biology and may be more appropriate units for analysis of pharmacogenomic studies. Application of such methods to pharmacogenomic studies may enable the detection of more subtle effects of multiple genes in the same pathway that may be missed by assessing each gene individually. A gene set analysis of 3821 gene sets is presented assessing the association between basal messenger RNA expression and drug cytotoxicity using ethnically defined human lymphoblastoid cell lines for two classes of drugs: pyrimidines [gemcitabine (dFdC) and arabinoside] and purines [6-thioguanine and 6-mercaptopurine]. The gene set nucleoside-diphosphatase activity was found to be significantly associated with both dFdC and arabinoside, whereas gene set γ-aminobutyric acid catabolic process was associated with dFdC and 6-thioguanine. These gene sets were significantly associated with the phenotype even after adjusting for multiple testing. In addition, five associated gene sets were found in common between the pyrimidines and two gene sets for the purines (3',5'-cyclic-AMP phosphodiesterase activity and γ-aminobutyric acid catabolic process) with a P value of less than 0.0001. Functional validation was attempted with four genes each in gene sets for thiopurine and pyrimidine antimetabolites. All four genes selected from the pyrimidine gene sets (PSME3, CANT1, ENTPD6, ADRM1) were validated, but only one (PDE4D) was validated for the thiopurine gene sets. In summary, results from the gene set analysis of pyrimidine and purine therapies, used often in the treatment of various cancers, provide novel insight into the relationship between genomic variation and drug response.
Employing conservation of co-expression to improve functional inference

PubMed Central

Daub, Carsten O; Sonnhammer, Erik LL

2008-01-01

Background Observing co-expression between genes suggests that they are functionally coupled. Co-expression of orthologous gene pairs across species may improve function prediction beyond the level achieved in a single species. Results We used orthology between genes of the three different species S. cerevisiae, D. melanogaster, and C. elegans to combine co-expression across two species at a time. This led to increased function prediction accuracy when we incorporated expression data from either of the other two species and even further increased when conservation across both of the two other species was considered at the same time. Employing the conservation across species to incorporate abundant model organism data for the prediction of protein interactions in poorly characterized species constitutes a very powerful annotation method. Conclusion To be able to employ the most suitable co-expression distance measure for our analysis, we evaluated the ability of four popular gene co-expression distance measures to detect biologically relevant interactions between pairs of genes. For the expression datasets employed in our co-expression conservation analysis above, we used the GO and the KEGG PATHWAY databases as gold standards. While the differences between distance measures were small, Spearman correlation showed to give most robust results. PMID:18808668
RefEx, a reference gene expression dataset as a web tool for the functional analysis of genes.

PubMed

Ono, Hiromasa; Ogasawara, Osamu; Okubo, Kosaku; Bono, Hidemasa

2017-08-29

Gene expression data are exponentially accumulating; thus, the functional annotation of such sequence data from metadata is urgently required. However, life scientists have difficulty utilizing the available data due to its sheer magnitude and complicated access. We have developed a web tool for browsing reference gene expression pattern of mammalian tissues and cell lines measured using different methods, which should facilitate the reuse of the precious data archived in several public databases. The web tool is called Reference Expression dataset (RefEx), and RefEx allows users to search by the gene name, various types of IDs, chromosomal regions in genetic maps, gene family based on InterPro, gene expression patterns, or biological categories based on Gene Ontology. RefEx also provides information about genes with tissue-specific expression, and the relative gene expression values are shown as choropleth maps on 3D human body images from BodyParts3D. Combined with the newly incorporated Functional Annotation of Mammals (FANTOM) dataset, RefEx provides insight regarding the functional interpretation of unfamiliar genes. RefEx is publicly available at http://refex.dbcls.jp/.
RefEx, a reference gene expression dataset as a web tool for the functional analysis of genes

PubMed Central

Ono, Hiromasa; Ogasawara, Osamu; Okubo, Kosaku; Bono, Hidemasa

2017-01-01

Gene expression data are exponentially accumulating; thus, the functional annotation of such sequence data from metadata is urgently required. However, life scientists have difficulty utilizing the available data due to its sheer magnitude and complicated access. We have developed a web tool for browsing reference gene expression pattern of mammalian tissues and cell lines measured using different methods, which should facilitate the reuse of the precious data archived in several public databases. The web tool is called Reference Expression dataset (RefEx), and RefEx allows users to search by the gene name, various types of IDs, chromosomal regions in genetic maps, gene family based on InterPro, gene expression patterns, or biological categories based on Gene Ontology. RefEx also provides information about genes with tissue-specific expression, and the relative gene expression values are shown as choropleth maps on 3D human body images from BodyParts3D. Combined with the newly incorporated Functional Annotation of Mammals (FANTOM) dataset, RefEx provides insight regarding the functional interpretation of unfamiliar genes. RefEx is publicly available at http://refex.dbcls.jp/. PMID:28850115

A Role for MORE AXILLARY GROWTH1 (MAX1) in Evolutionary Diversity in Strigolactone Signaling Upstream of MAX21[C][W][OA

PubMed Central

Challis, Richard J.; Hepworth, Jo; Mouchel, Céline; Waites, Richard; Leyser, Ottoline

2013-01-01

Strigolactones (SLs) are carotenoid-derived phytohormones with diverse roles. They are secreted from roots as attractants for arbuscular mycorrhizal fungi and have a wide range of endogenous functions, such as regulation of root and shoot system architecture. To date, six genes associated with SL synthesis and signaling have been molecularly identified using the shoot-branching mutants more axillary growth (max) of Arabidopsis (Arabidopsis thaliana) and dwarf (d) of rice (Oryza sativa). Here, we present a phylogenetic analysis of the MAX/D genes to clarify the relationships of each gene with its wider family and to allow the correlation of events in the evolution of the genes with the evolution of SL function. Our analysis suggests that the notion of a distinct SL pathway is inappropriate. Instead, there may be a diversity of SL-like compounds, the response to which requires a D14/D14-like protein. This ancestral system could have been refined toward distinct ligand-specific pathways channeled through MAX2, the most downstream known component of SL signaling. MAX2 is tightly conserved among land plants and is more diverged from its nearest sister clade than any other SL-related gene, suggesting a pivotal role in the evolution of SL signaling. By contrast, the evidence suggests much greater flexibility upstream of MAX2. The MAX1 gene is a particularly strong candidate for contributing to diversification of inputs upstream of MAX2. Our functional analysis of the MAX1 family demonstrates the early origin of its catalytic function and both redundancy and functional diversification associated with its duplication in angiosperm lineages. PMID:23424248
Comparative fecal metagenomics unveils unique functional capacity of the swine gut

PubMed Central

2011-01-01

Background Uncovering the taxonomic composition and functional capacity within the swine gut microbial consortia is of great importance to animal physiology and health as well as to food and water safety due to the presence of human pathogens in pig feces. Nonetheless, limited information on the functional diversity of the swine gut microbiome is available. Results Analysis of 637, 722 pyrosequencing reads (130 megabases) generated from Yorkshire pig fecal DNA extracts was performed to help better understand the microbial diversity and largely unknown functional capacity of the swine gut microbiome. Swine fecal metagenomic sequences were annotated using both MG-RAST and JGI IMG/M-ER pipelines. Taxonomic analysis of metagenomic reads indicated that swine fecal microbiomes were dominated by Firmicutes and Bacteroidetes phyla. At a finer phylogenetic resolution, Prevotella spp. dominated the swine fecal metagenome, while some genes associated with Treponema and Anareovibrio species were found to be exclusively within the pig fecal metagenomic sequences analyzed. Functional analysis revealed that carbohydrate metabolism was the most abundant SEED subsystem, representing 13% of the swine metagenome. Genes associated with stress, virulence, cell wall and cell capsule were also abundant. Virulence factors associated with antibiotic resistance genes with highest sequence homology to genes in Bacteroidetes, Clostridia, and Methanosarcina were numerous within the gene families unique to the swine fecal metagenomes. Other abundant proteins unique to the distal swine gut shared high sequence homology to putative carbohydrate membrane transporters. Conclusions The results from this metagenomic survey demonstrated the presence of genes associated with resistance to antibiotics and carbohydrate metabolism suggesting that the swine gut microbiome may be shaped by husbandry practices. PMID:21575148
[Gene deletion and functional analysis of the heptyl glycosyltransferase (waaF) gene in Vibrio parahemolyticus O-antigen cluster].

PubMed

Zhao, Feng; Meng, Songsong; Zhou, Deqing

2016-02-04

To construct heptyl glycosyltransferase gene II (waaF) gene deletion mutant of Vibrio parahaemolyticus, and explore the function of the waaF gene in Vibrio parahaemolyticus. The waaF gene deletion mutant was constructed by chitin-based transformation technology using clinical isolates, and then the growth rate, morphology and serotypes were identified. The different sources (O3, O5 and O10) waaF gene complementations were constructed through E. coli S17λpir strains conjugative transferring with Vibrio parahaemolyticus, and the function of the waaF gene was further verified by serotypes. The waaF gene deletion mutant strain was successfully constructed and it grew normally. The growth rate and morphology of mutant were similar with the wild type strains (WT), but the mutant could not occurred agglutination reaction with O antisera. The O3 and O5 sources waaF gene complementations occurred agglutination reaction with O antisera, but the O10 sources waaF gene complementations was not. The waaF gene was related with O-antigen synthesis and it was the key gene of O-antigen synthesis pathway in Vibrio parahaemolyticus. The function of different sources waaF gene were not the same.
In Silico Prediction and Validation of Gfap as an miR-3099 Target in Mouse Brain.

PubMed

Abidin, Shahidee Zainal; Leong, Jia-Wen; Mahmoudi, Marzieh; Nordin, Norshariza; Abdullah, Syahril; Cheah, Pike-See; Ling, King-Hwa

2017-08-01

MicroRNAs are small non-coding RNAs that play crucial roles in the regulation of gene expression and protein synthesis during brain development. MiR-3099 is highly expressed throughout embryogenesis, especially in the developing central nervous system. Moreover, miR-3099 is also expressed at a higher level in differentiating neurons in vitro, suggesting that it is a potential regulator during neuronal cell development. This study aimed to predict the target genes of miR-3099 via in-silico analysis using four independent prediction algorithms (miRDB, miRanda, TargetScan, and DIANA-micro-T-CDS) with emphasis on target genes related to brain development and function. Based on the analysis, a total of 3,174 miR-3099 target genes were predicted. Those predicted by at least three algorithms (324 genes) were subjected to DAVID bioinformatics analysis to understand their overall functional themes and representation. The analysis revealed that nearly 70% of the target genes were expressed in the nervous system and a significant proportion were associated with transcriptional regulation and protein ubiquitination mechanisms. Comparison of in situ hybridization (ISH) expression patterns of miR-3099 in both published and in-house-generated ISH sections with the ISH sections of target genes from the Allen Brain Atlas identified 7 target genes (Dnmt3a, Gabpa, Gfap, Itga4, Lxn, Smad7, and Tbx18) having expression patterns complementary to miR-3099 in the developing and adult mouse brain samples. Of these, we validated Gfap as a direct downstream target of miR-3099 using the luciferase reporter gene system. In conclusion, we report the successful prediction and validation of Gfap as an miR-3099 target gene using a combination of bioinformatics resources with enrichment of annotations based on functional ontologies and a spatio-temporal expression dataset.
Genome-Wide Characterization of bHLH Genes in Grape and Analysis of their Potential Relevance to Abiotic Stress Tolerance and Secondary Metabolite Biosynthesis

PubMed Central

Wang, Pengfei; Su, Ling; Gao, Huanhuan; Jiang, Xilong; Wu, Xinying; Li, Yi; Zhang, Qianqian; Wang, Yongmei; Ren, Fengshan

2018-01-01

Basic helix-loop-helix (bHLH) transcription factors are involved in many abiotic stress responses as well as flavonol and anthocyanin biosynthesis. In grapes (Vitis vinifera L.), flavonols including anthocyanins and condensed tannins are most abundant in the skins of the berries. Flavonols are important phytochemicals for viticulture and enology, but grape bHLH genes have rarely been examined. We identified 94 grape bHLH genes in a genome-wide analysis and performed Nr and GO function analyses for these genes. Phylogenetic analyses placed the genes into 15 clades, with some remaining orphans. 41 duplicate gene pairs were found in the grape bHLH gene family, and all of these duplicate gene pairs underwent purifying selection. Nine triplicate gene groups were found in the grape bHLH gene family and all of these triplicate gene groups underwent purifying selection. Twenty-two grape bHLH genes could be induced by PEG treatment and 17 grape bHLH genes could be induced by cold stress treatment including a homologous form of MYC2, VvbHLH007. Based on the GO or Nr function annotations, we found three other genes that are potentially related to anthocyanin or flavonol biosynthesis: VvbHLH003, VvbHLH007, and VvbHLH010. We also performed a cis-acting regulatory element analysis on some genes involved in flavonoid or anthocyanin biosynthesis and our results showed that most of these gene promoters contained G-box or E-box elements that could be recognized by bHLH family members. PMID:29449854
An integrative approach to inferring biologically meaningful gene modules

PubMed Central

2011-01-01

Background The ability to construct biologically meaningful gene networks and modules is critical for contemporary systems biology. Though recent studies have demonstrated the power of using gene modules to shed light on the functioning of complex biological systems, most modules in these networks have shown little association with meaningful biological function. We have devised a method which directly incorporates gene ontology (GO) annotation in construction of gene modules in order to gain better functional association. Results We have devised a method, Semantic Similarity-Integrated approach for Modularization (SSIM) that integrates various gene-gene pairwise similarity values, including information obtained from gene expression, protein-protein interactions and GO annotations, in the construction of modules using affinity propagation clustering. We demonstrated the performance of the proposed method using data from two complex biological responses: 1. the osmotic shock response in Saccharomyces cerevisiae, and 2. the prion-induced pathogenic mouse model. In comparison with two previously reported algorithms, modules identified by SSIM showed significantly stronger association with biological functions. Conclusions The incorporation of semantic similarity based on GO annotation with gene expression and protein-protein interaction data can greatly enhance the functional relevance of inferred gene modules. In addition, the SSIM approach can also reveal the hierarchical structure of gene modules to gain a broader functional view of the biological system. Hence, the proposed method can facilitate comprehensive and in-depth analysis of high throughput experimental data at the gene network level. PMID:21791051
Identification of learning and memory genes in canine; promoter investigation and determining the selective pressure.

PubMed

Seifi Moroudi, Reihane; Masoudi, Ali Akbar; Vaez Torshizi, Rasoul; Zandi, Mohammad

2014-12-01

One of the important behaviors of dogs is trainability which is affected by learning and memory genes. These kinds of the genes have not yet been identified in dogs. In the current research, these genes were found in animal models by mining the biological data and scientific literatures. The proteins of these genes were obtained from the UniProt database in dogs and humans. Not all homologous proteins perform similar functions, thus comparison of these proteins was studied in terms of protein families, domains, biological processes, molecular functions, and cellular location of metabolic pathways in Interpro, KEGG, Quick Go and Psort databases. The results showed that some of these proteins have the same performance in the rat or mouse, dog, and human. It is anticipated that the protein of these genes may be effective in learning and memory in dogs. Then, the expression pattern of the recognized genes was investigated in the dog hippocampus using the existing information in the GEO profile. The results showed that BDNF, TAC1 and CCK genes are expressed in the dog hippocampus, therefore, these genes could be strong candidates associated with learning and memory in dogs. Subsequently, due to the importance of the promoter regions in gene function, this region was investigated in the above genes. Analysis of the promoter indicated that the HNF-4 site of BDNF gene and the transcription start site of CCK gene is exposed to methylation. Phylogenetic analysis of protein sequences of these genes showed high similarity in each of these three genes among the studied species. The dN/dS ratio for BDNF, TAC1 and CCK genes indicates a purifying selection during the evolution of the genes.
Genome-wide survey and expression analysis of F-box genes in chickpea.

PubMed

Gupta, Shefali; Garg, Vanika; Kant, Chandra; Bhatia, Sabhyata

2015-02-13

The F-box genes constitute one of the largest gene families in plants involved in degradation of cellular proteins. F-box proteins can recognize a wide array of substrates and regulate many important biological processes such as embryogenesis, floral development, plant growth and development, biotic and abiotic stress, hormonal responses and senescence, among others. However, little is known about the F-box genes in the important legume crop, chickpea. The available draft genome sequence of chickpea allowed us to conduct a genome-wide survey of the F-box gene family in chickpea. A total of 285 F-box genes were identified in chickpea which were classified based on their C-terminal domain structures into 10 subfamilies. Thirteen putative novel motifs were also identified in F-box proteins with no known functional domain at their C-termini. The F-box genes were physically mapped on the 8 chickpea chromosomes and duplication events were investigated which revealed that the F-box gene family expanded largely due to tandem duplications. Phylogenetic analysis classified the chickpea F-box genes into 9 clusters. Also, maximum syntenic relationship was observed with soybean followed by Medicago truncatula, Lotus japonicus and Arabidopsis. Digital expression analysis of F-box genes in various chickpea tissues as well as under abiotic stress conditions utilizing the available chickpea transcriptome data revealed differential expression patterns with several F-box genes specifically expressing in each tissue, few of which were validated by using quantitative real-time PCR. The genome-wide analysis of chickpea F-box genes provides new opportunities for characterization of candidate F-box genes and elucidation of their function in growth, development and stress responses for utilization in chickpea improvement.
Genome-wide characterization of GRAS family genes in Medicago truncatula reveals their evolutionary dynamics and functional diversification

PubMed Central

Zhang, Hailing; Cao, Yingping; Shang, Chen; Li, Jikai; Wang, Jianli; Wu, Zhenying; Ma, Lichao; Qi, Tianxiong; Fu, Chunxiang; Hu, Baozhong

2017-01-01

The GRAS gene family is a large plant-specific family of transcription factors that are involved in diverse processes during plant development. Medicago truncatula is an ideal model plant for genetic research in legumes, and specifically for studying nodulation, which is crucial for nitrogen fixation. In this study, 59 MtGRAS genes were identified and classified into eight distinct subgroups based on phylogenetic relationships. Motifs located in the C-termini were conserved across the subgroups, while motifs in the N-termini were subfamily specific. Gene duplication was the main evolutionary force for MtGRAS expansion, especially proliferation of the LISCL subgroup. Seventeen duplicated genes showed strong effects of purifying selection and diverse expression patterns, highlighting their functional importance and diversification after duplication. Thirty MtGRAS genes, including NSP1 and NSP2, were preferentially expressed in nodules, indicating possible roles in the process of nodulation. A transcriptome study, combined with gene expression analysis under different stress conditions, suggested potential functions of MtGRAS genes in various biological pathways and stress responses. Taken together, these comprehensive analyses provide basic information for understanding the potential functions of GRAS genes, and will facilitate further discovery of MtGRAS gene functions. PMID:28945786
Comparative metabolic pathway analysis with special reference to nucleotide metabolism-related genes in chicken primordial germ cells.

PubMed

Rengaraj, Deivendran; Lee, Bo Ram; Jang, Hyun-Jun; Kim, Young Min; Han, Jae Yong

2013-01-01

Metabolism provides energy and nutrients required for the cellular growth, maintenance, and reproduction. When compared with genomics and proteomics, metabolism studies provide novel findings in terms of cellular functions. In this study, we examined significant and differentially expressed genes in primordial germ cells (PGCs), gonadal stromal cells, and chicken embryonic fibroblasts compared with blastoderms using microarray. All upregulated genes (1001, 1118, and 974, respectively) and downregulated genes (504, 627, and 1317, respectively) in three test samples were categorized into functional groups according to gene ontology. Then all selected genes were tested to examine their involvement in metabolic pathways through Kyoto Encyclopedia of Genes and Genomes pathway database using overrepresentation analysis. In our results, most of the upregulated and downregulated genes were involved in at least one subcategory of seven major metabolic pathways. The main objective of this study is to compare the PGC expressed genes and their metabolic pathways with blastoderms, gonadal stromal cells, and chicken embryonic fibroblasts. Among the genes involved in metabolic pathways, a higher number of PGC upregulated genes were identified in retinol metabolism, and a higher number of PGC downregulated genes were identified in sphingolipid metabolism. In terms of the fold change, acyl-CoA synthetase medium-chain family member 3 (ACSM3), which is involved in butanoate metabolism, and N-acetyltransferase, pineal gland isozyme NAT-10 (PNAT10), which is involved in energy metabolism, showed higher expression in PGCs. To validate these gene changes, the expression of 12 nucleotide metabolism-related genes in chicken PGCs was examined by real-time polymerase chain reaction. The results of this study provide new information on the expression of genes associated with metabolism function of PGCs and will facilitate more basic research on animal PGC differentiation and function. Copyright © 2013 Elsevier Inc. All rights reserved.
Genome-wide identification and expression analysis of SBP-like transcription factor genes in Moso Bamboo (Phyllostachys edulis).

PubMed

Pan, Feng; Wang, Yue; Liu, Huanglong; Wu, Min; Chu, Wenyuan; Chen, Danmei; Xiang, Yan

2017-06-27

The SQUAMOSA promoter binding protein-like (SPL) proteins are plant-specific transcription factors (TFs) that function in a variety of developmental processes including growth, flower development, and signal transduction. SPL proteins are encoded by a gene family, and these genes have been characterized in two model grass species, Zea mays and Oryza sativa. The SPL gene family has not been well studied in moso bamboo (Phyllostachys edulis), a woody grass species. We identified 32 putative PeSPL genes in the P. edulis genome. Phylogenetic analysis arranged the PeSPL protein sequences in eight groups. Similarly, phylogenetic analysis of the SBP-like and SBP proteins from rice and maize clustered them into eight groups analogous to those from P. edulis. Furthermore, the deduced PeSPL proteins in each group contained very similar conserved sequence motifs. Our analyses indicate that the PeSPL genes experienced a large-scale duplication event ~15 million years ago (MYA), and that divergence between the PeSPL and OsSPL genes occurred 34 MYA. The stress-response expression profiles and tissue-specificity of the putative PeSPL gene promoter regions showed that SPL genes in moso bamboo have potential biological functions in stress resistance as well as in growth and development. We therefore examined PeSPL gene expression in response to different plant hormone and drought (polyethylene glycol-6000; PEG) treatments to mimic biotic and abiotic stresses. Expression of three (PeSPL10, -12, -17), six (PeSPL1, -10, -12, -17, -20, -31), and nine (PeSPL5, -8, -9, -14, -15, -19, -20, -31, -32) genes remained relatively stable after treating with salicylic acid (SA), gibberellic acid (GA), and PEG, respectively, while the expression patterns of other genes changed. In addition, analysis of tissue-specific expression of the moso bamboo SPL genes during development showed differences in their spatiotemporal expression patterns, and many were expressed at high levels in flowers and leaves. The PeSPL genes play important roles in plant growth and development, including responses to stresses, and most of the genes are expressed in different tissues. Our study provides a comprehensive understanding of the PeSPL gene family and may enable future studies on the function and evolution of SPL genes in moso bamboo.
Genome-wide identification, evolutionary and expression analysis of the aspartic protease gene superfamily in grape

PubMed Central

2013-01-01

Background Aspartic proteases (APs) are a large family of proteolytic enzymes found in almost all organisms. In plants, they are involved in many biological processes, such as senescence, stress responses, programmed cell death, and reproduction. Prior to the present study, no grape AP gene(s) had been reported, and their research on woody species was very limited. Results In this study, a total of 50 AP genes (VvAP) were identified in the grape genome, among which 30 contained the complete ASP domain. Synteny analysis within grape indicated that segmental and tandem duplication events contributed to the expansion of the grape AP family. Additional analysis between grape and Arabidopsis demonstrated that several grape AP genes were found in the corresponding syntenic blocks of Arabidopsis, suggesting that these genes arose before the divergence of grape and Arabidopsis. Phylogenetic relationships of the 30 VvAPs with the complete ASP domain and their Arabidopsis orthologs, as well as their gene and protein features were analyzed and their cellular localization was predicted. Moreover, expression profiles of VvAP genes in six different tissues were determined, and their transcript abundance under various stresses and hormone treatments were measured. Twenty-seven VvAP genes were expressed in at least one of the six tissues examined; nineteen VvAPs responded to at least one abiotic stress, 12 VvAPs responded to powdery mildew infection, and most of the VvAPs responded to SA and ABA treatments. Furthermore, integrated synteny and phylogenetic analysis identified orthologous AP genes between grape and Arabidopsis, providing a unique starting point for investigating the function of grape AP genes. Conclusions The genome-wide identification, evolutionary and expression analyses of grape AP genes provide a framework for future analysis of AP genes in defining their roles during stress response. Integrated synteny and phylogenetic analyses provide novel insight into the functions of less well-studied genes using information from their better understood orthologs. PMID:23945092
Identification and Transcript Analysis of the TCP Transcription Factors in the Diploid Woodland Strawberry Fragaria vesca

PubMed Central

Wei, Wei; Hu, Yang; Cui, Meng-Yuan; Han, Yong-Tao; Gao, Kuan; Feng, Jia-Yue

2016-01-01

Plant-specific TEOSINTE BRANCHED 1, CYCLOIDEA, and PROLIFERATING CELL FACTORS (TCP) transcription factors play versatile functions in multiple processes of plant growth and development. However, no systematic study has been performed in strawberry. In this study, 19 FvTCP genes were identified in the diploid woodland strawberry (Fragaria vesca) accession Heilongjiang-3. Phylogenetic analysis suggested that the FvTCP genes were classified into two main classes, with the second class further divided into two subclasses, which was supported by the exon-intron organizations and the conserved motif structures. Promoter analysis revealed various cis-acting elements related to growth and development, hormone and/or stress responses. We analyzed FvTCP gene transcript accumulation patterns in different tissues and fruit developmental stages. Among them, 12 FvTCP genes exhibited distinct tissue-specific transcript accumulation patterns. Eleven FvTCP genes were down-regulated in different fruit developmental stages, while five FvTCP genes were up-regulated. Transcripts of FvTCP genes also varied with different subcultural propagation periods and were induced by hormone treatments and biotic and abiotic stresses. Subcellular localization analysis showed that six FvTCP-GFP fusion proteins showed distinct localizations in Arabidopsis mesophyll protoplasts. Notably, transient over-expression of FvTCP9 in strawberry fruits dramatically affected the expression of a series of genes implicated in fruit development and ripening. Taken together, the present study may provide the basis for functional studies to reveal the role of this gene family in strawberry growth and development. PMID:28066489
Comparison between smaller ruptured intracranial aneurysm and larger un-ruptured intracranial aneurysm: gene expression profile analysis.

PubMed

Li, Hao; Li, Haowen; Yue, Haiyan; Wang, Wen; Yu, Lanbing; ShuoWang; Cao, Yong; Zhao, Jizong

2017-07-01

As it grows in size, an intracranial aneurysm (IA) is prone to rupture. In this study, we compared two extreme groups of IAs, ruptured IAs (RIAs) smaller than 10 mm and un-ruptured IAs (UIAs) larger than 10 mm, to investigate the genes involved in the facilitation and prevention of IA rupture. The aneurismal walls of 6 smaller saccular RIAs (size smaller than 10 mm), 6 larger saccular UIAs (size larger than 10 mm) and 12 paired control arteries were obtained during surgery. The transcription profiles of these samples were studied by microarray analysis. RT-qPCR was used to confirm the expression of the genes of interest. In addition, functional group analysis of the differentially expressed genes was performed. Between smaller RIAs and larger UIAs, 101 genes and 179 genes were significantly over-expressed, respectively. In addition, functional group analysis demonstrated that the up-regulated genes in smaller RIAs mainly participated in the cellular response to metal ions and inorganic substances, while most of the up-regulated genes in larger UIAs were involved in inflammation and extracellular matrix (ECM) organization. Moreover, compared with control arteries, inflammation was up-regulated and muscle-related biological processes were down-regulated in both smaller RIAs and larger UIAs. The genes involved in the cellular response to metal ions and inorganic substances may facilitate the rupture of IAs. In addition, the healing process, involving inflammation and ECM organization, may protect IAs from rupture.
In Silico Analysis Identifies a Novel Role for Androgens in the Regulation of Human Endometrial Apoptosis

PubMed Central

Marshall, Elaine; Lowrey, Jacqueline; MacPherson, Sheila; Maybin, Jacqueline A.; Collins, Frances; Critchley, Hilary O. D.

2011-01-01

Context: The endometrium is a multicellular, steroid-responsive tissue that undergoes dynamic remodeling every menstrual cycle in preparation for implantation and, in absence of pregnancy, menstruation. Androgen receptors are present in the endometrium. Objective: The objective of the study was to investigate the impact of androgens on human endometrial stromal cells (hESC). Design: Bioinformatics was used to identify an androgen-regulated gene set and processes associated with their function. Regulation of target genes and impact of androgens on cell function were validated using primary hESC. Setting: The study was conducted at the University Research Institute. Patients: Endometrium was collected from women with regular menses; tissues were used for recovery of cells, total mRNA, or protein and for immunohistochemistry. Results: A new endometrial androgen target gene set (n = 15) was identified. Bioinformatics revealed 12 of these genes interacted in one pathway and identified an association with control of cell survival. Dynamic androgen-dependent changes in expression of the gene set were detected in hESC with nine significantly down-regulated at 2 and/or 8 h. Treatment of hESC with dihydrotestosterone reduced staurosporine-induced apoptosis and cell migration/proliferation. Conclusions: Rigorous in silico analysis resulted in identification of a group of androgen-regulated genes expressed in human endometrium. Pathway analysis and functional assays suggest androgen-dependent changes in gene expression may have a significant impact on stromal cell proliferation, migration, and survival. These data provide the platform for further studies on the role of circulatory or local androgens in the regulation of endometrial function and identify androgens as candidates in the pathogenesis of common endometrial disorders including polycystic ovarian syndrome, cancer, and endometriosis. PMID:21865353
Microarray analysis of gene expression profiles in ripening pineapple fruits.

PubMed

Koia, Jonni H; Moyle, Richard L; Botella, Jose R

2012-12-18

Pineapple (Ananas comosus) is a tropical fruit crop of significant commercial importance. Although the physiological changes that occur during pineapple fruit development have been well characterized, little is known about the molecular events that occur during the fruit ripening process. Understanding the molecular basis of pineapple fruit ripening will aid the development of new varieties via molecular breeding or genetic modification. In this study we developed a 9277 element pineapple microarray and used it to profile gene expression changes that occur during pineapple fruit ripening. Microarray analyses identified 271 unique cDNAs differentially expressed at least 1.5-fold between the mature green and mature yellow stages of pineapple fruit ripening. Among these 271 sequences, 184 share significant homology with genes encoding proteins of known function, 53 share homology with genes encoding proteins of unknown function and 34 share no significant homology with any database accession. Of the 237 pineapple sequences with homologs, 160 were up-regulated and 77 were down-regulated during pineapple fruit ripening. DAVID Functional Annotation Cluster (FAC) analysis of all 237 sequences with homologs revealed confident enrichment scores for redox activity, organic acid metabolism, metalloenzyme activity, glycolysis, vitamin C biosynthesis, antioxidant activity and cysteine peptidase activity, indicating the functional significance and importance of these processes and pathways during pineapple fruit development. Quantitative real-time PCR analysis validated the microarray expression results for nine out of ten genes tested. This is the first report of a microarray based gene expression study undertaken in pineapple. Our bioinformatic analyses of the transcript profiles have identified a number of genes, processes and pathways with putative involvement in the pineapple fruit ripening process. This study extends our knowledge of the molecular basis of pineapple fruit ripening and non-climacteric fruit ripening in general.
Microarray analysis of gene expression profiles in ripening pineapple fruits

PubMed Central

2012-01-01

Background Pineapple (Ananas comosus) is a tropical fruit crop of significant commercial importance. Although the physiological changes that occur during pineapple fruit development have been well characterized, little is known about the molecular events that occur during the fruit ripening process. Understanding the molecular basis of pineapple fruit ripening will aid the development of new varieties via molecular breeding or genetic modification. In this study we developed a 9277 element pineapple microarray and used it to profile gene expression changes that occur during pineapple fruit ripening. Results Microarray analyses identified 271 unique cDNAs differentially expressed at least 1.5-fold between the mature green and mature yellow stages of pineapple fruit ripening. Among these 271 sequences, 184 share significant homology with genes encoding proteins of known function, 53 share homology with genes encoding proteins of unknown function and 34 share no significant homology with any database accession. Of the 237 pineapple sequences with homologs, 160 were up-regulated and 77 were down-regulated during pineapple fruit ripening. DAVID Functional Annotation Cluster (FAC) analysis of all 237 sequences with homologs revealed confident enrichment scores for redox activity, organic acid metabolism, metalloenzyme activity, glycolysis, vitamin C biosynthesis, antioxidant activity and cysteine peptidase activity, indicating the functional significance and importance of these processes and pathways during pineapple fruit development. Quantitative real-time PCR analysis validated the microarray expression results for nine out of ten genes tested. Conclusions This is the first report of a microarray based gene expression study undertaken in pineapple. Our bioinformatic analyses of the transcript profiles have identified a number of genes, processes and pathways with putative involvement in the pineapple fruit ripening process. This study extends our knowledge of the molecular basis of pineapple fruit ripening and non-climacteric fruit ripening in general. PMID:23245313
Cardiomyocyte-specific deletion of the G protein-coupled estrogen receptor (GPER) leads to left ventricular dysfunction and adverse remodeling: A sex-specific gene profiling analysis.

PubMed

Wang, Hao; Sun, Xuming; Chou, Jeff; Lin, Marina; Ferrario, Carlos M; Zapata-Sudo, Gisele; Groban, Leanne

2017-08-01

Activation of G protein-coupled estrogen receptor (GPER) by its agonist, G1, protects the heart from stressors such as pressure-overload, ischemia, a high-salt diet, estrogen loss, and aging, in various male and female animal models. Due to nonspecific effects of G1, the exact functions of cardiac GPER cannot be concluded from studies using systemic G1 administration. Moreover, global knockdown of GPER affects glucose homeostasis, blood pressure, and many other cardiovascular-related systems, thereby confounding interpretation of its direct cardiac actions. We generated a cardiomyocyte-specific GPER knockout (KO) mouse model to specifically investigate the functions of GPER in cardiomyocytes. Compared to wild type mice, cardiomyocyte-specific GPER KO mice exhibited adverse alterations in cardiac structure and impaired systolic and diastolic function, as measured by echocardiography. Gene deletion effects on left ventricular dimensions were more profound in male KO mice compared to female KO mice. Analysis of DNA microarray data from isolated cardiomyocytes of wild type and KO mice revealed sex-based differences in gene expression profiles affecting multiple transcriptional networks. Gene Set Enrichment Analysis (GSEA) revealed that mitochondrial genes are enriched in GPER KO females, whereas inflammatory response genes are enriched in GPER KO males, compared to their wild type counterparts of the same sex. The cardiomyocyte-specific GPER KO mouse model provides us with a powerful tool to study the functions of GPER in cardiomyocytes. The gene expression profiles of the GPER KO mice provide foundational information for further study of the mechanisms underlying sex-specific cardioprotection by GPER. Copyright © 2016 Elsevier B.V. All rights reserved.
FOXO3 Modulates Endothelial Gene Expression and Function by Classical and Alternative Mechanisms*

PubMed Central

Czymai, Tobias; Viemann, Dorothee; Sticht, Carsten; Molema, Grietje; Goebeler, Matthias; Schmidt, Marc

2010-01-01

FOXO transcription factors represent targets of the phosphatidylinositol 3-kinase/protein kinase B survival pathway controlling important biological processes, such as cell cycle progression, apoptosis, vascular remodeling, stress responses, and metabolism. Recent studies suggested the existence of alternative mechanisms of FOXO-dependent gene expression beyond classical binding to a FOXO-responsive DNA-binding element (FRE). Here we analyzed the relative contribution of those mechanisms to vascular function by comparing the transcriptional and cellular responses to conditional activation of FOXO3 and a corresponding FRE-binding mutant in human primary endothelial cells. We demonstrate that FOXO3 controls expression of vascular remodeling genes in an FRE-dependent manner. In contrast, FOXO3-induced cell cycle arrest and apoptosis occurs independently of FRE binding, albeit FRE-dependent gene expression augments the proapoptotic response. These findings are supported by bioinformatical analysis, which revealed a statistical overrepresentation of cell cycle regulators and apoptosis-related genes in the group of co-regulated genes. Molecular analysis of FOXO3-induced endothelial apoptosis excluded modulators of the extrinsic death receptor pathway and demonstrated important roles for the BCL-2 family members BIM and NOXA in this process. Although NOXA essentially contributed to FRE-dependent apoptosis, BIM was effectively induced in the absence of FRE-binding, and small interfering RNA-mediated BIM depletion could rescue apoptosis induced by both FOXO3 mutants. These data suggest BIM as a critical cell type-specific mediator of FOXO3-induced endothelial apoptosis, whereas NOXA functions as an amplifying factor. Our study provides the first comprehensive analysis of alternatively regulated FOXO3 targets in relevant primary cells and underscores the importance of such genes for endothelial function and integrity. PMID:20123982
CRISPR-Cas9 and CRISPR-Cpf1 mediated targeting of a stomatal developmental gene EPFL9 in rice.

PubMed

Yin, Xiaojia; Biswal, Akshaya K; Dionora, Jacqueline; Perdigon, Kristel M; Balahadia, Christian P; Mazumdar, Shamik; Chater, Caspar; Lin, Hsiang-Chun; Coe, Robert A; Kretzschmar, Tobias; Gray, Julie E; Quick, Paul W; Bandyopadhyay, Anindya

2017-05-01

CRISPR-Cas9/Cpf1 system with its unique gene targeting efficiency, could be an important tool for functional study of early developmental genes through the generation of successful knockout plants. The introduction and utilization of systems biology approaches have identified several genes that are involved in early development of a plant and with such knowledge a robust tool is required for the functional validation of putative candidate genes thus obtained. The development of the CRISPR-Cas9/Cpf1 genome editing system has provided a convenient tool for creating loss of function mutants for genes of interest. The present study utilized CRISPR/Cas9 and CRISPR-Cpf1 technology to knock out an early developmental gene EPFL9 (Epidermal Patterning Factor like-9, a positive regulator of stomatal development in Arabidopsis) orthologue in rice. Germ-line mutants that were generated showed edits that were carried forward into the T2 generation when Cas9-free homozygous mutants were obtained. The homozygous mutant plants showed more than an eightfold reduction in stomatal density on the abaxial leaf surface of the edited rice plants. Potential off-target analysis showed no significant off-target effects. This study also utilized the CRISPR-LbCpf1 (Lachnospiracae bacterium Cpf1) to target the same OsEPFL9 gene to test the activity of this class-2 CRISPR system in rice and found that Cpf1 is also capable of genome editing and edits get transmitted through generations with similar phenotypic changes seen with CRISPR-Cas9. This study demonstrates the application of CRISPR-Cas9/Cpf1 to precisely target genomic locations and develop transgene-free homozygous heritable gene edits and confirms that the loss of function analysis of the candidate genes emerging from different systems biology based approaches, could be performed, and therefore, this system adds value in the validation of gene function studies.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.