genes: Topics by Science.gov

Sample records for genes

Olaparib in Treating Patients With Metastatic or Advanced Urothelial Cancer With DNA-Repair Defects

ClinicalTrials.gov

2018-06-14

Abnormal DNA Repair; ATM Gene Mutation; ATR Gene Mutation; BAP1 Gene Mutation; BARD1 Gene Mutation; BLM Gene Mutation; BRCA1 Gene Mutation; BRCA2 Gene Mutation; BRIP1 Gene Mutation; CHEK1 Gene Mutation; CHEK2 Gene Mutation; FANCC Gene Mutation; FANCD2 Gene Mutation; FANCE Gene Mutation; FANCF Gene Mutation; MEN1 Gene Mutation; Metastatic Urothelial Carcinoma; MLH1 Gene Mutation; MSH2 Gene Mutation; MSH6 Gene Mutation; MUTYH Gene Mutation; NPM1 Gene Mutation; PALB2 Gene Mutation; PMS2 Gene Mutation; POLD1 Gene Mutation; POLE Gene Mutation; PRKDC Gene Mutation; RAD50 Gene Mutation; RAD51 Gene Mutation; SMARCB1 Gene Mutation; Stage III Bladder Urothelial Carcinoma AJCC v6 and v7; Stage IV Bladder Urothelial Carcinoma AJCC v7; STK11 Gene Mutation; Urothelial Carcinoma
Lung-MAP: Talazoparib in Treating Patients With HRRD Positive Recurrent Stage IV Squamous Cell Lung Cancer

ClinicalTrials.gov

2018-05-31

ATM Gene Mutation; ATR Gene Mutation; BARD1 Gene Mutation; BRCA1 Gene Mutation; BRCA2 Gene Mutation; BRIP1 Gene Mutation; CHEK1 Gene Mutation; CHEK2 Gene Mutation; FANCA Gene Mutation; FANCC Gene Mutation; FANCD2 Gene Mutation; FANCF Gene Mutation; FANCM Gene Mutation; NBN Gene Mutation; PALB2 Gene Mutation; RAD51 Gene Mutation; RAD51B Gene Mutation; RAD54L Gene Mutation; Recurrent Squamous Cell Lung Carcinoma; RPA1 Gene Mutation; Stage IV Squamous Cell Lung Carcinoma AJCC v7
GeneSigDB—a curated database of gene expression signatures

PubMed Central

Culhane, Aedín C.; Schwarzl, Thomas; Sultana, Razvan; Picard, Kermshlise C.; Picard, Shaita C.; Lu, Tim H.; Franklin, Katherine R.; French, Simon J.; Papenhausen, Gerald; Correll, Mick; Quackenbush, John

2010-01-01

The primary objective of most gene expression studies is the identification of one or more gene signatures; lists of genes whose transcriptional levels are uniquely associated with a specific biological phenotype. Whilst thousands of experimentally derived gene signatures are published, their potential value to the community is limited by their computational inaccessibility. Gene signatures are embedded in published article figures, tables or in supplementary materials, and are frequently presented using non-standard gene or probeset nomenclature. We present GeneSigDB (http://compbio.dfci.harvard.edu/genesigdb) a manually curated database of gene expression signatures. GeneSigDB release 1.0 focuses on cancer and stem cells gene signatures and was constructed from more than 850 publications from which we manually transcribed 575 gene signatures. Most gene signatures (n = 560) were successfully mapped to the genome to extract standardized lists of EnsEMBL gene identifiers. GeneSigDB provides the original gene signature, the standardized gene list and a fully traceable gene mapping history for each gene from the original transcribed data table through to the standardized list of genes. The GeneSigDB web portal is easy to search, allows users to compare their own gene list to those in the database, and download gene signatures in most common gene identifier formats. PMID:19934259
[Genome-wide identification and analysis of WRKY transcription factors in Medicago truncatula].

PubMed

Song, Hui; Nan, Zhibiao

2014-02-01

WRKY gene family plays important roles in plant by involving in transcriptional regulations during various physiologically processes such as development, metabolism and responses to biotic and abiotic stresses. WRKY genes have been identified in various plants. However, only few WRKY genes in Medicago truncatula have been identified with systematic analysis and comparison. In this study, we identified 93 WRKY genes through analyses of M. truncatula genome. These genes include 19 type-I genes, 49 type II genes and 13 type-III genes, and 12 non-regular type genes. All of these genes were characterized through analyses of gene duplication, chromosomal locations, structural diversity, conserved protein motifs and phylogenetic relations. The results showed that 11 times of gene duplication event occurred in WRKY gene family involving 24 genes. WRKY genes, containing 6 gene clusters, are unevenly distributed into chromosome 1 to 6, and there is the purifying selection pressure in WRKY group III genes.
GeneNetFinder2: Improved Inference of Dynamic Gene Regulatory Relations with Multiple Regulators.

PubMed

Han, Kyungsook; Lee, Jeonghoon

2016-01-01

A gene involved in complex regulatory interactions may have multiple regulators since gene expression in such interactions is often controlled by more than one gene. Another thing that makes gene regulatory interactions complicated is that regulatory interactions are not static, but change over time during the cell cycle. Most research so far has focused on identifying gene regulatory relations between individual genes in a particular stage of the cell cycle. In this study we developed a method for identifying dynamic gene regulations of several types from the time-series gene expression data. The method can find gene regulations with multiple regulators that work in combination or individually as well as those with single regulators. The method has been implemented as the second version of GeneNetFinder (hereafter called GeneNetFinder2) and tested on several gene expression datasets. Experimental results with gene expression data revealed the existence of genes that are not regulated by individual genes but rather by a combination of several genes. Such gene regulatory relations cannot be found by conventional methods. Our method finds such regulatory relations as well as those with multiple, independent regulators or single regulators, and represents gene regulatory relations as a dynamic network in which different gene regulatory relations are shown in different stages of the cell cycle. GeneNetFinder2 is available at http://bclab.inha.ac.kr/GeneNetFinder and will be useful for modeling dynamic gene regulations with multiple regulators.
A survey of disease connections for CD4+ T cell master genes and their directly linked genes.

PubMed

Li, Wentian; Espinal-Enríquez, Jesús; Simpfendorfer, Kim R; Hernández-Lemus, Enrique

2015-12-01

Genome-wide association studies and other genetic analyses have identified a large number of genes and variants implicating a variety of disease etiological mechanisms. It is imperative for the study of human diseases to put these genetic findings into a coherent functional context. Here we use system biology tools to examine disease connections of five master genes for CD4+ T cell subtypes (TBX21, GATA3, RORC, BCL6, and FOXP3). We compiled a list of genes functionally interacting (protein-protein interaction, or by acting in the same pathway) with the master genes, then we surveyed the disease connections, either by experimental evidence or by genetic association. Embryonic lethal genes (also known as essential genes) are over-represented in master genes and their interacting genes (55% versus 40% in other genes). Transcription factors are significantly enriched among genes interacting with the master genes (63% versus 10% in other genes). Predicted haploinsufficiency is a feature of most these genes. Disease-connected genes are enriched in this list of genes: 42% of these genes have a disease connection according to Online Mendelian Inheritance in Man (OMIM) (versus 23% in other genes), and 74% are associated with some diseases or phenotype in a Genome Wide Association Study (GWAS) (versus 43% in other genes). Seemingly, not all of the diseases connected to genes surveyed were immune related, which may indicate pleiotropic functions of the master regulator genes and associated genes. Copyright © 2015 Elsevier Ltd. All rights reserved.
Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies.

PubMed

Schaid, Daniel J; Sinnwell, Jason P; Jenkins, Gregory D; McDonnell, Shannon K; Ingle, James N; Kubo, Michiaki; Goss, Paul E; Costantino, Joseph P; Wickerham, D Lawrence; Weinshilboum, Richard M

2012-01-01

Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc "fixes." To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses. © 2011 Wiley Periodicals, Inc.
Comparative genome analysis of PHB gene family reveals deep evolutionary origins and diverse gene function.

PubMed

Di, Chao; Xu, Wenying; Su, Zhen; Yuan, Joshua S

2010-10-07

PHB (Prohibitin) gene family is involved in a variety of functions important for different biological processes. PHB genes are ubiquitously present in divergent species from prokaryotes to eukaryotes. Human PHB genes have been found to be associated with various diseases. Recent studies by our group and others have shown diverse function of PHB genes in plants for development, senescence, defence, and others. Despite the importance of the PHB gene family, no comprehensive gene family analysis has been carried to evaluate the relatedness of PHB genes across different species. In order to better guide the gene function analysis and understand the evolution of the PHB gene family, we therefore carried out the comparative genome analysis of the PHB genes across different kingdoms. The relatedness, motif distribution, and intron/exon distribution all indicated that PHB genes is a relatively conserved gene family. The PHB genes can be classified into 5 classes and each class have a very deep evolutionary origin. The PHB genes within the class maintained the same motif patterns during the evolution. With Arabidopsis as the model species, we found that PHB gene intron/exon structure and domains are also conserved during the evolution. Despite being a conserved gene family, various gene duplication events led to the expansion of the PHB genes. Both segmental and tandem gene duplication were involved in Arabidopsis PHB gene family expansion. However, segmental duplication is predominant in Arabidopsis. Moreover, most of the duplicated genes experienced neofunctionalization. The results highlighted that PHB genes might be involved in important functions so that the duplicated genes are under the evolutionary pressure to derive new function. PHB gene family is a conserved gene family and accounts for diverse but important biological functions based on the similar molecular mechanisms. The highly diverse biological function indicated that more research needs to be carried out to dissect the PHB gene function. The conserved gene evolution indicated that the study in the model species can be translated to human and mammalian studies.
GoGene: gene annotation in the fast lane.

PubMed

Plake, Conrad; Royer, Loic; Winnenburg, Rainer; Hakenberg, Jörg; Schroeder, Michael

2009-07-01

High-throughput screens such as microarrays and RNAi screens produce huge amounts of data. They typically result in hundreds of genes, which are often further explored and clustered via enriched GeneOntology terms. The strength of such analyses is that they build on high-quality manual annotations provided with the GeneOntology. However, the weakness is that annotations are restricted to process, function and location and that they do not cover all known genes in model organisms. GoGene addresses this weakness by complementing high-quality manual annotation with high-throughput text mining extracting co-occurrences of genes and ontology terms from literature. GoGene contains over 4,000,000 associations between genes and gene-related terms for 10 model organisms extracted from more than 18,000,000 PubMed entries. It does not cover only process, function and location of genes, but also biomedical categories such as diseases, compounds, techniques and mutations. By bringing it all together, GoGene provides the most recent and most complete facts about genes and can rank them according to novelty and importance. GoGene accepts keywords, gene lists, gene sequences and protein sequences as input and supports search for genes in PubMed, EntrezGene and via BLAST. Since all associations of genes to terms are supported by evidence in the literature, the results are transparent and can be verified by the user. GoGene is available at http://gopubmed.org/gogene.
GeneSigDB: a manually curated database and resource for analysis of gene expression signatures

PubMed Central

Culhane, Aedín C.; Schröder, Markus S.; Sultana, Razvan; Picard, Shaita C.; Martinelli, Enzo N.; Kelly, Caroline; Haibe-Kains, Benjamin; Kapushesky, Misha; St Pierre, Anne-Alyssa; Flahive, William; Picard, Kermshlise C.; Gusenleitner, Daniel; Papenhausen, Gerald; O'Connor, Niall; Correll, Mick; Quackenbush, John

2012-01-01

GeneSigDB (http://www.genesigdb.org or http://compbio.dfci.harvard.edu/genesigdb/) is a database of gene signatures that have been extracted and manually curated from the published literature. It provides a standardized resource of published prognostic, diagnostic and other gene signatures of cancer and related disease to the community so they can compare the predictive power of gene signatures or use these in gene set enrichment analysis. Since GeneSigDB release 1.0, we have expanded from 575 to 3515 gene signatures, which were collected and transcribed from 1604 published articles largely focused on gene expression in cancer, stem cells, immune cells, development and lung disease. We have made substantial upgrades to the GeneSigDB website to improve accessibility and usability, including adding a tag cloud browse function, facetted navigation and a ‘basket’ feature to store genes or gene signatures of interest. Users can analyze GeneSigDB gene signatures, or upload their own gene list, to identify gene signatures with significant gene overlap and results can be viewed on a dynamic editable heatmap that can be downloaded as a publication quality image. All data in GeneSigDB can be downloaded in numerous formats including .gmt file format for gene set enrichment analysis or as a R/Bioconductor data file. GeneSigDB is available from http://www.genesigdb.org. PMID:22110038
Structure of a gene encoding a murine thymus leukemia antigen, and organization of Tla genes in the BALB/c mouse

PubMed Central

1985-01-01

We have determined the DNA sequence of a gene encoding a thymus leukemia (TL) antigen in the BALB/c mouse, and have more definitively mapped the cloned BALB/c Tla-region class I gene clusters. Analysis of the sequence shows that the Tla gene is less closely related to the H-2 genes than H-2 genes are to one another or to a Qa-2,3-region genes. The Tla gene, 17.3A, contains an apparent gene conversion. Comparison of the BALB/c Tla genes with those from C57BL shows that BALB/c has more Tla-region class I genes, and that one of the genes absent in C57BL is gene 17.3A. PMID:3894562
Gene-gene and gene-environment interactions: new insights into the prevention, detection and management of coronary artery disease.

PubMed

Lanktree, Matthew B; Hegele, Robert A

2009-02-26

Despite the recent success of genome-wide association studies (GWASs) in identifying loci consistently associated with coronary artery disease (CAD), a large proportion of the genetic components of CAD and its metabolic risk factors, including plasma lipids, type 2 diabetes and body mass index, remain unattributed. Gene-gene and gene-environment interactions might produce a meaningful improvement in quantification of the genetic determinants of CAD. Testing for gene-gene and gene-environment interactions is thus a new frontier for large-scale GWASs of CAD. There are several anecdotal examples of monogenic susceptibility to CAD in which the phenotype was worsened by an adverse environment. In addition, small-scale candidate gene association studies with functional hypotheses have identified gene-environment interactions. For future evaluation of gene-gene and gene-environment interactions to achieve the same success as the single gene associations reported in recent GWASs, it will be important to pre-specify agreed standards of study design and statistical power, environmental exposure measurement, phenomic characterization and analytical strategies. Here we discuss these issues, particularly in relation to the investigation and potential clinical utility of gene-gene and gene-environment interactions in CAD.
Gene Expression Profile Analysis is Directly Affected by the Selected Reference Gene: The Case of Leaf-Cutting Atta Sexdens

PubMed Central

Máximo, Wesley P. F.; Zanetti, Ronald; Paiva, Luciano V.

2018-01-01

Although several ant species are important targets for the development of molecular control strategies, only a few studies focus on identifying and validating reference genes for quantitative reverse transcription polymerase chain reaction (RT-qPCR) data normalization. We provide here an extensive study to identify and validate suitable reference genes for gene expression analysis in the ant Atta sexdens, a threatening agricultural pest in South America. The optimal number of reference genes varies according to each sample and the result generated by RefFinder differed about which is the most suitable reference gene. Results suggest that the RPS16, NADH and SDHB genes were the best reference genes in the sample pool according to stability values. The SNF7 gene expression pattern was stable in all evaluated sample set. In contrast, when using less stable reference genes for normalization a large variability in SNF7 gene expression was recorded. There is no universal reference gene suitable for all conditions under analysis, since these genes can also participate in different cellular functions, thus requiring a systematic validation of possible reference genes for each specific condition. The choice of reference genes on SNF7 gene normalization confirmed that unstable reference genes might drastically change the expression profile analysis of target candidate genes. PMID:29419794
Differentially Coexpressed Disease Gene Identification Based on Gene Coexpression Network.

PubMed

Jiang, Xue; Zhang, Han; Quan, Xiongwen

2016-01-01

Screening disease-related genes by analyzing gene expression data has become a popular theme. Traditional disease-related gene selection methods always focus on identifying differentially expressed gene between case samples and a control group. These traditional methods may not fully consider the changes of interactions between genes at different cell states and the dynamic processes of gene expression levels during the disease progression. However, in order to understand the mechanism of disease, it is important to explore the dynamic changes of interactions between genes in biological networks at different cell states. In this study, we designed a novel framework to identify disease-related genes and developed a differentially coexpressed disease-related gene identification method based on gene coexpression network (DCGN) to screen differentially coexpressed genes. We firstly constructed phase-specific gene coexpression network using time-series gene expression data and defined the conception of differential coexpression of genes in coexpression network. Then, we designed two metrics to measure the value of gene differential coexpression according to the change of local topological structures between different phase-specific networks. Finally, we conducted meta-analysis of gene differential coexpression based on the rank-product method. Experimental results demonstrated the feasibility and effectiveness of DCGN and the superior performance of DCGN over other popular disease-related gene selection methods through real-world gene expression data sets.
Bioinformatic analysis of the nucleotide binding site-encoding disease-resistance genes in foxtail millet (Setaria italica (L.) Beauv.).

PubMed

Zhu, Y B; Xie, X Q; Li, Z Y; Bai, H; Dong, L; Dong, Z P; Dong, J G

2014-08-28

The nucleotide-binding site (NBS) disease-resistance genes are the largest category of plant disease-resistance gene analogs. The complete set of disease-resistant candidate genes, which encode the NBS sequence, was filtered in the genomes of two varieties of foxtail millet (Yugu1 and 'Zhang gu'). This study investigated a number of characteristics of the putative NBS genes, such as structural diversity and phylogenetic relationships. A total of 269 and 281 NBS-coding sequences were identified in Yugu1 and 'Zhang gu', respectively. When the two databases were compared, 72 genes were found to be identical and 164 genes showed more than 90% similarity. Physical positioning and gene family analysis of the NBS disease-resistance genes in the genome revealed that the number of genes on each chromosome was similar in both varieties. The eighth chromosome contained the largest number of genes and the ninth chromosome contained the lowest number of genes. Exactly 34 gene clusters containing the 161 genes were found in the Yugu1 genome, with each cluster containing 4.7 genes on average. In comparison, the 'Zhang gu' genome possessed 28 gene clusters, which had 151 genes, with an average of 5.4 genes in each cluster. The largest gene cluster, located on the eighth chromosome, contained 12 genes in the Yugu1 database, whereas it contained 16 genes in the 'Zhang gu' database. The classification results showed that the CC-NBS-LRR gene made up the largest part of each chromosome in the two databases. Two TIR-NBS genes were also found in the Yugu1 genome.
Differential gene expression analysis in glioblastoma cells and normal human brain cells based on GEO database.

PubMed

Wang, Anping; Zhang, Guibin

2017-11-01

The differentially expressed genes between glioblastoma (GBM) cells and normal human brain cells were investigated to performed pathway analysis and protein interaction network analysis for the differentially expressed genes. GSE12657 and GSE42656 gene chips, which contain gene expression profile of GBM were obtained from Gene Expression Omniub (GEO) database of National Center for Biotechnology Information (NCBI). The 'limma' data packet in 'R' software was used to analyze the differentially expressed genes in the two gene chips, and gene integration was performed using 'RobustRankAggreg' package. Finally, pheatmap software was used for heatmap analysis and Cytoscape, DAVID, STRING and KOBAS were used for protein-protein interaction, Gene Ontology (GO) and KEGG analyses. As results: i) 702 differentially expressed genes were identified in GSE12657, among those genes, 548 were significantly upregulated and 154 were significantly downregulated (p<0.01, fold-change >1), and 1,854 differentially expressed genes were identified in GSE42656, among the genes, 1,068 were significantly upregulated and 786 were significantly downregulated (p<0.01, fold-change >1). A total of 167 differentially expressed genes including 100 upregulated genes and 67 downregulated genes were identified after gene integration, and the genes showed significantly different expression levels in GBM compared with normal human brain cells (p<0.05). ii) Interactions between the protein products of 101 differentially expressed genes were identified using STRING and expression network was established. A key gene, called CALM3, was identified by Cytoscape software. iii) GO enrichment analysis showed that differentially expressed genes were mainly enriched in 'neurotransmitter:sodium symporter activity' and 'neurotransmitter transporter activity', which can affect the activity of neurotransmitter transportation. KEGG pathway analysis showed that the differentially expressed genes were mainly enriched in 'protein processing in endoplasmic reticulum', which can affect protein processing in endoplasmic reticulum. The results showed that: i) 167 differentially expressed genes were identified from two gene chips after integration; and ii) protein interaction network was established, and GO and KEGG pathway analyses were successfully performed to identify and annotate the key gene, which provide new insights for the studies on GBN at gene level.
Random forests-based differential analysis of gene sets for gene expression data.

PubMed

Hsueh, Huey-Miin; Zhou, Da-Wei; Tsai, Chen-An

2013-04-10

In DNA microarray studies, gene-set analysis (GSA) has become the focus of gene expression data analysis. GSA utilizes the gene expression profiles of functionally related gene sets in Gene Ontology (GO) categories or priori-defined biological classes to assess the significance of gene sets associated with clinical outcomes or phenotypes. Many statistical approaches have been proposed to determine whether such functionally related gene sets express differentially (enrichment and/or deletion) in variations of phenotypes. However, little attention has been given to the discriminatory power of gene sets and classification of patients. In this study, we propose a method of gene set analysis, in which gene sets are used to develop classifications of patients based on the Random Forest (RF) algorithm. The corresponding empirical p-value of an observed out-of-bag (OOB) error rate of the classifier is introduced to identify differentially expressed gene sets using an adequate resampling method. In addition, we discuss the impacts and correlations of genes within each gene set based on the measures of variable importance in the RF algorithm. Significant classifications are reported and visualized together with the underlying gene sets and their contribution to the phenotypes of interest. Numerical studies using both synthesized data and a series of publicly available gene expression data sets are conducted to evaluate the performance of the proposed methods. Compared with other hypothesis testing approaches, our proposed methods are reliable and successful in identifying enriched gene sets and in discovering the contributions of genes within a gene set. The classification results of identified gene sets can provide an valuable alternative to gene set testing to reveal the unknown, biologically relevant classes of samples or patients. In summary, our proposed method allows one to simultaneously assess the discriminatory ability of gene sets and the importance of genes for interpretation of data in complex biological systems. The classifications of biologically defined gene sets can reveal the underlying interactions of gene sets associated with the phenotypes, and provide an insightful complement to conventional gene set analyses. Copyright © 2012 Elsevier B.V. All rights reserved.
Analysis of multiplex gene expression maps obtained by voxelation.

PubMed

An, Li; Xie, Hongbo; Chin, Mark H; Obradovic, Zoran; Smith, Desmond J; Megalooikonomou, Vasileios

2009-04-29

Gene expression signatures in the mammalian brain hold the key to understanding neural development and neurological disease. Researchers have previously used voxelation in combination with microarrays for acquisition of genome-wide atlases of expression patterns in the mouse brain. On the other hand, some work has been performed on studying gene functions, without taking into account the location information of a gene's expression in a mouse brain. In this paper, we present an approach for identifying the relation between gene expression maps obtained by voxelation and gene functions. To analyze the dataset, we chose typical genes as queries and aimed at discovering similar gene groups. Gene similarity was determined by using the wavelet features extracted from the left and right hemispheres averaged gene expression maps, and by the Euclidean distance between each pair of feature vectors. We also performed a multiple clustering approach on the gene expression maps, combined with hierarchical clustering. Among each group of similar genes and clusters, the gene function similarity was measured by calculating the average gene function distances in the gene ontology structure. By applying our methodology to find similar genes to certain target genes we were able to improve our understanding of gene expression patterns and gene functions. By applying the clustering analysis method, we obtained significant clusters, which have both very similar gene expression maps and very similar gene functions respectively to their corresponding gene ontologies. The cellular component ontology resulted in prominent clusters expressed in cortex and corpus callosum. The molecular function ontology gave prominent clusters in cortex, corpus callosum and hypothalamus. The biological process ontology resulted in clusters in cortex, hypothalamus and choroid plexus. Clusters from all three ontologies combined were most prominently expressed in cortex and corpus callosum. The experimental results confirm the hypothesis that genes with similar gene expression maps might have similar gene functions. The voxelation data takes into account the location information of gene expression level in mouse brain, which is novel in related research. The proposed approach can potentially be used to predict gene functions and provide helpful suggestions to biologists.
The Association of Multiple Interacting Genes with Specific Phenotypes in Rice Using Gene Coexpression Networks1[C][W][OA

PubMed Central

Ficklin, Stephen P.; Luo, Feng; Feltus, F. Alex

2010-01-01

Discovering gene sets underlying the expression of a given phenotype is of great importance, as many phenotypes are the result of complex gene-gene interactions. Gene coexpression networks, built using a set of microarray samples as input, can help elucidate tightly coexpressed gene sets (modules) that are mixed with genes of known and unknown function. Functional enrichment analysis of modules further subdivides the coexpressed gene set into cofunctional gene clusters that may coexist in the module with other functionally related gene clusters. In this study, 45 coexpressed gene modules and 76 cofunctional gene clusters were discovered for rice (Oryza sativa) using a global, knowledge-independent paradigm and the combination of two network construction methodologies. Some clusters were enriched for previously characterized mutant phenotypes, providing evidence for specific gene sets (and their annotated molecular functions) that underlie specific phenotypes. PMID:20668062
The association of multiple interacting genes with specific phenotypes in rice using gene coexpression networks.

PubMed

Ficklin, Stephen P; Luo, Feng; Feltus, F Alex

2010-09-01

Discovering gene sets underlying the expression of a given phenotype is of great importance, as many phenotypes are the result of complex gene-gene interactions. Gene coexpression networks, built using a set of microarray samples as input, can help elucidate tightly coexpressed gene sets (modules) that are mixed with genes of known and unknown function. Functional enrichment analysis of modules further subdivides the coexpressed gene set into cofunctional gene clusters that may coexist in the module with other functionally related gene clusters. In this study, 45 coexpressed gene modules and 76 cofunctional gene clusters were discovered for rice (Oryza sativa) using a global, knowledge-independent paradigm and the combination of two network construction methodologies. Some clusters were enriched for previously characterized mutant phenotypes, providing evidence for specific gene sets (and their annotated molecular functions) that underlie specific phenotypes.

Partial Roc Reveals Superiority of Mutual Rank of Pearson's Correlation Coefficient as a Coexpression Measure to Elucidate Functional Association of Genes

NASA Astrophysics Data System (ADS)

Obayashi, Takeshi; Kinoshita, Kengo

2013-01-01

Gene coexpression analysis is a powerful approach to elucidate gene function. We have established and developed this approach using vast amount of publicly available gene expression data measured by microarray techniques. The coexpressed genes are used to estimate gene function of the guide gene or to construct gene coexpression networks. In the case to construct gene networks, researchers should introduce an arbitrary threshold of gene coexpression, because gene coexpression value is continuous value. In the viewpoint to introduce common threshold of gene coexpression, we previously reported rank of Pearson's correlation coefficient (PCC) is more useful than the original PCC value. In this manuscript, we re-assessed the measure of gene coexpression to construct gene coexpression network, and found that mutual rank (MR) of PCC showed better performance than rank of PCC and the original PCC in low false positive rate.
Coexpression landscape in ATTED-II: usage of gene list and gene network for various types of pathways.

PubMed

Obayashi, Takeshi; Kinoshita, Kengo

2010-05-01

Gene coexpression analyses are a powerful method to predict the function of genes and/or to identify genes that are functionally related to query genes. The basic idea of gene coexpression analyses is that genes with similar functions should have similar expression patterns under many different conditions. This approach is now widely used by many experimental researchers, especially in the field of plant biology. In this review, we will summarize recent successful examples obtained by using our gene coexpression database, ATTED-II. Specifically, the examples will describe the identification of new genes, such as the subunits of a complex protein, the enzymes in a metabolic pathway and transporters. In addition, we will discuss the discovery of a new intercellular signaling factor and new regulatory relationships between transcription factors and their target genes. In ATTED-II, we provide two basic views of gene coexpression, a gene list view and a gene network view, which can be used as guide gene approach and narrow-down approach, respectively. In addition, we will discuss the coexpression effectiveness for various types of gene sets.
Gene network biological validity based on gene-gene interaction relevance.

PubMed

Gómez-Vela, Francisco; Díaz-Díaz, Norberto

2014-01-01

In recent years, gene networks have become one of the most useful tools for modeling biological processes. Many inference gene network algorithms have been developed as techniques for extracting knowledge from gene expression data. Ensuring the reliability of the inferred gene relationships is a crucial task in any study in order to prove that the algorithms used are precise. Usually, this validation process can be carried out using prior biological knowledge. The metabolic pathways stored in KEGG are one of the most widely used knowledgeable sources for analyzing relationships between genes. This paper introduces a new methodology, GeneNetVal, to assess the biological validity of gene networks based on the relevance of the gene-gene interactions stored in KEGG metabolic pathways. Hence, a complete KEGG pathway conversion into a gene association network and a new matching distance based on gene-gene interaction relevance are proposed. The performance of GeneNetVal was established with three different experiments. Firstly, our proposal is tested in a comparative ROC analysis. Secondly, a randomness study is presented to show the behavior of GeneNetVal when the noise is increased in the input network. Finally, the ability of GeneNetVal to detect biological functionality of the network is shown.
Evaluation and selection of reliable reference genes for gene expression under abiotic stress in cotton (Gossypium hirsutum L.).

PubMed

Wang, Min; Wang, Qinglian; Zhang, Baohong

2013-11-01

Reference genes are critical for normalization of the gene expression level of target genes. The widely used housekeeping genes may change their expression levels at different tissue under different treatment or stress conditions. Therefore, systematical evaluation on the housekeeping genes is required for gene expression analysis. Up to date, no work was performed to evaluate the housekeeping genes in cotton under stress treatment. In this study, we chose 10 housekeeping genes to systematically assess their expression levels at two different tissues (leaves and roots) under two different abiotic stresses (salt and drought) with three different concentrations. Our results show that there is no best reference gene for all tissues at all stress conditions. The reliable reference gene should be selected based on a specific condition. For example, under salt stress, UBQ7, GAPDH and EF1A8 are better reference genes in leaves; TUA10, UBQ7, CYP1, GAPDH and EF1A8 were better in roots. Under drought stress, UBQ7, EF1A8, TUA10, and GAPDH showed less variety of expression level in leaves and roots. Thus, it is better to identify reliable reference genes first before performing any gene expression analysis. However, using a combination of housekeeping genes as reference gene may provide a new strategy for normalization of gene expression. In this study, we found that combination of four housekeeping genes worked well as reference genes under all the stress conditions. © 2013.
MAGMA: Generalized Gene-Set Analysis of GWAS Data

PubMed Central

de Leeuw, Christiaan A.; Mooij, Joris M.; Heskes, Tom; Posthuma, Danielle

2015-01-01

By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn’s Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn’s Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn’s Disease data was found to be considerably faster as well. PMID:25885710
MAGMA: generalized gene-set analysis of GWAS data.

PubMed

de Leeuw, Christiaan A; Mooij, Joris M; Heskes, Tom; Posthuma, Danielle

2015-04-01

By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn's Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn's Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn's Disease data was found to be considerably faster as well.
Evolutionary dynamics of olfactory and other chemosensory receptor genes in vertebrates

PubMed Central

Niimura, Yoshihito

2007-01-01

The numbers of functional olfactory receptor (OR) genes in humans and mice are about 400 and 1,000 respectively. In both humans and mice, these genes exist as genomic clusters and are scattered over almost all chromosomes. The difference in the number of genes between the two species is apparently caused by massive inactivation of OR genes in the human lineage and a substantial increase of OR genes in the mouse lineage after the human–mouse divergence. Compared with mammals, fishes have a much smaller number of OR genes. However, the OR gene family in fishes is much more divergent than that in mammals. Fishes have many different groups of genes that are absent in mammals, suggesting that the mammalian OR gene family is characterized by the loss of many group genes that existed in the ancestor of vertebrates and the subsequent expansion of specific groups of genes. Therefore, this gene family apparently changed dynamically depending on the evolutionary lineage and evolved under the birth-and-death model of evolution. Study of the evolutionary changes of two gene families for vomeronasal receptors and two gene families for taste receptors, which are structurally similar, but remotely related to OR genes, showed that some of the gene families evolved in the same fashion as the OR gene family. It appears that the number and types of genes in chemosensory receptor gene families have evolved in response to environmental needs, but they are also affected by fortuitous factors. PMID:16607462
Down-weighting overlapping genes improves gene set analysis

PubMed Central

2012-01-01

Background The identification of gene sets that are significantly impacted in a given condition based on microarray data is a crucial step in current life science research. Most gene set analysis methods treat genes equally, regardless how specific they are to a given gene set. Results In this work we propose a new gene set analysis method that computes a gene set score as the mean of absolute values of weighted moderated gene t-scores. The gene weights are designed to emphasize the genes appearing in few gene sets, versus genes that appear in many gene sets. We demonstrate the usefulness of the method when analyzing gene sets that correspond to the KEGG pathways, and hence we called our method Pathway Analysis with Down-weighting of Overlapping Genes (PADOG). Unlike most gene set analysis methods which are validated through the analysis of 2-3 data sets followed by a human interpretation of the results, the validation employed here uses 24 different data sets and a completely objective assessment scheme that makes minimal assumptions and eliminates the need for possibly biased human assessments of the analysis results. Conclusions PADOG significantly improves gene set ranking and boosts sensitivity of analysis using information already available in the gene expression profiles and the collection of gene sets to be analyzed. The advantages of PADOG over other existing approaches are shown to be stable to changes in the database of gene sets to be analyzed. PADOG was implemented as an R package available at: http://bioinformaticsprb.med.wayne.edu/PADOG/or http://www.bioconductor.org. PMID:22713124
Avirulence Genes in Cereal Powdery Mildews: The Gene-for-Gene Hypothesis 2.0.

PubMed

Bourras, Salim; McNally, Kaitlin E; Müller, Marion C; Wicker, Thomas; Keller, Beat

2016-01-01

The gene-for-gene hypothesis states that for each gene controlling resistance in the host, there is a corresponding, specific gene controlling avirulence in the pathogen. Allelic series of the cereal mildew resistance genes Pm3 and Mla provide an excellent system for genetic and molecular analysis of resistance specificity. Despite this opportunity for molecular research, avirulence genes in mildews remain underexplored. Earlier work in barley powdery mildew (B.g. hordei) has shown that the reaction to some Mla resistance alleles is controlled by multiple genes. Similarly, several genes are involved in the specific interaction of wheat mildew (B.g. tritici) with the Pm3 allelic series. We found that two mildew genes control avirulence on Pm3f: one gene is involved in recognition by the resistance protein as demonstrated by functional studies in wheat and the heterologous host Nicotiana benthamiana. A second gene is a suppressor, and resistance is only observed in mildew genotypes combining the inactive suppressor and the recognized Avr. We propose that such suppressor/avirulence gene combinations provide the basis of specificity in mildews. Depending on the particular gene combinations in a mildew race, different genes will be genetically identified as the "avirulence" gene. Additionally, the observation of two LINE retrotransposon-encoded avirulence genes in B.g. hordei further suggests that the control of avirulence in mildew is more complex than a canonical gene-for-gene interaction. To fully understand the mildew-cereal interactions, more knowledge on avirulence determinants is needed and we propose ways how this can be achieved based on recent advances in the field.
Differential replication dynamics for large and small Vibrio chromosomes affect gene dosage, expression and location

PubMed Central

Dryselius, Rikard; Izutsu, Kaori; Honda, Takeshi; Iida, Tetsuya

2008-01-01

Background Replication of bacterial chromosomes increases copy numbers of genes located near origins of replication relative to genes located near termini. Such differential gene dosage depends on replication rate, doubling time and chromosome size. Although little explored, differential gene dosage may influence both gene expression and location. For vibrios, a diverse family of fast growing gammaproteobacteria, gene dosage may be particularly important as they harbor two chromosomes of different size. Results Here we examined replication dynamics and gene dosage effects for the separate chromosomes of three Vibrio species. We also investigated locations for specific gene types within the genome. The results showed consistently larger gene dosage differences for the large chromosome which also initiated replication long before the small. Accordingly, large chromosome gene expression levels were generally higher and showed an influence from gene dosage. This was reflected by a higher abundance of growth essential and growth contributing genes of which many locate near the origin of replication. In contrast, small chromosome gene expression levels were low and appeared independent of gene dosage. Also, species specific genes are highly abundant and an over-representation of genes involved in transcription could explain its gene dosage independent expression. Conclusion Here we establish a link between replication dynamics and differential gene dosage on one hand and gene expression levels and the location of specific gene types on the other. For vibrios, this relationship appears connected to a polarisation of genetic content between its chromosomes, which may both contribute to and be enhanced by an improved adaptive capacity. PMID:19032792
Avirulence Genes in Cereal Powdery Mildews: The Gene-for-Gene Hypothesis 2.0

PubMed Central

Bourras, Salim; McNally, Kaitlin E.; Müller, Marion C.; Wicker, Thomas; Keller, Beat

2016-01-01

The gene-for-gene hypothesis states that for each gene controlling resistance in the host, there is a corresponding, specific gene controlling avirulence in the pathogen. Allelic series of the cereal mildew resistance genes Pm3 and Mla provide an excellent system for genetic and molecular analysis of resistance specificity. Despite this opportunity for molecular research, avirulence genes in mildews remain underexplored. Earlier work in barley powdery mildew (B.g. hordei) has shown that the reaction to some Mla resistance alleles is controlled by multiple genes. Similarly, several genes are involved in the specific interaction of wheat mildew (B.g. tritici) with the Pm3 allelic series. We found that two mildew genes control avirulence on Pm3f: one gene is involved in recognition by the resistance protein as demonstrated by functional studies in wheat and the heterologous host Nicotiana benthamiana. A second gene is a suppressor, and resistance is only observed in mildew genotypes combining the inactive suppressor and the recognized Avr. We propose that such suppressor/avirulence gene combinations provide the basis of specificity in mildews. Depending on the particular gene combinations in a mildew race, different genes will be genetically identified as the “avirulence” gene. Additionally, the observation of two LINE retrotransposon-encoded avirulence genes in B.g. hordei further suggests that the control of avirulence in mildew is more complex than a canonical gene-for-gene interaction. To fully understand the mildew–cereal interactions, more knowledge on avirulence determinants is needed and we propose ways how this can be achieved based on recent advances in the field. PMID:26973683
Gene Duplicability of Core Genes Is Highly Consistent across All Angiosperms[OPEN

PubMed Central

Li, Zhen; Van de Peer, Yves; De Smet, Riet

2016-01-01

Gene duplication is an important mechanism for adding to genomic novelty. Hence, which genes undergo duplication and are preserved following duplication is an important question. It has been observed that gene duplicability, or the ability of genes to be retained following duplication, is a nonrandom process, with certain genes being more amenable to survive duplication events than others. Primarily, gene essentiality and the type of duplication (small-scale versus large-scale) have been shown in different species to influence the (long-term) survival of novel genes. However, an overarching view of “gene duplicability” is lacking, mainly due to the fact that previous studies usually focused on individual species and did not account for the influence of genomic context and the time of duplication. Here, we present a large-scale study in which we investigated duplicate retention for 9178 gene families shared between 37 flowering plant species, referred to as angiosperm core gene families. For most gene families, we observe a strikingly consistent pattern of gene duplicability across species, with gene families being either primarily single-copy or multicopy in all species. An intermediate class contains gene families that are often retained in duplicate for periods extending to tens of millions of years after whole-genome duplication, but ultimately appear to be largely restored to singleton status, suggesting that these genes may be dosage balance sensitive. The distinction between single-copy and multicopy gene families is reflected in their functional annotation, with single-copy genes being mainly involved in the maintenance of genome stability and organelle function and multicopy genes in signaling, transport, and metabolism. The intermediate class was overrepresented in regulatory genes, further suggesting that these represent putative dosage-balance-sensitive genes. PMID:26744215
Gene Duplicability of Core Genes Is Highly Consistent across All Angiosperms.

PubMed

Li, Zhen; Defoort, Jonas; Tasdighian, Setareh; Maere, Steven; Van de Peer, Yves; De Smet, Riet

2016-02-01

Gene duplication is an important mechanism for adding to genomic novelty. Hence, which genes undergo duplication and are preserved following duplication is an important question. It has been observed that gene duplicability, or the ability of genes to be retained following duplication, is a nonrandom process, with certain genes being more amenable to survive duplication events than others. Primarily, gene essentiality and the type of duplication (small-scale versus large-scale) have been shown in different species to influence the (long-term) survival of novel genes. However, an overarching view of "gene duplicability" is lacking, mainly due to the fact that previous studies usually focused on individual species and did not account for the influence of genomic context and the time of duplication. Here, we present a large-scale study in which we investigated duplicate retention for 9178 gene families shared between 37 flowering plant species, referred to as angiosperm core gene families. For most gene families, we observe a strikingly consistent pattern of gene duplicability across species, with gene families being either primarily single-copy or multicopy in all species. An intermediate class contains gene families that are often retained in duplicate for periods extending to tens of millions of years after whole-genome duplication, but ultimately appear to be largely restored to singleton status, suggesting that these genes may be dosage balance sensitive. The distinction between single-copy and multicopy gene families is reflected in their functional annotation, with single-copy genes being mainly involved in the maintenance of genome stability and organelle function and multicopy genes in signaling, transport, and metabolism. The intermediate class was overrepresented in regulatory genes, further suggesting that these represent putative dosage-balance-sensitive genes. © 2016 American Society of Plant Biologists. All rights reserved.
Tissue Non-Specific Genes and Pathways Associated with Diabetes: An Expression Meta-Analysis.

PubMed

Mei, Hao; Li, Lianna; Liu, Shijian; Jiang, Fan; Griswold, Michael; Mosley, Thomas

2017-01-21

We performed expression studies to identify tissue non-specific genes and pathways of diabetes by meta-analysis. We searched curated datasets of the Gene Expression Omnibus (GEO) database and identified 13 and five expression studies of diabetes and insulin responses at various tissues, respectively. We tested differential gene expression by empirical Bayes-based linear method and investigated gene set expression association by knowledge-based enrichment analysis. Meta-analysis by different methods was applied to identify tissue non-specific genes and gene sets. We also proposed pathway mapping analysis to infer functions of the identified gene sets, and correlation and independent analysis to evaluate expression association profile of genes and gene sets between studies and tissues. Our analysis showed that PGRMC1 and HADH genes were significant over diabetes studies, while IRS1 and MPST genes were significant over insulin response studies, and joint analysis showed that HADH and MPST genes were significant over all combined data sets. The pathway analysis identified six significant gene sets over all studies. The KEGG pathway mapping indicated that the significant gene sets are related to diabetes pathogenesis. The results also presented that 12.8% and 59.0% pairwise studies had significantly correlated expression association for genes and gene sets, respectively; moreover, 12.8% pairwise studies had independent expression association for genes, but no studies were observed significantly different for expression association of gene sets. Our analysis indicated that there are both tissue specific and non-specific genes and pathways associated with diabetes pathogenesis. Compared to the gene expression, pathway association tends to be tissue non-specific, and a common pathway influencing diabetes development is activated through different genes at different tissues.
Evolution of homeobox genes.

PubMed

Holland, Peter W H

2013-01-01

Many homeobox genes encode transcription factors with regulatory roles in animal and plant development. Homeobox genes are found in almost all eukaryotes, and have diversified into 11 gene classes and over 100 gene families in animal evolution, and 10 to 14 gene classes in plants. The largest group in animals is the ANTP class which includes the well-known Hox genes, plus other genes implicated in development including ParaHox (Cdx, Xlox, Gsx), Evx, Dlx, En, NK4, NK3, Msx, and Nanog. Genomic data suggest that the ANTP class diversified by extensive tandem duplication to generate a large array of genes, including an NK gene cluster and a hypothetical ProtoHox gene cluster that duplicated to generate Hox and ParaHox genes. Expression and functional data suggest that NK, Hox, and ParaHox gene clusters acquired distinct roles in patterning the mesoderm, nervous system, and gut. The PRD class is also diverse and includes Pax2/5/8, Pax3/7, Pax4/6, Gsc, Hesx, Otx, Otp, and Pitx genes. PRD genes are not generally arranged in ancient genomic clusters, although the Dux, Obox, and Rhox gene clusters arose in mammalian evolution as did several non-clustered PRD genes. Tandem duplication and genome duplication expanded the number of homeobox genes, possibly contributing to the evolution of developmental complexity, but homeobox gene loss must not be ignored. Evolutionary changes to homeobox gene expression have also been documented, including Hox gene expression patterns shifting in concert with segmental diversification in vertebrates and crustaceans, and deletion of a Pitx1 gene enhancer in pelvic-reduced sticklebacks. WIREs Dev Biol 2013, 2:31-45. doi: 10.1002/wdev.78 For further resources related to this article, please visit the WIREs website. The author declares that he has no conflicts of interest. Copyright © 2012 Wiley Periodicals, Inc.
Application of community phylogenetic approaches to understand gene expression: differential exploration of venom gene space in predatory marine gastropods.

PubMed

Chang, Dan; Duda, Thomas F

2014-06-05

Predatory marine gastropods of the genus Conus exhibit substantial variation in venom composition both within and among species. Apart from mechanisms associated with extensive turnover of gene families and rapid evolution of genes that encode venom components ('conotoxins'), the evolution of distinct conotoxin expression patterns is an additional source of variation that may drive interspecific differences in the utilization of species' 'venom gene space'. To determine the evolution of expression patterns of venom genes of Conus species, we evaluated the expression of A-superfamily conotoxin genes of a set of closely related Conus species by comparing recovered transcripts of A-superfamily genes that were previously identified from the genomes of these species. We modified community phylogenetics approaches to incorporate phylogenetic history and disparity of genes and their expression profiles to determine patterns of venom gene space utilization. Less than half of the A-superfamily gene repertoire of these species is expressed, and only a few orthologous genes are coexpressed among species. Species exhibit substantially distinct expression strategies, with some expressing sets of closely related loci ('under-dispersed' expression of available genes) while others express sets of more disparate genes ('over-dispersed' expression). In addition, expressed genes show higher dN/dS values than either unexpressed or ancestral genes; this implies that expression exposes genes to selection and facilitates rapid evolution of these genes. Few recent lineage-specific gene duplicates are expressed simultaneously, suggesting that expression divergence among redundant gene copies may be established shortly after gene duplication. Our study demonstrates that venom gene space is explored differentially by Conus species, a process that effectively permits the independent and rapid evolution of venoms in these species.
Intron-loss evolution of hatching enzyme genes in Teleostei

PubMed Central

2010-01-01

Background Hatching enzyme, belonging to the astacin metallo-protease family, digests egg envelope at embryo hatching. Orthologous genes of the enzyme are found in all vertebrate genomes. Recently, we found that exon-intron structures of the genes were conserved among tetrapods, while the genes of teleosts frequently lost their introns. Occurrence of such intron losses in teleostean hatching enzyme genes is an uncommon evolutionary event, as most eukaryotic genes are generally known to be interrupted by introns and the intron insertion sites are conserved from species to species. Here, we report on extensive studies of the exon-intron structures of teleostean hatching enzyme genes for insight into how and why introns were lost during evolution. Results We investigated the evolutionary pathway of intron-losses in hatching enzyme genes of 27 species of Teleostei. Hatching enzyme genes of basal teleosts are of only one type, which conserves the 9-exon-8-intron structure of an assumed ancestor. On the other hand, otocephalans and euteleosts possess two types of hatching enzyme genes, suggesting a gene duplication event in the common ancestor of otocephalans and euteleosts. The duplicated genes were classified into two clades, clades I and II, based on phylogenetic analysis. In otocephalans and euteleosts, clade I genes developed a phylogeny-specific structure, such as an 8-exon-7-intron, 5-exon-4-intron, 4-exon-3-intron or intron-less structure. In contrast to the clade I genes, the structures of clade II genes were relatively stable in their configuration, and were similar to that of the ancestral genes. Expression analyses revealed that hatching enzyme genes were high-expression genes, when compared to that of housekeeping genes. When expression levels were compared between clade I and II genes, clade I genes tends to be expressed more highly than clade II genes. Conclusions Hatching enzyme genes evolved to lose their introns, and the intron-loss events occurred at the specific points of teleostean phylogeny. We propose that the high-expression hatching enzyme genes frequently lost their introns during the evolution of teleosts, while the low-expression genes maintained the exon-intron structure of the ancestral gene. PMID:20796321
Genome-Wide Distribution, Organisation and Functional Characterization of Disease Resistance and Defence Response Genes across Rice Species

PubMed Central

Singh, Sangeeta; Chand, Suresh; Singh, N. K.; Sharma, Tilak Raj

2015-01-01

The resistance (R) genes and defense response (DR) genes have become very important resources for the development of disease resistant cultivars. In the present investigation, genome-wide identification, expression, phylogenetic and synteny analysis was done for R and DR-genes across three species of rice viz: Oryza sativa ssp indica cv 93-11, Oryza sativa ssp japonica and wild rice species, Oryza brachyantha. We used the in silico approach to identify and map 786 R -genes and 167 DR-genes, 672 R-genes and 142 DR-genes, 251 R-genes and 86 DR-genes in the japonica, indica and O. brachyanth a genomes, respectively. Our analysis showed that 60.5% and 55.6% of the R-genes are tandemly repeated within clusters and distributed over all the rice chromosomes in indica and japonica genomes, respectively. The phylogenetic analysis along with motif distribution shows high degree of conservation of R- and DR-genes in clusters. In silico expression analysis of R-genes and DR-genes showed more than 85% were expressed genes showing corresponding EST matches in the databases. This study gave special emphasis on mechanisms of gene evolution and duplication for R and DR genes across species. Analysis of paralogs across rice species indicated 17% and 4.38% R-genes, 29% and 11.63% DR-genes duplication in indica and Oryza brachyantha, as compared to 20% and 26% duplication of R-genes and DR-genes in japonica respectively. We found that during the course of duplication only 9.5% of R- and DR-genes changed their function and rest of the genes have maintained their identity. Syntenic relationship across three genomes inferred that more orthology is shared between indica and japonica genomes as compared to brachyantha genome. Genome wide identification of R-genes and DR-genes in the rice genome will help in allele mining and functional validation of these genes, and to understand molecular mechanism of disease resistance and their evolution in rice and related species. PMID:25902056
Network-based differential gene expression analysis suggests cell cycle related genes regulated by E2F1 underlie the molecular difference between smoker and non-smoker lung adenocarcinoma

PubMed Central

2013-01-01

Background Differential gene expression (DGE) analysis is commonly used to reveal the deregulated molecular mechanisms of complex diseases. However, traditional DGE analysis (e.g., the t test or the rank sum test) tests each gene independently without considering interactions between them. Top-ranked differentially regulated genes prioritized by the analysis may not directly relate to the coherent molecular changes underlying complex diseases. Joint analyses of co-expression and DGE have been applied to reveal the deregulated molecular modules underlying complex diseases. Most of these methods consist of separate steps: first to identify gene-gene relationships under the studied phenotype then to integrate them with gene expression changes for prioritizing signature genes, or vice versa. It is warrant a method that can simultaneously consider gene-gene co-expression strength and corresponding expression level changes so that both types of information can be leveraged optimally. Results In this paper, we develop a gene module based method for differential gene expression analysis, named network-based differential gene expression (nDGE) analysis, a one-step integrative process for prioritizing deregulated genes and grouping them into gene modules. We demonstrate that nDGE outperforms existing methods in prioritizing deregulated genes and discovering deregulated gene modules using simulated data sets. When tested on a series of smoker and non-smoker lung adenocarcinoma data sets, we show that top differentially regulated genes identified by the rank sum test in different sets are not consistent while top ranked genes defined by nDGE in different data sets significantly overlap. nDGE results suggest that a differentially regulated gene module, which is enriched for cell cycle related genes and E2F1 targeted genes, plays a role in the molecular differences between smoker and non-smoker lung adenocarcinoma. Conclusions In this paper, we develop nDGE to prioritize deregulated genes and group them into gene modules by simultaneously considering gene expression level changes and gene-gene co-regulations. When applied to both simulated and empirical data, nDGE outperforms the traditional DGE method. More specifically, when applied to smoker and non-smoker lung cancer sets, nDGE results illustrate the molecular differences between smoker and non-smoker lung cancer. PMID:24341432
Initial description of primate-specific cystine-knot Prometheus genes and differential gene expansions of D-dopachrome tautomerase genes

PubMed Central

Premzl, Marko

2015-01-01

Using eutherian comparative genomic analysis protocol and public genomic sequence data sets, the present work attempted to update and revise two gene data sets. The most comprehensive third party annotation gene data sets of eutherian adenohypophysis cystine-knot genes (128 complete coding sequences), and d-dopachrome tautomerases and macrophage migration inhibitory factor genes (30 complete coding sequences) were annotated. For example, the present study first described primate-specific cystine-knot Prometheus genes, as well as differential gene expansions of D-dopachrome tautomerase genes. Furthermore, new frameworks of future experiments of two eutherian gene data sets were proposed. PMID:25941635

Gene function prediction with gene interaction networks: a context graph kernel approach.

PubMed

Li, Xin; Chen, Hsinchun; Li, Jiexun; Zhang, Zhu

2010-01-01

Predicting gene functions is a challenge for biologists in the postgenomic era. Interactions among genes and their products compose networks that can be used to infer gene functions. Most previous studies adopt a linkage assumption, i.e., they assume that gene interactions indicate functional similarities between connected genes. In this study, we propose to use a gene's context graph, i.e., the gene interaction network associated with the focal gene, to infer its functions. In a kernel-based machine-learning framework, we design a context graph kernel to capture the information in context graphs. Our experimental study on a testbed of p53-related genes demonstrates the advantage of using indirect gene interactions and shows the empirical superiority of the proposed approach over linkage-assumption-based methods, such as the algorithm to minimize inconsistent connected genes and diffusion kernels.
Functional and evolutionary correlates of gene constellations in the Drosophila melanogaster genome that deviate from the stereotypical gene architecture

PubMed Central

2010-01-01

Background The biological dimensions of genes are manifold. These include genomic properties, (e.g., X/autosomal linkage, recombination) and functional properties (e.g., expression level, tissue specificity). Multiple properties, each generally of subtle influence individually, may affect the evolution of genes or merely be (auto-)correlates. Results of multidimensional analyses may reveal the relative importance of these properties on the evolution of genes, and therefore help evaluate whether these properties should be considered during analyses. While numerous properties are now considered during studies, most work still assumes the stereotypical solitary gene as commonly depicted in textbooks. Here, we investigate the Drosophila melanogaster genome to determine whether deviations from the stereotypical gene architecture correlate with other properties of genes. Results Deviations from the stereotypical gene architecture were classified as the following gene constellations: Overlapping genes were defined as those that overlap in the 5-prime, exonic, or intronic regions. Chromatin co-clustering genes were defined as genes that co-clustered within 20 kb of transcriptional territories. If this scheme is applied the stereotypical gene emerges as a rare occurrence (7.5%), slightly varied schemes yielded between ~1%-50%. Moreover, when following our scheme, paired-overlapping genes and chromatin co-clustering genes accounted for 50.1 and 42.4% of the genes analyzed, respectively. Gene constellation was a correlate of a number of functional and evolutionary properties of genes, but its statistical effect was ~1-2 orders of magnitude lower than the effects of recombination, chromosome linkage and protein function. Analysis of datasets on male reproductive proteins showed these were biased in their representation of gene constellations and evolutionary rate Ka/Ks estimates, but these biases did not overwhelm the biologically meaningful observation of high evolutionary rates of male reproductive genes. Conclusion Given the rarity of the solitary stereotypical gene, and the abundance of gene constellations that deviate from it, the presence of gene constellations, while once thought to be exceptional in large Eukaryote genomes, might have broader relevance to the understanding and study of the genome. However, according to our definition, while gene constellations can be significant correlates of functional properties of genes, they generally are weak correlates of the evolution of genes. Thus, the need for their consideration would depend on the context of studies. PMID:20497561
Construction and analysis of gene-gene dynamics influence networks based on a Boolean model.

PubMed

Mazaya, Maulida; Trinh, Hung-Cuong; Kwon, Yung-Keun

2017-12-21

Identification of novel gene-gene relations is a crucial issue to understand system-level biological phenomena. To this end, many methods based on a correlation analysis of gene expressions or structural analysis of molecular interaction networks have been proposed. They have a limitation in identifying more complicated gene-gene dynamical relations, though. To overcome this limitation, we proposed a measure to quantify a gene-gene dynamical influence (GDI) using a Boolean network model and constructed a GDI network to indicate existence of a dynamical influence for every ordered pair of genes. It represents how much a state trajectory of a target gene is changed by a knockout mutation subject to a source gene in a gene-gene molecular interaction (GMI) network. Through a topological comparison between GDI and GMI networks, we observed that the former network is denser than the latter network, which implies that there exist many gene pairs of dynamically influencing but molecularly non-interacting relations. In addition, a larger number of hub genes were generated in the GDI network. On the other hand, there was a correlation between these networks such that the degree value of a node was positively correlated to each other. We further investigated the relationships of the GDI value with structural properties and found that there are negative and positive correlations with the length of a shortest path and the number of paths, respectively. In addition, a GDI network could predict a set of genes whose steady-state expression is affected in E. coli gene-knockout experiments. More interestingly, we found that the drug-targets with side-effects have a larger number of outgoing links than the other genes in the GDI network, which implies that they are more likely to influence the dynamics of other genes. Finally, we found biological evidences showing that the gene pairs which are not molecularly interacting but dynamically influential can be considered for novel gene-gene relationships. Taken together, construction and analysis of the GDI network can be a useful approach to identify novel gene-gene relationships in terms of the dynamical influence.
GeneMesh: a web-based microarray analysis tool for relating differentially expressed genes to MeSH terms.

PubMed

Jani, Saurin D; Argraves, Gary L; Barth, Jeremy L; Argraves, W Scott

2010-04-01

An important objective of DNA microarray-based gene expression experimentation is determining inter-relationships that exist between differentially expressed genes and biological processes, molecular functions, cellular components, signaling pathways, physiologic processes and diseases. Here we describe GeneMesh, a web-based program that facilitates analysis of DNA microarray gene expression data. GeneMesh relates genes in a query set to categories available in the Medical Subject Headings (MeSH) hierarchical index. The interface enables hypothesis driven relational analysis to a specific MeSH subcategory (e.g., Cardiovascular System, Genetic Processes, Immune System Diseases etc.) or unbiased relational analysis to broader MeSH categories (e.g., Anatomy, Biological Sciences, Disease etc.). Genes found associated with a given MeSH category are dynamically linked to facilitate tabular and graphical depiction of Entrez Gene information, Gene Ontology information, KEGG metabolic pathway diagrams and intermolecular interaction information. Expression intensity values of groups of genes that cluster in relation to a given MeSH category, gene ontology or pathway can be displayed as heat maps of Z score-normalized values. GeneMesh operates on gene expression data derived from a number of commercial microarray platforms including Affymetrix, Agilent and Illumina. GeneMesh is a versatile web-based tool for testing and developing new hypotheses through relating genes in a query set (e.g., differentially expressed genes from a DNA microarray experiment) to descriptors making up the hierarchical structure of the National Library of Medicine controlled vocabulary thesaurus, MeSH. The system further enhances the discovery process by providing links between sets of genes associated with a given MeSH category to a rich set of html linked tabular and graphic information including Entrez Gene summaries, gene ontologies, intermolecular interactions, overlays of genes onto KEGG pathway diagrams and heatmaps of expression intensity values. GeneMesh is freely available online at http://proteogenomics.musc.edu/genemesh/.
Identification of Human HK Genes and Gene Expression Regulation Study in Cancer from Transcriptomics Data Analysis

PubMed Central

Zhang, Zhang; Liu, Jingxing; Wu, Jiayan; Yu, Jun

2013-01-01

The regulation of gene expression is essential for eukaryotes, as it drives the processes of cellular differentiation and morphogenesis, leading to the creation of different cell types in multicellular organisms. RNA-Sequencing (RNA-Seq) provides researchers with a powerful toolbox for characterization and quantification of transcriptome. Many different human tissue/cell transcriptome datasets coming from RNA-Seq technology are available on public data resource. The fundamental issue here is how to develop an effective analysis method to estimate expression pattern similarities between different tumor tissues and their corresponding normal tissues. We define the gene expression pattern from three directions: 1) expression breadth, which reflects gene expression on/off status, and mainly concerns ubiquitously expressed genes; 2) low/high or constant/variable expression genes, based on gene expression level and variation; and 3) the regulation of gene expression at the gene structure level. The cluster analysis indicates that gene expression pattern is higher related to physiological condition rather than tissue spatial distance. Two sets of human housekeeping (HK) genes are defined according to cell/tissue types, respectively. To characterize the gene expression pattern in gene expression level and variation, we firstly apply improved K-means algorithm and a gene expression variance model. We find that cancer-associated HK genes (a HK gene is specific in cancer group, while not in normal group) are expressed higher and more variable in cancer condition than in normal condition. Cancer-associated HK genes prefer to AT-rich genes, and they are enriched in cell cycle regulation related functions and constitute some cancer signatures. The expression of large genes is also avoided in cancer group. These studies will help us understand which cell type-specific patterns of gene expression differ among different cell types, and particularly for cancer. PMID:23382867
FunGene: the functional gene pipeline and repository.

PubMed

Fish, Jordan A; Chai, Benli; Wang, Qiong; Sun, Yanni; Brown, C Titus; Tiedje, James M; Cole, James R

2013-01-01

Ribosomal RNA genes have become the standard molecular markers for microbial community analysis for good reasons, including universal occurrence in cellular organisms, availability of large databases, and ease of rRNA gene region amplification and analysis. As markers, however, rRNA genes have some significant limitations. The rRNA genes are often present in multiple copies, unlike most protein-coding genes. The slow rate of change in rRNA genes means that multiple species sometimes share identical 16S rRNA gene sequences, while many more species share identical sequences in the short 16S rRNA regions commonly analyzed. In addition, the genes involved in many important processes are not distributed in a phylogenetically coherent manner, potentially due to gene loss or horizontal gene transfer. While rRNA genes remain the most commonly used markers, key genes in ecologically important pathways, e.g., those involved in carbon and nitrogen cycling, can provide important insights into community composition and function not obtainable through rRNA analysis. However, working with ecofunctional gene data requires some tools beyond those required for rRNA analysis. To address this, our Functional Gene Pipeline and Repository (FunGene; http://fungene.cme.msu.edu/) offers databases of many common ecofunctional genes and proteins, as well as integrated tools that allow researchers to browse these collections and choose subsets for further analysis, build phylogenetic trees, test primers and probes for coverage, and download aligned sequences. Additional FunGene tools are specialized to process coding gene amplicon data. For example, FrameBot produces frameshift-corrected protein and DNA sequences from raw reads while finding the most closely related protein reference sequence. These tools can help provide better insight into microbial communities by directly studying key genes involved in important ecological processes.
Cytokine-related genes and oxidation-related genes detected in preeclamptic placentas.

PubMed

Lee, Gui Se Ra; Joe, Yoon Seong; Kim, Sa Jin; Shin, Jong Chul

2010-10-01

To investigate cytokine- and oxidation-related genes for preeclampsia using DNA microarray analysis. Placentas were collected from 13 normal pregnancies and 13 patients with preeclampsia. Gene expression was studied using DNA microarray. Among significantly expressed genes, we focused on genes associated with cytokines and oxidation, and the results were confirmed using quantitative real time-polymerase chain reaction (QRT-PCR). 415 genes out of 30,940 genes were altered by > or =2-fold in the microarray analysis. 121 up-regulated genes and 294 down-regulated genes were found to be in preeclamptic placenta. Six cytokine-related genes and 5 oxidation-related genes were found from among the 121 up-regulated genes. The cytokine-related genes studied included oncostatin M (OSM), fms-related tyrosine kinase (FLT1) and vascular endothelial growth factor A (VEGFA), and the oxidation-related genes studied included spermine oxidase (SMOX), l cytochrome P450, family 26, subfamily A, polypeptide 1 (CYP26A1), acetate dehydrogenase A (LDHA). These six genes were also significantly higher in placentas from patients with preeclampsia than in those from women with normal pregnancies. The placental tissue of patients with preeclampsia showed significantly higher mRNA expression of these six genes than the normal group, using QRT-PCR. DNA microarray analysis is one of the great methods for simultaneously detecting the functionally associated genes of preeclampsia. The cytokine-related genes such as OSM, FLT1 and VEGFA, and the oxidation-related genes such as LDHA, CYP26A1 and SMOX might prove to be the starting point in the elucidation of the pathogenesis of preeclampsia.
Using RNA-seq data to select reference genes for normalizing gene expression in apple roots.

PubMed

Zhou, Zhe; Cong, Peihua; Tian, Yi; Zhu, Yanmin

2017-01-01

Gene expression in apple roots in response to various stress conditions is a less-explored research subject. Reliable reference genes for normalizing quantitative gene expression data have not been carefully investigated. In this study, the suitability of a set of 15 apple genes were evaluated for their potential use as reliable reference genes. These genes were selected based on their low variance of gene expression in apple root tissues from a recent RNA-seq data set, and a few previously reported apple reference genes for other tissue types. Four methods, Delta Ct, geNorm, NormFinder and BestKeeper, were used to evaluate their stability in apple root tissues of various genotypes and under different experimental conditions. A small panel of stably expressed genes, MDP0000095375, MDP0000147424, MDP0000233640, MDP0000326399 and MDP0000173025 were recommended for normalizing quantitative gene expression data in apple roots under various abiotic or biotic stresses. When the most stable and least stable reference genes were used for data normalization, significant differences were observed on the expression patterns of two target genes, MdLecRLK5 (MDP0000228426, a gene encoding a lectin receptor like kinase) and MdMAPK3 (MDP0000187103, a gene encoding a mitogen-activated protein kinase). Our data also indicated that for those carefully validated reference genes, a single reference gene is sufficient for reliable normalization of the quantitative gene expression. Depending on the experimental conditions, the most suitable reference genes can be specific to the sample of interest for more reliable RT-qPCR data normalization.
Using RNA-seq data to select reference genes for normalizing gene expression in apple roots

PubMed Central

Zhou, Zhe; Cong, Peihua; Tian, Yi

2017-01-01

Gene expression in apple roots in response to various stress conditions is a less-explored research subject. Reliable reference genes for normalizing quantitative gene expression data have not been carefully investigated. In this study, the suitability of a set of 15 apple genes were evaluated for their potential use as reliable reference genes. These genes were selected based on their low variance of gene expression in apple root tissues from a recent RNA-seq data set, and a few previously reported apple reference genes for other tissue types. Four methods, Delta Ct, geNorm, NormFinder and BestKeeper, were used to evaluate their stability in apple root tissues of various genotypes and under different experimental conditions. A small panel of stably expressed genes, MDP0000095375, MDP0000147424, MDP0000233640, MDP0000326399 and MDP0000173025 were recommended for normalizing quantitative gene expression data in apple roots under various abiotic or biotic stresses. When the most stable and least stable reference genes were used for data normalization, significant differences were observed on the expression patterns of two target genes, MdLecRLK5 (MDP0000228426, a gene encoding a lectin receptor like kinase) and MdMAPK3 (MDP0000187103, a gene encoding a mitogen-activated protein kinase). Our data also indicated that for those carefully validated reference genes, a single reference gene is sufficient for reliable normalization of the quantitative gene expression. Depending on the experimental conditions, the most suitable reference genes can be specific to the sample of interest for more reliable RT-qPCR data normalization. PMID:28934340
Bioinformatic prediction of leader genes in human periodontitis.

PubMed

Covani, Ugo; Marconcini, Simone; Giacomelli, Luca; Sivozhelevov, Victor; Barone, Antonio; Nicolini, Claudio

2008-10-01

Genes involved in different biologic processes form complex interaction networks. However, only a few have a high number of interactions with the other genes in the network. In previous bioinformatics and experimental studies concerning the T lymphocyte cell cycle, these genes were identified and termed "leader genes." In this work, genes involved in human periodontitis were tentatively identified and ranked according to their number of interactions to obtain a preliminary, broader view of molecular mechanisms of periodontitis and plan targeted experimentation. Genes were identified with interrelated queries of several databases. The interactions among these genes were mapped and given a significance score. The weighted number of links (weighted sum of scores for every interaction in which the given gene is involved) was calculated for each gene. Genes were clustered according to this parameter. The genes in the highest cluster were termed leader genes. Sixty-one genes involved or potentially involved in periodontitis were identified. Only five were identified as leader genes, whereas 12 others were ranked in an immediately lower cluster. For 10 of 17 genes there is evidence of involvement in periodontitis; seven new genes that are potentially involved in this disease were identified. The involvement in periodontitis has been completely established for only two leader genes. We applied a validated bioinformatics algorithm to increase our knowledge of molecular mechanisms of periodontitis. Even with the limitations of this ab initio analysis, this theoretical study can suggest ad hoc experimentation targeted on significant genes and, therefore, simpler than mass-scale molecular genomics. Moreover, the identification of leader genes might suggest new potential risk factors and therapeutic targets.
A hybrid approach of gene sets and single genes for the prediction of survival risks with gene expression data.

PubMed

Seok, Junhee; Davis, Ronald W; Xiao, Wenzhong

2015-01-01

Accumulated biological knowledge is often encoded as gene sets, collections of genes associated with similar biological functions or pathways. The use of gene sets in the analyses of high-throughput gene expression data has been intensively studied and applied in clinical research. However, the main interest remains in finding modules of biological knowledge, or corresponding gene sets, significantly associated with disease conditions. Risk prediction from censored survival times using gene sets hasn't been well studied. In this work, we propose a hybrid method that uses both single gene and gene set information together to predict patient survival risks from gene expression profiles. In the proposed method, gene sets provide context-level information that is poorly reflected by single genes. Complementarily, single genes help to supplement incomplete information of gene sets due to our imperfect biomedical knowledge. Through the tests over multiple data sets of cancer and trauma injury, the proposed method showed robust and improved performance compared with the conventional approaches with only single genes or gene sets solely. Additionally, we examined the prediction result in the trauma injury data, and showed that the modules of biological knowledge used in the prediction by the proposed method were highly interpretable in biology. A wide range of survival prediction problems in clinical genomics is expected to benefit from the use of biological knowledge.
A Hybrid Approach of Gene Sets and Single Genes for the Prediction of Survival Risks with Gene Expression Data

PubMed Central

Seok, Junhee; Davis, Ronald W.; Xiao, Wenzhong

2015-01-01

Accumulated biological knowledge is often encoded as gene sets, collections of genes associated with similar biological functions or pathways. The use of gene sets in the analyses of high-throughput gene expression data has been intensively studied and applied in clinical research. However, the main interest remains in finding modules of biological knowledge, or corresponding gene sets, significantly associated with disease conditions. Risk prediction from censored survival times using gene sets hasn’t been well studied. In this work, we propose a hybrid method that uses both single gene and gene set information together to predict patient survival risks from gene expression profiles. In the proposed method, gene sets provide context-level information that is poorly reflected by single genes. Complementarily, single genes help to supplement incomplete information of gene sets due to our imperfect biomedical knowledge. Through the tests over multiple data sets of cancer and trauma injury, the proposed method showed robust and improved performance compared with the conventional approaches with only single genes or gene sets solely. Additionally, we examined the prediction result in the trauma injury data, and showed that the modules of biological knowledge used in the prediction by the proposed method were highly interpretable in biology. A wide range of survival prediction problems in clinical genomics is expected to benefit from the use of biological knowledge. PMID:25933378
[Key effect genes responding to nerve injury identified by gene ontology and computer pattern recognition].

PubMed

Pan, Qian; Peng, Jin; Zhou, Xue; Yang, Hao; Zhang, Wei

2012-07-01

In order to screen out important genes from large gene data of gene microarray after nerve injury, we combine gene ontology (GO) method and computer pattern recognition technology to find key genes responding to nerve injury, and then verify one of these screened-out genes. Data mining and gene ontology analysis of gene chip data GSE26350 was carried out through MATLAB software. Cd44 was selected from screened-out key gene molecular spectrum by comparing genes' different GO terms and positions on score map of principal component. Function interferences were employed to influence the normal binding of Cd44 and one of its ligands, chondroitin sulfate C (CSC), to observe neurite extension. Gene ontology analysis showed that the first genes on score map (marked by red *) mainly distributed in molecular transducer activity, receptor activity, protein binding et al molecular function GO terms. Cd44 is one of six effector protein genes, and attracted us with its function diversity. After adding different reagents into the medium to interfere the normal binding of CSC and Cd44, varying-degree remissions of CSC's inhibition on neurite extension were observed. CSC can inhibit neurite extension through binding Cd44 on the neuron membrane. This verifies that important genes in given physiological processes can be identified by gene ontology analysis of gene chip data.
Clique-based data mining for related genes in a biomedical database.

PubMed

Matsunaga, Tsutomu; Yonemori, Chikara; Tomita, Etsuji; Muramatsu, Masaaki

2009-07-01

Progress in the life sciences cannot be made without integrating biomedical knowledge on numerous genes in order to help formulate hypotheses on the genetic mechanisms behind various biological phenomena, including diseases. There is thus a strong need for a way to automatically and comprehensively search from biomedical databases for related genes, such as genes in the same families and genes encoding components of the same pathways. Here we address the extraction of related genes by searching for densely-connected subgraphs, which are modeled as cliques, in a biomedical relational graph. We constructed a graph whose nodes were gene or disease pages, and edges were the hyperlink connections between those pages in the Online Mendelian Inheritance in Man (OMIM) database. We obtained over 20,000 sets of related genes (called 'gene modules') by enumerating cliques computationally. The modules included genes in the same family, genes for proteins that form a complex, and genes for components of the same signaling pathway. The results of experiments using 'metabolic syndrome'-related gene modules show that the gene modules can be used to get a coherent holistic picture helpful for interpreting relations among genes. We presented a data mining approach extracting related genes by enumerating cliques. The extracted gene sets provide a holistic picture useful for comprehending complex disease mechanisms.
Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data.

PubMed

Tintle, Nathan L; Sitarik, Alexandra; Boerema, Benjamin; Young, Kylie; Best, Aaron A; Dejongh, Matthew

2012-08-08

Statistical analyses of whole genome expression data require functional information about genes in order to yield meaningful biological conclusions. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) are common sources of functionally grouped gene sets. For bacteria, the SEED and MicrobesOnline provide alternative, complementary sources of gene sets. To date, no comprehensive evaluation of the data obtained from these resources has been performed. We define a series of gene set consistency metrics directly related to the most common classes of statistical analyses for gene expression data, and then perform a comprehensive analysis of 3581 Affymetrix® gene expression arrays across 17 diverse bacteria. We find that gene sets obtained from GO and KEGG demonstrate lower consistency than those obtained from the SEED and MicrobesOnline, regardless of gene set size. Despite the widespread use of GO and KEGG gene sets in bacterial gene expression data analysis, the SEED and MicrobesOnline provide more consistent sets for a wide variety of statistical analyses. Increased use of the SEED and MicrobesOnline gene sets in the analysis of bacterial gene expression data may improve statistical power and utility of expression data.
Effect of the absolute statistic on gene-sampling gene-set analysis methods.

PubMed

Nam, Dougu

2017-06-01

Gene-set enrichment analysis and its modified versions have commonly been used for identifying altered functions or pathways in disease from microarray data. In particular, the simple gene-sampling gene-set analysis methods have been heavily used for datasets with only a few sample replicates. The biggest problem with this approach is the highly inflated false-positive rate. In this paper, the effect of absolute gene statistic on gene-sampling gene-set analysis methods is systematically investigated. Thus far, the absolute gene statistic has merely been regarded as a supplementary method for capturing the bidirectional changes in each gene set. Here, it is shown that incorporating the absolute gene statistic in gene-sampling gene-set analysis substantially reduces the false-positive rate and improves the overall discriminatory ability. Its effect was investigated by power, false-positive rate, and receiver operating curve for a number of simulated and real datasets. The performances of gene-set analysis methods in one-tailed (genome-wide association study) and two-tailed (gene expression data) tests were also compared and discussed.
GeneEd—A Genetics Educational Resource | NIH MedlinePlus the Magazine

MedlinePlus

... of this page please turn Javascript on. Feature: Genetics 101 GeneEd — A Genetics Educational Resource Past Issues / Summer 2013 Table of ... GeneEd website as part of her lessons on genetics. A recently developed educational website about genetics— GeneEd. ...
Mining disease genes using integrated protein-protein interaction and gene-gene co-regulation information.

PubMed

Li, Jin; Wang, Limei; Guo, Maozu; Zhang, Ruijie; Dai, Qiguo; Liu, Xiaoyan; Wang, Chunyu; Teng, Zhixia; Xuan, Ping; Zhang, Mingming

2015-01-01

In humans, despite the rapid increase in disease-associated gene discovery, a large proportion of disease-associated genes are still unknown. Many network-based approaches have been used to prioritize disease genes. Many networks, such as the protein-protein interaction (PPI), KEGG, and gene co-expression networks, have been used. Expression quantitative trait loci (eQTLs) have been successfully applied for the determination of genes associated with several diseases. In this study, we constructed an eQTL-based gene-gene co-regulation network (GGCRN) and used it to mine for disease genes. We adopted the random walk with restart (RWR) algorithm to mine for genes associated with Alzheimer disease. Compared to the Human Protein Reference Database (HPRD) PPI network alone, the integrated HPRD PPI and GGCRN networks provided faster convergence and revealed new disease-related genes. Therefore, using the RWR algorithm for integrated PPI and GGCRN is an effective method for disease-associated gene mining.
Status of therapeutic gene transfer to treat cardiovascular disease in dogs and cats.

PubMed

Sleeper, Meg; Bish, Lawrence T; Haskins, Mark; Ponder, Katherine P; Sweeney, H Lee

2011-06-01

Gene therapy is a procedure resulting in the transfer of a gene(s) into an individual's cells to treat a disease, which is designed to produce a protein or functional RNA (the gene product). Although most current gene therapy clinical trials focus on cancer and inherited diseases, multiple studies have evaluated the efficacy of gene therapy to abrogate various forms of heart disease. Indeed, human clinical trials are currently underway. One goal of gene transfer may be to express a functional gene when the endogenous gene is inactive. Alternatively, complex diseases such as end stage heart failure are characterized by a number of abnormalities at the cellular level, many of which can be targeted using gene delivery to alter myocardial protein levels. This review will discuss issues related to gene vector systems, gene delivery strategies and two cardiovascular diseases in dogs successfully treated with therapeutic gene delivery. Copyright © 2011 Elsevier B.V. All rights reserved.
[Molecular genetics of functional articulation disorder in children].

PubMed

Zhao, Yun-Jing; Ma, Hong-Wei

2012-04-01

Genetic factors are an important cause of functional articulation disorder in children. This article reviews some genes and chromosome regions associated with a genetic susceptibility to functional articulation disorders. The forkhead box P2 (FOXP2) gene on chromosome 7 is introduced in details including its structure, expression and function. The relationship between the FOXP2 gene and developmental apraxia of speech is discussed. As a transcription factor, FOXP2 gene regulates the expression of many genes. CNTNAP2 as an important target gene of FOXP2 is a key gene influencing language development. Functional articulation disorder may be developed to dyslexia, therefore some candidate regions and genes related to dyslexia, such as 3p12-13, 15q11-21, 6p22 and 1p34-36, are also introduced. ROBO1 gene in 3p12.3, ZNF280D gene, TCF12 gene, EKN1 gene in 15q21, and KIAA0319 gene in 6p22 have been candidate genes for the study of functional articulation disorder.

Translation-coupling systems

DOEpatents

Pfleger, Brian; Mendez-Perez, Daniel

2013-11-05

Disclosed are systems and methods for coupling translation of a target gene to a detectable response gene. A version of the invention includes a translation-coupling cassette. The translation-coupling cassette includes a target gene, a response gene, a response-gene translation control element, and a secondary structure-forming sequence that reversibly forms a secondary structure masking the response-gene translation control element. Masking of the response-gene translation control element inhibits translation of the response gene. Full translation of the target gene results in unfolding of the secondary structure and consequent translation of the response gene. Translation of the target gene is determined by detecting presence of the response-gene protein product. The invention further includes RNA transcripts of the translation-coupling cassettes, vectors comprising the translation-coupling cassettes, hosts comprising the translation-coupling cassettes, methods of using the translation-coupling cassettes, and gene products produced with the translation-coupling cassettes.
Translation-coupling systems

DOEpatents

Pfleger, Brian; Mendez-Perez, Daniel

2015-05-19

Disclosed are systems and methods for coupling translation of a target gene to a detectable response gene. A version of the invention includes a translation-coupling cassette. The translation-coupling cassette includes a target gene, a response gene, a response-gene translation control element, and a secondary structure-forming sequence that reversibly forms a secondary structure masking the response-gene translation control element. Masking of the response-gene translation control element inhibits translation of the response gene. Full translation of the target gene results in unfolding of the secondary structure and consequent translation of the response gene. Translation of the target gene is determined by detecting presence of the response-gene protein product. The invention further includes RNA transcripts of the translation-coupling cassettes, vectors comprising the translation-coupling cassettes, hosts comprising the translation-coupling cassettes, methods of using the translation-coupling cassettes, and gene products produced with the translation-coupling cassettes.
Identification of essential genes and synthetic lethal gene combinations in Escherichia coli K-12.

PubMed

Mori, Hirotada; Baba, Tomoya; Yokoyama, Katsushi; Takeuchi, Rikiya; Nomura, Wataru; Makishi, Kazuichi; Otsuka, Yuta; Dose, Hitomi; Wanner, Barry L

2015-01-01

Here we describe the systematic identification of single genes and gene pairs, whose knockout causes lethality in Escherichia coli K-12. During construction of precise single-gene knockout library of E. coli K-12, we identified 328 essential gene candidates for growth in complex (LB) medium. Upon establishment of the Keio single-gene deletion library, we undertook the development of the ASKA single-gene deletion library carrying a different antibiotic resistance. In addition, we developed tools for identification of synthetic lethal gene combinations by systematic construction of double-gene knockout mutants. We introduce these methods herein.
GEM-TREND: a web tool for gene expression data mining toward relevant network discovery

PubMed Central

Feng, Chunlai; Araki, Michihiro; Kunimoto, Ryo; Tamon, Akiko; Makiguchi, Hiroki; Niijima, Satoshi; Tsujimoto, Gozoh; Okuno, Yasushi

2009-01-01

Background DNA microarray technology provides us with a first step toward the goal of uncovering gene functions on a genomic scale. In recent years, vast amounts of gene expression data have been collected, much of which are available in public databases, such as the Gene Expression Omnibus (GEO). To date, most researchers have been manually retrieving data from databases through web browsers using accession numbers (IDs) or keywords, but gene-expression patterns are not considered when retrieving such data. The Connectivity Map was recently introduced to compare gene expression data by introducing gene-expression signatures (represented by a set of genes with up- or down-regulated labels according to their biological states) and is available as a web tool for detecting similar gene-expression signatures from a limited data set (approximately 7,000 expression profiles representing 1,309 compounds). In order to support researchers to utilize the public gene expression data more effectively, we developed a web tool for finding similar gene expression data and generating its co-expression networks from a publicly available database. Results GEM-TREND, a web tool for searching gene expression data, allows users to search data from GEO using gene-expression signatures or gene expression ratio data as a query and retrieve gene expression data by comparing gene-expression pattern between the query and GEO gene expression data. The comparison methods are based on the nonparametric, rank-based pattern matching approach of Lamb et al. (Science 2006) with the additional calculation of statistical significance. The web tool was tested using gene expression ratio data randomly extracted from the GEO and with in-house microarray data, respectively. The results validated the ability of GEM-TREND to retrieve gene expression entries biologically related to a query from GEO. For further analysis, a network visualization interface is also provided, whereby genes and gene annotations are dynamically linked to external data repositories. Conclusion GEM-TREND was developed to retrieve gene expression data by comparing query gene-expression pattern with those of GEO gene expression data. It could be a very useful resource for finding similar gene expression profiles and constructing its gene co-expression networks from a publicly available database. GEM-TREND was designed to be user-friendly and is expected to support knowledge discovery. GEM-TREND is freely available at . PMID:19728865
GEM-TREND: a web tool for gene expression data mining toward relevant network discovery.

PubMed

Feng, Chunlai; Araki, Michihiro; Kunimoto, Ryo; Tamon, Akiko; Makiguchi, Hiroki; Niijima, Satoshi; Tsujimoto, Gozoh; Okuno, Yasushi

2009-09-03

DNA microarray technology provides us with a first step toward the goal of uncovering gene functions on a genomic scale. In recent years, vast amounts of gene expression data have been collected, much of which are available in public databases, such as the Gene Expression Omnibus (GEO). To date, most researchers have been manually retrieving data from databases through web browsers using accession numbers (IDs) or keywords, but gene-expression patterns are not considered when retrieving such data. The Connectivity Map was recently introduced to compare gene expression data by introducing gene-expression signatures (represented by a set of genes with up- or down-regulated labels according to their biological states) and is available as a web tool for detecting similar gene-expression signatures from a limited data set (approximately 7,000 expression profiles representing 1,309 compounds). In order to support researchers to utilize the public gene expression data more effectively, we developed a web tool for finding similar gene expression data and generating its co-expression networks from a publicly available database. GEM-TREND, a web tool for searching gene expression data, allows users to search data from GEO using gene-expression signatures or gene expression ratio data as a query and retrieve gene expression data by comparing gene-expression pattern between the query and GEO gene expression data. The comparison methods are based on the nonparametric, rank-based pattern matching approach of Lamb et al. (Science 2006) with the additional calculation of statistical significance. The web tool was tested using gene expression ratio data randomly extracted from the GEO and with in-house microarray data, respectively. The results validated the ability of GEM-TREND to retrieve gene expression entries biologically related to a query from GEO. For further analysis, a network visualization interface is also provided, whereby genes and gene annotations are dynamically linked to external data repositories. GEM-TREND was developed to retrieve gene expression data by comparing query gene-expression pattern with those of GEO gene expression data. It could be a very useful resource for finding similar gene expression profiles and constructing its gene co-expression networks from a publicly available database. GEM-TREND was designed to be user-friendly and is expected to support knowledge discovery. GEM-TREND is freely available at http://cgs.pharm.kyoto-u.ac.jp/services/network.
Genes and Gene Therapy

MedlinePlus

... a child can have a genetic disorder. Gene therapy is an experimental technique that uses genes to ... prevent disease. The most common form of gene therapy involves inserting a normal gene to replace an ...
Cloning and Characterizing Genes Involved in Monoterpene Induced Mammary Tumor Regression

DTIC Science & Technology

1998-05-01

Monoterpene -induced/repressed genes were identified in regressing rat mammary carcinomas treated with dietary limonene using a newly developed method...termed subtractive display. The subtractive display screen identified 42 monoterpene -induced genes comprising 9 known genes and 33 unidentified genes...as well as 58 monoterpene -repressed genes comprising 1 known gene and 57 unidentified genes. Several of the identified differentially expressed
Efficient Exploration of the Space of Reconciled Gene Trees

PubMed Central

Szöllősi, Gergely J.; Rosikiewicz, Wojciech; Boussau, Bastien; Tannier, Eric; Daubin, Vincent

2013-01-01

Gene trees record the combination of gene-level events, such as duplication, transfer and loss (DTL), and species-level events, such as speciation and extinction. Gene tree–species tree reconciliation methods model these processes by drawing gene trees into the species tree using a series of gene and species-level events. The reconstruction of gene trees based on sequence alone almost always involves choosing between statistically equivalent or weakly distinguishable relationships that could be much better resolved based on a putative species tree. To exploit this potential for accurate reconstruction of gene trees, the space of reconciled gene trees must be explored according to a joint model of sequence evolution and gene tree–species tree reconciliation. Here we present amalgamated likelihood estimation (ALE), a probabilistic approach to exhaustively explore all reconciled gene trees that can be amalgamated as a combination of clades observed in a sample of gene trees. We implement the ALE approach in the context of a reconciliation model (Szöllősi et al. 2013), which allows for the DTL of genes. We use ALE to efficiently approximate the sum of the joint likelihood over amalgamations and to find the reconciled gene tree that maximizes the joint likelihood among all such trees. We demonstrate using simulations that gene trees reconstructed using the joint likelihood are substantially more accurate than those reconstructed using sequence alone. Using realistic gene tree topologies, branch lengths, and alignment sizes, we demonstrate that ALE produces more accurate gene trees even if the model of sequence evolution is greatly simplified. Finally, examining 1099 gene families from 36 cyanobacterial genomes we find that joint likelihood-based inference results in a striking reduction in apparent phylogenetic discord, with respectively. 24%, 59%, and 46% reductions in the mean numbers of duplications, transfers, and losses per gene family. The open source implementation of ALE is available from https://github.com/ssolo/ALE.git. [amalgamation; gene tree reconciliation; gene tree reconstruction; lateral gene transfer; phylogeny.] PMID:23925510
Reconstruction of a Functional Human Gene Network, with an Application for Prioritizing Positional Candidate Genes

PubMed Central

Franke, Lude; Bakel, Harm van; Fokkens, Like; de Jong, Edwin D.; Egmont-Petersen, Michael; Wijmenga, Cisca

2006-01-01

Most common genetic disorders have a complex inheritance and may result from variants in many genes, each contributing only weak effects to the disease. Pinpointing these disease genes within the myriad of susceptibility loci identified in linkage studies is difficult because these loci may contain hundreds of genes. However, in any disorder, most of the disease genes will be involved in only a few different molecular pathways. If we know something about the relationships between the genes, we can assess whether some genes (which may reside in different loci) functionally interact with each other, indicating a joint basis for the disease etiology. There are various repositories of information on pathway relationships. To consolidate this information, we developed a functional human gene network that integrates information on genes and the functional relationships between genes, based on data from the Kyoto Encyclopedia of Genes and Genomes, the Biomolecular Interaction Network Database, Reactome, the Human Protein Reference Database, the Gene Ontology database, predicted protein-protein interactions, human yeast two-hybrid interactions, and microarray coexpressions. We applied this network to interrelate positional candidate genes from different disease loci and then tested 96 heritable disorders for which the Online Mendelian Inheritance in Man database reported at least three disease genes. Artificial susceptibility loci, each containing 100 genes, were constructed around each disease gene, and we used the network to rank these genes on the basis of their functional interactions. By following up the top five genes per artificial locus, we were able to detect at least one known disease gene in 54% of the loci studied, representing a 2.8-fold increase over random selection. This suggests that our method can significantly reduce the cost and effort of pinpointing true disease genes in analyses of disorders for which numerous loci have been reported but for which most of the genes are unknown. PMID:16685651
Identification of reference genes for quantitative expression analysis using large-scale RNA-seq data of Arabidopsis thaliana and model crop plants.

PubMed

Kudo, Toru; Sasaki, Yohei; Terashima, Shin; Matsuda-Imai, Noriko; Takano, Tomoyuki; Saito, Misa; Kanno, Maasa; Ozaki, Soichi; Suwabe, Keita; Suzuki, Go; Watanabe, Masao; Matsuoka, Makoto; Takayama, Seiji; Yano, Kentaro

2016-10-13

In quantitative gene expression analysis, normalization using a reference gene as an internal control is frequently performed for appropriate interpretation of the results. Efforts have been devoted to exploring superior novel reference genes using microarray transcriptomic data and to evaluating commonly used reference genes by targeting analysis. However, because the number of specifically detectable genes is totally dependent on probe design in the microarray analysis, exploration using microarray data may miss some of the best choices for the reference genes. Recently emerging RNA sequencing (RNA-seq) provides an ideal resource for comprehensive exploration of reference genes since this method is capable of detecting all expressed genes, in principle including even unknown genes. We report the results of a comprehensive exploration of reference genes using public RNA-seq data from plants such as Arabidopsis thaliana (Arabidopsis), Glycine max (soybean), Solanum lycopersicum (tomato) and Oryza sativa (rice). To select reference genes suitable for the broadest experimental conditions possible, candidates were surveyed by the following four steps: (1) evaluation of the basal expression level of each gene in each experiment; (2) evaluation of the expression stability of each gene in each experiment; (3) evaluation of the expression stability of each gene across the experiments; and (4) selection of top-ranked genes, after ranking according to the number of experiments in which the gene was expressed stably. Employing this procedure, 13, 10, 12 and 21 top candidates for reference genes were proposed in Arabidopsis, soybean, tomato and rice, respectively. Microarray expression data confirmed that the expression of the proposed reference genes under broad experimental conditions was more stable than that of commonly used reference genes. These novel reference genes will be useful for analyzing gene expression profiles across experiments carried out under various experimental conditions.
Phylogenetic relatedness determined between antibiotic resistance and 16S rRNA genes in actinobacteria.

PubMed

Sagova-Mareckova, Marketa; Ulanova, Dana; Sanderova, Petra; Omelka, Marek; Kamenik, Zdenek; Olsovska, Jana; Kopecky, Jan

2015-04-01

Distribution and evolutionary history of resistance genes in environmental actinobacteria provide information on intensity of antibiosis and evolution of specific secondary metabolic pathways at a given site. To this day, actinobacteria producing biologically active compounds were isolated mostly from soil but only a limited range of soil environments were commonly sampled. Consequently, soil remains an unexplored environment in search for novel producers and related evolutionary questions. Ninety actinobacteria strains isolated at contrasting soil sites were characterized phylogenetically by 16S rRNA gene, for presence of erm and ABC transporter resistance genes and antibiotic production. An analogous analysis was performed in silico with 246 and 31 strains from Integrated Microbial Genomes (JGI_IMG) database selected by the presence of ABC transporter genes and erm genes, respectively. In the isolates, distances of erm gene sequences were significantly correlated to phylogenetic distances based on 16S rRNA genes, while ABC transporter gene distances were not. The phylogenetic distance of isolates was significantly correlated to soil pH and organic matter content of isolation sites. In the analysis of JGI_IMG datasets the correlation between phylogeny of resistance genes and the strain phylogeny based on 16S rRNA genes or five housekeeping genes was observed for both the erm genes and ABC transporter genes in both actinobacteria and streptomycetes. However, in the analysis of sequences from genomes where both resistance genes occurred together the correlation was observed for both ABC transporter and erm genes in actinobacteria but in streptomycetes only in the erm gene. The type of erm resistance gene sequences was influenced by linkage to 16S rRNA gene sequences and site characteristics. The phylogeny of ABC transporter gene was correlated to 16S rRNA genes mainly above the genus level. The results support the concept of new specific secondary metabolite scaffolds occurring more likely in taxonomically distant producers but suggest that the antibiotic selection of gene pools is also influenced by site conditions.
Evaluation of Reference Genes for Normalization of Gene Expression Using Quantitative RT-PCR under Aluminum, Cadmium, and Heat Stresses in Soybean.

PubMed

Gao, Mengmeng; Liu, Yaping; Ma, Xiao; Shuai, Qin; Gai, Junyi; Li, Yan

2017-01-01

Quantitative reverse transcription polymerase chain reaction (qRT-PCR) is widely used to analyze the relative gene expression level, however, the accuracy of qRT-PCR is greatly affected by the stability of reference genes, which is tissue- and environment- dependent. Therefore, choosing the most stable reference gene in a specific tissue and environment is critical to interpret gene expression patterns. Aluminum (Al), cadmium (Cd), and heat stresses are three important abiotic factors limiting soybean (Glycine max) production in southern China. To identify the suitable reference genes for normalizing the expression levels of target genes by qRT-PCR in soybean response to Al, Cd and heat stresses, we studied the expression stability of ten commonly used housekeeping genes in soybean roots and leaves under these three abiotic stresses, using five approaches, BestKeeper, Delta Ct, geNorm, NormFinder and RefFinder. We found TUA4 is the most stable reference gene in soybean root tips under Al stress. Under Cd stress, Fbox and UKN2 are the most stable reference genes in roots and leaves, respectively, while 60S is the most suitable reference gene when analyzing both roots and leaves together. For heat stress, TUA4 and UKN2 are the most stable housekeeping genes in roots and leaves, respectively, and UKN2 is the best reference gene for analysis of roots and leaves together. To validate the reference genes, we quantified the relative expression levels of six target genes that were involved in soybean response to Al, Cd or heat stresses, respectively. The expression patterns of these target genes differed between using the most and least stable reference genes, suggesting the selection of a suitable reference gene is critical for gene expression studies.
Prevalent Role of Gene Features in Determining Evolutionary Fates of Whole-Genome Duplication Duplicated Genes in Flowering Plants1[W][OA

PubMed Central

Jiang, Wen-kai; Liu, Yun-long; Xia, En-hua; Gao, Li-zhi

2013-01-01

The evolution of genes and genomes after polyploidization has been the subject of extensive studies in evolutionary biology and plant sciences. While a significant number of duplicated genes are rapidly removed during a process called fractionation, which operates after the whole-genome duplication (WGD), another considerable number of genes are retained preferentially, leading to the phenomenon of biased gene retention. However, the evolutionary mechanisms underlying gene retention after WGD remain largely unknown. Through genome-wide analyses of sequence and functional data, we comprehensively investigated the relationships between gene features and the retention probability of duplicated genes after WGDs in six plant genomes, Arabidopsis (Arabidopsis thaliana), poplar (Populus trichocarpa), soybean (Glycine max), rice (Oryza sativa), sorghum (Sorghum bicolor), and maize (Zea mays). The results showed that multiple gene features were correlated with the probability of gene retention. Using a logistic regression model based on principal component analysis, we resolved evolutionary rate, structural complexity, and GC3 content as the three major contributors to gene retention. Cluster analysis of these features further classified retained genes into three distinct groups in terms of gene features and evolutionary behaviors. Type I genes are more prone to be selected by dosage balance; type II genes are possibly subject to subfunctionalization; and type III genes may serve as potential targets for neofunctionalization. This study highlights that gene features are able to act jointly as primary forces when determining the retention and evolution of WGD-derived duplicated genes in flowering plants. These findings thus may help to provide a resolution to the debate on different evolutionary models of gene fates after WGDs. PMID:23396833
Finding minimum gene subsets with heuristic breadth-first search algorithm for robust tumor classification

PubMed Central

2012-01-01

Background Previous studies on tumor classification based on gene expression profiles suggest that gene selection plays a key role in improving the classification performance. Moreover, finding important tumor-related genes with the highest accuracy is a very important task because these genes might serve as tumor biomarkers, which is of great benefit to not only tumor molecular diagnosis but also drug development. Results This paper proposes a novel gene selection method with rich biomedical meaning based on Heuristic Breadth-first Search Algorithm (HBSA) to find as many optimal gene subsets as possible. Due to the curse of dimensionality, this type of method could suffer from over-fitting and selection bias problems. To address these potential problems, a HBSA-based ensemble classifier is constructed using majority voting strategy from individual classifiers constructed by the selected gene subsets, and a novel HBSA-based gene ranking method is designed to find important tumor-related genes by measuring the significance of genes using their occurrence frequencies in the selected gene subsets. The experimental results on nine tumor datasets including three pairs of cross-platform datasets indicate that the proposed method can not only obtain better generalization performance but also find many important tumor-related genes. Conclusions It is found that the frequencies of the selected genes follow a power-law distribution, indicating that only a few top-ranked genes can be used as potential diagnosis biomarkers. Moreover, the top-ranked genes leading to very high prediction accuracy are closely related to specific tumor subtype and even hub genes. Compared with other related methods, the proposed method can achieve higher prediction accuracy with fewer genes. Moreover, they are further justified by analyzing the top-ranked genes in the context of individual gene function, biological pathway, and protein-protein interaction network. PMID:22830977
Frequent loss of lineages and deficient duplications accounted for low copy number of disease resistance genes in Cucurbitaceae

PubMed Central

2013-01-01

Background The sequenced genomes of cucumber, melon and watermelon have relatively few R-genes, with 70, 75 and 55 copies only, respectively. The mechanism for low copy number of R-genes in Cucurbitaceae genomes remains unknown. Results Manual annotation of R-genes in the sequenced genomes of Cucurbitaceae species showed that approximately half of them are pseudogenes. Comparative analysis of R-genes showed frequent loss of R-gene loci in different Cucurbitaceae species. Phylogenetic analysis, data mining and PCR cloning using degenerate primers indicated that Cucurbitaceae has limited number of R-gene lineages (subfamilies). Comparison between R-genes from Cucurbitaceae and those from poplar and soybean suggested frequent loss of R-gene lineages in Cucurbitaceae. Furthermore, the average number of R-genes per lineage in Cucurbitaceae species is approximately 1/3 that in soybean or poplar. Therefore, both loss of lineages and deficient duplications in extant lineages accounted for the low copy number of R-genes in Cucurbitaceae. No extensive chimeras of R-genes were found in any of the sequenced Cucurbitaceae genomes. Nevertheless, one lineage of R-genes from Trichosanthes kirilowii, a wild Cucurbitaceae species, exhibits chimeric structures caused by gene conversions, and may contain a large number of distinct R-genes in natural populations. Conclusions Cucurbitaceae species have limited number of R-gene lineages and each genome harbors relatively few R-genes. The scarcity of R-genes in Cucurbitaceae species was due to frequent loss of R-gene lineages and infrequent duplications in extant lineages. The evolutionary mechanisms for large variation of copy number of R-genes in different plant species were discussed. PMID:23682795
Cooperative Adaptive Responses in Gene Regulatory Networks with Many Degrees of Freedom

PubMed Central

Inoue, Masayo; Kaneko, Kunihiko

2013-01-01

Cells generally adapt to environmental changes by first exhibiting an immediate response and then gradually returning to their original state to achieve homeostasis. Although simple network motifs consisting of a few genes have been shown to exhibit such adaptive dynamics, they do not reflect the complexity of real cells, where the expression of a large number of genes activates or represses other genes, permitting adaptive behaviors. Here, we investigated the responses of gene regulatory networks containing many genes that have undergone numerical evolution to achieve high fitness due to the adaptive response of only a single target gene; this single target gene responds to changes in external inputs and later returns to basal levels. Despite setting a single target, most genes showed adaptive responses after evolution. Such adaptive dynamics were not due to common motifs within a few genes; even without such motifs, almost all genes showed adaptation, albeit sometimes partial adaptation, in the sense that expression levels did not always return to original levels. The genes split into two groups: genes in the first group exhibited an initial increase in expression and then returned to basal levels, while genes in the second group exhibited the opposite changes in expression. From this model, genes in the first group received positive input from other genes within the first group, but negative input from genes in the second group, and vice versa. Thus, the adaptation dynamics of genes from both groups were consolidated. This cooperative adaptive behavior was commonly observed if the number of genes involved was larger than the order of ten. These results have implications in the collective responses of gene expression networks in microarray measurements of yeast Saccharomyces cerevisiae and the significance to the biological homeostasis of systems with many components. PMID:23592959
Systematic Analysis and Comparison of Nucleotide-Binding Site Disease Resistance Genes in a Diploid Cotton Gossypium raimondii

PubMed Central

Wei, Hengling; Li, Wei; Sun, Xiwei; Zhu, Shuijin; Zhu, Jun

2013-01-01

Plant disease resistance genes are a key component of defending plants from a range of pathogens. The majority of these resistance genes belong to the super-family that harbors a Nucleotide-binding site (NBS). A number of studies have focused on NBS-encoding genes in disease resistant breeding programs for diverse plants. However, little information has been reported with an emphasis on systematic analysis and comparison of NBS-encoding genes in cotton. To fill this gap of knowledge, in this study, we identified and investigated the NBS-encoding resistance genes in cotton using the whole genome sequence information of Gossypium raimondii. Totally, 355 NBS-encoding resistance genes were identified. Analyses of the conserved motifs and structural diversity showed that the most two distinct features for these genes are the high proportion of non-regular NBS genes and the high diversity of N-termini domains. Analyses of the physical locations and duplications of NBS-encoding genes showed that gene duplication of disease resistance genes could play an important role in cotton by leading to an increase in the functional diversity of the cotton NBS-encoding genes. Analyses of phylogenetic comparisons indicated that, in cotton, the NBS-encoding genes with TIR domain not only have their own evolution pattern different from those of genes without TIR domain, but also have their own species-specific pattern that differs from those of TIR genes in other plants. Analyses of the correlation between disease resistance QTL and NBS-encoding resistance genes showed that there could be more than half of the disease resistance QTL associated to the NBS-encoding genes in cotton, which agrees with previous studies establishing that more than half of plant resistance genes are NBS-encoding genes. PMID:23936305
Functional Analysis of Mating Type Genes and Transcriptome Analysis during Fruiting Body Development of Botrytis cinerea

PubMed Central

2018-01-01

ABSTRACT Botrytis cinerea is a plant-pathogenic fungus producing apothecia as sexual fruiting bodies. To study the function of mating type (MAT) genes, single-gene deletion mutants were generated in both genes of the MAT1-1 locus and both genes of the MAT1-2 locus. Deletion mutants in two MAT genes were entirely sterile, while mutants in the other two MAT genes were able to develop stipes but never formed an apothecial disk. Little was known about the reprogramming of gene expression during apothecium development. We analyzed transcriptomes of sclerotia, three stages of apothecium development (primordia, stipes, and apothecial disks), and ascospores by RNA sequencing. Ten secondary metabolite gene clusters were upregulated at the onset of sexual development and downregulated in ascospores released from apothecia. Notably, more than 3,900 genes were differentially expressed in ascospores compared to mature apothecial disks. Among the genes that were upregulated in ascospores were numerous genes encoding virulence factors, which reveals that ascospores are transcriptionally primed for infection prior to their arrival on a host plant. Strikingly, the massive transcriptional changes at the initiation and completion of the sexual cycle often affected clusters of genes, rather than randomly dispersed genes. Thirty-five clusters of genes were jointly upregulated during the onset of sexual reproduction, while 99 clusters of genes (comprising >900 genes) were jointly downregulated in ascospores. These transcriptional changes coincided with changes in expression of genes encoding enzymes participating in chromatin organization, hinting at the occurrence of massive epigenetic regulation of gene expression during sexual reproduction. PMID:29440571
Transcriptional Coupling of Neighboring Genes and Gene Expression Noise: Evidence that Gene Orientation and Noncoding Transcripts Are Modulators of Noise

PubMed Central

Wang, Guang-Zhong; Lercher, Martin J.; Hurst, Laurence D.

2011-01-01

Abstract How is noise in gene expression modulated? Do mechanisms of noise control impact genome organization? In yeast, the expression of one gene can affect that of a very close neighbor. As the effect is highly regionalized, we hypothesize that genes in different orientations will have differing degrees of coupled expression and, in turn, different noise levels. Divergently organized gene pairs, in particular those with bidirectional promoters, have close promoters, maximizing the likelihood that expression of one gene affects the neighbor. With more distant promoters, the same is less likely to hold for gene pairs in nondivergent orientation. Stochastic models suggest that coupled chromatin dynamics will typically result in low abundance-corrected noise (ACN). Transcription of noncoding RNA (ncRNA) from a bidirectional promoter, we thus hypothesize to be a noise-reduction, expression-priming, mechanism. The hypothesis correctly predicts that protein-coding genes with a bidirectional promoter, including those with a ncRNA partner, have lower ACN than other genes and divergent gene pairs uniquely have correlated ACN. Moreover, as predicted, ACN increases with the distance between promoters. The model also correctly predicts ncRNA transcripts to be often divergently transcribed from genes that a priori would be under selection for low noise (essential genes, protein complex genes) and that the latter genes should commonly reside in divergent orientation. Likewise, that genes with bidirectional promoters are rare subtelomerically, cluster together, and are enriched in essential gene clusters is expected and observed. We conclude that gene orientation and transcription of ncRNAs are candidate modulators of noise. PMID:21402863
GSEH: A Novel Approach to Select Prostate Cancer-Associated Genes Using Gene Expression Heterogeneity.

PubMed

Kim, Hyunjin; Choi, Sang-Min; Park, Sanghyun

2018-01-01

When a gene shows varying levels of expression among normal people but similar levels in disease patients or shows similar levels of expression among normal people but different levels in disease patients, we can assume that the gene is associated with the disease. By utilizing this gene expression heterogeneity, we can obtain additional information that abets discovery of disease-associated genes. In this study, we used collaborative filtering to calculate the degree of gene expression heterogeneity between classes and then scored the genes on the basis of the degree of gene expression heterogeneity to find "differentially predicted" genes. Through the proposed method, we discovered more prostate cancer-associated genes than 10 comparable methods. The genes prioritized by the proposed method are potentially significant to biological processes of a disease and can provide insight into them.

Gene length as a biological timer to establish temporal transcriptional regulation

PubMed Central

Kirkconnell, Killeen S.; Magnuson, Brian; Paulsen, Michelle T.; Lu, Brian; Bedi, Karan; Ljungman, Mats

2017-01-01

ABSTRACT Transcriptional timing is inherently influenced by gene length, thus providing a mechanism for temporal regulation of gene expression. While gene size has been shown to be important for the expression timing of specific genes during early development, whether it plays a role in the timing of other global gene expression programs has not been extensively explored. Here, we investigate the role of gene length during the early transcriptional response of human fibroblasts to serum stimulation. Using the nascent sequencing techniques Bru-seq and BruUV-seq, we identified immediate genome-wide transcriptional changes following serum stimulation that were linked to rapid activation of enhancer elements. We identified 873 significantly induced and 209 significantly repressed genes. Variations in gene size allowed for a large group of genes to be simultaneously activated but produce full-length RNAs at different times. The median length of the group of serum-induced genes was significantly larger than the median length of all expressed genes, housekeeping genes, and serum-repressed genes. These gene length relationships were also observed in corresponding mouse orthologs, suggesting that relative gene size is evolutionarily conserved. The sizes of transcription factor and microRNA genes immediately induced after serum stimulation varied dramatically, setting up a cascade mechanism for temporal expression arising from a single activation event. The retention and expansion of large intronic sequences during evolution have likely played important roles in fine-tuning the temporal expression of target genes in various cellular response programs. PMID:28055303
Evolution of Prdm Genes in Animals: Insights from Comparative Genomics

PubMed Central

Vervoort, Michel; Meulemeester, David; Béhague, Julien; Kerner, Pierre

2016-01-01

Prdm genes encode transcription factors with a subtype of SET domain known as the PRDF1-RIZ (PR) homology domain and a variable number of zinc finger motifs. These genes are involved in a wide variety of functions during animal development. As most Prdm genes have been studied in vertebrates, especially in mice, little is known about the evolution of this gene family. We searched for Prdm genes in the fully sequenced genomes of 93 different species representative of all the main metazoan lineages. A total of 976 Prdm genes were identified in these species. The number of Prdm genes per species ranges from 2 to 19. To better understand how the Prdm gene family has evolved in metazoans, we performed phylogenetic analyses using this large set of identified Prdm genes. These analyses allowed us to define 14 different subfamilies of Prdm genes and to establish, through ancestral state reconstruction, that 11 of them are ancestral to bilaterian animals. Three additional subfamilies were acquired during early vertebrate evolution (Prdm5, Prdm11, and Prdm17). Several gene duplication and gene loss events were identified and mapped onto the metazoan phylogenetic tree. By studying a large number of nonmetazoan genomes, we confirmed that Prdm genes likely constitute a metazoan-specific gene family. Our data also suggest that Prdm genes originated before the diversification of animals through the association of a single ancestral SET domain encoding gene with one or several zinc finger encoding genes. PMID:26560352
Identification and function analysis of contrary genes in Dupuytren's contracture.

PubMed

Ji, Xianglu; Tian, Feng; Tian, Lijie

2015-07-01

The present study aimed to analyze the expression of genes involved in Dupuytren's contracture (DC), using bioinformatic methods. The profile of GSE21221 was downloaded from the gene expression ominibus, which included six samples, derived from fibroblasts and six healthy control samples, derived from carpal-tunnel fibroblasts. A Distributed Intrusion Detection System was used in order to identify differentially expressed genes. The term contrary genes is proposed. Contrary genes were the genes that exhibited opposite expression patterns in the positive and negative groups, and likely exhibited opposite functions. These were identified using Coexpress software. Gene ontology (GO) function analysis was conducted for the contrary genes. A network of GO terms was constructed using the reduce and visualize gene ontology database. Significantly expressed genes (801) and contrary genes (98) were screened. A significant association was observed between Chitinase-3-like protein 1 and ten genes in the positive gene set. Positive regulation of transcription and the activation of nuclear factor-κB (NF-κB)-inducing kinase activity exhibited the highest degree values in the network of GO terms. In the present study, the expression of genes involved in the development of DC was analyzed, and the concept of contrary genes proposed. The genes identified in the present study are involved in the positive regulation of transcription and activation of NF-κB-inducing kinase activity. The contrary genes and GO terms identified in the present study may potentially be used for DC diagnosis and treatment.
Reranking candidate gene models with cross-species comparison for improved gene prediction

PubMed Central

Liu, Qian; Crammer, Koby; Pereira, Fernando CN; Roos, David S

2008-01-01

Background Most gene finders score candidate gene models with state-based methods, typically HMMs, by combining local properties (coding potential, splice donor and acceptor patterns, etc). Competing models with similar state-based scores may be distinguishable with additional information. In particular, functional and comparative genomics datasets may help to select among competing models of comparable probability by exploiting features likely to be associated with the correct gene models, such as conserved exon/intron structure or protein sequence features. Results We have investigated the utility of a simple post-processing step for selecting among a set of alternative gene models, using global scoring rules to rerank competing models for more accurate prediction. For each gene locus, we first generate the K best candidate gene models using the gene finder Evigan, and then rerank these models using comparisons with putative orthologous genes from closely-related species. Candidate gene models with lower scores in the original gene finder may be selected if they exhibit strong similarity to probable orthologs in coding sequence, splice site location, or signal peptide occurrence. Experiments on Drosophila melanogaster demonstrate that reranking based on cross-species comparison outperforms the best gene models identified by Evigan alone, and also outperforms the comparative gene finders GeneWise and Augustus+. Conclusion Reranking gene models with cross-species comparison improves gene prediction accuracy. This straightforward method can be readily adapted to incorporate additional lines of evidence, as it requires only a ranked source of candidate gene models. PMID:18854050
Mouse Vk gene classification by nucleic acid sequence similarity.

PubMed

Strohal, R; Helmberg, A; Kroemer, G; Kofler, R

1989-01-01

Analyses of immunoglobulin (Ig) variable (V) region gene usage in the immune response, estimates of V gene germline complexity, and other nucleic acid hybridization-based studies depend on the extent to which such genes are related (i.e., sequence similarity) and their organization in gene families. While mouse Igh heavy chain V region (VH) gene families are relatively well-established, a corresponding systematic classification of Igk light chain V region (Vk) genes has not been reported. The present analysis, in the course of which we reviewed the known extent of the Vk germline gene repertoire and Vk gene usage in a variety of responses to foreign and self antigens, provides a classification of mouse Vk genes in gene families composed of members with greater than 80% overall nucleic acid sequence similarity. This classification differed in several aspects from that of VH genes: only some Vk gene families were as clearly separated (by greater than 25% sequence dissimilarity) as typical VH gene families; most Vk gene families were closely related and, in several instances, members from different families were very similar (greater than 80%) over large sequence portions; frequently, classification by nucleic acid sequence similarity diverged from existing classifications based on amino-terminal protein sequence similarity. Our data have implications for Vk gene analyses by nucleic acid hybridization and describe potentially important differences in sequence organization between VH and Vk genes.
From Interaction to Co-Association —A Fisher r-To-z Transformation-Based Simple Statistic for Real World Genome-Wide Association Study

PubMed Central

Yuan, Zhongshang; Liu, Hong; Zhang, Xiaoshuai; Li, Fangyu; Zhao, Jinghua; Zhang, Furen; Xue, Fuzhong

2013-01-01

Currently, the genetic variants identified by genome wide association study (GWAS) generally only account for a small proportion of the total heritability for complex disease. One crucial reason is the underutilization of gene-gene joint effects commonly encountered in GWAS, which includes their main effects and co-association. However, gene-gene co-association is often customarily put into the framework of gene-gene interaction vaguely. From the causal graph perspective, we elucidate in detail the concept and rationality of gene-gene co-association as well as its relationship with traditional gene-gene interaction, and propose two Fisher r-to-z transformation-based simple statistics to detect it. Three series of simulations further highlight that gene-gene co-association refers to the extent to which the joint effects of two genes differs from the main effects, not only due to the traditional interaction under the nearly independent condition but the correlation between two genes. The proposed statistics are more powerful than logistic regression under various situations, cannot be affected by linkage disequilibrium and can have acceptable false positive rate as long as strictly following the reasonable GWAS data analysis roadmap. Furthermore, an application to gene pathway analysis associated with leprosy confirms in practice that our proposed gene-gene co-association concepts as well as the correspondingly proposed statistics are strongly in line with reality. PMID:23923021
Convergent evolution of Y chromosome gene content in flies.

PubMed

Mahajan, Shivani; Bachtrog, Doris

2017-10-04

Sex-chromosomes have formed repeatedly across Diptera from ordinary autosomes, and X-chromosomes mostly conserve their ancestral genes. Y-chromosomes are characterized by abundant gene-loss and an accumulation of repetitive DNA, yet the nature of the gene repertoire of fly Y-chromosomes is largely unknown. Here we trace gene-content evolution of Y-chromosomes across 22 Diptera species, using a subtraction pipeline that infers Y genes from male and female genome, and transcriptome data. Few genes remain on old Y-chromosomes, but the number of inferred Y-genes varies substantially between species. Young Y-chromosomes still show clear evidence of their autosomal origins, but most genes on old Y-chromosomes are not simply remnants of genes originally present on the proto-sex-chromosome that escaped degeneration, but instead were recruited secondarily from autosomes. Despite almost no overlap in Y-linked gene content in different species with independently formed sex-chromosomes, we find that Y-linked genes have evolved convergent gene functions associated with testis expression. Thus, male-specific selection appears as a dominant force shaping gene-content evolution of Y-chromosomes across fly species.While X-chromosome gene content tends to be conserved, Y-chromosome evolution is dynamic and difficult to reconstruct. Here, Mahajan and Bachtrog use a subtraction pipeline to identify Y-linked genes in 22 Diptera species, revealing patterns of Y-chromosome gene-content evolution.
Gene transfer and gene mapping in mammalian cells in culture.

PubMed

Shows, T B; Sakaguchi, A Y

1980-01-01

The ability to transfer mammalian genes parasexually has opened new possibilities for gene mapping and fine structure mapping and offers great potential for contributing to several aspects of mammalian biology, including gene expression and genetic engineering. The DNA transferred has ranged from whole genomes to single genes and smaller segments of DNA. The transfer of whole genomes by cell fusion forms cell hybrids, which has promoted the extensive mapping of human and mouse genes. Transfer, by cell fusion, of rearranged chromosomes has contributed significantly to determining close linkage and the assignment of genes to specific chromosomal regions. Transfer of single chromosomes has been achieved utilizing microcells fused to recipient cells. Metaphase chromosomes have been isolated and used to transfer single-to-multigenic DNA segments. DNA-mediated gene transfer, simulating bacterial transformation, has achieved transfer of single-copy genes. By utilizing DNA cleaved with restriction endonucleases, gene transfer is being empolyed as a bioassay for the purification of genes. Gene mapping and the fate of transferred genes can be examined now at the molecular level using sequence-specific probles. Recently, single genes have been cloned into eucaryotic and procaryotic vectors for transfer into mammalian cells. Moreover, recombinant libraries in which entire mammalian genomes are represented collectively are a rich new source of transferable genes. Methodology for transferring mammalian genetic information and applications for mapping mammalian genes is presented and prospects for the future discussed.
Genome-Wide Comparative Gene Family Classification

PubMed Central

Frech, Christian; Chen, Nansheng

2010-01-01

Correct classification of genes into gene families is important for understanding gene function and evolution. Although gene families of many species have been resolved both computationally and experimentally with high accuracy, gene family classification in most newly sequenced genomes has not been done with the same high standard. This project has been designed to develop a strategy to effectively and accurately classify gene families across genomes. We first examine and compare the performance of computer programs developed for automated gene family classification. We demonstrate that some programs, including the hierarchical average-linkage clustering algorithm MC-UPGMA and the popular Markov clustering algorithm TRIBE-MCL, can reconstruct manual curation of gene families accurately. However, their performance is highly sensitive to parameter setting, i.e. different gene families require different program parameters for correct resolution. To circumvent the problem of parameterization, we have developed a comparative strategy for gene family classification. This strategy takes advantage of existing curated gene families of reference species to find suitable parameters for classifying genes in related genomes. To demonstrate the effectiveness of this novel strategy, we use TRIBE-MCL to classify chemosensory and ABC transporter gene families in C. elegans and its four sister species. We conclude that fully automated programs can establish biologically accurate gene families if parameterized accordingly. Comparative gene family classification finds optimal parameters automatically, thus allowing rapid insights into gene families of newly sequenced species. PMID:20976221
A 3,000-loci transcription map of chromosome 3B unravels the structural and functional features of gene islands in hexaploid wheat.

PubMed

Rustenholz, Camille; Choulet, Frédéric; Laugier, Christel; Safár, Jan; Simková, Hana; Dolezel, Jaroslav; Magni, Federica; Scalabrin, Simone; Cattonaro, Federica; Vautrin, Sonia; Bellec, Arnaud; Bergès, Hélène; Feuillet, Catherine; Paux, Etienne

2011-12-01

To improve our understanding of the organization and regulation of the wheat (Triticum aestivum) gene space, we established a transcription map of a wheat chromosome (3B) by hybridizing a newly developed wheat expression microarray with bacterial artificial chromosome pools from a new version of the 3B physical map as well as with cDNA probes derived from 15 RNA samples. Mapping data for almost 3,000 genes showed that the gene space spans the whole chromosome 3B with a 2-fold increase of gene density toward the telomeres due to an increase in the number of genes in islands. Comparative analyses with rice (Oryza sativa) and Brachypodium distachyon revealed that these gene islands are composed mainly of genes likely originating from interchromosomal gene duplications. Gene Ontology and expression profile analyses for the 3,000 genes located along the chromosome revealed that the gene islands are enriched significantly in genes sharing the same function or expression profile, thereby suggesting that genes in islands acquired shared regulation during evolution. Only a small fraction of these clusters of cofunctional and coexpressed genes was conserved with rice and B. distachyon, indicating a recent origin. Finally, genes with the same expression profiles in remote islands (coregulation islands) were identified suggesting long-distance regulation of gene expression along the chromosomes in wheat.
Familial aggregation analysis of gene expressions

PubMed Central

Rao, Shao-Qi; Xu, Liang-De; Zhang, Guang-Mei; Li, Xia; Li, Lin; Shen, Gong-Qing; Jiang, Yang; Yang, Yue-Ying; Gong, Bin-Sheng; Jiang, Wei; Zhang, Fan; Xiao, Yun; Wang, Qing K

2007-01-01

Traditional studies of familial aggregation are aimed at defining the genetic (and non-genetic) causes of a disease from physiological or clinical traits. However, there has been little attempt to use genome-wide gene expressions, the direct phenotypic measures of genes, as the traits to investigate several extended issues regarding the distributions of familially aggregated genes on chromosomes or in functions. In this study we conducted a genome-wide familial aggregation analysis by using the in vitro cell gene expressions of 3300 human autosome genes (Problem 1 data provided to Genetic Analysis Workshop 15) in order to answer three basic genetics questions. First, we investigated how gene expressions aggregate among different types (degrees) of relative pairs. Second, we conducted a bioinformatics analysis of highly familially aggregated genes to see how they are distributed on chromosomes. Third, we performed a gene ontology enrichment test of familially aggregated genes to find evidence to support their functional consensus. The results indicated that 1) gene expressions did aggregate in families, especially between sibs. Of 3300 human genes analyzed, there were a total of 1105 genes with one or more significant (empirical p < 0.05) familial correlation; 2) there were several genomic hot spots where highly familially aggregated genes (e.g., the chromosome 6 HLA genes cluster) were clustered; 3) as we expected, gene ontology enrichment tests revealed that the 1105 genes were aggregating not only in families but also in functional categories. PMID:18466548
With Reference to Reference Genes: A Systematic Review of Endogenous Controls in Gene Expression Studies.

PubMed

Chapman, Joanne R; Waldenström, Jonas

2015-01-01

The choice of reference genes that are stably expressed amongst treatment groups is a crucial step in real-time quantitative PCR gene expression studies. Recent guidelines have specified that a minimum of two validated reference genes should be used for normalisation. However, a quantitative review of the literature showed that the average number of reference genes used across all studies was 1.2. Thus, the vast majority of studies continue to use a single gene, with β-actin (ACTB) and/or glyceraldehyde 3-phosphate dehydrogenase (GAPDH) being commonly selected in studies of vertebrate gene expression. Few studies (15%) tested a panel of potential reference genes for stability of expression before using them to normalise data. Amongst studies specifically testing reference gene stability, few found ACTB or GAPDH to be optimal, whereby these genes were significantly less likely to be chosen when larger panels of potential reference genes were screened. Fewer reference genes were tested for stability in non-model organisms, presumably owing to a dearth of available primers in less well characterised species. Furthermore, the experimental conditions under which real-time quantitative PCR analyses were conducted had a large influence on the choice of reference genes, whereby different studies of rat brain tissue showed different reference genes to be the most stable. These results highlight the importance of validating the choice of normalising reference genes before conducting gene expression studies.
Multiple independent insertions of 5S rRNA genes in the spliced-leader gene family of trypanosome species.

PubMed

Beauparlant, Marc A; Drouin, Guy

2014-02-01

Analyses of the 5S rRNA genes found in the spliced-leader (SL) gene repeat units of numerous trypanosome species suggest that such linkages were not inherited from a common ancestor, but were the result of independent 5S rRNA gene insertions. In trypanosomes, 5S rRNA genes are found either in the tandemly repeated units coding for SL genes or in independent tandemly repeated units. Given that trypanosome species where 5S rRNA genes are within the tandemly repeated units coding for SL genes are phylogenetically related, one might hypothesize that this arrangement is the result of an ancestral insertion of 5S rRNA genes into the tandemly repeated SL gene family of trypanosomes. Here, we use the types of 5S rRNA genes found associated with SL genes, the flanking regions of the inserted 5S rRNA genes and the position of these insertions to show that most of the 5S rRNA genes found within SL gene repeat units of trypanosome species were not acquired from a common ancestor but are the results of independent insertions. These multiple 5S rRNA genes insertion events in trypanosomes are likely the result of frequent founder events in different hosts and/or geographical locations in species having short generation times.
Lineage-specific expansion of IFIT gene family: an insight into coevolution with IFN gene family.

PubMed

Liu, Ying; Zhang, Yi-Bing; Liu, Ting-Kai; Gui, Jian-Fang

2013-01-01

In mammals, IFIT (Interferon [IFN]-induced proteins with Tetratricopeptide Repeat [TPR] motifs) family genes are involved in many cellular and viral processes, which are tightly related to mammalian IFN response. However, little is known about non-mammalian IFIT genes. In the present study, IFIT genes are identified in the genome databases from the jawed vertebrates including the cartilaginous elephant shark but not from non-vertebrates such as lancelet, sea squirt and acorn worm, suggesting that IFIT gene family originates from a vertebrate ancestor about 450 million years ago. IFIT family genes show conserved gene structure and gene arrangements. Phylogenetic analyses reveal that this gene family has expanded through lineage-specific and species-specific gene duplication. Interestingly, IFN gene family seem to share a common ancestor and a similar evolutionary mechanism; the function link of IFIT genes to IFN response is present early since the origin of both gene families, as evidenced by the finding that zebrafish IFIT genes are upregulated by fish IFNs, poly(I:C) and two transcription factors IRF3/IRF7, likely via the IFN-stimulated response elements (ISRE) within the promoters of vertebrate IFIT family genes. These coevolution features creates functional association of both family genes to fulfill a common biological process, which is likely selected by viral infection during evolution of vertebrates. Our results are helpful for understanding of evolution of vertebrate IFN system.
Genome-wide identification, phylogeny and expression analyses of SCARECROW-LIKE(SCL) genes in millet (Setaria italica).

PubMed

Liu, Hongyun; Qin, Jiajia; Fan, Hui; Cheng, Jinjin; Li, Lin; Liu, Zheng

2017-07-01

As a member of the GRAS gene family, SCARECROW - LIKE ( SCL ) genes encode transcriptional regulators that are involved in plant information transmission and signal transduction. In this study, 44 SCL genes including two SCARECROW genes in millet were identified to be distributed on eight chromosomes, except chromosome 6. All the millet genes contain motifs 6-8, indicating that these motifs are conserved during the evolution. SCL genes of millet were divided into eight groups based on the phylogenetic relationship and classification of Arabidopsis SCL genes. Several putative millet orthologous genes in Arabidopsis , maize and rice were identified. High throughput RNA sequencing revealed that the expressions of millet SCL genes in root, stem, leaf, spica, and along leaf gradient varied greatly. Analyses combining the gene expression patterns, gene structures, motif compositions, promoter cis -elements identification, alternative splicing of transcripts and phylogenetic relationship of SCL genes indicate that the these genes may play diverse functions. Functionally characterized SCL genes in maize, rice and Arabidopsis would provide us some clues for future characterization of their homologues in millet. To the best of our knowledge, this is the first study of millet SCL genes at the genome wide level. Our work provides a useful platform for functional analysis of SCL genes in millet, a model crop for C 4 photosynthesis and bioenergy studies.
Gene expression studies of reference genes for quantitative real-time PCR: an overview in insects.

PubMed

Shakeel, Muhammad; Rodriguez, Alicia; Tahir, Urfa Bin; Jin, Fengliang

2018-02-01

Whenever gene expression is being examined, it is essential that a normalization process is carried out to eliminate non-biological variations. The use of reference genes, such as glyceraldehyde-3-phosphate dehydrogenase, actin, and ribosomal protein genes, is the usual method of choice for normalizing gene expression. Although reference genes are used to normalize target gene expression, a major problem is that the stability of these genes differs among tissues, developmental stages, species, and responses to abiotic factors. Therefore, the use and validation of multiple reference genes are required. This review discusses the reasons that why RT-qPCR has become the preferred method for validating results of gene expression profiles, the use of specific and non-specific dyes and the importance of use of primers and probes for qPCR as well as to discuss several statistical algorithms developed to help the validation of potential reference genes. The conflicts arising in the use of classical reference genes in gene normalization and their replacement with novel references are also discussed by citing the high stability and low stability of classical and novel reference genes under various biotic and abiotic experimental conditions by employing various methods applied for the reference genes amplification.
Genes with stable DNA methylation levels show higher evolutionary conservation than genes with fluctuant DNA methylation levels.

PubMed

Zhang, Ruijie; Lv, Wenhua; Luan, Meiwei; Zheng, Jiajia; Shi, Miao; Zhu, Hongjie; Li, Jin; Lv, Hongchao; Zhang, Mingming; Shang, Zhenwei; Duan, Lian; Jiang, Yongshuai

2015-11-24

Different human genes often exhibit different degrees of stability in their DNA methylation levels between tissues, samples or cell types. This may be related to the evolution of human genome. Thus, we compared the evolutionary conservation between two types of genes: genes with stable DNA methylation levels (SM genes) and genes with fluctuant DNA methylation levels (FM genes). For long-term evolutionary characteristics between species, we compared the percentage of the orthologous genes, evolutionary rate dn/ds and protein sequence identity. We found that the SM genes had greater percentages of the orthologous genes, lower dn/ds, and higher protein sequence identities in all the 21 species. These results indicated that the SM genes were more evolutionarily conserved than the FM genes. For short-term evolutionary characteristics among human populations, we compared the single nucleotide polymorphism (SNP) density, and the linkage disequilibrium (LD) degree in HapMap populations and 1000 genomes project populations. We observed that the SM genes had lower SNP densities, and higher degrees of LD in all the 11 HapMap populations and 13 1000 genomes project populations. These results mean that the SM genes had more stable chromosome genetic structures, and were more conserved than the FM genes.
Comprehensive analysis of the flowering genes in Chinese cabbage and examination of evolutionary pattern of CO-like genes in plant kingdom

NASA Astrophysics Data System (ADS)

Song, Xiaoming; Duan, Weike; Huang, Zhinan; Liu, Gaofeng; Wu, Peng; Liu, Tongkun; Li, Ying; Hou, Xilin

2015-09-01

In plants, flowering is the most important transition from vegetative to reproductive growth. The flowering patterns of monocots and eudicots are distinctly different, but few studies have described the evolutionary patterns of the flowering genes in them. In this study, we analysed the evolutionary pattern, duplication and expression level of these genes. The main results were as follows: (i) characterization of flowering genes in monocots and eudicots, including the identification of family-specific, orthologous and collinear genes; (ii) full characterization of CONSTANS-like genes in Brassica rapa (BraCOL genes), the key flowering genes; (iii) exploration of the evolution of COL genes in plant kingdom and construction of the evolutionary pattern of COL genes; (iv) comparative analysis of CO and FT genes between Brassicaceae and Grass, which identified several family-specific amino acids, and revealed that CO and FT protein structures were similar in B. rapa and Arabidopsis but different in rice; and (v) expression analysis of photoperiod pathway-related genes in B. rapa under different photoperiod treatments by RT-qPCR. This analysis will provide resources for understanding the flowering mechanisms and evolutionary pattern of COL genes. In addition, this genome-wide comparative study of COL genes may also provide clues for evolution of other flowering genes.
Gene finding in metatranscriptomic sequences.

PubMed

Ismail, Wazim Mohammed; Ye, Yuzhen; Tang, Haixu

2014-01-01

Metatranscriptomic sequencing is a highly sensitive bioassay of functional activity in a microbial community, providing complementary information to the metagenomic sequencing of the community. The acquisition of the metatranscriptomic sequences will enable us to refine the annotations of the metagenomes, and to study the gene activities and their regulation in complex microbial communities and their dynamics. In this paper, we present TransGeneScan, a software tool for finding genes in assembled transcripts from metatranscriptomic sequences. By incorporating several features of metatranscriptomic sequencing, including strand-specificity, short intergenic regions, and putative antisense transcripts into a Hidden Markov Model, TranGeneScan can predict a sense transcript containing one or multiple genes (in an operon) or an antisense transcript. We tested TransGeneScan on a mock metatranscriptomic data set containing three known bacterial genomes. The results showed that TranGeneScan performs better than metagenomic gene finders (MetaGeneMark and FragGeneScan) on predicting protein coding genes in assembled transcripts, and achieves comparable or even higher accuracy than gene finders for microbial genomes (Glimmer and GeneMark). These results imply, with the assistance of metatranscriptomic sequencing, we can obtain a broad and precise picture about the genes (and their functions) in a microbial community. TransGeneScan is available as open-source software on SourceForge at https://sourceforge.net/projects/transgenescan/.
Towards β-globin gene-targeting with integrase-defective lentiviral vectors.

PubMed

Inanlou, Davoud Nouri; Yakhchali, Bagher; Khanahmad, Hossein; Gardaneh, Mossa; Movassagh, Hesam; Cohan, Reza Ahangari; Ardestani, Mehdi Shafiee; Mahdian, Reza; Zeinali, Sirous

2010-11-01

We have developed an integrase-defective lentiviral (LV) vector in combination with a gene-targeting approach for gene therapy of β-thalassemia. The β-globin gene-targeting construct has two homologous stems including sequence upstream and downstream of the β-globin gene, a β-globin gene positioned between hygromycin and neomycin resistant genes and a herpes simplex virus type 1 thymidine kinase (HSVtk) suicide gene. Utilization of integrase-defective LV as a vector for the β-globin gene increased the number of selected clones relative to non-viral methods. This method represents an important step toward the ultimate goal of a clinical gene therapy for β-thalassemia.

Genome-wide network of regulatory genes for construction of a chordate embryo.

PubMed

Shoguchi, Eiichi; Hamaguchi, Makoto; Satoh, Nori

2008-04-15

Animal development is controlled by gene regulation networks that are composed of sequence-specific transcription factors (TF) and cell signaling molecules (ST). Although housekeeping genes have been reported to show clustering in the animal genomes, whether the genes comprising a given regulatory network are physically clustered on a chromosome is uncertain. We examined this question in the present study. Ascidians are the closest living relatives of vertebrates, and their tadpole-type larva represents the basic body plan of chordates. The Ciona intestinalis genome contains 390 core TF genes and 119 major ST genes. Previous gene disruption assays led to the formulation of a basic chordate embryonic blueprint, based on over 3000 genetic interactions among 79 zygotic regulatory genes. Here, we mapped the regulatory genes, including all 79 regulatory genes, on the 14 pairs of Ciona chromosomes by fluorescent in situ hybridization (FISH). Chromosomal localization of upstream and downstream regulatory genes demonstrates that the components of coherent developmental gene networks are evenly distributed over the 14 chromosomes. Thus, this study provides the first comprehensive evidence that the physical clustering of regulatory genes, or their target genes, is not relevant for the genome-wide control of gene expression during development.
Sexy gene conversions: locating gene conversions on the X-chromosome.

PubMed

Lawson, Mark J; Zhang, Liqing

2009-08-01

Gene conversion can have a profound impact on both the short- and long-term evolution of genes and genomes. Here, we examined the gene families that are located on the X-chromosomes of human (Homo sapiens), chimpanzee (Pan troglodytes), mouse (Mus musculus) and rat (Rattus norvegicus) for evidence of gene conversion. We identified seven gene families (WD repeat protein family, Ferritin Heavy Chain family, RAS-related Protein RAB-40 family, Diphosphoinositol polyphosphate phosphohydrolase family, Transcription Elongation Factor A family, LDOC1-related family, Zinc Finger Protein ZIC, and GLI family) that show evidence of gene conversion. Through phylogenetic analyses and synteny evidence, we show that gene conversion has played an important role in the evolution of these gene families and that gene conversion has occurred independently in both primates and rodents. Comparing the results with those of two gene conversion prediction programs (GENECONV and Partimatrix), we found that both GENECONV and Partimatrix have very high false negative rates (i.e. failed to predict gene conversions), which leads to many undetected gene conversions. The combination of phylogenetic analyses with physical synteny evidence exhibits high resolution in the detection of gene conversions.
Regulatory systems for hypoxia-inducible gene expression in ischemic heart disease gene therapy.

PubMed

Kim, Hyun Ah; Rhim, Taiyoun; Lee, Minhyung

2011-07-18

Ischemic heart diseases are caused by narrowed coronary arteries that decrease the blood supply to the myocardium. In the ischemic myocardium, hypoxia-responsive genes are up-regulated by hypoxia-inducible factor-1 (HIF-1). Gene therapy for ischemic heart diseases uses genes encoding angiogenic growth factors and anti-apoptotic proteins as therapeutic genes. These genes increase blood supply into the myocardium by angiogenesis and protect cardiomyocytes from cell death. However, non-specific expression of these genes in normal tissues may be harmful, since growth factors and anti-apoptotic proteins may induce tumor growth. Therefore, tight gene regulation is required to limit gene expression to ischemic tissues, to avoid unwanted side effects. For this purpose, various gene expression strategies have been developed for ischemic-specific gene expression. Transcriptional, post-transcriptional, and post-translational regulatory strategies have been developed and evaluated in ischemic heart disease animal models. The regulatory systems can limit therapeutic gene expression to ischemic tissues and increase the efficiency of gene therapy. In this review, recent progresses in ischemic-specific gene expression systems are presented, and their applications to ischemic heart diseases are discussed. Copyright © 2011 Elsevier B.V. All rights reserved.
An Independent Filter for Gene Set Testing Based on Spectral Enrichment.

PubMed

Frost, H Robert; Li, Zhigang; Asselbergs, Folkert W; Moore, Jason H

2015-01-01

Gene set testing has become an indispensable tool for the analysis of high-dimensional genomic data. An important motivation for testing gene sets, rather than individual genomic variables, is to improve statistical power by reducing the number of tested hypotheses. Given the dramatic growth in common gene set collections, however, testing is often performed with nearly as many gene sets as underlying genomic variables. To address the challenge to statistical power posed by large gene set collections, we have developed spectral gene set filtering (SGSF), a novel technique for independent filtering of gene set collections prior to gene set testing. The SGSF method uses as a filter statistic the p-value measuring the statistical significance of the association between each gene set and the sample principal components (PCs), taking into account the significance of the associated eigenvalues. Because this filter statistic is independent of standard gene set test statistics under the null hypothesis but dependent under the alternative, the proportion of enriched gene sets is increased without impacting the type I error rate. As shown using simulated and real gene expression data, the SGSF algorithm accurately filters gene sets unrelated to the experimental outcome resulting in significantly increased gene set testing power.
The early stages of duplicate gene evolution

PubMed Central

Moore, Richard C.; Purugganan, Michael D.

2003-01-01

Gene duplications are one of the primary driving forces in the evolution of genomes and genetic systems. Gene duplicates account for 8–20% of the genes in eukaryotic genomes, and the rates of gene duplication are estimated at between 0.2% and 2% per gene per million years. Duplicate genes are believed to be a major mechanism for the establishment of new gene functions and the generation of evolutionary novelty, yet very little is known about the early stages of the evolution of duplicated gene pairs. It is unclear, for example, to what extent selection, rather than neutral genetic drift, drives the fixation and early evolution of duplicate loci. Analysis of recently duplicated genes in the Arabidopsis thaliana genome reveals significantly reduced species-wide levels of nucleotide polymorphisms in the progenitor and/or duplicate gene copies, suggesting that selective sweeps accompany the initial stages of the evolution of these duplicated gene pairs. Our results support recent theoretical work that indicates that fates of duplicate gene pairs may be determined in the initial phases of duplicate gene evolution and that positive selection plays a prominent role in the evolutionary dynamics of the very early histories of duplicate nuclear genes. PMID:14671323
Single-nucleotide polymorphism-gene intermixed networking reveals co-linkers connected to multiple gene expression phenotypes

PubMed Central

Gong, Bin-Sheng; Zhang, Qing-Pu; Zhang, Guang-Mei; Zhang, Shao-Jun; Zhang, Wei; Lv, Hong-Chao; Zhang, Fan; Lv, Sa-Li; Li, Chuan-Xing; Rao, Shao-Qi; Li, Xia

2007-01-01

Gene expression profiles and single-nucleotide polymorphism (SNP) profiles are modern data for genetic analysis. It is possible to use the two types of information to analyze the relationships among genes by some genetical genomics approaches. In this study, gene expression profiles were used as expression traits. And relationships among the genes, which were co-linked to a common SNP(s), were identified by integrating the two types of information. Further research on the co-expressions among the co-linked genes was carried out after the gene-SNP relationships were established using the Haseman-Elston sib-pair regression. The results showed that the co-expressions among the co-linked genes were significantly higher if the number of connections between the genes and a SNP(s) was more than six. Then, the genes were interconnected via one or more SNP co-linkers to construct a gene-SNP intermixed network. The genes sharing more SNPs tended to have a stronger correlation. Finally, a gene-gene network was constructed with their intensities of relationships (the number of SNP co-linkers shared) as the weights for the edges. PMID:18466544
Adjusting for background mutation frequency biases improves the identification of cancer driver genes.

PubMed

Evans, Perry; Avey, Stefan; Kong, Yong; Krauthammer, Michael

2013-09-01

A common goal of tumor sequencing projects is finding genes whose mutations are selected for during tumor development. This is accomplished by choosing genes that have more non-synonymous mutations than expected from an estimated background mutation frequency. While this background frequency is unknown, it can be estimated using both the observed synonymous mutation frequency and the non-synonymous to synonymous mutation ratio. The synonymous mutation frequency can be determined across all genes or in a gene-specific manner. This choice introduces an interesting trade-off. A gene-specific frequency adjusts for an underlying mutation bias, but is difficult to estimate given missing synonymous mutation counts. Using a genome-wide synonymous frequency is more robust, but is less suited for adjusting biases. Studying four evaluation criteria for identifying genes with high non-synonymous mutation burden (reflecting preferential selection of expressed genes, genes with mutations in conserved bases, genes with many protein interactions, and genes that show loss of heterozygosity), we find that the gene-specific synonymous frequency is superior in the gene expression and protein interaction tests. In conclusion, the use of the gene-specific synonymous mutation frequency is well suited for assessing a gene's non-synonymous mutation burden.
Validation of miRNA genes suitable as reference genes in qPCR analyses of miRNA gene expression in Atlantic salmon (Salmo salar).

PubMed

Johansen, Ilona; Andreassen, Rune

2014-12-23

MicroRNAs (miRNAs) are an abundant class of endogenous small RNA molecules that downregulate gene expression at the post-transcriptional level. They play important roles by regulating genes that control multiple biological processes, and recent years there has been an increased interest in studying miRNA genes and miRNA gene expression. The most common method applied to study gene expression of single genes is quantitative PCR (qPCR). However, before expression of mature miRNAs can be studied robust qPCR methods (miRNA-qPCR) must be developed. This includes identification and validation of suitable reference genes. We are particularly interested in Atlantic salmon (Salmo salar). This is an economically important aquaculture species, but no reference genes dedicated for use in miRNA-qPCR methods has been validated for this species. Our aim was, therefore, to identify suitable reference genes for miRNA-qPCR methods in Salmo salar. We used a systematic approach where we utilized similar studies in other species, some biological criteria, results from deep sequencing of small RNAs and, finally, experimental validation of candidate reference genes by qPCR to identify the most suitable reference genes. Ssa-miR-25-3p was identified as most suitable single reference gene. The best combinations of two reference genes were ssa-miR-25-3p and ssa-miR-455-5p. These two genes were constitutively and stably expressed across many different tissues. Furthermore, infectious salmon anaemia did not seem to affect their expression levels. These genes were amplified with high specificity, good efficiency and the qPCR assays showed a good linearity when applying a simple cybergreen miRNA-PCR method using miRNA gene specific forward primers. We have identified suitable reference genes for miRNA-qPCR in Atlantic salmon. These results will greatly facilitate further studies on miRNA genes in this species. The reference genes identified are conserved genes that are identical in their mature sequence in many aquaculture species. Therefore, they may also be suitable as reference genes in other teleosts. Finally, the systematic approach used in our study successfully identified suitable reference genes, suggesting that this may be a useful strategy to apply in similar validation studies in other aquaculture species.
MADS goes genomic in conifers: towards determining the ancestral set of MADS-box genes in seed plants

PubMed Central

Gramzow, Lydia; Weilandt, Lisa; Theißen, Günter

2014-01-01

Background and Aims MADS-box genes comprise a gene family coding for transcription factors. This gene family expanded greatly during land plant evolution such that the number of MADS-box genes ranges from one or two in green algae to around 100 in angiosperms. Given the crucial functions of MADS-box genes for nearly all aspects of plant development, the expansion of this gene family probably contributed to the increasing complexity of plants. However, the expansion of MADS-box genes during one important step of land plant evolution, namely the origin of seed plants, remains poorly understood due to the previous lack of whole-genome data for gymnosperms. Methods The newly available genome sequences of Picea abies, Picea glauca and Pinus taeda were used to identify the complete set of MADS-box genes in these conifers. In addition, MADS-box genes were identified in the growing number of transcriptomes available for gymnosperms. With these datasets, phylogenies were constructed to determine the ancestral set of MADS-box genes of seed plants and to infer the ancestral functions of these genes. Key Results Type I MADS-box genes are under-represented in gymnosperms and only a minimum of two Type I MADS-box genes have been present in the most recent common ancestor (MRCA) of seed plants. In contrast, a large number of Type II MADS-box genes were found in gymnosperms. The MRCA of extant seed plants probably possessed at least 11–14 Type II MADS-box genes. In gymnosperms two duplications of Type II MADS-box genes were found, such that the MRCA of extant gymnosperms had at least 14–16 Type II MADS-box genes. Conclusions The implied ancestral set of MADS-box genes for seed plants shows simplicity for Type I MADS-box genes and remarkable complexity for Type II MADS-box genes in terms of phylogeny and putative functions. The analysis of transcriptome data reveals that gymnosperm MADS-box genes are expressed in a great variety of tissues, indicating diverse roles of MADS-box genes for the development of gymnosperms. This study is the first that provides a comprehensive overview of MADS-box genes in conifers and thus will provide a framework for future work on MADS-box genes in seed plants. PMID:24854168
Candidate qRT-PCR reference genes for barley that demonstrate better stability than traditional housekeeping genes

USDA-ARS?s Scientific Manuscript database

Gene transcript expression analysis is a useful tool for correlating gene activity with plant phenotype. For these studies, an appropriate reference gene is necessary to quantify the expression of target genes. Classic housekeeping genes have often been used for this purpose, but may not be consis...
Characteristics of functional enrichment and gene expression level of human putative transcriptional target genes.

PubMed

Osato, Naoki

2018-01-19

Transcriptional target genes show functional enrichment of genes. However, how many and how significantly transcriptional target genes include functional enrichments are still unclear. To address these issues, I predicted human transcriptional target genes using open chromatin regions, ChIP-seq data and DNA binding sequences of transcription factors in databases, and examined functional enrichment and gene expression level of putative transcriptional target genes. Gene Ontology annotations showed four times larger numbers of functional enrichments in putative transcriptional target genes than gene expression information alone, independent of transcriptional target genes. To compare the number of functional enrichments of putative transcriptional target genes between cells or search conditions, I normalized the number of functional enrichment by calculating its ratios in the total number of transcriptional target genes. With this analysis, native putative transcriptional target genes showed the largest normalized number of functional enrichments, compared with target genes including 5-60% of randomly selected genes. The normalized number of functional enrichments was changed according to the criteria of enhancer-promoter interactions such as distance from transcriptional start sites and orientation of CTCF-binding sites. Forward-reverse orientation of CTCF-binding sites showed significantly higher normalized number of functional enrichments than the other orientations. Journal papers showed that the top five frequent functional enrichments were related to the cellular functions in the three cell types. The median expression level of transcriptional target genes changed according to the criteria of enhancer-promoter assignments (i.e. interactions) and was correlated with the changes of the normalized number of functional enrichments of transcriptional target genes. Human putative transcriptional target genes showed significant functional enrichments. Functional enrichments were related to the cellular functions. The normalized number of functional enrichments of human putative transcriptional target genes changed according to the criteria of enhancer-promoter assignments and correlated with the median expression level of the target genes. These analyses and characters of human putative transcriptional target genes would be useful to examine the criteria of enhancer-promoter assignments and to predict the novel mechanisms and factors such as DNA binding proteins and DNA sequences of enhancer-promoter interactions.
Reconstructing directed gene regulatory network by only gene expression data.

PubMed

Zhang, Lu; Feng, Xi Kang; Ng, Yen Kaow; Li, Shuai Cheng

2016-08-18

Accurately identifying gene regulatory network is an important task in understanding in vivo biological activities. The inference of such networks is often accomplished through the use of gene expression data. Many methods have been developed to evaluate gene expression dependencies between transcription factor and its target genes, and some methods also eliminate transitive interactions. The regulatory (or edge) direction is undetermined if the target gene is also a transcription factor. Some methods predict the regulatory directions in the gene regulatory networks by locating the eQTL single nucleotide polymorphism, or by observing the gene expression changes when knocking out/down the candidate transcript factors; regrettably, these additional data are usually unavailable, especially for the samples deriving from human tissues. In this study, we propose the Context Based Dependency Network (CBDN), a method that is able to infer gene regulatory networks with the regulatory directions from gene expression data only. To determine the regulatory direction, CBDN computes the influence of source to target by evaluating the magnitude changes of expression dependencies between the target gene and the others with conditioning on the source gene. CBDN extends the data processing inequality by involving the dependency direction to distinguish between direct and transitive relationship between genes. We also define two types of important regulators which can influence a majority of the genes in the network directly or indirectly. CBDN can detect both of these two types of important regulators by averaging the influence functions of candidate regulator to the other genes. In our experiments with simulated and real data, even with the regulatory direction taken into account, CBDN outperforms the state-of-the-art approaches for inferring gene regulatory network. CBDN identifies the important regulators in the predicted network: 1. TYROBP influences a batch of genes that are related to Alzheimer's disease; 2. ZNF329 and RB1 significantly regulate those 'mesenchymal' gene expression signature genes for brain tumors. By merely leveraging gene expression data, CBDN can efficiently infer the existence of gene-gene interactions as well as their regulatory directions. The constructed networks are helpful in the identification of important regulators for complex diseases.
Inference of cancer-specific gene regulatory networks using soft computing rules.

PubMed

Wang, Xiaosheng; Gotoh, Osamu

2010-03-24

Perturbations of gene regulatory networks are essentially responsible for oncogenesis. Therefore, inferring the gene regulatory networks is a key step to overcoming cancer. In this work, we propose a method for inferring directed gene regulatory networks based on soft computing rules, which can identify important cause-effect regulatory relations of gene expression. First, we identify important genes associated with a specific cancer (colon cancer) using a supervised learning approach. Next, we reconstruct the gene regulatory networks by inferring the regulatory relations among the identified genes, and their regulated relations by other genes within the genome. We obtain two meaningful findings. One is that upregulated genes are regulated by more genes than downregulated ones, while downregulated genes regulate more genes than upregulated ones. The other one is that tumor suppressors suppress tumor activators and activate other tumor suppressors strongly, while tumor activators activate other tumor activators and suppress tumor suppressors weakly, indicating the robustness of biological systems. These findings provide valuable insights into the pathogenesis of cancer.
Prioritization of Disease Susceptibility Genes Using LSM/SVD.

PubMed

Gong, Lejun; Yang, Ronggen; Yan, Qin; Sun, Xiao

2013-12-01

Understanding the role of genetics in diseases is one of the most important tasks in the postgenome era. It is generally too expensive and time consuming to perform experimental validation for all candidate genes related to disease. Computational methods play important roles for prioritizing these candidates. Herein, we propose an approach to prioritize disease genes using latent semantic mapping based on singular value decomposition. Our hypothesis is that similar functional genes are likely to cause similar diseases. Measuring the functional similarity between known disease susceptibility genes and unknown genes is to predict new disease susceptibility genes. Taking autism as an instance, the analysis results of the top ten genes prioritized demonstrate they might be autism susceptibility genes, which also indicates our approach could discover new disease susceptibility genes. The novel approach of disease gene prioritization could discover new disease susceptibility genes, and latent disease-gene relations. The prioritized results could also support the interpretive diversity and experimental views as computational evidence for disease researchers.
Gene doping.

PubMed

Azzazy, Hassan M E

2010-01-01

Gene doping abuses the legitimate approach of gene therapy. While gene therapy aims to correct genetic disorders by introducing a foreign gene to replace an existing faulty one or by manipulating existing gene(s) to achieve a therapeutic benefit, gene doping employs the same concepts to bestow performance advantages on athletes over their competitors. Recent developments in genetic engineering have contributed significantly to the progress of gene therapy research and currently numerous clinical trials are underway. Some athletes and their staff are probably watching this progress closely. Any gene that plays a role in muscle development, oxygen delivery to tissues, neuromuscular coordination, or even pain control is considered a candidate for gene dopers. Unfortunately, detecting gene doping is technically very difficult because the transgenic proteins expressed by the introduced genes are similar to their endogenous counterparts. Researchers today are racing the clock because assuring the continued integrity of sports competition depends on their ability to develop effective detection strategies in preparation for the 2012 Olympics, which may mark the appearance of genetically modified athletes.
Establishing gene models from the Pinus pinaster genome using gene capture and BAC sequencing.

PubMed

Seoane-Zonjic, Pedro; Cañas, Rafael A; Bautista, Rocío; Gómez-Maldonado, Josefa; Arrillaga, Isabel; Fernández-Pozo, Noé; Claros, M Gonzalo; Cánovas, Francisco M; Ávila, Concepción

2016-02-27

In the era of DNA throughput sequencing, assembling and understanding gymnosperm mega-genomes remains a challenge. Although drafts of three conifer genomes have recently been published, this number is too low to understand the full complexity of conifer genomes. Using techniques focused on specific genes, gene models can be established that can aid in the assembly of gene-rich regions, and this information can be used to compare genomes and understand functional evolution. In this study, gene capture technology combined with BAC isolation and sequencing was used as an experimental approach to establish de novo gene structures without a reference genome. Probes were designed for 866 maritime pine transcripts to sequence genes captured from genomic DNA. The gene models were constructed using GeneAssembler, a new bioinformatic pipeline, which reconstructed over 82% of the gene structures, and a high proportion (85%) of the captured gene models contained sequences from the promoter regulatory region. In a parallel experiment, the P. pinaster BAC library was screened to isolate clones containing genes whose cDNA sequence were already available. BAC clones containing the asparagine synthetase, sucrose synthase and xyloglucan endotransglycosylase gene sequences were isolated and used in this study. The gene models derived from the gene capture approach were compared with the genomic sequences derived from the BAC clones. This combined approach is a particularly efficient way to capture the genomic structures of gene families with a small number of members. The experimental approach used in this study is a valuable combined technique to study genomic gene structures in species for which a reference genome is unavailable. It can be used to establish exon/intron boundaries in unknown gene structures, to reconstruct incomplete genes and to obtain promoter sequences that can be used for transcriptional studies. A bioinformatics algorithm (GeneAssembler) is also provided as a Ruby gem for this class of analyses.
Prediction of operon-like gene clusters in the Arabidopsis thaliana genome based on co-expression analysis of neighboring genes.

PubMed

Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko

2012-07-15

Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of E<10(-5)) are included in 27 clusters. Five clusters are associated with metabolism, containing P450 genes restricted to the Brassica family and predicted to be involved in secondary metabolism. Operon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system. Copyright © 2012 Elsevier B.V. All rights reserved.
Inferring gene dependency network specific to phenotypic alteration based on gene expression data and clinical information of breast cancer.

PubMed

Zhou, Xionghui; Liu, Juan

2014-01-01

Although many methods have been proposed to reconstruct gene regulatory network, most of them, when applied in the sample-based data, can not reveal the gene regulatory relations underlying the phenotypic change (e.g. normal versus cancer). In this paper, we adopt phenotype as a variable when constructing the gene regulatory network, while former researches either neglected it or only used it to select the differentially expressed genes as the inputs to construct the gene regulatory network. To be specific, we integrate phenotype information with gene expression data to identify the gene dependency pairs by using the method of conditional mutual information. A gene dependency pair (A,B) means that the influence of gene A on the phenotype depends on gene B. All identified gene dependency pairs constitute a directed network underlying the phenotype, namely gene dependency network. By this way, we have constructed gene dependency network of breast cancer from gene expression data along with two different phenotype states (metastasis and non-metastasis). Moreover, we have found the network scale free, indicating that its hub genes with high out-degrees may play critical roles in the network. After functional investigation, these hub genes are found to be biologically significant and specially related to breast cancer, which suggests that our gene dependency network is meaningful. The validity has also been justified by literature investigation. From the network, we have selected 43 discriminative hubs as signature to build the classification model for distinguishing the distant metastasis risks of breast cancer patients, and the result outperforms those classification models with published signatures. In conclusion, we have proposed a promising way to construct the gene regulatory network by using sample-based data, which has been shown to be effective and accurate in uncovering the hidden mechanism of the biological process and identifying the gene signature for phenotypic change.
Olaparib In Metastatic Breast Cancer

ClinicalTrials.gov

2018-03-27

Metastatic Breast Cancer; Invasive Breast Cancer; Somatic Mutation Breast Cancer (BRCA1); Somatic Mutation Breast Cancer (BRCA2); CHEK2 Gene Mutation; ATM Gene Mutation; PALB2 Gene Mutation; RAD51 Gene Mutation; BRIP1 Gene Mutation; NBN Gene Mutation
Discovery and validation of a glioblastoma co-expressed gene module

PubMed Central

Dunwoodie, Leland J.; Poehlman, William L.; Ficklin, Stephen P.; Feltus, Frank Alexander

2018-01-01

Tumors exhibit complex patterns of aberrant gene expression. Using a knowledge-independent, noise-reducing gene co-expression network construction software called KINC, we created multiple RNAseq-based gene co-expression networks relevant to brain and glioblastoma biology. In this report, we describe the discovery and validation of a glioblastoma-specific gene module that contains 22 co-expressed genes. The genes are upregulated in glioblastoma relative to normal brain and lower grade glioma samples; they are also hypo-methylated in glioblastoma relative to lower grade glioma tumors. Among the proneural, neural, mesenchymal, and classical glioblastoma subtypes, these genes are most-highly expressed in the mesenchymal subtype. Furthermore, high expression of these genes is associated with decreased survival across each glioblastoma subtype. These genes are of interest to glioblastoma biology and our gene interaction discovery and validation workflow can be used to discover and validate co-expressed gene modules derived from any co-expression network. PMID:29541392

Discovery and validation of a glioblastoma co-expressed gene module.

PubMed

Dunwoodie, Leland J; Poehlman, William L; Ficklin, Stephen P; Feltus, Frank Alexander

2018-02-16

Tumors exhibit complex patterns of aberrant gene expression. Using a knowledge-independent, noise-reducing gene co-expression network construction software called KINC, we created multiple RNAseq-based gene co-expression networks relevant to brain and glioblastoma biology. In this report, we describe the discovery and validation of a glioblastoma-specific gene module that contains 22 co-expressed genes. The genes are upregulated in glioblastoma relative to normal brain and lower grade glioma samples; they are also hypo-methylated in glioblastoma relative to lower grade glioma tumors. Among the proneural, neural, mesenchymal, and classical glioblastoma subtypes, these genes are most-highly expressed in the mesenchymal subtype. Furthermore, high expression of these genes is associated with decreased survival across each glioblastoma subtype. These genes are of interest to glioblastoma biology and our gene interaction discovery and validation workflow can be used to discover and validate co-expressed gene modules derived from any co-expression network.
Homology-dependent Gene Silencing in Paramecium

PubMed Central

Ruiz, Françoise; Vayssié, Laurence; Klotz, Catherine; Sperling, Linda; Madeddu, Luisa

1998-01-01

Microinjection at high copy number of plasmids containing only the coding region of a gene into the Paramecium somatic macronucleus led to a marked reduction in the expression of the corresponding endogenous gene(s). The silencing effect, which is stably maintained throughout vegetative growth, has been observed for all Paramecium genes examined so far: a single-copy gene (ND7), as well as members of multigene families (centrin genes and trichocyst matrix protein genes) in which all closely related paralogous genes appeared to be affected. This phenomenon may be related to posttranscriptional gene silencing in transgenic plants and quelling in Neurospora and allows the efficient creation of specific mutant phenotypes thus providing a potentially powerful tool to study gene function in Paramecium. For the two multigene families that encode proteins that coassemble to build up complex subcellular structures the analysis presented herein provides the first experimental evidence that the members of these gene families are not functionally redundant. PMID:9529389
Genome-wide targeted prediction of ABA responsive genes in rice based on over-represented cis-motif in co-expressed genes.

PubMed

Lenka, Sangram K; Lohia, Bikash; Kumar, Abhay; Chinnusamy, Viswanathan; Bansal, Kailash C

2009-02-01

Abscisic acid (ABA), the popular plant stress hormone, plays a key role in regulation of sub-set of stress responsive genes. These genes respond to ABA through specific transcription factors which bind to cis-regulatory elements present in their promoters. We discovered the ABA Responsive Element (ABRE) core (ACGT) containing CGMCACGTGB motif as over-represented motif among the promoters of ABA responsive co-expressed genes in rice. Targeted gene prediction strategy using this motif led to the identification of 402 protein coding genes potentially regulated by ABA-dependent molecular genetic network. RT-PCR analysis of arbitrarily chosen 45 genes from the predicted 402 genes confirmed 80% accuracy of our prediction. Plant Gene Ontology (GO) analysis of ABA responsive genes showed enrichment of signal transduction and stress related genes among diverse functional categories.
Gene-for-gene disease resistance: bridging insect pest and pathogen defense.

PubMed

Kaloshian, Isgouhi

2004-12-01

Active plant defense, also known as gene-for-gene resistance, is triggered when a plant resistance (R) gene recognizes the intrusion of a specific insect pest or pathogen. Activation of plant defense includes an array of physiological and transcriptional reprogramming. During the past decade, a large number of plant R genes that confer resistance to diverse group of pathogens have been cloned from a number of plant species. Based on predicted protein structures, these genes are classified into a small number of groups, indicating that structurally related R genes recognize phylogenetically distinct pathogens. An extreme example is the tomato Mi-1 gene, which confers resistance to potato aphid (Macrosiphum euphorbiae), whitefly (Bemisia tabaci), and root-knot nematodes (Meloidogyne spp.). While Mi-1 remains the only cloned insect R gene, there is evidence that gene-for-gene type of plant defense against piercing-sucking insects exists in a number of plant species.
Hox genes and study of Hox genes in crustacean

NASA Astrophysics Data System (ADS)

Hou, Lin; Chen, Zhijuan; Xu, Mingyu; Lin, Shengguo; Wang, Lu

2004-12-01

Homeobox genes have been discovered in many species. These genes are known to play a major role in specifying regional identity along the anterior-posterior axis of animals from a wide range of phyla. The products of the homeotic genes are a set of evolutionarily conserved transcription factors that control elaborate developmental processes and specify cell fates in metazoans. Crustacean, presenting a variety of body plans not encountered in any other class or phylum of the Metazoa, has been shown to possess a single set of homologous Hox genes like insect. The ancestral crustacean Hox gene complex comprised ten genes: eight homologous to the hometic Hox genes and two related to nonhomeotic genes presented within the insect Hox complexes. The crustacean in particular exhibits an abundant diversity segment specialization and tagmosis. This morphological diversity relates to the Hox genes. In crustacean body plan, different Hox genes control different segments and tagmosis.
New Gene Evolution: Little Did We Know

PubMed Central

Long, Manyuan; VanKuren, Nicholas W.; Chen, Sidi; Vibranovski, Maria D.

2014-01-01

Genes are perpetually added to and deleted from genomes during evolution. Thus, it is important to understand how new genes are formed and evolve as critical components of the genetic systems determining the biological diversity of life. Two decades of effort have shed light on the process of new gene origination, and have contributed to an emerging comprehensive picture of how new genes are added to genomes, ranging from the mechanisms that generate new gene structures to the presence of new genes in different organisms to the rates and patterns of new gene origination and the roles of new genes in phenotypic evolution. We review each of these aspects of new gene evolution, summarizing the main evidence for the origination and importance of new genes in evolution. We highlight findings showing that new genes rapidly change existing genetic systems that govern various molecular, cellular and phenotypic functions. PMID:24050177
Identifying key genes in rheumatoid arthritis by weighted gene co-expression network analysis.

PubMed

Ma, Chunhui; Lv, Qi; Teng, Songsong; Yu, Yinxian; Niu, Kerun; Yi, Chengqin

2017-08-01

This study aimed to identify rheumatoid arthritis (RA) related genes based on microarray data using the WGCNA (weighted gene co-expression network analysis) method. Two gene expression profile datasets GSE55235 (10 RA samples and 10 healthy controls) and GSE77298 (16 RA samples and seven healthy controls) were downloaded from Gene Expression Omnibus database. Characteristic genes were identified using metaDE package. WGCNA was used to find disease-related networks based on gene expression correlation coefficients, and module significance was defined as the average gene significance of all genes used to assess the correlation between the module and RA status. Genes in the disease-related gene co-expression network were subject to functional annotation and pathway enrichment analysis using Database for Annotation Visualization and Integrated Discovery. Characteristic genes were also mapped to the Connectivity Map to screen small molecules. A total of 599 characteristic genes were identified. For each dataset, characteristic genes in the green, red and turquoise modules were most closely associated with RA, with gene numbers of 54, 43 and 79, respectively. These genes were enriched in totally enriched in 17 Gene Ontology terms, mainly related to immune response (CD97, FYB, CXCL1, IKBKE, CCR1, etc.), inflammatory response (CD97, CXCL1, C3AR1, CCR1, LYZ, etc.) and homeostasis (C3AR1, CCR1, PLN, CCL19, PPT1, etc.). Two small-molecule drugs sanguinarine and papaverine were predicted to have a therapeutic effect against RA. Genes related to immune response, inflammatory response and homeostasis presumably have critical roles in RA pathogenesis. Sanguinarine and papaverine have a potential therapeutic effect against RA. © 2017 Asia Pacific League of Associations for Rheumatology and John Wiley & Sons Australia, Ltd.
Gene copy number evolution during tetraploid cotton radiation.

PubMed

Rong, J; Feltus, F A; Liu, L; Lin, L; Paterson, A H

2010-11-01

After polyploid formation, retention or loss of duplicated genes is not random. Genes with some functional domains are convergently restored to 'singleton' state after many independent genome duplications, and have been referred to as 'duplication-resistant' (DR) genes. To further explore the timeframe for their restoration to the singleton state, 27 cotton homologs of genes found to be 'DR' in Arabidopsis were selected based on diagnostic Pfam domains. Their copy numbers were studied using southern hybridization and sequence analysis in five tetraploid species and their ancestral A and D genome diploids. DR genes had significantly lower copy number than gene families hybridizing to randomly selected cotton ESTs. Three DR genes showed complete loss of D genome-derived homoeologs in some or all tetraploid species. Prior analysis has shown gene loss in polyploid cotton to be rare, and herein only one randomly selected gene showed loss of a homoeolog in only one of the five tetraploid species (Gossypium mustelinum). BAC sequencing confirmed two cases of gene loss in tetraploid cotton. Divergence among 5' sequences of DR genes amplified from G. arboreum, G. raimondii, and Gossypioides kirkii was correlated with gene copy number. These results show that genes containing Pfam domains associated with duplication resistance in Arabidopsis have also been preferentially restored to low copy number after a more recent polyploidization event in cotton. In tetraploid cotton, genes from the progenitor D genome seem to experience more gene copy number divergence than genes from the A genome. Together with D subgenome-biased alterations in gene expression, perhaps gene loss may contribute to the relatively larger portion of quantitative trait variation attributable to D than A subgenome chromosomes of tetraploid cotton.
Functional cohesion of gene sets determined by latent semantic indexing of PubMed abstracts.

PubMed

Xu, Lijing; Furlotte, Nicholas; Lin, Yunyue; Heinrich, Kevin; Berry, Michael W; George, Ebenezer O; Homayouni, Ramin

2011-04-14

High-throughput genomic technologies enable researchers to identify genes that are co-regulated with respect to specific experimental conditions. Numerous statistical approaches have been developed to identify differentially expressed genes. Because each approach can produce distinct gene sets, it is difficult for biologists to determine which statistical approach yields biologically relevant gene sets and is appropriate for their study. To address this issue, we implemented Latent Semantic Indexing (LSI) to determine the functional coherence of gene sets. An LSI model was built using over 1 million Medline abstracts for over 20,000 mouse and human genes annotated in Entrez Gene. The gene-to-gene LSI-derived similarities were used to calculate a literature cohesion p-value (LPv) for a given gene set using a Fisher's exact test. We tested this method against genes in more than 6,000 functional pathways annotated in Gene Ontology (GO) and found that approximately 75% of gene sets in GO biological process category and 90% of the gene sets in GO molecular function and cellular component categories were functionally cohesive (LPv<0.05). These results indicate that the LPv methodology is both robust and accurate. Application of this method to previously published microarray datasets demonstrated that LPv can be helpful in selecting the appropriate feature extraction methods. To enable real-time calculation of LPv for mouse or human gene sets, we developed a web tool called Gene-set Cohesion Analysis Tool (GCAT). GCAT can complement other gene set enrichment approaches by determining the overall functional cohesion of data sets, taking into account both explicit and implicit gene interactions reported in the biomedical literature. GCAT is freely available at http://binf1.memphis.edu/gcat.
Novel gene sets improve set-level classification of prokaryotic gene expression data.

PubMed

Holec, Matěj; Kuželka, Ondřej; Železný, Filip

2015-10-28

Set-level classification of gene expression data has received significant attention recently. In this setting, high-dimensional vectors of features corresponding to genes are converted into lower-dimensional vectors of features corresponding to biologically interpretable gene sets. The dimensionality reduction brings the promise of a decreased risk of overfitting, potentially resulting in improved accuracy of the learned classifiers. However, recent empirical research has not confirmed this expectation. Here we hypothesize that the reported unfavorable classification results in the set-level framework were due to the adoption of unsuitable gene sets defined typically on the basis of the Gene ontology and the KEGG database of metabolic networks. We explore an alternative approach to defining gene sets, based on regulatory interactions, which we expect to collect genes with more correlated expression. We hypothesize that such more correlated gene sets will enable to learn more accurate classifiers. We define two families of gene sets using information on regulatory interactions, and evaluate them on phenotype-classification tasks using public prokaryotic gene expression data sets. From each of the two gene-set families, we first select the best-performing subtype. The two selected subtypes are then evaluated on independent (testing) data sets against state-of-the-art gene sets and against the conventional gene-level approach. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. Novel gene sets defined on the basis of regulatory interactions improve set-level classification of gene expression data. The experimental scripts and other material needed to reproduce the experiments are available at http://ida.felk.cvut.cz/novelgenesets.tar.gz.
A new type of gene-disruption cassette with a rescue gene for Pichia pastoris.

PubMed

Shibui, Tatsuro; Hara, Hiroyoshi

2017-09-01

Pichia pastoris has been used for the production of many recombinant proteins, and many useful mutant strains have been created. However, the efficiency of mutant isolation by gene-targeting is usually low and the procedure is difficult for those inexperienced in yeast genetics. In order to overcome these issues, we developed a new gene-disruption system with a rescue gene using an inducible Cre/mutant-loxP system. With only short homology regions, the gene-disruption cassette of the system replaces its target-gene locus containing a mutation with a compensatory rescue gene. As the cassette contains the AOX1 promoter-driven Cre gene, when targeted strains are grown on media containing methanol, the DNA fragment, i.e., the marker, rescue and Cre genes, between the mutant-loxP sequences in the cassette is excised, leaving only the remaining mutant-loxP sequence in the genome, and consequently a target gene-disrupted mutant can be isolated. The system was initially validated on ADE2 gene disruption, where the disruption can easily be detected by color-change of the colonies. Then, the system was applied for knocking-out URA3 and OCH1 genes, reported to be difficult to accomplish by conventional gene-targeting methods. All three gene-disruption cassettes with their rescue genes replaced their target genes, and the Cre/mutant-loxP system worked well to successfully isolate their knock-out mutants. This study identified a new gene-disruption system that could be used to effectively and strategically knock out genes of interest, especially whose deletion is detrimental to growth, without using special strains, e.g., deficient in nonhomologous end-joining, in P. pastoris. © 2017 American Institute of Chemical Engineers Biotechnol. Prog., 33:1201-1208, 2017. © 2017 American Institute of Chemical Engineers.
Genome-Wide Temporal Expression Profiling in Caenorhabditis elegans Identifies a Core Gene Set Related to Long-Term Memory.

PubMed

Freytag, Virginie; Probst, Sabine; Hadziselimovic, Nils; Boglari, Csaba; Hauser, Yannick; Peter, Fabian; Gabor Fenyves, Bank; Milnik, Annette; Demougin, Philippe; Vukojevic, Vanja; de Quervain, Dominique J-F; Papassotiropoulos, Andreas; Stetak, Attila

2017-07-12

The identification of genes related to encoding, storage, and retrieval of memories is a major interest in neuroscience. In the current study, we analyzed the temporal gene expression changes in a neuronal mRNA pool during an olfactory long-term associative memory (LTAM) in Caenorhabditis elegans hermaphrodites. Here, we identified a core set of 712 (538 upregulated and 174 downregulated) genes that follows three distinct temporal peaks demonstrating multiple gene regulation waves in LTAM. Compared with the previously published positive LTAM gene set (Lakhina et al., 2015), 50% of the identified upregulated genes here overlap with the previous dataset, possibly representing stimulus-independent memory-related genes. On the other hand, the remaining genes were not previously identified in positive associative memory and may specifically regulate aversive LTAM. Our results suggest a multistep gene activation process during the formation and retrieval of long-term memory and define general memory-implicated genes as well as conditioning-type-dependent gene sets. SIGNIFICANCE STATEMENT The identification of genes regulating different steps of memory is of major interest in neuroscience. Identification of common memory genes across different learning paradigms and the temporal activation of the genes are poorly studied. Here, we investigated the temporal aspects of Caenorhabditis elegans gene expression changes using aversive olfactory associative long-term memory (LTAM) and identified three major gene activation waves. Like in previous studies, aversive LTAM is also CREB dependent, and CREB activity is necessary immediately after training. Finally, we define a list of memory paradigm-independent core gene sets as well as conditioning-dependent genes. Copyright © 2017 the authors 0270-6474/17/376661-12$15.00/0.
Gene therapy for ocular diseases meditated by ultrasound and microbubbles (Review)

PubMed Central

WAN, CAIFENG; LI, FENGHUA; LI, HONGLI

2015-01-01

The eye is an ideal target organ for gene therapy as it is easily accessible and immune-privileged. With the increasing insight into the underlying molecular mechanisms of ocular diseases, gene therapy has been proposed as an effective approach. Successful gene therapy depends on efficient gene transfer to targeted cells to prove stable and prolonged gene expression with minimal toxicity. At present, the main hindrance regarding the clinical application of gene therapy is not the lack of an ideal gene, but rather the lack of a safe and efficient method to selectively deliver genes to target cells and tissues. Ultrasound-targeted microbubble destruction (UTMD), with the advantages of high safety, repetitive applicability and tissue targeting, has become a potential strategy for gene- and drug delivery. When gene-loaded microbubbles are injected, UTMD is able to enhance the transport of the gene to the targeted cells. High-amplitude oscillations of microbubbles act as cavitation nuclei which can effectively focus ultrasound energy, produce oscillations and disruptions that increase the permeability of the cell membrane and create transient pores in the cell membrane. Thereby, the efficiency of gene therapy can be significantly improved. The UTMD-mediated gene delivery system has been widely used in pre-clinical studies to enhance gene expression in a site-specific manner in a variety of organs. With reasonable application, the effects of sonoporation can be spatially and temporally controlled to improve localized tissue deposition of gene complexes for ocular gene therapy applications. In addition, appropriately powered, focused ultrasound combined with microbubbles can induce a reversible disruption of the blood-retinal barrier with no significant side effects. The present review discusses the current status of gene therapy of ocular diseases as well as studies on gene therapy of ocular diseases meditated by UTMD. PMID:26151686
Recommended nomenclature for five mammalian carboxylesterase gene families: human, mouse, and rat genes and proteins.

PubMed

Holmes, Roger S; Wright, Matthew W; Laulederkind, Stanley J F; Cox, Laura A; Hosokawa, Masakiyo; Imai, Teruko; Ishibashi, Shun; Lehner, Richard; Miyazaki, Masao; Perkins, Everett J; Potter, Phillip M; Redinbo, Matthew R; Robert, Jacques; Satoh, Tetsuo; Yamashita, Tetsuro; Yan, Bingfan; Yokoi, Tsuyoshi; Zechner, Rudolf; Maltais, Lois J

2010-10-01

Mammalian carboxylesterase (CES or Ces) genes encode enzymes that participate in xenobiotic, drug, and lipid metabolism in the body and are members of at least five gene families. Tandem duplications have added more genes for some families, particularly for mouse and rat genomes, which has caused confusion in naming rodent Ces genes. This article describes a new nomenclature system for human, mouse, and rat carboxylesterase genes that identifies homolog gene families and allocates a unique name for each gene. The guidelines of human, mouse, and rat gene nomenclature committees were followed and "CES" (human) and "Ces" (mouse and rat) root symbols were used followed by the family number (e.g., human CES1). Where multiple genes were identified for a family or where a clash occurred with an existing gene name, a letter was added (e.g., human CES4A; mouse and rat Ces1a) that reflected gene relatedness among rodent species (e.g., mouse and rat Ces1a). Pseudogenes were named by adding "P" and a number to the human gene name (e.g., human CES1P1) or by using a new letter followed by ps for mouse and rat Ces pseudogenes (e.g., Ces2d-ps). Gene transcript isoforms were named by adding the GenBank accession ID to the gene symbol (e.g., human CES1_AB119995 or mouse Ces1e_BC019208). This nomenclature improves our understanding of human, mouse, and rat CES/Ces gene families and facilitates research into the structure, function, and evolution of these gene families. It also serves as a model for naming CES genes from other mammalian species.
Microarray analysis of gene expression alteration in human middle ear epithelial cells induced by micro particle.

PubMed

Song, Jae-Jun; Kwon, Jee Young; Park, Moo Kyun; Seo, Young Rok

2013-10-01

The primary aim of this study is to reveal the effect of particulate matter (PM) on the human middle ear epithelial cell (HMEEC). The HMEEC was treated with PM (300 μg/ml) for 24 h. Total RNA was extracted and used for microarray analysis. Molecular pathways among differentially expressed genes were further analyzed by using Pathway Studio 9.0 software. For selected genes, the changes in gene expression were confirmed by real-time PCR. A total of 611 genes were regulated by PM. Among them, 366 genes were up-regulated, whereas 245 genes were down-regulated. Up-regulated genes were mainly involved in cellular processes, including reactive oxygen species generation, cell proliferation, apoptosis, cell differentiation, inflammatory response and immune response. Down-regulated genes affected several cellular processes, including cell differentiation, cell cycle, proliferation, apoptosis and cell migration. A total of 21 genes were discovered as crucial components in potential signaling networks containing 2-fold up regulated genes. Four genes, VEGFA, IL1B, CSF2 and HMOX1 were revealed as key mediator genes among the up-regulated genes. A total of 25 genes were revealed as key modulators in the signaling pathway associated with 2-fold down regulated genes. Four genes, including IGF1R, TIMP1, IL6 and FN1, were identified as the main modulator genes. We identified the differentially expressed genes in PM-treated HMEEC, whose expression profile may provide a useful clue for the understanding of environmental pathophysiology of otitis media. Our work indicates that air pollution, like PM, plays an important role in the pathogenesis of otitis media. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Identification of key candidate genes and pathways in hepatitis B virus-associated acute liver failure by bioinformatical analysis

PubMed Central

Lin, Huapeng; Zhang, Qian; Li, Xiaocheng; Wu, Yushen; Liu, Ye; Hu, Yingchun

2018-01-01

Abstract Hepatitis B virus-associated acute liver failure (HBV-ALF) is a rare but life-threatening syndrome that carried a high morbidity and mortality. Our study aimed to explore the possible molecular mechanisms of HBV-ALF by means of bioinformatics analysis. In this study, genes expression microarray datasets of HBV-ALF from Gene Expression Omnibus were collected, and then we identified differentially expressed genes (DEGs) by the limma package in R. After functional enrichment analysis, we constructed the protein–protein interaction (PPI) network by the Search Tool for the Retrieval of Interacting Genes online database and weighted genes coexpression network by the WGCNA package in R. Subsequently, we picked out the hub genes among the DEGs. A total of 423 DEGs with 198 upregulated genes and 225 downregulated genes were identified between HBV-ALF and normal samples. The upregulated genes were mainly enriched in immune response, and the downregulated genes were mainly enriched in complement and coagulation cascades. Orosomucoid 1 (ORM1), orosomucoid 2 (ORM2), plasminogen (PLG), and aldehyde oxidase 1 (AOX1) were picked out as the hub genes that with a high degree in both PPI network and weighted genes coexpression network. The weighted genes coexpression network analysis found out 3 of the 5 modules that upregulated genes enriched in were closely related to immune system. The downregulated genes enriched in only one module, and the genes in this module majorly enriched in the complement and coagulation cascades pathway. In conclusion, 4 genes (ORM1, ORM2, PLG, and AOX1) with immune response and the complement and coagulation cascades pathway may take part in the pathogenesis of HBV-ALF, and these candidate genes and pathways could be therapeutic targets for HBV-ALF. PMID:29384847
FARO server: Meta-analysis of gene expression by matching gene expression signatures to a compendium of public gene expression data.

PubMed

Manijak, Mieszko P; Nielsen, Henrik B

2011-06-11

Although, systematic analysis of gene annotation is a powerful tool for interpreting gene expression data, it sometimes is blurred by incomplete gene annotation, missing expression response of key genes and secondary gene expression responses. These shortcomings may be partially circumvented by instead matching gene expression signatures to signatures of other experiments. To facilitate this we present the Functional Association Response by Overlap (FARO) server, that match input signatures to a compendium of 242 gene expression signatures, extracted from more than 1700 Arabidopsis microarray experiments. Hereby we present a publicly available tool for robust characterization of Arabidopsis gene expression experiments which can point to similar experimental factors in other experiments. The server is available at http://www.cbs.dtu.dk/services/faro/.
Tumor suppressor genes are larger than apoptosis-effector genes and have more regions of active chromatin: Connection to a stochastic paradigm for sequential gene expression programs.

PubMed

Garcia, Marlene; Mauro, James A; Ramsamooj, Michael; Blanck, George

2015-08-03

Apoptosis- and proliferation-effector genes are substantially regulated by the same transactivators, with E2F-1 and Oct-1 being notable examples. The larger proliferation-effector genes have more binding sites for the transactivators that regulate both sets of genes, and proliferation-effector genes have more regions of active chromatin, i.e, DNase I hypersensitive and histone 3, lysine-4 trimethylation sites. Thus, the size differences between the 2 classes of genes suggest a transcriptional regulation paradigm whereby the accumulation of transcription factors that regulate both sets of genes, merely as an aspect of stochastic behavior, accumulate first on the larger proliferation-effector gene "traps," and then accumulate on the apoptosis effector genes, thereby effecting sequential activation of the 2 different gene sets. As IRF-1 and p53 levels increase, tumor suppressor proteins are first activated, followed by the activation of apoptosis-effector genes, for example during S-phase pausing for DNA repair. Tumor suppressor genes are larger than apoptosis-effector genes and have more IRF-1 and p53 binding sites, thereby likewise suggesting a paradigm for transcription sequencing based on stochastic interactions of transcription factors with different gene classes. In this report, using the ENCODE database, we determined that tumor suppressor genes have a greater number of open chromatin regions and histone 3 lysine-4 trimethylation sites, consistent with the idea that a larger gene size can facilitate earlier transcriptional activation via the inclusion of more transactivator binding sites.
Patenting human genes: Chinese academic articles' portrayal of gene patents.

PubMed

Du, Li

2018-04-24

The patenting of human genes has been the subject of debate for decades. While China has gradually come to play an important role in the global genomics-based testing and treatment market, little is known about Chinese scholars' perspectives on patent protection for human genes. A content analysis of academic literature was conducted to identify Chinese scholars' concerns regarding gene patents, including benefits and risks of patenting human genes, attitudes that researchers hold towards gene patenting, and any legal and policy recommendations offered for the gene patent regime in China. 57.2% of articles were written by law professors, but scholars from health sciences, liberal arts, and ethics also participated in discussions on gene patent issues. While discussions of benefits and risks were relatively balanced in the articles, 63.5% of the articles favored gene patenting in general and, of the articles (n = 41) that explored gene patents in the Chinese context, 90.2% supported patent protections for human genes in China. The patentability of human genes was discussed in 33 articles, and 75.8% of these articles reached the conclusion that human genes are patentable. Chinese scholars view the patent regime as an important legal tool to protect the interests of inventors and inventions as well as the genetic resources of China. As such, many scholars support a gene patent system in China. These attitudes towards gene patents remain unchanged following the court ruling in the Myriad case in 2013, but arguments have been raised about the scope of gene patents, in particular that the increasing numbers of gene patents may negatively impact public health in China.
Optimal Reference Genes for Gene Expression Normalization in Trichomonas vaginalis.

PubMed

dos Santos, Odelta; de Vargas Rigo, Graziela; Frasson, Amanda Piccoli; Macedo, Alexandre José; Tasca, Tiana

2015-01-01

Trichomonas vaginalis is the etiologic agent of trichomonosis, the most common non-viral sexually transmitted disease worldwide. This infection is associated with several health consequences, including cervical and prostate cancers and HIV acquisition. Gene expression analysis has been facilitated because of available genome sequences and large-scale transcriptomes in T. vaginalis, particularly using quantitative real-time polymerase chain reaction (qRT-PCR), one of the most used methods for molecular studies. Reference genes for normalization are crucial to ensure the accuracy of this method. However, to the best of our knowledge, a systematic validation of reference genes has not been performed for T. vaginalis. In this study, the transcripts of nine candidate reference genes were quantified using qRT-PCR under different cultivation conditions, and the stability of these genes was compared using the geNorm and NormFinder algorithms. The most stable reference genes were α-tubulin, actin and DNATopII, and, conversely, the widely used T. vaginalis reference genes GAPDH and β-tubulin were less stable. The PFOR gene was used to validate the reliability of the use of these candidate reference genes. As expected, the PFOR gene was upregulated when the trophozoites were cultivated with ferrous ammonium sulfate when the DNATopII, α-tubulin and actin genes were used as normalizing gene. By contrast, the PFOR gene was downregulated when the GAPDH gene was used as an internal control, leading to misinterpretation of the data. These results provide an important starting point for reference gene selection and gene expression analysis with qRT-PCR studies of T. vaginalis.

Validation of reference genes for RT-qPCR studies of gene expression in banana fruit under different experimental conditions.

PubMed

Chen, Lei; Zhong, Hai-ying; Kuang, Jian-fei; Li, Jian-guo; Lu, Wang-jin; Chen, Jian-ye

2011-08-01

Reverse transcription quantitative real-time PCR (RT-qPCR) is a sensitive technique for quantifying gene expression, but its success depends on the stability of the reference gene(s) used for data normalization. Only a few studies on validation of reference genes have been conducted in fruit trees and none in banana yet. In the present work, 20 candidate reference genes were selected, and their expression stability in 144 banana samples were evaluated and analyzed using two algorithms, geNorm and NormFinder. The samples consisted of eight sample sets collected under different experimental conditions, including various tissues, developmental stages, postharvest ripening, stresses (chilling, high temperature, and pathogen), and hormone treatments. Our results showed that different suitable reference gene(s) or combination of reference genes for normalization should be selected depending on the experimental conditions. The RPS2 and UBQ2 genes were validated as the most suitable reference genes across all tested samples. More importantly, our data further showed that the widely used reference genes, ACT and GAPDH, were not the most suitable reference genes in many banana sample sets. In addition, the expression of MaEBF1, a gene of interest that plays an important role in regulating fruit ripening, under different experimental conditions was used to further confirm the validated reference genes. Taken together, our results provide guidelines for reference gene(s) selection under different experimental conditions and a foundation for more accurate and widespread use of RT-qPCR in banana.
Optimal Reference Genes for Gene Expression Normalization in Trichomonas vaginalis

PubMed Central

dos Santos, Odelta; de Vargas Rigo, Graziela; Frasson, Amanda Piccoli; Macedo, Alexandre José; Tasca, Tiana

2015-01-01

Trichomonas vaginalis is the etiologic agent of trichomonosis, the most common non-viral sexually transmitted disease worldwide. This infection is associated with several health consequences, including cervical and prostate cancers and HIV acquisition. Gene expression analysis has been facilitated because of available genome sequences and large-scale transcriptomes in T. vaginalis, particularly using quantitative real-time polymerase chain reaction (qRT-PCR), one of the most used methods for molecular studies. Reference genes for normalization are crucial to ensure the accuracy of this method. However, to the best of our knowledge, a systematic validation of reference genes has not been performed for T. vaginalis. In this study, the transcripts of nine candidate reference genes were quantified using qRT-PCR under different cultivation conditions, and the stability of these genes was compared using the geNorm and NormFinder algorithms. The most stable reference genes were α-tubulin, actin and DNATopII, and, conversely, the widely used T. vaginalis reference genes GAPDH and β-tubulin were less stable. The PFOR gene was used to validate the reliability of the use of these candidate reference genes. As expected, the PFOR gene was upregulated when the trophozoites were cultivated with ferrous ammonium sulfate when the DNATopII, α-tubulin and actin genes were used as normalizing gene. By contrast, the PFOR gene was downregulated when the GAPDH gene was used as an internal control, leading to misinterpretation of the data. These results provide an important starting point for reference gene selection and gene expression analysis with qRT-PCR studies of T. vaginalis. PMID:26393928
Distinct Trajectories of Massive Recent Gene Gains and Losses in Populations of a Microbial Eukaryotic Pathogen

PubMed Central

Hartmann, Fanny E.; Croll, Daniel

2017-01-01

Abstract Differences in gene content are a significant source of variability within species and have an impact on phenotypic traits. However, little is known about the mechanisms responsible for the most recent gene gains and losses. We screened the genomes of 123 worldwide isolates of the major pathogen of wheat Zymoseptoria tritici for robust evidence of gene copy number variation. Based on orthology relationships in three closely related fungi, we identified 599 gene gains and 1,024 gene losses that have not yet reached fixation within the focal species. Our analyses of gene gains and losses segregating in populations showed that gene copy number variation arose preferentially in subtelomeres and in proximity to transposable elements. Recently lost genes were enriched in virulence factors and secondary metabolite gene clusters. In contrast, recently gained genes encoded mostly secreted protein lacking a conserved domain. We analyzed the frequency spectrum at loci segregating a gene presence–absence polymorphism in four worldwide populations. Recent gene losses showed a significant excess in low-frequency variants compared with genome-wide single nucleotide polymorphism, which is indicative of strong negative selection against gene losses. Recent gene gains were either under weak negative selection or neutral. We found evidence for strong divergent selection among populations at individual loci segregating a gene presence–absence polymorphism. Hence, gene gains and losses likely contributed to local adaptation. Our study shows that microbial eukaryotes harbor extensive copy number variation within populations and that functional differences among recently gained and lost genes led to distinct evolutionary trajectories. PMID:28981698
Concerted and nonconcerted evolution of the Hsp70 gene superfamily in two sibling species of nematodes.

PubMed

Nikolaidis, Nikolas; Nei, Masatoshi

2004-03-01

We have identified the Hsp70 gene superfamily of the nematode Caenorhabditis briggsae and investigated the evolution of these genes in comparison with Hsp70 genes from C. elegans, Drosophila, and yeast. The Hsp70 genes are classified into three monophyletic groups according to their subcellular localization, namely, cytoplasm (CYT), endoplasmic reticulum (ER), and mitochondria (MT). The Hsp110 genes can be classified into the polyphyletic CYT group and the monophyletic ER group. The different Hsp70 and Hsp110 groups appeared to evolve following the model of divergent evolution. This model can also explain the evolution of the ER and MT genes. On the other hand, the CYT genes are divided into heat-inducible and constitutively expressed genes. The constitutively expressed genes have evolved more or less following the birth-and-death process, and the rates of gene birth and gene death are different between the two nematode species. By contrast, some heat-inducible genes show an intraspecies phylogenetic clustering. This suggests that they are subject to sequence homogenization resulting from gene conversion-like events. In addition, the heat-inducible genes show high levels of sequence conservation in both intra-species and inter-species comparisons, and in most cases, amino acid sequence similarity is higher than nucleotide sequence similarity. This indicates that purifying selection also plays an important role in maintaining high sequence similarity among paralogous Hsp70 genes. Therefore, we suggest that the CYT heat-inducible genes have been subjected to a combination of purifying selection, birth-and-death process, and gene conversion-like events.
Gene and enhancer trap tagging of vascular-expressed genes in poplar trees

Treesearch

Andrew Groover; Joseph R. Fontana; Gayle Dupper; Caiping Ma; Robert Martienssen; Steven Strauss; Richard Meilan

2004-01-01

We report a gene discovery system for poplar trees based on gene and enhancer traps. Gene and enhancer trap vectors carrying the β-glucuronidase (GUS) reporter gene were inserted into the poplar genome via Agrobacterium tumefaciens transformation, where they reveal the expression pattern of genes at or near the insertion sites. Because GUS...
Neighboring Genes Show Correlated Evolution in Gene Expression.

PubMed

Ghanbarian, Avazeh T; Hurst, Laurence D

2015-07-01

When considering the evolution of a gene's expression profile, we commonly assume that this is unaffected by its genomic neighborhood. This is, however, in contrast to what we know about the lack of autonomy between neighboring genes in gene expression profiles in extant taxa. Indeed, in all eukaryotic genomes genes of similar expression-profile tend to cluster, reflecting chromatin level dynamics. Does it follow that if a gene increases expression in a particular lineage then the genomic neighbors will also increase in their expression or is gene expression evolution autonomous? To address this here we consider evolution of human gene expression since the human-chimp common ancestor, allowing for both variation in estimation of current expression level and error in Bayesian estimation of the ancestral state. We find that in all tissues and both sexes, the change in gene expression of a focal gene on average predicts the change in gene expression of neighbors. The effect is highly pronounced in the immediate vicinity (<100 kb) but extends much further. Sex-specific expression change is also genomically clustered. As genes increasing their expression in humans tend to avoid nuclear lamina domains and be enriched for the gene activator 5-hydroxymethylcytosine, we conclude that, most probably owing to chromatin level control of gene expression, a change in gene expression of one gene likely affects the expression evolution of neighbors, what we term expression piggybacking, an analog of hitchhiking. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Expression profiles of urbilaterian genes uniquely shared between honey bee and vertebrates

PubMed Central

Matsui, Toshiaki; Yamamoto, Toshiyuki; Wyder, Stefan; Zdobnov, Evgeny M; Kadowaki, Tatsuhiko

2009-01-01

Background Large-scale comparison of metazoan genomes has revealed that a significant fraction of genes of the last common ancestor of Bilateria (Urbilateria) is lost in each animal lineage. This event could be one of the underlying mechanisms involved in generating metazoan diversity. However, the present functions of these ancient genes have not been addressed extensively. To understand the functions and evolutionary mechanisms of such ancient Urbilaterian genes, we carried out comprehensive expression profile analysis of genes shared between vertebrates and honey bees but not with the other sequenced ecdysozoan genomes (honey bee-vertebrate specific, HVS genes) as a model. Results We identified 30 honey bee and 55 mouse HVS genes. Many HVS genes exhibited tissue-selective expression patterns; intriguingly, the expression of 60% of honey bee HVS genes was found to be brain enriched, and 24% of mouse HVS genes were highly expressed in either or both the brain and testis. Moreover, a minimum of 38% of mouse HVS genes demonstrated neuron-enriched expression patterns, and 62% of them exhibited expression in selective brain areas, particularly the forebrain and cerebellum. Furthermore, gene ontology (GO) analysis of HVS genes predicted that 35% of genes are associated with DNA transcription and RNA processing. Conclusion These results suggest that HVS genes include genes that are biased towards expression in the brain and gonads. They also demonstrate that at least some of Urbilaterian genes retained in the specific animal lineage may be selectively maintained to support the species-specific phenotypes. PMID:19138430
Expression profiles of urbilaterian genes uniquely shared between honey bee and vertebrates.

PubMed

Matsui, Toshiaki; Yamamoto, Toshiyuki; Wyder, Stefan; Zdobnov, Evgeny M; Kadowaki, Tatsuhiko

2009-01-12

Large-scale comparison of metazoan genomes has revealed that a significant fraction of genes of the last common ancestor of Bilateria (Urbilateria) is lost in each animal lineage. This event could be one of the underlying mechanisms involved in generating metazoan diversity. However, the present functions of these ancient genes have not been addressed extensively. To understand the functions and evolutionary mechanisms of such ancient Urbilaterian genes, we carried out comprehensive expression profile analysis of genes shared between vertebrates and honey bees but not with the other sequenced ecdysozoan genomes (honey bee-vertebrate specific, HVS genes) as a model. We identified 30 honey bee and 55 mouse HVS genes. Many HVS genes exhibited tissue-selective expression patterns; intriguingly, the expression of 60% of honey bee HVS genes was found to be brain enriched, and 24% of mouse HVS genes were highly expressed in either or both the brain and testis. Moreover, a minimum of 38% of mouse HVS genes demonstrated neuron-enriched expression patterns, and 62% of them exhibited expression in selective brain areas, particularly the forebrain and cerebellum. Furthermore, gene ontology (GO) analysis of HVS genes predicted that 35% of genes are associated with DNA transcription and RNA processing. These results suggest that HVS genes include genes that are biased towards expression in the brain and gonads. They also demonstrate that at least some of Urbilaterian genes retained in the specific animal lineage may be selectively maintained to support the species-specific phenotypes.
Gene expression patterns during somatic embryo development and germination in maize Hi II callus cultures.

PubMed

Che, Ping; Love, Tanzy M; Frame, Bronwyn R; Wang, Kan; Carriquiry, Alicia L; Howell, Stephen H

2006-09-01

Gene expression patterns were profiled during somatic embryogenesis in a regeneration-proficient maize hybrid line, Hi II, in an effort to identify genes that might be used as developmental markers or targets to optimize regeneration steps for recovering maize plants from tissue culture. Gene expression profiles were generated from embryogenic calli induced to undergo embryo maturation and germination. Over 1,000 genes in the 12,060 element arrays showed significant time variation during somatic embryo development. A substantial number of genes were downregulated during embryo maturation, largely histone and ribosomal protein genes, which may result from a slowdown in cell proliferation and growth during embryo maturation. The expression of these genes dramatically recovered at germination. Other genes up-regulated during embryo maturation included genes encoding hydrolytic enzymes (nucleases, glucosidases and proteases) and a few storage genes (an alpha-zein and caleosin), which are good candidates for developmental marker genes. Germination is accompanied by the up-regulation of a number of stress response and membrane transporter genes, and, as expected, greening is associated with the up-regulation of many genes encoding photosynthetic and chloroplast components. Thus, some, but not all genes typically associated with zygotic embryogenesis are significantly up or down-regulated during somatic embryogenesis in Hi II maize line regeneration. Although many genes varied in expression throughout somatic embryo development in this study, no statistically significant gene expression changes were detected between total embryogenic callus and callus enriched for transition stage somatic embryos.
Comprehensive analysis of pathway or functionally related gene expression in the National Cancer Institute's anticancer screen.

PubMed

Huang, Ruili; Wallqvist, Anders; Covell, David G

2006-03-01

We have analyzed the level of gene coregulation, using gene expression patterns measured across the National Cancer Institute's 60 tumor cell panels (NCI(60)), in the context of predefined pathways or functional categories annotated by KEGG (Kyoto Encyclopedia of Genes and Genomes), BioCarta, and GO (Gene Ontology). Statistical methods were used to evaluate the level of gene expression coherence (coordinated expression) by comparing intra- and interpathway gene-gene correlations. Our results show that gene expression in pathways, or groups of functionally related genes, has a significantly higher level of coherence than that of a randomly selected set of genes. Transcriptional-level gene regulation appears to be on a "need to be" basis, such that pathways comprising genes encoding closely interacting proteins and pathways responsible for vital cellular processes or processes that are related to growth or proliferation, specifically in cancer cells, such as those engaged in genetic information processing, cell cycle, energy metabolism, and nucleotide metabolism, tend to be more modular (lower degree of gene sharing) and to have genes significantly more coherently expressed than most signaling and regular metabolic pathways. Hierarchical clustering of pathways based on their differential gene expression in the NCI(60) further revealed interesting interpathway communications or interactions indicative of a higher level of pathway regulation. The knowledge of the nature of gene expression regulation and biological pathways can be applied to understanding the mechanism by which small drug molecules interfere with biological systems.
Genes from scratch--the evolutionary fate of de novo genes.

PubMed

Schlötterer, Christian

2015-04-01

Although considered an extremely unlikely event, many genes emerge from previously noncoding genomic regions. This review covers the entire life cycle of such de novo genes. Two competing hypotheses about the process of de novo gene birth are discussed as well as the high death rate of de novo genes. Despite the high death rate, some de novo genes are retained and remain functional, even in distantly related species, through their integration into gene networks. Further studies combining gene expression with ribosome profiling in multiple populations across different species will be instrumental for an improved understanding of the evolutionary processes operating on de novo genes. Copyright © 2015 The Author. Published by Elsevier Ltd.. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Pfleger, Brian F.; Youngquist, Tyler J.

Recombinant cells and methods for improved yield of fatty alcohols. The recombinant cells harbor a recombinant thioesterase gene, a recombinant acyl-CoA synthetase gene, and a recombinant acyl-CoA reductase gene. In addition, a gene product from one or more of an acyl-CoA dehydrogenase gene, an enoyl-CoA hydratase gene, a 3-hydroxyacyl-CoA dehydrogenase gene, and a 3-ketoacyl-CoA thiolase gene in the recombinant cells is functionally deleted. Culturing the recombinant cells produces fatty alcohols at high yields.
Identification and validation of suitable reference genes for RT-qPCR analysis in mouse testis development.

PubMed

Gong, Zu-Kang; Wang, Shuang-Jie; Huang, Yong-Qi; Zhao, Rui-Qiang; Zhu, Qi-Fang; Lin, Wen-Zhen

2014-12-01

RT-qPCR is a commonly used method for evaluating gene expression; however, its accuracy and reliability are dependent upon the choice of appropriate reference gene(s), and there is limited information available on suitable reference gene(s) that can be used in mouse testis at different stages. In this study, using the RT-qPCR method, we investigated the expression variations of six reference genes representing different functional classes (Actb, Gapdh, Ppia, Tbp, Rps29, Hprt1) in mice testis during embryonic and postnatal development. The expression stabilities of putative reference genes were evaluated using five algorithms: geNorm, NormFinder, Bestkeeper, the comparative delta C(t) method and integrated tool RefFinder. Analysis of the results showed that Ppia, Gapdh and Actb were identified as the most stable genes and the geometric mean of Ppia, Gapdh and Actb constitutes an appropriate normalization factor for gene expression studies. The mRNA expression of AT1 as a test gene of interest varied depending upon which of the reference gene(s) was used as an internal control(s). This study suggested that Ppia, Gapdh and Actb are suitable reference genes among the six genes used for RT-qPCR normalization and provide crucial information for transcriptional analyses in future studies of gene expression in the developing mouse testis.
NDRC: A Disease-Causing Genes Prioritized Method Based on Network Diffusion and Rank Concordance.

PubMed

Fang, Minghong; Hu, Xiaohua; Wang, Yan; Zhao, Junmin; Shen, Xianjun; He, Tingting

2015-07-01

Disease-causing genes prioritization is very important to understand disease mechanisms and biomedical applications, such as design of drugs. Previous studies have shown that promising candidate genes are mostly ranked according to their relatedness to known disease genes or closely related disease genes. Therefore, a dangling gene (isolated gene) with no edges in the network can not be effectively prioritized. These approaches tend to prioritize those genes that are highly connected in the PPI network while perform poorly when they are applied to loosely connected disease genes. To address these problems, we propose a new disease-causing genes prioritization method that based on network diffusion and rank concordance (NDRC). The method is evaluated by leave-one-out cross validation on 1931 diseases in which at least one gene is known to be involved, and it is able to rank the true causal gene first in 849 of all 2542 cases. The experimental results suggest that NDRC significantly outperforms other existing methods such as RWR, VAVIEN, DADA and PRINCE on identifying loosely connected disease genes and successfully put dangling genes as potential candidate disease genes. Furthermore, we apply NDRC method to study three representative diseases, Meckel syndrome 1, Protein C deficiency and Peroxisome biogenesis disorder 1A (Zellweger). Our study has also found that certain complex disease-causing genes can be divided into several modules that are closely associated with different disease phenotype.
Characterization of a novel gene at the Gaucher disease locus spanning the region between the glucocerebrosidase (GC) pseudogene and thrombospondin (TSP)3

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ginns, E.I.; Winfield, S.; Sidransky, E.

1994-09-01

The human GC locus on chromosome 1q21 encompasses a 7 kb functional gene encoding the enzyme deficient in Gaucher disease, and a highly homologous sequence 16 Kb downstream that has the properties of a pseudogene. A novel gene, gene X, spanning the 6 kb region between the pseudogene and TSP3 has been identified and characterized in the mouse, and appears to be critical for normal embryonic development. As in the mouse, the human gene X is located 5{prime} to the TSP3 gene and two genes are transcribed divergently from a bidirectional promoter; the direction of transcription of gene X andmore » GC is convergent. However, in the human, gene X and GC are separated by gene X and GC pseudogenes that are the consequence of a gene duplication. The gene X pseudogene lacks the first exon and part of the second exon of the functional gene and may not be transcribed. Northern blot analyses indicate that gene X is transcribed in both normal individuals and in patients with Gaucher disease, but the function of this gene is still unknown. The possibility that mutations in gene X could account for some of the diversity of symptoms encountered in individuals with the more atypical presentations of Gaucher disease is under investigation.« less
Pyviko: an automated Python tool to design gene knockouts in complex viruses with overlapping genes.

PubMed

Taylor, Louis J; Strebel, Klaus

2017-01-07

Gene knockouts are a common tool used to study gene function in various organisms. However, designing gene knockouts is complicated in viruses, which frequently contain sequences that code for multiple overlapping genes. Designing mutants that can be traced by the creation of new or elimination of existing restriction sites further compounds the difficulty in experimental design of knockouts of overlapping genes. While software is available to rapidly identify restriction sites in a given nucleotide sequence, no existing software addresses experimental design of mutations involving multiple overlapping amino acid sequences in generating gene knockouts. Pyviko performed well on a test set of over 240,000 gene pairs collected from viral genomes deposited in the National Center for Biotechnology Information Nucleotide database, identifying a point mutation which added a premature stop codon within the first 20 codons of the target gene in 93.2% of all tested gene-overprinted gene pairs. This shows that Pyviko can be used successfully in a wide variety of contexts to facilitate the molecular cloning and study of viral overprinted genes. Pyviko is an extensible and intuitive Python tool for designing knockouts of overlapping genes. Freely available as both a Python package and a web-based interface ( http://louiejtaylor.github.io/pyViKO/ ), Pyviko simplifies the experimental design of gene knockouts in complex viruses with overlapping genes.
Gene family size conservation is a good indicator of evolutionary rates.

PubMed

Chen, Feng-Chi; Chen, Chiuan-Jung; Li, Wen-Hsiung; Chuang, Trees-Juen

2010-08-01

The evolution of duplicate genes has been a topic of broad interest. Here, we propose that the conservation of gene family size is a good indicator of the rate of sequence evolution and some other biological properties. By comparing the human-chimpanzee-macaque orthologous gene families with and without family size conservation, we demonstrate that genes with family size conservation evolve more slowly than those without family size conservation. Our results further demonstrate that both family expansion and contraction events may accelerate gene evolution, resulting in elevated evolutionary rates in the genes without family size conservation. In addition, we show that the duplicate genes with family size conservation evolve significantly more slowly than those without family size conservation. Interestingly, the median evolutionary rate of singletons falls in between those of the above two types of duplicate gene families. Our results thus suggest that the controversy on whether duplicate genes evolve more slowly than singletons can be resolved when family size conservation is taken into consideration. Furthermore, we also observe that duplicate genes with family size conservation have the highest level of gene expression/expression breadth, the highest proportion of essential genes, and the lowest gene compactness, followed by singletons and then by duplicate genes without family size conservation. Such a trend accords well with our observations of evolutionary rates. Our results thus point to the importance of family size conservation in the evolution of duplicate genes.
Rapid evolution of avirulence genes in rice blast fungus Magnaporthe oryzae

PubMed Central

2014-01-01

Background Rice blast fungus Magnaporthe oryzae is one of the most devastating pathogens in rice. Avirulence genes in this fungus share a gene-for-gene relationship with the resistance genes in its host rice. Although numerous studies have shown that rice blast R-genes are extremely diverse and evolve rapidly in their host populations, little is known about the evolutionary patterns of the Avr-genes in the pathogens. Results Here, six well-characterized Avr-genes and seven randomly selected non-Avr control genes were used to investigate the genetic variations in 62 rice blast strains from different parts of China. Frequent presence/absence polymorphisms, high levels of nucleotide variation (~10-fold higher than non-Avr genes), high non-synonymous to synonymous substitution ratios, and frequent shared non-synonymous substitution were observed in the Avr-genes of these diversified blast strains. In addition, most Avr-genes are closely associated with diverse repeated sequences, which may partially explain the frequent presence/absence polymorphisms in Avr-genes. Conclusion The frequent deletion and gain of Avr-genes and rapid non-synonymous variations might be the primary mechanisms underlying rapid adaptive evolution of pathogens toward virulence to their host plants, and these features can be used as the indicators for identifying additional Avr-genes. The high number of nucleotide polymorphisms among Avr-gene alleles could also be used to distinguish genetic groups among different strains. PMID:24725999
Neighboring Genes Show Correlated Evolution in Gene Expression

PubMed Central

Ghanbarian, Avazeh T.; Hurst, Laurence D.

2015-01-01

When considering the evolution of a gene’s expression profile, we commonly assume that this is unaffected by its genomic neighborhood. This is, however, in contrast to what we know about the lack of autonomy between neighboring genes in gene expression profiles in extant taxa. Indeed, in all eukaryotic genomes genes of similar expression-profile tend to cluster, reflecting chromatin level dynamics. Does it follow that if a gene increases expression in a particular lineage then the genomic neighbors will also increase in their expression or is gene expression evolution autonomous? To address this here we consider evolution of human gene expression since the human-chimp common ancestor, allowing for both variation in estimation of current expression level and error in Bayesian estimation of the ancestral state. We find that in all tissues and both sexes, the change in gene expression of a focal gene on average predicts the change in gene expression of neighbors. The effect is highly pronounced in the immediate vicinity (<100 kb) but extends much further. Sex-specific expression change is also genomically clustered. As genes increasing their expression in humans tend to avoid nuclear lamina domains and be enriched for the gene activator 5-hydroxymethylcytosine, we conclude that, most probably owing to chromatin level control of gene expression, a change in gene expression of one gene likely affects the expression evolution of neighbors, what we term expression piggybacking, an analog of hitchhiking. PMID:25743543
Evolving ideas about genetics underlying insect virulence to plant resistance in rice-brown planthopper interactions.

PubMed

Kobayashi, Tetsuya

2016-01-01

Many plant-parasite interactions that include major plant resistance genes have subsequently been shown to exhibit features of gene-for-gene interactions between plant Resistance genes and parasite Avirulence genes. The brown planthopper (BPH) Nilaparvata lugens is an important pest of rice (Oryza sativa). Historically, major Resistance genes have played an important role in agriculture. As is common in gene-for-gene interactions, evolution of BPH virulence compromises the effectiveness of singly-deployed resistance genes. It is therefore surprising that laboratory studies of BPH have supported the conclusion that virulence is conferred by changes in many genes rather than a change in a single gene, as is proposed by the gene-for-gene model. Here we review the behaviour, physiology and genetics of the BPH in the context of host plant resistance. A problem for genetic understanding has been the use of various insect populations that differ in frequencies of virulent genotypes. We show that the previously proposed polygenic inheritance of BPH virulence can be explained by the heterogeneity of parental populations. Genetic mapping of Avirulence genes indicates that virulence is a monogenic trait. These evolving concepts, which have brought the gene-for-gene model back into the picture, are accelerating our understanding of rice-BPH interactions at the molecular level. Copyright © 2015 Elsevier Ltd. All rights reserved.

LCGbase: A Comprehensive Database for Lineage-Based Co-regulated Genes.

PubMed

Wang, Dapeng; Zhang, Yubin; Fan, Zhonghua; Liu, Guiming; Yu, Jun

2012-01-01

Animal genes of different lineages, such as vertebrates and arthropods, are well-organized and blended into dynamic chromosomal structures that represent a primary regulatory mechanism for body development and cellular differentiation. The majority of genes in a genome are actually clustered, which are evolutionarily stable to different extents and biologically meaningful when evaluated among genomes within and across lineages. Until now, many questions concerning gene organization, such as what is the minimal number of genes in a cluster and what is the driving force leading to gene co-regulation, remain to be addressed. Here, we provide a user-friendly database-LCGbase (a comprehensive database for lineage-based co-regulated genes)-hosting information on evolutionary dynamics of gene clustering and ordering within animal kingdoms in two different lineages: vertebrates and arthropods. The database is constructed on a web-based Linux-Apache-MySQL-PHP framework and effective interactive user-inquiry service. Compared to other gene annotation databases with similar purposes, our database has three comprehensible advantages. First, our database is inclusive, including all high-quality genome assemblies of vertebrates and representative arthropod species. Second, it is human-centric since we map all gene clusters from other genomes in an order of lineage-ranks (such as primates, mammals, warm-blooded, and reptiles) onto human genome and start the database from well-defined gene pairs (a minimal cluster where the two adjacent genes are oriented as co-directional, convergent, and divergent pairs) to large gene clusters. Furthermore, users can search for any adjacent genes and their detailed annotations. Third, the database provides flexible parameter definitions, such as the distance of transcription start sites between two adjacent genes, which is extendable to genes that flanking the cluster across species. We also provide useful tools for sequence alignment, gene ontology (GO) annotation, promoter identification, gene expression (co-expression), and evolutionary analysis. This database not only provides a way to define lineage-specific and species-specific gene clusters but also facilitates future studies on gene co-regulation, epigenetic control of gene expression (DNA methylation and histone marks), and chromosomal structures in a context of gene clusters and species evolution. LCGbase is freely available at http://lcgbase.big.ac.cn/LCGbase.
Inflammatory Pathway Genes Associated with Inter-Individual Variability in the Trajectories of Morning and Evening Fatigue in Patients Receiving Chemotherapy

PubMed Central

Wright, Fay; Hammer, Marilyn; Paul, Steven M.; Aouizerat, Bradley E.; Kober, Kord M.; Conley, Yvette P.; Cooper, Bruce A.; Dunn, Laura B.; Levine, Jon D.; Melkus, Gail DEramo; Miaskowski, Christine

2017-01-01

Fatigue, a highly prevalent and distressing symptom during chemotherapy (CTX), demonstrates diurnal and interindividual variability in severity. Little is known about the associations between variations in genes involved in inflammatory processes and morning and evening fatigue severity during CTX. The purposes of this study, in a sample of oncology patients (N=543) with breast, gastrointestinal (GI), gynecological (GYN), or lung cancer who received two cycles of CTX, were to determine whether variations in genes involved in inflammatory processes were associated with inter-individual variability in initial levels as well as in the trajectories of morning and evening fatigue. Patients completed the Lee Fatigue Scale to determine morning and evening fatigue severity a total of six times over two cycles of CTX. Using a whole exome array, 309 single nucleotide polymorphisms among the 64 candidate genes that passed all quality control filters were evaluated using hierarchical linear modeling (HLM). Based on the results of the HLM analyses, the final SNPs were evaluated for their potential impact on protein function using two bioinformational tools. The following inflammatory pathways were represented: chemokines (3 genes); cytokines (12 genes); inflammasome (11 genes); Janus kinase/signal transducers and activators of transcription (JAK/STAT, 10 genes); mitogen-activated protein kinase/jun amino-terminal kinases (MAPK/JNK, 3 genes); nuclear factor-kappa beta (NFkB, 18 genes); and NFkB and MAP/JNK (7 genes). After controlling for self-reported and genomic estimates of race and ethnicity, polymorphisms in six genes from the cytokine (2 genes); inflammasome (2 genes); and NFkB (2 genes) pathways were associated with both morning and evening fatigue. Polymorphisms in six genes from the inflammasome (1 gene); JAK/STAT (1 gene); and NFkB (4 genes) pathways were associated with only morning fatigue. Polymorphisms in three genes from the inflammasome (2 genes) and the NFkB (1 gene) pathways were associated with only evening fatigue. Taken together, these findings add to the growing body of evidence that suggests that morning and evening fatigue are distinct symptoms. PMID:28110208
Cloning of novel rice blast resistance genes from two rapidly evolving NBS-LRR gene families in rice.

PubMed

Guo, Changjiang; Sun, Xiaoguang; Chen, Xiao; Yang, Sihai; Li, Jing; Wang, Long; Zhang, Xiaohui

2016-01-01

Most rice blast resistance genes (R-genes) encode proteins with nucleotide-binding site (NBS) and leucine-rich repeat (LRR) domains. Our previous study has shown that more rice blast R-genes can be cloned in rapidly evolving NBS-LRR gene families. In the present study, two rapidly evolving R-gene families in rice were selected for cloning a subset of genes from their paralogs in three resistant rice lines. A total of eight functional blast R-genes were identified among nine NBS-LRR genes, and some of these showed resistance to three or more blast strains. Evolutionary analysis indicated that high nucleotide diversity of coding regions served as important parameters in the determination of gene resistance. We also observed that amino-acid variants (nonsynonymous mutations, insertions, or deletions) in essential motifs of the NBS domain contribute to the blast resistance capacity of NBS-LRR genes. These results suggested that the NBS regions might also play an important role in resistance specificity determination. On the other hand, different splicing patterns of introns were commonly observed in R-genes. The results of the present study contribute to improving the effectiveness of R-gene identification by using evolutionary analysis method and acquisition of novel blast resistance genes.
Identification of candidate genes in Populus cell wall biosynthesis using text-mining, co-expression network and comparative genomics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yang, Xiaohan; Ye, Chuyu; Bisaria, Anjali

2011-01-01

Populus is an important bioenergy crop for bioethanol production. A greater understanding of cell wall biosynthesis processes is critical in reducing biomass recalcitrance, a major hindrance in efficient generation of ethanol from lignocellulosic biomass. Here, we report the identification of candidate cell wall biosynthesis genes through the development and application of a novel bioinformatics pipeline. As a first step, via text-mining of PubMed publications, we obtained 121 Arabidopsis genes that had the experimental evidences supporting their involvement in cell wall biosynthesis or remodeling. The 121 genes were then used as bait genes to query an Arabidopsis co-expression database and additionalmore » genes were identified as neighbors of the bait genes in the network, increasing the number of genes to 548. The 548 Arabidopsis genes were then used to re-query the Arabidopsis co-expression database and re-construct a network that captured additional network neighbors, expanding to a total of 694 genes. The 694 Arabidopsis genes were computationally divided into 22 clusters. Queries of the Populus genome using the Arabidopsis genes revealed 817 Populus orthologs. Functional analysis of gene ontology and tissue-specific gene expression indicated that these Arabidopsis and Populus genes are high likelihood candidates for functional genomics in relation to cell wall biosynthesis.« less
Interplay of bistable kinetics of gene expression during cellular growth

NASA Astrophysics Data System (ADS)

Zhdanov, Vladimir P.

2009-02-01

In cells, the bistable kinetics of gene expression can be observed on the level of (i) one gene with positive feedback between protein and mRNA production, (ii) two genes with negative mutual feedback between protein and mRNA production, or (iii) in more complex cases. We analyse the interplay of two genes of type (ii) governed by a gene of type (i) during cellular growth. In particular, using kinetic Monte Carlo simulations, we show that in the case where gene 1, operating in the bistable regime, regulates mutually inhibiting genes 2 and 3, also operating in the bistable regime, the latter genes may eventually be trapped either to the state with high transcriptional activity of gene 2 and low activity of gene 3 or to the state with high transcriptional activity of gene 3 and low activity of gene 2. The probability to get to one of these states depends on the values of the model parameters. If genes 2 and 3 are kinetically equivalent, the probability is equal to 0.5. Thus, our model illustrates how different intracellular states can be chosen at random with predetermined probabilities. This type of kinetics of gene expression may be behind complex processes occurring in cells, e.g., behind the choice of the fate by stem cells.
Gene therapy in pancreatic cancer

PubMed Central

Liu, Si-Xue; Xia, Zhong-Sheng; Zhong, Ying-Qiang

2014-01-01

Pancreatic cancer (PC) is a highly lethal disease and notoriously difficult to treat. Only a small proportion of PC patients are eligible for surgical resection, whilst conventional chemoradiotherapy only has a modest effect with substantial toxicity. Gene therapy has become a new widely investigated therapeutic approach for PC. This article reviews the basic rationale, gene delivery methods, therapeutic targets and developments of laboratory research and clinical trials in gene therapy of PC by searching the literature published in English using the PubMed database and analyzing clinical trials registered on the Gene Therapy Clinical Trials Worldwide website (http://www. wiley.co.uk/genmed/ clinical). Viral vectors are main gene delivery tools in gene therapy of cancer, and especially, oncolytic virus shows brighter prospect due to its tumor-targeting property. Efficient therapeutic targets for gene therapy include tumor suppressor gene p53, mutant oncogene K-ras, anti-angiogenesis gene VEGFR, suicide gene HSK-TK, cytosine deaminase and cytochrome p450, multiple cytokine genes and so on. Combining different targets or combination strategies with traditional chemoradiotherapy may be a more effective approach to improve the efficacy of cancer gene therapy. Cancer gene therapy is not yet applied in clinical practice, but basic and clinical studies have demonstrated its safety and clinical benefits. Gene therapy will be a new and promising field for the treatment of PC. PMID:25309069
Exact Algorithms for Duplication-Transfer-Loss Reconciliation with Non-Binary Gene Trees.

PubMed

Kordi, Misagh; Bansal, Mukul S

2017-06-01

Duplication-Transfer-Loss (DTL) reconciliation is a powerful method for studying gene family evolution in the presence of horizontal gene transfer. DTL reconciliation seeks to reconcile gene trees with species trees by postulating speciation, duplication, transfer, and loss events. Efficient algorithms exist for finding optimal DTL reconciliations when the gene tree is binary. In practice, however, gene trees are often non-binary due to uncertainty in the gene tree topologies, and DTL reconciliation with non-binary gene trees is known to be NP-hard. In this paper, we present the first exact algorithms for DTL reconciliation with non-binary gene trees. Specifically, we (i) show that the DTL reconciliation problem for non-binary gene trees is fixed-parameter tractable in the maximum degree of the gene tree, (ii) present an exponential-time, but in-practice efficient, algorithm to track and enumerate all optimal binary resolutions of a non-binary input gene tree, and (iii) apply our algorithms to a large empirical data set of over 4700 gene trees from 100 species to study the impact of gene tree uncertainty on DTL-reconciliation and to demonstrate the applicability and utility of our algorithms. The new techniques and algorithms introduced in this paper will help biologists avoid incorrect evolutionary inferences caused by gene tree uncertainty.
Silencing of six susceptibility genes results in potato late blight resistance.

PubMed

Sun, Kaile; Wolters, Anne-Marie A; Vossen, Jack H; Rouwet, Maarten E; Loonen, Annelies E H M; Jacobsen, Evert; Visser, Richard G F; Bai, Yuling

2016-10-01

Phytophthora infestans, the causal agent of late blight, is a major threat to commercial potato production worldwide. Significant costs are required for crop protection to secure yield. Many dominant genes for resistance (R-genes) to potato late blight have been identified, and some of these R-genes have been applied in potato breeding. However, the P. infestans population rapidly accumulates new virulent strains that render R-genes ineffective. Here we introduce a new class of resistance which is based on the loss-of-function of a susceptibility gene (S-gene) encoding a product exploited by pathogens during infection and colonization. Impaired S-genes primarily result in recessive resistance traits in contrast to recognition-based resistance that is governed by dominant R-genes. In Arabidopsis thaliana, many S-genes have been detected in screens of mutant populations. In the present study, we selected 11 A. thaliana S-genes and silenced orthologous genes in the potato cultivar Desiree, which is highly susceptible to late blight. The silencing of five genes resulted in complete resistance to the P. infestans isolate Pic99189, and the silencing of a sixth S-gene resulted in reduced susceptibility. The application of S-genes to potato breeding for resistance to late blight is further discussed.
LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights

PubMed Central

Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong

2016-01-01

Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher’s exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO’s usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher. PMID:26750448
LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights.

PubMed

Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong

2016-01-11

Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher's exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO's usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.
The drug target genes show higher evolutionary conservation than non-target genes.

PubMed

Lv, Wenhua; Xu, Yongdeng; Guo, Yiying; Yu, Ziqi; Feng, Guanglong; Liu, Panpan; Luan, Meiwei; Zhu, Hongjie; Liu, Guiyou; Zhang, Mingming; Lv, Hongchao; Duan, Lian; Shang, Zhenwei; Li, Jin; Jiang, Yongshuai; Zhang, Ruijie

2016-01-26

Although evidence indicates that drug target genes share some common evolutionary features, there have been few studies analyzing evolutionary features of drug targets from an overall level. Therefore, we conducted an analysis which aimed to investigate the evolutionary characteristics of drug target genes. We compared the evolutionary conservation between human drug target genes and non-target genes by combining both the evolutionary features and network topological properties in human protein-protein interaction network. The evolution rate, conservation score and the percentage of orthologous genes of 21 species were included in our study. Meanwhile, four topological features including the average shortest path length, betweenness centrality, clustering coefficient and degree were considered for comparison analysis. Then we got four results as following: compared with non-drug target genes, 1) drug target genes had lower evolutionary rates; 2) drug target genes had higher conservation scores; 3) drug target genes had higher percentages of orthologous genes and 4) drug target genes had a tighter network structure including higher degrees, betweenness centrality, clustering coefficients and lower average shortest path lengths. These results demonstrate that drug target genes are more evolutionarily conserved than non-drug target genes. We hope that our study will provide valuable information for other researchers who are interested in evolutionary conservation of drug targets.
Genome-Wide Identification, Characterization and Phylogenetic Analysis of ATP-Binding Cassette (ABC) Transporter Genes in Common Carp (Cyprinus carpio).

PubMed

Liu, Xiang; Li, Shangqi; Peng, Wenzhu; Feng, Shuaisheng; Feng, Jianxin; Mahboob, Shahid; Al-Ghanim, Khalid A; Xu, Peng

2016-01-01

The ATP-binding cassette (ABC) gene family is considered to be one of the largest gene families in all forms of prokaryotic and eukaryotic life. Although the ABC transporter genes have been annotated in some species, detailed information about the ABC superfamily and the evolutionary characterization of ABC genes in common carp (Cyprinus carpio) are still unclear. In this research, we identified 61 ABC transporter genes in the common carp genome. Phylogenetic analysis revealed that they could be classified into seven subfamilies, namely 11 ABCAs, six ABCBs, 19 ABCCs, eight ABCDs, two ABCEs, four ABCFs, and 11 ABCGs. Comparative analysis of the ABC genes in seven vertebrate species including common carp, showed that at least 10 common carp genes were retained from the third round of whole genome duplication, while 12 duplicated ABC genes may have come from the fourth round of whole genome duplication. Gene losses were also observed for 14 ABC genes. Expression profiles of the 61 ABC genes in six common carp tissues (brain, heart, spleen, kidney, intestine, and gill) revealed extensive functional divergence among the ABC genes. Different copies of some genes had tissue-specific expression patterns, which may indicate some gene function specialization. This study provides essential genomic resources for future studies in common carp.
Discovery of rare protein-coding genes in model methylotroph Methylobacterium extorquens AM1.

PubMed

Kumar, Dhirendra; Mondal, Anupam Kumar; Yadav, Amit Kumar; Dash, Debasis

2014-12-01

Proteogenomics involves the use of MS to refine annotation of protein-coding genes and discover genes in a genome. We carried out comprehensive proteogenomic analysis of Methylobacterium extorquens AM1 (ME-AM1) from publicly available proteomics data with a motive to improve annotation for methylotrophs; organisms capable of surviving in reduced carbon compounds such as methanol. Besides identifying 2482(50%) proteins, 29 new genes were discovered and 66 annotated gene models were revised in ME-AM1 genome. One such novel gene is identified with 75 peptides, lacks homolog in other methylobacteria but has glycosyl transferase and lipopolysaccharide biosynthesis protein domains, indicating its potential role in outer membrane synthesis. Many novel genes are present only in ME-AM1 among methylobacteria. Distant homologs of these genes in unrelated taxonomic classes and low GC-content of few genes suggest lateral gene transfer as a potential mode of their origin. Annotations of methylotrophy related genes were also improved by the discovery of a short gene in methylotrophy gene island and redefining a gene important for pyrroquinoline quinone synthesis, essential for methylotrophy. The combined use of proteogenomics and rigorous bioinformatics analysis greatly enhanced the annotation of protein-coding genes in model methylotroph ME-AM1 genome. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Validation of housekeeping genes as an internal control for gene expression studies in Giardia lamblia using quantitative real-time PCR.

PubMed

Marcial-Quino, Jaime; Fierro, Francisco; De la Mora-De la Mora, Ignacio; Enríquez-Flores, Sergio; Gómez-Manzo, Saúl; Vanoye-Carlo, America; Garcia-Torres, Itzhel; Sierra-Palacios, Edgar; Reyes-Vivas, Horacio

2016-04-25

The analysis of transcript levels of specific genes is important for understanding transcriptional regulation and for the characterization of gene function. Real-time quantitative reverse transcriptase PCR (RT-qPCR) has become a powerful tool to quantify gene expression. The objective of this study was to identify reliable housekeeping genes in Giardia lamblia. Twelve genes were selected for this purpose, and their expression was analyzed in the wild type WB strain and in two strains with resistance to nitazoxanide (NTZ) and metronidazole (MTZ), respectively. RefFinder software analysis showed that the expression of the genes is different in the three strains. The integrated data from the four analyses showed that the NADH oxidase (NADH) and aldolase (ALD) genes were the most steadily expressed genes, whereas the glyceraldehyde-3-phosphate dehydrogenase gene was the most unstable. Additionally, the relative expression of seven genes were quantified in the NTZ- and MTZ-resistant strains by RT-qPCR, using the aldolase gene as the internal control, and the results showed a consistent differential pattern of expression in both strains. The housekeeping genes found in this work will facilitate the analysis of mRNA expression levels of other genes of interest in G. lamblia. Copyright © 2016 Elsevier B.V. All rights reserved.
Identifying potential maternal genes of Bombyx mori using digital gene expression profiling

PubMed Central

Xu, Pingzhen

2018-01-01

Maternal genes present in mature oocytes play a crucial role in the early development of silkworm. Although maternal genes have been widely studied in many other species, there has been limited research in Bombyx mori. High-throughput next generation sequencing provides a practical method for gene discovery on a genome-wide level. Herein, a transcriptome study was used to identify maternal-related genes from silkworm eggs. Unfertilized eggs from five different stages of early development were used to detect the changing situation of gene expression. The expressed genes showed different patterns over time. Seventy-six maternal genes were annotated according to homology analysis with Drosophila melanogaster. More than half of the differentially expressed maternal genes fell into four expression patterns, while the expression patterns showed a downward trend over time. The functional annotation of these material genes was mainly related to transcription factor activity, growth factor activity, nucleic acid binding, RNA binding, ATP binding, and ion binding. Additionally, twenty-two gene clusters including maternal genes were identified from 18 scaffolds. Altogether, we plotted a profile for the maternal genes of Bombyx mori using a digital gene expression profiling method. This will provide the basis for maternal-specific signature research and improve the understanding of the early development of silkworm. PMID:29462160
Genome-Wide Identification, Characterization and Phylogenetic Analysis of ATP-Binding Cassette (ABC) Transporter Genes in Common Carp (Cyprinus carpio)

PubMed Central

Peng, Wenzhu; Feng, Shuaisheng; Feng, Jianxin; Mahboob, Shahid; Al-Ghanim, Khalid A.

2016-01-01

The ATP-binding cassette (ABC) gene family is considered to be one of the largest gene families in all forms of prokaryotic and eukaryotic life. Although the ABC transporter genes have been annotated in some species, detailed information about the ABC superfamily and the evolutionary characterization of ABC genes in common carp (Cyprinus carpio) are still unclear. In this research, we identified 61 ABC transporter genes in the common carp genome. Phylogenetic analysis revealed that they could be classified into seven subfamilies, namely 11 ABCAs, six ABCBs, 19 ABCCs, eight ABCDs, two ABCEs, four ABCFs, and 11 ABCGs. Comparative analysis of the ABC genes in seven vertebrate species including common carp, showed that at least 10 common carp genes were retained from the third round of whole genome duplication, while 12 duplicated ABC genes may have come from the fourth round of whole genome duplication. Gene losses were also observed for 14 ABC genes. Expression profiles of the 61 ABC genes in six common carp tissues (brain, heart, spleen, kidney, intestine, and gill) revealed extensive functional divergence among the ABC genes. Different copies of some genes had tissue-specific expression patterns, which may indicate some gene function specialization. This study provides essential genomic resources for future studies in common carp. PMID:27058731
Identification of gene expression profiles and key genes in subchondral bone of osteoarthritis using weighted gene coexpression network analysis.

PubMed

Guo, Sheng-Min; Wang, Jian-Xiong; Li, Jin; Xu, Fang-Yuan; Wei, Quan; Wang, Hai-Ming; Huang, Hou-Qiang; Zheng, Si-Lin; Xie, Yu-Jie; Zhang, Chi

2018-06-15

Osteoarthritis (OA) significantly influences the quality life of people around the world. It is urgent to find an effective way to understand the genetic etiology of OA. We used weighted gene coexpression network analysis (WGCNA) to explore the key genes involved in the subchondral bone pathological process of OA. Fifty gene expression profiles of GSE51588 were downloaded from the Gene Expression Omnibus database. The OA-associated genes and gene ontologies were acquired from JuniorDoc. Weighted gene coexpression network analysis was used to find disease-related networks based on 21756 gene expression correlation coefficients, hub-genes with the highest connectivity in each module were selected, and the correlation between module eigengene and clinical traits was calculated. The genes in the traits-related gene coexpression modules were subject to functional annotation and pathway enrichment analysis using ClusterProfiler. A total of 73 gene modules were identified, of which, 12 modules were found with high connectivity with clinical traits. Five modules were found with enriched OA-associated genes. Moreover, 310 OA-associated genes were found, and 34 of them were among hub-genes in each module. Consequently, enrichment results indicated some key metabolic pathways, such as extracellular matrix (ECM)-receptor interaction (hsa04512), focal adhesion (hsa04510), the phosphatidylinositol 3'-kinase (PI3K)-Akt signaling pathway (PI3K-AKT) (hsa04151), transforming growth factor beta pathway, and Wnt pathway. We intended to identify some core genes, collagen (COL)6A3, COL6A1, ITGA11, BAMBI, and HCK, which could influence downstream signaling pathways once they were activated. In this study, we identified important genes within key coexpression modules, which associate with a pathological process of subchondral bone in OA. Functional analysis results could provide important information to understand the mechanism of OA. © 2018 Wiley Periodicals, Inc.
Ancestral and more recently acquired syntenic relationships of MADS-box genes uncovered by the Physcomitrella patens pseudochromosomal genome assembly.

PubMed

Barker, Elizabeth I; Ashton, Neil W

2016-03-01

The Physcomitrella pseudochromosomal genome assembly revealed previously invisible synteny enabling realisation of the full potential of shared synteny as a tool for probing evolution of this plant's MADS-box gene family. Assembly of the sequenced genome of Physcomitrella patens into 27 mega-scaffolds (pseudochromosomes) has confirmed the major predictions of our earlier model of expansion of the MADS-box gene family in the Physcomitrella lineage. Additionally, microsynteny has been conserved in the immediate vicinity of some recent duplicates of MADS-box genes. However, comparison of non-syntenic MIKC MADS-box genes and neighbouring genes indicates that chromosomal rearrangements and/or sequence degeneration have destroyed shared synteny over longer distances (macrosynteny) around MADS-box genes despite subsets comprising two or three MIKC genes having remained syntenic. In contrast, half of the type I MADS-box genes have been transposed creating new syntenic relations with MIKC genes. This implies that conservation of ancient ancestral synteny of MIKC genes and of more recently acquired synteny of type I and MIKC genes may be selectively advantageous. Our revised model predicts the birth rate of MIKC genes in Physcomitrella is higher than that of type I genes. However, this difference is attributable to an early tandem duplication and an early segmental duplication of MIKC genes prior to the two polyploidisations that account for most of the expansion of the MADS-box gene family in Physcomitrella. Furthermore, this early segmental duplication spawned two chromosomal lineages: one with a MIKC (C) gene, belonging to the PPM2 clade, in close proximity to one or a pair of MIKC* genes and another with a MIKC (C) gene, belonging to the PpMADS-S clade, characterised by greater separation from syntenic MIKC* genes. Our model has evolutionary implications for the Physcomitrella karyotype.
DynGO: a tool for visualizing and mining of Gene Ontology and its associations

PubMed Central

Liu, Hongfang; Hu, Zhang-Zhi; Wu, Cathy H

2005-01-01

Background A large volume of data and information about genes and gene products has been stored in various molecular biology databases. A major challenge for knowledge discovery using these databases is to identify related genes and gene products in disparate databases. The development of Gene Ontology (GO) as a common vocabulary for annotation allows integrated queries across multiple databases and identification of semantically related genes and gene products (i.e., genes and gene products that have similar GO annotations). Meanwhile, dozens of tools have been developed for browsing, mining or editing GO terms, their hierarchical relationships, or their "associated" genes and gene products (i.e., genes and gene products annotated with GO terms). Tools that allow users to directly search and inspect relations among all GO terms and their associated genes and gene products from multiple databases are needed. Results We present a standalone package called DynGO, which provides several advanced functionalities in addition to the standard browsing capability of the official GO browsing tool (AmiGO). DynGO allows users to conduct batch retrieval of GO annotations for a list of genes and gene products, and semantic retrieval of genes and gene products sharing similar GO annotations. The result are shown in an association tree organized according to GO hierarchies and supported with many dynamic display options such as sorting tree nodes or changing orientation of the tree. For GO curators and frequent GO users, DynGO provides fast and convenient access to GO annotation data. DynGO is generally applicable to any data set where the records are annotated with GO terms, as illustrated by two examples. Conclusion We have presented a standalone package DynGO that provides functionalities to search and browse GO and its association databases as well as several additional functions such as batch retrieval and semantic retrieval. The complete documentation and software are freely available for download from the website . PMID:16091147
Gene set analysis of purine and pyrimidine antimetabolites cancer therapies.

PubMed

Fridley, Brooke L; Batzler, Anthony; Li, Liang; Li, Fang; Matimba, Alice; Jenkins, Gregory D; Ji, Yuan; Wang, Liewei; Weinshilboum, Richard M

2011-11-01

Responses to therapies, either with regard to toxicities or efficacy, are expected to involve complex relationships of gene products within the same molecular pathway or functional gene set. Therefore, pathways or gene sets, as opposed to single genes, may better reflect the true underlying biology and may be more appropriate units for analysis of pharmacogenomic studies. Application of such methods to pharmacogenomic studies may enable the detection of more subtle effects of multiple genes in the same pathway that may be missed by assessing each gene individually. A gene set analysis of 3821 gene sets is presented assessing the association between basal messenger RNA expression and drug cytotoxicity using ethnically defined human lymphoblastoid cell lines for two classes of drugs: pyrimidines [gemcitabine (dFdC) and arabinoside] and purines [6-thioguanine and 6-mercaptopurine]. The gene set nucleoside-diphosphatase activity was found to be significantly associated with both dFdC and arabinoside, whereas gene set γ-aminobutyric acid catabolic process was associated with dFdC and 6-thioguanine. These gene sets were significantly associated with the phenotype even after adjusting for multiple testing. In addition, five associated gene sets were found in common between the pyrimidines and two gene sets for the purines (3',5'-cyclic-AMP phosphodiesterase activity and γ-aminobutyric acid catabolic process) with a P value of less than 0.0001. Functional validation was attempted with four genes each in gene sets for thiopurine and pyrimidine antimetabolites. All four genes selected from the pyrimidine gene sets (PSME3, CANT1, ENTPD6, ADRM1) were validated, but only one (PDE4D) was validated for the thiopurine gene sets. In summary, results from the gene set analysis of pyrimidine and purine therapies, used often in the treatment of various cancers, provide novel insight into the relationship between genomic variation and drug response.

Horizontal gene transfer of microbial cellulases into nematode genomes is associated with functional assimilation and gene turnover

PubMed Central

2011-01-01

Background Natural acquisition of novel genes from other organisms by horizontal or lateral gene transfer is well established for microorganisms. There is now growing evidence that horizontal gene transfer also plays important roles in the evolution of eukaryotes. Genome-sequencing and EST projects of plant and animal associated nematodes such as Brugia, Meloidogyne, Bursaphelenchus and Pristionchus indicate horizontal gene transfer as a key adaptation towards parasitism and pathogenicity. However, little is known about the functional activity and evolutionary longevity of genes acquired by horizontal gene transfer and the mechanisms favoring such processes. Results We examine the transfer of cellulase genes to the free-living and beetle-associated nematode Pristionchus pacificus, for which detailed phylogenetic knowledge is available, to address predictions by evolutionary theory for successful gene transfer. We used transcriptomics in seven Pristionchus species and three other related diplogastrid nematodes with a well-defined phylogenetic framework to study the evolution of ancestral cellulase genes acquired by horizontal gene transfer. We performed intra-species, inter-species and inter-genic analysis by comparing the transcriptomes of these ten species and tested for cellulase activity in each species. Species with cellulase genes in their transcriptome always exhibited cellulase activity indicating functional integration into the host's genome and biology. The phylogenetic profile of cellulase genes was congruent with the species phylogeny demonstrating gene longevity. Cellulase genes show notable turnover with elevated birth and death rates. Comparison by sequencing of three selected cellulase genes in 24 natural isolates of Pristionchus pacificus suggests these high evolutionary dynamics to be associated with copy number variations and positive selection. Conclusion We could demonstrate functional integration of acquired cellulase genes into the nematode's biology as predicted by theory. Thus, functional assimilation, remarkable gene turnover and selection might represent key features of horizontal gene transfer events in nematodes. PMID:21232122
Horizontal gene transfer of microbial cellulases into nematode genomes is associated with functional assimilation and gene turnover.

PubMed

Mayer, Werner E; Schuster, Lisa N; Bartelmes, Gabi; Dieterich, Christoph; Sommer, Ralf J

2011-01-13

Natural acquisition of novel genes from other organisms by horizontal or lateral gene transfer is well established for microorganisms. There is now growing evidence that horizontal gene transfer also plays important roles in the evolution of eukaryotes. Genome-sequencing and EST projects of plant and animal associated nematodes such as Brugia, Meloidogyne, Bursaphelenchus and Pristionchus indicate horizontal gene transfer as a key adaptation towards parasitism and pathogenicity. However, little is known about the functional activity and evolutionary longevity of genes acquired by horizontal gene transfer and the mechanisms favoring such processes. We examine the transfer of cellulase genes to the free-living and beetle-associated nematode Pristionchus pacificus, for which detailed phylogenetic knowledge is available, to address predictions by evolutionary theory for successful gene transfer. We used transcriptomics in seven Pristionchus species and three other related diplogastrid nematodes with a well-defined phylogenetic framework to study the evolution of ancestral cellulase genes acquired by horizontal gene transfer. We performed intra-species, inter-species and inter-genic analysis by comparing the transcriptomes of these ten species and tested for cellulase activity in each species. Species with cellulase genes in their transcriptome always exhibited cellulase activity indicating functional integration into the host's genome and biology. The phylogenetic profile of cellulase genes was congruent with the species phylogeny demonstrating gene longevity. Cellulase genes show notable turnover with elevated birth and death rates. Comparison by sequencing of three selected cellulase genes in 24 natural isolates of Pristionchus pacificus suggests these high evolutionary dynamics to be associated with copy number variations and positive selection. We could demonstrate functional integration of acquired cellulase genes into the nematode's biology as predicted by theory. Thus, functional assimilation, remarkable gene turnover and selection might represent key features of horizontal gene transfer events in nematodes.
Evaluation and Validation of Housekeeping Genes as Reference for Gene Expression Studies in Pigeonpea (Cajanus cajan) Under Drought Stress Conditions

PubMed Central

Sinha, Pallavi; Singh, Vikas K.; Suryanarayana, V.; Krishnamurthy, L.; Saxena, Rachit K.; Varshney, Rajeev K.

2015-01-01

Gene expression analysis using quantitative real-time PCR (qRT-PCR) is a very sensitive technique and its sensitivity depends on the stable performance of reference gene(s) used in the study. A number of housekeeping genes have been used in various expression studies in many crops however, their expression were found to be inconsistent under different stress conditions. As a result, species specific housekeeping genes have been recommended for different expression studies in several crop species. However, such specific housekeeping genes have not been reported in the case of pigeonpea (Cajanus cajan) despite the fact that genome sequence has become available for the crop. To identify the stable housekeeping genes in pigeonpea for expression analysis under drought stress conditions, the relative expression variations of 10 commonly used housekeeping genes (EF1α, UBQ10, GAPDH, 18SrRNA, 25SrRNA, TUB6, ACT1, IF4α, UBC and HSP90) were studied on root, stem and leaves tissues of Asha (ICPL 87119). Three statistical algorithms geNorm, NormFinder and BestKeeper were used to define the stability of candidate genes. geNorm analysis identified IF4α and TUB6 as the most stable housekeeping genes however, NormFinder analysis determined IF4α and HSP90 as the most stable housekeeping genes under drought stress conditions. Subsequently validation of the identified candidate genes was undertaken in qRT-PCR based gene expression analysis of uspA gene which plays an important role for drought stress conditions in pigeonpea. The relative quantification of the uspA gene varied according to the internal controls (stable and least stable genes), thus highlighting the importance of the choice of as well as validation of internal controls in such experiments. The identified stable and validated housekeeping genes will facilitate gene expression studies in pigeonpea especially under drought stress conditions. PMID:25849964
Evaluation and validation of housekeeping genes as reference for gene expression studies in pigeonpea (Cajanus cajan) under drought stress conditions.

PubMed

Sinha, Pallavi; Singh, Vikas K; Suryanarayana, V; Krishnamurthy, L; Saxena, Rachit K; Varshney, Rajeev K

2015-01-01

Gene expression analysis using quantitative real-time PCR (qRT-PCR) is a very sensitive technique and its sensitivity depends on the stable performance of reference gene(s) used in the study. A number of housekeeping genes have been used in various expression studies in many crops however, their expression were found to be inconsistent under different stress conditions. As a result, species specific housekeeping genes have been recommended for different expression studies in several crop species. However, such specific housekeeping genes have not been reported in the case of pigeonpea (Cajanus cajan) despite the fact that genome sequence has become available for the crop. To identify the stable housekeeping genes in pigeonpea for expression analysis under drought stress conditions, the relative expression variations of 10 commonly used housekeeping genes (EF1α, UBQ10, GAPDH, 18SrRNA, 25SrRNA, TUB6, ACT1, IF4α, UBC and HSP90) were studied on root, stem and leaves tissues of Asha (ICPL 87119). Three statistical algorithms geNorm, NormFinder and BestKeeper were used to define the stability of candidate genes. geNorm analysis identified IF4α and TUB6 as the most stable housekeeping genes however, NormFinder analysis determined IF4α and HSP90 as the most stable housekeeping genes under drought stress conditions. Subsequently validation of the identified candidate genes was undertaken in qRT-PCR based gene expression analysis of uspA gene which plays an important role for drought stress conditions in pigeonpea. The relative quantification of the uspA gene varied according to the internal controls (stable and least stable genes), thus highlighting the importance of the choice of as well as validation of internal controls in such experiments. The identified stable and validated housekeeping genes will facilitate gene expression studies in pigeonpea especially under drought stress conditions.
Detection of biomarkers for Hepatocellular Carcinoma using a hybrid univariate gene selection methods

PubMed Central

2012-01-01

Background Discovering new biomarkers has a great role in improving early diagnosis of Hepatocellular carcinoma (HCC). The experimental determination of biomarkers needs a lot of time and money. This motivates this work to use in-silico prediction of biomarkers to reduce the number of experiments required for detecting new ones. This is achieved by extracting the most representative genes in microarrays of HCC. Results In this work, we provide a method for extracting the differential expressed genes, up regulated ones, that can be considered candidate biomarkers in high throughput microarrays of HCC. We examine the power of several gene selection methods (such as Pearson’s correlation coefficient, Cosine coefficient, Euclidean distance, Mutual information and Entropy with different estimators) in selecting informative genes. A biological interpretation of the highly ranked genes is done using KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways, ENTREZ and DAVID (Database for Annotation, Visualization, and Integrated Discovery) databases. The top ten genes selected using Pearson’s correlation coefficient and Cosine coefficient contained six genes that have been implicated in cancer (often multiple cancers) genesis in previous studies. A fewer number of genes were obtained by the other methods (4 genes using Mutual information, 3genes using Euclidean distance and only one gene using Entropy). A better result was obtained by the utilization of a hybrid approach based on intersecting the highly ranked genes in the output of all investigated methods. This hybrid combination yielded seven genes (2 genes for HCC and 5 genes in different types of cancer) in the top ten genes of the list of intersected genes. Conclusions To strengthen the effectiveness of the univariate selection methods, we propose a hybrid approach by intersecting several of these methods in a cascaded manner. This approach surpasses all of univariate selection methods when used individually according to biological interpretation and the examination of gene expression signal profiles. PMID:22867264
Transcriptome sequencing of Eucalyptus camaldulensis seedlings subjected to water stress reveals functional single nucleotide polymorphisms and genes under selection

PubMed Central

2012-01-01

Background Water stress limits plant survival and production in many parts of the world. Identification of genes and alleles responding to water stress conditions is important in breeding plants better adapted to drought. Currently there are no studies examining the transcriptome wide gene and allelic expression patterns under water stress conditions. We used RNA sequencing (RNA-seq) to identify the candidate genes and alleles and to explore the evolutionary signatures of selection. Results We studied the effect of water stress on gene expression in Eucalyptus camaldulensis seedlings derived from three natural populations. We used reference-guided transcriptome mapping to study gene expression. Several genes showed differential expression between control and stress conditions. Gene ontology (GO) enrichment tests revealed up-regulation of 140 stress-related gene categories and down-regulation of 35 metabolic and cell wall organisation gene categories. More than 190,000 single nucleotide polymorphisms (SNPs) were detected and 2737 of these showed differential allelic expression. Allelic expression of 52% of these variants was correlated with differential gene expression. Signatures of selection patterns were studied by estimating the proportion of nonsynonymous to synonymous substitution rates (Ka/Ks). The average Ka/Ks ratio among the 13,719 genes was 0.39 indicating that most of the genes are under purifying selection. Among the positively selected genes (Ka/Ks > 1.5) apoptosis and cell death categories were enriched. Of the 287 positively selected genes, ninety genes showed differential expression and 27 SNPs from 17 positively selected genes showed differential allelic expression between treatments. Conclusions Correlation of allelic expression of several SNPs with total gene expression indicates that these variants may be the cis-acting variants or in linkage disequilibrium with such variants. Enrichment of apoptosis and cell death gene categories among the positively selected genes reveals the past selection pressures experienced by the populations used in this study. PMID:22853646
Genotype-phenotype relationships in human red/green color-vision defects: Molecular and psychophysical studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deeb, S.S.; Motulsky, A.G.; Lindsey, D.T.

1992-10-01

The relationship between the molecular structure of the X-linked red and green visual pigment genes and color-vision phenotype as ascertained by anomaloscopy was studied in 64 color-defective males. The great majority of red-green defects were associated with either the deletion of the green-pigment gene or the formation of 5[prime] red-green hybrid genes or 5[prime] green-red hybrid genes. A rapid PCR-based method allowed detection of hybrid genes, including those undetectable by Southern blot analysis, as well as more precise localization of the fusion points in hybrid genes. Protan color-vision defects appeared always associated with 5[prime] red-green hybrid genes. Carriers of singlemore » red-green hybrid genes with fusion in introns 1-4 were protanopes. However, carriers of hybrid genes with red-green fusions in introns 2, 3, or 4 in the presence of additional normal green genes manifested as either protanopes or protanomalous trichromats, with the majority being protanomalous. Deutan defects were associated with green-pigment gene deletions, with 5[prime] green-red hybrid genes, or, rarely, with 5[prime] green-red-green hybrid genes. Complete green-pigment gene deletions or green-red fusions in intron 1 were usually associated with deuteranopia, although the authors unexpectedly found three carriers of a single red-pigment gene without any green-pigment genes to be deuteranomalous trichromats. All but one of the other deuteranomalous subjects had green-red hybrid genes with intron 1, 2, 3, or 4 fusions, as well as several normal green-pigment genes. The one exception had a grossly normal gene array, presumably with a more subtle mutation. Amino acid differences in exon 5 largely determine whether a hybrid gene will be more redlike or more greenlike in phenotype. Various discrepancies as to severity (dichromacy or trichromacy) remain unexplained but may arise because of variability of expression, postreceptoral variation, or both.« less
GeneSeqToFamily: a Galaxy workflow to find gene families based on the Ensembl Compara GeneTrees pipeline.

PubMed

Thanki, Anil S; Soranzo, Nicola; Haerty, Wilfried; Davey, Robert P

2018-03-01

Gene duplication is a major factor contributing to evolutionary novelty, and the contraction or expansion of gene families has often been associated with morphological, physiological, and environmental adaptations. The study of homologous genes helps us to understand the evolution of gene families. It plays a vital role in finding ancestral gene duplication events as well as identifying genes that have diverged from a common ancestor under positive selection. There are various tools available, such as MSOAR, OrthoMCL, and HomoloGene, to identify gene families and visualize syntenic information between species, providing an overview of syntenic regions evolution at the family level. Unfortunately, none of them provide information about structural changes within genes, such as the conservation of ancestral exon boundaries among multiple genomes. The Ensembl GeneTrees computational pipeline generates gene trees based on coding sequences, provides details about exon conservation, and is used in the Ensembl Compara project to discover gene families. A certain amount of expertise is required to configure and run the Ensembl Compara GeneTrees pipeline via command line. Therefore, we converted this pipeline into a Galaxy workflow, called GeneSeqToFamily, and provided additional functionality. This workflow uses existing tools from the Galaxy ToolShed, as well as providing additional wrappers and tools that are required to run the workflow. GeneSeqToFamily represents the Ensembl GeneTrees pipeline as a set of interconnected Galaxy tools, so they can be run interactively within the Galaxy's user-friendly workflow environment while still providing the flexibility to tailor the analysis by changing configurations and tools if necessary. Additional tools allow users to subsequently visualize the gene families produced by the workflow, using the Aequatus.js interactive tool, which has been developed as part of the Aequatus software project.
Alteration of the gene expression profile of T-cell receptor αβ-modified T-cells with diffuse large B-cell lymphoma specificity.

PubMed

Zha, Xianfeng; Yin, Qingsong; Tan, Huo; Wang, Chunyan; Chen, Shaohua; Yang, Lijian; Li, Bo; Wu, Xiuli; Li, Yangqiu

2013-05-01

Antigen-specific, T-cell receptor (TCR)-modified cytotoxic T lymphocytes (CTLs) that target tumors are an attractive strategy for specific adoptive immunotherapy. Little is known about whether there are any alterations in the gene expression profile after TCR gene transduction in T cells. We constructed TCR gene-redirected CTLs with specificity for diffuse large B-cell lymphoma (DLBCL)-associated antigens to elucidate the gene expression profiles of TCR gene-redirected T-cells, and we further analyzed the gene expression profile pattern of these redirected T-cells by Affymetrix microarrays. The resulting data were analyzed using Bioconductor software, a two-fold cut-off expression change was applied together with anti-correlation of the profile ratios to render the microarray analysis set. The fold change of all genes was calculated by comparing the three TCR gene-modified T-cells and a negative control counterpart. The gene pathways were analyzed using Bioconductor and Kyoto Encyclopedia of Genes and Genomes. Identical genes whose fold change was greater than or equal to 2.0 in all three TCR gene-redirected T-cell groups in comparison with the negative control were identified as the differentially expressed genes. The differentially expressed genes were comprised of 33 up-regulated genes and 1 down-regulated gene including JUNB, FOS, TNF, INF-γ, DUSP2, IL-1B, CXCL1, CXCL2, CXCL9, CCL2, CCL4, and CCL8. These genes are mainly involved in the TCR signaling, mitogen-activated protein kinase signaling, and cytokine-cytokine receptor interaction pathways. In conclusion, we characterized the gene expression profile of DLBCL-specific TCR gene-redirected T-cells. The changes corresponded to an up-regulation in the differentiation and proliferation of the T-cells. These data may help to explain some of the characteristics of the redirected T-cells.
The WRKY Transcription Factor Genes in Lotus japonicus.

PubMed

Song, Hui; Wang, Pengfei; Nan, Zhibiao; Wang, Xingjun

2014-01-01

WRKY transcription factor genes play critical roles in plant growth and development, as well as stress responses. WRKY genes have been examined in various higher plants, but they have not been characterized in Lotus japonicus. The recent release of the L. japonicus whole genome sequence provides an opportunity for a genome wide analysis of WRKY genes in this species. In this study, we identified 61 WRKY genes in the L. japonicus genome. Based on the WRKY protein structure, L. japonicus WRKY (LjWRKY) genes can be classified into three groups (I-III). Investigations of gene copy number and gene clusters indicate that only one gene duplication event occurred on chromosome 4 and no clustered genes were detected on chromosomes 3 or 6. Researchers previously believed that group II and III WRKY domains were derived from the C-terminal WRKY domain of group I. Our results suggest that some WRKY genes in group II originated from the N-terminal domain of group I WRKY genes. Additional evidence to support this hypothesis was obtained by Medicago truncatula WRKY (MtWRKY) protein motif analysis. We found that LjWRKY and MtWRKY group III genes are under purifying selection, suggesting that WRKY genes will become increasingly structured and functionally conserved.
Differential retention of metabolic genes following whole-genome duplication.

PubMed

Gout, Jean-François; Duret, Laurent; Kahn, Daniel

2009-05-01

Classical studies in Metabolic Control Theory have shown that metabolic fluxes usually exhibit little sensitivity to changes in individual enzyme activity, yet remain sensitive to global changes of all enzymes in a pathway. Therefore, little selective pressure is expected on the dosage or expression of individual metabolic genes, yet entire pathways should still be constrained. However, a direct estimate of this selective pressure had not been evaluated. Whole-genome duplications (WGDs) offer a good opportunity to address this question by analyzing the fates of metabolic genes during the massive gene losses that follow. Here, we take advantage of the successive rounds of WGD that occurred in the Paramecium lineage. We show that metabolic genes exhibit different gene retention patterns than nonmetabolic genes. Contrary to what was expected for individual genes, metabolic genes appeared more retained than other genes after the recent WGD, which was best explained by selection for gene expression operating on entire pathways. Metabolic genes also tend to be less retained when present at high copy number before WGD, contrary to other genes that show a positive correlation between gene retention and preduplication copy number. This is rationalized on the basis of the classical concave relationship relating metabolic fluxes with enzyme expression.
Identification of Homeotic Target Genes in Drosophila Melanogaster Including Nervy, a Proto-Oncogene Homologue

PubMed Central

Feinstein, P. G.; Kornfeld, K.; Hogness, D. S.; Mann, R. S.

1995-01-01

In Drosophila, the specific morphological characteristics of each segment are determined by the homeotic genes that regulate the expression of downstream target genes. We used a subtractive hybridization procedure to isolate activated target genes of the homeotic gene Ultrabithorax (Ubx). In addition, we constructed a set of mutant genotypes that measures the regulatory contribution of individual homeotic genes to a complex target gene expression pattern. Using these mutants, we demonstrate that homeotic genes can regulate target gene expression at the start of gastrulation, suggesting a previously unknown role for the homeotic genes at this early stage. We also show that, in abdominal segments, the levels of expression for two target genes increase in response to high levels of Ubx, demonstrating that the normal down-regulation of Ubx in these segments is functional. Finally, the DNA sequence of cDNAs for one of these genes predicts a protein that is similar to a human proto-oncogene involved in acute myeloid leukemias. These results illustrate potentially general rules about the homeotic control of target gene expression and suggest that subtractive hybridization can be used to isolate interesting homeotic target genes. PMID:7498738
The 'warrior gene' and the Mãori people: the responsibility of the geneticists.

PubMed

Perbal, Laurence

2013-09-01

The 'gene of' is a teleosemantic expression that conveys a simplistic and linear relationship between a gene and a phenotype. Throughout the 20th century, geneticists studied these genes of traits. The studies were often polemical when they concerned human traits: the 'crime gene', 'poverty gene', 'IQ gene', 'gay gene' or 'gene of alcoholism'. Quite recently, a controversy occurred in 2006 in New Zealand that started with the claim that a 'warrior gene' exists in the Mãori community. This claim came from a geneticist working on the MAOA gene. This article is interested in the responsibility of that researcher regarding the origin of the controversy. Several errors were made: overestimation of results, abusive use of the 'gene of' kind of expression, poor communication with the media and a lack of scientific culture. The issues of the debate were not taken into account sufficiently, either from the political, social, ethical or even the genetic points of view. After more than 100 years of debates around 'genes of' all kinds (here, the 'warrior gene'), geneticists may not hide themselves behind the media when a controversy occurs. Responsibilities have to be assumed. © 2012 John Wiley & Sons Ltd.
Degrees of separation as a statistical tool for evaluating candidate genes.

PubMed

Nelson, Ronald M; Pettersson, Mats E

2014-12-01

Selection of candidate genes is an important step in the exploration of complex genetic architecture. The number of gene networks available is increasing and these can provide information to help with candidate gene selection. It is currently common to use the degree of connectedness in gene networks as validation in Genome Wide Association (GWA) and Quantitative Trait Locus (QTL) mapping studies. However, it can cause misleading results if not validated properly. Here we present a method and tool for validating the gene pairs from GWA studies given the context of the network they co-occur in. It ensures that proposed interactions and gene associations are not statistical artefacts inherent to the specific gene network architecture. The CandidateBacon package provides an easy and efficient method to calculate the average degree of separation (DoS) between pairs of genes to currently available gene networks. We show how these empirical estimates of average connectedness are used to validate candidate gene pairs. Validation of interacting genes by comparing their connectedness with the average connectedness in the gene network will provide support for said interactions by utilising the growing amount of gene network information available. Copyright © 2014 Elsevier Ltd. All rights reserved.
Coexpression network based on natural variation in human gene expression reveals gene interactions and functions

PubMed Central

Nayak, Renuka R.; Kearns, Michael; Spielman, Richard S.; Cheung, Vivian G.

2009-01-01

Genes interact in networks to orchestrate cellular processes. Analysis of these networks provides insights into gene interactions and functions. Here, we took advantage of normal variation in human gene expression to infer gene networks, which we constructed using correlations in expression levels of more than 8.5 million gene pairs in immortalized B cells from three independent samples. The resulting networks allowed us to identify biological processes and gene functions. Among the biological pathways, we found processes such as translation and glycolysis that co-occur in the same subnetworks. We predicted the functions of poorly characterized genes, including CHCHD2 and TMEM111, and provided experimental evidence that TMEM111 is part of the endoplasmic reticulum-associated secretory pathway. We also found that IFIH1, a susceptibility gene of type 1 diabetes, interacts with YES1, which plays a role in glucose transport. Furthermore, genes that predispose to the same diseases are clustered nonrandomly in the coexpression network, suggesting that networks can provide candidate genes that influence disease susceptibility. Therefore, our analysis of gene coexpression networks offers information on the role of human genes in normal and disease processes. PMID:19797678
MADS goes genomic in conifers: towards determining the ancestral set of MADS-box genes in seed plants.

PubMed

Gramzow, Lydia; Weilandt, Lisa; Theißen, Günter

2014-11-01

MADS-box genes comprise a gene family coding for transcription factors. This gene family expanded greatly during land plant evolution such that the number of MADS-box genes ranges from one or two in green algae to around 100 in angiosperms. Given the crucial functions of MADS-box genes for nearly all aspects of plant development, the expansion of this gene family probably contributed to the increasing complexity of plants. However, the expansion of MADS-box genes during one important step of land plant evolution, namely the origin of seed plants, remains poorly understood due to the previous lack of whole-genome data for gymnosperms. The newly available genome sequences of Picea abies, Picea glauca and Pinus taeda were used to identify the complete set of MADS-box genes in these conifers. In addition, MADS-box genes were identified in the growing number of transcriptomes available for gymnosperms. With these datasets, phylogenies were constructed to determine the ancestral set of MADS-box genes of seed plants and to infer the ancestral functions of these genes. Type I MADS-box genes are under-represented in gymnosperms and only a minimum of two Type I MADS-box genes have been present in the most recent common ancestor (MRCA) of seed plants. In contrast, a large number of Type II MADS-box genes were found in gymnosperms. The MRCA of extant seed plants probably possessed at least 11-14 Type II MADS-box genes. In gymnosperms two duplications of Type II MADS-box genes were found, such that the MRCA of extant gymnosperms had at least 14-16 Type II MADS-box genes. The implied ancestral set of MADS-box genes for seed plants shows simplicity for Type I MADS-box genes and remarkable complexity for Type II MADS-box genes in terms of phylogeny and putative functions. The analysis of transcriptome data reveals that gymnosperm MADS-box genes are expressed in a great variety of tissues, indicating diverse roles of MADS-box genes for the development of gymnosperms. This study is the first that provides a comprehensive overview of MADS-box genes in conifers and thus will provide a framework for future work on MADS-box genes in seed plants. © The Author 2014. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Gene-for-genes interactions between cotton R genes and Xanthomonas campestris pv. malvacearum avr genes.

PubMed

De Feyter, R; Yang, Y; Gabriel, D W

1993-01-01

Six plasmid-borne avirulence (avr) genes were previously cloned from strain XcmH of the cotton pathogen, Xanthomonas campestris pv. malvacearum. We have now localized all six avr genes on the cloned fragments by subcloning and Tn5-gusA insertional mutagenesis. None of these avr genes appeared to exhibit exclusively gene-for-gene patterns of interactions with cotton R genes, and avrB4 was demonstrated to confer avr gene-for-R genes (plural) avirulence to X. c. pv. malvacearum on congenic cotton lines carrying either of two different resistance loci, B1 or B4. Furthermore, the B1 locus appeared to confer R gene-for-avr genes resistance to cotton against isogenic X. c. pv. malvacearum strains carrying any one of three avr genes: avrB4, avrb6, or avrB102. Restriction enzyme, Southern blot hybridization, and DNA sequence analyses showed that the XcmH avr genes are all highly similar to each other, to avrBs3 and avrBsP from the pepper pathogen X. c. pv. vesicatoria, and to the host-specific virulence gene pthA from the citrus pathogen X. citri. The XcmH avr genes differed primarily in the multiplicity of a tandemly repeated 102-base pair motif within the central portions of the genes, repeated from 14 to 23 times in members of this gene family. The complete nucleotide sequence of avrb6 revealed that it is 97% identical in DNA sequence to avrB4, avrBs3, avrBsP, and pthA and that 62-bp inverted terminal repeats mark the boundaries of homology between avrb6 and all members of this Xanthomonas virulence/avirulence gene family sequenced to date. The terminal 38 bp of both inverted repeats are highly similar to the 38-bp consensus terminal sequence of the Tn3 family of transposons. Up to 11 members of the avr gene family appear to be present in North American strains of X. c. pv. malvacearum, including XcmH. The high level of homology observed among these avr genes and their presence in multiple copies may explain the gene-for-genes interactions and also the observed high frequencies (10(-3) to 10(-4) per locus) of X. c. pv. malvacearum race change mutations. Five spontaneous race change mutants of XcmH suffered avr locus deletions, strongly indicating intergenic recombination as the primary mechanism for generating new races in X. c. pv. malvacearum.
Microarray-based gene expression profiling to elucidate effectiveness of fermented Codonopsis lanceolata in mice.

PubMed

Choi, Woon Yong; Kim, Ji Seon; Park, Sung Jin; Ma, Choong Je; Lee, Hyeon Yong

2014-04-08

In this study, the effect of Codonopsis lanceolata fermented by lactic acid on controlling gene expression levels related to obesity was observed in an oligonucleotide chip microarray. Among 8170 genes, 393 genes were up regulated and 760 genes were down regulated in feeding the fermented C. lanceolata (FCL). Another 374 genes were up regulated and 527 genes down regulated without feeding the sample. The genes were not affected by the FCL sample. It was interesting that among those genes, Chytochrome P450, Dmbt1, LOC76487, and thyroid hormones, etc., were mostly up or down regulated. These genes are more related to lipid synthesis. We could conclude that the FCL possibly controlled the gene expression levels related to lipid synthesis, which resulted in reducing obesity. However, more detailed protein expression experiments should be carried out.
Genes Downregulated in Endometriosis Are Located Near the Known Imprinting Genes

PubMed Central

Higashiura, Yumi; Koike, Natsuki; Akasaka, Juria; Uekuri, Chiharu; Iwai, Kana; Niiro, Emiko; Morioka, Sachiko; Yamada, Yuki

2014-01-01

There is now accumulating evidence that endometriosis is a disease associated with an epigenetic disorder. Genomic imprinting is an epigenetic phenomenon known to regulate DNA methylation of either maternal or paternal alleles. We hypothesize that hypermethylated endometriosis-associated genes may be enriched at imprinted gene loci. We sought to determine whether downregulated genes associated with endometriosis susceptibility are associated with chromosomal location of the known paternally and maternally expressed imprinting genes. Gene information has been gathered from National Center for Biotechnology Information database geneimprint.com. Several researchers have identified specific loci with strong DNA methylation in eutopic endometrium and ectopic lesion with endometriosis. Of the 29 hypermethylated genes in endometriosis, 19 genes were located near 45 known imprinted foci. There may be an association of the genomic location between genes specifically downregulated in endometriosis and epigenetically imprinted genes. PMID:24615936
Construction of a Bacterial Cell that Contains Only the Set of Essential Genes Necessary to Impart Life

DTIC Science & Technology

2014-05-16

native uncharacterized genes for characterized genes from Bacillus subtilis , that is presented in a constitutive expression module. If the B... subtilis gene containing M. mycoides mutant is viable than the function of the conserved hypothetical gene is the same as the input B. subtilis gene...Characterized genes from B. subtilis were swapped with similar, but not so similar as to be clearly the same, essential genes from M. mycoides. The B. subtilis

Gene for ataxia-telangiectasia complementation group D (ATDC)

DOEpatents

Murnane, John P.; Painter, Robert B.; Kapp, Leon N.; Yu, Loh-Chung

1995-03-07

Disclosed herein is a new gene, an AT gene for complementation group D, the ATDC gene and fragments thereof. Nucleic acid probes for said gene are provided as well as proteins encoded by said gene, cDNA therefrom, preferably a 3 kilobase (kb) cDNA, and recombinant nucleic acid molecules for expression of said proteins. Further disclosed are methods to detect mutations in said gene, preferably methods employing the polymerase chain reaction (PCR). Also disclosed are methods to detect AT genes from other AT complementation groups.
What is a gene? From molecules to metaphysics.

PubMed

Rolston, Holmes

2006-01-01

Mendelian genes have become molecular genes, with increasing puzzlement about locating them, due to increasing complexity in genomic webworks. Genome science finds modular and conserved units of inheritance, identified as homologous genes. Such genes are cybernetic, transmitting information over generations; this too requires multi-leveled analysis, from DNA transcription to development and reproduction of the whole organism. Genes are conserved; genes are also dynamic and creative in evolutionary speciation-most remarkably producing humans capable of wondering about what genes are.
Analysis of bHLH coding genes using gene co-expression network approach.

PubMed

Srivastava, Swati; Sanchita; Singh, Garima; Singh, Noopur; Srivastava, Gaurava; Sharma, Ashok

2016-07-01

Network analysis provides a powerful framework for the interpretation of data. It uses novel reference network-based metrices for module evolution. These could be used to identify module of highly connected genes showing variation in co-expression network. In this study, a co-expression network-based approach was used for analyzing the genes from microarray data. Our approach consists of a simple but robust rank-based network construction. The publicly available gene expression data of Solanum tuberosum under cold and heat stresses were considered to create and analyze a gene co-expression network. The analysis provide highly co-expressed module of bHLH coding genes based on correlation values. Our approach was to analyze the variation of genes expression, according to the time period of stress through co-expression network approach. As the result, the seed genes were identified showing multiple connections with other genes in the same cluster. Seed genes were found to be vary in different time periods of stress. These analyzed seed genes may be utilized further as marker genes for developing the stress tolerant plant species.
Problems associated with gene transfer and opportunities for microgravity environments

NASA Astrophysics Data System (ADS)

Tennessen, Daniel J.

1997-01-01

The method of crop improvement by gene transfer is becoming increasingly routine with transgenic foods and ornamental crops now being marketed to consumers. However, biological processes of plants, and the physical barriers of current protocols continue to limit the application of gene transfer in many commercial crops. The goal of this paper is to outline the current limitations of gene transfer and to hypothesize possible opportunities for use of microgravity to overcome such limitations. The limitations detailed in this paper include host-range specificity of Agrobacterium mediated transformation, probability of gene insertion, position effects of the inserted genes, gene copy number, stability of foreign gene expression in host plants, and regeneration of recalcitrant plant species. Microgravity offers an opportunity for gene transfer where cell growth kinetics, DNA synthesis, and genetic recombination rates can be altered. Such biological conditions may enhance the ability for recombination of reporter genes and other genes of interest to agriculture. Proposed studies would be useful for understanding instability of foreign gene expression and may lead to stable transformed plants. Other aspects of gene transfer in microgravity are discussed.
A Luciferase Reporter Gene System for High-Throughput Screening of γ-Globin Gene Activators.

PubMed

Xie, Wensheng; Silvers, Robert; Ouellette, Michael; Wu, Zining; Lu, Quinn; Li, Hu; Gallagher, Kathleen; Johnson, Kathy; Montoute, Monica

2016-01-01

Luciferase reporter gene assays have long been used for drug discovery due to their high sensitivity and robust signal. A dual reporter gene system contains a gene of interest and a control gene to monitor non-specific effects on gene expression. In our dual luciferase reporter gene system, a synthetic promoter of γ-globin gene was constructed immediately upstream of the firefly luciferase gene, followed downstream by a synthetic β-globin gene promoter in front of the Renilla luciferase gene. A stable cell line with the dual reporter gene was cloned and used for all assay development and HTS work. Due to the low activity of the control Renilla luciferase, only the firefly luciferase activity was further optimized for HTS. Several critical factors, such as cell density, serum concentration, and miniaturization, were optimized using tool compounds to achieve maximum robustness and sensitivity. Using the optimized reporter assay, the HTS campaign was successfully completed and approximately 1000 hits were identified. In this chapter, we also describe strategies to triage hits that non-specifically interfere with firefly luciferase.
New genes from old: asymmetric divergence of gene duplicates and the evolution of development.

PubMed

Holland, Peter W H; Marlétaz, Ferdinand; Maeso, Ignacio; Dunwell, Thomas L; Paps, Jordi

2017-02-05

Gene duplications and gene losses have been frequent events in the evolution of animal genomes, with the balance between these two dynamic processes contributing to major differences in gene number between species. After gene duplication, it is common for both daughter genes to accumulate sequence change at approximately equal rates. In some cases, however, the accumulation of sequence change is highly uneven with one copy radically diverging from its paralogue. Such 'asymmetric evolution' seems commoner after tandem gene duplication than after whole-genome duplication, and can generate substantially novel genes. We describe examples of asymmetric evolution in duplicated homeobox genes of moths, molluscs and mammals, in each case generating new homeobox genes that were recruited to novel developmental roles. The prevalence of asymmetric divergence of gene duplicates has been underappreciated, in part, because the origin of highly divergent genes can be difficult to resolve using standard phylogenetic methods.This article is part of the themed issue 'Evo-devo in the genomics era, and the origins of morphological diversity'. © 2016 The Author(s).
Bacterial evolution through the selective loss of beneficial Genes. Trade-offs in expression involving two loci.

PubMed Central

Zinser, Erik R; Schneider, Dominique; Blot, Michel; Kolter, Roberto

2003-01-01

The loss of preexisting genes or gene activities during evolution is a major mechanism of ecological specialization. Evolutionary processes that can account for gene loss or inactivation have so far been restricted to one of two mechanisms: direct selection for the loss of gene activities that are disadvantageous under the conditions of selection (i.e., antagonistic pleiotropy) and selection-independent genetic drift of neutral (or nearly neutral) mutations (i.e., mutation accumulation). In this study we demonstrate with an evolved strain of Escherichia coli that a third, distinct mechanism exists by which gene activities can be lost. This selection-dependent mechanism involves the expropriation of one gene's upstream regulatory element by a second gene via a homologous recombination event. Resulting from this genetic exchange is the activation of the second gene and a concomitant inactivation of the first gene. This gene-for-gene expression tradeoff provides a net fitness gain, even if the forfeited activity of the first gene can play a positive role in fitness under the conditions of selection. PMID:12930738
Bacterial reference genes for gene expression studies by RT-qPCR: survey and analysis.

PubMed

Rocha, Danilo J P; Santos, Carolina S; Pacheco, Luis G C

2015-09-01

The appropriate choice of reference genes is essential for accurate normalization of gene expression data obtained by the method of reverse transcription quantitative real-time PCR (RT-qPCR). In 2009, a guideline called the Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) highlighted the importance of the selection and validation of more than one suitable reference gene for obtaining reliable RT-qPCR results. Herein, we searched the recent literature in order to identify the bacterial reference genes that have been most commonly validated in gene expression studies by RT-qPCR (in the first 5 years following publication of the MIQE guidelines). Through a combination of different search parameters with the text mining tool MedlineRanker, we identified 145 unique bacterial genes that were recently tested as candidate reference genes. Of these, 45 genes were experimentally validated and, in most of the cases, their expression stabilities were verified using the software tools geNorm and NormFinder. It is noteworthy that only 10 of these reference genes had been validated in two or more of the studies evaluated. An enrichment analysis using Gene Ontology classifications demonstrated that genes belonging to the functional categories of DNA Replication (GO: 0006260) and Transcription (GO: 0006351) rendered a proportionally higher number of validated reference genes. Three genes in the former functional class were also among the top five most stable genes identified through an analysis of gene expression data obtained from the Pathosystems Resource Integration Center. These results may provide a guideline for the initial selection of candidate reference genes for RT-qPCR studies in several different bacterial species.
Recombinant Rp1 genes confer necrotic or nonspecific resistance phenotypes.

PubMed

Smith, Shavannor M; Steinau, Martin; Trick, Harold N; Hulbert, Scot H

2010-06-01

Genes at the Rp1 rust resistance locus of maize confer race-specific resistance to the common rust fungus Puccinia sorghi. Three variant genes with nonspecific effects (HRp1 -Kr1N, -D*21 and -MD*19) were found to be generated by intragenic crossing over within the LRR region. The LRR region of most NBS-LRR encoding genes is quite variable and codes for one of the regions in resistance gene proteins that controls specificity. Sequence comparisons demonstrated that the Rp1-Kr1N recombinant gene was identical to the N-terminus of the rp1-kp2 gene and C-terminus of another gene from its HRp1-K grandparent. The Rp1-D*21 recombinant gene consists of the N-terminus of the rp1-dp2 gene and C-terminus of the Rp1-D gene from the parental haplotype. Similarly, a recombinant gene from the Rp1-MD*19 haplotype has the N-terminus of an rp1 gene from the HRp1-M parent and C-terminus of the rp1-D19 gene from the HRp1-D parent. The recombinant Rp1 -Kr1N, -D*21 and -MD*19 genes activated defense responses in the absence of their AVR proteins triggering HR (hypersensitive response) in the absence of the pathogen. The results indicate that the frequent intragenic recombination events that occur in the Rp1 gene cluster not only recombine the genes into novel haplotypes, but also create genes with nonspecific effects. Some of these may contribute to nonspecific quantitative resistance but others have severe consequences for the fitness of the plant.
Regulatory and evolutionary signatures of sex-biased genes on both the X chromosome and the autosomes.

PubMed

Shen, Jiangshan J; Wang, Ting-You; Yang, Wanling

2017-11-02

Sex is an important but understudied factor in the genetics of human diseases. Analyses using a combination of gene expression data, ENCODE data, and evolutionary data of sex-biased gene expression in human tissues can give insight into the regulatory and evolutionary forces acting on sex-biased genes. In this study, we analyzed the differentially expressed genes between males and females. On the X chromosome, we used a novel method and investigated the status of genes that escape X-chromosome inactivation (escape genes), taking into account the clonality of lymphoblastoid cell lines (LCLs). To investigate the regulation of sex-biased differentially expressed genes (sDEG), we conducted pathway and transcription factor enrichment analyses on the sDEGs, as well as analyses on the genomic distribution of sDEGs. Evolutionary analyses were also conducted on both sDEGs and escape genes. Genome-wide, we characterized differential gene expression between sexes in 462 RNA-seq samples and identified 587 sex-biased genes, or 3.2% of the genes surveyed. On the X chromosome, sDEGs were distributed in evolutionary strata in a similar pattern as escape genes. We found a trend of negative correlation between the gene expression breadth and nonsynonymous over synonymous mutation (dN/dS) ratios, showing a possible pleiotropic constraint on evolution of genes. Genome-wide, nine transcription factors were found enriched in binding to the regions surrounding the transcription start sites of female-biased genes. Many pathways and protein domains were enriched in sex-biased genes, some of which hint at sex-biased physiological processes. These findings lend insight into the regulatory and evolutionary forces shaping sex-biased gene expression and their involvement in the physiological and pathological processes in human health and diseases.
Digital gene expression analysis of the zebra finch genome

PubMed Central

2010-01-01

Background In order to understand patterns of adaptation and molecular evolution it is important to quantify both variation in gene expression and nucleotide sequence divergence. Gene expression profiling in non-model organisms has recently been facilitated by the advent of massively parallel sequencing technology. Here we investigate tissue specific gene expression patterns in the zebra finch (Taeniopygia guttata) with special emphasis on the genes of the major histocompatibility complex (MHC). Results Almost 2 million 454-sequencing reads from cDNA of six different tissues were assembled and analysed. A total of 11,793 zebra finch transcripts were represented in this EST data, indicating a transcriptome coverage of about 65%. There was a positive correlation between the tissue specificity of gene expression and non-synonymous to synonymous nucleotide substitution ratio of genes, suggesting that genes with a specialised function are evolving at a higher rate (or with less constraint) than genes with a more general function. In line with this, there was also a negative correlation between overall expression levels and expression specificity of contigs. We found evidence for expression of 10 different genes related to the MHC. MHC genes showed relatively tissue specific expression levels and were in general primarily expressed in spleen. Several MHC genes, including MHC class I also showed expression in brain. Furthermore, for all genes with highest levels of expression in spleen there was an overrepresentation of several gene ontology terms related to immune function. Conclusions Our study highlights the usefulness of next-generation sequence data for quantifying gene expression in the genome as a whole as well as in specific candidate genes. Overall, the data show predicted patterns of gene expression profiles and molecular evolution in the zebra finch genome. Expression of MHC genes in particular, corresponds well with expression patterns in other vertebrates. PMID:20359325
Brain region-specific gene expression changes after chronic intermittent ethanol exposure and early withdrawal in C57BL/6J mice

PubMed Central

Melendez, Roberto I.; McGinty, Jacqueline F.; Kalivas, Peter W.; Becker, Howard C.

2014-01-01

Neuroadaptations that participate in the ontogeny of alcohol dependence are likely a result of altered gene expression in various brain regions. The present study investigated brain region-specific changes in the pattern and magnitude of gene expression immediately following chronic intermittent ethanol (CIE) exposure and 8 hours following final ethanol exposure [i.e. early withdrawal (EWD)]. High-density oligonucleotide microarrays (Affymetrix 430A 2.0, Affymetrix, Santa Clara, CA, USA) and bioinformatics analysis were used to characterize gene expression and function in the prefrontal cortex (PFC), hippocampus (HPC) and nucleus accumbens (NAc) of C57BL/6J mice (Jackson Laboratories, Bar Harbor, ME, USA). Gene expression levels were determined using gene chip robust multi-array average followed by statistical analysis of microarrays and validated by quantitative real-time reverse transcription polymerase chain reaction and Western blot analysis. Results indicated that immediately following CIE exposure, changes in gene expression were strikingly greater in the PFC (284 genes) compared with the HPC (16 genes) and NAc (32 genes). Bioinformatics analysis revealed that most of the transcriptionally responsive genes in the PFC were involved in Ras/MAPK signaling, notch signaling or ubiquitination. In contrast, during EWD, changes in gene expression were greatest in the HPC (139 genes) compared with the PFC (four genes) and NAc (eight genes). The most transcriptionally responsive genes in the HPC were involved in mRNA processing or actin dynamics. Of the few genes detected in the NAc, the most representatives were involved in circadian rhythms. Overall, these findings indicate that brain region-specific and time-dependent neuroadaptive alterations in gene expression play an integral role in the development of alcohol dependence and withdrawal. PMID:21812870
Identification and Characterization of the MADS-Box Genes and Their Contribution to Flower Organ in Carnation (Dianthus caryophyllus L.)

PubMed Central

Zhang, Xiaoni; Wang, Qijian; Yang, Shaozong; Lin, Shengnan; Bao, Manzhu; Wu, Quanshu; Wang, Caiyun; Fu, Xiaopeng

2018-01-01

Dianthus is a large genus containing many species with high ornamental economic value. Extensive breeding strategies permitted an exploration of an improvement in the quality of cultivated carnation, particularly in flowers. However, little is known on the molecular mechanisms of flower development in carnation. Here, we report the identification and description of MADS-box genes in carnation (DcaMADS) with a focus on those involved in flower development and organ identity determination. In this study, 39 MADS-box genes were identified from the carnation genome and transcriptome by the phylogenetic analysis. These genes were categorized into four subgroups (30 MIKCc, two MIKC*, two Mα, and five Mγ). The MADS-box domain, gene structure, and conserved motif compositions of the carnation MADS genes were analysed. Meanwhile, the expression of DcaMADS genes were significantly different in stems, leaves, and flower buds. Further studies were carried out for exploring the expression of DcaMADS genes in individual flower organs, and some crucial DcaMADS genes correlated with their putative function were validated. Finally, a new expression pattern of DcaMADS genes in flower organs of carnation was provided: sepal (three class E genes and two class A genes), petal (two class B genes, two class E genes, and one SHORT VEGETATIVE PHASE (SVP)), stamen (two class B genes, two class E genes, and two class C), styles (two class E genes and two class C), and ovary (two class E genes, two class C, one AGAMOUS-LIKE 6 (AGL6), one SEEDSTICK (STK), one B sister, one SVP, and one Mα). This result proposes a model in floral organ identity of carnation and it may be helpful to further explore the molecular mechanism of flower organ identity in carnation. PMID:29617274
Identification and Characterization of the MADS-Box Genes and Their Contribution to Flower Organ in Carnation (Dianthus caryophyllus L.).

PubMed

Zhang, Xiaoni; Wang, Qijian; Yang, Shaozong; Lin, Shengnan; Bao, Manzhu; Bendahmane, Mohammed; Wu, Quanshu; Wang, Caiyun; Fu, Xiaopeng

2018-04-04

Dianthus is a large genus containing many species with high ornamental economic value. Extensive breeding strategies permitted an exploration of an improvement in the quality of cultivated carnation, particularly in flowers. However, little is known on the molecular mechanisms of flower development in carnation. Here, we report the identification and description of MADS-box genes in carnation ( DcaMADS ) with a focus on those involved in flower development and organ identity determination. In this study, 39 MADS-box genes were identified from the carnation genome and transcriptome by the phylogenetic analysis. These genes were categorized into four subgroups (30 MIKC c , two MIKC*, two Mα, and five Mγ). The MADS-box domain, gene structure, and conserved motif compositions of the carnation MADS genes were analysed. Meanwhile, the expression of DcaMADS genes were significantly different in stems, leaves, and flower buds. Further studies were carried out for exploring the expression of DcaMADS genes in individual flower organs, and some crucial DcaMADS genes correlated with their putative function were validated. Finally, a new expression pattern of DcaMADS genes in flower organs of carnation was provided: sepal (three class E genes and two class A genes), petal (two class B genes, two class E genes, and one SHORT VEGETATIVE PHASE ( SVP )), stamen (two class B genes, two class E genes, and two class C), styles (two class E genes and two class C), and ovary (two class E genes, two class C, one AGAMOUS-LIKE 6 ( AGL6 ), one SEEDSTICK ( STK ), one B sister , one SVP , and one Mα ). This result proposes a model in floral organ identity of carnation and it may be helpful to further explore the molecular mechanism of flower organ identity in carnation.
Selection of reference genes for gene expression studies related to intramuscular fat deposition in Capra hircus skeletal muscle.

PubMed

Zhu, Wuzheng; Lin, Yaqiu; Liao, Honghai; Wang, Yong

2015-01-01

The identification of suitable reference genes is critical for obtaining reliable results from gene expression studies using quantitative real-time PCR (qPCR) because the expression of reference genes may vary considerably under different experimental conditions. In most cases, however, commonly used reference genes are employed in data normalization without proper validation, which may lead to incorrect data interpretation. Here, we aim to select a set of optimal reference genes for the accurate normalization of gene expression associated with intramuscular fat (IMF) deposition during development. In the present study, eight reference genes (PPIB, HMBS, RPLP0, B2M, YWHAZ, 18S, GAPDH and ACTB) were evaluated by three different algorithms (geNorm, NormFinder and BestKeeper) in two types of muscle tissues (longissimus dorsi muscle and biceps femoris muscle) across different developmental stages. All three algorithms gave similar results. PPIB and HMBS were identified as the most stable reference genes, while the commonly used reference genes 18S and GAPDH were the most variably expressed, with expression varying dramatically across different developmental stages. Furthermore, to reveal the crucial role of appropriate reference genes in obtaining a reliable result, analysis of PPARG expression was performed by normalization to the most and the least stable reference genes. The relative expression levels of PPARG normalized to the most stable reference genes greatly differed from those normalized to the least stable one. Therefore, evaluation of reference genes must be performed for a given experimental condition before the reference genes are used. PPIB and HMBS are the optimal reference genes for analysis of gene expression associated with IMF deposition in skeletal muscle during development.
Gene set analysis using variance component tests.

PubMed

Huang, Yen-Tsung; Lin, Xihong

2013-06-28

Gene set analyses have become increasingly important in genomic research, as many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional repertoire, e.g., a biological pathway/network and are highly correlated. However, most of the existing gene set analysis methods do not fully account for the correlation among the genes. Here we propose to tackle this important feature of a gene set to improve statistical power in gene set analyses. We propose to model the effects of an independent variable, e.g., exposure/biological status (yes/no), on multiple gene expression values in a gene set using a multivariate linear regression model, where the correlation among the genes is explicitly modeled using a working covariance matrix. We develop TEGS (Test for the Effect of a Gene Set), a variance component test for the gene set effects by assuming a common distribution for regression coefficients in multivariate linear regression models, and calculate the p-values using permutation and a scaled chi-square approximation. We show using simulations that type I error is protected under different choices of working covariance matrices and power is improved as the working covariance approaches the true covariance. The global test is a special case of TEGS when correlation among genes in a gene set is ignored. Using both simulation data and a published diabetes dataset, we show that our test outperforms the commonly used approaches, the global test and gene set enrichment analysis (GSEA). We develop a gene set analyses method (TEGS) under the multivariate regression framework, which directly models the interdependence of the expression values in a gene set using a working covariance. TEGS outperforms two widely used methods, GSEA and global test in both simulation and a diabetes microarray data.
Distinct Trajectories of Massive Recent Gene Gains and Losses in Populations of a Microbial Eukaryotic Pathogen.

PubMed

Hartmann, Fanny E; Croll, Daniel

2017-11-01

Differences in gene content are a significant source of variability within species and have an impact on phenotypic traits. However, little is known about the mechanisms responsible for the most recent gene gains and losses. We screened the genomes of 123 worldwide isolates of the major pathogen of wheat Zymoseptoria tritici for robust evidence of gene copy number variation. Based on orthology relationships in three closely related fungi, we identified 599 gene gains and 1,024 gene losses that have not yet reached fixation within the focal species. Our analyses of gene gains and losses segregating in populations showed that gene copy number variation arose preferentially in subtelomeres and in proximity to transposable elements. Recently lost genes were enriched in virulence factors and secondary metabolite gene clusters. In contrast, recently gained genes encoded mostly secreted protein lacking a conserved domain. We analyzed the frequency spectrum at loci segregating a gene presence-absence polymorphism in four worldwide populations. Recent gene losses showed a significant excess in low-frequency variants compared with genome-wide single nucleotide polymorphism, which is indicative of strong negative selection against gene losses. Recent gene gains were either under weak negative selection or neutral. We found evidence for strong divergent selection among populations at individual loci segregating a gene presence-absence polymorphism. Hence, gene gains and losses likely contributed to local adaptation. Our study shows that microbial eukaryotes harbor extensive copy number variation within populations and that functional differences among recently gained and lost genes led to distinct evolutionary trajectories. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Gene expression profiles in paraffin-embedded core biopsy tissue predict response to chemotherapy in women with locally advanced breast cancer.

PubMed

Gianni, Luca; Zambetti, Milvia; Clark, Kim; Baker, Joffre; Cronin, Maureen; Wu, Jenny; Mariani, Gabriella; Rodriguez, Jaime; Carcangiu, Marialuisa; Watson, Drew; Valagussa, Pinuccia; Rouzier, Roman; Symmans, W Fraser; Ross, Jeffrey S; Hortobagyi, Gabriel N; Pusztai, Lajos; Shak, Steven

2005-10-10

We sought to identify gene expression markers that predict the likelihood of chemotherapy response. We also tested whether chemotherapy response is correlated with the 21-gene Recurrence Score assay that quantifies recurrence risk. Patients with locally advanced breast cancer received neoadjuvant paclitaxel and doxorubicin. RNA was extracted from the pretreatment formalin-fixed paraffin-embedded core biopsies. The expression of 384 genes was quantified using reverse transcriptase polymerase chain reaction and correlated with pathologic complete response (pCR). The performance of genes predicting for pCR was tested in patients from an independent neoadjuvant study where gene expression was obtained using DNA microarrays. Of 89 assessable patients (mean age, 49.9 years; mean tumor size, 6.4 cm), 11 (12%) had a pCR. Eighty-six genes correlated with pCR (unadjusted P < .05); pCR was more likely with higher expression of proliferation-related genes and immune-related genes, and with lower expression of estrogen receptor (ER) -related genes. In 82 independent patients treated with neoadjuvant paclitaxel and doxorubicin, DNA microarray data were available for 79 of the 86 genes. In univariate analysis, 24 genes correlated with pCR with P < .05 (false discovery, four genes) and 32 genes showed correlation with P < .1 (false discovery, eight genes). The Recurrence Score was positively associated with the likelihood of pCR (P = .005), suggesting that the patients who are at greatest recurrence risk are more likely to have chemotherapy benefit. Quantitative expression of ER-related genes, proliferation genes, and immune-related genes are strong predictors of pCR in women with locally advanced breast cancer receiving neoadjuvant anthracyclines and paclitaxel.
Reference genes for gene expression studies in wheat flag leaves grown under different farming conditions

PubMed Central

2011-01-01

Background Internal control genes with highly uniform expression throughout the experimental conditions are required for accurate gene expression analysis as no universal reference genes exists. In this study, the expression stability of 24 candidate genes from Triticum aestivum cv. Cubus flag leaves grown under organic and conventional farming systems was evaluated in two locations in order to select suitable genes that can be used for normalization of real-time quantitative reverse-transcription PCR (RT-qPCR) reactions. The genes were selected among the most common used reference genes as well as genes encoding proteins involved in several metabolic pathways. Findings Individual genes displayed different expression rates across all samples assayed. Applying geNorm, a set of three potential reference genes were suitable for normalization of RT-qPCR reactions in winter wheat flag leaves cv. Cubus: TaFNRII (ferredoxin-NADP(H) oxidoreductase; AJ457980.1), ACT2 (actin 2; TC234027), and rrn26 (a putative homologue to RNA 26S gene; AL827977.1). In addition of these three genes that were also top-ranked by NormFinder, two extra genes: CYP18-2 (Cyclophilin A, AY456122.1) and TaWIN1 (14-3-3 like protein, AB042193) were most consistently stably expressed. Furthermore, we showed that TaFNRII, ACT2, and CYP18-2 are suitable for gene expression normalization in other two winter wheat varieties (Tommi and Centenaire) grown under three treatments (organic, conventional and no nitrogen) and a different environment than the one tested with cv. Cubus. Conclusions This study provides a new set of reference genes which should improve the accuracy of gene expression analyses when using wheat flag leaves as those related to the improvement of nitrogen use efficiency for cereal production. PMID:21951810
On the presence and role of human gene-body DNA methylation

PubMed Central

Jjingo, Daudi; Conley, Andrew B.; Yi, Soojin V.; Lunyak, Victoria V.; Jordan, I. King

2012-01-01

DNA methylation of promoter sequences is a repressive epigenetic mark that down-regulates gene expression. However, DNA methylation is more prevalent within gene-bodies than seen for promoters, and gene-body methylation has been observed to be positively correlated with gene expression levels. This paradox remains unexplained, and accordingly the role of DNA methylation in gene-bodies is poorly understood. We addressed the presence and role of human gene-body DNA methylation using a meta-analysis of human genome-wide methylation, expression and chromatin data sets. Methylation is associated with transcribed regions as genic sequences have higher levels of methylation than intergenic or promoter sequences. We also find that the relationship between gene-body DNA methylation and expression levels is non-monotonic and bell-shaped. Mid-level expressed genes have the highest levels of gene-body methylation, whereas the most lowly and highly expressed sets of genes both have low levels of methylation. While gene-body methylation can be seen to efficiently repress the initiation of intragenic transcription, the vast majority of methylated sites within genes are not associated with intragenic promoters. In fact, highly expressed genes initiate the most intragenic transcription, which is inconsistent with the previously held notion that gene-body methylation serves to repress spurious intragenic transcription to allow for efficient transcriptional elongation. These observations lead us to propose a model to explain the presence of human gene-body methylation. This model holds that the repression of intragenic transcription by gene-body methylation is largely epiphenomenal, and suggests that gene-body methylation levels are predominantly shaped via the accessibility of the DNA to methylating enzyme complexes. PMID:22577155

Phylogenetic Analysis of the Incidence of lux Gene Horizontal Transfer in Vibrionaceae▿ †

PubMed Central

Urbanczyk, Henryk; Ast, Jennifer C.; Kaeding, Allison J.; Oliver, James D.; Dunlap, Paul V.

2008-01-01

Horizontal gene transfer (HGT) is thought to occur frequently in bacteria in nature and to play an important role in bacterial evolution, contributing to the formation of new species. To gain insight into the frequency of HGT in Vibrionaceae and its possible impact on speciation, we assessed the incidence of interspecies transfer of the lux genes (luxCDABEG), which encode proteins involved in luminescence, a distinctive phenotype. Three hundred three luminous strains, most of which were recently isolated from nature and which represent 11 Aliivibrio, Photobacterium, and Vibrio species, were screened for incongruence of phylogenies based on a representative housekeeping gene (gyrB or pyrH) and a representative lux gene (luxA). Strains exhibiting incongruence were then subjected to detailed phylogenetic analysis of horizontal transfer by using multiple housekeeping genes (gyrB, recA, and pyrH) and multiple lux genes (luxCDABEG). In nearly all cases, housekeeping gene and lux gene phylogenies were congruent, and there was no instance in which the lux genes of one luminous species had replaced the lux genes of another luminous species. Therefore, the lux genes are predominantly vertically inherited in Vibrionaceae. The few exceptions to this pattern of congruence were as follows: (i) the lux genes of the only known luminous strain of Vibrio vulnificus, VVL1 (ATCC 43382), were evolutionarily closely related to the lux genes of Vibrio harveyi; (ii) the lux genes of two luminous strains of Vibrio chagasii, 21N-12 and SB-52, were closely related to those of V. harveyi and Vibrio splendidus, respectively; (iii) the lux genes of a luminous strain of Photobacterium damselae, BT-6, were closely related to the lux genes of the lux-rib2 operon of Photobacterium leiognathi; and (iv) a strain of the luminous bacterium Photobacterium mandapamensis was found to be merodiploid for the lux genes, and the second set of lux genes was closely related to the lux genes of the lux-rib2 operon of P. leiognathi. In none of these cases of apparent HGT, however, did acquisition of the lux genes correlate with phylogenetic divergence of the recipient strain from other members of its species. The results indicate that horizontal transfer of the lux genes in nature is rare and that horizontal acquisition of the lux genes apparently has not contributed to speciation in recipient taxa. PMID:18359809
Identification and comprehensive evaluation of reference genes for RT-qPCR analysis of host gene-expression in Brassica juncea-aphid interaction using microarray data.

PubMed

Ram, Chet; Koramutla, Murali Krishna; Bhattacharya, Ramcharan

2017-07-01

Brassica juncea is a chief oil yielding crop in many parts of the world including India. With advancement of molecular techniques, RT-qPCR based study of gene-expression has become an integral part of experimentations in crop breeding. In RT-qPCR, use of appropriate reference gene(s) is pivotal. The virtue of the reference genes, being constant in expression throughout the experimental treatments, needs to be validated case by case. Appropriate reference gene(s) for normalization of gene-expression data in B. juncea during the biotic stress of aphid infestation is not known. In the present investigation, 11 reference genes identified from microarray database of Arabidopsis-aphid interaction at a cut off FDR ≤0.1, along with two known reference genes of B. juncea, were analyzed for their expression stability upon aphid infestation. These included 6 frequently used and 5 newly identified reference genes. Ranking orders of the reference genes in terms of expression stability were calculated using advanced statistical approaches such as geNorm, NormFinder, delta Ct and BestKeeper. The analysis suggested CAC, TUA and DUF179 as the most suitable reference genes. Further, normalization of the gene-expression data of STP4 and PR1 by the most and the least stable reference gene, respectively has demonstrated importance and applicability of the recommended reference genes in aphid infested samples of B. juncea. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Positive Selection Linked with Generation of Novel Mammalian Dentition Patterns.

PubMed

Machado, João Paulo; Philip, Siby; Maldonado, Emanuel; O'Brien, Stephen J; Johnson, Warren E; Antunes, Agostinho

2016-09-11

A diverse group of genes are involved in the tooth development of mammals. Several studies, focused mainly on mice and rats, have provided a detailed depiction of the processes coordinating tooth formation and shape. Here we surveyed 236 tooth-associated genes in 39 mammalian genomes and tested for signatures of selection to assess patterns of molecular adaptation in genes regulating mammalian dentition. Of the 236 genes, 31 (∼13.1%) showed strong signatures of positive selection that may be responsible for the phenotypic diversity observed in mammalian dentition. Mammalian-specific tooth-associated genes had accelerated mutation rates compared with older genes found across all vertebrates. More recently evolved genes had fewer interactions (either genetic or physical), were associated with fewer Gene Ontology terms and had faster evolutionary rates compared with older genes. The introns of these positively selected genes also exhibited accelerated evolutionary rates, which may reflect additional adaptive pressure in the intronic regions that are associated with regulatory processes that influence tooth-gene networks. The positively selected genes were mainly involved in processes like mineralization and structural organization of tooth specific tissues such as enamel and dentin. Of the 236 analyzed genes, 12 mammalian-specific genes (younger genes) provided insights on diversification of mammalian teeth as they have higher evolutionary rates and exhibit different expression profiles compared with older genes. Our results suggest that the evolution and development of mammalian dentition occurred in part through positive selection acting on genes that previously had other functions. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Both a PKS and a PPTase are involved in melanin biosynthesis and regulation of Aureobasidium melanogenum XJ5-1 isolated from the Taklimakan desert.

PubMed

Jiang, Hong; Liu, Guang-Lei; Chi, Zhe; Wang, Jian-Ming; Zhang, Ly-Ly; Chi, Zhen-Ming

2017-02-20

A PKS1 gene responsible for the melanin biosynthesis and a NPG1 gene in Aureobasidium melanogenum XJ5-1 were cloned and characterized. An ORF of the PKS1 gene encoding a protein with 2165 amino acids contained 6495bp while an ORF of the NPG1 gene encoding a protein with 340 amino acids had 1076bp. After analysis of their promoters, it was found that expression of both the PKS1 gene and the NPG1 gene was repressed by nitrogen sources and glucose, respectively. The PKS deduced from the cloned gene consisted of one ketosynthase, one acyl transferase, two acyl carrier proteins, one thioesterase and one cyclase while the PPTase belonged to the family Sfp-type. After disruption of the PKS1 gene and the NPG1 gene, expression of the PKS1 gene and the NPG1 gene and the melanin biosynthesis in the disruptants K5 and DP107 disappeared and expression of the PKS1 gene in the disruptant DP107 was also negatively influenced. However, after the NPG1 gene was complemented in the disruptant DP107, the melanin biosynthesis in the complementary strain BP17 was restored and expression of the PKS1 gene and the NPG1 gene was greatly enhanced, suggesting that the PKS was indeed activated and regulated by the PPTase and expression of the PKS1 gene and the NPG1 gene had a coordinate regulation. Copyright © 2016 Elsevier B.V. All rights reserved.
Extraordinary diversity of visual opsin genes in dragonflies

PubMed Central

Futahashi, Ryo; Kawahara-Miki, Ryouka; Kinoshita, Michiyo; Yoshitake, Kazutoshi; Yajima, Shunsuke; Arikawa, Kentaro; Fukatsu, Takema

2015-01-01

Dragonflies are colorful and large-eyed animals strongly dependent on color vision. Here we report an extraordinary large number of opsin genes in dragonflies and their characteristic spatiotemporal expression patterns. Exhaustive transcriptomic and genomic surveys of three dragonflies of the family Libellulidae consistently identified 20 opsin genes, consisting of 4 nonvisual opsin genes and 16 visual opsin genes of 1 UV, 5 short-wavelength (SW), and 10 long-wavelength (LW) type. Comprehensive transcriptomic survey of the other dragonflies representing an additional 10 families also identified as many as 15–33 opsin genes. Molecular phylogenetic analysis revealed dynamic multiplications and losses of the opsin genes in the course of evolution. In contrast to many SW and LW genes expressed in adults, only one SW gene and several LW genes were expressed in larvae, reflecting less visual dependence and LW-skewed light conditions for their lifestyle under water. In this context, notably, the sand-burrowing or pit-dwelling species tended to lack SW gene expression in larvae. In adult visual organs: (i) many SW genes and a few LW genes were expressed in the dorsal region of compound eyes, presumably for processing SW-skewed light from the sky; (ii) a few SW genes and many LW genes were expressed in the ventral region of compound eyes, probably for perceiving terrestrial objects; and (iii) expression of a specific LW gene was associated with ocelli. Our findings suggest that the stage- and region-specific expressions of the diverse opsin genes underlie the behavior, ecology, and adaptation of dragonflies. PMID:25713365
Gene Selection and Cancer Classification: A Rough Sets Based Approach

NASA Astrophysics Data System (ADS)

Sun, Lijun; Miao, Duoqian; Zhang, Hongyun

Indentification of informative gene subsets responsible for discerning between available samples of gene expression data is an important task in bioinformatics. Reducts, from rough sets theory, corresponding to a minimal set of essential genes for discerning samples, is an efficient tool for gene selection. Due to the compuational complexty of the existing reduct algoritms, feature ranking is usually used to narrow down gene space as the first step and top ranked genes are selected . In this paper,we define a novel certierion based on the expression level difference btween classes and contribution to classification of the gene for scoring genes and present a algorithm for generating all possible reduct from informative genes.The algorithm takes the whole attribute sets into account and find short reduct with a significant reduction in computational complexity. An exploration of this approach on benchmark gene expression data sets demonstrates that this approach is successful for selecting high discriminative genes and the classification accuracy is impressive.
Two fundamentally different classes of microbial genes.

PubMed

Wolf, Yuri I; Makarova, Kira S; Lobkovsky, Alexander E; Koonin, Eugene V

2016-11-07

The evolution of bacterial and archaeal genomes is highly dynamic and involves extensive horizontal gene transfer and gene loss 1-4 . Furthermore, many microbial species appear to have open pangenomes, where each newly sequenced genome contains more than 10% ORFans, that is, genes without detectable homologues in other species 5,6 . Here, we report a quantitative analysis of microbial genome evolution by fitting the parameters of a simple, steady-state evolutionary model to the comparative genomic data on the gene content and gene order similarity between archaeal genomes. The results reveal two sharply distinct classes of microbial genes, one of which is characterized by effectively instantaneous gene replacement, and the other consists of genes with finite, distributed replacement rates. These findings imply a conservative estimate of the size of the prokaryotic genomic universe, which appears to consist of at least a billion distinct genes. Furthermore, the same distribution of constraints is shown to govern the evolution of gene complement and gene order, without the need to invoke long-range conservation or the selfish operon concept 7 .
ORGANIZATION OF THE nif GENES OF THE NONHETEROCYSTOUS CYANOBACTERIUM TRICHODESMIUM SP. IMS101.

PubMed

Dominic, Benny; Zani, Sabino; Chen, Yi-Bu; Mellon, Mark T; Zehr, Jonathan P

2000-08-26

An approximately 16-kb fragment of the Trichodesmium sp. IMS101 (a nonheterocystous filamentous cyanobacterium) "conventional"nif gene cluster was cloned and sequenced. The gene organization of the Trichodesmium and Anabaena variabilis vegetative (nif 2) nitrogenase gene clusters spanning the region from nif B to nif W are similar except for the absence of two open reading frames (ORF3 and ORF1) in Trichodesmium. The Trichodesmium nif EN genes encode a fused Nif EN polypeptide that does not appear to be processed into individual Nif E and Nif N polypeptides. Fused nif EN genes were previously found in the A. variabilis nif 2 genes, but we have found that fused nif EN genes are widespread in the nonheterocystous cyanobacteria. Although the gene organization of the nonheterocystous filamentous Trichodesmium nif gene cluster is very similar to that of the A. variabilis vegetative nif 2 gene cluster, phylogenetic analysis of nif sequences do not support close relatedness of Trichodesmium and A. variabilis vegetative (nif 2) nitrogenase genes.
Validation of reference genes for quantifying changes in gene expression in virus-infected tobacco.

PubMed

Baek, Eseul; Yoon, Ju-Yeon; Palukaitis, Peter

2017-10-01

To facilitate quantification of gene expression changes in virus-infected tobacco plants, eight housekeeping genes were evaluated for their stability of expression during infection by one of three systemically-infecting viruses (cucumber mosaic virus, potato virus X, potato virus Y) or a hypersensitive-response-inducing virus (tobacco mosaic virus; TMV) limited to the inoculated leaf. Five reference-gene validation programs were used to establish the order of the most stable genes for the systemically-infecting viruses as ribosomal protein L25 > β-Tubulin > Actin, and the least stable genes Ubiquitin-conjugating enzyme (UCE) < PP2A < GAPDH. For local infection by TMV, the most stable genes were EF1α > Cysteine protease > Actin, and the least stable genes were GAPDH < PP2A < UCE. Using two of the most stable and the two least stable validated reference genes, three defense responsive genes were examined to compare their relative changes in gene expression caused by each virus. Copyright © 2017 Elsevier Inc. All rights reserved.
Methylation of miRNA genes and oncogenesis.

PubMed

Loginov, V I; Rykov, S V; Fridman, M V; Braga, E A

2015-02-01

Interaction between microRNA (miRNA) and messenger RNA of target genes at the posttranscriptional level provides fine-tuned dynamic regulation of cell signaling pathways. Each miRNA can be involved in regulating hundreds of protein-coding genes, and, conversely, a number of different miRNAs usually target a structural gene. Epigenetic gene inactivation associated with methylation of promoter CpG-islands is common to both protein-coding genes and miRNA genes. Here, data on functions of miRNAs in development of tumor-cell phenotype are reviewed. Genomic organization of promoter CpG-islands of the miRNA genes located in inter- and intragenic areas is discussed. The literature and our own results on frequency of CpG-island methylation in miRNA genes from tumors are summarized, and data regarding a link between such modification and changed activity of miRNA genes and, consequently, protein-coding target genes are presented. Moreover, the impact of miRNA gene methylation on key oncogenetic processes as well as affected signaling pathways is discussed.
Simple F Test Reveals Gene-Gene Interactions in Case-Control Studies

PubMed Central

Chen, Guanjie; Yuan, Ao; Zhou, Jie; Bentley, Amy R.; Adeyemo, Adebowale; Rotimi, Charles N.

2012-01-01

Missing heritability is still a challenge for Genome Wide Association Studies (GWAS). Gene-gene interactions may partially explain this residual genetic influence and contribute broadly to complex disease. To analyze the gene-gene interactions in case-control studies of complex disease, we propose a simple, non-parametric method that utilizes the F-statistic. This approach consists of three steps. First, we examine the joint distribution of a pair of SNPs in cases and controls separately. Second, an F-test is used to evaluate the ratio of dependence in cases to that of controls. Finally, results are adjusted for multiple tests. This method was used to evaluate gene-gene interactions that are associated with risk of Type 2 Diabetes among African Americans in the Howard University Family Study. We identified 18 gene-gene interactions (P < 0.0001). Compared with the commonly-used logistical regression method, we demonstrate that the F-ratio test is an efficient approach to measuring gene-gene interactions, especially for studies with limited sample size. PMID:22837643
Bioinformatics Analysis of NBS-LRR Encoding Resistance Genes in Setaria italica.

PubMed

Zhao, Yan; Weng, Qiaoyun; Song, Jinhui; Ma, Hailian; Yuan, Jincheng; Dong, Zhiping; Liu, Yinghui

2016-06-01

In plants, resistance (R) genes are involved in pathogen recognition and subsequent activation of innate immune responses. The nucleotide-binding site-leucine-rich repeat (NBS-LRR) genes family forms the largest R-gene family among plant genomes and play an important role in plant disease resistance. In this paper, comprehensive analysis of NBS-encoding genes is performed in the whole Setaria italica genome. A total of 96 NBS-LRR genes are identified, and comprehensive overview of the NBS-LRR genes is undertaken, including phylogenetic analysis, chromosome locations, conserved motifs of proteins, and gene expression. Based on the domain, these genes are divided into two groups and distributed in all Setaria italica chromosomes. Most NBS-LRR genes are located at the distal tip of the long arms of the chromosomes. Setaria italica NBS-LRR proteins share at least one nucleotide-biding domain and one leucine-rich repeat domain. Our results also show the duplication of NBS-LRR genes in Setaria italica is related to their gene structure.
The construction of cDNA library and the screening of related antigen of ascitic tumor cells of ovarian cancer.

PubMed

Hou, Q; Chen, K; Shan, Z

2015-01-01

To construct the cDNA library of the ascites tumor cells of ovarian cancer, which can be used to screen the related antigen for the early diagnosis of ovarian cancer and therapeutic targets of immune treatment. Four cases of ovarian serous cystadenocarcinoma, two cases of ovarian mucinous cystadenocarcinoma, and two cases of ovarian endometrial carcinoma in patients with ascitic tumor cells which were used to construct the cDNA library. To screen the ovarian cancer antigen gene, evaluate the enzyme, and analyze nucleotide sequence, serological analysis of recombinant tumor cDNA expression libraries (SEREX) and suppression subtractive hybridization technique (SSH) techniques were utilized. The detection method of recombinant expression-based serological mini-arrays (SMARTA) was used to detect the ovarian cancer antigen and the positive reaction of 105 cases of ovarian cancer patients and 105 normal women's autoantibodies correspondingly in serum. After two rounds of serologic screening and glycosides sequencing analysis, 59 candidates of ovarian cancer antigen gene fragments were finally identified, which corresponded to 50 genes. They were then divided into six categories: (1) the homologous genes which related to the known ovarian cancer genes, such as BARD 1 gene, etc; (2) the homologous genes which were associated with other tumors, such as TM4SFI gene, etc; (3) the genes which were expressed in a special organization, such as ILF3, FXR1 gene, etc; (4) the genes which were the same with some protein genes of special function, such as TIZ, ClD gene; (5) the homologous genes which possessed the same source with embryonic genes, such as PKHD1 gene, etc; (6) the remaining genes were the unknown genes without the homologous sequence in the gene pool, such as OV-189 genes. SEREX technology combined with SSH method is an effective research strategy which can filter tumor antigen with high specific character; the corresponding autoantibodies of TM4SFl, ClD, TIZ, BARDI, FXRI, and OV-189 gene's recombinant antigen in serum can be regarded as the biomarkers which are used to diagnose ovarian cancer. The combination of multiple antigen detection can improve diagnostic efficiency.
Transcript Profile of Flowering Regulatory Genes in VcFT-Overexpressing Blueberry Plants

PubMed Central

Walworth, Aaron E.; Chai, Benli; Song, Guo-qing

2016-01-01

In order to identify genetic components in flowering pathways of highbush blueberry (Vaccinium corymbosum L.), a transcriptome reference composed of 254,396 transcripts and 179,853 gene contigs was developed by assembly of 72.7 million reads using Trinity. Using this transcriptome reference and a query of flowering pathway genes of herbaceous plants, we identified potential flowering pathway genes/transcripts of blueberry. Transcriptome analysis of flowering pathway genes was then conducted on leaf tissue samples of transgenic blueberry cv. Aurora (‘VcFT-Aurora’), which overexpresses a blueberry FLOWERING LOCUS T-like gene (VcFT). Sixty-one blueberry transcripts of 40 genes showed high similarities to 33 known flowering-related genes of herbaceous plants, of which 17 down-regulated and 16 up-regulated genes were identified in ‘VcFT-Aurora’. All down-regulated genes encoded transcription factors/enzymes upstream in the signaling pathway containing VcFT. A blueberry CONSTANS-LIKE 5-like (VcCOL5) gene was down-regulated and associated with five other differentially expressed (DE) genes in the photoperiod-mediated flowering pathway. Three down-regulated genes, i.e., a MADS-AFFECTING FLOWERING 2-like gene (VcMAF2), a MADS-AFFECTING FLOWERING 5-like gene (VcMAF5), and a VERNALIZATION1-like gene (VcVRN1), may function as integrators in place of FLOWERING LOCUS C (FLC) in the vernalization pathway. Because no CONSTAN1-like or FLOWERING LOCUS C-like genes were found in blueberry, VcCOL5 and VcMAF2/VcMAF5 or VRN1 might be the major integrator(s) in the photoperiod- and vernalization-mediated flowering pathway, respectively. The major down-stream genes of VcFT, i.e., SUPPRESSOR of Overexpression of Constans 1-like (VcSOC1), LEAFY-like (VcLFY), APETALA1-like (VcAP1), CAULIFLOWER 1-like (VcCAL1), and FRUITFULL-like (VcFUL) genes were present and showed high similarity to their orthologues in herbaceous plants. Moreover, overexpression of VcFT promoted expression of all of these VcFT downstream genes. These results suggest that VcFT’s down-stream genes appear conserved in blueberry. PMID:27271296
Transcript Profile of Flowering Regulatory Genes in VcFT-Overexpressing Blueberry Plants.

PubMed

Walworth, Aaron E; Chai, Benli; Song, Guo-Qing

2016-01-01

In order to identify genetic components in flowering pathways of highbush blueberry (Vaccinium corymbosum L.), a transcriptome reference composed of 254,396 transcripts and 179,853 gene contigs was developed by assembly of 72.7 million reads using Trinity. Using this transcriptome reference and a query of flowering pathway genes of herbaceous plants, we identified potential flowering pathway genes/transcripts of blueberry. Transcriptome analysis of flowering pathway genes was then conducted on leaf tissue samples of transgenic blueberry cv. Aurora ('VcFT-Aurora'), which overexpresses a blueberry FLOWERING LOCUS T-like gene (VcFT). Sixty-one blueberry transcripts of 40 genes showed high similarities to 33 known flowering-related genes of herbaceous plants, of which 17 down-regulated and 16 up-regulated genes were identified in 'VcFT-Aurora'. All down-regulated genes encoded transcription factors/enzymes upstream in the signaling pathway containing VcFT. A blueberry CONSTANS-LIKE 5-like (VcCOL5) gene was down-regulated and associated with five other differentially expressed (DE) genes in the photoperiod-mediated flowering pathway. Three down-regulated genes, i.e., a MADS-AFFECTING FLOWERING 2-like gene (VcMAF2), a MADS-AFFECTING FLOWERING 5-like gene (VcMAF5), and a VERNALIZATION1-like gene (VcVRN1), may function as integrators in place of FLOWERING LOCUS C (FLC) in the vernalization pathway. Because no CONSTAN1-like or FLOWERING LOCUS C-like genes were found in blueberry, VcCOL5 and VcMAF2/VcMAF5 or VRN1 might be the major integrator(s) in the photoperiod- and vernalization-mediated flowering pathway, respectively. The major down-stream genes of VcFT, i.e., SUPPRESSOR of Overexpression of Constans 1-like (VcSOC1), LEAFY-like (VcLFY), APETALA1-like (VcAP1), CAULIFLOWER 1-like (VcCAL1), and FRUITFULL-like (VcFUL) genes were present and showed high similarity to their orthologues in herbaceous plants. Moreover, overexpression of VcFT promoted expression of all of these VcFT downstream genes. These results suggest that VcFT's down-stream genes appear conserved in blueberry.
Identification of fever and vaccine-associated gene interaction networks using ontology-based literature mining

PubMed Central

2012-01-01

Background Fever is one of the most common adverse events of vaccines. The detailed mechanisms of fever and vaccine-associated gene interaction networks are not fully understood. In the present study, we employed a genome-wide, Centrality and Ontology-based Network Discovery using Literature data (CONDL) approach to analyse the genes and gene interaction networks associated with fever or vaccine-related fever responses. Results Over 170,000 fever-related articles from PubMed abstracts and titles were retrieved and analysed at the sentence level using natural language processing techniques to identify genes and vaccines (including 186 Vaccine Ontology terms) as well as their interactions. This resulted in a generic fever network consisting of 403 genes and 577 gene interactions. A vaccine-specific fever sub-network consisting of 29 genes and 28 gene interactions was extracted from articles that are related to both fever and vaccines. In addition, gene-vaccine interactions were identified. Vaccines (including 4 specific vaccine names) were found to directly interact with 26 genes. Gene set enrichment analysis was performed using the genes in the generated interaction networks. Moreover, the genes in these networks were prioritized using network centrality metrics. Making scientific discoveries and generating new hypotheses were possible by using network centrality and gene set enrichment analyses. For example, our study found that the genes in the generic fever network were more enriched in cell death and responses to wounding, and the vaccine sub-network had more gene enrichment in leukocyte activation and phosphorylation regulation. The most central genes in the vaccine-specific fever network are predicted to be highly relevant to vaccine-induced fever, whereas genes that are central only in the generic fever network are likely to be highly relevant to generic fever responses. Interestingly, no Toll-like receptors (TLRs) were found in the gene-vaccine interaction network. Since multiple TLRs were found in the generic fever network, it is reasonable to hypothesize that vaccine-TLR interactions may play an important role in inducing fever response, which deserves a further investigation. Conclusions This study demonstrated that ontology-based literature mining is a powerful method for analyzing gene interaction networks and generating new scientific hypotheses. PMID:23256563
Combining classifiers to predict gene function in Arabidopsis thaliana using large-scale gene expression measurements.

PubMed

Lan, Hui; Carson, Rachel; Provart, Nicholas J; Bonner, Anthony J

2007-09-21

Arabidopsis thaliana is the model species of current plant genomic research with a genome size of 125 Mb and approximately 28,000 genes. The function of half of these genes is currently unknown. The purpose of this study is to infer gene function in Arabidopsis using machine-learning algorithms applied to large-scale gene expression data sets, with the goal of identifying genes that are potentially involved in plant response to abiotic stress. Using in house and publicly available data, we assembled a large set of gene expression measurements for A. thaliana. Using those genes of known function, we first evaluated and compared the ability of basic machine-learning algorithms to predict which genes respond to stress. Predictive accuracy was measured using ROC50 and precision curves derived through cross validation. To improve accuracy, we developed a method for combining these classifiers using a weighted-voting scheme. The combined classifier was then trained on genes of known function and applied to genes of unknown function, identifying genes that potentially respond to stress. Visual evidence corroborating the predictions was obtained using electronic Northern analysis. Three of the predicted genes were chosen for biological validation. Gene knockout experiments confirmed that all three are involved in a variety of stress responses. The biological analysis of one of these genes (At1g16850) is presented here, where it is shown to be necessary for the normal response to temperature and NaCl. Supervised learning methods applied to large-scale gene expression measurements can be used to predict gene function. However, the ability of basic learning methods to predict stress response varies widely and depends heavily on how much dimensionality reduction is used. Our method of combining classifiers can improve the accuracy of such predictions - in this case, predictions of genes involved in stress response in plants - and it effectively chooses the appropriate amount of dimensionality reduction automatically. The method provides a useful means of identifying genes in A. thaliana that potentially respond to stress, and we expect it would be useful in other organisms and for other gene functions.
Transcriptome analysis of adipose tissues from two fat-tailed sheep breeds reveals key genes involved in fat deposition.

PubMed

Li, Baojun; Qiao, Liying; An, Lixia; Wang, Weiwei; Liu, Jianhua; Ren, Youshe; Pan, Yangyang; Jing, Jiongjie; Liu, Wenzhong

2018-05-08

The level of fat deposition in carcass is a crucial factor influencing meat quality. Guangling Large-Tailed (GLT) and Small-Tailed Han (STH) sheep are important local Chinese fat-tailed breeds that show distinct patterns of fat depots. To gain a better understanding of fat deposition, transcriptome profiles were determined by RNA-sequencing of perirenal, subcutaneous, and tail fat tissues from both the sheep breeds. The common highly expressed genes (co-genes) in all the six tissues, and the genes that were differentially expressed (DE genes) between these two breeds in the corresponding tissues were analyzed. Approximately 47 million clean reads were obtained for each sample, and a total of 17,267 genes were annotated. Of the 47 highly expressed co-genes, FABP4, ADIPOQ, FABP5, and CD36 were the four most highly transcribed genes among all the known genes related to adipose deposition. FHC, FHC-pseudogene, and ZC3H10 were also highly expressed genes and could, thus, have roles in fat deposition. A total of 2091, 4233, and 4131 DE genes were identified in the perirenal, subcutaneous, and tail fat tissues between the GLT and STH breeds, respectively. Gene Ontology (GO) analysis showed that some DE genes were associated with adipose metabolism. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis revealed that PPAR signaling pathway and ECM-receptor interaction were specifically enriched. Four genes, namely LOC101102230, PLTP, C1QTNF7, and OLR1 were up-regulated and two genes, SCD and UCP-1, were down-regulated in all the tested tissues of STH. Among the genes involved in ECM-receptor interaction, the genes encoding collagens, laminins, and integrins were quite different depending on the depots or the breeds. In STH, genes such as LAMB3, RELN, TNXB, and ITGA8, were identified to be up regulated and LAMB4 was observed to be down regulated. This study unravels the complex transcriptome profiles in sheep fat tissues, highlighting the candidate genes involved in fat deposition. Further studies are needed to investigate the roles of the candidate genes in fat deposition and in determining the meat quality of sheep.
Stratification of clear cell renal cell carcinoma (ccRCC) genomes by gene-directed copy number alteration (CNA) analysis

PubMed Central

Thiesen, H.-J.; Steinbeck, F.; Maruschke, M.; Koczan, D.; Ziems, B.; Hakenberg, O. W.

2017-01-01

Tumorigenic processes are understood to be driven by epi-/genetic and genomic alterations from single point mutations to chromosomal alterations such as insertions and deletions of nucleotides up to gains and losses of large chromosomal fragments including products of chromosomal rearrangements e.g. fusion genes and proteins. Overall comparisons of copy number alterations (CNAs) presented in 48 clear cell renal cell carcinoma (ccRCC) genomes resulted in ratios of gene losses versus gene gains between 26 ccRCC Fuhrman malignancy grades G1 (ratio 1.25) and 20 G3 (ratio 0.58). Gene losses and gains of 15762 CNA genes were mapped to 795 chromosomal cytoband loci including 280 KEGG pathways. CNAs were classified according to their contribution to Fuhrman tumour gradings G1 and G3. Gene gains and losses turned out to be highly structured processes in ccRCC genomes enabling the subclassification and stratification of ccRCC tumours in a genome-wide manner. CNAs of ccRCC seem to start with common tumour related gene losses flanked by CNAs specifying Fuhrman grade G1 losses and CNA gains favouring grade G3 tumours. The appearance of recurrent CNA signatures implies the presence of causal mechanisms most likely implicated in the pathogenesis and disease-outcome of ccRCC tumours distinguishing lower from higher malignant tumours. The diagnostic quality of initial 201 genes (108 genes supporting G1 and 93 genes G3 phenotypes) has been successfully validated on published Swiss data (GSE19949) leading to a restricted CNA gene set of 171 CNA genes of which 85 genes favour Fuhrman grade G1 and 86 genes Fuhrman grade G3. Regarding these gene sets overall survival decreased with the number of G3 related gene losses plus G3 related gene gains. CNA gene sets presented define an entry to a gene-directed and pathway-related functional understanding of ongoing copy number alterations within and between individual ccRCC tumours leading to CNA genes of prognostic and predictive value. PMID:28486536
Stratification of clear cell renal cell carcinoma (ccRCC) genomes by gene-directed copy number alteration (CNA) analysis.

PubMed

Thiesen, H-J; Steinbeck, F; Maruschke, M; Koczan, D; Ziems, B; Hakenberg, O W

2017-01-01

Tumorigenic processes are understood to be driven by epi-/genetic and genomic alterations from single point mutations to chromosomal alterations such as insertions and deletions of nucleotides up to gains and losses of large chromosomal fragments including products of chromosomal rearrangements e.g. fusion genes and proteins. Overall comparisons of copy number alterations (CNAs) presented in 48 clear cell renal cell carcinoma (ccRCC) genomes resulted in ratios of gene losses versus gene gains between 26 ccRCC Fuhrman malignancy grades G1 (ratio 1.25) and 20 G3 (ratio 0.58). Gene losses and gains of 15762 CNA genes were mapped to 795 chromosomal cytoband loci including 280 KEGG pathways. CNAs were classified according to their contribution to Fuhrman tumour gradings G1 and G3. Gene gains and losses turned out to be highly structured processes in ccRCC genomes enabling the subclassification and stratification of ccRCC tumours in a genome-wide manner. CNAs of ccRCC seem to start with common tumour related gene losses flanked by CNAs specifying Fuhrman grade G1 losses and CNA gains favouring grade G3 tumours. The appearance of recurrent CNA signatures implies the presence of causal mechanisms most likely implicated in the pathogenesis and disease-outcome of ccRCC tumours distinguishing lower from higher malignant tumours. The diagnostic quality of initial 201 genes (108 genes supporting G1 and 93 genes G3 phenotypes) has been successfully validated on published Swiss data (GSE19949) leading to a restricted CNA gene set of 171 CNA genes of which 85 genes favour Fuhrman grade G1 and 86 genes Fuhrman grade G3. Regarding these gene sets overall survival decreased with the number of G3 related gene losses plus G3 related gene gains. CNA gene sets presented define an entry to a gene-directed and pathway-related functional understanding of ongoing copy number alterations within and between individual ccRCC tumours leading to CNA genes of prognostic and predictive value.

Towards an informative mutant phenotype for every bacterial gene

DOE PAGES

Deutschbauer, Adam; Price, Morgan N.; Wetmore, Kelly M.; ...

2014-08-11

Mutant phenotypes provide strong clues to the functions of the underlying genes and could allow annotation of the millions of sequenced yet uncharacterized bacterial genes. However, it is not known how many genes have a phenotype under laboratory conditions, how many phenotypes are biologically interpretable for predicting gene function, and what experimental conditions are optimal to maximize the number of genes with a phenotype. To address these issues, we measured the mutant fitness of 1,586 genes of the ethanol-producing bacterium Zymomonas mobilis ZM4 across 492 diverse experiments and found statistically significant phenotypes for 89% of all assayed genes. Thus, inmore » Z. mobilis, most genes have a functional consequence under laboratory conditions. We demonstrate that 41% of Z. mobilis genes have both a strong phenotype and a similar fitness pattern (cofitness) to another gene, and are therefore good candidates for functional annotation using mutant fitness. Among 502 poorly characterized Z. mobilis genes, we identified a significant cofitness relationship for 174. For 57 of these genes without a specific functional annotation, we found additional evidence to support the biological significance of these gene-gene associations, and in 33 instances, we were able to predict specific physiological or biochemical roles for the poorly characterized genes. Last, we identified a set of 79 diverse mutant fitness experiments in Z. mobilis that are nearly as biologically informative as the entire set of 492 experiments. Therefore, our work provides a blueprint for the functional annotation of diverse bacteria using mutant fitness.« less
Genomics of local adaptation with gene flow.

PubMed

Tigano, Anna; Friesen, Vicki L

2016-05-01

Gene flow is a fundamental evolutionary force in adaptation that is especially important to understand as humans are rapidly changing both the natural environment and natural levels of gene flow. Theory proposes a multifaceted role for gene flow in adaptation, but it focuses mainly on the disruptive effect that gene flow has on adaptation when selection is not strong enough to prevent the loss of locally adapted alleles. The role of gene flow in adaptation is now better understood due to the recent development of both genomic models of adaptive evolution and genomic techniques, which both point to the importance of genetic architecture in the origin and maintenance of adaptation with gene flow. In this review, we discuss three main topics on the genomics of adaptation with gene flow. First, we investigate selection on migration and gene flow. Second, we discuss the three potential sources of adaptive variation in relation to the role of gene flow in the origin of adaptation. Third, we explain how local adaptation is maintained despite gene flow: we provide a synthesis of recent genomic models of adaptation, discuss the genomic mechanisms and review empirical studies on the genomics of adaptation with gene flow. Despite predictions on the disruptive effect of gene flow in adaptation, an increasing number of studies show that gene flow can promote adaptation, that local adaptations can be maintained despite high gene flow, and that genetic architecture plays a fundamental role in the origin and maintenance of local adaptation with gene flow. © 2016 John Wiley & Sons Ltd.
Identification of nitrogen-fixing genes and gene clusters from metagenomic library of acid mine drainage.

PubMed

Dai, Zhimin; Guo, Xue; Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan

2014-01-01

Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community.
GARNET--gene set analysis with exploration of annotation relations.

PubMed

Rho, Kyoohyoung; Kim, Bumjin; Jang, Youngjun; Lee, Sanghyun; Bae, Taejeong; Seo, Jihae; Seo, Chaehwa; Lee, Jihyun; Kang, Hyunjung; Yu, Ungsik; Kim, Sunghoon; Lee, Sanghyuk; Kim, Wan Kyu

2011-02-15

Gene set analysis is a powerful method of deducing biological meaning for an a priori defined set of genes. Numerous tools have been developed to test statistical enrichment or depletion in specific pathways or gene ontology (GO) terms. Major difficulties towards biological interpretation are integrating diverse types of annotation categories and exploring the relationships between annotation terms of similar information. GARNET (Gene Annotation Relationship NEtwork Tools) is an integrative platform for gene set analysis with many novel features. It includes tools for retrieval of genes from annotation database, statistical analysis & visualization of annotation relationships, and managing gene sets. In an effort to allow access to a full spectrum of amassed biological knowledge, we have integrated a variety of annotation data that include the GO, domain, disease, drug, chromosomal location, and custom-defined annotations. Diverse types of molecular networks (pathways, transcription and microRNA regulations, protein-protein interaction) are also included. The pair-wise relationship between annotation gene sets was calculated using kappa statistics. GARNET consists of three modules--gene set manager, gene set analysis and gene set retrieval, which are tightly integrated to provide virtually automatic analysis for gene sets. A dedicated viewer for annotation network has been developed to facilitate exploration of the related annotations. GARNET (gene annotation relationship network tools) is an integrative platform for diverse types of gene set analysis, where complex relationships among gene annotations can be easily explored with an intuitive network visualization tool (http://garnet.isysbio.org/ or http://ercsb.ewha.ac.kr/garnet/).
Identification of Nitrogen-Fixing Genes and Gene Clusters from Metagenomic Library of Acid Mine Drainage

PubMed Central

Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan

2014-01-01

Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community. PMID:24498417
Systematic analysis of microarray datasets to identify Parkinson's disease‑associated pathways and genes.

PubMed

Feng, Yinling; Wang, Xuefeng

2017-03-01

In order to investigate commonly disturbed genes and pathways in various brain regions of patients with Parkinson's disease (PD), microarray datasets from previous studies were collected and systematically analyzed. Different normalization methods were applied to microarray datasets from different platforms. A strategy combining gene co‑expression networks and clinical information was adopted, using weighted gene co‑expression network analysis (WGCNA) to screen for commonly disturbed genes in different brain regions of patients with PD. Functional enrichment analysis of commonly disturbed genes was performed using the Database for Annotation, Visualization, and Integrated Discovery (DAVID). Co‑pathway relationships were identified with Pearson's correlation coefficient tests and a hypergeometric distribution‑based test. Common genes in pathway pairs were selected out and regarded as risk genes. A total of 17 microarray datasets from 7 platforms were retained for further analysis. Five gene coexpression modules were identified, containing 9,745, 736, 233, 101 and 93 genes, respectively. One module was significantly correlated with PD samples and thus the 736 genes it contained were considered to be candidate PD‑associated genes. Functional enrichment analysis demonstrated that these genes were implicated in oxidative phosphorylation and PD. A total of 44 pathway pairs and 52 risk genes were revealed, and a risk gene pathway relationship network was constructed. Eight modules were identified and were revealed to be associated with PD, cancers and metabolism. A number of disturbed pathways and risk genes were unveiled in PD, and these findings may help advance understanding of PD pathogenesis.
Comparative genomic and transcriptomic analysis of selected fatty acid biosynthesis genes and CNL disease resistance genes in oil palm.

PubMed

Rosli, Rozana; Amiruddin, Nadzirah; Ab Halim, Mohd Amin; Chan, Pek-Lan; Chan, Kuang-Lim; Azizi, Norazah; Morris, Priscilla E; Leslie Low, Eng-Ti; Ong-Abdullah, Meilina; Sambanthamurthi, Ravigadevi; Singh, Rajinder; Murphy, Denis J

2018-01-01

Comparative genomics and transcriptomic analyses were performed on two agronomically important groups of genes from oil palm versus other major crop species and the model organism, Arabidopsis thaliana. The first analysis was of two gene families with key roles in regulation of oil quality and in particular the accumulation of oleic acid, namely stearoyl ACP desaturases (SAD) and acyl-acyl carrier protein (ACP) thioesterases (FAT). In both cases, these were found to be large gene families with complex expression profiles across a wide range of tissue types and developmental stages. The detailed classification of the oil palm SAD and FAT genes has enabled the updating of the latest version of the oil palm gene model. The second analysis focused on disease resistance (R) genes in order to elucidate possible candidates for breeding of pathogen tolerance/resistance. Ortholog analysis showed that 141 out of the 210 putative oil palm R genes had homologs in banana and rice. These genes formed 37 clusters with 634 orthologous genes. Classification of the 141 oil palm R genes showed that the genes belong to the Kinase (7), CNL (95), MLO-like (8), RLK (3) and Others (28) categories. The CNL R genes formed eight clusters. Expression data for selected R genes also identified potential candidates for breeding of disease resistance traits. Furthermore, these findings can provide information about the species evolution as well as the identification of agronomically important genes in oil palm and other major crops.
Evolution of the duplicated intracellular lipid-binding protein genes of teleost fishes.

PubMed

Venkatachalam, Ananda B; Parmar, Manoj B; Wright, Jonathan M

2017-08-01

Increasing organismal complexity during the evolution of life has been attributed to the duplication of genes and entire genomes. More recently, theoretical models have been proposed that postulate the fate of duplicated genes, among them the duplication-degeneration-complementation (DDC) model. In the DDC model, the common fate of a duplicated gene is lost from the genome owing to nonfunctionalization. Duplicated genes are retained in the genome either by subfunctionalization, where the functions of the ancestral gene are sub-divided between the sister duplicate genes, or by neofunctionalization, where one of the duplicate genes acquires a new function. Both processes occur either by loss or gain of regulatory elements in the promoters of duplicated genes. Here, we review the genomic organization, evolution, and transcriptional regulation of the multigene family of intracellular lipid-binding protein (iLBP) genes from teleost fishes. Teleost fishes possess many copies of iLBP genes owing to a whole genome duplication (WGD) early in the teleost fish radiation. Moreover, the retention of duplicated iLBP genes is substantially higher than the retention of all other genes duplicated in the teleost genome. The fatty acid-binding protein genes, a subfamily of the iLBP multigene family in zebrafish, are differentially regulated by peroxisome proliferator-activated receptor (PPAR) isoforms, which may account for the retention of iLBP genes in the zebrafish genome by the process of subfunctionalization of cis-acting regulatory elements in iLBP gene promoters.
Validating internal controls for quantitative plant gene expression studies.

PubMed

Brunner, Amy M; Yakovlev, Igor A; Strauss, Steven H

2004-08-18

Real-time reverse transcription PCR (RT-PCR) has greatly improved the ease and sensitivity of quantitative gene expression studies. However, accurate measurement of gene expression with this method relies on the choice of a valid reference for data normalization. Studies rarely verify that gene expression levels for reference genes are adequately consistent among the samples used, nor compare alternative genes to assess which are most reliable for the experimental conditions analyzed. Using real-time RT-PCR to study the expression of 10 poplar (genus Populus) housekeeping genes, we demonstrate a simple method for determining the degree of stability of gene expression over a set of experimental conditions. Based on a traditional method for analyzing the stability of varieties in plant breeding, it defines measures of gene expression stability from analysis of variance (ANOVA) and linear regression. We found that the potential internal control genes differed widely in their expression stability over the different tissues, developmental stages and environmental conditions studied. Our results support that quantitative comparisons of candidate reference genes are an important part of real-time RT-PCR studies that seek to precisely evaluate variation in gene expression. The method we demonstrated facilitates statistical and graphical evaluation of gene expression stability. Selection of the best reference gene for a given set of experimental conditions should enable detection of biologically significant changes in gene expression that are too small to be revealed by less precise methods, or when highly variable reference genes are unknowingly used in real-time RT-PCR experiments.
Genomic regulatory blocks encompass multiple neighboring genes and maintain conserved synteny in vertebrates

PubMed Central

Kikuta, Hiroshi; Laplante, Mary; Navratilova, Pavla; Komisarczuk, Anna Z.; Engström, Pär G.; Fredman, David; Akalin, Altuna; Caccamo, Mario; Sealy, Ian; Howe, Kerstin; Ghislain, Julien; Pezeron, Guillaume; Mourrain, Philippe; Ellingsen, Staale; Oates, Andrew C.; Thisse, Christine; Thisse, Bernard; Foucher, Isabelle; Adolf, Birgit; Geling, Andrea; Lenhard, Boris; Becker, Thomas S.

2007-01-01

We report evidence for a mechanism for the maintenance of long-range conserved synteny across vertebrate genomes. We found the largest mammal-teleost conserved chromosomal segments to be spanned by highly conserved noncoding elements (HCNEs), their developmental regulatory target genes, and phylogenetically and functionally unrelated “bystander” genes. Bystander genes are not specifically under the control of the regulatory elements that drive the target genes and are expressed in patterns that are different from those of the target genes. Reporter insertions distal to zebrafish developmental regulatory genes pax6.1/2, rx3, id1, and fgf8 and miRNA genes mirn9-1 and mirn9-5 recapitulate the expression patterns of these genes even if located inside or beyond bystander genes, suggesting that the regulatory domain of a developmental regulatory gene can extend into and beyond adjacent transcriptional units. We termed these chromosomal segments genomic regulatory blocks (GRBs). After whole genome duplication in teleosts, GRBs, including HCNEs and target genes, were often maintained in both copies, while bystander genes were typically lost from one GRB, strongly suggesting that evolutionary pressure acts to keep the single-copy GRBs of higher vertebrates intact. We show that loss of bystander genes and other mutational events suffered by duplicated GRBs in teleost genomes permits target gene identification and HCNE/target gene assignment. These findings explain the absence of evolutionary breakpoints from large vertebrate chromosomal segments and will aid in the recognition of position effect mutations within human GRBs. PMID:17387144
Selection of reliable reference genes for quantitative real-time PCR gene expression analysis in Jute (Corchorus capsularis) under stress treatments

PubMed Central

Niu, Xiaoping; Qi, Jianmin; Zhang, Gaoyang; Xu, Jiantang; Tao, Aifen; Fang, Pingping; Su, Jianguang

2015-01-01

To accurately measure gene expression using quantitative reverse transcription PCR (qRT-PCR), reliable reference gene(s) are required for data normalization. Corchorus capsularis, an annual herbaceous fiber crop with predominant biodegradability and renewability, has not been investigated for the stability of reference genes with qRT-PCR. In this study, 11 candidate reference genes were selected and their expression levels were assessed using qRT-PCR. To account for the influence of experimental approach and tissue type, 22 different jute samples were selected from abiotic and biotic stress conditions as well as three different tissue types. The stability of the candidate reference genes was evaluated using geNorm, NormFinder, and BestKeeper programs, and the comprehensive rankings of gene stability were generated by aggregate analysis. For the biotic stress and NaCl stress subsets, ACT7 and RAN were suitable as stable reference genes for gene expression normalization. For the PEG stress subset, UBC, and DnaJ were sufficient for accurate normalization. For the tissues subset, four reference genes TUBβ, UBI, EF1α, and RAN were sufficient for accurate normalization. The selected genes were further validated by comparing expression profiles of WRKY15 in various samples, and two stable reference genes were recommended for accurate normalization of qRT-PCR data. Our results provide researchers with appropriate reference genes for qRT-PCR in C. capsularis, and will facilitate gene expression study under these conditions. PMID:26528312
Identification and Evaluation of Reliable Reference Genes for Quantitative Real-Time PCR Analysis in Tea Plant (Camellia sinensis (L.) O. Kuntze)

PubMed Central

Hao, Xinyuan; Horvath, David P.; Chao, Wun S.; Yang, Yajun; Wang, Xinchao; Xiao, Bin

2014-01-01

Reliable reference selection for the accurate quantification of gene expression under various experimental conditions is a crucial step in qRT-PCR normalization. To date, only a few housekeeping genes have been identified and used as reference genes in tea plant. The validity of those reference genes are not clear since their expression stabilities have not been rigorously examined. To identify more appropriate reference genes for qRT-PCR studies on tea plant, we examined the expression stability of 11 candidate reference genes from three different sources: the orthologs of Arabidopsis traditional reference genes and stably expressed genes identified from whole-genome GeneChip studies, together with three housekeeping gene commonly used in tea plant research. We evaluated the transcript levels of these genes in 94 experimental samples. The expression stabilities of these 11 genes were ranked using four different computation programs including geNorm, Normfinder, BestKeeper, and the comparative ∆CT method. Results showed that the three commonly used housekeeping genes of CsTUBULIN1, CsACINT1 and Cs18S rRNA1 together with CsUBQ1 were the most unstable genes in all sample ranking order. However, CsPTB1, CsEF1, CsSAND1, CsCLATHRIN1 and CsUBC1 were the top five appropriate reference genes for qRT-PCR analysis in complex experimental conditions. PMID:25474086
Comparative genomic and transcriptomic analysis of selected fatty acid biosynthesis genes and CNL disease resistance genes in oil palm

PubMed Central

Rosli, Rozana; Amiruddin, Nadzirah; Ab Halim, Mohd Amin; Chan, Pek-Lan; Chan, Kuang-Lim; Azizi, Norazah; Morris, Priscilla E.; Leslie Low, Eng-Ti; Ong-Abdullah, Meilina; Sambanthamurthi, Ravigadevi; Singh, Rajinder

2018-01-01

Comparative genomics and transcriptomic analyses were performed on two agronomically important groups of genes from oil palm versus other major crop species and the model organism, Arabidopsis thaliana. The first analysis was of two gene families with key roles in regulation of oil quality and in particular the accumulation of oleic acid, namely stearoyl ACP desaturases (SAD) and acyl-acyl carrier protein (ACP) thioesterases (FAT). In both cases, these were found to be large gene families with complex expression profiles across a wide range of tissue types and developmental stages. The detailed classification of the oil palm SAD and FAT genes has enabled the updating of the latest version of the oil palm gene model. The second analysis focused on disease resistance (R) genes in order to elucidate possible candidates for breeding of pathogen tolerance/resistance. Ortholog analysis showed that 141 out of the 210 putative oil palm R genes had homologs in banana and rice. These genes formed 37 clusters with 634 orthologous genes. Classification of the 141 oil palm R genes showed that the genes belong to the Kinase (7), CNL (95), MLO-like (8), RLK (3) and Others (28) categories. The CNL R genes formed eight clusters. Expression data for selected R genes also identified potential candidates for breeding of disease resistance traits. Furthermore, these findings can provide information about the species evolution as well as the identification of agronomically important genes in oil palm and other major crops. PMID:29672525
Gene therapy in animal models of autosomal dominant retinitis pigmentosa

PubMed Central

Rossmiller, Brian; Mao, Haoyu

2012-01-01

Gene therapy for dominantly inherited genetic disease is more difficult than gene-based therapy for recessive disorders, which can be treated with gene supplementation. Treatment of dominant disease may require gene supplementation partnered with suppression of the expression of the mutant gene either at the DNA level, by gene repair, or at the RNA level by RNA interference or transcriptional repression. In this review, we examine some of the gene delivery approaches used to treat animal models of autosomal dominant retinitis pigmentosa, focusing on those models associated with mutations in the gene for rhodopsin. We conclude that combinatorial approaches have the greatest promise for success. PMID:23077406
Monoallelic expression of the human FOXP2 speech gene

PubMed Central

Adegbola, Abidemi A.; Cox, Gerald F.; Bradshaw, Elizabeth M.; Hafler, David A.; Gimelbrant, Alexander; Chess, Andrew

2015-01-01

The recent descriptions of widespread random monoallelic expression (RMAE) of genes distributed throughout the autosomal genome indicate that there are more genes subject to RMAE on autosomes than the number of genes on the X chromosome where X-inactivation dictates RMAE of X-linked genes. Several of the autosomal genes that undergo RMAE have independently been implicated in human Mendelian disorders. Thus, parsing the relationship between allele-specific expression of these genes and disease is of interest. Mutations in the human forkhead box P2 gene, FOXP2, cause developmental verbal dyspraxia with profound speech and language deficits. Here, we show that the human FOXP2 gene undergoes RMAE. Studying an individual with developmental verbal dyspraxia, we identify a deletion 3 Mb away from the FOXP2 gene, which impacts FOXP2 gene expression in cis. Together these data suggest the intriguing possibility that RMAE impacts the haploinsufficiency phenotypes observed for FOXP2 mutations. PMID:25422445
Monoallelic expression of the human FOXP2 speech gene.

PubMed

Adegbola, Abidemi A; Cox, Gerald F; Bradshaw, Elizabeth M; Hafler, David A; Gimelbrant, Alexander; Chess, Andrew

2015-06-02

The recent descriptions of widespread random monoallelic expression (RMAE) of genes distributed throughout the autosomal genome indicate that there are more genes subject to RMAE on autosomes than the number of genes on the X chromosome where X-inactivation dictates RMAE of X-linked genes. Several of the autosomal genes that undergo RMAE have independently been implicated in human Mendelian disorders. Thus, parsing the relationship between allele-specific expression of these genes and disease is of interest. Mutations in the human forkhead box P2 gene, FOXP2, cause developmental verbal dyspraxia with profound speech and language deficits. Here, we show that the human FOXP2 gene undergoes RMAE. Studying an individual with developmental verbal dyspraxia, we identify a deletion 3 Mb away from the FOXP2 gene, which impacts FOXP2 gene expression in cis. Together these data suggest the intriguing possibility that RMAE impacts the haploinsufficiency phenotypes observed for FOXP2 mutations.
Reference genes for measuring mRNA expression.

PubMed

Dundas, Jitesh; Ling, Maurice

2012-12-01

The aim of this review is to find answers to some of the questions surrounding reference genes and their reliability for quantitative experiments. Reference genes are assumed to be at a constant expression level, over a range of conditions such as temperature. These genes, such as GADPH and beta-actin, are used extensively for gene expression studies using techniques like quantitative PCR. There have been several studies carried out on identifying reference genes. However, a lot of evidence indicates issues to the general suitability of these genes. Recent studies had shown that different factors, including the environment and methods, play an important role in changing the expression levels of the reference genes. Thus, we conclude that there is no reference gene that can deemed suitable for all the experimental conditions. In addition, we believe that every experiment will require the scientific evaluation and selection of the best candidate gene for use as a reference gene to obtain reliable scientific results.
Microarray-based cancer prediction using soft computing approach.

PubMed

Wang, Xiaosheng; Gotoh, Osamu

2009-05-26

One of the difficulties in using gene expression profiles to predict cancer is how to effectively select a few informative genes to construct accurate prediction models from thousands or ten thousands of genes. We screen highly discriminative genes and gene pairs to create simple prediction models involved in single genes or gene pairs on the basis of soft computing approach and rough set theory. Accurate cancerous prediction is obtained when we apply the simple prediction models for four cancerous gene expression datasets: CNS tumor, colon tumor, lung cancer and DLBCL. Some genes closely correlated with the pathogenesis of specific or general cancers are identified. In contrast with other models, our models are simple, effective and robust. Meanwhile, our models are interpretable for they are based on decision rules. Our results demonstrate that very simple models may perform well on cancerous molecular prediction and important gene markers of cancer can be detected if the gene selection approach is chosen reasonably.
Non-functional genes repaired at the RNA level.

PubMed

Burger, Gertraud

2016-01-01

Genomes and genes continuously evolve. Gene sequences undergo substitutions, deletions or nucleotide insertions; mobile genetic elements invade genomes and interleave in genes; chromosomes break, even within genes, and pieces reseal in reshuffled order. To maintain functional gene products and assure an organism's survival, two principal strategies are used - either repair of the gene itself or of its product. I will introduce common types of gene aberrations and how gene function is restored secondarily, and then focus on systematically fragmented genes found in a poorly studied protist group, the diplonemids. Expression of their broken genes involves restitching of pieces at the RNA-level, and substantial RNA editing, to compensate for point mutations. I will conclude with thoughts on how such a grotesquely unorthodox system may have evolved, and why this group of organisms persists and thrives since tens of millions of years. Copyright © 2016 Académie des sciences. Published by Elsevier SAS. All rights reserved.
SiBIC: a web server for generating gene set networks based on biclusters obtained by maximal frequent itemset mining.

PubMed

Takahashi, Kei-ichiro; Takigawa, Ichigaku; Mamitsuka, Hiroshi

2013-01-01

Detecting biclusters from expression data is useful, since biclusters are coexpressed genes under only part of all given experimental conditions. We present a software called SiBIC, which from a given expression dataset, first exhaustively enumerates biclusters, which are then merged into rather independent biclusters, which finally are used to generate gene set networks, in which a gene set assigned to one node has coexpressed genes. We evaluated each step of this procedure: 1) significance of the generated biclusters biologically and statistically, 2) biological quality of merged biclusters, and 3) biological significance of gene set networks. We emphasize that gene set networks, in which nodes are not genes but gene sets, can be more compact than usual gene networks, meaning that gene set networks are more comprehensible. SiBIC is available at http://utrecht.kuicr.kyoto-u.ac.jp:8080/miami/faces/index.jsp.

Prediction of regulatory gene pairs using dynamic time warping and gene ontology.

PubMed

Yang, Andy C; Hsu, Hui-Huang; Lu, Ming-Da; Tseng, Vincent S; Shih, Timothy K

2014-01-01

Selecting informative genes is the most important task for data analysis on microarray gene expression data. In this work, we aim at identifying regulatory gene pairs from microarray gene expression data. However, microarray data often contain multiple missing expression values. Missing value imputation is thus needed before further processing for regulatory gene pairs becomes possible. We develop a novel approach to first impute missing values in microarray time series data by combining k-Nearest Neighbour (KNN), Dynamic Time Warping (DTW) and Gene Ontology (GO). After missing values are imputed, we then perform gene regulation prediction based on our proposed DTW-GO distance measurement of gene pairs. Experimental results show that our approach is more accurate when compared with existing missing value imputation methods on real microarray data sets. Furthermore, our approach can also discover more regulatory gene pairs that are known in the literature than other methods.
[High gene conversion frequency between genes encoding 2-deoxyglucose-6-phosphate phosphatase in 3 Saccharomyces species].

PubMed

Piscopo, Sara-Pier; Drouin, Guy

2014-05-01

Gene conversions are nonreciprocal sequence exchanges between genes. They are relatively common in Saccharomyces cerevisiae, but few studies have investigated the evolutionary fate of gene conversions or their functional impacts. Here, we analyze the evolution and impact of gene conversions between the two genes encoding 2-deoxyglucose-6-phosphate phosphatase in S. cerevisiae, Saccharomyces paradoxus and Saccharomyces mikatae. Our results demonstrate that the last half of these genes are subject to gene conversions among these three species. The greater similarity and the greater percentage of GC nucleotides in the converted regions, as well as the absence of long regions of adjacent common converted sites, suggest that these gene conversions are frequent and occur independently in all three species. The high frequency of these conversions probably result from the fact that they have little impact on the protein sequences encoded by these genes.
A PLSPM-Based Test Statistic for Detecting Gene-Gene Co-Association in Genome-Wide Association Study with Case-Control Design

PubMed Central

Zhang, Xiaoshuai; Yang, Xiaowei; Yuan, Zhongshang; Liu, Yanxun; Li, Fangyu; Peng, Bin; Zhu, Dianwen; Zhao, Jinghua; Xue, Fuzhong

2013-01-01

For genome-wide association data analysis, two genes in any pathway, two SNPs in the two linked gene regions respectively or in the two linked exons respectively within one gene are often correlated with each other. We therefore proposed the concept of gene-gene co-association, which refers to the effects not only due to the traditional interaction under nearly independent condition but the correlation between two genes. Furthermore, we constructed a novel statistic for detecting gene-gene co-association based on Partial Least Squares Path Modeling (PLSPM). Through simulation, the relationship between traditional interaction and co-association was highlighted under three different types of co-association. Both simulation and real data analysis demonstrated that the proposed PLSPM-based statistic has better performance than single SNP-based logistic model, PCA-based logistic model, and other gene-based methods. PMID:23620809
GenePRIMP: A Gene Prediction Improvement Pipeline For Prokaryotic Genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kyrpides, Nikos C.; Ivanova, Natalia N.; Pati, Amrita

2010-07-08

GenePRIMP (Gene Prediction Improvement Pipeline, Http://geneprimp.jgi-psf.org), a computational process that performs evidence-based evaluation of gene models in prokaryotic genomes and reports anomalies including inconsistent start sites, missing genes, and split genes. We show that manual curation of gene models using the anomaly reports generated by GenePRIMP improves their quality and demonstrate the applicability of GenePRIMP in improving finishing quality and comparing different genome sequencing and annotation technologies. Keywords in context: Gene model, Quality Control, Translation start sites, Automatic correction. Hardware requirements; PC, MAC; Operating System: UNIX/LINUX; Compiler/Version: Perl 5.8.5 or higher; Special requirements: NCBI Blast and nr installation; File Types:more » Source Code, Executable module(s), Sample problem input data; installation instructions other; programmer documentation. Location/transmission: http://geneprimp.jgi-psf.org/gp.tar.gz« less
Examining the process of de novo gene birth: an educational primer on "integration of new genes into cellular networks, and their structural maturation".

PubMed

Frietze, Seth; Leatherman, Judith

2014-03-01

New genes that arise from modification of the noncoding portion of a genome rather than being duplicated from parent genes are called de novo genes. These genes, identified by their brief evolution and lack of parent genes, provide an opportunity to study the timeframe in which emerging genes integrate into cellular networks, and how the characteristics of these genes change as they mature into bona fide genes. An article by G. Abrusán provides an opportunity to introduce students to fundamental concepts in evolutionary and comparative genetics and to provide a technical background by which to discuss systems biology approaches when studying the evolutionary process of gene birth. Basic background needed to understand the Abrusán study and details on comparative genomic concepts tailored for a classroom discussion are provided, including discussion questions and a supplemental exercise on navigating a genome database.
A PLSPM-based test statistic for detecting gene-gene co-association in genome-wide association study with case-control design.

PubMed

Zhang, Xiaoshuai; Yang, Xiaowei; Yuan, Zhongshang; Liu, Yanxun; Li, Fangyu; Peng, Bin; Zhu, Dianwen; Zhao, Jinghua; Xue, Fuzhong

2013-01-01

For genome-wide association data analysis, two genes in any pathway, two SNPs in the two linked gene regions respectively or in the two linked exons respectively within one gene are often correlated with each other. We therefore proposed the concept of gene-gene co-association, which refers to the effects not only due to the traditional interaction under nearly independent condition but the correlation between two genes. Furthermore, we constructed a novel statistic for detecting gene-gene co-association based on Partial Least Squares Path Modeling (PLSPM). Through simulation, the relationship between traditional interaction and co-association was highlighted under three different types of co-association. Both simulation and real data analysis demonstrated that the proposed PLSPM-based statistic has better performance than single SNP-based logistic model, PCA-based logistic model, and other gene-based methods.
Regulation of gene expression in plasmid ColE1: delayed expression of the kil gene.

PubMed Central

Zhang, S P; Yan, L F; Zubay, G

1988-01-01

cea, imm, and kil are a cluster of three functionally related genes of the plasmid ColE1. The cea and kil genes are in the same inducible operon, with transcription being initiated from a promoter adjacent to the cea gene. The imm gene is located between the cea and kil genes, but it is transcribed in the opposite direction. Complementary interaction between the imm mRNA and the anti-imm sequences in the middle of the cea-kil transcript causes a pronounced delay in expression of the kil gene when the cea-kil operon is induced. A segment in the overlapping region between the cea and imm genes causes delayed expression of the kil gene in the absence of imm gene transcription. This delay effect increases the yields of colicin synthesized in induced cells. Images PMID:3142845
Gene expression variability in human hepatic drug metabolizing enzymes and transporters.

PubMed

Yang, Lun; Price, Elvin T; Chang, Ching-Wei; Li, Yan; Huang, Ying; Guo, Li-Wu; Guo, Yongli; Kaput, Jim; Shi, Leming; Ning, Baitang

2013-01-01

Interindividual variability in the expression of drug-metabolizing enzymes and transporters (DMETs) in human liver may contribute to interindividual differences in drug efficacy and adverse reactions. Published studies that analyzed variability in the expression of DMET genes were limited by sample sizes and the number of genes profiled. We systematically analyzed the expression of 374 DMETs from a microarray data set consisting of gene expression profiles derived from 427 human liver samples. The standard deviation of interindividual expression for DMET genes was much higher than that for non-DMET genes. The 20 DMET genes with the largest variability in the expression provided examples of the interindividual variation. Gene expression data were also analyzed using network analysis methods, which delineates the similarities of biological functionalities and regulation mechanisms for these highly variable DMET genes. Expression variability of human hepatic DMET genes may affect drug-gene interactions and disease susceptibility, with concomitant clinical implications.
5. OVERHEAD VIEW OF GENE CAMP LOOKING SOUTH. GENE PUMP ...

Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

5. OVERHEAD VIEW OF GENE CAMP LOOKING SOUTH. GENE PUMP PLANT IS AT CENTER WITH ADMINISTRATIVE COMPLEX IN FOREGROUND AND RESIDENTIAL AREA BEYOND PLANT. - Gene Pump Plant, South of Gene Wash Reservoir, 2 miles west of Whitsett Pump Plant, Parker Dam, San Bernardino County, CA
Rational confederation of genes and diseases: NGS interpretation via GeneCards, MalaCards and VarElect.

PubMed

Rappaport, Noa; Fishilevich, Simon; Nudel, Ron; Twik, Michal; Belinky, Frida; Plaschkes, Inbar; Stein, Tsippi Iny; Cohen, Dana; Oz-Levi, Danit; Safran, Marilyn; Lancet, Doron

2017-08-18

A key challenge in the realm of human disease research is next generation sequencing (NGS) interpretation, whereby identified filtered variant-harboring genes are associated with a patient's disease phenotypes. This necessitates bioinformatics tools linked to comprehensive knowledgebases. The GeneCards suite databases, which include GeneCards (human genes), MalaCards (human diseases) and PathCards (human pathways) together with additional tools, are presented with the focus on MalaCards utility for NGS interpretation as well as for large scale bioinformatic analyses. VarElect, our NGS interpretation tool, leverages the broad information in the GeneCards suite databases. MalaCards algorithms unify disease-related terms and annotations from 69 sources. Further, MalaCards defines hierarchical relatedness-aliases, disease families, a related diseases network, categories and ontological classifications. GeneCards and MalaCards delineate and share a multi-tiered, scored gene-disease network, with stringency levels, including the definition of elite status-high quality gene-disease pairs, coming from manually curated trustworthy sources, that includes 4500 genes for 8000 diseases. This unique resource is key to NGS interpretation by VarElect. VarElect, a comprehensive search tool that helps infer both direct and indirect links between genes and user-supplied disease/phenotype terms, is robustly strengthened by the information found in MalaCards. The indirect mode benefits from GeneCards' diverse gene-to-gene relationships, including SuperPaths-integrated biological pathways from 12 information sources. We are currently adding an important information layer in the form of "disease SuperPaths", generated from the gene-disease matrix by an algorithm similar to that previously employed for biological pathway unification. This allows the discovery of novel gene-disease and disease-disease relationships. The advent of whole genome sequencing necessitates capacities to go beyond protein coding genes. GeneCards is highly useful in this respect, as it also addresses 101,976 non-protein-coding RNA genes. In a more recent development, we are currently adding an inclusive map of regulatory elements and their inferred target genes, generated by integration from 4 resources. MalaCards provides a rich big-data scaffold for in silico biomedical discovery within the gene-disease universe. VarElect, which depends significantly on both GeneCards and MalaCards power, is a potent tool for supporting the interpretation of wet-lab experiments, notably NGS analyses of disease. The GeneCards suite has thus transcended its 2-decade role in biomedical research, maturing into a key player in clinical investigation.
The genome of Paenibacillus sabinae T27 provides insight into evolution, organization and functional elucidation of nif and nif-like genes.

PubMed

Li, Xinxin; Deng, Zhiping; Liu, Zhanzhi; Yan, Yongliang; Wang, Tianshu; Xie, Jianbo; Lin, Min; Cheng, Qi; Chen, Sanfeng

2014-08-27

Most biological nitrogen fixation is catalyzed by the molybdenum nitrogenase. This enzyme is a complex which contains the MoFe protein encoded by nifDK and the Fe protein encoded by nifH. In addition to nifHDK, nifHDK-like genes were found in some Archaea and Firmicutes, but their function is unclear. We sequenced the genome of Paenibacillus sabinae T27. A total of 4,793 open reading frames were predicted from its 5.27 Mb genome. The genome of P. sabinae T27 contains fifteen nitrogen fixation (nif) genes, including three nifH, one nifD, one nifK, four nifB, two nifE, two nifN, one nifX and one nifV. Of the 15 nif genes, eight nif genes (nifB, nifH, nifD, nifK, nifE, nifN, nifX and nifV) and two non-nif genes (orf1 and hesA) form a complete nif gene cluster. In addition to the nif genes, there are nitrogenase-like genes, including two nifH-like genes and five pairs of nifDK-like genes. IS elements on the flanking regions of nif and nif-like genes imply that these genes might have been obtained by horizontal gene transfer. Phylogenies of the concatenated 8 nif gene (nifB, nifH, nifD, nifK, nifE, nifN, nifX and nifV) products suggest that P. sabinae T27 is closely related to Frankia. RT-PCR analysis showed that the complete nif gene cluster is organized as an operon. We demonstrated that the complete nif gene cluster under the control of σ70-dependent promoter enabled Escherichia coli JM109 to fix nitrogen. Also, here for the first time we demonstrated that unlike nif genes, the transcriptions of nifHDK-like genes were not regulated by ammonium and oxygen, and nifH-like or nifD-like gene could not restore the nitrogenase activity of Klebsiella pneumonia nifH- and nifD- mutant strains, respectively, suggesting that nifHDK-like genes were not involved in nitrogen fixation. Our data and analysis reveal the contents and distribution of nif and nif-like genes and contribute to the study of evolutionary history of nitrogen fixation in Paenibacillus. For the first time we demonstrated that the transcriptions of nifHDK-like genes were not regulated by ammonium and oxygen and nifHDK-like genes were not involved in nitrogen fixation.
Genotype-phenotype relationships in human red/green color-vision defects: molecular and psychophysical studies.

PubMed Central

Deeb, S S; Lindsey, D T; Hibiya, Y; Sanocki, E; Winderickx, J; Teller, D Y; Motulsky, A G

1992-01-01

The relationship between the molecular structure of the X-linked red and green visual pigment genes and color-vision phenotype as ascertained by anomaloscopy was studied in 64 color-defective males. The great majority of red-green defects were associated with either the deletion of the green-pigment gene or the formation of 5' red-green hybrid genes or 5' green-red hybrid genes. A rapid PCR-based method allowed detection of hybrid genes, including those undetectable by Southern blot analysis, as well as more precise localization of the fusion points in hybrid genes. Protan color-vision defects appeared always associated with 5' red-green hybrid genes. Carriers of single red-green hybrid genes with fusion in introns 1-4 were protanopes. However, carriers of hybrid genes with red-green fusions in introns 2, 3, or 4 in the presence of additional normal green genes manifested as either protanopes or protanomalous trichromats, with the majority being protanomalous. Deutan defects were associated with green-pigment gene deletions, with 5' green-red hybrid genes, or, rarely, with 5' green-red-green hybrid genes. Complete green-pigment gene deletions or green-red fusions in intron 1 were usually associated with deuteranopia, although we unexpectedly found three carriers of a single red-pigment gene without any green-pigment genes to be deuteranomalous trichromats. All but one of the other deuteranomalous subjects had green-red hybrid genes with intron 1, 2, 3, or 4 fusions, as well as several normal green-pigment genes. The one exception had a grossly normal gene array, presumably with a more subtle mutation. Amino acid differences in exon 5 largely determine whether a hybrid gene will be more redlike or more greenlike in phenotype. Various discrepancies as to severity (dichromacy or trichromacy) remain unexplained but may arise because of variability of expression, postreceptoral variation, or both. When phenotypic color-vision defects exist, the kind of defect (protan or deutan) can be predicted by molecular analysis. Red-green hybrid genes are probably always associated with protan color-vision defects, while the presence of green-red hybrid genes may not always manifest phenotypically with color-vision defects. Four subjects who were found to have 5' green-red hybrid genes in addition to normal red- and green-pigment genes had normal color vision as determined by anomaloscopy. These were discovered among a group of 129 Caucasian males who had been recruited as volunteers for a vision study.(ABSTRACT TRUNCATED AT 400 WORDS) Images Figure 3 PMID:1415215
Probabilistic modeling of the evolution of gene synteny within reconciled phylogenies

PubMed Central

2015-01-01

Background Most models of genome evolution concern either genetic sequences, gene content or gene order. They sometimes integrate two of the three levels, but rarely the three of them. Probabilistic models of gene order evolution usually have to assume constant gene content or adopt a presence/absence coding of gene neighborhoods which is blind to complex events modifying gene content. Results We propose a probabilistic evolutionary model for gene neighborhoods, allowing genes to be inserted, duplicated or lost. It uses reconciled phylogenies, which integrate sequence and gene content evolution. We are then able to optimize parameters such as phylogeny branch lengths, or probabilistic laws depicting the diversity of susceptibility of syntenic regions to rearrangements. We reconstruct a structure for ancestral genomes by optimizing a likelihood, keeping track of all evolutionary events at the level of gene content and gene synteny. Ancestral syntenies are associated with a probability of presence. We implemented the model with the restriction that at most one gene duplication separates two gene speciations in reconciled gene trees. We reconstruct ancestral syntenies on a set of 12 drosophila genomes, and compare the evolutionary rates along the branches and along the sites. We compare with a parsimony method and find a significant number of results not supported by the posterior probability. The model is implemented in the Bio++ library. It thus benefits from and enriches the classical models and methods for molecular evolution. PMID:26452018
Gene selection for tumor classification using neighborhood rough sets and entropy measures.

PubMed

Chen, Yumin; Zhang, Zunjun; Zheng, Jianzhong; Ma, Ying; Xue, Yu

2017-03-01

With the development of bioinformatics, tumor classification from gene expression data becomes an important useful technology for cancer diagnosis. Since a gene expression data often contains thousands of genes and a small number of samples, gene selection from gene expression data becomes a key step for tumor classification. Attribute reduction of rough sets has been successfully applied to gene selection field, as it has the characters of data driving and requiring no additional information. However, traditional rough set method deals with discrete data only. As for the gene expression data containing real-value or noisy data, they are usually employed by a discrete preprocessing, which may result in poor classification accuracy. In this paper, we propose a novel gene selection method based on the neighborhood rough set model, which has the ability of dealing with real-value data whilst maintaining the original gene classification information. Moreover, this paper addresses an entropy measure under the frame of neighborhood rough sets for tackling the uncertainty and noisy of gene expression data. The utilization of this measure can bring about a discovery of compact gene subsets. Finally, a gene selection algorithm is designed based on neighborhood granules and the entropy measure. Some experiments on two gene expression data show that the proposed gene selection is an effective method for improving the accuracy of tumor classification. Copyright © 2017 Elsevier Inc. All rights reserved.
Microarray analysis of potential genes in the pathogenesis of recurrent oral ulcer.

PubMed

Han, Jingying; He, Zhiwei; Li, Kun; Hou, Lu

2015-01-01

Recurrent oral ulcer seriously threatens patients' daily life and health. This study investigated potential genes and pathways that participate in the pathogenesis of recurrent oral ulcer by high throughput bioinformatic analysis. RT-PCR and Western blot were applied to further verify screened interleukins effect. Recurrent oral ulcer related genes were collected from websites and papers, and further found out from Human Genome 280 6.0 microarray data. Each pathway of recurrent oral ulcer related genes were got through chip hybridization. RT-PCR was applied to test four recurrent oral ulcer related genes to verify the microarray data. Data transformation, scatter plot, clustering analysis, and expression pattern analysis were used to analyze recurrent oral ulcer related gene expression changes. Recurrent oral ulcer gene microarray was successfully established. Microarray showed that 551 genes involved in recurrent oral ulcer activity and 196 genes were recurrent oral ulcer related genes. Of them, 76 genes up-regulated, 62 genes down-regulated, and 58 genes up-/down-regulated. Total expression level up-regulated 752 times (60%) and down-regulated 485 times (40%). IL-2 plays an important role in the occurrence, development and recurrence of recurrent oral ulcer on the mRNA and protein levels. Gene microarray can be used to analyze potential genes and pathways in recurrent oral ulcer. IL-2 may be involved in the pathogenesis of recurrent oral ulcer.
snpGeneSets: An R Package for Genome-Wide Study Annotation

PubMed Central

Mei, Hao; Li, Lianna; Jiang, Fan; Simino, Jeannette; Griswold, Michael; Mosley, Thomas; Liu, Shijian

2016-01-01

Genome-wide studies (GWS) of SNP associations and differential gene expressions have generated abundant results; next-generation sequencing technology has further boosted the number of variants and genes identified. Effective interpretation requires massive annotation and downstream analysis of these genome-wide results, a computationally challenging task. We developed the snpGeneSets package to simplify annotation and analysis of GWS results. Our package integrates local copies of knowledge bases for SNPs, genes, and gene sets, and implements wrapper functions in the R language to enable transparent access to low-level databases for efficient annotation of large genomic data. The package contains functions that execute three types of annotations: (1) genomic mapping annotation for SNPs and genes and functional annotation for gene sets; (2) bidirectional mapping between SNPs and genes, and genes and gene sets; and (3) calculation of gene effect measures from SNP associations and performance of gene set enrichment analyses to identify functional pathways. We applied snpGeneSets to type 2 diabetes (T2D) results from the NHGRI genome-wide association study (GWAS) catalog, a Finnish GWAS, and a genome-wide expression study (GWES). These studies demonstrate the usefulness of snpGeneSets for annotating and performing enrichment analysis of GWS results. The package is open-source, free, and can be downloaded at: https://www.umc.edu/biostats_software/. PMID:27807048
Bioinformatics, interaction network analysis, and neural networks to characterize gene expression of radicular cyst and periapical granuloma.

PubMed

Poswar, Fabiano de Oliveira; Farias, Lucyana Conceição; Fraga, Carlos Alberto de Carvalho; Bambirra, Wilson; Brito-Júnior, Manoel; Sousa-Neto, Manoel Damião; Santos, Sérgio Henrique Souza; de Paula, Alfredo Maurício Batista; D'Angelo, Marcos Flávio Silveira Vasconcelos; Guimarães, André Luiz Sena

2015-06-01

Bioinformatics has emerged as an important tool to analyze the large amount of data generated by research in different diseases. In this study, gene expression for radicular cysts (RCs) and periapical granulomas (PGs) was characterized based on a leader gene approach. A validated bioinformatics algorithm was applied to identify leader genes for RCs and PGs. Genes related to RCs and PGs were first identified in PubMed, GenBank, GeneAtlas, and GeneCards databases. The Web-available STRING software (The European Molecular Biology Laboratory [EMBL], Heidelberg, Baden-Württemberg, Germany) was used in order to build the interaction map among the identified genes by a significance score named weighted number of links. Based on the weighted number of links, genes were clustered using k-means. The genes in the highest cluster were considered leader genes. Multilayer perceptron neural network analysis was used as a complementary supplement for gene classification. For RCs, the suggested leader genes were TP53 and EP300, whereas PGs were associated with IL2RG, CCL2, CCL4, CCL5, CCR1, CCR3, and CCR5 genes. Our data revealed different gene expression for RCs and PGs, suggesting that not only the inflammatory nature but also other biological processes might differentiate RCs and PGs. Copyright © 2015 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.
Extensive Gains and Losses of Olfactory Receptor Genes in Mammalian Evolution

PubMed Central

Niimura, Yoshihito; Nei, Masatoshi

2007-01-01

Odor perception in mammals is mediated by a large multigene family of olfactory receptor (OR) genes. The number of OR genes varies extensively among different species of mammals, and most species have a substantial number of pseudogenes. To gain some insight into the evolutionary dynamics of mammalian OR genes, we identified the entire set of OR genes in platypuses, opossums, cows, dogs, rats, and macaques and studied the evolutionary change of the genes together with those of humans and mice. We found that platypuses and primates have <400 functional OR genes while the other species have 800–1,200 functional OR genes. We then estimated the numbers of gains and losses of OR genes for each branch of the phylogenetic tree of mammals. This analysis showed that (i) gene expansion occurred in the placental lineage each time after it diverged from monotremes and from marsupials and (ii) hundreds of gains and losses of OR genes have occurred in an order-specific manner, making the gene repertoires highly variable among different orders. It appears that the number of OR genes is determined primarily by the functional requirement for each species, but once the number reaches the required level, it fluctuates by random duplication and deletion of genes. This fluctuation seems to have been aided by the stochastic nature of OR gene expression. PMID:17684554
Microarray Analysis of Gene Expression Alteration in Human Middle Ear Epithelial Cells Induced by Asian Sand Dust.

PubMed

Go, Yoon Young; Park, Moo Kyun; Kwon, Jee Young; Seo, Young Rok; Chae, Sung-Won; Song, Jae-Jun

2015-12-01

The primary aim of this study is to evaluate the gene expression profile of Asian sand dust (ASD)-treated human middle ear epithelial cell (HMEEC) using microarray analysis. The HMEEC was treated with ASD (400 µg/mL) and total RNA was extracted for microarray analysis. Molecular pathways among differentially expressed genes were further analyzed. For selected genes, the changes in gene expression were confirmed by real-time polymerase chain reaction. A total of 1,274 genes were differentially expressed by ASD. Among them, 1,138 genes were 2 folds up-regulated, whereas 136 genes were 2 folds down-regulated. Up-regulated genes were mainly involved in cellular processes, including apoptosis, cell differentiation, and cell proliferation. Down-regulated genes affected cellular processes, including apoptosis, cell cycle, cell differentiation, and cell proliferation. The 10 genes including ADM, CCL5, EDN1, EGR1, FOS, GHRL, JUN, SOCS3, TNF, and TNFSF10 were identified as main modulators in up-regulated genes. A total of 11 genes including CSF3, DKK1, FOSL1, FST, TERT, MMP13, PTHLH, SPRY2, TGFBR2, THBS1, and TIMP1 acted as main components of pathway associated with 2-fold down regulated genes. We identified the differentially expressed genes in ASD-treated HMEEC. Our work indicates that air pollutant like ASD, may play an important role in the pathogenesis of otitis media.
The repertoire of bitter taste receptor genes in canids.

PubMed

Shang, Shuai; Wu, Xiaoyang; Chen, Jun; Zhang, Huanxin; Zhong, Huaming; Wei, Qinguo; Yan, Jiakuo; Li, Haotian; Liu, Guangshuai; Sha, Weilai; Zhang, Honghai

2017-07-01

Bitter taste receptors (Tas2rs) play important roles in mammalian defense mechanisms by helping animals detect and avoid toxins in food. Although Tas2r genes have been widely studied in several mammals, minimal research has been performed in canids. To analyze the genetic basis of Tas2r genes in canids, we first identified Tas2r genes in the wolf, maned wolf, red fox, corsac fox, Tibetan fox, fennec fox, dhole and African hunting dog. A total of 183 Tas2r genes, consisting of 118 intact genes, 6 partial genes and 59 pseudogenes, were detected. Differences in the pseudogenes were observed among nine canid species. For example, Tas2r4 was a pseudogene in the dog but might play a functional role in other canid species. The Tas2r42 and Tas2r10 genes were pseudogenes in the maned wolf and dhole, respectively, and the Tas2r5 and Tas2r34 genes were pseudogenes in the African hunting dog; however, these genes were intact genes in other canid species. The differences in Tas2r pseudogenes among canids might suggest that the loss of intact Tas2r genes in canid species is species-dependent. We further compared the 183 Tas2r genes identified in this study with Tas2r genes from ten additional carnivorous species to evaluate the potential influence of diet on the evolution of the Tas2r gene repertoire. Phylogenetic analysis revealed that most of the Tas2r genes from the 18 species intermingled across the tree, suggesting that Tas2r genes are conserved among carnivores. Within canids, we found that some Tas2r genes corresponded to the traditional taxonomic groupings, while some did not. PIC analysis showed that the number of Tas2r genes in carnivores exhibited no positive correlation with diet composition, which might be due to the limited number of carnivores included in our study.

Selection of novel reference genes for use in the human central nervous system: a BrainNet Europe Study.

PubMed

Durrenberger, Pascal F; Fernando, Francisca S; Magliozzi, Roberta; Kashefi, Samira N; Bonnert, Timothy P; Ferrer, Isidro; Seilhean, Danielle; Nait-Oumesmar, Brahim; Schmitt, Andrea; Gebicke-Haerter, Peter J; Falkai, Peter; Grünblatt, Edna; Palkovits, Miklos; Parchi, Piero; Capellari, Sabina; Arzberger, Thomas; Kretzschmar, Hans; Roncaroli, Federico; Dexter, David T; Reynolds, Richard

2012-12-01

The use of an appropriate reference gene to ensure accurate normalisation is crucial for the correct quantification of gene expression using qPCR assays and RNA arrays. The main criterion for a gene to qualify as a reference gene is a stable expression across various cell types and experimental settings. Several reference genes are commonly in use but more and more evidence reveals variations in their expression due to the presence of on-going neuropathological disease processes, raising doubts concerning their use. We conducted an analysis of genome-wide changes of gene expression in the human central nervous system (CNS) covering several neurological disorders and regions, including the spinal cord, and were able to identify a number of novel stable reference genes. We tested the stability of expression of eight novel (ATP5E, AARS, GAPVD1, CSNK2B, XPNPEP1, OSBP, NAT5 and DCTN2) and four more commonly used (BECN1, GAPDH, QARS and TUBB) reference genes in a smaller cohort using RT-qPCR. The most stable genes out of the 12 reference genes were tested as normaliser to validate increased levels of a target gene in CNS disease. We found that in human post-mortem tissue the novel reference genes, XPNPEP1 and AARS, were efficient in replicating microarray target gene expression levels and that XPNPEP1 was more efficient as a normaliser than BECN1, which has been shown to change in expression as a consequence of neuronal cell loss. We provide herein one more suitable novel reference gene, XPNPEP1, with no current neuroinflammatory or neurodegenerative associations that can be used for gene quantitative gene expression studies with human CNS post-mortem tissue and also suggest a list of potential other candidates. These data also emphasise the importance of organ/tissue-specific stably expressed genes as reference genes for RNA studies.
Speciation genes in plants

PubMed Central

Rieseberg, Loren H.; Blackman, Benjamin K.

2010-01-01

Background Analyses of speciation genes – genes that contribute to the cessation of gene flow between populations – can offer clues regarding the ecological settings, evolutionary forces and molecular mechanisms that drive the divergence of populations and species. This review discusses the identities and attributes of genes that contribute to reproductive isolation (RI) in plants, compares them with animal speciation genes and investigates what these genes can tell us about speciation. Scope Forty-one candidate speciation genes were identified in the plant literature. Of these, seven contributed to pre-pollination RI, one to post-pollination, prezygotic RI, eight to hybrid inviability, and 25 to hybrid sterility. Genes, gene families and genetic pathways that were frequently found to underlie the evolution of RI in different plant groups include the anthocyanin pathway and its regulators (pollinator isolation), S RNase-SI genes (unilateral incompatibility), disease resistance genes (hybrid necrosis), chimeric mitochondrial genes (cytoplasmic male sterility), and pentatricopeptide repeat family genes (cytoplasmic male sterility). Conclusions The most surprising conclusion from this review is that identities of genes underlying both prezygotic and postzygotic RI are often predictable in a broad sense from the phenotype of the reproductive barrier. Regulatory changes (both cis and trans) dominate the evolution of pre-pollination RI in plants, whereas a mix of regulatory mutations and changes in protein-coding genes underlie intrinsic postzygotic barriers. Also, loss-of-function mutations and copy number variation frequently contribute to RI. Although direct evidence of positive selection on speciation genes is surprisingly scarce in plants, analyses of gene family evolution, along with theoretical considerations, imply an important role for diversifying selection and genetic conflict in the evolution of RI. Unlike in animals, however, most candidate speciation genes in plants exhibit intraspecific polymorphism, consistent with an important role for stochastic forces and/or balancing selection in development of RI in plants. PMID:20576737
The Rice B-Box Zinc Finger Gene Family: Genomic Identification, Characterization, Expression Profiling and Diurnal Analysis

PubMed Central

Huang, Jianyan; Zhao, Xiaobo; Weng, Xiaoyu; Wang, Lei; Xie, Weibo

2012-01-01

Background The B-box (BBX) -containing proteins are a class of zinc finger proteins that contain one or two B-box domains and play important roles in plant growth and development. The Arabidopsis BBX gene family has recently been re-identified and renamed. However, there has not been a genome-wide survey of the rice BBX (OsBBX) gene family until now. Methodology/Principal Findings In this study, we identified 30 rice BBX genes through a comprehensive bioinformatics analysis. Each gene was assigned a uniform nomenclature. We described the chromosome localizations, gene structures, protein domains, phylogenetic relationship, whole life-cycle expression profile and diurnal expression patterns of the OsBBX family members. Based on the phylogeny and domain constitution, the OsBBX gene family was classified into five subfamilies. The gene duplication analysis revealed that only chromosomal segmental duplication contributed to the expansion of the OsBBX gene family. The expression profile of the OsBBX genes was analyzed by Affymetrix GeneChip microarrays throughout the entire life-cycle of rice cultivar Zhenshan 97 (ZS97). In addition, microarray analysis was performed to obtain the expression patterns of these genes under light/dark conditions and after three phytohormone treatments. This analysis revealed that the expression patterns of the OsBBX genes could be classified into eight groups. Eight genes were regulated under the light/dark treatments, and eleven genes showed differential expression under at least one phytohormone treatment. Moreover, we verified the diurnal expression of the OsBBX genes using the data obtained from the Diurnal Project and qPCR analysis, and the results indicated that many of these genes had a diurnal expression pattern. Conclusions/Significance The combination of the genome-wide identification and the expression and diurnal analysis of the OsBBX gene family should facilitate additional functional studies of the OsBBX genes. PMID:23118960
Characterizing gene sets using discriminative random walks with restart on heterogeneous biological networks.

PubMed

Blatti, Charles; Sinha, Saurabh

2016-07-15

Analysis of co-expressed gene sets typically involves testing for enrichment of different annotations or 'properties' such as biological processes, pathways, transcription factor binding sites, etc., one property at a time. This common approach ignores any known relationships among the properties or the genes themselves. It is believed that known biological relationships among genes and their many properties may be exploited to more accurately reveal commonalities of a gene set. Previous work has sought to achieve this by building biological networks that combine multiple types of gene-gene or gene-property relationships, and performing network analysis to identify other genes and properties most relevant to a given gene set. Most existing network-based approaches for recognizing genes or annotations relevant to a given gene set collapse information about different properties to simplify (homogenize) the networks. We present a network-based method for ranking genes or properties related to a given gene set. Such related genes or properties are identified from among the nodes of a large, heterogeneous network of biological information. Our method involves a random walk with restarts, performed on an initial network with multiple node and edge types that preserve more of the original, specific property information than current methods that operate on homogeneous networks. In this first stage of our algorithm, we find the properties that are the most relevant to the given gene set and extract a subnetwork of the original network, comprising only these relevant properties. We then re-rank genes by their similarity to the given gene set, based on a second random walk with restarts, performed on the above subnetwork. We demonstrate the effectiveness of this algorithm for ranking genes related to Drosophila embryonic development and aggressive responses in the brains of social animals. DRaWR was implemented as an R package available at veda.cs.illinois.edu/DRaWR. blatti@illinois.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Unique mechanisms of sheng yu decoction ( shèng yù tang) on ischemic stroke mice revealed by an integrated neurofunctional and transcriptome analysis.

PubMed

Hou, Yu-Chang; Lu, Chung-Kuang; Wang, Yea-Hwey; Chern, Chang-Ming; Liou, Kuo-Tong; Wang, Hsei-Wei; Shen, Yuh-Chiang

2013-10-01

Sheng Yu Decoction ( Shèng Yù Tang; SYD) is a popular traditional Chinese medicine (TCM) remedy used in treating cardiovascular and brain-related dysfunction clinically; yet, its neuroprotective mechanisms are still unclear. Here, mice were subjected to an acute ischemic stroke to examine the efficacy and mechanisms of action of SYD by an integrated neurofunctional and transcriptome analysis. More than 80% of the mice died within 2 days after ischemic stroke with vehicle treatment. Treatments with SYD (1.0 g/kg, twice daily, orally or p.o.) and recombinant thrombolytic tissue plasminogen activator (rt-PA; 10 mg/kg, once daily, intravenously or i.v.) both significantly extended the lifespan as compared to that of the vehicle-treated stroke group. SYD successfully restored brain function, ameliorated cerebral infarction and oxidative stress, and significantly improved neurological deficits in mice with stroke. Molecular impact of SYD by a genome-wide transcriptome analysis using brains from stroke mice showed a total of 162 out of 2081 ischemia-induced probe sets were significantly influenced by SYD. Mining the functional modules and genetic networks of these 162 genes revealed a significant upregulation of neuroprotective genes in Wnt receptor signaling pathway (3 genes) and regulation of cell communication (7 genes) and downregulation of destructive genes in response to stress (13 genes) and in the induction of inflammation (5 genes), cytokine production (4 genes), angiogenesis (3 genes), vasculature (6 genes) and blood vessel (5 genes) development, wound healing (7 genes), defense response (7 genes), chemotaxis (4 genes), immune response (7 genes), antigen processing and presenting (3 genes), and leukocyte-mediated cytotoxicity (2 genes) by SYD. Our results suggest that SYD could protect mice against ischemic stroke primarily through significantly downregulating the damaging genes involved in stress, inflammation, angiogenesis, blood vessel formation, immune responses, and wound healing, as well as upregulating the genes mediating neurogenesis and cell communication, which make SYD beneficial for treating ischemic stroke.
Nucleotide Sequences of Genes Coding for Fimbrial Proteins in a Cryptic Genospecies of Haemophilus spp. Isolated from Neonatal and Genital Tract Infections

PubMed Central

Gousset, Nathalie; Rosenau, Agnes; Sizaret, Pierre-Yves; Quentin, Roland

1999-01-01

Nineteen isolates belonging to a cryptic genospecies of Haemophilus (referred to here as genital strains) isolated from genital tract infections (6 strains) and from neonatal infections (13 strains) were studied for fimbrial genes. Sixteen strains exhibit peritrichous fimbriae observed by electron microscopy. By PCR with primers corresponding to the extreme ends of the Haemophilus influenzae type b (Hib) hifA and hifD genes and Southern blotting, a hifA-like gene (named ghfA) and a hifD-like gene (named ghfD) were identified in 6 of the 19 strains. Five of these six strains were from the genital tracts of adults, and one was from a neonate. For each gene, the nucleotide sequence was identical for the six strains. A hifE-like gene (named ghfE) was amplified from only one of the 19 genital strains of Haemophilus, but the ghfE probe gave a signal in Southern hybridization with the five other strains positive for ghfA and ghfD. Therefore, these strains may carry a ghfE-like gene. The Hib fimbrial gene cluster is located between the purE and pepN genes as previously described. For the 13 genital Haemophilus strains that lack fimbrial genes, this region corresponds to a noncoding sequence. Another major fimbrial gene designated the fimbrin gene was previously identified in a nontypeable H. influenzae strain. A fimbrin-like gene was identified for all of our 19 genital strains. This gene is similar to the ompP5 gene of many Haemophilus strains. Therefore, other, unidentified genes may explain the piliation observed in electron microscopy on genital Haemophilus strains which do not possess LKP-like fimbrial genes. Fimbrial genes were significantly associated with strains isolated from the genital tract. They may confer on the strain the ability to survive in the genital tract. PMID:9864189
Lung-MAP: AZD4547 as Second-Line Therapy in Treating FGFR Positive Patients With Recurrent Stage IV Squamous Cell Lung Cancer

ClinicalTrials.gov

2017-12-13

FGFR1 Gene Amplification; FGFR1 Gene Mutation; FGFR2 Gene Amplification; FGFR2 Gene Mutation; FGFR3 Gene Amplification; FGFR3 Gene Mutation; Recurrent Squamous Cell Lung Carcinoma; Stage IV Squamous Cell Lung Carcinoma AJCC v7
40 CFR 174.513 - Potato Leaf Roll Virus Resistance Gene (also known as orf1/orf2 gene); exemption from the...

Code of Federal Regulations, 2012 CFR

2012-07-01

... 40 Protection of Environment 25 2012-07-01 2012-07-01 false Potato Leaf Roll Virus Resistance Gene... Virus Resistance Gene (also known as orf1/orf2 gene); exemption from the requirement of a tolerance. An... protectant Potato Leaf Roll Virus Resistance Gene (also known as orf1/orf2 gene) in or on all food...
40 CFR 174.513 - Potato Leaf Roll Virus Resistance Gene (also known as orf1/orf2 gene); exemption from the...

Code of Federal Regulations, 2014 CFR

2014-07-01

... 40 Protection of Environment 24 2014-07-01 2014-07-01 false Potato Leaf Roll Virus Resistance Gene... Virus Resistance Gene (also known as orf1/orf2 gene); exemption from the requirement of a tolerance. An... protectant Potato Leaf Roll Virus Resistance Gene (also known as orf1/orf2 gene) in or on all food...
40 CFR 174.513 - Potato Leaf Roll Virus Resistance Gene (also known as orf1/orf2 gene); exemption from the...

Code of Federal Regulations, 2011 CFR

2011-07-01

... 40 Protection of Environment 24 2011-07-01 2011-07-01 false Potato Leaf Roll Virus Resistance Gene... Virus Resistance Gene (also known as orf1/orf2 gene); exemption from the requirement of a tolerance. An... protectant Potato Leaf Roll Virus Resistance Gene (also known as orf1/orf2 gene) in or on all food...
40 CFR 174.513 - Potato Leaf Roll Virus Resistance Gene (also known as orf1/orf2 gene); exemption from the...

Code of Federal Regulations, 2010 CFR

2010-07-01

... 40 Protection of Environment 23 2010-07-01 2010-07-01 false Potato Leaf Roll Virus Resistance Gene... Virus Resistance Gene (also known as orf1/orf2 gene); exemption from the requirement of a tolerance. An... protectant Potato Leaf Roll Virus Resistance Gene (also known as orf1/orf2 gene) in or on all food...
40 CFR 174.513 - Potato Leaf Roll Virus Resistance Gene (also known as orf1/orf2 gene); exemption from the...

Code of Federal Regulations, 2013 CFR

2013-07-01

... 40 Protection of Environment 25 2013-07-01 2013-07-01 false Potato Leaf Roll Virus Resistance Gene... Virus Resistance Gene (also known as orf1/orf2 gene); exemption from the requirement of a tolerance. An... protectant Potato Leaf Roll Virus Resistance Gene (also known as orf1/orf2 gene) in or on all food...
Construct and Compare Gene Coexpression Networks with DAPfinder and DAPview.

PubMed

Skinner, Jeff; Kotliarov, Yuri; Varma, Sudhir; Mine, Karina L; Yambartsev, Anatoly; Simon, Richard; Huyen, Yentram; Morgun, Andrey

2011-07-14

DAPfinder and DAPview are novel BRB-ArrayTools plug-ins to construct gene coexpression networks and identify significant differences in pairwise gene-gene coexpression between two phenotypes. Each significant difference in gene-gene association represents a Differentially Associated Pair (DAP). Our tools include several choices of filtering methods, gene-gene association metrics, statistical testing methods and multiple comparison adjustments. Network results are easily displayed in Cytoscape. Analyses of glioma experiments and microarray simulations demonstrate the utility of these tools. DAPfinder is a new friendly-user tool for reconstruction and comparison of biological networks.
Gene for ataxia-telangiectasia complementation group D (ATDC)

DOEpatents

Murnane, J.P.; Painter, R.B.; Kapp, L.N.; Yu, L.C.

1995-03-07

Disclosed herein is a new gene, an AT gene for complementation group D, the ATDC gene and fragments thereof. Nucleic acid probes for the gene are provided as well as proteins encoded by the gene, cDNA therefrom, preferably a 3 kilobase (kb) cDNA, and recombinant nucleic acid molecules for expression of the proteins. Further disclosed are methods to detect mutations in the gene, preferably methods employing the polymerase chain reaction (PCR). Also disclosed are methods to detect AT genes from other AT complementation groups. 30 figs.
A genome-wide resource of cell cycle and cell shape genes of fission yeast

PubMed Central

Hayles, Jacqueline; Wood, Valerie; Jeffery, Linda; Hoe, Kwang-Lae; Kim, Dong-Uk; Park, Han-Oh; Salas-Pino, Silvia; Heichinger, Christian; Nurse, Paul

2013-01-01

To identify near complete sets of genes required for the cell cycle and cell shape, we have visually screened a genome-wide gene deletion library of 4843 fission yeast deletion mutants (95.7% of total protein encoding genes) for their effects on these processes. A total of 513 genes have been identified as being required for cell cycle progression, 276 of which have not been previously described as cell cycle genes. Deletions of a further 333 genes lead to specific alterations in cell shape and another 524 genes result in generally misshapen cells. Here, we provide the first eukaryotic resource of gene deletions, which describes a near genome-wide set of genes required for the cell cycle and cell shape. PMID:23697806
A study of structural properties of gene network graphs for mathematical modeling of integrated mosaic gene networks.

PubMed

Petrovskaya, Olga V; Petrovskiy, Evgeny D; Lavrik, Inna N; Ivanisenko, Vladimir A

2017-04-01

Gene network modeling is one of the widely used approaches in systems biology. It allows for the study of complex genetic systems function, including so-called mosaic gene networks, which consist of functionally interacting subnetworks. We conducted a study of a mosaic gene networks modeling method based on integration of models of gene subnetworks by linear control functionals. An automatic modeling of 10,000 synthetic mosaic gene regulatory networks was carried out using computer experiments on gene knockdowns/knockouts. Structural analysis of graphs of generated mosaic gene regulatory networks has revealed that the most important factor for building accurate integrated mathematical models, among those analyzed in the study, is data on expression of genes corresponding to the vertices with high properties of centrality.
An intronic microRNA silences genes that are functionally antagonistic to its host gene.

PubMed

Barik, Sailen

2008-09-01

MicroRNAs (miRNAs) are short noncoding RNAs that down-regulate gene expression by silencing specific target mRNAs. While many miRNAs are transcribed from their own genes, nearly half map within introns of 'host' genes, the significance of which remains unclear. We report that transcriptional activation of apoptosis-associated tyrosine kinase (AATK), essential for neuronal differentiation, also generates miR-338 from an AATK gene intron that silences a family of mRNAs whose protein products are negative regulators of neuronal differentiation. We conclude that an intronic miRNA, transcribed together with the host gene mRNA, may serve the interest of its host gene by silencing a cohort of genes that are functionally antagonistic to the host gene itself.
Endovascular Gene Delivery from a Stent Platform: Gene- Eluting Stents

PubMed Central

Fishbein, Ilia; Chorny, Michael; Adamo, Richard F; Forbes, Scott P; Corrales, Ricardo A; Alferiev, Ivan S; Levy, Robert J

2015-01-01

A synergistic impact of research in the fields of post-angioplasty restenosis, drug-eluting stents and vascular gene therapy over the past 15 years has shaped the concept of gene-eluting stents. Gene-eluting stents hold promise of overcoming some biological and technical problems inherent to drug-eluting stent technology. As the field of gene-eluting stents matures it becomes evident that all three main design modules of a gene-eluting stent: a therapeutic transgene, a vector and a delivery system are equally important for accomplishing sustained inhibition of neointimal formation in arteries treated with gene delivery stents. This review summarizes prior work on stent-based gene delivery and discusses the main optimization strategies required to move the field of gene-eluting stents to clinical translation. PMID:26225356
The WRKY Transcription Factor Genes in Lotus japonicus

PubMed Central

Wang, Pengfei; Wang, Xingjun

2014-01-01

WRKY transcription factor genes play critical roles in plant growth and development, as well as stress responses. WRKY genes have been examined in various higher plants, but they have not been characterized in Lotus japonicus. The recent release of the L. japonicus whole genome sequence provides an opportunity for a genome wide analysis of WRKY genes in this species. In this study, we identified 61 WRKY genes in the L. japonicus genome. Based on the WRKY protein structure, L. japonicus WRKY (LjWRKY) genes can be classified into three groups (I–III). Investigations of gene copy number and gene clusters indicate that only one gene duplication event occurred on chromosome 4 and no clustered genes were detected on chromosomes 3 or 6. Researchers previously believed that group II and III WRKY domains were derived from the C-terminal WRKY domain of group I. Our results suggest that some WRKY genes in group II originated from the N-terminal domain of group I WRKY genes. Additional evidence to support this hypothesis was obtained by Medicago truncatula WRKY (MtWRKY) protein motif analysis. We found that LjWRKY and MtWRKY group III genes are under purifying selection, suggesting that WRKY genes will become increasingly structured and functionally conserved. PMID:24745006
[Gene deletion and functional analysis of the heptyl glycosyltransferase (waaF) gene in Vibrio parahemolyticus O-antigen cluster].

PubMed

Zhao, Feng; Meng, Songsong; Zhou, Deqing

2016-02-04

To construct heptyl glycosyltransferase gene II (waaF) gene deletion mutant of Vibrio parahaemolyticus, and explore the function of the waaF gene in Vibrio parahaemolyticus. The waaF gene deletion mutant was constructed by chitin-based transformation technology using clinical isolates, and then the growth rate, morphology and serotypes were identified. The different sources (O3, O5 and O10) waaF gene complementations were constructed through E. coli S17λpir strains conjugative transferring with Vibrio parahaemolyticus, and the function of the waaF gene was further verified by serotypes. The waaF gene deletion mutant strain was successfully constructed and it grew normally. The growth rate and morphology of mutant were similar with the wild type strains (WT), but the mutant could not occurred agglutination reaction with O antisera. The O3 and O5 sources waaF gene complementations occurred agglutination reaction with O antisera, but the O10 sources waaF gene complementations was not. The waaF gene was related with O-antigen synthesis and it was the key gene of O-antigen synthesis pathway in Vibrio parahaemolyticus. The function of different sources waaF gene were not the same.

Role of Apoptosis in the Development of Uterine Leiomyoma: Analysis of Expression Patterns of Bcl-2 and Bax in Human Leiomyoma Tissue With Clinical Correlations.

PubMed

Csatlós, Éva; Máté, Szabolcs; Laky, Marcella; Rigó, János; Joó, József Gábor

2015-07-01

To describe gene expression patterns of the apoptotic regulatory genes Bcl and Bax in human uterine leiomyoma tissue. To investigate the relationship between alterations of gene expression patterns and several relevant clinical parameters. We obtained samples from 101 cases undergoing surgery for uterine leiomyoma for gene expression analysis of the Bcl-2 and Bax genes. Gene expression was quantified using RT-PCR technique. In the leiomyoma group, the Bcl-2 gene was significantly overexpressed compared with the control group although there was no such difference in the gene expression of Bax. Gene activity of Bcl-2 positively correlated with the tumor number in individual uterine leiomyoma cases. Although there was no significant correlation between the length of the cumulative lactation period before the development of uterine leiomyoma and Bcl-2 gene expression in the leiomyoma tissue, we observed a trend for a shorter cumulative lactation period to be associated with overexpression of the Bcl-2 gene. Overexpression of the antiapoptotic Bcl-2 gene appeared to be a factor in the development of uterine leiomyoma, whereas gene activity of the proapoptotic Bax gene did not seem to play a role in the process.
Multiple conversion between the genes encoding bacterial class-I release factors

PubMed Central

Ishikawa, Sohta A.; Kamikawa, Ryoma; Inagaki, Yuji

2015-01-01

Bacteria require two class-I release factors, RF1 and RF2, that recognize stop codons and promote peptide release from the ribosome. RF1 and RF2 were most likely established through gene duplication followed by altering their stop codon specificities in the common ancestor of extant bacteria. This scenario expects that the two RF gene families have taken independent evolutionary trajectories after the ancestral gene duplication event. However, we here report two independent cases of conversion between RF1 and RF2 genes (RF1-RF2 gene conversion), which were severely examined by procedures incorporating the maximum-likelihood phylogenetic method. In both cases, RF1-RF2 gene conversion was predicted to occur in the region encoding nearly entire domain 3, of which functions are common between RF paralogues. Nevertheless, the ‘direction’ of gene conversion appeared to be opposite from one another—from RF2 gene to RF1 gene in one case, while from RF1 gene to RF2 gene in the other. The two cases of RF1-RF2 gene conversion prompt us to propose two novel aspects in the evolution of bacterial class-I release factors: (i) domain 3 is interchangeable between RF paralogues, and (ii) RF1-RF2 gene conversion have occurred frequently in bacterial genome evolution. PMID:26257102
An evidence-based knowledgebase of metastasis suppressors to identify key pathways relevant to cancer metastasis

PubMed Central

Zhao, Min; Li, Zhe; Qu, Hong

2015-01-01

Metastasis suppressor genes (MS genes) are genes that play important roles in inhibiting the process of cancer metastasis without preventing growth of the primary tumor. Identification of these genes and understanding their functions are critical for investigation of cancer metastasis. Recent studies on cancer metastasis have identified many new susceptibility MS genes. However, the comprehensive illustration of diverse cellular processes regulated by metastasis suppressors during the metastasis cascade is lacking. Thus, the relationship between MS genes and cancer risk is still unclear. To unveil the cellular complexity of MS genes, we have constructed MSGene (http://MSGene.bioinfo-minzhao.org/), the first literature-based gene resource for exploring human MS genes. In total, we manually curated 194 experimentally verified MS genes and mapped to 1448 homologous genes from 17 model species. Follow-up functional analyses associated 194 human MS genes with epithelium/tissue morphogenesis and epithelia cell proliferation. In addition, pathway analysis highlights the prominent role of MS genes in activation of platelets and coagulation system in tumor metastatic cascade. Moreover, global mutation pattern of MS genes across multiple cancers may reveal common cancer metastasis mechanisms. All these results illustrate the importance of MSGene to our understanding on cell development and cancer metastasis. PMID:26486520
Selection of reference genes for tissue/organ samples on day 3 fifth-instar larvae in silkworm, Bombyx mori.

PubMed

Wang, Genhong; Chen, Yanfei; Zhang, Xiaoying; Bai, Bingchuan; Yan, Hao; Qin, Daoyuan; Xia, Qingyou

2018-06-01

The silkworm, Bombyx mori, is one of the world's most economically important insect. Surveying variations in gene expression among multiple tissue/organ samples will provide clues for gene function assignments and will be helpful for identifying genes related to economic traits or specific cellular processes. To ensure their accuracy, commonly used gene expression quantification methods require a set of stable reference genes for data normalization. In this study, 24 candidate reference genes were assessed in 10 tissue/organ samples of day 3 fifth-instar B. mori larvae using geNorm and NormFinder. The results revealed that, using the combination of the expression of BGIBMGA003186 and BGIBMGA008209 was the optimum choice for normalizing the expression data of the B. mori tissue/organ samples. The most stable gene, BGIBMGA003186, is recommended if just one reference gene is used. Moreover, the commonly used reference gene encoding cytoplasmic actin was the least appropriate reference gene of the samples investigated. The reliability of the selected reference genes was further confirmed by evaluating the expression profiles of two cathepsin genes. Our results may be useful for future studies involving the quantification of relative gene expression levels of different tissue/organ samples in B. mori. © 2018 Wiley Periodicals, Inc.
Visual gene developer: a fully programmable bioinformatics software for synthetic gene optimization.

PubMed

Jung, Sang-Kyu; McDonald, Karen

2011-08-16

Direct gene synthesis is becoming more popular owing to decreases in gene synthesis pricing. Compared with using natural genes, gene synthesis provides a good opportunity to optimize gene sequence for specific applications. In order to facilitate gene optimization, we have developed a stand-alone software called Visual Gene Developer. The software not only provides general functions for gene analysis and optimization along with an interactive user-friendly interface, but also includes unique features such as programming capability, dedicated mRNA secondary structure prediction, artificial neural network modeling, network & multi-threaded computing, and user-accessible programming modules. The software allows a user to analyze and optimize a sequence using main menu functions or specialized module windows. Alternatively, gene optimization can be initiated by designing a gene construct and configuring an optimization strategy. A user can choose several predefined or user-defined algorithms to design a complicated strategy. The software provides expandable functionality as platform software supporting module development using popular script languages such as VBScript and JScript in the software programming environment. Visual Gene Developer is useful for both researchers who want to quickly analyze and optimize genes, and those who are interested in developing and testing new algorithms in bioinformatics. The software is available for free download at http://www.visualgenedeveloper.net.
Gene Therapy in the Cornea: 2005-present

PubMed Central

Mohan, Rajiv R.; Tovey, Jonathan C.K.; Sharma, Ajay; Tandon, Ashish

2011-01-01

Successful restoration of vision in human patients with gene therapy affirmed its promise to cure ocular diseases and disorders. The efficacy of gene therapy is contingent upon vector and mode of therapeutic DNA introduction into targeted cells/tissues. The cornea is an ideal tissue for gene therapy due to its ease of access and relative immune-privilege. Considerable progress has been made in the field of corneal gene therapy in last 5 years. Several new gene transfer vectors, techniques and approaches have evolved. Although corneal gene therapy is still in its early stages of development, the potential of gene-based interventions to treat corneal abnormalities have begun to surface. Identification of next generation viral and nanoparticle vectors, characterization of delivered gene levels, localization, and duration in the cornea, and significant success in controlling corneal disorders, particularly fibrosis and angiogenesis, in experimental animal disease models, with no major side effects have propelled gene therapy a step closer towards establishing gene-based therapies for corneal blindness. Recently, researchers have assessed the delivery of therapeutic genes for corneal diseases and disorders due to trauma, infections, chemical, mechanical, and surgical injury, and/or abnormal wound healing. This review provides an update on the developments in gene therapy for corneal diseases and discusses the barriers that hinder its utilization for delivering genes in the cornea. PMID:21967960
Differential expression profiles and pathways of genes in sugarcane leaf at elongation stage in response to drought stress

PubMed Central

Li, Changning; Nong, Qian; Solanki, Manoj Kumar; Liang, Qiang; Xie, Jinlan; Liu, Xiaoyan; Li, Yijie; Wang, Weizan; Yang, Litao; Li, Yangrui

2016-01-01

Water stress causes considerable yield losses in sugarcane. To investigate differentially expressed genes under water stress, a pot experiment was performed with the sugarcane variety GT21 at three water-deficit levels (mild, moderate, and severe) during the elongation stage and gene expression was analyzed using microarray technology. Physiological parameters of sugarcane showed significant alterations in response to drought stress. Based on the expression profile of 15,593 sugarcane genes, 1,501 (9.6%) genes were differentially expressed under different water-level treatments; 821 genes were upregulated and 680 genes were downregulated. A gene similarity analysis showed that approximately 62.6% of the differentially expressed genes shared homology with functional proteins. In a Gene Ontology (GO) analysis, 901 differentially expressed genes were assigned to 36 GO categories. Moreover, 325 differentially expressed genes were classified into 101 pathway categories involved in various processes, such as the biosynthesis of secondary metabolites, ribosomes, carbon metabolism, etc. In addition, some unannotated genes were detected; these may provide a basis for studies of water-deficit tolerance. The reliability of the observed expression patterns was confirmed by RT-PCR. The results of this study may help identify useful genes for improving drought tolerance in sugarcane. PMID:27170459
CNL Disease Resistance Genes in Soybean and Their Evolutionary Divergence

PubMed Central

Nepal, Madhav P; Benson, Benjamin V

2015-01-01

Disease resistance genes (R-genes) encode proteins involved in detecting pathogen attack and activating downstream defense molecules. Recent availability of soybean genome sequences makes it possible to examine the diversity of gene families including disease-resistant genes. The objectives of this study were to identify coiled-coil NBS-LRR (= CNL) R-genes in soybean, infer their evolutionary relationships, and assess structural as well as functional divergence of the R-genes. Profile hidden Markov models were used for sequence identification and model-based maximum likelihood was used for phylogenetic analysis, and variation in chromosomal positioning, gene clustering, and functional divergence were assessed. We identified 188 soybean CNL genes nested into four clades consistent to their orthologs in Arabidopsis. Gene clustering analysis revealed the presence of 41 gene clusters located on 13 different chromosomes. Analyses of the Ks-values and chromosomal positioning suggest duplication events occurring at varying timescales, and an extrapericentromeric positioning may have facilitated their rapid evolution. Each of the four CNL clades exhibited distinct patterns of gene expression. Phylogenetic analysis further supported the extrapericentromeric positioning effect on the divergence and retention of the CNL genes. The results are important for understanding the diversity and divergence of CNL genes in soybean, which would have implication in soybean crop improvement in future. PMID:25922568
Selecting and validating reference genes for quantitative real-time PCR in Plutella xylostella (L.).

PubMed

You, Yanchun; Xie, Miao; Vasseur, Liette; You, Minsheng

2018-05-01

Gene expression analysis provides important clues regarding gene functions, and quantitative real-time PCR (qRT-PCR) is a widely used method in gene expression studies. Reference genes are essential for normalizing and accurately assessing gene expression. In the present study, 16 candidate reference genes (ACTB, CyPA, EF1-α, GAPDH, HSP90, NDPk, RPL13a, RPL18, RPL19, RPL32, RPL4, RPL8, RPS13, RPS4, α-TUB, and β-TUB) from Plutella xylostella were selected to evaluate gene expression stability across different experimental conditions using five statistical algorithms (geNorm, NormFinder, Delta Ct, BestKeeper, and RefFinder). The results suggest that different reference genes or combinations of reference genes are suitable for normalization in gene expression studies of P. xylostella according to the different developmental stages, strains, tissues, and insecticide treatments. Based on the given experimental sets, the most stable reference genes were RPS4 across different developmental stages, RPL8 across different strains and tissues, and EF1-α across different insecticide treatments. A comprehensive and systematic assessment of potential reference genes for gene expression normalization is essential for post-genomic functional research in P. xylostella, a notorious pest with worldwide distribution and a high capacity to adapt and develop resistance to insecticides.
Visual gene developer: a fully programmable bioinformatics software for synthetic gene optimization

PubMed Central

2011-01-01

Background Direct gene synthesis is becoming more popular owing to decreases in gene synthesis pricing. Compared with using natural genes, gene synthesis provides a good opportunity to optimize gene sequence for specific applications. In order to facilitate gene optimization, we have developed a stand-alone software called Visual Gene Developer. Results The software not only provides general functions for gene analysis and optimization along with an interactive user-friendly interface, but also includes unique features such as programming capability, dedicated mRNA secondary structure prediction, artificial neural network modeling, network & multi-threaded computing, and user-accessible programming modules. The software allows a user to analyze and optimize a sequence using main menu functions or specialized module windows. Alternatively, gene optimization can be initiated by designing a gene construct and configuring an optimization strategy. A user can choose several predefined or user-defined algorithms to design a complicated strategy. The software provides expandable functionality as platform software supporting module development using popular script languages such as VBScript and JScript in the software programming environment. Conclusion Visual Gene Developer is useful for both researchers who want to quickly analyze and optimize genes, and those who are interested in developing and testing new algorithms in bioinformatics. The software is available for free download at http://www.visualgenedeveloper.net. PMID:21846353
The pkI gene encoding pyruvate kinase I links to the luxZ gene which enhances bioluminescence of the lux operon from Photobacterium leiognathi.

PubMed

Lin, J W; Lu, H C; Chen, H Y; Weng, S F

1997-10-09

Partial 3'-end nucleotide sequence of the pkI gene (GenBank accession No. AF019143) from Photobacterium leiognathi ATCC 25521 has been determined, and the encoded pyruvate kinase I is deduced. Pyruvate kinase I is the key enzyme of glycolysis, which converts phosphoenol pyruvate to pyruvate. Alignment and comparison of pyruvate kinase Is from P. leiognathi, E. coli and Salmonella typhimurium show that they are homologous. Nucleotide sequence reveals that the pkI gene is linked to the luxZ gene that enhances bioluminescence of the lux operon from P. leiognathi. The gene order of the pkI and luxZ genes is-pk1-ter-->-R&R"-luxZ-ter"-->, whereas ter is transcriptional terminator for the pkI and related genes, and R&R" is the regulatory region and ter" is transcriptional terminator for the luxZ gene. It clearly elicits that the pkI gene and luxZ gene are divided to two operons. Functional analysis confirms that the potential hairpin loop omega T is the transcriptional terminator for the pkI and related genes. It infers that the pkI and related genes are simply linked to the luxZ gene in P. leiognathi genome.
CNL Disease Resistance Genes in Soybean and Their Evolutionary Divergence.

PubMed

Nepal, Madhav P; Benson, Benjamin V

2015-01-01

Disease resistance genes (R-genes) encode proteins involved in detecting pathogen attack and activating downstream defense molecules. Recent availability of soybean genome sequences makes it possible to examine the diversity of gene families including disease-resistant genes. The objectives of this study were to identify coiled-coil NBS-LRR (= CNL) R-genes in soybean, infer their evolutionary relationships, and assess structural as well as functional divergence of the R-genes. Profile hidden Markov models were used for sequence identification and model-based maximum likelihood was used for phylogenetic analysis, and variation in chromosomal positioning, gene clustering, and functional divergence were assessed. We identified 188 soybean CNL genes nested into four clades consistent to their orthologs in Arabidopsis. Gene clustering analysis revealed the presence of 41 gene clusters located on 13 different chromosomes. Analyses of the K s-values and chromosomal positioning suggest duplication events occurring at varying timescales, and an extrapericentromeric positioning may have facilitated their rapid evolution. Each of the four CNL clades exhibited distinct patterns of gene expression. Phylogenetic analysis further supported the extrapericentromeric positioning effect on the divergence and retention of the CNL genes. The results are important for understanding the diversity and divergence of CNL genes in soybean, which would have implication in soybean crop improvement in future.
Identification and validation of reference genes for qRT-PCR studies of the obligate aphid pathogenic fungus Pandora neoaphidis during different developmental stages.

PubMed

Zhang, Shutao; Chen, Chun; Xie, Tingna; Ye, Sudan

2017-01-01

The selection of stable reference genes is a critical step for the accurate quantification of gene expression. To identify and validate the reference genes in Pandora neoaphidis-an obligate aphid pathogenic fungus-the expression of 13classical candidate reference genes were evaluated by quantitative real-time reverse transcriptase polymerase chain reaction(qPCR) at four developmental stages (conidia, conidia with germ tubes, short hyphae and elongated hyphae). Four statistical algorithms, including geNorm, NormFinder, BestKeeper and Delta Ct method were used to rank putative reference genes according to their expression stability and indicate the best reference gene or combination of reference genes for accurate normalization. The analysis of comprehensive ranking revealed that ACT1and 18Swas the most stably expressed genes throughout the developmental stages. To further validate the suitability of the reference genes identified in this study, the expression of cell division control protein 25 (CDC25) and Chitinase 1(CHI1) genes were used to further confirm the validated candidate reference genes. Our study presented the first systematic study of reference gene(s) selection for P. neoaphidis study and provided guidelines to obtain more accurate qPCR results for future developmental efforts.
Discovering Implicit Entity Relation with the Gene-Citation-Gene Network

PubMed Central

Song, Min; Han, Nam-Gi; Kim, Yong-Hwan; Ding, Ying; Chambers, Tamy

2013-01-01

In this paper, we apply the entitymetrics model to our constructed Gene-Citation-Gene (GCG) network. Based on the premise there is a hidden, but plausible, relationship between an entity in one article and an entity in its citing article, we constructed a GCG network of gene pairs implicitly connected through citation. We compare the performance of this GCG network to a gene-gene (GG) network constructed over the same corpus but which uses gene pairs explicitly connected through traditional co-occurrence. Using 331,411 MEDLINE abstracts collected from 18,323 seed articles and their references, we identify 25 gene pairs. A comparison of these pairs with interactions found in BioGRID reveal that 96% of the gene pairs in the GCG network have known interactions. We measure network performance using degree, weighted degree, closeness, betweenness centrality and PageRank. Combining all measures, we find the GCG network has more gene pairs, but a lower matching rate than the GG network. However, combining top ranked genes in both networks produces a matching rate of 35.53%. By visualizing both the GG and GCG networks, we find that cancer is the most dominant disease associated with the genes in both networks. Overall, the study indicates that the GCG network can be useful for detecting gene interaction in an implicit manner. PMID:24358368
The gsdf gene locus harbors evolutionary conserved and clustered genes preferentially expressed in fish previtellogenic oocytes.

PubMed

Gautier, Aude; Le Gac, Florence; Lareyre, Jean-Jacques

2011-02-01

The gonadal soma-derived factor (GSDF) belongs to the transforming growth factor-β superfamily and is conserved in teleostean fish species. Gsdf is specifically expressed in the gonads, and gene expression is restricted to the granulosa and Sertoli cells in trout and medaka. The gsdf gene expression is correlated to early testis differentiation in medaka and was shown to stimulate primordial germ cell and spermatogonia proliferation in trout. In the present study, we show that the gsdf gene localizes to a syntenic chromosomal fragment conserved among vertebrates although no gsdf-related gene is detected on the corresponding genomic region in tetrapods. We demonstrate using quantitative RT-PCR that most of the genes localized in the synteny are specifically expressed in medaka gonads. Gsdf is the only gene of the synteny with a much higher expression in the testis compared to the ovary. In contrast, gene expression pattern analysis of the gsdf surrounding genes (nup54, aff1, klhl8, sdad1, and ptpn13) indicates that these genes are preferentially expressed in the female gonads. The tissue distribution of these genes is highly similar in medaka and zebrafish, two teleostean species that have diverged more than 110 million years ago. The cellular localization of these genes was determined in medaka gonads using the whole-mount in situ hybridization technique. We confirm that gsdf gene expression is restricted to Sertoli and granulosa cells in contact with the premeiotic and meiotic cells. The nup54 gene is expressed in spermatocytes and previtellogenic oocytes. Transcripts corresponding to the ovary-specific genes (aff1, klhl8, and sdad1) are detected only in previtellogenic oocytes. No expression was detected in the gonocytes in 10 dpf embryos. In conclusion, we show that the gsdf gene localizes to a syntenic chromosomal fragment harboring evolutionary conserved genes in vertebrates. These genes are preferentially expressed in previtelloogenic oocytes, and thus, they display a different cellular localization compared to that of the gsdf gene indicating that the later gene is not co-regulated. Interestingly, our study identifies new clustered genes that are specifically expressed in previtellogenic oocytes (nup54, aff1, klhl8, sdad1). Copyright Â© 2010 Elsevier B.V. All rights reserved.
Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data.

PubMed

Hettne, Kristina M; Boorsma, André; van Dartel, Dorien A M; Goeman, Jelle J; de Jong, Esther; Piersma, Aldert H; Stierum, Rob H; Kleinjans, Jos C; Kors, Jan A

2013-01-29

Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set analysis (GSA) methods for chemical treatment identification, for pharmacological mechanism elucidation, and for comparing compound toxicity profiles. We created 30,211 chemical response-specific gene sets for human and mouse by next-gen TM, and derived 1,189 (human) and 588 (mouse) gene sets from the Comparative Toxicogenomics Database (CTD). We tested for significant differential expression (SDE) (false discovery rate -corrected p-values < 0.05) of the next-gen TM-derived gene sets and the CTD-derived gene sets in gene expression (GE) data sets of five chemicals (from experimental models). We tested for SDE of gene sets for six fibrates in a peroxisome proliferator-activated receptor alpha (PPARA) knock-out GE dataset and compared to results from the Connectivity Map. We tested for SDE of 319 next-gen TM-derived gene sets for environmental toxicants in three GE data sets of triazoles, and tested for SDE of 442 gene sets associated with embryonic structures. We compared the gene sets to triazole effects seen in the Whole Embryo Culture (WEC), and used principal component analysis (PCA) to discriminate triazoles from other chemicals. Next-gen TM-derived gene sets matching the chemical treatment were significantly altered in three GE data sets, and the corresponding CTD-derived gene sets were significantly altered in five GE data sets. Six next-gen TM-derived and four CTD-derived fibrate gene sets were significantly altered in the PPARA knock-out GE dataset. None of the fibrate signatures in cMap scored significant against the PPARA GE signature. 33 environmental toxicant gene sets were significantly altered in the triazole GE data sets. 21 of these toxicants had a similar toxicity pattern as the triazoles. We confirmed embryotoxic effects, and discriminated triazoles from other chemicals. Gene set analysis with next-gen TM-derived chemical response-specific gene sets is a scalable method for identifying similarities in gene responses to other chemicals, from which one may infer potential mode of action and/or toxic effect.
Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

PubMed Central

2013-01-01

Background Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set analysis (GSA) methods for chemical treatment identification, for pharmacological mechanism elucidation, and for comparing compound toxicity profiles. Methods We created 30,211 chemical response-specific gene sets for human and mouse by next-gen TM, and derived 1,189 (human) and 588 (mouse) gene sets from the Comparative Toxicogenomics Database (CTD). We tested for significant differential expression (SDE) (false discovery rate -corrected p-values < 0.05) of the next-gen TM-derived gene sets and the CTD-derived gene sets in gene expression (GE) data sets of five chemicals (from experimental models). We tested for SDE of gene sets for six fibrates in a peroxisome proliferator-activated receptor alpha (PPARA) knock-out GE dataset and compared to results from the Connectivity Map. We tested for SDE of 319 next-gen TM-derived gene sets for environmental toxicants in three GE data sets of triazoles, and tested for SDE of 442 gene sets associated with embryonic structures. We compared the gene sets to triazole effects seen in the Whole Embryo Culture (WEC), and used principal component analysis (PCA) to discriminate triazoles from other chemicals. Results Next-gen TM-derived gene sets matching the chemical treatment were significantly altered in three GE data sets, and the corresponding CTD-derived gene sets were significantly altered in five GE data sets. Six next-gen TM-derived and four CTD-derived fibrate gene sets were significantly altered in the PPARA knock-out GE dataset. None of the fibrate signatures in cMap scored significant against the PPARA GE signature. 33 environmental toxicant gene sets were significantly altered in the triazole GE data sets. 21 of these toxicants had a similar toxicity pattern as the triazoles. We confirmed embryotoxic effects, and discriminated triazoles from other chemicals. Conclusions Gene set analysis with next-gen TM-derived chemical response-specific gene sets is a scalable method for identifying similarities in gene responses to other chemicals, from which one may infer potential mode of action and/or toxic effect. PMID:23356878
Ontology-based literature mining of E. coli vaccine-associated gene interaction networks.

PubMed

Hur, Junguk; Özgür, Arzucan; He, Yongqun

2017-03-14

Pathogenic Escherichia coli infections cause various diseases in humans and many animal species. However, with extensive E. coli vaccine research, we are still unable to fully protect ourselves against E. coli infections. To more rational development of effective and safe E. coli vaccine, it is important to better understand E. coli vaccine-associated gene interaction networks. In this study, we first extended the Vaccine Ontology (VO) to semantically represent various E. coli vaccines and genes used in the vaccine development. We also normalized E. coli gene names compiled from the annotations of various E. coli strains using a pan-genome-based annotation strategy. The Interaction Network Ontology (INO) includes a hierarchy of various interaction-related keywords useful for literature mining. Using VO, INO, and normalized E. coli gene names, we applied an ontology-based SciMiner literature mining strategy to mine all PubMed abstracts and retrieve E. coli vaccine-associated E. coli gene interactions. Four centrality metrics (i.e., degree, eigenvector, closeness, and betweenness) were calculated for identifying highly ranked genes and interaction types. Using vaccine-related PubMed abstracts, our study identified 11,350 sentences that contain 88 unique INO interactions types and 1,781 unique E. coli genes. Each sentence contained at least one interaction type and two unique E. coli genes. An E. coli gene interaction network of genes and INO interaction types was created. From this big network, a sub-network consisting of 5 E. coli vaccine genes, including carA, carB, fimH, fepA, and vat, and 62 other E. coli genes, and 25 INO interaction types was identified. While many interaction types represent direct interactions between two indicated genes, our study has also shown that many of these retrieved interaction types are indirect in that the two genes participated in the specified interaction process in a required but indirect process. Our centrality analysis of these gene interaction networks identified top ranked E. coli genes and 6 INO interaction types (e.g., regulation and gene expression). Vaccine-related E. coli gene-gene interaction network was constructed using ontology-based literature mining strategy, which identified important E. coli vaccine genes and their interactions with other genes through specific interaction types.
A novel approach for human whole transcriptome analysis based on absolute gene expression of microarray data.

PubMed

Bikel, Shirley; Jacobo-Albavera, Leonor; Sánchez-Muñoz, Fausto; Cornejo-Granados, Fernanda; Canizales-Quinteros, Samuel; Soberón, Xavier; Sotelo-Mundo, Rogerio R; Del Río-Navarro, Blanca E; Mendoza-Vargas, Alfredo; Sánchez, Filiberto; Ochoa-Leyva, Adrian

2017-01-01

In spite of the emergence of RNA sequencing (RNA-seq), microarrays remain in widespread use for gene expression analysis in the clinic. There are over 767,000 RNA microarrays from human samples in public repositories, which are an invaluable resource for biomedical research and personalized medicine. The absolute gene expression analysis allows the transcriptome profiling of all expressed genes under a specific biological condition without the need of a reference sample. However, the background fluorescence represents a challenge to determine the absolute gene expression in microarrays. Given that the Y chromosome is absent in female subjects, we used it as a new approach for absolute gene expression analysis in which the fluorescence of the Y chromosome genes of female subjects was used as the background fluorescence for all the probes in the microarray. This fluorescence was used to establish an absolute gene expression threshold, allowing the differentiation between expressed and non-expressed genes in microarrays. We extracted the RNA from 16 children leukocyte samples (nine males and seven females, ages 6-10 years). An Affymetrix Gene Chip Human Gene 1.0 ST Array was carried out for each sample and the fluorescence of 124 genes of the Y chromosome was used to calculate the absolute gene expression threshold. After that, several expressed and non-expressed genes according to our absolute gene expression threshold were compared against the expression obtained using real-time quantitative polymerase chain reaction (RT-qPCR). From the 124 genes of the Y chromosome, three genes (DDX3Y, TXLNG2P and EIF1AY) that displayed significant differences between sexes were used to calculate the absolute gene expression threshold. Using this threshold, we selected 13 expressed and non-expressed genes and confirmed their expression level by RT-qPCR. Then, we selected the top 5% most expressed genes and found that several KEGG pathways were significantly enriched. Interestingly, these pathways were related to the typical functions of leukocytes cells, such as antigen processing and presentation and natural killer cell mediated cytotoxicity. We also applied this method to obtain the absolute gene expression threshold in already published microarray data of liver cells, where the top 5% expressed genes showed an enrichment of typical KEGG pathways for liver cells. Our results suggest that the three selected genes of the Y chromosome can be used to calculate an absolute gene expression threshold, allowing a transcriptome profiling of microarray data without the need of an additional reference experiment. Our approach based on the establishment of a threshold for absolute gene expression analysis will allow a new way to analyze thousands of microarrays from public databases. This allows the study of different human diseases without the need of having additional samples for relative expression experiments.
Quantifying the major mechanisms of recent gene duplications in the human and mouse genomes: a novel strategy to estimate gene duplication rates

PubMed Central

Pan, Deng; Zhang, Liqing

2007-01-01

Background The rate of gene duplication is an important parameter in the study of evolution, but the influence of gene conversion and technical problems have confounded previous attempts to provide a satisfying estimate. We propose a new strategy to estimate the rate that involves separate quantification of the rates of two different mechanisms of gene duplication and subsequent combination of the two rates, based on their respective contributions to the overall gene duplication rate. Results Previous estimates of gene duplication rates are based on small gene families. Therefore, to assess the applicability of this to families of all sizes, we looked at both two-copy gene families and the entire genome. We studied unequal crossover and retrotransposition, and found that these mechanisms of gene duplication are largely independent and account for a substantial amount of duplicated genes. Unequal crossover contributed more to duplications in the entire genome than retrotransposition did, but this contribution was significantly less in two-copy gene families, and duplicated genes arising from this mechanism are more likely to be retained. Combining rates of duplication using the two mechanisms, we estimated the overall rates to be from approximately 0.515 to 1.49 × 10-3 per gene per million years in human, and from approximately 1.23 to 4.23 × 10-3 in mouse. The rates estimated from two-copy gene families are always lower than those from the entire genome, and so it is not appropriate to use small families to estimate the rate for the entire genome. Conclusion We present a novel strategy for estimating gene duplication rates. Our results show that different mechanisms contribute differently to the evolution of small and large gene families. PMID:17683522

Constructing an integrated gene similarity network for the identification of disease genes.

PubMed

Tian, Zhen; Guo, Maozu; Wang, Chunyu; Xing, LinLin; Wang, Lei; Zhang, Yin

2017-09-20

Discovering novel genes that are involved human diseases is a challenging task in biomedical research. In recent years, several computational approaches have been proposed to prioritize candidate disease genes. Most of these methods are mainly based on protein-protein interaction (PPI) networks. However, since these PPI networks contain false positives and only cover less half of known human genes, their reliability and coverage are very low. Therefore, it is highly necessary to fuse multiple genomic data to construct a credible gene similarity network and then infer disease genes on the whole genomic scale. We proposed a novel method, named RWRB, to infer causal genes of interested diseases. First, we construct five individual gene (protein) similarity networks based on multiple genomic data of human genes. Then, an integrated gene similarity network (IGSN) is reconstructed based on similarity network fusion (SNF) method. Finally, we employee the random walk with restart algorithm on the phenotype-gene bilayer network, which combines phenotype similarity network, IGSN as well as phenotype-gene association network, to prioritize candidate disease genes. We investigate the effectiveness of RWRB through leave-one-out cross-validation methods in inferring phenotype-gene relationships. Results show that RWRB is more accurate than state-of-the-art methods on most evaluation metrics. Further analysis shows that the success of RWRB is benefited from IGSN which has a wider coverage and higher reliability comparing with current PPI networks. Moreover, we conduct a comprehensive case study for Alzheimer's disease and predict some novel disease genes that supported by literature. RWRB is an effective and reliable algorithm in prioritizing candidate disease genes on the genomic scale. Software and supplementary information are available at http://nclab.hit.edu.cn/~tianzhen/RWRB/ .
Genes with minimal phylogenetic information are problematic for coalescent analyses when gene tree estimation is biased.

PubMed

Xi, Zhenxiang; Liu, Liang; Davis, Charles C

2015-11-01

The development and application of coalescent methods are undergoing rapid changes. One little explored area that bears on the application of gene-tree-based coalescent methods to species tree estimation is gene informativeness. Here, we investigate the accuracy of these coalescent methods when genes have minimal phylogenetic information, including the implementation of the multilocus bootstrap approach. Using simulated DNA sequences, we demonstrate that genes with minimal phylogenetic information can produce unreliable gene trees (i.e., high error in gene tree estimation), which may in turn reduce the accuracy of species tree estimation using gene-tree-based coalescent methods. We demonstrate that this problem can be alleviated by sampling more genes, as is commonly done in large-scale phylogenomic analyses. This applies even when these genes are minimally informative. If gene tree estimation is biased, however, gene-tree-based coalescent analyses will produce inconsistent results, which cannot be remedied by increasing the number of genes. In this case, it is not the gene-tree-based coalescent methods that are flawed, but rather the input data (i.e., estimated gene trees). Along these lines, the commonly used program PhyML has a tendency to infer one particular bifurcating topology even though it is best represented as a polytomy. We additionally corroborate these findings by analyzing the 183-locus mammal data set assembled by McCormack et al. (2012) using ultra-conserved elements (UCEs) and flanking DNA. Lastly, we demonstrate that when employing the multilocus bootstrap approach on this 183-locus data set, there is no strong conflict between species trees estimated from concatenation and gene-tree-based coalescent analyses, as has been previously suggested by Gatesy and Springer (2014). Copyright © 2015 Elsevier Inc. All rights reserved.
Reference Genes for Accurate Transcript Normalization in Citrus Genotypes under Different Experimental Conditions

PubMed Central

Mafra, Valéria; Kubo, Karen S.; Alves-Ferreira, Marcio; Ribeiro-Alves, Marcelo; Stuart, Rodrigo M.; Boava, Leonardo P.; Rodrigues, Carolina M.; Machado, Marcos A.

2012-01-01

Real-time reverse transcription PCR (RT-qPCR) has emerged as an accurate and widely used technique for expression profiling of selected genes. However, obtaining reliable measurements depends on the selection of appropriate reference genes for gene expression normalization. The aim of this work was to assess the expression stability of 15 candidate genes to determine which set of reference genes is best suited for transcript normalization in citrus in different tissues and organs and leaves challenged with five pathogens (Alternaria alternata, Phytophthora parasitica, Xylella fastidiosa and Candidatus Liberibacter asiaticus). We tested traditional genes used for transcript normalization in citrus and orthologs of Arabidopsis thaliana genes described as superior reference genes based on transcriptome data. geNorm and NormFinder algorithms were used to find the best reference genes to normalize all samples and conditions tested. Additionally, each biotic stress was individually analyzed by geNorm. In general, FBOX (encoding a member of the F-box family) and GAPC2 (GAPDH) was the most stable candidate gene set assessed under the different conditions and subsets tested, while CYP (cyclophilin), TUB (tubulin) and CtP (cathepsin) were the least stably expressed genes found. Validation of the best suitable reference genes for normalizing the expression level of the WRKY70 transcription factor in leaves infected with Candidatus Liberibacter asiaticus showed that arbitrary use of reference genes without previous testing could lead to misinterpretation of data. Our results revealed FBOX, SAND (a SAND family protein), GAPC2 and UPL7 (ubiquitin protein ligase 7) to be superior reference genes, and we recommend their use in studies of gene expression in citrus species and relatives. This work constitutes the first systematic analysis for the selection of superior reference genes for transcript normalization in different citrus organs and under biotic stress. PMID:22347455
Characterization of a rabbit germ-line VH gene that is a candidate donor for VH gene conversion in mutant Alicia rabbits.

PubMed

Chen, H T; Alexander, C B; Mage, R G

1995-06-15

Normal rabbits preferentially rearrange the 3'-most VH gene, VH1, to encode Igs with VHa allotypes, which constitute the majority of rabbit serum Igs. A gene conversion-like mechanism is employed to diversify the primary Ab repertoire. In mutant Alicia rabbits that derived from a rabbit with VHa2 allotype, the VH1 gene was deleted. Our previous studies showed that the first functional gene (VH4) or VH4-like genes were rearranged in 2- to 8-wk-old homozygous Alicia. The VH1a2-like sequences that were found in splenic mRNA from 6-wk and older Alicia rabbits still had some residues that were typical of VH4. The appearances of sequences resembling that of VH1a2 may have been caused by gene conversions that altered the sequences of the rearranged VH or there may have been rearrangement of upstream VH1a2-like genes later in development. To investigate this further, we constructed a cosmid library and isolated a VH1a2-like gene, VH12-1-6, with a sequence almost identical to VH1a2. This gene had a deleted base in the heptamer of its recombination signal sequence. However, even if this defect diminished or eliminated its ability to rearrange, the a2-like gene could have acted as a donor for gene-conversion-like alteration of rearranged VH genes. Sequence comparisons suggested that this gene or a gene like it could have acted as a donor for gene conversion in mutant Alicia and in normal rabbits.
Global identification and expression analysis of stress-responsive genes of the Argonaute family in apple.

PubMed

Xu, Ruirui; Liu, Caiyun; Li, Ning; Zhang, Shizhong

2016-12-01

Argonaute (AGO) proteins, which are found in yeast, animals, and plants, are the core molecules of the RNA-induced silencing complex. These proteins play important roles in plant growth, development, and responses to biotic stresses. The complete analysis and classification of the AGO gene family have been recently reported in different plants. Nevertheless, systematic analysis and expression profiling of these genes have not been performed in apple (Malus domestica). Approximately 15 AGO genes were identified in the apple genome. The phylogenetic tree, chromosome location, conserved protein motifs, gene structure, and expression of the AGO gene family in apple were analyzed for gene prediction. All AGO genes were phylogenetically clustered into four groups (i.e., AGO1, AGO4, MEL1/AGO5, and ZIPPY/AGO7) with the AGO genes of Arabidopsis. These groups of the AGO gene family were statistically analyzed and compared among 31 plant species. The predicted apple AGO genes are distributed across nine chromosomes at different densities and include three segment duplications. Expression studies indicated that 15 AGO genes exhibit different expression patterns in at least one of the tissues tested. Additionally, analysis of gene expression levels indicated that the genes are mostly involved in responses to NaCl, PEG, heat, and low-temperature stresses. Hence, several candidate AGO genes are involved in different aspects of physiological and developmental processes and may play an important role in abiotic stress responses in apple. To the best of our knowledge, this study is the first to report a comprehensive analysis of the apple AGO gene family. Our results provide useful information to understand the classification and putative functions of these proteins, especially for gene members that may play important roles in abiotic stress responses in M. hupehensis.
Organization of genes for transcription and translation in the rif region of the Escherichia coli chromosome. [uv radiation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yamamoto, M.; Nomura, M.

1979-01-01

The lambda rif/sup d/18 transducing phage is known to carry several genes for components of transcriptional and translational machineries; these genes are clustered in the rif region at 88 min on the Escherichia coli genetic map. They include a set of genes for rRNA's (rrnB), a gene for spacer tRNA, tRNA/sub 2//sup Glu/(tgtB), one of the two genes for EF-TU (tufB), genes for four ribosomal proteins (rplK, A, J, and L), genes for the ..beta.. and ..beta..' subunits of RNA polymerase (rpoB and rpoC), and genes for three tRNA's (tyrU, gluT, and thrT). An additional tRNA gene (subsequently identified asmore » thrU by Landy and his co-workers) and a gene for a protein (protein U) with unknown functions were found to be carried by lambda rif/sup d/18. We analyzed the organization of these genes by using various deletion and hybrid phages derived from lambda rif/sup d/18 and lambda rif/sup d/12, a phage related to lambda rif/sup d/18. The expression of various genes was examined in uv-irradiated cells infected with these transducing phages. Two main conclusions were obtained. First, the four tRNA genes are not cotranscribed with the genes in rrnB, even though these tRNA genes are located close to the distal end of rrnB. Second, the four ribosomal protein genes are organized into two separate transcriptional units; rplK and A are in one unit and rplJ and L are in the second unit.« less
Identification and validation of reference genes for quantitative real-time PCR normalization and its applications in lycium.

PubMed

Zeng, Shaohua; Liu, Yongliang; Wu, Min; Liu, Xiaomin; Shen, Xiaofei; Liu, Chunzhao; Wang, Ying

2014-01-01

Lycium barbarum and L. ruthenicum are extensively used as traditional Chinese medicinal plants. Next generation sequencing technology provides a powerful tool for analyzing transcriptomic profiles of gene expression in non-model species. Such gene expression can then be confirmed with quantitative real-time polymerase chain reaction (qRT-PCR). Therefore, use of systematically identified suitable reference genes is a prerequisite for obtaining reliable gene expression data. Here, we calculated the expression stability of 18 candidate reference genes across samples from different tissues and grown under salt stress using geNorm and NormFinder procedures. The geNorm-determined rank of reference genes was similar to those defined by NormFinder with some differences. Both procedures confirmed that the single most stable reference gene was ACNTIN1 for L. barbarum fruits, H2B1 for L. barbarum roots, and EF1α for L. ruthenicum fruits. PGK3, H2B2, and PGK3 were identified as the best stable reference genes for salt-treated L. ruthenicum leaves, roots, and stems, respectively. H2B1 and GAPDH1+PGK1 for L. ruthenicum and SAMDC2+H2B1 for L. barbarum were the best single and/or combined reference genes across all samples. Finally, expression of salt-responsive gene NAC, fruit ripening candidate gene LrPG, and anthocyanin genes were investigated to confirm the validity of the selected reference genes. Suitable reference genes identified in this study provide a foundation for accurately assessing gene expression and further better understanding of novel gene function to elucidate molecular mechanisms behind particular biological/physiological processes in Lycium.
A BAC-bacterial recombination method to generate physically linked multiple gene reporter DNA constructs.

PubMed

Maye, Peter; Stover, Mary Louise; Liu, Yaling; Rowe, David W; Gong, Shiaochin; Lichtler, Alexander C

2009-03-13

Reporter gene mice are valuable animal models for biological research providing a gene expression readout that can contribute to cellular characterization within the context of a developmental process. With the advancement of bacterial recombination techniques to engineer reporter gene constructs from BAC genomic clones and the generation of optically distinguishable fluorescent protein reporter genes, there is an unprecedented capability to engineer more informative transgenic reporter mouse models relative to what has been traditionally available. We demonstrate here our first effort on the development of a three stage bacterial recombination strategy to physically link multiple genes together with their respective fluorescent protein (FP) reporters in one DNA fragment. This strategy uses bacterial recombination techniques to: (1) subclone genes of interest into BAC linking vectors, (2) insert desired reporter genes into respective genes and (3) link different gene-reporters together. As proof of concept, we have generated a single DNA fragment containing the genes Trap, Dmp1, and Ibsp driving the expression of ECFP, mCherry, and Topaz FP reporter genes, respectively. Using this DNA construct, we have successfully generated transgenic reporter mice that retain two to three gene readouts. The three stage methodology to link multiple genes with their respective fluorescent protein reporter works with reasonable efficiency. Moreover, gene linkage allows for their common chromosomal integration into a single locus. However, the testing of this multi-reporter DNA construct by transgenesis does suggest that the linkage of two different genes together, despite their large size, can still create a positional effect. We believe that gene choice, genomic DNA fragment size and the presence of endogenous insulator elements are critical variables.
Successful transient expression of Cas9 and single guide RNA genes in Chlamydomonas reinhardtii.

PubMed

Jiang, Wenzhi; Brueggeman, Andrew J; Horken, Kempton M; Plucinak, Thomas M; Weeks, Donald P

2014-11-01

The clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9 system has become a powerful and precise tool for targeted gene modification (e.g., gene knockout and gene replacement) in numerous eukaryotic organisms. Initial attempts to apply this technology to a model, the single-cell alga, Chlamydomonas reinhardtii, failed to yield cells containing edited genes. To determine if the Cas9 and single guide RNA (sgRNA) genes were functional in C. reinhardtii, we tested the ability of a codon-optimized Cas9 gene along with one of four different sgRNAs to cause targeted gene disruption during a 24-h period immediately following transformation. All three exogenously supplied gene targets as well as the endogenous FKB12 (rapamycin sensitivity) gene of C. reinhardtii displayed distinct Cas9/sgRNA-mediated target site modifications as determined by DNA sequencing of cloned PCR amplicons of the target site region. Success in transient expression of Cas9 and sgRNA genes contrasted with the recovery of only a single rapamycin-resistant colony bearing an appropriately modified FKB12 target site in 16 independent transformation experiments involving >10(9) cells. Failure to recover transformants with intact or expressed Cas9 genes following transformation with the Cas9 gene alone (or even with a gene encoding a Cas9 lacking nuclease activity) provided strong suggestive evidence for Cas9 toxicity when Cas9 is produced constitutively in C. reinhardtii. The present results provide compelling evidence that Cas9 and sgRNA genes function properly in C. reinhardtii to cause targeted gene modifications and point to the need for a focus on development of methods to properly stem Cas9 production and/or activity following gene editing. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
Rapid Evolution of Ovarian-Biased Genes in the Yellow Fever Mosquito (Aedes aegypti).

PubMed

Whittle, Carrie A; Extavour, Cassandra G

2017-08-01

Males and females exhibit highly dimorphic phenotypes, particularly in their gonads, which is believed to be driven largely by differential gene expression. Typically, the protein sequences of genes upregulated in males, or male-biased genes, evolve rapidly as compared to female-biased and unbiased genes. To date, the specific study of gonad-biased genes remains uncommon in metazoans. Here, we identified and studied a total of 2927, 2013, and 4449 coding sequences (CDS) with ovary-biased, testis-biased, and unbiased expression, respectively, in the yellow fever mosquito Aedes aegypti The results showed that ovary-biased and unbiased CDS had higher nonsynonymous to synonymous substitution rates (dN/dS) and lower optimal codon usage (those codons that promote efficient translation) than testis-biased genes. Further, we observed higher dN/dS in ovary-biased genes than in testis-biased genes, even for genes coexpressed in nonsexual (embryo) tissues. Ovary-specific genes evolved exceptionally fast, as compared to testis- or embryo-specific genes, and exhibited higher frequency of positive selection. Genes with ovary expression were preferentially involved in olfactory binding and reception. We hypothesize that at least two potential mechanisms could explain rapid evolution of ovary-biased genes in this mosquito: (1) the evolutionary rate of ovary-biased genes may be accelerated by sexual selection (including female-female competition or male-mate choice) affecting olfactory genes during female swarming by males, and/or by adaptive evolution of olfactory signaling within the female reproductive system ( e.g. , sperm-ovary signaling); and/or (2) testis-biased genes may exhibit decelerated evolutionary rates due to the formation of mating plugs in the female after copulation, which limits male-male sperm competition. Copyright © 2017 by the Genetics Society of America.
Predicting response to primary chemotherapy: gene expression profiling of paraffin-embedded core biopsy tissue.

PubMed

Mina, Lida; Soule, Sharon E; Badve, Sunil; Baehner, Fredrick L; Baker, Joffre; Cronin, Maureen; Watson, Drew; Liu, Mei-Lan; Sledge, George W; Shak, Steve; Miller, Kathy D

2007-06-01

Primary chemotherapy provides an ideal opportunity to correlate gene expression with response to treatment. We used paraffin-embedded core biopsies from a completed phase II trial to identify genes that correlate with response to primary chemotherapy. Patients with newly diagnosed stage II or III breast cancer were treated with sequential doxorubicin 75 mg/M2 q2 wks x 3 and docetaxel 40 mg/M2 weekly x 6; treatment order was randomly assigned. Pretreatment core biopsy samples were interrogated for genes that might correlate with pathologic complete response (pCR). In addition to the individual genes, the correlation of the Oncotype DX Recurrence Score with pCR was examined. Of 70 patients enrolled in the parent trial, core biopsies samples with sufficient RNA for gene analyses were available from 45 patients; 9 (20%) had inflammatory breast cancer (IBC). Six (14%) patients achieved a pCR. Twenty-two of the 274 candidate genes assessed correlated with pCR (p < 0.05). Genes correlating with pCR could be grouped into three large clusters: angiogenesis-related genes, proliferation related genes, and invasion-related genes. Expression of estrogen receptor (ER)-related genes and Recurrence Score did not correlate with pCR. In an exploratory analysis we compared gene expression in IBC to non-inflammatory breast cancer; twenty-four (9%) of the genes were differentially expressed (p < 0.05), 5 were upregulated and 19 were downregulated in IBC. Gene expression analysis on core biopsy samples is feasible and identifies candidate genes that correlate with pCR to primary chemotherapy. Gene expression in IBC differs significantly from noninflammatory breast cancer.
Predicting Protein Function by Genomic Context: Quantitative Evaluation and Qualitative Inferences

PubMed Central

Huynen, Martijn; Snel, Berend; Lathe, Warren; Bork, Peer

2000-01-01

Various new methods have been proposed to predict functional interactions between proteins based on the genomic context of their genes. The types of genomic context that they use are Type I: the fusion of genes; Type II: the conservation of gene-order or co-occurrence of genes in potential operons; and Type III: the co-occurrence of genes across genomes (phylogenetic profiles). Here we compare these types for their coverage, their correlations with various types of functional interaction, and their overlap with homology-based function assignment. We apply the methods to Mycoplasma genitalium, the standard benchmarking genome in computational and experimental genomics. Quantitatively, conservation of gene order is the technique with the highest coverage, applying to 37% of the genes. By combining gene order conservation with gene fusion (6%), the co-occurrence of genes in operons in absence of gene order conservation (8%), and the co-occurrence of genes across genomes (11%), significant context information can be obtained for 50% of the genes (the categories overlap). Qualitatively, we observe that the functional interactions between genes are stronger as the requirements for physical neighborhood on the genome are more stringent, while the fraction of potential false positives decreases. Moreover, only in cases in which gene order is conserved in a substantial fraction of the genomes, in this case six out of twenty-five, does a single type of functional interaction (physical interaction) clearly dominate (>80%). In other cases, complementary function information from homology searches, which is available for most of the genes with significant genomic context, is essential to predict the type of interaction. Using a combination of genomic context and homology searches, new functional features can be predicted for 10% of M. genitalium genes. PMID:10958638
A limited role for gene duplications in the evolution of platypus venom.

PubMed

Wong, Emily S W; Papenfuss, Anthony T; Whittington, Camilla M; Warren, Wesley C; Belov, Katherine

2012-01-01

Gene duplication followed by adaptive selection is believed to be the primary driver of venom evolution. However, to date, no studies have evaluated the importance of gene duplications for venom evolution using a genomic approach. The availability of a sequenced genome and a venom gland transcriptome for the enigmatic platypus provides a unique opportunity to explore the role that gene duplication plays in venom evolution. Here, we identify gene duplication events and correlate them with expressed transcripts in an in-season venom gland. Gene duplicates (1,508) were identified. These duplicated pairs (421), including genes that have undergone multiple rounds of gene duplications, were expressed in the venom gland. The majority of these genes are involved in metabolism and protein synthesis not toxin functions. Twelve secretory genes including serine proteases, metalloproteinases, and protease inhibitors likely to produce symptoms of envenomation such as vasodilation and pain were detected. Only 16 of 107 platypus genes with high similarity to known toxins evolved through gene duplication. Platypus venom C-type natriuretic peptides and nerve growth factor do not possess lineage-specific gene duplicates. Extensive duplications, believed to increase the potency of toxic content and promote toxin diversification, were not found. This is the first study to take a genome-wide approach in order to examine the impact of gene duplication on venom evolution. Our findings support the idea that adaptive selection acts on gene duplicates to drive the independent evolution and functional diversification of similar venom genes in venomous species. However, gene duplications alone do not explain the "venome" of the platypus. Other mechanisms, such as alternative splicing and mutation, may be important in venom innovation.
A Limited Role for Gene Duplications in the Evolution of Platypus Venom

PubMed Central

Wong, Emily S. W.; Papenfuss, Anthony T.; Whittington, Camilla M.; Warren, Wesley C.; Belov, Katherine

2012-01-01

Gene duplication followed by adaptive selection is believed to be the primary driver of venom evolution. However, to date, no studies have evaluated the importance of gene duplications for venom evolution using a genomic approach. The availability of a sequenced genome and a venom gland transcriptome for the enigmatic platypus provides a unique opportunity to explore the role that gene duplication plays in venom evolution. Here, we identify gene duplication events and correlate them with expressed transcripts in an in-season venom gland. Gene duplicates (1,508) were identified. These duplicated pairs (421), including genes that have undergone multiple rounds of gene duplications, were expressed in the venom gland. The majority of these genes are involved in metabolism and protein synthesis not toxin functions. Twelve secretory genes including serine proteases, metalloproteinases, and protease inhibitors likely to produce symptoms of envenomation such as vasodilation and pain were detected. Only 16 of 107 platypus genes with high similarity to known toxins evolved through gene duplication. Platypus venom C-type natriuretic peptides and nerve growth factor do not possess lineage-specific gene duplicates. Extensive duplications, believed to increase the potency of toxic content and promote toxin diversification, were not found. This is the first study to take a genome-wide approach in order to examine the impact of gene duplication on venom evolution. Our findings support the idea that adaptive selection acts on gene duplicates to drive the independent evolution and functional diversification of similar venom genes in venomous species. However, gene duplications alone do not explain the “venome” of the platypus. Other mechanisms, such as alternative splicing and mutation, may be important in venom innovation. PMID:21816864
Transposable elements contribute to activation of maize genes in response to abiotic stress.

PubMed

Makarevitch, Irina; Waters, Amanda J; West, Patrick T; Stitzer, Michelle; Hirsch, Candice N; Ross-Ibarra, Jeffrey; Springer, Nathan M

2015-01-01

Transposable elements (TEs) account for a large portion of the genome in many eukaryotic species. Despite their reputation as "junk" DNA or genomic parasites deleterious for the host, TEs have complex interactions with host genes and the potential to contribute to regulatory variation in gene expression. It has been hypothesized that TEs and genes they insert near may be transcriptionally activated in response to stress conditions. The maize genome, with many different types of TEs interspersed with genes, provides an ideal system to study the genome-wide influence of TEs on gene regulation. To analyze the magnitude of the TE effect on gene expression response to environmental changes, we profiled gene and TE transcript levels in maize seedlings exposed to a number of abiotic stresses. Many genes exhibit up- or down-regulation in response to these stress conditions. The analysis of TE families inserted within upstream regions of up-regulated genes revealed that between four and nine different TE families are associated with up-regulated gene expression in each of these stress conditions, affecting up to 20% of the genes up-regulated in response to abiotic stress, and as many as 33% of genes that are only expressed in response to stress. Expression of many of these same TE families also responds to the same stress conditions. The analysis of the stress-induced transcripts and proximity of the transposon to the gene suggests that these TEs may provide local enhancer activities that stimulate stress-responsive gene expression. Our data on allelic variation for insertions of several of these TEs show strong correlation between the presence of TE insertions and stress-responsive up-regulation of gene expression. Our findings suggest that TEs provide an important source of allelic regulatory variation in gene response to abiotic stress in maize.
Genome-wide analysis and identification of stress-responsive genes of the NAM-ATAF1,2-CUC2 transcription factor family in apple.

PubMed

Su, Hongyan; Zhang, Shizhong; Yuan, Xiaowei; Chen, Changtian; Wang, Xiao-Fei; Hao, Yu-Jin

2013-10-01

NAC (NAM, ATAF1,2, and CUC2) proteins constitute one of the largest families of plant-specific transcription factors. To date, little is known about the NAC genes in the apple (Malus domestica). In this study, a total of 180 NAC genes were identified in the apple genome and were phylogenetically clustered into six groups (I-VI) with the NAC genes from Arabidopsis and rice. The predicted apple NAC genes were distributed across all of 17 chromosomes at various densities. Additionally, the gene structure and motif compositions of the apple NAC genes were analyzed. Moreover, the expression of 29 selected apple NAC genes was analyzed in different tissues and under different abiotic stress conditions. All of the selected genes, with the exception of four genes, were expressed in at least one of the tissues tested, which indicates that the NAC genes are involved in various aspects of the physiological and developmental processes of the apple. Encouragingly, 17 of the selected genes were found to respond to one or more of the abiotic stress treatments, and these 17 genes included not only the expected 7 genes that were clustered with the well-known stress-related marker genes in group IV but also 10 genes located in other subgroups, none of which contains members that have been reported to be stress-related. To the best of our knowledge, this report describes the first genome-wide analysis of the apple NAC gene family, and the results should provide valuable information for understanding the classification and putative functions of this family. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
Abundance and Distribution of Dimethylsulfoniopropionate Degradation Genes and the Corresponding Bacterial Community Structure at Dimethyl Sulfide Hot Spots in the Tropical and Subtropical Pacific Ocean

PubMed Central

Suzuki, Shotaro; Omori, Yuko; Wong, Shu-Kuan; Ijichi, Minoru; Kaneko, Ryo; Kameyama, Sohiko; Tanimoto, Hiroshi; Hamasaki, Koji

2015-01-01

Dimethylsulfoniopropionate (DMSP) is mainly produced by marine phytoplankton but is released into the microbial food web and degraded by marine bacteria to dimethyl sulfide (DMS) and other products. To reveal the abundance and distribution of bacterial DMSP degradation genes and the corresponding bacterial communities in relation to DMS and DMSP concentrations in seawater, we collected surface seawater samples from DMS hot spot sites during a cruise across the Pacific Ocean. We analyzed the genes encoding DMSP lyase (dddP) and DMSP demethylase (dmdA), which are responsible for the transformation of DMSP to DMS and DMSP assimilation, respectively. The averaged abundance (±standard deviation) of these DMSP degradation genes relative to that of the 16S rRNA genes was 33% ± 12%. The abundances of these genes showed large spatial variations. dddP genes showed more variation in abundances than dmdA genes. Multidimensional analysis based on the abundances of DMSP degradation genes and environmental factors revealed that the distribution pattern of these genes was influenced by chlorophyll a concentrations and temperatures. dddP genes, dmdA subclade C/2 genes, and dmdA subclade D genes exhibited significant correlations with the marine Roseobacter clade, SAR11 subgroup Ib, and SAR11 subgroup Ia, respectively. SAR11 subgroups Ia and Ib, which possessed dmdA genes, were suggested to be the main potential DMSP consumers. The Roseobacter clade members possessing dddP genes in oligotrophic subtropical regions were possible DMS producers. These results suggest that DMSP degradation genes are abundant and widely distributed in the surface seawater and that the marine bacteria possessing these genes influence the degradation of DMSP and regulate the emissions of DMS in subtropical gyres of the Pacific Ocean. PMID:25862229
Leader genes in osteogenesis: a theoretical study.

PubMed

Orlando, Bruno; Giacomelli, Luca; Ricci, Massimiliano; Barone, Antonio; Covani, Ugo

2013-01-01

Little is still known about the molecular mechanisms involved in the process of osteogenesis. In this paper, the leader genes approach, a new bioinformatics method which has already been experimentally validated, is adopted in order to identify the genes involved in human osteogenesis. Interactions among genes are then calculated and genes are ranked according to their relative importance in this process. In total, 167 genes were identified as being involved in osteogenesis. Genes were divided into 4 groups, according to their main function in the osteogenic processes: skeletal development; cell adhesion and proliferation; ossification; and calcium ion binding. Seven genes were consistently identified as leader genes (i.e. the genes with the greatest importance in osteogenesis), while 14 were found to have slightly less importance (class B genes). It was interesting to notice that the larger part of leader and class B genes belonged to the cell adhesion and proliferation or to the ossification sub-groups. This finding suggested that these two particular sub-processes could play a more important role in osteogenesis. Moreover, among the 7 leader genes, it is interesting to notice that RUNX2, BMP2, SPARC, PTH play a direct role in bone formation, while the 3 other leader genes (VEGF, IL6, FGF2) seem to be more connected with an angiogenetic process. Twenty-nine genes have no known interactions (orphan genes). From these results, it may be possible to plan an ad hoc experimentation, for instance by microarray analyses, focused on leader, class B and orphan genes, with the aim to shed new light on the molecular mechanisms underlying osteogenesis. Copyright © 2012 Elsevier Ltd. All rights reserved.
Evolutionary origin and functional divergence of totipotent cell homeobox genes in eutherian mammals.

PubMed

Maeso, Ignacio; Dunwell, Thomas L; Wyatt, Chris D R; Marlétaz, Ferdinand; Vető, Borbála; Bernal, Juan A; Quah, Shan; Irimia, Manuel; Holland, Peter W H

2016-06-13

A central goal of evolutionary biology is to link genomic change to phenotypic evolution. The origin of new transcription factors is a special case of genomic evolution since it brings opportunities for novel regulatory interactions and potentially the emergence of new biological properties. We demonstrate that a group of four homeobox gene families (Argfx, Leutx, Dprx, Tprx), plus a gene newly described here (Pargfx), arose by tandem gene duplication from the retinal-expressed Crx gene, followed by asymmetric sequence evolution. We show these genes arose as part of repeated gene gain and loss events on a dynamic chromosomal region in the stem lineage of placental mammals, on the forerunner of human chromosome 19. The human orthologues of these genes are expressed specifically in early embryo totipotent cells, peaking from 8-cell to morula, prior to cell fate restrictions; cow orthologues have similar expression. To examine biological roles, we used ectopic gene expression in cultured human cells followed by high-throughput RNA-seq and uncovered extensive transcriptional remodelling driven by three of the genes. Comparison to transcriptional profiles of early human embryos suggest roles in activating and repressing a set of developmentally-important genes that spike at 8-cell to morula, rather than a general role in genome activation. We conclude that a dynamic chromosome region spawned a set of evolutionarily new homeobox genes, the ETCHbox genes, specifically in eutherian mammals. After these genes diverged from the parental Crx gene, we argue they were recruited for roles in the preimplantation embryo including activation of genes at the 8-cell stage and repression after morula. We propose these new homeobox gene roles permitted fine-tuning of cell fate decisions necessary for specification and function of embryonic and extra-embryonic tissues utilised in mammalian development and pregnancy.
A novel MADS-box gene subfamily with a sister-group relationship to class B floral homeotic genes.

PubMed

Becker, A; Kaufmann, K; Freialdenhoven, A; Vincent, C; Li, M-A; Saedler, H; Theissen, G

2002-02-01

Class B floral homeotic genes specify the identity of petals and stamens during the development of angiosperm flowers. Recently, putative orthologs of these genes have been identified in different gymnosperms. Together, these genes constitute a clade, termed B genes. Here we report that diverse seed plants also contain members of a hitherto unknown sister clade of the B genes, termed B(sister) (B(s)) genes. We have isolated members of the B(s) clade from the gymnosperm Gnetum gnemon, the monocotyledonous angiosperm Zea mays and the eudicots Arabidopsis thaliana and Antirrhinum majus. In addition, MADS-box genes from the basal angiosperm Asarum europaeum and the eudicot Petunia hybrida were identified as B(s) genes. Comprehensive expression studies revealed that B(s) genes are mainly transcribed in female reproductive organs (ovules and carpel walls). This is in clear contrast to the B genes, which are predominantly expressed in male reproductive organs (and in angiosperm petals). Our data suggest that the B(s) genes played an important role during the evolution of the reproductive structures in seed plants. The establishment of distinct B and B(s) gene lineages after duplication of an ancestral gene may have accompanied the evolution of male microsporophylls and female megasporophylls 400-300 million years ago. During flower evolution, expression of B(s) genes diversified, but the focus of expression remained in female reproductive organs. Our findings imply that a clade of highly conserved close relatives of class B floral homeotic genes has been completely overlooked until recently and awaits further evaluation of its developmental and evolutionary importance. Electronic supplementary material to this paper can be obtained by using the Springer Link server located at http://dx.doi.org/10.1007/s00438-001-0615-8.

Identification of learning and memory genes in canine; promoter investigation and determining the selective pressure.

PubMed

Seifi Moroudi, Reihane; Masoudi, Ali Akbar; Vaez Torshizi, Rasoul; Zandi, Mohammad

2014-12-01

One of the important behaviors of dogs is trainability which is affected by learning and memory genes. These kinds of the genes have not yet been identified in dogs. In the current research, these genes were found in animal models by mining the biological data and scientific literatures. The proteins of these genes were obtained from the UniProt database in dogs and humans. Not all homologous proteins perform similar functions, thus comparison of these proteins was studied in terms of protein families, domains, biological processes, molecular functions, and cellular location of metabolic pathways in Interpro, KEGG, Quick Go and Psort databases. The results showed that some of these proteins have the same performance in the rat or mouse, dog, and human. It is anticipated that the protein of these genes may be effective in learning and memory in dogs. Then, the expression pattern of the recognized genes was investigated in the dog hippocampus using the existing information in the GEO profile. The results showed that BDNF, TAC1 and CCK genes are expressed in the dog hippocampus, therefore, these genes could be strong candidates associated with learning and memory in dogs. Subsequently, due to the importance of the promoter regions in gene function, this region was investigated in the above genes. Analysis of the promoter indicated that the HNF-4 site of BDNF gene and the transcription start site of CCK gene is exposed to methylation. Phylogenetic analysis of protein sequences of these genes showed high similarity in each of these three genes among the studied species. The dN/dS ratio for BDNF, TAC1 and CCK genes indicates a purifying selection during the evolution of the genes.
MicroRNA-integrated and network-embedded gene selection with diffusion distance.

PubMed

Huang, Di; Zhou, Xiaobo; Lyon, Christopher J; Hsueh, Willa A; Wong, Stephen T C

2010-10-29

Gene network information has been used to improve gene selection in microarray-based studies by selecting marker genes based both on their expression and the coordinate expression of genes within their gene network under a given condition. Here we propose a new network-embedded gene selection model. In this model, we first address the limitations of microarray data. Microarray data, although widely used for gene selection, measures only mRNA abundance, which does not always reflect the ultimate gene phenotype, since it does not account for post-transcriptional effects. To overcome this important (critical in certain cases) but ignored-in-almost-all-existing-studies limitation, we design a new strategy to integrate together microarray data with the information of microRNA, the major post-transcriptional regulatory factor. We also handle the challenges led by gene collaboration mechanism. To incorporate the biological facts that genes without direct interactions may work closely due to signal transduction and that two genes may be functionally connected through multi paths, we adopt the concept of diffusion distance. This concept permits us to simulate biological signal propagation and therefore to estimate the collaboration probability for all gene pairs, directly or indirectly-connected, according to multi paths connecting them. We demonstrate, using type 2 diabetes (DM2) as an example, that the proposed strategies can enhance the identification of functional gene partners, which is the key issue in a network-embedded gene selection model. More importantly, we show that our gene selection model outperforms related ones. Genes selected by our model 1) have improved classification capability; 2) agree with biological evidence of DM2-association; and 3) are involved in many well-known DM2-associated pathways.
Genome-wide identification, phylogenetic classification, and exon-intron structure characterisation of the tubulin and actin genes in flax (Linum usitatissimum).

PubMed

Pydiura, Nikolay; Pirko, Yaroslav; Galinousky, Dmitry; Postovoitova, Anastasiia; Yemets, Alla; Kilchevsky, Aleksandr; Blume, Yaroslav

2018-06-08

Flax (Linum usitatissimum L.) is a valuable food and fiber crop cultivated for its quality fiber and seed oil. α-, β-, γ-tubulins and actins are the main structural proteins of the cytoskeleton. α- and γ-tubulin and actin genes have not been characterized yet in the flax genome. In this study, we have identified 6 α-tubulin genes, 13 β-tubulin genes, 2 γ-tubulin genes, and 15 actin genes in the flax genome and analysed the phylogenetic relationships between flax and A. thaliana tubulin and actin genes. Six α-tubulin genes are represented by 3 paralogous pairs, among 13 β-tubulin genes 7 different isotypes can be distinguished, 6 of which are encoded by two paralogous genes each. γ-tubulin is represented by a paralogous pair of genes one of which may be not functional. Fifteen actin genes represent 7 paralogous pairs - 7 actin isotypes and a sequentially duplicated copy of one of the genes of one of the isotypes. Exon-intron structure analysis has shown intron length polymorphism within the β-tubulin genes and intron number variation among the α-tubulin gene: 3 or 4 introns are found in two or four genes, respectively. Intron positioning occurs at conservative sites, as observed in numerous other plant species. Flax actin genes show both intron length polymorphisms and variation in the number of intron that may be 2 or 3. These data will be useful to support further studies on the specificity, functioning, regulation and evolution of the flax cytoskeleton proteins. This article is protected by copyright. All rights reserved.
Mutated-leptin gene transfer induces increases in body weight by electroporation and hydrodynamics-based gene delivery in mice.

PubMed

Xiang, Lan; Murai, Atsushi; Muramatsu, Tatsuo

2005-12-01

To investigate whether in vivo gene transfer causes leptin-antagonistic effects on food intake, animal body weight and fat tissue weight, the R128Q mutated-leptin gene, an R to Q substitution at position 128 of mouse leptin, was transferred into mouse liver and leg muscle by electroporation and hydrodynamics-based gene delivery. Mutated-leptin gene transfer by electroporation caused significant increases in body weight at 5 days and after (5.4% increase relative to control; p<0.05). Hydrodynamics-based gene delivery of the mutated-leptin gene also caused an increase in body weight (3.0% increase relative to control; p<0.05). Mutated-leptin gene transfer by electroporation significantly increased the tissue weight of epididymal white fat and neuropeptide Y mRNA expression in the hypothalamus compared with those of the control group 3 weeks after gene transfer (p<0.05). These results suggest that mutated-leptin gene transfer successfully produced leptin-antagonistic effects by modulating the central regulator of energy homeostasis. Also, the extent of leptin-antagonistic effects by electroporation was much higher than hydrodynamics-based gene delivery, with at least single gene transfer.
On the Complexity of Duplication-Transfer-Loss Reconciliation with Non-Binary Gene Trees.

PubMed

Kordi, Misagh; Bansal, Mukul S

2017-01-01

Duplication-Transfer-Loss (DTL) reconciliation has emerged as a powerful technique for studying gene family evolution in the presence of horizontal gene transfer. DTL reconciliation takes as input a gene family phylogeny and the corresponding species phylogeny, and reconciles the two by postulating speciation, gene duplication, horizontal gene transfer, and gene loss events. Efficient algorithms exist for finding optimal DTL reconciliations when the gene tree is binary. However, gene trees are frequently non-binary. With such non-binary gene trees, the reconciliation problem seeks to find a binary resolution of the gene tree that minimizes the reconciliation cost. Given the prevalence of non-binary gene trees, many efficient algorithms have been developed for this problem in the context of the simpler Duplication-Loss (DL) reconciliation model. Yet, no efficient algorithms exist for DTL reconciliation with non-binary gene trees and the complexity of the problem remains unknown. In this work, we resolve this open question by showing that the problem is, in fact, NP-hard. Our reduction applies to both the dated and undated formulations of DTL reconciliation. By resolving this long-standing open problem, this work will spur the development of both exact and heuristic algorithms for this important problem.
Transcriptome-wide selection of a reliable set of reference genes for gene expression studies in potato cyst nematodes (Globodera spp.).

PubMed

Sabeh, Michael; Duceppe, Marc-Olivier; St-Arnaud, Marc; Mimee, Benjamin

2018-01-01

Relative gene expression analyses by qRT-PCR (quantitative reverse transcription PCR) require an internal control to normalize the expression data of genes of interest and eliminate the unwanted variation introduced by sample preparation. A perfect reference gene should have a constant expression level under all the experimental conditions. However, the same few housekeeping genes selected from the literature or successfully used in previous unrelated experiments are often routinely used in new conditions without proper validation of their stability across treatments. The advent of RNA-Seq and the availability of public datasets for numerous organisms are opening the way to finding better reference genes for expression studies. Globodera rostochiensis is a plant-parasitic nematode that is particularly yield-limiting for potato. The aim of our study was to identify a reliable set of reference genes to study G. rostochiensis gene expression. Gene expression levels from an RNA-Seq database were used to identify putative reference genes and were validated with qRT-PCR analysis. Three genes, GR, PMP-3, and aaRS, were found to be very stable within the experimental conditions of this study and are proposed as reference genes for future work.
Revealing Alzheimer's disease genes spectrum in the whole-genome by machine learning.

PubMed

Huang, Xiaoyan; Liu, Hankui; Li, Xinming; Guan, Liping; Li, Jiankang; Tellier, Laurent Christian Asker M; Yang, Huanming; Wang, Jian; Zhang, Jianguo

2018-01-10

Alzheimer's disease (AD) is an important, progressive neurodegenerative disease, with a complex genetic architecture. A key goal of biomedical research is to seek out disease risk genes, and to elucidate the function of these risk genes in the development of disease. For this purpose, expanding the AD-associated gene set is necessary. In past research, the prediction methods for AD related genes has been limited in their exploration of the target genome regions. We here present a genome-wide method for AD candidate genes predictions. We present a machine learning approach (SVM), based upon integrating gene expression data with human brain-specific gene network data, to discover the full spectrum of AD genes across the whole genome. We classified AD candidate genes with an accuracy and the area under the receiver operating characteristic (ROC) curve of 84.56% and 94%. Our approach provides a supplement for the spectrum of AD-associated genes extracted from more than 20,000 genes in a genome wide scale. In this study, we have elucidated the whole-genome spectrum of AD, using a machine learning approach. Through this method, we expect for the candidate gene catalogue to provide a more comprehensive annotation of AD for researchers.
High level of microsynteny and purifying selection affect the evolution of WRKY family in Gramineae.

PubMed

Jin, Jing; Kong, Jingjing; Qiu, Jianle; Zhu, Huasheng; Peng, Yuancheng; Jiang, Haiyang

2016-01-01

The WRKY gene family, which encodes proteins in the regulation processes of diverse developmental stages, is one of the largest families of transcription factors in higher plants. In this study, by searching for interspecies gene colinearity (microsynteny) and dating the age distributions of duplicated genes, we found 35 chromosomal segments of subgroup I genes of WRKY family (WRKY I) in four Gramineae species (Brachypodium, rice, sorghum, and maize) formed eight orthologous groups. After a stepwise gene-by-gene reciprocal comparison of all the protein sequences in the WRKY I gene flanking areas, highly conserved regions of microsynteny were found in the four Gramineae species. Most gene pairs showed conserved orientation within syntenic genome regions. Furthermore, tandem duplication events played the leading role in gene expansion. Eventually, environmental selection pressure analysis indicated strong purifying selection for the WRKY I genes in Gramineae, which may have been followed by gene loss and rearrangement. The results presented in this study provide basic information of Gramineae WRKY I genes and form the foundation for future functional studies of these genes. High level of microsynteny in the four grass species provides further evidence that a large-scale genome duplication event predated speciation.
Identifying osteosarcoma metastasis associated genes by weighted gene co-expression network analysis (WGCNA).

PubMed

Tian, Honglai; Guan, Donghui; Li, Jianmin

2018-06-01

Osteosarcoma (OS), the most common malignant bone tumor, accounts for the heavy healthy threat in the period of children and adolescents. OS occurrence usually correlates with early metastasis and high death rate. This study aimed to better understand the mechanism of OS metastasis.Based on Gene Expression Omnibus (GEO) database, we downloaded 4 expression profile data sets associated with OS metastasis, and selected differential expressed genes. Weighted gene co-expression network analysis (WGCNA) approach allowed us to investigate the most OS metastasis-correlated module. Gene Ontology functional and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were used to give annotation of selected OS metastasis-associated genes.We select 897 differential expressed genes from OS metastasis and OS non-metastasis groups. Based on these selected genes, WGCNA further explored 142 genes included in the most OS metastasis-correlated module. Gene Ontology functional and KEGG pathway enrichment analyses showed that significantly OS metastasis-associated genes were involved in pathway correlated with insulin-like growth factor binding.Our research figured out several potential molecules participating in metastasis process and factors acting as biomarker. With this study, we could better explore the mechanism of OS metastasis and further discover more therapy targets.
Using the gene ontology for microarray data mining: a comparison of methods and application to age effects in human prefrontal cortex.

PubMed

Pavlidis, Paul; Qin, Jie; Arango, Victoria; Mann, John J; Sibille, Etienne

2004-06-01

One of the challenges in the analysis of gene expression data is placing the results in the context of other data available about genes and their relationships to each other. Here, we approach this problem in the study of gene expression changes associated with age in two areas of the human prefrontal cortex, comparing two computational methods. The first method, "overrepresentation analysis" (ORA), is based on statistically evaluating the fraction of genes in a particular gene ontology class found among the set of genes showing age-related changes in expression. The second method, "functional class scoring" (FCS), examines the statistical distribution of individual gene scores among all genes in the gene ontology class and does not involve an initial gene selection step. We find that FCS yields more consistent results than ORA, and the results of ORA depended strongly on the gene selection threshold. Our findings highlight the utility of functional class scoring for the analysis of complex expression data sets and emphasize the advantage of considering all available genomic information rather than sets of genes that pass a predetermined "threshold of significance."
Evidence for a large expansion and subfunctionalisation of globin genes in sea anemones.

PubMed

Smith, Hayden L; Pavasovic, Ana; Surm, Joachim M; Phillips, Matthew J; Prentis, Peter J

2018-06-27

The globin gene superfamily has been well-characterised in vertebrates, however, there has been limited research in early-diverging lineages, such as phylum Cnidaria. This study aimed to identify globin genes in multiple cnidarian lineages, and use bioinformatic approaches to characterise the evolution, structure and expression of these genes. Phylogenetic analyses and in silico protein predictions showed that all cnidarians have undergone an expansion of globin genes, which likely have a hexacoordinate protein structure. Our protein modelling has also revealed the possibility of a single pentacoordinate globin lineage in anthozoan species. Some cnidarian globin genes displayed tissue and development specific expression with very few orthologous genes similarly expressed across species. Our phylogenetic analyses also revealed that eumetazoan globin genes form a polyphyletic relationship with vertebrate globin genes. Overall, our analyses suggest that a Ngb-like and GbX-like gene were most likely present in the globin gene repertoire for the last common ancestor of eumetazoans. The identification of a large-scale expansion and subfunctionalisation of globin genes in actiniarians provides an excellent starting point to further our understanding of the evolution and function of the globin gene superfamily in early-diverging lineages.
Gene Delivery in Neuro-Oncology.

PubMed

Dixit, Karan; Kumthekar, Priya

2017-09-02

Glioblastoma multiforme (GBM) is the most common primary malignant brain tumor in adults with a dismal prognosis despite aggressive multimodal management thus novel treatments are urgently needed. Gene therapy is a versatile treatment strategy being investigated in multiple cancers including GBM. In gene therapy, a variety of vectors or "carriers" are used to deliver genes designed for different anti-tumoral effects. Gene delivery vehicles and approaches to treatment will be addressed in this review. The most commonly studied vectors are viral based, however, driven by advances in biomedical engineering, mesenchymal and neural stem cells, as well as multiple different types of nanoparticles have been developed to improve tumor tropism and also increase gene transfer into tumor cells. Different genes have been studied including suicide genes, which convert non-toxic prodrug into cytotoxic drug; immunomodulatory genes, which stimulate the immune system; and tumor suppressor genes which repair the defect that allow cells to divide unchecked. Gene therapy may be a promising treatment strategy in neuro-oncology as it is versatile and flexible due to the ability to tailor vectors and genes for specific therapeutic activity. Pre-clinical studies and clinical trials have demonstrated feasibility and safety of gene therapy; however, further studies are required to determine efficacy.
How controlled release technology can aid gene delivery.

PubMed

Jo, Jun-Ichiro; Tabata, Yasuhiko

2015-01-01

Many types of gene delivery systems have been developed to enhance the level of gene expression. Controlled release technology is a feasible gene delivery system which enables genes to extend the expression duration by maintaining and releasing them at the injection site in a controlled manner. This technology can reduce the adverse effects by the bolus dose administration and avoid the repeated administration. Biodegradable biomaterials are useful as materials for the controlled release-based gene delivery technology and various biodegradable biomaterials have been developed. Controlled release-based gene delivery plays a critical role in a conventional gene therapy and genetic engineering. In the gene therapy, the therapeutic gene is released from biodegradable biomaterial matrices around the tissue to be treated. On the other hand, the intracellular controlled release of gene from the sub-micro-sized matrices is required for genetic engineering. Genetic engineering is feasible for cell transplantation as well as research of stem cells biology and medicine. DNA hydrogel containing a sequence of therapeutic gene and the exosome including the individual specific nucleic acids may become candidates for controlled release carriers. Technologies to deliver genes to cell aggregates will play an important role in the promotion of regenerative research and therapy.
Organization of the capsule biosynthesis gene locus of the oral streptococcus Streptococcus anginosus.

PubMed

Tsunashima, Hiroyuki; Miyake, Katsuhide; Motono, Makoto; Iijima, Shinji

2012-03-01

The capsular polysaccharide (CPS) of the important oral streptococcus Streptococcus anginosus, which causes endocarditis, and the genes for its synthesis have not been clarified. In this study, we investigated the gene locus required for CPS synthesis in S. anginosus. Southern hybridization using the cpsE gene of the well-characterized bacterium S. agalactiae revealed that there is a similar gene in the genome of S. anginosus. By using the colony hybridization technique and inverse PCR, we isolated the CPS synthesis (cps) genes of S. anginosus. This gene cluster consisted of genes containing typical regulatory genes, cpsA-D, and glycosyltransferase genes coding for glucose, rhamnose, N-acetylgalactosamine, and galactofuranose transferases. Furthermore, we confirmed that the cps locus is required for CPS synthesis using a mutant strain with a defective cpsE gene. The cps cluster was found to be located downstream the nrdG gene, which encodes ribonucleoside triphosphate reductase activator, as is the case in other oral streptococci such as S. gordonii and S. sanguinis. However, the location of the gene cluster was different from those of S. pneumonia and S. agalactiae. Copyright © 2011 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
GOTree Machine (GOTM): a web-based platform for interpreting sets of interesting genes using Gene Ontology hierarchies

PubMed Central

Zhang, Bing; Schmoyer, Denise; Kirov, Stefan; Snoddy, Jay

2004-01-01

Background Microarray and other high-throughput technologies are producing large sets of interesting genes that are difficult to analyze directly. Bioinformatics tools are needed to interpret the functional information in the gene sets. Results We have created a web-based tool for data analysis and data visualization for sets of genes called GOTree Machine (GOTM). This tool was originally intended to analyze sets of co-regulated genes identified from microarray analysis but is adaptable for use with other gene sets from other high-throughput analyses. GOTree Machine generates a GOTree, a tree-like structure to navigate the Gene Ontology Directed Acyclic Graph for input gene sets. This system provides user friendly data navigation and visualization. Statistical analysis helps users to identify the most important Gene Ontology categories for the input gene sets and suggests biological areas that warrant further study. GOTree Machine is available online at . Conclusion GOTree Machine has a broad application in functional genomic, proteomic and other high-throughput methods that generate large sets of interesting genes; its primary purpose is to help users sort for interesting patterns in gene sets. PMID:14975175
DLGP: A database for lineage-conserved and lineage-specific gene pairs in animal and plant genomes.

PubMed

Wang, Dapeng

2016-01-15

The conservation of gene organization in the genome with lineage-specificity is an invaluable resource to decipher their potential functionality with diverse selective constraints, especially in higher animals and plants. Gene pairs appear to be the minimal structure for such kind of gene clusters that tend to reside in their preferred locations, representing the distinctive genomic characteristics in single species or a given lineage. Despite gene families having been investigated in a widespread manner, the definition of gene pair families in various taxa still lacks adequate attention. To address this issue, we report DLGP (http://lcgbase.big.ac.cn/DLGP/) that stores the pre-calculated lineage-based gene pairs in currently available 134 animal and plant genomes and inspect them under the same analytical framework, bringing out a set of innovational features. First, the taxonomy or lineage has been classified into four levels such as Kingdom, Phylum, Class and Order. It adopts all-to-all comparison strategy to identify the possible conserved gene pairs in all species for each gene pair in certain species and reckon those that are conserved in over a significant proportion of species in a given lineage (e.g. Primates, Diptera or Poales) as the lineage-conserved gene pairs. Furthermore, it predicts the lineage-specific gene pairs by retaining the above-mentioned lineage-conserved gene pairs that are not conserved in any other lineages. Second, it carries out pairwise comparison for the gene pairs between two compared species and creates the table including all the conserved gene pairs and the image elucidating the conservation degree of gene pairs in chromosomal level. Third, it supplies gene order browser to extend gene pairs to gene clusters, allowing users to view the evolution dynamics in the gene context in an intuitive manner. This database will be able to facilitate the particular comparison between animals and plants, between vertebrates and arthropods, and between monocots and eudicots, accounting for the significant contribution of gene pairs to speciation and diversification in specific lineages. Copyright © 2015 Elsevier Inc. All rights reserved.
Construction of diagnosis system and gene regulatory networks based on microarray analysis.

PubMed

Hong, Chun-Fu; Chen, Ying-Chen; Chen, Wei-Chun; Tu, Keng-Chang; Tsai, Meng-Hsiun; Chan, Yung-Kuan; Yu, Shyr Shen

2018-05-01

A microarray analysis generally contains expression data of thousands of genes, but most of them are irrelevant to the disease of interest, making analyzing the genes concerning specific diseases complicated. Therefore, filtering out a few essential genes as well as their regulatory networks is critical, and a disease can be easily diagnosed just depending on the expression profiles of a few critical genes. In this study, a target gene screening (TGS) system, which is a microarray-based information system that integrates F-statistics, pattern recognition matching, a two-layer K-means classifier, a Parameter Detection Genetic Algorithm (PDGA), a genetic-based gene selector (GBG selector) and the association rule, was developed to screen out a small subset of genes that can discriminate malignant stages of cancers. During the first stage, F-statistic, pattern recognition matching, and a two-layer K-means classifier were applied in the system to filter out the 20 critical genes most relevant to ovarian cancer from 9600 genes, and the PDGA was used to decide the fittest values of the parameters for these critical genes. Among the 20 critical genes, 15 are associated with cancer progression. In the second stage, we further employed a GBG selector and the association rule to screen out seven target gene sets, each with only four to six genes, and each of which can precisely identify the malignancy stage of ovarian cancer based on their expression profiles. We further deduced the gene regulatory networks of the 20 critical genes by applying the Pearson correlation coefficient to evaluate the correlationship between the expression of each gene at the same stages and at different stages. Correlationships between gene pairs were calculated, and then, three regulatory networks were deduced. Their correlationships were further confirmed by the Ingenuity pathway analysis. The prognostic significances of the genes identified via regulatory networks were examined using online tools, and most represented biomarker candidates. In summary, our proposed system provides a new strategy to identify critical genes or biomarkers, as well as their regulatory networks, from microarray data. Copyright © 2018. Published by Elsevier Inc.
Transcriptional responses in thyroid tissues from rats treated with a tumorigenic and a non-tumorigenic triazole conazole fungicide.

PubMed

Hester, Susan D; Nesnow, Stephen

2008-03-15

Conazoles are azole-containing fungicides that are used in agriculture and medicine. Conazoles can induce follicular cell adenomas of the thyroid in rats after chronic bioassay. The goal of this study was to identify pathways and networks of genes that were associated with thyroid tumorigenesis through transcriptional analyses. To this end, we compared transcriptional profiles from tissues of rats treated with a tumorigenic and a non-tumorigenic conazole. Triadimefon, a rat thyroid tumorigen, and myclobutanil, which was not tumorigenic in rats after a 2-year bioassay, were administered in the feed to male Wistar/Han rats for 30 or 90 days similar to the treatment conditions previously used in their chronic bioassays. Thyroid gene expression was determined using high density Affymetrix GeneChips (Rat 230_2). Gene expression was analyzed by the Gene Set Expression Analyses method which clearly separated the tumorigenic treatments (tumorigenic response group (TRG)) from the non-tumorigenic treatments (non-tumorigenic response group (NRG)). Core genes from these gene sets were mapped to canonical, metabolic, and GeneGo processes and these processes compared across group and treatment time. Extensive analyses were performed on the 30-day gene sets as they represented the major perturbations. Gene sets in the 30-day TRG group had over representation of fatty acid metabolism, oxidation, and degradation processes (including PPARgamma and CYP involvement), and of cell proliferation responses. Core genes from these gene sets were combined into networks and found to possess signaling interactions. In addition, the core genes in each gene set were compared with genes known to be associated with human thyroid cancer. Among the genes that appeared in both rat and human data sets were: Acaca, Asns, Cebpg, Crem, Ddit3, Gja1, Grn, Jun, Junb, and Vegf. These genes were major contributors in the previously developed network from triadimefon-treated rat thyroids. It is postulated that triadimefon induces oxidative response genes and activates the nuclear receptor, Ppargamma, initiating transcription of gene products and signaling to a series of genes involved in cell proliferation.
Bacterial avirulence genes.

PubMed

Leach, J E; White, F F

1996-01-01

Although more than 30 bacterial avirulence genes have been cloned and characterized, the function of the gene products in the elictitation of resistance is unknown in all cases but one. The product of avrD from Pseudomonas syringae pv. glycinea likely functions indirectly to elicit resistance in soybean, that is, evidence suggests the gene product is an enzyme involved in elicitor production. In most if not all cases, bacterial avirulence gene function is dependent on interactions with the hypersensitive response and pathogenicity (hrp) genes. Many hrp genes are similar to genes involved in delivery of pathogenicity factors in mammalian bacterial pathogens. Thus, analogies between mammalian and plant pathogens may provide needed clues to elucidate how virulence gene products control induction of resistance.
FUNCTIONAL NANOPARTICLES FOR MOLECULAR IMAGING GUIDED GENE DELIVERY

PubMed Central

Liu, Gang; Swierczewska, Magdalena; Lee, Seulki; Chen, Xiaoyuan

2010-01-01

Gene therapy has great potential to bring tremendous changes in treatment of various diseases and disorders. However, one of the impediments to successful gene therapy is the inefficient delivery of genes to target tissues and the inability to monitor delivery of genes and therapeutic responses at the targeted site. The emergence of molecular imaging strategies has been pivotal in optimizing gene therapy; since it can allow us to evaluate the effectiveness of gene delivery noninvasively and spatiotemporally. Due to the unique physiochemical properties of nanomaterials, numerous functional nanoparticles show promise in accomplishing gene delivery with the necessary feature of visualizing the delivery. In this review, recent developments of nanoparticles for molecular imaging guided gene delivery are summarized. PMID:22473061

A Gene Module-Based eQTL Analysis Prioritizing Disease Genes and Pathways in Kidney Cancer.

PubMed

Yang, Mary Qu; Li, Dan; Yang, William; Zhang, Yifan; Liu, Jun; Tong, Weida

2017-01-01

Clear cell renal cell carcinoma (ccRCC) is the most common and most aggressive form of renal cell cancer (RCC). The incidence of RCC has increased steadily in recent years. The pathogenesis of renal cell cancer remains poorly understood. Many of the tumor suppressor genes, oncogenes, and dysregulated pathways in ccRCC need to be revealed for improvement of the overall clinical outlook of the disease. Here, we developed a systems biology approach to prioritize the somatic mutated genes that lead to dysregulation of pathways in ccRCC. The method integrated multi-layer information to infer causative mutations and disease genes. First, we identified differential gene modules in ccRCC by coupling transcriptome and protein-protein interactions. Each of these modules consisted of interacting genes that were involved in similar biological processes and their combined expression alterations were significantly associated with disease type. Then, subsequent gene module-based eQTL analysis revealed somatic mutated genes that had driven the expression alterations of differential gene modules. Our study yielded a list of candidate disease genes, including several known ccRCC causative genes such as BAP1 and PBRM1 , as well as novel genes such as NOD2, RRM1, CSRNP1, SLC4A2, TTLL1 and CNTN1. The differential gene modules and their driver genes revealed by our study provided a new perspective for understanding the molecular mechanisms underlying the disease. Moreover, we validated the results in independent ccRCC patient datasets. Our study provided a new method for prioritizing disease genes and pathways.
Accelerated recruitment of new brain development genes into the human genome.

PubMed

Zhang, Yong E; Landback, Patrick; Vibranovski, Maria D; Long, Manyuan

2011-10-01

How the human brain evolved has attracted tremendous interests for decades. Motivated by case studies of primate-specific genes implicated in brain function, we examined whether or not the young genes, those emerging genome-wide in the lineages specific to the primates or rodents, showed distinct spatial and temporal patterns of transcription compared to old genes, which had existed before primate and rodent split. We found consistent patterns across different sources of expression data: there is a significantly larger proportion of young genes expressed in the fetal or infant brain of humans than in mouse, and more young genes in humans have expression biased toward early developing brains than old genes. Most of these young genes are expressed in the evolutionarily newest part of human brain, the neocortex. Remarkably, we also identified a number of human-specific genes which are expressed in the prefrontal cortex, which is implicated in complex cognitive behaviors. The young genes upregulated in the early developing human brain play diverse functional roles, with a significant enrichment of transcription factors. Genes originating from different mechanisms show a similar expression bias in the developing brain. Moreover, we found that the young genes upregulated in early brain development showed rapid protein evolution compared to old genes also expressed in the fetal brain. Strikingly, genes expressed in the neocortex arose soon after its morphological origin. These four lines of evidence suggest that positive selection for brain function may have contributed to the origination of young genes expressed in the developing brain. These data demonstrate a striking recruitment of new genes into the early development of the human brain.
Differentially expressed genes in the silk gland of silkworm (Bombyx mori) treated with TiO2 NPs.

PubMed

Xue, Bin; Li, Fanchi; Hu, Jingsheng; Tian, Jianghai; Li, Jinxin; Cheng, Xiaoyu; Hu, Jiahuan; Li, Bing

2017-05-05

Silk gland is a silkworm organ where silk proteins are synthesized and secreted. Dietary supplement of TiO 2 nanoparticles (NPs) promotes silk protein synthesis in silkworms. In this study, digital gene expression (DGE) tag was used to analyze the gene expression profile of the posterior silk gland of silkworms that were fed with TiO 2 NPs. In total, 5,702,823 and 6,150,719 clean tags, 55,096 and 74,715 distinct tags were detected in TiO 2 NPs treated and control groups, respectively. Compared with the control, TiO 2 NPs treated silkworms showed 306 differentially expressed genes, including 137 upregulated genes and 169 downregulated genes. Of these differentially expressed genes, 106 genes were related to silk protein synthesis, among which 97 genes were upregulated and 9 genes were downregulated. Pathway mapping using the Kyoto Encyclopedia of Genes and Genomes (KEGG) showed that 20 pathways were significantly enriched in TiO 2 NPs treated silkworms, and the metabolic pathway-related genes were the most significantly enriched. The DGE results were verified by qRT-PCR analysis of eight differentially expressed genes. The DGE and qRT-PCR results were consistent for all three upregulated genes and three of the five downregulated genes, but the expression trends of the remaining two genes were different between qRT-PCR and DGE analysis. This study enhances our understanding of the mechanism of TiO 2 NPs promoted silk protein synthesis. Copyright © 2017 Elsevier B.V. All rights reserved.
Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model.

PubMed

Sun, Xiaoxiao; Dalpiaz, David; Wu, Di; S Liu, Jun; Zhong, Wenxuan; Ma, Ping

2016-08-26

Accurate identification of differentially expressed (DE) genes in time course RNA-Seq data is crucial for understanding the dynamics of transcriptional regulatory network. However, most of the available methods treat gene expressions at different time points as replicates and test the significance of the mean expression difference between treatments or conditions irrespective of time. They thus fail to identify many DE genes with different profiles across time. In this article, we propose a negative binomial mixed-effect model (NBMM) to identify DE genes in time course RNA-Seq data. In the NBMM, mean gene expression is characterized by a fixed effect, and time dependency is described by random effects. The NBMM is very flexible and can be fitted to both unreplicated and replicated time course RNA-Seq data via a penalized likelihood method. By comparing gene expression profiles over time, we further classify the DE genes into two subtypes to enhance the understanding of expression dynamics. A significance test for detecting DE genes is derived using a Kullback-Leibler distance ratio. Additionally, a significance test for gene sets is developed using a gene set score. Simulation analysis shows that the NBMM outperforms currently available methods for detecting DE genes and gene sets. Moreover, our real data analysis of fruit fly developmental time course RNA-Seq data demonstrates the NBMM identifies biologically relevant genes which are well justified by gene ontology analysis. The proposed method is powerful and efficient to detect biologically relevant DE genes and gene sets in time course RNA-Seq data.
Modeling leaderless transcription and atypical genes results in more accurate gene prediction in prokaryotes.

PubMed

Lomsadze, Alexandre; Gemayel, Karl; Tang, Shiyuyun; Borodovsky, Mark

2018-05-17

In a conventional view of the prokaryotic genome organization, promoters precede operons and ribosome binding sites (RBSs) with Shine-Dalgarno consensus precede genes. However, recent experimental research suggesting a more diverse view motivated us to develop an algorithm with improved gene-finding accuracy. We describe GeneMarkS-2, an ab initio algorithm that uses a model derived by self-training for finding species-specific (native) genes, along with an array of precomputed "heuristic" models designed to identify harder-to-detect genes (likely horizontally transferred). Importantly, we designed GeneMarkS-2 to identify several types of distinct sequence patterns (signals) involved in gene expression control, among them the patterns characteristic for leaderless transcription as well as noncanonical RBS patterns. To assess the accuracy of GeneMarkS-2, we used genes validated by COG (Clusters of Orthologous Groups) annotation, proteomics experiments, and N-terminal protein sequencing. We observed that GeneMarkS-2 performed better on average in all accuracy measures when compared with the current state-of-the-art gene prediction tools. Furthermore, the screening of ∼5000 representative prokaryotic genomes made by GeneMarkS-2 predicted frequent leaderless transcription in both archaea and bacteria. We also observed that the RBS sites in some species with leadered transcription did not necessarily exhibit the Shine-Dalgarno consensus. The modeling of different types of sequence motifs regulating gene expression prompted a division of prokaryotic genomes into five categories with distinct sequence patterns around the gene starts. © 2018 Lomsadze et al.; Published by Cold Spring Harbor Laboratory Press.
Rapidly evolving R genes in diverse grass species confer resistance to rice blast disease

PubMed Central

Yang, Sihai; Li, Jing; Zhang, Xiaohui; Zhang, Qijun; Huang, Ju; Chen, Jian-Qun; Hartl, Daniel L.; Tian, Dacheng

2013-01-01

We show that the genomes of maize, sorghum, and brachypodium contain genes that, when transformed into rice, confer resistance to rice blast disease. The genes are resistance genes (R genes) that encode proteins with nucleotide-binding site (NBS) and leucine-rich repeat (LRR) domains (NBS–LRR proteins). By using criteria associated with rapid molecular evolution, we identified three rapidly evolving R-gene families in these species as well as in rice, and transformed a randomly chosen subset of these genes into rice strains known to be sensitive to rice blast disease caused by the fungus Magnaporthe oryzae. The transformed strains were then tested for sensitivity or resistance to 12 diverse strains of M. oryzae. A total of 15 functional blast R genes were identified among 60 NBS–LRR genes cloned from maize, sorghum, and brachypodium; and 13 blast R genes were obtained from 20 NBS–LRR paralogs in rice. These results show that abundant blast R genes occur not only within species but also among species, and that the R genes in the same rapidly evolving gene family can exhibit an effector response that confers resistance to rapidly evolving fungal pathogens. Neither conventional evolutionary conservation nor conventional evolutionary convergence supplies a satisfactory explanation of our findings. We suggest a unique mechanism termed “constrained divergence,” in which R genes and pathogen effectors can follow only limited evolutionary pathways to increase fitness. Our results open avenues for R-gene identification that will help to elucidate R-gene vs. effector mechanisms and may yield new sources of durable pathogen resistance. PMID:24145399
Validation of reference genes for quantitative gene expression analysis in experimental epilepsy.

PubMed

Sadangi, Chinmaya; Rosenow, Felix; Norwood, Braxton A

2017-12-01

To grasp the molecular mechanisms and pathophysiology underlying epilepsy development (epileptogenesis) and epilepsy itself, it is important to understand the gene expression changes that occur during these phases. Quantitative real-time polymerase chain reaction (qPCR) is a technique that rapidly and accurately determines gene expression changes. It is crucial, however, that stable reference genes are selected for each experimental condition to ensure that accurate values are obtained for genes of interest. If reference genes are unstably expressed, this can lead to inaccurate data and erroneous conclusions. To date, epilepsy studies have used mostly single, nonvalidated reference genes. This is the first study to systematically evaluate reference genes in male Sprague-Dawley rat models of epilepsy. We assessed 15 potential reference genes in hippocampal tissue obtained from 2 different models during epileptogenesis, 1 model during chronic epilepsy, and a model of noninjurious seizures. Reference gene ranking varied between models and also differed between epileptogenesis and chronic epilepsy time points. There was also some variance between the four mathematical models used to rank reference genes. Notably, we found novel reference genes to be more stably expressed than those most often used in experimental epilepsy studies. The consequence of these findings is that reference genes suitable for one epilepsy model may not be appropriate for others and that reference genes can change over time. It is, therefore, critically important to validate potential reference genes before using them as normalizing factors in expression analysis in order to ensure accurate, valid results. © 2017 Wiley Periodicals, Inc.
De Novo ORFs in Drosophila Are Important to Organismal Fitness and Evolved Rapidly from Previously Non-coding Sequences

PubMed Central

Reinhardt, Josephine A.; Wanjiru, Betty M.; Brant, Alicia T.; Saelao, Perot; Begun, David J.; Jones, Corbin D.

2013-01-01

How non-coding DNA gives rise to new protein-coding genes (de novo genes) is not well understood. Recent work has revealed the origins and functions of a few de novo genes, but common principles governing the evolution or biological roles of these genes are unknown. To better define these principles, we performed a parallel analysis of the evolution and function of six putatively protein-coding de novo genes described in Drosophila melanogaster. Reconstruction of the transcriptional history of de novo genes shows that two de novo genes emerged from novel long non-coding RNAs that arose at least 5 MY prior to evolution of an open reading frame. In contrast, four other de novo genes evolved a translated open reading frame and transcription within the same evolutionary interval suggesting that nascent open reading frames (proto-ORFs), while not required, can contribute to the emergence of a new de novo gene. However, none of the genes arose from proto-ORFs that existed long before expression evolved. Sequence and structural evolution of de novo genes was rapid compared to nearby genes and the structural complexity of de novo genes steadily increases over evolutionary time. Despite the fact that these genes are transcribed at a higher level in males than females, and are most strongly expressed in testes, RNAi experiments show that most of these genes are essential in both sexes during metamorphosis. This lethality suggests that protein coding de novo genes in Drosophila quickly become functionally important. PMID:24146629
Genome-wide identification, characterisation and expression analysis of the MADS-box gene family in Prunus mume.

PubMed

Xu, Zongda; Zhang, Qixiang; Sun, Lidan; Du, Dongliang; Cheng, Tangren; Pan, Huitang; Yang, Weiru; Wang, Jia

2014-10-01

MADS-box genes encode transcription factors that play crucial roles in plant development, especially in flower and fruit development. To gain insight into this gene family in Prunus mume, an important ornamental and fruit plant in East Asia, and to elucidate their roles in flower organ determination and fruit development, we performed a genome-wide identification, characterisation and expression analysis of MADS-box genes in this Rosaceae tree. In this study, 80 MADS-box genes were identified in P. mume and categorised into MIKC, Mα, Mβ, Mγ and Mδ groups based on gene structures and phylogenetic relationships. The MIKC group could be further classified into 12 subfamilies. The FLC subfamily was absent in P. mume and the six tandemly arranged DAM genes might experience a species-specific evolution process in P. mume. The MADS-box gene family might experience an evolution process from MIKC genes to Mδ genes to Mα, Mβ and Mγ genes. The expression analysis suggests that P. mume MADS-box genes have diverse functions in P. mume development and the functions of duplicated genes diverged after the duplication events. In addition to its involvement in the development of female gametophytes, type I genes also play roles in male gametophytes development. In conclusion, this study adds to our understanding of the roles that the MADS-box genes played in flower and fruit development and lays a foundation for selecting candidate genes for functional studies in P. mume and other species. Furthermore, this study also provides a basis to study the evolution of the MADS-box family.
The Inference of Gene Trees with Species Trees

PubMed Central

Szöllősi, Gergely J.; Tannier, Eric; Daubin, Vincent; Boussau, Bastien

2015-01-01

This article reviews the various models that have been used to describe the relationships between gene trees and species trees. Molecular phylogeny has focused mainly on improving models for the reconstruction of gene trees based on sequence alignments. Yet, most phylogeneticists seek to reveal the history of species. Although the histories of genes and species are tightly linked, they are seldom identical, because genes duplicate, are lost or horizontally transferred, and because alleles can coexist in populations for periods that may span several speciation events. Building models describing the relationship between gene and species trees can thus improve the reconstruction of gene trees when a species tree is known, and vice versa. Several approaches have been proposed to solve the problem in one direction or the other, but in general neither gene trees nor species trees are known. Only a few studies have attempted to jointly infer gene trees and species trees. These models account for gene duplication and loss, transfer or incomplete lineage sorting. Some of them consider several types of events together, but none exists currently that considers the full repertoire of processes that generate gene trees along the species tree. Simulations as well as empirical studies on genomic data show that combining gene tree–species tree models with models of sequence evolution improves gene tree reconstruction. In turn, these better gene trees provide a more reliable basis for studying genome evolution or reconstructing ancestral chromosomes and ancestral gene sequences. We predict that gene tree–species tree methods that can deal with genomic data sets will be instrumental to advancing our understanding of genomic evolution. PMID:25070970
Identification and Characterization of Switchgrass Histone H3 and CENH3 Genes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Miao, Jiamin; Frazier, Taylor; Huang, Linkai

Switchgrass is one of the most promising energy crops and only recently has been employed for biofuel production. The draft genome of switchgrass was recently released; however, relatively few switchgrass genes have been functionally characterized. CENH3, the major histone protein found in centromeres, along with canonical H3 and other histones, plays an important role in maintaining genome stability and integrity. Despite their importance, the histone H3 genes of switchgrass have remained largely uninvestigated. In this study, we identified 17 putative switchgrass histone H3 genes in silico. Of these genes, 15 showed strong homology to histone H3 genes including six H3.1more » genes, three H3.3 genes, four H3.3-like genes and two H3.1-like genes. The remaining two genes were found to be homologous to CENH3. RNA-seq data derived from lowland cultivar Alamo and upland cultivar Dacotah allowed us to identify SNPs in the histone H3 genes and compare their differential gene expression. Interestingly, we also found that overexpression of switchgrass histone H3 and CENH3 genes in N. benthamiana could trigger cell death of the transformed plant cells. Localization and deletion analyses of the histone H3 and CENH3 genes revealed that nuclear localization of the N-terminal tail is essential and sufficient for triggering the cell death phenotype. Lastly, our results deliver insight into the mechanisms underlying the histone-triggered cell death phenotype and provide a foundation for further studying the variations of the histone H3 and CENH3 genes in switchgrass.« less
Identification and Characterization of Switchgrass Histone H3 and CENH3 Genes

DOE PAGES

Miao, Jiamin; Frazier, Taylor; Huang, Linkai; ...

2016-07-12

Switchgrass is one of the most promising energy crops and only recently has been employed for biofuel production. The draft genome of switchgrass was recently released; however, relatively few switchgrass genes have been functionally characterized. CENH3, the major histone protein found in centromeres, along with canonical H3 and other histones, plays an important role in maintaining genome stability and integrity. Despite their importance, the histone H3 genes of switchgrass have remained largely uninvestigated. In this study, we identified 17 putative switchgrass histone H3 genes in silico. Of these genes, 15 showed strong homology to histone H3 genes including six H3.1more » genes, three H3.3 genes, four H3.3-like genes and two H3.1-like genes. The remaining two genes were found to be homologous to CENH3. RNA-seq data derived from lowland cultivar Alamo and upland cultivar Dacotah allowed us to identify SNPs in the histone H3 genes and compare their differential gene expression. Interestingly, we also found that overexpression of switchgrass histone H3 and CENH3 genes in N. benthamiana could trigger cell death of the transformed plant cells. Localization and deletion analyses of the histone H3 and CENH3 genes revealed that nuclear localization of the N-terminal tail is essential and sufficient for triggering the cell death phenotype. Lastly, our results deliver insight into the mechanisms underlying the histone-triggered cell death phenotype and provide a foundation for further studying the variations of the histone H3 and CENH3 genes in switchgrass.« less
Genome-Wide Characterization of bHLH Genes in Grape and Analysis of their Potential Relevance to Abiotic Stress Tolerance and Secondary Metabolite Biosynthesis

PubMed Central

Wang, Pengfei; Su, Ling; Gao, Huanhuan; Jiang, Xilong; Wu, Xinying; Li, Yi; Zhang, Qianqian; Wang, Yongmei; Ren, Fengshan

2018-01-01

Basic helix-loop-helix (bHLH) transcription factors are involved in many abiotic stress responses as well as flavonol and anthocyanin biosynthesis. In grapes (Vitis vinifera L.), flavonols including anthocyanins and condensed tannins are most abundant in the skins of the berries. Flavonols are important phytochemicals for viticulture and enology, but grape bHLH genes have rarely been examined. We identified 94 grape bHLH genes in a genome-wide analysis and performed Nr and GO function analyses for these genes. Phylogenetic analyses placed the genes into 15 clades, with some remaining orphans. 41 duplicate gene pairs were found in the grape bHLH gene family, and all of these duplicate gene pairs underwent purifying selection. Nine triplicate gene groups were found in the grape bHLH gene family and all of these triplicate gene groups underwent purifying selection. Twenty-two grape bHLH genes could be induced by PEG treatment and 17 grape bHLH genes could be induced by cold stress treatment including a homologous form of MYC2, VvbHLH007. Based on the GO or Nr function annotations, we found three other genes that are potentially related to anthocyanin or flavonol biosynthesis: VvbHLH003, VvbHLH007, and VvbHLH010. We also performed a cis-acting regulatory element analysis on some genes involved in flavonoid or anthocyanin biosynthesis and our results showed that most of these gene promoters contained G-box or E-box elements that could be recognized by bHLH family members. PMID:29449854
RNAi screening of developmental toolkit genes: a search for novel wing genes in the red flour beetle, Tribolium castaneum.

PubMed

Linz, David M; Tomoyasu, Yoshinori

2015-01-01

The amazing array of diversity among insect wings offers a powerful opportunity to study the mechanisms guiding morphological evolution. Studies in Drosophila (the fruit fly) have identified dozens of genes important for wing development. These genes are often called candidate genes, serving as an ideal starting point to study wing development in other insects. However, we also need to explore beyond the candidate genes to gain a more comprehensive view of insect wing evolution. As a first step away from the traditional candidate genes, we utilized Tribolium (the red flour beetle) as a model and assessed the potential involvement of a group of developmental toolkit genes (embryonic patterning genes) in beetle wing development. We hypothesized that the highly pleiotropic nature of these developmental genes would increase the likelihood of finding novel wing genes in Tribolium. Through the RNA interference screening, we found that Tc-cactus has a less characterized (but potentially evolutionarily conserved) role in wing development. We also found that the odd-skipped family genes are essential for the formation of the thoracic pleural plates, including the recently discovered wing serial homologs in Tribolium. In addition, we obtained several novel insights into the function of these developmental genes, such as the involvement of mille-pattes and Tc-odd-paired in metamorphosis. Despite these findings, no gene we examined was found to have novel wing-related roles unique in Tribolium. These results suggest a relatively conserved nature of developmental toolkit genes and highlight the limited degree to which these genes are co-opted during insect wing evolution.
Validating internal controls for quantitative plant gene expression studies

PubMed Central

Brunner, Amy M; Yakovlev, Igor A; Strauss, Steven H

2004-01-01

Background Real-time reverse transcription PCR (RT-PCR) has greatly improved the ease and sensitivity of quantitative gene expression studies. However, accurate measurement of gene expression with this method relies on the choice of a valid reference for data normalization. Studies rarely verify that gene expression levels for reference genes are adequately consistent among the samples used, nor compare alternative genes to assess which are most reliable for the experimental conditions analyzed. Results Using real-time RT-PCR to study the expression of 10 poplar (genus Populus) housekeeping genes, we demonstrate a simple method for determining the degree of stability of gene expression over a set of experimental conditions. Based on a traditional method for analyzing the stability of varieties in plant breeding, it defines measures of gene expression stability from analysis of variance (ANOVA) and linear regression. We found that the potential internal control genes differed widely in their expression stability over the different tissues, developmental stages and environmental conditions studied. Conclusion Our results support that quantitative comparisons of candidate reference genes are an important part of real-time RT-PCR studies that seek to precisely evaluate variation in gene expression. The method we demonstrated facilitates statistical and graphical evaluation of gene expression stability. Selection of the best reference gene for a given set of experimental conditions should enable detection of biologically significant changes in gene expression that are too small to be revealed by less precise methods, or when highly variable reference genes are unknowingly used in real-time RT-PCR experiments. PMID:15317655
The gene expression profile of resistant and susceptible Bombyx mori strains reveals cypovirus-associated variations in host gene transcript levels.

PubMed

Guo, Rui; Wang, Simei; Xue, Renyu; Cao, Guangli; Hu, Xiaolong; Huang, Moli; Zhang, Yangqi; Lu, Yahong; Zhu, Liyuan; Chen, Fei; Liang, Zi; Kuang, Sulan; Gong, Chengliang

2015-06-01

High-throughput paired-end RNA sequencing (RNA-Seq) was performed to investigate the gene expression profile of a susceptible Bombyx mori strain, Lan5, and a resistant B. mori strain, Ou17, which were both orally infected with B. mori cypovirus (BmCPV) in the midgut. There were 330 and 218 up-regulated genes, while there were 147 and 260 down-regulated genes in the Lan5 and Ou17 strains, respectively. Gene ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment for differentially expressed genes (DEGs) were carried out. Moreover, gene interaction network (STRING) analyses were performed to analyze the relationships among the shared DEGs. Some of these genes were related and formed a large network, in which the genes for B. mori cuticular protein RR-2 motif 123 (BmCPR123) and the gene for B. mori DNA replication licensing factor Mcm2-like (BmMCM2) were key genes among the common up-regulated DEGs, whereas the gene for B. mori heat shock protein 20.1 (Bmhsp20.1) was the central gene among the shared down-regulated DEGs between Lan5 vs Lan5-CPV and Ou17 vs Ou17-CPV. These findings established a comprehensive database of genes that are differentially expressed in response to BmCPV infection between silkworm strains that differed in resistance to BmCPV and implied that these DEGs might be involved in B. mori immune responses against BmCPV infection.
Metabolic Adaptation to Nutrients Involves Coregulation of Gene Expression by the RNA Helicase Dbp2 and the Cyc8 Corepressor in Saccharomyces cerevisiae.

PubMed

Wang, Siwen; Xing, Zheng; Pascuzzi, Pete E; Tran, Elizabeth J

2017-07-05

Cells fine-tune their metabolic programs according to nutrient availability in order to maintain homeostasis. This is achieved largely through integrating signaling pathways and the gene expression program, allowing cells to adapt to nutritional change. Dbp2, a member of the DEAD-box RNA helicase family in Saccharomyces cerevisiae , has been proposed to integrate gene expression with cellular metabolism. Prior work from our laboratory has reported the necessity of DBP2 in proper gene expression, particularly for genes involved in glucose-dependent regulation. Here, by comparing differentially expressed genes in dbp2 ∆ to those of 700 other deletion strains from other studies, we find that CYC8 and TUP1 , which form a complex and inhibit transcription of numerous genes, corepress a common set of genes with DBP2 Gene ontology (GO) annotations reveal that these corepressed genes are related to cellular metabolism, including respiration, gluconeogenesis, and alternative carbon-source utilization genes. Consistent with a direct role in metabolic gene regulation, loss of either DBP2 or CYC8 results in increased cellular respiration rates. Furthermore, we find that corepressed genes have a propensity to be associated with overlapping long noncoding RNAs and that upregulation of these genes in the absence of DBP2 correlates with decreased binding of Cyc8 to these gene promoters. Taken together, this suggests that Dbp2 integrates nutrient availability with energy homeostasis by maintaining repression of glucose-repressed, Cyc8-targeted genes across the genome. Copyright © 2017 Wang et al.
Sexually Dimorphic Gene Expression Associated with Growth and Reproduction of Tongue Sole (Cynoglossus semilaevis) Revealed by Brain Transcriptome Analysis.

PubMed

Wang, Pingping; Zheng, Min; Liu, Jian; Liu, Yongzhuang; Lu, Jianguo; Sun, Xiaowen

2016-08-26

In this study, we performed a comprehensive analysis of the transcriptome of one- and two-year-old male and female brains of Cynoglossus semilaevis by high-throughput Illumina sequencing. A total of 77,066 transcripts, corresponding to 21,475 unigenes, were obtained with a N50 value of 4349 bp. Of these unigenes, 33 genes were found to have significant differential expression and potentially associated with growth, from which 18 genes were down-regulated and 12 genes were up-regulated in two-year-old males, most of these genes had no significant differences in expression among one-year-old males and females and two-year-old females. A similar analysis was conducted to look for genes associated with reproduction; 25 genes were identified, among them, five genes were found to be down regulated and 20 genes up regulated in two-year-old males, again, most of the genes had no significant expression differences among the other three. The performance of up regulated genes in Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis was significantly different between two-year-old males and females. Males had a high gene expression in genetic information processing, while female's highly expressed genes were mainly enriched on organismal systems. Our work identified a set of sex-biased genes potentially associated with growth and reproduction that might be the candidate factors affecting sexual dimorphism of tongue sole, laying the foundation to understand the complex process of sex determination of this economic valuable species.
Digital gene expression profiling of flax (Linum usitatissimum L.) stem peel identifies genes enriched in fiber-bearing phloem tissue.

PubMed

Guo, Yuan; Qiu, Caisheng; Long, Songhua; Chen, Ping; Hao, Dongmei; Preisner, Marta; Wang, Hui; Wang, Yufu

2017-08-30

To better understand the molecular mechanisms and gene expression characteristics associated with development of bast fiber cell within flax stem phloem, the gene expression profiling of flax stem peels and leaves were screened, using Illumina's Digital Gene Expression (DGE) analysis. Four DGE libraries (2 for stem peel and 2 for leaf), ranging from 6.7 to 9.2 million clean reads were obtained, which produced 7.0 million and 6.8 million mapped reads for flax stem peel and leave, respectively. By differential gene expression analysis, a total of 975 genes, of which 708 (73%) genes have protein-coding annotation, were identified as phloem enriched genes putatively involved in the processes of polysaccharide and cell wall metabolism. Differential expression genes (DEGs) was validated using quantitative RT-PCR, the expression pattern of all nine genes determined by qRT-PCR fitted in well with that obtained by sequencing analysis. Cluster and Gene Ontology (GO) analysis revealed that a large number of genes related to metabolic process, catalytic activity and binding category were expressed predominantly in the stem peels. The Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of the phloem enriched genes suggested approximately 111 biological pathways. The large number of genes and pathways produced from DGE sequencing will expand our understanding of the complex molecular and cellular events in flax bast fiber development and provide a foundation for future studies on fiber development in other bast fiber crops. Copyright © 2017 Elsevier B.V. All rights reserved.
Novel candidate genes important for asthma and hypertension comorbidity revealed from associative gene networks.

PubMed

Saik, Olga V; Demenkov, Pavel S; Ivanisenko, Timofey V; Bragina, Elena Yu; Freidin, Maxim B; Goncharova, Irina A; Dosenko, Victor E; Zolotareva, Olga I; Hofestaedt, Ralf; Lavrik, Inna N; Rogaev, Evgeny I; Ivanisenko, Vladimir A

2018-02-13

Hypertension and bronchial asthma are a major issue for people's health. As of 2014, approximately one billion adults, or ~ 22% of the world population, have had hypertension. As of 2011, 235-330 million people globally have been affected by asthma and approximately 250,000-345,000 people have died each year from the disease. The development of the effective treatment therapies against these diseases is complicated by their comorbidity features. This is often a major problem in diagnosis and their treatment. Hence, in this study the bioinformatical methodology for the analysis of the comorbidity of these two diseases have been developed. As such, the search for candidate genes related to the comorbid conditions of asthma and hypertension can help in elucidating the molecular mechanisms underlying the comorbid condition of these two diseases, and can also be useful for genotyping and identifying new drug targets. Using ANDSystem, the reconstruction and analysis of gene networks associated with asthma and hypertension was carried out. The gene network of asthma included 755 genes/proteins and 62,603 interactions, while the gene network of hypertension - 713 genes/proteins and 45,479 interactions. Two hundred and five genes/proteins and 9638 interactions were shared between asthma and hypertension. An approach for ranking genes implicated in the comorbid condition of two diseases was proposed. The approach is based on nine criteria for ranking genes by their importance, including standard methods of gene prioritization (Endeavor, ToppGene) as well as original criteria that take into account the characteristics of an associative gene network and the presence of known polymorphisms in the analysed genes. According to the proposed approach, the genes IL10, TLR4, and CAT had the highest priority in the development of comorbidity of these two diseases. Additionally, it was revealed that the list of top genes is enriched with apoptotic genes and genes involved in biological processes related to the functioning of central nervous system. The application of methods of reconstruction and analysis of gene networks is a productive tool for studying the molecular mechanisms of comorbid conditions. The method put forth to rank genes by their importance to the comorbid condition of asthma and hypertension was employed that resulted in prediction of 10 genes, playing the key role in the development of the comorbid condition. The results can be utilised to plan experiments for identification of novel candidate genes along with searching for novel pharmacological targets.

Recent progresses in gene delivery-based bone tissue engineering.

PubMed

Lu, Chia-Hsin; Chang, Yu-Han; Lin, Shih-Yeh; Li, Kuei-Chang; Hu, Yu-Chen

2013-12-01

Gene therapy has converged with bone engineering over the past decade, by which a variety of therapeutic genes have been delivered to stimulate bone repair. These genes can be administered via in vivo or ex vivo approach using either viral or nonviral vectors. This article reviews the fundamental aspects and recent progresses in the gene therapy-based bone engineering, with emphasis on the new genes, viral vectors and gene delivery approaches. © 2013.
G-NEST: A gene neighborhood scoring tool to identify co-conserved, co-expressed genes

USDA-ARS?s Scientific Manuscript database

In previous studies, gene neighborhoods--spatial clusters of co-expressed genes in the genome--have been defined using arbitrary rules such as requiring adjacency, a minimum number of genes, a fixed window size, or a minimum expression level. In the current study, we developed a Gene Neighborhood Sc...
Using RNA-Seq data to select refence genes for normalizing gene expression in apple roots

USDA-ARS?s Scientific Manuscript database

Gene expression in apple roots in response to various stress conditions is a less-explored research subject. Reliable reference genes for normalizing quantitative gene expression data have not been carefully investigated. In this study, the suitability of a set of 15 apple genes were evaluated for t...
Prioritizing Genes Related to Nicotine Addiction Via a Multi-source-Based Approach.

PubMed

Liu, Xinhua; Liu, Meng; Li, Xia; Zhang, Lihua; Fan, Rui; Wang, Ju

2015-08-01

Nicotine has a broad impact on both the central and peripheral nervous systems. Over the past decades, an increasing number of genes potentially involved in nicotine addiction have been identified by different technical approaches. However, the molecular mechanisms underlying nicotine addiction remain largely unknown. Under such situation, prioritizing the candidate genes for further investigation is becoming increasingly important. In this study, we presented a multi-source-based gene prioritization approach for nicotine addiction by utilizing the vast amounts of information generated from for nicotine addiction study during the past years. In this approach, we first collected and curated genes from studies in four categories, i.e., genetic association analysis, genetic linkage analysis, high-throughput gene/protein expression analysis, and literature search of single gene/protein-based studies. Based on these resources, the genes were scored and a weight value was determined for each category. Finally, the genes were ranked by their combined scores, and 220 genes were selected as the prioritized nicotine addiction-related genes. Evaluation suggested the prioritized genes were promising targets for further analysis and replication study.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Otani, Sanae; Department of Pediatrics, Graduate School of Medicine, Osaka City University, Osaka; Ayata, Minoru, E-mail: maverick@med.osaka-cu.ac.jp

Measles virus (MV) is the causative agent of measles and its neurological complications, subacute sclerosing panencephalitis (SSPE) and measles inclusion body encephalitis (MIBE). Biased hypermutation in the M gene is a characteristic feature of SSPE and MIBE. To determine whether the M gene is the preferred target of hypermutation, an additional transcriptional unit containing a humanized Renilla reniformis green fluorescent protein (hrGFP) gene was introduced into the IC323 MV genome, and nude mice were inoculated intracerebrally with the virus. Biased hypermutation occurred in the M gene and also in the hrGFP gene when it was inserted between the leader andmore » the N gene, but not between the H and L gene. These results indicate that biased hypermutation is usually found in a gene whose function is not essential for viral proliferation in the brain and that the location of a gene in the MV genome can affect its mutational frequency. - Highlights: • Wild-type MV can cause persistent infections in nude mice. • Biased hypermutation occurred in the M gene. • Biased hypermutation occurred in an inessential gene inserted between the leader and the N gene.« less
[Genetic instability of probiotic characteristics in the Bifidobacterium longum subsp. longum B379M strain during cultivation and maintenance].

PubMed

Averina, O V; Nezametdinova, V Z; Alekseeva, M G; Danilenko, V N

2012-11-01

The stability of inheriting several genes in the Russian commercial strain Bifidobacterium longum subsp. longum B379M during cultivation and maintenance under laboratory conditions has been studied. The examined genes code for probiotic characteristics, such as utilization of several sugars (lacA2 gene, encoding beta-galactosidase; ara gene, encoding arabinosidase; and galA gene, encoding arabinogalactan endo-beta-galactosidase); synthesis of bacteriocins (lans gene, encoding lanthionine synthetase); and mobile gene tet(W), conferring resistance to the antibiotic tetracycline. The other gene families studied include the genes responsible for signal transduction and adaptation to stress conditions in the majority of bacteria (serine/threonine protein kinases and the toxin-antitoxin systems of MazEF and RelBE types) and transcription regulators (genes encoding WhiB family proteins). Genomic DNA was analyzed by PCR using specially selected primers. A loss of the genes galA and tet(W) has been shown. It is proposed to expand the requirements on probiotic strains, namely, to control retention of the key probiotic genes using molecular biological methods.
The limitations of simple gene set enrichment analysis assuming gene independence.

PubMed

Tamayo, Pablo; Steinhardt, George; Liberzon, Arthur; Mesirov, Jill P

2016-02-01

Since its first publication in 2003, the Gene Set Enrichment Analysis method, based on the Kolmogorov-Smirnov statistic, has been heavily used, modified, and also questioned. Recently a simplified approach using a one-sample t-test score to assess enrichment and ignoring gene-gene correlations was proposed by Irizarry et al. 2009 as a serious contender. The argument criticizes Gene Set Enrichment Analysis's nonparametric nature and its use of an empirical null distribution as unnecessary and hard to compute. We refute these claims by careful consideration of the assumptions of the simplified method and its results, including a comparison with Gene Set Enrichment Analysis's on a large benchmark set of 50 datasets. Our results provide strong empirical evidence that gene-gene correlations cannot be ignored due to the significant variance inflation they produced on the enrichment scores and should be taken into account when estimating gene set enrichment significance. In addition, we discuss the challenges that the complex correlation structure and multi-modality of gene sets pose more generally for gene set enrichment methods. © The Author(s) 2012.
The chromosomal organization of horizontal gene transfer in bacteria.

PubMed

Oliveira, Pedro H; Touchon, Marie; Cury, Jean; Rocha, Eduardo P C

2017-10-10

Bacterial adaptation is accelerated by the acquisition of novel traits through horizontal gene transfer, but the integration of these genes affects genome organization. We found that transferred genes are concentrated in only ~1% of the chromosomal regions (hotspots) in 80 bacterial species. This concentration increases with genome size and with the rate of transfer. Hotspots diversify by rapid gene turnover; their chromosomal distribution depends on local contexts (neighboring core genes), and content in mobile genetic elements. Hotspots concentrate most changes in gene repertoires, reduce the trade-off between genome diversification and organization, and should be treasure troves of strain-specific adaptive genes. Most mobile genetic elements and antibiotic resistance genes are in hotspots, but many hotspots lack recognizable mobile genetic elements and exhibit frequent homologous recombination at flanking core genes. Overrepresentation of hotspots with fewer mobile genetic elements in naturally transformable bacteria suggests that homologous recombination and horizontal gene transfer are tightly linked in genome evolution.Horizontal gene transfer (HGT) is an important mechanism for genome evolution and adaptation in bacteria. Here, Oliveira and colleagues find HGT hotspots comprising ~ 1% of the chromosomal regions in 80 bacterial species.
Estimation of gene induction enables a relevance-based ranking of gene sets.

PubMed

Bartholomé, Kilian; Kreutz, Clemens; Timmer, Jens

2009-07-01

In order to handle and interpret the vast amounts of data produced by microarray experiments, the analysis of sets of genes with a common biological functionality has been shown to be advantageous compared to single gene analyses. Some statistical methods have been proposed to analyse the differential gene expression of gene sets in microarray experiments. However, most of these methods either require threshhold values to be chosen for the analysis, or they need some reference set for the determination of significance. We present a method that estimates the number of differentially expressed genes in a gene set without requiring a threshold value for significance of genes. The method is self-contained (i.e., it does not require a reference set for comparison). In contrast to other methods which are focused on significance, our approach emphasizes the relevance of the regulation of gene sets. The presented method measures the degree of regulation of a gene set and is a useful tool to compare the induction of different gene sets and place the results of microarray experiments into the biological context. An R-package is available.
Operon Formation is Driven by Co-Regulation and Not by Horizontal Gene Transfer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Price, Morgan N.; Huang, Katherine H.; Arkin, Adam P.

Although operons are often subject to horizontal gene transfer (HGT), non-HGT genes are particularly likely to be in operons. To resolve this apparent discrepancy and to determine whether HGT is involved in operon formation, we examined the evolutionary history of the genes and operons in Escherichia coli K12. We show that genes that have homologs in distantly related bacteria but not in close relatives of E. coli (indicating HGTi) form new operons at about the same rates as native genes. Furthermore, genes in new operons are no more likely than other genes to have phylogenetic trees that are inconsistent withmore » the species tree. In contrast, essential genes and ubiquitous genes without paralogs (genes believed to undergo HGT rarely) often form new operons. We conclude that HGT is not associated with operon formation, but instead promotes the prevalence of pre-existing operons. To explain operon formation, we propose that new operons reduce the amount of regulatory information required to specify optimal expression patterns. Consistent with this hypothesis, operons have greater amounts of conserved regulatory sequences than do individually transcribed genes.« less
Identification of reference genes in human myelomonocytic cells for gene expression studies in altered gravity.

PubMed

Thiel, Cora S; Hauschild, Swantje; Tauber, Svantje; Paulsen, Katrin; Raig, Christiane; Raem, Arnold; Biskup, Josefine; Gutewort, Annett; Hürlimann, Eva; Unverdorben, Felix; Buttron, Isabell; Lauber, Beatrice; Philpot, Claudia; Lier, Hartwin; Engelmann, Frank; Layer, Liliana E; Ullrich, Oliver

2015-01-01

Gene expression studies are indispensable for investigation and elucidation of molecular mechanisms. For the process of normalization, reference genes ("housekeeping genes") are essential to verify gene expression analysis. Thus, it is assumed that these reference genes demonstrate similar expression levels over all experimental conditions. However, common recommendations about reference genes were established during 1 g conditions and therefore their applicability in studies with altered gravity has not been demonstrated yet. The microarray technology is frequently used to generate expression profiles under defined conditions and to determine the relative difference in expression levels between two or more different states. In our study, we searched for potential reference genes with stable expression during different gravitational conditions (microgravity, normogravity, and hypergravity) which are additionally not altered in different hardware systems. We were able to identify eight genes (ALB, B4GALT6, GAPDH, HMBS, YWHAZ, ABCA5, ABCA9, and ABCC1) which demonstrated no altered gene expression levels in all tested conditions and therefore represent good candidates for the standardization of gene expression studies in altered gravity.
Characterization of the human RAB38 and RAB7 genes: exclusion of new major pathological loci for Japanese OCA.

PubMed

Suzuki, Tamio; Miyamura, Yoshinori; Inagaki, Katsuhiko; Tomita, Yasushi

2003-08-01

Oculocutaneous albinisms (OCAs) are due to various gene mutations that cause a disruption of melanogenesis in the melanocyte. Four different genes associated with human OCA have been reported, however, not all of OCA patients can be classified according to these four genes. We have sought to find a new major locus for Japanese OCA. Recently two genes, RAB38 and RAB7, were reported to play an important role in melanogenesis in the melanocyte, suggesting that these two genes could be good candidates for new OCA loci. To determine the structures of the human RAB38 and RAB7 genes, and examine if the two genes are new major loci for Japanese OCA. We screened mutations in these genes of 25 Japanese OCA patients who lacked mutations in the OCA1 and OCA2 genes with SSCP/heteroduplexes method. We determined the both genes, and their genomic organizations to design the primers for SSCP/heteroduplexes method. And then we screened mutations, but no mutation was detected. Neither of the genes is a new major locus for Japanese OCA.
Molecular phylogeny and evolution of alcohol dehydrogenase (Adh) genes in legumes

PubMed Central

Fukuda, Tatsuya; Yokoyama, Jun; Nakamura, Toru; Song, In-Ja; Ito, Takuro; Ochiai, Toshinori; Kanno, Akira; Kameya, Toshiaki; Maki, Masayuki

2005-01-01

Background Nuclear genes determine the vast range of phenotypes that are responsible for the adaptive abilities of organisms in nature. Nevertheless, the evolutionary processes that generate the structures and functions of nuclear genes are only now be coming understood. The aim of our study is to isolate the alcohol dehydrogenase (Adh) genes in two distantly related legumes, and use these sequences to examine the molecular evolutionary history of this nuclear gene. Results We isolated the expressed Adh genes from two species of legumes, Sophora flavescens Ait. and Wisteria floribunda DC., by a RT-PCR based approach and found a new Adh locus in addition to homologues of the Adh genes found previously in legumes. To examine the evolution of these genes, we compared the species and gene trees and found gene duplication of the Adh loci in the legumes occurred as an ancient event. Conclusion This is the first report revealing that some legume species have at least two Adh gene loci belonging to separate clades. Phylogenetic analyses suggest that these genes resulted from relatively ancient duplication events. PMID:15836788
Down-Regulation of Gene Expression by RNA-Induced Gene Silencing

NASA Astrophysics Data System (ADS)

Travella, Silvia; Keller, Beat

Down-regulation of endogenous genes via post-transcriptional gene silencing (PTGS) is a key to the characterization of gene function in plants. Many RNA-based silencing mechanisms such as post-transcriptional gene silencing, co-suppression, quelling, and RNA interference (RNAi) have been discovered among species of different kingdoms (plants, fungi, and animals). One of the most interesting discoveries was RNAi, a sequence-specific gene-silencing mechanism initiated by the introduction of double-stranded RNA (dsRNA), homologous in sequence to the silenced gene, which triggers degradation of mRNA. Infection of plants with modified viruses can also induce RNA silencing and is referred to as virus-induced gene silencing (VIGS). In contrast to insertional mutagenesis, these emerging new reverse genetic approaches represent a powerful tool for exploring gene function and for manipulating gene expression experimentally in cereal species such as barley and wheat. We examined how RNAi and VIGS have been used to assess gene function in barley and wheat, including molecular mechanisms involved in the process and available methodological elements, such as vectors, inoculation procedures, and analysis of silenced phenotypes.
Modularity of Plant Metabolic Gene Clusters: A Trio of Linked Genes That Are Collectively Required for Acylation of Triterpenes in Oat[W][OA

PubMed Central

Mugford, Sam T.; Louveau, Thomas; Melton, Rachel; Qi, Xiaoquan; Bakht, Saleha; Hill, Lionel; Tsurushima, Tetsu; Honkanen, Suvi; Rosser, Susan J.; Lomonossoff, George P.; Osbourn, Anne

2013-01-01

Operon-like gene clusters are an emerging phenomenon in the field of plant natural products. The genes encoding some of the best-characterized plant secondary metabolite biosynthetic pathways are scattered across plant genomes. However, an increasing number of gene clusters encoding the synthesis of diverse natural products have recently been reported in plant genomes. These clusters have arisen through the neo-functionalization and relocation of existing genes within the genome, and not by horizontal gene transfer from microbes. The reasons for clustering are not yet clear, although this form of gene organization is likely to facilitate co-inheritance and co-regulation. Oats (Avena spp) synthesize antimicrobial triterpenoids (avenacins) that provide protection against disease. The synthesis of these compounds is encoded by a gene cluster. Here we show that a module of three adjacent genes within the wider biosynthetic gene cluster is required for avenacin acylation. Through the characterization of these genes and their encoded proteins we present a model of the subcellular organization of triterpenoid biosynthesis. PMID:23532069
Biased Gene Fractionation and Dominant Gene Expression among the Subgenomes of Brassica rapa

PubMed Central

Cheng, Feng; Wu, Jian; Fang, Lu; Sun, Silong; Liu, Bo; Lin, Ke; Bonnema, Guusje; Wang, Xiaowu

2012-01-01

Polyploidization, both ancient and recent, is frequent among plants. A “two-step theory" was proposed to explain the meso-triplication of the Brassica “A" genome: Brassica rapa. By accurately partitioning of this genome, we observed that genes in the less fractioned subgenome (LF) were dominantly expressed over the genes in more fractioned subgenomes (MFs: MF1 and MF2), while the genes in MF1 were slightly dominantly expressed over the genes in MF2. The results indicated that the dominantly expressed genes tended to be resistant against gene fractionation. By re-sequencing two B. rapa accessions: a vegetable turnip (VT117) and a Rapid Cycling line (L144), we found that genes in LF had less non-synonymous or frameshift mutations than genes in MFs; however mutation rates were not significantly different between MF1 and MF2. The differences in gene expression patterns and on-going gene death among the three subgenomes suggest that “two-step" genome triplication and differential subgenome methylation played important roles in the genome evolution of B. rapa. PMID:22567157
Biased gene fractionation and dominant gene expression among the subgenomes of Brassica rapa.

PubMed

Cheng, Feng; Wu, Jian; Fang, Lu; Sun, Silong; Liu, Bo; Lin, Ke; Bonnema, Guusje; Wang, Xiaowu

2012-01-01

Polyploidization, both ancient and recent, is frequent among plants. A "two-step theory" was proposed to explain the meso-triplication of the Brassica "A" genome: Brassica rapa. By accurately partitioning of this genome, we observed that genes in the less fractioned subgenome (LF) were dominantly expressed over the genes in more fractioned subgenomes (MFs: MF1 and MF2), while the genes in MF1 were slightly dominantly expressed over the genes in MF2. The results indicated that the dominantly expressed genes tended to be resistant against gene fractionation. By re-sequencing two B. rapa accessions: a vegetable turnip (VT117) and a Rapid Cycling line (L144), we found that genes in LF had less non-synonymous or frameshift mutations than genes in MFs; however mutation rates were not significantly different between MF1 and MF2. The differences in gene expression patterns and on-going gene death among the three subgenomes suggest that "two-step" genome triplication and differential subgenome methylation played important roles in the genome evolution of B. rapa.
OGEE v2: an update of the online gene essentiality database with special focus on differentially essential genes in human cancer cell lines.

PubMed

Chen, Wei-Hua; Lu, Guanting; Chen, Xiao; Zhao, Xing-Ming; Bork, Peer

2017-01-04

OGEE is an Online GEne Essentiality database. To enhance our understanding of the essentiality of genes, in OGEE we collected experimentally tested essential and non-essential genes, as well as associated gene properties known to contribute to gene essentiality. We focus on large-scale experiments, and complement our data with text-mining results. We organized tested genes into data sets according to their sources, and tagged those with variable essentiality statuses across data sets as conditionally essential genes, intending to highlight the complex interplay between gene functions and environments/experimental perturbations. Developments since the last public release include increased numbers of species and gene essentiality data sets, inclusion of non-coding essential sequences and genes with intermediate essentiality statuses. In addition, we included 16 essentiality data sets from cancer cell lines, corresponding to 9 human cancers; with OGEE, users can easily explore the shared and differentially essential genes within and between cancer types. These genes, especially those derived from cell lines that are similar to tumor samples, could reveal the oncogenic drivers, paralogous gene expression pattern and chromosomal structure of the corresponding cancer types, and can be further screened to identify targets for cancer therapy and/or new drug development. OGEE is freely available at http://ogee.medgenius.info. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
LGscore: A method to identify disease-related genes using biological literature and Google data.

PubMed

Kim, Jeongwoo; Kim, Hyunjin; Yoon, Youngmi; Park, Sanghyun

2015-04-01

Since the genome project in 1990s, a number of studies associated with genes have been conducted and researchers have confirmed that genes are involved in disease. For this reason, the identification of the relationships between diseases and genes is important in biology. We propose a method called LGscore, which identifies disease-related genes using Google data and literature data. To implement this method, first, we construct a disease-related gene network using text-mining results. We then extract gene-gene interactions based on co-occurrences in abstract data obtained from PubMed, and calculate the weights of edges in the gene network by means of Z-scoring. The weights contain two values: the frequency and the Google search results. The frequency value is extracted from literature data, and the Google search result is obtained using Google. We assign a score to each gene through a network analysis. We assume that genes with a large number of links and numerous Google search results and frequency values are more likely to be involved in disease. For validation, we investigated the top 20 inferred genes for five different diseases using answer sets. The answer sets comprised six databases that contain information on disease-gene relationships. We identified a significant number of disease-related genes as well as candidate genes for Alzheimer's disease, diabetes, colon cancer, lung cancer, and prostate cancer. Our method was up to 40% more accurate than existing methods. Copyright © 2015 Elsevier Inc. All rights reserved.
A powerful score-based test statistic for detecting gene-gene co-association.

PubMed

Xu, Jing; Yuan, Zhongshang; Ji, Jiadong; Zhang, Xiaoshuai; Li, Hongkai; Wu, Xuesen; Xue, Fuzhong; Liu, Yanxun

2016-01-29

The genetic variants identified by Genome-wide association study (GWAS) can only account for a small proportion of the total heritability for complex disease. The existence of gene-gene joint effects which contains the main effects and their co-association is one of the possible explanations for the "missing heritability" problems. Gene-gene co-association refers to the extent to which the joint effects of two genes differ from the main effects, not only due to the traditional interaction under nearly independent condition but the correlation between genes. Generally, genes tend to work collaboratively within specific pathway or network contributing to the disease and the specific disease-associated locus will often be highly correlated (e.g. single nucleotide polymorphisms (SNPs) in linkage disequilibrium). Therefore, we proposed a novel score-based statistic (SBS) as a gene-based method for detecting gene-gene co-association. Various simulations illustrate that, under different sample sizes, marginal effects of causal SNPs and co-association levels, the proposed SBS has the better performance than other existed methods including single SNP-based and principle component analysis (PCA)-based logistic regression model, the statistics based on canonical correlations (CCU), kernel canonical correlation analysis (KCCU), partial least squares path modeling (PLSPM) and delta-square (δ (2)) statistic. The real data analysis of rheumatoid arthritis (RA) further confirmed its advantages in practice. SBS is a powerful and efficient gene-based method for detecting gene-gene co-association.

Utility and Limitations of Using Gene Expression Data to Identify Functional Associations

PubMed Central

Peng, Cheng; Shiu, Shin-Han

2016-01-01

Gene co-expression has been widely used to hypothesize gene function through guilt-by association. However, it is not clear to what degree co-expression is informative, whether it can be applied to genes involved in different biological processes, and how the type of dataset impacts inferences about gene functions. Here our goal is to assess the utility and limitations of using co-expression as a criterion to recover functional associations between genes. By determining the percentage of gene pairs in a metabolic pathway with significant expression correlation, we found that many genes in the same pathway do not have similar transcript profiles and the choice of dataset, annotation quality, gene function, expression similarity measure, and clustering approach significantly impacts the ability to recover functional associations between genes using Arabidopsis thaliana as an example. Some datasets are more informative in capturing coordinated expression profiles and larger data sets are not always better. In addition, to recover the maximum number of known pathways and identify candidate genes with similar functions, it is important to explore rather exhaustively multiple dataset combinations, similarity measures, clustering algorithms and parameters. Finally, we validated the biological relevance of co-expression cluster memberships with an independent phenomics dataset and found that genes that consistently cluster with leucine degradation genes tend to have similar leucine levels in mutants. This study provides a framework for obtaining gene functional associations by maximizing the information that can be obtained from gene expression datasets. PMID:27935950
Skin transcriptome profiles associated with coat color in sheep

PubMed Central

2013-01-01

Background Previous molecular genetic studies of physiology and pigmentation of sheep skin have focused primarily on a limited number of genes and proteins. To identify additional genes that may play important roles in coat color regulation, Illumina sequencing technology was used to catalog global gene expression profiles in skin of sheep with white versus black coat color. Results There were 90,006 and 74,533 unigenes assembled from the reads obtained from white and black sheep skin, respectively. Genes encoding for the ribosomal proteins and keratin associated proteins were most highly expressed. A total of 2,235 known genes were differentially expressed in black versus white sheep skin, with 479 genes up-regulated and 1,756 genes down-regulated. A total of 845 novel genes were differentially expressed in black versus white sheep skin, consisting of 107 genes which were up-regulated (including 2 highly expressed genes exclusively expressed in black sheep skin) and 738 genes that were down-regulated. There was also a total of 49 known coat color genes expressed in sheep skin, from which 13 genes showed higher expression in black sheep skin. Many of these up-regulated genes, such as DCT, MATP, TYR and TYRP1, are members of the components of melanosomes and their precursor ontology category. Conclusion The white and black sheep skin transcriptome profiles obtained provide a valuable resource for future research to understand the network of gene expression controlling skin physiology and melanogenesis in sheep. PMID:23758853
Gene doping in sports.

PubMed

Unal, Mehmet; Ozer Unal, Durisehvar

2004-01-01

Gene or cell doping is defined by the World Anti-Doping Agency (WADA) as "the non-therapeutic use of genes, genetic elements and/or cells that have the capacity to enhance athletic performance". New research in genetics and genomics will be used not only to diagnose and treat disease, but also to attempt to enhance human performance. In recent years, gene therapy has shown progress and positive results that have highlighted the potential misuse of this technology and the debate of 'gene doping'. Gene therapies developed for the treatment of diseases such as anaemia (the gene for erythropoietin), muscular dystrophy (the gene for insulin-like growth factor-1) and peripheral vascular diseases (the gene for vascular endothelial growth factor) are potential doping methods. With progress in gene technology, many other genes with this potential will be discovered. For this reason, it is important to develop timely legal regulations and to research the field of gene doping in order to develop methods of detection. To protect the health of athletes and to ensure equal competitive conditions, the International Olympic Committee, WADA and International Sports Federations have accepted performance-enhancing substances and methods as being doping, and have forbidden them. Nevertheless, the desire to win causes athletes to misuse these drugs and methods. This paper reviews the current status of gene doping and candidate performance enhancement genes, and also the use of gene therapy in sports medicine and ethics of genetic enhancement. Copyright 2004 Adis Data Information BV
DOE Office of Scientific and Technical Information (OSTI.GOV)

Deutschbauer, Adam; Price, Morgan N.; Wetmore, Kelly M.

Mutant phenotypes provide strong clues to the functions of the underlying genes and could allow annotation of the millions of sequenced yet uncharacterized bacterial genes. However, it is not known how many genes have a phenotype under laboratory conditions, how many phenotypes are biologically interpretable for predicting gene function, and what experimental conditions are optimal to maximize the number of genes with a phenotype. To address these issues, we measured the mutant fitness of 1,586 genes of the ethanol-producing bacterium Zymomonas mobilis ZM4 across 492 diverse experiments and found statistically significant phenotypes for 89% of all assayed genes. Thus, inmore » Z. mobilis, most genes have a functional consequence under laboratory conditions. We demonstrate that 41% of Z. mobilis genes have both a strong phenotype and a similar fitness pattern (cofitness) to another gene, and are therefore good candidates for functional annotation using mutant fitness. Among 502 poorly characterized Z. mobilis genes, we identified a significant cofitness relationship for 174. For 57 of these genes without a specific functional annotation, we found additional evidence to support the biological significance of these gene-gene associations, and in 33 instances, we were able to predict specific physiological or biochemical roles for the poorly characterized genes. Last, we identified a set of 79 diverse mutant fitness experiments in Z. mobilis that are nearly as biologically informative as the entire set of 492 experiments. Therefore, our work provides a blueprint for the functional annotation of diverse bacteria using mutant fitness.« less
Replicon-dependent differentiation of symbiosis-related genes in Sinorhizobium strains nodulating Glycine max.

PubMed

Guo, Hui Juan; Wang, En Tao; Zhang, Xing Xing; Li, Qin Qin; Zhang, Yan Ming; Tian, Chang Fu; Chen, Wen Xin

2014-02-01

In order to investigate the genetic differentiation of Sinorhizobium strains nodulating Glycine max and related microevolutionary mechanisms, three housekeeping genes (SMc00019, truA, and thrA) and 16 symbiosis-related genes on the chromosome (7 genes), pSymA (6 genes), and pSymB (3 genes) were analyzed. Five distinct species were identified among the test strains by calculating the average nucleotide identity (ANI) of SMc00019-truA-thrA: Sinorhizobium fredii, Sinorhizobium sojae, Sinorhizobium sp. I, Sinorhizobium sp. II, and Sinorhizobium sp. III. These species assignments were also supported by population genetics and phylogenetic analyses of housekeeping genes and symbiosis-related genes on the chromosome and pSymB. Different levels of genetic differentiation were observed among these species or different replicons. S. sojae was the most divergent from the other test species and was characterized by its low intraspecies diversity and limited geographic distribution. Intergenic recombination dominated the evolution of 19 genes from different replicons. Intraspecies recombination happened frequently in housekeeping genes and symbiosis-related genes on the chromosome and pSymB, whereas pSymA genes showed a clear pattern of lateral-transfer events between different species. Moreover, pSymA genes were characterized by a lower level of polymorphism and recombination than those on the chromosome and pSymB. Taken together, genes from different replicons of rhizobia might be involved in the establishment of symbiosis with legumes, but these symbiosis-related genes might have evolved differently according to their corresponding replicons.
Complete nucleotide sequence of the freshwater unicellular cyanobacterium Synechococcus elongatus PCC 6301 chromosome: gene content and organization.

PubMed

Sugita, Chieko; Ogata, Koretsugu; Shikata, Masamitsu; Jikuya, Hiroyuki; Takano, Jun; Furumichi, Miho; Kanehisa, Minoru; Omata, Tatsuo; Sugiura, Masahiro; Sugita, Mamoru

2007-01-01

The entire genome of the unicellular cyanobacterium Synechococcus elongatus PCC 6301 (formerly Anacystis nidulans Berkeley strain 6301) was sequenced. The genome consisted of a circular chromosome 2,696,255 bp long. A total of 2,525 potential protein-coding genes, two sets of rRNA genes, 45 tRNA genes representing 42 tRNA species, and several genes for small stable RNAs were assigned to the chromosome by similarity searches and computer predictions. The translated products of 56% of the potential protein-coding genes showed sequence similarities to experimentally identified and predicted proteins of known function, and the products of 35% of the genes showed sequence similarities to the translated products of hypothetical genes. The remaining 9% of genes lacked significant similarities to genes for predicted proteins in the public DNA databases. Some 139 genes coding for photosynthesis-related components were identified. Thirty-seven genes for two-component signal transduction systems were also identified. This is the smallest number of such genes identified in cyanobacteria, except for marine cyanobacteria, suggesting that only simple signal transduction systems are found in this strain. The gene arrangement and nucleotide sequence of Synechococcus elongatus PCC 6301 were nearly identical to those of a closely related strain Synechococcus elongatus PCC 7942, except for the presence of a 188.6 kb inversion. The sequences as well as the gene information shown in this paper are available in the Web database, CYORF (http://www.cyano.genome.jp/).
Unifying measures of gene function and evolution.

PubMed

Wolf, Yuri I; Carmel, Liran; Koonin, Eugene V

2006-06-22

Recent genome analyses revealed intriguing correlations between variables characterizing the functioning of a gene, such as expression level (EL), connectivity of genetic and protein-protein interaction networks, and knockout effect, and variables describing gene evolution, such as sequence evolution rate (ER) and propensity for gene loss. Typically, variables within each of these classes are positively correlated, e.g. products of highly expressed genes also have a propensity to be involved in many protein-protein interactions, whereas variables between classes are negatively correlated, e.g. highly expressed genes, on average, evolve slower than weakly expressed genes. Here, we describe principal component (PC) analysis of seven genome-related variables and propose biological interpretations for the first three PCs. The first PC reflects a gene's 'importance', or the 'status' of a gene in the genomic community, with positive contributions from knockout lethality, EL, number of protein-protein interaction partners and the number of paralogues, and negative contributions from sequence ER and gene loss propensity. The next two PCs define a plane that seems to reflect the functional and evolutionary plasticity of a gene. Specifically, PC2 can be interpreted as a gene's 'adaptability' whereby genes with high adaptability readily duplicate, have many genetic interaction partners and tend to be non-essential. PC3 also might reflect the role of a gene in organismal adaptation albeit with a negative rather than a positive contribution of genetic interactions; we provisionally designate this PC 'reactivity'. The interpretation of PC2 and PC3 as measures of a gene's plasticity is compatible with the observation that genes with high values of these PCs tend to be expressed in a condition- or tissue-specific manner. Functional classes of genes substantially vary in status, adaptability and reactivity, with the highest status characteristic of the translation system and cytoskeletal proteins, highest adaptability seen in cellular processes and signalling genes, and top reactivity characteristic of metabolic enzymes.
Partitioning of functional gene expression data using principal points.

PubMed

Kim, Jaehee; Kim, Haseong

2017-10-12

DNA microarrays offer motivation and hope for the simultaneous study of variations in multiple genes. Gene expression is a temporal process that allows variations in expression levels with a characterized gene function over a period of time. Temporal gene expression curves can be treated as functional data since they are considered as independent realizations of a stochastic process. This process requires appropriate models to identify patterns of gene functions. The partitioning of the functional data can find homogeneous subgroups of entities for the massive genes within the inherent biological networks. Therefor it can be a useful technique for the analysis of time-course gene expression data. We propose a new self-consistent partitioning method of functional coefficients for individual expression profiles based on the orthonormal basis system. A principal points based functional partitioning method is proposed for time-course gene expression data. The method explores the relationship between genes using Legendre coefficients as principal points to extract the features of gene functions. Our proposed method provides high connectivity in connectedness after clustering for simulated data and finds a significant subsets of genes with the increased connectivity. Our approach has comparative advantages that fewer coefficients are used from the functional data and self-consistency of principal points for partitioning. As real data applications, we are able to find partitioned genes through the gene expressions found in budding yeast data and Escherichia coli data. The proposed method benefitted from the use of principal points, dimension reduction, and choice of orthogonal basis system as well as provides appropriately connected genes in the resulting subsets. We illustrate our method by applying with each set of cell-cycle-regulated time-course yeast genes and E. coli genes. The proposed method is able to identify highly connected genes and to explore the complex dynamics of biological systems in functional genomics.
Identification of Suitable Reference Genes for Gene Expression Normalization in qRT-PCR Analysis in Watermelon

PubMed Central

Gao, Lingyun; Zhao, Shuang; Jiang, Wei; Huang, Yuan; Bie, Zhilong

2014-01-01

Watermelon is one of the major Cucurbitaceae crops and the recent availability of genome sequence greatly facilitates the fundamental researches on it. Quantitative real-time reverse transcriptase PCR (qRT–PCR) is the preferred method for gene expression analyses, and using validated reference genes for normalization is crucial to ensure the accuracy of this method. However, a systematic validation of reference genes has not been conducted on watermelon. In this study, transcripts of 15 candidate reference genes were quantified in watermelon using qRT–PCR, and the stability of these genes was compared using geNorm and NormFinder. geNorm identified ClTUA and ClACT, ClEF1α and ClACT, and ClCAC and ClTUA as the best pairs of reference genes in watermelon organs and tissues under normal growth conditions, abiotic stress, and biotic stress, respectively. NormFinder identified ClYLS8, ClUBCP, and ClCAC as the best single reference genes under the above experimental conditions, respectively. ClYLS8 and ClPP2A were identified as the best reference genes across all samples. Two to nine reference genes were required for more reliable normalization depending on the experimental conditions. The widely used watermelon reference gene 18SrRNA was less stable than the other reference genes under the experimental conditions. Catalase family genes were identified in watermelon genome, and used to validate the reliability of the identified reference genes. ClCAT1and ClCAT2 were induced and upregulated in the first 24 h, whereas ClCAT3 was downregulated in the leaves under low temperature stress. However, the expression levels of these genes were significantly overestimated and misinterpreted when 18SrRNA was used as a reference gene. These results provide a good starting point for reference gene selection in qRT–PCR analyses involving watermelon. PMID:24587403
Designer TAL effectors induce disease susceptibility and resistance to Xanthomonas oryzae pv. oryzae in rice.

PubMed

Li, Ting; Huang, Sheng; Zhou, Junhui; Yang, Bing

2013-05-01

TAL (transcription activator-like) effectors from Xanthomonas bacteria activate the cognate host genes, leading to disease susceptibility or resistance dependent on the genetic context of host target genes. The modular nature and DNA recognition code of TAL effectors enable custom-engineering of designer TAL effectors (dTALE) for gene activation. However, the feasibility of dTALEs as transcription activators for gene functional analysis has not been demonstrated. Here, we report the use of dTALEs, as expressed and delivered by the pathogenic Xanthomonas oryzae pv. oryzae (Xoo), in revealing the new function of two previously identified disease-related genes and the potential of one developmental gene for disease susceptibility in rice/Xoo interactions. The dTALE gene dTALE-xa27, designed to target the susceptible allele of the resistance gene Xa27, elicited a resistant reaction in the otherwise susceptible rice cultivar IR24. Four dTALE genes were made to induce the four annotated Xa27 homologous genes in rice cultivar Nipponbare, but none of the four induced Xa27-like genes conferred resistance to the dTALE-containing Xoo strains. A dTALE gene was also generated to activate the recessive resistance gene xa13, an allele of the disease-susceptibility gene Os8N3 (also named Xa13 or OsSWEET11, a member of sucrose efflux transporter SWEET gene family). The induction of xa13 by the dTALE rendered the resistant rice IRBB13 (xa13/xa13) susceptible to Xoo. Finally, OsSWEET12, an as-yet uncharacterized SWEET gene with no corresponding naturally occurring TAL effector identified, conferred susceptibility to the Xoo strains expressing the corresponding dTALE genes. Our results demonstrate that dTALEs can be delivered through the bacterial secretion system to activate genes of interest for functional analysis in plants.
The effect of mutation on Rhodococcus equi virulence plasmid gene expression and mouse virulence.

PubMed

Ren, Jun; Prescott, John F

2004-11-15

An 81 kb virulence plasmid containing a pathogenicity island (PI) plays a crucial role in the pathogenesis of Rhodococcus equi pneumonia in foals but its specific function in virulence and regulation of plasmid-encoded virulence genes is unclear. Using a LacZ selection marker developed for R. equi in this study, in combination with an apramycin resistance gene, an efficient two-stage homologous recombination targeted gene mutation procedure was used to mutate three virulence plasmid genes, a LysR regulatory gene homologue (ORF4), a ResD-like two-component response regulator homologue (ORF8), and a gene (ORF10) of unknown function that is highly expressed by R. equi inside macrophages, as well as the chromosomal gene operon, phoPR. Virulence testing by liver clearance after intravenous injection in mice showed that the ORF4 and ORF8 mutants were fully attenuated, that the phoPR mutant was hypervirulent, and that virulence of the ORF10 mutant remained unchanged. A virulence plasmid DNA microarray was used to compare the plasmid gene expression profile of each of the four gene-targeted mutants against the parental R. equi strain. Changes were limited to PI genes and gene induction was observed for all mutants, suggesting that expression of virulence plasmid genes is dominated by a negative regulatory network. The finding of attenuation of ORF4 and ORF8 mutants despite enhanced transcription of vapA suggests that factors other than VapA are important for full expression of virulence. ORF1, a putative Lsr antigen gene, was strongly and similarly induced in all mutants, implying a common regulatory pathway affecting this gene for all four mutated genes. ORF8 is apparently the centre of this common pathway. Two distinct highly correlated gene induction patterns were observed, that of the ORF4 and ORF8 mutants, and that of the ORF10 and phoPR mutants. The gene induction pattern distinguishing these two groups paralleled their virulence in mice.
Mapping Gene Associations in Human Mitochondria using Clinical Disease Phenotypes

PubMed Central

Scharfe, Curt; Lu, Henry Horng-Shing; Neuenburg, Jutta K.; Allen, Edward A.; Li, Guan-Cheng; Klopstock, Thomas; Cowan, Tina M.; Enns, Gregory M.; Davis, Ronald W.

2009-01-01

Nuclear genes encode most mitochondrial proteins, and their mutations cause diverse and debilitating clinical disorders. To date, 1,200 of these mitochondrial genes have been recorded, while no standardized catalog exists of the associated clinical phenotypes. Such a catalog would be useful to develop methods to analyze human phenotypic data, to determine genotype-phenotype relations among many genes and diseases, and to support the clinical diagnosis of mitochondrial disorders. Here we establish a clinical phenotype catalog of 174 mitochondrial disease genes and study associations of diseases and genes. Phenotypic features such as clinical signs and symptoms were manually annotated from full-text medical articles and classified based on the hierarchical MeSH ontology. This classification of phenotypic features of each gene allowed for the comparison of diseases between different genes. In turn, we were then able to measure the phenotypic associations of disease genes for which we calculated a quantitative value that is based on their shared phenotypic features. The results showed that genes sharing more similar phenotypes have a stronger tendency for functional interactions, proving the usefulness of phenotype similarity values in disease gene network analysis. We then constructed a functional network of mitochondrial genes and discovered a higher connectivity for non-disease than for disease genes, and a tendency of disease genes to interact with each other. Utilizing these differences, we propose 168 candidate genes that resemble the characteristic interaction patterns of mitochondrial disease genes. Through their network associations, the candidates are further prioritized for the study of specific disorders such as optic neuropathies and Parkinson disease. Most mitochondrial disease phenotypes involve several clinical categories including neurologic, metabolic, and gastrointestinal disorders, which might indicate the effects of gene defects within the mitochondrial system. The accompanying knowledgebase (http://www.mitophenome.org/) supports the study of clinical diseases and associated genes. PMID:19390613
Genome-Wide Comparative Analysis Reveals Similar Types of NBS Genes in Hybrid Citrus sinensis Genome and Original Citrus clementine Genome and Provides New Insights into Non-TIR NBS Genes

PubMed Central

Wang, Yunsheng; Zhou, Lijuan; Li, Dazhi; Dai, Liangying; Lawton-Rauh, Amy; Srimani, Pradip K.; Duan, Yongping; Luo, Feng

2015-01-01

In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR) domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC) domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention. PMID:25811466
Spermatogenesis Drives Rapid Gene Creation and Masculinization of the X Chromosome in Stalk-Eyed Flies (Diopsidae).

PubMed

Baker, Richard H; Narechania, Apurva; DeSalle, Rob; Johns, Philip M; Reinhardt, Josephine A; Wilkinson, Gerald S

2016-03-26

Throughout their evolutionary history, genomes acquire new genetic material that facilitates phenotypic innovation and diversification. Developmental processes associated with reproduction are particularly likely to involve novel genes. Abundant gene creation impacts the evolution of chromosomal gene content and general regulatory mechanisms such as dosage compensation. Numerous studies in model organisms have found complex and, at times contradictory, relationships among these genomic attributes highlighting the need to examine these patterns in other systems characterized by abundant sexual selection. Therefore, we examined the association among novel gene creation, tissue-specific gene expression, and chromosomal gene content within stalk-eyed flies. Flies in this family are characterized by strong sexual selection and the presence of a newly evolved X chromosome. We generated RNA-seq transcriptome data from the testes for three species within the family and from seven additional tissues in the highly dimorphic species,Teleopsis dalmanni Analysis of dipteran gene orthology reveals dramatic testes-specific gene creation in stalk-eyed flies, involving numerous gene families that are highly conserved in other insect groups. Identification of X-linked genes for the three species indicates that the X chromosome arose prior to the diversification of the family. The most striking feature of this X chromosome is that it is highly masculinized, containing nearly twice as many testes-specific genes as expected based on its size. All the major processes that may drive differential sex chromosome gene content-creation of genes with male-specific expression, development of male-specific expression from pre-existing genes, and movement of genes with male-specific expression-are elevated on the X chromosome ofT. dalmanni This masculinization occurs despite evidence that testes expressed genes do not achieve the same levels of gene expression on the X chromosome as they do on the autosomes. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Functional differentiation and spatial-temporal co-expression networks of the NBS-encoding gene family in Jilin ginseng, Panax ginseng C.A. Meyer.

PubMed

Yin, Rui; Zhao, Mingzhu; Wang, Kangyu; Lin, Yanping; Wang, Yanfang; Sun, Chunyu; Wang, Yi; Zhang, Meiping

2017-01-01

Ginseng, Panax ginseng C.A. Meyer, is one of the most important medicinal plants for human health and medicine. It has been documented that over 80% of genes conferring resistance to bacteria, viruses, fungi and nematodes are contributed by the nucleotide binding site (NBS)-encoding gene family. Therefore, identification and characterization of NBS genes expressed in ginseng are paramount to its genetic improvement and breeding. However, little is known about the NBS-encoding genes in ginseng. Here we report genome-wide identification and systems analysis of the NBS genes actively expressed in ginseng (PgNBS genes). Four hundred twelve PgNBS gene transcripts, derived from 284 gene models, were identified from the transcriptomes of 14 ginseng tissues. These genes were classified into eight types, including TNL, TN, CNL, CN, NL, N, RPW8-NL and RPW8-N. Seven conserved motifs were identified in both the Toll/interleukine-1 receptor (TIR) and coiled-coil (CC) typed genes whereas six were identified in the RPW8 typed genes. Phylogenetic analysis showed that the PgNBS gene family is an ancient family, with a vast majority of its genes originated before ginseng originated. In spite of their belonging to a family, the PgNBS genes have functionally dramatically differentiated and been categorized into numerous functional categories. The expressions of the across tissues, different aged roots and the roots of different genotypes. However, they are coordinating in expression, forming a single co-expression network. These results provide a deeper understanding of the origin, evolution and functional differentiation and expression dynamics of the NBS-encoding gene family in plants in general and in ginseng particularly, and a NBS gene toolkit useful for isolation and characterization of disease resistance genes and for enhanced disease resistance breeding in ginseng and related species.
Functional differentiation and spatial-temporal co-expression networks of the NBS-encoding gene family in Jilin ginseng, Panax ginseng C.A. Meyer

PubMed Central

Wang, Kangyu; Lin, Yanping; Wang, Yanfang; Sun, Chunyu; Wang, Yi

2017-01-01

Ginseng, Panax ginseng C.A. Meyer, is one of the most important medicinal plants for human health and medicine. It has been documented that over 80% of genes conferring resistance to bacteria, viruses, fungi and nematodes are contributed by the nucleotide binding site (NBS)-encoding gene family. Therefore, identification and characterization of NBS genes expressed in ginseng are paramount to its genetic improvement and breeding. However, little is known about the NBS-encoding genes in ginseng. Here we report genome-wide identification and systems analysis of the NBS genes actively expressed in ginseng (PgNBS genes). Four hundred twelve PgNBS gene transcripts, derived from 284 gene models, were identified from the transcriptomes of 14 ginseng tissues. These genes were classified into eight types, including TNL, TN, CNL, CN, NL, N, RPW8-NL and RPW8-N. Seven conserved motifs were identified in both the Toll/interleukine-1 receptor (TIR) and coiled-coil (CC) typed genes whereas six were identified in the RPW8 typed genes. Phylogenetic analysis showed that the PgNBS gene family is an ancient family, with a vast majority of its genes originated before ginseng originated. In spite of their belonging to a family, the PgNBS genes have functionally dramatically differentiated and been categorized into numerous functional categories. The expressions of the across tissues, different aged roots and the roots of different genotypes. However, they are coordinating in expression, forming a single co-expression network. These results provide a deeper understanding of the origin, evolution and functional differentiation and expression dynamics of the NBS-encoding gene family in plants in general and in ginseng particularly, and a NBS gene toolkit useful for isolation and characterization of disease resistance genes and for enhanced disease resistance breeding in ginseng and related species. PMID:28727829
Genome-wide comparative analysis reveals similar types of NBS genes in hybrid Citrus sinensis genome and original Citrus clementine genome and provides new insights into non-TIR NBS genes.

PubMed

Wang, Yunsheng; Zhou, Lijuan; Li, Dazhi; Dai, Liangying; Lawton-Rauh, Amy; Srimani, Pradip K; Duan, Yongping; Luo, Feng

2015-01-01

In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR) domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC) domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention.
Properties of genes essential for mouse development

PubMed Central

Kabir, Mitra; Barradas, Ana; Tzotzos, George T.; Hentges, Kathryn E.

2017-01-01

Essential genes are those that are critical for life. In the specific case of the mouse, they are the set of genes whose deletion means that a mouse is unable to survive after birth. As such, they are the key minimal set of genes needed for all the steps of development to produce an organism capable of life ex utero. We explored a wide range of sequence and functional features to characterise essential (lethal) and non-essential (viable) genes in mice. Experimental data curated manually identified 1301 essential genes and 3451 viable genes. Very many sequence features show highly significant differences between essential and viable mouse genes. Essential genes generally encode complex proteins, with multiple domains and many introns. These genes tend to be: long, highly expressed, old and evolutionarily conserved. These genes tend to encode ligases, transferases, phosphorylated proteins, intracellular proteins, nuclear proteins, and hubs in protein-protein interaction networks. They are involved with regulating protein-protein interactions, gene expression and metabolic processes, cell morphogenesis, cell division, cell proliferation, DNA replication, cell differentiation, DNA repair and transcription, cell differentiation and embryonic development. Viable genes tend to encode: membrane proteins or secreted proteins, and are associated with functions such as cellular communication, apoptosis, behaviour and immune response, as well as housekeeping and tissue specific functions. Viable genes are linked to transport, ion channels, signal transduction, calcium binding and lipid binding, consistent with their location in membranes and involvement with cell-cell communication. From the analysis of the composite features of essential and viable genes, we conclude that essential genes tend to be required for intracellular functions, and viable genes tend to be involved with extracellular functions and cell-cell communication. Knowledge of the features that are over-represented in essential genes allows for a deeper understanding of the functions and processes implemented during mammalian development. PMID:28562614
Exercise-associated DNA methylation change in skeletal muscle and the importance of imprinted genes: a bioinformatics meta-analysis.

PubMed

Brown, William M

2015-12-01

Epigenetics is the study of processes--beyond DNA sequence alteration--producing heritable characteristics. For example, DNA methylation modifies gene expression without altering the nucleotide sequence. A well-studied DNA methylation-based phenomenon is genomic imprinting (ie, genotype-independent parent-of-origin effects). We aimed to elucidate: (1) the effect of exercise on DNA methylation and (2) the role of imprinted genes in skeletal muscle gene networks (ie, gene group functional profiling analyses). Gene ontology (ie, gene product elucidation)/meta-analysis. 26 skeletal muscle and 86 imprinted genes were subjected to g:Profiler ontology analysis. Meta-analysis assessed exercise-associated DNA methylation change. g:Profiler found four muscle gene networks with imprinted loci. Meta-analysis identified 16 articles (387 genes/1580 individuals) associated with exercise. Age, method, sample size, sex and tissue variation could elevate effect size bias. Only skeletal muscle gene networks including imprinted genes were reported. Exercise-associated effect sizes were calculated by gene. Age, method, sample size, sex and tissue variation were moderators. Six imprinted loci (RB1, MEG3, UBE3A, PLAGL1, SGCE, INS) were important for muscle gene networks, while meta-analysis uncovered five exercise-associated imprinted loci (KCNQ1, MEG3, GRB10, L3MBTL1, PLAGL1). DNA methylation decreased with exercise (60% of loci). Exercise-associated DNA methylation change was stronger among older people (ie, age accounted for 30% of the variation). Among older people, genes exhibiting DNA methylation decreases were part of a microRNA-regulated gene network functioning to suppress cancer. Imprinted genes were identified in skeletal muscle gene networks and exercise-associated DNA methylation change. Exercise-associated DNA methylation modification could rewind the 'epigenetic clock' as we age. CRD42014009800. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Systematic study of association of four GABAergic genes: glutamic acid decarboxylase 1 gene, glutamic acid decarboxylase 2 gene, GABA(B) receptor 1 gene and GABA(A) receptor subunit beta2 gene, with schizophrenia using a universal DNA microarray.

PubMed

Zhao, Xu; Qin, Shengying; Shi, Yongyong; Zhang, Aiping; Zhang, Jing; Bian, Li; Wan, Chunling; Feng, Guoyin; Gu, Niufan; Zhang, Guangqi; He, Guang; He, Lin

2007-07-01

Several studies have suggested the dysfunction of the GABAergic system as a risk factor in the pathogenesis of schizophrenia. In the present study, case-control association analysis was conducted in four GABAergic genes: two glutamic acid decarboxylase genes (GAD1 and GAD2), a GABA(A) receptor subunit beta2 gene (GABRB2) and a GABA(B) receptor 1 gene (GABBR1). Using a universal DNA microarray procedure we genotyped a total of 20 SNPs on the above four genes in a study involving 292 patients and 286 controls of Chinese descent. Statistically significant differences were observed in the allelic frequencies of the rs187269C/T polymorphism in the GABRB2 gene (P=0.0450, chi(2)=12.40, OR=1.65) and the -292A/C polymorphism in the GAD1 gene (P=0.0450, chi(2)=14.64 OR=1.77). In addition, using an electrophoretic mobility shift assay (EMSA), we discovered differences in the U251 nuclear protein binding to oligonucleotides representing the -292 SNP on the GAD1 gene, which suggests that the -292C allele has reduced transcription factor binding efficiency compared with the 292A allele. Using the multifactor-dimensionality reduction method (MDR), we found that the interactions among the rs187269C/T polymorphism in the GABRB2 gene, the -243A/G polymorphism in the GAD2 gene and the 27379C/T and 661C/T polymorphisms in the GAD1 gene revealed a significant association with schizophrenia (P<0.001). These findings suggest that the GABRB2 and GAD1 genes alone and the combined effects of the polymorphisms in the four GABAergic system genes may confer susceptibility to the development of schizophrenia in the Chinese population.

Co-regulation of the atrial natriuretic factor and cardiac myosin light chain-2 genes during alpha-adrenergic stimulation of neonatal rat ventricular cells. Identification of cis sequences within an embryonic and a constitutive contractile protein gene which mediate inducible expression.

PubMed

Knowlton, K U; Baracchini, E; Ross, R S; Harris, A N; Henderson, S A; Evans, S M; Glembotski, C C; Chien, K R

1991-04-25

To study the mechanisms which mediate the transcriptional activation of cardiac genes during alpha adrenergic stimulation, the present study examined the regulated expression of three cardiac genes, a ventricular embryonic gene (atrial natriuretic factor, ANF), a constitutively expressed contractile protein gene (cardiac MLC-2), and a cardiac sodium channel gene. alpha 1-Adrenergic stimulation activates the expression and release of ANF from neonatal ventricular cells. As assessed by RNase protection analyses, treatment with alpha-adrenergic agonists increases the steady-state levels of ANF mRNA by greater than 15-fold. However, a rat cardiac sodium channel gene mRNA is not induced, indicating that alpha-adrenergic stimulation does not lead to an increase in the expression of all cardiac genes. Studies employing a series of rat ANF luciferase and rat MLC-2 luciferase fusion genes identify 315- and 92-base pair cis regulatory sequences within an embryonic gene (ANF) and a constitutively expressed contractile protein gene (MLC-2), respectively, which mediate alpha-adrenergic-inducible gene expression. Transfection of various ANF luciferase reporters into neonatal rat ventricular cells demonstrated that upstream sequences which mediate tissue-specific expression (-3003 to -638) can be segregated from those responsible for inducibility. The lack of inducibility of a cardiac Na+ channel gene, and the segregation of ANF gene sequences which mediate cardiac specific from those which mediate inducible expression, provides further insight into the relationship between muscle-specific and inducible expression during cardiac myocyte hypertrophy. Based on these results, a testable model is proposed for the induction of embryonic cardiac genes and constitutively expressed contractile protein genes and the noninducibility of a subset of cardiac genes during alpha-adrenergic stimulation of neonatal rat ventricular cells.
Combining Shigella Tn-seq data with gold-standard E. coli gene deletion data suggests rare transitions between essential and non-essential gene functionality.

PubMed

Freed, Nikki E; Bumann, Dirk; Silander, Olin K

2016-09-06

Gene essentiality - whether or not a gene is necessary for cell growth - is a fundamental component of gene function. It is not well established how quickly gene essentiality can change, as few studies have compared empirical measures of essentiality between closely related organisms. Here we present the results of a Tn-seq experiment designed to detect essential protein coding genes in the bacterial pathogen Shigella flexneri 2a 2457T on a genome-wide scale. Superficial analysis of this data suggested that 481 protein-coding genes in this Shigella strain are critical for robust cellular growth on rich media. Comparison of this set of genes with a gold-standard data set of essential genes in the closely related Escherichia coli K12 BW25113 revealed that an excessive number of genes appeared essential in Shigella but non-essential in E. coli. Importantly, and in converse to this comparison, we found no genes that were essential in E. coli and non-essential in Shigella, implying that many genes were artefactually inferred as essential in Shigella. Controlling for such artefacts resulted in a much smaller set of discrepant genes. Among these, we identified three sets of functionally related genes, two of which have previously been implicated as critical for Shigella growth, but which are dispensable for E. coli growth. The data presented here highlight the small number of protein coding genes for which we have strong evidence that their essentiality status differs between the closely related bacterial taxa E. coli and Shigella. A set of genes involved in acetate utilization provides a canonical example. These results leave open the possibility of developing strain-specific antibiotic treatments targeting such differentially essential genes, but suggest that such opportunities may be rare in closely related bacteria.
Identification of suitable reference genes for gene expression normalization in qRT-PCR analysis in watermelon.

PubMed

Kong, Qiusheng; Yuan, Jingxian; Gao, Lingyun; Zhao, Shuang; Jiang, Wei; Huang, Yuan; Bie, Zhilong

2014-01-01

Watermelon is one of the major Cucurbitaceae crops and the recent availability of genome sequence greatly facilitates the fundamental researches on it. Quantitative real-time reverse transcriptase PCR (qRT-PCR) is the preferred method for gene expression analyses, and using validated reference genes for normalization is crucial to ensure the accuracy of this method. However, a systematic validation of reference genes has not been conducted on watermelon. In this study, transcripts of 15 candidate reference genes were quantified in watermelon using qRT-PCR, and the stability of these genes was compared using geNorm and NormFinder. geNorm identified ClTUA and ClACT, ClEF1α and ClACT, and ClCAC and ClTUA as the best pairs of reference genes in watermelon organs and tissues under normal growth conditions, abiotic stress, and biotic stress, respectively. NormFinder identified ClYLS8, ClUBCP, and ClCAC as the best single reference genes under the above experimental conditions, respectively. ClYLS8 and ClPP2A were identified as the best reference genes across all samples. Two to nine reference genes were required for more reliable normalization depending on the experimental conditions. The widely used watermelon reference gene 18SrRNA was less stable than the other reference genes under the experimental conditions. Catalase family genes were identified in watermelon genome, and used to validate the reliability of the identified reference genes. ClCAT1and ClCAT2 were induced and upregulated in the first 24 h, whereas ClCAT3 was downregulated in the leaves under low temperature stress. However, the expression levels of these genes were significantly overestimated and misinterpreted when 18SrRNA was used as a reference gene. These results provide a good starting point for reference gene selection in qRT-PCR analyses involving watermelon.
Validation of reference genes for quantitative expression analysis by real-time RT-PCR in Saccharomyces cerevisiae

PubMed Central

Teste, Marie-Ange; Duquenne, Manon; François, Jean M; Parrou, Jean-Luc

2009-01-01

Background Real-time RT-PCR is the recommended method for quantitative gene expression analysis. A compulsory step is the selection of good reference genes for normalization. A few genes often referred to as HouseKeeping Genes (HSK), such as ACT1, RDN18 or PDA1 are among the most commonly used, as their expression is assumed to remain unchanged over a wide range of conditions. Since this assumption is very unlikely, a geometric averaging of multiple, carefully selected internal control genes is now strongly recommended for normalization to avoid this problem of expression variation of single reference genes. The aim of this work was to search for a set of reference genes for reliable gene expression analysis in Saccharomyces cerevisiae. Results From public microarray datasets, we selected potential reference genes whose expression remained apparently invariable during long-term growth on glucose. Using the algorithm geNorm, ALG9, TAF10, TFC1 and UBC6 turned out to be genes whose expression remained stable, independent of the growth conditions and the strain backgrounds tested in this study. We then showed that the geometric averaging of any subset of three genes among the six most stable genes resulted in very similar normalized data, which contrasted with inconsistent results among various biological samples when the normalization was performed with ACT1. Normalization with multiple selected genes was therefore applied to transcriptional analysis of genes involved in glycogen metabolism. We determined an induction ratio of 100-fold for GPH1 and 20-fold for GSY2 between the exponential phase and the diauxic shift on glucose. There was no induction of these two genes at this transition phase on galactose, although in both cases, the kinetics of glycogen accumulation was similar. In contrast, SGA1 expression was independent of the carbon source and increased by 3-fold in stationary phase. Conclusion In this work, we provided a set of genes that are suitable reference genes for quantitative gene expression analysis by real-time RT-PCR in yeast biological samples covering a large panel of physiological states. In contrast, we invalidated and discourage the use of ACT1 as well as other commonly used reference genes (PDA1, TDH3, RDN18, etc) as internal controls for quantitative gene expression analysis in yeast. PMID:19874630
Validation of reference genes for quantitative expression analysis by real-time RT-PCR in Saccharomyces cerevisiae.

PubMed

Teste, Marie-Ange; Duquenne, Manon; François, Jean M; Parrou, Jean-Luc

2009-10-30

Real-time RT-PCR is the recommended method for quantitative gene expression analysis. A compulsory step is the selection of good reference genes for normalization. A few genes often referred to as HouseKeeping Genes (HSK), such as ACT1, RDN18 or PDA1 are among the most commonly used, as their expression is assumed to remain unchanged over a wide range of conditions. Since this assumption is very unlikely, a geometric averaging of multiple, carefully selected internal control genes is now strongly recommended for normalization to avoid this problem of expression variation of single reference genes. The aim of this work was to search for a set of reference genes for reliable gene expression analysis in Saccharomyces cerevisiae. From public microarray datasets, we selected potential reference genes whose expression remained apparently invariable during long-term growth on glucose. Using the algorithm geNorm, ALG9, TAF10, TFC1 and UBC6 turned out to be genes whose expression remained stable, independent of the growth conditions and the strain backgrounds tested in this study. We then showed that the geometric averaging of any subset of three genes among the six most stable genes resulted in very similar normalized data, which contrasted with inconsistent results among various biological samples when the normalization was performed with ACT1. Normalization with multiple selected genes was therefore applied to transcriptional analysis of genes involved in glycogen metabolism. We determined an induction ratio of 100-fold for GPH1 and 20-fold for GSY2 between the exponential phase and the diauxic shift on glucose. There was no induction of these two genes at this transition phase on galactose, although in both cases, the kinetics of glycogen accumulation was similar. In contrast, SGA1 expression was independent of the carbon source and increased by 3-fold in stationary phase. In this work, we provided a set of genes that are suitable reference genes for quantitative gene expression analysis by real-time RT-PCR in yeast biological samples covering a large panel of physiological states. In contrast, we invalidated and discourage the use of ACT1 as well as other commonly used reference genes (PDA1, TDH3, RDN18, etc) as internal controls for quantitative gene expression analysis in yeast.
Dating and functional characterization of duplicated genes in the apple (Malus domestica Borkh.) by analyzing EST data.

PubMed

Sanzol, Javier

2010-05-14

Gene duplication is central to genome evolution. In plants, genes can be duplicated through small-scale events and large-scale duplications often involving polyploidy. The apple belongs to the subtribe Pyrinae (Rosaceae), a diverse lineage that originated via allopolyploidization. Both small-scale duplications and polyploidy may have been important mechanisms shaping the genome of this species. This study evaluates the gene duplication and polyploidy history of the apple by characterizing duplicated genes in this species using EST data. Overall, 68% of the apple genes were clustered into families with a mean copy-number of 4.6. Analysis of the age distribution of gene duplications supported a continuous mode of small-scale duplications, plus two episodes of large-scale duplicates of vastly different ages. The youngest was consistent with the polyploid origin of the Pyrinae 37-48 MYBP, whereas the older may be related to gamma-triplication; an ancient hexapolyploidization previously characterized in the four sequenced eurosid genomes and basal to the eurosid-asterid divergence. Duplicated genes were studied for functional diversification with an emphasis on young paralogs; those originated during or after the formation of the Pyrinae lineage. Unequal assignment of single-copy genes and gene families to Gene Ontology categories suggested functional bias in the pattern of gene retention of paralogs. Young paralogs related to signal transduction, metabolism, and energy pathways have been preferentially retained. Non-random retention of duplicated genes seems to have mediated the expansion of gene families, some of which may have substantially increased their members after the origin of the Pyrinae. The joint analysis of over-duplicated functional categories and phylogenies, allowed evaluation of the role of both polyploidy and small-scale duplications during this process. Finally, gene expression analysis indicated that 82% of duplicated genes, including 80% of young paralogs, showed uncorrelated expression profiles, suggesting extensive subfunctionalization and a role of gene duplication in the acquisition of novel patterns of gene expression. This study reports a genome-wide analysis of the mode of gene duplication in the apple, and provides evidence for its role in genome functional diversification by characterising three major processes: selective retention of paralogs, amplification of gene families, and changes in gene expression.
Widespread of horizontal gene transfer in the human genome.

PubMed

Huang, Wenze; Tsai, Lillian; Li, Yulong; Hua, Nan; Sun, Chen; Wei, Chaochun

2017-04-04

A fundamental concept in biology is that heritable material is passed from parents to offspring, a process called vertical gene transfer. An alternative mechanism of gene acquisition is through horizontal gene transfer (HGT), which involves movement of genetic materials between different species. Horizontal gene transfer has been found prevalent in prokaryotes but very rare in eukaryote. In this paper, we investigate horizontal gene transfer in the human genome. From the pair-wise alignments between human genome and 53 vertebrate genomes, 1,467 human genome regions (2.6 M bases) from all chromosomes were found to be more conserved with non-mammals than with most mammals. These human genome regions involve 642 known genes, which are enriched with ion binding. Compared to known horizontal gene transfer regions in the human genome, there were few overlapping regions, which indicated horizontal gene transfer is more common than we expected in the human genome. Horizontal gene transfer impacts hundreds of human genes and this study provided insight into potential mechanisms of HGT in the human genome.
Differential expression of the eight genes of the petunia ribulose bisphosphate carboxylase small subunit multi-gene family

PubMed Central

Dean, Caroline; Elzen, Peter van den; Tamaki, Stanley; Dunsmuir, Pamela; Bedbrook, John

1985-01-01

Of the eight nuclear genes in the plant multi-gene family which encodes the small subunit (rbcS) of Petunia (Mitchell) ribulose bisphosphate carboxylase, one rbcS gene accounts for 47% of the total rbcS gene expression in petunia leaf tissue. Expression of each of five other rbcS genes is detected at levels between 2 and 23% of the total rbcS expression in leaf tissue, while expression of the remaining two rbcS genes is not detected. There is considerable variation (500-fold) in the levels of total rbcS mRNA in six organs of petunia (leaves, sepals, petals, stems, roots and stigmas/anthers). One gene, SSU301, showed the highest levels of steady-state mRNA in each of the organs examined. We discuss the differences in the steady-state mRNA levels of the individual rbcS genes in relation to their gene structure, nucleotide sequence and genomic linkage. ImagesFig. 2.Fig. 3. PMID:16453647
Identification of reference genes for RT-qPCR analysis in peach genotypes with contrasting chilling requirements.

PubMed

Marini, N; Bevilacqua, C B; Büttow, M V; Raseira, M C B; Bonow, S

2017-05-25

Selecting and validating reference genes are the first steps in studying gene expression by reverse transcriptase-quantitative polymerase chain reaction (RT-qPCR). The present study aimed to evaluate the stability of five reference genes for the purpose of normalization when studying gene expression in various cultivars of Prunus persica with different chilling requirements. Flower bud tissues of nine peach genotypes from Embrapa's peach breeding program with different chilling requirements were used, and five candidate reference genes based on the RT-qPCR that were useful for studying the relative quantitative gene expression and stability were evaluated using geNorm, NormFinder, and bestKeeper software packages. The results indicated that among the genes tested, the most stable genes to be used as reference genes are Act and UBQ10. This study is the first survey of the stability of reference genes in peaches under chilling stress and provides guidelines for more accurate RT-qPCR results.
Loss of Sfpq Causes Long-Gene Transcriptopathy in the Brain.

PubMed

Takeuchi, Akihide; Iida, Kei; Tsubota, Toshiaki; Hosokawa, Motoyasu; Denawa, Masatsugu; Brown, J B; Ninomiya, Kensuke; Ito, Mikako; Kimura, Hiroshi; Abe, Takaya; Kiyonari, Hiroshi; Ohno, Kinji; Hagiwara, Masatoshi

2018-05-01

Genes specifically expressed in neurons contain members with extended long introns. Longer genes present a problem with respect to fulfilment of gene length transcription, and evidence suggests that dysregulation of long genes is a mechanism underlying neurodegenerative and psychiatric disorders. Here, we report the discovery that RNA-binding protein Sfpq is a critical factor for maintaining transcriptional elongation of long genes. We demonstrate that Sfpq co-transcriptionally binds to long introns and is required for sustaining long-gene transcription by RNA polymerase II through mediating the interaction of cyclin-dependent kinase 9 with the elongation complex. Phenotypically, Sfpq disruption caused neuronal apoptosis in developing mouse brains. Expression analysis of Sfpq-regulated genes revealed specific downregulation of developmentally essential neuronal genes longer than 100 kb in Sfpq-disrupted brains; those genes are enriched in associations with neurodegenerative and psychiatric diseases. The identified molecular machinery yields directions for targeted investigations of the association between long-gene transcriptopathy and neuronal diseases. Copyright © 2018 The Author(s). Published by Elsevier Inc. All rights reserved.
DNMT3B modulates the expression of cancer-related genes and downregulates the expression of the gene VAV3 via methylation

PubMed Central

Peralta-Arrieta, Irlanda; Hernández-Sotelo, Daniel; Castro-Coronel, Yaneth; Leyva-Vázquez, Marco Antonio; Illades-Aguiar, Berenice

2017-01-01

Altered promoter DNA methylation is one of the most important epigenetic abnormalities in human cancer. DNMT3B, de novo methyltransferase, is clearly related to abnormal methylation of tumour suppressor genes, DNA repair genes and its overexpression contributes to oncogenic processes and tumorigenesis in vivo. The purpose of this study was to assess the effect of the overexpression of DNMT3B in HaCaT cells on global gene expression and on the methylation of selected genes to the identification of genes that can be target of DNMT3B. We found that the overexpression of DNMT3B in HaCaT cells, modulate the expression of genes related to cancer, downregulated the expression of 151 genes with CpG islands and downregulated the expression of the VAV3 gene via methylation of its promoter. These results highlight the importance of DNMT3B in gene expression and human cancer. PMID:28123849
DNMT3B modulates the expression of cancer-related genes and downregulates the expression of the gene VAV3 via methylation.

PubMed

Peralta-Arrieta, Irlanda; Hernández-Sotelo, Daniel; Castro-Coronel, Yaneth; Leyva-Vázquez, Marco Antonio; Illades-Aguiar, Berenice

2017-01-01

Altered promoter DNA methylation is one of the most important epigenetic abnormalities in human cancer. DNMT3B, de novo methyltransferase, is clearly related to abnormal methylation of tumour suppressor genes, DNA repair genes and its overexpression contributes to oncogenic processes and tumorigenesis in vivo . The purpose of this study was to assess the effect of the overexpression of DNMT3B in HaCaT cells on global gene expression and on the methylation of selected genes to the identification of genes that can be target of DNMT3B. We found that the overexpression of DNMT3B in HaCaT cells, modulate the expression of genes related to cancer, downregulated the expression of 151 genes with CpG islands and downregulated the expression of the VAV3 gene via methylation of its promoter. These results highlight the importance of DNMT3B in gene expression and human cancer.
An unsupervised learning approach to find ovarian cancer genes through integration of biological data

PubMed Central

2015-01-01

Cancer is a disease characterized largely by the accumulation of out-of-control somatic mutations during the lifetime of a patient. Distinguishing driver mutations from passenger mutations has posed a challenge in modern cancer research. With the advanced development of microarray experiments and clinical studies, a large numbers of candidate cancer genes have been extracted and distinguishing informative genes out of them is essential. As a matter of fact, we proposed to find the informative genes for cancer by using mutation data from ovarian cancers in our framework. In our model we utilized the patient gene mutation profile, gene expression data and gene gene interactions network to construct a graphical representation of genes and patients. Markov processes for mutation and patients are triggered separately. After this process, cancer genes are prioritized automatically by examining their scores at their stationary distributions in the eigenvector. Extensive experiments demonstrate that the integration of heterogeneous sources of information is essential in finding important cancer genes. PMID:26328548
Genome-Wide Analysis of NBS-LRR Genes in Sorghum Genome Revealed Several Events Contributing to NBS-LRR Gene Evolution in Grass Species

PubMed Central

Yang, Xiping; Wang, Jianping

2016-01-01

The nucleotide-binding site (NBS)–leucine-rich repeat (LRR) gene family is crucially important for offering resistance to pathogens. To explore evolutionary conservation and variability of NBS-LRR genes across grass species, we identified 88, 107, 24, and 44 full-length NBS-LRR genes in sorghum, rice, maize, and Brachypodium, respectively. A comprehensive analysis was performed on classification, genome organization, evolution, expression, and regulation of these NBS-LRR genes using sorghum as a representative of grass species. In general, the full-length NBS-LRR genes are highly clustered and duplicated in sorghum genome mainly due to local duplications. NBS-LRR genes have basal expression levels and are highly potentially targeted by miRNA. The number of NBS-LRR genes in the four grass species is positively correlated with the gene clustering rate. The results provided a valuable genomic resource and insights for functional and evolutionary studies of NBS-LRR genes in grass species. PMID:26792976
Fast gene ontology based clustering for microarray experiments.

PubMed

Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa

2008-11-21

Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.
GSCALite: A Web Server for Gene Set Cancer Analysis.

PubMed

Liu, Chun-Jie; Hu, Fei-Fei; Xia, Mengxuan; Han, Leng; Zhang, Qiong; Guo, An-Yuan

2018-05-22

The availability of cancer genomic data makes it possible to analyze genes related to cancer. Cancer is usually the result of a set of genes and the signal of a single gene could be covered by background noise. Here, we present a web server named Gene Set Cancer Analysis (GSCALite) to analyze a set of genes in cancers with the following functional modules. (i) Differential expression in tumor vs normal, and the survival analysis; (ii) Genomic variations and their survival analysis; (iii) Gene expression associated cancer pathway activity; (iv) miRNA regulatory network for genes; (v) Drug sensitivity for genes; (vi) Normal tissue expression and eQTL for genes. GSCALite is a user-friendly web server for dynamic analysis and visualization of gene set in cancer and drug sensitivity correlation, which will be of broad utilities to cancer researchers. GSCALite is available on http://bioinfo.life.hust.edu.cn/web/GSCALite/. guoay@hust.edu.cn or zhangqiong@hust.edu.cn. Supplementary data are available at Bioinformatics online.
Primetime for Learning Genes.

PubMed

Keifer, Joyce

2017-02-11

Learning genes in mature neurons are uniquely suited to respond rapidly to specific environmental stimuli. Expression of individual learning genes, therefore, requires regulatory mechanisms that have the flexibility to respond with transcriptional activation or repression to select appropriate physiological and behavioral responses. Among the mechanisms that equip genes to respond adaptively are bivalent domains. These are specific histone modifications localized to gene promoters that are characteristic of both gene activation and repression, and have been studied primarily for developmental genes in embryonic stem cells. In this review, studies of the epigenetic regulation of learning genes in neurons, particularly the brain-derived neurotrophic factor gene ( BDNF ), by methylation/demethylation and chromatin modifications in the context of learning and memory will be highlighted. Because of the unique function of learning genes in the mature brain, it is proposed that bivalent domains are a characteristic feature of the chromatin landscape surrounding their promoters. This allows them to be "poised" for rapid response to activate or repress gene expression depending on environmental stimuli.
Large clusters of co-expressed genes in the Drosophila genome.

PubMed

Boutanaev, Alexander M; Kalmykova, Alla I; Shevelyov, Yuri Y; Nurminsky, Dmitry I

2002-12-12

Clustering of co-expressed, non-homologous genes on chromosomes implies their co-regulation. In lower eukaryotes, co-expressed genes are often found in pairs. Clustering of genes that share aspects of transcriptional regulation has also been reported in higher eukaryotes. To advance our understanding of the mode of coordinated gene regulation in multicellular organisms, we performed a genome-wide analysis of the chromosomal distribution of co-expressed genes in Drosophila. We identified a total of 1,661 testes-specific genes, one-third of which are clustered on chromosomes. The number of clusters of three or more genes is much higher than expected by chance. We observed a similar trend for genes upregulated in the embryo and in the adult head, although the expression pattern of individual genes cannot be predicted on the basis of chromosomal position alone. Our data suggest that the prevalent mechanism of transcriptional co-regulation in higher eukaryotes operates with extensive chromatin domains that comprise multiple genes.
Discovering potential driver genes through an integrated model of somatic mutation profiles and gene functional information.

PubMed

Xi, Jianing; Wang, Minghui; Li, Ao

2017-09-26

The accumulating availability of next-generation sequencing data offers an opportunity to pinpoint driver genes that are causally implicated in oncogenesis through computational models. Despite previous efforts made regarding this challenging problem, there is still room for improvement in the driver gene identification accuracy. In this paper, we propose a novel integrated approach called IntDriver for prioritizing driver genes. Based on a matrix factorization framework, IntDriver can effectively incorporate functional information from both the interaction network and Gene Ontology similarity, and detect driver genes mutated in different sets of patients at the same time. When evaluated through known benchmarking driver genes, the top ranked genes of our result show highly significant enrichment for the known genes. Meanwhile, IntDriver also detects some known driver genes that are not found by the other competing approaches. When measured by precision, recall and F1 score, the performances of our approach are comparable or increased in comparison to the competing approaches.
Relationships between Gene Structure and Genome Instability in Flowering Plants.

PubMed

Bennetzen, Jeffrey L; Wang, Xuewen

2018-03-05

Flowering plant (angiosperm) genomes are exceptional in their variability with respect to genome size, ploidy, chromosome number, gene content, and gene arrangement. Gene movement, although observed in some of the earliest plant genome comparisons, has been relatively underinvestigated. We present herein a description of several interesting properties of plant gene and genome structure that are pertinent to the successful movement of a gene to a new location. These considerations lead us to propose a model that can explain the frequent success of plant gene mobility, namely that Small Insulated Genes Move Around (SIGMAR). The SIGMAR model is then compared with known processes for gene mobilization, and predictions of the SIGMAR model are formulated to encourage future experimentation. The overall results indicate that the frequent gene movement in angiosperm genomes is partly an outcome of the unusual properties of angiosperm genes, especially their small size and insulation from epigenetic silencing. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

Evolution of the YABBY gene family in seed plants.

PubMed

Finet, Cédric; Floyd, Sandra K; Conway, Stephanie J; Zhong, Bojian; Scutt, Charles P; Bowman, John L

2016-01-01

Members of the YABBY gene family of transcription factors in angiosperms have been shown to be involved in the initiation of outgrowth of the lamina, the maintenance of polarity, and establishment of the leaf margin. Although most of the dorsal-ventral polarity genes in seed plants have homologs in non-spermatophyte lineages, the presence of YABBY genes is restricted to seed plants. To gain insight into the origin and diversification of this gene family, we reconstructed the evolutionary history of YABBY gene lineages in seed plants. Our findings suggest that either one or two YABBY genes were present in the last common ancestor of extant seed plants. We also examined the expression of YABBY genes in the gymnosperms Ephedra distachya (Gnetales), Ginkgo biloba (Ginkgoales), and Pseudotsuga menziesii (Coniferales). Our data indicate that some YABBY genes are expressed in a polar (abaxial) manner in leaves and female cones in gymnosperms. We propose that YABBY genes already acted as polarity genes in the last common ancestor of extant seed plants. © 2016 Wiley Periodicals, Inc.
Unusual Gene Order and Organization of the Sea Urchin Hox Cluster

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cameron, R A; Rowen, L; Nesbitt, R

2005-10-11

The highly consistent gene order and axial colinear expression patterns found in vertebrate hox gene clusters are less well conserved across the rest of bilaterians. We report the first deuterostome instance of an intact hox cluster with a unique gene order where the paralog groups are not expressed in a sequential manner. The finished sequence from BAC clones from the genome of the sea urchin, Strongylocentrotus purpuratus, reveals a gene order wherein the anterior genes (Hox1, Hox2 and Hox3) lie nearest the posterior genes in the cluster such that the most 3 gene is Hox5. (The gene order is :more » 5-Hox1, 2, 3, 11/13c, 11/13b, 11/13a, 9/10, 8, 7, 6, 5 - 3). The finished sequence result is corroborated by restriction mapping evidence and BAC-end scaffold analyses. Comparisons with a putative ancestral deuterostome Hox gene cluster suggest that the rearrangements leading to the sea urchin gene order were many and complex.« less
Unusual Gene Order and Organization of the Sea Urchin HoxCluster

DOE Office of Scientific and Technical Information (OSTI.GOV)

Richardson, Paul M.; Lucas, Susan; Cameron, R. Andrew

2005-05-10

The highly consistent gene order and axial colinear expression patterns found in vertebrate hox gene clusters are less well conserved across the rest of bilaterians. We report the first deuterostome instance of an intact hox cluster with a unique gene order where the paralog groups are not expressed in a sequential manner. The finished sequence from BAC clones from the genome of the sea urchin, Strongylocentrotus purpuratus, reveals a gene order wherein the anterior genes (Hox1, Hox2 and Hox3) lie nearest the posterior genes in the cluster such that the most 3' gene is Hox5. (The gene order is :more » 5'-Hox1,2, 3, 11/13c, 11/13b, '11/13a, 9/10, 8, 7, 6, 5 - 3)'. The finished sequence result is corroborated by restriction mapping evidence and BAC-end scaffold analyses. Comparisons with a putative ancestral deuterostome Hox gene cluster suggest that the rearrangements leading to the sea urchin gene order were many and complex.« less
Biophysical Constraints Arising from Compositional Context in Synthetic Gene Networks.

PubMed

Yeung, Enoch; Dy, Aaron J; Martin, Kyle B; Ng, Andrew H; Del Vecchio, Domitilla; Beck, James L; Collins, James J; Murray, Richard M

2017-07-26

Synthetic gene expression is highly sensitive to intragenic compositional context (promoter structure, spacing regions between promoter and coding sequences, and ribosome binding sites). However, much less is known about the effects of intergenic compositional context (spatial arrangement and orientation of entire genes on DNA) on expression levels in synthetic gene networks. We compare expression of induced genes arranged in convergent, divergent, or tandem orientations. Induction of convergent genes yielded up to 400% higher expression, greater ultrasensitivity, and dynamic range than divergent- or tandem-oriented genes. Orientation affects gene expression whether one or both genes are induced. We postulate that transcriptional interference in divergent and tandem genes, mediated by supercoiling, can explain differences in expression and validate this hypothesis through modeling and in vitro supercoiling relaxation experiments. Treatment with gyrase abrogated intergenic context effects, bringing expression levels within 30% of each other. We rebuilt the toggle switch with convergent genes, taking advantage of supercoiling effects to improve threshold detection and switch stability. Copyright © 2017 Elsevier Inc. All rights reserved.
Transcriptome meta-analysis reveals common differential and global gene expression profiles in cystic fibrosis and other respiratory disorders and identifies CFTR regulators.

PubMed

Clarke, Luka A; Botelho, Hugo M; Sousa, Lisete; Falcao, Andre O; Amaral, Margarida D

2015-11-01

A meta-analysis of 13 independent microarray data sets was performed and gene expression profiles from cystic fibrosis (CF), similar disorders (COPD: chronic obstructive pulmonary disease, IPF: idiopathic pulmonary fibrosis, asthma), environmental conditions (smoking, epithelial injury), related cellular processes (epithelial differentiation/regeneration), and non-respiratory "control" conditions (schizophrenia, dieting), were compared. Similarity among differentially expressed (DE) gene lists was assessed using a permutation test, and a clustergram was constructed, identifying common gene markers. Global gene expression values were standardized using a novel approach, revealing that similarities between independent data sets run deeper than shared DE genes. Correlation of gene expression values identified putative gene regulators of the CF transmembrane conductance regulator (CFTR) gene, of potential therapeutic significance. Our study provides a novel perspective on CF epithelial gene expression in the context of other lung disorders and conditions, and highlights the contribution of differentiation/EMT and injury to gene signatures of respiratory disease. Copyright © 2015 Elsevier Inc. All rights reserved.
Biological interpretation of genome-wide association studies using predicted gene functions.

PubMed

Pers, Tune H; Karjalainen, Juha M; Chan, Yingleong; Westra, Harm-Jan; Wood, Andrew R; Yang, Jian; Lui, Julian C; Vedantam, Sailaja; Gustafsson, Stefan; Esko, Tonu; Frayling, Tim; Speliotes, Elizabeth K; Boehnke, Michael; Raychaudhuri, Soumya; Fehrmann, Rudolf S N; Hirschhorn, Joel N; Franke, Lude

2015-01-19

The main challenge for gaining biological insights from genetic associations is identifying which genes and pathways explain the associations. Here we present DEPICT, an integrative tool that employs predicted gene functions to systematically prioritize the most likely causal genes at associated loci, highlight enriched pathways and identify tissues/cell types where genes from associated loci are highly expressed. DEPICT is not limited to genes with established functions and prioritizes relevant gene sets for many phenotypes.
Id-1 gene and gene products as therapeutic targets for treatment of breast cancer and other types of carcinoma

DOEpatents

Desprez, Pierre-Yves; Campisi, Judith

2014-08-19

A method for treatment of breast cancer and other types of cancer. The method comprises targeting and modulating Id-1 gene expression, if any, for the Id-1 gene, or gene products in breast or other epithelial cancers in a patient by delivering products that modulate Id-1 gene expression. When expressed, Id-1 gene is a prognostic indicator that cancer cells are invasive and metastatic.
A curated catalog of canine and equine keratin genes

PubMed Central

Pujar, Shashikant; McGarvey, Kelly M.; Welle, Monika; Galichet, Arnaud; Müller, Eliane J.; Pruitt, Kim D.; Leeb, Tosso

2017-01-01

Keratins represent a large protein family with essential structural and functional roles in epithelial cells of skin, hair follicles, and other organs. During evolution the genes encoding keratins have undergone multiple rounds of duplication and humans have two clusters with a total of 55 functional keratin genes in their genomes. Due to the high similarity between different keratin paralogs and species-specific differences in gene content, the currently available keratin gene annotation in species with draft genome assemblies such as dog and horse is still imperfect. We compared the National Center for Biotechnology Information (NCBI) (dog annotation release 103, horse annotation release 101) and Ensembl (release 87) gene predictions for the canine and equine keratin gene clusters to RNA-seq data that were generated from adult skin of five dogs and two horses and from adult hair follicle tissue of one dog. Taking into consideration the knowledge on the conserved exon/intron structure of keratin genes, we annotated 61 putatively functional keratin genes in both the dog and horse, respectively. Subsequently, curators in the RefSeq group at NCBI reviewed their annotation of keratin genes in the dog and horse genomes (Annotation Release 104 and Annotation Release 102, respectively) and updated annotation and gene nomenclature of several keratin genes. The updates are now available in the NCBI Gene database (https://www.ncbi.nlm.nih.gov/gene). PMID:28846680
[Stability analysis of reference gene based on real-time PCR in Artemisia annua under cadmium treatment].

PubMed

Zhou, Liang-Yun; Mo, Ge; Wang, Sheng; Tang, Jin-Fu; Yue, Hong; Huang, Lu-Qi; Shao, Ai-Juan; Guo, Lan-Ping

2014-03-01

In this study, Actin, 18S rRNA, PAL, GAPDH and CPR of Artemisia annua were selected as candidate reference genes, and their gene-specific primers for real-time PCR were designed, then geNorm, NormFinder, BestKeeper, Delta CT and RefFinder were used to evaluate their expression stability in the leaves of A. annua under treatment of different concentrations of Cd, with the purpose of finding a reliable reference gene to ensure the reliability of gene-expression analysis. The results showed that there were some significant differences among the candidate reference genes under different treatments and the order of expression stability of candidate reference gene was Actin > 18S rRNA > PAL > GAPDH > CPR. These results suggested that Actin, 18S rRNA and PAL could be used as ideal reference genes of gene expression analysis in A. annua and multiple internal control genes were adopted for results calibration. In addition, differences in expression stability of candidate reference genes in the leaves of A. annua under the same concentrations of Cd were observed, which suggested that the screening of candidate reference genes was needed even under the same treatment. To our best knowledge, this study for the first time provided the ideal reference genes under Cd treatment in the leaves of A. annua and offered reference for the gene expression analysis of A. annua under other conditions.
GeneFarm, structural and functional annotation of Arabidopsis gene and protein families by a network of experts

PubMed Central

Aubourg, Sébastien; Brunaud, Véronique; Bruyère, Clémence; Cock, Mark; Cooke, Richard; Cottet, Annick; Couloux, Arnaud; Déhais, Patrice; Deléage, Gilbert; Duclert, Aymeric; Echeverria, Manuel; Eschbach, Aimée; Falconet, Denis; Filippi, Ghislain; Gaspin, Christine; Geourjon, Christophe; Grienenberger, Jean-Michel; Houlné, Guy; Jamet, Elisabeth; Lechauve, Frédéric; Leleu, Olivier; Leroy, Philippe; Mache, Régis; Meyer, Christian; Nedjari, Hafed; Negrutiu, Ioan; Orsini, Valérie; Peyretaillade, Eric; Pommier, Cyril; Raes, Jeroen; Risler, Jean-Loup; Rivière, Stéphane; Rombauts, Stéphane; Rouzé, Pierre; Schneider, Michel; Schwob, Philippe; Small, Ian; Soumayet-Kampetenga, Ghislain; Stankovski, Darko; Toffano, Claire; Tognolli, Michael; Caboche, Michel; Lecharny, Alain

2005-01-01

Genomic projects heavily depend on genome annotations and are limited by the current deficiencies in the published predictions of gene structure and function. It follows that, improved annotation will allow better data mining of genomes, and more secure planning and design of experiments. The purpose of the GeneFarm project is to obtain homogeneous, reliable, documented and traceable annotations for Arabidopsis nuclear genes and gene products, and to enter them into an added-value database. This re-annotation project is being performed exhaustively on every member of each gene family. Performing a family-wide annotation makes the task easier and more efficient than a gene-by-gene approach since many features obtained for one gene can be extrapolated to some or all the other genes of a family. A complete annotation procedure based on the most efficient prediction tools available is being used by 16 partner laboratories, each contributing annotated families from its field of expertise. A database, named GeneFarm, and an associated user-friendly interface to query the annotations have been developed. More than 3000 genes distributed over 300 families have been annotated and are available at http://genoplante-info.infobiogen.fr/Genefarm/. Furthermore, collaboration with the Swiss Institute of Bioinformatics is underway to integrate the GeneFarm data into the protein knowledgebase Swiss-Prot. PMID:15608279
Motif-independent prediction of a secondary metabolism gene cluster using comparative genomics: application to sequenced genomes of Aspergillus and ten other filamentous fungal species.

PubMed

Takeda, Itaru; Umemura, Myco; Koike, Hideaki; Asai, Kiyoshi; Machida, Masayuki

2014-08-01

Despite their biological importance, a significant number of genes for secondary metabolite biosynthesis (SMB) remain undetected due largely to the fact that they are highly diverse and are not expressed under a variety of cultivation conditions. Several software tools including SMURF and antiSMASH have been developed to predict fungal SMB gene clusters by finding core genes encoding polyketide synthase, nonribosomal peptide synthetase and dimethylallyltryptophan synthase as well as several others typically present in the cluster. In this work, we have devised a novel comparative genomics method to identify SMB gene clusters that is independent of motif information of the known SMB genes. The method detects SMB gene clusters by searching for a similar order of genes and their presence in nonsyntenic blocks. With this method, we were able to identify many known SMB gene clusters with the core genes in the genomic sequences of 10 filamentous fungi. Furthermore, we have also detected SMB gene clusters without core genes, including the kojic acid biosynthesis gene cluster of Aspergillus oryzae. By varying the detection parameters of the method, a significant difference in the sequence characteristics was detected between the genes residing inside the clusters and those outside the clusters. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
An 8-gene qRT-PCR-based gene expression score that has prognostic value in early breast cancer

PubMed Central

2010-01-01

Background Gene expression profiling may improve prognostic accuracy in patients with early breast cancer. Our objective was to demonstrate that it is possible to develop a simple molecular signature to predict distant relapse. Methods We included 153 patients with stage I-II hormonal receptor-positive breast cancer. RNA was isolated from formalin-fixed paraffin-embedded samples and qRT-PCR amplification of 83 genes was performed with gene expression assays. The genes we analyzed were those included in the 70-Gene Signature, the Recurrence Score and the Two-Gene Index. The association among gene expression, clinical variables and distant metastasis-free survival was analyzed using Cox regression models. Results An 8-gene prognostic score was defined. Distant metastasis-free survival at 5 years was 97% for patients defined as low-risk by the prognostic score versus 60% for patients defined as high-risk. The 8-gene score remained a significant factor in multivariate analysis and its performance was similar to that of two validated gene profiles: the 70-Gene Signature and the Recurrence Score. The validity of the signature was verified in independent cohorts obtained from the GEO database. Conclusions This study identifies a simple gene expression score that complements histopathological prognostic factors in breast cancer, and can be determined in paraffin-embedded samples. PMID:20584321
Low load for disruptive mutations in autism genes and their biased transmission

PubMed Central

Iossifov, Ivan; Levy, Dan; Allen, Jeremy; Ye, Kenny; Ronemus, Michael; Lee, Yoon-ha; Yamrom, Boris; Wigler, Michael

2015-01-01

We previously computed that genes with de novo (DN) likely gene-disruptive (LGD) mutations in children with autism spectrum disorders (ASD) have high vulnerability: disruptive mutations in many of these genes, the vulnerable autism genes, will have a high likelihood of resulting in ASD. Because individuals with ASD have lower fecundity, such mutations in autism genes would be under strong negative selection pressure. An immediate prediction is that these genes will have a lower LGD load than typical genes in the human gene pool. We confirm this hypothesis in an explicit test by measuring the load of disruptive mutations in whole-exome sequence databases from two cohorts. We use information about mutational load to show that lower and higher intelligence quotients (IQ) affected individuals can be distinguished by the mutational load in their respective gene targets, as well as to help prioritize gene targets by their likelihood of being autism genes. Moreover, we demonstrate that transmission of rare disruptions in genes with a lower LGD load occurs more often to affected offspring; we show transmission originates most often from the mother, and transmission of such variants is seen more often in offspring with lower IQ. A surprising proportion of transmission of these rare events comes from genes expressed in the embryonic brain that show sharply reduced expression shortly after birth. PMID:26401017
An internal regulatory element controls troponin I gene expression.

PubMed Central

Yutzey, K E; Kline, R L; Konieczny, S F

1989-01-01

During skeletal myogenesis, approximately 20 contractile proteins and related gene products temporally accumulate as the cells fuse to form multinucleated muscle fibers. In most instances, the contractile protein genes are regulated transcriptionally, which suggests that a common molecular mechanism may coordinate the expression of this diverse and evolutionarily unrelated gene set. Recent studies have examined the muscle-specific cis-acting elements associated with numerous contractile protein genes. All of the identified regulatory elements are positioned in the 5'-flanking regions, usually within 1,500 base pairs of the transcription start site. Surprisingly, a DNA consensus sequence that is common to each contractile protein gene has not been identified. In contrast to the results of these earlier studies, we have found that the 5'-flanking region of the quail troponin I (TnI) gene is not sufficient to permit the normal myofiber transcriptional activation of the gene. Instead, the TnI gene utilizes a unique internal regulatory element that is responsible for the correct myofiber-specific expression pattern associated with the TnI gene. This is the first example in which a contractile protein gene has been shown to rely primarily on an internal regulatory element to elicit transcriptional activation during myogenesis. The diversity of regulatory elements associated with the contractile protein genes suggests that the temporal expression of the genes may involve individual cis-trans regulatory components specific for each gene. Images PMID:2725509
Heat Shock Protein Genes Undergo Dynamic Alteration in Their Three-Dimensional Structure and Genome Organization in Response to Thermal Stress

PubMed Central

Chowdhary, Surabhi; Kainth, Amoldeep S.

2017-01-01

ABSTRACT Three-dimensional (3D) chromatin organization is important for proper gene regulation, yet how the genome is remodeled in response to stress is largely unknown. Here, we use a highly sensitive version of chromosome conformation capture in combination with fluorescence microscopy to investigate Heat Shock Protein (HSP) gene conformation and 3D nuclear organization in budding yeast. In response to acute thermal stress, HSP genes undergo intense intragenic folding interactions that go well beyond 5′-3′ gene looping previously described for RNA polymerase II genes. These interactions include looping between upstream activation sequence (UAS) and promoter elements, promoter and terminator regions, and regulatory and coding regions (gene “crumpling”). They are also dynamic, being prominent within 60 s, peaking within 2.5 min, and attenuating within 30 min, and correlate with HSP gene transcriptional activity. With similarly striking kinetics, activated HSP genes, both chromosomally linked and unlinked, coalesce into discrete intranuclear foci. Constitutively transcribed genes also loop and crumple yet fail to coalesce. Notably, a missense mutation in transcription factor TFIIB suppresses gene looping, yet neither crumpling nor HSP gene coalescence is affected. An inactivating promoter mutation, in contrast, obviates all three. Our results provide evidence for widespread, transcription-associated gene crumpling and demonstrate the de novo assembly and disassembly of HSP gene foci. PMID:28970326
Genome-wide characterization of the Pectate Lyase-like (PLL) genes in Brassica rapa.

PubMed

Jiang, Jingjing; Yao, Lina; Miao, Ying; Cao, Jiashu

2013-11-01

Pectate lyases (PL) depolymerize demethylated pectin (pectate, EC 4.2.2.2) by catalyzing the eliminative cleavage of α-1,4-glycosidic linked galacturonan. Pectate Lyase-like (PLL) genes are one of the largest and most complex families in plants. However, studies on the phylogeny, gene structure, and expression of PLL genes are limited. To understand the potential functions of PLL genes in plants, we characterized their intron-exon structure, phylogenetic relationships, and protein structures, and measured their expression patterns in various tissues, specifically the reproductive tissues in Brassica rapa. Sequence alignments revealed two characteristic motifs in PLL genes. The chromosome location analysis indicated that 18 of the 46 PLL genes were located in the least fractionated sub-genome (LF) of B. rapa, while 16 were located in the medium fractionated sub-genome (MF1) and 12 in the more fractionated sub-genome (MF2). Quantitative RT-PCR analysis showed that BrPLL genes were expressed in various tissues, with most of them being expressed in flowers. Detailed qRT-PCR analysis identified 11 pollen specific PLL genes and several other genes with unique spatial expression patterns. In addition, some duplicated genes showed similar expression patterns. The phylogenetic analysis identified three PLL gene subfamilies in plants, among which subfamily II might have evolved from gene neofunctionalization or subfunctionalization. Therefore, this study opens the possibility for exploring the roles of PLL genes during plant development.
Heat Shock Protein Genes Undergo Dynamic Alteration in Their Three-Dimensional Structure and Genome Organization in Response to Thermal Stress.

PubMed

Chowdhary, Surabhi; Kainth, Amoldeep S; Gross, David S

2017-12-15

Three-dimensional (3D) chromatin organization is important for proper gene regulation, yet how the genome is remodeled in response to stress is largely unknown. Here, we use a highly sensitive version of chromosome conformation capture in combination with fluorescence microscopy to investigate Heat Shock Protein ( HSP ) gene conformation and 3D nuclear organization in budding yeast. In response to acute thermal stress, HSP genes undergo intense intragenic folding interactions that go well beyond 5'-3' gene looping previously described for RNA polymerase II genes. These interactions include looping between upstream activation sequence (UAS) and promoter elements, promoter and terminator regions, and regulatory and coding regions (gene "crumpling"). They are also dynamic, being prominent within 60 s, peaking within 2.5 min, and attenuating within 30 min, and correlate with HSP gene transcriptional activity. With similarly striking kinetics, activated HSP genes, both chromosomally linked and unlinked, coalesce into discrete intranuclear foci. Constitutively transcribed genes also loop and crumple yet fail to coalesce. Notably, a missense mutation in transcription factor TFIIB suppresses gene looping, yet neither crumpling nor HSP gene coalescence is affected. An inactivating promoter mutation, in contrast, obviates all three. Our results provide evidence for widespread, transcription-associated gene crumpling and demonstrate the de novo assembly and disassembly of HSP gene foci. Copyright © 2017 American Society for Microbiology.
Functional Interactions of Major Rice Blast Resistance Genes Pi-ta with Pi-b and Minor Blast Resistance QTLs

USDA-ARS?s Scientific Manuscript database

Major blast resistance (R) genes confer resistance in a gene-for-gene manner. However, little information is available on interactions between R genes. In this study, interactions between two rice blast R genes, Pi-ta and Pi-b, and other minor blast resistance quantitative trait locus (QTLs) were in...
Cloning and sequencing of an alkaline protease gene from Bacillus lentus and amplification of the gene on the B. lentus chromosome by an improved technique.

PubMed

Jørgensen, P L; Tangney, M; Pedersen, P E; Hastrup, S; Diderichsen, B; Jørgensen, S T

2000-02-01

A gene encoding an alkaline protease was cloned from an alkalophilic bacillus, and its nucleotide sequence was determined. The cloned gene was used to increase the copy number of the protease gene on the chromosome by an improved gene amplification technique.
An epigenetic state associated with areas of gene duplication

PubMed Central

Gimelbrant, Alexander A.; Chess, Andrew

2006-01-01

Asynchronous DNA replication is an epigenetically determined feature found in all cases of monoallelic expression, including genomic imprinting, X-inactivation, and random monoallelic expression of autosomal genes such as immunoglobulins and olfactory receptor genes. Most genes of the latter class were identified in experiments focused on genes functioning in the chemosensory and immune systems. We performed an unbiased survey of asynchronous replication in the mouse genome, excluding known asynchronously replicated genes. Fully 10% (eight of 80) of the genes tested exhibited asynchronous replication. A common feature of the newly identified asynchronously replicated areas is their proximity to areas of tandem gene duplication. Testing of other clustered areas supported the idea that such regions are enriched with asynchronously replicated genes. PMID:16687731

Converting cancer genes into killer genes.

PubMed Central

Da Costa, L T; Jen, J; He, T C; Chan, T A; Kinzler, K W; Vogelstein, B

1996-01-01

Over the past decade, it has become clear that tumorigenesis is driven by alterations in genes that control cell growth or cell death. Theoretically, the proteins encoded by these genes provide excellent targets for new therapeutic agents. Here, we describe a gene therapy approach to specifically kill tumor cells expressing such oncoproteins. In outline, the target oncoprotein binds to exogenously introduced gene products, resulting in transcriptional activation of a toxic gene. As an example, we show that this approach can be used to specifically kill cells overexpressing a mutant p53 gene in cell culture. The strategy may be generally applicable to neoplastic diseases in which the underlying patterns of genetic alterations or abnormal gene expression are known. Images Fig. 1 Fig. 2 Fig. 4 Fig. 5 PMID:8633039
[Construction and expression of the targeting super-antigen EGF-SEA fusion gene].

PubMed

Xie, Yang; Peng, Shaoping; Liao, Zhiying; Liu, Jiafeng; Liu, Xuemei; Chen, Weifeng

2014-05-01

To construct expression vector for the SEA-EGF fusion gene. Clone the SEA gene and the EGF gene segment with PCR and RT-PCR independently, and connect this two genes by the bridge PCR. Insert the fusion gene EGF-SEA into the expression vector PET-44. Induced the secretion of the fusion protein SEA-EGF by the antileptic. The gene fragment encoding EGF and SEA mature peptide was successfully cloned. The fusion gene EGF-SEA was successfully constructed and was inserted into expression vector. The new recombinant expression vector for fusion gene EGF-SEA is specific for head and neck cancer, laid the foundation for the further study of fusion protein SEA-EGF targeting immune therapy in head and neck tumors.
Gene doping: gene delivery for olympic victory

PubMed Central

Gould, David

2013-01-01

With one recently recommended gene therapy in Europe and a number of other gene therapy treatments now proving effective in clinical trials it is feasible that the same technologies will soon be adopted in the world of sport by unscrupulous athletes and their trainers in so called ‘gene doping’. In this article an overview of the successful gene therapy clinical trials is provided and the potential targets for gene doping are highlighted. Depending on whether a doping gene product is secreted from the engineered cells or is retained locally to, or inside engineered cells will, to some extent, determine the likelihood of detection. It is clear that effective gene delivery technologies now exist and it is important that detection and prevention plans are in place. PMID:23082866
Defining suitable reference genes for RT-qPCR analysis on human sertoli cells after 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) exposure.

PubMed

Ribeiro, Mariana Antunes; dos Reis, Mariana Bisarro; de Moraes, Leonardo Nazário; Briton-Jones, Christine; Rainho, Cláudia Aparecida; Scarano, Wellerson Rodrigo

2014-11-01

Quantitative real-time RT-PCR (qPCR) has proven to be a valuable molecular technique to quantify gene expression. There are few studies in the literature that describe suitable reference genes to normalize gene expression data. Studies of transcriptionally disruptive toxins, like tetrachlorodibenzo-p-dioxin (TCDD), require careful consideration of reference genes. The present study was designed to validate potential reference genes in human Sertoli cells after exposure to TCDD. 32 candidate reference genes were analyzed to determine their applicability. geNorm and NormFinder softwares were used to obtain an estimation of the expression stability of the 32 genes and to identify the most suitable genes for qPCR data normalization.
A novel approach for human whole transcriptome analysis based on absolute gene expression of microarray data

PubMed Central

Bikel, Shirley; Jacobo-Albavera, Leonor; Sánchez-Muñoz, Fausto; Cornejo-Granados, Fernanda; Canizales-Quinteros, Samuel; Soberón, Xavier; Sotelo-Mundo, Rogerio R.; del Río-Navarro, Blanca E.; Mendoza-Vargas, Alfredo; Sánchez, Filiberto

2017-01-01

Background In spite of the emergence of RNA sequencing (RNA-seq), microarrays remain in widespread use for gene expression analysis in the clinic. There are over 767,000 RNA microarrays from human samples in public repositories, which are an invaluable resource for biomedical research and personalized medicine. The absolute gene expression analysis allows the transcriptome profiling of all expressed genes under a specific biological condition without the need of a reference sample. However, the background fluorescence represents a challenge to determine the absolute gene expression in microarrays. Given that the Y chromosome is absent in female subjects, we used it as a new approach for absolute gene expression analysis in which the fluorescence of the Y chromosome genes of female subjects was used as the background fluorescence for all the probes in the microarray. This fluorescence was used to establish an absolute gene expression threshold, allowing the differentiation between expressed and non-expressed genes in microarrays. Methods We extracted the RNA from 16 children leukocyte samples (nine males and seven females, ages 6–10 years). An Affymetrix Gene Chip Human Gene 1.0 ST Array was carried out for each sample and the fluorescence of 124 genes of the Y chromosome was used to calculate the absolute gene expression threshold. After that, several expressed and non-expressed genes according to our absolute gene expression threshold were compared against the expression obtained using real-time quantitative polymerase chain reaction (RT-qPCR). Results From the 124 genes of the Y chromosome, three genes (DDX3Y, TXLNG2P and EIF1AY) that displayed significant differences between sexes were used to calculate the absolute gene expression threshold. Using this threshold, we selected 13 expressed and non-expressed genes and confirmed their expression level by RT-qPCR. Then, we selected the top 5% most expressed genes and found that several KEGG pathways were significantly enriched. Interestingly, these pathways were related to the typical functions of leukocytes cells, such as antigen processing and presentation and natural killer cell mediated cytotoxicity. We also applied this method to obtain the absolute gene expression threshold in already published microarray data of liver cells, where the top 5% expressed genes showed an enrichment of typical KEGG pathways for liver cells. Our results suggest that the three selected genes of the Y chromosome can be used to calculate an absolute gene expression threshold, allowing a transcriptome profiling of microarray data without the need of an additional reference experiment. Discussion Our approach based on the establishment of a threshold for absolute gene expression analysis will allow a new way to analyze thousands of microarrays from public databases. This allows the study of different human diseases without the need of having additional samples for relative expression experiments. PMID:29230367
Identification of Common Differentially Expressed Genes in Urinary Bladder Cancer

PubMed Central

Zaravinos, Apostolos; Lambrou, George I.; Boulalas, Ioannis; Delakas, Dimitris; Spandidos, Demetrios A.

2011-01-01

Background Current diagnosis and treatment of urinary bladder cancer (BC) has shown great progress with the utilization of microarrays. Purpose Our goal was to identify common differentially expressed (DE) genes among clinically relevant subclasses of BC using microarrays. Methodology/Principal Findings BC samples and controls, both experimental and publicly available datasets, were analyzed by whole genome microarrays. We grouped the samples according to their histology and defined the DE genes in each sample individually, as well as in each tumor group. A dual analysis strategy was followed. First, experimental samples were analyzed and conclusions were formulated; and second, experimental sets were combined with publicly available microarray datasets and were further analyzed in search of common DE genes. The experimental dataset identified 831 genes that were DE in all tumor samples, simultaneously. Moreover, 33 genes were up-regulated and 85 genes were down-regulated in all 10 BC samples compared to the 5 normal tissues, simultaneously. Hierarchical clustering partitioned tumor groups in accordance to their histology. K-means clustering of all genes and all samples, as well as clustering of tumor groups, presented 49 clusters. K-means clustering of common DE genes in all samples revealed 24 clusters. Genes manifested various differential patterns of expression, based on PCA. YY1 and NFκB were among the most common transcription factors that regulated the expression of the identified DE genes. Chromosome 1 contained 32 DE genes, followed by chromosomes 2 and 11, which contained 25 and 23 DE genes, respectively. Chromosome 21 had the least number of DE genes. GO analysis revealed the prevalence of transport and binding genes in the common down-regulated DE genes; the prevalence of RNA metabolism and processing genes in the up-regulated DE genes; as well as the prevalence of genes responsible for cell communication and signal transduction in the DE genes that were down-regulated in T1-Grade III tumors and up-regulated in T2/T3-Grade III tumors. Combination of samples from all microarray platforms revealed 17 common DE genes, (BMP4, CRYGD, DBH, GJB1, KRT83, MPZ, NHLH1, TACR3, ACTC1, MFAP4, SPARCL1, TAGLN, TPM2, CDC20, LHCGR, TM9SF1 and HCCS) 4 of which participate in numerous pathways. Conclusions/Significance The identification of the common DE genes among BC samples of different histology can provide further insight into the discovery of new putative markers. PMID:21483740
Genome-Wide Analyses of the NAC Transcription Factor Gene Family in Pepper (Capsicum annuum L.): Chromosome Location, Phylogeny, Structure, Expression Patterns, Cis-Elements in the Promoter, and Interaction Network

PubMed Central

Diao, Weiping; Snyder, John C.; Liu, Jinbing; Pan, Baogui; Guo, Guangjun; Ge, Wei; Dawood, Mohammad Hasan Salman Ali

2018-01-01

The NAM, ATAF1/2, and CUC2 (NAC) transcription factors form a large plant-specific gene family, which is involved in the regulation of tissue development in response to biotic and abiotic stress. To date, there have been no comprehensive studies investigating chromosomal location, gene structure, gene phylogeny, conserved motifs, or gene expression of NAC in pepper (Capsicum annuum L.). The recent release of the complete genome sequence of pepper allowed us to perform a genome-wide investigation of Capsicum annuum L. NAC (CaNAC) proteins. In the present study, a comprehensive analysis of the CaNAC gene family in pepper was performed, and a total of 104 CaNAC genes were identified. Genome mapping analysis revealed that CaNAC genes were enriched on four chromosomes (chromosomes 1, 2, 3, and 6). In addition, phylogenetic analysis of the NAC domains from pepper, potato, Arabidopsis, and rice showed that CaNAC genes could be clustered into three groups (I, II, and III). Group III, which contained 24 CaNAC genes, was exclusive to the Solanaceae plant family. Gene structure and protein motif analyses showed that these genes were relatively conserved within each subgroup. The number of introns in CaNAC genes varied from 0 to 8, with 83 (78.9%) of CaNAC genes containing two or less introns. Promoter analysis confirmed that CaNAC genes are involved in pepper growth, development, and biotic or abiotic stress responses. Further, the expression of 22 selected CaNAC genes in response to seven different biotic and abiotic stresses [salt, heat shock, drought, Phytophthora capsici, abscisic acid, salicylic acid (SA), and methyl jasmonate (MeJA)] was evaluated by quantitative RT-PCR to determine their stress-related expression patterns. Several putative stress-responsive CaNAC genes, including CaNAC72 and CaNAC27, which are orthologs of the known stress-responsive Arabidopsis gene ANAC055 and potato gene StNAC30, respectively, were highly regulated by treatment with different types of stress. Our results also showed that CaNAC36 plays an important role in the interaction network, interacting with 48 genes. Most of these genes are in the mitogen-activated protein kinase (MAPK) family. Taken together, our results provide a platform for further studies to identify the biological functions of CaNAC genes. PMID:29596349
Sugar Lego: gene composition of bacterial carbohydrate metabolism genomic loci.

PubMed

Kaznadzey, Anna; Shelyakin, Pavel; Gelfand, Mikhail S

2017-11-25

Bacterial carbohydrate metabolism is extremely diverse, since carbohydrates serve as a major energy source and are involved in a variety of cellular processes. Bacterial genes belonging to same metabolic pathway are often co-localized in the chromosome, but it is not a strict rule. Gene co-localization in linked to co-evolution and co-regulation. This study focuses on a large-scale analysis of bacterial genomic loci related to the carbohydrate metabolism. We demonstrate that only 53% of 148,000 studied genes from over six hundred bacterial genomes are co-localized in bacterial genomes with other carbohydrate metabolism genes, which points to a significant role of singleton genes. Co-localized genes form cassettes, ranging in size from two to fifteen genes. Two major factors influencing the cassette-forming tendency are gene function and bacterial phylogeny. We have obtained a comprehensive picture of co-localization preferences of genes for nineteen major carbohydrate metabolism functional classes, over two hundred gene orthologous clusters, and thirty bacterial classes, and characterized the cassette variety in size and content among different species, highlighting a significant role of short cassettes. The preference towards co-localization of carbohydrate metabolism genes varies between 40 and 76% for bacterial taxa. Analysis of frequently co-localized genes yielded forty-five significant pairwise links between genes belonging to different functional classes. The number of such links per class range from zero to eight, demonstrating varying preferences of respective genes towards a specific chromosomal neighborhood. Genes from eleven functional classes tend to co-localize with genes from the same class, indicating an important role of clustering of genes with similar functions. At that, in most cases such co-localization does not originate from local duplication events. Overall, we describe a complex web formed by evolutionary relationships of bacterial carbohydrate metabolism genes, manifested as co-localization patterns. This article was reviewed by Daria V. Dibrova (A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia), nominated by Armen Mulkidjanian (University of Osnabrück, Germany), Igor Rogozin (NCBI, NLM, NIH, USA) and Yuri Wolf (NCBI, NLM, NIH, USA).
Structural and transcriptional analysis of plant genes encoding the bifunctional lysine ketoglutarate reductase saccharopine dehydrogenase enzyme.

PubMed

Anderson, Olin D; Coleman-Derr, Devin; Gu, Yong Q; Heath, Sekou

2010-06-16

Among the dietary essential amino acids, the most severely limiting in the cereals is lysine. Since cereals make up half of the human diet, lysine limitation has quality/nutritional consequences. The breakdown of lysine is controlled mainly by the catabolic bifunctional enzyme lysine ketoglutarate reductase - saccharopine dehydrogenase (LKR/SDH). The LKR/SDH gene has been reported to produce transcripts for the bifunctional enzyme and separate monofunctional transcripts. In addition to lysine metabolism, this gene has been implicated in a number of metabolic and developmental pathways, which along with its production of multiple transcript types and complex exon/intron structure suggest an important node in plant metabolism. Understanding more about the LKR/SDH gene is thus interesting both from applied standpoint and for basic plant metabolism. The current report describes a wheat genomic fragment containing an LKR/SDH gene and adjacent genes. The wheat LKR/SDH genomic segment was found to originate from the A-genome of wheat, and EST analysis indicates all three LKR/SDH genes in hexaploid wheat are transcriptionally active. A comparison of a set of plant LKR/SDH genes suggests regions of greater sequence conservation likely related to critical enzymatic functions and metabolic controls. Although most plants contain only a single LKR/SDH gene per genome, poplar contains at least two functional bifunctional genes in addition to a monofunctional LKR gene. Analysis of ESTs finds evidence for monofunctional LKR transcripts in switchgrass, and monofunctional SDH transcripts in wheat, Brachypodium, and poplar. The analysis of a wheat LKR/SDH gene and comparative structural and functional analyses among available plant genes provides new information on this important gene. Both the structure of the LKR/SDH gene and the immediately adjacent genes show lineage-specific differences between monocots and dicots, and findings suggest variation in activity of LKR/SDH genes among plants. Although most plant genomes seem to contain a single conserved LKR/SDH gene per genome, poplar possesses multiple contiguous genes. A preponderance of SDH transcripts suggests the LKR region may be more rate-limiting. Only switchgrass has EST evidence for LKR monofunctional transcripts. Evidence for monofunctional SDH transcripts shows a novel intron in wheat, Brachypodium, and poplar.
Genome-wide comparative analysis of NBS-encoding genes between Brassica species and Arabidopsis thaliana.

PubMed

Yu, Jingyin; Tehrim, Sadia; Zhang, Fengqi; Tong, Chaobo; Huang, Junyan; Cheng, Xiaohui; Dong, Caihua; Zhou, Yanqiu; Qin, Rui; Hua, Wei; Liu, Shengyi

2014-01-03

Plant disease resistance (R) genes with the nucleotide binding site (NBS) play an important role in offering resistance to pathogens. The availability of complete genome sequences of Brassica oleracea and Brassica rapa provides an important opportunity for researchers to identify and characterize NBS-encoding R genes in Brassica species and to compare with analogues in Arabidopsis thaliana based on a comparative genomics approach. However, little is known about the evolutionary fate of NBS-encoding genes in the Brassica lineage after split from A. thaliana. Here we present genome-wide analysis of NBS-encoding genes in B. oleracea, B. rapa and A. thaliana. Through the employment of HMM search and manual curation, we identified 157, 206 and 167 NBS-encoding genes in B. oleracea, B. rapa and A. thaliana genomes, respectively. Phylogenetic analysis among 3 species classified NBS-encoding genes into 6 subgroups. Tandem duplication and whole genome triplication (WGT) analyses revealed that after WGT of the Brassica ancestor, NBS-encoding homologous gene pairs on triplicated regions in Brassica ancestor were deleted or lost quickly, but NBS-encoding genes in Brassica species experienced species-specific gene amplification by tandem duplication after divergence of B. rapa and B. oleracea. Expression profiling of NBS-encoding orthologous gene pairs indicated the differential expression pattern of retained orthologous gene copies in B. oleracea and B. rapa. Furthermore, evolutionary analysis of CNL type NBS-encoding orthologous gene pairs among 3 species suggested that orthologous genes in B. rapa species have undergone stronger negative selection than those in B .oleracea species. But for TNL type, there are no significant differences in the orthologous gene pairs between the two species. This study is first identification and characterization of NBS-encoding genes in B. rapa and B. oleracea based on whole genome sequences. Through tandem duplication and whole genome triplication analysis in B. oleracea, B. rapa and A. thaliana genomes, our study provides insight into the evolutionary history of NBS-encoding genes after divergence of A. thaliana and the Brassica lineage. These results together with expression pattern analysis of NBS-encoding orthologous genes provide useful resource for functional characterization of these genes and genetic improvement of relevant crops.
Ferritin gene organization: differences between plants and animals suggest possible kingdom-specific selective constraints.

PubMed

Proudhon, D; Wei, J; Briat, J; Theil, E C

1996-03-01

Ferritin, a protein widespread in nature, concentrates iron approximately 10(11)-10(12)-fold above the solubility within a spherical shell of 24 subunits; it derives in plants and animals from a common ancestor (based on sequence) but displays a cytoplasmic location in animals compared to the plastid in contemporary plants. Ferritin gene regulation in plants and animals is altered by development, hormones, and excess iron; iron signals target DNA in plants but mRNA in animals. Evolution has thus conserved the two end points of ferritin gene expression, the physiological signals and the protein structure, while allowing some divergence of the genetic mechanisms. Comparison of ferritin gene organization in plants and animals, made possible by the cloning of a dicot (soybean) ferritin gene presented here and the recent cloning of two monocot (maize) ferritin genes, shows evolutionary divergence in ferritin gene organization between plants and animals but conservation among plants or among animals; divergence in the genetic mechanism for iron regulation is reflected by the absence in all three plant genes of the IRE, a highly conserved, noncoding sequence in vertebrate animal ferritin mRNA. In plant ferritin genes, the number of introns (n = 7) is higher than in animals (n = 3). Second, no intron positions are conserved when ferritin genes of plants and animals are compared, although all ferritin gene introns are in the coding region; within kingdoms, the intron positions in ferritin genes are conserved. Finally, secondary protein structure has no apparent relationship to intron/exon boundaries in plant ferritin genes, whereas in animal ferritin genes the correspondence is high. The structural differences in introns/exons among phylogenetically related ferritin coding sequences and the high conservation of the gene structure within plant or animal kingdoms of the gene structure within plant or animal kingdoms suggest that kingdom-specific functional constraints may exist to maintain a particular intron/exon pattern within ferritin genes. In the case of plants, where ferritin gene intron placement is unrelated to triplet codons or protein structure, and where ferritin is targeted to the plastid, the selection pressure on gene organization may relate to RNA function and plastid/nuclear signaling.
Repeated evolution of chimeric fusion genes in the β-globin gene family of laurasiatherian mammals.

PubMed

Gaudry, Michael J; Storz, Jay F; Butts, Gary Tyler; Campbell, Kevin L; Hoffmann, Federico G

2014-05-09

The evolutionary fate of chimeric fusion genes may be strongly influenced by their recombinational mode of origin and the nature of functional divergence between the parental genes. In the β-globin gene family of placental mammals, the two postnatally expressed δ- and β-globin genes (HBD and HBB, respectively) have a propensity for recombinational exchange via gene conversion and unequal crossing-over. In the latter case, there are good reasons to expect differences in retention rates for the reciprocal HBB/HBD and HBD/HBB fusion genes due to thalassemia pathologies associated with the HBD/HBB "Lepore" deletion mutant in humans. Here, we report a comparative genomic analysis of the mammalian β-globin gene cluster, which revealed that chimeric HBB/HBD fusion genes originated independently in four separate lineages of laurasiatherian mammals: Eulipotyphlans (shrews, moles, and hedgehogs), carnivores, microchiropteran bats, and cetaceans. In cases where an independently derived "anti-Lepore" duplication mutant has become fixed, the parental HBD and/or HBB genes have typically been inactivated or deleted, so that the newly created HBB/HBD fusion gene is primarily responsible for synthesizing the β-type subunits of adult and fetal hemoglobin (Hb). Contrary to conventional wisdom that the HBD gene is a vestigial relict that is typically inactivated or expressed at negligible levels, we show that HBD-like genes often encode a substantial fraction (20-100%) of β-chain Hbs in laurasiatherian taxa. Our results indicate that the ascendancy or resuscitation of genes with HBD-like coding sequence requires the secondary acquisition of HBB-like promoter sequence via unequal crossing-over or interparalog gene conversion. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Identification and Validation of Reference Genes and Their Impact on Normalized Gene Expression Studies across Cultivated and Wild Cicer Species

PubMed Central

Reddy, Palakolanu Sudhakar; Sri Cindhuri, Katamreddy; Sivaji Ganesh, Adusumalli; Sharma, Kiran Kumar

2016-01-01

Quantitative Real-Time PCR (qPCR) is a preferred and reliable method for accurate quantification of gene expression to understand precise gene functions. A total of 25 candidate reference genes including traditional and new generation reference genes were selected and evaluated in a diverse set of chickpea samples. The samples used in this study included nine chickpea genotypes (Cicer spp.) comprising of cultivated and wild species, six abiotic stress treatments (drought, salinity, high vapor pressure deficit, abscisic acid, cold and heat shock), and five diverse tissues (leaf, root, flower, seedlings and seed). The geNorm, NormFinder and RefFinder algorithms used to identify stably expressed genes in four sample sets revealed stable expression of UCP and G6PD genes across genotypes, while TIP41 and CAC were highly stable under abiotic stress conditions. While PP2A and ABCT genes were ranked as best for different tissues, ABCT, UCP and CAC were most stable across all samples. This study demonstrated the usefulness of new generation reference genes for more accurate qPCR based gene expression quantification in cultivated as well as wild chickpea species. Validation of the best reference genes was carried out by studying their impact on normalization of aquaporin genes PIP1;4 and TIP3;1, in three contrasting chickpea genotypes under high vapor pressure deficit (VPD) treatment. The chickpea TIP3;1 gene got significantly up regulated under high VPD conditions with higher relative expression in the drought susceptible genotype, confirming the suitability of the selected reference genes for expression analysis. This is the first comprehensive study on the stability of the new generation reference genes for qPCR studies in chickpea across species, different tissues and abiotic stresses. PMID:26863232
Identification and Validation of Reference Genes and Their Impact on Normalized Gene Expression Studies across Cultivated and Wild Cicer Species.

PubMed

Reddy, Dumbala Srinivas; Bhatnagar-Mathur, Pooja; Reddy, Palakolanu Sudhakar; Sri Cindhuri, Katamreddy; Sivaji Ganesh, Adusumalli; Sharma, Kiran Kumar

2016-01-01

Quantitative Real-Time PCR (qPCR) is a preferred and reliable method for accurate quantification of gene expression to understand precise gene functions. A total of 25 candidate reference genes including traditional and new generation reference genes were selected and evaluated in a diverse set of chickpea samples. The samples used in this study included nine chickpea genotypes (Cicer spp.) comprising of cultivated and wild species, six abiotic stress treatments (drought, salinity, high vapor pressure deficit, abscisic acid, cold and heat shock), and five diverse tissues (leaf, root, flower, seedlings and seed). The geNorm, NormFinder and RefFinder algorithms used to identify stably expressed genes in four sample sets revealed stable expression of UCP and G6PD genes across genotypes, while TIP41 and CAC were highly stable under abiotic stress conditions. While PP2A and ABCT genes were ranked as best for different tissues, ABCT, UCP and CAC were most stable across all samples. This study demonstrated the usefulness of new generation reference genes for more accurate qPCR based gene expression quantification in cultivated as well as wild chickpea species. Validation of the best reference genes was carried out by studying their impact on normalization of aquaporin genes PIP1;4 and TIP3;1, in three contrasting chickpea genotypes under high vapor pressure deficit (VPD) treatment. The chickpea TIP3;1 gene got significantly up regulated under high VPD conditions with higher relative expression in the drought susceptible genotype, confirming the suitability of the selected reference genes for expression analysis. This is the first comprehensive study on the stability of the new generation reference genes for qPCR studies in chickpea across species, different tissues and abiotic stresses.
Gene Presence-Absence Polymorphism in Castrating Anther-Smut Fungi: Recent Gene Gains and Phylogeographic Structure.

PubMed

Hartmann, Fanny E; Rodríguez de la Vega, Ricardo C; Brandenburg, Jean-Tristan; Carpentier, Fantin; Giraud, Tatiana

2018-04-01

Gene presence-absence polymorphisms segregating within species are a significant source of genetic variation but have been little investigated to date in natural populations. In plant pathogens, the gain or loss of genes encoding proteins interacting directly with the host, such as secreted proteins, probably plays an important role in coevolution and local adaptation. We investigated gene presence-absence polymorphism in populations of two closely related species of castrating anther-smut fungi, Microbotryum lychnidis-dioicae (MvSl) and M. silenes-dioicae (MvSd), from across Europe, on the basis of Illumina genome sequencing data and high-quality genome references. We observed presence-absence polymorphism for 186 autosomal genes (2% of all genes) in MvSl, and only 51 autosomal genes in MvSd. Distinct genes displayed presence-absence polymorphism in the two species. Genes displaying presence-absence polymorphism were frequently located in subtelomeric and centromeric regions and close to repetitive elements, and comparison with outgroups indicated that most were present in a single species, being recently acquired through duplications in multiple-gene families. Gene presence-absence polymorphism in MvSl showed a phylogeographic structure corresponding to clusters detected based on SNPs. In addition, gene absence alleles were rare within species and skewed toward low-frequency variants. These findings are consistent with a deleterious or neutral effect for most gene presence-absence polymorphism. Some of the observed gene loss and gain events may however be adaptive, as suggested by the putative functions of the corresponding encoded proteins (e.g., secreted proteins) or their localization within previously identified selective sweeps. The adaptive roles in plant and anther-smut fungi interactions of candidate genes however need to be experimentally tested in future studies.
B lymphocyte selection and age-related changes in VH gene usage in mutant Alicia rabbits.

PubMed

Zhu, X; Boonthum, A; Zhai, S K; Knight, K L

1999-09-15

Young Alicia rabbits use VHa-negative genes, VHx and VHy, in most VDJ genes, and their serum Ig is VHa negative. However, as Alicia rabbits age, VHa2 allotype Ig is produced at high levels. We investigated which VH gene segments are used in the VDJ genes of a2 Ig-secreting hybridomas and of a2 Ig+ B cells from adult Alicia rabbits. We found that 21 of the 25 VDJ genes used the a2-encoding genes, VH4 or VH7; the other four VDJ genes used four unknown VH gene segments. Because VH4 and VH7 are rarely found in VDJ genes of normal or young Alicia rabbits, we investigated the timing of rearrangement of these genes in Alicia rabbits. During fetal development, VH4 was used in 60-80% of nonproductively rearranged VDJ genes, and VHx and VHy together were used in 10-26%. These data indicate that during B lymphopoiesis VH4 is preferentially rearranged. However, the percentage of productive VHx- and VHy-utilizing VDJ genes increased from 38% at day 21 of gestation to 89% at birth (gestation day 31), whereas the percentage of VH4-utilizing VDJ genes remained at 15%. These data suggest that during fetal development, either VH4-utilizing B-lineage cells are selectively eliminated, or B cells with VHx- and VHy-utilizing VDJ genes are selectively expanded, or both. The accumulation of peripheral VH4-utilizing a2 B cells with age indicates that these B cells might be selectively expanded in the periphery. We discuss the possible selection mechanisms that regulate VH gene segment usage in rabbit B cells during lymphopoiesis and in the periphery.
Diurnal Transcriptome and Gene Network Represented through Sparse Modeling in Brachypodium distachyon.

PubMed

Koda, Satoru; Onda, Yoshihiko; Matsui, Hidetoshi; Takahagi, Kotaro; Yamaguchi-Uehara, Yukiko; Shimizu, Minami; Inoue, Komaki; Yoshida, Takuhiro; Sakurai, Tetsuya; Honda, Hiroshi; Eguchi, Shinto; Nishii, Ryuei; Mochida, Keiichi

2017-01-01

We report the comprehensive identification of periodic genes and their network inference, based on a gene co-expression analysis and an Auto-Regressive eXogenous (ARX) model with a group smoothly clipped absolute deviation (SCAD) method using a time-series transcriptome dataset in a model grass, Brachypodium distachyon . To reveal the diurnal changes in the transcriptome in B. distachyon , we performed RNA-seq analysis of its leaves sampled through a diurnal cycle of over 48 h at 4 h intervals using three biological replications, and identified 3,621 periodic genes through our wavelet analysis. The expression data are feasible to infer network sparsity based on ARX models. We found that genes involved in biological processes such as transcriptional regulation, protein degradation, and post-transcriptional modification and photosynthesis are significantly enriched in the periodic genes, suggesting that these processes might be regulated by circadian rhythm in B. distachyon . On the basis of the time-series expression patterns of the periodic genes, we constructed a chronological gene co-expression network and identified putative transcription factors encoding genes that might be involved in the time-specific regulatory transcriptional network. Moreover, we inferred a transcriptional network composed of the periodic genes in B. distachyon , aiming to identify genes associated with other genes through variable selection by grouping time points for each gene. Based on the ARX model with the group SCAD regularization using our time-series expression datasets of the periodic genes, we constructed gene networks and found that the networks represent typical scale-free structure. Our findings demonstrate that the diurnal changes in the transcriptome in B. distachyon leaves have a sparse network structure, demonstrating the spatiotemporal gene regulatory network over the cyclic phase transitions in B. distachyon diurnal growth.
Selection and validation of reference genes for gene expression analysis in apomictic and sexual Cenchrus ciliaris

PubMed Central

2013-01-01

Background Apomixis is a naturally occurring asexual mode of seed reproduction resulting in offspring genetically identical to the maternal plant. Identifying differential gene expression patterns between apomictic and sexual plants is valuable to help deconstruct the trait. Quantitative RT-PCR (qRT-PCR) is a popular method for analyzing gene expression. Normalizing gene expression data using proper reference genes which show stable expression under investigated conditions is critical in qRT-PCR analysis. We used qRT-PCR to validate expression and stability of six potential reference genes (EF1alpha, EIF4A, UBCE, GAPDH, ACT2 and TUBA) in vegetative and reproductive tissues of B-2S and B-12-9 accessions of C. ciliaris. Findings Among tissue types evaluated, EF1alpha showed the highest level of expression while TUBA showed the lowest. When all tissue types were evaluated and compared between genotypes, EIF4A was the most stable reference gene. Gene expression stability for specific ovary stages of B-2S and B-12-9 was also determined. Except for TUBA, all other tested reference genes could be used for any stage-specific ovary tissue normalization, irrespective of the mode of reproduction. Conclusion Our gene expression stability assay using six reference genes, in sexual and apomictic accessions of C. ciliaris, suggests that EIF4A is the most stable gene across all tissue types analyzed. All other tested reference genes, with the exception of TUBA, could be used for gene expression comparison studies between sexual and apomictic ovaries over multiple developmental stages. This reference gene validation data in C. ciliaris will serve as an important base for future apomixis-related transcriptome data validation. PMID:24083672
Gene-gene interactions and gene polymorphisms of VEGFA and EG-VEGF gene systems in recurrent pregnancy loss.

PubMed

Su, Mei-Tsz; Lin, Sheng-Hsiang; Chen, Yi-Chi; Kuo, Pao-Lin

2014-06-01

Both vascular endothelial growth factor A (VEGFA) and endocrine gland-derived vascular endothelial growth factor (EG-VEGF) systems play major roles in angiogenesis. A body of evidence suggests VEGFs regulate critical processes during pregnancy and have been associated with recurrent pregnancy loss (RPL). However, little information is available regarding the interaction of these two major major angiogenesis-related systems in early human pregnancy. This study was conducted to investigate the association of gene polymorphisms and gene-gene interaction among genes in VEGFA and EG-VEGF systems and idiopathic RPL. A total of 98 women with history of idiopathic RPL and 142 controls were included, and 5 functional SNPs selected from VEGFA, KDR, EG-VEGF (PROK1), PROKR1 and PROKR2 were genotyped. We used multifactor dimensionality reduction (MDR) analysis to choose a best model and evaluate gene-gene interactions. Ingenuity pathways analysis (IPA) was introduced to explore possible complex interactions. Two receptor gene polymorphisms [KDR (Q472H) and PROKR2 (V331M)] were significantly associated with idiopathic RPL (P<0.01). The MDR test revealed that the KDR (Q472H) polymorphism was the best loci to be associated with RPL (P=0.02). IPA revealed EG-VEGF and VEGFA systems shared several canonical signaling pathways that may contribute to gene-gene interactions, including the Akt, IL-8, EGFR, MAPK, SRC, VHL, HIF-1A and STAT3 signaling pathways. Two receptor gene polymorphisms [KDR (Q472H) and PROKR2 (V331M)] were significantly associated with idiopathic RPL. EG-VEGF and VEGFA systems shared several canonical signaling pathways that may contribute to gene-gene interactions, including the Akt, IL-8, EGFR, MAPK, SRC, VHL, HIF-1A and STAT3.
Defining the Role of Essential Genes in Human Disease

PubMed Central

Robertson, David L.; Hentges, Kathryn E.

2011-01-01

A greater understanding of the causes of human disease can come from identifying characteristics that are specific to disease genes. However, a full understanding of the contribution of essential genes to human disease is lacking, due to the premise that these genes tend to cause developmental abnormalities rather than adult disease. We tested the hypothesis that human orthologs of mouse essential genes are associated with a variety of human diseases, rather than only those related to miscarriage and birth defects. We segregated human disease genes according to whether the knockout phenotype of their mouse ortholog was lethal or viable, defining those with orthologs producing lethal knockouts as essential disease genes. We show that the human orthologs of mouse essential genes are associated with a wide spectrum of diseases affecting diverse physiological systems. Notably, human disease genes with essential mouse orthologs are over-represented among disease genes associated with cancer, suggesting links between adult cellular abnormalities and developmental functions. The proteins encoded by essential genes are highly connected in protein-protein interaction networks, which we find correlates with an over-representation of nuclear proteins amongst essential disease genes. Disease genes associated with essential orthologs also are more likely than those with non-essential orthologs to contribute to disease through an autosomal dominant inheritance pattern, suggesting that these diseases may actually result from semi-dominant mutant alleles. Overall, we have described attributes found in disease genes according to the essentiality status of their mouse orthologs. These findings demonstrate that disease genes do occupy highly connected positions in protein-protein interaction networks, and that due to the complexity of disease-associated alleles, essential genes cannot be ignored as candidates for causing diverse human diseases. PMID:22096564

Characterization of basal gene expression trends over a diurnal cycle in Xiphophorus maculatus skin, brain and liver.

PubMed

Lu, Yuan; Reyes, Jose; Walter, Sean; Gonzalez, Trevor; Medrano, Geraldo; Boswell, Mikki; Boswell, William; Savage, Markita; Walter, Ronald

2018-06-01

Evolutionarily conserved diurnal circadian mechanisms maintain oscillating patterns of gene expression based on the day-night cycle. Xiphophorus fish have been used to evaluate transcriptional responses after exposure to various light sources and it was determined that each source incites distinct genetic responses in skin tissue. However, basal expression levels of genes that show oscillating expression patterns in day-night cycle, may affect the outcomes of such experiments, since basal gene expression levels at each point in the circadian path may influence the profile of identified light responsive genes. Lack of knowledge regarding diurnal fluctuations in basal gene expression patterns may confound the understanding of genetic responses to external stimuli (e.g., light) since the dynamic nature of gene expression implies animals subjected to stimuli at different times may be at very different stages within the continuum of genetic homeostasis. We assessed basal gene expression changes over a 24-hour period in 200 select Xiphophorus gene targets known to transcriptionally respond to various types of light exposure. We identified 22 genes in skin, 36 genes in brain and 28 genes in liver that exhibit basal oscillation of expression patterns. These genes, including known circadian regulators, produced the expected expression patterns over a 24-hour cycle when compared to circadian regulatory genes identified in other species, especially human and other vertebrate animal models. Our results suggest the regulatory network governing diurnal oscillating gene expression is similar between Xiphophorus and other vertebrates for the three Xiphophorus organs tested. In addition, we were able to categorize light responsive gene sets in Xiphophorus that do, and do not, exhibit circadian based oscillating expression patterns. Copyright © 2017 Elsevier Inc. All rights reserved.
Regional and temporal differences in gene expression of LH(BETA)T(AG) retinoblastoma tumors.

PubMed

Houston, Samuel K; Pina, Yolanda; Clarke, Jennifer; Koru-Sengul, Tulay; Scott, William K; Nathanson, Lubov; Schefler, Amy C; Murray, Timothy G

2011-07-23

The purpose of this study was to evaluate by microarray the hypothesis that LH(BETA)T(AG) retinoblastoma tumors exhibit regional and temporal variations in gene expression. LH(BETA)T(AG) mice aged 12, 16, and 20 weeks were euthanatized (n = 9). Specimens were taken from five tumor areas (apex, anterior lateral, center, base, and posterior lateral). Samples were hybridized to gene microarrays. The data were preprocessed and analyzed, and genes with a P < 0.01, according to the ANOVA models, and a log(2)-fold change >2.5 were considered to be differentially expressed. Differentially expressed genes were analyzed for overlap with known networks by using pathway analysis tools. There were significant temporal (P < 10(-8)) and regional differences in gene expression for LH(BETA)T(AG) retinoblastoma tumors. At P < 0.01 and log(2)-fold change >2.5, there were significant changes in gene expression of 190 genes apically, 84 genes anterolaterally, 126 genes posteriorly, 56 genes centrally, and 134 genes at the base. Differentially expressed genes overlapped with known networks, with significant involvement in regulation of cellular proliferation and growth, response to oxygen levels and hypoxia, regulation of cellular processes, cellular signaling cascades, and angiogenesis. There are significant temporal and regional variations in the LH(BETA)T(AG) retinoblastoma model. Differentially expressed genes overlap with key pathways that may play pivotal roles in murine retinoblastoma development. These findings suggest the mechanisms involved in tumor growth and progression in murine retinoblastoma tumors and identify pathways for analysis at a functional level, to determine significance in human retinoblastoma. Microarray analysis of LH(BETA)T(AG) retinal tumors showed significant regional and temporal variations in gene expression, including dysregulation of genes involved in hypoxic responses and angiogenesis.
Genomic Evidence Reveals the Extreme Diversity and Wide Distribution of the Arsenic-Related Genes in Burkholderiales

PubMed Central

Li, Xiangyang; Zhang, Linshuang; Wang, Gejiao

2014-01-01

So far, numerous genes have been found to associate with various strategies to resist and transform the toxic metalloid arsenic (here, we denote these genes as “arsenic-related genes”). However, our knowledge of the distribution, redundancies and organization of these genes in bacteria is still limited. In this study, we analyzed the 188 Burkholderiales genomes and found that 95% genomes harbored arsenic-related genes, with an average of 6.6 genes per genome. The results indicated: a) compared to a low frequency of distribution for aio (arsenite oxidase) (12 strains), arr (arsenate respiratory reductase) (1 strain) and arsM (arsenite methytransferase)-like genes (4 strains), the ars (arsenic resistance system)-like genes were identified in 174 strains including 1,051 genes; b) 2/3 ars-like genes were clustered as ars operon and displayed a high diversity of gene organizations (68 forms) which may suggest the rapid movement and evolution for ars-like genes in bacterial genomes; c) the arsenite efflux system was dominant with ACR3 form rather than ArsB in Burkholderiales; d) only a few numbers of arsM and arrAB are found indicating neither As III biomethylation nor AsV respiration is the primary mechanism in Burkholderiales members; (e) the aio-like gene is mostly flanked with ars-like genes and phosphate transport system, implying the close functional relatedness between arsenic and phosphorus metabolisms. On average, the number of arsenic-related genes per genome of strains isolated from arsenic-rich environments is more than four times higher than the strains from other environments. Compared with human, plant and animal pathogens, the environmental strains possess a larger average number of arsenic-related genes, which indicates that habitat is likely a key driver for bacterial arsenic resistance. PMID:24632831
Gene-environment studies: any advantage over environmental studies?

PubMed

Bermejo, Justo Lorenzo; Hemminki, Kari

2007-07-01

Gene-environment studies have been motivated by the likely existence of prevalent low-risk genes that interact with common environmental exposures. The present study assessed the statistical advantage of the simultaneous consideration of genes and environment to investigate the effect of environmental risk factors on disease. In particular, we contemplated the possibility that several genes modulate the environmental effect. Environmental exposures, genotypes and phenotypes were simulated according to a wide range of parameter settings. Different models of gene-gene-environment interaction were considered. For each parameter combination, we estimated the probability of detecting the main environmental effect, the power to identify the gene-environment interaction and the frequency of environmentally affected individuals at which environmental and gene-environment studies show the same statistical power. The proportion of cases in the population attributable to the modeled risk factors was also calculated. Our data indicate that environmental exposures with weak effects may account for a significant proportion of the population prevalence of the disease. A general result was that, if the environmental effect was restricted to rare genotypes, the power to detect the gene-environment interaction was higher than the power to identify the main environmental effect. In other words, when few individuals contribute to the overall environmental effect, individual contributions are large and result in easily identifiable gene-environment interactions. Moreover, when multiple genes interacted with the environment, the statistical benefit of gene-environment studies was limited to those studies that included major contributors to the gene-environment interaction. The advantage of gene-environment over plain environmental studies also depends on the inheritance mode of the involved genes, on the study design and, to some extend, on the disease prevalence.
A Comprehensive Analysis of Nuclear-Encoded Mitochondrial Genes in Schizophrenia.

PubMed

Gonçalves, Vanessa F; Cappi, Carolina; Hagen, Christian M; Sequeira, Adolfo; Vawter, Marquis P; Derkach, Andriy; Zai, Clement C; Hedley, Paula L; Bybjerg-Grauholm, Jonas; Pouget, Jennie G; Cuperfain, Ari B; Sullivan, Patrick F; Christiansen, Michael; Kennedy, James L; Sun, Lei

2018-05-01

The genetic risk factors of schizophrenia (SCZ), a severe psychiatric disorder, are not yet fully understood. Multiple lines of evidence suggest that mitochondrial dysfunction may play a role in SCZ, but comprehensive association studies are lacking. We hypothesized that variants in nuclear-encoded mitochondrial genes influence susceptibility to SCZ. We conducted gene-based and gene-set analyses using summary association results from the Psychiatric Genomics Consortium Schizophrenia Phase 2 (PGC-SCZ2) genome-wide association study comprising 35,476 cases and 46,839 control subjects. We applied the MAGMA method to three sets of nuclear-encoded mitochondrial genes: oxidative phosphorylation genes, other nuclear-encoded mitochondrial genes, and genes involved in nucleus-mitochondria crosstalk. Furthermore, we conducted a replication study using the iPSYCH SCZ sample of 2290 cases and 21,621 control subjects. In the PGC-SCZ2 sample, 1186 mitochondrial genes were analyzed, among which 159 had p values < .05 and 19 remained significant after multiple testing correction. A meta-analysis of 818 genes combining the PGC-SCZ2 and iPSYCH samples resulted in 104 nominally significant and nine significant genes, suggesting a polygenic model for the nuclear-encoded mitochondrial genes. Gene-set analysis, however, did not show significant results. In an in silico protein-protein interaction network analysis, 14 mitochondrial genes interacted directly with 158 SCZ risk genes identified in PGC-SCZ2 (permutation p = .02), and aldosterone signaling in epithelial cells and mitochondrial dysfunction pathways appeared to be overrepresented in this network of mitochondrial and SCZ risk genes. This study provides evidence that specific aspects of mitochondrial function may play a role in SCZ, but we did not observe its broad involvement even using a large sample. Copyright © 2018 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Systematic Analysis of Zn2Cys6 Transcription Factors Required for Development and Pathogenicity by High-Throughput Gene Knockout in the Rice Blast Fungus

PubMed Central

Huang, Pengyun; Lin, Fucheng

2014-01-01

Because of great challenges and workload in deleting genes on a large scale, the functions of most genes in pathogenic fungi are still unclear. In this study, we developed a high-throughput gene knockout system using a novel yeast-Escherichia-Agrobacterium shuttle vector, pKO1B, in the rice blast fungus Magnaporthe oryzae. Using this method, we deleted 104 fungal-specific Zn2Cys6 transcription factor (TF) genes in M. oryzae. We then analyzed the phenotypes of these mutants with regard to growth, asexual and infection-related development, pathogenesis, and 9 abiotic stresses. The resulting data provide new insights into how this rice pathogen of global significance regulates important traits in the infection cycle through Zn2Cys6TF genes. A large variation in biological functions of Zn2Cys6TF genes was observed under the conditions tested. Sixty-one of 104 Zn2Cys6 TF genes were found to be required for fungal development. In-depth analysis of TF genes revealed that TF genes involved in pathogenicity frequently tend to function in multiple development stages, and disclosed many highly conserved but unidentified functional TF genes of importance in the fungal kingdom. We further found that the virulence-required TF genes GPF1 and CNF2 have similar regulation mechanisms in the gene expression involved in pathogenicity. These experimental validations clearly demonstrated the value of a high-throughput gene knockout system in understanding the biological functions of genes on a genome scale in fungi, and provided a solid foundation for elucidating the gene expression network that regulates the development and pathogenicity of M. oryzae. PMID:25299517
Microarray analysis reveals key genes and pathways in Tetralogy of Fallot

PubMed Central

He, Yue-E; Qiu, Hui-Xian; Jiang, Jian-Bing; Wu, Rong-Zhou; Xiang, Ru-Lian; Zhang, Yuan-Hai

2017-01-01

The aim of the present study was to identify key genes that may be involved in the pathogenesis of Tetralogy of Fallot (TOF) using bioinformatics methods. The GSE26125 microarray dataset, which includes cardiovascular tissue samples derived from 16 children with TOF and five healthy age-matched control infants, was downloaded from the Gene Expression Omnibus database. Differential expression analysis was performed between TOF and control samples to identify differentially expressed genes (DEGs) using Student's t-test, and the R/limma package, with a log2 fold-change of >2 and a false discovery rate of <0.01 set as thresholds. The biological functions of DEGs were analyzed using the ToppGene database. The ReactomeFIViz application was used to construct functional interaction (FI) networks, and the genes in each module were subjected to pathway enrichment analysis. The iRegulon plugin was used to identify transcription factors predicted to regulate the DEGs in the FI network, and the gene-transcription factor pairs were then visualized using Cytoscape software. A total of 878 DEGs were identified, including 848 upregulated genes and 30 downregulated genes. The gene FI network contained seven function modules, which were all comprised of upregulated genes. Genes enriched in Module 1 were enriched in the following three neurological disorder-associated signaling pathways: Parkinson's disease, Alzheimer's disease and Huntington's disease. Genes in Modules 0, 3 and 5 were dominantly enriched in pathways associated with ribosomes and protein translation. The Xbox binding protein 1 transcription factor was demonstrated to be involved in the regulation of genes encoding the subunits of cytoplasmic and mitochondrial ribosomes, as well as genes involved in neurodegenerative disorders. Therefore, dysfunction of genes involved in signaling pathways associated with neurodegenerative disorders, ribosome function and protein translation may contribute to the pathogenesis of TOF. PMID:28713939
nGASP--the nematode genome annotation assessment project.

PubMed

Coghlan, Avril; Fiedler, Tristan J; McKay, Sheldon J; Flicek, Paul; Harris, Todd W; Blasiar, Darin; Stein, Lincoln D

2008-12-19

While the C. elegans genome is extensively annotated, relatively little information is available for other Caenorhabditis species. The nematode genome annotation assessment project (nGASP) was launched to objectively assess the accuracy of protein-coding gene prediction software in C. elegans, and to apply this knowledge to the annotation of the genomes of four additional Caenorhabditis species and other nematodes. Seventeen groups worldwide participated in nGASP, and submitted 47 prediction sets across 10 Mb of the C. elegans genome. Predictions were compared to reference gene sets consisting of confirmed or manually curated gene models from WormBase. The most accurate gene-finders were 'combiner' algorithms, which made use of transcript- and protein-alignments and multi-genome alignments, as well as gene predictions from other gene-finders. Gene-finders that used alignments of ESTs, mRNAs and proteins came in second. There was a tie for third place between gene-finders that used multi-genome alignments and ab initio gene-finders. The median gene level sensitivity of combiners was 78% and their specificity was 42%, which is nearly the same accuracy reported for combiners in the human genome. C. elegans genes with exons of unusual hexamer content, as well as those with unusually many exons, short exons, long introns, a weak translation start signal, weak splice sites, or poorly conserved orthologs posed the greatest difficulty for gene-finders. This experiment establishes a baseline of gene prediction accuracy in Caenorhabditis genomes, and has guided the choice of gene-finders for the annotation of newly sequenced genomes of Caenorhabditis and other nematode species. We have created new gene sets for C. briggsae, C. remanei, C. brenneri, C. japonica, and Brugia malayi using some of the best-performing gene-finders.
Methylation of microRNA genes regulates gene expression in bisexual flower development in andromonoecious poplar.

PubMed

Song, Yuepeng; Tian, Min; Ci, Dong; Zhang, Deqiang

2015-04-01

Previous studies showed sex-specific DNA methylation and expression of candidate genes in bisexual flowers of andromonoecious poplar, but the regulatory relationship between methylation and microRNAs (miRNAs) remains unclear. To investigate whether the methylation of miRNA genes regulates gene expression in bisexual flower development, the methylome, microRNA, and transcriptome were examined in female and male flowers of andromonoecious poplar. 27 636 methylated coding genes and 113 methylated miRNA genes were identified. In the coding genes, 64.5% of the methylated reads mapped to the gene body region; by contrast, 60.7% of methylated reads in miRNA genes mainly mapped in the 5' and 3' flanking regions. CHH methylation showed the highest methylation levels and CHG showed the lowest methylation levels. Correlation analysis showed a significant, negative, strand-specific correlation of methylation and miRNA gene expression (r=0.79, P <0.05). The methylated miRNA genes included eight long miRNAs (lmiRNAs) of 24 nucleotides and 11 miRNAs related to flower development. miRNA172b might play an important role in the regulation of bisexual flower development-related gene expression in andromonoecious poplar, via modification of methylation. Gynomonoecious, female, and male poplars were used to validate the methylation patterns of the miRNA172b gene, implying that hyper-methylation in andromonoecious and gynomonoecious poplar might function as an important regulator in bisexual flower development. Our data provide a useful resource for the study of flower development in poplar and improve our understanding of the effect of epigenetic regulation on genes other than protein-coding genes. © The Author 2015. Published by Oxford University Press on behalf of the Society for Experimental Biology. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Expression of a DNA Replication Gene Cluster in Bacteriophage T4: Genetic Linkage and the Control of Gene Product Interactions

PubMed Central

Gerald, W. L.; Karam, J. D.

1984-01-01

The results of this study bear on the relationship between genetic linkage and control of interactions between the protein products of different cistrons. In T4 bacteriophage, genes 45 and 44 encode essential components of the phage DNA replication multiprotein complex. T4 gene 45 maps directly upstream of gene 44 relative to the overall direction of reading of this region of the phage chromosome, but it is not known whether these two genes are cotranscribed. It has been shown that a nonsense lesion of T4 gene 45 exerts a cis-dominant inhibitory effect on growth of a missense mutant of gene 44 but not on growth of phage carrying the wild-type gene 44 allele. In previous work, we confirmed these observations on polarity of the gene 45 mutation but detected no polar effects by this lesion on synthesis of either mutant or wild-type gene 44 protein. In the present study, we demonstrate that mRNA for gene 44 protein is separable by gel electrophoresis from gene 45-protein-encoding mRNA. That is, the two proteins are not synthesized from one polycistronic message, and the cis-dominant inhibitory effect of the gene 45 mutation on gene 44 function is probably expressed at a posttranslational stage. We propose that close genetic linkage, whether or not it provides shared transcriptional and translational regulatory signals for certain clusters of functionally related cistrons, may determine the intracellular compartmentalization for synthesis of proteins encoded by these clusters. In prokaryotes, such linkage-dependent compartmentation may minimize the diffusion distances between gene products that are synthesized at low levels and are destined to interact. PMID:6745641
Integrated analysis of gene expression and methylation profiles of 48 candidate genes in breast cancer patients.

PubMed

Li, Zibo; Heng, Jianfu; Yan, Jinhua; Guo, Xinwu; Tang, Lili; Chen, Ming; Peng, Limin; Wu, Yepeng; Wang, Shouman; Xiao, Zhi; Deng, Zhongping; Dai, Lizhong; Wang, Jun

2016-11-01

Gene-specific methylation and expression have shown biological and clinical importance for breast cancer diagnosis and prognosis. Integrated analysis of gene methylation and gene expression may identify genes associated with biology mechanism and clinical outcome of breast cancer and aid in clinical management. Using high-throughput microfluidic quantitative PCR, we analyzed the expression profiles of 48 candidate genes in 96 Chinese breast cancer patients and investigated their correlation with gene methylation and associations with breast cancer clinical parameters. Breast cancer-specific gene expression alternation was found in 25 genes with significant expression difference between paired tumor and normal tissues. A total of 9 genes (CCND2, EGFR, GSTP1, PGR, PTGS2, RECK, SOX17, TNFRSF10D, and WIF1) showed significant negative correlation between methylation and gene expression, which were validated in the TCGA database. Total 23 genes (ACADL, APC, BRCA2, CADM1, CAV1, CCND2, CST6, EGFR, ESR2, GSTP1, ICAM5, NPY, PGR, PTGS2, RECK, RUNX3, SFRP1, SOX17, SYK, TGFBR2, TNFRSF10D, WIF1, and WRN) annotated with potential TFBSs in the promoter regions showed negative correlation between methylation and expression. In logistics regression analysis, 31 of the 48 genes showed improved performance in disease prediction with combination of methylation and expression coefficient. Our results demonstrated the complex correlation and the possible regulatory mechanisms between DNA methylation and gene expression. Integration analysis of methylation and expression of candidate genes could improve performance in breast cancer prediction. These findings would contribute to molecular characterization and identification of biomarkers for potential clinical applications.
Genome-wide profiling identifies a subset of methamphetamine (METH)-induced genes associated with METH-induced increased H4K5Ac binding in the rat striatum

PubMed Central

2013-01-01

Background METH is an illicit drug of abuse that influences gene expression in the rat striatum. Histone modifications regulate gene transcription. Methods We therefore used microarray analysis and genome-scale approaches to examine potential relationships between the effects of METH on gene expression and on DNA binding of histone H4 acetylated at lysine 4 (H4K5Ac) in the rat dorsal striatum of METH-naïve and METH-pretreated rats. Results Acute and chronic METH administration caused differential changes in striatal gene expression. METH also increased H4K5Ac binding around the transcriptional start sites (TSSs) of genes in the rat striatum. In order to relate gene expression to histone acetylation, we binned genes of similar expression into groups of 100 genes and proceeded to relate gene expression to H4K5Ac binding. We found a positive correlation between gene expression and H4K5Ac binding in the striatum of control rats. Similar correlations were observed in METH-treated rats. Genes that showed acute METH-induced increased expression in saline-pretreated rats also showed METH-induced increased H4K5Ac binding. The acute METH injection caused similar increases in H4K5Ac binding in METH-pretreated rats, without affecting gene expression to the same degree. Finally, genes that showed METH-induced decreased expression exhibited either decreases or no changes in H4K5Ac binding. Conclusion Acute METH injections caused increased gene expression of genes that showed increased H4K5Ac binding near their transcription start sites. PMID:23937714
Genome-wide profiling identifies a subset of methamphetamine (METH)-induced genes associated with METH-induced increased H4K5Ac binding in the rat striatum.

PubMed

Cadet, Jean Lud; Jayanthi, Subramaniam; McCoy, Michael T; Ladenheim, Bruce; Saint-Preux, Fabienne; Lehrmann, Elin; De, Supriyo; Becker, Kevin G; Brannock, Christie

2013-08-12

METH is an illicit drug of abuse that influences gene expression in the rat striatum. Histone modifications regulate gene transcription. We therefore used microarray analysis and genome-scale approaches to examine potential relationships between the effects of METH on gene expression and on DNA binding of histone H4 acetylated at lysine 4 (H4K5Ac) in the rat dorsal striatum of METH-naïve and METH-pretreated rats. Acute and chronic METH administration caused differential changes in striatal gene expression. METH also increased H4K5Ac binding around the transcriptional start sites (TSSs) of genes in the rat striatum. In order to relate gene expression to histone acetylation, we binned genes of similar expression into groups of 100 genes and proceeded to relate gene expression to H4K5Ac binding. We found a positive correlation between gene expression and H4K5Ac binding in the striatum of control rats. Similar correlations were observed in METH-treated rats. Genes that showed acute METH-induced increased expression in saline-pretreated rats also showed METH-induced increased H4K5Ac binding. The acute METH injection caused similar increases in H4K5Ac binding in METH-pretreated rats, without affecting gene expression to the same degree. Finally, genes that showed METH-induced decreased expression exhibited either decreases or no changes in H4K5Ac binding. Acute METH injections caused increased gene expression of genes that showed increased H4K5Ac binding near their transcription start sites.
Gene Presence–Absence Polymorphism in Castrating Anther-Smut Fungi: Recent Gene Gains and Phylogeographic Structure

PubMed Central

Rodríguez de la Vega, Ricardo C; Brandenburg, Jean-Tristan; Carpentier, Fantin; Giraud, Tatiana

2018-01-01

Abstract Gene presence–absence polymorphisms segregating within species are a significant source of genetic variation but have been little investigated to date in natural populations. In plant pathogens, the gain or loss of genes encoding proteins interacting directly with the host, such as secreted proteins, probably plays an important role in coevolution and local adaptation. We investigated gene presence–absence polymorphism in populations of two closely related species of castrating anther-smut fungi, Microbotryum lychnidis-dioicae (MvSl) and M. silenes-dioicae (MvSd), from across Europe, on the basis of Illumina genome sequencing data and high-quality genome references. We observed presence–absence polymorphism for 186 autosomal genes (2% of all genes) in MvSl, and only 51 autosomal genes in MvSd. Distinct genes displayed presence–absence polymorphism in the two species. Genes displaying presence–absence polymorphism were frequently located in subtelomeric and centromeric regions and close to repetitive elements, and comparison with outgroups indicated that most were present in a single species, being recently acquired through duplications in multiple-gene families. Gene presence–absence polymorphism in MvSl showed a phylogeographic structure corresponding to clusters detected based on SNPs. In addition, gene absence alleles were rare within species and skewed toward low-frequency variants. These findings are consistent with a deleterious or neutral effect for most gene presence–absence polymorphism. Some of the observed gene loss and gain events may however be adaptive, as suggested by the putative functions of the corresponding encoded proteins (e.g., secreted proteins) or their localization within previously identified selective sweeps. The adaptive roles in plant and anther-smut fungi interactions of candidate genes however need to be experimentally tested in future studies. PMID:29722826
Selection of reference genes for quantitative real-time PCR normalization in Panax ginseng at different stages of growth and in different organs.

PubMed

Liu, Jing; Wang, Qun; Sun, Minying; Zhu, Linlin; Yang, Michael; Zhao, Yu

2014-01-01

Quantitative real-time reverse transcription PCR (qRT-PCR) has become a widely used method for gene expression analysis; however, its data interpretation largely depends on the stability of reference genes. The transcriptomics of Panax ginseng, one of the most popular and traditional ingredients used in Chinese medicines, is increasingly being studied. Furthermore, it is vital to establish a series of reliable reference genes when qRT-PCR is used to assess the gene expression profile of ginseng. In this study, we screened out candidate reference genes for ginseng using gene expression data generated by a high-throughput sequencing platform. Based on the statistical tests, 20 reference genes (10 traditional housekeeping genes and 10 novel genes) were selected. These genes were tested for the normalization of expression levels in five growth stages and three distinct plant organs of ginseng by qPCR. These genes were subsequently ranked and compared according to the stability of their expressions using geNorm, NormFinder, and BestKeeper computational programs. Although the best reference genes were found to vary across different samples, CYP and EF-1α were the most stable genes amongst all samples. GAPDH/30S RPS20, CYP/60S RPL13 and CYP/QCR were the optimum pair of reference genes in the roots, stems, and leaves. CYP/60S RPL13, CYP/eIF-5A, aTUB/V-ATP, eIF-5A/SAR1, and aTUB/pol IIa were the most stably expressed combinations in each of the five developmental stages. Our study serves as a foundation for developing an accurate method of qRT-PCR and will benefit future studies on gene expression profiles of Panax Ginseng.
Comparative analysis of grapevine whole-genome gene predictions, functional annotation, categorization and integration of the predicted gene sequences

PubMed Central

2012-01-01

Background The first draft assembly and gene prediction of the grapevine genome (8X base coverage) was made available to the scientific community in 2007, and functional annotation was developed on this gene prediction. Since then additional Sanger sequences were added to the 8X sequences pool and a new version of the genomic sequence with superior base coverage (12X) was produced. Results In order to more efficiently annotate the function of the genes predicted in the new assembly, it is important to build on as much of the previous work as possible, by transferring 8X annotation of the genome to the 12X version. The 8X and 12X assemblies and gene predictions of the grapevine genome were compared to answer the question, “Can we uniquely map 8X predicted genes to 12X predicted genes?” The results show that while the assemblies and gene structure predictions are too different to make a complete mapping between them, most genes (18,725) showed a one-to-one relationship between 8X predicted genes and the last version of 12X predicted genes. In addition, reshuffled genomic sequence structures appeared. These highlight regions of the genome where the gene predictions need to be taken with caution. Based on the new grapevine gene functional annotation and in-depth functional categorization, twenty eight new molecular networks have been created for VitisNet while the existing networks were updated. Conclusions The outcomes of this study provide a functional annotation of the 12X genes, an update of VitisNet, the system of the grapevine molecular networks, and a new functional categorization of genes. Data are available at the VitisNet website (http://www.sdstate.edu/ps/research/vitis/pathways.cfm). PMID:22554261
Methylation of microRNA genes regulates gene expression in bisexual flower development in andromonoecious poplar

PubMed Central

Song, Yuepeng; Tian, Min; Ci, Dong; Zhang, Deqiang

2015-01-01

Previous studies showed sex-specific DNA methylation and expression of candidate genes in bisexual flowers of andromonoecious poplar, but the regulatory relationship between methylation and microRNAs (miRNAs) remains unclear. To investigate whether the methylation of miRNA genes regulates gene expression in bisexual flower development, the methylome, microRNA, and transcriptome were examined in female and male flowers of andromonoecious poplar. 27 636 methylated coding genes and 113 methylated miRNA genes were identified. In the coding genes, 64.5% of the methylated reads mapped to the gene body region; by contrast, 60.7% of methylated reads in miRNA genes mainly mapped in the 5′ and 3′ flanking regions. CHH methylation showed the highest methylation levels and CHG showed the lowest methylation levels. Correlation analysis showed a significant, negative, strand-specific correlation of methylation and miRNA gene expression (r=0.79, P <0.05). The methylated miRNA genes included eight long miRNAs (lmiRNAs) of 24 nucleotides and 11 miRNAs related to flower development. miRNA172b might play an important role in the regulation of bisexual flower development-related gene expression in andromonoecious poplar, via modification of methylation. Gynomonoecious, female, and male poplars were used to validate the methylation patterns of the miRNA172b gene, implying that hyper-methylation in andromonoecious and gynomonoecious poplar might function as an important regulator in bisexual flower development. Our data provide a useful resource for the study of flower development in poplar and improve our understanding of the effect of epigenetic regulation on genes other than protein-coding genes. PMID:25617468
Genome-wide survey and characterization of the WRKY gene family in Populus trichocarpa.

PubMed

He, Hongsheng; Dong, Qing; Shao, Yuanhua; Jiang, Haiyang; Zhu, Suwen; Cheng, Beijiu; Xiang, Yan

2012-07-01

WRKY transcription factors participate in diverse physiological and developmental processes in plants. They have highly conserved WRKYGQK amino acid sequences in their N-termini, followed by the novel zinc-finger-like motifs, Cys₂His₂ or Cys₂HisCys. To date, numerous WRKY genes have been identified and characterized in a number of herbaceous species. Survey and characterization of WRKY genes in a ligneous species would facilitate a better understanding of the evolutionary processes and functions of this gene family. In this study, 104 poplar WRKY genes (PtWRKY) were identified in the latest poplar genome sequence. According to their structural features, the predicted members were divided into the previously defined groups I-III, as described in rice. In addition, chromosomal localization of the genes demonstrated that there might be WRKY gene hot spots in 2.3 Mb regions on chromosome 14. Furthermore, approximately 83% (86 out of 104) WRKY genes participated in gene duplication events, including 69% (29 out of 42) gene pairs which exhibited segmental duplication. Using semi-quantitative RT-PCR, the expression patterns of subgroup III genes were investigated under different stresses [cold, drought, salinity and salicylic acid (SA)]. The data revealed that these genes presented different expression levels in response to various stress conditions. Expression analysis exhibited PtWRKY76 gene induced markedly in 0.1 mM SA or 25% PEG-6000 treatment. The results presented here provide a fundamental clue for cloning specific function genes in further studies and applications. This study identified 104 poplar WRKY genes and demonstrated WRKY gene hot spots on chromosome 14. Furthermore, semi-quantitative RT-PCR showed variable stress responses in subgroup III.
Comparative analyses of Xanthomonas and Xylella complete genomes.

PubMed

Moreira, Leandro M; De Souza, Robson F; Digiampietri, Luciano A; Da Silva, Ana C R; Setubal, João C

2005-01-01

Computational analyses of four bacterial genomes of the Xanthomonadaceae family reveal new unique genes that may be involved in adaptation, pathogenicity, and host specificity. The Xanthomonas genus presents 3636 unique genes distributed in 1470 families, while Xylella genus presents 1026 unique genes distributed in 375 families. Among Xanthomonas-specific genes, we highlight a large number of cell wall degrading enzymes, proteases, and iron receptors, a set of energy metabolism genes, second copy of the type II secretion system, type III secretion system, flagella and chemotactic machinery, and the xanthomonadin synthesis gene cluster. Important genes unique to the Xylella genus are an additional copy of a type IV pili gene cluster and the complete machinery of colicin V synthesis and secretion. Intersections of gene sets from both genera reveal a cluster of genes homologous to Salmonella's SPI-7 island in Xanthomonas axonopodis pv citri and Xylella fastidiosa 9a5c, which might be involved in host specificity. Each genome also presents important unique genes, such as an HMS cluster, the kdgT gene, and O-antigen in Xanthomonas axonopodis pv citri; a number of avrBS genes and a distinct O-antigen in Xanthomonas campestris pv campestris, a type I restriction-modification system and a nickase gene in Xylella fastidiosa 9a5c, and a type II restriction-modification system and two genes related to peptidoglycan biosynthesis in Xylella fastidiosa temecula 1. All these differences imply a considerable number of gene gains and losses during the divergence of the four lineages, and are associated with structural genome modifications that may have a direct relation with the mode of transmission, adaptation to specific environments and pathogenicity of each organism.
Discovery of numerous novel small genes in the intergenic regions of the Escherichia coli O157:H7 Sakai genome

PubMed Central

Hücker, Sarah M.; Ardern, Zachary; Goldberg, Tatyana; Schafferhans, Andrea; Bernhofer, Michael; Vestergaard, Gisle; Nelson, Chase W.; Schloter, Michael; Rost, Burkhard; Scherer, Siegfried

2017-01-01

In the past, short protein-coding genes were often disregarded by genome annotation pipelines. Transcriptome sequencing (RNAseq) signals outside of annotated genes have usually been interpreted to indicate either ncRNA or pervasive transcription. Therefore, in addition to the transcriptome, the translatome (RIBOseq) of the enteric pathogen Escherichia coli O157:H7 strain Sakai was determined at two optimal growth conditions and a severe stress condition combining low temperature and high osmotic pressure. All intergenic open reading frames potentially encoding a protein of ≥ 30 amino acids were investigated with regard to coverage by transcription and translation signals and their translatability expressed by the ribosomal coverage value. This led to discovery of 465 unique, putative novel genes not yet annotated in this E. coli strain, which are evenly distributed over both DNA strands of the genome. For 255 of the novel genes, annotated homologs in other bacteria were found, and a machine-learning algorithm, trained on small protein-coding E. coli genes, predicted that 89% of these translated open reading frames represent bona fide genes. The remaining 210 putative novel genes without annotated homologs were compared to the 255 novel genes with homologs and to 250 short annotated genes of this E. coli strain. All three groups turned out to be similar with respect to their translatability distribution, fractions of differentially regulated genes, secondary structure composition, and the distribution of evolutionary constraint, suggesting that both novel groups represent legitimate genes. However, the machine-learning algorithm only recognized a small fraction of the 210 genes without annotated homologs. It is possible that these genes represent a novel group of genes, which have unusual features dissimilar to the genes of the machine-learning algorithm training set. PMID:28902868

Some links on this page may take you to non-federal websites. Their policies may differ from this site.