Liu, Gangbiao; Zou, Yangyun; Cheng, Qiqun; Zeng, Yanwu; Gu, Xun; Su, Zhixi
2014-04-01
The age distribution of gene duplication events within the human genome exhibits two waves of duplications along with an ancient component. However, because of functional constraint differences, genes in different functional categories might show dissimilar retention patterns after duplication. It is known that genes in some functional categories are highly duplicated in the early stage of vertebrate evolution. However, the correlations of the age distribution pattern of gene duplication between the different functional categories are still unknown. To investigate this issue, we developed a robust pipeline to date the gene duplication events in the human genome. We successfully estimated about three-quarters of the duplication events within the human genome, along with the age distribution pattern in each Gene Ontology (GO) slim category. We found that some GO slim categories show different distribution patterns when compared to the whole genome. Further hierarchical clustering of the GO slim functional categories enabled grouping into two main clusters. We found that human genes located in the duplicated copy number variant regions, whose duplicate genes have not been fixed in the human population, were mainly enriched in the groups with a high proportion of recently duplicated genes. Moreover, we used a phylogenetic tree-based method to date the age of duplications in three signaling-related gene superfamilies: transcription factors, protein kinases and G-protein coupled receptors. These superfamilies were expressed in different subcellular localizations. They showed a similar age distribution as the signaling-related GO slim categories. We also compared the differences between the age distributions of gene duplications in multiple subcellular localizations. We found that the distribution patterns of the major subcellular localizations were similar to that of the whole genome. This study revealed the whole picture of the evolution patterns of gene functional categories in the human genome.
Marko, Nicholas F.; Weil, Robert J.
2012-01-01
Introduction Gene expression data is often assumed to be normally-distributed, but this assumption has not been tested rigorously. We investigate the distribution of expression data in human cancer genomes and study the implications of deviations from the normal distribution for translational molecular oncology research. Methods We conducted a central moments analysis of five cancer genomes and performed empiric distribution fitting to examine the true distribution of expression data both on the complete-experiment and on the individual-gene levels. We used a variety of parametric and nonparametric methods to test the effects of deviations from normality on gene calling, functional annotation, and prospective molecular classification using a sixth cancer genome. Results Central moments analyses reveal statistically-significant deviations from normality in all of the analyzed cancer genomes. We observe as much as 37% variability in gene calling, 39% variability in functional annotation, and 30% variability in prospective, molecular tumor subclassification associated with this effect. Conclusions Cancer gene expression profiles are not normally-distributed, either on the complete-experiment or on the individual-gene level. Instead, they exhibit complex, heavy-tailed distributions characterized by statistically-significant skewness and kurtosis. The non-Gaussian distribution of this data affects identification of differentially-expressed genes, functional annotation, and prospective molecular classification. These effects may be reduced in some circumstances, although not completely eliminated, by using nonparametric analytics. This analysis highlights two unreliable assumptions of translational cancer gene expression analysis: that “small” departures from normality in the expression data distributions are analytically-insignificant and that “robust” gene-calling algorithms can fully compensate for these effects. PMID:23118863
Shen, Congcong; Shi, Yu; Ni, Yingying; Deng, Ye; Van Nostrand, Joy D; He, Zhili; Zhou, Jizhong; Chu, Haiyan
2016-01-01
The elevational and latitudinal diversity patterns of microbial taxa have attracted great attention in the past decade. Recently, the distribution of functional attributes has been in the spotlight. Here, we report a study profiling soil microbial communities along an elevation gradient (500-2200 m) on Changbai Mountain. Using a comprehensive functional gene microarray (GeoChip 5.0), we found that microbial functional gene richness exhibited a dramatic increase at the treeline ecotone, but the bacterial taxonomic and phylogenetic diversity based on 16S rRNA gene sequencing did not exhibit such a similar trend. However, the β-diversity (compositional dissimilarity among sites) pattern for both bacterial taxa and functional genes was similar, showing significant elevational distance-decay patterns which presented increased dissimilarity with elevation. The bacterial taxonomic diversity/structure was strongly influenced by soil pH, while the functional gene diversity/structure was significantly correlated with soil dissolved organic carbon (DOC). This finding highlights that soil DOC may be a good predictor in determining the elevational distribution of microbial functional genes. The finding of significant shifts in functional gene diversity at the treeline ecotone could also provide valuable information for predicting the responses of microbial functions to climate change.
Awazu, Akinori; Tanabe, Takahiro; Kamitani, Mari; Tezuka, Ayumi; Nagano, Atsushi J
2018-05-29
Gene expression levels exhibit stochastic variations among genetically identical organisms under the same environmental conditions. In many recent transcriptome analyses based on RNA sequencing (RNA-seq), variations in gene expression levels among replicates were assumed to follow a negative binomial distribution, although the physiological basis of this assumption remains unclear. In this study, RNA-seq data were obtained from Arabidopsis thaliana under eight conditions (21-27 replicates), and the characteristics of gene-dependent empirical probability density function (ePDF) profiles of gene expression levels were analyzed. For A. thaliana and Saccharomyces cerevisiae, various types of ePDF of gene expression levels were obtained that were classified as Gaussian, power law-like containing a long tail, or intermediate. These ePDF profiles were well fitted with a Gauss-power mixing distribution function derived from a simple model of a stochastic transcriptional network containing a feedback loop. The fitting function suggested that gene expression levels with long-tailed ePDFs would be strongly influenced by feedback regulation. Furthermore, the features of gene expression levels are correlated with their functions, with the levels of essential genes tending to follow a Gaussian-like ePDF while those of genes encoding nucleic acid-binding proteins and transcription factors exhibit long-tailed ePDF.
Shen, Congcong; Shi, Yu; Ni, Yingying; Deng, Ye; Van Nostrand, Joy D.; He, Zhili; Zhou, Jizhong; Chu, Haiyan
2016-01-01
The elevational and latitudinal diversity patterns of microbial taxa have attracted great attention in the past decade. Recently, the distribution of functional attributes has been in the spotlight. Here, we report a study profiling soil microbial communities along an elevation gradient (500–2200 m) on Changbai Mountain. Using a comprehensive functional gene microarray (GeoChip 5.0), we found that microbial functional gene richness exhibited a dramatic increase at the treeline ecotone, but the bacterial taxonomic and phylogenetic diversity based on 16S rRNA gene sequencing did not exhibit such a similar trend. However, the β-diversity (compositional dissimilarity among sites) pattern for both bacterial taxa and functional genes was similar, showing significant elevational distance-decay patterns which presented increased dissimilarity with elevation. The bacterial taxonomic diversity/structure was strongly influenced by soil pH, while the functional gene diversity/structure was significantly correlated with soil dissolved organic carbon (DOC). This finding highlights that soil DOC may be a good predictor in determining the elevational distribution of microbial functional genes. The finding of significant shifts in functional gene diversity at the treeline ecotone could also provide valuable information for predicting the responses of microbial functions to climate change. PMID:27524983
Di, Chao; Xu, Wenying; Su, Zhen; Yuan, Joshua S
2010-10-07
PHB (Prohibitin) gene family is involved in a variety of functions important for different biological processes. PHB genes are ubiquitously present in divergent species from prokaryotes to eukaryotes. Human PHB genes have been found to be associated with various diseases. Recent studies by our group and others have shown diverse function of PHB genes in plants for development, senescence, defence, and others. Despite the importance of the PHB gene family, no comprehensive gene family analysis has been carried to evaluate the relatedness of PHB genes across different species. In order to better guide the gene function analysis and understand the evolution of the PHB gene family, we therefore carried out the comparative genome analysis of the PHB genes across different kingdoms. The relatedness, motif distribution, and intron/exon distribution all indicated that PHB genes is a relatively conserved gene family. The PHB genes can be classified into 5 classes and each class have a very deep evolutionary origin. The PHB genes within the class maintained the same motif patterns during the evolution. With Arabidopsis as the model species, we found that PHB gene intron/exon structure and domains are also conserved during the evolution. Despite being a conserved gene family, various gene duplication events led to the expansion of the PHB genes. Both segmental and tandem gene duplication were involved in Arabidopsis PHB gene family expansion. However, segmental duplication is predominant in Arabidopsis. Moreover, most of the duplicated genes experienced neofunctionalization. The results highlighted that PHB genes might be involved in important functions so that the duplicated genes are under the evolutionary pressure to derive new function. PHB gene family is a conserved gene family and accounts for diverse but important biological functions based on the similar molecular mechanisms. The highly diverse biological function indicated that more research needs to be carried out to dissect the PHB gene function. The conserved gene evolution indicated that the study in the model species can be translated to human and mammalian studies.
Yan, Xiping; Wang, Guosong; Liu, Hehe; Gan, Xiang; Zhang, Tao; Wang, Jiwen; Li, Liang
2015-01-01
Peroxisome proliferators-activated receptor (PPAR) gene family members exhibit distinct patterns of distribution in tissues and differ in functions. The purpose of this study is to investigate the evolutionary impacts on diversity functions of PPAR members and the regulatory differences on gene expression patterns. 63 homology sequences of PPAR genes from 31 species were collected and analyzed. The results showed that three isolated types of PPAR gene family may emerge from twice times of gene duplication events. The conserved domains of HOLI (ligand binding domain of hormone receptors) domain and ZnF_C4 (C4 zinc finger in nuclear in hormone receptors) are essential for keeping basic roles of PPAR gene family, and the variant domains of LCRs may be responsible for their divergence in functions. The positive selection sites in HOLI domain are benefit for PPARs to evolve towards diversity functions. The evolutionary variants in the promoter regions and 3′ UTR regions of PPARs result into differential transcription factors and miRNAs involved in regulating PPAR members, which may eventually affect their expressions and tissues distributions. These results indicate that gene duplication event, selection pressure on HOLI domain, and the variants on promoter and 3′ UTR are essential for PPARs evolution and diversity functions acquired. PMID:25961030
Zhou, Tianyu; Yan, Xiping; Wang, Guosong; Liu, Hehe; Gan, Xiang; Zhang, Tao; Wang, Jiwen; Li, Liang
2015-01-01
Peroxisome proliferators-activated receptor (PPAR) gene family members exhibit distinct patterns of distribution in tissues and differ in functions. The purpose of this study is to investigate the evolutionary impacts on diversity functions of PPAR members and the regulatory differences on gene expression patterns. 63 homology sequences of PPAR genes from 31 species were collected and analyzed. The results showed that three isolated types of PPAR gene family may emerge from twice times of gene duplication events. The conserved domains of HOLI (ligand binding domain of hormone receptors) domain and ZnF_C4 (C4 zinc finger in nuclear in hormone receptors) are essential for keeping basic roles of PPAR gene family, and the variant domains of LCRs may be responsible for their divergence in functions. The positive selection sites in HOLI domain are benefit for PPARs to evolve towards diversity functions. The evolutionary variants in the promoter regions and 3' UTR regions of PPARs result into differential transcription factors and miRNAs involved in regulating PPAR members, which may eventually affect their expressions and tissues distributions. These results indicate that gene duplication event, selection pressure on HOLI domain, and the variants on promoter and 3' UTR are essential for PPARs evolution and diversity functions acquired.
McGuire, Austen B; Rafi, Syed K; Manzardo, Ann M; Butler, Merlin G
2016-05-05
Mammalian chromosomes are comprised of complex chromatin architecture with the specific assembly and configuration of each chromosome influencing gene expression and function in yet undefined ways by varying degrees of heterochromatinization that result in Giemsa (G) negative euchromatic (light) bands and G-positive heterochromatic (dark) bands. We carried out morphometric measurements of high-resolution chromosome ideograms for the first time to characterize the total euchromatic and heterochromatic chromosome band length, distribution and localization of 20,145 known protein-coding genes, 790 recognized autism spectrum disorder (ASD) genes and 365 obesity genes. The individual lengths of G-negative euchromatin and G-positive heterochromatin chromosome bands were measured in millimeters and recorded from scaled and stacked digital images of 850-band high-resolution ideograms supplied by the International Society of Chromosome Nomenclature (ISCN) 2013. Our overall measurements followed established banding patterns based on chromosome size. G-negative euchromatic band regions contained 60% of protein-coding genes while the remaining 40% were distributed across the four heterochromatic dark band sub-types. ASD genes were disproportionately overrepresented in the darker heterochromatic sub-bands, while the obesity gene distribution pattern did not significantly differ from protein-coding genes. Our study supports recent trends implicating genes located in heterochromatin regions playing a role in biological processes including neurodevelopment and function, specifically genes associated with ASD.
Distributed Function Mining for Gene Expression Programming Based on Fast Reduction.
Deng, Song; Yue, Dong; Yang, Le-chan; Fu, Xiong; Feng, Ya-zhou
2016-01-01
For high-dimensional and massive data sets, traditional centralized gene expression programming (GEP) or improved algorithms lead to increased run-time and decreased prediction accuracy. To solve this problem, this paper proposes a new improved algorithm called distributed function mining for gene expression programming based on fast reduction (DFMGEP-FR). In DFMGEP-FR, fast attribution reduction in binary search algorithms (FAR-BSA) is proposed to quickly find the optimal attribution set, and the function consistency replacement algorithm is given to solve integration of the local function model. Thorough comparative experiments for DFMGEP-FR, centralized GEP and the parallel gene expression programming algorithm based on simulated annealing (parallel GEPSA) are included in this paper. For the waveform, mushroom, connect-4 and musk datasets, the comparative results show that the average time-consumption of DFMGEP-FR drops by 89.09%%, 88.85%, 85.79% and 93.06%, respectively, in contrast to centralized GEP and by 12.5%, 8.42%, 9.62% and 13.75%, respectively, compared with parallel GEPSA. Six well-studied UCI test data sets demonstrate the efficiency and capability of our proposed DFMGEP-FR algorithm for distributed function mining.
Evolutionary analysis of the jacalin-related lectin family genes in 11 fishes.
Cao, Jun; Lv, Yueqing
2016-09-01
Jacalin-related lectins are a type of carbohydrate-binding proteins, which are distributed across a wide variety of organisms and involved in some important biological processes. The evolution of this gene family in fishes is unknown. Here, 47 putative jacalin genes in 11 fish species were identified and divided into 4 groups through phylogenetic analysis. Conserved gene organization and motif distribution existed in each group, suggesting their functional conservation. Some fishes have eleven jacalin genes, while others have only one or zero gene in their genomes, suggesting dynamic changes in the number of jacalin genes during the evolution of fishes. Intragenic recombination played a key role in the evolution of jacalin genes. Synteny analyses of jacalin genes in some fishes implied conserved and dynamic evolution characteristics of this gene family and related genome segments. Moreover, a few functional divergence sites were identified within each group pairs. Divergent expression profiles of the zebra fish jacalin genes were further investigated in different stresses. The results provided a foundation for exploring the characterization of the jacalin genes in fishes and will offer insights for additional functional studies. Copyright © 2016 Elsevier Ltd. All rights reserved.
Tsiagkas, Giannis; Nikolaou, Christoforos; Almirantis, Yannis
2014-12-01
CpG Islands (CGIs) are compositionally defined short genomic stretches, which have been studied in the human, mouse, chicken and later in several other genomes. Initially, they were assigned the role of transcriptional regulation of protein-coding genes, especially the house-keeping ones, while more recently there is found evidence that they are involved in several other functions as well, which might include regulation of the expression of RNA genes, DNA replication etc. Here, an investigation of their distributional characteristics in a variety of genomes is undertaken for both whole CGI populations as well as for CGI subsets that lie away from known genes (gene-unrelated or "orphan" CGIs). In both cases power-law-like linearity in double logarithmic scale is found. An evolutionary model, initially put forward for the explanation of a similar pattern found in gene populations is implemented. It includes segmental duplication events and eliminations of most of the duplicated CGIs, while a moderate rate of non-duplicated CGI eliminations is also applied in some cases. Simulations reproduce all the main features of the observed inter-CGI chromosomal size distributions. Our results on power-law-like linearity found in orphan CGI populations suggest that the observed distributional pattern is independent of the analogous pattern that protein coding segments were reported to follow. The power-law-like patterns in the genomic distributions of CGIs described herein are found to be compatible with several other features of the composition, abundance or functional role of CGIs reported in the current literature across several genomes, on the basis of the proposed evolutionary model. Copyright © 2014 Elsevier Ltd. All rights reserved.
Liang, Kai-Chiang; Tseng, Joseph T; Tsai, Shaw-Jenq; Sun, H Sunny
2015-08-01
Repetitive elements constitute more than 50% of the human genome. Recent studies implied that the complexity of living organisms is not just a direct outcome of a number of coding sequences; the repetitive elements, which do not encode proteins, may also play a significant role. Though scattered studies showed that repetitive elements in the regulatory regions of a gene control gene expression, no systematic survey has been done to report the characterization and distribution of various types of these repetitive elements in the human genome. Sequences from 5' and 3' untranslated regions and upstream and downstream of a gene were downloaded from the Ensembl database. The repetitive elements in the neighboring of each gene were identified and classified using cross-matching implemented in the RepeatMasker. The annotation and distribution of distinct classes of repetitive elements associated with individual gene were collected to characterize genes in association with different types of repetitive elements using systems biology program. We identified a total of 1,068,400 repetitive elements which belong to 37-class families and 1235 subclasses that are associated with 33,761 genes and 57,365 transcripts. In addition, we found that the tandem repeats preferentially locate proximal to the transcription start site (TSS) of genes and the major function of these genes are involved in developmental processes. On the other hand, interspersed repetitive elements showed a tendency to be accumulated at distal region from the TSS and the function of interspersed repeat-containing genes took part in the catabolic/metabolic processes. Results from the distribution analysis were collected and used to construct a gene-based repetitive element database (GBRED; http://www.binfo.ncku.edu.tw/GBRED/index.html). A user-friendly web interface was designed to provide the information of repetitive elements associated with any particular gene(s). This is the first study focusing on the gene-associated repetitive elements in the human genome. Our data showed distinct genes associated with different kinds of repetitive element and implied such combination may shape the function of these genes. Aside from the conventional view of these elements in genome evolution, results from this study offer a systemic review to facilitate exploitation of these elements in genome function. Copyright © 2015 Elsevier Ltd. All rights reserved.
Analytical workflow profiling gene expression in murine macrophages
Nixon, Scott E.; González-Peña, Dianelys; Lawson, Marcus A.; McCusker, Robert H.; Hernandez, Alvaro G.; O’Connor, Jason C.; Dantzer, Robert; Kelley, Keith W.
2015-01-01
Comprehensive and simultaneous analysis of all genes in a biological sample is a capability of RNA-Seq technology. Analysis of the entire transcriptome benefits from summarization of genes at the functional level. As a cellular response of interest not previously explored with RNA-Seq, peritoneal macrophages from mice under two conditions (control and immunologically challenged) were analyzed for gene expression differences. Quantification of individual transcripts modeled RNA-Seq read distribution and uncertainty (using a Beta Negative Binomial distribution), then tested for differential transcript expression (False Discovery Rate-adjusted p-value < 0.05). Enrichment of functional categories utilized the list of differentially expressed genes. A total of 2079 differentially expressed transcripts representing 1884 genes were detected. Enrichment of 92 categories from Gene Ontology Biological Processes and Molecular Functions, and KEGG pathways were grouped into 6 clusters. Clusters included defense and inflammatory response (Enrichment Score = 11.24) and ribosomal activity (Enrichment Score = 17.89). Our work provides a context to the fine detail of individual gene expression differences in murine peritoneal macrophages during immunological challenge with high throughput RNA-Seq. PMID:25708305
Familial aggregation analysis of gene expressions
Rao, Shao-Qi; Xu, Liang-De; Zhang, Guang-Mei; Li, Xia; Li, Lin; Shen, Gong-Qing; Jiang, Yang; Yang, Yue-Ying; Gong, Bin-Sheng; Jiang, Wei; Zhang, Fan; Xiao, Yun; Wang, Qing K
2007-01-01
Traditional studies of familial aggregation are aimed at defining the genetic (and non-genetic) causes of a disease from physiological or clinical traits. However, there has been little attempt to use genome-wide gene expressions, the direct phenotypic measures of genes, as the traits to investigate several extended issues regarding the distributions of familially aggregated genes on chromosomes or in functions. In this study we conducted a genome-wide familial aggregation analysis by using the in vitro cell gene expressions of 3300 human autosome genes (Problem 1 data provided to Genetic Analysis Workshop 15) in order to answer three basic genetics questions. First, we investigated how gene expressions aggregate among different types (degrees) of relative pairs. Second, we conducted a bioinformatics analysis of highly familially aggregated genes to see how they are distributed on chromosomes. Third, we performed a gene ontology enrichment test of familially aggregated genes to find evidence to support their functional consensus. The results indicated that 1) gene expressions did aggregate in families, especially between sibs. Of 3300 human genes analyzed, there were a total of 1105 genes with one or more significant (empirical p < 0.05) familial correlation; 2) there were several genomic hot spots where highly familially aggregated genes (e.g., the chromosome 6 HLA genes cluster) were clustered; 3) as we expected, gene ontology enrichment tests revealed that the 1105 genes were aggregating not only in families but also in functional categories. PMID:18466548
Comparative genomics approaches to understanding and manipulating plant metabolism.
Bradbury, Louis M T; Niehaus, Tom D; Hanson, Andrew D
2013-04-01
Over 3000 genomes, including numerous plant genomes, are now sequenced. However, their annotation remains problematic as illustrated by the many conserved genes with no assigned function, vague annotations such as 'kinase', or even wrong ones. Around 40% of genes of unknown function that are conserved between plants and microbes are probably metabolic enzymes or transporters; finding functions for these genes is a major challenge. Comparative genomics has correctly predicted functions for many such genes by analyzing genomic context, and gene fusions, distributions and co-expression. Comparative genomics complements genetic and biochemical approaches to dissect metabolism, continues to increase in power and decrease in cost, and has a pivotal role in modeling and engineering by helping identify functions for all metabolic genes. Copyright © 2012 Elsevier Ltd. All rights reserved.
BRAIN NETWORKS. Correlated gene expression supports synchronous activity in brain networks.
Richiardi, Jonas; Altmann, Andre; Milazzo, Anna-Clare; Chang, Catie; Chakravarty, M Mallar; Banaschewski, Tobias; Barker, Gareth J; Bokde, Arun L W; Bromberg, Uli; Büchel, Christian; Conrod, Patricia; Fauth-Bühler, Mira; Flor, Herta; Frouin, Vincent; Gallinat, Jürgen; Garavan, Hugh; Gowland, Penny; Heinz, Andreas; Lemaître, Hervé; Mann, Karl F; Martinot, Jean-Luc; Nees, Frauke; Paus, Tomáš; Pausova, Zdenka; Rietschel, Marcella; Robbins, Trevor W; Smolka, Michael N; Spanagel, Rainer; Ströhle, Andreas; Schumann, Gunter; Hawrylycz, Mike; Poline, Jean-Baptiste; Greicius, Michael D
2015-06-12
During rest, brain activity is synchronized between different regions widely distributed throughout the brain, forming functional networks. However, the molecular mechanisms supporting functional connectivity remain undefined. We show that functional brain networks defined with resting-state functional magnetic resonance imaging can be recapitulated by using measures of correlated gene expression in a post mortem brain tissue data set. The set of 136 genes we identify is significantly enriched for ion channels. Polymorphisms in this set of genes significantly affect resting-state functional connectivity in a large sample of healthy adolescents. Expression levels of these genes are also significantly associated with axonal connectivity in the mouse. The results provide convergent, multimodal evidence that resting-state functional networks correlate with the orchestrated activity of dozens of genes linked to ion channel activity and synaptic function. Copyright © 2015, American Association for the Advancement of Science.
Fujimoto, Akihiro; Okada, Yukinori; Boroevich, Keith A; Tsunoda, Tatsuhiko; Taniguchi, Hiroaki; Nakagawa, Hidewaki
2016-05-26
Protein tertiary structure determines molecular function, interaction, and stability of the protein, therefore distribution of mutation in the tertiary structure can facilitate the identification of new driver genes in cancer. To analyze mutation distribution in protein tertiary structures, we applied a novel three dimensional permutation test to the mutation positions. We analyzed somatic mutation datasets of 21 types of cancers obtained from exome sequencing conducted by the TCGA project. Of the 3,622 genes that had ≥3 mutations in the regions with tertiary structure data, 106 genes showed significant skew in mutation distribution. Known tumor suppressors and oncogenes were significantly enriched in these identified cancer gene sets. Physical distances between mutations in known oncogenes were significantly smaller than those of tumor suppressors. Twenty-three genes were detected in multiple cancers. Candidate genes with significant skew of the 3D mutation distribution included kinases (MAPK1, EPHA5, ERBB3, and ERBB4), an apoptosis related gene (APP), an RNA splicing factor (SF1), a miRNA processing factor (DICER1), an E3 ubiquitin ligase (CUL1) and transcription factors (KLF5 and EEF1B2). Our study suggests that systematic analysis of mutation distribution in the tertiary protein structure can help identify cancer driver genes.
Fujimoto, Akihiro; Okada, Yukinori; Boroevich, Keith A.; Tsunoda, Tatsuhiko; Taniguchi, Hiroaki; Nakagawa, Hidewaki
2016-01-01
Protein tertiary structure determines molecular function, interaction, and stability of the protein, therefore distribution of mutation in the tertiary structure can facilitate the identification of new driver genes in cancer. To analyze mutation distribution in protein tertiary structures, we applied a novel three dimensional permutation test to the mutation positions. We analyzed somatic mutation datasets of 21 types of cancers obtained from exome sequencing conducted by the TCGA project. Of the 3,622 genes that had ≥3 mutations in the regions with tertiary structure data, 106 genes showed significant skew in mutation distribution. Known tumor suppressors and oncogenes were significantly enriched in these identified cancer gene sets. Physical distances between mutations in known oncogenes were significantly smaller than those of tumor suppressors. Twenty-three genes were detected in multiple cancers. Candidate genes with significant skew of the 3D mutation distribution included kinases (MAPK1, EPHA5, ERBB3, and ERBB4), an apoptosis related gene (APP), an RNA splicing factor (SF1), a miRNA processing factor (DICER1), an E3 ubiquitin ligase (CUL1) and transcription factors (KLF5 and EEF1B2). Our study suggests that systematic analysis of mutation distribution in the tertiary protein structure can help identify cancer driver genes. PMID:27225414
Singh, Sangeeta; Chand, Suresh; Singh, N. K.; Sharma, Tilak Raj
2015-01-01
The resistance (R) genes and defense response (DR) genes have become very important resources for the development of disease resistant cultivars. In the present investigation, genome-wide identification, expression, phylogenetic and synteny analysis was done for R and DR-genes across three species of rice viz: Oryza sativa ssp indica cv 93-11, Oryza sativa ssp japonica and wild rice species, Oryza brachyantha. We used the in silico approach to identify and map 786 R -genes and 167 DR-genes, 672 R-genes and 142 DR-genes, 251 R-genes and 86 DR-genes in the japonica, indica and O. brachyanth a genomes, respectively. Our analysis showed that 60.5% and 55.6% of the R-genes are tandemly repeated within clusters and distributed over all the rice chromosomes in indica and japonica genomes, respectively. The phylogenetic analysis along with motif distribution shows high degree of conservation of R- and DR-genes in clusters. In silico expression analysis of R-genes and DR-genes showed more than 85% were expressed genes showing corresponding EST matches in the databases. This study gave special emphasis on mechanisms of gene evolution and duplication for R and DR genes across species. Analysis of paralogs across rice species indicated 17% and 4.38% R-genes, 29% and 11.63% DR-genes duplication in indica and Oryza brachyantha, as compared to 20% and 26% duplication of R-genes and DR-genes in japonica respectively. We found that during the course of duplication only 9.5% of R- and DR-genes changed their function and rest of the genes have maintained their identity. Syntenic relationship across three genomes inferred that more orthology is shared between indica and japonica genomes as compared to brachyantha genome. Genome wide identification of R-genes and DR-genes in the rice genome will help in allele mining and functional validation of these genes, and to understand molecular mechanism of disease resistance and their evolution in rice and related species. PMID:25902056
Colón, Maritrini; Hernández, Fabiola; López, Karla; Quezada, Héctor; González, James; López, Geovani; Aranda, Cristina; González, Alicia
2011-01-01
Background Gene duplication is a key evolutionary mechanism providing material for the generation of genes with new or modified functions. The fate of duplicated gene copies has been amply discussed and several models have been put forward to account for duplicate conservation. The specialization model considers that duplication of a bifunctional ancestral gene could result in the preservation of both copies through subfunctionalization, resulting in the distribution of the two ancestral functions between the gene duplicates. Here we investigate whether the presumed bifunctional character displayed by the single branched chain amino acid aminotransferase present in K. lactis has been distributed in the two paralogous genes present in S. cerevisiae, and whether this conservation has impacted S. cerevisiae metabolism. Principal Findings Our results show that the KlBat1 orthologous BCAT is a bifunctional enzyme, which participates in the biosynthesis and catabolism of branched chain aminoacids (BCAAs). This dual role has been distributed in S. cerevisiae Bat1 and Bat2 paralogous proteins, supporting the specialization model posed to explain the evolution of gene duplications. BAT1 is highly expressed under biosynthetic conditions, while BAT2 expression is highest under catabolic conditions. Bat1 and Bat2 differential relocalization has favored their physiological function, since biosynthetic precursors are generated in the mitochondria (Bat1), while catabolic substrates are accumulated in the cytosol (Bat2). Under respiratory conditions, in the presence of ammonium and BCAAs the bat1Δ bat2Δ double mutant shows impaired growth, indicating that Bat1 and Bat2 could play redundant roles. In K. lactis wild type growth is independent of BCAA degradation, since a Klbat1Δ mutant grows under this condition. Conclusions Our study shows that BAT1 and BAT2 differential expression and subcellular relocalization has resulted in the distribution of the biosynthetic and catabolic roles of the ancestral BCAT in two isozymes improving BCAAs metabolism and constituting an adaptation to facultative metabolism. PMID:21267457
Stinchcombe, Adam R; Peskin, Charles S; Tranchina, Daniel
2012-06-01
We present a generalization of a population density approach for modeling and analysis of stochastic gene expression. In the model, the gene of interest fluctuates stochastically between an inactive state, in which transcription cannot occur, and an active state, in which discrete transcription events occur; and the individual mRNA molecules are degraded stochastically in an independent manner. This sort of model in simplest form with exponential dwell times has been used to explain experimental estimates of the discrete distribution of random mRNA copy number. In our generalization, the random dwell times in the inactive and active states, T_{0} and T_{1}, respectively, are independent random variables drawn from any specified distributions. Consequently, the probability per unit time of switching out of a state depends on the time since entering that state. Our method exploits a connection between the fully discrete random process and a related continuous process. We present numerical methods for computing steady-state mRNA distributions and an analytical derivation of the mRNA autocovariance function. We find that empirical estimates of the steady-state mRNA probability mass function from Monte Carlo simulations of laboratory data do not allow one to distinguish between underlying models with exponential and nonexponential dwell times in some relevant parameter regimes. However, in these parameter regimes and where the autocovariance function has negative lobes, the autocovariance function disambiguates the two types of models. Our results strongly suggest that temporal data beyond the autocovariance function is required in general to characterize gene switching.
Fu, Wen-Bo; Li, Bo; He, Zheng-Bo
2018-01-01
Chemosensory proteins (CSP) are soluble carrier proteins that may function in odorant reception in insects. CSPs have not been thoroughly studied at whole-genome level, despite the availability of insect genomes. Here, we identified/reidentified 283 CSP genes in the genomes of 22 mosquitoes. All 283 CSP genes possess a highly conserved OS-D domain. We comprehensively analyzed these CSP genes and determined their conserved domains, structure, genomic distribution, phylogeny, and evolutionary patterns. We found an average of seven CSP genes in each of 19 Anopheles genomes, 27 CSP genes in Cx. quinquefasciatus, 43 in Ae. aegypti, and 83 in Ae. albopictus. The Anopheles CSP genes had a simple genomic organization with a relatively consistent gene distribution, while most of the Culicinae CSP genes were distributed in clusters on the scaffolds. Our phylogenetic analysis clustered the CSPs into two major groups: CSP1-8 and CSE1-3. The CSP1-8 groups were all monophyletic with good bootstrap support. The CSE1-3 groups were an expansion of the CSP family of genes specific to the three Culicinae species. The Ka/Ks ratios indicated that the CSP genes had been subject to purifying selection with relatively slow evolution. Our results provide a comprehensive framework for the study of the CSP gene family in these 22 mosquito species, laying a foundation for future work on CSP function in the detection of chemical cues in the surrounding environment. PMID:29304168
Mei, Ting; Fu, Wen-Bo; Li, Bo; He, Zheng-Bo; Chen, Bin
2018-01-01
Chemosensory proteins (CSP) are soluble carrier proteins that may function in odorant reception in insects. CSPs have not been thoroughly studied at whole-genome level, despite the availability of insect genomes. Here, we identified/reidentified 283 CSP genes in the genomes of 22 mosquitoes. All 283 CSP genes possess a highly conserved OS-D domain. We comprehensively analyzed these CSP genes and determined their conserved domains, structure, genomic distribution, phylogeny, and evolutionary patterns. We found an average of seven CSP genes in each of 19 Anopheles genomes, 27 CSP genes in Cx. quinquefasciatus, 43 in Ae. aegypti, and 83 in Ae. albopictus. The Anopheles CSP genes had a simple genomic organization with a relatively consistent gene distribution, while most of the Culicinae CSP genes were distributed in clusters on the scaffolds. Our phylogenetic analysis clustered the CSPs into two major groups: CSP1-8 and CSE1-3. The CSP1-8 groups were all monophyletic with good bootstrap support. The CSE1-3 groups were an expansion of the CSP family of genes specific to the three Culicinae species. The Ka/Ks ratios indicated that the CSP genes had been subject to purifying selection with relatively slow evolution. Our results provide a comprehensive framework for the study of the CSP gene family in these 22 mosquito species, laying a foundation for future work on CSP function in the detection of chemical cues in the surrounding environment.
Differential Retention of Gene Functions in a Secondary Metabolite Cluster.
Reynolds, Hannah T; Slot, Jason C; Divon, Hege H; Lysøe, Erik; Proctor, Robert H; Brown, Daren W
2017-08-01
In fungi, distribution of secondary metabolite (SM) gene clusters is often associated with host- or environment-specific benefits provided by SMs. In the plant pathogen Alternaria brassicicola (Dothideomycetes), the DEP cluster confers an ability to synthesize the SM depudecin, a histone deacetylase inhibitor that contributes weakly to virulence. The DEP cluster includes genes encoding enzymes, a transporter, and a transcription regulator. We investigated the distribution and evolution of the DEP cluster in 585 fungal genomes and found a wide but sporadic distribution among Dothideomycetes, Sordariomycetes, and Eurotiomycetes. We confirmed DEP gene expression and depudecin production in one fungus, Fusarium langsethiae. Phylogenetic analyses suggested 6-10 horizontal gene transfers (HGTs) of the cluster, including a transfer that led to the presence of closely related cluster homologs in Alternaria and Fusarium. The analyses also indicated that HGTs were frequently followed by loss/pseudogenization of one or more DEP genes. Independent cluster inactivation was inferred in at least four fungal classes. Analyses of transitions among functional, pseudogenized, and absent states of DEP genes among Fusarium species suggest enzyme-encoding genes are lost at higher rates than the transporter (DEP3) and regulatory (DEP6) genes. The phenotype of an experimentally-induced DEP3 mutant of Fusarium did not support the hypothesis that selective retention of DEP3 and DEP6 protects fungi from exogenous depudecin. Together, the results suggest that HGT and gene loss have contributed significantly to DEP cluster distribution, and that some DEP genes provide a greater fitness benefit possibly due to a differential tendency to form network connections. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution 2017. This work is written by US Government employees and is in the public domain in the US.
Mutsuddi, Mousumi; Mukherjee, Ashim; Shen, Baohe; Manley, James L; Nambu, John R
2010-01-01
The Drosophila Dichaete gene encodes a member of the Sox family of high mobility group (HMG) domain proteins that have crucial gene regulatory functions in diverse developmental processes. The subcellular localization and transcriptional regulatory activities of Sox proteins can be regulated by several post-translational modifications. To identify genes that functionally interact with Dichaete, we undertook a genetic modifier screen based on a Dichaete gain-of-function phenotype in the adult eye. Mutations in several genes, including decapentaplegic, engrailed and pelle, behaved as dominant modifiers of this eye phenotype. Further analysis of pelle mutants revealed that loss of pelle function results in alterations in the distinctive cytoplasmic distribution of Dichaete protein within the developing oocyte, as well as defects in the elaboration of individual egg chambers. The death domain-containing region of the Pelle protein kinase was found to associate with both Dichaete and mouse Sox2 proteins, and Pelle can phosphorylate Dichaete protein in vitro. Overall, these findings reveal that maternal functions of pelle are essential for proper localization of Dichaete protein in the oocyte and normal egg chamber formation. Dichaete appears to be a novel phosphorylation substrate for Pelle and may function in a Pelle-dependent signaling pathway during oogenesis.
Identification, distribution and molecular evolution of the pacifastin gene family in Metazoa
Breugelmans, Bert; Simonet, Gert; van Hoef, Vincent; Van Soest, Sofie; Broeck, Jozef Vanden
2009-01-01
Background Members of the pacifastin family are serine peptidase inhibitors, most of which are produced as multi domain precursor proteins. Structural and biochemical characteristics of insect pacifastin-like peptides have been studied intensively, but only one inhibitor has been functionally characterised. Recent sequencing projects of metazoan genomes have created an unprecedented opportunity to explore the distribution, evolution and functional diversification of pacifastin genes in the animal kingdom. Results A large scale in silico data mining search led to the identification of 83 pacifastin members with 284 inhibitor domains, distributed over 55 species from three metazoan phyla. In contrast to previous assumptions, members of this family were also found in other phyla than Arthropoda, including the sister phylum Onychophora and the 'primitive', non-bilaterian Placozoa. In Arthropoda, pacifastin members were found to be distributed among insect families of nearly all insect orders and for the first time also among crustacean species other than crayfish and the Chinese mitten crab. Contrary to precursors from Crustacea, the majority of insect pacifastin members contain dibasic cleavage sites, indicative for posttranslational processing into numerous inhibitor peptides. Whereas some insect species have lost the pacifastin gene, others were found to have several (often clustered) paralogous genes. Amino acids corresponding to the reactive site or involved in the folding of the inhibitor domain were analysed as a basis for the biochemical properties. Conclusion The absence of the pacifastin gene in some insect genomes and the extensive gene expansion in other insects are indicative for the rapid (adaptive) evolution of this gene family. In addition, differential processing mechanisms and a high variability in the reactive site residues and the inner core interactions contribute to a broad functional diversification of inhibitor peptides, indicating wide ranging roles in different physiological processes. Based on the observation of a pacifastin gene in Placozoa, it can be hypothesized that the ancestral pacifastin gene has occurred before the divergence of bilaterian animals. However, considering differences in gene structure between the placozoan and other pacifastin genes and the existence of a 'pacifastin gene gap' between Placozoa and Onychophora/Arthropoda, it cannot be excluded that the pacifastin signature originated twice by convergent evolution. PMID:19435517
Malviya, N; Gupta, S; Singh, V K; Yadav, M K; Bisht, N C; Sarangi, B K; Yadav, D
2015-02-01
The DNA binding with One Finger (Dof) protein is a plant specific transcription factor involved in the regulation of wide range of processes. The analysis of whole genome sequence of pigeonpea has identified 38 putative Dof genes (CcDof) distributed on 8 chromosomes. A total of 17 out of 38 CcDof genes were found to be intronless. A comprehensive in silico characterization of CcDof gene family including the gene structure, chromosome location, protein motif, phylogeny, gene duplication and functional divergence has been attempted. The phylogenetic analysis resulted in 3 major clusters with closely related members in phylogenetic tree revealed common motif distribution. The in silico cis-regulatory element analysis revealed functional diversity with predominance of light responsive and stress responsive elements indicating the possibility of these CcDof genes to be associated with photoperiodic control and biotic and abiotic stress. The duplication pattern showed that tandem duplication is predominant over segmental duplication events. The comparative phylogenetic analysis of these Dof proteins along with 78 soybean, 36 Arabidopsis and 30 rice Dof proteins revealed 7 major clusters. Several groups of orthologs and paralogs were identified based on phylogenetic tree constructed. Our study provides useful information for functional characterization of CcDof genes.
Malhotra, Sony; Sowdhamini, Ramanathan
2013-08-01
The interaction of proteins with their respective DNA targets is known to control many high-fidelity cellular processes. Performing a comprehensive survey of the sequenced genomes for DNA-binding proteins (DBPs) will help in understanding their distribution and the associated functions in a particular genome. Availability of fully sequenced genome of Arabidopsis thaliana enables the review of distribution of DBPs in this model plant genome. We used profiles of both structure and sequence-based DNA-binding families, derived from PDB and PFam databases, to perform the survey. This resulted in 4471 proteins, identified as DNA-binding in Arabidopsis genome, which are distributed across 300 different PFam families. Apart from several plant-specific DNA-binding families, certain RING fingers and leucine zippers also had high representation. Our search protocol helped to assign DNA-binding property to several proteins that were previously marked as unknown, putative or hypothetical in function. The distribution of Arabidopsis genes having a role in plant DNA repair were particularly studied and noted for their functional mapping. The functions observed to be overrepresented in the plant genome harbour DNA-3-methyladenine glycosylase activity, alkylbase DNA N-glycosylase activity and DNA-(apurinic or apyrimidinic site) lyase activity, suggesting their role in specialized functions such as gene regulation and DNA repair.
Ma, Jun; Wang, Qinglian; Sun, Runrun; Xie, Fuliang; Jones, Don C; Zhang, Baohong
2014-10-16
Plant-specific TEOSINTE-BRANCHED1/CYCLOIDEA/PCF (TCP) transcription factors play versatile functions in multiple aspects of plant growth and development. However, no systematical study has been performed in cotton. In this study, we performed for the first time the genome-wide identification and expression analysis of the TCP transcription factor family in Gossypium raimondii. A total of 38 non-redundant cotton TCP encoding genes were identified. The TCP transcription factors were divided into eleven subgroups based on phylogenetic analysis. Most TCP genes within the same subfamily demonstrated similar exon and intron organization and the motif structures were highly conserved among the subfamilies. Additionally, the chromosomal distribution pattern revealed that TCP genes were unevenly distributed across 11 out of the 13 chromosomes; segmental duplication is a predominant duplication event for TCP genes and the major contributor to the expansion of TCP gene family in G. raimondii. Moreover, the expression profiles of TCP genes shed light on their functional divergence.
Ma, Jun; Wang, Qinglian; Sun, Runrun; Xie, Fuliang; Jones, Don C.; Zhang, Baohong
2014-01-01
Plant-specific TEOSINTE-BRANCHED1/CYCLOIDEA/PCF (TCP) transcription factors play versatile functions in multiple aspects of plant growth and development. However, no systematical study has been performed in cotton. In this study, we performed for the first time the genome-wide identification and expression analysis of the TCP transcription factor family in Gossypium raimondii. A total of 38 non-redundant cotton TCP encoding genes were identified. The TCP transcription factors were divided into eleven subgroups based on phylogenetic analysis. Most TCP genes within the same subfamily demonstrated similar exon and intron organization and the motif structures were highly conserved among the subfamilies. Additionally, the chromosomal distribution pattern revealed that TCP genes were unevenly distributed across 11 out of the 13 chromosomes; segmental duplication is a predominant duplication event for TCP genes and the major contributor to the expansion of TCP gene family in G. raimondii. Moreover, the expression profiles of TCP genes shed light on their functional divergence. PMID:25322260
Ceapa, Corina; Davids, Mark; Ritari, Jarmo; Lambert, Jolanda; Wels, Michiel; Douillard, François P.; Smokvina, Tamara; de Vos, Willem M.; Knol, Jan; Kleerebezem, Michiel
2016-01-01
Lactobacillus rhamnosus is a diverse Gram-positive species with strains isolated from different ecological niches. Here, we report the genome sequence analysis of 40 diverse strains of L. rhamnosus and their genomic comparison, with a focus on the variable genome. Genomic comparison of 40 L. rhamnosus strains discriminated the conserved genes (core genome) and regions of plasticity involving frequent rearrangements and horizontal transfer (variome). The L. rhamnosus core genome encompasses 2,164 genes, out of 4,711 genes in total (the pan-genome). The accessory genome is dominated by genes encoding carbohydrate transport and metabolism, extracellular polysaccharides (EPS) biosynthesis, bacteriocin production, pili production, the cas system, and the associated clustered regularly interspaced short palindromic repeat (CRISPR) loci, and more than 100 transporter functions and mobile genetic elements like phages, plasmid genes, and transposons. A clade distribution based on amino acid differences between core (shared) proteins matched with the clade distribution obtained from the presence–absence of variable genes. The phylogenetic and variome tree overlap indicated that frequent events of gene acquisition and loss dominated the evolutionary segregation of the strains within this species, which is paralleled by evolutionary diversification of core gene functions. The CRISPR-Cas system could have contributed to this evolutionary segregation. Lactobacillus rhamnosus strains contain the genetic and metabolic machinery with strain-specific gene functions required to adapt to a large range of environments. A remarkable congruency of the evolutionary relatedness of the strains’ core and variome functions, possibly favoring interspecies genetic exchanges, underlines the importance of gene-acquisition and loss within the L. rhamnosus strain diversification. PMID:27358423
Conservation, Divergence, and Genome-Wide Distribution of PAL and POX A Gene Families in Plants.
Rawal, H C; Singh, N K; Sharma, T R
2013-01-01
Genome-wide identification and phylogenetic and syntenic comparison were performed for the genes responsible for phenylalanine ammonia lyase (PAL) and peroxidase A (POX A) enzymes in nine plant species representing very diverse groups like legumes (Glycine max and Medicago truncatula), fruits (Vitis vinifera), cereals (Sorghum bicolor, Zea mays, and Oryza sativa), trees (Populus trichocarpa), and model dicot (Arabidopsis thaliana) and monocot (Brachypodium distachyon) species. A total of 87 and 1045 genes in PAL and POX A gene families, respectively, have been identified in these species. The phylogenetic and syntenic comparison along with motif distributions shows a high degree of conservation of PAL genes, suggesting that these genes may predate monocot/eudicot divergence. The POX A family genes, present in clusters at the subtelomeric regions of chromosomes, might be evolving and expanding with higher rate than the PAL gene family. Our analysis showed that during the expansion of POX A gene family, many groups and subgroups have evolved, resulting in a high level of functional divergence among monocots and dicots. These results will act as a first step toward the understanding of monocot/eudicot evolution and functional characterization of these gene families in the future.
Conservation, Divergence, and Genome-Wide Distribution of PAL and POX A Gene Families in Plants
Rawal, H. C.; Singh, N. K.; Sharma, T. R.
2013-01-01
Genome-wide identification and phylogenetic and syntenic comparison were performed for the genes responsible for phenylalanine ammonia lyase (PAL) and peroxidase A (POX A) enzymes in nine plant species representing very diverse groups like legumes (Glycine max and Medicago truncatula), fruits (Vitis vinifera), cereals (Sorghum bicolor, Zea mays, and Oryza sativa), trees (Populus trichocarpa), and model dicot (Arabidopsis thaliana) and monocot (Brachypodium distachyon) species. A total of 87 and 1045 genes in PAL and POX A gene families, respectively, have been identified in these species. The phylogenetic and syntenic comparison along with motif distributions shows a high degree of conservation of PAL genes, suggesting that these genes may predate monocot/eudicot divergence. The POX A family genes, present in clusters at the subtelomeric regions of chromosomes, might be evolving and expanding with higher rate than the PAL gene family. Our analysis showed that during the expansion of POX A gene family, many groups and subgroups have evolved, resulting in a high level of functional divergence among monocots and dicots. These results will act as a first step toward the understanding of monocot/eudicot evolution and functional characterization of these gene families in the future. PMID:23671845
Construction of CRISPR Libraries for Functional Screening.
Carstens, Carsten P; Felts, Katherine A; Johns, Sarah E
2018-01-01
Identification of gene function has been aided by the ability to generate targeted gene knockouts or transcriptional repression using the CRISPR/CAS9 system. Using pooled libraries of guide RNA expression vectors that direct CAS9 to a specific genomic site allows identification of genes that are either enriched or depleted in response to a selection scheme, thus linking the affected gene to the chosen phenotype. The quality of the data generated by the screening is dependent on the quality of the guide RNA delivery library with regards to error rates and especially evenness of distribution of the guides. Here, we describe a method for constructing complex plasmid libraries based on pooled designed oligomers with high representation and tight distributions. The procedure allows construction of plasmid libraries of >60,000 members with a 95th/5th percentile ratio of less than 3.5.
Plant uncoupling mitochondrial proteins.
Vercesi, Aníbal Eugênio; Borecký, Jiri; Maia, Ivan de Godoy; Arruda, Paulo; Cuccovia, Iolanda Midea; Chaimovich, Hernan
2006-01-01
Uncoupling proteins (UCPs) are membrane proteins that mediate purine nucleotide-sensitive free fatty acid-activated H(+) flux through the inner mitochondrial membrane. After the discovery of UCP in higher plants in 1995, it was acknowledged that these proteins are widely distributed in eukaryotic organisms. The widespread presence of UCPs in eukaryotes implies that these proteins may have functions other than thermogenesis. In this review, we describe the current knowledge of plant UCPs, including their discovery, biochemical properties, distribution, gene family, gene expression profiles, regulation of gene expression, and evolutionary aspects. Expression analyses and functional studies on the plant UCPs under normal and stressful conditions suggest that UCPs regulate energy metabolism in the cellular responses to stress through regulation of the electrochemical proton potential (Deltamu(H)+) and production of reactive oxygen species.
Short, Michael D.; Abell, Guy C. J.; Bodrossy, Levente; van den Akker, Ben
2013-01-01
We report on the first study trialling a newly-developed, functional gene microarray (FGA) for characterising bacterial and archaeal ammonia oxidisers in activated sludge. Mixed liquor (ML) and media biofilm samples from a full-scale integrated fixed-film activated sludge (IFAS) plant were analysed with the FGA to profile the diversity and relative abundance of ammonia-oxidising archaea and bacteria (AOA and AOB respectively). FGA analyses of AOA and AOB communities revealed ubiquitous distribution of AOA across all samples – an important finding for these newly-discovered and poorly characterised organisms. Results also revealed striking differences in the functional ecology of attached versus suspended communities within the IFAS reactor. Quantitative assessment of AOB and AOA functional gene abundance revealed a dominance of AOB in the ML and approximately equal distribution of AOA and AOB in the media-attached biofilm. Subsequent correlations of functional gene abundance data with key water quality parameters suggested an important functional role for media-attached AOB in particular for IFAS reactor nitrification performance and indicate possible functional redundancy in some IFAS ammonia oxidiser communities. Results from this investigation demonstrate the capacity of the FGA to resolve subtle ecological shifts in key microbial communities in nitrifying activated sludge and indicate its value as a tool for better understanding the linkages between the ecology and performance of these engineered systems. PMID:24155925
Genomic islands link secondary metabolism to functional adaptation in marine Actinobacteria
Penn, Kevin; Jenkins, Caroline; Nett, Markus; Udwary, Daniel W.; Gontang, Erin A.; McGlinchey, Ryan P.; Foster, Brian; Lapidus, Alla; Podell, Sheila; Allen, Eric E.; Moore, Bradley S.; Jensen, Paul R.
2009-01-01
Genomic islands have been shown to harbor functional traits that differentiate ecologically distinct populations of environmental bacteria. A comparative analysis of the complete genome sequences of the marine Actinobacteria Salinispora tropica and S. arenicola reveals that 75% of the species-specific genes are located in 21 genomic islands. These islands are enriched in genes associated with secondary metabolite biosynthesis providing evidence that secondary metabolism is linked to functional adaptation. Secondary metabolism accounts for 8.8% and 10.9% of the genes in the S. tropica and S. arenicola genomes, respectively, and represents the major functional category of annotated genes that differentiates the two species. Genomic islands harbor all 25 of the species-specific biosynthetic pathways, the majority of which occur in S. arenicola and may contribute to the cosmopolitan distribution of this species. Genome evolution is dominated by gene duplication and acquisition, which in the case of secondary metabolism provide immediate opportunities for the production of new bioactive products. Evidence that secondary metabolic pathways are exchanged horizontally, coupled with prior evidence for fixation among globally distributed populations, supports a functional role and suggests that the acquisition of natural product biosynthetic gene clusters represents a previously unrecognized force driving bacterial diversification. Species-specific differences observed in CRISPR (clustered regularly interspaced short palindromic repeat) sequences suggest that S. arenicola may possess a higher level of phage immunity, while a highly duplicated family of polymorphic membrane proteins provides evidence of a new mechanism of marine adaptation in Gram-positive bacteria. PMID:19474814
Efficient Credit Assignment through Evaluation Function Decomposition
NASA Technical Reports Server (NTRS)
Agogino, Adrian; Turner, Kagan; Mikkulainen, Risto
2005-01-01
Evolutionary methods are powerful tools in discovering solutions for difficult continuous tasks. When such a solution is encoded over multiple genes, a genetic algorithm faces the difficult credit assignment problem of evaluating how a single gene in a chromosome contributes to the full solution. Typically a single evaluation function is used for the entire chromosome, implicitly giving each gene in the chromosome the same evaluation. This method is inefficient because a gene will get credit for the contribution of all the other genes as well. Accurately measuring the fitness of individual genes in such a large search space requires many trials. This paper instead proposes turning this single complex search problem into a multi-agent search problem, where each agent has the simpler task of discovering a suitable gene. Gene-specific evaluation functions can then be created that have better theoretical properties than a single evaluation function over all genes. This method is tested in the difficult double-pole balancing problem, showing that agents using gene-specific evaluation functions can create a successful control policy in 20 percent fewer trials than the best existing genetic algorithms. The method is extended to more distributed problems, achieving 95 percent performance gains over tradition methods in the multi-rover domain.
Polyploidization altered gene functions in cotton (Gossypium spp.)
USDA-ARS?s Scientific Manuscript database
Cotton fibers are seed trichomes derived from individual cells of the epidermal layer of the seed coat. It has been known for a long time that a large set of genes determine the development of cotton fiber, and more recently it has been determined that these genes are distributed across the At and ...
Genome-wide analysis of promoter architecture in Drosophila melanogaster
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hoskins, Roger A.; Landolin, Jane M.; Brown, James B.
2010-10-20
Core promoters are critical regions for gene regulation in higher eukaryotes. However, the boundaries of promoter regions, the relative rates of initiation at the transcription start sites (TSSs) distributed within them, and the functional significance of promoter architecture remain poorly understood. We produced a high-resolution map of promoters active in the Drosophila melanogaster embryo by integrating data from three independent and complementary methods: 21 million cap analysis of gene expression (CAGE) tags, 1.2 million RNA ligase mediated rapid amplification of cDNA ends (RLMRACE) reads, and 50,000 cap-trapped expressed sequence tags (ESTs). We defined 12,454 promoters of 8037 genes. Our analysismore » indicates that, due to non-promoter-associated RNA background signal, previous studies have likely overestimated the number of promoter-associated CAGE clusters by fivefold. We show that TSS distributions form a complex continuum of shapes, and that promoters active in the embryo and adult have highly similar shapes in 95% of cases. This suggests that these distributions are generally determined by static elements such as local DNA sequence and are not modulated by dynamic signals such as histone modifications. Transcription factor binding motifs are differentially enriched as a function of promoter shape, and peaked promoter shape is correlated with both temporal and spatial regulation of gene expression. Our results contribute to the emerging view that core promoters are functionally diverse and control patterning of gene expression in Drosophila and mammals.« less
Analysis of the Prefoldin Gene Family in 14 Plant Species
Cao, Jun
2016-01-01
Prefoldin is a hexameric molecular chaperone complex present in all eukaryotes and archaea. The evolution of this gene family in plants is unknown. Here, I identified 140 prefoldin genes in 14 plant species. These prefoldin proteins were divided into nine groups through phylogenetic analysis. Highly conserved gene organization and motif distribution exist in each prefoldin group, implying their functional conservation. I also observed the segmental duplication of maize prefoldin gene family. Moreover, a few functional divergence sites were identified within each group pairs. Functional network analyses identified 78 co-expressed genes, and most of them were involved in carrying, binding and kinase activity. Divergent expression profiles of the maize prefoldin genes were further investigated in different tissues and development periods and under auxin and some abiotic stresses. I also found a few cis-elements responding to abiotic stress and phytohormone in the upstream sequences of the maize prefoldin genes. The results provided a foundation for exploring the characterization of the prefoldin genes in plants and will offer insights for additional functional studies. PMID:27014333
Subramoni, Sujatha; Florez Salcedo, Diana Vanessa; Suarez-Moreno, Zulma R
2015-01-01
LuxR solo transcriptional regulators contain both an autoinducer binding domain (ABD; N-terminal) and a DNA binding Helix-Turn-Helix domain (HTH; C-terminal), but are not associated with a cognate N-acyl homoserine lactone (AHL) synthase coding gene in the same genome. Although a few LuxR solos have been characterized, their distributions as well as their role in bacterial signal perception and other processes are poorly understood. In this study we have carried out a systematic survey of distribution of all ABD containing LuxR transcriptional regulators (QS domain LuxRs) available in the InterPro database (IPR005143), and identified those lacking a cognate AHL synthase. These LuxR solos were then analyzed regarding their taxonomical distribution, predicted functions of neighboring genes and the presence of complete AHL-QS systems in the genomes that carry them. Our analyses reveal the presence of one or multiple predicted LuxR solos in many proteobacterial genomes carrying QS domain LuxRs, some of them harboring genes for one or more AHL-QS circuits. The presence of LuxR solos in bacteria occupying diverse environments suggests potential ecological functions for these proteins beyond AHL and interkingdom signaling. Based on gene context and the conservation levels of invariant amino acids of ABD, we have classified LuxR solos into functionally meaningful groups or putative orthologs. Surprisingly, putative LuxR solos were also found in a few non-proteobacterial genomes which are not known to carry AHL-QS systems. Multiple predicted LuxR solos in the same genome appeared to have different levels of conservation of invariant amino acid residues of ABD questioning their binding to AHLs. In summary, this study provides a detailed overview of distribution of LuxR solos and their probable roles in bacteria with genome sequence information.
Subramoni, Sujatha; Florez Salcedo, Diana Vanessa; Suarez-Moreno, Zulma R.
2015-01-01
LuxR solo transcriptional regulators contain both an autoinducer binding domain (ABD; N-terminal) and a DNA binding Helix-Turn-Helix domain (HTH; C-terminal), but are not associated with a cognate N-acyl homoserine lactone (AHL) synthase coding gene in the same genome. Although a few LuxR solos have been characterized, their distributions as well as their role in bacterial signal perception and other processes are poorly understood. In this study we have carried out a systematic survey of distribution of all ABD containing LuxR transcriptional regulators (QS domain LuxRs) available in the InterPro database (IPR005143), and identified those lacking a cognate AHL synthase. These LuxR solos were then analyzed regarding their taxonomical distribution, predicted functions of neighboring genes and the presence of complete AHL-QS systems in the genomes that carry them. Our analyses reveal the presence of one or multiple predicted LuxR solos in many proteobacterial genomes carrying QS domain LuxRs, some of them harboring genes for one or more AHL-QS circuits. The presence of LuxR solos in bacteria occupying diverse environments suggests potential ecological functions for these proteins beyond AHL and interkingdom signaling. Based on gene context and the conservation levels of invariant amino acids of ABD, we have classified LuxR solos into functionally meaningful groups or putative orthologs. Surprisingly, putative LuxR solos were also found in a few non-proteobacterial genomes which are not known to carry AHL-QS systems. Multiple predicted LuxR solos in the same genome appeared to have different levels of conservation of invariant amino acid residues of ABD questioning their binding to AHLs. In summary, this study provides a detailed overview of distribution of LuxR solos and their probable roles in bacteria with genome sequence information. PMID:25759807
Genetic resources offer efficient tools for rice functional genomics research.
Lo, Shuen-Fang; Fan, Ming-Jen; Hsing, Yue-Ie; Chen, Liang-Jwu; Chen, Shu; Wen, Ien-Chie; Liu, Yi-Lun; Chen, Ku-Ting; Jiang, Mirng-Jier; Lin, Ming-Kuang; Rao, Meng-Yen; Yu, Lin-Chih; Ho, Tuan-Hua David; Yu, Su-May
2016-05-01
Rice is an important crop and major model plant for monocot functional genomics studies. With the establishment of various genetic resources for rice genomics, the next challenge is to systematically assign functions to predicted genes in the rice genome. Compared with the robustness of genome sequencing and bioinformatics techniques, progress in understanding the function of rice genes has lagged, hampering the utilization of rice genes for cereal crop improvement. The use of transfer DNA (T-DNA) insertional mutagenesis offers the advantage of uniform distribution throughout the rice genome, but preferentially in gene-rich regions, resulting in direct gene knockout or activation of genes within 20-30 kb up- and downstream of the T-DNA insertion site and high gene tagging efficiency. Here, we summarize the recent progress in functional genomics using the T-DNA-tagged rice mutant population. We also discuss important features of T-DNA activation- and knockout-tagging and promoter-trapping of the rice genome in relation to mutant and candidate gene characterizations and how to more efficiently utilize rice mutant populations and datasets for high-throughput functional genomics and phenomics studies by forward and reverse genetics approaches. These studies may facilitate the translation of rice functional genomics research to improvements of rice and other cereal crops. © 2015 John Wiley & Sons Ltd.
Manoharan, Lokeshwaran; Kushwaha, Sandeep K.; Hedlund, Katarina; Ahrén, Dag
2015-01-01
Microbial enzyme diversity is a key to understand many ecosystem processes. Whole metagenome sequencing (WMG) obtains information on functional genes, but it is costly and inefficient due to large amount of sequencing that is required. In this study, we have applied a captured metagenomics technique for functional genes in soil microorganisms, as an alternative to WMG. Large-scale targeting of functional genes, coding for enzymes related to organic matter degradation, was applied to two agricultural soil communities through captured metagenomics. Captured metagenomics uses custom-designed, hybridization-based oligonucleotide probes that enrich functional genes of interest in metagenomic libraries where only probe-bound DNA fragments are sequenced. The captured metagenomes were highly enriched with targeted genes while maintaining their target diversity and their taxonomic distribution correlated well with the traditional ribosomal sequencing. The captured metagenomes were highly enriched with genes related to organic matter degradation; at least five times more than similar, publicly available soil WMG projects. This target enrichment technique also preserves the functional representation of the soils, thereby facilitating comparative metagenomics projects. Here, we present the first study that applies the captured metagenomics approach in large scale, and this novel method allows deep investigations of central ecosystem processes by studying functional gene abundances. PMID:26490729
Gupta, Gagan D.; Howes, Mark T.; Chandran, Ruma; Das, Anupam; Menon, Sindhu; Parton, Robert G.; Sowdhamini, R.; Thattai, Mukund; Mayor, Satyajit
2014-01-01
Single-cell-resolved measurements reveal heterogeneous distributions of clathrin-dependent (CD) and -independent (CLIC/GEEC: CG) endocytic activity in Drosophila cell populations. dsRNA-mediated knockdown of core versus peripheral endocytic machinery induces strong changes in the mean, or subtle changes in the shapes of these distributions, respectively. By quantifying these subtle shape changes for 27 single-cell features which report on endocytic activity and cell morphology, we organize 1072 Drosophila genes into a tree-like hierarchy. We find that tree nodes contain gene sets enriched in functional classes and protein complexes, providing a portrait of core and peripheral control of CD and CG endocytosis. For 470 genes we obtain additional features from separate assays and classify them into early- or late-acting genes of the endocytic pathways. Detailed analyses of specific genes at intermediate levels of the tree suggest that Vacuolar ATPase and lysosomal genes involved in vacuolar biogenesis play an evolutionarily conserved role in CG endocytosis. PMID:24971745
Distribution and diversity of ribosome binding sites in prokaryotic genomes.
Omotajo, Damilola; Tate, Travis; Cho, Hyuk; Choudhary, Madhusudan
2015-08-14
Prokaryotic translation initiation involves the proper docking, anchoring, and accommodation of mRNA to the 30S ribosomal subunit. Three initiation factors (IF1, IF2, and IF3) and some ribosomal proteins mediate the assembly and activation of the translation initiation complex. Although the interaction between Shine-Dalgarno (SD) sequence and its complementary sequence in the 16S rRNA is important in initiation, some genes lacking an SD ribosome binding site (RBS) are still well expressed. The objective of this study is to examine the pattern of distribution and diversity of RBS in fully sequenced bacterial genomes. The following three hypotheses were tested: SD motifs are prevalent in bacterial genomes; all previously identified SD motifs are uniformly distributed across prokaryotes; and genes with specific cluster of orthologous gene (COG) functions differ in their use of SD motifs. Data for 2,458 bacterial genomes, previously generated by Prodigal (PROkaryotic DYnamic programming Gene-finding ALgorithm) and currently available at the National Center for Biotechnology Information (NCBI), were analyzed. Of the total genes examined, ~77.0% use an SD RBS, while ~23.0% have no RBS. Majority of the genes with the most common SD motifs are distributed in a manner that is representative of their abundance for each COG functional category, while motifs 13 (5'-GGA-3'/5'-GAG-3'/5'-AGG-3') and 27 (5'-AGGAGG-3') appear to be predominantly used by genes for information storage and processing, and translation and ribosome biogenesis, respectively. These findings suggest that an SD sequence is not obligatory for translation initiation; instead, other signals, such as the RBS spacer, may have an overarching influence on translation of mRNAs. Subsequent analyses of the 5' secondary structure of these mRNAs may provide further insight into the translation initiation mechanism.
Gao, Jie; Lan, Ting
2016-01-19
Late embryogenesis abundant (LEA) proteins are a large and highly diverse gene family present in a wide range of plant species. LEAs are proposed to play a role in various stress tolerance responses. Our study represents the first-ever survey of LEA proteins and their encoding genes in a widely distributed pine (Pinus tabuliformis) in China. Twenty-three LEA genes were identified from the P. tabuliformis belonging to seven groups. Proteins with repeated motifs are an important feature specific to LEA groups. Ten of 23 pine LEA genes were selectively expressed in specific tissues, and showed expression divergence within each group. In addition, we selected 13 genes representing each group and introduced theses genes into Escherichia coli to assess the protective function of PtaLEA under heat and salt stresses. Compared with control cells, the E. coli cells expressing PtaLEA fusion protein exhibited enhanced salt and heat resistance and viability, indicating the protein may play a protective role in cells under stress conditions. Furthermore, among these enhanced tolerance genes, a certain extent of function divergence appeared within a gene group as well as between gene groups, suggesting potential functional diversity of this gene family in conifers.
Boyd, Eric S.; Barkay, Tamar
2012-01-01
Mercuric mercury (Hg[II]) is a highly toxic and mobile element that is likely to have had a pronounced and adverse effect on biology since Earth’s oxygenation ∼2.4 billion years ago due to its high affinity for protein sulfhydryl groups, which upon binding destabilize protein structure and decrease enzyme activity, resulting in a decreased organismal fitness. The central enzyme in the microbial mercury detoxification system is the mercuric reductase (MerA) protein, which catalyzes the reduction of Hg(II) to volatile Hg(0). In addition to MerA, mer operons encode for proteins involved in regulation, Hg binding, and organomercury degradation. Mer-mediated approaches have had broad applications in the bioremediation of mercury-contaminated environments and industrial waste streams. Here, we examine the composition of 272 individual mer operons and quantitatively map the distribution of mer-encoded functions on both taxonomic SSU rRNA gene and MerA phylogenies. The results indicate an origin and early evolution of MerA among thermophilic bacteria and an overall increase in the complexity of mer operons through evolutionary time, suggesting continual gene recruitment and evolution leading to an improved efficiency and functional potential of the Mer detoxification system. Consistent with a positive relationship between the evolutionary history and topology of MerA and SSU rRNA gene phylogenies (Mantel R = 0.81, p < 0.01), the distribution of the majority of mer functions, when mapped on these phylograms, indicates an overall tendency to inherit mer-encoded functions through vertical descent. However, individual mer functions display evidence of a variable degree of vertical inheritance, with several genes exhibiting strong evidence for acquisition via lateral gene transfer and/or gene loss. Collectively, these data suggest that (i) mer has evolved from a simple system in geothermal environments to a widely distributed and more complex and efficient detoxification system, and (ii) merA is a suitable biomarker for examining the functional diversity of Hg detoxification and for predicting the composition of mer operons in natural environments. PMID:23087676
Song, Hyun-Seob; McClure, Ryan S.; Bernstein, Hans C.; ...
2015-03-27
Cyanobacteria dynamically relay environmental inputs to intracellular adaptations through a coordinated adjustment of photosynthetic efficiency and carbon processing rates. The output of such adaptations is reflected through changes in transcriptional patterns and metabolic flux distributions that ultimately define growth strategy. To address interrelationships between metabolism and regulation, we performed integrative analyses of metabolic and gene co-expression networks in a model cyanobacterium, Synechococcus sp. PCC 7002. Centrality analyses using the gene co-expression network identified a set of key genes, which were defined here as ‘topologically important.’ Parallel in silico gene knock-out simulations, using the genome-scale metabolic network, classified what we termedmore » as ‘functionally important’ genes, deletion of which affected growth or metabolism. A strong positive correlation was observed between topologically and functionally important genes. Functionally important genes exhibited variable levels of topological centrality; however, the majority of topologically central genes were found to be functionally essential for growth. Subsequent functional enrichment analysis revealed that both functionally and topologically important genes in Synechococcus sp. PCC 7002 are predominantly associated with translation and energy metabolism, two cellular processes critical for growth. This research demonstrates how synergistic network-level analyses can be used for reconciliation of metabolic and gene expression data to uncover fundamental biological principles.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Song, Hyun-Seob; McClure, Ryan S.; Bernstein, Hans C.
Cyanobacteria dynamically relay environmental inputs to intracellular adaptations through a coordinated adjustment of photosynthetic efficiency and carbon processing rates. The output of such adaptations is reflected through changes in transcriptional patterns and metabolic flux distributions that ultimately define growth strategy. To address interrelationships between metabolism and regulation, we performed integrative analyses of metabolic and gene co-expression networks in a model cyanobacterium, Synechococcus sp. PCC 7002. Centrality analyses using the gene co-expression network identified a set of key genes, which were defined here as ‘topologically important.’ Parallel in silico gene knock-out simulations, using the genome-scale metabolic network, classified what we termedmore » as ‘functionally important’ genes, deletion of which affected growth or metabolism. A strong positive correlation was observed between topologically and functionally important genes. Functionally important genes exhibited variable levels of topological centrality; however, the majority of topologically central genes were found to be functionally essential for growth. Subsequent functional enrichment analysis revealed that both functionally and topologically important genes in Synechococcus sp. PCC 7002 are predominantly associated with translation and energy metabolism, two cellular processes critical for growth. This research demonstrates how synergistic network-level analyses can be used for reconciliation of metabolic and gene expression data to uncover fundamental biological principles.« less
The petunia AGL6 gene has a SEPALLATA-like function in floral patterning.
Rijpkema, Anneke S; Zethof, Jan; Gerats, Tom; Vandenbussche, Michiel
2009-10-01
SEPALLATA (SEP) MADS-box genes are required for the regulation of floral meristem determinacy and the specification of sepals, petals, stamens, carpels and ovules, specifically in angiosperms. The SEP subfamily is closely related to the AGAMOUS LIKE6 (AGL6) and SQUAMOSA (SQUA) subfamilies. So far, of these three groups only AGL6-like genes have been found in extant gymnosperms. AGL6 genes are more similar to SEP than to SQUA genes, both in sequence and in expression pattern. Despite the ancestry and wide distribution of AGL6-like MADS-box genes, not a single loss-of-function mutant exhibiting a clear phenotype has yet been reported; consequently the function of AGL6-like genes has remained elusive. Here, we characterize the Petunia hybrida AGL6 (PhAGL6, formerly called PETUNIA MADS BOX GENE4/pMADS4) gene, and show that it functions redundantly with the SEP genes FLORAL BINDING PROTEIN2 (FBP2) and FBP5 in petal and anther development. Moreover, expression analysis suggests a function for PhAGL6 in ovary and ovule development. The PhAGL6 and FBP2 proteins interact in in vitro experiments overall with the same partners, indicating that the two proteins are biochemically quite similar. It will be interesting to determine the functions of AGL6-like genes of other species, especially those of gymnosperms.
Zhou, Yan; Xu, Daixiang; Jia, Ledong; Huang, Xiaohu; Ma, Guoqiang; Wang, Shuxian; Zhu, Meichen; Zhang, Aoxiang; Guan, Mingwei; Lu, Kun; Xu, Xinfu; Wang, Rui; Li, Jiana; Qu, Cunmin
2017-10-24
The basic region/leucine zipper motif (bZIP) transcription factor family is one of the largest families of transcriptional regulators in plants. bZIP genes have been systematically characterized in some plants, but not in rapeseed ( Brassica napus ). In this study, we identified 247 BnbZIP genes in the rapeseed genome, which we classified into 10 subfamilies based on phylogenetic analysis of their deduced protein sequences. The BnbZIP genes were grouped into functional clades with Arabidopsis genes with similar putative functions, indicating functional conservation. Genome mapping analysis revealed that the BnbZIPs are distributed unevenly across all 19 chromosomes, and that some of these genes arose through whole-genome duplication and dispersed duplication events. All expression profiles of 247 bZIP genes were extracted from RNA-sequencing data obtained from 17 different B . napus ZS11 tissues with 42 various developmental stages. These genes exhibited different expression patterns in various tissues, revealing that these genes are differentially regulated. Our results provide a valuable foundation for functional dissection of the different BnbZIP homologs in B . napus and its parental lines and for molecular breeding studies of bZIP genes in B . napus .
Zhou, Yan; Xu, Daixiang; Jia, Ledong; Huang, Xiaohu; Ma, Guoqiang; Wang, Shuxian; Zhu, Meichen; Zhang, Aoxiang; Guan, Mingwei; Xu, Xinfu; Wang, Rui; Li, Jiana
2017-01-01
The basic region/leucine zipper motif (bZIP) transcription factor family is one of the largest families of transcriptional regulators in plants. bZIP genes have been systematically characterized in some plants, but not in rapeseed (Brassica napus). In this study, we identified 247 BnbZIP genes in the rapeseed genome, which we classified into 10 subfamilies based on phylogenetic analysis of their deduced protein sequences. The BnbZIP genes were grouped into functional clades with Arabidopsis genes with similar putative functions, indicating functional conservation. Genome mapping analysis revealed that the BnbZIPs are distributed unevenly across all 19 chromosomes, and that some of these genes arose through whole-genome duplication and dispersed duplication events. All expression profiles of 247 bZIP genes were extracted from RNA-sequencing data obtained from 17 different B. napus ZS11 tissues with 42 various developmental stages. These genes exhibited different expression patterns in various tissues, revealing that these genes are differentially regulated. Our results provide a valuable foundation for functional dissection of the different BnbZIP homologs in B. napus and its parental lines and for molecular breeding studies of bZIP genes in B. napus. PMID:29064393
Statistical mechanics of scale-free gene expression networks
NASA Astrophysics Data System (ADS)
Gross, Eitan
2012-12-01
The gene co-expression networks of many organisms including bacteria, mice and man exhibit scale-free distribution. This heterogeneous distribution of connections decreases the vulnerability of the network to random attacks and thus may confer the genetic replication machinery an intrinsic resilience to such attacks, triggered by changing environmental conditions that the organism may be subject to during evolution. This resilience to random attacks comes at an energetic cost, however, reflected by the lower entropy of the scale-free distribution compared to the more homogenous, random network. In this study we found that the cell cycle-regulated gene expression pattern of the yeast Saccharomyces cerevisiae obeys a power-law distribution with an exponent α = 2.1 and an entropy of 1.58. The latter is very close to the maximal value of 1.65 obtained from linear optimization of the entropy function under the constraint of a constant cost function, determined by the average degree connectivity
Evolution and Distribution of Teleost myomiRNAs: Functionally Diversified myomiRs in Teleosts.
Siddique, Bhuiyan Sharmin; Kinoshita, Shigeharu; Wongkarangkana, Chaninya; Asakawa, Shuichi; Watabe, Shugo
2016-06-01
Myosin heavy chain (MYH) genes belong to a multigene family, and the regulated expression of each member determines the physiological and contractile muscle properties. Among these, MYH6, MYH7, and MYH14 occupy unique positions in the mammalian MYH gene family because of their specific expression in slow/cardiac muscles and the existence of intronic micro(mi) RNAs. MYH6, MYH7, and MYH14 encode miR-208a, miR-208b, and miR-499, respectively. These MYH encoded miRNAs are designated as myomiRs because of their muscle-specific expression and functions. In mammals, myomiRs and host MYHs form a transcription network involved in muscle fiber-type specification; thus, genomic positions and expression patterns of them are well conserved. However, our previous studies revealed divergent distribution and expression of MYH14/miR-499 among teleosts, suggesting the unique evolution of myomiRs and host MYHs in teleosts. Here, we examined distribution and expression of myomiRs and host MYHs in various teleost species. The major cardiac MYH isoforms in teleosts are an intronless gene, atrial myosin heavy chain (amhc), and ventricular myosin heavy chain (vmhc) gene that encodes an intronic miRNA, miR-736. Phylogenetic analysis revealed that vmhc/miR-736 is a teleost-specific myomiR that differed from tetrapoda MYH6/MYH7/miR-208s. Teleost genomes also contain species-specific orthologs in addition to vmhc and amhc, indicating complex gene duplication and gene loss events during teleost evolution. In medaka and torafugu, miR-499 was highly expressed in slow/cardiac muscles whereas the expression of miR-736 was quite low and not muscle specific. These results suggest functional diversification of myomiRs in teleost with the diversification of host MYHs.
The Cryptochrome/Photolyase Family in aquatic organisms.
Oliveri, Paola; Fortunato, Antonio E; Petrone, Libero; Ishikawa-Fujiwara, Tomoko; Kobayashi, Yuri; Todo, Takeshi; Antonova, Olga; Arboleda, Enrique; Zantke, Juliane; Tessmar-Raible, Kristin; Falciatore, Angela
2014-04-01
The Cryptochrome/Photolyase Family (CPF) represents an ancient group of widely distributed UV-A/blue-light sensitive proteins sharing common structures and chromophores. During the course of evolution, different CPFs acquired distinct functions in DNA repair, light perception and circadian clock regulation. Previous phylogenetic analyses of the CPF have allowed reconstruction of the evolution and distribution of the different CPF super-classes in the tree of life. However, so far only limited information is available from the CPF orthologs in aquatic organisms that evolved in environments harboring great diversity of life forms and showing peculiar light distribution and rhythms. To gain new insights into the evolutionary and functional relationships within the CPF family, we performed a detailed study of CPF members from marine (diatoms, sea urchin and annelid) and freshwater organisms (teleost) that populate diverse habitats and exhibit different life strategies. In particular, we first extended the CPF family phylogeny by including genes from aquatic organisms representative of several branches of the tree of life. Our analysis identifies four major super-classes of CPF proteins and importantly singles out the presence of a plant-like CRY in diatoms and in metazoans. Moreover, we show a dynamic evolution of Cpf genes in eukaryotes with various events of gene duplication coupled to functional diversification and gene loss, which have shaped the complex array of Cpf genes in extant aquatic organisms. Second, we uncover clear rhythmic diurnal expression patterns and light-dependent regulation for the majority of the analyzed Cpf genes in our reference species. Our analyses reconstruct the molecular evolution of the CPF family in eukaryotes and provide a solid foundation for a systematic characterization of novel light activated proteins in aquatic environments. Copyright © 2014. Published by Elsevier B.V.
Sundström, Jens; Engström, Peter
2002-07-01
The Norway spruce MADS-box genes DAL11, DAL12 and DAL13 are phylogenetically related to the angiosperm B-function MADS-box genes: genes that act together with A-function genes in specifying petal identity and with C-function genes in specifying stamen identity to floral organs. In this report we present evidence to suggest that the B-gene function in the specification of identity of the pollen-bearing organs has been conserved between conifers and angiosperms. Expression of DAL11 or DAL12 in transgenic Arabidopsis causes phenotypic changes which partly resemble those caused by ectopic expression of the endogenous B-genes. In similar experiments, flowers of Arabidopsis plants expressing DAL13 showed a different homeotic change in that they formed ectopic anthers in whorls one, two or four. We also demonstrate the capacity of the spruce gene products to form homodimers, and that DAL11 and DAL13 may form heterodimers with each other and with the Arabidopsis B-protein AP3, but not with PI, the second B-gene product in Arabidopsis. In situ hybridization experiments show that the conifer B-like genes are expressed specifically in developing pollen cones, but differ in both temporal and spatial distribution patterns. These results suggest that the B-function in conifers is dual and is separated into a meristem identity and an organ identity function, the latter function possibly being independent of an interaction with the C-function. Thus, even though an ancestral B-function may have acted in combination with C to specify micro- and megasporangia, the B-function has evolved differently in conifers and angiosperms.
Singh, Anuradha; Mantri, Shrikant; Sharma, Monica; Chaudhury, Ashok; Tuli, Rakesh; Roy, Joy
2014-01-16
The cultivated bread wheat (Triticum aestivum L.) possesses unique flour quality, which can be processed into many end-use food products such as bread, pasta, chapatti (unleavened flat bread), biscuit, etc. The present wheat varieties require improvement in processing quality to meet the increasing demand of better quality food products. However, processing quality is very complex and controlled by many genes, which have not been completely explored. To identify the candidate genes whose expressions changed due to variation in processing quality and interaction (quality x development), genome-wide transcriptome studies were performed in two sets of diverse Indian wheat varieties differing for chapatti quality. It is also important to understand the temporal and spatial distributions of their expressions for designing tissue and growth specific functional genomics experiments. Gene-specific two-way ANOVA analysis of expression of about 55 K transcripts in two diverse sets of Indian wheat varieties for chapatti quality at three seed developmental stages identified 236 differentially expressed probe sets (10-fold). Out of 236, 110 probe sets were identified for chapatti quality. Many processing quality related key genes such as glutenin and gliadins, puroindolines, grain softness protein, alpha and beta amylases, proteases, were identified, and many other candidate genes related to cellular and molecular functions were also identified. The ANOVA analysis revealed that the expression of 56 of 110 probe sets was involved in interaction (quality x development). Majority of the probe sets showed differential expression at early stage of seed development i.e. temporal expression. Meta-analysis revealed that the majority of the genes expressed in one or a few growth stages indicating spatial distribution of their expressions. The differential expressions of a few candidate genes such as pre-alpha/beta-gliadin and gamma gliadin were validated by RT-PCR. Therefore, this study identified several quality related key genes including many other genes, their interactions (quality x development) and temporal and spatial distributions. The candidate genes identified for processing quality and information on temporal and spatial distributions of their expressions would be useful for designing wheat improvement programs for processing quality either by changing their expression or development of single nucleotide polymorphisms (SNPs) markers.
2014-01-01
Background The cultivated bread wheat (Triticum aestivum L.) possesses unique flour quality, which can be processed into many end-use food products such as bread, pasta, chapatti (unleavened flat bread), biscuit, etc. The present wheat varieties require improvement in processing quality to meet the increasing demand of better quality food products. However, processing quality is very complex and controlled by many genes, which have not been completely explored. To identify the candidate genes whose expressions changed due to variation in processing quality and interaction (quality x development), genome-wide transcriptome studies were performed in two sets of diverse Indian wheat varieties differing for chapatti quality. It is also important to understand the temporal and spatial distributions of their expressions for designing tissue and growth specific functional genomics experiments. Results Gene-specific two-way ANOVA analysis of expression of about 55 K transcripts in two diverse sets of Indian wheat varieties for chapatti quality at three seed developmental stages identified 236 differentially expressed probe sets (10-fold). Out of 236, 110 probe sets were identified for chapatti quality. Many processing quality related key genes such as glutenin and gliadins, puroindolines, grain softness protein, alpha and beta amylases, proteases, were identified, and many other candidate genes related to cellular and molecular functions were also identified. The ANOVA analysis revealed that the expression of 56 of 110 probe sets was involved in interaction (quality x development). Majority of the probe sets showed differential expression at early stage of seed development i.e. temporal expression. Meta-analysis revealed that the majority of the genes expressed in one or a few growth stages indicating spatial distribution of their expressions. The differential expressions of a few candidate genes such as pre-alpha/beta-gliadin and gamma gliadin were validated by RT-PCR. Therefore, this study identified several quality related key genes including many other genes, their interactions (quality x development) and temporal and spatial distributions. Conclusions The candidate genes identified for processing quality and information on temporal and spatial distributions of their expressions would be useful for designing wheat improvement programs for processing quality either by changing their expression or development of single nucleotide polymorphisms (SNPs) markers. PMID:24433256
NASA Astrophysics Data System (ADS)
Liland, Kristian Hovde; Snipen, Lars
When a series of Bernoulli trials occur within a fixed time frame or limited space, it is often interesting to assess if the successful outcomes have occurred completely at random, or if they tend to group together. One example, in genetics, is detecting grouping of genes within a genome. Approximations of the distribution of successes are possible, but they become inaccurate for small sample sizes. In this article, we describe the exact distribution of time between random, non-overlapping successes in discrete time of fixed length. A complete description of the probability mass function, the cumulative distribution function, mean, variance and recurrence relation is included. We propose an associated test for the over-representation of short distances and illustrate the methodology through relevant examples. The theory is implemented in an R package including probability mass, cumulative distribution, quantile function, random number generator, simulation functions, and functions for testing.
Molecular and comparative genetics of mental retardation.
Inlow, Jennifer K; Restifo, Linda L
2004-01-01
Affecting 1-3% of the population, mental retardation (MR) poses significant challenges for clinicians and scientists. Understanding the biology of MR is complicated by the extraordinary heterogeneity of genetic MR disorders. Detailed analyses of >1000 Online Mendelian Inheritance in Man (OMIM) database entries and literature searches through September 2003 revealed 282 molecularly identified MR genes. We estimate that hundreds more MR genes remain to be identified. A novel test, in which we distributed unmapped MR disorders proportionately across the autosomes, failed to eliminate the well-known X-chromosome overrepresentation of MR genes and candidate genes. This evidence argues against ascertainment bias as the main cause of the skewed distribution. On the basis of a synthesis of clinical and laboratory data, we developed a biological functions classification scheme for MR genes. Metabolic pathways, signaling pathways, and transcription are the most common functions, but numerous other aspects of neuronal and glial biology are controlled by MR genes as well. Using protein sequence and domain-organization comparisons, we found a striking conservation of MR genes and genetic pathways across the approximately 700 million years that separate Homo sapiens and Drosophila melanogaster. Eighty-seven percent have one or more fruit fly homologs and 76% have at least one candidate functional ortholog. We propose that D. melanogaster can be used in a systematic manner to study MR and possibly to develop bioassays for therapeutic drug discovery. We selected 42 Drosophila orthologs as most likely to reveal molecular and cellular mechanisms of nervous system development or plasticity relevant to MR. PMID:15020472
Functional and topological characteristics of mammalian regulatory domains
Symmons, Orsolya; Uslu, Veli Vural; Tsujimura, Taro; Ruf, Sandra; Nassari, Sonya; Schwarzer, Wibke; Ettwiller, Laurence; Spitz, François
2014-01-01
Long-range regulatory interactions play an important role in shaping gene-expression programs. However, the genomic features that organize these activities are still poorly characterized. We conducted a large operational analysis to chart the distribution of gene regulatory activities along the mouse genome, using hundreds of insertions of a regulatory sensor. We found that enhancers distribute their activities along broad regions and not in a gene-centric manner, defining large regulatory domains. Remarkably, these domains correlate strongly with the recently described TADs, which partition the genome into distinct self-interacting blocks. Different features, including specific repeats and CTCF-binding sites, correlate with the transition zones separating regulatory domains, and may help to further organize promiscuously distributed regulatory influences within large domains. These findings support a model of genomic organization where TADs confine regulatory activities to specific but large regulatory domains, contributing to the establishment of specific gene expression profiles. PMID:24398455
Yamamoto, Kaneyoshi; Yamanaka, Yuki; Shimada, Tomohiro; Sarkar, Paramita; Yoshida, Myu; Bhardwaj, Neerupma; Watanabe, Hiroki; Taira, Yuki; Chatterji, Dipankar; Ishihama, Akira
2018-01-01
The RNA polymerase (RNAP) of Escherichia coli K-12 is a complex enzyme consisting of the core enzyme with the subunit structure α 2 ββ'ω and one of the σ subunits with promoter recognition properties. The smallest subunit, omega (the rpoZ gene product), participates in subunit assembly by supporting the folding of the largest subunit, β', but its functional role remains unsolved except for its involvement in ppGpp binding and stringent response. As an initial approach for elucidation of its functional role, we performed in this study ChIP-chip (chromatin immunoprecipitation with microarray technology) analysis of wild-type and rpoZ -defective mutant strains. The altered distribution of RpoZ-defective RNAP was identified mostly within open reading frames, in particular, of the genes inside prophages. For the genes that exhibited increased or decreased distribution of RpoZ-defective RNAP, the level of transcripts increased or decreased, respectively, as detected by reverse transcription-quantitative PCR (qRT-PCR). In parallel, we analyzed, using genomic SELEX (systemic evolution of ligands by exponential enrichment), the distribution of constitutive promoters that are recognized by RNAP RpoD holoenzyme alone and of general silencer H-NS within prophages. Since all 10 prophages in E. coli K-12 carry only a small number of promoters, the altered occupancy of RpoZ-defective RNAP and of transcripts might represent transcription initiated from as-yet-unidentified host promoters. The genes that exhibited transcription enhanced by RpoZ-defective RNAP are located in the regions of low-level H-NS binding. By using phenotype microarray (PM) assay, alterations of some phenotypes were detected for the rpoZ -deleted mutant, indicating the involvement of RpoZ in regulation of some genes. Possible mechanisms of altered distribution of RNAP inside prophages are discussed. IMPORTANCE The 91-amino-acid-residue small-subunit omega (the rpoZ gene product) of Escherichia coli RNA polymerase plays a structural role in the formation of RNA polymerase (RNAP) as a chaperone in folding the largest subunit (β', of 1,407 residues in length), but except for binding of the stringent signal ppGpp, little is known of its role in the control of RNAP function. After analysis of genomewide distribution of wild-type and RpoZ-defective RNAP by the ChIP-chip method, we found alteration of the RpoZ-defective RNAP inside open reading frames, in particular, of the genes within prophages. For a set of the genes that exhibited altered occupancy of the RpoZ-defective RNAP, transcription was found to be altered as observed by qRT-PCR assay. All the observations here described indicate the involvement of RpoZ in recognition of some of the prophage genes. This study advances understanding of not only the regulatory role of omega subunit in the functions of RNAP but also the regulatory interplay between prophages and the host E. coli for adjustment of cellular physiology to a variety of environments in nature.
Yamamoto, Kaneyoshi; Yamanaka, Yuki; Shimada, Tomohiro; Sarkar, Paramita; Yoshida, Myu; Bhardwaj, Neerupma; Watanabe, Hiroki; Taira, Yuki
2018-01-01
ABSTRACT The RNA polymerase (RNAP) of Escherichia coli K-12 is a complex enzyme consisting of the core enzyme with the subunit structure α2ββ′ω and one of the σ subunits with promoter recognition properties. The smallest subunit, omega (the rpoZ gene product), participates in subunit assembly by supporting the folding of the largest subunit, β′, but its functional role remains unsolved except for its involvement in ppGpp binding and stringent response. As an initial approach for elucidation of its functional role, we performed in this study ChIP-chip (chromatin immunoprecipitation with microarray technology) analysis of wild-type and rpoZ-defective mutant strains. The altered distribution of RpoZ-defective RNAP was identified mostly within open reading frames, in particular, of the genes inside prophages. For the genes that exhibited increased or decreased distribution of RpoZ-defective RNAP, the level of transcripts increased or decreased, respectively, as detected by reverse transcription-quantitative PCR (qRT-PCR). In parallel, we analyzed, using genomic SELEX (systemic evolution of ligands by exponential enrichment), the distribution of constitutive promoters that are recognized by RNAP RpoD holoenzyme alone and of general silencer H-NS within prophages. Since all 10 prophages in E. coli K-12 carry only a small number of promoters, the altered occupancy of RpoZ-defective RNAP and of transcripts might represent transcription initiated from as-yet-unidentified host promoters. The genes that exhibited transcription enhanced by RpoZ-defective RNAP are located in the regions of low-level H-NS binding. By using phenotype microarray (PM) assay, alterations of some phenotypes were detected for the rpoZ-deleted mutant, indicating the involvement of RpoZ in regulation of some genes. Possible mechanisms of altered distribution of RNAP inside prophages are discussed. IMPORTANCE The 91-amino-acid-residue small-subunit omega (the rpoZ gene product) of Escherichia coli RNA polymerase plays a structural role in the formation of RNA polymerase (RNAP) as a chaperone in folding the largest subunit (β′, of 1,407 residues in length), but except for binding of the stringent signal ppGpp, little is known of its role in the control of RNAP function. After analysis of genomewide distribution of wild-type and RpoZ-defective RNAP by the ChIP-chip method, we found alteration of the RpoZ-defective RNAP inside open reading frames, in particular, of the genes within prophages. For a set of the genes that exhibited altered occupancy of the RpoZ-defective RNAP, transcription was found to be altered as observed by qRT-PCR assay. All the observations here described indicate the involvement of RpoZ in recognition of some of the prophage genes. This study advances understanding of not only the regulatory role of omega subunit in the functions of RNAP but also the regulatory interplay between prophages and the host E. coli for adjustment of cellular physiology to a variety of environments in nature. PMID:29468196
Joint scaling laws in functional and evolutionary categories in prokaryotic genomes
Grilli, J.; Bassetti, B.; Maslov, S.; Cosentino Lagomarsino, M.
2012-01-01
We propose and study a class-expansion/innovation/loss model of genome evolution taking into account biological roles of genes and their constituent domains. In our model, numbers of genes in different functional categories are coupled to each other. For example, an increase in the number of metabolic enzymes in a genome is usually accompanied by addition of new transcription factors regulating these enzymes. Such coupling can be thought of as a proportional ‘recipe’ for genome composition of the type ‘a spoonful of sugar for each egg yolk’. The model jointly reproduces two known empirical laws: the distribution of family sizes and the non-linear scaling of the number of genes in certain functional categories (e.g. transcription factors) with genome size. In addition, it allows us to derive a novel relation between the exponents characterizing these two scaling laws, establishing a direct quantitative connection between evolutionary and functional categories. It predicts that functional categories that grow faster-than-linearly with genome size to be characterized by flatter-than-average family size distributions. This relation is confirmed by our bioinformatics analysis of prokaryotic genomes. This proves that the joint quantitative trends of functional and evolutionary classes can be understood in terms of evolutionary growth with proportional recipes. PMID:21937509
Ayadi, M; Hanana, M; Kharrat, N; Merchaoui, H; Marzoug, R Ben; Lauvergeat, V; Rebaï, A; Mzid, R
2016-10-01
WRKY transcription factors belong to a large family of plant transcriptional regulators whose members have been reported to be involved in a wide range of biological roles including plant development, adaptation to environmental constraints and response to several diseases. However, little or poor information is available about WRKY's in Citrus. The recent release of completely assembled genomes sequences of Citrus sinensis and Citrus clementina and the availability of ESTs sequences from other citrus species allowed us to perform a genome survey for Citrus WRKY proteins. In the present study, we identified 100 WRKY members from C. sinensis (51), C. clementina (48) and Citrus unshiu (1), and analyzed their chromosomal distribution, gene structure, gene duplication, syntenic relation and phylogenetic analysis. A phylogenetic tree of 100 Citrus WRKY sequences with their orthologs from Arabidopsis has distinguished seven groups. The CsWRKY genes were distributed across all ten sweet orange chromosomes. A comprehensive approach and an integrative analysis of Citrus WRKY gene expression revealed variable profiles of expression within tissues and stress conditions indicating functional diversification. Thus, candidate Citrus WRKY genes have been proposed as potentially involved in fruit acidification, essential oil biosynthesis and abiotic/biotic stress tolerance. Our results provided essential prerequisites for further WRKY genes cloning and functional analysis with an aim of citrus crop improvement.
Carpenter, Margaret A; Shaw, Martin; Cooper, Rebecca D; Frew, Tonya J; Butler, Ruth C; Murray, Sarah R; Moya, Leire; Coyne, Clarice J; Timmerman-Vaughan, Gail M
2017-08-01
Although starch consists of large macromolecules composed of glucose units linked by α-1,4-glycosidic linkages with α-1,6-glycosidic branchpoints, variation in starch structural and functional properties is found both within and between species. Interest in starch genetics is based on the importance of starch in food and industrial processes, with the potential of genetics to provide novel starches. The starch metabolic pathway is complex but has been characterized in diverse plant species, including pea. To understand how allelic variation in the pea starch metabolic pathway affects starch structure and percent amylose, partial sequences of 25 candidate genes were characterized for polymorphisms using a panel of 92 diverse pea lines. Variation in the percent amylose composition of extracted seed starch and (amylopectin) chain length distribution, one measure of starch structure, were characterized for these lines. Association mapping was undertaken to identify polymorphisms associated with the variation in starch chain length distribution and percent amylose, using a mixed linear model that incorporated population structure and kinship. Associations were found for polymorphisms in seven candidate genes plus Mendel's r locus (which conditions the round versus wrinkled seed phenotype). The genes with associated polymorphisms are involved in the substrate supply, chain elongation and branching stages of the pea carbohydrate and starch metabolic pathways. The association of polymorphisms in carbohydrate and starch metabolic genes with variation in amylopectin chain length distribution and percent amylose may help to guide manipulation of pea seed starch structural and functional properties through plant breeding.
Gene expression links functional networks across cortex and striatum.
Anderson, Kevin M; Krienen, Fenna M; Choi, Eun Young; Reinen, Jenna M; Yeo, B T Thomas; Holmes, Avram J
2018-04-12
The human brain is comprised of a complex web of functional networks that link anatomically distinct regions. However, the biological mechanisms supporting network organization remain elusive, particularly across cortical and subcortical territories with vastly divergent cellular and molecular properties. Here, using human and primate brain transcriptional atlases, we demonstrate that spatial patterns of gene expression show strong correspondence with limbic and somato/motor cortico-striatal functional networks. Network-associated expression is consistent across independent human datasets and evolutionarily conserved in non-human primates. Genes preferentially expressed within the limbic network (encompassing nucleus accumbens, orbital/ventromedial prefrontal cortex, and temporal pole) relate to risk for psychiatric illness, chloride channel complexes, and markers of somatostatin neurons. Somato/motor associated genes are enriched for oligodendrocytes and markers of parvalbumin neurons. These analyses indicate that parallel cortico-striatal processing channels possess dissociable genetic signatures that recapitulate distributed functional networks, and nominate molecular mechanisms supporting cortico-striatal circuitry in health and disease.
General statistics of stochastic process of gene expression in eukaryotic cells.
Kuznetsov, V A; Knott, G D; Bonner, R F
2002-01-01
Thousands of genes are expressed at such very low levels (< or =1 copy per cell) that global gene expression analysis of rarer transcripts remains problematic. Ambiguity in identification of rarer transcripts creates considerable uncertainty in fundamental questions such as the total number of genes expressed in an organism and the biological significance of rarer transcripts. Knowing the distribution of the true number of genes expressed at each level and the corresponding gene expression level probability function (GELPF) could help resolve these uncertainties. We found that all observed large-scale gene expression data sets in yeast, mouse, and human cells follow a Pareto-like distribution model skewed by many low-abundance transcripts. A novel stochastic model of the gene expression process predicts the universality of the GELPF both across different cell types within a multicellular organism and across different organisms. This model allows us to predict the frequency distribution of all gene expression levels within a single cell and to estimate the number of expressed genes in a single cell and in a population of cells. A random "basal" transcription mechanism for protein-coding genes in all or almost all eukaryotic cell types is predicted. This fundamental mechanism might enhance the expression of rarely expressed genes and, thus, provide a basic level of phenotypic diversity, adaptability, and random monoallelic expression in cell populations. PMID:12136033
Liu, Qin; Dang, Huijie; Chen, Zhijian; Wu, Junzheng; Chen, Yinhua; Chen, Songbi; Luo, Lijuan
2018-03-26
The sugar transporter ( STP ) gene family encodes monosaccharide transporters that contain 12 transmembrane domains and belong to the major facilitator superfamily. STP genes play critical roles in monosaccharide distribution and participate in diverse plant metabolic processes. To investigate the potential roles of STPs in cassava ( Manihot esculenta ) tuber root growth, genome-wide identification and expression and functional analyses of the STP gene family were performed in this study. A total of 20 MeSTP genes ( MeSTP1 - 20 ) containing the Sugar_tr conserved motifs were identified from the cassava genome, which could be further classified into four distinct groups in the phylogenetic tree. The expression profiles of the MeSTP genes explored using RNA-seq data showed that most of the MeSTP genes exhibited tissue-specific expression, and 15 out of 20 MeSTP genes were mainly expressed in the early storage root of cassava. qRT-PCR analysis further confirmed that most of the MeSTPs displayed higher expression in roots after 30 and 40 days of growth, suggesting that these genes may be involved in the early growth of tuber roots. Although all the MeSTP proteins exhibited plasma membrane localization, variations in monosaccharide transport activity were found through a complementation analysis in a yeast ( Saccharomyces cerevisiae ) mutant, defective in monosaccharide uptake. Among them, MeSTP2, MeSTP15, and MeSTP19 were able to efficiently complement the uptake of five monosaccharides in the yeast mutant, while MeSTP3 and MeSTP16 only grew on medium containing galactose, suggesting that these two MeSTP proteins are transporters specific for galactose. This study provides significant insights into the potential functions of MeSTPs in early tuber root growth, which possibly involves the regulation of monosaccharide distribution.
Guo, Yong; Qiu, Li-Juan
2013-01-01
The Dof domain protein family is a classic plant-specific zinc-finger transcription factor family involved in a variety of biological processes. There is great diversity in the number of Dof genes in different plants. However, there are only very limited reports on the characterization of Dof transcription factors in soybean (Glycine max). In the present study, 78 putative Dof genes were identified from the whole-genome sequence of soybean. The predicted GmDof genes were non-randomly distributed within and across 19 out of 20 chromosomes and 97.4% (38 pairs) were preferentially retained duplicate paralogous genes located in duplicated regions of the genome. Soybean-specific segmental duplications contributed significantly to the expansion of the soybean Dof gene family. These Dof proteins were phylogenetically clustered into nine distinct subgroups among which the gene structure and motif compositions were considerably conserved. Comparative phylogenetic analysis of these Dof proteins revealed four major groups, similar to those reported for Arabidopsis and rice. Most of the GmDofs showed specific expression patterns based on RNA-seq data analyses. The expression patterns of some duplicate genes were partially redundant while others showed functional diversity, suggesting the occurrence of sub-functionalization during subsequent evolution. Comprehensive expression profile analysis also provided insights into the soybean-specific functional divergence among members of the Dof gene family. Cis-regulatory element analysis of these GmDof genes suggested diverse functions associated with different processes. Taken together, our results provide useful information for the functional characterization of soybean Dof genes by combining phylogenetic analysis with global gene-expression profiling.
Discover mouse gene coexpression landscapes using dictionary learning and sparse coding.
Li, Yujie; Chen, Hanbo; Jiang, Xi; Li, Xiang; Lv, Jinglei; Peng, Hanchuan; Tsien, Joe Z; Liu, Tianming
2017-12-01
Gene coexpression patterns carry rich information regarding enormously complex brain structures and functions. Characterization of these patterns in an unbiased, integrated, and anatomically comprehensive manner will illuminate the higher-order transcriptome organization and offer genetic foundations of functional circuitry. Here using dictionary learning and sparse coding, we derived coexpression networks from the space-resolved anatomical comprehensive in situ hybridization data from Allen Mouse Brain Atlas dataset. The key idea is that if two genes use the same dictionary to represent their original signals, then their gene expressions must share similar patterns, thereby considering them as "coexpressed." For each network, we have simultaneous knowledge of spatial distributions, the genes in the network and the extent a particular gene conforms to the coexpression pattern. Gene ontologies and the comparisons with published gene lists reveal biologically identified coexpression networks, some of which correspond to major cell types, biological pathways, and/or anatomical regions.
Microevolutionary dynamics of a macroevolutionary key innovation in a Lepidopteran herbivore
2010-01-01
Background A molecular population genetics understanding is central to the study of ecological and evolutionary functional genomics. Population genetics identifies genetic variation and its distribution within and among populations, it reveals the demographic history of the populations studied, and can provide indirect insights into historical selection dynamics. Here we use this approach to examine the demographic and selective dynamics acting of a candidate gene involved in plant-insect interactions. Previous work documents the macroevolutionary and historical ecological importance of the nitrile-specifier protein (Nsp), which facilitated the host shift of Pieridae butterflies onto Brassicales host plants ~80 Myr ago. Results Here we assess the microevolutionary dynamics of the Nsp gene by studying the within and among-population variation at Nsp and reference genes in the butterfly Pieris rapae (Small Cabbage White). Nsp exhibits unexpectedly high amounts of amino acid polymorphism, unequally distributed across the gene. The vast majority of genetic variation exists within populations, with little to no genetic differentiation among four populations on two continents. A comparison of synonymous and nonsynonymous substitutions in 70 randomly chosen genes among P. rapae and its close relative Pieris brassicae (Large Cabbage White) finds Nsp to have a significantly relaxed functional constraint compared to housekeeping genes. We find strong evidence for a recent population expansion and no role for strong purifying or directional selection upon the Nsp gene. Conclusions The microevolutionary dynamics of the Nsp gene in P. rapae are dominated by recent population expansion and variation in functional constraint across the repeated domains of the Nsp gene. While the high amounts of amino acid diversity suggest there may be significant functional differences among allelic variants segregating within populations, indirect tests of selection could not conclusively identify a signature of historical selection. The importance of using this information for planning future studies of potential performance and fitness consequences of the observed variation is discussed. PMID:20181249
Mastretta-Yanes, Alicia; Zamudio, Sergio; Jorgensen, Tove H.; Arrigo, Nils; Alvarez, Nadir; Piñero, Daniel; Emerson, Brent C.
2014-01-01
Gene duplication leads to paralogy, which complicates the de novo assembly of genotyping-by-sequencing (GBS) data. The issue of paralogous genes is exacerbated in plants, because they are particularly prone to gene duplication events. Paralogs are normally filtered from GBS data before undertaking population genomics or phylogenetic analyses. However, gene duplication plays an important role in the functional diversification of genes and it can also lead to the formation of postzygotic barriers. Using populations and closely related species of a tropical mountain shrub, we examine 1) the genomic differentiation produced by putative orthologs, and 2) the distribution of recent gene duplication among lineages and geography. We find high differentiation among populations from isolated mountain peaks and species-level differentiation within what is morphologically described as a single species. The inferred distribution of paralogs among populations is congruent with taxonomy and shows that GBS could be used to examine recent gene duplication as a source of genomic differentiation of nonmodel species. PMID:25223767
Baxter, Laura L; Hsu, Benjamin J; Umayam, Lowell; Wolfsberg, Tyra G; Larson, Denise M; Frith, Martin C; Kawai, Jun; Hayashizaki, Yoshihide; Carninci, Piero; Pavan, William J
2007-06-01
As part of the RIKEN mouse encyclopedia project, two cDNA libraries were prepared from melanocyte-derived cell lines, using techniques of full-length clone selection and subtraction/normalization to enrich for rare transcripts. End sequencing showed that these libraries display over 83% complete coding sequence at the 5' end and 96-97% complete coding sequence at the 3' end. Evaluation of the libraries, derived from B16F10Y tumor cells and melan-c cells, revealed that they contain clones for a majority of the genes previously demonstrated to function in melanocyte biology. Analysis of genomic locations for transcripts revealed that the distribution of melanocyte genes is non-random throughout the genome. Three genomic regions identified that showed significant clustering of melanocyte-expressed genes contain one or more genes previously shown to regulate melanocyte development or function. A catalog of genes expressed in these libraries is presented, providing a valuable resource of cDNA clones and sequence information that can be used for identification of new genes important for melanocyte development, function, and disease.
Dimond, James L; Roberts, Steven B
2016-04-01
DNA methylation is an epigenetic mark that plays an inadequately understood role in gene regulation, particularly in nonmodel species. Because it can be influenced by the environment, DNA methylation may contribute to the ability of organisms to acclimatize and adapt to environmental change. We evaluated the distribution of gene body methylation in reef-building corals, a group of organisms facing significant environmental threats. Gene body methylation in six species of corals was inferred from in silico transcriptome analysis of CpG O/E, an estimate of germline DNA methylation that is highly correlated with patterns of methylation enrichment. Consistent with what has been documented in most other invertebrates, all corals exhibited bimodal distributions of germline methylation suggestive of distinct fractions of genes with high and low levels of methylation. The hypermethylated fractions were enriched with genes with housekeeping functions, while genes with inducible functions were highly represented in the hypomethylated fractions. High transcript abundance was associated with intermediate levels of methylation. In three of the coral species, we found that genes differentially expressed in response to thermal stress and ocean acidification exhibited significantly lower levels of methylation. These results support a link between gene body hypomethylation and transcriptional plasticity that may point to a role of DNA methylation in the response of corals to environmental change. © 2015 John Wiley & Sons Ltd.
Modeling gene expression measurement error: a quasi-likelihood approach
Strimmer, Korbinian
2003-01-01
Background Using suitable error models for gene expression measurements is essential in the statistical analysis of microarray data. However, the true probabilistic model underlying gene expression intensity readings is generally not known. Instead, in currently used approaches some simple parametric model is assumed (usually a transformed normal distribution) or the empirical distribution is estimated. However, both these strategies may not be optimal for gene expression data, as the non-parametric approach ignores known structural information whereas the fully parametric models run the risk of misspecification. A further related problem is the choice of a suitable scale for the model (e.g. observed vs. log-scale). Results Here a simple semi-parametric model for gene expression measurement error is presented. In this approach inference is based an approximate likelihood function (the extended quasi-likelihood). Only partial knowledge about the unknown true distribution is required to construct this function. In case of gene expression this information is available in the form of the postulated (e.g. quadratic) variance structure of the data. As the quasi-likelihood behaves (almost) like a proper likelihood, it allows for the estimation of calibration and variance parameters, and it is also straightforward to obtain corresponding approximate confidence intervals. Unlike most other frameworks, it also allows analysis on any preferred scale, i.e. both on the original linear scale as well as on a transformed scale. It can also be employed in regression approaches to model systematic (e.g. array or dye) effects. Conclusions The quasi-likelihood framework provides a simple and versatile approach to analyze gene expression data that does not make any strong distributional assumptions about the underlying error model. For several simulated as well as real data sets it provides a better fit to the data than competing models. In an example it also improved the power of tests to identify differential expression. PMID:12659637
2012-01-01
Background GDSL esterases/lipases are a newly discovered subclass of lipolytic enzymes that are very important and attractive research subjects because of their multifunctional properties, such as broad substrate specificity and regiospecificity. Compared with the current knowledge regarding these enzymes in bacteria, our understanding of the plant GDSL enzymes is very limited, although the GDSL gene family in plant species include numerous members in many fully sequenced plant genomes. Only two genes from a large rice GDSL esterase/lipase gene family were previously characterised, and the majority of the members remain unknown. In the present study, we describe the rice OsGELP (Oryza sativa GDSL esterase/lipase protein) gene family at the genomic and proteomic levels, and use this knowledge to provide insights into the multifunctionality of the rice OsGELP enzymes. Results In this study, an extensive bioinformatics analysis identified 114 genes in the rice OsGELP gene family. A complete overview of this family in rice is presented, including the chromosome locations, gene structures, phylogeny, and protein motifs. Among the OsGELPs and the plant GDSL esterase/lipase proteins of known functions, 41 motifs were found that represent the core secondary structure elements or appear specifically in different phylogenetic subclades. The specification and distribution of identified putative conserved clade-common and -specific peptide motifs, and their location on the predicted protein three dimensional structure may possibly signify their functional roles. Potentially important regions for substrate specificity are highlighted, in accordance with protein three-dimensional model and location of the phylogenetic specific conserved motifs. The differential expression of some representative genes were confirmed by quantitative real-time PCR. The phylogenetic analysis, together with protein motif architectures, and the expression profiling were analysed to predict the possible biological functions of the rice OsGELP genes. Conclusions Our current genomic analysis, for the first time, presents fundamental information on the organization of the rice OsGELP gene family. With combination of the genomic, phylogenetic, microarray expression, protein motif distribution, and protein structure analyses, we were able to create supported basis for the functional prediction of many members in the rice GDSL esterase/lipase family. The present study provides a platform for the selection of candidate genes for further detailed functional study. PMID:22793791
Aporntewan, Chatchawit; Pin-on, Piyapat; Chaiyaratana, Nachol; Pongpanich, Monnat; Boonyaratanakornkit, Viroj; Mutirangura, Apiwat
2013-10-01
A-repeats are the simplest form of tandem repeats and are found ubiquitously throughout genomes. These mononucleotide repeats have been widely believed to be non-functional 'junk' DNA. However, studies in yeasts suggest that A-repeats play crucial biological functions, and their role in humans remains largely unknown. Here, we showed a non-random pattern of distribution of sense A- and T-repeats within 20 kb around transcription start sites (TSSs) in the human genome. Different distributions of these repeats are observed upstream and downstream of TSSs. Sense A-repeats are enriched upstream, whereas sense T-repeats are enriched downstream of TSSs. This enrichment directly correlates with repeat size. Genes with different functions contain different lengths of repeats. In humans, tissue-specific genes are enriched for short repeats of <10 bp, whereas housekeeping genes are enriched for long repeats of ≥10 bp. We demonstrated that DICER1 and Argonaute proteins are required for the cis-regulatory role of A-repeats. Moreover, in the presence of a synthetic polymer that mimics an A-repeat, protein binding to A-repeats was blocked, resulting in a dramatic change in the expression of genes containing upstream A-repeats. Our findings suggest a length-dependent cis-regulatory function of A-repeats and that Argonaute proteins serve as trans-acting factors, binding to A-repeats.
NEAT: an efficient network enrichment analysis test.
Signorelli, Mirko; Vinciotti, Veronica; Wit, Ernst C
2016-09-05
Network enrichment analysis is a powerful method, which allows to integrate gene enrichment analysis with the information on relationships between genes that is provided by gene networks. Existing tests for network enrichment analysis deal only with undirected networks, they can be computationally slow and are based on normality assumptions. We propose NEAT, a test for network enrichment analysis. The test is based on the hypergeometric distribution, which naturally arises as the null distribution in this context. NEAT can be applied not only to undirected, but to directed and partially directed networks as well. Our simulations indicate that NEAT is considerably faster than alternative resampling-based methods, and that its capacity to detect enrichments is at least as good as the one of alternative tests. We discuss applications of NEAT to network analyses in yeast by testing for enrichment of the Environmental Stress Response target gene set with GO Slim and KEGG functional gene sets, and also by inspecting associations between functional sets themselves. NEAT is a flexible and efficient test for network enrichment analysis that aims to overcome some limitations of existing resampling-based tests. The method is implemented in the R package neat, which can be freely downloaded from CRAN ( https://cran.r-project.org/package=neat ).
Dasgupta, Ujjaini; Dixit, Bharat L; Rusch, Melissa; Selleck, Scott; The, Inge
2007-08-01
Heparan sulfate proteoglycans play a vital role in signaling of various growth factors in both Drosophila and vertebrates. In Drosophila, mutations in the tout velu (ttv) gene, a homolog of the mammalian EXT1 tumor suppressor gene, leads to abrogation of glycosaminoglycan (GAG) biosynthesis. This impairs distribution and signaling activities of various morphogens such as Hedgehog (Hh), Wingless (Wg), and Decapentaplegic (Dpp). Mutations in members of the exostosin (EXT) gene family lead to hereditary multiple exostosis in humans leading to bone outgrowths and tumors. In this study, we provide genetic and biochemical evidence that the human EXT1 (hEXT1) gene is conserved through species and can functionally complement the ttv mutation in Drosophila. The hEXT1 gene was able to rescue a ttv null mutant to adulthood and restore GAG biosynthesis.
Pan, Qian; Peng, Jin; Zhou, Xue; Yang, Hao; Zhang, Wei
2012-07-01
In order to screen out important genes from large gene data of gene microarray after nerve injury, we combine gene ontology (GO) method and computer pattern recognition technology to find key genes responding to nerve injury, and then verify one of these screened-out genes. Data mining and gene ontology analysis of gene chip data GSE26350 was carried out through MATLAB software. Cd44 was selected from screened-out key gene molecular spectrum by comparing genes' different GO terms and positions on score map of principal component. Function interferences were employed to influence the normal binding of Cd44 and one of its ligands, chondroitin sulfate C (CSC), to observe neurite extension. Gene ontology analysis showed that the first genes on score map (marked by red *) mainly distributed in molecular transducer activity, receptor activity, protein binding et al molecular function GO terms. Cd44 is one of six effector protein genes, and attracted us with its function diversity. After adding different reagents into the medium to interfere the normal binding of CSC and Cd44, varying-degree remissions of CSC's inhibition on neurite extension were observed. CSC can inhibit neurite extension through binding Cd44 on the neuron membrane. This verifies that important genes in given physiological processes can be identified by gene ontology analysis of gene chip data.
Patterns of linkage disequilibrium and haplotype distribution in disease candidate genes.
Long, Ji-Rong; Zhao, Lan-Juan; Liu, Peng-Yuan; Lu, Yan; Dvornyk, Volodymyr; Shen, Hui; Liu, Yong-Jun; Zhang, Yuan-Yuan; Xiong, Dong-Hai; Xiao, Peng; Deng, Hong-Wen
2004-05-24
The adequacy of association studies for complex diseases depends critically on the existence of linkage disequilibrium (LD) between functional alleles and surrounding SNP markers. We examined the patterns of LD and haplotype distribution in eight candidate genes for osteoporosis and/or obesity using 31 SNPs in 1,873 subjects. These eight genes are apolipoprotein E (APOE), type I collagen alpha1 (COL1A1), estrogen receptor-alpha (ER-alpha), leptin receptor (LEPR), parathyroid hormone (PTH)/PTH-related peptide receptor type 1 (PTHR1), transforming growth factor-beta1 (TGF-beta1), uncoupling protein 3 (UCP3), and vitamin D (1,25-dihydroxyvitamin D3) receptor (VDR). Yin yang haplotypes, two high-frequency haplotypes composed of completely mismatching SNP alleles, were examined. To quantify LD patterns, two common measures of LD, D' and r2, were calculated for the SNPs within the genes. The haplotype distribution varied in the different genes. Yin yang haplotypes were observed only in PTHR1 and UCP3. D' ranged from 0.020 to 1.000 with the average of 0.475, whereas the average r2 was 0.158 (ranging from 0.000 to 0.883). A decay of LD was observed as the intermarker distance increased, however, there was a great difference in LD characteristics of different genes or even in different regions within gene. The differences in haplotype distributions and LD patterns among the genes underscore the importance of characterizing genomic regions of interest prior to association studies.
NASA Astrophysics Data System (ADS)
Zhao, Xueqin; Wang, Jun; Tao, SiJie; Ye, Ting; Kong, Xiangdong; Ren, Lei
2016-04-01
The non-viral gene delivery system is an attractive alternative to cancer therapy. The clinical success of non-viral gene delivery is hampered by transfection efficiency and tumor targeting, which can be individually overcome by addition of functional modules such as cell penetration or targeting. Here, we first engineered the multifunctional gelatin/silica (GS) nanovectors with separately controllable modules, including tumor-targeting aptamer AGRO100, membrane-destabilizing peptide HA2, and polyethylene glycol (PEG), and then studied their bio-distribution and in vivo transfection efficiencies by contrast resonance imaging (CRI). The results suggest that the sizes and zeta potentials of multifunctional gelatin/silica nanovectors were 203-217 nm and 2-8 mV, respectively. Functional GS-PEG nanoparticles mainly accumulated in the liver and tumor, with the lowest uptake by the heart and brain. Moreover, the synergistic effects of tumor-targeting aptamer AGRO100 and fusogenic peptide HA2 promoted the efficient cellular internalization in the tumor site. More importantly, the combined use of AGRO100 and PEG enhanced tumor gene expression specificity and effectively reduced toxicity in reticuloendothelial system (RES) organs after intravenous injection. Additionally, low accumulation of GS-PEG was observed in the heart tissues with high gene expression levels, which could provide opportunities for non-invasive gene therapy.
Genetic differences in human circadian clock genes among worldwide populations.
Ciarleglio, Christopher M; Ryckman, Kelli K; Servick, Stein V; Hida, Akiko; Robbins, Sam; Wells, Nancy; Hicks, Jennifer; Larson, Sydney A; Wiedermann, Joshua P; Carver, Krista; Hamilton, Nalo; Kidd, Kenneth K; Kidd, Judith R; Smith, Jeffrey R; Friedlaender, Jonathan; McMahon, Douglas G; Williams, Scott M; Summar, Marshall L; Johnson, Carl Hirschie
2008-08-01
The daily biological clock regulates the timing of sleep and physiological processes that are of fundamental importance to human health, performance, and well-being. Environmental parameters of relevance to biological clocks include (1) daily fluctuations in light intensity and temperature, and (2) seasonal changes in photoperiod (day length) and temperature; these parameters vary dramatically as a function of latitude and locale. In wide-ranging species other than humans, natural selection has genetically optimized adaptiveness along latitudinal clines. Is there evidence for selection of clock gene alleles along latitudinal/photoperiod clines in humans? A number of polymorphisms in the human clock genes Per2, Per3, Clock, and AANAT have been reported as alleles that could be subject to selection. In addition, this investigation discovered several novel polymorphisms in the human Arntl and Arntl2 genes that may have functional impact upon the expression of these clock transcriptional factors. The frequency distribution of these clock gene polymorphisms is reported for diverse populations of African Americans, European Americans, Ghanaians, Han Chinese, and Papua New Guineans (including 5 subpopulations within Papua New Guinea). There are significant differences in the frequency distribution of clock gene alleles among these populations. Population genetic analyses indicate that these differences are likely to arise from genetic drift rather than from natural selection.
Integrative and conjugative elements and their hosts: composition, distribution and organization
Touchon, Marie; Rocha, Eduardo P. C.
2017-01-01
Abstract Conjugation of single-stranded DNA drives horizontal gene transfer between bacteria and was widely studied in conjugative plasmids. The organization and function of integrative and conjugative elements (ICE), even if they are more abundant, was only studied in a few model systems. Comparative genomics of ICE has been precluded by the difficulty in finding and delimiting these elements. Here, we present the results of a method that circumvents these problems by requiring only the identification of the conjugation genes and the species’ pan-genome. We delimited 200 ICEs and this allowed the first large-scale characterization of these elements. We quantified the presence in ICEs of a wide set of functions associated with the biology of mobile genetic elements, including some that are typically associated with plasmids, such as partition and replication. Protein sequence similarity networks and phylogenetic analyses revealed that ICEs are structured in functional modules. Integrases and conjugation systems have different evolutionary histories, even if the gene repertoires of ICEs can be grouped in function of conjugation types. Our characterization of the composition and organization of ICEs paves the way for future functional and evolutionary analyses of their cargo genes, composed of a majority of unknown function genes. PMID:28911112
Comparative Analysis and Distribution of Omega-3 lcPUFA Biosynthesis Genes in Marine Molluscs
Surm, Joachim M.; Prentis, Peter J.; Pavasovic, Ana
2015-01-01
Recent research has identified marine molluscs as an excellent source of omega-3 long-chain polyunsaturated fatty acids (lcPUFAs), based on their potential for endogenous synthesis of lcPUFAs. In this study we generated a representative list of fatty acyl desaturase (Fad) and elongation of very long-chain fatty acid (Elovl) genes from major orders of Phylum Mollusca, through the interrogation of transcriptome and genome sequences, and various publicly available databases. We have identified novel and uncharacterised Fad and Elovl sequences in the following species: Anadara trapezia, Nerita albicilla, Nerita melanotragus, Crassostrea gigas, Lottia gigantea, Aplysia californica, Loligo pealeii and Chlamys farreri. Based on alignments of translated protein sequences of Fad and Elovl genes, the haeme binding motif and histidine boxes of Fad proteins, and the histidine box and seventeen important amino acids in Elovl proteins, were highly conserved. Phylogenetic analysis of aligned reference sequences was used to reconstruct the evolutionary relationships for Fad and Elovl genes separately. Multiple, well resolved clades for both the Fad and Elovl sequences were observed, suggesting that repeated rounds of gene duplication best explain the distribution of Fad and Elovl proteins across the major orders of molluscs. For Elovl sequences, one clade contained the functionally characterised Elovl5 proteins, while another clade contained proteins hypothesised to have Elovl4 function. Additional well resolved clades consisted only of uncharacterised Elovl sequences. One clade from the Fad phylogeny contained only uncharacterised proteins, while the other clade contained functionally characterised delta-5 desaturase proteins. The discovery of an uncharacterised Fad clade is particularly interesting as these divergent proteins may have novel functions. Overall, this paper presents a number of novel Fad and Elovl genes suggesting that many mollusc groups possess most of the required enzymes for the synthesis of lcPUFAs. PMID:26308548
Cyclomodulins in Urosepsis Strains of Escherichia coli▿
Dubois, Damien; Delmas, Julien; Cady, Anne; Robin, Frédéric; Sivignon, Adeline; Oswald, Eric; Bonnet, Richard
2010-01-01
Determinants of urosepsis in Escherichia coli remain incompletely defined. Cyclomodulins (CMs) are a growing functional family of toxins that hijack the eukaryotic cell cycle. Four cyclomodulin types are actually known in E. coli: cytotoxic necrotizing factors (CNFs), cycle-inhibiting factor (Cif), cytolethal distending toxins (CDTs), and the pks-encoded toxin. In the present study, the distribution of CM-encoding genes and the functionality of these toxins were investigated in 197 E. coli strains isolated from patients with community-acquired urosepsis (n = 146) and from uninfected subjects (n = 51). This distribution was analyzed in relation to the phylogenetic background, clinical origin, and antibiotic resistance of the strains. It emerged from this study that strains harboring the pks island and the cnf1 gene (i) were strongly associated with the B2 phylogroup (P, <0.001), (ii) frequently harbored both toxin-encoded genes in phylogroup B2 (33%), and (iii) were predictive of a urosepsis origin (P, <0.001 to 0.005). However, the prevalences of the pks island among phylogroup B2 strains, in contrast to those of the cnf1 gene, were not significantly different between fecal and urosepsis groups, suggesting that the pks island is more important for the colonization process and the cnf1 gene for virulence. pks- or cnf1-harboring strains were significantly associated with susceptibility to antibiotics (amoxicillin, cotrimoxazole, and quinolones [P, <0.001 to 0.043]). Otherwise, only 6% and 1% of all strains harbored the cdtB and cif genes, respectively, with no particular distribution by phylogenetic background, antimicrobial susceptibility, or clinical origin. PMID:20375237
Liu, Hongyun; Qin, Jiajia; Fan, Hui; Cheng, Jinjin; Li, Lin; Liu, Zheng
2017-07-01
As a member of the GRAS gene family, SCARECROW - LIKE ( SCL ) genes encode transcriptional regulators that are involved in plant information transmission and signal transduction. In this study, 44 SCL genes including two SCARECROW genes in millet were identified to be distributed on eight chromosomes, except chromosome 6. All the millet genes contain motifs 6-8, indicating that these motifs are conserved during the evolution. SCL genes of millet were divided into eight groups based on the phylogenetic relationship and classification of Arabidopsis SCL genes. Several putative millet orthologous genes in Arabidopsis , maize and rice were identified. High throughput RNA sequencing revealed that the expressions of millet SCL genes in root, stem, leaf, spica, and along leaf gradient varied greatly. Analyses combining the gene expression patterns, gene structures, motif compositions, promoter cis -elements identification, alternative splicing of transcripts and phylogenetic relationship of SCL genes indicate that the these genes may play diverse functions. Functionally characterized SCL genes in maize, rice and Arabidopsis would provide us some clues for future characterization of their homologues in millet. To the best of our knowledge, this is the first study of millet SCL genes at the genome wide level. Our work provides a useful platform for functional analysis of SCL genes in millet, a model crop for C 4 photosynthesis and bioenergy studies.
Pingault, Lise; Choulet, Frédéric; Alberti, Adriana; Glover, Natasha; Wincker, Patrick; Feuillet, Catherine; Paux, Etienne
2015-02-10
Because of its size, allohexaploid nature, and high repeat content, the bread wheat genome is a good model to study the impact of the genome structure on gene organization, function, and regulation. However, because of the lack of a reference genome sequence, such studies have long been hampered and our knowledge of the wheat gene space is still limited. The access to the reference sequence of the wheat chromosome 3B provided us with an opportunity to study the wheat transcriptome and its relationships to genome and gene structure at a level that has never been reached before. By combining this sequence with RNA-seq data, we construct a fine transcriptome map of the chromosome 3B. More than 8,800 transcription sites are identified, that are distributed throughout the entire chromosome. Expression level, expression breadth, alternative splicing as well as several structural features of genes, including transcript length, number of exons, and cumulative intron length are investigated. Our analysis reveals a non-monotonic relationship between gene expression and structure and leads to the hypothesis that gene structure is determined by its function, whereas gene expression is subject to energetic cost. Moreover, we observe a recombination-based partitioning at the gene structure and function level. Our analysis provides new insights into the relationships between gene and genome structure and function. It reveals mechanisms conserved with other plant species as well as superimposed evolutionary forces that shaped the wheat gene space, likely participating in wheat adaptation.
Liang, Yuting; Zhao, Huihui; Zhang, Xu; Zhou, Jizhong; Li, Guanghe
2014-07-15
To compare the functional gene structure and diversity of microbial communities in saline-alkali and slightly acidic oil-contaminated sites, 40 soil samples were collected from two typical oil exploration sites in North and South China and analyzed with a comprehensive functional gene array (GeoChip 3.0). The overall microbial pattern was significantly different between the two sites, and a more divergent pattern was observed in slightly acidic soils. Response ratio was calculated to compare the microbial functional genes involved in organic contaminant degradation and carbon, nitrogen, phosphorus, and sulfur cycling. The results indicated a significantly low abundance of most genes involved in organic contaminant degradation and in the cycling of nitrogen and phosphorus in saline-alkali soils. By contrast, most carbon degradation genes and all carbon fixation genes had similar abundance at both sites. Based on the relationship between the environmental variables and microbial functional structure, pH was the major factor influencing the microbial distribution pattern in the two sites. This study demonstrated that microbial functional diversity and heterogeneity in oil-contaminated environments can vary significantly in relation to local environmental conditions. The limitation of nitrogen and phosphorus and the low degradation capacity of organic contaminant should be carefully considered, particularly in most oil-exploration sites with saline-alkali soils. Copyright © 2014 Elsevier B.V. All rights reserved.
Song, Xiaowen; Huang, Fei; Liu, Juanjuan; Li, Chengjun; Gao, Shanshan; Wu, Wei; Zhai, Mengfan; Yu, Xiaojuan; Xiong, Wenfeng; Xie, Jia
2017-01-01
Abstract Cytosine DNA methylation is a vital epigenetic regulator of eukaryotic development. Whether this epigenetic modification occurs in Tribolium castaneum has been controversial, its distribution pattern and functions have not been established. Here, using bisulphite sequencing (BS-Seq), we confirmed the existence of DNA methylation and described the methylation profiles of the four life stages of T. castaneum. In the T. castaneum genome, both symmetrical CpG and non-CpG methylcytosines were observed. Symmetrical CpG methylation, which was catalysed by DNMT1 and occupied a small part in T. castaneum methylome, was primarily enriched in gene bodies and was positively correlated with gene expression levels. Asymmetrical non-CpG methylation, which was predominant in the methylome, was strongly concentrated in intergenic regions and introns but absent from exons. Gene body methylation was negatively correlated with gene expression levels. The distribution pattern and functions of this type of methylation were similar only to the methylome of Drosophila melanogaster, which further supports the existence of a novel methyltransferase in the two species responsible for this type of methylation. This first life-cycle methylome of T. castaneum reveals a novel and unique methylation pattern, which will contribute to the further understanding of the variety and functions of DNA methylation in eukaryotes. PMID:28449092
ISRNA: an integrative online toolkit for short reads from high-throughput sequencing data.
Luo, Guan-Zheng; Yang, Wei; Ma, Ying-Ke; Wang, Xiu-Jie
2014-02-01
Integrative Short Reads NAvigator (ISRNA) is an online toolkit for analyzing high-throughput small RNA sequencing data. Besides the high-speed genome mapping function, ISRNA provides statistics for genomic location, length distribution and nucleotide composition bias analysis of sequence reads. Number of reads mapped to known microRNAs and other classes of short non-coding RNAs, coverage of short reads on genes, expression abundance of sequence reads as well as some other analysis functions are also supported. The versatile search functions enable users to select sequence reads according to their sub-sequences, expression abundance, genomic location, relationship to genes, etc. A specialized genome browser is integrated to visualize the genomic distribution of short reads. ISRNA also supports management and comparison among multiple datasets. ISRNA is implemented in Java/C++/Perl/MySQL and can be freely accessed at http://omicslab.genetics.ac.cn/ISRNA/.
Microbial Functional Gene Diversity Predicts Groundwater Contamination and Ecosystem Functioning.
He, Zhili; Zhang, Ping; Wu, Linwei; Rocha, Andrea M; Tu, Qichao; Shi, Zhou; Wu, Bo; Qin, Yujia; Wang, Jianjun; Yan, Qingyun; Curtis, Daniel; Ning, Daliang; Van Nostrand, Joy D; Wu, Liyou; Yang, Yunfeng; Elias, Dwayne A; Watson, David B; Adams, Michael W W; Fields, Matthew W; Alm, Eric J; Hazen, Terry C; Adams, Paul D; Arkin, Adam P; Zhou, Jizhong
2018-02-20
Contamination from anthropogenic activities has significantly impacted Earth's biosphere. However, knowledge about how environmental contamination affects the biodiversity of groundwater microbiomes and ecosystem functioning remains very limited. Here, we used a comprehensive functional gene array to analyze groundwater microbiomes from 69 wells at the Oak Ridge Field Research Center (Oak Ridge, TN), representing a wide pH range and uranium, nitrate, and other contaminants. We hypothesized that the functional diversity of groundwater microbiomes would decrease as environmental contamination (e.g., uranium or nitrate) increased or at low or high pH, while some specific populations capable of utilizing or resistant to those contaminants would increase, and thus, such key microbial functional genes and/or populations could be used to predict groundwater contamination and ecosystem functioning. Our results indicated that functional richness/diversity decreased as uranium (but not nitrate) increased in groundwater. In addition, about 5.9% of specific key functional populations targeted by a comprehensive functional gene array (GeoChip 5) increased significantly ( P < 0.05) as uranium or nitrate increased, and their changes could be used to successfully predict uranium and nitrate contamination and ecosystem functioning. This study indicates great potential for using microbial functional genes to predict environmental contamination and ecosystem functioning. IMPORTANCE Disentangling the relationships between biodiversity and ecosystem functioning is an important but poorly understood topic in ecology. Predicting ecosystem functioning on the basis of biodiversity is even more difficult, particularly with microbial biomarkers. As an exploratory effort, this study used key microbial functional genes as biomarkers to provide predictive understanding of environmental contamination and ecosystem functioning. The results indicated that the overall functional gene richness/diversity decreased as uranium increased in groundwater, while specific key microbial guilds increased significantly as uranium or nitrate increased. These key microbial functional genes could be used to successfully predict environmental contamination and ecosystem functioning. This study represents a significant advance in using functional gene markers to predict the spatial distribution of environmental contaminants and ecosystem functioning toward predictive microbial ecology, which is an ultimate goal of microbial ecology. Copyright © 2018 He et al.
Gene-culture coevolution in whales and dolphins.
Whitehead, Hal
2017-07-24
Whales and dolphins (Cetacea) have excellent social learning skills as well as a long and strong mother-calf bond. These features produce stable cultures, and, in some species, sympatric groups with different cultures. There is evidence and speculation that this cultural transmission of behavior has affected gene distributions. Culture seems to have driven killer whales into distinct ecotypes, which may be incipient species or subspecies. There are ecotype-specific signals of selection in functional genes that correspond to cultural foraging behavior and habitat use by the different ecotypes. The five species of whale with matrilineal social systems have remarkably low diversity of mtDNA. Cultural hitchhiking, the transmission of functionally neutral genes in parallel with selective cultural traits, is a plausible hypothesis for this low diversity, especially in sperm whales. In killer whales the ecotype divisions, together with founding bottlenecks, selection, and cultural hitchhiking, likely explain the low mtDNA diversity. Several cetacean species show habitat-specific distributions of mtDNA haplotypes, probably the result of mother-offspring cultural transmission of migration routes or destinations. In bottlenose dolphins, remarkable small-scale differences in haplotype distribution result from maternal cultural transmission of foraging methods, and large-scale redistributions of sperm whale cultural clans in the Pacific have likely changed mitochondrial genetic geography. With the acceleration of genomics new results should come fast, but understanding gene-culture coevolution will be hampered by the measured pace of research on the socio-cultural side of cetacean biology.
Capturing novel mouse genes encoding chromosomal and other nuclear proteins.
Tate, P; Lee, M; Tweedie, S; Skarnes, W C; Bickmore, W A
1998-09-01
The burgeoning wealth of gene sequences contrasts with our ignorance of gene function. One route to assigning function is by determining the sub-cellular location of proteins. We describe the identification of mouse genes encoding proteins that are confined to nuclear compartments by splicing endogeneous gene sequences to a promoterless betageo reporter, using a gene trap approach. Mouse ES (embryonic stem) cell lines were identified that express betageo fusions located within sub-nuclear compartments, including chromosomes, the nucleolus and foci containing splicing factors. The sequences of 11 trapped genes were ascertained, and characterisation of endogenous protein distribution in two cases confirmed the validity of the approach. Three novel proteins concentrated within distinct chromosomal domains were identified, one of which appears to be a serine/threonine kinase. The sequence of a gene whose product co-localises with splicesome components suggests that this protein may be an E3 ubiquitin-protein ligase. The majority of the other genes isolated represent novel genes. This approach is shown to be a powerful tool for identifying genes encoding novel proteins with specific sub-nuclear localisations and exposes our ignorance of the protein composition of the nucleus. Motifs in two of the isolated genes suggest new links between cellular regulatory mechanisms (ubiquitination and phosphorylation) and mRNA splicing and chromosome structure/function.
Xiong, Jinbo; Wu, Liyou; Tu, Shuxin; Van Nostrand, Joy D.; He, Zhili; Zhou, Jizhong; Wang, Gejiao
2010-01-01
To understand how microbial communities and functional genes respond to arsenic contamination in the rhizosphere of Pteris vittata, five soil samples with different arsenic contamination levels were collected from the rhizosphere of P. vittata and nonrhizosphere areas and investigated by Biolog, geochemical, and functional gene microarray (GeoChip 3.0) analyses. Biolog analysis revealed that the uncontaminated soil harbored the greatest diversity of sole-carbon utilization abilities and that arsenic contamination decreased the metabolic diversity, while rhizosphere soils had higher metabolic diversities than did the nonrhizosphere soils. GeoChip 3.0 analysis showed low proportions of overlapping genes across the five soil samples (16.52% to 45.75%). The uncontaminated soil had a higher heterogeneity and more unique genes (48.09%) than did the arsenic-contaminated soils. Arsenic resistance, sulfur reduction, phosphorus utilization, and denitrification genes were remarkably distinct between P. vittata rhizosphere and nonrhizosphere soils, which provides evidence for a strong linkage among the level of arsenic contamination, the rhizosphere, and the functional gene distribution. Canonical correspondence analysis (CCA) revealed that arsenic is the main driver in reducing the soil functional gene diversity; however, organic matter and phosphorus also have significant effects on the soil microbial community structure. The results implied that rhizobacteria play an important role during soil arsenic uptake and hyperaccumulation processes of P. vittata. PMID:20833780
Detecting Genetic Interactions for Quantitative Traits Using m-Spacing Entropy Measure
Yee, Jaeyong; Kwon, Min-Seok; Park, Taesung; Park, Mira
2015-01-01
A number of statistical methods for detecting gene-gene interactions have been developed in genetic association studies with binary traits. However, many phenotype measures are intrinsically quantitative and categorizing continuous traits may not always be straightforward and meaningful. Association of gene-gene interactions with an observed distribution of such phenotypes needs to be investigated directly without categorization. Information gain based on entropy measure has previously been successful in identifying genetic associations with binary traits. We extend the usefulness of this information gain by proposing a nonparametric evaluation method of conditional entropy of a quantitative phenotype associated with a given genotype. Hence, the information gain can be obtained for any phenotype distribution. Because any functional form, such as Gaussian, is not assumed for the entire distribution of a trait or a given genotype, this method is expected to be robust enough to be applied to any phenotypic association data. Here, we show its use to successfully identify the main effect, as well as the genetic interactions, associated with a quantitative trait. PMID:26339620
Stojanova, Daniela; Ceci, Michelangelo; Malerba, Donato; Dzeroski, Saso
2013-09-26
Ontologies and catalogs of gene functions, such as the Gene Ontology (GO) and MIPS-FUN, assume that functional classes are organized hierarchically, that is, general functions include more specific ones. This has recently motivated the development of several machine learning algorithms for gene function prediction that leverages on this hierarchical organization where instances may belong to multiple classes. In addition, it is possible to exploit relationships among examples, since it is plausible that related genes tend to share functional annotations. Although these relationships have been identified and extensively studied in the area of protein-protein interaction (PPI) networks, they have not received much attention in hierarchical and multi-class gene function prediction. Relations between genes introduce autocorrelation in functional annotations and violate the assumption that instances are independently and identically distributed (i.i.d.), which underlines most machine learning algorithms. Although the explicit consideration of these relations brings additional complexity to the learning process, we expect substantial benefits in predictive accuracy of learned classifiers. This article demonstrates the benefits (in terms of predictive accuracy) of considering autocorrelation in multi-class gene function prediction. We develop a tree-based algorithm for considering network autocorrelation in the setting of Hierarchical Multi-label Classification (HMC). We empirically evaluate the proposed algorithm, called NHMC (Network Hierarchical Multi-label Classification), on 12 yeast datasets using each of the MIPS-FUN and GO annotation schemes and exploiting 2 different PPI networks. The results clearly show that taking autocorrelation into account improves the predictive performance of the learned models for predicting gene function. Our newly developed method for HMC takes into account network information in the learning phase: When used for gene function prediction in the context of PPI networks, the explicit consideration of network autocorrelation increases the predictive performance of the learned models. Overall, we found that this holds for different gene features/ descriptions, functional annotation schemes, and PPI networks: Best results are achieved when the PPI network is dense and contains a large proportion of function-relevant interactions.
[SSR loci information analysis in transcriptome of Andrographis paniculata].
Li, Jun-Ren; Chen, Xiu-Zhen; Tang, Xiao-Ting; He, Rui; Zhan, Ruo-Ting
2018-06-01
To study the SSR loci information and develop molecular markers, a total of 43 683 Unigenes in transcriptome of Andrographis paniculata were used to explore SSR. The distribution frequency of SSR and the basic characteristics of repeat motifs were analyzed using MicroSAtellite software, SSR primers were designed by Primer 3.0 software and then validated by PCR. Moreover, the gene function analysis of SSR Unigene was obtained by Blast. The results showed that 14 135 SSR loci were found in the transcriptome of A. paniculata, which distributed in 9 973 Unigenes with a distribution frequency of 32.36%. Di-nucleotide and Tri-nucleotide repeat were the main types, accounted for 75.54% of all SSRs. The repeat motifs of AT/AT and CCG/CGG were the predominant repeat types of Di-nucleotide and Tri-nucleotide, respectively. A total of 4 740 pairs of SSR primers with the potential to produce polymorphism were designed for maker development. Ten pairs of primers in 20 pairs of randomly picked primers produced fragments with expected molecular size. The gene function of Unigenes containing SSR were mostly related to the basic metabolism function of A. paniculata. The SSR markers in transcriptome of A. paniculata show rich type, strong specificity and high potential of polymorphism, which will benefit the candidate gene mining and marker-assisted breeding. Copyright© by the Chinese Pharmaceutical Association.
Displayed Trees Do Not Determine Distinguishability Under the Network Multispecies Coalescent
Zhu, Sha; Degnan, James H.
2017-01-01
Abstract Recent work in estimating species relationships from gene trees has included inferring networks assuming that past hybridization has occurred between species. Probabilistic models using the multispecies coalescent can be used in this framework for likelihood-based inference of both network topologies and parameters, including branch lengths and hybridization parameters. A difficulty for such methods is that it is not always clear whether, or to what extent, networks are identifiable—that is whether there could be two distinct networks that lead to the same distribution of gene trees. For cases in which incomplete lineage sorting occurs in addition to hybridization, we demonstrate a new representation of the species network likelihood that expresses the probability distribution of the gene tree topologies as a linear combination of gene tree distributions given a set of species trees. This representation makes it clear that in some cases in which two distinct networks give the same distribution of gene trees when sampling one allele per species, the two networks can be distinguished theoretically when multiple individuals are sampled per species. This result means that network identifiability is not only a function of the trees displayed by the networks but also depends on allele sampling within species. We additionally give an example in which two networks that display exactly the same trees can be distinguished from their gene trees even when there is only one lineage sampled per species. PMID:27780899
Bright, Lydia J; Gout, Jean-Francois; Lynch, Michael
2017-04-15
New gene functions arise within existing gene families as a result of gene duplication and subsequent diversification. To gain insight into the steps that led to the functional diversification of paralogues, we tracked duplicate retention patterns, expression-level divergence, and subcellular markers of functional diversification in the Rab GTPase gene family in three Paramecium aurelia species. After whole-genome duplication, Rab GTPase duplicates are more highly retained than other genes in the genome but appear to be diverging more rapidly in expression levels, consistent with early steps in functional diversification. However, by localizing specific Rab proteins in Paramecium cells, we found that paralogues from the two most recent whole-genome duplications had virtually identical localization patterns, and that less closely related paralogues showed evidence of both conservation and diversification. The functionally conserved paralogues appear to target to compartments associated with both endocytic and phagocytic recycling functions, confirming evolutionary and functional links between the two pathways in a divergent eukaryotic lineage. Because the functionally diversifying paralogues are still closely related to and derived from a clade of functionally conserved Rab11 genes, we were able to pinpoint three specific amino acid residues that may be driving the change in the localization and thus the function in these proteins. © 2017 Bright et al. This article is distributed by The American Society for Cell Biology under license from the author(s). Two months after publication it is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).
Rousidou, Constantina; Karaiskos, Dionysis; Myti, Despoina; Karanasios, Evangelos; Karas, Panagiotis A; Tourna, Maria; Tzortzakakis, Emmanuel A; Karpouzas, Dimitrios G
2017-01-01
Synthetic carbamates constitute a significant pesticide group with oxamyl being a leading compound in the nematicide market. Oxamyl degradation in soil is mainly microbially mediated. However, the distribution and function of carbamate hydrolase genes (cehA, mcd, cahA) associated with the soil biodegradation of carbamates is not yet clear. We studied oxamyl degradation in 16 soils from a potato monoculture area in Greece where oxamyl is regularly used. Oxamyl showed low persistence (DT50 2.4-26.7 days). q-PCR detected the cehA and mcd genes in 10 and three soils, respectively. The abundance of the cehA gene was positively correlated with pH, while both cehA abundance and pH were negatively correlated with oxamyl DT50. Amongst the carbamates used in the study region, oxamyl stimulated the abundance and expression only of the cehA gene, while carbofuran stimulated the abundance and expression of both genes. The cehA gene was also detected in pristine soils upon repeated treatments with oxamyl and carbofuran and only in soils with pH ≥7.2, where the most rapid degradation of oxamyl was observed. These results have major implications regarding the maintenance of carbamate hydrolase genes in soils, have practical implications regarding the agricultural use of carbamates, and provide insights into the evolution of cehA. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Mathur, Sunil; Sadana, Ajit
2015-12-01
We present a rank-based test statistic for the identification of differentially expressed genes using a distance measure. The proposed test statistic is highly robust against extreme values and does not assume the distribution of parent population. Simulation studies show that the proposed test is more powerful than some of the commonly used methods, such as paired t-test, Wilcoxon signed rank test, and significance analysis of microarray (SAM) under certain non-normal distributions. The asymptotic distribution of the test statistic, and the p-value function are discussed. The application of proposed method is shown using a real-life data set. © The Author(s) 2011.
RELATIONSHIP BETWEEN PHYLOGENETIC DISTRIBUTION AND GENOMIC FEATURES IN NEUROSPORA CRASSA
USDA-ARS?s Scientific Manuscript database
In the post-genome era, insufficient functional annotation of predicted genes greatly restricts the potential of mining genome data. We demonstrate that an evolutionary approach, which is independent of functional annotation, has great potential as a tool for genome analysis. We chose the genome o...
Microarray-based analysis of survival of soil microbial community during ozonation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Jian; Van Nostrand, Joy D.; He, Zhili
A 15 h ozonation was performed on bioremediated soil to remove recalcitrant residual oil. To monitor the survival of indigenous microorganisms in the soil during in-situ chemical oxidation(ISCO) culturing and a functional genearray, GeoChip, was used to examine the functional genes and structure of the microbial community during ozonation (0h, 2h, 4h, 6h, 10hand15h). Breakthrough ozonation decreased the population of cultivable heterotrophic bacteria by about 3 orders of magnitude. The total functional gene abundance and diversity decreased during ozonation, as the number of functional genes was reduced by 48percent after 15 h. However, functional genes were evenly distributed during ozonationmore » as judged by the Shannon-Weaver Evenness index. A sharp decrease in gene number was observed in the first 6 h of ozonation followed by a slower decrease in the next 9 h, which was consistent with microbial populations measured by a culture based method. Functional genes involved in carbon, nitrogen, phosphors and sulfur cycling, metal resistance and organic remediation were detected in all samples. Though the pattern of gene categories detected was similar for all time points, hierarchica lcluster of all functional genes and major functional categories all showed a time-serial pattern. Bacteria, archaea and fungi decreased by 96.1percent, 95.1percent and 91.3percent, respectively, after 15 h ozonation. Delta proteobacteria, which were reduced by 94.3percent, showed the highest resistance to ozonation while Actinobacteria, reduced by 96.3percent, showed the lowest resistance. Microorganisms similar to Rhodothermus, Obesumbacterium, Staphylothermus, Gluconobacter, and Enterococcus were dominant at all time points. Functional genes related to petroleum degradation decreased 1~;;2 orders of magnitude. Most of the key functional genes were still detected after ozonation, allowing a rapid recovery of the microbial community after ozonation. While ozone had a large impact on the indigenous soil microorganisms, a fraction of the key functional gene-containing microorganisms survived during ozonation and kept the community functional.« less
Sampaio, Dayanna Souza; Almeida, Juliana Rodrigues Barboza; de Jesus, Hugo E; Rosado, Alexandre S; Seldin, Lucy; Jurelevicius, Diogo
2017-11-01
Anaerobic diesel fuel Arctic (DFA) degradation has already been demonstrated in Antarctic soils. However, studies comparing the distribution of anaerobic bacterial groups and of anaerobic hydrocarbon-degrading bacteria in Antarctic soils containing different concentrations of DFA are scarce. In this study, functional genes were used to study the diversity and distribution of anaerobic hydrocarbon-degrading bacteria (bamA, assA, and bssA) and of sulfate-reducing bacteria (SRB-apsR) in highly, intermediate, and non-DFA-contaminated soils collected during the summers of 2009, 2010, and 2011 from King George Island, Antarctica. Signatures of bamA genes were detected in all soils analyzed, whereas bssA and assA were found in only 4 of 10 soils. The concentration of DFA was the main factor influencing the distribution of bamA-containing bacteria and of SRB in the analyzed soils, as shown by PCR-DGGE results. bamA sequences related to genes previously described in Desulfuromonas, Lautropia, Magnetospirillum, Sulfuritalea, Rhodovolum, Rhodomicrobium, Azoarcus, Geobacter, Ramlibacter, and Gemmatimonas genera were dominant in King George Island soils. Although DFA modulated the distribution of bamA-hosting bacteria, DFA concentration was not related to bamA abundance in the soils studied here. This result suggests that King George Island soils show functional redundancy for aromatic hydrocarbon degradation. The results obtained in this study support the hypothesis that specialized anaerobic hydrocarbon-degrading bacteria have been selected by hydrocarbon concentrations present in King George Island soils.
A functional genomic analysis of Arabidopsis thaliana PP2C clade D
USDA-ARS?s Scientific Manuscript database
In the reference dicot plant Arabidopsis thaliana, the PP2C family of P-protein phosphatases includes the products of 80 genes that have been separated into 10 multi-protein clades plus six singletons. Clade D includes the products of nine genes distributed among 3 chromosomes (PPD1, At3g12620; PPD2...
Emdin, Connor A; Khera, Amit V; Chaffin, Mark; Klarin, Derek; Natarajan, Pradeep; Aragam, Krishna; Haas, Mary; Bick, Alexander; Zekavat, Seyedeh M; Nomura, Akihiro; Ardissino, Diego; Wilson, James G; Schunkert, Heribert; McPherson, Ruth; Watkins, Hugh; Elosua, Roberto; Bown, Matthew J; Samani, Nilesh J; Baber, Usman; Erdmann, Jeanette; Gupta, Namrata; Danesh, John; Chasman, Daniel; Ridker, Paul; Denny, Joshua; Bastarache, Lisa; Lichtman, Judith H; D'Onofrio, Gail; Mattera, Jennifer; Spertus, John A; Sheu, Wayne H-H; Taylor, Kent D; Psaty, Bruce M; Rich, Stephen S; Post, Wendy; Rotter, Jerome I; Chen, Yii-Der Ida; Krumholz, Harlan; Saleheen, Danish; Gabriel, Stacey; Kathiresan, Sekar
2018-04-24
Less than 3% of protein-coding genetic variants are predicted to result in loss of protein function through the introduction of a stop codon, frameshift, or the disruption of an essential splice site; however, such predicted loss-of-function (pLOF) variants provide insight into effector transcript and direction of biological effect. In >400,000 UK Biobank participants, we conduct association analyses of 3759 pLOF variants with six metabolic traits, six cardiometabolic diseases, and twelve additional diseases. We identified 18 new low-frequency or rare (allele frequency < 5%) pLOF variant-phenotype associations. pLOF variants in the gene GPR151 protect against obesity and type 2 diabetes, in the gene IL33 against asthma and allergic disease, and in the gene IFIH1 against hypothyroidism. In the gene PDE3B, pLOF variants associate with elevated height, improved body fat distribution and protection from coronary artery disease. Our findings prioritize genes for which pharmacologic mimics of pLOF variants may lower risk for disease.
Comparative whole genome transcriptome and metabolome analyses of five Klebsiella pneumonia strains.
Lee, Soojin; Kim, Borim; Yang, Jeongmo; Jeong, Daun; Park, Soohyun; Shin, Sang Heum; Kook, Jun Ho; Yang, Kap-Seok; Lee, Jinwon
2015-11-01
The integration of transcriptomics and metabolomics can provide precise information on gene-to-metabolite networks for identifying the function of novel genes. The goal of this study was to identify novel gene functions involved in 2,3-butanediol (2,3-BDO) biosynthesis by a comprehensive analysis of the transcriptome and metabolome of five mutated Klebsiella pneumonia strains (∆wabG = SGSB100, ∆wabG∆budA = SGSB106, ∆wabG∆budB = SGSB107, ∆wabG∆budC = SGSB108, ∆wabG∆budABC = SGSB109). First, the transcriptomes of all five mutants were analyzed and the genes exhibiting reproducible changes in expression were determined. The transcriptome was well conserved among the five strains, and differences in gene expression occurred mainly in genes coding for 2,3-BDO biosynthesis (budA, budB, and budC) and the genes involved in the degradation of reactive oxygen, biosynthesis and transport of arginine, cysteine biosynthesis, sulfur metabolism, oxidoreductase reaction, and formate dehydrogenase reaction. Second, differences in the metabolome (estimated by carbon distribution, CO2 emission, and redox balance) among the five mutant strains due to gene alteration of the 2,3-BDO operon were detected. The functional genomics approach integrating metabolomics and transcriptomics in K. Pneumonia presented here provides an innovative means of identifying novel gene functions involved in 2,3-BDO biosynthesis metabolism and whole cell metabolism.
Mastretta-Yanes, Alicia; Zamudio, Sergio; Jorgensen, Tove H; Arrigo, Nils; Alvarez, Nadir; Piñero, Daniel; Emerson, Brent C
2014-09-14
Gene duplication leads to paralogy, which complicates the de novo assembly of genotyping-by-sequencing (GBS) data. The issue of paralogous genes is exacerbated in plants, because they are particularly prone to gene duplication events. Paralogs are normally filtered from GBS data before undertaking population genomics or phylogenetic analyses. However, gene duplication plays an important role in the functional diversification of genes and it can also lead to the formation of postzygotic barriers. Using populations and closely related species of a tropical mountain shrub, we examine 1) the genomic differentiation produced by putative orthologs, and 2) the distribution of recent gene duplication among lineages and geography. We find high differentiation among populations from isolated mountain peaks and species-level differentiation within what is morphologically described as a single species. The inferred distribution of paralogs among populations is congruent with taxonomy and shows that GBS could be used to examine recent gene duplication as a source of genomic differentiation of nonmodel species. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Xu, Aishi; Li, Guang; Yang, Dong; Wu, Songfeng; Ouyang, Hongsheng; Xu, Ping; He, Fuchu
2015-12-04
Although the "missing protein" is a temporary concept in C-HPP, the biological information for their "missing" could be an important clue in evolutionary studies. Here we classified missing-protein-encoding genes into two groups, the genes encoding PE2 proteins (with transcript evidence) and the genes encoding PE3/4 proteins (with no transcript evidence). These missing-protein-encoding genes distribute unevenly among different chromosomes, chromosomal regions, or gene clusters. In the view of evolutionary features, PE3/4 genes tend to be young, spreading at the nonhomology chromosomal regions and evolving at higher rates. Interestingly, there is a higher proportion of singletons in PE3/4 genes than the proportion of singletons in all genes (background) and OTCSGs (organ, tissue, cell type-specific genes). More importantly, most of the paralogous PE3/4 genes belong to the newly duplicated members of the paralogous gene groups, which mainly contribute to special biological functions, such as "smell perception". These functions are heavily restricted into specific type of cells, tissues, or specific developmental stages, acting as the new functional requirements that facilitated the emergence of the missing-protein-encoding genes during evolution. In addition, the criteria for the extremely special physical-chemical proteins were first set up based on the properties of PE2 proteins, and the evolutionary characteristics of those proteins were explored. Overall, the evolutionary analyses of missing-protein-encoding genes are expected to be highly instructive for proteomics and functional studies in the future.
HERC1 polymorphisms: population-specific variations in haplotype composition.
Yuasa, Isao; Umetsu, Kazuo; Nishimukai, Hiroaki; Fukumori, Yasuo; Harihara, Shinji; Saitou, Naruya; Jin, Feng; Chattopadhyay, Prasanta K; Henke, Lotte; Henke, Jürgen
2009-08-01
Human HERC1 is one of six HERC proteins and may play an important role in intracellular membrane trafficking. The human HERC1 gene is suggested to have been affected by local positive selection. To assess the global frequency distributions of coding and non-coding single nucleotide polymorphisms (SNPs) in the HERC1 gene, we developed a new simultaneous genotyping method for four SNPs, and applied this method to investigate 1213 individuals from 12 global populations. The results confirmed remarked differences in the allele and haplotype frequencies between East Asian and non-East Asian populations. One of the three common haplotypes observed was found to be characteristic of East Asians, who showed a relatively uniform distribution of haplotypes. Information on haplotypes would be useful for testing the function of polymorphisms in the HERC1 gene. This is the first study to investigate the distribution of HERC1 polymorphisms in various populations. (c) 2009 John Wiley & Sons, Ltd.
Identification and function analysis of contrary genes in Dupuytren's contracture.
Ji, Xianglu; Tian, Feng; Tian, Lijie
2015-07-01
The present study aimed to analyze the expression of genes involved in Dupuytren's contracture (DC), using bioinformatic methods. The profile of GSE21221 was downloaded from the gene expression ominibus, which included six samples, derived from fibroblasts and six healthy control samples, derived from carpal-tunnel fibroblasts. A Distributed Intrusion Detection System was used in order to identify differentially expressed genes. The term contrary genes is proposed. Contrary genes were the genes that exhibited opposite expression patterns in the positive and negative groups, and likely exhibited opposite functions. These were identified using Coexpress software. Gene ontology (GO) function analysis was conducted for the contrary genes. A network of GO terms was constructed using the reduce and visualize gene ontology database. Significantly expressed genes (801) and contrary genes (98) were screened. A significant association was observed between Chitinase-3-like protein 1 and ten genes in the positive gene set. Positive regulation of transcription and the activation of nuclear factor-κB (NF-κB)-inducing kinase activity exhibited the highest degree values in the network of GO terms. In the present study, the expression of genes involved in the development of DC was analyzed, and the concept of contrary genes proposed. The genes identified in the present study are involved in the positive regulation of transcription and activation of NF-κB-inducing kinase activity. The contrary genes and GO terms identified in the present study may potentially be used for DC diagnosis and treatment.
Signatures of combinatorial regulation in intrinsic biological noise
Warmflash, Aryeh; Dinner, Aaron R.
2008-01-01
Gene expression is controlled by the action of transcription factors that bind to DNA and influence the rate at which a gene is transcribed. The quantitative mapping between the regulator concentrations and the output of the gene is known as the cis-regulatory input function (CRIF). Here, we show how the CRIF shapes the form of the joint probability distribution of molecular copy numbers of the regulators and the product of a gene. Namely, we derive a class of fluctuation-based relations that relate the moments of the distribution to the derivatives of the CRIF. These relations are useful because they enable statistics of naturally arising cell-to-cell variations in molecular copy numbers to substitute for traditional manipulations for probing regulatory mechanisms. We demonstrate that these relations can distinguish super- and subadditive gene regulatory scenarios (molecular analogs of AND and OR logic operations) in simulations that faithfully represent bacterial gene expression. Applications and extensions to other regulatory scenarios are discussed. PMID:18981421
Genomics Review of Holocellulose Deconstruction by Aspergilli
Segato, Fernando; Damásio, André R. L.; de Lucas, Rosymar C.; Squina, Fabio M.
2014-01-01
SUMMARY Biomass is constructed of dense recalcitrant polymeric materials: proteins, lignin, and holocellulose, a fraction constituting fibrous cellulose wrapped in hemicellulose-pectin. Bacteria and fungi are abundant in soil and forest floors, actively recycling biomass mainly by extracting sugars from holocellulose degradation. Here we review the genome-wide contents of seven Aspergillus species and unravel hundreds of gene models encoding holocellulose-degrading enzymes. Numerous apparent gene duplications followed functional evolution, grouping similar genes into smaller coherent functional families according to specialized structural features, domain organization, biochemical activity, and genus genome distribution. Aspergilli contain about 37 cellulase gene models, clustered in two mechanistic categories: 27 hydrolyze and 10 oxidize glycosidic bonds. Within the oxidative enzymes, we found two cellobiose dehydrogenases that produce oxygen radicals utilized by eight lytic polysaccharide monooxygenases that oxidize glycosidic linkages, breaking crystalline cellulose chains and making them accessible to hydrolytic enzymes. Among the hydrolases, six cellobiohydrolases with a tunnel-like structural fold embrace single crystalline cellulose chains and cooperate at nonreducing or reducing end termini, splitting off cellobiose. Five endoglucanases group into four structural families and interact randomly and internally with cellulose through an open cleft catalytic domain, and finally, seven extracellular β-glucosidases cleave cellobiose and related oligomers into glucose. Aspergilli contain, on average, 30 hemicellulase and 7 accessory gene models, distributed among 9 distinct functional categories: the backbone-attacking enzymes xylanase, mannosidase, arabinase, and xyloglucanase, the short-side-chain-removing enzymes xylan α-1,2-glucuronidase, arabinofuranosidase, and xylosidase, and the accessory enzymes acetyl xylan and feruloyl esterases. PMID:25428936
Ren, Chong; Zhang, Zhan; Wang, Yi; Li, Shaohua; Liang, Zhenchang
2016-08-11
Nuclear factor Y (NF-Y) transcription factor is composed of three distinct subunits: NF-YA, NF-YB and NF-YC. Many members of NF-Y family have been reported to be key regulators in plant development, phytohormone signaling and drought tolerance. However, the function of the NF-Y family is less known in grape (Vitis vinifera L.). A total of 34 grape NF-Y genes that distributed unevenly on grape (V. vinifera) chromosomes were identified in this study. Phylogenetic analysis was performed to predict functional similarities between Arabidopsis thaliana and grape NF-Y genes. Comparison of the structures of grape NF-Y genes (VvNF-Ys) revealed their functional conservation and alteration. Furthermore, we investigated the expression profiles of VvNF-Ys in response to various stresses, phytohormone treatments, and in leaves and grape berries with various sugar contents at different developmental stages. The relationship between VvNF-Y transcript levels and sugar content was examined to select candidates for exogenous sugar treatments. Quantitative real-time PCR (qPCR) indicated that many VvNF-Ys responded to different sugar stimuli with variations in transcript abundance. qPCR and publicly available microarray data suggest that VvNF-Ys exhibit distinct expression patterns in different grape organs and developmental stages, and a number of VvNF-Ys may participate in responses to multiple abiotic and biotic stresses, phytohormone treatments and sugar accumulation or metabolism. In this study, we characterized 34 VvNF-Ys based on their distributions on chromosomes, gene structures, phylogenetic relationship with Arabidopsis NF-Y genes, and their expression patterns. The potential roles of VvNF-Ys in sugar accumulation or metabolism were also investigated. Altogether, the data provide significant insights on VvNF-Ys, and lay foundations for further functional studies of NF-Y genes in grape.
Li, Xiangyang; Zhang, Linshuang; Wang, Gejiao
2014-01-01
So far, numerous genes have been found to associate with various strategies to resist and transform the toxic metalloid arsenic (here, we denote these genes as “arsenic-related genes”). However, our knowledge of the distribution, redundancies and organization of these genes in bacteria is still limited. In this study, we analyzed the 188 Burkholderiales genomes and found that 95% genomes harbored arsenic-related genes, with an average of 6.6 genes per genome. The results indicated: a) compared to a low frequency of distribution for aio (arsenite oxidase) (12 strains), arr (arsenate respiratory reductase) (1 strain) and arsM (arsenite methytransferase)-like genes (4 strains), the ars (arsenic resistance system)-like genes were identified in 174 strains including 1,051 genes; b) 2/3 ars-like genes were clustered as ars operon and displayed a high diversity of gene organizations (68 forms) which may suggest the rapid movement and evolution for ars-like genes in bacterial genomes; c) the arsenite efflux system was dominant with ACR3 form rather than ArsB in Burkholderiales; d) only a few numbers of arsM and arrAB are found indicating neither As III biomethylation nor AsV respiration is the primary mechanism in Burkholderiales members; (e) the aio-like gene is mostly flanked with ars-like genes and phosphate transport system, implying the close functional relatedness between arsenic and phosphorus metabolisms. On average, the number of arsenic-related genes per genome of strains isolated from arsenic-rich environments is more than four times higher than the strains from other environments. Compared with human, plant and animal pathogens, the environmental strains possess a larger average number of arsenic-related genes, which indicates that habitat is likely a key driver for bacterial arsenic resistance. PMID:24632831
Raethong, Nachon; Wong-ekkabut, Jirasak; Laoteng, Kobkul; Vongsangnak, Wanwipa
2016-01-01
Aspergillus oryzae is widely used for the industrial production of enzymes. In A. oryzae metabolism, transporters appear to play crucial roles in controlling the flux of molecules for energy generation, nutrients delivery, and waste elimination in the cell. While the A. oryzae genome sequence is available, transporter annotation remains limited and thus the connectivity of metabolic networks is incomplete. In this study, we developed a metabolic annotation strategy to understand the relationship between the sequence, structure, and function for annotation of A. oryzae metabolic transporters. Sequence-based analysis with manual curation showed that 58 genes of 12,096 total genes in the A. oryzae genome encoded metabolic transporters. Under consensus integrative databases, 55 unambiguous metabolic transporter genes were distributed into channels and pores (7 genes), electrochemical potential-driven transporters (33 genes), and primary active transporters (15 genes). To reveal the transporter functional role, a combination of homology modeling and molecular dynamics simulation was implemented to assess the relationship between sequence to structure and structure to function. As in the energy metabolism of A. oryzae, the H+-ATPase encoded by the AO090005000842 gene was selected as a representative case study of multilevel linkage annotation. Our developed strategy can be used for enhancing metabolic network reconstruction. PMID:27274991
Raethong, Nachon; Wong-Ekkabut, Jirasak; Laoteng, Kobkul; Vongsangnak, Wanwipa
2016-01-01
Aspergillus oryzae is widely used for the industrial production of enzymes. In A. oryzae metabolism, transporters appear to play crucial roles in controlling the flux of molecules for energy generation, nutrients delivery, and waste elimination in the cell. While the A. oryzae genome sequence is available, transporter annotation remains limited and thus the connectivity of metabolic networks is incomplete. In this study, we developed a metabolic annotation strategy to understand the relationship between the sequence, structure, and function for annotation of A. oryzae metabolic transporters. Sequence-based analysis with manual curation showed that 58 genes of 12,096 total genes in the A. oryzae genome encoded metabolic transporters. Under consensus integrative databases, 55 unambiguous metabolic transporter genes were distributed into channels and pores (7 genes), electrochemical potential-driven transporters (33 genes), and primary active transporters (15 genes). To reveal the transporter functional role, a combination of homology modeling and molecular dynamics simulation was implemented to assess the relationship between sequence to structure and structure to function. As in the energy metabolism of A. oryzae, the H(+)-ATPase encoded by the AO090005000842 gene was selected as a representative case study of multilevel linkage annotation. Our developed strategy can be used for enhancing metabolic network reconstruction.
Sun, Haimeng; Yang, Zhongchen; Wei, Caijie; Wu, Weizhong
2018-04-26
An up-flow vertical flow constructed wetland (AC-VFCW) filled with ceramsite and 5% external carbon source poly(3-hydroxybutyrate-hydroxyvalerate) (PHBV) as substrate was set for nitrogen removal with micro aeration. Simultaneous nitrification and denitrification process was observed with 90.4% NH 4 + -N and 92.1% TN removal efficiencies. Nitrification and denitrification genes were both preferentially enriched on the surface of PHBV. Nitrogen transformation along the flow direction showed that NH 4 + -N was oxidized to NO 3 - -N at the lowermost 10 cm of the substrate and NO 3 - -N gradually degraded over the depth. AmoA gene was more enriched at -10 and -50 cm layers. NirS gene was the dominant functional gene at the bottom layer with the abundance of 2.05 × 10 7 copies g -1 substrate while nosZ gene was predominantly abundant with 7.51 × 10 6 and 2.64 × 10 6 copies g -1 substrate at the middle and top layer, respectively, indicating that functional division of dominant nitrogen functional genes forms along the flow direction in AC-VFCW. Copyright © 2018. Published by Elsevier Ltd.
Shi, Pibiao; Guy, Kateta Malangisha; Wu, Weifang; Fang, Bingsheng; Yang, Jinghua; Zhang, Mingfang; Hu, Zhongyuan
2016-04-12
The plant-specific TCP transcription factor family, which is involved in the regulation of cell growth and proliferation, performs diverse functions in multiple aspects of plant growth and development. However, no comprehensive analysis of the TCP family in watermelon (Citrullus lanatus) has been undertaken previously. A total of 27 watermelon TCP encoding genes distributed on nine chromosomes were identified. Phylogenetic analysis clustered the genes into 11 distinct subgroups. Furthermore, phylogenetic and structural analyses distinguished two homology classes within the ClTCP family, designated Class I and Class II. The Class II genes were differentiated into two subclasses, the CIN subclass and the CYC/TB1 subclass. The expression patterns of all members were determined by semi-quantitative PCR. The functions of two ClTCP genes, ClTCP14a and ClTCP15, in regulating plant height were confirmed by ectopic expression in Arabidopsis wild-type and ortholog mutants. This study represents the first genome-wide analysis of the watermelon TCP gene family, which provides valuable information for understanding the classification and functions of the TCP genes in watermelon.
Structural and functional partitioning of bread wheat chromosome 3B.
Choulet, Frédéric; Alberti, Adriana; Theil, Sébastien; Glover, Natasha; Barbe, Valérie; Daron, Josquin; Pingault, Lise; Sourdille, Pierre; Couloux, Arnaud; Paux, Etienne; Leroy, Philippe; Mangenot, Sophie; Guilhot, Nicolas; Le Gouis, Jacques; Balfourier, Francois; Alaux, Michael; Jamilloux, Véronique; Poulain, Julie; Durand, Céline; Bellec, Arnaud; Gaspin, Christine; Safar, Jan; Dolezel, Jaroslav; Rogers, Jane; Vandepoele, Klaas; Aury, Jean-Marc; Mayer, Klaus; Berges, Hélène; Quesneville, Hadi; Wincker, Patrick; Feuillet, Catherine
2014-07-18
We produced a reference sequence of the 1-gigabase chromosome 3B of hexaploid bread wheat. By sequencing 8452 bacterial artificial chromosomes in pools, we assembled a sequence of 774 megabases carrying 5326 protein-coding genes, 1938 pseudogenes, and 85% of transposable elements. The distribution of structural and functional features along the chromosome revealed partitioning correlated with meiotic recombination. Comparative analyses indicated high wheat-specific inter- and intrachromosomal gene duplication activities that are potential sources of variability for adaption. In addition to providing a better understanding of the organization, function, and evolution of a large and polyploid genome, the availability of a high-quality sequence anchored to genetic maps will accelerate the identification of genes underlying important agronomic traits. Copyright © 2014, American Association for the Advancement of Science.
Identification of hub subnetwork based on topological features of genes in breast cancer
ZHUANG, DA-YONG; JIANG, LI; HE, QING-QING; ZHOU, PENG; YUE, TAO
2015-01-01
The aim of this study was to provide functional insight into the identification of hub subnetworks by aggregating the behavior of genes connected in a protein-protein interaction (PPI) network. We applied a protein network-based approach to identify subnetworks which may provide new insight into the functions of pathways involved in breast cancer rather than individual genes. Five groups of breast cancer data were downloaded and analyzed from the Gene Expression Omnibus (GEO) database of high-throughput gene expression data to identify gene signatures using the genome-wide global significance (GWGS) method. A PPI network was constructed using Cytoscape and clusters that focused on highly connected nodes were obtained using the molecular complex detection (MCODE) clustering algorithm. Pathway analysis was performed to assess the functional relevance of selected gene signatures based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Topological centrality was used to characterize the biological importance of gene signatures, pathways and clusters. The results revealed that, cluster1, as well as the cell cycle and oocyte meiosis pathways were significant subnetworks in the analysis of degree and other centralities, in which hub nodes mostly distributed. The most important hub nodes, with top ranked centrality, were also similar with the common genes from the above three subnetwork intersections, which was viewed as a hub subnetwork with more reproducible than individual critical genes selected without network information. This hub subnetwork attributed to the same biological process which was essential in the function of cell growth and death. This increased the accuracy of identifying gene interactions that took place within the same functional process and was potentially useful for the development of biomarkers and networks for breast cancer. PMID:25573623
PIGD: a database for intronless genes in the Poaceae.
Yan, Hanwei; Jiang, Cuiping; Li, Xiaoyu; Sheng, Lei; Dong, Qing; Peng, Xiaojian; Li, Qian; Zhao, Yang; Jiang, Haiyang; Cheng, Beijiu
2014-10-01
Intronless genes are a feature of prokaryotes; however, they are widespread and unequally distributed among eukaryotes and represent an important resource to study the evolution of gene architecture. Although many databases on exons and introns exist, there is currently no cohesive database that collects intronless genes in plants into a single database. In this study, we present the Poaceae Intronless Genes Database (PIGD), a user-friendly web interface to explore information on intronless genes from different plants. Five Poaceae species, Sorghum bicolor, Zea mays, Setaria italica, Panicum virgatum and Brachypodium distachyon, are included in the current release of PIGD. Gene annotations and sequence data were collected and integrated from different databases. The primary focus of this study was to provide gene descriptions and gene product records. In addition, functional annotations, subcellular localization prediction and taxonomic distribution are reported. PIGD allows users to readily browse, search and download data. BLAST and comparative analyses are also provided through this online database, which is available at http://pigd.ahau.edu.cn/. PIGD provides a solid platform for the collection, integration and analysis of intronless genes in the Poaceae. As such, this database will be useful for subsequent bio-computational analysis in comparative genomics and evolutionary studies.
Assessment of the reliability of protein-protein interactions and protein function prediction.
Deng, Minghua; Sun, Fengzhu; Chen, Ting
2003-01-01
As more and more high-throughput protein-protein interaction data are collected, the task of estimating the reliability of different data sets becomes increasingly important. In this paper, we present our study of two groups of protein-protein interaction data, the physical interaction data and the protein complex data, and estimate the reliability of these data sets using three different measurements: (1) the distribution of gene expression correlation coefficients, (2) the reliability based on gene expression correlation coefficients, and (3) the accuracy of protein function predictions. We develop a maximum likelihood method to estimate the reliability of protein interaction data sets according to the distribution of correlation coefficients of gene expression profiles of putative interacting protein pairs. The results of the three measurements are consistent with each other. The MIPS protein complex data have the highest mean gene expression correlation coefficients (0.256) and the highest accuracy in predicting protein functions (70% sensitivity and specificity), while Ito's Yeast two-hybrid data have the lowest mean (0.041) and the lowest accuracy (15% sensitivity and specificity). Uetz's data are more reliable than Ito's data in all three measurements, and the TAP protein complex data are more reliable than the HMS-PCI data in all three measurements as well. The complex data sets generally perform better in function predictions than do the physical interaction data sets. Proteins in complexes are shown to be more highly correlated in gene expression. The results confirm that the components of a protein complex can be assigned to functions that the complex carries out within a cell. There are three interaction data sets different from the above two groups: the genetic interaction data, the in-silico data and the syn-express data. Their capability of predicting protein functions generally falls between that of the Y2H data and that of the MIPS protein complex data. The supplementary information is available at the following Web site: http://www-hto.usc.edu/-msms/AssessInteraction/.
Environmental drivers of the distribution of nitrogen functional genes at a watershed scale.
Tsiknia, Myrto; Paranychianakis, Nikolaos V; Varouchakis, Emmanouil A; Nikolaidis, Nikolaos P
2015-06-01
To date only few studies have dealt with the biogeography of microbial communities at large spatial scales, despite the importance of such information to understand and simulate ecosystem functioning. Herein, we describe the biogeographic patterns of microorganisms involved in nitrogen (N)-cycling (diazotrophs, ammonia oxidizers, denitrifiers) as well as the environmental factors shaping these patterns across the Koiliaris Critical Zone Observatory, a typical Mediterranean watershed. Our findings revealed that a proportion of variance ranging from 40 to 80% of functional genes abundance could be explained by the environmental variables monitored, with pH, soil texture, total organic carbon and potential nitrification rate being identified as the most important drivers. The spatial autocorrelation of N-functional genes ranged from 0.2 to 6.2 km and prediction maps, generated by cokriging, revealed distinct patterns of functional genes. The inclusion of functional genes in statistical modeling substantially improved the proportion of variance explained by the models, a result possibly due to the strong relationships that were identified among microbial groups. Significant relationships were set between functional groups, which were further mediated by land use (natural versus agricultural lands). These relationships, in combination with the environmental variables, allow us to provide insights regarding the ecological preferences of N-functional groups and among them the recently identified clade II of nitrous oxide reducers. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Integrative and conjugative elements and their hosts: composition, distribution and organization.
Cury, Jean; Touchon, Marie; Rocha, Eduardo P C
2017-09-06
Conjugation of single-stranded DNA drives horizontal gene transfer between bacteria and was widely studied in conjugative plasmids. The organization and function of integrative and conjugative elements (ICE), even if they are more abundant, was only studied in a few model systems. Comparative genomics of ICE has been precluded by the difficulty in finding and delimiting these elements. Here, we present the results of a method that circumvents these problems by requiring only the identification of the conjugation genes and the species' pan-genome. We delimited 200 ICEs and this allowed the first large-scale characterization of these elements. We quantified the presence in ICEs of a wide set of functions associated with the biology of mobile genetic elements, including some that are typically associated with plasmids, such as partition and replication. Protein sequence similarity networks and phylogenetic analyses revealed that ICEs are structured in functional modules. Integrases and conjugation systems have different evolutionary histories, even if the gene repertoires of ICEs can be grouped in function of conjugation types. Our characterization of the composition and organization of ICEs paves the way for future functional and evolutionary analyses of their cargo genes, composed of a majority of unknown function genes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Zhou, Zhichao; Chen, Jing; Meng, Han; Dvornyk, Volodymyr; Gu, Ji-Dong
2017-02-01
PCR primers targeting genes encoding the two proteins of anammox bacteria, hydrazine synthase and cytochrome c biogenesis protein, were designed and tested in this study. Three different ecotypes of samples, namely ocean sediments, coastal wetland sediments, and wastewater treatment plant (WWTP) samples, were used to assess the primer efficiency and the community structures of anammox bacteria retrieved by 16S ribosomal RNA (rRNA) and the functional genes. Abundances of hzsB gene of anammox bacteria in South China Sea (SCS) samples were significantly correlated with 16S rRNA gene by qPCR method. And hzsB and hzsC gene primer pair hzsB364f-hzsB640r and hzsC745f-hzsC862r in combination with anammox bacterial 16S rRNA gene primers were recommended for quantifying anammox bacteria. Congruent with 16S rRNA gene-based community study, functional gene hzsB could also delineate the coastal-ocean distributing pattern, and seawater depth was positively associated with the diversity and abundance of anammox bacteria from shallow- to deep-sea. Both hzsC and ccsA genes could differentiate marine samples between deep and shallow groups of the Scalindua sp. clades. As for WWTP samples, non-Scalindua anammox bacteria reflected by hzsB, hzsC, ccsA, and ccsB gene-based libraries showed a similar distribution pattern with that by 16S rRNA gene. NH 4 + and NH 4 + /Σ(NO 3 - + NO 2 - ) positively correlated with anammox bacteria gene diversity, but organic matter contents correlated negatively with anammox bacteria gene diversity in SCS. Salinity was positively associated with diversity indices of hzsC and ccsB gene-harboring anammox bacteria communities and could potentially differentiate the distribution patterns between shallow- and deep-sea sediment samples. SCS surface sediments harbored considerably diverse community of Scalindua. A new Mai Po clade representing coastal estuary wetland anammox bacteria group based on 16S rRNA gene phylogeny is proposed. Existence of anammox bacteria within wider coverage of genera in Mai Po wetland indicates this unique niche is very complex, and species of anammox bacteria are niche-specific with different physiological properties towards substrates competing and chemical tolerance capability.
Diversity and distribution of catechol 2, 3-dioxygenase genes in surface sediments of the Bohai Sea.
He, Peiqing; Li, Li; Liu, Jihua; Bai, Yazhi; Fang, Xisheng
2016-05-01
Catechol 2, 3-dioxygenase (C23O) is the key enzyme for aerobic aromatic degradation. Based on clone libraries and quantitative real-time polymerase chain reaction, we characterized diversity and distribution patterns of C23O genes in surface sediments of the Bohai Sea. The results showed that sediments of the Bohai Sea were dominated by genes related to C23O subfamily I.2.A. The samples from wastewater discharge area (DG) and aquaculture farm (KL) showed distinct composition of C23O genes when compared to the samples from Bohai Bay (BH), and total organic carbon was a crucial determinant accounted for the composition variation. C6BH12-38 and C2BH2-35 displayed the highest gene copies and highest ratios to the 16S rRNA genes in KL, and they might prefer biologically labile aromatic hydrocarbons via aquaculture inputs. Meanwhile, C7BH3-48 showed the highest gene copies and highest ratios to the 16S rRNA genes in DG, and this could be selective effect of organic loadings from wastewater discharge. An evident increase in C6BH12-38 and C7BH3-48 gene copies and reduction in diversity of C23O genes in DG and KL indicated composition perturbations of C23O genes and potential loss in functional redundancy. We suggest that ecological habitat and trophic specificity could shape the distribution of C23O genes in the Bohai Sea sediments. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Shen, Danyu; Liu, Tingli; Ye, Wenwu; Liu, Li; Liu, Peihan; Wu, Yuren; Wang, Yuanchao; Dou, Daolong
2013-01-01
Phytophthora and other oomycetes secrete a large number of putative host cytoplasmic effectors with conserved FLAK motifs following signal peptides, termed crinkling and necrosis inducing proteins (CRN), or Crinkler. Here, we first investigated the evolutionary patterns and mechanisms of CRN effectors in Phytophthora sojae and compared them to two other Phytophthora species. The genes encoding CRN effectors could be divided into 45 orthologous gene groups (OGG), and most OGGs unequally distributed in the three species, in which each underwent large number of gene gains or losses, indicating that the CRN genes expanded after species evolution in Phytophthora and evolved through pathoadaptation. The 134 expanded genes in P. sojae encoded family proteins including 82 functional genes and expressed at higher levels while the other 68 genes encoding orphan proteins were less expressed and contained 50 pseudogenes. Furthermore, we demonstrated that most expanded genes underwent gene duplication or/and fragment recombination. Three different mechanisms that drove gene duplication or recombination were identified. Finally, the expanded CRN effectors exhibited varying pathogenic functions, including induction of programmed cell death (PCD) and suppression of PCD through PAMP-triggered immunity or/and effector-triggered immunity. Overall, these results suggest that gene duplication and fragment recombination may be two mechanisms that drive the expansion and neofunctionalization of the CRN family in P. sojae, which aids in understanding the roles of CRN effectors within each oomycete pathogen.
Phylogenetic congruence and ecological coherence in terrestrial Thaumarchaeota.
Oton, Eduard Vico; Quince, Christopher; Nicol, Graeme W; Prosser, James I; Gubry-Rangin, Cécile
2016-01-01
Thaumarchaeota form a ubiquitously distributed archaeal phylum, comprising both the ammonia-oxidising archaea (AOA) and other archaeal groups in which ammonia oxidation has not been demonstrated (including Group 1.1c and Group 1.3). The ecology of AOA in terrestrial environments has been extensively studied using either a functional gene, encoding ammonia monooxygenase subunit A (amoA) or 16S ribosomal RNA (rRNA) genes, which show phylogenetic coherence with respect to soil pH. To test phylogenetic congruence between these two markers and to determine ecological coherence in all Thaumarchaeota, we performed high-throughput sequencing of 16S rRNA and amoA genes in 46 UK soils presenting 29 available contextual soil characteristics. Adaptation to pH and organic matter content reflected strong ecological coherence at various levels of taxonomic resolution for Thaumarchaeota (AOA and non-AOA), whereas nitrogen, total mineralisable nitrogen and zinc concentration were also important factors associated with AOA thaumarchaeotal community distribution. Other significant associations with environmental factors were also detected for amoA and 16S rRNA genes, reflecting different diversity characteristics between these two markers. Nonetheless, there was significant statistical congruence between the markers at fine phylogenetic resolution, supporting the hypothesis of low horizontal gene transfer between Thaumarchaeota. Group 1.1c Thaumarchaeota were also widely distributed, with two clusters predominating, particularly in environments with higher moisture content and organic matter, whereas a similar ecological pattern was observed for Group 1.3 Thaumarchaeota. The ecological and phylogenetic congruence identified is fundamental to understand better the life strategies, evolutionary history and ecosystem function of the Thaumarchaeota.
Phylogenetic congruence and ecological coherence in terrestrial Thaumarchaeota
Oton, Eduard Vico; Quince, Christopher; Nicol, Graeme W; Prosser, James I; Gubry-Rangin, Cécile
2016-01-01
Thaumarchaeota form a ubiquitously distributed archaeal phylum, comprising both the ammonia-oxidising archaea (AOA) and other archaeal groups in which ammonia oxidation has not been demonstrated (including Group 1.1c and Group 1.3). The ecology of AOA in terrestrial environments has been extensively studied using either a functional gene, encoding ammonia monooxygenase subunit A (amoA) or 16S ribosomal RNA (rRNA) genes, which show phylogenetic coherence with respect to soil pH. To test phylogenetic congruence between these two markers and to determine ecological coherence in all Thaumarchaeota, we performed high-throughput sequencing of 16S rRNA and amoA genes in 46 UK soils presenting 29 available contextual soil characteristics. Adaptation to pH and organic matter content reflected strong ecological coherence at various levels of taxonomic resolution for Thaumarchaeota (AOA and non-AOA), whereas nitrogen, total mineralisable nitrogen and zinc concentration were also important factors associated with AOA thaumarchaeotal community distribution. Other significant associations with environmental factors were also detected for amoA and 16S rRNA genes, reflecting different diversity characteristics between these two markers. Nonetheless, there was significant statistical congruence between the markers at fine phylogenetic resolution, supporting the hypothesis of low horizontal gene transfer between Thaumarchaeota. Group 1.1c Thaumarchaeota were also widely distributed, with two clusters predominating, particularly in environments with higher moisture content and organic matter, whereas a similar ecological pattern was observed for Group 1.3 Thaumarchaeota. The ecological and phylogenetic congruence identified is fundamental to understand better the life strategies, evolutionary history and ecosystem function of the Thaumarchaeota. PMID:26140533
2013-01-01
Background Nucleoside phosphorylases (NPs) have been extensively investigated in human and bacterial systems for their role in metabolic nucleotide salvaging and links to oncogenesis. In plants, NP-like proteins have not been comprehensively studied, likely because there is no evidence of a metabolic function in nucleoside salvage. However, in the forest trees genus Populus a family of NP-like proteins function as an important ecophysiological adaptation for inter- and intra-seasonal nitrogen storage and cycling. Results We conducted phylogenetic analyses to determine the distribution and evolution of NP-like proteins in plants. These analyses revealed two major clusters of NP-like proteins in plants. Group I proteins were encoded by genes across a wide range of plant taxa while proteins encoded by Group II genes were dominated by species belonging to the order Malpighiales and included the Populus Bark Storage Protein (BSP) and WIN4-like proteins. Additionally, we evaluated the NP-like genes in Populus by examining the transcript abundance of the 13 NP-like genes found in the Populus genome in various tissues of plants exposed to long-day (LD) and short-day (SD) photoperiods. We found that all 13 of the Populus NP-like genes belonging to either Group I or II are expressed in various tissues in both LD and SD conditions. Tests of natural selection and expression evolution analysis of the Populus genes suggests that divergence in gene expression may have occurred recently during the evolution of Populus, which supports the adaptive maintenance models. Lastly, in silico analysis of cis-regulatory elements in the promoters of the 13 NP-like genes in Populus revealed common regulatory elements known to be involved in light regulation, stress/pathogenesis and phytohormone responses. Conclusion In Populus, the evolution of the NP-like protein and gene family has been shaped by duplication events and natural selection. Expression data suggest that previously uncharacterized NP-like proteins may function in nutrient sensing and/or signaling. These proteins are members of Group I NP-like proteins, which are widely distributed in many plant taxa. We conclude that NP-like proteins may function in plants, although this function is undefined. PMID:23957885
Zhang, C H; Ma, R J; Shen, Z J; Sun, X; Korir, N K; Yu, M L
2014-04-08
In this study, 33 homeodomain-leucine zipper (HD-ZIP) genes were identified in peach using the HD-ZIP amino acid sequences of Arabidopsis thaliana as a probe. Based on the phylogenetic analysis and the individual gene or protein characteristics, the HD-ZIP gene family in peach can be classified into 4 subfamilies, HD-ZIP I, II, III, and IV, containing 14, 7, 4, and 8 members, respectively. The most closely related peach HD-ZIP members within the same subfamilies shared very similar gene structure in terms of either intron/exon numbers or lengths. Almost all members of the same subfamily shared common motif compositions, thereby implying that the HD-ZIP proteins within the same subfamily may have functional similarity. The 33 peach HD-ZIP genes were distributed across scaffolds 1 to 7. Although the primary structure varied among HD-ZIP family proteins, their tertiary structures were similar. The results from this study will be useful in selecting candidate genes from specific subfamilies for functional analysis.
Wang, Jianli; Wu, Zhenying; Shen, Zhongbao; Bai, Zetao; Zhong, Peng; Ma, Lichao; Pan, Duofeng; Zhang, Ruibo; Li, Daoming; Zhang, Hailing; Fu, Chunxiang; Han, Guiqing; Guo, Changhong
2018-01-01
Auxin response factors (ARFs) have been reported to play vital roles during plant growth and development. In order to reveal specific functions related to vegetative organs in grasses, an in-depth study of the ARF gene family was carried out in switchgrass ( Panicum virgatum L.), a warm-season C4 perennial grass that is mostly used as bioenergy and animal feedstock. A total of 47 putative ARF genes ( PvARFs ) were identified in the switchgrass genome (2n = 4x = 36), 42 of which were anchored to the seven pairs of chromosomes and found to be unevenly distributed. Sixteen PvARFs were predicted to be potential targets of small RNAs (microRNA160 and 167). Phylogenetically speaking, PvARFs were divided into seven distinct subgroups based on the phylogeny, exon/intron arrangement, and conserved motif distribution. Moreover, 15 pairs of PvARFs have different temporal-spatial expression profiles in vegetative organs (2nd, 3rd, and 4th internode and leaves), which implies that different PvARFs have specific functions in switchgrass growth and development. In addition, at least 14 pairs of PvARFs respond to naphthylacetic acid (NAA) treatment, which might be helpful for us to study on auxin response in switchgrass. The comprehensive analysis, described here, will facilitate the future functional analysis of ARF genes in grasses.
Song, Xiaowen; Huang, Fei; Liu, Juanjuan; Li, Chengjun; Gao, Shanshan; Wu, Wei; Zhai, Mengfan; Yu, Xiaojuan; Xiong, Wenfeng; Xie, Jia; Li, Bin
2017-10-01
Cytosine DNA methylation is a vital epigenetic regulator of eukaryotic development. Whether this epigenetic modification occurs in Tribolium castaneum has been controversial, its distribution pattern and functions have not been established. Here, using bisulphite sequencing (BS-Seq), we confirmed the existence of DNA methylation and described the methylation profiles of the four life stages of T. castaneum. In the T. castaneum genome, both symmetrical CpG and non-CpG methylcytosines were observed. Symmetrical CpG methylation, which was catalysed by DNMT1 and occupied a small part in T. castaneum methylome, was primarily enriched in gene bodies and was positively correlated with gene expression levels. Asymmetrical non-CpG methylation, which was predominant in the methylome, was strongly concentrated in intergenic regions and introns but absent from exons. Gene body methylation was negatively correlated with gene expression levels. The distribution pattern and functions of this type of methylation were similar only to the methylome of Drosophila melanogaster, which further supports the existence of a novel methyltransferase in the two species responsible for this type of methylation. This first life-cycle methylome of T. castaneum reveals a novel and unique methylation pattern, which will contribute to the further understanding of the variety and functions of DNA methylation in eukaryotes. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Jędrak, Jakub; Ochab-Marcinek, Anna
2016-09-01
We study a stochastic model of gene expression, in which protein production has a form of random bursts whose size distribution is arbitrary, whereas protein decay is a first-order reaction. We find exact analytical expressions for the time evolution of the cumulant-generating function for the most general case when both the burst size probability distribution and the model parameters depend on time in an arbitrary (e.g., oscillatory) manner, and for arbitrary initial conditions. We show that in the case of periodic external activation and constant protein degradation rate, the response of the gene is analogous to the resistor-capacitor low-pass filter, where slow oscillations of the external driving have a greater effect on gene expression than the fast ones. We also demonstrate that the nth cumulant of the protein number distribution depends on the nth moment of the burst size distribution. We use these results to show that different measures of noise (coefficient of variation, Fano factor, fractional change of variance) may vary in time in a different manner. Therefore, any biological hypothesis of evolutionary optimization based on the nonmonotonic dependence of a chosen measure of noise on time must justify why it assumes that biological evolution quantifies noise in that particular way. Finally, we show that not only for exponentially distributed burst sizes but also for a wider class of burst size distributions (e.g., Dirac delta and gamma) the control of gene expression level by burst frequency modulation gives rise to proportional scaling of variance of the protein number distribution to its mean, whereas the control by amplitude modulation implies proportionality of protein number variance to the mean squared.
Pavlidis, Paul; Qin, Jie; Arango, Victoria; Mann, John J; Sibille, Etienne
2004-06-01
One of the challenges in the analysis of gene expression data is placing the results in the context of other data available about genes and their relationships to each other. Here, we approach this problem in the study of gene expression changes associated with age in two areas of the human prefrontal cortex, comparing two computational methods. The first method, "overrepresentation analysis" (ORA), is based on statistically evaluating the fraction of genes in a particular gene ontology class found among the set of genes showing age-related changes in expression. The second method, "functional class scoring" (FCS), examines the statistical distribution of individual gene scores among all genes in the gene ontology class and does not involve an initial gene selection step. We find that FCS yields more consistent results than ORA, and the results of ORA depended strongly on the gene selection threshold. Our findings highlight the utility of functional class scoring for the analysis of complex expression data sets and emphasize the advantage of considering all available genomic information rather than sets of genes that pass a predetermined "threshold of significance."
German, M S; Moss, L G; Wang, J; Rutter, W J
1992-01-01
The pancreatic beta cell makes several unique gene products, including insulin, islet amyloid polypeptide (IAPP), and beta-cell-specific glucokinase (beta GK). The functions of isolated portions of the insulin, IAPP, and beta GK promoters were studied by using transient expression and DNA binding assays. A short portion (-247 to -197 bp) of the rat insulin I gene, the FF minienhancer, contains three interacting transcriptional regulatory elements. The FF minienhancer binds at least two nuclear complexes with limited tissue distribution. Sequences similar to that of the FF minienhancer are present in the 5' flanking DNA of the human IAPP and rat beta GK genes and also the rat insulin II and mouse insulin I and II genes. Similar minienhancer constructs from the insulin and IAPP genes function as cell-specific transcriptional regulatory elements and compete for binding of the same nuclear factors, while the beta GK construct competes for protein binding but functions poorly as a minienhancer. These observations suggest that the patterns of expression of the beta-cell-specific genes result in part from sharing the same transcriptional regulators. Images PMID:1549125
Detection of gene expression changes at chromosomal rearrangement breakpoints in evolution
2012-01-01
Background We study the relation between genome rearrangements, breakpoints and gene expression. Genome rearrangement research has been concerned with the creation of breakpoints and their position in the chromosome, but the functional consequences of individual breakpoints remain virtually unknown, and there are no direct genome-wide studies of breakpoints from this point of view. A question arises of what the biological consequences of breakpoint creation are, rather than just their structural aspects. The question is whether proximity to the site of a breakpoint event changes the activity of a gene. Results We investigate this by comparing the distribution of distances to the nearest breakpoint of genes that are differentially expressed with the distribution of the same distances for the entire gene complement. We study this in data on whole blood tissue in human versus macaque, and in cerebral cortex tissue in human versus chimpanzee. We find in both data sets that the distribution of distances to the nearest breakpoint of "changed expression genes" differs little from this distance calculated for the rest of the gene complement. In focusing on the changed expression genes closest to the breakpoints, however, we discover that several of these have previously been implicated in the literature as being connected to the evolutionary divergence of humans from other primates. Conclusions We conjecture that chromosomal rearrangements occasionally interrupt the regulatory configurations of genes close to the breakpoint, leading to changes in expression. PMID:22536904
Nussbaumer, Thomas; Kugler, Karl G; Schweiger, Wolfgang; Bader, Kai C; Gundlach, Heidrun; Spannagl, Manuel; Poursarebani, Naser; Pfeifer, Matthias; Mayer, Klaus F X
2014-12-10
Over the last years reference genome sequences of several economically and scientifically important cereals and model plants became available. Despite the agricultural significance of these crops only a small number of tools exist that allow users to inspect and visualize the genomic position of genes of interest in an interactive manner. We present chromoWIZ, a web tool that allows visualizing the genomic positions of relevant genes and comparing these data between different plant genomes. Genes can be queried using gene identifiers, functional annotations, or sequence homology in four grass species (Triticum aestivum, Hordeum vulgare, Brachypodium distachyon, Oryza sativa). The distribution of the anchored genes is visualized along the chromosomes by using heat maps. Custom gene expression measurements, differential expression information, and gene-to-group mappings can be uploaded and can be used for further filtering. This tool is mainly designed for breeders and plant researchers, who are interested in the location and the distribution of candidate genes as well as in the syntenic relationships between different grass species. chromoWIZ is freely available and online accessible at http://mips.helmholtz-muenchen.de/plant/chromoWIZ/index.jsp.
Martínez-del Campo, Ana; Bodea, Smaranda; Hamer, Hilary A; Marks, Jonathan A; Haiser, Henry J; Turnbaugh, Peter J; Balskus, Emily P
2015-04-14
Elucidation of the molecular mechanisms underlying the human gut microbiota's effects on health and disease has been complicated by difficulties in linking metabolic functions associated with the gut community as a whole to individual microorganisms and activities. Anaerobic microbial choline metabolism, a disease-associated metabolic pathway, exemplifies this challenge, as the specific human gut microorganisms responsible for this transformation have not yet been clearly identified. In this study, we established the link between a bacterial gene cluster, the choline utilization (cut) cluster, and anaerobic choline metabolism in human gut isolates by combining transcriptional, biochemical, bioinformatic, and cultivation-based approaches. Quantitative reverse transcription-PCR analysis and in vitro biochemical characterization of two cut gene products linked the entire cluster to growth on choline and supported a model for this pathway. Analyses of sequenced bacterial genomes revealed that the cut cluster is present in many human gut bacteria, is predictive of choline utilization in sequenced isolates, and is widely but discontinuously distributed across multiple bacterial phyla. Given that bacterial phylogeny is a poor marker for choline utilization, we were prompted to develop a degenerate PCR-based method for detecting the key functional gene choline TMA-lyase (cutC) in genomic and metagenomic DNA. Using this tool, we found that new choline-metabolizing gut isolates universally possessed cutC. We also demonstrated that this gene is widespread in stool metagenomic data sets. Overall, this work represents a crucial step toward understanding anaerobic choline metabolism in the human gut microbiota and underscores the importance of examining this microbial community from a function-oriented perspective. Anaerobic choline utilization is a bacterial metabolic activity that occurs in the human gut and is linked to multiple diseases. While bacterial genes responsible for choline fermentation (the cut gene cluster) have been recently identified, there has been no characterization of these genes in human gut isolates and microbial communities. In this work, we use multiple approaches to demonstrate that the pathway encoded by the cut genes is present and functional in a diverse range of human gut bacteria and is also widespread in stool metagenomes. We also developed a PCR-based strategy to detect a key functional gene (cutC) involved in this pathway and applied it to characterize newly isolated choline-utilizing strains. Both our analyses of the cut gene cluster and this molecular tool will aid efforts to further understand the role of choline metabolism in the human gut microbiota and its link to disease. Copyright © 2015 Martínez-del Campo et al.
Integrating mean and variance heterogeneities to identify differentially expressed genes.
Ouyang, Weiwei; An, Qiang; Zhao, Jinying; Qin, Huaizhen
2016-12-06
In functional genomics studies, tests on mean heterogeneity have been widely employed to identify differentially expressed genes with distinct mean expression levels under different experimental conditions. Variance heterogeneity (aka, the difference between condition-specific variances) of gene expression levels is simply neglected or calibrated for as an impediment. The mean heterogeneity in the expression level of a gene reflects one aspect of its distribution alteration; and variance heterogeneity induced by condition change may reflect another aspect. Change in condition may alter both mean and some higher-order characteristics of the distributions of expression levels of susceptible genes. In this report, we put forth a conception of mean-variance differentially expressed (MVDE) genes, whose expression means and variances are sensitive to the change in experimental condition. We mathematically proved the null independence of existent mean heterogeneity tests and variance heterogeneity tests. Based on the independence, we proposed an integrative mean-variance test (IMVT) to combine gene-wise mean heterogeneity and variance heterogeneity induced by condition change. The IMVT outperformed its competitors under comprehensive simulations of normality and Laplace settings. For moderate samples, the IMVT well controlled type I error rates, and so did existent mean heterogeneity test (i.e., the Welch t test (WT), the moderated Welch t test (MWT)) and the procedure of separate tests on mean and variance heterogeneities (SMVT), but the likelihood ratio test (LRT) severely inflated type I error rates. In presence of variance heterogeneity, the IMVT appeared noticeably more powerful than all the valid mean heterogeneity tests. Application to the gene profiles of peripheral circulating B raised solid evidence of informative variance heterogeneity. After adjusting for background data structure, the IMVT replicated previous discoveries and identified novel experiment-wide significant MVDE genes. Our results indicate tremendous potential gain of integrating informative variance heterogeneity after adjusting for global confounders and background data structure. The proposed informative integration test better summarizes the impacts of condition change on expression distributions of susceptible genes than do the existent competitors. Therefore, particular attention should be paid to explicitly exploit the variance heterogeneity induced by condition change in functional genomics analysis.
Effect of Temperature on Synthetic Positive and Negative Feedback Gene Networks
NASA Astrophysics Data System (ADS)
Charlebois, Daniel A.; Marshall, Sylvia; Balazsi, Gabor
Synthetic biological systems are built and tested under well controlled laboratory conditions. How altering the environment, such as the ambient temperature affects their function is not well understood. To address this question for synthetic gene networks with positive and negative feedback, we used mathematical modeling coupled with experiments in the budding yeast Saccharomyces cerevisiae. We found that cellular growth rates and gene expression dose responses change significantly at temperatures above and below the physiological optimum for yeast. Gene expression distributions for the negative feedback-based circuit changed from unimodal to bimodal at high temperature, while the bifurcation point of the positive feedback circuit shifted up with temperature. These results demonstrate that synthetic gene network function is context-dependent. Temperature effects should thus be tested and incorporated into their design and validation for real-world applications. NSERC Postdoctoral Fellowship (Grant No. PDF-453977-2014).
Dynamic chromatin changes associated with de novo centromere formation in maize euchromatin.
Su, Handong; Liu, Yalin; Liu, Yong-Xin; Lv, Zhenling; Li, Hongyao; Xie, Shaojun; Gao, Zhi; Pang, Junling; Wang, Xiu-Jie; Lai, Jinsheng; Birchler, James A; Han, Fangpu
2016-12-01
The inheritance and function of centromeres are not strictly dependent on any specific DNA sequence, but involve an epigenetic component in most species. CENH3, a centromere histone H3 variant, is one of the best-described epigenetic factors in centromere identity, but the chromatin features required during centromere formation have not yet been revealed. We previously identified two de novo centromeres on Zea mays (maize) minichromosomes derived from euchromatic sites with high-density gene distributions but low-density transposon distributions. The distribution of gene location and gene expression in these sites indicates that transcriptionally active regions can initiate de novo centromere formation, and CENH3 seeding shows a preference for gene-free regions or regions with no gene expression. The locations of the expressed genes detected were at relatively hypomethylated loci, and the altered gene expression resulted from de novo centromere formation, but not from the additional copy of the minichromosome. The initial overall DNA methylation level of the two de novo regions was at a low level, but increased substantially to that of native centromeres after centromere formation. These results illustrate the dynamic chromatin changes during euchromatin-originated de novo centromere formation, which provides insight into the mechanism of de novo centromere formation and regulation of subsequent consequences. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.
Zhang, Ning; Yu, Hong; Yu, Hao; Cai, Yueyue; Huang, Linzhou; Xu, Cao; Xiong, Guosheng; Meng, Xiangbing; Wang, Jiyao; Chen, Haofeng; Liu, Guifu; Jing, Yanhui; Yuan, Yundong; Liang, Yan; Li, Shujia; Smith, Steven M; Li, Jiayang; Wang, Yonghong
2018-06-18
Tiller angle in cereals is a key shoot architecture trait that strongly influences grain yield. Studies in rice (Oryza sativa L.) have implicated shoot gravitropism in the regulation of tiller angle. However, the functional link between shoot gravitropism and tiller angle is unknown. Here, we conducted a large-scale transcriptome analysis of rice shoots in response to gravistimulation and identified two new nodes of a shoot gravitropism regulatory gene network that also controls rice tiller angle. We demonstrate that HEAT STRESS TRANSCRIPTION FACTOR 2D (HSFA2D) is an upstream positive regulator of the LAZY1-mediated asymmetric auxin distribution pathway. We also show that two functionally redundant transcription factor genes, WUSCHEL RELATED HOMEOBOX6 (WOX6) and WOX11, are expressed asymmetrically in response to auxin to connect gravitropism responses with the control of rice tiller angle. These findings define upstream and downstream genetic components that link shoot gravitropism, asymmetric auxin distribution, and rice tiller angle. The results highlight the power of the high-temporal-resolution RNA-seq dataset, and its use to explore further genetic components controlling tiller angle. Collectively these approaches will identify genes to improve grain yields by facilitating the optimization of plant architecture. © 2018 American Society of Plant Biologists. All rights reserved.
Yamada, Shigehiro; Hotta, Kohji; Yamamoto, Takamasa S; Ueno, Naoto; Satoh, Nori; Takahashi, Hiroki
2009-04-01
The midline organ the notochord and its overlying dorsal neural tube are the most prominent features of the chordate body plan. Although the molecular mechanisms involved in the formation of the central nervous system (CNS) have been studied extensively in vertebrate embryos, none of the genes that are expressed exclusively in notochord cells has been shown to function in this process. Here, we report a gene in the urochordate Ciona intestinalis encoding a fibrinogen-like protein that plays a pivotal role in the notochord-dependent positioning of neuronal cells. While this gene (Ci-fibrn) is expressed exclusively in notochord cells, its protein product is not confined to these cells but is distributed underneath the CNS as fibril-like protrusions. We demonstrated that Ci-fibrn interacts physically and functionally with Ci-Notch that is expressed in the central nervous system, and that the correct distribution of Ci-fibrn protein is dependent on Notch signaling. Disturbance of the Ci-fibrn distribution caused an abnormal positioning of neuronal cells and an abnormal track of axon extension. Therefore, it is highly likely that the interaction between the notochord-based fibrinogen-like protein and the neural tube-based Notch signaling plays an essential role in the proper patterning of CNS.
deFUME: Dynamic exploration of functional metagenomic sequencing data.
van der Helm, Eric; Geertz-Hansen, Henrik Marcus; Genee, Hans Jasper; Malla, Sailesh; Sommer, Morten Otto Alexander
2015-07-31
Functional metagenomic selections represent a powerful technique that is widely applied for identification of novel genes from complex metagenomic sources. However, whereas hundreds to thousands of clones can be easily generated and sequenced over a few days of experiments, analyzing the data is time consuming and constitutes a major bottleneck for experimental researchers in the field. Here we present the deFUME web server, an easy-to-use web-based interface for processing, annotation and visualization of functional metagenomics sequencing data, tailored to meet the requirements of non-bioinformaticians. The web-server integrates multiple analysis steps into one single workflow: read assembly, open reading frame prediction, and annotation with BLAST, InterPro and GO classifiers. Analysis results are visualized in an online dynamic web-interface. The deFUME webserver provides a fast track from raw sequence to a comprehensive visual data overview that facilitates effortless inspection of gene function, clustering and distribution. The webserver is available at cbs.dtu.dk/services/deFUME/and the source code is distributed at github.com/EvdH0/deFUME.
Yang, Hong; Lin, Shan; Cui, Jingru
2014-02-10
Arsenic trioxide (ATO) is presently the most active single agent in the treatment of acute promyelocytic leukemia (APL). In order to explore the molecular mechanism of ATO in leukemia cells with time series, we adopted bioinformatics strategy to analyze expression changing patterns and changes in transcription regulation modules of time series genes filtered from Gene Expression Omnibus database (GSE24946). We totally screened out 1847 time series genes for subsequent analysis. The KEGG (Kyoto encyclopedia of genes and genomes) pathways enrichment analysis of these genes showed that oxidative phosphorylation and ribosome were the top 2 significantly enriched pathways. STEM software was employed to compare changing patterns of gene expression with assigned 50 expression patterns. We screened out 7 significantly enriched patterns and 4 tendency charts of time series genes. The result of Gene Ontology showed that functions of times series genes mainly distributed in profiles 41, 40, 39 and 38. Seven genes with positive regulation of cell adhesion function were enriched in profile 40, and presented the same first increased model then decreased model as profile 40. The transcription module analysis showed that they mainly involved in oxidative phosphorylation pathway and ribosome pathway. Overall, our data summarized the gene expression changes in ATO treated K562-r cell lines with time and suggested that time series genes mainly regulated cell adhesive. Furthermore, our result may provide theoretical basis of molecular biology in treating acute promyelocytic leukemia. Copyright © 2013 Elsevier B.V. All rights reserved.
Bacci, Giovanni; Fiscarelli, Ersilia; Taccetti, Giovanni; Dolce, Daniela; Paganin, Patrizia; Morelli, Patrizia; Tuccio, Vanessa; De Alessandri, Alessandra; Lucidi, Vincenzina
2017-01-01
In recent years, next-generation sequencing (NGS) was employed to decipher the structure and composition of the microbiota of the airways in cystic fibrosis (CF) patients. However, little is still known about the overall gene functions harbored by the resident microbial populations and which specific genes are associated with various stages of CF lung disease. In the present study, we aimed to identify the microbial gene repertoire of CF microbiota in twelve patients with severe and normal/mild lung disease by performing sputum shotgun metagenome sequencing. The abundance of metabolic pathways encoded by microbes inhabiting CF airways was reconstructed from the metagenome. We identified a set of metabolic pathways differently distributed in patients with different pulmonary function; namely, pathways related to bacterial chemotaxis and flagellar assembly, as well as genes encoding efflux-mediated antibiotic resistance mechanisms and virulence-related genes. The results indicated that the microbiome of CF patients with low pulmonary function is enriched in virulence-related genes and in genes encoding efflux-mediated antibiotic resistance mechanisms. Overall, the microbiome of severely affected adults with CF seems to encode different mechanisms for the facilitation of microbial colonization and persistence in the lung, consistent with the characteristics of multidrug-resistant microbial communities that are commonly observed in patients with severe lung disease. PMID:28758937
Krienen, Fenna M.; Yeo, B. T. Thomas; Ge, Tian; Buckner, Randy L.; Sherwood, Chet C.
2016-01-01
The human brain is patterned with disproportionately large, distributed cerebral networks that connect multiple association zones in the frontal, temporal, and parietal lobes. The expansion of the cortical surface, along with the emergence of long-range connectivity networks, may be reflected in changes to the underlying molecular architecture. Using the Allen Institute’s human brain transcriptional atlas, we demonstrate that genes particularly enriched in supragranular layers of the human cerebral cortex relative to mouse distinguish major cortical classes. The topography of transcriptional expression reflects large-scale brain network organization consistent with estimates from functional connectivity MRI and anatomical tracing in nonhuman primates. Microarray expression data for genes preferentially expressed in human upper layers (II/III), but enriched only in lower layers (V/VI) of mouse, were cross-correlated to identify molecular profiles across the cerebral cortex of postmortem human brains (n = 6). Unimodal sensory and motor zones have similar molecular profiles, despite being distributed across the cortical mantle. Sensory/motor profiles were anticorrelated with paralimbic and certain distributed association network profiles. Tests of alternative gene sets did not consistently distinguish sensory and motor regions from paralimbic and association regions: (i) genes enriched in supragranular layers in both humans and mice, (ii) genes cortically enriched in humans relative to nonhuman primates, (iii) genes related to connectivity in rodents, (iv) genes associated with human and mouse connectivity, and (v) 1,454 gene sets curated from known gene ontologies. Molecular innovations of upper cortical layers may be an important component in the evolution of long-range corticocortical projections. PMID:26739559
Krienen, Fenna M; Yeo, B T Thomas; Ge, Tian; Buckner, Randy L; Sherwood, Chet C
2016-01-26
The human brain is patterned with disproportionately large, distributed cerebral networks that connect multiple association zones in the frontal, temporal, and parietal lobes. The expansion of the cortical surface, along with the emergence of long-range connectivity networks, may be reflected in changes to the underlying molecular architecture. Using the Allen Institute's human brain transcriptional atlas, we demonstrate that genes particularly enriched in supragranular layers of the human cerebral cortex relative to mouse distinguish major cortical classes. The topography of transcriptional expression reflects large-scale brain network organization consistent with estimates from functional connectivity MRI and anatomical tracing in nonhuman primates. Microarray expression data for genes preferentially expressed in human upper layers (II/III), but enriched only in lower layers (V/VI) of mouse, were cross-correlated to identify molecular profiles across the cerebral cortex of postmortem human brains (n = 6). Unimodal sensory and motor zones have similar molecular profiles, despite being distributed across the cortical mantle. Sensory/motor profiles were anticorrelated with paralimbic and certain distributed association network profiles. Tests of alternative gene sets did not consistently distinguish sensory and motor regions from paralimbic and association regions: (i) genes enriched in supragranular layers in both humans and mice, (ii) genes cortically enriched in humans relative to nonhuman primates, (iii) genes related to connectivity in rodents, (iv) genes associated with human and mouse connectivity, and (v) 1,454 gene sets curated from known gene ontologies. Molecular innovations of upper cortical layers may be an important component in the evolution of long-range corticocortical projections.
Lee, Ann-Ying; Chen, Chun-Yi; Chang, Yao-Chien Alex; Chao, Ya-Ting; Shih, Ming-Che
2013-01-01
Previously we developed genomic resources for orchids, including transcriptomic analyses using next-generation sequencing techniques and construction of a web-based orchid genomic database. Here, we report a modified molecular model of flower development in the Orchidaceae based on functional analysis of gene expression profiles in Phalaenopsis aphrodite (a moth orchid) that revealed novel roles for the transcription factors involved in floral organ pattern formation. Phalaenopsis orchid floral organ-specific genes were identified by microarray analysis. Several critical transcription factors including AP3, PI, AP1 and AGL6, displayed distinct spatial distribution patterns. Phylogenetic analysis of orchid MADS box genes was conducted to infer the evolutionary relationship among floral organ-specific genes. The results suggest that gene duplication MADS box genes in orchid may have resulted in their gaining novel functions during evolution. Based on these analyses, a modified model of orchid flowering was proposed. Comparison of the expression profiles of flowers of a peloric mutant and wild-type Phalaenopsis orchid further identified genes associated with lip morphology and peloric effects. Large scale investigation of gene expression profiles revealed that homeotic genes from the ABCDE model of flower development classes A and B in the Phalaenopsis orchid have novel functions due to evolutionary diversification, and display differential expression patterns. PMID:24265826
Why is the correlation between gene importance and gene evolutionary rate so weak?
Wang, Zhi; Zhang, Jianzhi
2009-01-01
One of the few commonly believed principles of molecular evolution is that functionally more important genes (or DNA sequences) evolve more slowly than less important ones. This principle is widely used by molecular biologists in daily practice. However, recent genomic analysis of a diverse array of organisms found only weak, negative correlations between the evolutionary rate of a gene and its functional importance, typically measured under a single benign lab condition. A frequently suggested cause of the above finding is that gene importance determined in the lab differs from that in an organism's natural environment. Here, we test this hypothesis in yeast using gene importance values experimentally determined in 418 lab conditions or computationally predicted for 10,000 nutritional conditions. In no single condition or combination of conditions did we find a much stronger negative correlation, which is explainable by our subsequent finding that always-essential (enzyme) genes do not evolve significantly more slowly than sometimes-essential or always-nonessential ones. Furthermore, we verified that functional density, approximated by the fraction of amino acid sites within protein domains, is uncorrelated with gene importance. Thus, neither the lab-nature mismatch nor a potentially biased among-gene distribution of functional density explains the observed weakness of the correlation between gene importance and evolutionary rate. We conclude that the weakness is factual, rather than artifactual. In addition to being weakened by population genetic reasons, the correlation is likely to have been further weakened by the presence of multiple nontrivial rate determinants that are independent from gene importance. These findings notwithstanding, we show that the principle of slower evolution of more important genes does have some predictive power when genes with vastly different evolutionary rates are compared, explaining why the principle can be practically useful despite the weakness of the correlation.
Badhai, Jhasketan; Ghosh, Tarini S.; Das, Subrata K.
2015-01-01
This study describes microbial diversity in four tropical hot springs representing moderately thermophilic environments (temperature range: 40–58°C; pH: 7.2–7.4) with discrete geochemistry. Metagenome sequence data showed a dominance of Bacteria over Archaea; the most abundant phyla were Chloroflexi and Proteobacteria, although other phyla were also present, such as Acetothermia, Nitrospirae, Acidobacteria, Firmicutes, Deinococcus-Thermus, Bacteroidetes, Thermotogae, Euryarchaeota, Verrucomicrobia, Ignavibacteriae, Cyanobacteria, Actinobacteria, Planctomycetes, Spirochaetes, Armatimonadetes, Crenarchaeota, and Aquificae. The distribution of major genera and their statistical correlation analyses with the physicochemical parameters predicted that the temperature, aqueous concentrations of ions (such as sodium, chloride, sulfate, and bicarbonate), total hardness, dissolved solids and conductivity were the main environmental variables influencing microbial community composition and diversity. Despite the observed high taxonomic diversity, there were only little variations in the overall functional profiles of the microbial communities in the four springs. Genes involved in the metabolism of carbohydrates and carbon fixation were the most abundant functional class of genes present in these hot springs. The distribution of genes involved in carbon fixation predicted the presence of all the six known autotrophic pathways in the metagenomes. A high prevalence of genes involved in membrane transport, signal transduction, stress response, bacterial chemotaxis, and flagellar assembly were observed along with genes involved in the pathways of xenobiotic degradation and metabolism. The analysis of the metagenomic sequences affiliated to the candidate phylum Acetothermia from spring TB-3 provided new insight into the metabolism and physiology of yet-unknown members of this lineage of bacteria. PMID:26579081
Li, Qianqian; Liu, Jianguo; Zhang, Litao; Liu, Qian
2014-01-01
Background Algae in the order Trentepohliales have a broad geographic distribution and are generally characterized by the presence of abundant β-carotene. The many monographs published to date have mainly focused on their morphology, taxonomy, phylogeny, distribution and reproduction; molecular studies of this order are still rare. High-throughput RNA sequencing (RNA-Seq) technology provides a powerful and efficient method for transcript analysis and gene discovery in Trentepohlia jolithus. Methods/Principal Findings Illumina HiSeq 2000 sequencing generated 55,007,830 Illumina PE raw reads, which were assembled into 41,328 assembled unigenes. Based on NR annotation, 53.28% of the unigenes (22,018) could be assigned to gene ontology classes with 54 subcategories and 161,451 functional terms. A total of 26,217 (63.44%) assembled unigenes were mapped to 128 KEGG pathways. Furthermore, a set of 5,798 SSRs in 5,206 unigenes and 131,478 putative SNPs were identified. Moreover, the fact that all of the C4 photosynthesis genes exist in T. jolithus suggests a complex carbon acquisition and fixation system. Similarities and differences between T. jolithus and other algae in carotenoid biosynthesis are also described in depth. Conclusions/Significance This is the first broad transcriptome survey for T. jolithus, increasing the amount of molecular data available for the class Ulvophyceae. As well as providing resources for functional genomics studies, the functional genes and putative pathways identified here will contribute to a better understanding of carbon fixation and fatty acid and carotenoid biosynthesis in T. jolithus. PMID:25254555
Badhai, Jhasketan; Ghosh, Tarini S; Das, Subrata K
2015-01-01
This study describes microbial diversity in four tropical hot springs representing moderately thermophilic environments (temperature range: 40-58°C; pH: 7.2-7.4) with discrete geochemistry. Metagenome sequence data showed a dominance of Bacteria over Archaea; the most abundant phyla were Chloroflexi and Proteobacteria, although other phyla were also present, such as Acetothermia, Nitrospirae, Acidobacteria, Firmicutes, Deinococcus-Thermus, Bacteroidetes, Thermotogae, Euryarchaeota, Verrucomicrobia, Ignavibacteriae, Cyanobacteria, Actinobacteria, Planctomycetes, Spirochaetes, Armatimonadetes, Crenarchaeota, and Aquificae. The distribution of major genera and their statistical correlation analyses with the physicochemical parameters predicted that the temperature, aqueous concentrations of ions (such as sodium, chloride, sulfate, and bicarbonate), total hardness, dissolved solids and conductivity were the main environmental variables influencing microbial community composition and diversity. Despite the observed high taxonomic diversity, there were only little variations in the overall functional profiles of the microbial communities in the four springs. Genes involved in the metabolism of carbohydrates and carbon fixation were the most abundant functional class of genes present in these hot springs. The distribution of genes involved in carbon fixation predicted the presence of all the six known autotrophic pathways in the metagenomes. A high prevalence of genes involved in membrane transport, signal transduction, stress response, bacterial chemotaxis, and flagellar assembly were observed along with genes involved in the pathways of xenobiotic degradation and metabolism. The analysis of the metagenomic sequences affiliated to the candidate phylum Acetothermia from spring TB-3 provided new insight into the metabolism and physiology of yet-unknown members of this lineage of bacteria.
Diversity and Phylogenetic Distribution of Extracellular Microbial Peptidases
NASA Astrophysics Data System (ADS)
Nguyen, Trang; Mueller, Ryan; Myrold, David
2017-04-01
Depolymerization of proteinaceous compounds by extracellular proteolytic enzymes is a bottleneck in the nitrogen cycle, limiting the rate of the nitrogen turnover in soils. Protein degradation is accomplished by a diverse range of extracellular (secreted) peptidases. Our objective was to better understand the evolution of these enzymes and how their functional diversity corresponds to known phylogenetic diversity. Peptidase subfamilies from 110 archaeal, 1,860 bacterial, and 97 fungal genomes were extracted from the MEROPS database along with corresponding SSU sequences for each genome from the SILVA database, resulting in 43,177 secreted peptidases belonging to 34 microbial phyla and 149 peptidase subfamilies. We compared the distribution of each peptidase subfamily across all taxa to the phylogenetic relationships of these organisms based on their SSU gene sequences. The occurrence and abundance of genes coding for secreted peptidases varied across microbial taxa, distinguishing the peptidase complement of the three microbial kingdoms. Bacteria had the highest frequency of secreted peptidase coding genes per 1,000 genes and contributed from 1% to 6% of the gene content. Fungi only had a slightly higher number of secreted peptidase gene content than archaea, standardized by the total genes. The relative abundance profiles of secreted peptidases in each microbial kingdom also varied, in which aspartic family was found to be the greatest in fungi (25%), whereas it was only 12% in archaea and 4% in bacteria. Serine, metallo, and cysteine families consistently contributed widely up to 75% of the secreted peptidase abundance across the three kingdoms. Overall, bacteria had a much wider collection of secreted peptidases, whereas fungi and archaea shared most of their secreted peptidase families. Principle coordinate analysis of the peptidase subfamily-based dissimilarities showed distinguishable clusters for different groups of microorganisms. The distribution of secreted peptidases was found to be significantly correlated with phylogenetic relationships within kingdoms (archaea rMantel=0.364, p=0.001; bacteria rMantel=0.257, p=0.001, and fungi rMantel=0.281, p=0.005), inferring an evolutionary relationship where subsets of phylogenetically related organisms share similar types of secreted peptidases. We also tested the phylogenetic signal strength of each peptidase subfamily for each microbial kingdom based on the binary traits of the distribution (presence or absence of secreted peptidase subfamilies in individual species). About one-third of the peptidase subfamilies displayed a strong evolutionary signal; the rest were phylogenetically over-dispersed, suggesting that these subfamilies are randomly distributed across the tree of life or the result of events such as horizontal gene transfer. Study of the diversity and phylogenetic distribution of secreted peptidases offered a mechanistic basis to anticipate the proteolytic potential function of microbial communities.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Penn, Kevin; Jenkins, Caroline; Nett, Markus
Linking functional traits to bacterial phylogeny remains a fundamental but elusive goal of microbial ecology 1. Without this information, it becomes impossible to resolve meaningful units of diversity and the mechanisms by which bacteria interact with each other and adapt to environmental change. Ecological adaptations among bacterial populations have been linked to genomic islands, strain-specific regions of DNA that house functionally adaptive traits 2. In the case of environmental bacteria, these traits are largely inferred from bioinformatic or gene expression analyses 2, thus leaving few examples in which the functions of island genes have been experimentally characterized. Here we reportmore » the complete genome sequences of Salinispora tropica and S. arenicola, the first cultured, obligate marine Actinobacteria 3. These two species inhabit benthic marine environments and dedicate 8-10percent of their genomes to the biosynthesis of secondary metabolites. Despite a close phylogenetic relationship, 25 of 37 secondary metabolic pathways are species-specific and located within 21 genomic islands, thus providing new evidence linking secondary metabolism to ecological adaptation. Species-specific differences are also observed in CRISPR sequences, suggesting that variations in phage immunity provide fitness advantages that contribute to the cosmopolitan distribution of S. arenicola 4. The two Salinispora genomes have evolved by complex processes that include the duplication and acquisition of secondary metabolite genes, the products of which provide immediate opportunities for molecular diversification and ecological adaptation. Evidence that secondary metabolic pathways are exchanged by Horizontal Gene Transfer (HGT) yet are fixed among globally distributed populations 5 supports a functional role for their products and suggests that pathway acquisition represents a previously unrecognized force driving bacterial diversification« less
Chen, Zhen-Yong; Guo, Xiao-Jiang; Chen, Zhong-Xu; Chen, Wei-Ying; Wang, Ji-Rui
2017-06-01
The binding sites of transcription factors (TFs) in upstream DNA regions are called transcription factor binding sites (TFBSs). TFBSs are important elements for regulating gene expression. To date, there have been few studies on the profiles of TFBSs in plants. In total, 4,873 sequences with 5' upstream regions from 8530 wheat fl-cDNA sequences were used to predict TFBSs. We found 4572 TFBSs for the MADS TF family, which was twice as many as for bHLH (1951), B3 (1951), HB superfamily (1914), ERF (1820), and AP2/ERF (1725) TFs, and was approximately four times higher than the remaining TFBS types. The percentage of TFBSs and TF members showed a distinct distribution in different tissues. Overall, the distribution of TFBSs in the upstream regions of wheat fl-cDNA sequences had significant difference. Meanwhile, high frequencies of some types of TFBSs were found in specific regions in the upstream sequences. Both TFs and fl-cDNA with TFBSs predicted in the same tissues exhibited specific distribution preferences for regulating gene expression. The tissue-specific analysis of TFs and fl-cDNA with TFBSs provides useful information for functional research, and can be used to identify relationships between tissue-specific TFs and fl-cDNA with TFBSs. Moreover, the positional distribution of TFBSs indicates that some types of wheat TFBS have different positional distribution preferences in the upstream regions of genes.
Ran, Jin-Hua; Shen, Ting-Ting; Liu, Wen-Juan; Wang, Xiao-Quan
2013-01-01
Stomata play significant roles in plant evolution. A trio of closely related basic Helix-Loop-Helix (bHLH) subgroup Ia genes, SPCH, MUTE and FAMA, mediate sequential steps of stomatal development, and their functions may be conserved in land plants. However, the evolutionary history of the putative SPCH/MUTE/FAMA genes is still greatly controversial, especially the phylogenetic positions of the bHLH Ia members from basal land plants. To better understand the evolutionary pattern and functional diversity of the bHLH genes involved in stomatal development, we made a comprehensive evolutionary analysis of the homologous genes from 54 species representing the major lineages of green plants. The phylogenetic analysis indicated: (1) All bHLH Ia genes from the two basal land plants Physcomitrella and Selaginella were closely related to the FAMA genes of seed plants; and (2) the gymnosperm ‘SPCH’ genes were sister to a clade comprising the angiosperm SPCH and MUTE genes, while the FAMA genes of gymnosperms and angiosperms had a sister relationship. The revealed phylogenetic relationships are also supported by the distribution of gene structures and previous functional studies. Therefore, we deduce that the function of FAMA might be ancestral in the bHLH Ia subgroup. In addition, the gymnosperm “SPCH” genes may represent an ancestral state and have a dual function of SPCH and MUTE, two genes that could have originated from a duplication event in the common ancestor of angiosperms. Moreover, in angiosperms, SPCHs have experienced more duplications and harbor more copies than MUTEs and FAMAs, which, together with variation of the stomatal development in the entry division, implies that SPCH might have contributed greatly to the diversity of stomatal development. Based on the above, we proposed a model for the correlation between the evolution of stomatal development and the genes involved in this developmental process in land plants. PMID:24244399
pico-PLAZA, a genome database of microbial photosynthetic eukaryotes.
Vandepoele, Klaas; Van Bel, Michiel; Richard, Guilhem; Van Landeghem, Sofie; Verhelst, Bram; Moreau, Hervé; Van de Peer, Yves; Grimsley, Nigel; Piganeau, Gwenael
2013-08-01
With the advent of next generation genome sequencing, the number of sequenced algal genomes and transcriptomes is rapidly growing. Although a few genome portals exist to browse individual genome sequences, exploring complete genome information from multiple species for the analysis of user-defined sequences or gene lists remains a major challenge. pico-PLAZA is a web-based resource (http://bioinformatics.psb.ugent.be/pico-plaza/) for algal genomics that combines different data types with intuitive tools to explore genomic diversity, perform integrative evolutionary sequence analysis and study gene functions. Apart from homologous gene families, multiple sequence alignments, phylogenetic trees, Gene Ontology, InterPro and text-mining functional annotations, different interactive viewers are available to study genome organization using gene collinearity and synteny information. Different search functions, documentation pages, export functions and an extensive glossary are available to guide non-expert scientists. To illustrate the versatility of the platform, different case studies are presented demonstrating how pico-PLAZA can be used to functionally characterize large-scale EST/RNA-Seq data sets and to perform environmental genomics. Functional enrichments analysis of 16 Phaeodactylum tricornutum transcriptome libraries offers a molecular view on diatom adaptation to different environments of ecological relevance. Furthermore, we show how complementary genomic data sources can easily be combined to identify marker genes to study the diversity and distribution of algal species, for example in metagenomes, or to quantify intraspecific diversity from environmental strains. © 2013 John Wiley & Sons Ltd and Society for Applied Microbiology.
Genome-wide identification and characterisation of F-box family in maize.
Jia, Fengjuan; Wu, Bingjiang; Li, Hui; Huang, Jinguang; Zheng, Chengchao
2013-11-01
F-box-containing proteins, as the key components of the protein degradation machinery, are widely distributed in higher plants and are considered as one of the largest known families of regulatory proteins. The F-box protein family plays a crucial role in plant growth and development and in response to biotic and abiotic stresses. However, systematic analysis of the F-box family in maize (Zea mays) has not been reported yet. In this paper, we identified and characterised the maize F-box genes in a genome-wide scale, including phylogenetic analysis, chromosome distribution, gene structure, promoter analysis and gene expression profiles. A total of 359 F-box genes were identified and divided into 15 subgroups by phylogenetic analysis. The F-box domain was relatively conserved, whereas additional motifs outside the F-box domain may indicate the functional diversification of maize F-box genes. These genes were unevenly distributed in ten maize chromosomes, suggesting that they expanded in the maize genome because of tandem and segmental duplication events. The expression profiles suggested that the maize F-box genes had temporal and spatial expression patterns. Putative cis-acting regulatory DNA elements involved in abiotic stresses were observed in maize F-box gene promoters. The gene expression profiles under abiotic stresses also suggested that some genes participated in stress responsive pathways. Furthermore, ten genes were chosen for quantitative real-time PCR analysis under drought stress and the results were consistent with the microarray data. This study has produced a comparative genomics analysis of the maize ZmFBX gene family that can be used in further studies to uncover their roles in maize growth and development.
Distribution of mutations in the PEX gene in families with X-linked hypophosphataemic rickets (HYP).
Rowe, P S; Oudet, C L; Francis, F; Sinding, C; Pannetier, S; Econs, M J; Strom, T M; Meitinger, T; Garabedian, M; David, A; Macher, M A; Questiaux, E; Popowska, E; Pronicka, E; Read, A P; Mokrzycki, A; Glorieux, F H; Drezner, M K; Hanauer, A; Lehrach, H; Goulding, J N; O'Riordan, J L
1997-04-01
Mutations in the PEX gene at Xp22.1 (phosphate-regulating gene with homologies to endopeptidases, on the X-chromosome), are responsible for X-linked hypophosphataemic rickets (HYP). Homology of PEX to the M13 family of Zn2+ metallopeptidases which include neprilysin (NEP) as prototype, has raised important questions regarding PEX function at the molecular level. The aim of this study was to analyse 99 HYP families for PEX gene mutations, and to correlate predicted changes in the protein structure with Zn2+ metallopeptidase gene function. Primers flanking 22 characterised exons were used to amplify DNA by PCR, and SSCP was then used to screen for mutations. Deletions, insertions, nonsense mutations, stop codons and splice mutations occurred in 83% of families screened for in all 22 exons, and 51% of a separate set of families screened in 17 PEX gene exons. Missense mutations in four regions of the gene were informative regarding function, with one mutation in the Zn2+-binding site predicted to alter substrate enzyme interaction and catalysis. Computer analysis of the remaining mutations predicted changes in secondary structure, N-glycosylation, protein phosphorylation and catalytic site molecular structure. The wide range of mutations that align with regions required for protease activity in NEP suggests that PEX also functions as a protease, and may act by processing factor(s) involved in bone mineral metabolism.
The genetics of fat distribution.
Schleinitz, Dorit; Böttcher, Yvonne; Blüher, Matthias; Kovacs, Peter
2014-07-01
Fat stored in visceral depots makes obese individuals more prone to complications than subcutaneous fat. There is good evidence that body fat distribution (FD) is controlled by genetic factors. WHR, a surrogate measure of FD, shows significant heritability of up to ∼60%, even after adjusting for BMI. Genetic variants have been linked to various forms of altered FD such as lipodystrophies; however, the polygenic background of visceral obesity has only been sparsely investigated in the past. Recent genome-wide association studies (GWAS) for measures of FD revealed numerous loci harbouring genes potentially regulating FD. In addition, genes with fat depot-specific expression patterns (in particular subcutaneous vs visceral adipose tissue) provide plausible candidate genes involved in the regulation of FD. Many of these genes are differentially expressed in various fat compartments and correlate with obesity-related traits, thus further supporting their role as potential mediators of metabolic alterations associated with a distinct FD. Finally, developmental genes may at a very early stage determine specific FD in later life. Indeed, genes such as TBX15 not only manifest differential expression in various fat depots, but also correlate with obesity and related traits. Moreover, recent GWAS identified several polymorphisms in developmental genes (including TBX15, HOXC13, RSPO3 and CPEB4) strongly associated with FD. More accurate methods, including cardiometabolic imaging, for assessment of FD are needed to promote our understanding in this field, where the main focus is now to unravel the yet unknown biological function of these novel 'fat distribution genes'.
[Genome-wide identification and expression analysis of auxin-related gene families in grape].
Yuan, Hua-zhao; Zhao, Mi-zhen; Wu, Wei-min; Yu, Hong-Mei; Qian, Ya-ming; Wang, Zhuang-wei; Wang, Xi-cheng
2015-07-01
The auxin response gene family adjusts the auxin balance and the growth hormone signaling pathways in plants. Using bioinformatics methods, the auxin-response genes from the grape genome database are identified and their chromosomal location, gene collinearity and phylogenetic analysis are performed. Probable genes include 25 AUX_IAA, 19 ARF, 9 GH3 and 42 LBD genes, which are unevenly distributed on all 19 chromosomes and some of them formed distinct tandem duplicate gene clusters. The available grape microarray databases show that all of the auxin-response genes are expressed in fruit and leaf buds, and significant overexpressed during fruit color-changing, bud break and bud dormancy periods. This paper provides a resource for functional studies of auxin-response genes in grape leaf and fruit development.
Despite extensive genetic, biochemical and structural studies on Escherichia coli RNA polymerase (RNAP), little is known about its location and distribution in response to environmental changes. To visualize the RNAP by fluorescence microscopy in E. coli under different physiological conditions, we constructed a functional rpoC-gfp gene fusion on the chromosome.
FunGene: the functional gene pipeline and repository.
Fish, Jordan A; Chai, Benli; Wang, Qiong; Sun, Yanni; Brown, C Titus; Tiedje, James M; Cole, James R
2013-01-01
Ribosomal RNA genes have become the standard molecular markers for microbial community analysis for good reasons, including universal occurrence in cellular organisms, availability of large databases, and ease of rRNA gene region amplification and analysis. As markers, however, rRNA genes have some significant limitations. The rRNA genes are often present in multiple copies, unlike most protein-coding genes. The slow rate of change in rRNA genes means that multiple species sometimes share identical 16S rRNA gene sequences, while many more species share identical sequences in the short 16S rRNA regions commonly analyzed. In addition, the genes involved in many important processes are not distributed in a phylogenetically coherent manner, potentially due to gene loss or horizontal gene transfer. While rRNA genes remain the most commonly used markers, key genes in ecologically important pathways, e.g., those involved in carbon and nitrogen cycling, can provide important insights into community composition and function not obtainable through rRNA analysis. However, working with ecofunctional gene data requires some tools beyond those required for rRNA analysis. To address this, our Functional Gene Pipeline and Repository (FunGene; http://fungene.cme.msu.edu/) offers databases of many common ecofunctional genes and proteins, as well as integrated tools that allow researchers to browse these collections and choose subsets for further analysis, build phylogenetic trees, test primers and probes for coverage, and download aligned sequences. Additional FunGene tools are specialized to process coding gene amplicon data. For example, FrameBot produces frameshift-corrected protein and DNA sequences from raw reads while finding the most closely related protein reference sequence. These tools can help provide better insight into microbial communities by directly studying key genes involved in important ecological processes.
Non-Maxwellian fast particle effects in gyrokinetic GENE simulations
NASA Astrophysics Data System (ADS)
Di Siena, A.; Görler, T.; Doerk, H.; Bilato, R.; Citrin, J.; Johnson, T.; Schneider, M.; Poli, E.; JET Contributors
2018-04-01
Fast ions have recently been found to significantly impact and partially suppress plasma turbulence both in experimental and numerical studies in a number of scenarios. Understanding the underlying physics and identifying the range of their beneficial effect is an essential task for future fusion reactors, where highly energetic ions are generated through fusion reactions and external heating schemes. However, in many of the gyrokinetic codes fast ions are, for simplicity, treated as equivalent-Maxwellian-distributed particle species, although it is well known that to rigorously model highly non-thermalised particles, a non-Maxwellian background distribution function is needed. To study the impact of this assumption, the gyrokinetic code GENE has recently been extended to support arbitrary background distribution functions which might be either analytical, e.g., slowing down and bi-Maxwellian, or obtained from numerical fast ion models. A particular JET plasma with strong fast-ion related turbulence suppression is revised with these new code capabilities both with linear and nonlinear gyrokinetic simulations. It appears that the fast ion stabilization tends to be less strong but still substantial with more realistic distributions, and this improves the quantitative power balance agreement with experiments.
de Luis, Daniel Antonio; Almansa, Raquel; Aller, Rocío; Izaola, Olatz; Romero, E
2017-06-10
Understanding molecular basis involved in overweight is an important first step in developing therapeutic pathways against excess in body weight gain. The purpose of our pilot study was to evaluate the gene expression profiles in the peripheral blood of obese patients without other metabolic complications. A sample of 17 obese patients without metabolic syndrome and 15 non obese control subjects was evaluated in a prospective way. Following 'One-Color Microarray-Based Gene Expression Analysis' protocol Version 5.7 (Agilent p/n 4140-90040), cRNA was hybridized with Whole Human Genome Oligo Microarray Kit (Agilent p/n G2519F-014850) containing 41,000+ unique human genes and transcripts. The average age of the study group was 43.6 ± 19.7 years with a sex distribution of 64.7% females and 35.3% males. No statistical differences were detected with healthy controls 41.9 ± 12.3 years with a sex distribution of 70% females and 30% males. Obese patients showed 1436 genes that were differentially expressed compared to control group. Ingenuity Pathway Analysis showed that these genes participated in 13 different categories related to metabolism and cellular functions. In the gene set of cellular function, the most important genes were C-terminal region of Nel-like molecule 1 protein (NELL1) and Pigment epithelium-derived factor (SPEDF), both genes were over-expressed. In the gene set of metabolism, insulin growth factor type 1 (IGF1), ApoA5 (apolipoprotein subtype 5), Foxo4 (Forkhead transcription factor 4), ADIPOR1 (receptor of adiponectin type 1) and AQP7 (aquaporin channel proteins7) were over expressed. Moreover, PIKFYVE (PtdIns(3) P 5-kinase), and ROCK-2 (rho-kinase II) were under expressed. We showed that PBMCs from obese subjects presented significant changes in gene expression, exhibiting 1436 differentially expressed genes compared to PBMCs from non-obese subjects. Furthermore, our data showed a number of genes involved in relevant processes implicated in metabolism, with genes presenting high fold-change values (up-regulation and down regulation) associated with lipid, carbohydrate and protein metabolism. Copyright © 2017 Elsevier Ltd and European Society for Clinical Nutrition and Metabolism. All rights reserved.
Selecting and validating reference genes for quantitative real-time PCR in Plutella xylostella (L.).
You, Yanchun; Xie, Miao; Vasseur, Liette; You, Minsheng
2018-05-01
Gene expression analysis provides important clues regarding gene functions, and quantitative real-time PCR (qRT-PCR) is a widely used method in gene expression studies. Reference genes are essential for normalizing and accurately assessing gene expression. In the present study, 16 candidate reference genes (ACTB, CyPA, EF1-α, GAPDH, HSP90, NDPk, RPL13a, RPL18, RPL19, RPL32, RPL4, RPL8, RPS13, RPS4, α-TUB, and β-TUB) from Plutella xylostella were selected to evaluate gene expression stability across different experimental conditions using five statistical algorithms (geNorm, NormFinder, Delta Ct, BestKeeper, and RefFinder). The results suggest that different reference genes or combinations of reference genes are suitable for normalization in gene expression studies of P. xylostella according to the different developmental stages, strains, tissues, and insecticide treatments. Based on the given experimental sets, the most stable reference genes were RPS4 across different developmental stages, RPL8 across different strains and tissues, and EF1-α across different insecticide treatments. A comprehensive and systematic assessment of potential reference genes for gene expression normalization is essential for post-genomic functional research in P. xylostella, a notorious pest with worldwide distribution and a high capacity to adapt and develop resistance to insecticides.
White noise and synchronization shaping the age structure of the human population
NASA Astrophysics Data System (ADS)
Cebrat, Stanislaw; Biecek, Przemyslaw; Bonkowska, Katarzyna; Kula, Mateusz
2007-06-01
We have modified the standard diploid Penna model of ageing in such a way that instead of threshold of defective loci resulting in genetic death of individuals, the fluctuation of environment and "personal" fluctuations of individuals were introduced. The sum of the both fluctuations describes the health status of the individual. While environmental fluctuations are the same for all individuals in the population, the personal component of fluctuations is composed of fluctuations corresponding to each physiological function (gene, genetic locus). It is rather accepted hypothesis that physiological parameters of any organism fluctuate highly nonlinearly. Transition to the synchronized behaviors could be a very strong diagnostic signal of the life threatening disorder. Thus, in our model, mutations of genes change the chaotic fluctuations representing the function of a wild gene to the synchronized signals generated by mutated genes. Genes are switched on chronologically, like in the standard Penna model. Accumulation of defective genes predicted by Medawar's theory of ageing leads to the replacement of uncorrelated white noise corresponding to the healthy organism by the correlated signals of defective functions. As a result we have got the age distribution of population corresponding to the human demographic data.
Diverse Antibiotic Resistance Genes in Dairy Cow Manure
Wichmann, Fabienne; Udikovic-Kolic, Nikolina; Andrew, Sheila; Handelsman, Jo
2014-01-01
ABSTRACT Application of manure from antibiotic-treated animals to crops facilitates the dissemination of antibiotic resistance determinants into the environment. However, our knowledge of the identity, diversity, and patterns of distribution of these antibiotic resistance determinants remains limited. We used a new combination of methods to examine the resistome of dairy cow manure, a common soil amendment. Metagenomic libraries constructed with DNA extracted from manure were screened for resistance to beta-lactams, phenicols, aminoglycosides, and tetracyclines. Functional screening of fosmid and small-insert libraries identified 80 different antibiotic resistance genes whose deduced protein sequences were on average 50 to 60% identical to sequences deposited in GenBank. The resistance genes were frequently found in clusters and originated from a taxonomically diverse set of species, suggesting that some microorganisms in manure harbor multiple resistance genes. Furthermore, amid the great genetic diversity in manure, we discovered a novel clade of chloramphenicol acetyltransferases. Our study combined functional metagenomics with third-generation PacBio sequencing to significantly extend the roster of functional antibiotic resistance genes found in animal gut bacteria, providing a particularly broad resource for understanding the origins and dispersal of antibiotic resistance genes in agriculture and clinical settings. PMID:24757214
Computer analysis of protein functional sites projection on exon structure of genes in Metazoa.
Medvedeva, Irina V; Demenkov, Pavel S; Ivanisenko, Vladimir A
2015-01-01
Study of the relationship between the structural and functional organization of proteins and their coding genes is necessary for an understanding of the evolution of molecular systems and can provide new knowledge for many applications for designing proteins with improved medical and biological properties. It is well known that the functional properties of proteins are determined by their functional sites. Functional sites are usually represented by a small number of amino acid residues that are distantly located from each other in the amino acid sequence. They are highly conserved within their functional group and vary significantly in structure between such groups. According to this facts analysis of the general properties of the structural organization of the functional sites at the protein level and, at the level of exon-intron structure of the coding gene is still an actual problem. One approach to this analysis is the projection of amino acid residue positions of the functional sites along with the exon boundaries to the gene structure. In this paper, we examined the discontinuity of the functional sites in the exon-intron structure of genes and the distribution of lengths and phases of the functional site encoding exons in vertebrate genes. We have shown that the DNA fragments coding the functional sites were in the same exons, or in close exons. The observed tendency to cluster the exons that code functional sites which could be considered as the unit of protein evolution. We studied the characteristics of the structure of the exon boundaries that code, and do not code, functional sites in 11 Metazoa species. This is accompanied by a reduced frequency of intercodon gaps (phase 0) in exons encoding the amino acid residue functional site, which may be evidence of the existence of evolutionary limitations to the exon shuffling. These results characterize the features of the coding exon-intron structure that affect the functionality of the encoded protein and allow a better understanding of the emergence of biological diversity.
Genome-Wide Analyses of the Soybean F-Box Gene Family in Response to Salt Stress
Jia, Qi; Xiao, Zhi-Xia; Wong, Fuk-Ling; Sun, Song; Liang, Kang-Jing; Lam, Hon-Ming
2017-01-01
The F-box family is one of the largest gene families in plants that regulate diverse life processes, including salt responses. However, the knowledge of the soybean F-box genes and their roles in salt tolerance remains limited. Here, we conducted a genome-wide survey of the soybean F-box family, and their expression analysis in response to salinity via in silico analysis of online RNA-sequencing (RNA-seq) data and quantitative reverse-transcription polymerase chain reaction (qRT-PCR) to predict their potential functions. A total of 725 potential F-box proteins encoded by 509 genes were identified and classified into 9 subfamilies. The gene structures, conserved domains and chromosomal distributions were characterized. There are 76 pairs of duplicate genes identified, including genome-wide segmental and tandem duplication events, which lead to the expansion of the number of F-box genes. The in silico expression analysis showed that these genes would be involved in diverse developmental functions and play an important role in salt response. Our qRT-PCR analysis confirmed 12 salt-responding F-box genes. Overall, our results provide useful information on soybean F-box genes, especially their potential roles in salt tolerance. PMID:28417911
Genome-Wide Analyses of the Soybean F-Box Gene Family in Response to Salt Stress.
Jia, Qi; Xiao, Zhi-Xia; Wong, Fuk-Ling; Sun, Song; Liang, Kang-Jing; Lam, Hon-Ming
2017-04-12
The F-box family is one of the largest gene families in plants that regulate diverse life processes, including salt responses. However, the knowledge of the soybean F-box genes and their roles in salt tolerance remains limited. Here, we conducted a genome-wide survey of the soybean F-box family, and their expression analysis in response to salinity via in silico analysis of online RNA-sequencing (RNA-seq) data and quantitative reverse-transcription polymerase chain reaction (qRT-PCR) to predict their potential functions. A total of 725 potential F-box proteins encoded by 509 genes were identified and classified into 9 subfamilies. The gene structures, conserved domains and chromosomal distributions were characterized. There are 76 pairs of duplicate genes identified, including genome-wide segmental and tandem duplication events, which lead to the expansion of the number of F-box genes. The in silico expression analysis showed that these genes would be involved in diverse developmental functions and play an important role in salt response. Our qRT-PCR analysis confirmed 12 salt-responding F-box genes. Overall, our results provide useful information on soybean F-box genes, especially their potential roles in salt tolerance.
Futagami, Taiki; Morono, Yuki; Terada, Takeshi; Kaksonen, Anna H.; Inagaki, Fumio
2013-01-01
Halogenated organic matter buried in marine subsurface sediment may serve as a source of electron acceptors for anaerobic respiration of subseafloor microbes. Detection of a diverse array of reductive dehalogenase-homologous (rdhA) genes suggests that subseafloor organohalide-respiring microbial communities may play significant ecological roles in the biogeochemical carbon and halogen cycle in the subseafloor biosphere. We report here the spatial distribution of dehalogenation activity in the Nankai Trough plate-subduction zone of the northwest Pacific off the Kii Peninsula of Japan. Incubation experiments with slurries of sediment collected at various depths and locations showed that degradation of several organohalides tested only occurred in the shallow sedimentary basin, down to 4.7 metres below the seafloor, despite detection of rdhA in the deeper sediments. We studied the phylogenetic diversity of the metabolically active microbes in positive enrichment cultures by extracting RNA, and found that Desulfuromonadales bacteria predominate. In addition, for the isolation of genes involved in the dehalogenation reaction, we performed a substrate-induced gene expression screening on DNA extracted from the enrichment cultures. Diverse DNA fragments were obtained and some of them showed best BLAST hit to known organohalide respirers such as Dehalococcoides, whereas no functionally known dehalogenation-related genes such as rdhA were found, indicating the need to improve the molecular approach to assess functional genes for organohalide respiration. PMID:23479745
Futagami, Taiki; Morono, Yuki; Terada, Takeshi; Kaksonen, Anna H; Inagaki, Fumio
2013-04-19
Halogenated organic matter buried in marine subsurface sediment may serve as a source of electron acceptors for anaerobic respiration of subseafloor microbes. Detection of a diverse array of reductive dehalogenase-homologous (rdhA) genes suggests that subseafloor organohalide-respiring microbial communities may play significant ecological roles in the biogeochemical carbon and halogen cycle in the subseafloor biosphere. We report here the spatial distribution of dehalogenation activity in the Nankai Trough plate-subduction zone of the northwest Pacific off the Kii Peninsula of Japan. Incubation experiments with slurries of sediment collected at various depths and locations showed that degradation of several organohalides tested only occurred in the shallow sedimentary basin, down to 4.7 metres below the seafloor, despite detection of rdhA in the deeper sediments. We studied the phylogenetic diversity of the metabolically active microbes in positive enrichment cultures by extracting RNA, and found that Desulfuromonadales bacteria predominate. In addition, for the isolation of genes involved in the dehalogenation reaction, we performed a substrate-induced gene expression screening on DNA extracted from the enrichment cultures. Diverse DNA fragments were obtained and some of them showed best BLAST hit to known organohalide respirers such as Dehalococcoides, whereas no functionally known dehalogenation-related genes such as rdhA were found, indicating the need to improve the molecular approach to assess functional genes for organohalide respiration.
Gruber, Ansgar; Kroth, Peter G
2017-09-05
Diatoms are important primary producers in the oceans and can also dominate other aquatic habitats. One reason for the success of this phylogenetically relatively young group of unicellular organisms could be the impressive redundancy and diversity of metabolic isoenzymes in diatoms. This redundancy is a result of the evolutionary origin of diatom plastids by a eukaryote-eukaryote endosymbiosis, a process that implies temporary redundancy of functionally complete eukaryotic genomes. During the establishment of the plastids, this redundancy was partially reduced via gene losses, and was partially retained via gene transfer to the nucleus of the respective host cell. These gene transfers required re-assignment of intracellular targeting signals, a process that simultaneously altered the intracellular distribution of metabolic enzymes compared with the ancestral cells. Genome annotation, the correct assignment of the gene products and the prediction of putative function, strongly depends on the correct prediction of the intracellular targeting of a gene product. Here again diatoms are very peculiar, because the targeting systems for organelle import are partially different to those in land plants. In this review, we describe methods of predicting intracellular enzyme locations, highlight findings of metabolic peculiarities in diatoms and present genome-enabled approaches to study their metabolism.This article is part of the themed issue 'The peculiar carbon metabolism in diatoms'. © 2017 The Author(s).
Luna-Zurita, Luis; Stirnimann, Christian U; Glatt, Sebastian; Kaynak, Bogac L; Thomas, Sean; Baudin, Florence; Samee, Md Abul Hassan; He, Daniel; Small, Eric M; Mileikovsky, Maria; Nagy, Andras; Holloway, Alisha K; Pollard, Katherine S; Müller, Christoph W; Bruneau, Benoit G
2016-02-25
Transcription factors (TFs) are thought to function with partners to achieve specificity and precise quantitative outputs. In the developing heart, heterotypic TF interactions, such as between the T-box TF TBX5 and the homeodomain TF NKX2-5, have been proposed as a mechanism for human congenital heart defects. We report extensive and complex interdependent genomic occupancy of TBX5, NKX2-5, and the zinc finger TF GATA4 coordinately controlling cardiac gene expression, differentiation, and morphogenesis. Interdependent binding serves not only to co-regulate gene expression but also to prevent TFs from distributing to ectopic loci and activate lineage-inappropriate genes. We define preferential motif arrangements for TBX5 and NKX2-5 cooperative binding sites, supported at the atomic level by their co-crystal structure bound to DNA, revealing a direct interaction between the two factors and induced DNA bending. Complex interdependent binding mechanisms reveal tightly regulated TF genomic distribution and define a combinatorial logic for heterotypic TF regulation of differentiation. Copyright © 2016 Elsevier Inc. All rights reserved.
Insect sex determination: it all evolves around transformer.
Verhulst, Eveline C; van de Zande, Louis; Beukeboom, Leo W
2010-08-01
Insects exhibit a variety of sex determining mechanisms including male or female heterogamety and haplodiploidy. The primary signal that starts sex determination is processed by a cascade of genes ending with the conserved switch doublesex that controls sexual differentiation. Transformer is the doublesex splicing regulator and has been found in all examined insects, indicating its ancestral function as a sex-determining gene. Despite this conserved function, the variation in transformer nucleotide sequence, amino acid composition and protein structure can accommodate a multitude of upstream sex determining signals. Transformer regulation of doublesex and its taxonomic distribution indicate that the doublesex-transformer axis is conserved among all insects and that transformer is the key gene around which variation in sex determining mechanisms has evolved.
Lijun Liu; Matthew S. Zinkgraf; H. Earl Petzold; Eric P. Beers; Vladimir Filkov; Andrew Groover
2014-01-01
The class I KNOX homeodomain transcription factor ARBORKNOX1 (ARK1) is a key regulator of vascular cambium maintenance and cell differentiation in Populus. Currently, basic information is lacking concerning the distribution, functional characteristics, and evolution of ARK1 binding in the Populus genome.
Phi Class of Glutathione S-transferase Gene Superfamily Widely Exists in Nonplant Taxonomic Groups.
Munyampundu, Jean-Pierre; Xu, You-Ping; Cai, Xin-Zhong
2016-01-01
Glutathione S-transferases (GSTs) constitute a superfamily of enzymes involved in detoxification of noxious compounds and protection against oxidative damage. GST class Phi (GSTF), one of the important classes of plant GSTs, has long been considered as plant specific but was recently found in basidiomycete fungi. However, the range of nonplant taxonomic groups containing GSTFs remains unknown. In this study, the distribution and phylogenetic relationships of nonplant GSTFs were investigated. We identified GSTFs in ascomycete fungi, myxobacteria, and protists Naegleria gruberi and Aureococcus anophagefferens. GSTF occurrence in these bacteria and protists correlated with their genome sizes and habitats. While this link was missing across ascomycetes, the distribution and abundance of GSTFs among ascomycete genomes could be associated with their lifestyles to some extent. Sequence comparison, gene structure, and phylogenetic analyses indicated divergence among nonplant GSTFs, suggesting polyphyletic origins during evolution. Furthermore, in silico prediction of functional partners suggested functional diversification among nonplant GSTFs.
Preston, Jill C; Jorgensen, Stacy A; Jha, Suryatapa G
2014-01-01
Flowering time is strictly controlled by a combination of internal and external signals that match seed set with favorable environmental conditions. In the model plant species Arabidopsis thaliana (Brassicaceae), many of the genes underlying development and evolution of flowering have been discovered. However, much remains unknown about how conserved the flowering gene networks are in plants with different growth habits, gene duplication histories, and distributions. Here we functionally characterize three homologs of the flowering gene Suppressor Of Overexpression of Constans 1 (SOC1) in the short-lived perennial Petunia hybrida (petunia, Solanaceae). Similar to A. thaliana soc1 mutants, co-silencing of duplicated petunia SOC1-like genes results in late flowering. This phenotype is most severe when all three SOC1-like genes are silenced. Furthermore, expression levels of the SOC1-like genes Unshaven (UNS) and Floral Binding Protein 21 (FBP21), but not FBP28, are positively correlated with developmental age. In contrast to A. thaliana, petunia SOC1-like gene expression did not increase with longer photoperiods, and FBP28 transcripts were actually more abundant under short days. Despite evidence of functional redundancy, differential spatio-temporal expression data suggest that SOC1-like genes might fine-tune petunia flowering in response to photoperiod and developmental stage. This likely resulted from modification of SOC1-like gene regulatory elements following recent duplication, and is a possible mechanism to ensure flowering under both inductive and non-inductive photoperiods.
Preston, Jill C.; Jorgensen, Stacy A.; Jha, Suryatapa G.
2014-01-01
Flowering time is strictly controlled by a combination of internal and external signals that match seed set with favorable environmental conditions. In the model plant species Arabidopsis thaliana (Brassicaceae), many of the genes underlying development and evolution of flowering have been discovered. However, much remains unknown about how conserved the flowering gene networks are in plants with different growth habits, gene duplication histories, and distributions. Here we functionally characterize three homologs of the flowering gene SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1 (SOC1) in the short-lived perennial Petunia hybrida (petunia, Solanaceae). Similar to A. thaliana soc1 mutants, co-silencing of duplicated petunia SOC1-like genes results in late flowering. This phenotype is most severe when all three SOC1-like genes are silenced. Furthermore, expression levels of the SOC1-like genes UNSHAVEN (UNS) and FLORAL BINDING PROTEIN 21 (FBP21), but not FBP28, are positively correlated with developmental age. In contrast to A. thaliana, petunia SOC1-like gene expression did not increase with longer photoperiods, and FBP28 transcripts were actually more abundant under short days. Despite evidence of functional redundancy, differential spatio-temporal expression data suggest that SOC1-like genes might fine-tune petunia flowering in response to photoperiod and developmental stage. This likely resulted from modification of SOC1-like gene regulatory elements following recent duplication, and is a possible mechanism to ensure flowering under both inductive and non-inductive photoperiods. PMID:24787903
Factors affecting the concordance between orthologous gene trees and species tree in bacteria.
Castillo-Ramírez, Santiago; González, Víctor
2008-10-30
As originally defined, orthologous genes implied a reflection of the history of the species. In recent years, many studies have examined the concordance between orthologous gene trees and species trees in bacteria. These studies have produced contradictory results that may have been influenced by orthologous gene misidentification and artefactual phylogenetic reconstructions. Here, using a method that allows the detection and exclusion of false positives during identification of orthologous genes, we address the question of whether putative orthologous genes within bacteria really reflect the history of the species. We identified a set of 370 orthologous genes from the bacterial order Rhizobiales. Although manifesting strong vertical signal, almost every orthologous gene had a distinct phylogeny, and the most common topology among the orthologous gene trees did not correspond with the best estimate of the species tree. However, each orthologous gene tree shared an average of 70% of its bipartitions with the best estimate of the species tree. Stochastic error related to gene size affected the concordance between the best estimated of the species tree and the orthologous gene trees, although this effect was weak and distributed unevenly among the functional categories. The nodes showing the greatest discordance were those defined by the shortest internal branches in the best estimated of the species tree. Moreover, a clear bias was evident with respect to the function of the orthologous genes, and the degree of divergence among the orthologous genes appeared to be related to their functional classification. Orthologous genes do not reflect the history of the species when taken as individual markers, but they do when taken as a whole. Stochastic error affected the concordance of orthologous genes with the species tree, albeit weakly. We conclude that two important biological causes of discordance among orthologous genes are incomplete lineage sorting and functional restriction.
Chai, Wenbo; Jiang, Pengfei; Huang, Guoyu; Jiang, Haiyang; Li, Xiaoyu
2017-10-01
The TCP family is a group of plant-specific transcription factors. TCP genes encode proteins harboring bHLH structure, which is implicated in DNA binding and protein-protein interactions and known as the TCP domain. TCP genes play important roles in plant development and have been evolutionarily and functionally elaborated in various plants, however, no overall phylogenetic analysis or expression profiling of TCP genes in Zea mays has been reported. In the present study, a systematic analysis of molecular evolution and functional prediction of TCP family genes in maize ( Z . mays L.) has been conducted. We performed a genome-wide survey of TCP genes in maize, revealing the gene structure, chromosomal location and phylogenetic relationship of family members. Microsynteny between grass species and tissue-specific expression profiles were also investigated. In total, 29 TCP genes were identified in the maize genome, unevenly distributed on the 10 maize chromosomes. Additionally, ZmTCP genes were categorized into nine classes based on phylogeny and purifying selection may largely be responsible for maintaining the functions of maize TCP genes. What's more, microsynteny analysis suggested that TCP genes have been conserved during evolution. Finally, expression analysis revealed that most TCP genes are expressed in the stem and ear, which suggests that ZmTCP genes influence stem and ear growth. This result is consistent with the previous finding that maize TCP genes represses the growth of axillary organs and enables the formation of female inflorescences. Altogether, this study presents a thorough overview of TCP family in maize and provides a new perspective on the evolution of this gene family. The results also indicate that TCP family genes may be involved in development stage in plant growing conditions. Additionally, our results will be useful for further functional analysis of the TCP gene family in maize.
The effect of NGATHA altered activity on auxin signaling pathways within the Arabidopsis gynoecium
Martínez-Fernández, Irene; Sanchís, Sofía; Marini, Naciele; Balanzá, Vicente; Ballester, Patricia; Navarrete-Gómez, Marisa; Oliveira, Antonio C.; Colombo, Lucia; Ferrándiz, Cristina
2014-01-01
The four NGATHA genes (NGA) form a small subfamily within the large family of B3-domain transcription factors of Arabidopsis thaliana. NGA genes act redundantly to direct the development of the apical tissues of the gynoecium, the style, and the stigma. Previous studies indicate that NGA genes could exert this function at least partially by directing the synthesis of auxin at the distal end of the developing gynoecium through the upregulation of two different YUCCA genes, which encode flavin monooxygenases involved in auxin biosynthesis. We have compared three developing pistil transcriptome data sets from wildtype, nga quadruple mutants, and a 35S::NGA3 line. The differentially expressed genes showed a significant enrichment for auxin-related genes, supporting the idea of NGA genes as major regulators of auxin accumulation and distribution within the developing gynoecium. We have introduced reporter lines for several of these differentially expressed genes involved in synthesis, transport and response to auxin in NGA gain- and loss-of-function backgrounds. We present here a detailed map of the response of these reporters to NGA misregulation that could help to clarify the role of NGA in auxin-mediated gynoecium morphogenesis. Our data point to a very reduced auxin synthesis in the developing apical gynoecium of nga mutants, likely responsible for the lack of DR5rev::GFP reporter activity observed in these mutants. In addition, NGA altered activity affects the expression of protein kinases that regulate the cellular localization of auxin efflux regulators, and thus likely impact auxin transport. Finally, protein accumulation in pistils of several ARFs was differentially affected by nga mutations or NGA overexpression, suggesting that these accumulation patterns depend not only on auxin distribution but could be also regulated by transcriptional networks involving NGA factors. PMID:24904608
2010-01-01
Background Cinnamyl Alcohol Dehydrogenase (CAD) proteins function in lignin biosynthesis and play a critical role in wood development and plant defense against stresses. Previous phylogenetic studies did not include genes from seedless plants and did not reflect the deep evolutionary history of this gene family. We reanalyzed the phylogeny of CAD and CAD-like genes using a representative dataset including lycophyte and bryophyte sequences. Many CAD/CAD-like genes do not seem to be associated with wood development under normal growth conditions. To gain insight into the functional evolution of CAD/CAD-like genes, we analyzed their expression in Populus plant tissues in response to feeding damage by gypsy moth larvae (Lymantria dispar L.). Expression of CAD/CAD-like genes in Populus tissues (xylem, leaves, and barks) was analyzed in herbivore-treated and non-treated plants by real time quantitative RT-PCR. Results CAD family genes were distributed in three classes based on sequence conservation. All the three classes are represented by seedless as well as seed plants, including the class of bona fide lignin pathway genes. The expression of some CAD/CAD-like genes that are not associated with xylem development were induced following herbivore damage in leaves, while other genes were induced in only bark or xylem tissues. Five of the CAD/CAD-like genes, however, showed a shift in expression from one tissue to another between non-treated and herbivore-treated plants. Systemic expression of the CAD/CAD-like genes was generally suppressed. Conclusions Our results indicated a correlation between the evolution of the CAD gene family and lignin and that the three classes of genes may have evolved in the ancestor of land plants. Our results also suggest that the CAD/CAD-like genes have evolved a diversity of expression profiles and potentially different functions, but that they are nonetheless co-regulated under stress conditions. PMID:20509918
Shang, Haihong; Li, Wei; Zou, Changsong; Yuan, Youlu
2013-07-01
NAC domain proteins are plant-specific transcription factors known to play diverse roles in various plant developmental processes. In the present study, we performed the first comprehensive study of the NAC gene family in Gossypium raimondii Ulbr., incorporating phylogenetic, chromosomal location, gene structure, conserved motif, and expression profiling analyses. We identified 145 NAC transcription factor (NAC-TF) genes that were phylogenetically clustered into 18 distinct subfamilies. Of these, 127 NAC-TF genes were distributed across the 13 chromosomes, 80 (55%) were preferentially retained duplicates located in both duplicated regions and six were located in triplicated chromosomal regions. The majority of NAC-TF genes showed temporal-, spatial-, and tissue-specific expression patterns based on transcriptomic and qRT-PCR analyses. However, the expression patterns of several duplicate genes were partially redundant, suggesting the occurrence of sub-functionalization during their evolution. Based on their genomic organization, we concluded that genomic duplications contributed significantly to the expansion of the NAC-TF gene family in G. raimondii. Comprehensive analysis of their expression profiles could provide novel insights into the functional divergence among members of the NAC gene family in G. raimondii. © 2013 Institute of Botany, Chinese Academy of Sciences.
Singh, Vikash K.; Jain, Mukesh; Garg, Rohini
2014-01-01
Growth hormone auxin regulates various cellular processes by altering the expression of diverse genes in plants. Among various auxin-responsive genes, GH3 genes maintain endogenous auxin homeostasis by conjugating excess of auxin with amino acids. GH3 genes have been characterized in many plant species, but not in legumes. In the present work, we identified members of GH3 gene family and analyzed their chromosomal distribution, gene structure, gene duplication and phylogenetic analysis in different legumes, including chickpea, soybean, Medicago, and Lotus. A comprehensive expression analysis in different vegetative and reproductive tissues/stages revealed that many of GH3 genes were expressed in a tissue-specific manner. Notably, chickpea CaGH3-3, soybean GmGH3-8 and -25, and Lotus LjGH3-4, -5, -9 and -18 genes were up-regulated in root, indicating their putative role in root development. In addition, chickpea CaGH3-1 and -7, and Medicago MtGH3-7, -8, and -9 were found to be highly induced under drought and/or salt stresses, suggesting their role in abiotic stress responses. We also observed the examples of differential expression pattern of duplicated GH3 genes in soybean, indicating their functional diversification. Furthermore, analyses of three-dimensional structures, active site residues and ligand preferences provided molecular insights into function of GH3 genes in legumes. The analysis presented here would help in investigation of precise function of GH3 genes in legumes during development and stress conditions. PMID:25642236
Sineokiĭ, S P; Pogosov, V Z; Iankovskiĭ, N K; Krylov, V N
1976-01-01
123 Amber mutants of lambdoid bacteriophage phi81 are isolated and distributed into 19 complementation groups. Deletion mapping made possible to locate 5 gene groups on the genetic map of bacteriophage phi81 and to determine a region of possible location of mm' sticky ends on the prophage genetic map. A gene of phage phi81 is localized, which controls the adsorption specificity, and which functional similarity to a respective gene of phage phi80 is demonstrated.
Wang, Y.; Boyd, E.; Crane, S.; Lu-Irving, P.; Krabbenhoft, D.; King, S.; Dighton, J.; Geesey, G.; Barkay, T.
2011-01-01
The distribution and phylogeny of extant protein-encoding genes recovered from geochemically diverse environments can provide insight into the physical and chemical parameters that led to the origin and which constrained the evolution of a functional process. Mercuric reductase (MerA) plays an integral role in mercury (Hg) biogeochemistry by catalyzing the transformation of Hg(II) to Hg(0). Putative merA sequences were amplified from DNA extracts of microbial communities associated with mats and sulfur precipitates from physicochemically diverse Hg-containing springs in Yellowstone National Park, Wyoming, using four PCR primer sets that were designed to capture the known diversity of merA. The recovery of novel and deeply rooted MerA lineages from these habitats supports previous evidence that indicates merA originated in a thermophilic environment. Generalized linear models indicate that the distribution of putative archaeal merA lineages was constrained by a combination of pH, dissolved organic carbon, dissolved total mercury and sulfide. The models failed to identify statistically well supported trends for the distribution of putative bacterial merA lineages as a function of these or other measured environmental variables, suggesting that these lineages were either influenced by environmental parameters not considered in the present study, or the bacterial primer sets were designed to target too broad of a class of genes which may have responded differently to environmental stimuli. The widespread occurrence of merA in the geothermal environments implies a prominent role for Hg detoxification in these environments. Moreover, the differences in the distribution of the merA genes amplified with the four merA primer sets suggests that the organisms putatively engaged in this activity have evolved to occupy different ecological niches within the geothermal gradient. ?? 2011 Springer Science+Business Media, LLC.
Wang, Yanping; Boyd, Eric; Crane, Sharron; Lu-Irving, Patricia; Krabbenhoft, David; King, Susan; Dighton, John; Geesey, Gill; Barkay, Tamar
2011-11-01
The distribution and phylogeny of extant protein-encoding genes recovered from geochemically diverse environments can provide insight into the physical and chemical parameters that led to the origin and which constrained the evolution of a functional process. Mercuric reductase (MerA) plays an integral role in mercury (Hg) biogeochemistry by catalyzing the transformation of Hg(II) to Hg(0). Putative merA sequences were amplified from DNA extracts of microbial communities associated with mats and sulfur precipitates from physicochemically diverse Hg-containing springs in Yellowstone National Park, Wyoming, using four PCR primer sets that were designed to capture the known diversity of merA. The recovery of novel and deeply rooted MerA lineages from these habitats supports previous evidence that indicates merA originated in a thermophilic environment. Generalized linear models indicate that the distribution of putative archaeal merA lineages was constrained by a combination of pH, dissolved organic carbon, dissolved total mercury and sulfide. The models failed to identify statistically well supported trends for the distribution of putative bacterial merA lineages as a function of these or other measured environmental variables, suggesting that these lineages were either influenced by environmental parameters not considered in the present study, or the bacterial primer sets were designed to target too broad of a class of genes which may have responded differently to environmental stimuli. The widespread occurrence of merA in the geothermal environments implies a prominent role for Hg detoxification in these environments. Moreover, the differences in the distribution of the merA genes amplified with the four merA primer sets suggests that the organisms putatively engaged in this activity have evolved to occupy different ecological niches within the geothermal gradient.
Robustness, evolvability, and the logic of genetic regulation.
Payne, Joshua L; Moore, Jason H; Wagner, Andreas
2014-01-01
In gene regulatory circuits, the expression of individual genes is commonly modulated by a set of regulating gene products, which bind to a gene's cis-regulatory region. This region encodes an input-output function, referred to as signal-integration logic, that maps a specific combination of regulatory signals (inputs) to a particular expression state (output) of a gene. The space of all possible signal-integration functions is vast and the mapping from input to output is many-to-one: For the same set of inputs, many functions (genotypes) yield the same expression output (phenotype). Here, we exhaustively enumerate the set of signal-integration functions that yield identical gene expression patterns within a computational model of gene regulatory circuits. Our goal is to characterize the relationship between robustness and evolvability in the signal-integration space of regulatory circuits, and to understand how these properties vary between the genotypic and phenotypic scales. Among other results, we find that the distributions of genotypic robustness are skewed, so that the majority of signal-integration functions are robust to perturbation. We show that the connected set of genotypes that make up a given phenotype are constrained to specific regions of the space of all possible signal-integration functions, but that as the distance between genotypes increases, so does their capacity for unique innovations. In addition, we find that robust phenotypes are (i) evolvable, (ii) easily identified by random mutation, and (iii) mutationally biased toward other robust phenotypes. We explore the implications of these latter observations for mutation-based evolution by conducting random walks between randomly chosen source and target phenotypes. We demonstrate that the time required to identify the target phenotype is independent of the properties of the source phenotype.
Dact genes are chordate specific regulators at the intersection of Wnt and Tgf-β signaling pathways.
Schubert, Frank Richard; Sobreira, Débora Rodrigues; Janousek, Ricardo Guerreiro; Alvares, Lúcia Elvira; Dietrich, Susanne
2014-08-06
Dacts are multi-domain adaptor proteins. They have been implicated in Wnt and Tgfβ signaling and serve as a nodal point in regulating many cellular activities. Dact genes have so far only been identified in bony vertebrates. Also, the number of Dact genes in a given species, the number and roles of protein motifs and functional domains, and the overlap of gene expression domains are all not clear. To address these problems, we have taken an evolutionary approach, screening for Dact genes in the animal kingdom and establishing their phylogeny and the synteny of Dact loci. Furthermore, we performed a deep analysis of the various Dact protein motifs and compared the expression patterns of different Dacts. Our study identified previously not recognized dact genes and showed that they evolved late in the deuterostome lineage. In gnathostomes, four Dact genes were generated by the two rounds of whole genome duplication in the vertebrate ancestor, with Dact1/3 and Dact2/4, respectively, arising from the two genes generated during the first genome duplication. In actinopterygians, a further dact4r gene arose from retrotranscription. The third genome duplication in the teleost ancestor, and subsequent gene loss in most gnathostome lineages left extant species with a subset of Dact genes. The distribution of functional domains suggests that the ancestral Dact function lied with Wnt signaling, and a role in Tgfβ signaling may have emerged with the Dact2/4 ancestor. Motif reduction, in particular in Dact4, suggests that this protein may counteract the function of the other Dacts. Dact genes were expressed in both distinct and overlapping domains, suggesting possible combinatorial function. The gnathostome Dact gene family comprises four members, derived from a chordate-specific ancestor. The ability to control Wnt signaling seems to be part of the ancestral repertoire of Dact functions, while the ability to inhibit Tgfβ signaling and to carry out specialized, ortholog-specific roles may have evolved later. The complement of Dact genes coexpressed in a tissue provides a complex way to fine-tune Wnt and Tgfβ signaling. Our work provides the basis for future structural and functional studies aimed at unraveling intracellular regulatory networks.
Weighted functional linear regression models for gene-based association analysis.
Belonogova, Nadezhda M; Svishcheva, Gulnara R; Wilson, James F; Campbell, Harry; Axenovich, Tatiana I
2018-01-01
Functional linear regression models are effectively used in gene-based association analysis of complex traits. These models combine information about individual genetic variants, taking into account their positions and reducing the influence of noise and/or observation errors. To increase the power of methods, where several differently informative components are combined, weights are introduced to give the advantage to more informative components. Allele-specific weights have been introduced to collapsing and kernel-based approaches to gene-based association analysis. Here we have for the first time introduced weights to functional linear regression models adapted for both independent and family samples. Using data simulated on the basis of GAW17 genotypes and weights defined by allele frequencies via the beta distribution, we demonstrated that type I errors correspond to declared values and that increasing the weights of causal variants allows the power of functional linear models to be increased. We applied the new method to real data on blood pressure from the ORCADES sample. Five of the six known genes with P < 0.1 in at least one analysis had lower P values with weighted models. Moreover, we found an association between diastolic blood pressure and the VMP1 gene (P = 8.18×10-6), when we used a weighted functional model. For this gene, the unweighted functional and weighted kernel-based models had P = 0.004 and 0.006, respectively. The new method has been implemented in the program package FREGAT, which is freely available at https://cran.r-project.org/web/packages/FREGAT/index.html.
The miR-29 family: genomics, cell biology, and relevance to renal and cardiovascular injury.
Kriegel, Alison J; Liu, Yong; Fang, Yi; Ding, Xiaoqiang; Liang, Mingyu
2012-02-27
The human miR-29 family of microRNAs has three mature members, miR-29a, miR-29b, and miR-29c. miR-29s are encoded by two gene clusters. Binding sites for several transcriptional factors have been identified in the promoter regions of miR-29 genes. The miR-29 family members share a common seed region sequence and are predicted to target largely overlapping sets of genes. However, the miR-29 family members exhibit differential regulation in several cases and different subcellular distribution, suggesting their functional relevance may not be identical. miR-29s directly target at least 16 extracellular matrix genes, providing a dramatic example of a single microRNA targeting a large group of functionally related genes. Strong antifibrotic effects of miR-29s have been demonstrated in heart, kidney, and other organs. miR-29s have also been shown to be proapoptotic and involved in the regulation of cell differentiation. It remains to be explored how various cellular effects of miR-29s determine functional relevance of miR-29s to specific diseases and how the miR-29 family members may function cooperatively or separately.
Martínez-del Campo, Ana; Bodea, Smaranda; Hamer, Hilary A.; Marks, Jonathan A.; Haiser, Henry J.; Turnbaugh, Peter J.
2015-01-01
ABSTRACT Elucidation of the molecular mechanisms underlying the human gut microbiota’s effects on health and disease has been complicated by difficulties in linking metabolic functions associated with the gut community as a whole to individual microorganisms and activities. Anaerobic microbial choline metabolism, a disease-associated metabolic pathway, exemplifies this challenge, as the specific human gut microorganisms responsible for this transformation have not yet been clearly identified. In this study, we established the link between a bacterial gene cluster, the choline utilization (cut) cluster, and anaerobic choline metabolism in human gut isolates by combining transcriptional, biochemical, bioinformatic, and cultivation-based approaches. Quantitative reverse transcription-PCR analysis and in vitro biochemical characterization of two cut gene products linked the entire cluster to growth on choline and supported a model for this pathway. Analyses of sequenced bacterial genomes revealed that the cut cluster is present in many human gut bacteria, is predictive of choline utilization in sequenced isolates, and is widely but discontinuously distributed across multiple bacterial phyla. Given that bacterial phylogeny is a poor marker for choline utilization, we were prompted to develop a degenerate PCR-based method for detecting the key functional gene choline TMA-lyase (cutC) in genomic and metagenomic DNA. Using this tool, we found that new choline-metabolizing gut isolates universally possessed cutC. We also demonstrated that this gene is widespread in stool metagenomic data sets. Overall, this work represents a crucial step toward understanding anaerobic choline metabolism in the human gut microbiota and underscores the importance of examining this microbial community from a function-oriented perspective. PMID:25873372
2012-01-01
Background Single nucleotide polymorphism (SNP) validation and large-scale genotyping are required to maximize the use of DNA sequence variation and determine the functional relevance of candidate genes for complex stress tolerance traits through genetic association in rice. We used the bead array platform-based Illumina GoldenGate assay to validate and genotype SNPs in a select set of stress-responsive genes to understand their functional relevance and study the population structure in rice. Results Of the 384 putative SNPs assayed, we successfully validated and genotyped 362 (94.3%). Of these 325 (84.6%) showed polymorphism among the 91 rice genotypes examined. Physical distribution, degree of allele sharing, admixtures and introgression, and amino acid replacement of SNPs in 263 abiotic and 62 biotic stress-responsive genes provided clues for identification and targeted mapping of trait-associated genomic regions. We assessed the functional and adaptive significance of validated SNPs in a set of contrasting drought tolerant upland and sensitive lowland rice genotypes by correlating their allelic variation with amino acid sequence alterations in catalytic domains and three-dimensional secondary protein structure encoded by stress-responsive genes. We found a strong genetic association among SNPs in the nine stress-responsive genes with upland and lowland ecological adaptation. Higher nucleotide diversity was observed in indica accessions compared with other rice sub-populations based on different population genetic parameters. The inferred ancestry of 16% among rice genotypes was derived from admixed populations with the maximum between upland aus and wild Oryza species. Conclusions SNPs validated in biotic and abiotic stress-responsive rice genes can be used in association analyses to identify candidate genes and develop functional markers for stress tolerance in rice. PMID:22921105
Petit, Daniel; Teppa, Elin; Mir, Anne-Marie; Vicogne, Dorothée; Thisse, Christine; Thisse, Bernard; Filloux, Cyril; Harduin-Lepers, Anne
2015-01-01
Sialyltransferases are responsible for the synthesis of a diverse range of sialoglycoconjugates predicted to be pivotal to deuterostomes’ evolution. In this work, we reconstructed the evolutionary history of the metazoan α2,3-sialyltransferases family (ST3Gal), a subset of sialyltransferases encompassing six subfamilies (ST3Gal I–ST3Gal VI) functionally characterized in mammals. Exploration of genomic and expressed sequence tag databases and search of conserved sialylmotifs led to the identification of a large data set of st3gal-related gene sequences. Molecular phylogeny and large scale sequence similarity network analysis identified four new vertebrate subfamilies called ST3Gal III-r, ST3Gal VII, ST3Gal VIII, and ST3Gal IX. To address the issue of the origin and evolutionary relationships of the st3gal-related genes, we performed comparative syntenic mapping of st3gal gene loci combined to ancestral genome reconstruction. The ten vertebrate ST3Gal subfamilies originated from genome duplication events at the base of vertebrates and are organized in three distinct and ancient groups of genes predating the early deuterostomes. Inferring st3gal gene family history identified also several lineage-specific gene losses, the significance of which was explored in a functional context. Toward this aim, spatiotemporal distribution of st3gal genes was analyzed in zebrafish and bovine tissues. In addition, molecular evolutionary analyses using specificity determining position and coevolved amino acid predictions led to the identification of amino acid residues with potential implication in functional divergence of vertebrate ST3Gal. We propose a detailed scenario of the evolutionary relationships of st3gal genes coupled to a conceptual framework of the evolution of ST3Gal functions. PMID:25534026
Wang, Miao-Ying; Zhao, Pi-Ming; Cheng, Huan-Qing; Han, Li-Bo; Wu, Xiao-Min; Gao, Peng; Wang, Hai-Yun; Yang, Chun-Lin; Zhong, Nai-Qin; Zuo, Jian-Ru; Xia, Gui-Xian
2013-07-01
Plant-specific TEOSINTE-BRANCHED1/CYCLOIDEA/PCF (TCP) transcription factors play crucial roles in development, but their functional mechanisms remain largely unknown. Here, we characterized the cellular functions of the class I TCP transcription factor GhTCP14 from upland cotton (Gossypium hirsutum). GhTCP14 is expressed predominantly in fiber cells, especially at the initiation and elongation stages of development, and its expression increased in response to exogenous auxin. Induced heterologous overexpression of GhTCP14 in Arabidopsis (Arabidopsis thaliana) enhanced initiation and elongation of trichomes and root hairs. In addition, root gravitropism was severely affected, similar to mutant of the auxin efflux carrier PIN-FORMED2 (PIN2) gene. Examination of auxin distribution in GhTCP14-expressing Arabidopsis by observation of auxin-responsive reporters revealed substantial alterations in auxin distribution in sepal trichomes and root cortical regions. Consistent with these changes, expression of the auxin uptake carrier AUXIN1 (AUX1) was up-regulated and PIN2 expression was down-regulated in the GhTCP14-expressing plants. The association of GhTCP14 with auxin responses was also evidenced by the enhanced expression of auxin response gene IAA3, a gene in the AUXIN/INDOLE-3-ACETIC ACID (Aux/IAA) family. Electrophoretic mobility shift assays showed that GhTCP14 bound the promoters of PIN2, IAA3, and AUX1, and transactivation assays indicated that GhTCP14 had transcription activation activity. Taken together, these results demonstrate that GhTCP14 is a dual-function transcription factor able to positively or negatively regulate expression of auxin response and transporter genes, thus potentially acting as a crucial regulator in auxin-mediated differentiation and elongation of cotton fiber cells.
Wang, Miao-Ying; Zhao, Pi-Ming; Cheng, Huan-Qing; Han, Li-Bo; Wu, Xiao-Min; Gao, Peng; Wang, Hai-Yun; Yang, Chun-Lin; Zhong, Nai-Qin; Zuo, Jian-Ru; Xia, Gui-Xian
2013-01-01
Plant-specific TEOSINTE-BRANCHED1/CYCLOIDEA/PCF (TCP) transcription factors play crucial roles in development, but their functional mechanisms remain largely unknown. Here, we characterized the cellular functions of the class I TCP transcription factor GhTCP14 from upland cotton (Gossypium hirsutum). GhTCP14 is expressed predominantly in fiber cells, especially at the initiation and elongation stages of development, and its expression increased in response to exogenous auxin. Induced heterologous overexpression of GhTCP14 in Arabidopsis (Arabidopsis thaliana) enhanced initiation and elongation of trichomes and root hairs. In addition, root gravitropism was severely affected, similar to mutant of the auxin efflux carrier PIN-FORMED2 (PIN2) gene. Examination of auxin distribution in GhTCP14-expressing Arabidopsis by observation of auxin-responsive reporters revealed substantial alterations in auxin distribution in sepal trichomes and root cortical regions. Consistent with these changes, expression of the auxin uptake carrier AUXIN1 (AUX1) was up-regulated and PIN2 expression was down-regulated in the GhTCP14-expressing plants. The association of GhTCP14 with auxin responses was also evidenced by the enhanced expression of auxin response gene IAA3, a gene in the AUXIN/INDOLE-3-ACETIC ACID (Aux/IAA) family. Electrophoretic mobility shift assays showed that GhTCP14 bound the promoters of PIN2, IAA3, and AUX1, and transactivation assays indicated that GhTCP14 had transcription activation activity. Taken together, these results demonstrate that GhTCP14 is a dual-function transcription factor able to positively or negatively regulate expression of auxin response and transporter genes, thus potentially acting as a crucial regulator in auxin-mediated differentiation and elongation of cotton fiber cells. PMID:23715527
Emergence of the self-similar property in gene expression dynamics
NASA Astrophysics Data System (ADS)
Ochiai, T.; Nacher, J. C.; Akutsu, T.
2007-08-01
Many theoretical models have recently been proposed to understand the structure of cellular systems composed of various types of elements (e.g., proteins, metabolites and genes) and their interactions. However, the cell is a highly dynamic system with thousands of functional elements fluctuating across temporal states. Therefore, structural analysis alone is not sufficient to reproduce the cell's observed behavior. In this article, we analyze the gene expression dynamics (i.e., how the amount of mRNA molecules in cell fluctuate in time) by using a new constructive approach, which reveals a symmetry embedded in gene expression fluctuations and characterizes the dynamical equation of gene expression (i.e., a specific stochastic differential equation). First, by using experimental data of human and yeast gene expression time series, we found a symmetry in short-time transition probability from time t to time t+1. We call it self-similarity symmetry (i.e., the gene expression short-time fluctuations contain a repeating pattern of smaller and smaller parts that are like the whole, but different in size). Secondly, we reconstruct the global behavior of the observed distribution of gene expression (i.e., scaling-law) and the local behavior of the power-law tail of this distribution. This approach may represent a step forward toward an integrated image of the basic elements of the whole cell.
Sanzol, Javier
2010-05-14
Gene duplication is central to genome evolution. In plants, genes can be duplicated through small-scale events and large-scale duplications often involving polyploidy. The apple belongs to the subtribe Pyrinae (Rosaceae), a diverse lineage that originated via allopolyploidization. Both small-scale duplications and polyploidy may have been important mechanisms shaping the genome of this species. This study evaluates the gene duplication and polyploidy history of the apple by characterizing duplicated genes in this species using EST data. Overall, 68% of the apple genes were clustered into families with a mean copy-number of 4.6. Analysis of the age distribution of gene duplications supported a continuous mode of small-scale duplications, plus two episodes of large-scale duplicates of vastly different ages. The youngest was consistent with the polyploid origin of the Pyrinae 37-48 MYBP, whereas the older may be related to gamma-triplication; an ancient hexapolyploidization previously characterized in the four sequenced eurosid genomes and basal to the eurosid-asterid divergence. Duplicated genes were studied for functional diversification with an emphasis on young paralogs; those originated during or after the formation of the Pyrinae lineage. Unequal assignment of single-copy genes and gene families to Gene Ontology categories suggested functional bias in the pattern of gene retention of paralogs. Young paralogs related to signal transduction, metabolism, and energy pathways have been preferentially retained. Non-random retention of duplicated genes seems to have mediated the expansion of gene families, some of which may have substantially increased their members after the origin of the Pyrinae. The joint analysis of over-duplicated functional categories and phylogenies, allowed evaluation of the role of both polyploidy and small-scale duplications during this process. Finally, gene expression analysis indicated that 82% of duplicated genes, including 80% of young paralogs, showed uncorrelated expression profiles, suggesting extensive subfunctionalization and a role of gene duplication in the acquisition of novel patterns of gene expression. This study reports a genome-wide analysis of the mode of gene duplication in the apple, and provides evidence for its role in genome functional diversification by characterising three major processes: selective retention of paralogs, amplification of gene families, and changes in gene expression.
Kasacka, I
2009-06-01
The majority of research for the calcitonin gene-related peptide (CGRP) in the stomach in the hypertension has been devoted to the submucosal blood flow, and no attention has been paid to its quantitative distribution in the gastric neuroendocrine cells. The aim of the present study was to examine the number and distribution of CGRP-containing cells in the pylorus of "two kidney, one clip" (2K1C) renovascular hypertension model in rats. The studies were carried out on the stomach of rats. After 6 week period of the renal artery clipping procedure, eight 2K1C rats developed stable hypertension. The hypertension significantly increased the number of endocrine cells pylorus immunoreactive to calcitonin gene-related peptide (CGRP) antisera. The differences between the hypertensive rats and the control group concerned not only the number of endocrine cells but also their distribution. CGRP participates in the regulation of cardiovascular functions both in normal state and in the pathophysiology of hypertension through interactions with the prohypertensive systems. The changes induced by hypertension in the neuroendocrine cells containing CGRP of the rats are discussed.
Aubourg, Sébastien; Brunaud, Véronique; Bruyère, Clémence; Cock, Mark; Cooke, Richard; Cottet, Annick; Couloux, Arnaud; Déhais, Patrice; Deléage, Gilbert; Duclert, Aymeric; Echeverria, Manuel; Eschbach, Aimée; Falconet, Denis; Filippi, Ghislain; Gaspin, Christine; Geourjon, Christophe; Grienenberger, Jean-Michel; Houlné, Guy; Jamet, Elisabeth; Lechauve, Frédéric; Leleu, Olivier; Leroy, Philippe; Mache, Régis; Meyer, Christian; Nedjari, Hafed; Negrutiu, Ioan; Orsini, Valérie; Peyretaillade, Eric; Pommier, Cyril; Raes, Jeroen; Risler, Jean-Loup; Rivière, Stéphane; Rombauts, Stéphane; Rouzé, Pierre; Schneider, Michel; Schwob, Philippe; Small, Ian; Soumayet-Kampetenga, Ghislain; Stankovski, Darko; Toffano, Claire; Tognolli, Michael; Caboche, Michel; Lecharny, Alain
2005-01-01
Genomic projects heavily depend on genome annotations and are limited by the current deficiencies in the published predictions of gene structure and function. It follows that, improved annotation will allow better data mining of genomes, and more secure planning and design of experiments. The purpose of the GeneFarm project is to obtain homogeneous, reliable, documented and traceable annotations for Arabidopsis nuclear genes and gene products, and to enter them into an added-value database. This re-annotation project is being performed exhaustively on every member of each gene family. Performing a family-wide annotation makes the task easier and more efficient than a gene-by-gene approach since many features obtained for one gene can be extrapolated to some or all the other genes of a family. A complete annotation procedure based on the most efficient prediction tools available is being used by 16 partner laboratories, each contributing annotated families from its field of expertise. A database, named GeneFarm, and an associated user-friendly interface to query the annotations have been developed. More than 3000 genes distributed over 300 families have been annotated and are available at http://genoplante-info.infobiogen.fr/Genefarm/. Furthermore, collaboration with the Swiss Institute of Bioinformatics is underway to integrate the GeneFarm data into the protein knowledgebase Swiss-Prot. PMID:15608279
Yang, Jing; Wang, Chao; Wu, Jinyu; Liu, Li; Zhang, Gang
2014-01-01
The genus Exiguobacterium can adapt readily to, and survive in, diverse environments. Our study demonstrated that Exiguobacterium sp. strain S3-2, isolated from marine sediment, is resistant to five antibiotics. The plasmid pMC1 in this strain carries seven putative resistance genes. We functionally characterized these resistance genes in Escherichia coli, and genes encoding dihydrofolate reductase and macrolide phosphotransferase were considered novel resistance genes based on their low similarities to known resistance genes. The plasmid G+C content distribution was highly heterogeneous. Only the G+C content of one block, which shared significant similarity with a plasmid from Exiguobacterium arabatum, fit well with the mean G+C content of the host. The remainder of the plasmid was composed of mobile elements with a markedly lower G+C ratio than the host. Interestingly, five mobile elements located on pMC1 showed significant similarities to sequences found in pathogens. Our data provided an example of the link between resistance genes in strains from the environment and the clinic and revealed the aggregation of antibiotic resistance genes in bacteria isolated from fish farms. PMID:24362420
Are there laws of genome evolution?
Koonin, Eugene V
2011-08-01
Research in quantitative evolutionary genomics and systems biology led to the discovery of several universal regularities connecting genomic and molecular phenomic variables. These universals include the log-normal distribution of the evolutionary rates of orthologous genes; the power law-like distributions of paralogous family size and node degree in various biological networks; the negative correlation between a gene's sequence evolution rate and expression level; and differential scaling of functional classes of genes with genome size. The universals of genome evolution can be accounted for by simple mathematical models similar to those used in statistical physics, such as the birth-death-innovation model. These models do not explicitly incorporate selection; therefore, the observed universal regularities do not appear to be shaped by selection but rather are emergent properties of gene ensembles. Although a complete physical theory of evolutionary biology is inconceivable, the universals of genome evolution might qualify as "laws of evolutionary genomics" in the same sense "law" is understood in modern physics.
SDN-1/syndecan regulates growth factor signaling in distal tip cell migrations in C. elegans.
Schwabiuk, Megan; Coudiere, Ludivine; Merz, David C
2009-10-01
Mutations in the sdn-1/syndecan gene act as genetic enhancers of the ventral-to-dorsal distal tip cell (DTC) migration defects caused by a weak allele of the netrin receptor gene unc-5. The sdn-1(ev697) allele was identified in a genetic screen for enhancers of unc-5 DTC migration defects, and carried a nonsense mutation predicted to truncate the SDN-1 protein prior to the transmembrane domain. The enhancement of unc-5 caused by an sdn-1 mutation was rescued by expression of wild-type sdn-1 in the hypodermis or nervous system rather than the DTCs, indicating a cell non-autonomous function of sdn-1. The enhancement was also partially reversed by mutations in the egl-17/FGF or egl-20/Wnt genes, suggesting that sdn-1 affects UNC-5 function through a mis-regulation of signaling in growth factor pathways. egl-20 reporter constructs exhibited increased and mis-localized EGL-20 distribution in sdn-1 mutants compared to wild-type animals. Finally, using loss of function mutations, we show that egl-17/Fgf and egl-20/Wnt are partially redundant in regulating the migration pattern of the posterior DTC, as double mutants exhibit significant frequencies of defects in migration phases along both the anteroposterior and dorsoventral axes. Together these results suggest that SDN-1 affects UNC-5 function by regulating the proper extracellular distribution of growth factors.
Widespread antisense transcription of Populus genome under drought.
Yuan, Yinan; Chen, Su
2018-06-06
Antisense transcription is widespread in many genomes and plays important regulatory roles in gene expression. The objective of our study was to investigate the extent and functional relevance of antisense transcription in forest trees. We employed Populus, a model tree species, to probe the antisense transcriptional response of tree genome under drought, through stranded RNA-seq analysis. We detected nearly 48% of annotated Populus gene loci with antisense transcripts and 44% of them with co-transcription from both DNA strands. Global distribution of reads pattern across annotated gene regions uncovered that antisense transcription was enriched in untranslated regions while sense reads were predominantly mapped in coding exons. We further detected 1185 drought-responsive sense and antisense gene loci and identified a strong positive correlation between the expression of antisense and sense transcripts. Additionally, we assessed the antisense expression in introns and found a strong correlation between intronic expression and exonic expression, confirming antisense transcription of introns contributes to transcriptional activity of Populus genome under drought. Finally, we functionally characterized drought-responsive sense-antisense transcript pairs through gene ontology analysis and discovered that functional groups including transcription factors and histones were concordantly regulated at both sense and antisense transcriptional level. Overall, our study demonstrated the extensive occurrence of antisense transcripts of Populus genes under drought and provided insights into genome structure, regulation pattern and functional significance of drought-responsive antisense genes in forest trees. Datasets generated in this study serve as a foundation for future genetic analysis to improve our understanding of gene regulation by antisense transcription.
[Advance of the study on LRRK2 gene in Parkinson's disease].
Zhang, Yu; Chen, Shengdi
2008-12-01
The leucine-rich repeat kinase2 (LRRK2) has been identified to be the gene causing autosomal dominant inherited Parkinson's disease(PD)8. The clinical features of this type of PD are similar to those of idiopathic PD, but the pathological changes are diverse. The mutation types and frequencies of the LRRK2 distribute unevenly in different populations. LRRK2 is a large complex protein with multiple functions and expresses widely in human body. Sequence alignment shows that LRRK2 might be a multiple function kinase for substrate phosphorylation and might also act as a scaffolding protein. Further study on the physiological function and pathogenic mechanism of LRRK2 will help to find out the possible pathogenesis and new treatment for PD.
GATA simple sequence repeats function as enhancer blocker boundaries.
Kumar, Ram P; Krishnan, Jaya; Pratap Singh, Narendra; Singh, Lalji; Mishra, Rakesh K
2013-01-01
Simple sequence repeats (SSRs) account for ~3% of the human genome, but their functional significance still remains unclear. One of the prominent SSRs the GATA tetranucleotide repeat has preferentially accumulated in complex organisms. GATA repeats are particularly enriched on the human Y chromosome, and their non-random distribution and exclusive association with genes expressed during early development indicate their role in coordinated gene regulation. Here we show that GATA repeats have enhancer blocker activity in Drosophila and human cells. This enhancer blocker activity is seen in transgenic as well as native context of the enhancers at various developmental stages. These findings ascribe functional significance to SSRs and offer an explanation as to why SSRs, especially GATA, may have accumulated in complex organisms.
Dynamics and Context-Dependent Roles of DNA Methylation.
Ambrosi, Christina; Manzo, Massimiliano; Baubec, Tuncay
2017-05-19
DNA methylation is one of the most extensively studied epigenetic marks. It is involved in transcriptional gene silencing and plays important roles during mammalian development. Its perturbation is often associated with human diseases. In mammalian genomes, DNA methylation is a prevalent modification that decorates the majority of cytosines. It is found at the promoters and enhancers of inactive genes, at repetitive elements, and within transcribed gene bodies. Its presence at promoters is dynamically linked to gene activity, suggesting that it could directly influence gene expression patterns and cellular identity. The genome-wide distribution and dynamic behaviour of this mark have been studied in great detail in a variety of tissues and cell lines, including early embryonic development and in embryonic stem cells. In combination with functional studies, these genome-wide maps of DNA methylation revealed interesting features of this mark and provided important insights into its dynamic nature and potential functional role in genome regulation. In this review, we discuss how these recent observations, in combination with insights obtained from biochemical and functional genetics studies, have expanded our current knowledge about the regulation and context-dependent roles of DNA methylation in mammalian genomes. Copyright © 2017 Elsevier Ltd. All rights reserved.
Vascular gene expression: a hypothesis
Martínez-Navarro, Angélica C.; Galván-Gordillo, Santiago V.; Xoconostle-Cázares, Beatriz; Ruiz-Medrano, Roberto
2013-01-01
The phloem is the conduit through which photoassimilates are distributed from autotrophic to heterotrophic tissues and is involved in the distribution of signaling molecules that coordinate plant growth and responses to the environment. Phloem function depends on the coordinate expression of a large array of genes. We have previously identified conserved motifs in upstream regions of the Arabidopsis genes, encoding the homologs of pumpkin phloem sap mRNAs, displaying expression in vascular tissues. This tissue-specific expression in Arabidopsis is predicted by the overrepresentation of GA/CT-rich motifs in gene promoters. In this work we have searched for common motifs in upstream regions of the homologous genes from plants considered to possess a “primitive” vascular tissue (a lycophyte), as well as from others that lack a true vascular tissue (a bryophyte), and finally from chlorophytes. Both lycophyte and bryophyte display motifs similar to those found in Arabidopsis with a significantly low E-value, while the chlorophytes showed either a different conserved motif or no conserved motif at all. These results suggest that these same genes are expressed coordinately in non-vascular plants; this coordinate expression may have been one of the prerequisites for the development of conducting tissues in plants. We have also analyzed the phylogeny of conserved proteins that may be involved in phloem function and development. The presence of CmPP16, APL, FT, and YDA in chlorophytes suggests the recruitment of ancient regulatory networks for the development of the vascular tissue during evolution while OPS is a novel protein specific to vascular plants. PMID:23882276
Zhao, Yang; Zhou, Yuqiong; Jiang, Haiyang; Li, Xiaoyu; Gan, Defang; Peng, Xiaojian; Zhu, Suwen; Cheng, Beijiu
2011-01-01
Background Members of the homeodomain-leucine zipper (HD-Zip) gene family encode transcription factors that are unique to plants and have diverse functions in plant growth and development such as various stress responses, organ formation and vascular development. Although systematic characterization of this family has been carried out in Arabidopsis and rice, little is known about HD-Zip genes in maize (Zea mays L.). Methods and Findings In this study, we described the identification and structural characterization of HD-Zip genes in the maize genome. A complete set of 55 HD-Zip genes (Zmhdz1-55) were identified in the maize genome using Blast search tools and categorized into four classes (HD-Zip I-IV) based on phylogeny. Chromosomal location of these genes revealed that they are distributed unevenly across all 10 chromosomes. Segmental duplication contributed largely to the expansion of the maize HD-ZIP gene family, while tandem duplication was only responsible for the amplification of the HD-Zip II genes. Furthermore, most of the maize HD-Zip I genes were found to contain an overabundance of stress-related cis-elements in their promoter sequences. The expression levels of the 17 HD-Zip I genes under drought stress were also investigated by quantitative real-time PCR (qRT-PCR). All of the 17 maize HD-ZIP I genes were found to be regulated by drought stress, and the duplicated genes within a sister pair exhibited the similar expression patterns, suggesting their conserved functions during the process of evolution. Conclusions Our results reveal a comprehensive overview of the maize HD-Zip gene family and provide the first step towards the selection of Zmhdz genes for cloning and functional research to uncover their roles in maize growth and development. PMID:22164299
Zhao, Yang; Zhou, Yuqiong; Jiang, Haiyang; Li, Xiaoyu; Gan, Defang; Peng, Xiaojian; Zhu, Suwen; Cheng, Beijiu
2011-01-01
Members of the homeodomain-leucine zipper (HD-Zip) gene family encode transcription factors that are unique to plants and have diverse functions in plant growth and development such as various stress responses, organ formation and vascular development. Although systematic characterization of this family has been carried out in Arabidopsis and rice, little is known about HD-Zip genes in maize (Zea mays L.). In this study, we described the identification and structural characterization of HD-Zip genes in the maize genome. A complete set of 55 HD-Zip genes (Zmhdz1-55) were identified in the maize genome using Blast search tools and categorized into four classes (HD-Zip I-IV) based on phylogeny. Chromosomal location of these genes revealed that they are distributed unevenly across all 10 chromosomes. Segmental duplication contributed largely to the expansion of the maize HD-ZIP gene family, while tandem duplication was only responsible for the amplification of the HD-Zip II genes. Furthermore, most of the maize HD-Zip I genes were found to contain an overabundance of stress-related cis-elements in their promoter sequences. The expression levels of the 17 HD-Zip I genes under drought stress were also investigated by quantitative real-time PCR (qRT-PCR). All of the 17 maize HD-ZIP I genes were found to be regulated by drought stress, and the duplicated genes within a sister pair exhibited the similar expression patterns, suggesting their conserved functions during the process of evolution. Our results reveal a comprehensive overview of the maize HD-Zip gene family and provide the first step towards the selection of Zmhdz genes for cloning and functional research to uncover their roles in maize growth and development.
2013-01-01
Background Currently, there is very limited knowledge about the genes involved in normal pigmentation variation in East Asian populations. We carried out a genome-wide scan of signatures of positive selection using the 1000 Genomes Phase I dataset, in order to identify pigmentation genes showing putative signatures of selective sweeps in East Asia. We applied a broad range of methods to detect signatures of selection including: 1) Tests designed to identify deviations of the Site Frequency Spectrum (SFS) from neutral expectations (Tajima’s D, Fay and Wu’s H and Fu and Li’s D* and F*), 2) Tests focused on the identification of high-frequency haplotypes with extended linkage disequilibrium (iHS and Rsb) and 3) Tests based on genetic differentiation between populations (LSBL). Based on the results obtained from a genome wide analysis of 25 kb windows, we constructed an empirical distribution for each statistic across all windows, and identified pigmentation genes that are outliers in the distribution. Results Our tests identified twenty genes that are relevant for pigmentation biology. Of these, eight genes (ATRN, EDAR, KLHL7, MITF, OCA2, TH, TMEM33 and TRPM1,) were extreme outliers (top 0.1% of the empirical distribution) for at least one statistic, and twelve genes (ADAM17, BNC2, CTSD, DCT, EGFR, LYST, MC1R, MLPH, OPRM1, PDIA6, PMEL (SILV) and TYRP1) were in the top 1% of the empirical distribution for at least one statistic. Additionally, eight of these genes (BNC2, EGFR, LYST, MC1R, OCA2, OPRM1, PMEL (SILV) and TYRP1) have been associated with pigmentary traits in association studies. Conclusions We identified a number of putative pigmentation genes showing extremely unusual patterns of genetic variation in East Asia. Most of these genes are outliers for different tests and/or different populations, and have already been described in previous scans for positive selection, providing strong support to the hypothesis that recent selective sweeps left a signature in these regions. However, it will be necessary to carry out association and functional studies to demonstrate the implication of these genes in normal pigmentation variation. PMID:23848512
Hider, Jessica L; Gittelman, Rachel M; Shah, Tapan; Edwards, Melissa; Rosenbloom, Arnold; Akey, Joshua M; Parra, Esteban J
2013-07-12
Currently, there is very limited knowledge about the genes involved in normal pigmentation variation in East Asian populations. We carried out a genome-wide scan of signatures of positive selection using the 1000 Genomes Phase I dataset, in order to identify pigmentation genes showing putative signatures of selective sweeps in East Asia. We applied a broad range of methods to detect signatures of selection including: 1) Tests designed to identify deviations of the Site Frequency Spectrum (SFS) from neutral expectations (Tajima's D, Fay and Wu's H and Fu and Li's D* and F*), 2) Tests focused on the identification of high-frequency haplotypes with extended linkage disequilibrium (iHS and Rsb) and 3) Tests based on genetic differentiation between populations (LSBL). Based on the results obtained from a genome wide analysis of 25 kb windows, we constructed an empirical distribution for each statistic across all windows, and identified pigmentation genes that are outliers in the distribution. Our tests identified twenty genes that are relevant for pigmentation biology. Of these, eight genes (ATRN, EDAR, KLHL7, MITF, OCA2, TH, TMEM33 and TRPM1,) were extreme outliers (top 0.1% of the empirical distribution) for at least one statistic, and twelve genes (ADAM17, BNC2, CTSD, DCT, EGFR, LYST, MC1R, MLPH, OPRM1, PDIA6, PMEL (SILV) and TYRP1) were in the top 1% of the empirical distribution for at least one statistic. Additionally, eight of these genes (BNC2, EGFR, LYST, MC1R, OCA2, OPRM1, PMEL (SILV) and TYRP1) have been associated with pigmentary traits in association studies. We identified a number of putative pigmentation genes showing extremely unusual patterns of genetic variation in East Asia. Most of these genes are outliers for different tests and/or different populations, and have already been described in previous scans for positive selection, providing strong support to the hypothesis that recent selective sweeps left a signature in these regions. However, it will be necessary to carry out association and functional studies to demonstrate the implication of these genes in normal pigmentation variation.
NASA Astrophysics Data System (ADS)
Zhu, Y. G.
2015-12-01
In addition to material and energy flows, the dynamics and functions of the Earth's critical zone are intensively mediated by biological actions performed by diverse organisms. These biological actions are modulated by the expression of functional genes and their translation into enzymes that catalyze geochemical reactions, such as nutrient turnover and pollutant biodegradation. Although geobiology, as an interdisciplinary research area, is playing and vital role in linking biological and geochemical processes at different temporal and spatial scales, the distribution and transport of functional genes have rarely been investigated from the Earth's critical zone perspectives. To illustrate the framework of studies on the transport and transformation of genetic information in the critical zone, antibiotic resistance is taken as an example. Antibiotic resistance genes are considered as a group of emerging contaminants, and their emergence and spread within the critical zone on one hand are induced by anthropogenic activities, and on other hand are threatening human health worldwide. The transport and transformation of antibiotic resistance genes are controlled by both horizontal gene transfer between bacterial cells and the movement of bacteria harboring antibiotic resistance genes. In this paper, the fate and behavior of antibiotic resistance genes will be discussed in the following aspects: 1) general overview of environmental antibiotic resistance; 2) high through quantification of the resistome in various environmental media; 3) pathways of resistance gene flow within the critical zone; and 4) potential strategies in mitigating antibiotic resistance, particularly from the critical zone perspectives.
2016-10-01
STATEMENT: Approved for Public Release; Distribution Unlimited The views, opinions and/or findings contained in this report are those of the author(s) and...AVAILABILITY STATEMENT Approved for Public Release; Distribution Unlimited 13. SUPPLEMENTARY NOTES 14. ABSTRACT Androgens are hormones that play a critical...role in stimulating prostate cancer growth. Androgens activate a protein called the androgen receptor ( AR ), which regulates genes involved in cell
The human visual cortex responds to gene therapy–mediated recovery of retinal function
Ashtari, Manzar; Cyckowski, Laura L.; Monroe, Justin F.; Marshall, Kathleen A.; Chung, Daniel C.; Auricchio, Alberto; Simonelli, Francesca; Leroy, Bart P.; Maguire, Albert M.; Shindler, Kenneth S.; Bennett, Jean
2011-01-01
Leber congenital amaurosis (LCA) is a rare degenerative eye disease, linked to mutations in at least 14 genes. A recent gene therapy trial in patients with LCA2, who have mutations in RPE65, demonstrated that subretinal injection of an adeno-associated virus (AAV) carrying the normal cDNA of that gene (AAV2-hRPE65v2) could markedly improve vision. However, it remains unclear how the visual cortex responds to recovery of retinal function after prolonged sensory deprivation. Here, 3 of the gene therapy trial subjects, treated at ages 8, 9, and 35 years, underwent functional MRI within 2 years of unilateral injection of AAV2-hRPE65v2. All subjects showed increased cortical activation in response to high- and medium-contrast stimuli after exposure to the treated compared with the untreated eye. Furthermore, we observed a correlation between the visual field maps and the distribution of cortical activations for the treated eyes. These data suggest that despite severe and long-term visual impairment, treated LCA2 patients have intact and responsive visual pathways. In addition, these data suggest that gene therapy resulted in not only sustained and improved visual ability, but also enhanced contrast sensitivity. PMID:21606598
Liu, Fuli; Hu, Zimin; Liu, Wenhui; Li, Jingjing; Wang, Wenjun; Liang, Zhourui; Wang, Feijiu; Sun, Xiutao
2016-01-01
Using transcriptome data to mine microsatellite and develop markers has growingly become prevalent. However, characterizing the possible function of microsatellite is relatively rare. In this study, we explored microsatellites in the transcriptome of the brown alga Sargassum thunbergii and characterized the frequencies, distribution, function and evolution, and developed primers to validate these microsatellites. Our results showed that Tri-nucleotide is the most abundant, followed by di- and mono-nucleotide. The length of microsatellite was significantly affected by the repeat motif size. The density of microsatellite in the CDS region is significantly lower than that in the UTR region. The annotation of the transcripts containing microsatellite showed that 573 transcripts have GO terms and can be categorized into 42 groups. Pathways enrichment showed that microsatellites were significantly overrepresented in the genes involved in pathways such as Ubiquitin mediated proteolysis, RNA degradation, Spliceosome, etc. Primers flanking 961 microsatellite loci were designed, and among the 30 pairs of primer selected randomly for availability test, 23 were proved to be efficient. These findings provided new insight into the function and evolution of microsatellite in transcriptome, and the identified microsatellite loci within the annotated gene will be useful for developing functional markers in S. thunbergii. PMID:26732855
Massive expansion of the calpain gene family in unicellular eukaryotes.
Zhao, Sen; Liang, Zhe; Demko, Viktor; Wilson, Robert; Johansen, Wenche; Olsen, Odd-Arne; Shalchian-Tabrizi, Kamran
2012-09-29
Calpains are Ca2+-dependent cysteine proteases that participate in a range of crucial cellular processes. Dysfunction of these enzymes may cause, for instance, life-threatening diseases in humans, the loss of sex determination in nematodes and embryo lethality in plants. Although the calpain family is well characterized in animal and plant model organisms, there is a great lack of knowledge about these genes in unicellular eukaryote species (i.e. protists). Here, we study the distribution and evolution of calpain genes in a wide range of eukaryote genomes from major branches in the tree of life. Our investigations reveal 24 types of protein domains that are combined with the calpain-specific catalytic domain CysPc. In total we identify 41 different calpain domain architectures, 28 of these domain combinations have not been previously described. Based on our phylogenetic inferences, we propose that at least four calpain variants were established in the early evolution of eukaryotes, most likely before the radiation of all the major supergroups of eukaryotes. Many domains associated with eukaryotic calpain genes can be found among eubacteria or archaebacteria but never in combination with the CysPc domain. The analyses presented here show that ancient modules present in prokaryotes, and a few de novo eukaryote domains, have been assembled into many novel domain combinations along the evolutionary history of eukaryotes. Some of the new calpain genes show a narrow distribution in a few branches in the tree of life, likely representing lineage-specific innovations. Hence, the functionally important classical calpain genes found among humans and vertebrates make up only a tiny fraction of the calpain family. In fact, a massive expansion of the calpain family occurred by domain shuffling among unicellular eukaryotes and contributed to a wealth of functionally different genes.
Bek-Thomsen, Malene; Poulsen, Knud; Kilian, Mogens
2012-01-01
ABSTRACT The distribution, genome location, and evolution of the four paralogous zinc metalloproteases, IgA1 protease, ZmpB, ZmpC, and ZmpD, in Streptococcus pneumoniae and related commensal species were studied by in silico analysis of whole genomes and by activity screening of 154 representatives of 20 species. ZmpB was ubiquitous in the Mitis and Salivarius groups of the genus Streptococcus and in the genera Gemella and Granulicatella, with the exception of a fragmented gene in Streptococcus thermophilus, the only species with a nonhuman habitat. IgA1 protease activity was observed in all members of S. pneumoniae, S. pseudopneumoniae, S. oralis, S. sanguinis, and Gemella haemolysans, was variably present in S. mitis and S. infantis, and absent in S. gordonii, S. parasanguinis, S. cristatus, S. oligofermentans, S. australis, S. peroris, and S. suis. Phylogenetic analysis of 297 zmp sequences and representative housekeeping genes provided evidence for an unprecedented selection for genetic diversification of the iga, zmpB, and zmpD genes in S. pneumoniae and evidence of very frequent intraspecies transfer of entire genes and combination of genes. Presumably due to their adaptation to a commensal lifestyle, largely unaffected by adaptive mucosal immune factors, the corresponding genes in commensal streptococci have remained conserved. The widespread distribution and significant sequence diversity indicate an ancient origin of the zinc metalloproteases predating the emergence of the humanoid species. zmpB, which appears to be the ancestral gene, subsequently duplicated and successfully diversified into distinct functions, is likely to serve an important but yet unknown housekeeping function associated with the human host. PMID:23033471
Massive expansion of the calpain gene family in unicellular eukaryotes
2012-01-01
Background Calpains are Ca2+-dependent cysteine proteases that participate in a range of crucial cellular processes. Dysfunction of these enzymes may cause, for instance, life-threatening diseases in humans, the loss of sex determination in nematodes and embryo lethality in plants. Although the calpain family is well characterized in animal and plant model organisms, there is a great lack of knowledge about these genes in unicellular eukaryote species (i.e. protists). Here, we study the distribution and evolution of calpain genes in a wide range of eukaryote genomes from major branches in the tree of life. Results Our investigations reveal 24 types of protein domains that are combined with the calpain-specific catalytic domain CysPc. In total we identify 41 different calpain domain architectures, 28 of these domain combinations have not been previously described. Based on our phylogenetic inferences, we propose that at least four calpain variants were established in the early evolution of eukaryotes, most likely before the radiation of all the major supergroups of eukaryotes. Many domains associated with eukaryotic calpain genes can be found among eubacteria or archaebacteria but never in combination with the CysPc domain. Conclusions The analyses presented here show that ancient modules present in prokaryotes, and a few de novo eukaryote domains, have been assembled into many novel domain combinations along the evolutionary history of eukaryotes. Some of the new calpain genes show a narrow distribution in a few branches in the tree of life, likely representing lineage-specific innovations. Hence, the functionally important classical calpain genes found among humans and vertebrates make up only a tiny fraction of the calpain family. In fact, a massive expansion of the calpain family occurred by domain shuffling among unicellular eukaryotes and contributed to a wealth of functionally different genes. PMID:23020305
Wang, Ning; Kinoshita, Shigeharu; Nomura, Naoko; Riho, Chihiro; Maeyama, Kaoru; Nagai, Kiyohito; Watabe, Shugo
2012-04-01
Recent researches revealed the regional preference of biomineralization gene transcription in the pearl oyster Pinctada fucata: it transcribed mainly the genes responsible for nacre secretion in mantle pallial, whereas the ones regulating calcite shells expressed in mantle edge. This study took use of this character and constructed the forward and reverse suppression subtractive hybridization (SSH) cDNA libraries. A total of 669 cDNA clones were sequenced and 360 expressed sequence tags (ESTs) greater than 100 bp were generated. Functional annotation associated 95 ESTs with specific functions, and 79 among them were identified from P. fucata at the first time. In the forward SSH cDNA library, it recognized mass amount of nacre protein genes, biomineralization genes dominantly expressed in the mantle pallial, calcium-ion-binding genes, and other biomineralization-related genes important for pearl formation. Real-time PCR showed that all the examined genes were distributed in oyster mantle tissues with a consistence to the SSH design. The detection of their RNA transcripts in pearl sac confirmed that the identified genes were certainly involved in pearl formation. Therefore, the data from this work will initiate a new round of pearl formation gene study and shed new insights into molluscan biomineralization.
Kuenne, Carsten; Billion, André; Mraheil, Mobarak Abu; Strittmatter, Axel; Daniel, Rolf; Goesmann, Alexander; Barbuddhe, Sukhadeo; Hain, Torsten; Chakraborty, Trinad
2013-01-22
Listeria monocytogenes is an important food-borne pathogen and model organism for host-pathogen interaction, thus representing an invaluable target considering research on the forces governing the evolution of such microbes. The diversity of this species has not been exhaustively explored yet, as previous efforts have focused on analyses of serotypes primarily implicated in human listeriosis. We conducted complete genome sequencing of 11 strains employing 454 GS FLX technology, thereby achieving full coverage of all serotypes including the first complete strains of serotypes 1/2b, 3c, 3b, 4c, 4d, and 4e. These were comparatively analyzed in conjunction with publicly available data and assessed for pathogenicity in the Galleria mellonella insect model. The species pan-genome of L. monocytogenes is highly stable but open, suggesting an ability to adapt to new niches by generating or including new genetic information. The majority of gene-scale differences represented by the accessory genome resulted from nine hyper variable hotspots, a similar number of different prophages, three transposons (Tn916, Tn554, IS3-like), and two mobilizable islands. Only a subset of strains showed CRISPR/Cas bacteriophage resistance systems of different subtypes, suggesting a supplementary function in maintenance of chromosomal stability. Multiple phylogenetic branches of the genus Listeria imply long common histories of strains of each lineage as revealed by a SNP-based core genome tree highlighting the impact of small mutations for the evolution of species L. monocytogenes. Frequent loss or truncation of genes described to be vital for virulence or pathogenicity was confirmed as a recurring pattern, especially for strains belonging to lineages III and II. New candidate genes implicated in virulence function were predicted based on functional domains and phylogenetic distribution. A comparative analysis of small regulatory RNA candidates supports observations of a differential distribution of trans-encoded RNA, hinting at a diverse range of adaptations and regulatory impact. This study determined commonly occurring hyper variable hotspots and mobile elements as primary effectors of quantitative gene-scale evolution of species L. monocytogenes, while gene decay and SNPs seem to represent major factors influencing long-term evolution. The discovery of common and disparately distributed genes considering lineages, serogroups, serotypes and strains of species L. monocytogenes will assist in diagnostic, phylogenetic and functional research, supported by the comparative genomic GECO-LisDB analysis server (http://bioinfo.mikrobio.med.uni-giessen.de/geco2lisdb).
Global biogeography of Prochlorococcus genome diversity in the surface ocean.
Kent, Alyssa G; Dupont, Chris L; Yooseph, Shibu; Martiny, Adam C
2016-08-01
Prochlorococcus, the smallest known photosynthetic bacterium, is abundant in the ocean's surface layer despite large variation in environmental conditions. There are several genetically divergent lineages within Prochlorococcus and superimposed on this phylogenetic diversity is extensive gene gain and loss. The environmental role in shaping the global ocean distribution of genome diversity in Prochlorococcus is largely unknown, particularly in a framework that considers the vertical and lateral mechanisms of evolution. Here we show that Prochlorococcus field populations from a global circumnavigation harbor extensive genome diversity across the surface ocean, but this diversity is not randomly distributed. We observed a significant correspondence between phylogenetic and gene content diversity, including regional differences in both phylogenetic composition and gene content that were related to environmental factors. Several gene families were strongly associated with specific regions and environmental factors, including the identification of a set of genes related to lower nutrient and temperature regions. Metagenomic assemblies of natural Prochlorococcus genomes reinforced this association by providing linkage of genes across genomic backbones. Overall, our results show that the phylogeography in Prochlorococcus taxonomy is echoed in its genome content. Thus environmental variation shapes the functional capabilities and associated ecosystem role of the globally abundant Prochlorococcus.
NASA Astrophysics Data System (ADS)
Ellis, K.; Cohen, N.; Moreno, C.; Marchetti, A.
2016-02-01
The requirement for cobalamin (vitamin B12) in microalgae is primarily a function of the type of methionine synthase present within their gene repertoires. This study validates this concept through analysis of the distribution of B12-independent methionine synthase in ecologically relevant diatom genera, including the closely related bloom-forming diatoms Pseudo-nitzschia and Fragilariopsis. Growth and gene expression analysis of the vitamin B12-requiring version of the methionine synthase enzyme, MetH, and the B12-independent version, MetE, demonstrate that it is the presence of the MetE gene which allows Fragilariopsis cylindrus to grow in the absence of B12, while P. granii's lack of a functional MetE gene means that it cannot survive without the vitamin. Through phylogenetic analysis, we further substantiate a lack of obvious grouping in MetE presence among diatom clades. In addition, we also show how this trend may have a biogeographical basis, particularly in High-Nutrient, Low-Chlorophyll (HNLC) regions such as the Southern Ocean where B12 concentrations may be consistently low. These results are paired with field experiments showing patterns of MetE and MetH gene expression in natural phytoplankton communities under a matrix of iron and B12 limitations in the HNLC NE Pacific. Our findings demonstrate the important role vitamins can play in diatom community dynamics within areas where vitamin supply may be variable and limiting.
Broad Phylogenetic Occurrence of the Oxygen-Binding Hemerythrins in Bilaterians
Schrago, Carlos G.; Halanych, Kenneth M.
2017-01-01
Abstract Animal tissues need to be properly oxygenated for carrying out catabolic respiration and, as such, natural selection has presumably favored special molecules that can reversibly bind and transport oxygen. Hemoglobins, hemocyanins, and hemerythrins (Hrs) fulfill this role, with Hrs being the least studied. Knowledge of oxygen-binding proteins is crucial for understanding animal physiology. Hr genes are present in the three domains of life, Archaea, Bacteria, and Eukaryota; however, within Animalia, Hrs has been reported only in marine species in six phyla (Annelida, Brachiopoda, Priapulida, Bryozoa, Cnidaria, and Arthropoda). Given this observed Hr distribution, whether all metazoan Hrs share a common origin is circumspect. We investigated Hr diversity and evolution in metazoans, by employing in silico approaches to survey for Hrs from of 120 metazoan transcriptomes and genomes. We found 58 candidate Hr genes actively transcribed in 36 species distributed in 11 animal phyla, with new records in Echinodermata, Hemichordata, Mollusca, Nemertea, Phoronida, and Platyhelminthes. Moreover, we found that “Hrs” reported from Cnidaria and Arthropoda were not consistent with that of other metazoan Hrs. Contrary to previous suggestions that Hr genes were absent in deuterostomes, we find Hr genes present in deuterostomes and were likely present in early bilaterians, but not in nonbilaterian animal lineages. As expected, the Hr gene tree did not mirror metazoan phylogeny, suggesting that Hrs evolutionary history was complex and besides the oxygen carrying capacity, the drivers of Hr evolution may also consist of secondary functional specializations of the proteins, like immunological functions. PMID:29016798
Macqueen, Daniel J; Wilcox, Alexander H
2014-04-09
The calpains are a superfamily of proteases with extensive relevance to human health and welfare. Vast research attention is given to the vertebrate 'classical' subfamily, making it surprising that the evolutionary origins, distribution and relationships of these genes is poorly characterized. Consequently, there exists uncertainty about the conservation of gene family structure, function and expression that has been principally defined from work with mammals. Here, more than 200 vertebrate classical calpains were incorporated in phylogenetic analyses spanning an unprecedented range of taxa, including jawless and cartilaginous fish. We demonstrate that the common vertebrate ancestor had at least six classical calpains, including a single gene that gave rise to CAPN11, 1, 2 and 8 in the early jawed fish lineage, plus CAPN3, 9, 12, 13 and a novel calpain gene, hereafter named CAPN17. We reveal that while all vertebrate classical calpains have been subject to persistent purifying selection during evolution, the degree and nature of selective pressure has often been lineage-dependent. The tissue expression of the complete classic calpain family was assessed in representative teleost fish, amphibians, reptiles and mammals. This highlighted systematic divergence in expression across vertebrate taxa, with most classic calpain genes from fish and amphibians having more extensive tissue distribution than in amniotes. Our data suggest that classical calpain functions have frequently diverged during vertebrate evolution and challenge the ongoing value of the established system of classifying calpains by expression.
Macqueen, Daniel J.; Wilcox, Alexander H.
2014-01-01
The calpains are a superfamily of proteases with extensive relevance to human health and welfare. Vast research attention is given to the vertebrate ‘classical’ subfamily, making it surprising that the evolutionary origins, distribution and relationships of these genes is poorly characterized. Consequently, there exists uncertainty about the conservation of gene family structure, function and expression that has been principally defined from work with mammals. Here, more than 200 vertebrate classical calpains were incorporated in phylogenetic analyses spanning an unprecedented range of taxa, including jawless and cartilaginous fish. We demonstrate that the common vertebrate ancestor had at least six classical calpains, including a single gene that gave rise to CAPN11, 1, 2 and 8 in the early jawed fish lineage, plus CAPN3, 9, 12, 13 and a novel calpain gene, hereafter named CAPN17. We reveal that while all vertebrate classical calpains have been subject to persistent purifying selection during evolution, the degree and nature of selective pressure has often been lineage-dependent. The tissue expression of the complete classic calpain family was assessed in representative teleost fish, amphibians, reptiles and mammals. This highlighted systematic divergence in expression across vertebrate taxa, with most classic calpain genes from fish and amphibians having more extensive tissue distribution than in amniotes. Our data suggest that classical calpain functions have frequently diverged during vertebrate evolution and challenge the ongoing value of the established system of classifying calpains by expression. PMID:24718597
A High Proportion of Chromosome 21 Promoter Polymorphisms Influence Transcriptional Activity
Buckland, Paul R.; Coleman, Sharol L.; Hoogendoorn, Bastiaan; Guy, Carol; Smith, S. Kaye; O’Donovan, Michael C.
2004-01-01
We have sought to obtain an unbiased estimate of the proportion of polymorphisms in promoters of human genes that have functional effects. We carried out polymorphism discovery on a randomly selected group of 51 gene promoters mapping to human chromosome 21 and successfully analyzed the effect on transcription of 38 of the sequence variants. To achieve this, a total of 53 different haplotypes from 20 promoters were cloned into a modified pGL3 luciferase reporter gene vector and were tested for their abilities to promote transcription in HEK293t and JEG-3 cells. Up to seven (18%) of the 38 tested variants altered transcription by 1.5-fold, confirming that a surprisingly high proportion of promoter region polymorphisms are likely to be functionally important. The functional variants were distributed across the promoters of CRYAA, IFNAR1, KCNJ15, NCAM2, IGSF5, and B3GALT5. Three of the genes (NCAM2, IFNAR1, and CRYAA) have been previously associated with human phenotypes and the polymorphisms we describe here may therefore play a role in those phenotypes. PMID:15200235
High-frequency promoter firing links THO complex function to heavy chromatin formation.
Mouaikel, John; Causse, Sébastien Z; Rougemaille, Mathieu; Daubenton-Carafa, Yves; Blugeon, Corinne; Lemoine, Sophie; Devaux, Frédéric; Darzacq, Xavier; Libri, Domenico
2013-11-27
The THO complex is involved in transcription, genome stability, and messenger ribonucleoprotein (mRNP) formation, but its precise molecular function remains enigmatic. Under heat shock conditions, THO mutants accumulate large protein-DNA complexes that alter the chromatin density of target genes (heavy chromatin), defining a specific biochemical facet of THO function and a powerful tool of analysis. Here, we show that heavy chromatin distribution is dictated by gene boundaries and that the gene promoter is necessary and sufficient to convey THO sensitivity in these conditions. Single-molecule fluorescence in situ hybridization measurements show that heavy chromatin formation correlates with an unusually high firing pace of the promoter with more than 20 transcription events per minute. Heavy chromatin formation closely follows the modulation of promoter firing and strongly correlates with polymerase occupancy genome wide. We propose that the THO complex is required for tuning the dynamic of gene-nuclear pore association and mRNP release to the same high pace of transcription initiation. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
Why Is the Correlation between Gene Importance and Gene Evolutionary Rate So Weak?
Wang, Zhi; Zhang, Jianzhi
2009-01-01
One of the few commonly believed principles of molecular evolution is that functionally more important genes (or DNA sequences) evolve more slowly than less important ones. This principle is widely used by molecular biologists in daily practice. However, recent genomic analysis of a diverse array of organisms found only weak, negative correlations between the evolutionary rate of a gene and its functional importance, typically measured under a single benign lab condition. A frequently suggested cause of the above finding is that gene importance determined in the lab differs from that in an organism's natural environment. Here, we test this hypothesis in yeast using gene importance values experimentally determined in 418 lab conditions or computationally predicted for 10,000 nutritional conditions. In no single condition or combination of conditions did we find a much stronger negative correlation, which is explainable by our subsequent finding that always-essential (enzyme) genes do not evolve significantly more slowly than sometimes-essential or always-nonessential ones. Furthermore, we verified that functional density, approximated by the fraction of amino acid sites within protein domains, is uncorrelated with gene importance. Thus, neither the lab-nature mismatch nor a potentially biased among-gene distribution of functional density explains the observed weakness of the correlation between gene importance and evolutionary rate. We conclude that the weakness is factual, rather than artifactual. In addition to being weakened by population genetic reasons, the correlation is likely to have been further weakened by the presence of multiple nontrivial rate determinants that are independent from gene importance. These findings notwithstanding, we show that the principle of slower evolution of more important genes does have some predictive power when genes with vastly different evolutionary rates are compared, explaining why the principle can be practically useful despite the weakness of the correlation. PMID:19132081
An anatomically comprehensive atlas of the adult human brain transcriptome
Guillozet-Bongaarts, Angela L.; Shen, Elaine H.; Ng, Lydia; Miller, Jeremy A.; van de Lagemaat, Louie N.; Smith, Kimberly A.; Ebbert, Amanda; Riley, Zackery L.; Abajian, Chris; Beckmann, Christian F.; Bernard, Amy; Bertagnolli, Darren; Boe, Andrew F.; Cartagena, Preston M.; Chakravarty, M. Mallar; Chapin, Mike; Chong, Jimmy; Dalley, Rachel A.; David Daly, Barry; Dang, Chinh; Datta, Suvro; Dee, Nick; Dolbeare, Tim A.; Faber, Vance; Feng, David; Fowler, David R.; Goldy, Jeff; Gregor, Benjamin W.; Haradon, Zeb; Haynor, David R.; Hohmann, John G.; Horvath, Steve; Howard, Robert E.; Jeromin, Andreas; Jochim, Jayson M.; Kinnunen, Marty; Lau, Christopher; Lazarz, Evan T.; Lee, Changkyu; Lemon, Tracy A.; Li, Ling; Li, Yang; Morris, John A.; Overly, Caroline C.; Parker, Patrick D.; Parry, Sheana E.; Reding, Melissa; Royall, Joshua J.; Schulkin, Jay; Sequeira, Pedro Adolfo; Slaughterbeck, Clifford R.; Smith, Simon C.; Sodt, Andy J.; Sunkin, Susan M.; Swanson, Beryl E.; Vawter, Marquis P.; Williams, Derric; Wohnoutka, Paul; Zielke, H. Ronald; Geschwind, Daniel H.; Hof, Patrick R.; Smith, Stephen M.; Koch, Christof; Grant, Seth G. N.; Jones, Allan R.
2014-01-01
Neuroanatomically precise, genome-wide maps of transcript distributions are critical resources to complement genomic sequence data and to correlate functional and genetic brain architecture. Here we describe the generation and analysis of a transcriptional atlas of the adult human brain, comprising extensive histological analysis and comprehensive microarray profiling of ~900 neuroanatomically precise subdivisions in two individuals. Transcriptional regulation varies enormously by anatomical location, with different regions and their constituent cell types displaying robust molecular signatures that are highly conserved between individuals. Analysis of differential gene expression and gene co-expression relationships demonstrates that brain-wide variation strongly reflects the distributions of major cell classes such as neurons, oligodendrocytes, astrocytes and microglia. Local neighbourhood relationships between fine anatomical subdivisions are associated with discrete neuronal subtypes and genes involved with synaptic transmission. The neocortex displays a relatively homogeneous transcriptional pattern, but with distinct features associated selectively with primary sensorimotor cortices and with enriched frontal lobe expression. Notably, the spatial topography of the neocortex is strongly reflected in its molecular topography— the closer two cortical regions, the more similar their transcriptomes. This freely accessible online data resource forms a high-resolution transcriptional baseline for neurogenetic studies of normal and abnormal human brain function. PMID:22996553
Ancestry and evolution of a secretory pathway serpin
2008-01-01
Background The serpin (serine protease inhibitor) superfamily constitutes a class of functionally highly diverse proteins usually encompassing several dozens of paralogs in mammals. Though phylogenetic classification of vertebrate serpins into six groups based on gene organisation is well established, the evolutionary roots beyond the fish/tetrapod split are unresolved. The aim of this study was to elucidate the phylogenetic relationships of serpins involved in surveying the secretory pathway routes against uncontrolled proteolytic activity. Results Here, rare genomic characters are used to show that orthologs of neuroserpin, a prominent representative of vertebrate group 3 serpin genes, exist in early diverging deuterostomes and probably also in cnidarians, indicating that the origin of a mammalian serpin can be traced back far in the history of eumetazoans. A C-terminal address code assigning association with secretory pathway organelles is present in all neuroserpin orthologs, suggesting that supervision of cellular export/import routes by antiproteolytic serpins is an ancient trait, though subtle functional and compartmental specialisations have developed during their evolution. The results also suggest that massive changes in the exon-intron organisation of serpin genes have occurred along the lineage leading to vertebrate neuroserpin, in contrast with the immediately adjacent PDCD10 gene that is linked to its neighbour at least since divergence of echinoderms. The intron distribution pattern of closely adjacent and co-regulated genes thus may experience quite different fates during evolution of metazoans. Conclusion This study demonstrates that the analysis of microsynteny and other rare characters can provide insight into the intricate family history of metazoan serpins. Serpins with the capacity to defend the main cellular export/import routes against uncontrolled endogenous and/or foreign proteolytic activity represent an ancient trait in eukaryotes that has been maintained continuously in metazoans though subtle changes affecting function and subcellular location have evolved. It is shown that the intron distribution pattern of neuroserpin gene orthologs has undergone substantial rearrangements during metazoan evolution. PMID:18793432
Ancestry and evolution of a secretory pathway serpin.
Kumar, Abhishek; Ragg, Hermann
2008-09-15
The serpin (serine protease inhibitor) superfamily constitutes a class of functionally highly diverse proteins usually encompassing several dozens of paralogs in mammals. Though phylogenetic classification of vertebrate serpins into six groups based on gene organisation is well established, the evolutionary roots beyond the fish/tetrapod split are unresolved. The aim of this study was to elucidate the phylogenetic relationships of serpins involved in surveying the secretory pathway routes against uncontrolled proteolytic activity. Here, rare genomic characters are used to show that orthologs of neuroserpin, a prominent representative of vertebrate group 3 serpin genes, exist in early diverging deuterostomes and probably also in cnidarians, indicating that the origin of a mammalian serpin can be traced back far in the history of eumetazoans. A C-terminal address code assigning association with secretory pathway organelles is present in all neuroserpin orthologs, suggesting that supervision of cellular export/import routes by antiproteolytic serpins is an ancient trait, though subtle functional and compartmental specialisations have developed during their evolution. The results also suggest that massive changes in the exon-intron organisation of serpin genes have occurred along the lineage leading to vertebrate neuroserpin, in contrast with the immediately adjacent PDCD10 gene that is linked to its neighbour at least since divergence of echinoderms. The intron distribution pattern of closely adjacent and co-regulated genes thus may experience quite different fates during evolution of metazoans. This study demonstrates that the analysis of microsynteny and other rare characters can provide insight into the intricate family history of metazoan serpins. Serpins with the capacity to defend the main cellular export/import routes against uncontrolled endogenous and/or foreign proteolytic activity represent an ancient trait in eukaryotes that has been maintained continuously in metazoans though subtle changes affecting function and subcellular location have evolved. It is shown that the intron distribution pattern of neuroserpin gene orthologs has undergone substantial rearrangements during metazoan evolution.
Connallon, Tim; Clark, Andrew G
2010-12-01
Sex-biased genes--genes that are differentially expressed within males and females--are nonrandomly distributed across animal genomes, with sex chromosomes and autosomes often carrying markedly different concentrations of male- and female-biased genes. These linkage patterns are often gene- and lineage-dependent, differing between functional genetic categories and between species. Although sex-specific selection is often hypothesized to shape the evolution of sex-linked and autosomal gene content, population genetics theory has yet to account for many of the gene- and lineage-specific idiosyncrasies emerging from the empirical literature. With the goal of improving the connection between evolutionary theory and a rapidly growing body of genome-wide empirical studies, we extend previous population genetics theory of sex-specific selection by developing and analyzing a biologically informed model that incorporates sex linkage, pleiotropy, recombination, and epistasis, factors that are likely to vary between genes and between species. Our results demonstrate that sex-specific selection and sex-specific recombination rates can generate, and are compatible with, the gene- and species-specific linkage patterns reported in the genomics literature. The theory suggests that sexual selection may strongly influence the architectures of animal genomes, as well as the chromosomal distribution of fixed substitutions underlying sexually dimorphic traits. © 2010 The Author(s). Evolution© 2010 The Society for the Study of Evolution.
Computer analysis of protein functional sites projection on exon structure of genes in Metazoa
2015-01-01
Background Study of the relationship between the structural and functional organization of proteins and their coding genes is necessary for an understanding of the evolution of molecular systems and can provide new knowledge for many applications for designing proteins with improved medical and biological properties. It is well known that the functional properties of proteins are determined by their functional sites. Functional sites are usually represented by a small number of amino acid residues that are distantly located from each other in the amino acid sequence. They are highly conserved within their functional group and vary significantly in structure between such groups. According to this facts analysis of the general properties of the structural organization of the functional sites at the protein level and, at the level of exon-intron structure of the coding gene is still an actual problem. Results One approach to this analysis is the projection of amino acid residue positions of the functional sites along with the exon boundaries to the gene structure. In this paper, we examined the discontinuity of the functional sites in the exon-intron structure of genes and the distribution of lengths and phases of the functional site encoding exons in vertebrate genes. We have shown that the DNA fragments coding the functional sites were in the same exons, or in close exons. The observed tendency to cluster the exons that code functional sites which could be considered as the unit of protein evolution. We studied the characteristics of the structure of the exon boundaries that code, and do not code, functional sites in 11 Metazoa species. This is accompanied by a reduced frequency of intercodon gaps (phase 0) in exons encoding the amino acid residue functional site, which may be evidence of the existence of evolutionary limitations to the exon shuffling. Conclusions These results characterize the features of the coding exon-intron structure that affect the functionality of the encoded protein and allow a better understanding of the emergence of biological diversity. PMID:26693737
Hornoy, Benjamin; Pavy, Nathalie; Gérardi, Sébastien; Beaulieu, Jean; Bousquet, Jean
2015-11-11
Understanding the genetic basis of adaptation to climate is of paramount importance for preserving and managing genetic diversity in plants in a context of climate change. Yet, this objective has been addressed mainly in short-lived model species. Thus, expanding knowledge to nonmodel species with contrasting life histories, such as forest trees, appears necessary. To uncover the genetic basis of adaptation to climate in the widely distributed boreal conifer white spruce (Picea glauca), an environmental association study was conducted using 11,085 single nucleotide polymorphisms representing 7,819 genes, that is, approximately a quarter of the transcriptome.Linear and quadratic regressions controlling for isolation-by-distance, and the Random Forest algorithm, identified several dozen genes putatively under selection, among which 43 showed strongest signals along temperature and precipitation gradients. Most of them were related to temperature. Small to moderate shifts in allele frequencies were observed. Genes involved encompassed a wide variety of functions and processes, some of them being likely important for plant survival under biotic and abiotic environmental stresses according to expression data. Literature mining and sequence comparison also highlighted conserved sequences and functions with angiosperm homologs.Our results are consistent with theoretical predictions that local adaptation involves genes with small frequency shifts when selection is recent and gene flow among populations is high. Accordingly, genetic adaptation to climate in P. glauca appears to be complex, involving many independent and interacting gene functions, biochemical pathways, and processes. From an applied perspective, these results shall lead to specific functional/association studies in conifers and to the development of markers useful for the conservation of genetic resources. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Cellular Retinoic Acid Binding Proteins: Genomic and Non-genomic Functions and their Regulation.
Wei, Li-Na
Cellular retinoic acid binding proteins (CRABPs) are high-affinity retinoic acid (RA) binding proteins that mainly reside in the cytoplasm. In mammals, this family has two members, CRABPI and II, both highly conserved during evolution. The two proteins share a very similar structure that is characteristic of a "β-clam" motif built up from10-strands. The proteins are encoded by two different genes that share a very similar genomic structure. CRABPI is widely distributed and CRABPII has restricted expression in only certain tissues. The CrabpI gene is driven by a housekeeping promoter, but can be regulated by numerous factors, including thyroid hormones and RA, which engage a specific chromatin-remodeling complex containing either TRAP220 or RIP140 as coactivator and corepressor, respectively. The chromatin-remodeling complex binds the DR4 element in the CrabpI gene promoter to activate or repress this gene in different cellular backgrounds. The CrabpII gene promoter contains a TATA-box and is rapidly activated by RA through an RA response element. Biochemical and cell culture studies carried out in vitro show the two proteins have distinct biological functions. CRABPII mainly functions to deliver RA to the nuclear RA receptors for gene regulation, although recent studies suggest that CRABPII may also be involved in other cellular events, such as RNA stability. In contrast, biochemical and cell culture studies suggest that CRABPI functions mainly in the cytoplasm to modulate intracellular RA availability/concentration and to engage other signaling components such as ERK activity. However, these functional studies remain inconclusive because knocking out one or both genes in mice does not produce definitive phenotypes. Further studies are needed to unambiguously decipher the exact physiological activities of these two proteins.
Hornoy, Benjamin; Pavy, Nathalie; Gérardi, Sébastien; Beaulieu, Jean; Bousquet, Jean
2015-01-01
Understanding the genetic basis of adaptation to climate is of paramount importance for preserving and managing genetic diversity in plants in a context of climate change. Yet, this objective has been addressed mainly in short-lived model species. Thus, expanding knowledge to nonmodel species with contrasting life histories, such as forest trees, appears necessary. To uncover the genetic basis of adaptation to climate in the widely distributed boreal conifer white spruce (Picea glauca), an environmental association study was conducted using 11,085 single nucleotide polymorphisms representing 7,819 genes, that is, approximately a quarter of the transcriptome. Linear and quadratic regressions controlling for isolation-by-distance, and the Random Forest algorithm, identified several dozen genes putatively under selection, among which 43 showed strongest signals along temperature and precipitation gradients. Most of them were related to temperature. Small to moderate shifts in allele frequencies were observed. Genes involved encompassed a wide variety of functions and processes, some of them being likely important for plant survival under biotic and abiotic environmental stresses according to expression data. Literature mining and sequence comparison also highlighted conserved sequences and functions with angiosperm homologs. Our results are consistent with theoretical predictions that local adaptation involves genes with small frequency shifts when selection is recent and gene flow among populations is high. Accordingly, genetic adaptation to climate in P. glauca appears to be complex, involving many independent and interacting gene functions, biochemical pathways, and processes. From an applied perspective, these results shall lead to specific functional/association studies in conifers and to the development of markers useful for the conservation of genetic resources. PMID:26560341
Multiple Multi-Copper Oxidase Gene Families in Basidiomycetes – What for?
Kües, Ursula; Rühl, Martin
2011-01-01
Genome analyses revealed in various basidiomycetes the existence of multiple genes for blue multi-copper oxidases (MCOs). Whole genomes are now available from saprotrophs, white rot and brown rot species, plant and animal pathogens and ectomycorrhizal species. Total numbers (from 1 to 17) and types of mco genes differ between analyzed species with no easy to recognize connection of gene distribution to fungal life styles. Types of mco genes might be present in one and absent in another fungus. Distinct types of genes have been multiplied at speciation in different organisms. Phylogenetic analysis defined different subfamilies of laccases sensu stricto (specific to Agaricomycetes), classical Fe2+-oxidizing Fet3-like ferroxidases, potential ferroxidases/laccases exhibiting either one or both of these enzymatic functions, enzymes clustering with pigment MCOs and putative ascorbate oxidases. Biochemically best described are laccases sensu stricto due to their proposed roles in degradation of wood, straw and plant litter and due to the large interest in these enzymes in biotechnology. However, biological functions of laccases and other MCOs are generally little addressed. Functions in substrate degradation, symbiontic and pathogenic intercations, development, pigmentation and copper homeostasis have been put forward. Evidences for biological functions are in most instances rather circumstantial by correlations of expression. Multiple factors impede research on biological functions such as difficulties of defining suitable biological systems for molecular research, the broad and overlapping substrate spectrum multi-copper oxidases usually possess, the low existent knowledge on their natural substrates, difficulties imposed by low expression or expression of multiple enzymes, and difficulties in expressing enzymes heterologously. PMID:21966246
Zhou, Ziyao; Zhou, Xiaoxiao; Zhong, Zhijun; Wang, Chengdong; Zhang, Hemin; Li, Desheng; He, Tingmei; Li, Caiwu; Liu, Xuehan; Yuan, Hui; Ji, Hanli; Luo, Yongjiu; Gu, Wuyang; Fu, Hualin; Peng, Guangneng
2014-12-01
Bacillus group is a prevalent community of Giant Panda's intestinal flora, and plays a significant role in the field of biological control of pathogens. To understand the diversity of Bacillus group from the Giant Panda intestine and their functions in maintaining the balance of the intestinal microflora of Giant Panda, this study isolated a significant number of strains of Bacillus spp. from the feces of Giant Panda, compared the inhibitory effects of these strains on three common enteric pathogens, investigated the distributions of six universal antimicrobial genes (ituA, hag, tasA, sfp, spaS and mrsA) found within the Bacillus group by PCR, and analyzed the characterization of antimicrobial gene distributions in these strains using statistical methods. The results suggest that 34 strains of Bacillus spp. were isolated which has not previously been detected at such a scale, these Bacillus strains could be classified into five categories as well as an external strain by 16S rRNA; Most of Bacillus strains are able to inhibit enteric pathogens, and the antimicrobial abilities may be correlated to their categories of 16S rRNA; The detection rates of six common antimicrobial genes are between 20.58 %(7/34) and 79.41 %(27/34), and genes distribute in three clusters in these strains. We found that the antimicrobial abilities of Bacillus strains can be one of the mechanisms by which Giant Panda maintains its intestinal microflora balance, and may be correlated to their phylogeny.
Wen, Feng; Zhu, Hong; Li, Peng; Jiang, Min; Mao, Wenqing; Ong, Chermaine; Chu, Zhaoqing
2014-01-01
Members of plant WRKY gene family are ancient transcription factors that function in plant growth and development and respond to biotic and abiotic stresses. In our present study, we have investigated WRKY family genes in Brachypodium distachyon, a new model plant of family Poaceae. We identified a total of 86 WRKY genes from B. distachyon and explored their chromosomal distribution and evolution, domain alignment, promoter cis-elements, and expression profiles. Combining the analysis of phylogenetic tree of BdWRKY genes and the result of expression profiling, results showed that most of clustered gene pairs had higher similarities in the WRKY domain, suggesting that they might be functionally redundant. Neighbour-joining analysis of 301 WRKY domains from Oryza sativa, Arabidopsis thaliana, and B. distachyon suggested that BdWRKY domains are evolutionarily more closely related to O. sativa WRKY domains than those of A. thaliana. Moreover, tissue-specific expression profile of BdWRKY genes and their responses to phytohormones and several biotic or abiotic stresses were analysed by quantitative real-time PCR. The results showed that the expression of BdWRKY genes was rapidly regulated by stresses and phytohormones, and there was a strong correlation between promoter cis-elements and the phytohormones-induced BdWRKY gene expression. PMID:24453041
Serial analysis of gene expression in a rat lung model of asthma.
Yin, Lei-Miao; Jiang, Gong-Hao; Wang, Yu; Wang, Yan; Liu, Yan-Yan; Jin, Wei-Rong; Zhang, Zen; Xu, Yu-Dong; Yang, Yong-Qing
2008-11-01
The pathogenesis and molecular mechanism underlying asthma remain undetermined. The purpose of this study was to identify genes and pathways involved in the early airway response (EAR) phase of asthma by using serial analysis of gene expression (SAGE). Two SAGE tag libraries of lung tissues derived from a rat model of asthma and controls were generated. Bioinformatic analyses were carried out using the Database for Annotation, Visualization and IntegratedDiscovery Functional Annotation Tool, Gene Ontology (GO) TreeMachine and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. A total of 26 552 SAGE tags of asthmatic rat lung were obtained, of which 12 221 were unique tags. Of the unique tags, 55.5% were matched with known genes. By comparison of the two libraries, 186 differentially expressed tags (P < 0.05) were identified, of which 103 were upregulated and 83 were downregulated. Using the bioinformatic tools these genes were classified into 23 functional groups, 15 KEGG pathways and 37 enriched GO categories. The bioinformatic analyses of gene distribution, enriched categories and the involvement of specific pathways in the SAGE libraries have provided information on regulatory networks of the EAR phase of asthma. Analyses of the regulated genes of interest may inform new hypotheses, increase our understanding of the disease and provide a foundation for future research.
Genetic variation in the myeloperoxidase gene and cognitive impairment in Multiple Sclerosis
Manna, I; Valentino, P; La Russa, A; Condino, F; Nisticò, R; Liguori, M; Clodomiro, A; Andreoli, V; Pirritano, D; Cittadella, R; Quattrone, A
2006-01-01
There is evidence that multiple sclerosis (MS) may associated with cognitive impairment in 25 to 40% of cases. The gene encoding myeloperoxidase (MPO) is involved in molecular pathways leading to β-amyloid deposition. We investigated a functional biallelic (G/A) polymorphism in the promoter region (-463) of the MPO gene in 465 patients affected by MS, divided into 204 cognitively normal and 261 impaired. We did not find significant differences in allele or genotype distributions between impaired and preserved MS patients. Our findings suggest that MPO polymorphism is not a risk factor for cognitive impairment in MS. PMID:16504169
Robustness, Evolvability, and the Logic of Genetic Regulation
Moore, Jason H.; Wagner, Andreas
2014-01-01
In gene regulatory circuits, the expression of individual genes is commonly modulated by a set of regulating gene products, which bind to a gene’s cis-regulatory region. This region encodes an input-output function, referred to as signal-integration logic, that maps a specific combination of regulatory signals (inputs) to a particular expression state (output) of a gene. The space of all possible signal-integration functions is vast and the mapping from input to output is many-to-one: for the same set of inputs, many functions (genotypes) yield the same expression output (phenotype). Here, we exhaustively enumerate the set of signal-integration functions that yield idential gene expression patterns within a computational model of gene regulatory circuits. Our goal is to characterize the relationship between robustness and evolvability in the signal-integration space of regulatory circuits, and to understand how these properties vary between the genotypic and phenotypic scales. Among other results, we find that the distributions of genotypic robustness are skewed, such that the majority of signal-integration functions are robust to perturbation. We show that the connected set of genotypes that make up a given phenotype are constrained to specific regions of the space of all possible signal-integration functions, but that as the distance between genotypes increases, so does their capacity for unique innovations. In addition, we find that robust phenotypes are (i) evolvable, (ii) easily identified by random mutation, and (iii) mutationally biased toward other robust phenotypes. We explore the implications of these latter observations for mutation-based evolution by conducting random walks between randomly chosen source and target phenotypes. We demonstrate that the time required to identify the target phenotype is independent of the properties of the source phenotype. PMID:23373974
Chu, Shuyuan; Zhong, Xiaoning; Zhang, Jianquan; Lai, Xiaoying; Xie, Jiajun; Li, Yu
2016-12-01
Forkhead box P3 (FOXP3) is the essential transcription factor for the function of regulatory T-cell (Treg). However, the gene mutation of FOXP3 in patients with chronic obstructive pulmonary disease (COPD) at different stages has not been reported. We aim to investigate four single nucleotide polymorphisms (SNPs) and the mRNA expression of FOXP3 in smokers with normal lung function and smokers with COPD at different stages. FOXP3 mRNA expression and SNPs in FOXP3 were assessed in nonsmokers with normal lung function (N), smokers with normal lung function (S), smokers with COPD in the Global Initiative for Chronic Obstructive Lung Disease (GOLD) 1 or 2 grade (COPD 1-2), and smokers with COPD in GOLD 3 or 4 grade (COPD 3-4). In peripheral blood sample, FOXP3 mRNA was assessed using real-time quantitative PCR and SNPs were analyzed by TaqMan PCR. FOXP3 mRNA level in peripheral blood sample was decreased when COPD was aggravated. The frequency of FOXP3 rs5902434 genotype del/del and allele del are lower in COPD 1-2 and COPD 3-4 than that in N or S. The rs5902434 genotype del/del and allele del were, respectively, associated with decreased risk of COPD and lung function decline. The rs5902434 genotypic distribution was correlated with FOXP3 mRNA level. In conclusion, both FOXP3 rs5902434 genotypes and alleles were differently distributed in COPD patients and smokers with normal lung function. The distribution of del/del genotype was associated with systemic expression of FOXP3 mRNA. More research is needed to explore the role of FOXP3 gene polymorphism in immunoinflammation of COPD.
LINE-1 retrotransposons: from 'parasite' sequences to functional elements.
Paço, Ana; Adega, Filomena; Chaves, Raquel
2015-02-01
Long interspersed nuclear elements-1 (LINE-1) are the most abundant and active retrotransposons in the mammalian genomes. Traditionally, the occurrence of LINE-1 sequences in the genome of mammals has been explained by the selfish DNA hypothesis. Nevertheless, recently, it has also been argued that these sequences could play important roles in these genomes, as in the regulation of gene expression, genome modelling and X-chromosome inactivation. The non-random chromosomal distribution is a striking feature of these retroelements that somehow reflects its functionality. In the present study, we have isolated and analysed a fraction of the open reading frame 2 (ORF2) LINE-1 sequence from three rodent species, Cricetus cricetus, Peromyscus eremicus and Praomys tullbergi. Physical mapping of the isolated sequences revealed an interspersed longitudinal AT pattern of distribution along all the chromosomes of the complement in the three genomes. A detailed analysis shows that these sequences are preferentially located in the euchromatic regions, although some signals could be detected in the heterochromatin. In addition, a coincidence between the location of imprinted gene regions (as Xist and Tsix gene regions) and the LINE-1 retroelements was also observed. According to these results, we propose an involvement of LINE-1 sequences in different genomic events as gene imprinting, X-chromosome inactivation and evolution of repetitive sequences located at the heterochromatic regions (e.g. satellite DNA sequences) of the rodents' genomes analysed.
Tian, Feng-Xia; Zang, Jian-Lei; Wang, Tan; Xie, Yu-Li; Zhang, Jin; Hu, Jian-Jun
2015-01-01
Aldehyde dehydrogenases (ALDHs) constitute a superfamily of NAD(P)+-dependent enzymes that catalyze the irreversible oxidation of a wide range of reactive aldehydes to their corresponding nontoxic carboxylic acids. ALDHs have been studied in many organisms from bacteria to mammals; however, no systematic analyses incorporating genome organization, gene structure, expression profiles, and cis-acting elements have been conducted in the model tree species Populus trichocarpa thus far. In this study, a comprehensive analysis of the Populus ALDH gene superfamily was performed. A total of 26 Populus ALDH genes were found to be distributed across 12 chromosomes. Genomic organization analysis indicated that purifying selection may have played a pivotal role in the retention and maintenance of PtALDH gene families. The exon-intron organizations of PtALDHs were highly conserved within the same family, suggesting that the members of the same family also may have conserved functionalities. Microarray data and qRT-PCR analysis indicated that most PtALDHs had distinct tissue-specific expression patterns. The specificity of cis-acting elements in the promoter regions of the PtALDHs and the divergence of expression patterns between nine paralogous PtALDH gene pairs suggested that gene duplications may have freed the duplicate genes from the functional constraints. The expression levels of some ALDHs were up- or down-regulated by various abiotic stresses, implying that the products of these genes may be involved in the adaptation of Populus to abiotic stresses. Overall, the data obtained from our investigation contribute to a better understanding of the complexity of the Populus ALDH gene superfamily and provide insights into the function and evolution of ALDH gene families in vascular plants.
Pleiotropy Analysis of Quantitative Traits at Gene Level by Multivariate Functional Linear Models
Wang, Yifan; Liu, Aiyi; Mills, James L.; Boehnke, Michael; Wilson, Alexander F.; Bailey-Wilson, Joan E.; Xiong, Momiao; Wu, Colin O.; Fan, Ruzong
2015-01-01
In genetics, pleiotropy describes the genetic effect of a single gene on multiple phenotypic traits. A common approach is to analyze the phenotypic traits separately using univariate analyses and combine the test results through multiple comparisons. This approach may lead to low power. Multivariate functional linear models are developed to connect genetic variant data to multiple quantitative traits adjusting for covariates for a unified analysis. Three types of approximate F-distribution tests based on Pillai–Bartlett trace, Hotelling–Lawley trace, and Wilks’s Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants in one genetic region. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and optimal sequence kernel association test (SKAT-O). Extensive simulations were performed to evaluate the false positive rates and power performance of the proposed models and tests. We show that the approximate F-distribution tests control the type I error rates very well. Overall, simultaneous analysis of multiple traits can increase power performance compared to an individual test of each trait. The proposed methods were applied to analyze (1) four lipid traits in eight European cohorts, and (2) three biochemical traits in the Trinity Students Study. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and SKAT-O for the three biochemical traits. The approximate F-distribution tests of the proposed functional linear models are more sensitive than those of the traditional multivariate linear models that in turn are more sensitive than SKAT-O in the univariate case. The analysis of the four lipid traits and the three biochemical traits detects more association than SKAT-O in the univariate case. PMID:25809955
Pleiotropy analysis of quantitative traits at gene level by multivariate functional linear models.
Wang, Yifan; Liu, Aiyi; Mills, James L; Boehnke, Michael; Wilson, Alexander F; Bailey-Wilson, Joan E; Xiong, Momiao; Wu, Colin O; Fan, Ruzong
2015-05-01
In genetics, pleiotropy describes the genetic effect of a single gene on multiple phenotypic traits. A common approach is to analyze the phenotypic traits separately using univariate analyses and combine the test results through multiple comparisons. This approach may lead to low power. Multivariate functional linear models are developed to connect genetic variant data to multiple quantitative traits adjusting for covariates for a unified analysis. Three types of approximate F-distribution tests based on Pillai-Bartlett trace, Hotelling-Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants in one genetic region. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and optimal sequence kernel association test (SKAT-O). Extensive simulations were performed to evaluate the false positive rates and power performance of the proposed models and tests. We show that the approximate F-distribution tests control the type I error rates very well. Overall, simultaneous analysis of multiple traits can increase power performance compared to an individual test of each trait. The proposed methods were applied to analyze (1) four lipid traits in eight European cohorts, and (2) three biochemical traits in the Trinity Students Study. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and SKAT-O for the three biochemical traits. The approximate F-distribution tests of the proposed functional linear models are more sensitive than those of the traditional multivariate linear models that in turn are more sensitive than SKAT-O in the univariate case. The analysis of the four lipid traits and the three biochemical traits detects more association than SKAT-O in the univariate case. © 2015 WILEY PERIODICALS, INC.
NASA Astrophysics Data System (ADS)
Bucklin, A. C.; Batta Lona, P. G.; Maas, A. E.; O'Neill, R. J.; Wiebe, P. H.
2015-12-01
In response to the changing Antarctic climate, the Southern Ocean salp Salpa thompsoni has shown altered patterns of distribution and abundance that are anticipated to have profound impacts on pelagic food webs and ecosystem dynamics. The physiological and molecular processes that underlay ecological function and biogeographical distribution are key to understanding present-day dynamics and predicting future trajectories. This study examined transcriptome-wide patterns of gene expression in relation to biological and physical oceanographic conditions in coastal, shelf and offshore waters of the Western Antarctic Peninsula (WAP) region during austral spring and summer 2011. Based on field observations and collections, seasonal changes in the distribution and abundance of salps of different life stages were associated with differences in water mass structure of the WAP. Our observations are consistent with previous suggestions that bathymetry and currents in Bransfield Strait could generate a retentive cell for an overwintering population of S. thompsoni, which may generate the characteristic salp blooms found throughout the region later in summer. The statistical analysis of transcriptome-wide patterns of gene expression revealed differences among salps collected in different seasons and from different habitats (i.e., coastal versus offshore) in the WAP. Gene expression patterns also clustered by station in austral spring - but not summer - collections, suggesting stronger heterogeneity of environmental conditions. During the summer, differentially expressed genes covered a wider range of functions, including those associated with stress responses. Future research using novel molecular transcriptomic / genomic characterization of S. thompsoni will allow more complete understanding of individual-, population-, and species-level responses to environmental variability and prediction of future dynamics of Southern Ocean food webs and ecosystems.
Neimanis, Karina; Staples, James F; Hüner, Norman P A; McDonald, Allison E
2013-09-10
Alternative oxidase (AOX) is a terminal ubiquinol oxidase present in the respiratory chain of all angiosperms investigated to date, but AOX distribution in other members of the Viridiplantae is less clear. We assessed the taxonomic distribution of AOX using bioinformatics. Multiple sequence alignments compared AOX proteins and examined amino acid residues involved in AOX catalytic function and post-translational regulation. Novel AOX sequences were found in both Chlorophytes and Streptophytes and we conclude that AOX is widespread in the Viridiplantae. AOX multigene families are common in non-angiosperm plants and the appearance of AOX1 and AOX2 subtypes pre-dates the divergence of the Coniferophyta and Magnoliophyta. Residues involved in AOX catalytic function are highly conserved between Chlorophytes and Streptophytes, while AOX post-translational regulation likely differs in these two lineages. We demonstrate experimentally that an AOX gene is present in the moss Physcomitrella patens and that the gene is transcribed. Our findings suggest that AOX will likely exert an influence on plant respiration and carbon metabolism in non-angiosperms such as green algae, bryophytes, liverworts, lycopods, ferns, gnetophytes, and gymnosperms and that further research in these systems is required. Copyright © 2013 Elsevier B.V. All rights reserved.
Chu, Audrey Y; Deng, Xuan; Fisher, Virginia A; Drong, Alexander; Zhang, Yang; Feitosa, Mary F; Liu, Ching-Ti; Weeks, Olivia; Choh, Audrey C; Duan, Qing; Dyer, Thomas D; Eicher, John D; Guo, Xiuqing; Heard-Costa, Nancy L; Kacprowski, Tim; Kent, Jack W; Lange, Leslie A; Liu, Xinggang; Lohman, Kurt; Lu, Lingyi; Mahajan, Anubha; O'Connell, Jeffrey R; Parihar, Ankita; Peralta, Juan M; Smith, Albert V; Zhang, Yi; Homuth, Georg; Kissebah, Ahmed H; Kullberg, Joel; Laqua, René; Launer, Lenore J; Nauck, Matthias; Olivier, Michael; Peyser, Patricia A; Terry, James G; Wojczynski, Mary K; Yao, Jie; Bielak, Lawrence F; Blangero, John; Borecki, Ingrid B; Bowden, Donald W; Carr, John Jeffrey; Czerwinski, Stefan A; Ding, Jingzhong; Friedrich, Nele; Gudnason, Vilmunder; Harris, Tamara B; Ingelsson, Erik; Johnson, Andrew D; Kardia, Sharon L R; Langefeld, Carl D; Lind, Lars; Liu, Yongmei; Mitchell, Braxton D; Morris, Andrew P; Mosley, Thomas H; Rotter, Jerome I; Shuldiner, Alan R; Towne, Bradford; Völzke, Henry; Wallaschofski, Henri; Wilson, James G; Allison, Matthew; Lindgren, Cecilia M; Goessling, Wolfram; Cupples, L Adrienne; Steinhauser, Matthew L; Fox, Caroline S
2017-01-01
Variation in body fat distribution contributes to the metabolic sequelae of obesity. The genetic determinants of body fat distribution are poorly understood. The goal of this study was to gain new insights into the underlying genetics of body fat distribution by conducting sample-size-weighted fixed-effects genome-wide association meta-analyses in up to 9,594 women and 8,738 men of European, African, Hispanic and Chinese ancestry, with and without sex stratification, for six traits associated with ectopic fat (hereinafter referred to as ectopic-fat traits). In total, we identified seven new loci associated with ectopic-fat traits (ATXN1, UBE2E2, EBF1, RREB1, GSDMB, GRAMD3 and ENSA; P < 5 × 10 -8 ; false discovery rate < 1%). Functional analysis of these genes showed that loss of function of either Atxn1 or Ube2e2 in primary mouse adipose progenitor cells impaired adipocyte differentiation, suggesting physiological roles for ATXN1 and UBE2E2 in adipogenesis. Future studies are necessary to further explore the mechanisms by which these genes affect adipocyte biology and how their perturbations contribute to systemic metabolic disease.
The Choice between MapMan and Gene Ontology for Automated Gene Function Prediction in Plant Science
Klie, Sebastian; Nikoloski, Zoran
2012-01-01
Since the introduction of the Gene Ontology (GO), the analysis of high-throughput data has become tightly coupled with the use of ontologies to establish associations between knowledge and data in an automated fashion. Ontologies provide a systematic description of knowledge by a controlled vocabulary of defined structure in which ontological concepts are connected by pre-defined relationships. In plant science, MapMan and GO offer two alternatives for ontology-driven analyses. Unlike GO, initially developed to characterize microbial systems, MapMan was specifically designed to cover plant-specific pathways and processes. While the dependencies between concepts in MapMan are modeled as a tree, in GO these are captured in a directed acyclic graph. Therefore, the difference in ontologies may cause discrepancies in data reduction, visualization, and hypothesis generation. Here provide the first systematic comparative analysis of GO and MapMan for the case of the model plant species Arabidopsis thaliana (Arabidopsis) with respect to their structural properties and difference in distributions of information content. In addition, we investigate the effect of the two ontologies on the specificity and sensitivity of automated gene function prediction via the coupling of co-expression networks and the guilt-by-association principle. Automated gene function prediction is particularly needed for the model plant Arabidopsis in which only half of genes have been functionally annotated based on sequence similarity to known genes. The results highlight the need for structured representation of species-specific biological knowledge, and warrants caution in the design principles employed in future ontologies. PMID:22754563
Analysis of functional importance of binding sites in the Drosophila gap gene network model.
Kozlov, Konstantin; Gursky, Vitaly V; Kulakovskiy, Ivan V; Dymova, Arina; Samsonova, Maria
2015-01-01
The statistical thermodynamics based approach provides a promising framework for construction of the genotype-phenotype map in many biological systems. Among important aspects of a good model connecting the DNA sequence information with that of a molecular phenotype (gene expression) is the selection of regulatory interactions and relevant transcription factor bindings sites. As the model may predict different levels of the functional importance of specific binding sites in different genomic and regulatory contexts, it is essential to formulate and study such models under different modeling assumptions. We elaborate a two-layer model for the Drosophila gap gene network and include in the model a combined set of transcription factor binding sites and concentration dependent regulatory interaction between gap genes hunchback and Kruppel. We show that the new variants of the model are more consistent in terms of gene expression predictions for various genetic constructs in comparison to previous work. We quantify the functional importance of binding sites by calculating their impact on gene expression in the model and calculate how these impacts correlate across all sites under different modeling assumptions. The assumption about the dual interaction between hb and Kr leads to the most consistent modeling results, but, on the other hand, may obscure existence of indirect interactions between binding sites in regulatory regions of distinct genes. The analysis confirms the previously formulated regulation concept of many weak binding sites working in concert. The model predicts a more or less uniform distribution of functionally important binding sites over the sets of experimentally characterized regulatory modules and other open chromatin domains.
Niu, Xin; Guan, Yuxiang; Chen, Shoukun; Li, Haifeng
2017-08-15
As a superfamily of transcription factors (TFs), the basic helix-loop-helix (bHLH) proteins have been characterized functionally in many plants with a vital role in the regulation of diverse biological processes including growth, development, response to various stresses, and so on. However, no systemic analysis of the bHLH TFs has been reported in Brachypodium distachyon, an emerging model plant in Poaceae. A total of 146 bHLH TFs were identified in the Brachypodium distachyon genome and classified into 24 subfamilies. BdbHLHs in the same subfamily share similar protein motifs and gene structures. Gene duplication events showed a close relationship to rice, maize and sorghum, and segment duplications might play a key role in the expansion of this gene family. The amino acid sequence of the bHLH domains were quite conservative, especially Leu-27 and Leu-54. Based on the predicted binding activities, the BdbHLHs were divided into DNA binding and non-DNA binding types. According to the gene ontology (GO) analysis, BdbHLHs were speculated to function in homodimer or heterodimer manner. By integrating the available high throughput data in public database and results of quantitative RT-PCR, we found the expression profiles of BdbHLHs were different, implying their differentiated functions. One hundred fourty-six BdbHLHs were identified and their conserved domains, sequence features, phylogenetic relationship, chromosomal distribution, GO annotations, gene structures, gene duplication and expression profiles were investigated. Our findings lay a foundation for further evolutionary and functional elucidation of BdbHLH genes.
SinEx DB: a database for single exon coding sequences in mammalian genomes.
Jorquera, Roddy; Ortiz, Rodrigo; Ossandon, F; Cárdenas, Juan Pablo; Sepúlveda, Rene; González, Carolina; Holmes, David S
2016-01-01
Eukaryotic genes are typically interrupted by intragenic, noncoding sequences termed introns. However, some genes lack introns in their coding sequence (CDS) and are generally known as 'single exon genes' (SEGs). In this work, a SEG is defined as a nuclear, protein-coding gene that lacks introns in its CDS. Whereas, many public databases of Eukaryotic multi-exon genes are available, there are only two specialized databases for SEGs. The present work addresses the need for a more extensive and diverse database by creating SinEx DB, a publicly available, searchable database of predicted SEGs from 10 completely sequenced mammalian genomes including human. SinEx DB houses the DNA and protein sequence information of these SEGs and includes their functional predictions (KOG) and the relative distribution of these functions within species. The information is stored in a relational database built with My SQL Server 5.1.33 and the complete dataset of SEG sequences and their functional predictions are available for downloading. SinEx DB can be interrogated by: (i) a browsable phylogenetic schema, (ii) carrying out BLAST searches to the in-house SinEx DB of SEGs and (iii) via an advanced search mode in which the database can be searched by key words and any combination of searches by species and predicted functions. SinEx DB provides a rich source of information for advancing our understanding of the evolution and function of SEGs.Database URL: www.sinex.cl. © The Author(s) 2016. Published by Oxford University Press.
Lessons from the canine Oxtr gene: populations, variants and functional aspects.
Bence, M; Marx, P; Szantai, E; Kubinyi, E; Ronai, Z; Banlaki, Z
2017-04-01
Oxytocin receptor (OXTR) acts as a key behavioral modulator of the central nervous system, affecting social behavior, stress, affiliation and cognitive functions. Variants of the Oxtr gene are known to influence behavior both in animals and humans; however, canine Oxtr polymorphisms are less characterized in terms of possible relevance to function, selection criteria in breeding and domestication. In this report, we provide a detailed characterization of common variants of the canine Oxtr gene. In particular (1) novel polymorphisms were identified by direct sequencing of wolf and dog samples, (2) allelic distributions and pairwise linkage disequilibrium patterns of several canine populations were compared, (3) neighbor joining (NJ) tree based on common single nucleotide polymorphisms (SNPs) was constructed, (4) mRNA expression features were assessed, (5) a novel splice variant was detected and (6) in vitro functional assays were performed. Results indicate marked differences regarding Oxtr variations between purebred dogs of different breeds, free-ranging dog populations, wolf subspecies and golden jackals. This, together with existence of explicitly dog-specific alleles and data obtained from the NJ tree implies that Oxtr could indeed have been a target gene during domestication and selection for human preferred aspects of temperament and social behavior. This assumption is further supported by the present observations on gene expression patterns within the brain and luciferase reporter experiments, providing a molecular level link between certain canine Oxtr polymorphisms and differences in nervous system function and behavior. © 2016 John Wiley & Sons Ltd and International Behavioural and Neural Genetics Society.
Adamek, Martina; Alanjary, Mohammad; Sales-Ortells, Helena; Goodfellow, Michael; Bull, Alan T; Winkler, Anika; Wibberg, Daniel; Kalinowski, Jörn; Ziemert, Nadine
2018-06-01
Genome mining tools have enabled us to predict biosynthetic gene clusters that might encode compounds with valuable functions for industrial and medical applications. With the continuously increasing number of genomes sequenced, we are confronted with an overwhelming number of predicted clusters. In order to guide the effective prioritization of biosynthetic gene clusters towards finding the most promising compounds, knowledge about diversity, phylogenetic relationships and distribution patterns of biosynthetic gene clusters is necessary. Here, we provide a comprehensive analysis of the model actinobacterial genus Amycolatopsis and its potential for the production of secondary metabolites. A phylogenetic characterization, together with a pan-genome analysis showed that within this highly diverse genus, four major lineages could be distinguished which differed in their potential to produce secondary metabolites. Furthermore, we were able to distinguish gene cluster families whose distribution correlated with phylogeny, indicating that vertical gene transfer plays a major role in the evolution of secondary metabolite gene clusters. Still, the vast majority of the diverse biosynthetic gene clusters were derived from clusters unique to the genus, and also unique in comparison to a database of known compounds. Our study on the locations of biosynthetic gene clusters in the genomes of Amycolatopsis' strains showed that clusters acquired by horizontal gene transfer tend to be incorporated into non-conserved regions of the genome thereby allowing us to distinguish core and hypervariable regions in Amycolatopsis genomes. Using a comparative genomics approach, it was possible to determine the potential of the genus Amycolatopsis to produce a huge diversity of secondary metabolites. Furthermore, the analysis demonstrates that horizontal and vertical gene transfer play an important role in the acquisition and maintenance of valuable secondary metabolites. Our results cast light on the interconnections between secondary metabolite gene clusters and provide a way to prioritize biosynthetic pathways in the search and discovery of novel compounds.
Mobile genes in the human microbiome are structured from global to individual scales
Brito, IL; Jupiter, SD; Jenkins, AP; Naisilisili, W; Tamminen, M; Smillie, CS; Wortman, JR; Birren, BW; Xavier, RJ; Blainey, PC; Singh, AK; Gevers, D; Alm, EJ
2016-01-01
Recent work has underscored the importance of the microbiome in human health, largely attributing differences in phenotype to differences in the species present across individuals1,2,3,4,5. But mobile genes can confer profoundly different phenotypes on different strains of the same species. Little is known about the function and distribution of mobile genes in the human microbiome, and in particular whether the gene pool is globally homogenous or constrained by human population structure. Here, we investigate this question by comparing the mobile genes found in the microbiomes of 81 metropolitan North Americans with that of 172 agrarian Fiji islanders using a combination of single-cell genomics and metagenomics. We find large differences in mobile gene content between the Fijian and North American microbiomes, with functional variation that mirrors known dietary differences such as the excess of plant-based starch degradation genes. Remarkably, differences are also observed between the mobile gene pools of proximal Fijian villages, even though microbiome composition across villages is similar. Finally, we observe high rates of recombination leading to individual-specific mobile elements, suggesting that the abundance of some genes may reflect environmental selection rather than dispersal limitation. Together, these data support the hypothesis that human activities and behaviors provide selective pressures that shape mobile gene pools, and that acquisition of mobile genes is important to colonizing specific human populations. PMID:27409808
Phi Class of Glutathione S-transferase Gene Superfamily Widely Exists in Nonplant Taxonomic Groups
Munyampundu, Jean-Pierre; Xu, You-Ping; Cai, Xin-Zhong
2016-01-01
Glutathione S-transferases (GSTs) constitute a superfamily of enzymes involved in detoxification of noxious compounds and protection against oxidative damage. GST class Phi (GSTF), one of the important classes of plant GSTs, has long been considered as plant specific but was recently found in basidiomycete fungi. However, the range of nonplant taxonomic groups containing GSTFs remains unknown. In this study, the distribution and phylogenetic relationships of nonplant GSTFs were investigated. We identified GSTFs in ascomycete fungi, myxobacteria, and protists Naegleria gruberi and Aureococcus anophagefferens. GSTF occurrence in these bacteria and protists correlated with their genome sizes and habitats. While this link was missing across ascomycetes, the distribution and abundance of GSTFs among ascomycete genomes could be associated with their lifestyles to some extent. Sequence comparison, gene structure, and phylogenetic analyses indicated divergence among nonplant GSTFs, suggesting polyphyletic origins during evolution. Furthermore, in silico prediction of functional partners suggested functional diversification among nonplant GSTFs. PMID:26884677
Berger, Michael; Farcas, Anca; Geertz, Marcel; Zhelyazkova, Petya; Brix, Klaudia; Travers, Andrew; Muskhelishvili, Georgi
2010-01-01
The histone-like protein HU is a highly abundant DNA architectural protein that is involved in compacting the DNA of the bacterial nucleoid and in regulating the main DNA transactions, including gene transcription. However, the coordination of the genomic structure and function by HU is poorly understood. Here, we address this question by comparing transcript patterns and spatial distributions of RNA polymerase in Escherichia coli wild-type and hupA/B mutant cells. We demonstrate that, in mutant cells, upregulated genes are preferentially clustered in a large chromosomal domain comprising the ribosomal RNA operons organized on both sides of OriC. Furthermore, we show that, in parallel to this transcription asymmetry, mutant cells are also impaired in forming the transcription foci—spatially confined aggregations of RNA polymerase molecules transcribing strong ribosomal RNA operons. Our data thus implicate HU in coordinating the global genomic structure and function by regulating the spatial distribution of RNA polymerase in the nucleoid. PMID:20010798
Chen, Shuowen; Khan, Muhammad J.; Loor, Juan J.
2013-01-01
Characterization and biological roles of the peroxisome proliferator-activated receptor (PPAR) isotypes are well known in monogastrics, but not in ruminants. However, a wealth of information has accumulated in little more than a decade on ruminant PPARs including isotype tissue distribution, response to synthetic and natural agonists, gene targets, and factors affecting their expression. Functional characterization demonstrated that, as in monogastrics, the PPAR isotypes control expression of genes involved in lipid metabolism, anti-inflammatory response, development, and growth. Contrary to mouse, however, the PPARγ gene network appears to controls milk fat synthesis in lactating ruminants. As in monogastrics, PPAR isotypes in ruminants are activated by long-chain fatty acids, therefore, making them ideal candidates for fine-tuning metabolism in this species via nutrients. In this regard, using information accumulated in ruminants and monogastrics, we propose a model of PPAR isotype-driven biological functions encompassing key tissues during the peripartal period in dairy cattle. PMID:23737762
Liu, Qinglong; Tang, Jingchun; Bai, Zhihui; Hecker, Markus; Giesy, John P.
2015-01-01
Genes that encode for enzymes that can degrade petroleum hydrocarbons (PHs) are critical for the ability of microorganisms to bioremediate soils contaminated with PHs. Distributions of two petroleum-degrading genes AlkB and Nah in soils collected from three zones of the Dagang Oilfield, Tianjin, China were investigated. Numbers of copies of AlkB ranged between 9.1 × 105 and 1.9 × 107 copies/g dry mass (dm) soil, and were positively correlated with total concentrations of PHs (TPH) (R2 = 0.573, p = 0.032) and alkanes (C33 ~ C40) (R2 = 0.914, p < 0.01). The Nah gene was distributed relatively evenly among sampling zones, ranging between 1.9 × 107 and 1.1 × 108 copies/g dm soil, and was negatively correlated with concentrations of total aromatic hydrocarbons (TAH) (R2 = −0.567, p = 0.035) and ∑16 PAHs (R2 = −0.599, p = 0.023). Results of a factor analysis showed that individual samples of soils were not ordinated as a function of the zones. PMID:26086670
Moreno-Sánchez, Natalia; Rueda, Julia; Reverter, Antonio; Carabaño, María Jesús; Díaz, Clara
2012-03-01
Variations on the transcriptome from one skeletal muscle type to another still remain unknown. The reliable identification of stable gene coexpression networks is essential to unravel gene functions and define biological processes. The differential expression of two distinct muscles, M. flexor digitorum (FD) and M. psoas major (PM), was studied using microarrays in cattle to illustrate muscle-specific transcription patterns and to quantify changes in connectivity regarding the expected gene coexpression pattern. A total of 206 genes were differentially expressed (DE), 94 upregulated in PM and 112 in FD. The distribution of DE genes in pathways and biological functions was explored in the context of system biology. Global interactomes for genes of interest were predicted. Fast/slow twitch genes, genes coding for extracellular matrix, ribosomal and heat shock proteins, and fatty acid uptake centred the specific gene expression patterns per muscle. Genes involved in repairing mechanisms, such as ribosomal and heat shock proteins, suggested a differential ability of muscles to react to similar stressing factors, acting preferentially in slow twitch muscles. Muscle attributes do not seem to be completely explained by the muscle fibre composition. Changes in connectivity accounted for 24% of significant correlations between DE genes. Genes changing their connectivity mostly seem to contribute to the main differential attributes that characterize each specific muscle type. These results underscore the unique flexibility of skeletal muscle where a substantial set of genes are able to change their behavior depending on the circumstances.
Ancient Eukaryotic Origin and Evolutionary Plasticity of Nuclear Lamina
Field, Mark C.
2016-01-01
Abstract The emergence of the nucleus was a major event of eukaryogenesis. How the nuclear envelope (NE) arose and acquired functions governing chromatin organization and epigenetic control has direct bearing on origins of developmental/stage-specific expression programs. The configuration of the NE and the associated lamina in the last eukaryotic common ancestor (LECA) is of major significance and can provide insight into activities within the LECA nucleus. Subsequent lamina evolution, alterations, and adaptations inform on the variation and selection of distinct mechanisms that subtend gene expression in distinct taxa. Understanding lamina evolution has been difficult due to the diversity and limited taxonomic distributions of the three currently known highly distinct nuclear lamina. We rigorously searched available sequence data for an expanded view of the distribution of known lamina and lamina-associated proteins. While the lamina proteins of plants and trypanosomes are indeed taxonomically restricted, homologs of metazoan lamins and key lamin-binding proteins have significantly broader distributions, and a lamin gene tree supports vertical evolution from the LECA. Two protist lamins from highly divergent taxa target the nucleus in mammalian cells and polymerize into filamentous structures, suggesting functional conservation of distant lamin homologs. Significantly, a high level of divergence of lamin homologs within certain eukaryotic groups and the apparent absence of lamins and/or the presence of seemingly different lamina proteins in many eukaryotes suggests great evolutionary plasticity in structures at the NE, and hence mechanisms of chromatin tethering and epigenetic gene control. PMID:27189989
Ramachandran, Arthi; Walsh, David A
2015-10-01
The diversity and distribution of methylotrophic bacteria have been investigated in the oceans and lakes using the methanol dehydrogenase mxaF gene as a functional marker. However, pelagic marine (OM43) and freshwater (LD28 and PRD01a001B) methylotrophs within the Betaproteobacteria lack mxaF, instead possessing a related xoxF4-encoded methanol dehydrogenase. Here, we developed and employed xoxF4 as a complementary functional gene marker to mxaF for studying methylotrophs in aquatic environment. Using xoxF4, we detected OM43-related and LD28-related methylotrophs in the ocean and freshwaters of North America, respectively, and showed the coexistence of these two lineages in a large estuarine system (St Lawrence Estuary). Gene expression patterns of xoxF4 supported a positive relationship between xoxF4-containing methylotroph activity and spring time productivity, suggesting phytoplankton blooms are a source of methylotrophic substrates. Further investigation of methanol dehydrogenase diversity in pelagic ecosystems using comparative metagenomics provided strong support for a widespread distribution of xoxF4 (as well as several distinct xoxF5) containing methylotrophs in marine and freshwater surface waters. In total, these results demonstrate a geographical distribution of OM43/LD28-related methylotrophs that includes marine and freshwaters and suggest that methylotrophy occurring in the water column is an important component of lake and estuary carbon cycling and biogeochemistry. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Gene–culture coevolution in whales and dolphins
Whitehead, Hal
2017-01-01
Whales and dolphins (Cetacea) have excellent social learning skills as well as a long and strong mother–calf bond. These features produce stable cultures, and, in some species, sympatric groups with different cultures. There is evidence and speculation that this cultural transmission of behavior has affected gene distributions. Culture seems to have driven killer whales into distinct ecotypes, which may be incipient species or subspecies. There are ecotype-specific signals of selection in functional genes that correspond to cultural foraging behavior and habitat use by the different ecotypes. The five species of whale with matrilineal social systems have remarkably low diversity of mtDNA. Cultural hitchhiking, the transmission of functionally neutral genes in parallel with selective cultural traits, is a plausible hypothesis for this low diversity, especially in sperm whales. In killer whales the ecotype divisions, together with founding bottlenecks, selection, and cultural hitchhiking, likely explain the low mtDNA diversity. Several cetacean species show habitat-specific distributions of mtDNA haplotypes, probably the result of mother–offspring cultural transmission of migration routes or destinations. In bottlenose dolphins, remarkable small-scale differences in haplotype distribution result from maternal cultural transmission of foraging methods, and large-scale redistributions of sperm whale cultural clans in the Pacific have likely changed mitochondrial genetic geography. With the acceleration of genomics new results should come fast, but understanding gene–culture coevolution will be hampered by the measured pace of research on the socio-cultural side of cetacean biology. PMID:28739936
Ocean biogeochemistry modeled with emergent trait-based genomics
NASA Astrophysics Data System (ADS)
Coles, V. J.; Stukel, M. R.; Brooks, M. T.; Burd, A.; Crump, B. C.; Moran, M. A.; Paul, J. H.; Satinsky, B. M.; Yager, P. L.; Zielinski, B. L.; Hood, R. R.
2017-12-01
Marine ecosystem models have advanced to incorporate metabolic pathways discovered with genomic sequencing, but direct comparisons between models and “omics” data are lacking. We developed a model that directly simulates metagenomes and metatranscriptomes for comparison with observations. Model microbes were randomly assigned genes for specialized functions, and communities of 68 species were simulated in the Atlantic Ocean. Unfit organisms were replaced, and the model self-organized to develop community genomes and transcriptomes. Emergent communities from simulations that were initialized with different cohorts of randomly generated microbes all produced realistic vertical and horizontal ocean nutrient, genome, and transcriptome gradients. Thus, the library of gene functions available to the community, rather than the distribution of functions among specific organisms, drove community assembly and biogeochemical gradients in the model ocean.
Applications of statistical physics and information theory to the analysis of DNA sequences
NASA Astrophysics Data System (ADS)
Grosse, Ivo
2000-10-01
DNA carries the genetic information of most living organisms, and the of genome projects is to uncover that genetic information. One basic task in the analysis of DNA sequences is the recognition of protein coding genes. Powerful computer programs for gene recognition have been developed, but most of them are based on statistical patterns that vary from species to species. In this thesis I address the question if there exist universal statistical patterns that are different in coding and noncoding DNA of all living species, regardless of their phylogenetic origin. In search for such species-independent patterns I study the mutual information function of genomic DNA sequences, and find that it shows persistent period-three oscillations. To understand the biological origin of the observed period-three oscillations, I compare the mutual information function of genomic DNA sequences to the mutual information function of stochastic model sequences. I find that the pseudo-exon model is able to reproduce the mutual information function of genomic DNA sequences. Moreover, I find that a generalization of the pseudo-exon model can connect the existence and the functional form of long-range correlations to the presence and the length distributions of coding and noncoding regions. Based on these theoretical studies I am able to find an information-theoretical quantity, the average mutual information (AMI), whose probability distributions are significantly different in coding and noncoding DNA, while they are almost identical in all studied species. These findings show that there exist universal statistical patterns that are different in coding and noncoding DNA of all studied species, and they suggest that the AMI may be used to identify genes in different living species, irrespective of their taxonomic origin.
Cheviron, Zachary A.; Natarajan, Chandrasekhar; Projecto-Garcia, Joana; Eddy, Douglas K.; Jones, Jennifer; Carling, Matthew D.; Witt, Christopher C.; Moriyama, Hideaki; Weber, Roy E.; Fago, Angela; Storz, Jay F.
2014-01-01
In air-breathing vertebrates, the physiologically optimal blood-O2 affinity is jointly determined by the prevailing partial pressure of atmospheric O2, the efficacy of pulmonary O2 transfer, and internal metabolic demands. Consequently, genetic variation in the oxygenation properties of hemoglobin (Hb) may be subject to spatially varying selection in species with broad elevational distributions. Here we report the results of a combined functional and evolutionary analysis of Hb polymorphism in the rufous-collared sparrow (Zonotrichia capensis), a species that is continuously distributed across a steep elevational gradient on the Pacific slope of the Peruvian Andes. We integrated a population genomic analysis that included all postnatally expressed Hb genes with functional studies of naturally occurring Hb variants, as well as recombinant Hb (rHb) mutants that were engineered through site-directed mutagenesis. We identified three clinally varying amino acid polymorphisms: Two in the αA-globin gene, which encodes the α-chain subunits of the major HbA isoform, and one in the αD-globin gene, which encodes the α-chain subunits of the minor HbD isoform. We then constructed and experimentally tested single- and double-mutant rHbs representing each of the alternative αA-globin genotypes that predominate at different elevations. Although the locus-specific patterns of altitudinal differentiation suggested a history of spatially varying selection acting on Hb polymorphism, the experimental tests demonstrated that the observed amino acid mutations have no discernible effect on respiratory properties of the HbA or HbD isoforms. These results highlight the importance of experimentally validating the hypothesized effects of genetic changes in protein function to avoid the pitfalls of adaptive storytelling. PMID:25135942
Mining biological databases for candidate disease genes
NASA Astrophysics Data System (ADS)
Braun, Terry A.; Scheetz, Todd; Webster, Gregg L.; Casavant, Thomas L.
2001-07-01
The publicly-funded effort to sequence the complete nucleotide sequence of the human genome, the Human Genome Project (HGP), has currently produced more than 93% of the 3 billion nucleotides of the human genome into a preliminary `draft' format. In addition, several valuable sources of information have been developed as direct and indirect results of the HGP. These include the sequencing of model organisms (rat, mouse, fly, and others), gene discovery projects (ESTs and full-length), and new technologies such as expression analysis and resources (micro-arrays or gene chips). These resources are invaluable for the researchers identifying the functional genes of the genome that transcribe and translate into the transcriptome and proteome, both of which potentially contain orders of magnitude more complexity than the genome itself. Preliminary analyses of this data identified approximately 30,000 - 40,000 human `genes.' However, the bulk of the effort still remains -- to identify the functional and structural elements contained within the transcriptome and proteome, and to associate function in the transcriptome and proteome to genes. A fortuitous consequence of the HGP is the existence of hundreds of databases containing biological information that may contain relevant data pertaining to the identification of disease-causing genes. The task of mining these databases for information on candidate genes is a commercial application of enormous potential. We are developing a system to acquire and mine data from specific databases to aid our efforts to identify disease genes. A high speed cluster of Linux of workstations is used to analyze sequence and perform distributed sequence alignments as part of our data mining and processing. This system has been used to mine GeneMap99 sequences within specific genomic intervals to identify potential candidate disease genes associated with Bardet-Biedle Syndrome (BBS).
Henríquez-Valencia, Carlos; Arenas-M, Anita; Medina, Joaquín; Canales, Javier
2018-01-01
Sulfur is an essential nutrient for plant growth and development. Sulfur is a constituent of proteins, the plasma membrane and cell walls, among other important cellular components. To obtain new insights into the gene regulatory networks underlying the sulfate response, we performed an integrative meta-analysis of transcriptomic data from five different sulfate experiments available in public databases. This bioinformatic approach allowed us to identify a robust set of genes whose expression depends only on sulfate availability, indicating that those genes play an important role in the sulfate response. In relation to sulfate metabolism, the biological function of approximately 45% of these genes is currently unknown. Moreover, we found several consistent Gene Ontology terms related to biological processes that have not been extensively studied in the context of the sulfate response; these processes include cell wall organization, carbohydrate metabolism, nitrogen compound transport, and the regulation of proteolysis. Gene co-expression network analyses revealed relationships between the sulfate-responsive genes that were distributed among seven function-specific co-expression modules. The most connected genes in the sulfate co-expression network belong to a module related to the carbon response, suggesting that this biological function plays an important role in the control of the sulfate response. Temporal analyses of the network suggest that sulfate starvation generates a biphasic response, which involves that major changes in gene expression occur during both the early and late responses. Network analyses predicted that the sulfate response is regulated by a limited number of transcription factors, including MYBs, bZIPs, and NF-YAs. In conclusion, our analysis identified new candidate genes and provided new hypotheses to advance our understanding of the transcriptional regulation of sulfate metabolism in plants. PMID:29692794
Henríquez-Valencia, Carlos; Arenas-M, Anita; Medina, Joaquín; Canales, Javier
2018-01-01
Sulfur is an essential nutrient for plant growth and development. Sulfur is a constituent of proteins, the plasma membrane and cell walls, among other important cellular components. To obtain new insights into the gene regulatory networks underlying the sulfate response, we performed an integrative meta-analysis of transcriptomic data from five different sulfate experiments available in public databases. This bioinformatic approach allowed us to identify a robust set of genes whose expression depends only on sulfate availability, indicating that those genes play an important role in the sulfate response. In relation to sulfate metabolism, the biological function of approximately 45% of these genes is currently unknown. Moreover, we found several consistent Gene Ontology terms related to biological processes that have not been extensively studied in the context of the sulfate response; these processes include cell wall organization, carbohydrate metabolism, nitrogen compound transport, and the regulation of proteolysis. Gene co-expression network analyses revealed relationships between the sulfate-responsive genes that were distributed among seven function-specific co-expression modules. The most connected genes in the sulfate co-expression network belong to a module related to the carbon response, suggesting that this biological function plays an important role in the control of the sulfate response. Temporal analyses of the network suggest that sulfate starvation generates a biphasic response, which involves that major changes in gene expression occur during both the early and late responses. Network analyses predicted that the sulfate response is regulated by a limited number of transcription factors, including MYBs, bZIPs, and NF-YAs. In conclusion, our analysis identified new candidate genes and provided new hypotheses to advance our understanding of the transcriptional regulation of sulfate metabolism in plants.
Verma, Jitendra Kumar; Wardhan, Vijay; Singh, Deepali; Chakraborty, Subhra; Chakraborty, Niranjan
2018-03-28
Architectural proteins play key roles in genome construction and regulate the expression of many genes, albeit the modulation of genome plasticity by these proteins is largely unknown. A critical screening of the architectural proteins in five crop species, viz., Oryza sativa , Zea mays , Sorghum bicolor , Cicer arietinum , and Vitis vinifera , and in the model plant Arabidopsis thaliana along with evolutionary relevant species such as Chlamydomonas reinhardtii , Physcomitrella patens , and Amborella trichopoda , revealed 9, 20, 10, 7, 7, 6, 1, 4, and 4 Alba (acetylation lowers binding affinity) genes, respectively. A phylogenetic analysis of the genes and of their counterparts in other plant species indicated evolutionary conservation and diversification. In each group, the structural components of the genes and motifs showed significant conservation. The chromosomal location of the Alba genes of rice ( OsAlba ), showed an unequal distribution on 8 of its 12 chromosomes. The expression profiles of the OsAlba genes indicated a distinct tissue-specific expression in the seedling, vegetative, and reproductive stages. The quantitative real-time PCR (qRT-PCR) analysis of the OsAlba genes confirmed their stress-inducible expression under multivariate environmental conditions and phytohormone treatments. The evaluation of the regulatory elements in 68 Alba genes from the 9 species studied led to the identification of conserved motifs and overlapping microRNA (miRNA) target sites, suggesting the conservation of their function in related proteins and a divergence in their biological roles across species. The 3D structure and the prediction of putative ligands and their binding sites for OsAlba proteins offered a key insight into the structure-function relationship. These results provide a comprehensive overview of the subtle genetic diversification of the OsAlba genes, which will help in elucidating their functional role in plants.
Szperl, Agata M.; Golachowska, Magdalena R.; Bruinenberg, Marcel; Prekeris, Rytis; Thunnissen, Andy-Mark W. H.; Karrenbeld, Arend; Dijkstra, Gerard; Hoekstra, Dick; Mercer, David; Ksiazyk, Janusz; Wijmenga, Cisca; Wapenaar, Martin C.; Rings, Edmond H. H. M.; van IJzendoorn, Sven C. D.
2010-01-01
Objectives Microvillus inclusion disease (MVID) is a rare autosomal recessive enteropathy characterized by intractable diarrhea and malabsorption. Recently, various MYO5B gene mutations have been identified in MVID patients. Interestingly, several MVID patients showed only a MYO5B mutation in one allele (heterozygous) or no mutations in the MYO5B gene, illustrating the need to further functionally characterize the cell biological effects of the MYO5B mutations. Methods The genomic DNA of nine patients diagnosed with microvillus inclusion disease was screened for MYO5B mutations, and qPCR and immunohistochemistry on the material of two patients was performed to investigate resultant cellular consequences. Results We demonstrate for the first time that MYO5B mutations can be correlated with altered myosin Vb mRNA expression and with an aberrant subcellular distribution of the myosin Vb protein. Moreover, we demonstrate that the typical and myosin Vb–controlled accumulation of rab11a-and FIP5-positive recycling endosomes in the apical cytoplasm of the cells is abolished in MVID enterocytes, which is indicative for altered myosin Vb function. Also, we report 8 novel MYO5B mutations in 9 MVID patients of various etnic backgrounds, including compound heterozygous mutations. Conclusions Our functional analysis indicate that MYO5B mutations can be correlated with an aberrant subcellular distribution of the myosin Vb protein and apical recycling endosomes which, together with the additional compound heterozygous mutations, significantly strengthen the link between MYO5B and MVID. PMID:21206382
Ye, Heng; Feng, Jiuhuan; Zhang, Lihua; Zhang, Jinfeng; Mispan, Muhamad S.; Cao, Zhuanqin; Beighley, Donn H.; Yang, Jianchang; Gu, Xing-You
2015-01-01
Natural variation in seed dormancy is controlled by multiple genes mapped as quantitative trait loci in major crop or model plants. This research aimed to clone and characterize the Seed Dormancy1-2 (qSD1-2) locus associated with endosperm-imposed dormancy and plant height in rice (Oryza sativa). qSD1-2 was delimited to a 20-kb region, which contains OsGA20ox2 and had an additive effect on germination. Naturally occurring or induced loss-of-function mutations of the gibberellin (GA) synthesis gene enhanced seed dormancy and also reduced plant height. Expression of this gene in seeds (including endospermic cells) during early development increased GA accumulation to promote tissue morphogenesis and maturation programs. The mutant allele prevalent in semidwarf cultivars reduced the seed GA content by up to 2-fold at the early stage, which decelerated tissue morphogenesis including endosperm cell differentiation, delayed abscisic acid accumulation by a shift in the temporal distribution pattern, and postponed dehydration, physiological maturity, and germinability development. As the endosperm of developing seeds dominates the moisture equilibrium and desiccation status of the embryo in cereal crops, qSD1-2 is proposed to control primary dormancy by a GA-regulated dehydration mechanism. Allelic distribution of OsGA20ox2, the rice Green Revolution gene, was associated with the indica and japonica subspeciation. However, this research provided no evidence that the primitive indica- and common japonica-specific alleles at the presumably domestication-related locus functionally differentiate in plant height and seed dormancy. Thus, the evolutionary mechanism of this agriculturally important gene remains open for discussion. PMID:26373662
Ye, Heng; Feng, Jiuhuan; Zhang, Lihua; Zhang, Jinfeng; Mispan, Muhamad S; Cao, Zhuanqin; Beighley, Donn H; Yang, Jianchang; Gu, Xing-You
2015-11-01
Natural variation in seed dormancy is controlled by multiple genes mapped as quantitative trait loci in major crop or model plants. This research aimed to clone and characterize the Seed Dormancy1-2 (qSD1-2) locus associated with endosperm-imposed dormancy and plant height in rice (Oryza sativa). qSD1-2 was delimited to a 20-kb region, which contains OsGA20ox2 and had an additive effect on germination. Naturally occurring or induced loss-of-function mutations of the gibberellin (GA) synthesis gene enhanced seed dormancy and also reduced plant height. Expression of this gene in seeds (including endospermic cells) during early development increased GA accumulation to promote tissue morphogenesis and maturation programs. The mutant allele prevalent in semidwarf cultivars reduced the seed GA content by up to 2-fold at the early stage, which decelerated tissue morphogenesis including endosperm cell differentiation, delayed abscisic acid accumulation by a shift in the temporal distribution pattern, and postponed dehydration, physiological maturity, and germinability development. As the endosperm of developing seeds dominates the moisture equilibrium and desiccation status of the embryo in cereal crops, qSD1-2 is proposed to control primary dormancy by a GA-regulated dehydration mechanism. Allelic distribution of OsGA20ox2, the rice Green Revolution gene, was associated with the indica and japonica subspeciation. However, this research provided no evidence that the primitive indica- and common japonica-specific alleles at the presumably domestication-related locus functionally differentiate in plant height and seed dormancy. Thus, the evolutionary mechanism of this agriculturally important gene remains open for discussion. © 2015 American Society of Plant Biologists. All Rights Reserved.
Aubry-Hivet, D; Nziengui, H; Rapp, K; Oliveira, O; Paponov, I A; Li, Y; Hauslage, J; Vagt, N; Braun, M; Ditengou, F A; Dovzhenko, A; Palme, K
2014-01-01
Plant roots are among most intensively studied biological systems in gravity research. Altered gravity induces asymmetric cell growth leading to root bending. Differential distribution of the phytohormone auxin underlies root responses to gravity, being coordinated by auxin efflux transporters from the PIN family. The objective of this study was to compare early transcriptomic changes in roots of Arabidopsis thaliana wild type, and pin2 and pin3 mutants under parabolic flight conditions and to correlate these changes to auxin distribution. Parabolic flights allow comparison of transient 1-g, hypergravity and microgravity effects in living organisms in parallel. We found common and mutation-related genes differentially expressed in response to transient microgravity phases. Gene ontology analysis of common genes revealed lipid metabolism, response to stress factors and light categories as primarily involved in response to transient microgravity phases, suggesting that fundamental reorganisation of metabolic pathways functions upstream of a further signal mediating hormonal network. Gene expression changes in roots lacking the columella-located PIN3 were stronger than in those deprived of the epidermis and cortex cell-specific PIN2. Moreover, repetitive exposure to microgravity/hypergravity and gravity/hypergravity flight phases induced an up-regulation of auxin responsive genes in wild type and pin2 roots, but not in pin3 roots, suggesting a critical function of PIN3 in mediating auxin fluxes in response to transient microgravity phases. Our study provides important insights towards understanding signal transduction processes in transient microgravity conditions by combining for the first time the parabolic flight platform with the transcriptome analysis of different genetic mutants in the model plant, Arabidopsis. © 2013 German Botanical Society and The Royal Botanical Society of the Netherlands.
Genome-Wide Identification of the Invertase Gene Family in Populus.
Chen, Zhong; Gao, Kai; Su, Xiaoxing; Rao, Pian; An, Xinmin
2015-01-01
Invertase plays a crucial role in carbohydrate partitioning and plant development as it catalyses the irreversible hydrolysis of sucrose into glucose and fructose. The invertase family in plants is composed of two sub-families: acid invertases, which are targeted to the cell wall and vacuole; and neutral/alkaline invertases, which function in the cytosol. In this study, 5 cell wall invertase genes (PtCWINV1-5), 3 vacuolar invertase genes (PtVINV1-3) and 16 neutral/alkaline invertase genes (PtNINV1-16) were identified in the Populus genome and found to be distributed on 14 chromosomes. A comprehensive analysis of poplar invertase genes was performed, including structures, chromosome location, phylogeny, evolutionary pattern and expression profiles. Phylogenetic analysis indicated that the two sub-families were both divided into two clades. Segmental duplication is contributed to neutral/alkaline sub-family expansion. Furthermore, the Populus invertase genes displayed differential expression in roots, stems, leaves, leaf buds and in response to salt/cold stress and pathogen infection. In addition, the analysis of enzyme activity and sugar content revealed that invertase genes play key roles in the sucrose metabolism of various tissues and organs in poplar. This work lays the foundation for future functional analysis of the invertase genes in Populus and other woody perennials.
Genome-Wide Identification of the Invertase Gene Family in Populus
Su, Xiaoxing; Rao, Pian; An, Xinmin
2015-01-01
Invertase plays a crucial role in carbohydrate partitioning and plant development as it catalyses the irreversible hydrolysis of sucrose into glucose and fructose. The invertase family in plants is composed of two sub-families: acid invertases, which are targeted to the cell wall and vacuole; and neutral/alkaline invertases, which function in the cytosol. In this study, 5 cell wall invertase genes (PtCWINV1-5), 3 vacuolar invertase genes (PtVINV1-3) and 16 neutral/alkaline invertase genes (PtNINV1-16) were identified in the Populus genome and found to be distributed on 14 chromosomes. A comprehensive analysis of poplar invertase genes was performed, including structures, chromosome location, phylogeny, evolutionary pattern and expression profiles. Phylogenetic analysis indicated that the two sub-families were both divided into two clades. Segmental duplication is contributed to neutral/alkaline sub-family expansion. Furthermore, the Populus invertase genes displayed differential expression in roots, stems, leaves, leaf buds and in response to salt/cold stress and pathogen infection. In addition, the analysis of enzyme activity and sugar content revealed that invertase genes play key roles in the sucrose metabolism of various tissues and organs in poplar. This work lays the foundation for future functional analysis of the invertase genes in Populus and other woody perennials. PMID:26393355
Genome-wide identification and characterization of Fox genes in the silkworm, Bombyx mori.
Song, JiangBo; Li, ZhiQuan; Tong, XiaoLing; Chen, Cong; Chen, Min; Meng, Gang; Chen, Peng; Li, ChunLin; Xin, YaQun; Gai, TingTing; Dai, FangYin; Lu, Cheng
2015-09-01
The forkhead box (Fox) transcription factor family has a characteristic of forkhead domain, a winged DNA-binding domain. The Fox genes have been classified into 23 subfamilies, designated FoxA to FoxS, of which the FoxR and FoxS subfamilies are specific to vertebrates. In this review, using whole-genome scanning, we identified 17 distinct Fox genes distributed on 13 chromosomes of the silkworm, Bombyx mori. A phylogenetic tree showed that the silkworm Fox genes could be classified into 13 subfamilies. The FoxK subfamily is specifically absent from the silkworm, although it is present in other lepidopteran insects, including Danaus plexippus and Heliconius melpomene. Microarray data revealed that the Fox genes have distinct expression patterns in the tissues on day 3 of the 5th instar larva. A Gene Ontology analysis suggested that the Fox genes have roles in cellular components, molecular functions, and biological processes, except in pore complex biogenesis. An analysis of the selective pressure on the proteins indicated that most of the amino acid sites in the Fox proteins are undergoing strong purifying selection. Here, we summarize the general characteristics of the Fox genes in the silkworm, which should support further functional studies of the silkworm Fox proteins.
A Solution to the C-Value Paradox and the Function of Junk DNA: The Genome Balance Hypothesis.
Freeling, Michael; Xu, Jie; Woodhouse, Margaret; Lisch, Damon
2015-06-01
The Genome Balance Hypothesis originated from a recent study that provided a mechanism for the phenomenon of genome dominance in ancient polyploids: unique 24nt RNA coverage near genes is greater in genes on the recessive subgenome irrespective of differences in gene expression. 24nt RNAs target transposons. Transposon position effects are now hypothesized to balance the expression of networked genes and provide spring-like tension between pericentromeric heterochromatin and microtubules. The balance (coordination) of gene expression and centromere movement is under selection. Our hypothesis states that this balance can be maintained by many or few transposons about equally well. We explain known balanced distributions of junk DNA within genomes and between subgenomes in allopolyploids (and our hypothesis passes "the onion test" for any so-called solution to the C-value paradox). Importantly, when the allotetraploid maize chromosomes delete redundant genes, their nearby transposons are also lost; this result is explained if transposons near genes function. The Genome Balance Hypothesis is hypothetical because the position effect mechanisms implicated are not proved to apply to all junk DNA, and the continuous nature of the centromeric and gene position effects have not yet been studied as a single phenomenon. Copyright © 2015 The Author. Published by Elsevier Inc. All rights reserved.
Cuadrado, A; Cardoso, M; Jouve, N
2008-01-01
A significant fraction of the nuclear DNA of all eukaryotes is occupied by simple sequence repeats (SSRs) or microsatellites. This type of sequence has sparked great interest as a means of studying genetic variation, linkage mapping, gene tagging and evolution. Although SSRs at different positions in a gene help determine the regulation of expression and the function of the protein produced, little attention has been paid to the chromosomal organisation and distribution of these sequences, even in model species. This review discusses the main achievements in the characterisation of long-range SSR organisation in the chromosomes of Triticum aestivum L., Secale cereale L., and Hordeum vulgare L. (all members of Triticeae). We have detected SSRs using an improved FISH technique based on the random primer labelling of synthetic oligonucleotides (15-24 bases) in multi-colour experiments. Detailed information on the presence and distribution of AC, AG and all the possible classes of trinucleotide repeats has been acquired. These data have revealed the motif-dependent and non-random chromosome distributions of SSRs in the different genomes, and allowed the correlation of particular SSRs with chromosome areas characterised by specific features (e.g., heterochromatin, euchromatin and centromeres) in all three species. The present review provides a detailed comparative study of the distribution of these SSRs in each of the seven chromosomes of the genomes A, B and D of wheat, H of barley and R of rye. The importance of SSRs in plant breeding and their possible role in chromosome structure, function and evolution is discussed. 2008 S. Karger AG, Basel
Rensing, Stefan A; Fritzowsky, Dana; Lang, Daniel; Reski, Ralf
2005-01-01
Background The moss Physcomitrella patens is an emerging plant model system due to its high rate of homologous recombination, haploidy, simple body plan, physiological properties as well as phylogenetic position. Available EST data was clustered and assembled, and provided the basis for a genome-wide analysis of protein encoding genes. Results We have clustered and assembled Physcomitrella patens EST and CDS data in order to represent the transcriptome of this non-seed plant. Clustering of the publicly available data and subsequent prediction resulted in a total of 19,081 non-redundant ORF. Of these putative transcripts, approximately 30% have a homolog in both rice and Arabidopsis transcriptome. More than 130 transcripts are not present in seed plants but can be found in other kingdoms. These potential "retained genes" might have been lost during seed plant evolution. Functional annotation of these genes reveals unequal distribution among taxonomic groups and intriguing putative functions such as cytotoxicity and nucleic acid repair. Whereas introns in the moss are larger on average than in the seed plant Arabidopsis thaliana, position and amount of introns are approximately the same. Contrary to Arabidopsis, where CDS contain on average 44% G/C, in Physcomitrella the average G/C content is 50%. Interestingly, moss orthologs of Arabidopsis genes show a significant drift of codon fraction usage, towards the seed plant. While averaged codon bias is the same in Physcomitrella and Arabidopsis, the distribution pattern is different, with 15% of moss genes being unbiased. Species-specific, sensitive and selective splice site prediction for Physcomitrella has been developed using a dataset of 368 donor and acceptor sites, utilizing a support vector machine. The prediction accuracy is better than those achieved with tools trained on Arabidopsis data. Conclusion Analysis of the moss transcriptome displays differences in gene structure, codon and splice site usage in comparison with the seed plant Arabidopsis. Putative retained genes exhibit possible functions that might explain the peculiar physiological properties of mosses. Both the transcriptome representation (including a BLAST and retrieval service) and splice site prediction have been made available on , setting the basis for assembly and annotation of the Physcomitrella genome, of which draft shotgun sequences will become available in 2005. PMID:15784153
Hou, Xiao-Jin; Li, Si-Bei; Liu, Sheng-Rui; Hu, Chun-Gen; Zhang, Jin-Zhi
2014-01-01
MYB family genes are widely distributed in plants and comprise one of the largest transcription factors involved in various developmental processes and defense responses of plants. To date, few MYB genes and little expression profiling have been reported for citrus. Here, we describe and classify 177 members of the sweet orange MYB gene (CsMYB) family in terms of their genomic gene structures and similarity to their putative Arabidopsis orthologs. According to these analyses, these CsMYBs were categorized into four groups (4R-MYB, 3R-MYB, 2R-MYB and 1R-MYB). Gene structure analysis revealed that 1R-MYB genes possess relatively more introns as compared with 2R-MYB genes. Investigation of their chromosomal localizations revealed that these CsMYBs are distributed across nine chromosomes. Sweet orange includes a relatively small number of MYB genes compared with the 198 members in Arabidopsis, presumably due to a paralog reduction related to repetitive sequence insertion into promoter and non-coding transcribed region of the genes. Comparative studies of CsMYBs and Arabidopsis showed that CsMYBs had fewer gene duplication events. Expression analysis revealed that the MYB gene family has a wide expression profile in sweet orange development and plays important roles in development and stress responses. In addition, 337 new putative microsatellites with flanking sequences sufficient for primer design were also identified from the 177 CsMYBs. These results provide a useful reference for the selection of candidate MYB genes for cloning and further functional analysis forcitrus. PMID:25375352
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kalyuzhnaya, Marina G.; Nercessian, Olivier; Lapidus, Alla
2004-07-01
The recently generated database of microbial genes from anoligotrophic environment populated by a calculated 1,800 of major phylotypes (the Sargasso Sea metagenome) presents a great source for expanding local databases of genes indicative of a specific function. In this paper we analyze the Sargasso Sea metagenome in terms of the presence of methanopterin-linked C1 transfer genes that are signature for methylotrophy. We conclude that more than 10 phylotypes possessing genes of interest are present in this environment, and a few of these are relatively abundant species. The sequences representative of the major phylotypes do not appear to belong to anymore » known microbial group capable of methanopterin-linked C1 transfer. Instead, they separate from all known sequences on phylogenetic trees, pointing towards their affiliation with a novel microbial phylum. These data imply a broader distribution of methanopterin-linked functions in the microbial world than previously known.« less
A framework for the interpretation of de novo mutation in human disease
Samocha, Kaitlin E.; Robinson, Elise B.; Sanders, Stephan J.; Stevens, Christine; Sabo, Aniko; McGrath, Lauren M.; Kosmicki, Jack A.; Rehnström, Karola; Mallick, Swapan; Kirby, Andrew; Wall, Dennis P.; MacArthur, Daniel G.; Gabriel, Stacey B.; dePristo, Mark; Purcell, Shaun M.; Palotie, Aarno; Boerwinkle, Eric; Buxbaum, Joseph D.; Cook, Edwin H.; Gibbs, Richard A.; Schellenberg, Gerard D.; Sutcliffe, James S.; Devlin, Bernie; Roeder, Kathryn; Neale, Benjamin M.; Daly, Mark J.
2014-01-01
Spontaneously arising (‘de novo’) mutations play an important role in medical genetics. For diseases with extensive locus heterogeneity – such as autism spectrum disorders (ASDs) – the signal from de novo mutations (DNMs) is distributed across many genes, making it difficult to distinguish disease-relevant mutations from background variation. We provide a statistical framework for the analysis of DNM excesses per gene and gene set by calibrating a model of de novo mutation. We applied this framework to DNMs collected from 1,078 ASD trios and – while affirming a significant role for loss-of-function (LoF) mutations – found no excess of de novo LoF mutations in cases with IQ above 100, suggesting that the role of DNMs in ASD may reside in fundamental neurodevelopmental processes. We also used our model to identify ~1,000 genes that are significantly lacking functional coding variation in non-ASD samples and are enriched for de novo LoF mutations identified in ASD cases. PMID:25086666
NASA Astrophysics Data System (ADS)
Zhao, Xiaoqing; Li, Hong; Bao, Tonglaga; Ying, Zhiqiang
2012-09-01
Many experiment evidences showed that sequence structures of introns and intron loss/gain can influence gene expression, but current mechanisms did not refer to the functions of post-spliced introns directly. We propose that postspliced introns play their functions in gene expression by interacting with their mRNA sequences and the interaction is characterized by the matched segments between introns and their CDS. In this study, we investigated the interaction characters with length series by improved Smith-Waterman local alignment software for the ribosomal protein genes in C. elegans and D. melanogaster. Our results showed that RF values of five intron groups are significantly high in the central non-conserved region and very low in 5'-end and 3'-end splicing region. It is interesting that the number of the optimal matched regions gradually increases with intron length. Distributions of the optimal matched regions are different for five intron groups. Our study revealed that there are more interaction regions between longer introns and their CDS than shorter, and it provides a positive pattern for regulating the gene expression.
Accumulation of the antibiotic phenazine-1-carboxylic acid in the rhizosphere of dryland cereals
USDA-ARS?s Scientific Manuscript database
Natural antibiotics are thought to function in microbial defense, fitness, competitiveness, biocontrol, communication and gene regulation. However, the frequency and amount of antibiotics produced in nature are poorly understood. In this study, we assessed the geographic distribution of indigenous p...
Salaneck, Erik; Ardell, David H; Larson, Earl T; Larhammar, Dan
2003-08-01
It has been debated whether the increase in gene number during early vertebrate evolution was due to multiple independent gene duplications or synchronous duplications of many genes. We describe here the cloning of three neuropeptide Y (NPY) receptor genes belonging to the Y1 subfamily in the spiny dogfish, Squalus acanthias, a cartilaginous fish. The three genes are orthologs of the mammalian subtypes Y1, Y4, and Y6, which are located in paralogous gene regions on different chromosomes in mammals. Thus, these genes arose by duplications of a chromosome region before the radiation of gnathostomes (jawed vertebrates). Estimates of duplication times from linearized trees together with evidence from other gene families supports two rounds of chromosome duplications or tetraploidizations early in vertebrate evolution. The anatomical distribution of mRNA was determined by reverse-transcriptase PCR and was found to differ from mammals, suggesting differential functional diversification of the new gene copies during the radiation of the vertebrate classes.
Transcriptome profile and unique genetic evolution of positively selected genes in yak lungs.
Lan, DaoLiang; Xiong, XianRong; Ji, WenHui; Li, Jian; Mipam, Tserang-Donko; Ai, Yi; Chai, ZhiXin
2018-04-01
The yak (Bos grunniens), which is a unique bovine breed that is distributed mainly in the Qinghai-Tibetan Plateau, is considered a good model for studying plateau adaptability in mammals. The lungs are important functional organs that enable animals to adapt to their external environment. However, the genetic mechanism underlying the adaptability of yak lungs to harsh plateau environments remains unknown. To explore the unique evolutionary process and genetic mechanism of yak adaptation to plateau environments, we performed transcriptome sequencing of yak and cattle (Bos taurus) lungs using RNA-Seq technology and a subsequent comparison analysis to identify the positively selected genes in the yak. After deep sequencing, a normal transcriptome profile of yak lung that containing a total of 16,815 expressed genes was obtained, and the characteristics of yak lungs transcriptome was described by functional analysis. Furthermore, Ka/Ks comparison statistics result showed that 39 strong positively selected genes are identified from yak lungs. Further GO and KEGG analysis was conducted for the functional annotation of these genes. The results of this study provide valuable data for further explorations of the unique evolutionary process of high-altitude hypoxia adaptation in yaks in the Tibetan Plateau and the genetic mechanism at the molecular level.
Mascotti, Maria Laura; Lapadula, Walter Jesús; Juri Ayub, Maximiliano
2015-01-01
The Baeyer—Villiger Monooxygenases (BVMOs) are enzymes belonging to the “Class B” of flavin monooxygenases and are capable of performing exquisite selective oxidations. These enzymes have been studied from a biotechnological perspective, but their physiological substrates and functional roles are widely unknown. Here, we investigated the origin, taxonomic distribution and evolutionary history of the BVMO genes. By using in silico approaches, 98 BVMO encoding genes were detected in the three domains of life: Archaea, Bacteria and Eukarya. We found evidence for the presence of these genes in Metazoa (Hydra vulgaris, Oikopleura dioica and Adineta vaga) and Haptophyta (Emiliania huxleyi) for the first time. Furthermore, a search for other “Class B” monooxygenases (flavoprotein monooxygenases –FMOs – and N-hydroxylating monooxygenases – NMOs) was conducted. These sequences were also found in the three domains of life. Phylogenetic analyses of all “Class B” monooxygenases revealed that NMOs and BVMOs are monophyletic, whereas FMOs form a paraphyletic group. Based on these results, we propose that BVMO genes were already present in the last universal common ancestor (LUCA) and their current taxonomic distribution is the result of differential duplication and loss of paralogous genes. PMID:26161776
Polymorphisms in the type I deiodinase gene and frontal function in recurrent depressive disorder.
Gałecka, Elżbieta; Talarowska, Monika; Orzechowska, Agata; Górski, Paweł; Szemraj, Janusz
2016-09-01
Significant impairment of some psychological functions, including cognitive functioning, has been characteristically found in depressed patients. Memory disturbances may be related to the levels of thyroid hormones (TH) that are under the influence of different mechanisms and molecules, including deiodinase type 1(D1) - an important determinant of circulating triiodothyronine (T3). We investigated the relationship between two functionally known polymorphisms within the DIO1 gene, i.e. DIO1a-C/T and DIO1b-A/G, and cognitive functioning in patients diagnosed with recurrent depressive disorder (rDD). In the planned analysis we mainly concentrated on the frontal function: working memory, executive functions and verbal fluency. Genetic variants were genotyped in 128 patients using a method based on polymerase chain reaction (PCR). Cognitive functions were assessed by the Trail Making Test, the Stroop Test and the Verbal Fluency Test (VFT). No significant associations were found between DIO1 polymorphisms and cognitive functioning in rDD. Only the CT and TT genotypes of the DIO1a variant were significantly related to verbal fluency. There were no significant differences between the distribution of the genotypes and demographic/medical variables. Based on the study, the examined polymorphisms are not an important risk or protective factor for cognitive impairment in depressive patients. Functional variants within the DIO1 gene that affect triiodothyronine (T3) levels seem not to be associated with cognitive functions. Nevertheless, considering the fact that the DIO1 gene is related to the course and management of depression, further studies on a larger sample size might be suggested. Copyright © 2016 Medical University of Bialystok. Published by Elsevier Urban & Partner Sp. z o.o. All rights reserved.
Comprehensive Analysis of the Soybean (Glycine max) GmLAX Auxin Transporter Gene Family
Chai, Chenglin; Wang, Yongqin; Valliyodan, Babu; Nguyen, Henry T.
2016-01-01
The phytohormone auxin plays a critical role in regulation of plant growth and development as well as plant responses to abiotic stresses. This is mainly achieved through its uneven distribution in plant via a polar auxin transport process. Auxin transporters are major players in polar auxin transport. The AUXIN RESISTENT 1/LIKE AUX1 (AUX/LAX) auxin influx carriers belong to the amino acid permease family of proton-driven transporters and function in the uptake of indole-3-acetic acid (IAA). In this study, genome-wide comprehensive analysis of the soybean AUX/LAX (GmLAX) gene family, including phylogenic relationships, chromosome localization, and gene structure, was carried out. A total of 15 GmLAX genes, including seven duplicated gene pairs, were identified in the soybean genome. They were distributed on 10 chromosomes. Despite their higher percentage identities at the protein level, GmLAXs exhibited versatile tissue-specific expression patterns, indicating coordinated functioning during plant growth and development. Most GmLAXs were responsive to drought and dehydration stresses and auxin and abscisic acid (ABA) stimuli, in a tissue- and/or time point- sensitive mode. Several GmLAX members were involved in responding to salt stress. Sequence analysis revealed that promoters of GmLAXs contained different combinations of stress-related cis-regulatory elements. These studies suggest that the soybean GmLAXs were under control of a very complex regulatory network, responding to various internal and external signals. This study helps to identity candidate GmLAXs for further analysis of their roles in soybean development and adaption to adverse environments. PMID:27014306
Broad Phylogenetic Occurrence of the Oxygen-Binding Hemerythrins in Bilaterians.
Costa-Paiva, Elisa M; Schrago, Carlos G; Halanych, Kenneth M
2017-10-01
Animal tissues need to be properly oxygenated for carrying out catabolic respiration and, as such, natural selection has presumably favored special molecules that can reversibly bind and transport oxygen. Hemoglobins, hemocyanins, and hemerythrins (Hrs) fulfill this role, with Hrs being the least studied. Knowledge of oxygen-binding proteins is crucial for understanding animal physiology. Hr genes are present in the three domains of life, Archaea, Bacteria, and Eukaryota; however, within Animalia, Hrs has been reported only in marine species in six phyla (Annelida, Brachiopoda, Priapulida, Bryozoa, Cnidaria, and Arthropoda). Given this observed Hr distribution, whether all metazoan Hrs share a common origin is circumspect. We investigated Hr diversity and evolution in metazoans, by employing in silico approaches to survey for Hrs from of 120 metazoan transcriptomes and genomes. We found 58 candidate Hr genes actively transcribed in 36 species distributed in 11 animal phyla, with new records in Echinodermata, Hemichordata, Mollusca, Nemertea, Phoronida, and Platyhelminthes. Moreover, we found that "Hrs" reported from Cnidaria and Arthropoda were not consistent with that of other metazoan Hrs. Contrary to previous suggestions that Hr genes were absent in deuterostomes, we find Hr genes present in deuterostomes and were likely present in early bilaterians, but not in nonbilaterian animal lineages. As expected, the Hr gene tree did not mirror metazoan phylogeny, suggesting that Hrs evolutionary history was complex and besides the oxygen carrying capacity, the drivers of Hr evolution may also consist of secondary functional specializations of the proteins, like immunological functions. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Zhang, Xiaoying; Hu, Bill X; Ren, Hejun; Zhang, Jin
2018-08-15
The gradient distribution of microbial communities has been detected in profiles along many natural environments. In a mangrove seedlings inhabited mudflat, the microbes drive a variety of biogeochemical processes and are associated with a dramatically changed environment across the tidal zones of mudflat. A better understanding of microbial composition, diversity and associated functional profiles in relation to physicochemical influences could provide more insights into the ecological functions of microbes in a coastal mangrove ecosystem. In this study, the variation of microbial community along successive tidal flats inhabited by mangrove seedlings were characterized based on the 16S rDNA gene sequences, and then the factors that shape the bacterial and archaeal communities were determined. Results showed that the tidal cycles strongly influence the distribution of bacterial and archaeal communities. Dissimilarity and gradient distribution of microbial communities were found among high tidal flat, mid-low tidal flat and seawater. Discrepancies were also as well observed from the surface to subsurface layers specifically in the high tidal flat. For example, Alphaproteobacteria displayed an increasing trend from low tidal to high tidal flat and vice versa for Deltaproteobacteria; Cyanobacteria and Thaumarchaeota were more dominant in the surface layer than the subsurface. In addition, by classifying the microorganisms into metabolic functional groups, we were able to identify the biogeochemical pathway that was dominant in each zone. The (oxygenic) photoautotrophy and nitrate reduction were enhanced in the mangrove inhabited mid tidal flat. It revealed the ability of xenobiotic metabolism microbes to degrade, transform, or accumulate environmental hydrocarbon pollutants in seawater, increasing sulfur-related respiration from high tidal to low tidal flat. An opposite distribution was found for major nitrogen cycling processes. The shift of both composition and function of microbial communities were significantly related to light, oxygen availability and total dissolved nitrogen instead of sediment types or salinity. Copyright © 2018 Elsevier B.V. All rights reserved.
Kohl, Kevin D.; Dearing, M. Denise
2014-01-01
The microbiota inhabiting the mammalian gut is a functional organ that provides a number of services for the host. One factor that may regulate the composition and function of gut microbial communities is dietary toxins. Oxalate is a toxic plant secondary compound (PSC) produced in all major taxa of vascular plants and is consumed by a variety of animals. The mammalian herbivore Neotoma albigula is capable of consuming and degrading large quantities of dietary oxalate. We isolated and characterized oxalate-degrading bacteria from the gut contents of wild-caught animals and used high-throughput sequencing to determine the distribution of potential oxalate-degrading taxa along the gastrointestinal tract. Isolates spanned three genera: Lactobacillus, Clostridium, and Enterococcus. Over half of the isolates exhibited significant oxalate degradation in vitro, and all Lactobacillus isolates contained the oxc gene, one of the genes responsible for oxalate degradation. Although diverse potential oxalate-degrading genera were distributed throughout the gastrointestinal tract, they were most concentrated in the foregut, where dietary oxalate first enters the gastrointestinal tract. We hypothesize that unique environmental conditions present in each gut region provide diverse niches that select for particular functional taxa and communities. PMID:24362432
Billington, Stephen J; Songer, J Glenn; Jost, B Helen
2002-05-01
Tetracycline resistance is common among isolates of the animal commensal and opportunistic pathogen Arcanobacterium pyogenes. The tetracycline resistance determinant cloned from two bovine isolates of A. pyogenes was highly similar at the DNA level (92% identity) to the tet(W) gene, encoding a ribosomal protection tetracycline resistance protein, from the rumen bacterium Butyrivibrio fibrisolvens. The tet(W) gene was found in all 20 tetracycline-resistant isolates tested, indicating that it is a widely distributed determinant of tetracycline resistance in this organism. In 25% of tetracycline-resistant isolates, the tet(W) gene was associated with a mob gene, encoding a functional mobilization protein, and an origin of transfer, suggesting that the determinant may be transferable to other bacteria. In fact, low-frequency transfer of tet(W) was detected from mob+ A. pyogenes isolates to a tetracycline-sensitive A. pyogenes recipient. The mobile nature of this determinant and the presence of A. pyogenes in the gastrointestinal tract of cattle and pigs suggest that A. pyogenes may have inherited this determinant within the gastrointestinal tracts of these animals.
Advances in methods for detection of anaerobic ammonium oxidizing (anammox) bacteria.
Li, Meng; Gu, Ji-Dong
2011-05-01
Anaerobic ammonium oxidation (anammox), the biochemical process oxidizing ammonium into dinitrogen gas using nitrite as an electron acceptor, has only been recognized for its significant role in the global nitrogen cycle not long ago, and its ubiquitous distribution in a wide range of environments has changed our knowledge about the contributors to the global nitrogen cycle. Currently, several groups of methods are used in detection of anammox bacteria based on their physiological and biochemical characteristics, cellular chemical composition, and both 16S rRNA gene and selective functional genes as biomarkers, including hydrazine oxidoreductase and nitrite reductase encoding genes hzo and nirS, respectively. Results from these methods coupling with advances in quantitative PCR, reverse transcription of mRNA genes and stable isotope labeling have improved our understanding on the distribution, diversity, and activity of anammox bacteria in different environments both natural and engineered ones. In this review, we summarize these methods used in detection of anammox bacteria from various environments, highlight the strengths and weakness of these methods, and also discuss the new development potentials on the existing and new techniques in the future.
Genome-Wide Association Study of the Genetic Determinants of Emphysema Distribution.
Boueiz, Adel; Lutz, Sharon M; Cho, Michael H; Hersh, Craig P; Bowler, Russell P; Washko, George R; Halper-Stromberg, Eitan; Bakke, Per; Gulsvik, Amund; Laird, Nan M; Beaty, Terri H; Coxson, Harvey O; Crapo, James D; Silverman, Edwin K; Castaldi, Peter J; DeMeo, Dawn L
2017-03-15
Emphysema has considerable variability in the severity and distribution of parenchymal destruction throughout the lungs. Upper lobe-predominant emphysema has emerged as an important predictor of response to lung volume reduction surgery. Yet, aside from alpha-1 antitrypsin deficiency, the genetic determinants of emphysema distribution remain largely unknown. To identify the genetic influences of emphysema distribution in non-alpha-1 antitrypsin-deficient smokers. A total of 11,532 subjects with complete genotype and computed tomography densitometry data in the COPDGene (Genetic Epidemiology of Chronic Obstructive Pulmonary Disease [COPD]; non-Hispanic white and African American), ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints), and GenKOLS (Genetics of Chronic Obstructive Lung Disease) studies were analyzed. Two computed tomography scan emphysema distribution measures (difference between upper-third and lower-third emphysema; ratio of upper-third to lower-third emphysema) were tested for genetic associations in all study subjects. Separate analyses in each study population were followed by a fixed effect metaanalysis. Single-nucleotide polymorphism-, gene-, and pathway-based approaches were used. In silico functional evaluation was also performed. We identified five loci associated with emphysema distribution at genome-wide significance. These loci included two previously reported associations with COPD susceptibility (4q31 near HHIP and 15q25 near CHRNA5) and three new associations near SOWAHB, TRAPPC9, and KIAA1462. Gene set analysis and in silico functional evaluation revealed pathways and cell types that may potentially contribute to the pathogenesis of emphysema distribution. This multicohort genome-wide association study identified new genomic loci associated with differential emphysematous destruction throughout the lungs. These findings may point to new biologic pathways on which to expand diagnostic and therapeutic approaches in chronic obstructive pulmonary disease. Clinical trial registered with www.clinicaltrials.gov (NCT 00608764).
Groth-Malonek, Milena; Wahrmund, Ute; Polsakiewicz, Monika; Knoop, Volker
2007-04-01
Gene transfer from the mitochondrion into the nucleus is a corollary of the endosymbiont hypothesis. The frequent and independent transfer of genes for mitochondrial ribosomal proteins is well documented with many examples in angiosperms, whereas transfer of genes for components of the respiratory chain is a rarity. A notable exception is the nad7 gene, encoding subunit 7 of complex I, in the liverwort Marchantia polymorpha, which resides as a full-length, intron-carrying and transcribed, but nonspliced pseudogene in the chondriome, whereas its functional counterpart is nuclear encoded. To elucidate the patterns of pseudogene degeneration, we have investigated the mitochondrial nad7 locus in 12 other liverworts of broad phylogenetic distribution. We find that the mitochondrial nad7 gene is nonfunctional in 11 of them. However, the modes of pseudogene degeneration vary: whereas point mutations, accompanied by single-nucleotide indels, predominantly introduce stop codons into the reading frame in marchantiid liverworts, larger indels introduce frameshifts in the simple thalloid and leafy jungermanniid taxa. Most notably, however, the mitochondrial nad7 reading frame appears to be intact in the isolated liverwort genus Haplomitrium. Its functional expression is shown by cDNA analysis identifying typical RNA-editing events to reconstitute conserved codon identities and also confirming functional splicing of the 2 liverwort-specific group II introns. We interpret our results 1) to indicate the presence of a functional mitochondrial nad7 gene in the earliest land plants and strongly supporting a basal placement of Haplomitrium among the liverworts, 2) to indicate different modes of pseudogene degeneration and chondriome evolution in the later branching liverwort clades, 3) to suggest a surprisingly long maintenance of a nonfunctional gene in the presumed oldest group of land plants, and 4) to support the model of a secondary loss of RNA-editing activity in marchantiid liverworts.
Marsh, Adam G; Hoadley, Kenneth D; Warner, Mark E
2016-01-01
Coral reefs are under assault from stressors including global warming, ocean acidification, and urbanization. Knowing how these factors impact the future fate of reefs requires delineating stress responses across ecological, organismal and cellular scales. Recent advances in coral reef biology have integrated molecular processes with ecological fitness and have identified putative suites of temperature acclimation genes in a Scleractinian coral Acropora hyacinthus. We wondered what unique characteristics of these genes determined their coordinate expression in response to temperature acclimation, and whether or not other corals and cnidarians would likewise possess these features. Here, we focus on cytosine methylation as an epigenetic DNA modification that is responsive to environmental stressors. We identify common conserved patterns of cytosine-guanosine dinucleotide (CpG) motif frequencies in upstream promoter domains of different functional gene groups in two cnidarian genomes: a coral (Acropora digitifera) and an anemone (Nematostella vectensis). Our analyses show that CpG motif frequencies are prominent in the promoter domains of functional genes associated with environmental adaptation, particularly those identified in A. hyacinthus. Densities of CpG sites in upstream promoter domains near the transcriptional start site (TSS) are 1.38x higher than genomic background levels upstream of -2000 bp from the TSS. The increase in CpG usage suggests selection to allow for DNA methylation events to occur more frequently within 1 kb of the TSS. In addition, observed shifts in CpG densities among functional groups of genes suggests a potential role for epigenetic DNA methylation within promoter domains to impact functional gene expression responses in A. digitifera and N. vectensis. Identifying promoter epigenetic sequence motifs among genes within specific functional groups establishes an approach to describe integrated cellular responses to environmental stress in reef corals and potential roles of epigenetics on survival and fitness in the face of global climate change.
Small, A J; Todd, R B; Zanker, M C; Delimitrou, S; Hynes, M J; Davis, M A
2001-06-01
The tam A gene of Aspergillus nidulans encodes a 739-amino acid protein with similarity to Uga35p/Dal81p/DurLp of Saccharomyces cerevisiae. It has been proposed that TamA functions as a co-activator of AreA, the major nitrogen regulatory protein in A. nidulans. Because AreA functions as a transcriptional activator under nitrogen-limiting conditions, we investigated whether TamA was also present in the nucleus. We found that a GFP-TamA fusion protein was predominantly localised to the nucleus in the presence and absence of ammonium, and that AreA was not required for this distribution. As the predicted DNA-binding domain of TamA is not essential for function, we have used a number of approaches to further define functionally important regions. We have cloned the tamA gene of A. oryzae and compared its functional and sequence characteristics with those of A. nidulans tamA and S. cerevisiae UGA35/DAL81/DURL. The Aspergillus homologues are highly conserved and functionally interchangeable, whereas the S. cerevisiae gene does not complement a tamA mutant when expressed in A. nidulans. Uga35p/Dal81p/DurLp was also found to be unable to recruit AreA. The sequence changes in a number of tamA mutant alleles were determined, and altered versions of TamA were tested for tamA complementation and interaction with AreA. Changes in most regions of TamA appeared to destroy its function, suggesting that the overall conformation of the protein may be critical for its activity.
Marine Bacterial and Archaeal Ion-Pumping Rhodopsins: Genetic Diversity, Physiology, and Ecology
DeLong, Edward F.; Béjà, Oded; González, José M.; Pedrós-Alió, Carlos
2016-01-01
SUMMARY The recognition of a new family of rhodopsins in marine planktonic bacteria, proton-pumping proteorhodopsin, expanded the known phylogenetic range, environmental distribution, and sequence diversity of retinylidene photoproteins. At the time of this discovery, microbial ion-pumping rhodopsins were known solely in haloarchaea inhabiting extreme hypersaline environments. Shortly thereafter, proteorhodopsins and other light-activated energy-generating rhodopsins were recognized to be widespread among marine bacteria. The ubiquity of marine rhodopsin photosystems now challenges prior understanding of the nature and contributions of “heterotrophic” bacteria to biogeochemical carbon cycling and energy fluxes. Subsequent investigations have focused on the biophysics and biochemistry of these novel microbial rhodopsins, their distribution across the tree of life, evolutionary trajectories, and functional expression in nature. Later discoveries included the identification of proteorhodopsin genes in all three domains of life, the spectral tuning of rhodopsin variants to wavelengths prevailing in the sea, variable light-activated ion-pumping specificities among bacterial rhodopsin variants, and the widespread lateral gene transfer of biosynthetic genes for bacterial rhodopsins and their associated photopigments. Heterologous expression experiments with marine rhodopsin genes (and associated retinal chromophore genes) provided early evidence that light energy harvested by rhodopsins could be harnessed to provide biochemical energy. Importantly, some studies with native marine bacteria show that rhodopsin-containing bacteria use light to enhance growth or promote survival during starvation. We infer from the distribution of rhodopsin genes in diverse genomic contexts that different marine bacteria probably use rhodopsins to support light-dependent fitness strategies somewhere between these two extremes. PMID:27630250
Harvey-Girard, Erik; Giassi, Ana C C; Ellis, William; Maler, Leonard
2012-10-15
We have cloned the apteronotid homologs of FoxP2, Otx1, and FoxO3. There was, in the case of all three genes, good similarity between the apteronotid and human amino acid sequences: FoxP2, 78%; Otx1, 54%; FoxO3, 71%. The functional domains of these genes were conserved to a far greater extent, on average: FoxP2, 89%; Otx1, 76%; FoxO3, 82%. This led us to hypothesize that the cellular functions of these genes might also be conserved. We used in situ hybridization to examine the distribution of the mRNA transcripts of these genes in the apteronotid telencephalon. We confined our analysis to the pallial regions previously associated with learning about social signals, whose circuitry has been closely examined in the other articles of this series. We found that AptFoxP2 and AptOtx1 transcripts were expressed predominantly in the dorsocentral division of the pallium (DC); the dorsolateral division of the pallium (DL) contained only weakly labeled neurons. In both cases, the distribution of labeled neurons was very heterogeneous, and unlabeled neurons could be found adjacent to strongly labeled ones. In contrast, we found that most neurons in DL strongly expressed AptFoxO3 mRNA, although there was only weak expression in a small number of cells within DC. We briefly discuss the relevance of our results regarding the functional roles of AptFoxP2/AptOtx1-expressing neurons in DC for communication vs. foraging behavior. We extensively discuss the implications of our results for possible homologies between DL and DC and medial and dorsal pallium of tetrapods, respectively. Copyright © 2012 Wiley Periodicals, Inc.
Alwani, Saniya; Kaur, Randeep; Michel, Deborah; Chitanda, Jackson M; Verrall, Ronald E; Karunakaran, Chithra; Badea, Ildiko
2016-01-01
Purpose Nanodiamonds (NDs) are emerging as an attractive tool for gene therapeutics. To reach their full potential for biological application, NDs should maintain their colloidal stability in biological milieu. This study describes the behavior of lysine-functionalized ND (lys-ND) in various dispersion media, with an aim to limit aggregation and improve the colloidal stability of ND-gene complexes called diamoplexes. Furthermore, cellular and macromolecular interactions of lys-NDs are also analyzed in vitro to establish the understanding of ND-mediated gene transfer in cells. Methods lys-NDs were synthesized earlier through covalent conjugation of lysine amino acid to carboxylated NDs surface generated through re-oxidation in strong oxidizing acids. In this study, dispersions of lys-NDs were prepared in various media, and the degree of sedimentation was monitored for 72 hours. Particle size distributions and zeta potential measurements were performed for a period of 25 days to characterize the physicochemical stability of lys-NDs in the medium. The interaction profile of lys-NDs with fetal bovine serum showed formation of a protein corona, which was evaluated by size and charge distribution measurements. Uptake of lys-NDs in cervical cancer cells was analyzed by scanning transmission X-ray microscopy, flow cytometry, and confocal microscopy. Cellular uptake of diamoplexes (complex of lys-NDs with small interfering RNA) was also analyzed using flow cytometry. Results Aqueous dispersion of lys-NDs showed minimum sedimentation and remained stable over a period of 25 days. Size distributions showed good stability, remaining under 100 nm throughout the testing period. A positive zeta potential of >+20 mV indicated a preservation of surface charges. Size distribution and zeta potential changed for lys-NDs after incubation with blood serum, suggesting an interaction with biomolecules, mainly proteins, and a possible formation of a protein corona. Cellular internalization of lys-NDs was confirmed by various techniques such as confocal microscopy, soft X-ray spectroscopy, and flow cytometry. Conclusion This study establishes that dispersion of lys-NDs in aqueous medium maintains long-term stability and also provides evidence that lysine functionalization enables NDs to interact effectively with the biological system to be used for RNAi therapeutics. PMID:26929623
Evaluation of ACE gene I/D polymorphism in Iranian elite athletes.
Shahmoradi, Somayeh; Ahmadalipour, Ali; Salehi, Mansoor
2014-01-01
Angiotensin converting enzyme (ACE) is an important gene, which is associated with the successful physical activity. The ACE gene has a major polymorphism (I/D) in intron 16 that determines its plasma and tissue levels. In this study, we aimed to determine whether there is an association between this polymorphism and sports performance in our studied population including elite athletes of different sports disciplines. We investigated allele frequency and genotype distribution of the ACE gene in 156 Iranian elite athletes compared to 163 healthy individuals. We also investigated this allele frequency between elite athletes in three functional groups of endurance, power, and mixed sports performances. DNA was extracted from peripheral blood, and polymerase chain reaction (PCR) method was performed on intron 16 of the ACE gene. The ACE genotype was determined for each subject. Statistical analysis was performed by SPSS 15, and results were analyzed by Chi-Square test. There was a significant difference in genotype distribution and allele frequency of the ACE gene in athletes and control group (P = 0.05, P = 0.03, respectively). There was also a significant difference in allele frequency of the ACE gene in 3 groups of athletes with different sports disciplines (P = 0.045). Proportion of the ACE gene D allele was greater in elite endurance athletes (37 high-distance cyclists) than two other groups. Findings of the present study demonstrated that there is an association between the ACE gene I/D polymorphism and sports performance in Iranian elite athletes.
Cui, Hao-Ran; Zhang, Zheng-Rong; Lv, Wei; Xu, Jia-Ning; Wang, Xiao-Yun
2015-08-01
The F-box protein family is a large family that is characterized by conserved F-box domains of approximately 40-50 amino acids in the N-terminus. F-box proteins participate in diverse cellular processes, such as development of floral organs, signal transduction and response to stress, primarily as a component of the Skp1-cullin-F-box (SCF) complex. In this study, using a global search of the apple genome, 517 F-box protein-encoding genes (F-box genes for short) were identified and further subdivided into 12 groups according to the characterization of known functional domains, which suggests the different potential functions or processes that they were involved in. Among these domains, the galactose oxidase domain was analyzed for the first time in plants, and this domain was present with or without the Kelch domain. The F-box genes were distributed in all 17 apple chromosomes with various densities and tended to form gene clusters. Spatial expression profile analysis revealed that F-box genes have organ-specific expression and are widely expressed in all organs. Proteins that contained the galactose oxidase domain were highly expressed in leaves, flowers and seeds. From a fruit ripening expression profile, 166 F-box genes were identified. The expressions of most of these genes changed little during maturation, but five of them increased significantly. Using qRT-PCR to examine the expression of F-box genes encoding proteins with domains related to stress, the results revealed that F-box proteins were up- or down-regulated, which suggests that F-box genes were involved in abiotic stress. The results of this study helped to elucidate the functions of F-box proteins, especially in Rosaceae plants.
Opazo, Juan C.; Toloza-Villalobos, Jessica; Burmester, Thorsten; Venkatesh, Byrappa; Storz, Jay F.
2015-01-01
Comparative analyses of vertebrate genomes continue to uncover a surprising diversity of genes in the globin gene superfamily, some of which have very restricted phyletic distributions despite their antiquity. Genomic analysis of the globin gene repertoire of cartilaginous fish (Chondrichthyes) should be especially informative about the duplicative origins and ancestral functions of vertebrate globins, as divergence between Chondrichthyes and bony vertebrates represents the most basal split within the jawed vertebrates. Here, we report a comparative genomic analysis of the vertebrate globin gene family that includes the complete globin gene repertoire of the elephant shark (Callorhinchus milii). Using genomic sequence data from representatives of all major vertebrate classes, integrated analyses of conserved synteny and phylogenetic relationships revealed that the last common ancestor of vertebrates possessed a repertoire of at least seven globin genes: single copies of androglobin and neuroglobin, four paralogous copies of globin X, and the single-copy progenitor of the entire set of vertebrate-specific globins. Combined with expression data, the genomic inventory of elephant shark globins yielded four especially surprising findings: 1) there is no trace of the neuroglobin gene (a highly conserved gene that is present in all other jawed vertebrates that have been examined to date), 2) myoglobin is highly expressed in heart, but not in skeletal muscle (reflecting a possible ancestral condition in vertebrates with single-circuit circulatory systems), 3) elephant shark possesses two highly divergent globin X paralogs, one of which is preferentially expressed in gonads, and 4) elephant shark possesses two structurally distinct α-globin paralogs, one of which is preferentially expressed in the brain. Expression profiles of elephant shark globin genes reveal distinct specializations of function relative to orthologs in bony vertebrates and suggest hypotheses about ancestral functions of vertebrate globins. PMID:25743544
Tu, N; Chen, H; Winnikes, U; Reinert, I; Marmann, G; Pirke, K M; Lentes, K U
1999-11-19
As a member of the uncoupling protein family, UCP2 is ubiquitously expressed in rodents and humans, implicating a major role in thermogenesis. To analyze promoter function and regulatory motifs involved in the transcriptional regulation of UCP2 gene expression, 3.3 kb of 5'-flanking region of the human UCP2 (hUCP2) gene have been cloned. Sequence analysis showed that the promoter region of hUCP2 lacks a classical TATA or CAAT box, however, appeared GC-rich resulting in the presence of several Sp-1 motifs and Ap-1/-2 binding sites near the transcription initiation site. Functional characterization of human UCP2 promoter-CAT fusion constructs in transient expression assays showed that minimal promoter activity was observed within 65 bp upstream of the transcriptional start site (+1). 75 bp further upstream (from nt -141 to -66) a strong cis-acting regulatory element (or enhancer) was identified, which significantly enhanced basal promoter activity. The regulation of human UCP2 gene expression involves complex interactions among positive and negative regulatory elements distributed over a minimum of 3.3 kb of the promoter region. Copyright 1999 Academic Press.
Yang, Jinying; Li, Jing; Luan, Xiwu; Zhang, Yunbo; Gu, Guizhou; Xue, Rongrong; Zong, Mingyue; Klotz, Martin G.
2013-01-01
The South China Sea (SCS), the largest marginal sea in the Western Pacific Ocean, is a huge oligotrophic water body with very limited influx of nitrogenous nutrients. This suggests that sediment microbial N2 fixation plays an important role in the production of bioavailable nitrogen. To test the molecular underpinning of this hypothesis, the diversity, abundance, biogeographical distribution, and community structure of the sediment diazotrophic microbiota were investigated at 12 sampling sites, including estuarine, coastal, offshore, deep-sea, and methane hydrate reservoirs or their prospective areas by targeting nifH and some other functional biomarker genes. Diverse and novel nifH sequences were obtained, significantly extending the evolutionary complexity of extant nifH genes. Statistical analyses indicate that sediment in situ temperature is the most significant environmental factor influencing the abundance, community structure, and spatial distribution of the sediment nifH-harboring microbial assemblages in the northern SCS (nSCS). The significantly positive correlation of the sediment pore water NH4+ concentration with the nifH gene abundance suggests that the nSCS sediment nifH-harboring microbiota is active in N2 fixation and NH4+ production. Several other environmental factors, including sediment pore water PO43− concentration, sediment organic carbon, nitrogen and phosphorus levels, etc., are also important in influencing the community structure, spatial distribution, or abundance of the nifH-harboring microbial assemblages. We also confirmed that the nifH genes encoded by archaeal diazotrophs in the ANME-2c subgroup occur exclusively in the deep-sea methane seep areas, providing for the possibility to develop ANME-2c nifH genes as a diagnostic tool for deep-sea methane hydrate reservoir discovery. PMID:23064334
Schwämmle, Veit; Jensen, Ole Nørregaard
2013-01-01
Chromatin is a highly compact and dynamic nuclear structure that consists of DNA and associated proteins. The main organizational unit is the nucleosome, which consists of a histone octamer with DNA wrapped around it. Histone proteins are implicated in the regulation of eukaryote genes and they carry numerous reversible post-translational modifications that control DNA-protein interactions and the recruitment of chromatin binding proteins. Heterochromatin, the transcriptionally inactive part of the genome, is densely packed and contains histone H3 that is methylated at Lys 9 (H3K9me). The propagation of H3K9me in nucleosomes along the DNA in chromatin is antagonizing by methylation of H3 Lysine 4 (H3K4me) and acetylations of several lysines, which is related to euchromatin and active genes. We show that the related histone modifications form antagonized domains on a coarse scale. These histone marks are assumed to be initiated within distinct nucleation sites in the DNA and to propagate bi-directionally. We propose a simple computer model that simulates the distribution of heterochromatin in human chromosomes. The simulations are in agreement with previously reported experimental observations from two different human cell lines. We reproduced different types of barriers between heterochromatin and euchromatin providing a unified model for their function. The effect of changes in the nucleation site distribution and of propagation rates were studied. The former occurs mainly with the aim of (de-)activation of single genes or gene groups and the latter has the power of controlling the transcriptional programs of entire chromosomes. Generally, the regulatory program of gene transcription is controlled by the distribution of nucleation sites along the DNA string.
Wang, Ting-Ting; Si, Feng-Ling; He, Zheng-Bo; Chen, Bin
2018-01-15
Ionotropic glutamate receptors (iGluRs) are conserved ligand-gated ion channel receptors, and ionotropic receptors (IRs) were revealed as a new family of iGluRs. Their subdivision was unsettled, and their characteristics are little known. Anopheles sinensis is a major malaria vector in eastern Asia, and its genome was recently well sequenced and annotated. We identified iGluR genes in the An. sinensis genome, analyzed their characteristics including gene structure, genome distribution, domains and specific sites by bioinformatic methods, and deduced phylogenetic relationships of all iGluRs in An. sinensis, Anopheles gambiae and Drosophila melanogaster. Based on the characteristics and phylogenetics, we generated the classification of iGluRs, and comparatively analyzed the intron number and selective pressure of three iGluRs subdivisions, iGluR group, Antenna IR and Divergent IR subfamily. A total of 56 iGluR genes were identified and named in the whole-genome of An. sinensis. These genes were located on 18 scaffolds, and 31 of them (29 being IRs) are distributed into 10 clusters that are suggested to form mainly from recent gene duplication. These iGluRs can be divided into four groups: NMDA, non-NMDA, Antenna IR and Divergent IR based on feature comparison and phylogenetic analysis. IR8a and IR25a were suggested to be monophyletic, named as Putative in the study, and moved from the Antenna subfamily in the IR family to the non-NMDA group as a sister of traditional non-NMDA. The generated iGluRs of genes (including NMDA and regenerated non-NMDA) are relatively conserved, and have a more complicated gene structure, smaller ω values and some specific functional sites. The iGluR genes in An. sinensis, An. gambiae and D. melanogaster have amino-terminal domain (ATD), ligand binding domain (LBD) and Lig_Chan domains, except for IR8a that only has the LBD and Lig_Chan domains. However, the new concept IR family of genes (including regenerated Antenna IR, and Divergent IR), especially for Divergent IR are more variable, have a simpler gene structure (intron loss phenomenon) and larger ω values, and lack specific functional sites. These IR genes have no other domains except for Antenna IRs that only have the Lig_Chan domain. This study provides a comprehensive information framework for iGluR genes in An. sinensis, and generated the classification of iGluRs by feature and bioinformatics analyses. The work lays the foundation for further functional study of these genes.
USDA-ARS?s Scientific Manuscript database
Cotton fibers represent the largest single cell in the plant kingdom, and they have been used as a model to study cell function, differentiation, maturation, and cell death. The cotton fiber transcriptome can be clustered into two genomic regions: conserved and recombination hotspots. Genetic link...
The molecular analysis of drinking water microbial communities has focused primarily on 16S rRNA gene sequence analysis. Since this approach provides limited information on function potential of microbial communities, analysis of whole-metagenome pyrosequencing data was used to...
Identification of Cell Cycle-Regulated Genes by Convolutional Neural Network.
Liu, Chenglin; Cui, Peng; Huang, Tao
2017-01-01
The cell cycle-regulated genes express periodically with the cell cycle stages, and the identification and study of these genes can provide a deep understanding of the cell cycle process. Large false positives and low overlaps are big problems in cell cycle-regulated gene detection. Here, a computational framework called DLGene was proposed for cell cycle-regulated gene detection. It is based on the convolutional neural network, a deep learning algorithm representing raw form of data pattern without assumption of their distribution. First, the expression data was transformed to categorical state data to denote the changing state of gene expression, and four different expression patterns were revealed for the reported cell cycle-regulated genes. Then, DLGene was applied to discriminate the non-cell cycle gene and the four subtypes of cell cycle genes. Its performances were compared with six traditional machine learning methods. At last, the biological functions of representative cell cycle genes for each subtype are analyzed. Our method showed better and more balanced performance of sensitivity and specificity comparing to other machine learning algorithms. The cell cycle genes had very different expression pattern with non-cell cycle genes and among the cell-cycle genes, there were four subtypes. Our method not only detects the cell cycle genes, but also describes its expression pattern, such as when its highest expression level is reached and how it changes with time. For each type, we analyzed the biological functions of the representative genes and such results provided novel insight to the cell cycle mechanisms. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Kaur, Amritpreet; Pati, Pratap Kumar; Pati, Aparna Maitra; Nagpal, Avinash Kaur
2017-01-01
Pathogenesis related (PR) proteins are low molecular weight family of proteins induced in plants under various biotic and abiotic stresses. They play an important role in plant-defense mechanism. PRs have wide range of functions, acting as hydrolases, peroxidases, chitinases, anti-fungal, protease inhibitors etc. In the present study, an attempt has been made to analyze promoter regions of PR1, PR2, PR5, PR9, PR10 and PR12 of Arabidopsis thaliana and Oryza sativa. Analysis of cis-element distribution revealed the functional multiplicity of PRs and provides insight into the gene regulation. CpG islands are observed only in rice PRs, which indicates that monocot genome contains more GC rich motifs than dicots. Tandem repeats were also observed in 5’ UTR of PR genes. Thus, the present study provides an understanding of regulation of PR genes and their versatile roles in plants. PMID:28910327
Kaur, Amritpreet; Pati, Pratap Kumar; Pati, Aparna Maitra; Nagpal, Avinash Kaur
2017-01-01
Pathogenesis related (PR) proteins are low molecular weight family of proteins induced in plants under various biotic and abiotic stresses. They play an important role in plant-defense mechanism. PRs have wide range of functions, acting as hydrolases, peroxidases, chitinases, anti-fungal, protease inhibitors etc. In the present study, an attempt has been made to analyze promoter regions of PR1, PR2, PR5, PR9, PR10 and PR12 of Arabidopsis thaliana and Oryza sativa. Analysis of cis-element distribution revealed the functional multiplicity of PRs and provides insight into the gene regulation. CpG islands are observed only in rice PRs, which indicates that monocot genome contains more GC rich motifs than dicots. Tandem repeats were also observed in 5' UTR of PR genes. Thus, the present study provides an understanding of regulation of PR genes and their versatile roles in plants.
Xu, Minli; Lawrence, Jeffrey G; Durand, Dannie
2018-03-16
Highly Iterated Palindrome 1 (HIP1, GCGATCGC) is hyper-abundant in most cyanobacterial genomes. In some cyanobacteria, average HIP1 abundance exceeds one motif per gene. Such high abundance suggests a significant role in cyanobacterial biology. However, 20 years of study have not revealed whether HIP1 has a function, much less what that function might be. We show that HIP1 is 15- to 300-fold over-represented in genomes analyzed. More importantly, HIP1 sites are conserved both within and between open reading frames, suggesting that their overabundance is maintained by selection rather than by continual replenishment by neutral processes, such as biased DNA repair. This evidence for selection suggests a functional role for HIP1. No evidence was found to support a functional role as a peptide or RNA motif or a role in the regulation of gene expression. Rather, we demonstrate that the distribution of HIP1 along cyanobacterial chromosomes is significantly periodic, with periods ranging from 10 to 90 kb, consistent in scale with periodicities reported for co-regulated, co-expressed and evolutionarily correlated genes. The periodicity we observe is also comparable in scale to chromosomal interaction domains previously described in other bacteria. In this context, our findings imply HIP1 functions associated with chromosome and nucleoid structure.
Xu, Minli; Lawrence, Jeffrey G; Durand, Dannie
2018-01-01
Abstract Highly Iterated Palindrome 1 (HIP1, GCGATCGC) is hyper-abundant in most cyanobacterial genomes. In some cyanobacteria, average HIP1 abundance exceeds one motif per gene. Such high abundance suggests a significant role in cyanobacterial biology. However, 20 years of study have not revealed whether HIP1 has a function, much less what that function might be. We show that HIP1 is 15- to 300-fold over-represented in genomes analyzed. More importantly, HIP1 sites are conserved both within and between open reading frames, suggesting that their overabundance is maintained by selection rather than by continual replenishment by neutral processes, such as biased DNA repair. This evidence for selection suggests a functional role for HIP1. No evidence was found to support a functional role as a peptide or RNA motif or a role in the regulation of gene expression. Rather, we demonstrate that the distribution of HIP1 along cyanobacterial chromosomes is significantly periodic, with periods ranging from 10 to 90 kb, consistent in scale with periodicities reported for co-regulated, co-expressed and evolutionarily correlated genes. The periodicity we observe is also comparable in scale to chromosomal interaction domains previously described in other bacteria. In this context, our findings imply HIP1 functions associated with chromosome and nucleoid structure. PMID:29432573
Wecke, Tina; Halang, Petra; Staroń, Anna; Dufour, Yann S; Donohue, Timothy J; Mascher, Thorsten
2012-01-01
Bacteria need signal transducing systems to respond to environmental changes. Next to one- and two-component systems, alternative σ factors of the extra-cytoplasmic function (ECF) protein family represent the third fundamental mechanism of bacterial signal transduction. A comprehensive classification of these proteins identified more than 40 phylogenetically distinct groups, most of which are not experimentally investigated. Here, we present the characterization of such a group with unique features, termed ECF41. Among analyzed bacterial genomes, ECF41 σ factors are widely distributed with about 400 proteins from 10 different phyla. They lack obvious anti-σ factors that typically control activity of other ECF σ factors, but their structural genes are often predicted to be cotranscribed with carboxymuconolactone decarboxylases, oxidoreductases, or epimerases based on genomic context conservation. We demonstrate for Bacillus licheniformis and Rhodobacter sphaeroides that the corresponding genes are preceded by a highly conserved promoter motif and are the only detectable targets of ECF41-dependent gene regulation. In contrast to other ECF σ factors, proteins of group ECF41 contain a large C-terminal extension, which is crucial for σ factor activity. Our data demonstrate that ECF41 σ factors are regulated by a novel mechanism based on the presence of a fused regulatory domain. PMID:22950025
Peri, A; Cordella-Miele, E; Miele, L; Mukherjee, A B
1993-01-01
Clara cell 10-kD protein (cc10kD), a secretory phospholipase A2 inhibitor, is suggested to be the human counterpart of rabbit uteroglobin (UG). Because cc10kD is expressed constitutively at a very high level in the human respiratory epithelium, the 5' region of its gene may be useful in achieving organ-specific expression of recombinant DNA in gene therapy of diseases such as cystic fibrosis. However, it is important to establish the tissue-specific expression of this gene before designing gene transfer experiments. Since the UG gene in the rabbit is expressed in many other organs besides the lung and the endometrium, we investigated the organ and tissue specificity of human cc10kD gene expression using polymerase chain reaction, nucleotide sequence analysis, immunofluorescence, and Northern blotting. Our results indicate that, in addition to the lung, cc10kD is expressed in several nonrespiratory organs, with a distribution pattern very similar, if not identical, to that of UG in the rabbit. These results underscore the necessity for more detailed analyses of the 5' region of the human cc10kD gene before its usefulness in gene therapy could be fully assessed. These data also suggest that cc10kD and UG may have similar physiological function(s). Images PMID:8227325
Thomas, David; Finan, Chris; Newport, Melanie J; Jones, Susan
2015-10-01
The complexity of DNA can be quantified using estimates of entropy. Variation in DNA complexity is expected between the promoters of genes with different transcriptional mechanisms; namely housekeeping (HK) and tissue specific (TS). The former are transcribed constitutively to maintain general cellular functions, and the latter are transcribed in restricted tissue and cells types for specific molecular events. It is known that promoter features in the human genome are related to tissue specificity, but this has been difficult to quantify on a genomic scale. If entropy effectively quantifies DNA complexity, calculating the entropies of HK and TS gene promoters as profiles may reveal significant differences. Entropy profiles were calculated for a total dataset of 12,003 human gene promoters and for 501 housekeeping (HK) and 587 tissue specific (TS) human gene promoters. The mean profiles show the TS promoters have a significantly lower entropy (p<2.2e-16) than HK gene promoters. The entropy distributions for the 3 datasets show that promoter entropies could be used to identify novel HK genes. Functional features comprise DNA sequence patterns that are non-random and hence they have lower entropies. The lower entropy of TS gene promoters can be explained by a higher density of positive and negative regulatory elements, required for genes with complex spatial and temporary expression. Copyright © 2015 Elsevier Ltd. All rights reserved.
Wen, Feng; Zhu, Hong; Li, Peng; Jiang, Min; Mao, Wenqing; Ong, Chermaine; Chu, Zhaoqing
2014-06-01
Members of plant WRKY gene family are ancient transcription factors that function in plant growth and development and respond to biotic and abiotic stresses. In our present study, we have investigated WRKY family genes in Brachypodium distachyon, a new model plant of family Poaceae. We identified a total of 86 WRKY genes from B. distachyon and explored their chromosomal distribution and evolution, domain alignment, promoter cis-elements, and expression profiles. Combining the analysis of phylogenetic tree of BdWRKY genes and the result of expression profiling, results showed that most of clustered gene pairs had higher similarities in the WRKY domain, suggesting that they might be functionally redundant. Neighbour-joining analysis of 301 WRKY domains from Oryza sativa, Arabidopsis thaliana, and B. distachyon suggested that BdWRKY domains are evolutionarily more closely related to O. sativa WRKY domains than those of A. thaliana. Moreover, tissue-specific expression profile of BdWRKY genes and their responses to phytohormones and several biotic or abiotic stresses were analysed by quantitative real-time PCR. The results showed that the expression of BdWRKY genes was rapidly regulated by stresses and phytohormones, and there was a strong correlation between promoter cis-elements and the phytohormones-induced BdWRKY gene expression. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Wei, Kai-Fa; Chen, Juan; Chen, Yan-Feng; Wu, Ling-Juan; Xie, Dao-Xin
2012-01-01
The WRKY transcription factors function in plant growth and development, and response to the biotic and abiotic stresses. Although many studies have focused on the functional identification of the WRKY transcription factors, much less is known about molecular phylogenetic and global expression analysis of the complete WRKY family in maize. In this study, we identified 136 WRKY proteins coded by 119 genes in the B73 inbred line from the complete genome and named them in an orderly manner. Then, a comprehensive phylogenetic analysis of five species was performed to explore the origin and evolutionary patterns of these WRKY genes, and the result showed that gene duplication is the major driving force for the origin of new groups and subgroups and functional divergence during evolution. Chromosomal location analysis of maize WRKY genes indicated that 20 gene clusters are distributed unevenly in the genome. Microarray-based expression analysis has revealed that 131 WRKY transcripts encoded by 116 genes may participate in the regulation of maize growth and development. Among them, 102 transcripts are stably expressed with a coefficient of variation (CV) value of <15%. The remaining 29 transcripts produced by 25 WRKY genes with the CV value of >15% are further analysed to discover new organ- or tissue-specific genes. In addition, microarray analyses of transcriptional responses to drought stress and fungal infection showed that maize WRKY proteins are involved in stress responses. All these results contribute to a deep probing into the roles of WRKY transcription factors in maize growth and development and stress tolerance. PMID:22279089
Wei, Kai-Fa; Chen, Juan; Chen, Yan-Feng; Wu, Ling-Juan; Xie, Dao-Xin
2012-04-01
The WRKY transcription factors function in plant growth and development, and response to the biotic and abiotic stresses. Although many studies have focused on the functional identification of the WRKY transcription factors, much less is known about molecular phylogenetic and global expression analysis of the complete WRKY family in maize. In this study, we identified 136 WRKY proteins coded by 119 genes in the B73 inbred line from the complete genome and named them in an orderly manner. Then, a comprehensive phylogenetic analysis of five species was performed to explore the origin and evolutionary patterns of these WRKY genes, and the result showed that gene duplication is the major driving force for the origin of new groups and subgroups and functional divergence during evolution. Chromosomal location analysis of maize WRKY genes indicated that 20 gene clusters are distributed unevenly in the genome. Microarray-based expression analysis has revealed that 131 WRKY transcripts encoded by 116 genes may participate in the regulation of maize growth and development. Among them, 102 transcripts are stably expressed with a coefficient of variation (CV) value of <15%. The remaining 29 transcripts produced by 25 WRKY genes with the CV value of >15% are further analysed to discover new organ- or tissue-specific genes. In addition, microarray analyses of transcriptional responses to drought stress and fungal infection showed that maize WRKY proteins are involved in stress responses. All these results contribute to a deep probing into the roles of WRKY transcription factors in maize growth and development and stress tolerance.
mRNA N6-methyladenosine methylation of postnatal liver development in pig.
He, Shen; Wang, Hong; Liu, Rui; He, Mengnan; Che, Tiandong; Jin, Long; Deng, Lamei; Tian, Shilin; Li, Yan; Lu, Hongfeng; Li, Xuewei; Jiang, Zhi; Li, Diyan; Li, Mingzhou
2017-01-01
N6-methyladenosine (m6A) is a ubiquitous reversible epigenetic RNA modification that plays an important role in the regulation of post-transcriptional protein coding gene expression. Liver is a vital organ and plays a major role in metabolism with numerous functions. Information concerning the dynamic patterns of mRNA m6A methylation during postnatal development of liver has been long overdue and elucidation of this information will benefit for further deciphering a multitude of functional outcomes of mRNA m6A methylation. Here, we profile transcriptome-wide m6A in porcine liver at three developmental stages: newborn (0 day), suckling (21 days) and adult (2 years). About 33% of transcribed genes were modified by m6A, with 1.33 to 1.42 m6A peaks per modified gene. m6A was distributed predominantly around stop codons. The consensus motif sequence RRm6ACH was observed in 78.90% of m6A peaks. A negative correlation (average Pearson's r = -0.45, P < 10-16) was found between levels of m6A methylation and gene expression. Functional enrichment analysis of genes consistently modified by m6A methylation at all three stages showed genes relevant to important functions, including regulation of growth and development, regulation of metabolic processes and protein catabolic processes. Genes with higher m6A methylation and lower expression levels at any particular stage were associated with the biological processes required for or unique to that stage. We suggest that differential m6A methylation may be important for the regulation of nutrient metabolism in porcine liver.
Morton, Nicholas M.; Nelson, Yvonne B.; Michailidou, Zoi; Di Rollo, Emma M.; Ramage, Lynne; Hadoke, Patrick W. F.; Seckl, Jonathan R.; Bunger, Lutz; Horvat, Simon; Kenyon, Christopher J.; Dunbar, Donald R.
2011-01-01
Background Obesity and metabolic syndrome results from a complex interaction between genetic and environmental factors. In addition to brain-regulated processes, recent genome wide association studies have indicated that genes highly expressed in adipose tissue affect the distribution and function of fat and thus contribute to obesity. Using a stratified transcriptome gene enrichment approach we attempted to identify adipose tissue-specific obesity genes in the unique polygenic Fat (F) mouse strain generated by selective breeding over 60 generations for divergent adiposity from a comparator Lean (L) strain. Results To enrich for adipose tissue obesity genes a ‘snap-shot’ pooled-sample transcriptome comparison of key fat depots and non adipose tissues (muscle, liver, kidney) was performed. Known obesity quantitative trait loci (QTL) information for the model allowed us to further filter genes for increased likelihood of being causal or secondary for obesity. This successfully identified several genes previously linked to obesity (C1qr1, and Np3r) as positional QTL candidate genes elevated specifically in F line adipose tissue. A number of novel obesity candidate genes were also identified (Thbs1, Ppp1r3d, Tmepai, Trp53inp2, Ttc7b, Tuba1a, Fgf13, Fmr) that have inferred roles in fat cell function. Quantitative microarray analysis was then applied to the most phenotypically divergent adipose depot after exaggerating F and L strain differences with chronic high fat feeding which revealed a distinct gene expression profile of line, fat depot and diet-responsive inflammatory, angiogenic and metabolic pathways. Selected candidate genes Npr3 and Thbs1, as well as Gys2, a non-QTL gene that otherwise passed our enrichment criteria were characterised, revealing novel functional effects consistent with a contribution to obesity. Conclusions A focussed candidate gene enrichment strategy in the unique F and L model has identified novel adipose tissue-enriched genes contributing to obesity. PMID:21915269
Morton, Nicholas M; Nelson, Yvonne B; Michailidou, Zoi; Di Rollo, Emma M; Ramage, Lynne; Hadoke, Patrick W F; Seckl, Jonathan R; Bunger, Lutz; Horvat, Simon; Kenyon, Christopher J; Dunbar, Donald R
2011-01-01
Obesity and metabolic syndrome results from a complex interaction between genetic and environmental factors. In addition to brain-regulated processes, recent genome wide association studies have indicated that genes highly expressed in adipose tissue affect the distribution and function of fat and thus contribute to obesity. Using a stratified transcriptome gene enrichment approach we attempted to identify adipose tissue-specific obesity genes in the unique polygenic Fat (F) mouse strain generated by selective breeding over 60 generations for divergent adiposity from a comparator Lean (L) strain. To enrich for adipose tissue obesity genes a 'snap-shot' pooled-sample transcriptome comparison of key fat depots and non adipose tissues (muscle, liver, kidney) was performed. Known obesity quantitative trait loci (QTL) information for the model allowed us to further filter genes for increased likelihood of being causal or secondary for obesity. This successfully identified several genes previously linked to obesity (C1qr1, and Np3r) as positional QTL candidate genes elevated specifically in F line adipose tissue. A number of novel obesity candidate genes were also identified (Thbs1, Ppp1r3d, Tmepai, Trp53inp2, Ttc7b, Tuba1a, Fgf13, Fmr) that have inferred roles in fat cell function. Quantitative microarray analysis was then applied to the most phenotypically divergent adipose depot after exaggerating F and L strain differences with chronic high fat feeding which revealed a distinct gene expression profile of line, fat depot and diet-responsive inflammatory, angiogenic and metabolic pathways. Selected candidate genes Npr3 and Thbs1, as well as Gys2, a non-QTL gene that otherwise passed our enrichment criteria were characterised, revealing novel functional effects consistent with a contribution to obesity. A focussed candidate gene enrichment strategy in the unique F and L model has identified novel adipose tissue-enriched genes contributing to obesity.
Statistical assessment of crosstalk enrichment between gene groups in biological networks.
McCormack, Theodore; Frings, Oliver; Alexeyenko, Andrey; Sonnhammer, Erik L L
2013-01-01
Analyzing groups of functionally coupled genes or proteins in the context of global interaction networks has become an important aspect of bioinformatic investigations. Assessing the statistical significance of crosstalk enrichment between or within groups of genes can be a valuable tool for functional annotation of experimental gene sets. Here we present CrossTalkZ, a statistical method and software to assess the significance of crosstalk enrichment between pairs of gene or protein groups in large biological networks. We demonstrate that the standard z-score is generally an appropriate and unbiased statistic. We further evaluate the ability of four different methods to reliably recover crosstalk within known biological pathways. We conclude that the methods preserving the second-order topological network properties perform best. Finally, we show how CrossTalkZ can be used to annotate experimental gene sets using known pathway annotations and that its performance at this task is superior to gene enrichment analysis (GEA). CrossTalkZ (available at http://sonnhammer.sbc.su.se/download/software/CrossTalkZ/) is implemented in C++, easy to use, fast, accepts various input file formats, and produces a number of statistics. These include z-score, p-value, false discovery rate, and a test of normality for the null distributions.
Wang, Shan-Ning; Peng, Yong; Lu, Zi-Yun; Dhiloo, Khalid Hussain; Zheng, Yao; Shan, Shuang; Li, Rui-Jun; Zhang, Yong-Jun; Guo, Yu-Yuan
2016-07-01
Ionotropic receptors (IRs) mainly detect the acids and amines having great importance in many insect species, representing an ancient olfactory receptor family in insects. In the present work, we performed RNAseq of Microplitis mediator antennae and identified seventeen IRs. Full-length MmedIRs were cloned and sequenced. Phylogenetic analysis of the Hymenoptera IRs revealed that ten MmedIR genes encoded "antennal IRs" and seven encoded "divergent IRs". Among the IR25a orthologous groups, two genes, MmedIR25a.1 and MmedIR25a.2, were found in M. mediator. Gene structure analysis of MmedIR25a revealed a tandem duplication of IR25a in M. mediator. The tissue distribution and development specific expression of the MmedIR genes suggested that these genes showed a broad expression profile. Quantitative gene expression analysis showed that most of the genes are highly enriched in adult antennae, indicating the candidate chemosensory function of this family in parasitic wasps. Using immunocytochemistry, we confirmed that one co-receptor, MmedIR8a, was expressed in the olfactory sensory neurons. Our data will supply fundamental information for functional analysis of the IRs in parasitoid wasp chemoreception. Copyright © 2016 Elsevier Ltd. All rights reserved.
Nikitin, Aleksey G; Potapov, Viktor Y; Brovkina, Olga I; Koksharova, Ekaterina O; Khodyrev, Dmitry S; Philippov, Yury I; Michurova, Marina S; Shamkhalova, Minara S; Vikulova, Olga K; Smetanina, Svetlana A; Suplotova, Lyudmila A; Kononenko, Irina V; Kalashnikov, Viktor Y; Smirnova, Olga M; Mayorov, Alexander Y; Nosikov, Valery V; Averyanov, Alexander V; Shestakova, Marina V
2017-01-01
The association of type 2 diabetes mellitus (T2DM) with the KCNJ11, CDKAL1, SLC30A8, CDKN2B, and FTO genes in the Russian population has not been well studied. In this study, we analysed the population frequencies of polymorphic markers of these genes. The study included 862 patients with T2DM and 443 control subjects of Russian origin. All subjects were genotyped for 10 single nucleotide polymorphisms (SNPs) of the genes using real-time PCR (TaqMan assays). HOMA-IR and HOMA- β were used to measure insulin resistance and β -cell secretory function, respectively. The analysis of the frequency distribution of polymorphic markers for genes KCNJ11, CDKAL1, SLC30A8 and CDKN2B showed statistically significant associations with T2DM in the Russian population. The association between the FTO gene and T2DM was not statistically significant. The polymorphic markers rs5219 of the KCNJ11 gene, rs13266634 of the SLC30A8 gene, rs10811661 of the CDKN2B gene and rs9465871 , rs7756992 and rs10946398 of the CDKAL1 gene showed a significant association with impaired glucose metabolism or impaired β -cell function. In the Russian population, genes, which affect insulin synthesis and secretion in the β -cells of the pancreas, play a central role in the development of T2DM.
Lam, L T; Pickeral, O K; Peng, A C; Rosenwald, A; Hurt, E M; Giltnane, J M; Averett, L M; Zhao, H; Davis, R E; Sathyamoorthy, M; Wahl, L M; Harris, E D; Mikovits, J A; Monks, A P; Hollingshead, M G; Sausville, E A; Staudt, L M
2001-01-01
Flavopiridol, a flavonoid currently in cancer clinical trials, inhibits cyclin-dependent kinases (CDKs) by competitively blocking their ATP-binding pocket. However, the mechanism of action of flavopiridol as an anti-cancer agent has not been fully elucidated. Using DNA microarrays, we found that flavopiridol inhibited gene expression broadly, in contrast to two other CDK inhibitors, roscovitine and 9-nitropaullone. The gene expression profile of flavopiridol closely resembled the profiles of two transcription inhibitors, actinomycin D and 5,6-dichloro-1-beta-D-ribofuranosyl-benzimidazole (DRB), suggesting that flavopiridol inhibits transcription globally. We were therefore able to use flavopiridol to measure mRNA turnover rates comprehensively and we found that different functional classes of genes had distinct distributions of mRNA turnover rates. In particular, genes encoding apoptosis regulators frequently had very short half-lives, as did several genes encoding key cell-cycle regulators. Strikingly, genes that were transcriptionally inducible were disproportionately represented in the class of genes with rapid mRNA turnover. The present genomic-scale measurement of mRNA turnover uncovered a regulatory logic that links gene function with mRNA half-life. The observation that transcriptionally inducible genes often have short mRNA half-lives demonstrates that cells have a coordinated strategy to rapidly modulate the mRNA levels of these genes. In addition, the present results suggest that flavopiridol may be more effective against types of cancer that are highly dependent on genes with unstable mRNAs.
Eco-Evolutionary Dynamics of Episomes among Ecologically Cohesive Bacterial Populations
Xue, Hong; Cordero, Otto X.; Camas, Francisco M.; ...
2015-05-05
Although plasmids and other episomes are recognized as key players in horizontal gene transfer among microbes, their diversity and dynamics among ecologically structured host populations in the wild remain poorly understood. Here, we show that natural populations of marine Vibrionaceae bacteria host large numbers of families of episomes, consisting of plasmids and a surprisingly high fraction of plasmid-like temperate phages. Episomes are unevenly distributed among host populations, and contrary to the notion that high-density communities in biofilms act as hot spots of gene transfer, we identified a strong bias for episomes to occur in free-living as opposed to particle-attached cells.more » Mapping of episomal families onto host phylogeny shows that, with the exception of all phage and a few plasmid families, most are of recent evolutionary origin and appear to have spread rapidly by horizontal transfer. Such high eco-evolutionary turnover is particularly surprising for plasmids that are, based on previously suggested categorization, putatively nontransmissible, indicating that this type of plasmid is indeed frequently transferred by currently unknown mechanisms. Finally, analysis of recent gene transfer among plasmids reveals a network of extensive exchange connecting nearly all episomes. Genes functioning in plasmid transfer and maintenance are frequently exchanged, suggesting that plasmids can be rapidly transformed from one category to another. The broad distribution of episomes among distantly related hosts and the observed promiscuous recombination patterns show how episomes can offer their hosts rapid assembly and dissemination of novel functions.« less
Zallot, Rémi; Brochier-Armanet, Céline; Gaston, Kirk W; Forouhar, Farhad; Limbach, Patrick A; Hunt, John F; de Crécy-Lagard, Valérie
2014-08-15
Queuosine (Q) is a modification found at the wobble position of tRNAs with GUN anticodons. Although Q is present in most eukaryotes and bacteria, only bacteria can synthesize Q de novo. Eukaryotes acquire queuine (q), the free base of Q, from diet and/or microflora, making q an important but under-recognized micronutrient for plants, animals, and fungi. Eukaryotic type tRNA-guanine transglycosylases (eTGTs) are composed of a catalytic subunit (QTRT1) and a homologous accessory subunit (QTRTD1) forming a complex that catalyzes q insertion into target tRNAs. Phylogenetic analysis of eTGT subunits revealed a patchy distribution pattern in which gene losses occurred independently in different clades. Searches for genes co-distributing with eTGT family members identified DUF2419 as a potential Q salvage protein family. This prediction was experimentally validated in Schizosaccharomyces pombe by confirming that Q was present by analyzing tRNA(Asp) with anticodon GUC purified from wild-type cells and by showing that Q was absent from strains carrying deletions in the QTRT1 or DUF2419 encoding genes. DUF2419 proteins occur in most Eukarya with a few possible cases of horizontal gene transfer to bacteria. The universality of the DUF2419 function was confirmed by complementing the S. pombe mutant with the Zea mays (maize), human, and Sphaerobacter thermophilus homologues. The enzymatic function of this family is yet to be determined, but structural similarity with DNA glycosidases suggests a ribonucleoside hydrolase activity.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sundstrom, Magnus; Chatterji, Udayan; Schaffer, Lana
2008-02-20
Expression of the feline immunodeficiency virus (FIV) accessory protein OrfA (or Orf2) is critical for efficient viral replication in lymphocytes, both in vitro and in vivo. OrfA has been reported to exhibit functions in common with the human immunodeficiency virus (HIV) and simian immunodeficiency virus (SIV) accessory proteins Vpr and Tat, although the function of OrfA has not been fully explained. Here, we use microarray analysis to characterize how OrfA modulates the gene expression profile of T-lymphocytes. The primary IL-2-dependent T-cell line 104-C1 was transduced to express OrfA. Functional expression of OrfA was demonstrated by trans complementation of the OrfA-defectivemore » clone, FIV-34TF10. OrfA-expressing cells had a slightly reduced cell proliferation rate but did not exhibit any significant alteration in cell cycle distribution. Reverse-transcribed RNA from cells expressing green fluorescent protein (GFP) or GFP + OrfA were hybridized to Affymetrix HU133 Plus 2.0 microarray chips representing more than 47,000 genome-wide transcripts. By using two statistical approaches, 461 (Rank Products) and 277 (ANOVA) genes were identified as modulated by OrfA expression. The functional relevance of the differentially expressed genes was explored by Ingenuity Pathway Analysis. The analyses revealed alterations in genes critical for RNA post-transcriptional modifications and protein ubiquitination as the two most significant functional outcomes of OrfA expression. In these two groups, several subunits of the spliceosome, cellular splicing factors and family members of the proteasome-ubiquitination system were identified. These findings provide novel information on the versatile function of OrfA during FIV infection and indicate a fine-tuning mechanism of the cellular environment by OrfA to facilitate efficient FIV replication.« less
Massive GGAAs in genomic repetitive sequences serve as a nuclear reservoir of NF-κB.
Wu, Jian; Wang, Qiao; Dai, Wei; Wang, Wei; Yue, Ming; Wang, Jinke
2018-04-13
Nuclear factor κB (NF-κB) is a DNA-binding transcription factor. Characterizing its genomic binding sites is crucial for understanding its gene regulatory function and mechanism in cells. This study characterized the binding sites of NF-κB RelA/p65 in the tumor neurosis factor-α (TNFα) stimulated HeLa cells by a precise chromatin immunoprecipitation-sequencing (ChIP-seq). The results revealed that NF-κB binds nontraditional motifs (nt-motifs) containing conserved GGAA quadruplet. Moreover, nt-motifs mainly distribute in the peaks nearby centromeres that contain a larger number of repetitive elements such as satellite, simple repeats and short interspersed nuclear elements (SINEs). This intracellular binding pattern was then confirmed by the in vitro detection, indicating that NF-κB dimers can bind the nontraditional κB (nt-κB) sites with low affinity. However, this binding hardly activates transcription. This study thus deduced that NF-κB binding nt-motifs may realize functions other than gene regulation as NF-κB binding traditional motifs (t-motifs). To testify the deduction, many ChIP-seq data of other cell lines were then analyzed. The results indicate that NF-κB binding nt-motifs is also widely present in other cells. The ChIP-seq data analysis also revealed that nt-motifs more widely distribute in the peaks with low-fold enrichment. Importantly, it was also found that NF-κB binding nt-motifs is mainly present in the resting cells, whereas NF-κB binding t-motifs is mainly present in the stimulated cells. Astonishingly, no known function was enriched by the gene annotation of nt-motif peaks. Based on these results, this study proposed that the nt-κB sites that extensively distribute in larger numbers of repeat elements function as a nuclear reservoir of NF-κB. The nuclear NF-κB proteins stored at nt-κB sites in the resting cells may be recruited to the t-κB sites for regulating its target genes upon stimulation. Copyright © 2018 Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and Genetics Society of China. Published by Elsevier Ltd. All rights reserved.
What can flies tell us about copper homeostasis?
Southon, Adam; Burke, Richard; Camakaris, James
2013-10-01
Copper (Cu) is an essential redox active metal that is potentially toxic in excess. Multicellular organisms acquire Cu from the diet and must regulate uptake, storage, distribution and export of Cu at both the cellular and organismal levels. Systemic Cu deficiency can be fatal, as seen in Menkes disease patients. Conversely Cu toxicity occurs in patients with Wilson disease. Cu dyshomeostasis has also been implicated in neurodegenerative disorders such as Alzheimer's disease. Over the last decade, the fly Drosophila melanogaster has become an important model organism for the elucidation of eukaryotic Cu regulatory mechanisms. Gene discovery approaches with Drosophila have identified novel genes with conserved protein functions relevant to Cu homeostasis in humans. This review focuses on our current understanding of Cu uptake, distribution and export in Drosophila and the implications for mammals.
Valentin-Kahan, Adrián; García-Tejedor, Gabriela B; Robello, Carlos; Trujillo-Cenóz, Omar; Russo, Raúl E; Alvarez-Valin, Fernando
2017-01-01
Slider turtles are the only known amniotes with self-repair mechanisms of the spinal cord that lead to substantial functional recovery. Their strategic phylogenetic position makes them a relevant model to investigate the peculiar genetic programs that allow anatomical reconnection in some vertebrate groups but are absent in others. Here, we analyze the gene expression profile of the response to spinal cord injury (SCI) in the turtle Trachemys scripta elegans . We found that this response comprises more than 1000 genes affecting diverse functions: reaction to ischemic insult, extracellular matrix re-organization, cell proliferation and death, immune response, and inflammation. Genes related to synapses and cholesterol biosynthesis are down-regulated. The analysis of the evolutionary distribution of these genes shows that almost all are present in most vertebrates. Additionally, we failed to find genes that were exclusive of regenerating taxa. The comparison of expression patterns among species shows that the response to SCI in the turtle is more similar to that of mice and non-regenerative Xenopus than to Xenopus during its regenerative stage. This observation, along with the lack of conserved "regeneration genes" and the current accepted phylogenetic placement of turtles (sister group of crocodilians and birds), indicates that the ability of spinal cord self-repair of turtles does not represent the retention of an ancestral vertebrate character. Instead, our results suggest that turtles developed this capability from a non-regenerative ancestor (i.e., a lineage specific innovation) that was achieved by re-organizing gene expression patterns on an essentially non-regenerative genetic background. Among the genes activated by SCI exclusively in turtles, those related to anoxia tolerance, extracellular matrix remodeling, and axonal regrowth are good candidates to underlie functional recovery.
The spatial distribution of fixed mutations within genes coding for proteins
NASA Technical Reports Server (NTRS)
Holmquist, R.; Goodman, M.; Conroy, T.; Czelusniak, J.
1983-01-01
An examination has been conducted of the extensive amino acid sequence data now available for five protein families - the alpha crystallin A chain, myoglobin, alpha and beta hemoglobin, and the cytochromes c - with the goal of estimating the true spatial distribution of base substitutions within genes that code for proteins. In every case the commonly used Poisson density failed to even approximate the experimental pattern of base substitution. For the 87 species of beta hemoglobin examined, for example, the probability that the observed results were from a Poisson process was the minuscule 10 to the -44th. Analogous results were obtained for the other functional families. All the data were reasonably, but not perfectly, described by the negative binomial density. In particular, most of the data were described by one of the very simple limiting forms of this density, the geometric density. The implications of this for evolutionary inference are discussed. It is evident that most estimates of total base substitutions between genes are badly in need of revision.
Phylogenetic Origin and Diversification of RNAi Pathway Genes in Insects.
Dowling, Daniel; Pauli, Thomas; Donath, Alexander; Meusemann, Karen; Podsiadlowski, Lars; Petersen, Malte; Peters, Ralph S; Mayer, Christoph; Liu, Shanlin; Zhou, Xin; Misof, Bernhard; Niehuis, Oliver
2016-12-01
RNA interference (RNAi) refers to the set of molecular processes found in eukaryotic organisms in which small RNA molecules mediate the silencing or down-regulation of target genes. In insects, RNAi serves a number of functions, including regulation of endogenous genes, anti-viral defense, and defense against transposable elements. Despite being well studied in model organisms, such as Drosophila, the distribution of core RNAi pathway genes and their evolution in insects is not well understood. Here we present the most comprehensive overview of the distribution and diversity of core RNAi pathway genes across 100 insect species, encompassing all currently recognized insect orders. We inferred the phylogenetic origin of insect-specific RNAi pathway genes and also identified several hitherto unrecorded gene expansions using whole-body transcriptome data from the international 1KITE (1000 Insect Transcriptome Evolution) project as well as other resources such as i5K (5000 Insect Genome Project). Specifically, we traced the origin of the double stranded RNA binding protein R2D2 to the last common ancestor of winged insects (Pterygota), the loss of Sid-1/Tag-130 orthologs in Antliophora (fleas, flies and relatives, and scorpionflies in a broad sense), and confirm previous evidence for the splitting of the Argonaute proteins Aubergine and Piwi in Brachyceran flies (Diptera, Brachycera). Our study offers new reference points for future experimental research on RNAi-related pathway genes in insects. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Mu, Min; Lu, Xu-Ke; Wang, Jun-Juan; Wang, De-Long; Yin, Zu-Jun; Wang, Shuai; Fan, Wei-Li; Ye, Wu-Wei
2016-03-18
Trehalose (a-D-glucopyranosyl a-D-glucopyranoside) is a nonreducing disaccharide and is widely distributed in bacteria, fungi, algae, plants and invertebrates. In the study, the identification of trehalose-6-phosphate synthase (TPS) genes stress-related in cotton, and the genetic structure analysis and molecular evolution analysis of TPSs were conducted with bioinformatics methods, which could lay a foundation for further research of TPS functions in cotton. The genome information of Gossypium raimondii (group D), G. arboreum L. (group A), and G. hirsutum L. (group AD) was used in the study. Fifty-three TPSs were identified comprising 15 genes in group D, 14 in group A, and 24 in group AD. Bioinformatics methods were used to analyze the genetic structure and molecular evolution of TPSs. Real-time PCR analysis was performed to investigate the expression patterns of gene family members. All TPS family members in cotton can be divided into two subfamilies: Class I and Class II. The similarity of the TPS sequence is high within the same species and close within their family relatives. The genetic structures of two TPS subfamily members are different, with more introns and a more complicated gene structure in Class I. There is a TPS domain(Glyco transf_20) at the N-terminal in all TPS family members and a TPP domain(Trehalose_PPase) at the C-terminal in all except GrTPS6, GhTPS4, and GhTPS9. All Class II members contain a UDP-forming domain. The responses to environmental stresses showed that stresses could induce the expression of TPSs but the expression patterns vary with different stresses. The distribution of TPSs varies with different species but is relatively uniform on chromosomes. Genetic structure varies with different gene members, and expression levels vary with different stresses and exhibit tissue specificity. The upregulated genes in upland cotton TM-1 is significantly more than that in G. raimondii and G. arboreum L. Shixiya 1.
Lin, Hailan; Xia, Xiaofeng; Yu, Liying; Vasseur, Liette; Gurr, Geoff M; Yao, Fengluan; Yang, Guang; You, Minsheng
2015-12-10
Serine proteases (SPs) are crucial proteolytic enzymes responsible for digestion and other processes including signal transduction and immune responses in insects. Serine protease homologs (SPHs) lack catalytic activity but are involved in innate immunity. This study presents a genome-wide investigation of SPs and SPHs in the diamondback moth, Plutella xylostella (L.), a globally-distributed destructive pest of cruciferous crops. A total of 120 putative SPs and 101 putative SPHs were identified in the P. xylostella genome by bioinformatics analysis. Based on the features of trypsin, 38 SPs were putatively designated as trypsin genes. The distribution, transcription orientation, exon-intron structure and sequence alignments suggested that the majority of trypsin genes evolved from tandem duplications. Among the 221 SP/SPH genes, ten SP and three SPH genes with one or more clip domains were predicted and designated as PxCLIPs. Phylogenetic analysis of CLIPs in P. xylostella, two other Lepidoptera species (Bombyx mori and Manduca sexta), and two more distantly related insects (Drosophila melanogaster and Apis mellifera) showed that seven of the 13 PxCLIPs were clustered with homologs of the Lepidoptera rather than other species. Expression profiling of the P. xylostella SP and SPH genes in different developmental stages and tissues showed diverse expression patterns, suggesting high functional diversity with roles in digestion and development. This is the first genome-wide investigation on the SP and SPH genes in P. xylostella. The characterized features and profiled expression patterns of the P. xylostella SPs and SPHs suggest their involvement in digestion, development and immunity of this species. Our findings provide a foundation for further research on the functions of this gene family in P. xylostella, and a better understanding of its capacity to rapidly adapt to a wide range of environmental variables including host plants and insecticides.
Buchner, Peter; Hawkesford, Malcolm J.
2014-01-01
NPF (formerly referred to as low-affinity NRT1) and ‘high-affinity’ NRT2 nitrate transporter genes are involved in nitrate uptake by the root, and transport and distribution of nitrate within the plant. The NPF gene family consists of 53 members in Arabidopsis thaliana, however only 11 of these have been functionally characterized. Although homologous genes have been identified in genomes of different plant species including some cereals, there is little information available for wheat (Triticum aestivum). Sixteen genes were identified in wheat homologous to characterized Arabidopsis low-affinity nitrate transporter NPF genes, suggesting a complex wheat NPF gene family. The regulation of wheat NFP genes by plant N-status indicated involvement of these transporters in substrate transport in relation to N-metabolism. The complex expression pattern in relation to tissue specificity, nitrate availability and senescence may be associated with the complex growth patterns of wheat depending on sink/source demands, as well as remobilization during grain filling. PMID:24913625
Niche specialization of terrestrial archaeal ammonia oxidizers.
Gubry-Rangin, Cécile; Hai, Brigitte; Quince, Christopher; Engel, Marion; Thomson, Bruce C; James, Phillip; Schloter, Michael; Griffiths, Robert I; Prosser, James I; Nicol, Graeme W
2011-12-27
Soil pH is a major determinant of microbial ecosystem processes and potentially a major driver of evolution, adaptation, and diversity of ammonia oxidizers, which control soil nitrification. Archaea are major components of soil microbial communities and contribute significantly to ammonia oxidation in some soils. To determine whether pH drives evolutionary adaptation and community structure of soil archaeal ammonia oxidizers, sequences of amoA, a key functional gene of ammonia oxidation, were examined in soils at global, regional, and local scales. Globally distributed database sequences clustered into 18 well-supported phylogenetic lineages that dominated specific soil pH ranges classified as acidic (pH <5), acido-neutral (5 ≤ pH <7), or alkalinophilic (pH ≥ 7). To determine whether patterns were reproduced at regional and local scales, amoA gene fragments were amplified from DNA extracted from 47 soils in the United Kingdom (pH 3.5-8.7), including a pH-gradient formed by seven soils at a single site (pH 4.5-7.5). High-throughput sequencing and analysis of amoA gene fragments identified an additional, previously undiscovered phylogenetic lineage and revealed similar pH-associated distribution patterns at global, regional, and local scales, which were most evident for the five most abundant clusters. Archaeal amoA abundance and diversity increased with soil pH, which was the only physicochemical characteristic measured that significantly influenced community structure. These results suggest evolution based on specific adaptations to soil pH and niche specialization, resulting in a global distribution of archaeal lineages that have important consequences for soil ecosystem function and nitrogen cycling.
Ancient Eukaryotic Origin and Evolutionary Plasticity of Nuclear Lamina.
Koreny, Ludek; Field, Mark C
2016-09-19
The emergence of the nucleus was a major event of eukaryogenesis. How the nuclear envelope (NE) arose and acquired functions governing chromatin organization and epigenetic control has direct bearing on origins of developmental/stage-specific expression programs. The configuration of the NE and the associated lamina in the last eukaryotic common ancestor (LECA) is of major significance and can provide insight into activities within the LECA nucleus. Subsequent lamina evolution, alterations, and adaptations inform on the variation and selection of distinct mechanisms that subtend gene expression in distinct taxa. Understanding lamina evolution has been difficult due to the diversity and limited taxonomic distributions of the three currently known highly distinct nuclear lamina. We rigorously searched available sequence data for an expanded view of the distribution of known lamina and lamina-associated proteins. While the lamina proteins of plants and trypanosomes are indeed taxonomically restricted, homologs of metazoan lamins and key lamin-binding proteins have significantly broader distributions, and a lamin gene tree supports vertical evolution from the LECA. Two protist lamins from highly divergent taxa target the nucleus in mammalian cells and polymerize into filamentous structures, suggesting functional conservation of distant lamin homologs. Significantly, a high level of divergence of lamin homologs within certain eukaryotic groups and the apparent absence of lamins and/or the presence of seemingly different lamina proteins in many eukaryotes suggests great evolutionary plasticity in structures at the NE, and hence mechanisms of chromatin tethering and epigenetic gene control. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Characterizing virus-induced gene silencing at the cellular level with in situ multimodal imaging
Burkhow, Sadie J.; Stephens, Nicole M.; Mei, Yu; ...
2018-05-25
Reverse genetic strategies, such as virus-induced gene silencing, are powerful techniques to study gene function. Currently, there are few tools to study the spatial dependence of the consequences of gene silencing at the cellular level. Here, we report the use of multimodal Raman and mass spectrometry imaging to study the cellular-level biochemical changes that occur from silencing the phytoene desaturase ( pds) gene using a Foxtail mosaic virus (FoMV) vector in maize leaves. The multimodal imaging method allows the localized carotenoid distribution to be measured and reveals differences lost in the spatial average when analyzing a carotenoid extraction of themore » whole leaf. The nature of the Raman and mass spectrometry signals are complementary: silencing pds reduces the downstream carotenoid Raman signal and increases the phytoene mass spectrometry signal.« less
Characterizing virus-induced gene silencing at the cellular level with in situ multimodal imaging
DOE Office of Scientific and Technical Information (OSTI.GOV)
Burkhow, Sadie J.; Stephens, Nicole M.; Mei, Yu
Reverse genetic strategies, such as virus-induced gene silencing, are powerful techniques to study gene function. Currently, there are few tools to study the spatial dependence of the consequences of gene silencing at the cellular level. Here, we report the use of multimodal Raman and mass spectrometry imaging to study the cellular-level biochemical changes that occur from silencing the phytoene desaturase ( pds) gene using a Foxtail mosaic virus (FoMV) vector in maize leaves. The multimodal imaging method allows the localized carotenoid distribution to be measured and reveals differences lost in the spatial average when analyzing a carotenoid extraction of themore » whole leaf. The nature of the Raman and mass spectrometry signals are complementary: silencing pds reduces the downstream carotenoid Raman signal and increases the phytoene mass spectrometry signal.« less
Hu, Hongshuang; Xin, Nian; Liu, Jinxiang; Liu, Mengmeng; Wang, Zhenwei; Wang, Wenji; Zhang, Quanqi; Qi, Jie
2016-01-10
F-spondin was originally isolated from the developing embryonic floor plate of vertebrates, secreting numerous kinds of neuron-related molecules. The protein performs a positive function in nervous system development, which is attributed to the high conservation of F-spondin protein, an extracellular matrix (ECM) protein in several species. However, its precise function remains unknown, especially in marine fish. In this study, the F-spondin of Japanese flounder (Paralichthys olivaceus). was cloned, and its expression pattern and structural characteristics were analyzed. The 2421bp-long cDNA ORF of PoF-spondin was obtained and divided into 14 exons spread over 61,496bp of the genomic sequence. Phylogenetic analysis showed that PoF-spondin was actually the ortholog of the human spon1 gene and shared high identities with other teleost spon1a genes. Quantitative RT-PCR analysis showed that PoF-spondin was maternally expressed, and transcripts were present from one-cell stage to hatching stage, peaking at tailbud stage. Tissue distribution analysis indicated that PoF-spondin was detectable mainly in the gonads (especially in the ovary) and the brain. Whole mount in situ hybridization analysis revealed that the PoF-spondin transcription distributed throughout the cleavage of the ball in the early stage and expressed at a high level in the floor plate of the trunk at tailbud and pre-hatching stages. Furthermore, the expression of genes related to nervous system development (spon1b, foxo3b, and foxj1a) was significantly increased after the injection of PoF-spondin into the embryos of wild-type zebrafish. Furthermore, PoF-spondin significantly suppressed the expression of the chordamesoderm marker gene ntl, increased the expression of otx2/krox20, ectoderm mark genes, and left the expression of dorsal mesodermal marker gene gsc unaffected at 50% epiboly stage in zebrafish. In short, our results suggest that PoF-spondin functions in the development of the teleost nervous system. Copyright © 2015 Elsevier B.V. All rights reserved.
Genome-wide loss of 5-hmC is a novel epigenetic feature of Huntington's disease.
Wang, Fengli; Yang, Yeran; Lin, Xiwen; Wang, Jiu-Qiang; Wu, Yong-Sheng; Xie, Wenjuan; Wang, Dandan; Zhu, Shu; Liao, You-Qi; Sun, Qinmiao; Yang, Yun-Gui; Luo, Huai-Rong; Guo, Caixia; Han, Chunsheng; Tang, Tie-Shan
2013-09-15
5-Hydroxymethylcytosine (5-hmC) may represent a new epigenetic modification of cytosine. While the dynamics of 5-hmC during neurodevelopment have recently been reported, little is known about its genomic distribution and function(s) in neurodegenerative diseases such as Huntington's disease (HD). We here observed a marked reduction of the 5-hmC signal in YAC128 (yeast artificial chromosome transgene with 128 CAG repeats) HD mouse brain tissues when compared with age-matched wild-type (WT) mice, suggesting a deficiency of 5-hmC reconstruction in HD brains during postnatal development. Genome-wide distribution analysis of 5-hmC further confirmed the diminishment of the 5-hmC signal in striatum and cortex in YAC128 HD mice. General genomic features of 5-hmC are highly conserved, not being affected by either disease or brain regions. Intriguingly, we have identified disease-specific (YAC128 versus WT) differentially hydroxymethylated regions (DhMRs), and found that acquisition of DhmRs in gene body is a positive epigenetic regulator for gene expression. Ingenuity pathway analysis (IPA) of genotype-specific DhMR-annotated genes revealed that alternation of a number of canonical pathways involving neuronal development/differentiation (Wnt/β-catenin/Sox pathway, axonal guidance signaling pathway) and neuronal function/survival (glutamate receptor/calcium/CREB, GABA receptor signaling, dopamine-DARPP32 feedback pathway, etc.) could be important for the onset of HD. Our results indicate that loss of the 5-hmC marker is a novel epigenetic feature in HD, and that this aberrant epigenetic regulation may impair the neurogenesis, neuronal function and survival in HD brain. Our study also opens a new avenue for HD treatment; re-establishing the native 5-hmC landscape may have the potential to slow/halt the progression of HD.
Basolateral membrane K+ channels in renal epithelial cells
Devor, Daniel C.
2012-01-01
The major function of epithelial tissues is to maintain proper ion, solute, and water homeostasis. The tubule of the renal nephron has an amazingly simple structure, lined by epithelial cells, yet the segments (i.e., proximal tubule vs. collecting duct) of the nephron have unique transport functions. The functional differences are because epithelial cells are polarized and thus possess different patterns (distributions) of membrane transport proteins in the apical and basolateral membranes of the cell. K+ channels play critical roles in normal physiology. Over 90 different genes for K+ channels have been identified in the human genome. Epithelial K+ channels can be located within either or both the apical and basolateral membranes of the cell. One of the primary functions of basolateral K+ channels is to recycle K+ across the basolateral membrane for proper function of the Na+-K+-ATPase, among other functions. Mutations of these channels can cause significant disease. The focus of this review is to provide an overview of the basolateral K+ channels of the nephron, providing potential physiological functions and pathophysiology of these channels, where appropriate. We have taken a “K+ channel gene family” approach in presenting the representative basolateral K+ channels of the nephron. The basolateral K+ channels of the renal epithelia are represented by members of the KCNK, KCNJ, KCNQ, KCNE, and SLO gene families. PMID:22338089
Chen, Hongfei; Zuo, Xiya; Shao, Hongxia; Fan, Sheng; Ma, Juanjuan; Zhang, Dong; Zhao, Caiping; Yan, Xiangyan; Liu, Xiaojie; Han, Mingyu
2018-02-01
Carotenoid cleavage oxygenases (CCOs) are able to cleave carotenoids to produce apocarotenoids and their derivatives, which are important for plant growth and development. In this study, 21 apple CCO genes were identified and divided into six groups based on their phylogenetic relationships. We further characterized the apple CCO genes in terms of chromosomal distribution, structure and the presence of cis-elements in the promoter. We also predicted the cellular localization of the encoded proteins. An analysis of the synteny within the apple genome revealed that tandem, segmental, and whole-genome duplication events likely contributed to the expansion of the apple carotenoid oxygenase gene family. An additional integrated synteny analysis identified orthologous carotenoid oxygenase genes between apple and Arabidopsis thaliana, which served as references for the functional analysis of the apple CCO genes. The net photosynthetic rate, transpiration rate, and stomatal conductance of leaves decreased, while leaf stomatal density increased under drought and saline conditions. Tissue-specific gene expression analyses revealed diverse spatiotemporal expression patterns. Finally, hormone and abiotic stress treatments indicated that many apple CCO genes are responsive to various phytohormones as well as drought and salinity stresses. The genome-wide identification of apple CCO genes and the analyses of their expression patterns described herein may provide a solid foundation for future studies examining the regulation and functions of this gene family. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Cao, Yunpeng; Han, Yahui; Li, Dahui; Lin, Yi; Cai, Yongping
2016-01-01
In plants, 4-coumarate:coenzyme A ligases (4CLs), comprising some of the adenylate-forming enzymes, are key enzymes involved in regulating lignin metabolism and the biosynthesis of flavonoids and other secondary metabolites. Although several 4CL-related proteins were shown to play roles in secondary metabolism, no comprehensive study on 4CL-related genes in the pear and other Rosaceae species has been reported. In this study, we identified 4CL-related genes in the apple, peach, yangmei, and pear genomes using DNATOOLS software and inferred their evolutionary relationships using phylogenetic analysis, collinearity analysis, conserved motif analysis, and structure analysis. A total of 149 4CL-related genes in four Rosaceous species (pear, apple, peach, and yangmei) were identified, with 30 members in the pear. We explored the functions of several 4CL and acyl-coenzyme A synthetase (ACS) genes during the development of pear fruit by quantitative real-time PCR (qRT-PCR). We found that duplication events had occurred in the 30 4CL-related genes in the pear. These duplicated 4CL-related genes are distributed unevenly across all pear chromosomes except chromosomes 4, 8, 11, and 12. The results of this study provide a basis for further investigation of both the functions and evolutionary history of 4CL-related genes. PMID:27775579
Cao, Yunpeng; Han, Yahui; Li, Dahui; Lin, Yi; Cai, Yongping
2016-10-19
In plants, 4-coumarate:coenzyme A ligases (4CLs), comprising some of the adenylate-forming enzymes, are key enzymes involved in regulating lignin metabolism and the biosynthesis of flavonoids and other secondary metabolites. Although several 4CL-related proteins were shown to play roles in secondary metabolism, no comprehensive study on 4CL-related genes in the pear and other Rosaceae species has been reported. In this study, we identified 4CL-related genes in the apple, peach, yangmei, and pear genomes using DNATOOLS software and inferred their evolutionary relationships using phylogenetic analysis, collinearity analysis, conserved motif analysis, and structure analysis. A total of 149 4CL-related genes in four Rosaceous species (pear, apple, peach, and yangmei) were identified, with 30 members in the pear. We explored the functions of several 4CL and acyl-coenzyme A synthetase (ACS) genes during the development of pear fruit by quantitative real-time PCR (qRT-PCR). We found that duplication events had occurred in the 30 4CL-related genes in the pear. These duplicated 4CL-related genes are distributed unevenly across all pear chromosomes except chromosomes 4, 8, 11, and 12. The results of this study provide a basis for further investigation of both the functions and evolutionary history of 4CL-related genes.
Understanding genetic regulatory networks
NASA Astrophysics Data System (ADS)
Kauffman, Stuart
2003-04-01
Random Boolean networks (RBM) were introduced about 35 years ago as first crude models of genetic regulatory networks. RBNs are comprised of N on-off genes, connected by a randomly assigned regulatory wiring diagram where each gene has K inputs, and each gene is controlled by a randomly assigned Boolean function. This procedure samples at random from the ensemble of all possible NK Boolean networks. The central ideas are to study the typical, or generic properties of this ensemble, and see 1) whether characteristic differences appear as K and biases in Boolean functions are introducted, and 2) whether a subclass of this ensemble has properties matching real cells. Such networks behave in an ordered or a chaotic regime, with a phase transition, "the edge of chaos" between the two regimes. Networks with continuous variables exhibit the same two regimes. Substantial evidence suggests that real cells are in the ordered regime. A key concept is that of an attractor. This is a reentrant trajectory of states of the network, called a state cycle. The central biological interpretation is that cell types are attractors. A number of properties differentiate the ordered and chaotic regimes. These include the size and number of attractors, the existence in the ordered regime of a percolating "sea" of genes frozen in the on or off state, with a remainder of isolated twinkling islands of genes, a power law distribution of avalanches of gene activity changes following perturbation to a single gene in the ordered regime versus a similar power law distribution plus a spike of enormous avalanches of gene changes in the chaotic regime, and the existence of branching pathway of "differentiation" between attractors induced by perturbations in the ordered regime. Noise is serious issue, since noise disrupts attractors. But numerical evidence suggests that attractors can be made very stable to noise, and meanwhile, metaplasias may be a biological manifestation of noise. As we learn more about the wiring diagram and constraints on rules controlling real genes, we can build refined ensembles reflecting these properties, study the generic properties of the refined ensembles, and hope to gain insight into the dynamics of real cells.
Korenjak, Michael; Kwon, Eunjeong; Morris, Robert T.; Anderssen, Endre; Amzallag, Arnaud; Ramaswamy, Sridhar; Dyson, Nicholas J.
2014-01-01
dREAM complexes represent the predominant form of E2F/RBF repressor complexes in Drosophila. dREAM associates with thousands of sites in the fly genome but its mechanism of action is unknown. To understand the genomic context in which dREAM acts we examined the distribution and localization of Drosophila E2F and dREAM proteins. Here we report a striking and unexpected overlap between dE2F2/dREAM sites and binding sites for the insulator-binding proteins CP190 and Beaf-32. Genetic assays show that these components functionally co-operate and chromatin immunoprecipitation experiments on mutant animals demonstrate that dE2F2 is important for association of CP190 with chromatin. dE2F2/dREAM binding sites are enriched at divergently transcribed genes, and the majority of genes upregulated by dE2F2 depletion represent the repressed half of a differentially expressed, divergently transcribed pair of genes. Analysis of mutant animals confirms that dREAM and CP190 are similarly required for transcriptional integrity at these gene pairs and suggest that dREAM functions in concert with CP190 to establish boundaries between repressed/activated genes. Consistent with the idea that dREAM co-operates with insulator-binding proteins, genomic regions bound by dREAM possess enhancer-blocking activity that depends on multiple dREAM components. These findings suggest that dREAM functions in the organization of transcriptional domains. PMID:25053843
Winata, Cecilia L; Kondrychyn, Igor; Kumar, Vibhor; Srinivasan, Kandhadayar G; Orlov, Yuriy; Ravishankar, Ashwini; Prabhakar, Shyam; Stanton, Lawrence W; Korzh, Vladimir; Mathavan, Sinnakaruppan
2013-10-01
Zic3 regulates early embryonic patterning in vertebrates. Loss of Zic3 function is known to disrupt gastrulation, left-right patterning, and neurogenesis. However, molecular events downstream of this transcription factor are poorly characterized. Here we use the zebrafish as a model to study the developmental role of Zic3 in vivo, by applying a combination of two powerful genomics approaches--ChIP-seq and microarray. Besides confirming direct regulation of previously implicated Zic3 targets of the Nodal and canonical Wnt pathways, analysis of gastrula stage embryos uncovered a number of novel candidate target genes, among which were members of the non-canonical Wnt pathway and the neural pre-pattern genes. A similar analysis in zic3-expressing cells obtained by FACS at segmentation stage revealed a dramatic shift in Zic3 binding site locations and identified an entirely distinct set of target genes associated with later developmental functions such as neural development. We demonstrate cis-regulation of several of these target genes by Zic3 using in vivo enhancer assay. Analysis of Zic3 binding sites revealed a distribution biased towards distal intergenic regions, indicative of a long distance regulatory mechanism; some of these binding sites are highly conserved during evolution and act as functional enhancers. This demonstrated that Zic3 regulation of developmental genes is achieved predominantly through long distance regulatory mechanism and revealed that developmental transitions could be accompanied by dramatic changes in regulatory landscape.
Eiler, Alexander; Zaremba-Niedzwiedzka, Katarzyna; Martínez-García, Manuel; McMahon, Katherine D; Stepanauskas, Ramunas; Andersson, Siv G E; Bertilsson, Stefan
2014-01-01
Little is known about the diversity and structuring of freshwater microbial communities beyond the patterns revealed by tracing their distribution in the landscape with common taxonomic markers such as the ribosomal RNA. To address this gap in knowledge, metagenomes from temperate lakes were compared to selected marine metagenomes. Taxonomic analyses of rRNA genes in these freshwater metagenomes confirm the previously reported dominance of a limited subset of uncultured lineages of freshwater bacteria, whereas Archaea were rare. Diversification into marine and freshwater microbial lineages was also reflected in phylogenies of functional genes, and there were also significant differences in functional beta-diversity. The pathways and functions that accounted for these differences are involved in osmoregulation, active transport, carbohydrate and amino acid metabolism. Moreover, predicted genes orthologous to active transporters and recalcitrant organic matter degradation were more common in microbial genomes from oligotrophic versus eutrophic lakes. This comparative metagenomic analysis allowed us to formulate a general hypothesis that oceanic- compared with freshwater-dwelling microorganisms, invest more in metabolism of amino acids and that strategies of carbohydrate metabolism differ significantly between marine and freshwater microbial communities. PMID:24118837
DNA methylation in insects: on the brink of the epigenomic era.
Glastad, K M; Hunt, Brendan G; Yi, S V; Goodisman, M A D
2011-10-01
DNA methylation plays an important role in gene regulation in animals. However, the evolution and function of DNA methylation has only recently emerged as the subject of widespread study in insects. In this review we profile the known distribution of DNA methylation systems across insect taxa and synthesize functional inferences from studies of DNA methylation in insects and vertebrates. Unlike vertebrate genomes, which tend to be globally methylated, DNA methylation is primarily targeted to genes in insects. Nevertheless, mounting evidence suggests that a specialized role exists for genic methylation in the regulation of transcription, and possibly mRNA splicing, in both insects and mammals. Investigations in several insect taxa further reveal that DNA methylation is preferentially targeted to ubiquitously expressed genes and may play a key role in the regulation of phenotypic plasticity. We suggest that insects are particularly amenable to advancing our understanding of the biological functions of DNA methylation, because insects are evolutionarily diverse, display several lineage-specific losses of DNA methylation and possess tractable patterns of DNA methylation in moderately sized genomes. © 2011 The Authors. Insect Molecular Biology © 2011 The Royal Entomological Society.
The schizophrenia risk gene product miR-137 alters presynaptic plasticity
Siegert, Sandra; Seo, Jinsoo; Kwon, Ester J.; Rudenko, Andrii; Cho, Sukhee; Wang, Wenyuan; Flood, Zachary; Martorell, Anthony J.; Ericsson, Maria; Mungenast, Alison E.; Tsai, Li-Huei
2015-01-01
Non-coding variants in the human MIR137 gene locus increase schizophrenia risk at a genome-wide significance level. However, the functional consequence of these risk alleles is unknown. Here, we examined induced human neurons harboring the minor alleles of four disease-associated single nucleotide polymorphisms (SNPs) in MIR137, and observed increased MIR137 levels compared to major allele-carrying cells. We found that miR-137 gain-of-function causes downregulation of the presynaptic target genes, Complexin-1 (Cplx1), Nsf, and Synaptotagmin-1 (Syt1), leading to impaired vesicle release. In vivo, miR-137 gain-of-function results in changes in synaptic vesicle pool distribution, impaired mossy fiber-LTP induction and deficits in hippocampus-dependent learning and memory. By sequestering endogenous miR-137, we were able to ameliorate the synaptic phenotypes. Moreover, reinstatement of Syt1 expression partially restored synaptic plasticity, demonstrating the importance of Syt1 as a miR-137 target. Our data provide new insight into the mechanism by which miR-137 dysregulation can impair synaptic plasticity in the hippocampus. PMID:26005852
2013-01-01
Background Transcription factors (TFs) are vital elements that regulate transcription and the spatio-temporal expression of genes, thereby ensuring the accurate development and functioning of an organism. The identification of TF-encoding genes in a liverwort, Marchantia polymorpha, offers insights into TF organization in the members of the most basal lineages of land plants (embryophytes). Therefore, a comparison of Marchantia TF genes with other land plants (monocots, dicots, bryophytes) and algae (chlorophytes, rhodophytes) provides the most comprehensive view of the rates of expansion or contraction of TF genes in plant evolution. Results In this study, we report the identification of TF-encoding transcripts in M. polymorpha for the first time, as evidenced by deep RNA sequencing data. In total, 3,471 putative TF encoding transcripts, distributed in 80 families, were identified, representing 7.4% of the generated Marchantia gametophytic transcriptome dataset. Overall, TF basic functions and distribution across families appear to be conserved when compared to other plant species. However, it is of interest to observe the genesis of novel sequences in 24 TF families and the apparent termination of 2 TF families with the emergence of Marchantia. Out of 24 TF families, 6 are known to be associated with plant reproductive development processes. We also examined the expression pattern of these TF-encoding transcripts in six male and female developmental stages in vegetative and reproductive gametophytic tissues of Marchantia. Conclusions The analysis highlighted the importance of Marchantia, a model plant system, in an evolutionary context. The dataset generated here provides a scientific resource for TF gene discovery and other comparative evolutionary studies of land plants. PMID:24365221
Tollenaere, C; Jacquet, S; Ivanova, S; Loiseau, A; Duplantier, J-M; Streiff, R; Brouat, C
2013-01-01
Genome scans using amplified fragment length polymorphism (AFLP) markers became popular in nonmodel species within the last 10 years, but few studies have tried to characterize the anonymous outliers identified. This study follows on from an AFLP genome scan in the black rat (Rattus rattus), the reservoir of plague (Yersinia pestis infection) in Madagascar. We successfully sequenced 17 of the 22 markers previously shown to be potentially affected by plague-mediated selection and associated with a plague resistance phenotype. Searching these sequences in the genome of the closely related species Rattus norvegicus assigned them to 14 genomic regions, revealing a random distribution of outliers in the genome (no clustering). We compared these results with those of an in silico AFLP study of the R. norvegicus genome, which showed that outlier sequences could not have been inferred by this method in R. rattus (only four of the 15 sequences were predicted). However, in silico analysis allowed the prediction of AFLP markers distribution and the estimation of homoplasy rates, confirming its potential utility for designing AFLP studies in nonmodel species. The 14 genomic regions surrounding AFLP outliers (less than 300 kb from the marker) contained 75 genes encoding proteins of known function, including nine involved in immune function and pathogen defence. We identified the two interleukin 1 genes (Il1a and Il1b) that share homology with an antigen of Y. pestis, as the best candidates for genes subject to plague-mediated natural selection. At least six other genes known to be involved in proinflammatory pathways may also be affected by plague-mediated selection. © 2012 Blackwell Publishing Ltd.
Distribution and Evolution of Yersinia Leucine-Rich Repeat Proteins
Hu, Yueming; Huang, He; Hui, Xinjie; Cheng, Xi; White, Aaron P.
2016-01-01
Leucine-rich repeat (LRR) proteins are widely distributed in bacteria, playing important roles in various protein-protein interaction processes. In Yersinia, the well-characterized type III secreted effector YopM also belongs to the LRR protein family and is encoded by virulence plasmids. However, little has been known about other LRR members encoded by Yersinia genomes or their evolution. In this study, the Yersinia LRR proteins were comprehensively screened, categorized, and compared. The LRR proteins encoded by chromosomes (LRR1 proteins) appeared to be more similar to each other and different from those encoded by plasmids (LRR2 proteins) with regard to repeat-unit length, amino acid composition profile, and gene expression regulation circuits. LRR1 proteins were also different from LRR2 proteins in that the LRR1 proteins contained an E3 ligase domain (NEL domain) in the C-terminal region or an NEL domain-encoding nucleotide relic in flanking genomic sequences. The LRR1 protein-encoding genes (LRR1 genes) varied dramatically and were categorized into 4 subgroups (a to d), with the LRR1a to -c genes evolving from the same ancestor and LRR1d genes evolving from another ancestor. The consensus and ancestor repeat-unit sequences were inferred for different LRR1 protein subgroups by use of a maximum parsimony modeling strategy. Structural modeling disclosed very similar repeat-unit structures between LRR1 and LRR2 proteins despite the different unit lengths and amino acid compositions. Structural constraints may serve as the driving force to explain the observed mutations in the LRR regions. This study suggests that there may be functional variation and lays the foundation for future experiments investigating the functions of the chromosomally encoded LRR proteins of Yersinia. PMID:27217422
Distribution of anaerobic carbon monoxide dehydrogenase genes in deep subseafloor sediments.
Hoshino, T; Inagaki, F
2017-05-01
Carbon monoxide (CO) is the simplest oxocarbon generated by the decomposition of organic compounds, and it is expected to be in marine sediments in substantial amounts. However, the availability of CO in the deep subseafloor sedimentary biosphere is largely unknown even though anaerobic oxidation of CO is a thermodynamically favourable reaction that possibly occurs with sulphate reduction, methanogenesis, acetogenesis and hydrogenesis. In this study, we surveyed for the first time the distribution of the CO dehydrogenase gene (cooS), which encodes the catalytic beta subunit of anaerobic CO dehydrogenase (CODH), in subseafloor sediment-core samples from the eastern flank of the Juan de Fuca Ridge, Mars-Ursa Basin, Kumano Basin, and off the Shimokita Peninsula, Japan, during Integrated Ocean Drilling Program (IODP) Expeditions 301, 308 and 315 and the D/V Chikyu shakedown cruise CK06-06, respectively. Our results show the occurrence of diverse cooS genes from the seafloor down to about 390 m below the seafloor, suggesting that microbial communities have metabolic functions to utilize CO in anoxic microbial ecosystems beneath the ocean floor, and that the microbial community potentially responsible for anaerobic CO oxidation differs in accordance with possible energy-yielding metabolic reactions in the deep subseafloor sedimentary biosphere. Little is known about the microbial community associated with carbon monoxide (CO) in the deep subseafloor. This study is the first survey of a functional gene encoding anaerobic carbon monoxide dehydrogenase (CODH). The widespread occurrence of previously undiscovered CO dehydrogenase genes (cooS) suggests that diverse micro-organisms are capable of anaerobic oxidation of CO in the deep subseafloor sedimentary biosphere. © 2017 The Society for Applied Microbiology.
Spaceflight effects on T lymphocyte distribution, function and gene expression
Gridley, Daila S.; Slater, James M.; Luo-Owen, Xian; Rizvi, Asma; Chapes, Stephen K.; Stodieck, Louis S.; Ferguson, Virginia L.; Pecaut, Michael J.
2009-01-01
The immune system is highly sensitive to stressors present during spaceflight. The major emphasis of this study was on the T lymphocytes in C57BL/6NTac mice after return from a 13-day space shuttle mission (STS-118). Spleens and thymuses from flight animals (FLT) and ground controls similarly housed in animal enclosure modules (AEM) were evaluated within 3–6 h after landing. Phytohemagglutinin-induced splenocyte DNA synthesis was significantly reduced in FLT mice when based on both counts per minute and stimulation indexes (P < 0.05). Flow cytometry showed that CD3+ T and CD19+ B cell counts were low in spleens from the FLT group, whereas the number of NK1.1+ natural killer (NK) cells was increased (P < 0.01 for all three populations vs. AEM). The numerical changes resulted in a low percentage of T cells and high percentage of NK cells in FLT animals (P < 0.05). After activation of spleen cells with anti-CD3 monoclonal antibody, interleukin-2 (IL-2) was decreased, but IL-10, interferon-γ, and macrophage inflammatory protein-1α were increased in FLT mice (P < 0.05). Analysis of cancer-related genes in the thymus showed that the expression of 30 of 84 genes was significantly affected by flight (P < 0.05). Genes that differed from AEM controls by at least 1.5-fold were Birc5, Figf, Grb2, and Tert (upregulated) and Fos, Ifnb1, Itgb3, Mmp9, Myc, Pdgfb, S100a4, Thbs, and Tnf (downregulated). Collectively, the data show that T cell distribution, function, and gene expression are significantly modified shortly after return from the spaceflight environment. PMID:18988762
Yang, Dong-Dong; de Billerbeck, Gustavo M; Zhang, Jin-Jing; Rosenzweig, Frank; Francois, Jean-Marie
2018-01-01
Homology searches indicate that Saccharomyces cerevisiae strain BY4741 contains seven redundant genes that encode putative aryl-alcohol dehydrogenases (AAD). Yeast AAD genes are located in subtelomeric regions of different chromosomes, and their functional role(s) remain enigmatic. Here, we show that two of these genes, AAD4 and AAD14 , encode functional enzymes that reduce aliphatic and aryl-aldehydes concomitant with the oxidation of cofactor NADPH, and that Aad4p and Aad14p exhibit different substrate preference patterns. Other yeast AAD genes are undergoing pseudogenization. The 5' sequence of AAD15 has been deleted from the genome. Repair of an AAD3 missense mutation at the catalytically essential Tyr 73 residue did not result in a functional enzyme. However, ancestral-state reconstruction by fusing Aad6 with Aad16 and by N-terminal repair of Aad10 restores NADPH-dependent aryl-alcohol dehydrogenase activities. Phylogenetic analysis indicates that AAD genes are narrowly distributed in wood-saprophyte fungi and in yeast that occupy lignocellulosic niches. Because yeast AAD genes exhibit activity on veratraldehyde, cinnamaldehyde, and vanillin, they could serve to detoxify aryl-aldehydes released during lignin degradation. However, none of these compounds induce yeast AAD gene expression, and Aad activities do not relieve aryl-aldehyde growth inhibition. Our data suggest an ancestral role for AAD genes in lignin degradation that is degenerating as a result of yeast's domestication and use in brewing, baking, and other industrial applications. IMPORTANCE Functional characterization of hypothetical genes remains one of the chief tasks of the postgenomic era. Although the first Saccharomyces cerevisiae genome sequence was published over 20 years ago, 22% of its estimated 6,603 open reading frames (ORFs) remain unverified. One outstanding example of this category of genes is the enigmatic seven-member AAD family. Here, we demonstrate that proteins encoded by two members of this family exhibit aliphatic and aryl-aldehyde reductase activity, and further that such activity can be recovered from pseudogenized AAD genes via ancestral-state reconstruction. The phylogeny of yeast AAD genes suggests that these proteins may have played an important ancestral role in detoxifying aromatic aldehydes in ligninolytic fungi. However, in yeast adapted to niches rich in sugars, AAD genes become subject to mutational erosion. Our findings shed new light on the selective pressures and molecular mechanisms by which genes undergo pseudogenization. Copyright © 2017 Yang et al.
de Billerbeck, Gustavo M.; Zhang, Jin-jing; Rosenzweig, Frank
2017-01-01
ABSTRACT Homology searches indicate that Saccharomyces cerevisiae strain BY4741 contains seven redundant genes that encode putative aryl-alcohol dehydrogenases (AAD). Yeast AAD genes are located in subtelomeric regions of different chromosomes, and their functional role(s) remain enigmatic. Here, we show that two of these genes, AAD4 and AAD14, encode functional enzymes that reduce aliphatic and aryl-aldehydes concomitant with the oxidation of cofactor NADPH, and that Aad4p and Aad14p exhibit different substrate preference patterns. Other yeast AAD genes are undergoing pseudogenization. The 5′ sequence of AAD15 has been deleted from the genome. Repair of an AAD3 missense mutation at the catalytically essential Tyr73 residue did not result in a functional enzyme. However, ancestral-state reconstruction by fusing Aad6 with Aad16 and by N-terminal repair of Aad10 restores NADPH-dependent aryl-alcohol dehydrogenase activities. Phylogenetic analysis indicates that AAD genes are narrowly distributed in wood-saprophyte fungi and in yeast that occupy lignocellulosic niches. Because yeast AAD genes exhibit activity on veratraldehyde, cinnamaldehyde, and vanillin, they could serve to detoxify aryl-aldehydes released during lignin degradation. However, none of these compounds induce yeast AAD gene expression, and Aad activities do not relieve aryl-aldehyde growth inhibition. Our data suggest an ancestral role for AAD genes in lignin degradation that is degenerating as a result of yeast's domestication and use in brewing, baking, and other industrial applications. IMPORTANCE Functional characterization of hypothetical genes remains one of the chief tasks of the postgenomic era. Although the first Saccharomyces cerevisiae genome sequence was published over 20 years ago, 22% of its estimated 6,603 open reading frames (ORFs) remain unverified. One outstanding example of this category of genes is the enigmatic seven-member AAD family. Here, we demonstrate that proteins encoded by two members of this family exhibit aliphatic and aryl-aldehyde reductase activity, and further that such activity can be recovered from pseudogenized AAD genes via ancestral-state reconstruction. The phylogeny of yeast AAD genes suggests that these proteins may have played an important ancestral role in detoxifying aromatic aldehydes in ligninolytic fungi. However, in yeast adapted to niches rich in sugars, AAD genes become subject to mutational erosion. Our findings shed new light on the selective pressures and molecular mechanisms by which genes undergo pseudogenization. PMID:29079624
Ahadian, Samad; Ramón-Azcón, Javier; Estili, Mehdi; Liang, Xiaobin; Ostrovidov, Serge; Shiku, Hitoshi; Ramalingam, Murugan; Nakajima, Ken; Sakka, Yoshio; Bae, Hojae; Matsue, Tomokazu; Khademhosseini, Ali
2014-03-19
Biological scaffolds with tunable electrical and mechanical properties are of great interest in many different fields, such as regenerative medicine, biorobotics, and biosensing. In this study, dielectrophoresis (DEP) was used to vertically align carbon nanotubes (CNTs) within methacrylated gelatin (GelMA) hydrogels in a robust, simple, and rapid manner. GelMA-aligned CNT hydrogels showed anisotropic electrical conductivity and superior mechanical properties compared with pristine GelMA hydrogels and GelMA hydrogels containing randomly distributed CNTs. Skeletal muscle cells grown on vertically aligned CNTs in GelMA hydrogels yielded a higher number of functional myofibers than cells that were cultured on hydrogels with randomly distributed CNTs and horizontally aligned CNTs, as confirmed by the expression of myogenic genes and proteins. In addition, the myogenic gene and protein expression increased more profoundly after applying electrical stimulation along the direction of the aligned CNTs due to the anisotropic conductivity of the hybrid GelMA-vertically aligned CNT hydrogels. We believe that platform could attract great attention in other biomedical applications, such as biosensing, bioelectronics, and creating functional biomedical devices.
Ahadian, Samad; Ramón-Azcón, Javier; Estili, Mehdi; Liang, Xiaobin; Ostrovidov, Serge; Shiku, Hitoshi; Ramalingam, Murugan; Nakajima, Ken; Sakka, Yoshio; Bae, Hojae; Matsue, Tomokazu; Khademhosseini, Ali
2014-01-01
Biological scaffolds with tunable electrical and mechanical properties are of great interest in many different fields, such as regenerative medicine, biorobotics, and biosensing. In this study, dielectrophoresis (DEP) was used to vertically align carbon nanotubes (CNTs) within methacrylated gelatin (GelMA) hydrogels in a robust, simple, and rapid manner. GelMA-aligned CNT hydrogels showed anisotropic electrical conductivity and superior mechanical properties compared with pristine GelMA hydrogels and GelMA hydrogels containing randomly distributed CNTs. Skeletal muscle cells grown on vertically aligned CNTs in GelMA hydrogels yielded a higher number of functional myofibers than cells that were cultured on hydrogels with randomly distributed CNTs and horizontally aligned CNTs, as confirmed by the expression of myogenic genes and proteins. In addition, the myogenic gene and protein expression increased more profoundly after applying electrical stimulation along the direction of the aligned CNTs due to the anisotropic conductivity of the hybrid GelMA-vertically aligned CNT hydrogels. We believe that platform could attract great attention in other biomedical applications, such as biosensing, bioelectronics, and creating functional biomedical devices. PMID:24642903
NASA Astrophysics Data System (ADS)
Ahadian, Samad; Ramón-Azcón, Javier; Estili, Mehdi; Liang, Xiaobin; Ostrovidov, Serge; Shiku, Hitoshi; Ramalingam, Murugan; Nakajima, Ken; Sakka, Yoshio; Bae, Hojae; Matsue, Tomokazu; Khademhosseini, Ali
2014-03-01
Biological scaffolds with tunable electrical and mechanical properties are of great interest in many different fields, such as regenerative medicine, biorobotics, and biosensing. In this study, dielectrophoresis (DEP) was used to vertically align carbon nanotubes (CNTs) within methacrylated gelatin (GelMA) hydrogels in a robust, simple, and rapid manner. GelMA-aligned CNT hydrogels showed anisotropic electrical conductivity and superior mechanical properties compared with pristine GelMA hydrogels and GelMA hydrogels containing randomly distributed CNTs. Skeletal muscle cells grown on vertically aligned CNTs in GelMA hydrogels yielded a higher number of functional myofibers than cells that were cultured on hydrogels with randomly distributed CNTs and horizontally aligned CNTs, as confirmed by the expression of myogenic genes and proteins. In addition, the myogenic gene and protein expression increased more profoundly after applying electrical stimulation along the direction of the aligned CNTs due to the anisotropic conductivity of the hybrid GelMA-vertically aligned CNT hydrogels. We believe that platform could attract great attention in other biomedical applications, such as biosensing, bioelectronics, and creating functional biomedical devices.
Genetic recombination is associated with intrinsic disorder in plant proteomes.
Yruela, Inmaculada; Contreras-Moreira, Bruno
2013-11-09
Intrinsically disordered proteins, found in all living organisms, are essential for basic cellular functions and complement the function of ordered proteins. It has been shown that protein disorder is linked to the G + C content of the genome. Furthermore, recent investigations have suggested that the evolutionary dynamics of the plant nucleus adds disordered segments to open reading frames alike, and these segments are not necessarily conserved among orthologous genes. In the present work the distribution of intrinsically disordered proteins along the chromosomes of several representative plants was analyzed. The reported results support a non-random distribution of disordered proteins along the chromosomes of Arabidopsis thaliana and Oryza sativa, two model eudicot and monocot plant species, respectively. In fact, for most chromosomes positive correlations between the frequency of disordered segments of 30+ amino acids and both recombination rates and G + C content were observed. These analyses demonstrate that the presence of disordered segments among plant proteins is associated with the rates of genetic recombination of their encoding genes. Altogether, these findings suggest that high recombination rates, as well as chromosomal rearrangements, could induce disordered segments in proteins during evolution.
Ecophysiology of Freshwater Verrucomicrobia Inferred from Metagenome-Assembled Genomes
He, Shaomei; Stevens, Sarah L. R.; Chan, Leong-Keat; Bertilsson, Stefan; Glavina del Rio, Tijana; Tringe, Susannah G.; Malmstrom, Rex R.
2017-01-01
ABSTRACT Microbes are critical in carbon and nutrient cycling in freshwater ecosystems. Members of the Verrucomicrobia are ubiquitous in such systems, and yet their roles and ecophysiology are not well understood. In this study, we recovered 19 Verrucomicrobia draft genomes by sequencing 184 time-series metagenomes from a eutrophic lake and a humic bog that differ in carbon source and nutrient availabilities. These genomes span four of the seven previously defined Verrucomicrobia subdivisions and greatly expand knowledge of the genomic diversity of freshwater Verrucomicrobia. Genome analysis revealed their potential role as (poly)saccharide degraders in freshwater, uncovered interesting genomic features for this lifestyle, and suggested their adaptation to nutrient availabilities in their environments. Verrucomicrobia populations differ significantly between the two lakes in glycoside hydrolase gene abundance and functional profiles, reflecting the autochthonous and terrestrially derived allochthonous carbon sources of the two ecosystems, respectively. Interestingly, a number of genomes recovered from the bog contained gene clusters that potentially encode a novel porin-multiheme cytochrome c complex and might be involved in extracellular electron transfer in the anoxic humus-rich environment. Notably, most epilimnion genomes have large numbers of so-called “Planctomycete-specific” cytochrome c-encoding genes, which exhibited distribution patterns nearly opposite to those seen with glycoside hydrolase genes, probably associated with the different levels of environmental oxygen availability and carbohydrate complexity between lakes/layers. Overall, the recovered genomes represent a major step toward understanding the role, ecophysiology, and distribution of Verrucomicrobia in freshwater. IMPORTANCE Freshwater Verrucomicrobia spp. are cosmopolitan in lakes and rivers, and yet their roles and ecophysiology are not well understood, as cultured freshwater Verrucomicrobia spp. are restricted to one subdivision of this phylum. Here, we greatly expanded the known genomic diversity of this freshwater lineage by recovering 19 Verrucomicrobia draft genomes from 184 metagenomes collected from a eutrophic lake and a humic bog across multiple years. Most of these genomes represent the first freshwater representatives of several Verrucomicrobia subdivisions. Genomic analysis revealed Verrucomicrobia to be potential (poly)saccharide degraders and suggested their adaptation to carbon sources of different origins in the two contrasting ecosystems. We identified putative extracellular electron transfer genes and so-called “Planctomycete-specific” cytochrome c-encoding genes and identified their distinct distribution patterns between the lakes/layers. Overall, our analysis greatly advances the understanding of the function, ecophysiology, and distribution of freshwater Verrucomicrobia, while highlighting their potential role in freshwater carbon cycling. PMID:28959738
Gene expression profiles of fin regeneration in loach (Paramisgurnus dabryanu).
Li, Li; He, Jingya; Wang, Linlin; Chen, Weihua; Chang, Zhongjie
2017-11-01
Teleost fins can regenerate accurate position-matched structure and function after amputation. However, we still lack systematic transcriptional profiling and methodologies to understand the molecular basis of fin regeneration. After histological analysis, we established a suppression subtraction hybridization library containing 418 distinct sequences expressed differentially during the process of blastema formation and differentiation in caudal fin regeneration. Genome ontology and comparative analysis of differential distribution of our data and the reference zebrafish genome showed notable subcategories, including multi-organism processes, response to stimuli, extracellular matrix, antioxidant activity, and cell junction function. KEGG pathway analysis allowed the effective identification of relevant genes in those pathways involved in tissue morphogenesis and regeneration, including tight junction, cell adhesion molecules, mTOR and Jak-STAT signaling pathway. From relevant function subcategories and signaling pathways, 78 clones were examined for further Southern-blot hybridization. Then, 17 genes were chosen and characterized using semi-quantitative PCR. Then 4 candidate genes were identified, including F11r, Mmp9, Agr2 and one without a match to any database. After real-time quantitative PCR, the results showed obvious expression changes in different periods of caudal fin regeneration. We can assume that the 4 candidates, likely valuable genes associated with fin regeneration, deserve additional attention. Thus, our study demonstrated how to investigate the transcript profiles with an emphasis on bioinformatics intervention and how to identify potential genes related to fin regeneration processes. The results also provide a foundation or knowledge for further research into genes and molecular mechanisms of fin regeneration. Copyright © 2017 Elsevier B.V. All rights reserved.
Martínez-Castilla, León Patricio; Alvarez-Buylla, Elena R.
2003-01-01
Gene duplication is a substrate of evolution. However, the relative importance of positive selection versus relaxation of constraints in the functional divergence of gene copies is still under debate. Plant MADS-box genes encode transcriptional regulators key in various aspects of development and have undergone extensive duplications to form a large family. We recovered 104 MADS sequences from the Arabidopsis genome. Bayesian phylogenetic trees recover type II lineage as a monophyletic group and resolve a branching sequence of monophyletic groups within this lineage. The type I lineage is comprised of several divergent groups. However, contrasting gene structure and patterns of chromosomal distribution between type I and II sequences suggest that they had different evolutionary histories and support the placement of the root of the gene family between these two groups. Site-specific and site-branch analyses of positive Darwinian selection (PDS) suggest that different selection regimes could have affected the evolution of these lineages. We found evidence for PDS along the branch leading to flowering time genes that have a direct impact on plant fitness. Sites with high probabilities of having been under PDS were found in the MADS and K domains, suggesting that these played important roles in the acquisition of novel functions during MADS-box diversification. Detected sites are targets for further experimental analyses. We argue that adaptive changes in MADS-domain protein sequences have been important for their functional divergence, suggesting that changes within coding regions of transcriptional regulators have influenced phenotypic evolution of plants. PMID:14597714
Palmer, Jeffrey D.; Adams, Keith L.; Cho, Yangrae; Parkinson, Christopher L.; Qiu, Yin-Long; Song, Keming
2000-01-01
We summarize our recent studies showing that angiosperm mitochondrial (mt) genomes have experienced remarkably high rates of gene loss and concomitant transfer to the nucleus and of intron acquisition by horizontal transfer. Moreover, we find substantial lineage-specific variation in rates of these structural mutations and also point mutations. These findings mostly arise from a Southern blot survey of gene and intron distribution in 281 diverse angiosperms. These blots reveal numerous losses of mt ribosomal protein genes but, with one exception, only rare loss of respiratory genes. Some lineages of angiosperms have kept all of their mt ribosomal protein genes whereas others have lost most of them. These many losses appear to reflect remarkably high (and variable) rates of functional transfer of mt ribosomal protein genes to the nucleus in angiosperms. The recent transfer of cox2 to the nucleus in legumes provides both an example of interorganellar gene transfer in action and a starting point for discussion of the roles of mechanistic and selective forces in determining the distribution of genetic labor between organellar and nuclear genomes. Plant mt genomes also acquire sequences by horizontal transfer. A striking example of this is a homing group I intron in the mt cox1 gene. This extraordinarily invasive mobile element has probably been acquired over 1,000 times separately during angiosperm evolution via a recent wave of cross-species horizontal transfers. Finally, whereas all previously examined angiosperm mtDNAs have low rates of synonymous substitutions, mtDNAs of two distantly related angiosperms have highly accelerated substitution rates. PMID:10860957
Ito, T M; Polido, P B; Rampim, M C; Kaschuk, G; Souza, S G H
2014-09-26
Sweet orange (Citrus sinensis) plays an important role in the economy of more than 140 countries, but it is grown in areas with intermittent stressful soil and climatic conditions. The stress tolerance could be addressed by manipulating the ethylene response factor (ERF) transcription factors because they orchestrate plant responses to environmental stress. We performed an in silico study on the ERFs in the expressed sequence tag database of C. sinensis to identify potential genes that regulate plant responses to stress. We identified 108 putative genes encoding protein sequences of the AP2/ERF superfamily distributed within 10 groups of amino acid sequences. Ninety-one genes were assembled from the ERF family containing only one AP2/ERF domain, 13 genes were assembled from the AP2 family containing two AP2/ERF domains, and four other genes were assembled from the RAV family containing one AP2/ERF domain and a B3 domain. Some conserved domains of the ERF family genes were disrupted into a few segments by introns. This irregular distribution of genes in the AP2/ERF superfamily in different plant species could be a result of genomic losses or duplication events in a common ancestor. The in silico gene expression revealed that 67% of AP2/ERF genes are expressed in tissues with usual plant development, and 14% were expressed in stressed tissues. Because the AP2/ERF superfamily is expressed in an orchestrated way, it is possible that the manipulation of only one gene may result in changes in the whole plant function, which could result in more tolerant crops.
Zhu, Qiyun; Kosoy, Michael; Olival, Kevin J.; Dittmar, Katharina
2014-01-01
Bartonellae are mammalian pathogens vectored by blood-feeding arthropods. Although of increasing medical importance, little is known about their ecological past, and host associations are underexplored. Previous studies suggest an influence of horizontal gene transfers in ecological niche colonization by acquisition of host pathogenicity genes. We here expand these analyses to metabolic pathways of 28 Bartonella genomes, and experimentally explore the distribution of bartonellae in 21 species of blood-feeding arthropods. Across genomes, repeated gene losses and horizontal gains in the phospholipid pathway were found. The evolutionary timing of these patterns suggests functional consequences likely leading to an early intracellular lifestyle for stem bartonellae. Comparative phylogenomic analyses discover three independent lineage-specific reacquisitions of a core metabolic gene—NAD(P)H-dependent glycerol-3-phosphate dehydrogenase (gpsA)—from Gammaproteobacteria and Epsilonproteobacteria. Transferred genes are significantly closely related to invertebrate Arsenophonus-, and Serratia-like endosymbionts, and mammalian Helicobacter-like pathogens, supporting a cellular association with arthropods and mammals at the base of extant Bartonella spp. Our studies suggest that the horizontal reacquisitions had a key impact on bartonellae lineage specific ecological and functional evolution. PMID:25106622
Revised phylogeny of the Cellulose Synthase gene superfamily: insights into cell wall evolution.
Little, Alan; Schwerdt, Julian G; Shirley, Neil J; Khor, Shi F; Neumann, Kylie; O'Donovan, Lisa A; Lahnstein, Jelle; Collins, Helen M; Henderson, Marilyn; Fincher, Geoffrey B; Burton, Rachel A
2018-05-20
Cell walls are crucial for the integrity and function of all land plants, and are of central importance in human health, livestock production, and as a source of renewable bioenergy. Many enzymes that mediate the biosynthesis of cell wall polysaccharides are encoded by members of the large cellulose synthase (CesA) gene superfamily. Here, we analyzed 29 sequenced genomes and 17 transcriptomes to revise the phylogeny of the CesA gene superfamily in angiosperms. Our results identify ancestral gene clusters that predate the monocot-eudicot divergence and reveal several novel evolutionary observations, including the expansion of the Poaceae-specific cellulose synthase-like CslF family to the graminids and restiids and the characterisation of a previously unreported eudicot lineage, CslM, that forms a reciprocally monophyletic eudicot-monocot grouping with the CslJ clade. The CslM lineage is widely distributed in eudicots, and the CslJ clade, which was previously thought to be restricted to the Poales, is widely distributed in monocots. Our analyses show that some members of the CslJ lineage, but not the newly identified CslM genes, are capable of directing (1,3;1,4)-β-glucan biosynthesis, which, contrary to current dogma, is not restricted to Poaceae. {copyright, serif} 2018 American Society of Plant Biologists. All rights reserved.
Gao, Chao; Sun, Jianlei; Wang, Chongqi; Dong, Yumei; Xiao, Shouhua; Wang, Xingjun; Jiao, Zigao
2017-01-01
The basic/helix-loop-helix (bHLH) proteins constitute a superfamily of transcription factors that are known to play a range of regulatory roles in eukaryotes. Over the past few decades, many bHLH family genes have been well-characterized in model plants, such as Arabidopsis, rice and tomato. However, the bHLH protein family in peanuts has not yet been systematically identified and characterized. Here, 132 and 129 bHLH proteins were identified from two wild ancestral diploid subgenomes of cultivated tetraploid peanuts, Arachis duranensis (AA) and Arachis ipaensis (BB), respectively. Phylogenetic analysis indicated that these bHLHs could be classified into 19 subfamilies. Distribution mapping results showed that peanut bHLH genes were randomly and unevenly distributed within the 10 AA chromosomes and 10 BB chromosomes. In addition, 120 bHLH gene pairs between the AA-subgenome and BB-subgenome were found to be orthologous and 101 of these pairs were highly syntenic in AA and BB chromosomes. Furthermore, we confirmed that 184 bHLH genes expressed in different tissues, 22 of which exhibited tissue-specific expression. Meanwhile, we identified 61 bHLH genes that may be potentially involved in peanut-specific subterranean. Our comprehensive genomic analysis provides a foundation for future functional dissection and understanding of the regulatory mechanisms of bHLH transcription factors in peanuts.
Zhang, Xian; Liu, Xueduan; Liang, Yili; Xiao, Yunhua; Ma, Liyuan; Guo, Xue; Miao, Bo; Liu, Hongwei; Peng, Deliang; Huang, Wenkun; Yin, Huaqun
2017-01-01
The spatial-temporal distribution of populations in various econiches is thought to be potentially related to individual differences in the utilization of nutrients or other resources, but their functional roles in the microbial communities remain elusive. We compared differentiation in gene repertoire and metabolic profiles, with a focus on the potential functional traits of three commonly recognized members (Acidithiobacillus caldus, Leptospirillum ferriphilum, and Sulfobacillus thermosulfidooxidans) in bioleaching heaps. Comparative genomics revealed that intra-species divergence might be driven by horizontal gene transfer. These co-occurring bacteria shared a few homologous genes, which significantly suggested the genomic differences between these organisms. Notably, relatively more genes assigned to the Clusters of Orthologous Groups category [G] (carbohydrate transport and metabolism) were identified in Sulfobacillus thermosulfidooxidans compared to the two other species, which probably indicated their mixotrophic capabilities that assimilate both organic and inorganic forms of carbon. Further inspection revealed distinctive metabolic capabilities involving carbon assimilation, nitrogen uptake, and iron-sulfur cycling, providing robust evidence for functional differences with respect to nutrient utilization. Therefore, we proposed that the mutual compensation of functionalities among these co-occurring organisms might provide a selective advantage for efficiently utilizing the limited resources in their habitats. Furthermore, it might be favorable to chemoautotrophs' lifestyles to form mutualistic interactions with these heterotrophic and/or mixotrophic acidophiles, whereby the latter could degrade organic compounds to effectively detoxify the environments. Collectively, the findings shed light on the genetic traits and potential metabolic activities of these organisms, and enable us to make some inferences about genomic and functional differences that might allow them to co-exist. PMID:28529505
Valentin-Kahan, Adrián; García-Tejedor, Gabriela B.; Robello, Carlos; Trujillo-Cenóz, Omar; Russo, Raúl E.; Alvarez-Valin, Fernando
2017-01-01
Slider turtles are the only known amniotes with self-repair mechanisms of the spinal cord that lead to substantial functional recovery. Their strategic phylogenetic position makes them a relevant model to investigate the peculiar genetic programs that allow anatomical reconnection in some vertebrate groups but are absent in others. Here, we analyze the gene expression profile of the response to spinal cord injury (SCI) in the turtle Trachemys scripta elegans. We found that this response comprises more than 1000 genes affecting diverse functions: reaction to ischemic insult, extracellular matrix re-organization, cell proliferation and death, immune response, and inflammation. Genes related to synapses and cholesterol biosynthesis are down-regulated. The analysis of the evolutionary distribution of these genes shows that almost all are present in most vertebrates. Additionally, we failed to find genes that were exclusive of regenerating taxa. The comparison of expression patterns among species shows that the response to SCI in the turtle is more similar to that of mice and non-regenerative Xenopus than to Xenopus during its regenerative stage. This observation, along with the lack of conserved “regeneration genes” and the current accepted phylogenetic placement of turtles (sister group of crocodilians and birds), indicates that the ability of spinal cord self-repair of turtles does not represent the retention of an ancestral vertebrate character. Instead, our results suggest that turtles developed this capability from a non-regenerative ancestor (i.e., a lineage specific innovation) that was achieved by re-organizing gene expression patterns on an essentially non-regenerative genetic background. Among the genes activated by SCI exclusively in turtles, those related to anoxia tolerance, extracellular matrix remodeling, and axonal regrowth are good candidates to underlie functional recovery. PMID:28223917
Ji, Jing; Wang, Gang; Wang, Jiehua; Wang, Ping
2009-02-01
Carotenoids are red, yellow and orange pigments, which are widely distributed in nature and are especially abundant in yellow-orange fruits and vegetables and dark green leafy vegetables. Carotenoids are essential for photosynthesis and photoprotection in plant life and also have different beneficial effects in humans and animals (van den Berg et al. 2000). For example, beta-carotene plays an essential role as the main dietary source of vitamin A. To obtain further insight into beta-carotene biosynthesis in two important economic plant species, Lycium barbarum and Gentiana lutea L., and to investigate and prioritize potential genetic engineering targets in the pathway, the effects of five carotenogenic genes from these two species, encoding proteins including geranylgeranyl diphosphate synthase, phytoene synthase and delta-carotene desaturase gene, lycopene beta-cyclase, lycopene epsilon-cyclase were functionally analyzed in transgenic tobacco (Nicotiana tabacum) plants. All transgenic tobacco plants constitutively expressing these genes showed enhanced beta-carotene contents in their leaves and flowers to different extents. The addictive effects of co-ordinate expression of double transgenes have also been investigated.
Limited mitogenomic degradation in response to a parasitic lifestyle in Orobanchaceae
Fan, Weishu; Zhu, Andan; Kozaczek, Melisa; Shah, Neethu; Pabón-Mora, Natalia; González, Favio; Mower, Jeffrey P.
2016-01-01
In parasitic plants, the reduction in plastid genome (plastome) size and content is driven predominantly by the loss of photosynthetic genes. The first completed mitochondrial genomes (mitogenomes) from parasitic mistletoes also exhibit significant degradation, but the generality of this observation for other parasitic plants is unclear. We sequenced the complete mitogenome and plastome of the hemiparasite Castilleja paramensis (Orobanchaceae) and compared them with additional holoparasitic, hemiparasitic and nonparasitic species from Orobanchaceae. Comparative mitogenomic analysis revealed minimal gene loss among the seven Orobanchaceae species, indicating the retention of typical mitochondrial function among Orobanchaceae species. Phylogenetic analysis demonstrated that the mobile cox1 intron was acquired vertically from a nonparasitic ancestor, arguing against a role for Orobanchaceae parasites in the horizontal acquisition or distribution of this intron. The C. paramensis plastome has retained nearly all genes except for the recent pseudogenization of four subunits of the NAD(P)H dehydrogenase complex, indicating a very early stage of plastome degradation. These results lend support to the notion that loss of ndh gene function is the first step of plastome degradation in the transition to a parasitic lifestyle. PMID:27808159
The mouse bagpipe gene controls development of axial skeleton, skull, and spleen
Lettice, Laura A.; Purdie, Lorna A.; Carlson, Geoffrey J.; Kilanowski, Fiona; Dorin, Julia; Hill, Robert E.
1999-01-01
The mouse Bapx1 gene is homologous to the Drosophila homeobox containing bagpipe (bap) gene. A shared characteristic of the genes in these two organisms is expression in gut mesoderm. In Drosophila, bap functions to specify the formation of the musculature of the midgut. To determine the function of the mammalian cognate, we targeted a mutation into the Bapx1 locus. Bapx1, similar to Drosophila, does have a conspicuous role in gut mesoderm; however, this appears to be restricted to development of the spleen. In addition, Bapx1 has a major role in the development of the axial skeleton. Loss of Bapx1 affects the distribution of sclerotomal cells, markedly reducing the number that appear ventromedially around the notochord. Subsequently, the structures in the midaxial region, the intervertebral discs, and centra of the vertebral bodies, fail to form. Abnormalities are also found in those bones of the basal skull (basioccipital and basisphenoid bones) associated with the notochord. We postulate that Bapx1 confers the capacity of cells to interact with the notochord, effecting inductive interactions essential for development of the vertebral column and chondrocranium. PMID:10449756
Ling, Hong; Zeng, Xu; Guo, Shunxing
2016-01-01
Late embryogenesis abundant (LEA) proteins, a diverse family, accumulate during seed desiccation in the later stages of embryogenesis. LEA proteins are associated with tolerance to abiotic stresses, such as drought, salinity and high or cold temperature. Here, we report the first comprehensive survey of the LEA gene family in Dendrobium officinale, an important and widely grown medicinal orchid in China. Based on phylogenetic relationships with the complete set of Arabidopsis and Oryza LEA proteins, 17 genes encoding D. officinale LEAs (DofLEAs) were identified and their deduced proteins were classified into seven groups. The motif composition of these deduced proteins was correlated with the gene structure found in each LEA group. Our results reveal the DofLEA genes are widely distributed and expressed in tissues. Additionally, 11 genes from different groups were introduced into Escherichia coli to assess the functions of DofLEAs. Expression of 6 and 7 DofLEAs in E. coli improved growth performance compared with the control under salt and heat stress, respectively. Based on qPCR data, all of these genes were up-regulated in various tissues following exposure to salt and heat stresses. Our results suggest that DofLEAs play an important role in responses to abiotic stress. PMID:28004781
Distribution and regulation of stochasticity and plasticity in Saccharomyces cerevisiae
Dar, R. D.; Karig, D. K.; Cooke, J. F.; ...
2010-09-01
Stochasticity is an inherent feature of complex systems with nanoscale structure. In such systems information is represented by small collections of elements (e.g. a few electrons on a quantum dot), and small variations in the populations of these elements may lead to big uncertainties in the information. Unfortunately, little is known about how to work within this inherently noisy environment to design robust functionality into complex nanoscale systems. Here, we look to the biological cell as an intriguing model system where evolution has mediated the trade-offs between fluctuations and function, and in particular we look at the relationships and trade-offsmore » between stochastic and deterministic responses in the gene expression of budding yeast (Saccharomyces cerevisiae). We find gene regulatory arrangements that control the stochastic and deterministic components of expression, and show that genes that have evolved to respond to stimuli (stress) in the most strongly deterministic way exhibit the most noise in the absence of the stimuli. We show that this relationship is consistent with a bursty 2-state model of gene expression, and demonstrate that this regulatory motif generates the most uncertainty in gene expression when there is the greatest uncertainty in the optimal level of gene expression.« less
Evolution of the snake body form reveals homoplasy in amniote Hox gene function.
Head, Jason J; Polly, P David
2015-04-02
Hox genes regulate regionalization of the axial skeleton in vertebrates, and changes in their expression have been proposed to be a fundamental mechanism driving the evolution of new body forms. The origin of the snake-like body form, with its deregionalized pre-cloacal axial skeleton, has been explained as either homogenization of Hox gene expression domains, or retention of standard vertebrate Hox domains with alteration of downstream expression that suppresses development of distinct regions. Both models assume a highly regionalized ancestor, but the extent of deregionalization of the primaxial domain (vertebrae, dorsal ribs) of the skeleton in snake-like body forms has never been analysed. Here we combine geometric morphometrics and maximum-likelihood analysis to show that the pre-cloacal primaxial domain of elongate, limb-reduced lizards and snakes is not deregionalized compared with limbed taxa, and that the phylogenetic structure of primaxial morphology in reptiles does not support a loss of regionalization in the evolution of snakes. We demonstrate that morphometric regional boundaries correspond to mapped gene expression domains in snakes, suggesting that their primaxial domain is patterned by a normally functional Hox code. Comparison of primaxial osteology in fossil and modern amniotes with Hox gene distributions within Amniota indicates that a functional, sequentially expressed Hox code patterned a subtle morphological gradient along the anterior-posterior axis in stem members of amniote clades and extant lizards, including snakes. The highly regionalized skeletons of extant archosaurs and mammals result from independent evolution in the Hox code and do not represent ancestral conditions for clades with snake-like body forms. The developmental origin of snakes is best explained by decoupling of the primaxial and abaxial domains and by increases in somite number, not by changes in the function of primaxial Hox genes.
A statistical method for measuring activation of gene regulatory networks.
Esteves, Gustavo H; Reis, Luiz F L
2018-06-13
Gene expression data analysis is of great importance for modern molecular biology, given our ability to measure the expression profiles of thousands of genes and enabling studies rooted in systems biology. In this work, we propose a simple statistical model for the activation measuring of gene regulatory networks, instead of the traditional gene co-expression networks. We present the mathematical construction of a statistical procedure for testing hypothesis regarding gene regulatory network activation. The real probability distribution for the test statistic is evaluated by a permutation based study. To illustrate the functionality of the proposed methodology, we also present a simple example based on a small hypothetical network and the activation measuring of two KEGG networks, both based on gene expression data collected from gastric and esophageal samples. The two KEGG networks were also analyzed for a public database, available through NCBI-GEO, presented as Supplementary Material. This method was implemented in an R package that is available at the BioConductor project website under the name maigesPack.
Dong, Chen; Hu, Huigang; Xie, Jianghui
2016-12-01
DNA-binding with one finger (Dof) domain proteins are a multigene family of plant-specific transcription factors involved in numerous aspects of plant growth and development. In this study, we report a genome-wide search for Musa acuminata Dof (MaDof) genes and their expression profiles at different developmental stages and in response to various abiotic stresses. In addition, a complete overview of the Dof gene family in bananas is presented, including the gene structures, chromosomal locations, cis-regulatory elements, conserved protein domains, and phylogenetic inferences. Based on the genome-wide analysis, we identified 74 full-length protein-coding MaDof genes unevenly distributed on 11 chromosomes. Phylogenetic analysis with Dof members from diverse plant species showed that MaDof genes can be classified into four subgroups (StDof I, II, III, and IV). The detailed genomic information of the MaDof gene homologs in the present study provides opportunities for functional analyses to unravel the exact role of the genes in plant growth and development.
Ocean biogeochemistry modeled with emergent trait-based genomics.
Coles, V J; Stukel, M R; Brooks, M T; Burd, A; Crump, B C; Moran, M A; Paul, J H; Satinsky, B M; Yager, P L; Zielinski, B L; Hood, R R
2017-12-01
Marine ecosystem models have advanced to incorporate metabolic pathways discovered with genomic sequencing, but direct comparisons between models and "omics" data are lacking. We developed a model that directly simulates metagenomes and metatranscriptomes for comparison with observations. Model microbes were randomly assigned genes for specialized functions, and communities of 68 species were simulated in the Atlantic Ocean. Unfit organisms were replaced, and the model self-organized to develop community genomes and transcriptomes. Emergent communities from simulations that were initialized with different cohorts of randomly generated microbes all produced realistic vertical and horizontal ocean nutrient, genome, and transcriptome gradients. Thus, the library of gene functions available to the community, rather than the distribution of functions among specific organisms, drove community assembly and biogeochemical gradients in the model ocean. Copyright © 2017 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.
Ye, Meixia; Wang, Zhong; Wang, Yaqun; Wu, Rongling
2015-03-01
Dynamic changes of gene expression reflect an intrinsic mechanism of how an organism responds to developmental and environmental signals. With the increasing availability of expression data across a time-space scale by RNA-seq, the classification of genes as per their biological function using RNA-seq data has become one of the most significant challenges in contemporary biology. Here we develop a clustering mixture model to discover distinct groups of genes expressed during a period of organ development. By integrating the density function of multivariate Poisson distribution, the model accommodates the discrete property of read counts characteristic of RNA-seq data. The temporal dependence of gene expression is modeled by the first-order autoregressive process. The model is implemented with the Expectation-Maximization algorithm and model selection to determine the optimal number of gene clusters and obtain the estimates of Poisson parameters that describe the pattern of time-dependent expression of genes from each cluster. The model has been demonstrated by analyzing a real data from an experiment aimed to link the pattern of gene expression to catkin development in white poplar. The usefulness of the model has been validated through computer simulation. The model provides a valuable tool for clustering RNA-seq data, facilitating our global view of expression dynamics and understanding of gene regulation mechanisms. © The Author 2014. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Verma, Jitendra Kumar; Wardhan, Vijay; Singh, Deepali; Chakraborty, Subhra; Chakraborty, Niranjan
2018-01-01
Architectural proteins play key roles in genome construction and regulate the expression of many genes, albeit the modulation of genome plasticity by these proteins is largely unknown. A critical screening of the architectural proteins in five crop species, viz., Oryza sativa, Zea mays, Sorghum bicolor, Cicer arietinum, and Vitis vinifera, and in the model plant Arabidopsis thaliana along with evolutionary relevant species such as Chlamydomonas reinhardtii, Physcomitrella patens, and Amborella trichopoda, revealed 9, 20, 10, 7, 7, 6, 1, 4, and 4 Alba (acetylation lowers binding affinity) genes, respectively. A phylogenetic analysis of the genes and of their counterparts in other plant species indicated evolutionary conservation and diversification. In each group, the structural components of the genes and motifs showed significant conservation. The chromosomal location of the Alba genes of rice (OsAlba), showed an unequal distribution on 8 of its 12 chromosomes. The expression profiles of the OsAlba genes indicated a distinct tissue-specific expression in the seedling, vegetative, and reproductive stages. The quantitative real-time PCR (qRT-PCR) analysis of the OsAlba genes confirmed their stress-inducible expression under multivariate environmental conditions and phytohormone treatments. The evaluation of the regulatory elements in 68 Alba genes from the 9 species studied led to the identification of conserved motifs and overlapping microRNA (miRNA) target sites, suggesting the conservation of their function in related proteins and a divergence in their biological roles across species. The 3D structure and the prediction of putative ligands and their binding sites for OsAlba proteins offered a key insight into the structure–function relationship. These results provide a comprehensive overview of the subtle genetic diversification of the OsAlba genes, which will help in elucidating their functional role in plants. PMID:29597290
Ancient genes establish stress-induced mutation as a hallmark of cancer.
Cisneros, Luis; Bussey, Kimberly J; Orr, Adam J; Miočević, Milica; Lineweaver, Charles H; Davies, Paul
2017-01-01
Cancer is sometimes depicted as a reversion to single cell behavior in cells adapted to live in a multicellular assembly. If this is the case, one would expect that mutation in cancer disrupts functional mechanisms that suppress cell-level traits detrimental to multicellularity. Such mechanisms should have evolved with or after the emergence of multicellularity. This leads to two related, but distinct hypotheses: 1) Somatic mutations in cancer will occur in genes that are younger than the emergence of multicellularity (1000 million years [MY]); and 2) genes that are frequently mutated in cancer and whose mutations are functionally important for the emergence of the cancer phenotype evolved within the past 1000 million years, and thus would exhibit an age distribution that is skewed to younger genes. In order to investigate these hypotheses we estimated the evolutionary ages of all human genes and then studied the probability of mutation and their biological function in relation to their age and genomic location for both normal germline and cancer contexts. We observed that under a model of uniform random mutation across the genome, controlled for gene size, genes less than 500 MY were more frequently mutated in both cases. Paradoxically, causal genes, defined in the COSMIC Cancer Gene Census, were depleted in this age group. When we used functional enrichment analysis to explain this unexpected result we discovered that COSMIC genes with recessive disease phenotypes were enriched for DNA repair and cell cycle control. The non-mutated genes in these pathways are orthologous to those underlying stress-induced mutation in bacteria, which results in the clustering of single nucleotide variations. COSMIC genes were less common in regions where the probability of observing mutational clusters is high, although they are approximately 2-fold more likely to harbor mutational clusters compared to other human genes. Our results suggest this ancient mutational response to stress that evolved among prokaryotes was co-opted to maintain diversity in the germline and immune system, while the original phenotype is restored in cancer. Reversion to a stress-induced mutational response is a hallmark of cancer that allows for effectively searching "protected" genome space where genes causally implicated in cancer are located and underlies the high adaptive potential and concomitant therapeutic resistance that is characteristic of cancer.
Ancient genes establish stress-induced mutation as a hallmark of cancer
Orr, Adam J.; Miočević, Milica; Lineweaver, Charles H.; Davies, Paul
2017-01-01
Cancer is sometimes depicted as a reversion to single cell behavior in cells adapted to live in a multicellular assembly. If this is the case, one would expect that mutation in cancer disrupts functional mechanisms that suppress cell-level traits detrimental to multicellularity. Such mechanisms should have evolved with or after the emergence of multicellularity. This leads to two related, but distinct hypotheses: 1) Somatic mutations in cancer will occur in genes that are younger than the emergence of multicellularity (1000 million years [MY]); and 2) genes that are frequently mutated in cancer and whose mutations are functionally important for the emergence of the cancer phenotype evolved within the past 1000 million years, and thus would exhibit an age distribution that is skewed to younger genes. In order to investigate these hypotheses we estimated the evolutionary ages of all human genes and then studied the probability of mutation and their biological function in relation to their age and genomic location for both normal germline and cancer contexts. We observed that under a model of uniform random mutation across the genome, controlled for gene size, genes less than 500 MY were more frequently mutated in both cases. Paradoxically, causal genes, defined in the COSMIC Cancer Gene Census, were depleted in this age group. When we used functional enrichment analysis to explain this unexpected result we discovered that COSMIC genes with recessive disease phenotypes were enriched for DNA repair and cell cycle control. The non-mutated genes in these pathways are orthologous to those underlying stress-induced mutation in bacteria, which results in the clustering of single nucleotide variations. COSMIC genes were less common in regions where the probability of observing mutational clusters is high, although they are approximately 2-fold more likely to harbor mutational clusters compared to other human genes. Our results suggest this ancient mutational response to stress that evolved among prokaryotes was co-opted to maintain diversity in the germline and immune system, while the original phenotype is restored in cancer. Reversion to a stress-induced mutational response is a hallmark of cancer that allows for effectively searching “protected” genome space where genes causally implicated in cancer are located and underlies the high adaptive potential and concomitant therapeutic resistance that is characteristic of cancer. PMID:28441401
Green, Benjamin B; Houseman, E Andres; Johnson, Kevin C; Guerin, Dylan J; Armstrong, David A; Christensen, Brock C; Marsit, Carmen J
2016-08-01
The conversion of cytosine to 5-methylcystosine (5mC) is an important regulator of gene expression. 5mC may be enzymatically converted to 5-hydroxymethylcytosine (5hmC), with a potentially distinct regulatory function. We sought to investigate these cytosine modifications and their effect on gene expression by parallel processing of genomic DNA using bisulfite and oxidative bisulfite conversion in conjunction with RNA sequencing. Although values of 5hmC across the placental genome were generally low, we identified ∼21,000 loci with consistently elevated levels of 5-hydroxymethycytosine. Absence of 5hmC was observed in CpG islands and, to a greater extent, in non-CpG island-associated regions. 5hmC was enriched within poised enhancers, and depleted within active enhancers, as defined by H3K27ac and H3K4me1 measurements. 5hmC and 5mC were significantly elevated in transcriptionally silent genes when compared with actively transcribed genes. 5hmC was positively associated with transcription in actively transcribed genes only. Our data suggest that dynamic cytosine regulation, associated with transcription, provides the most complete epigenomic landscape of the human placenta, and will be useful for future studies of the placental epigenome.-Green, B. B., Houseman, E. A., Johnson, K. C., Guerin, D. J., Armstrong, D. A., Christensen, B. C., Marsit, C. J. Hydroxymethylation is uniquely distributed within term placenta, and is associated with gene expression. © FASEB.
Green, Benjamin B.; Houseman, E. Andres; Johnson, Kevin C.; Guerin, Dylan J.; Armstrong, David A.; Christensen, Brock C.; Marsit, Carmen J.
2016-01-01
The conversion of cytosine to 5-methylcystosine (5mC) is an important regulator of gene expression. 5mC may be enzymatically converted to 5-hydroxymethylcytosine (5hmC), with a potentially distinct regulatory function. We sought to investigate these cytosine modifications and their effect on gene expression by parallel processing of genomic DNA using bisulfite and oxidative bisulfite conversion in conjunction with RNA sequencing. Although values of 5hmC across the placental genome were generally low, we identified ∼21,000 loci with consistently elevated levels of 5-hydroxymethycytosine. Absence of 5hmC was observed in CpG islands and, to a greater extent, in non-CpG island–associated regions. 5hmC was enriched within poised enhancers, and depleted within active enhancers, as defined by H3K27ac and H3K4me1 measurements. 5hmC and 5mC were significantly elevated in transcriptionally silent genes when compared with actively transcribed genes. 5hmC was positively associated with transcription in actively transcribed genes only. Our data suggest that dynamic cytosine regulation, associated with transcription, provides the most complete epigenomic landscape of the human placenta, and will be useful for future studies of the placental epigenome.—Green, B. B., Houseman, E. A., Johnson, K. C., Guerin, D. J., Armstrong, D. A., Christensen, B. C., Marsit, C. J. Hydroxymethylation is uniquely distributed within term placenta, and is associated with gene expression. PMID:27118675
Hindt, Maria; Socha, Amanda L.; Zuber, Hélène
2013-01-01
Here we present approaches for using multi-elemental imaging (specifically synchrotron X-ray fluorescence microscopy, SXRF) in ionomics, with examples using the model plant Arabidopsis thaliana. The complexity of each approach depends on the amount of a priori information available for the gene and/or phenotype being studied. Three approaches are outlined, which apply to experimental situations where a gene of interest has been identified but has an unknown phenotype (Phenotyping), an unidentified gene is associated with a known phenotype (Gene Cloning) and finally, a Screening approach, where both gene and phenotype are unknown. These approaches make use of open-access, online databases with which plant molecular genetics researchers working in the model plant Arabidopsis will be familiar, in particular the Ionomics Hub and online transcriptomic databases such as the Arabidopsis eFP browser. The approaches and examples we describe are based on the assumption that altering the expression of ion transporters can result in changes in elemental distribution. We provide methodological details on using elemental imaging to aid or accelerate gene functional characterization by narrowing down the search for candidate genes to the tissues in which elemental distributions are altered. We use synchrotron X-ray microprobes as a technique of choice, which can now be used to image all parts of an Arabidopsis plant in a hydrated state. We present elemental images of leaves, stem, root, siliques and germinating hypocotyls. PMID:23912758
Distribution of RPTLN Genes Across Reptilia: Hypothesized Role for RPTLN in the Evolution of SVMPs.
Sanz-Soler, Raquel; Sanz, Libia; Calvete, Juan J
2016-11-01
We report the cloning, full-length sequencing, and broad distribution of reptile-specific RPTLN genes across a number of Anapsida (Testudines), Diapsida (Serpentes, Sauria), and Archosauria (Crocodylia) taxa. The remarkable structural conservation of RPTLN genes in species that had a common ancestor more than 250 million years ago, their low transcriptional level, and the lack of evidence for RPTLN translation in any reptile organ investigated, suggest for this ancient gene family a yet elusive function as long noncoding RNAs. The high conservation in extant snake venom metalloproteinases (SVMPs) of the signal peptide sequence coded for by RPTLN genes strongly suggests that this region may have played a key role in the recruitment and restricted expression of SVMP genes in the venom gland of Caenophidian snakes, some 60-50 Mya. More recently, 23-16 Mya, the neofunctionalization of an RPTLN copy in the venom gland of snakes of the genera Macrovipera and Daboia marked the beginning of the evolutionary history of a new family of disintegrins, the α 1 β 1 -collagen binding antagonists, short-RTS/KTS disintegrins. This evolutionary scenario predicts that venom gland RPTLN and SVMP genes may share tissue-specific regulatory elements. Future genomic studies should support or refute this hypothesis. © The Author 2016. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
Esaki, Masahiro; Hoshijima, Kazuyuki; Nakamura, Nobuhiro; Munakata, Keijiro; Tanaka, Mikiko; Ookata, Kayoko; Asakawa, Kazuhide; Kawakami, Koichi; Wang, Weiyi; Weinberg, Eric S.; Hirose, Shigehisa
2009-01-01
Mitochondrion-rich cells (MRCs), or ionocytes, play a central role in aquatic species, maintaining body fluid ionic homeostasis by actively taking up or excreting ions. Since their first description in 1932 in eel gills, extensive morphological and physiological analyses have yielded important insights into ionocyte structure and function, but understanding the developmental pathway specifying these cells remains an ongoing challenge. We previously succeeded in identifying a key transcription factor, Foxi3a, in zebrafish larvae by database mining. In the present study, we analyzed a zebrafish mutant, quadro (quo), deficient in foxi1 gene expression and found that foxi1 is essential for development of an MRC subpopulation rich in vacuolar-type H+-ATPase (vH-MRC). foxi1 acts upstream of Delta-Notch signaling that determines sporadic distribution of vH-MRC and regulates foxi3a expression. Through gain- and loss-of-function assays and cell transplantation experiments, we further clarified that (1) the expression level of foxi3a is maintained by a positive feedback loop between foxi3a and its downstream gene gcm2 and (2) Foxi3a functions cell-autonomously in the specification of vH-MRC. These observations provide a better understanding of the differentiation and distribution of the vH-MRC subtype. PMID:19268451
High level of microsynteny and purifying selection affect the evolution of WRKY family in Gramineae.
Jin, Jing; Kong, Jingjing; Qiu, Jianle; Zhu, Huasheng; Peng, Yuancheng; Jiang, Haiyang
2016-01-01
The WRKY gene family, which encodes proteins in the regulation processes of diverse developmental stages, is one of the largest families of transcription factors in higher plants. In this study, by searching for interspecies gene colinearity (microsynteny) and dating the age distributions of duplicated genes, we found 35 chromosomal segments of subgroup I genes of WRKY family (WRKY I) in four Gramineae species (Brachypodium, rice, sorghum, and maize) formed eight orthologous groups. After a stepwise gene-by-gene reciprocal comparison of all the protein sequences in the WRKY I gene flanking areas, highly conserved regions of microsynteny were found in the four Gramineae species. Most gene pairs showed conserved orientation within syntenic genome regions. Furthermore, tandem duplication events played the leading role in gene expansion. Eventually, environmental selection pressure analysis indicated strong purifying selection for the WRKY I genes in Gramineae, which may have been followed by gene loss and rearrangement. The results presented in this study provide basic information of Gramineae WRKY I genes and form the foundation for future functional studies of these genes. High level of microsynteny in the four grass species provides further evidence that a large-scale genome duplication event predated speciation.
Genome-wide investigation and transcriptome analysis of the WRKY gene family in Gossypium.
Ding, Mingquan; Chen, Jiadong; Jiang, Yurong; Lin, Lifeng; Cao, YueFen; Wang, Minhua; Zhang, Yuting; Rong, Junkang; Ye, Wuwei
2015-02-01
WRKY transcription factors play important roles in various stress responses in diverse plant species. In cotton, this family has not been well studied, especially in relation to fiber development. Here, the genomes and transcriptomes of Gossypium raimondii and Gossypium arboreum were investigated to identify fiber development related WRKY genes. This represents the first comprehensive comparative study of WRKY transcription factors in both diploid A and D cotton species. In total, 112 G. raimondii and 109 G. arboreum WRKY genes were identified. No significant gene structure or domain alterations were detected between the two species, but many SNPs distributed unequally in exon and intron regions. Physical mapping revealed that the WRKY genes in G. arboreum were not located in the corresponding chromosomes of G. raimondii, suggesting great chromosome rearrangement in the diploid cotton genomes. The cotton WRKY genes, especially subgroups I and II, have expanded through multiple whole genome duplications and tandem duplications compared with other plant species. Sequence comparison showed many functionally divergent sites between WRKY subgroups, while the genes within each group are under strong purifying selection. Transcriptome analysis suggested that many WRKY genes participate in specific fiber development processes such as fiber initiation, elongation and maturation with different expression patterns between species. Complex WRKY gene expression such as differential Dt and At allelic gene expression in G. hirsutum and alternative splicing events were also observed in both diploid and tetraploid cottons during fiber development process. In conclusion, this study provides important information on the evolution and function of WRKY gene family in cotton species.
Horizontal transfer of a eukaryotic plastid-targeted protein gene to cyanobacteria
Rogers, Matthew B; Patron, Nicola J; Keeling, Patrick J
2007-01-01
Background Horizontal or lateral transfer of genetic material between distantly related prokaryotes has been shown to play a major role in the evolution of bacterial and archaeal genomes, but exchange of genes between prokaryotes and eukaryotes is not as well understood. In particular, gene flow from eukaryotes to prokaryotes is rarely documented with strong support, which is unusual since prokaryotic genomes appear to readily accept foreign genes. Results Here, we show that abundant marine cyanobacteria in the related genera Synechococcus and Prochlorococcus acquired a key Calvin cycle/glycolytic enzyme from a eukaryote. Two non-homologous forms of fructose bisphosphate aldolase (FBA) are characteristic of eukaryotes and prokaryotes respectively. However, a eukaryotic gene has been inserted immediately upstream of the ancestral prokaryotic gene in several strains (ecotypes) of Synechococcus and Prochlorococcus. In one lineage this new gene has replaced the ancestral gene altogether. The eukaryotic gene is most closely related to the plastid-targeted FBA from red algae. This eukaryotic-type FBA once replaced the plastid/cyanobacterial type in photosynthetic eukaryotes, hinting at a possible functional advantage in Calvin cycle reactions. The strains that now possess this eukaryotic FBA are scattered across the tree of Synechococcus and Prochlorococcus, perhaps because the gene has been transferred multiple times among cyanobacteria, or more likely because it has been selectively retained only in certain lineages. Conclusion A gene for plastid-targeted FBA has been transferred from red algae to cyanobacteria, where it has inserted itself beside its non-homologous, functional analogue. Its current distribution in Prochlorococcus and Synechococcus is punctate, suggesting a complex history since its introduction to this group. PMID:17584924
NASA Astrophysics Data System (ADS)
Ceja Navarro, J. A.; Karaoz, U.; White, R. A., III; Lipton, M. S.; Adkins, J.; Mayali, X.; Blackwell, M.; Pett-Ridge, J.; Brodie, E.; Hao, Z.
2015-12-01
Odontotaenius disjuctus is a wood feeding beetle that processes large amounts of hardwoods and plays an important role in forest carbon cycling. In its gut, plant material is transformed into simple molecules by sequential processing during passage through the insect's digestive system. In this study, we used multiple 'omics approaches to analyze the distribution of microbial communities and their specific functions in lignocellulose deconstruction within the insect's gut. Fosmid clones were selected and sequenced from a pool of clones based on their expression of plant polymer degrading enzymes, allowing the identification of a wide range of carbohydrate degrading enzymes. Comparison of metagenomes of all gut regions demonstrated the distribution of genes across the beetle gut. Cellulose, starch, and xylan degradation genes were particularly abundant in the midgut and posterior hindgut. Genes involved in hydrogenotrophic production of methane and nitrogenases were more abundant in the anterior hindgut. Assembled contigs were binned into 127 putative genomes representing Bacteria, Archaea, Fungi and Nematodes. Eleven complete genomes were reconstructed allowing to identify linked functions/traits, including organisms with cellulosomes, and a combined potential for cellulose, xylan and starch hydrolysis and nitrogen fixation. A metaproteomic study was conducted to test the expression of the pathways identified in the metagenomic study. Preliminary analyses suggest enrichment of pathways related to hemicellulosic degradation. A complete xylan degradation pathway was reconstructed and GC-MS/MS based metabolomics identified xylobiose and xylose as major metabolite pools. To relate microbial identify to function in the beetle gut, Chip-SIP isotope tracing was conducted with RNA extracted from beetles fed 13C-cellulose. Multiple 13C enriched bacterial groups were detected, mainly in the midgut. Our multi-omics approach has allowed us to characterize the contribution of the gut microbiota to the transformation of woody biomass and the distribution of microbial-driven function in the beetle's gut. Through the study of such highly evolved polymer deconstruction and fermentation system we want to identify criteria for design of improved lignocellulosic fuel production processes.
Schöner, Tim A; Gassel, Sören; Osawa, Ayako; Tobias, Nicholas J; Okuno, Yukari; Sakakibara, Yui; Shindo, Kazutoshi; Sandmann, Gerhard; Bode, Helge B
2016-02-02
Bacterial pigments of the aryl polyene type are structurally similar to the well-known carotenoids with respect to their polyene systems. Their biosynthetic gene cluster is widespread in taxonomically distant bacteria, and four classes of such pigments have been found. Here we report the structure elucidation of the aryl polyene/dialkylresorcinol hybrid pigments of Variovorax paradoxus B4 by HPLC-UV-MS, MALDI-MS and NMR. Furthermore, we show for the first time that this pigment class protects the bacterium from reactive oxygen species, similarly to what is known for carotenoids. An analysis of the distribution of biosynthetic genes for aryl polyenes and carotenoids in bacterial genomes is presented; it shows a complementary distribution of these protective pigments in bacteria. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Noise-induced multistability in the regulation of cancer by genes and pseudogenes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Petrosyan, K. G., E-mail: pkaren@phys.sinica.edu.tw; Hu, Chin-Kun, E-mail: huck@phys.sinica.edu.tw; National Center for Theoretical Sciences, National Tsing Hua University, Hsinchu 30013, Taiwan
2016-07-28
We extend a previously introduced model of stochastic gene regulation of cancer to a nonlinear case having both gene and pseudogene messenger RNAs (mRNAs) self-regulated. The model consists of stochastic Boolean genetic elements and possesses noise-induced multistability (multimodality). We obtain analytical expressions for probabilities for the case of constant but finite number of microRNA molecules which act as a noise source for the competing gene and pseudogene mRNAs. The probability distribution functions display both the global bistability regime as well as even-odd number oscillations for a certain range of model parameters. Statistical characteristics of the mRNA’s level fluctuations are evaluated.more » The obtained results of the extended model advance our understanding of the process of stochastic gene and pseudogene expressions that is crucial in regulation of cancer.« less
Bayesian median regression for temporal gene expression data
NASA Astrophysics Data System (ADS)
Yu, Keming; Vinciotti, Veronica; Liu, Xiaohui; 't Hoen, Peter A. C.
2007-09-01
Most of the existing methods for the identification of biologically interesting genes in a temporal expression profiling dataset do not fully exploit the temporal ordering in the dataset and are based on normality assumptions for the gene expression. In this paper, we introduce a Bayesian median regression model to detect genes whose temporal profile is significantly different across a number of biological conditions. The regression model is defined by a polynomial function where both time and condition effects as well as interactions between the two are included. MCMC-based inference returns the posterior distribution of the polynomial coefficients. From this a simple Bayes factor test is proposed to test for significance. The estimation of the median rather than the mean, and within a Bayesian framework, increases the robustness of the method compared to a Hotelling T2-test previously suggested. This is shown on simulated data and on muscular dystrophy gene expression data.
Poole, William; Leinonen, Kalle; Shmulevich, Ilya
2017-01-01
Cancer researchers have long recognized that somatic mutations are not uniformly distributed within genes. However, most approaches for identifying cancer mutations focus on either the entire-gene or single amino-acid level. We have bridged these two methodologies with a multiscale mutation clustering algorithm that identifies variable length mutation clusters in cancer genes. We ran our algorithm on 539 genes using the combined mutation data in 23 cancer types from The Cancer Genome Atlas (TCGA) and identified 1295 mutation clusters. The resulting mutation clusters cover a wide range of scales and often overlap with many kinds of protein features including structured domains, phosphorylation sites, and known single nucleotide variants. We statistically associated these multiscale clusters with gene expression and drug response data to illuminate the functional and clinical consequences of mutations in our clusters. Interestingly, we find multiple clusters within individual genes that have differential functional associations: these include PTEN, FUBP1, and CDH1. This methodology has potential implications in identifying protein regions for drug targets, understanding the biological underpinnings of cancer, and personalizing cancer treatments. Toward this end, we have made the mutation clusters and the clustering algorithm available to the public. Clusters and pathway associations can be interactively browsed at m2c.systemsbiology.net. The multiscale mutation clustering algorithm is available at https://github.com/IlyaLab/M2C. PMID:28170390
Poole, William; Leinonen, Kalle; Shmulevich, Ilya; Knijnenburg, Theo A; Bernard, Brady
2017-02-01
Cancer researchers have long recognized that somatic mutations are not uniformly distributed within genes. However, most approaches for identifying cancer mutations focus on either the entire-gene or single amino-acid level. We have bridged these two methodologies with a multiscale mutation clustering algorithm that identifies variable length mutation clusters in cancer genes. We ran our algorithm on 539 genes using the combined mutation data in 23 cancer types from The Cancer Genome Atlas (TCGA) and identified 1295 mutation clusters. The resulting mutation clusters cover a wide range of scales and often overlap with many kinds of protein features including structured domains, phosphorylation sites, and known single nucleotide variants. We statistically associated these multiscale clusters with gene expression and drug response data to illuminate the functional and clinical consequences of mutations in our clusters. Interestingly, we find multiple clusters within individual genes that have differential functional associations: these include PTEN, FUBP1, and CDH1. This methodology has potential implications in identifying protein regions for drug targets, understanding the biological underpinnings of cancer, and personalizing cancer treatments. Toward this end, we have made the mutation clusters and the clustering algorithm available to the public. Clusters and pathway associations can be interactively browsed at m2c.systemsbiology.net. The multiscale mutation clustering algorithm is available at https://github.com/IlyaLab/M2C.
Imanian, Behzad; Keeling, Patrick J
2007-01-01
Background The dinoflagellates Durinskia baltica and Kryptoperidinium foliaceum are distinguished by the presence of a tertiary plastid derived from a diatom endosymbiont. The diatom is fully integrated with the host cell cycle and is so altered in structure as to be difficult to recognize it as a diatom, and yet it retains a number of features normally lost in tertiary and secondary endosymbionts, most notably mitochondria. The dinoflagellate host is also reported to retain mitochondrion-like structures, making these cells unique in retaining two evolutionarily distinct mitochondria. This redundancy raises the question of whether the organelles share any functions in common or have distributed functions between them. Results We show that both host and endosymbiont mitochondrial genomes encode genes for electron transport proteins. We have characterized cytochrome c oxidase 1 (cox1), cytochrome oxidase 2 (cox2), cytochrome oxidase 3 (cox3), cytochrome b (cob), and large subunit of ribosomal RNA (LSUrRNA) of endosymbiont mitochondrial ancestry, and cox1 and cob of host mitochondrial ancestry. We show that all genes are transcribed and that those ascribed to the host mitochondrial genome are extensively edited at the RNA level, as expected for a dinoflagellate mitochondrion-encoded gene. We also found evidence for extensive recombination in the host mitochondrial genes and that recombination products are also transcribed, as expected for a dinoflagellate. Conclusion Durinskia baltica and K. foliaceum retain two mitochondria from evolutionarily distinct lineages, and the functions of these organelles are at least partially overlapping, since both express genes for proteins in electron transport. PMID:17892581
Gao, Minghong; Liu, Jiwen; Qiao, Yanlu; Zhao, Meixun; Zhang, Xiao-Hua
2017-04-01
Investigating the environmental influence on the community composition and abundance of denitrifiers in marine sediment ecosystem is essential for understanding of the ecosystem-level controls on the biogeochemical process of denitrification. In the present study, nirK-harboring denitrifying communities in different mud deposit zones of eastern China marginal seas (ECMS) were investigated via clone library analysis. The abundance of three functional genes affiliated with denitrification (narG, nirK, nosZ) was assessed by fluorescent quantitative PCR. The nirK-harboring microbiota were dominated by a few operational taxonomic units (OTUs), which were widely distributed in different sites with each site harboring their unique phylotypes. The mean abundance of nirK was significantly higher than that of narG and nosZ genes, and the abundance of narG was higher than that of nosZ. The inconsistent abundance profile of different functional genes along the process of denitrification might indicate that nitrite reduction occurred independently of denitrification in the mud deposit zones of ECMS, and sedimentary denitrification was accomplished by cooperation of different denitrifying species rather than a single species. Such important information would be missed when targeting only a single denitrifying functional gene. Analysis of correlation between abundance ratios and environmental factors revealed that the response of denitrifiers to environmental factors was not invariable in different mud deposit zones. Our results suggested that a comprehensive analysis of different denitrifying functional genes may gain more information about the dynamics of denitrifying microbiota in marine sediments.
Baumgartner, Desiree; Kopf, Matthias; Klähn, Stephan; Steglich, Claudia; Hess, Wolfgang R
2016-11-28
Despite their versatile functions in multimeric protein complexes, in the modification of enzymatic activities, intercellular communication or regulatory processes, proteins shorter than 80 amino acids (μ-proteins) are a systematically underestimated class of gene products in bacteria. Photosynthetic cyanobacteria provide a paradigm for small protein functions due to extensive work on the photosynthetic apparatus that led to the functional characterization of 19 small proteins of less than 50 amino acids. In analogy, previously unstudied small ORFs with similar degrees of conservation might encode small proteins of high relevance also in other functional contexts. Here we used comparative transcriptomic information available for two model cyanobacteria, Synechocystis sp. PCC 6803 and Synechocystis sp. PCC 6714 for the prediction of small ORFs. We found 293 transcriptional units containing candidate small ORFs ≤80 codons in Synechocystis sp. PCC 6803, also including the known mRNAs encoding small proteins of the photosynthetic apparatus. From these transcriptional units, 146 are shared between the two strains, 42 are shared with the higher plant Arabidopsis thaliana and 25 with E. coli. To verify the existence of the respective μ-proteins in vivo, we selected five genes as examples to which a FLAG tag sequence was added and re-introduced them into Synechocystis sp. PCC 6803. These were the previously annotated gene ssr1169, two newly defined genes norf1 and norf4, as well as nsiR6 (nitrogen stress-induced RNA 6) and hliR1(high light-inducible RNA 1) , which originally were considered non-coding. Upon activation of expression via the Cu 2+. responsive petE promoter or from the native promoters, all five proteins were detected in Western blot experiments. The distribution and conservation of these five genes as well as their regulation of expression and the physico-chemical properties of the encoded proteins underline the likely great bandwidth of small protein functions in bacteria and makes them attractive candidates for functional studies.
Sherwood, Chet C; Raghanti, Mary Ann; Stimpson, Cheryl D; Spocter, Muhammad A; Uddin, Monica; Boddy, Amy M; Wildman, Derek E; Bonar, Christopher J; Lewandowski, Albert H; Phillips, Kimberley A; Erwin, Joseph M; Hof, Patrick R
2010-04-07
Inhibitory interneurons participate in local processing circuits, playing a central role in executive cognitive functions of the prefrontal cortex. Although humans differ from other primates in a number of cognitive domains, it is not currently known whether the interneuron system has changed in the course of primate evolution leading to our species. In this study, we examined the distribution of different interneuron subtypes in the prefrontal cortex of anthropoid primates as revealed by immunohistochemistry against the calcium-binding proteins calbindin, calretinin and parvalbumin. In addition, we tested whether genes involved in the specification, differentiation and migration of interneurons show evidence of positive selection in the evolution of humans. Our findings demonstrate that cellular distributions of interneuron subtypes in human prefrontal cortex are similar to other anthropoid primates and can be explained by general scaling rules. Furthermore, genes underlying interneuron development are highly conserved at the amino acid level in primate evolution. Taken together, these results suggest that the prefrontal cortex in humans retains a similar inhibitory circuitry to that in closely related primates, even though it performs functional operations that are unique to our species. Thus, it is likely that other significant modifications to the connectivity and molecular biology of the prefrontal cortex were overlaid on this conserved interneuron architecture in the course of human evolution.
Yazdani Foshtomi, Maryam; Leliaert, Frederik; Derycke, Sofie; Willems, Anne; Vincx, Magda
2018-01-01
The presence of large densities of the piston-pumping polychaete Lanice conchilega can have important consequences for the functioning of marine sediments. It is considered both an allogenic and an autogenic ecosystem engineer, affecting spatial and temporal biogeochemical gradients (oxygen concentrations, oxygen penetration depth and nutrient concentrations) and physical properties (grain size) of marine sediments, which could affect functional properties of sediment-inhabiting microbial communities. Here we investigated whether density-dependent effects of L. conchilega affected horizontal (m-scale) and vertical (cm-scale) patterns in the distribution, diversity and composition of the typical nosZ gene in the active denitrifying organisms. This gene plays a major role in N2O reduction in coastal ecosystems as the last step completing the denitrification pathway. We showed that both vertical and horizontal composition and richness of nosZ gene were indeed significantly affected when large densities of the bio-irrigator were present. This could be directly related to allogenic ecosystem engineering effects on the environment, reflected in increased oxygen penetration depth and oxygen concentrations in the upper cm of the sediment in high densities of L. conchilega. A higher diversity (Shannon diversity and inverse Simpson) of nosZ observed in patches with high L. conchilega densities (3,185–3,440 ind. m-2) at deeper sediment layers could suggest a downward transport of NO3− to deeper layers resulting from bio-irrigation as well. Hence, our results show the effect of L. conchilega bio-irrigation activity on denitrifying organisms in L. conchilega reefs. PMID:29408934
Singh, Himanshu Narayan; Rajeswari, Moganty R
2016-01-01
Purine repeat sequences present in a gene are unique as they have high propensity to form unusual DNA-triple helix structures. Friedreich's ataxia is the only human disease that is well known to be associated with DNA-triplexes formed by purine repeats. The purpose of this study was to recognize the expanded purine repeats (EPRs) in human genome and find their correlation with cancer pathogenesis. We developed "PuRepeatFinder.pl" algorithm to identify non-overlapping EPRs without pyrimidine interruptions in the human genome and customized for searching repeat lengths, n ≥ 200. A total of 1158 EPRs were identified in the genome which followed Wakeby distribution. Two hundred and ninety-six EPRs were found in geneic regions of 282 genes (EPR-genes). Gene clustering of EPR-genes was done based on their cellular function and a large number of EPR-genes were found to be enzymes/enzyme modulators. Meta-analysis of 282 EPR-genes identified only 63 EPR-genes in association with cancer, mostly in breast, lung, and blood cancers. Protein-protein interaction network analysis of all 282 EPR-genes identified proteins including those in cadherins and VEGF. The two observations, that EPRs can induce mutations under malignant conditions and that identification of some EPR-gene products in vital cell signaling-mediated pathways, together suggest the crucial role of EPRs in carcinogenesis. The new link between EPR-genes and their functionally interacting proteins throws a new dimension in the present understanding of cancer pathogenesis and can help in planning therapeutic strategies. Validation of present results using techniques like NGS is required to establish the role of the EPR genes in cancer pathology.
Porcelli, Damiano; Barsanti, Paolo; Pesole, Graziano; Caggese, Corrado
2007-01-01
Background When orthologous sequences from species distributed throughout an optimal range of divergence times are available, comparative genomics is a powerful tool to address problems such as the identification of the forces that shape gene structure during evolution, although the functional constraints involved may vary in different genes and lineages. Results We identified and annotated in the MitoComp2 dataset the orthologs of 68 nuclear genes controlling oxidative phosphorylation in 11 Drosophilidae species and in five non-Drosophilidae insects, and compared them with each other and with their counterparts in three vertebrates (Fugu rubripes, Danio rerio and Homo sapiens) and in the cnidarian Nematostella vectensis, taking into account conservation of gene structure and regulatory motifs, and preservation of gene paralogs in the genome. Comparative analysis indicates that the ancestral insect OXPHOS genes were intron rich and that extensive intron loss and lineage-specific intron gain occurred during evolution. Comparison with vertebrates and cnidarians also shows that many OXPHOS gene introns predate the cnidarian/Bilateria evolutionary split. The nuclear respiratory gene element (NRG) has played a key role in the evolution of the insect OXPHOS genes; it is constantly conserved in the OXPHOS orthologs of all the insect species examined, while their duplicates either completely lack the element or possess only relics of the motif. Conclusion Our observations reinforce the notion that the common ancestor of most animal phyla had intron-rich gene, and suggest that changes in the pattern of expression of the gene facilitate the fixation of duplications in the genome and the development of novel genetic functions. PMID:18315839
Hu, Anyi; Jiao, Nianzhi; Zhang, Chuanlun L
2011-10-01
Marine Crenarchaeota represent a widespread and abundant microbial group in marine ecosystems. Here, we investigated the abundance, diversity, and distribution of planktonic Crenarchaeota in the epi-, meso-, and bathypelagic zones at three stations in the South China Sea (SCS) by analysis of crenarchaeal 16S rRNA gene, ammonia monooxygenase gene amoA involved in ammonia oxidation, and biotin carboxylase gene accA putatively involved in archaeal CO(2) fixation. Quantitative PCR analyses indicated that crenarchaeal amoA and accA gene abundances varied similarly with archaeal and crenarchaeal 16S rRNA gene abundances at all stations, except that crenarchaeal accA genes were almost absent in the epipelagic zone. Ratios of the crenarchaeal amoA gene to 16S rRNA gene abundances decreased ~2.6 times from the epi- to bathypelagic zones, whereas the ratios of crenarchaeal accA gene to marine group I crenarchaeal 16S rRNA gene or to crenarchaeal amoA gene abundances increased with depth, suggesting that the metabolism of Crenarchaeota may change from the epi- to meso- or bathypelagic zones. Denaturing gradient gel electrophoresis profiling of the 16S rRNA genes revealed depth partitioning in archaeal community structures. Clone libraries of crenarchaeal amoA and accA genes showed two clusters: the "shallow" cluster was exclusively derived from epipelagic water and the "deep" cluster was from meso- and/or bathypelagic waters, suggesting that niche partitioning may take place between the shallow and deep marine Crenarchaeota. Overall, our results show strong depth partitioning of crenarchaeal populations in the SCS and suggest a shift in their community structure and ecological function with increasing depth.
Stochastic model of transcription factor-regulated gene expression
NASA Astrophysics Data System (ADS)
Karmakar, Rajesh; Bose, Indrani
2006-09-01
We consider a stochastic model of transcription factor (TF)-regulated gene expression. The model describes two genes, gene A and gene B, which synthesize the TFs and the target gene proteins, respectively. We show through analytic calculations that the TF fluctuations have a significant effect on the distribution of the target gene protein levels when the mean TF level falls in the highest sensitive region of the dose-response curve. We further study the effect of reducing the copy number of gene A from two to one. The enhanced TF fluctuations yield results different from those in the deterministic case. The probability that the target gene protein level exceeds a threshold value is calculated with the knowledge of the probability density functions associated with the TF and target gene protein levels. Numerical simulation results for a more detailed stochastic model are shown to be in agreement with those obtained through analytic calculations. The relevance of these results in the context of the genetic disorder haploinsufficiency is pointed out. Some experimental observations on the haploinsufficiency of the tumour suppressor gene, Nkx 3.1, are explained with the help of the stochastic model of TF-regulated gene expression.
Cheviron, Zachary A; Natarajan, Chandrasekhar; Projecto-Garcia, Joana; Eddy, Douglas K; Jones, Jennifer; Carling, Matthew D; Witt, Christopher C; Moriyama, Hideaki; Weber, Roy E; Fago, Angela; Storz, Jay F
2014-11-01
In air-breathing vertebrates, the physiologically optimal blood-O2 affinity is jointly determined by the prevailing partial pressure of atmospheric O2, the efficacy of pulmonary O2 transfer, and internal metabolic demands. Consequently, genetic variation in the oxygenation properties of hemoglobin (Hb) may be subject to spatially varying selection in species with broad elevational distributions. Here we report the results of a combined functional and evolutionary analysis of Hb polymorphism in the rufous-collared sparrow (Zonotrichia capensis), a species that is continuously distributed across a steep elevational gradient on the Pacific slope of the Peruvian Andes. We integrated a population genomic analysis that included all postnatally expressed Hb genes with functional studies of naturally occurring Hb variants, as well as recombinant Hb (rHb) mutants that were engineered through site-directed mutagenesis. We identified three clinally varying amino acid polymorphisms: Two in the α(A)-globin gene, which encodes the α-chain subunits of the major HbA isoform, and one in the α(D)-globin gene, which encodes the α-chain subunits of the minor HbD isoform. We then constructed and experimentally tested single- and double-mutant rHbs representing each of the alternative α(A)-globin genotypes that predominate at different elevations. Although the locus-specific patterns of altitudinal differentiation suggested a history of spatially varying selection acting on Hb polymorphism, the experimental tests demonstrated that the observed amino acid mutations have no discernible effect on respiratory properties of the HbA or HbD isoforms. These results highlight the importance of experimentally validating the hypothesized effects of genetic changes in protein function to avoid the pitfalls of adaptive storytelling. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Genome-wide analysis of putative peroxiredoxin in unicellular and filamentous cyanobacteria.
Cui, Hongli; Wang, Yipeng; Wang, Yinchu; Qin, Song
2012-11-16
Cyanobacteria are photoautotrophic prokaryotes with wide variations in genome sizes and ecological habitats. Peroxiredoxin (PRX) is an important protein that plays essential roles in protecting own cells against reactive oxygen species (ROS). PRXs have been identified from mammals, fungi and higher plants. However, knowledge on cyanobacterial PRXs still remains obscure. With the availability of 37 sequenced cyanobacterial genomes, we performed a comprehensive comparative analysis of PRXs and explored their diversity, distribution, domain structure and evolution. Overall 244 putative prx genes were identified, which were abundant in filamentous diazotrophic cyanobacteria, Acaryochloris marina MBIC 11017, and unicellular cyanobacteria inhabiting freshwater and hot-springs, while poor in all Prochlorococcus and marine Synechococcus strains. Among these putative genes, 25 open reading frames (ORFs) encoding hypothetical proteins were identified as prx gene family members and the others were already annotated as prx genes. All 244 putative PRXs were classified into five major subfamilies (1-Cys, 2-Cys, BCP, PRX5_like, and PRX-like) according to their domain structures. The catalytic motifs of the cyanobacterial PRXs were similar to those of eukaryotic PRXs and highly conserved in all but the PRX-like subfamily. Classical motif (CXXC) of thioredoxin was detected in protein sequences from the PRX-like subfamily. Phylogenetic tree constructed of catalytic domains coincided well with the domain structures of PRXs and the phylogenies based on 16s rRNA. The distribution of genes encoding PRXs in different unicellular and filamentous cyanobacteria especially those sub-families like PRX-like or 1-Cys PRX correlate with the genome size, eco-physiology, and physiological properties of the organisms. Cyanobacterial and eukaryotic PRXs share similar conserved motifs, indicating that cyanobacteria adopt similar catalytic mechanisms as eukaryotes. All cyanobacterial PRX proteins share highly similar structures, implying that these genes may originate from a common ancestor. In this study, a general framework of the sequence-structure-function connections of the PRXs was revealed, which may facilitate functional investigations of PRXs in various organisms.
Genome-wide analysis of putative peroxiredoxin in unicellular and filamentous cyanobacteria
2012-01-01
Background Cyanobacteria are photoautotrophic prokaryotes with wide variations in genome sizes and ecological habitats. Peroxiredoxin (PRX) is an important protein that plays essential roles in protecting own cells against reactive oxygen species (ROS). PRXs have been identified from mammals, fungi and higher plants. However, knowledge on cyanobacterial PRXs still remains obscure. With the availability of 37 sequenced cyanobacterial genomes, we performed a comprehensive comparative analysis of PRXs and explored their diversity, distribution, domain structure and evolution. Results Overall 244 putative prx genes were identified, which were abundant in filamentous diazotrophic cyanobacteria, Acaryochloris marina MBIC 11017, and unicellular cyanobacteria inhabiting freshwater and hot-springs, while poor in all Prochlorococcus and marine Synechococcus strains. Among these putative genes, 25 open reading frames (ORFs) encoding hypothetical proteins were identified as prx gene family members and the others were already annotated as prx genes. All 244 putative PRXs were classified into five major subfamilies (1-Cys, 2-Cys, BCP, PRX5_like, and PRX-like) according to their domain structures. The catalytic motifs of the cyanobacterial PRXs were similar to those of eukaryotic PRXs and highly conserved in all but the PRX-like subfamily. Classical motif (CXXC) of thioredoxin was detected in protein sequences from the PRX-like subfamily. Phylogenetic tree constructed of catalytic domains coincided well with the domain structures of PRXs and the phylogenies based on 16s rRNA. Conclusions The distribution of genes encoding PRXs in different unicellular and filamentous cyanobacteria especially those sub-families like PRX-like or 1-Cys PRX correlate with the genome size, eco-physiology, and physiological properties of the organisms. Cyanobacterial and eukaryotic PRXs share similar conserved motifs, indicating that cyanobacteria adopt similar catalytic mechanisms as eukaryotes. All cyanobacterial PRX proteins share highly similar structures, implying that these genes may originate from a common ancestor. In this study, a general framework of the sequence-structure-function connections of the PRXs was revealed, which may facilitate functional investigations of PRXs in various organisms. PMID:23157370
Underwater Application of Quantitative PCR on an Ocean Mooring
Preston, Christina M.; Harris, Adeline; Ryan, John P.; Roman, Brent; Marin, Roman; Jensen, Scott; Everlove, Cheri; Birch, James; Dzenitis, John M.; Pargett, Douglas; Adachi, Masao; Turk, Kendra; Zehr, Jonathon P.; Scholin, Christopher A.
2011-01-01
The Environmental Sample Processor (ESP) is a device that allows for the underwater, autonomous application of DNA and protein probe array technologies as a means to remotely identify and quantify, in situ, marine microorganisms and substances they produce. Here, we added functionality to the ESP through the development and incorporation of a module capable of solid-phase nucleic acid extraction and quantitative PCR (qPCR). Samples collected by the instrument were homogenized in a chaotropic buffer compatible with direct detection of ribosomal RNA (rRNA) and nucleic acid purification. From a single sample, both an rRNA community profile and select gene abundances were ascertained. To illustrate this functionality, we focused on bacterioplankton commonly found along the central coast of California and that are known to vary in accordance with different oceanic conditions. DNA probe arrays targeting rRNA revealed the presence of 16S rRNA indicative of marine crenarchaea, SAR11 and marine cyanobacteria; in parallel, qPCR was used to detect 16S rRNA genes from the former two groups and the large subunit RuBisCo gene (rbcL) from Synecchococcus. The PCR-enabled ESP was deployed on a coastal mooring in Monterey Bay for 28 days during the spring-summer upwelling season. The distributions of the targeted bacterioplankon groups were as expected, with the exception of an increase in abundance of marine crenarchaea in anomalous nitrate-rich, low-salinity waters. The unexpected co-occurrence demonstrated the utility of the ESP in detecting novel events relative to previously described distributions of particular bacterioplankton groups. The ESP can easily be configured to detect and enumerate genes and gene products from a wide range of organisms. This study demonstrated for the first time that gene abundances could be assessed autonomously, underwater in near real-time and referenced against prevailing chemical, physical and bulk biological conditions. PMID:21829630
Genome-wide analysis of tandem repeats in plants and green algae
Zhixin Zhao; Cheng Guo; Sreeskandarajan Sutharzan; Pei Li; Craig Echt; Jie Zhang; Chun Liang
2014-01-01
Tandem repeats (TRs) extensively exist in the genomes of prokaryotes and eukaryotes. Based on the sequenced genomes and gene annotations of 31 plant and algal species in Phytozome version 8.0 (http://www.phytozome.net/), we examined TRs in a genome-wide scale, characterized their distributions and motif features, and explored their putative biological functions. Among...
Bettembourg, Charles; Diot, Christian; Dameron, Olivier
2015-01-01
Background The analysis of gene annotations referencing back to Gene Ontology plays an important role in the interpretation of high-throughput experiments results. This analysis typically involves semantic similarity and particularity measures that quantify the importance of the Gene Ontology annotations. However, there is currently no sound method supporting the interpretation of the similarity and particularity values in order to determine whether two genes are similar or whether one gene has some significant particular function. Interpretation is frequently based either on an implicit threshold, or an arbitrary one (typically 0.5). Here we investigate a method for determining thresholds supporting the interpretation of the results of a semantic comparison. Results We propose a method for determining the optimal similarity threshold by minimizing the proportions of false-positive and false-negative similarity matches. We compared the distributions of the similarity values of pairs of similar genes and pairs of non-similar genes. These comparisons were performed separately for all three branches of the Gene Ontology. In all situations, we found overlap between the similar and the non-similar distributions, indicating that some similar genes had a similarity value lower than the similarity value of some non-similar genes. We then extend this method to the semantic particularity measure and to a similarity measure applied to the ChEBI ontology. Thresholds were evaluated over the whole HomoloGene database. For each group of homologous genes, we computed all the similarity and particularity values between pairs of genes. Finally, we focused on the PPAR multigene family to show that the similarity and particularity patterns obtained with our thresholds were better at discriminating orthologs and paralogs than those obtained using default thresholds. Conclusion We developed a method for determining optimal semantic similarity and particularity thresholds. We applied this method on the GO and ChEBI ontologies. Qualitative analysis using the thresholds on the PPAR multigene family yielded biologically-relevant patterns. PMID:26230274
Johansson, Martin M; Lundin, Elin; Qian, Xiaoyan; Mirzazadeh, Mohammadreza; Halvardson, Jonatan; Darj, Elisabeth; Feuk, Lars; Nilsson, Mats; Jazin, Elena
2016-01-01
Renewed attention has been directed to the functions of the Y chromosome in the central nervous system during early human male development, due to the recent proposed involvement in neurodevelopmental diseases. PCDH11Y and NLGN4Y are of special interest because they belong to gene families involved in cell fate determination and formation of dendrites and axon. We used RNA sequencing, immunocytochemistry and a padlock probing and rolling circle amplification strategy, to distinguish the expression of X and Y homologs in situ in the human brain for the first time. To minimize influence of androgens on the sex differences in the brain, we focused our investigation to human embryos at 8-11 weeks post-gestation. We found that the X- and Y-encoded genes are expressed in specific and heterogeneous cellular sub-populations of both glial and neuronal origins. More importantly, we found differential distribution patterns of X and Y homologs in the male developing central nervous system. This study has visualized the spatial distribution of PCDH11X/Y and NLGN4X/Y in human developing nervous tissue. The observed spatial distribution patterns suggest the existence of an additional layer of complexity in the development of the male CNS.
Opazo, Juan C; Lee, Alison P; Hoffmann, Federico G; Toloza-Villalobos, Jessica; Burmester, Thorsten; Venkatesh, Byrappa; Storz, Jay F
2015-07-01
Comparative analyses of vertebrate genomes continue to uncover a surprising diversity of genes in the globin gene superfamily, some of which have very restricted phyletic distributions despite their antiquity. Genomic analysis of the globin gene repertoire of cartilaginous fish (Chondrichthyes) should be especially informative about the duplicative origins and ancestral functions of vertebrate globins, as divergence between Chondrichthyes and bony vertebrates represents the most basal split within the jawed vertebrates. Here, we report a comparative genomic analysis of the vertebrate globin gene family that includes the complete globin gene repertoire of the elephant shark (Callorhinchus milii). Using genomic sequence data from representatives of all major vertebrate classes, integrated analyses of conserved synteny and phylogenetic relationships revealed that the last common ancestor of vertebrates possessed a repertoire of at least seven globin genes: single copies of androglobin and neuroglobin, four paralogous copies of globin X, and the single-copy progenitor of the entire set of vertebrate-specific globins. Combined with expression data, the genomic inventory of elephant shark globins yielded four especially surprising findings: 1) there is no trace of the neuroglobin gene (a highly conserved gene that is present in all other jawed vertebrates that have been examined to date), 2) myoglobin is highly expressed in heart, but not in skeletal muscle (reflecting a possible ancestral condition in vertebrates with single-circuit circulatory systems), 3) elephant shark possesses two highly divergent globin X paralogs, one of which is preferentially expressed in gonads, and 4) elephant shark possesses two structurally distinct α-globin paralogs, one of which is preferentially expressed in the brain. Expression profiles of elephant shark globin genes reveal distinct specializations of function relative to orthologs in bony vertebrates and suggest hypotheses about ancestral functions of vertebrate globins. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Spatial Genetic Structure of the Abundant and Widespread Peatmoss Sphagnum magellanicum Brid.
Kyrkjeeide, Magni Olsen; Hassel, Kristian; Flatberg, Kjell Ivar; Shaw, A. Jonathan; Yousefi, Narjes; Stenøien, Hans K.
2016-01-01
Spore-producing organisms have small dispersal units enabling them to become widespread across continents. However, barriers to gene flow and cryptic speciation may exist. The common, haploid peatmoss Sphagnum magellanicum occurs in both the Northern and Southern hemisphere, and is commonly used as a model in studies of peatland ecology and peatmoss physiology. Even though it will likely act as a rich source in functional genomics studies in years to come, surprisingly little is known about levels of genetic variability and structuring in this species. Here, we assess for the first time how genetic variation in S. magellanicum is spatially structured across its full distribution range (Northern Hemisphere and South America). The morphologically similar species S. alaskense was included for comparison. In total, 195 plants were genotyped at 15 microsatellite loci. Sequences from two plastid loci (trnG and trnL) were obtained from 30 samples. Our results show that S. alaskense and almost all plants of S. magellanicum in the northern Pacific area are diploids and share the same gene pool. Haploid plants occur in South America, Europe, eastern North America, western North America, and southern Asia, and five genetically differentiated groups with different distribution ranges were found. Our results indicate that S. magellanicum consists of several distinct genetic groups, seemingly with little or no gene flow among them. Noteworthy, the geographical separation of diploids and haploids is strikingly similar to patterns found within other haploid Sphagnum species spanning the Northern Hemisphere. Our results confirm a genetic division between the Beringian and the Atlantic that seems to be a general pattern in Sphagnum taxa. The pattern of strong genetic population structuring throughout the distribution range of morphologically similar plants need to be considered in future functional genomic studies of S. magellanicum. PMID:26859563
Spatial Genetic Structure of the Abundant and Widespread Peatmoss Sphagnum magellanicum Brid.
Kyrkjeeide, Magni Olsen; Hassel, Kristian; Flatberg, Kjell Ivar; Shaw, A Jonathan; Yousefi, Narjes; Stenøien, Hans K
2016-01-01
Spore-producing organisms have small dispersal units enabling them to become widespread across continents. However, barriers to gene flow and cryptic speciation may exist. The common, haploid peatmoss Sphagnum magellanicum occurs in both the Northern and Southern hemisphere, and is commonly used as a model in studies of peatland ecology and peatmoss physiology. Even though it will likely act as a rich source in functional genomics studies in years to come, surprisingly little is known about levels of genetic variability and structuring in this species. Here, we assess for the first time how genetic variation in S. magellanicum is spatially structured across its full distribution range (Northern Hemisphere and South America). The morphologically similar species S. alaskense was included for comparison. In total, 195 plants were genotyped at 15 microsatellite loci. Sequences from two plastid loci (trnG and trnL) were obtained from 30 samples. Our results show that S. alaskense and almost all plants of S. magellanicum in the northern Pacific area are diploids and share the same gene pool. Haploid plants occur in South America, Europe, eastern North America, western North America, and southern Asia, and five genetically differentiated groups with different distribution ranges were found. Our results indicate that S. magellanicum consists of several distinct genetic groups, seemingly with little or no gene flow among them. Noteworthy, the geographical separation of diploids and haploids is strikingly similar to patterns found within other haploid Sphagnum species spanning the Northern Hemisphere. Our results confirm a genetic division between the Beringian and the Atlantic that seems to be a general pattern in Sphagnum taxa. The pattern of strong genetic population structuring throughout the distribution range of morphologically similar plants need to be considered in future functional genomic studies of S. magellanicum.
Niche specialization of terrestrial archaeal ammonia oxidizers
Gubry-Rangin, Cécile; Hai, Brigitte; Quince, Christopher; Engel, Marion; Thomson, Bruce C.; James, Phillip; Schloter, Michael; Griffiths, Robert I.; Prosser, James I.; Nicol, Graeme W.
2011-01-01
Soil pH is a major determinant of microbial ecosystem processes and potentially a major driver of evolution, adaptation, and diversity of ammonia oxidizers, which control soil nitrification. Archaea are major components of soil microbial communities and contribute significantly to ammonia oxidation in some soils. To determine whether pH drives evolutionary adaptation and community structure of soil archaeal ammonia oxidizers, sequences of amoA, a key functional gene of ammonia oxidation, were examined in soils at global, regional, and local scales. Globally distributed database sequences clustered into 18 well-supported phylogenetic lineages that dominated specific soil pH ranges classified as acidic (pH <5), acido-neutral (5≤ pH <7), or alkalinophilic (pH ≥7). To determine whether patterns were reproduced at regional and local scales, amoA gene fragments were amplified from DNA extracted from 47 soils in the United Kingdom (pH 3.5–8.7), including a pH-gradient formed by seven soils at a single site (pH 4.5–7.5). High-throughput sequencing and analysis of amoA gene fragments identified an additional, previously undiscovered phylogenetic lineage and revealed similar pH-associated distribution patterns at global, regional, and local scales, which were most evident for the five most abundant clusters. Archaeal amoA abundance and diversity increased with soil pH, which was the only physicochemical characteristic measured that significantly influenced community structure. These results suggest evolution based on specific adaptations to soil pH and niche specialization, resulting in a global distribution of archaeal lineages that have important consequences for soil ecosystem function and nitrogen cycling. PMID:22158986
Abruzzi, Katharine C; Zadina, Abigail; Luo, Weifei; Wiyanto, Evelyn; Rahman, Reazur; Guo, Fang; Shafer, Orie; Rosbash, Michael
2017-02-01
Locomotor activity rhythms are controlled by a network of ~150 circadian neurons within the adult Drosophila brain. They are subdivided based on their anatomical locations and properties. We profiled transcripts "around the clock" from three key groups of circadian neurons with different functions. We also profiled a non-circadian outgroup, dopaminergic (TH) neurons. They have cycling transcripts but fewer than clock neurons as well as low expression and poor cycling of clock gene transcripts. This suggests that TH neurons do not have a canonical circadian clock and that their gene expression cycling is driven by brain systemic cues. The three circadian groups are surprisingly diverse in their cycling transcripts and overall gene expression patterns, which include known and putative novel neuropeptides. Even the overall phase distributions of cycling transcripts are distinct, indicating that different regulatory principles govern transcript oscillations. This surprising cell-type diversity parallels the functional heterogeneity of the different neurons.
Dewey, Frederick E; Murray, Michael F; Overton, John D; Habegger, Lukas; Leader, Joseph B; Fetterolf, Samantha N; O'Dushlaine, Colm; Van Hout, Cristopher V; Staples, Jeffrey; Gonzaga-Jauregui, Claudia; Metpally, Raghu; Pendergrass, Sarah A; Giovanni, Monica A; Kirchner, H Lester; Balasubramanian, Suganthi; Abul-Husn, Noura S; Hartzel, Dustin N; Lavage, Daniel R; Kost, Korey A; Packer, Jonathan S; Lopez, Alexander E; Penn, John; Mukherjee, Semanti; Gosalia, Nehal; Kanagaraj, Manoj; Li, Alexander H; Mitnaul, Lyndon J; Adams, Lance J; Person, Thomas N; Praveen, Kavita; Marcketta, Anthony; Lebo, Matthew S; Austin-Tse, Christina A; Mason-Suares, Heather M; Bruse, Shannon; Mellis, Scott; Phillips, Robert; Stahl, Neil; Murphy, Andrew; Economides, Aris; Skelding, Kimberly A; Still, Christopher D; Elmore, James R; Borecki, Ingrid B; Yancopoulos, George D; Davis, F Daniel; Faucett, William A; Gottesman, Omri; Ritchie, Marylyn D; Shuldiner, Alan R; Reid, Jeffrey G; Ledbetter, David H; Baras, Aris; Carey, David J
2016-12-23
The DiscovEHR collaboration between the Regeneron Genetics Center and Geisinger Health System couples high-throughput sequencing to an integrated health care system using longitudinal electronic health records (EHRs). We sequenced the exomes of 50,726 adult participants in the DiscovEHR study to identify ~4.2 million rare single-nucleotide variants and insertion/deletion events, of which ~176,000 are predicted to result in a loss of gene function. Linking these data to EHR-derived clinical phenotypes, we find clinical associations supporting therapeutic targets, including genes encoding drug targets for lipid lowering, and identify previously unidentified rare alleles associated with lipid levels and other blood level traits. About 3.5% of individuals harbor deleterious variants in 76 clinically actionable genes. The DiscovEHR data set provides a blueprint for large-scale precision medicine initiatives and genomics-guided therapeutic discovery. Copyright © 2016, American Association for the Advancement of Science.
Garenc, Christophe; Aubert, Samuel; Laroche, Jèrôme; Girouard, Joël; Vohl, Marie-Claude; Bergeron, Jean; Rousseau, François; Julien, Pierre
2004-01-01
Hypertriglyceridemia (HTG) is known as a common metabolic disorder associated with increased production, decrease catabolism and/or decreased hepatic uptake of triglyceride (TG)-rich particles. We assessed, in the Quebec City population, the allele frequency and haplotype distributions of mutations in genes related to HTG, such as the apolipoprotein E (APOE) (C112R and C158R), the apolipoprotein CIII (APOC3) (C-482T and C3238G) and the peroxisome proliferator-activated receptor alpha (PPARalpha) (L162V) genes. A total of 938 anonymous unlinked newborns from the metropolitan Quebec City area have been genotyped. Allele frequencies observed in the Quebec City population differed from known frequencies determined in other Caucasian populations. The co-transmitted allele distribution between the two-marker genotypes APOE/APOC3(C3238G) and APOC3(C-482T)/PPARalpha(L162V) presented a weak deviation from the assumption of genetic independence. Also, we observed a non-independent distribution of the T-482/G3238 allele combinations within the APOC3 gene, suggesting strong linkage disequilibrium between the C-482T and C3238G polymorphisms. Moreover, comparisons of allele frequencies observed in the population of Québec City to those obtained in other Caucasian populations suggested that the population of Québec City may be at a lower risk of developing HTG due to APOE, APOC3 and PPARalpha genetic variants. However, the strong linkage disequilibrium and the two-marker genotype distributions observed in the APOC3 gene suggest that these two variants may functionally interact in the Québec City population.
Functional and Evolutionary Characterization of a Gene Transfer Agent’s Multilocus “Genome”
Hynes, Alexander P.; Shakya, Migun; Mercer, Ryan G.; Grüll, Marc P.; Bown, Luke; Davidson, Fraser; Steffen, Ekaterina; Matchem, Heidi; Peach, Mandy E.; Berger, Tim; Grebe, Katherine; Zhaxybayeva, Olga; Lang, Andrew S.
2016-01-01
Gene transfer agents (GTAs) are phage-like particles that can package and transfer a random piece of the producing cell’s genome, but are unable to transfer all the genes required for their own production. As such, GTAs represent an evolutionary conundrum: are they selfish genetic elements propagating through an unknown mechanism, defective viruses, or viral structures “repurposed” by cells for gene exchange, as their name implies? In Rhodobacter capsulatus, production of the R. capsulatus GTA (RcGTA) particles is associated with a cluster of genes resembling a small prophage. Utilizing transcriptomic, genetic and biochemical approaches, we report that the RcGTA “genome” consists of at least 24 genes distributed across five distinct loci. We demonstrate that, of these additional loci, two are involved in cell recognition and binding and one in the production and maturation of RcGTA particles. The five RcGTA “genome” loci are widespread within Rhodobacterales, but not all loci have the same evolutionary histories. Specifically, two of the loci have been subject to frequent, probably virus-mediated, gene transfer events. We argue that it is unlikely that RcGTA is a selfish genetic element. Instead, our findings are compatible with the scenario that RcGTA is a virus-derived element maintained by the producing organism due to a selective advantage of within-population gene exchange. The modularity of the RcGTA “genome” is presumably a result of selection on the host organism to retain GTA functionality. PMID:27343288
Kumar, Hirdesh; Frischknecht, Friedrich; Mair, Gunnar R; Gomes, James
2015-12-01
Genetically attenuated parasites (GAPs) that lack genes essential for the liver stage of the malaria parasite, and therefore cause developmental arrest, have been developed as live vaccines in rodent malaria models and recently been tested in humans. The genes targeted for deletion were often identified by trial and error. Here we present a systematic gene - protein and transcript - expression analyses of several Plasmodium species with the aim to identify candidate genes for the generation of novel GAPs. With a lack of liver stage expression data for human malaria parasites, we used data available for liver stage development of Plasmodium yoelii, a rodent malaria model, to identify proteins expressed in the liver stage but absent from blood stage parasites. An orthology-based search was then employed to identify orthologous proteins in the human malaria parasite Plasmodium falciparum resulting in a total of 310 genes expressed in the liver stage but lacking evidence of protein expression in blood stage parasites. Among these 310 possible GAP candidates, we further studied Plasmodium liver stage proteins by phyletic distribution and functional domain analyses and shortlisted twenty GAP-candidates; these are: fabB/F, fabI, arp, 3 genes encoding subunits of the PDH complex, dnaJ, urm1, rS5, ancp, mcp, arh, gk, lisp2, valS, palm, and four conserved Plasmodium proteins of unknown function. Parasites lacking one or several of these genes might yield new attenuated malaria parasites for experimental vaccination studies. Copyright © 2015 Elsevier B.V. All rights reserved.
Chidebe, Ifeoma N.
2017-01-01
ABSTRACT Cowpea derives most of its N nutrition from biological nitrogen fixation (BNF) via symbiotic bacteroids in root nodules. In Sub-Saharan Africa, the diversity and biogeographic distribution of bacterial microsymbionts nodulating cowpea and other indigenous legumes are not well understood, though needed for increased legume production. The aim of this study was to describe the distribution and phylogenies of rhizobia at different agroecological regions of Mozambique using PCR of the BOX element (BOX-PCR), restriction fragment length polymorphism of the internal transcribed spacer (ITS-RFLP), and sequence analysis of ribosomal, symbiotic, and housekeeping genes. A total of 122 microsymbionts isolated from two cowpea varieties (IT-1263 and IT-18) grouped into 17 clades within the BOX-PCR dendrogram. The PCR-ITS analysis yielded 17 ITS types for the bacterial isolates, while ITS-RFLP analysis placed all test isolates in six distinct clusters (I to VI). BLASTn sequence analysis of 16S rRNA and four housekeeping genes (glnII, gyrB, recA, and rpoB) showed their alignment with Rhizobium and Bradyrhizobium species. The results revealed a group of highly diverse and adapted cowpea-nodulating microsymbionts which included Bradyrhizobium pachyrhizi, Bradyrhizobium arachidis, Bradyrhizobium yuanmingense, and a novel Bradyrhizobium sp., as well as Rhizobium tropici, Rhizobium pusense, and Neorhizobium galegae in Mozambican soils. Discordances observed in single-gene phylogenies could be attributed to horizontal gene transfer and/or subsequent recombinations of the genes. Natural deletion of 60 bp of the gyrB region was observed in isolate TUTVU7; however, this deletion effect on DNA gyrase function still needs to be confirmed. The inconsistency of nifH with core gene phylogenies suggested differences in the evolutionary history of both chromosomal and symbiotic genes. IMPORTANCE A diverse group of both Bradyrhizobium and Rhizobium species responsible for cowpea nodulation in Mozambique was found in this study. Future studies could prove useful in evaluating these bacterial isolates for symbiotic efficiency and strain competitiveness in Mozambican soils. PMID:29101189
Genome-Wide Association Study of the Genetic Determinants of Emphysema Distribution
Boueiz, Adel; Lutz, Sharon M.; Cho, Michael H.; Hersh, Craig P.; Bowler, Russell P.; Washko, George R.; Halper-Stromberg, Eitan; Bakke, Per; Gulsvik, Amund; Laird, Nan M.; Beaty, Terri H.; Coxson, Harvey O.; Crapo, James D.; Silverman, Edwin K.; Castaldi, Peter J.
2017-01-01
Rationale: Emphysema has considerable variability in the severity and distribution of parenchymal destruction throughout the lungs. Upper lobe–predominant emphysema has emerged as an important predictor of response to lung volume reduction surgery. Yet, aside from alpha-1 antitrypsin deficiency, the genetic determinants of emphysema distribution remain largely unknown. Objectives: To identify the genetic influences of emphysema distribution in non–alpha-1 antitrypsin–deficient smokers. Methods: A total of 11,532 subjects with complete genotype and computed tomography densitometry data in the COPDGene (Genetic Epidemiology of Chronic Obstructive Pulmonary Disease [COPD]; non-Hispanic white and African American), ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints), and GenKOLS (Genetics of Chronic Obstructive Lung Disease) studies were analyzed. Two computed tomography scan emphysema distribution measures (difference between upper-third and lower-third emphysema; ratio of upper-third to lower-third emphysema) were tested for genetic associations in all study subjects. Separate analyses in each study population were followed by a fixed effect metaanalysis. Single-nucleotide polymorphism–, gene-, and pathway-based approaches were used. In silico functional evaluation was also performed. Measurements and Main Results: We identified five loci associated with emphysema distribution at genome-wide significance. These loci included two previously reported associations with COPD susceptibility (4q31 near HHIP and 15q25 near CHRNA5) and three new associations near SOWAHB, TRAPPC9, and KIAA1462. Gene set analysis and in silico functional evaluation revealed pathways and cell types that may potentially contribute to the pathogenesis of emphysema distribution. Conclusions: This multicohort genome-wide association study identified new genomic loci associated with differential emphysematous destruction throughout the lungs. These findings may point to new biologic pathways on which to expand diagnostic and therapeutic approaches in chronic obstructive pulmonary disease. Clinical trial registered with www.clinicaltrials.gov (NCT 00608764). PMID:27669027
Ota, Satoshi; Taimatsu, Kiyohito; Yanagi, Kanoko; Namiki, Tomohiro; Ohga, Rie; Higashijima, Shin-Ichi; Kawahara, Atsuo
2016-10-11
The CRISPR/Cas9 complex, which is composed of a guide RNA (gRNA) and the Cas9 nuclease, is useful for carrying out genome modifications in various organisms. Recently, the CRISPR/Cas9-mediated locus-specific integration of a reporter, which contains the Mbait sequence targeted using Mbait-gRNA, the hsp70 promoter and the eGFP gene, has allowed the visualization of the target gene expression. However, it has not been ascertained whether the reporter integrations at both targeted alleles cause loss-of-function phenotypes in zebrafish. In this study, we have inserted the Mbait-hs-eGFP reporter into the pax2a gene because the disruption of pax2a causes the loss of the midbrain-hindbrain boundary (MHB) in zebrafish. In the heterozygous Tg[pax2a-hs:eGFP] embryos, MHB formed normally and the eGFP expression recapitulated the endogenous pax2a expression, including the MHB. We observed the loss of the MHB in homozygous Tg[pax2a-hs:eGFP] embryos. Furthermore, we succeeded in integrating the Mbait-hs-eGFP reporter into an uncharacterized gene epdr1. The eGFP expression in heterozygous Tg[epdr1-hs:eGFP] embryos overlapped the epdr1 expression, whereas the distribution of eGFP-positive cells was disorganized in the MHB of homozygous Tg[epdr1-hs:eGFP] embryos. We propose that the locus-specific integration of the Mbait-hs-eGFP reporter is a powerful method to investigate both gene expression profiles and loss-of-function phenotypes.
Ota, Satoshi; Taimatsu, Kiyohito; Yanagi, Kanoko; Namiki, Tomohiro; Ohga, Rie; Higashijima, Shin-ichi; Kawahara, Atsuo
2016-01-01
The CRISPR/Cas9 complex, which is composed of a guide RNA (gRNA) and the Cas9 nuclease, is useful for carrying out genome modifications in various organisms. Recently, the CRISPR/Cas9-mediated locus-specific integration of a reporter, which contains the Mbait sequence targeted using Mbait-gRNA, the hsp70 promoter and the eGFP gene, has allowed the visualization of the target gene expression. However, it has not been ascertained whether the reporter integrations at both targeted alleles cause loss-of-function phenotypes in zebrafish. In this study, we have inserted the Mbait-hs-eGFP reporter into the pax2a gene because the disruption of pax2a causes the loss of the midbrain-hindbrain boundary (MHB) in zebrafish. In the heterozygous Tg[pax2a-hs:eGFP] embryos, MHB formed normally and the eGFP expression recapitulated the endogenous pax2a expression, including the MHB. We observed the loss of the MHB in homozygous Tg[pax2a-hs:eGFP] embryos. Furthermore, we succeeded in integrating the Mbait-hs-eGFP reporter into an uncharacterized gene epdr1. The eGFP expression in heterozygous Tg[epdr1-hs:eGFP] embryos overlapped the epdr1 expression, whereas the distribution of eGFP-positive cells was disorganized in the MHB of homozygous Tg[epdr1-hs:eGFP] embryos. We propose that the locus-specific integration of the Mbait-hs-eGFP reporter is a powerful method to investigate both gene expression profiles and loss-of-function phenotypes. PMID:27725766
Molecular characterization of the apical organ of the anthozoan Nematostella vectensis
Sinigaglia, Chiara; Busengdal, Henriette; Lerner, Avi; Oliveri, Paola; Rentzsch, Fabian
2015-01-01
Apical organs are sensory structures present in many marine invertebrate larvae where they are considered to be involved in their settlement, metamorphosis and locomotion. In bilaterians they are characterised by a tuft of long cilia and receptor cells and they are associated with groups of neurons, but their relatively low morphological complexity and dispersed phylogenetic distribution have left their evolutionary relationship unresolved. Moreover, since apical organs are not present in the standard model organisms, their development and function are not well understood. To provide a foundation for a better understanding of this structure we have characterised the molecular composition of the apical organ of the sea anemone Nematostella vectensis. In a microarray-based comparison of the gene expression profiles of planulae with either a wildtype or an experimentally expanded apical organ, we identified 78 evolutionarily conserved genes, which are predominantly or specifically expressed in the apical organ of Nematostella. This gene set comprises signalling molecules, transcription factors, structural and metabolic genes. The majority of these genes, including several conserved, but previously uncharacterized ones, are potentially involved in different aspects of the development or function of the long cilia of the apical organ. To demonstrate the utility of this gene set for comparative analyses, we further analysed the expression of a subset of previously uncharacterized putative orthologs in sea urchin larvae and detected expression for twelve out of eighteen of them in the apical domain. Our study provides a molecular characterization of the apical organ of Nematostella and represents an informative tool for future studies addressing the development, function and evolutionary history of apical organ cells. PMID:25478911
Tang, Bin; Wang, Su; Wang, Shi-Gui; Wang, Hui-Juan; Zhang, Jia-Yong; Cui, Shuai-Ying
2018-01-01
The non-reducing disaccharide trehalose is widely distributed among various organisms. It plays a crucial role as an instant source of energy, being the major blood sugar in insects. In addition, it helps countering abiotic stresses. Trehalose synthesis in insects and other invertebrates is thought to occur via the trehalose-6-phosphate synthase (TPS) and trehalose-6-phosphate phosphatase (TPP) pathways. In many insects, the TPP gene has not been identified, whereas multiple TPS genes that encode proteins harboring TPS/OtsA and TPP/OtsB conserved domains have been found and cloned in the same species. The function of the TPS gene in insects and other invertebrates has not been reviewed in depth, and the available information is quite fragmented. The present review discusses the current understanding of the trehalose synthesis pathway, TPS genetic architecture, biochemistry, physiological function, and potential sensitivity to insecticides. We note the variability in the number of TPS genes in different invertebrate species, consider whether trehalose synthesis may rely only on the TPS gene, and discuss the results of in vitro TPS overexpression experiment. Tissue expression profile and developmental characteristics of the TPS gene indicate that it is important in energy production, growth and development, metamorphosis, stress recovery, chitin synthesis, insect flight, and other biological processes. We highlight the molecular and biochemical properties of insect TPS that make it a suitable target of potential pest control inhibitors. The application of trehalose synthesis inhibitors is a promising direction in insect pest control because vertebrates do not synthesize trehalose; therefore, TPS inhibitors would be relatively safe for humans and higher animals, making them ideal insecticidal agents without off-target effects.
Tang, Bin; Wang, Su; Wang, Shi-Gui; Wang, Hui-Juan; Zhang, Jia-Yong; Cui, Shuai-Ying
2018-01-01
The non-reducing disaccharide trehalose is widely distributed among various organisms. It plays a crucial role as an instant source of energy, being the major blood sugar in insects. In addition, it helps countering abiotic stresses. Trehalose synthesis in insects and other invertebrates is thought to occur via the trehalose-6-phosphate synthase (TPS) and trehalose-6-phosphate phosphatase (TPP) pathways. In many insects, the TPP gene has not been identified, whereas multiple TPS genes that encode proteins harboring TPS/OtsA and TPP/OtsB conserved domains have been found and cloned in the same species. The function of the TPS gene in insects and other invertebrates has not been reviewed in depth, and the available information is quite fragmented. The present review discusses the current understanding of the trehalose synthesis pathway, TPS genetic architecture, biochemistry, physiological function, and potential sensitivity to insecticides. We note the variability in the number of TPS genes in different invertebrate species, consider whether trehalose synthesis may rely only on the TPS gene, and discuss the results of in vitro TPS overexpression experiment. Tissue expression profile and developmental characteristics of the TPS gene indicate that it is important in energy production, growth and development, metamorphosis, stress recovery, chitin synthesis, insect flight, and other biological processes. We highlight the molecular and biochemical properties of insect TPS that make it a suitable target of potential pest control inhibitors. The application of trehalose synthesis inhibitors is a promising direction in insect pest control because vertebrates do not synthesize trehalose; therefore, TPS inhibitors would be relatively safe for humans and higher animals, making them ideal insecticidal agents without off-target effects. PMID:29445344
Spiegel, S; Chiu, A; James, A S; Jentsch, J D; Karlsgodt, K H
2015-11-01
Numerous studies have implicated DTNBP1, the gene encoding dystrobrevin-binding protein or dysbindin, as a candidate risk gene for schizophrenia, though this relationship remains somewhat controversial. Variation in dysbindin, and its location on chromosome 6p, has been associated with cognitive processes, including those relying on a complex system of glutamatergic and dopaminergic interactions. Dysbindin is one of the seven protein subunits that comprise the biogenesis of lysosome-related organelles complex 1 (BLOC-1). Dysbindin protein levels are lower in mice with null mutations in pallidin, another gene in the BLOC-1, and pallidin levels are lower in mice with null mutations in the dysbindin gene, suggesting that multiple subunit proteins must be present to form a functional oligomeric complex. Furthermore, pallidin and dysbindin have similar distribution patterns in a mouse and human brain. Here, we investigated whether the apparent correspondence of pallid and dysbindin at the level of gene expression is also found at the level of behavior. Hypothesizing a mutation leading to underexpression of either of these proteins should show similar phenotypic effects, we studied recognition memory in both strains using the novel object recognition task (NORT) and social novelty recognition task (SNRT). We found that mice with a null mutation in either gene are impaired on SNRT and NORT when compared with wild-type controls. These results support the conclusion that deficits consistent with recognition memory impairment, a cognitive function that is impaired in schizophrenia, result from either pallidin or dysbindin mutations, possibly through degradation of BLOC-1 expression and/or function. © 2015 John Wiley & Sons Ltd and International Behavioural and Neural Genetics Society.
Bowen, Lizabeth; Miles, A. Keith; Murray, Michael; Haulena, Martin; Tuttle, Judy; van Bonn, William; Adams, Lance; Bodkin, James L.; Ballachey, Brenda E.; Estes, James A.; Tinker, M. Tim; Keister, Robin; Stott, Jeffrey L.
2012-01-01
Gene transcription analysis for diagnosing or monitoring wildlife health requires the ability to distinguish pathophysiological change from natural variation. Herein, we describe methodology for the development of quantitative real-time polymerase chain reaction (qPCR) assays to measure differential transcript levels of multiple immune function genes in the sea otter (Enhydra lutris); sea otter-specific qPCR primer sequences for the genes of interest are defined. We establish a ‘reference’ range of transcripts for each gene in a group of clinically healthy captive and free-ranging sea otters. The 10 genes of interest represent multiple physiological systems that play a role in immuno-modulation, inflammation, cell protection, tumour suppression, cellular stress response, xenobiotic metabolizing enzymes, antioxidant enzymes and cell–cell adhesion. The cycle threshold (CT) measures for most genes were normally distributed; the complement cytolysis inhibitor was the exception. The relative enumeration of multiple gene transcripts in simple peripheral blood samples expands the diagnostic capability currently available to assess the health of sea otters in situ and provides a better understanding of the state of their environment.
2010-01-01
Background Intragenic tandem repeats occur throughout all domains of life and impart functional and structural variability to diverse translation products. Repeat proteins confer distinctive surface phenotypes to many unicellular organisms, including those with minimal genomes such as the wall-less bacterial monoderms, Mollicutes. One such repeat pattern in this clade is distributed in a manner suggesting its exchange by horizontal gene transfer (HGT). Expanding genome sequence databases reveal the pattern in a widening range of bacteria, and recently among eucaryotic microbes. We examined the genomic flux and consequences of the motif by determining its distribution, predicted structural features and association with membrane-targeted proteins. Results Using a refined hidden Markov model, we document a 25-residue protein sequence motif tandemly arrayed in variable-number repeats in ORFs lacking assigned functions. It appears sporadically in unicellular microbes from disparate bacterial and eucaryotic clades, representing diverse lifestyles and ecological niches that include host parasitic, marine and extreme environments. Tracts of the repeats predict a malleable configuration of recurring domains, with conserved hydrophobic residues forming an amphipathic secondary structure in which hydrophilic residues endow extensive sequence variation. Many ORFs with these domains also have membrane-targeting sequences that predict assorted topologies; others may comprise reservoirs of sequence variants. We demonstrate expressed variants among surface lipoproteins that distinguish closely related animal pathogens belonging to a subgroup of the Mollicutes. DNA sequences encoding the tandem domains display dyad symmetry. Moreover, in some taxa the domains occur in ORFs selectively associated with mobile elements. These features, a punctate phylogenetic distribution, and different patterns of dispersal in genomes of related taxa, suggest that the repeat may be disseminated by HGT and intra-genomic shuffling. Conclusions We describe novel features of PARCELs (Palindromic Amphipathic Repeat Coding ELements), a set of widely distributed repeat protein domains and coding sequences that were likely acquired through HGT by diverse unicellular microbes, further mobilized and diversified within genomes, and co-opted for expression in the membrane proteome of some taxa. Disseminated by multiple gene-centric vehicles, ORFs harboring these elements enhance accessory gene pools as part of the "mobilome" connecting genomes of various clades, in taxa sharing common niches. PMID:20626840
Embryonic expression of zebrafish MiT family genes tfe3b, tfeb, and tfec.
Lister, James A; Lane, Brandon M; Nguyen, Anhthu; Lunney, Katherine
2011-11-01
The MiT family comprises four genes in mammals: Mitf, Tfe3, Tfeb, and Tfec, which encode transcription factors of the basic-helix-loop-helix/leucine zipper class. Mitf is well-known for its essential role in the development of melanocytes, however the functions of the other members of this family, and of interactions between them, are less well understood. We have now characterized the complete set of MiT genes from zebrafish, which totals six instead of four. The zebrafish genome contain two mitf (mitfa and mitfb), two tfe3 (tfe3a and tfe3b), and single tfeb and tfec genes; this distribution is shared with other teleosts. We present here the sequence and embryonic expression patterns for the zebrafish tfe3b, tfeb, and tfec genes, and identify a new isoform of tfe3a. These findings will assist in elucidating the roles of the MiT gene family over the course of vertebrate evolution. Copyright © 2011 Wiley-Liss, Inc.
Ma, Jun; Liu, Fang; Wang, Qinglian; Wang, Kunbo; Jones, Don C.; Zhang, Baohong
2016-01-01
TCP proteins are plant-specific transcription factors implicated to perform a variety of physiological functions during plant growth and development. In the current study, we performed for the first time the comprehensive analysis of TCP gene family in a diploid cotton species, Gossypium arboreum, including phylogenetic analysis, chromosome location, gene duplication status, gene structure and conserved motif analysis, as well as expression profiles in fiber at different developmental stages. Our results showed that G. arboreum contains 36 TCP genes, distributing across all of the thirteen chromosomes. GaTCPs within the same subclade of the phylogenetic tree shared similar exon/intron organization and motif composition. In addition, both segmental duplication and whole-genome duplication contributed significantly to the expansion of GaTCPs. Many these TCP transcription factor genes are specifically expressed in cotton fiber during different developmental stages, including cotton fiber initiation and early development. This suggests that TCP genes may play important roles in cotton fiber development. PMID:26857372
Watanabe, T; Aonuma, H
2012-01-01
Biogenic amine serotonin (5-HT) modulates various aspects of behaviors such as aggressive behavior and circadian behavior in the cricket. In our previous report, in order to elucidate the molecular basis of the cricket 5-HT system, we identified three genes involved in 5-HT biosynthesis, as well as four 5-HT receptor genes (5-HT1A, 5-HT1B, 5-HT2α, and 5-HT7) expressed in the brain of the field cricket Gryllus bimaculatus DeGeer [7]. In the present study, we identified Gryllus 5-HT2β gene, an additional 5-HT receptor gene expressed in the cricket brain, and examined its tissue-specific distribution and embryonic stage-dependent expression. Gryllus 5-HT2β gene was ubiquitously expressed in the all examined adult tissues, and was expressed during early embryonic development, as well as during later stages. This study suggests functional differences between two 5-HT2 receptors in the cricket.
Ma, Jun; Liu, Fang; Wang, Qinglian; Wang, Kunbo; Jones, Don C; Zhang, Baohong
2016-02-09
TCP proteins are plant-specific transcription factors implicated to perform a variety of physiological functions during plant growth and development. In the current study, we performed for the first time the comprehensive analysis of TCP gene family in a diploid cotton species, Gossypium arboreum, including phylogenetic analysis, chromosome location, gene duplication status, gene structure and conserved motif analysis, as well as expression profiles in fiber at different developmental stages. Our results showed that G. arboreum contains 36 TCP genes, distributing across all of the thirteen chromosomes. GaTCPs within the same subclade of the phylogenetic tree shared similar exon/intron organization and motif composition. In addition, both segmental duplication and whole-genome duplication contributed significantly to the expansion of GaTCPs. Many these TCP transcription factor genes are specifically expressed in cotton fiber during different developmental stages, including cotton fiber initiation and early development. This suggests that TCP genes may play important roles in cotton fiber development.
Distribution of nitrogen fixation and nitrogenase-like sequences amongst microbial genomes
2012-01-01
Background The metabolic capacity for nitrogen fixation is known to be present in several prokaryotic species scattered across taxonomic groups. Experimental detection of nitrogen fixation in microbes requires species-specific conditions, making it difficult to obtain a comprehensive census of this trait. The recent and rapid increase in the availability of microbial genome sequences affords novel opportunities to re-examine the occurrence and distribution of nitrogen fixation genes. The current practice for computational prediction of nitrogen fixation is to use the presence of the nifH and/or nifD genes. Results Based on a careful comparison of the repertoire of nitrogen fixation genes in known diazotroph species we propose a new criterion for computational prediction of nitrogen fixation: the presence of a minimum set of six genes coding for structural and biosynthetic components, namely NifHDK and NifENB. Using this criterion, we conducted a comprehensive search in fully sequenced genomes and identified 149 diazotrophic species, including 82 known diazotrophs and 67 species not known to fix nitrogen. The taxonomic distribution of nitrogen fixation in Archaea was limited to the Euryarchaeota phylum; within the Bacteria domain we predict that nitrogen fixation occurs in 13 different phyla. Of these, seven phyla had not hitherto been known to contain species capable of nitrogen fixation. Our analyses also identified protein sequences that are similar to nitrogenase in organisms that do not meet the minimum-gene-set criteria. The existence of nitrogenase-like proteins lacking conserved co-factor ligands in both diazotrophs and non-diazotrophs suggests their potential for performing other, as yet unidentified, metabolic functions. Conclusions Our predictions expand the known phylogenetic diversity of nitrogen fixation, and suggest that this trait may be much more common in nature than it is currently thought. The diverse phylogenetic distribution of nitrogenase-like proteins indicates potential new roles for anciently duplicated and divergent members of this group of enzymes. PMID:22554235
Complexin2 modulates working memory-related neural activity in patients with schizophrenia
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hass, Johanna; Walton, Esther; Kirsten, Holger
The specific contribution of risk or candidate gene variants to the complex phenotype of schizophrenia is largely unknown. Studying the effects of such variants on brain function can provide insight into disease-associated mechanisms on a neural systems level. Previous studies found common variants in the complexin2 ( CPLX2) gene to be highly associated with cognitive dysfunction in schizophrenia patients. Similarly, cognitive functioning was found to be impaired in Cplx2 gene-deficient mice if they were subjected to maternal deprivation or mild brain trauma during puberty. Here, we aimed to study seven common CPLX2 single-nucleotide polymorphisms (SNPs) and their neurogenetic risk mechanismsmore » by investigating their relationship to a schizophrenia-related functional neuroimaging intermediate phenotype. In this paper, we examined functional MRI and genotype data collected from 104 patients with DSM-IV-diagnosed schizophrenia and 122 healthy controls who participated in the Mind Clinical Imaging Consortium study of schizophrenia. Seven SNPs distributed over the whole CPLX2 gene were tested for association with working memory-elicited neural activity in a frontoparietal neural network. Three CPLX2 SNPs were significantly associated with increased neural activity in the dorsolateral prefrontal cortex and intraparietal sulcus in the schizophrenia sample, but showed no association in healthy controls. Finally, since increased working memory-related neural activity in individuals with or at risk for schizophrenia has been interpreted as ‘neural inefficiency,’ these findings suggest that certain variants of CPLX2 may contribute to impaired brain function in schizophrenia, possibly combined with other deleterious genetic variants, adverse environmental events, or developmental insults.« less
Complexin2 modulates working memory-related neural activity in patients with schizophrenia
Hass, Johanna; Walton, Esther; Kirsten, Holger; ...
2014-10-09
The specific contribution of risk or candidate gene variants to the complex phenotype of schizophrenia is largely unknown. Studying the effects of such variants on brain function can provide insight into disease-associated mechanisms on a neural systems level. Previous studies found common variants in the complexin2 ( CPLX2) gene to be highly associated with cognitive dysfunction in schizophrenia patients. Similarly, cognitive functioning was found to be impaired in Cplx2 gene-deficient mice if they were subjected to maternal deprivation or mild brain trauma during puberty. Here, we aimed to study seven common CPLX2 single-nucleotide polymorphisms (SNPs) and their neurogenetic risk mechanismsmore » by investigating their relationship to a schizophrenia-related functional neuroimaging intermediate phenotype. In this paper, we examined functional MRI and genotype data collected from 104 patients with DSM-IV-diagnosed schizophrenia and 122 healthy controls who participated in the Mind Clinical Imaging Consortium study of schizophrenia. Seven SNPs distributed over the whole CPLX2 gene were tested for association with working memory-elicited neural activity in a frontoparietal neural network. Three CPLX2 SNPs were significantly associated with increased neural activity in the dorsolateral prefrontal cortex and intraparietal sulcus in the schizophrenia sample, but showed no association in healthy controls. Finally, since increased working memory-related neural activity in individuals with or at risk for schizophrenia has been interpreted as ‘neural inefficiency,’ these findings suggest that certain variants of CPLX2 may contribute to impaired brain function in schizophrenia, possibly combined with other deleterious genetic variants, adverse environmental events, or developmental insults.« less
A functional U-statistic method for association analysis of sequencing data.
Jadhav, Sneha; Tong, Xiaoran; Lu, Qing
2017-11-01
Although sequencing studies hold great promise for uncovering novel variants predisposing to human diseases, the high dimensionality of the sequencing data brings tremendous challenges to data analysis. Moreover, for many complex diseases (e.g., psychiatric disorders) multiple related phenotypes are collected. These phenotypes can be different measurements of an underlying disease, or measurements characterizing multiple related diseases for studying common genetic mechanism. Although jointly analyzing these phenotypes could potentially increase the power of identifying disease-associated genes, the different types of phenotypes pose challenges for association analysis. To address these challenges, we propose a nonparametric method, functional U-statistic method (FU), for multivariate analysis of sequencing data. It first constructs smooth functions from individuals' sequencing data, and then tests the association of these functions with multiple phenotypes by using a U-statistic. The method provides a general framework for analyzing various types of phenotypes (e.g., binary and continuous phenotypes) with unknown distributions. Fitting the genetic variants within a gene using a smoothing function also allows us to capture complexities of gene structure (e.g., linkage disequilibrium, LD), which could potentially increase the power of association analysis. Through simulations, we compared our method to the multivariate outcome score test (MOST), and found that our test attained better performance than MOST. In a real data application, we apply our method to the sequencing data from Minnesota Twin Study (MTS) and found potential associations of several nicotine receptor subunit (CHRN) genes, including CHRNB3, associated with nicotine dependence and/or alcohol dependence. © 2017 WILEY PERIODICALS, INC.
DNA mimic proteins: functions, structures, and bioinformatic analysis.
Wang, Hao-Ching; Ho, Chun-Han; Hsu, Kai-Cheng; Yang, Jinn-Moon; Wang, Andrew H-J
2014-05-13
DNA mimic proteins have DNA-like negative surface charge distributions, and they function by occupying the DNA binding sites of DNA binding proteins to prevent these sites from being accessed by DNA. DNA mimic proteins control the activities of a variety of DNA binding proteins and are involved in a wide range of cellular mechanisms such as chromatin assembly, DNA repair, transcription regulation, and gene recombination. However, the sequences and structures of DNA mimic proteins are diverse, making them difficult to predict by bioinformatic search. To date, only a few DNA mimic proteins have been reported. These DNA mimics were not found by searching for functional motifs in their sequences but were revealed only by structural analysis of their charge distribution. This review highlights the biological roles and structures of 16 reported DNA mimic proteins. We also discuss approaches that might be used to discover new DNA mimic proteins.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Monte, D.; Coutte, L.; Dewitte, F.
The ERM protein belongs to the family of Ets transcription factors. We show here that the human ERM gene is organized into 14 exons distributed along 65 kb of genomic DNA on chromosome 3. The two main functional domains of ERM, the acidic domain and the DNA-binding ETS domain, are overlapped by three different exons each. The 3{prime}-untranslated region of ERM is 2.1 kb, whereas the 5{prime}-untranslated region is about 0.3 kb; this allows the transcription of ERM transcripts of approximately 4 kb. The human ERM gene is localized to the q27-q29 region of chromosome 3. 17 refs., 3 figs.
Abundant raw material for cis-regulatory evolution in humans
NASA Technical Reports Server (NTRS)
Rockman, Matthew V.; Wray, Gregory A.
2002-01-01
Changes in gene expression and regulation--due in particular to the evolution of cis-regulatory DNA sequences--may underlie many evolutionary changes in phenotypes, yet little is known about the distribution of such variation in populations. We present in this study the first survey of experimentally validated functional cis-regulatory polymorphism. These data are derived from more than 140 polymorphisms involved in the regulation of 107 genes in Homo sapiens, the eukaryote species with the most available data. We find that functional cis-regulatory variation is widespread in the human genome and that the consequent variation in gene expression is twofold or greater for 63% of the genes surveyed. Transcription factor-DNA interactions are highly polymorphic, and regulatory interactions have been gained and lost within human populations. On average, humans are heterozygous at more functional cis-regulatory sites (>16,000) than at amino acid positions (<13,000), in part because of an overrepresentation among the former in multiallelic tandem repeat variation, especially (AC)(n) dinucleotide microsatellites. The role of microsatellites in gene expression variation may provide a larger store of heritable phenotypic variation, and a more rapid mutational input of such variation, than has been realized. Finally, we outline the distinctive consequences of cis-regulatory variation for the genotype-phenotype relationship, including ubiquitous epistasis and genotype-by-environment interactions, as well as underappreciated modes of pleiotropy and overdominance. Ordinary small-scale mutations contribute to pervasive variation in transcription rates and consequently to patterns of human phenotypic variation.
Mizan, Md Furkanur Rahaman; Jahid, Iqbal Kabir; Kim, Minhui; Lee, Ki-Hoon; Kim, Tae Jo; Ha, Sang-Do
2016-01-01
Vibrio parahaemolyticus is one of the leading foodborne pathogens causing seafood contamination. Here, 22 V. parahaemolyticus strains were analyzed for biofilm formation to determine whether there is a correlation between biofilm formation and quorum sensing (QS), swimming motility, or hydrophobicity. The results indicate that the biofilm formation ability of V. parahaemolyticus is positively correlated with cell surface hydrophobicity, autoinducer (AI-2) production, and protease activity. Field emission scanning electron microscopy (FESEM) showed that strong-biofilm-forming strains established thick 3-D structures, whereas poor-biofilm-forming strains produced thin inconsistent biofilms. In addition, the distribution of the genes encoding pandemic clone factors, type VI secretion systems (T6SS), biofilm functions, and the type I pilus in the V. parahaemolyticus seafood isolates were examined. Biofilm-associated genes were present in almost all the strains, irrespective of other phenotypes. These results indicate that biofilm formation on/in seafood may constitute a major factor in the dissemination of V. parahaemolyticus and the ensuing diseases.
Collod-Béroud, G; Béroud, C; Adès, L; Black, C; Boxer, M; Brock, D J; Godfrey, M; Hayward, C; Karttunen, L; Milewicz, D; Peltonen, L; Richards, R I; Wang, M; Junien, C; Boileau, C
1997-01-01
Fibrillin is the major component of extracellular microfibrils. Mutations in the fibrillin gene on chromosome 15 (FBN1) were described at first in the heritable connective tissue disorder, Marfan syndrome (MFS). More recently, FBN1 has also been shown to harbor mutations related to a spectrum of conditions phenotypically related to MFS. These mutations are private, essentially missense, generally non-recurrent and widely distributed throughout the gene. To date no clear genotype/phenotype relationship has been observed excepted for the localization of neonatal mutations in a cluster between exons 24 and 32. The second version of the computerized Marfan database contains 89 entries. The software has been modified to accomodate new functions and routines. PMID:9016526
2013-01-01
Background High–throughput (HT) technologies provide huge amount of gene expression data that can be used to identify biomarkers useful in the clinical practice. The most frequently used approaches first select a set of genes (i.e. gene signature) able to characterize differences between two or more phenotypical conditions, and then provide a functional assessment of the selected genes with an a posteriori enrichment analysis, based on biological knowledge. However, this approach comes with some drawbacks. First, gene selection procedure often requires tunable parameters that affect the outcome, typically producing many false hits. Second, a posteriori enrichment analysis is based on mapping between biological concepts and gene expression measurements, which is hard to compute because of constant changes in biological knowledge and genome analysis. Third, such mapping is typically used in the assessment of the coverage of gene signature by biological concepts, that is either score–based or requires tunable parameters as well, limiting its power. Results We present Knowledge Driven Variable Selection (KDVS), a framework that uses a priori biological knowledge in HT data analysis. The expression data matrix is transformed, according to prior knowledge, into smaller matrices, easier to analyze and to interpret from both computational and biological viewpoints. Therefore KDVS, unlike most approaches, does not exclude a priori any function or process potentially relevant for the biological question under investigation. Differently from the standard approach where gene selection and functional assessment are applied independently, KDVS embeds these two steps into a unified statistical framework, decreasing the variability derived from the threshold–dependent selection, the mapping to the biological concepts, and the signature coverage. We present three case studies to assess the usefulness of the method. Conclusions We showed that KDVS not only enables the selection of known biological functionalities with accuracy, but also identification of new ones. An efficient implementation of KDVS was devised to obtain results in a fast and robust way. Computing time is drastically reduced by the effective use of distributed resources. Finally, integrated visualization techniques immediately increase the interpretability of results. Overall, KDVS approach can be considered as a viable alternative to enrichment–based approaches. PMID:23302187
Sun, Eric I; Leyn, Semen A; Kazanov, Marat D; Saier, Milton H; Novichkov, Pavel S; Rodionov, Dmitry A
2013-09-02
In silico comparative genomics approaches have been efficiently used for functional prediction and reconstruction of metabolic and regulatory networks. Riboswitches are metabolite-sensing structures often found in bacterial mRNA leaders controlling gene expression on transcriptional or translational levels.An increasing number of riboswitches and other cis-regulatory RNAs have been recently classified into numerous RNA families in the Rfam database. High conservation of these RNA motifs provides a unique advantage for their genomic identification and comparative analysis. A comparative genomics approach implemented in the RegPredict tool was used for reconstruction and functional annotation of regulons controlled by RNAs from 43 Rfam families in diverse taxonomic groups of Bacteria. The inferred regulons include ~5200 cis-regulatory RNAs and more than 12000 target genes in 255 microbial genomes. All predicted RNA-regulated genes were classified into specific and overall functional categories. Analysis of taxonomic distribution of these categories allowed us to establish major functional preferences for each analyzed cis-regulatory RNA motif family. Overall, most RNA motif regulons showed predictable functional content in accordance with their experimentally established effector ligands. Our results suggest that some RNA motifs (including thiamin pyrophosphate and cobalamin riboswitches that control the cofactor metabolism) are widespread and likely originated from the last common ancestor of all bacteria. However, many more analyzed RNA motifs are restricted to a narrow taxonomic group of bacteria and likely represent more recent evolutionary innovations. The reconstructed regulatory networks for major known RNA motifs substantially expand the existing knowledge of transcriptional regulation in bacteria. The inferred regulons can be used for genetic experiments, functional annotations of genes, metabolic reconstruction and evolutionary analysis. The obtained genome-wide collection of reference RNA motif regulons is available in the RegPrecise database (http://regprecise.lbl.gov/).
Zarzycki, Jan; Sutter, Markus; Cortina, Niña Socorro; Erb, Tobias J; Kerfeld, Cheryl A
2017-02-16
Many bacteria encode proteinaceous bacterial microcompartments (BMCs) that encapsulate sequential enzymatic reactions of diverse metabolic pathways. Well-characterized BMCs include carboxysomes for CO 2 -fixation, and propanediol- and ethanolamine-utilizing microcompartments that contain B 12 -dependent enzymes. Genes required to form BMCs are typically organized in gene clusters, which promoted their distribution across phyla by horizontal gene transfer. Recently, BMCs associated with glycyl radical enzymes (GREs) were discovered; these are widespread and comprise at least three functionally distinct types. Previously, we predicted one type of these GRE-associated microcompartments (GRMs) represents a B 12 -independent propanediol-utilizing BMC. Here we functionally and structurally characterize enzymes of the GRM of Rhodopseudomonas palustris BisB18 and demonstrate their concerted function in vitro. The GRM signature enzyme, the GRE, is a dedicated 1,2-propanediol dehydratase with a new type of intramolecular encapsulation peptide. It forms a complex with its activating enzyme and, in conjunction with an aldehyde dehydrogenase, converts 1,2-propanediol to propionyl-CoA. Notably, homologous GRMs are also encoded in pathogenic Escherichia coli strains. Our high-resolution crystal structures of the aldehyde dehydrogenase lead to a revised reaction mechanism. The successful in vitro reconstitution of a part of the GRM metabolism provides insights into the metabolic function and steps in the assembly of this BMC.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zarzycki, Jan; Sutter, Markus; Cortina, Niña Socorro
Many bacteria encode proteinaceous bacterial microcompartments (BMCs) that encapsulate sequential enzymatic reactions of diverse metabolic pathways. Well-characterized BMCs include carboxysomes for CO 2-fixation, and propanediol- and ethanolamine-utilizing microcompartments that contain B 12-dependent enzymes. Genes thus required to form BMCs are typically organized in gene clusters, which promoted their distribution across phyla by horizontal gene transfer. Recently, BMCs associated with glycyl radical enzymes (GREs) were discovered; these are widespread and comprise at least three functionally distinct types. Previously, we predicted one type of these GRE-associated microcompartments (GRMs) represents a B 12-independent propanediol-utilizing BMC. We functionally and structurally characterize enzymes of themore » GRM of Rhodopseudomonas palustris BisB18 and demonstrate their concerted function in vitro. The GRM signature enzyme, the GRE, is a dedicated 1,2-propanediol dehydratase with a new type of intramolecular encapsulation peptide. It forms a complex with its activating enzyme and, in conjunction with an aldehyde dehydrogenase, converts 1,2-propanediol to propionyl-CoA. Notably, homologous GRMs are also encoded in pathogenic Escherichia coli strains. Our high-resolution crystal structures of the aldehyde dehydrogenase lead to a revised reaction mechanism. The successful in vitro reconstitution of a part of the GRM metabolism provides insights into the metabolic function and steps in the assembly of this BMC.« less
Zarzycki, Jan; Sutter, Markus; Cortina, Niña Socorro; ...
2017-02-16
Many bacteria encode proteinaceous bacterial microcompartments (BMCs) that encapsulate sequential enzymatic reactions of diverse metabolic pathways. Well-characterized BMCs include carboxysomes for CO 2-fixation, and propanediol- and ethanolamine-utilizing microcompartments that contain B 12-dependent enzymes. Genes thus required to form BMCs are typically organized in gene clusters, which promoted their distribution across phyla by horizontal gene transfer. Recently, BMCs associated with glycyl radical enzymes (GREs) were discovered; these are widespread and comprise at least three functionally distinct types. Previously, we predicted one type of these GRE-associated microcompartments (GRMs) represents a B 12-independent propanediol-utilizing BMC. We functionally and structurally characterize enzymes of themore » GRM of Rhodopseudomonas palustris BisB18 and demonstrate their concerted function in vitro. The GRM signature enzyme, the GRE, is a dedicated 1,2-propanediol dehydratase with a new type of intramolecular encapsulation peptide. It forms a complex with its activating enzyme and, in conjunction with an aldehyde dehydrogenase, converts 1,2-propanediol to propionyl-CoA. Notably, homologous GRMs are also encoded in pathogenic Escherichia coli strains. Our high-resolution crystal structures of the aldehyde dehydrogenase lead to a revised reaction mechanism. The successful in vitro reconstitution of a part of the GRM metabolism provides insights into the metabolic function and steps in the assembly of this BMC.« less
Genome-Wide Analysis of the NAC Gene Family in Physic Nut (Jatropha curcas L.)
Wu, Zhenying; Xu, Xueqin; Xiong, Wangdan; Wu, Pingzhi; Chen, Yaping; Li, Meiru; Wu, Guojiang; Jiang, Huawu
2015-01-01
The NAC proteins (NAM, ATAF1/2 and CUC2) are plant-specific transcriptional regulators that have a conserved NAM domain in the N-terminus. They are involved in various biological processes, including both biotic and abiotic stress responses. In the present study, a total of 100 NAC genes (JcNAC) were identified in physic nut (Jatropha curcas L.). Based on phylogenetic analysis and gene structures, 83 JcNAC genes were classified as members of, or proposed to be diverged from, 39 previously predicted orthologous groups (OGs) of NAC sequences. Physic nut has a single intron-containing NAC gene subfamily that has been lost in many plants. The JcNAC genes are non-randomly distributed across the 11 linkage groups of the physic nut genome, and appear to be preferentially retained duplicates that arose from both ancient and recent duplication events. Digital gene expression analysis indicates that some of the JcNAC genes have tissue-specific expression profiles (e.g. in leaves, roots, stem cortex or seeds), and 29 genes differentially respond to abiotic stresses (drought, salinity, phosphorus deficiency and nitrogen deficiency). Our results will be helpful for further functional analysis of the NAC genes in physic nut. PMID:26125188
The Mechanism of Gene Targeting in Human Somatic Cells
Kan, Yinan; Ruis, Brian; Lin, Sherry; Hendrickson, Eric A.
2014-01-01
Gene targeting in human somatic cells is of importance because it can be used to either delineate the loss-of-function phenotype of a gene or correct a mutated gene back to wild-type. Both of these outcomes require a form of DNA double-strand break (DSB) repair known as homologous recombination (HR). The mechanism of HR leading to gene targeting, however, is not well understood in human cells. Here, we demonstrate that a two-end, ends-out HR intermediate is valid for human gene targeting. Furthermore, the resolution step of this intermediate occurs via the classic DSB repair model of HR while synthesis-dependent strand annealing and Holliday Junction dissolution are, at best, minor pathways. Moreover, and in contrast to other systems, the positions of Holliday Junction resolution are evenly distributed along the homology arms of the targeting vector. Most unexpectedly, we demonstrate that when a meganuclease is used to introduce a chromosomal DSB to augment gene targeting, the mechanism of gene targeting is inverted to an ends-in process. Finally, we demonstrate that the anti-recombination activity of mismatch repair is a significant impediment to gene targeting. These observations significantly advance our understanding of HR and gene targeting in human cells. PMID:24699519
Genome-Wide Analysis of the NAC Gene Family in Physic Nut (Jatropha curcas L.).
Wu, Zhenying; Xu, Xueqin; Xiong, Wangdan; Wu, Pingzhi; Chen, Yaping; Li, Meiru; Wu, Guojiang; Jiang, Huawu
2015-01-01
The NAC proteins (NAM, ATAF1/2 and CUC2) are plant-specific transcriptional regulators that have a conserved NAM domain in the N-terminus. They are involved in various biological processes, including both biotic and abiotic stress responses. In the present study, a total of 100 NAC genes (JcNAC) were identified in physic nut (Jatropha curcas L.). Based on phylogenetic analysis and gene structures, 83 JcNAC genes were classified as members of, or proposed to be diverged from, 39 previously predicted orthologous groups (OGs) of NAC sequences. Physic nut has a single intron-containing NAC gene subfamily that has been lost in many plants. The JcNAC genes are non-randomly distributed across the 11 linkage groups of the physic nut genome, and appear to be preferentially retained duplicates that arose from both ancient and recent duplication events. Digital gene expression analysis indicates that some of the JcNAC genes have tissue-specific expression profiles (e.g. in leaves, roots, stem cortex or seeds), and 29 genes differentially respond to abiotic stresses (drought, salinity, phosphorus deficiency and nitrogen deficiency). Our results will be helpful for further functional analysis of the NAC genes in physic nut.
Du, Jiancan; Hu, Simin; Yu, Qin; Wang, Chongde; Yang, Yunqiang; Sun, Hang; Yang, Yongping; Sun, Xudong
2017-01-01
The teosinte branched1/cycloidea/proliferating cell factor (TCP) gene family is a plant-specific transcription factor that participates in the control of plant development by regulating cell proliferation. However, no report is currently available about this gene family in turnips ( Brassica rapa ssp. rapa ). In this study, a genome-wide analysis of TCP genes was performed in turnips. Thirty-nine TCP genes in turnip genome were identified and distributed on 10 chromosomes. Phylogenetic analysis clearly showed that the family was classified as two clades: class I and class II. Gene structure and conserved motif analysis showed that the same clade genes have similar gene structures and conserved motifs. The expression profiles of 39 TCP genes were determined through quantitative real-time PCR. Most CIN-type BrrTCP genes were highly expressed in leaf. The members of CYC/TB1 subclade are highly expressed in flower bud and weakly expressed in root. By contrast, class I clade showed more widespread but less tissue-specific expression patterns. Yeast two-hybrid data show that BrrTCP proteins preferentially formed heterodimers. The function of BrrTCP2 was confirmed through ectopic expression of BrrTCP2 in wild-type and loss-of-function ortholog mutant of Arabidopsis. Overexpression of BrrTCP2 in wild-type Arabidopsis resulted in the diminished leaf size. Overexpression of BrrTCP2 in triple mutants of tcp2/4/10 restored the leaf phenotype of tcp2/4/10 to the phenotype of wild type. The comprehensive analysis of turnip TCP gene family provided the foundation to further study the roles of TCP genes in turnips.
Constant, Philippe; Chowdhury, Soumitra Paul; Hesse, Laura; Pratscher, Jennifer; Conrad, Ralf
2011-01-01
Streptomyces soil isolates exhibiting the unique ability to oxidize atmospheric H2 possess genes specifying a putative high-affinity [NiFe]-hydrogenase. This study was undertaken to explore the taxonomic diversity and the ecological importance of this novel functional group. We propose to designate the genes encoding the small and large subunits of the putative high-affinity hydrogenase hhyS and hhyL, respectively. Genome data mining revealed that the hhyL gene is unevenly distributed in the phyla Actinobacteria, Proteobacteria, Chloroflexi, and Acidobacteria. The hhyL gene sequences comprised a phylogenetically distinct group, namely, the group 5 [NiFe]-hydrogenase genes. The presumptive high-affinity H2-oxidizing bacteria constituting group 5 were shown to possess a hydrogenase gene cluster, including the genes encoding auxiliary and structural components of the enzyme and four additional open reading frames (ORFs) of unknown function. A soil survey confirmed that both high-affinity H2 oxidation activity and the hhyL gene are ubiquitous. A quantitative PCR assay revealed that soil contained 106 to 108 hhyL gene copies g (dry weight)−1. Assuming one hhyL gene copy per genome, the abundance of presumptive high-affinity H2-oxidizing bacteria was higher than the maximal population size for which maintenance energy requirements would be fully supplied through the H2 oxidation activity measured in soil. Our data indicate that the abundance of the hhyL gene should not be taken as a reliable proxy for the uptake of atmospheric H2 by soil, because high-affinity H2 oxidation is a facultatively mixotrophic metabolism, and microorganisms harboring a nonfunctional group 5 [NiFe]-hydrogenase may occur. PMID:21742924
Rey, Elodie; Abrouk, Michael; Keeble-Gagnère, Gabriel; Karafiátová, Miroslava; Vrána, Jan; Balzergue, Sandrine; Soubigou-Taconnat, Ludivine; Brunaud, Véronique; Martin-Magniette, Marie-Laure; Endo, Takashi R; Bartoš, Jan; Appels, Rudi; Doležel, Jaroslav
2018-03-06
Despite a long history, the production of useful alien introgression lines in wheat remains difficult mainly due to linkage drag and incomplete genetic compensation. In addition, little is known about the molecular mechanisms underlying the impact of foreign chromatin on plant phenotype. Here, a comparison of the transcriptomes of barley, wheat and a wheat-barley 7HL addition line allowed the transcriptional impact both on 7HL genes of a non-native genetic background and on the wheat gene complement as a result of the presence of 7HL to be assessed. Some 42% (389/923) of the 7HL genes assayed were differentially transcribed, which was the case for only 3% (960/35 301) of the wheat gene complement. The absence of any transcript in the addition line of a suite of chromosome 7A genes implied the presence of a 36 Mbp deletion at the distal end of the 7AL arm; this deletion was found to be in common across the full set of Chinese Spring/Betzes barley addition lines. The remaining differentially transcribed wheat genes were distributed across the whole genome. The up-regulated barley genes were mostly located in the proximal part of the 7HL arm, while the down-regulated ones were concentrated in the distal part; as a result, genes encoding basal cellular functions tended to be transcribed, while those encoding specific functions were suppressed. An insight has been gained into gene transcription in an alien introgression line, thereby providing a basis for understanding the interactions between wheat and exotic genes in introgression materials. © 2018 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Gjini, Erida; Haydon, Daniel T; David Barry, J; Cobbold, Christina A
2014-01-21
Genetic diversity in multigene families is shaped by multiple processes, including gene conversion and point mutation. Because multi-gene families are involved in crucial traits of organisms, quantifying the rates of their genetic diversification is important. With increasing availability of genomic data, there is a growing need for quantitative approaches that integrate the molecular evolution of gene families with their higher-scale function. In this study, we integrate a stochastic simulation framework with population genetics theory, namely the diffusion approximation, to investigate the dynamics of genetic diversification in a gene family. Duplicated genes can diverge and encode new functions as a result of point mutation, and become more similar through gene conversion. To model the evolution of pairwise identity in a multigene family, we first consider all conversion and mutation events in a discrete manner, keeping track of their details and times of occurrence; second we consider only the infinitesimal effect of these processes on pairwise identity accounting for random sampling of genes and positions. The purely stochastic approach is closer to biological reality and is based on many explicit parameters, such as conversion tract length and family size, but is more challenging analytically. The population genetics approach is an approximation accounting implicitly for point mutation and gene conversion, only in terms of per-site average probabilities. Comparison of these two approaches across a range of parameter combinations reveals that they are not entirely equivalent, but that for certain relevant regimes they do match. As an application of this modelling framework, we consider the distribution of nucleotide identity among VSG genes of African trypanosomes, representing the most prominent example of a multi-gene family mediating parasite antigenic variation and within-host immune evasion. © 2013 Published by Elsevier Ltd. All rights reserved.
Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo
2011-01-01
Accumulated transcriptome data can be used to investigate regulatory networks of genes involved in various biological systems. Co-expression analysis data sets generated from comprehensively collected transcriptome data sets now represent efficient resources that are capable of facilitating the discovery of genes with closely correlated expression patterns. In order to construct a co-expression network for barley, we analyzed 45 publicly available experimental series, which are composed of 1,347 sets of GeneChip data for barley. On the basis of a gene-to-gene weighted correlation coefficient, we constructed a global barley co-expression network and classified it into clusters of subnetwork modules. The resulting clusters are candidates for functional regulatory modules in the barley transcriptome. To annotate each of the modules, we performed comparative annotation using genes in Arabidopsis and Brachypodium distachyon. On the basis of a comparative analysis between barley and two model species, we investigated functional properties from the representative distributions of the gene ontology (GO) terms. Modules putatively involved in drought stress response and cellulose biogenesis have been identified. These modules are discussed to demonstrate the effectiveness of the co-expression analysis. Furthermore, we applied the data set of co-expressed genes coupled with comparative analysis in attempts to discover potentially Triticeae-specific network modules. These results demonstrate that analysis of the co-expression network of the barley transcriptome together with comparative analysis should promote the process of gene discovery in barley. Furthermore, the insights obtained should be transferable to investigations of Triticeae plants. The associated data set generated in this analysis is publicly accessible at http://coexpression.psc.riken.jp/barley/. PMID:21441235
Jue, Dengwei; Sang, Xuelian; Lu, Shengqiao; Dong, Chen; Zhao, Qiufang; Chen, Hongliang; Jia, Liqiang
2015-01-01
Ubiquitination is a post-translation modification where ubiquitin is attached to a substrate. Ubiquitin-conjugating enzymes (E2s) play a major role in the ubiquitin transfer pathway, as well as a variety of functions in plant biological processes. To date, no genome-wide characterization of this gene family has been conducted in maize (Zea mays). In the present study, a total of 75 putative ZmUBC genes have been identified and located in the maize genome. Phylogenetic analysis revealed that ZmUBC proteins could be divided into 15 subfamilies, which include 13 ubiquitin-conjugating enzymes (ZmE2s) and two independent ubiquitin-conjugating enzyme variant (UEV) groups. The predicted ZmUBC genes were distributed across 10 chromosomes at different densities. In addition, analysis of exon-intron junctions and sequence motifs in each candidate gene has revealed high levels of conservation within and between phylogenetic groups. Tissue expression analysis indicated that most ZmUBC genes were expressed in at least one of the tissues, indicating that these are involved in various physiological and developmental processes in maize. Moreover, expression profile analyses of ZmUBC genes under different stress treatments (4°C, 20% PEG6000, and 200 mM NaCl) and various expression patterns indicated that these may play crucial roles in the response of plants to stress. Genome-wide identification, chromosome organization, gene structure, evolutionary and expression analyses of ZmUBC genes have facilitated in the characterization of this gene family, as well as determined its potential involvement in growth, development, and stress responses. This study provides valuable information for better understanding the classification and putative functions of the UBC-encoding genes of maize.
Jue, Dengwei; Sang, Xuelian; Lu, Shengqiao; Dong, Chen; Zhao, Qiufang; Chen, Hongliang; Jia, Liqiang
2015-01-01
Background Ubiquitination is a post-translation modification where ubiquitin is attached to a substrate. Ubiquitin-conjugating enzymes (E2s) play a major role in the ubiquitin transfer pathway, as well as a variety of functions in plant biological processes. To date, no genome-wide characterization of this gene family has been conducted in maize (Zea mays). Methodology/Principal Findings In the present study, a total of 75 putative ZmUBC genes have been identified and located in the maize genome. Phylogenetic analysis revealed that ZmUBC proteins could be divided into 15 subfamilies, which include 13 ubiquitin-conjugating enzymes (ZmE2s) and two independent ubiquitin-conjugating enzyme variant (UEV) groups. The predicted ZmUBC genes were distributed across 10 chromosomes at different densities. In addition, analysis of exon-intron junctions and sequence motifs in each candidate gene has revealed high levels of conservation within and between phylogenetic groups. Tissue expression analysis indicated that most ZmUBC genes were expressed in at least one of the tissues, indicating that these are involved in various physiological and developmental processes in maize. Moreover, expression profile analyses of ZmUBC genes under different stress treatments (4°C, 20% PEG6000, and 200 mM NaCl) and various expression patterns indicated that these may play crucial roles in the response of plants to stress. Conclusions Genome-wide identification, chromosome organization, gene structure, evolutionary and expression analyses of ZmUBC genes have facilitated in the characterization of this gene family, as well as determined its potential involvement in growth, development, and stress responses. This study provides valuable information for better understanding the classification and putative functions of the UBC-encoding genes of maize. PMID:26606743
Karanja, Bernard Kinuthia; Fan, Lianxue; Xu, Liang; Wang, Yan; Zhu, Xianwen; Tang, Mingjia; Wang, Ronghua; Zhang, Fei; Muleke, Everlyne M'mbone; Liu, Liwang
2017-11-01
The radish WRKY gene family was genome-widely identified and played critical roles in response to multiple abiotic stresses. The WRKY is among the largest transcription factors (TFs) associated with multiple biological activities for plant survival, including control response mechanisms against abiotic stresses such as heat, salinity, and heavy metals. Radish is an important root vegetable crop and therefore characterization and expression pattern investigation of WRKY transcription factors in radish is imperative. In the present study, 126 putative WRKY genes were retrieved from radish genome database. Protein sequence and annotation scrutiny confirmed that RsWRKY proteins possessed highly conserved domains and zinc finger motif. Based on phylogenetic analysis results, RsWRKYs candidate genes were divided into three groups (Group I, II and III) with the number 31, 74, and 20, respectively. Additionally, gene structure analysis revealed that intron-exon patterns of the WRKY genes are highly conserved in radish. Linkage map analysis indicated that RsWRKY genes were distributed with varying densities over nine linkage groups. Further, RT-qPCR analysis illustrated the significant variation of 36 RsWRKY genes under one or more abiotic stress treatments, implicating that they might be stress-responsive genes. In total, 126 WRKY TFs were identified from the R. sativus genome wherein, 35 of them showed abiotic stress-induced expression patterns. These results provide a genome-wide characterization of RsWRKY TFs and baseline for further functional dissection and molecular evolution investigation, specifically for improving abiotic stress resistances with an ultimate goal of increasing yield and quality of radish.
Pang, Xiaocong; Zhao, Ying; Wang, Jinhua; Zhou, Qimeng; Xu, Lvjie; Kang, De
2017-01-01
Aim The incidence of Alzheimer's disease (AD) has been increasing in recent years, but there exists no cure and the pathological mechanisms are not fully understood. This study aimed to find out the pathogenesis of learning and memory impairment, new biomarkers, potential therapeutic targets, and drugs for AD. Methods We downloaded the microarray data of entorhinal cortex (EC) and hippocampus (HIP) of AD and controls from Gene Expression Omnibus (GEO) database, and then the differentially expressed genes (DEGs) in EC and HIP regions were analyzed for functional and pathway enrichment. Furthermore, we utilized the DEGs to construct coexpression networks to identify hub genes and discover the small molecules which were capable of reversing the gene expression profile of AD. Finally, we also analyzed microarray and RNA-seq dataset of blood samples to find the biomarkers related to gene expression in brain. Results We found some functional hub genes, such as ErbB2, ErbB4, OCT3, MIF, CDK13, and GPI. According to GO and KEGG pathway enrichment, several pathways were significantly dysregulated in EC and HIP. CTSD and VCAM1 were dysregulated significantly in blood, EC, and HIP, which were potential biomarkers for AD. Target genes of four microRNAs had similar GO_terms distribution with DEGs in EC and HIP. In addtion, small molecules were screened out for AD treatment. Conclusion These biological pathways and DEGs or hub genes will be useful to elucidate AD pathogenesis and identify novel biomarkers or drug targets for developing improved diagnostics and therapeutics against AD. PMID:29359159
Two Paralogous Families of a Two-Gene Subtilisin Operon Are Widely Distributed in Oral Treponemes
Correia, Frederick F.; Plummer, Alvin R.; Ellen, Richard P.; Wyss, Chris; Boches, Susan K.; Galvin, Jamie L.; Paster, Bruce J.; Dewhirst, Floyd E.
2003-01-01
Certain oral treponemes express a highly proteolytic phenotype and have been associated with periodontal diseases. The periodontal pathogen Treponema denticola produces dentilisin, a serine protease of the subtilisin family. The two-gene operon prcA-prtP is required for expression of active dentilisin (PrtP), a putative lipoprotein attached to the treponeme's outer membrane or sheath. The purpose of this study was to examine the diversity and structure of treponemal subtilisin-like proteases in order to better understand their distribution and function. The complete sequences of five prcA-prtP operons were determined for Treponema lecithinolyticum, “Treponema vincentii,” and two canine species. Partial operon sequences were obtained for T. socranskii subsp. 04 as well as 450- to 1,000-base fragments of prtP genes from four additional treponeme strains. Phylogenetic analysis demonstrated that the sequences fall into two paralogous families. The first family includes the sequence from T. denticola. Treponemes possessing this operon family express chymotrypsin-like protease activity and can cleave the substrate N-succinyl-alanyl-alanyl-prolyl-phenylalanine-p-nitroanilide (SAAPFNA). Treponemes possessing the second paralog family do not possess chymotrypsin-like activity or cleave SAAPFNA. Despite examination of a range of protein and peptide substrates, the specificity of the second protease family remains unknown. Each of the fully sequenced prcA and prtP genes contains a 5′ hydrophobic leader sequence with a treponeme lipobox. The two paralogous families of treponeme subtilisins represent a new subgroup within the subtilisin family of proteases and are the only subtilisin lipoprotein family. The present study demonstrated that the subtilisin paralogs comprising a two-gene operon are widely distributed among treponemes. PMID:14617650
Kang, Yu; Gu, Chaohao; Yuan, Lina; Wang, Yue; Zhu, Yanmin; Li, Xinna; Luo, Qibin; Xiao, Jingfa; Jiang, Daquan; Qian, Minping; Ahmed Khan, Aftab; Chen, Fei; Zhang, Zhang; Yu, Jun
2014-11-25
The prokaryotic pangenome partitions genes into core and dispensable genes. The order of core genes, albeit assumed to be stable under selection in general, is frequently interrupted by horizontal gene transfer and rearrangement, but how a core-gene-defined genome maintains its stability or flexibility remains to be investigated. Based on data from 30 species, including 425 genomes from six phyla, we grouped core genes into syntenic blocks in the context of a pangenome according to their stability across multiple isolates. A subset of the core genes, often species specific and lineage associated, formed a core-gene-defined genome organizational framework (cGOF). Such cGOFs are either single segmental (one-third of the species analyzed) or multisegmental (the rest). Multisegment cGOFs were further classified into symmetric or asymmetric according to segment orientations toward the origin-terminus axis. The cGOFs in Gram-positive species are exclusively symmetric and often reversible in orientation, as opposed to those of the Gram-negative bacteria, which are all asymmetric and irreversible. Meanwhile, all species showing strong strand-biased gene distribution contain symmetric cGOFs and often specific DnaE (α subunit of DNA polymerase III) isoforms. Furthermore, functional evaluations revealed that cGOF genes are hub associated with regard to cellular activities, and the stability of cGOF provides efficient indexes for scaffold orientation as demonstrated by assembling virtual and empirical genome drafts. cGOFs show species specificity, and the symmetry of multisegmental cGOFs is conserved among taxa and constrained by DNA polymerase-centric strand-biased gene distribution. The definition of species-specific cGOFs provides powerful guidance for genome assembly and other structure-based analysis. Prokaryotic genomes are frequently interrupted by horizontal gene transfer (HGT) and rearrangement. To know whether there is a set of genes not only conserved in position among isolates but also functionally essential for a given species and to further evaluate the stability or flexibility of such genome structures across lineages are of importance. Based on a large number of multi-isolate pangenomic data, our analysis reveals that a subset of core genes is organized into a core-gene-defined genome organizational framework, or cGOF. Furthermore, the lineage-associated cGOFs among Gram-positive and Gram-negative bacteria behave differently: the former, composed of 2 to 4 segments, have their fragments symmetrically rearranged around the origin-terminus axis, whereas the latter show more complex segmentation and are partitioned asymmetrically into chromosomal structures. The definition of cGOFs provides new insights into prokaryotic genome organization and efficient guidance for genome assembly and analysis. Copyright © 2014 Kang et al.
Villada, Juan C.; Brustolini, Otávio José Bernardes
2017-01-01
Abstract Gene codon optimization may be impaired by the misinterpretation of frequency and optimality of codons. Although recent studies have revealed the effects of codon usage bias (CUB) on protein biosynthesis, an integrated perspective of the biological role of individual codons remains unknown. Unlike other previous studies, we show, through an integrated framework that attributes of codons such as frequency, optimality and positional dependency should be combined to unveil individual codon contribution for protein biosynthesis. We designed a codon quantification method for assessing CUB as a function of position within genes with a novel constraint: the relativity of position-dependent codon usage shaped by coding sequence length. Thus, we propose a new way of identifying the enrichment, depletion and non-uniform positional distribution of codons in different regions of yeast genes. We clustered codons that shared attributes of frequency and optimality. The cluster of non-optimal codons with rare occurrence displayed two remarkable characteristics: higher codon decoding time than frequent–non-optimal cluster and enrichment at the 5′-end region, where optimal codons with the highest frequency are depleted. Interestingly, frequent codons with non-optimal adaptation to tRNAs are uniformly distributed in the Saccharomyces cerevisiae genes, suggesting their determinant role as a speed regulator in protein elongation. PMID:28449100
Villada, Juan C; Brustolini, Otávio José Bernardes; Batista da Silveira, Wendel
2017-08-01
Gene codon optimization may be impaired by the misinterpretation of frequency and optimality of codons. Although recent studies have revealed the effects of codon usage bias (CUB) on protein biosynthesis, an integrated perspective of the biological role of individual codons remains unknown. Unlike other previous studies, we show, through an integrated framework that attributes of codons such as frequency, optimality and positional dependency should be combined to unveil individual codon contribution for protein biosynthesis. We designed a codon quantification method for assessing CUB as a function of position within genes with a novel constraint: the relativity of position-dependent codon usage shaped by coding sequence length. Thus, we propose a new way of identifying the enrichment, depletion and non-uniform positional distribution of codons in different regions of yeast genes. We clustered codons that shared attributes of frequency and optimality. The cluster of non-optimal codons with rare occurrence displayed two remarkable characteristics: higher codon decoding time than frequent-non-optimal cluster and enrichment at the 5'-end region, where optimal codons with the highest frequency are depleted. Interestingly, frequent codons with non-optimal adaptation to tRNAs are uniformly distributed in the Saccharomyces cerevisiae genes, suggesting their determinant role as a speed regulator in protein elongation. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Neurogenic gene regulatory pathways in the sea urchin embryo.
Wei, Zheng; Angerer, Lynne M; Angerer, Robert C
2016-01-15
During embryogenesis the sea urchin early pluteus larva differentiates 40-50 neurons marked by expression of the pan-neural marker synaptotagmin B (SynB) that are distributed along the ciliary band, in the apical plate and pharyngeal endoderm, and 4-6 serotonergic neurons that are confined to the apical plate. Development of all neurons has been shown to depend on the function of Six3. Using a combination of molecular screens and tests of gene function by morpholino-mediated knockdown, we identified SoxC and Brn1/2/4, which function sequentially in the neurogenic regulatory pathway and are also required for the differentiation of all neurons. Misexpression of Brn1/2/4 at low dose caused an increase in the number of serotonin-expressing cells and at higher dose converted most of the embryo to a neurogenic epithelial sphere expressing the Hnf6 ciliary band marker. A third factor, Z167, was shown to work downstream of the Six3 and SoxC core factors and to define a branch specific for the differentiation of serotonergic neurons. These results provide a framework for building a gene regulatory network for neurogenesis in the sea urchin embryo. © 2016. Published by The Company of Biologists Ltd.
Molecular Phylogeny of Heme Peroxidases
NASA Astrophysics Data System (ADS)
Zámocký, Marcel; Obinger, Christian
All currently available gene sequences of heme peroxidases can be phylogenetically divided in two superfamilies and three families. In this chapter, the phylogenetics and genomic distribution of each group are presented. Within the peroxidase-cyclooxygenase superfamily, the main evolutionary direction developed peroxidatic heme proteins involved in the innate immune defense system and in biosynthesis of (iodinated) hormones. The peroxidase-catalase superfamily is widely spread mainly among bacteria, fungi, and plants, and particularly in Class I led to the evolution of bifunctional catalase-peroxidases. Its numerous fungal representatives of Class II are involved in carbon recycling via lignin degradation, whereas Class III secretory peroxidases from algae and plants are included in various forms of secondary metabolism. The family of di-heme peroxidases are predominantly bacteria-inducible enzymes; however, a few corresponding genes were also detected in archaeal genomes. Four subfamilies of dyp-type peroxidases capable of degradation of various xenobiotics are abundant mainly among bacteria and fungi. Heme-haloperoxidase genes are widely spread among sac and club fungi, but corresponding genes were recently found also among oomycetes. All described families herein represent heme peroxidases of broad diversity in structure and function. Our accumulating knowledge about the evolution of various enzymatic functions and physiological roles can be exploited in future directed evolution approaches for engineering peroxidase genes de novo for various demands.
Vatansever, Recep; Koc, Ibrahim; Ozyigit, Ibrahim Ilker; Sen, Ugur; Uras, Mehmet Emin; Anjum, Naser A; Pereira, Eduarda; Filiz, Ertugrul
2016-12-01
Solanum tuberosum genome analysis revealed 12 StSULTR genes encoding 18 transcripts. Among genes annotated at group level ( StSULTR I-IV), group III members formed the largest SULTRs-cluster and were potentially involved in biotic/abiotic stress responses via various regulatory factors, and stress and signaling proteins. Employing bioinformatics tools, this study performed genome-wide identification and expression analysis of SULTR (StSULTR) genes in potato (Solanum tuberosum L.). Very strict homology search and subsequent domain verification with Hidden Markov Model revealed 12 StSULTR genes encoding 18 transcripts. StSULTR genes were mapped on seven S. tuberosum chromosomes. Annotation of StSULTR genes was also done as StSULTR I-IV at group level based mainly on the phylogenetic distribution with Arabidopsis SULTRs. Several tandem and segmental duplications were identified between StSULTR genes. Among these duplications, Ka/Ks ratios indicated neutral nature of mutations that might not be causing any selection. Two segmental and one-tandem duplications were calculated to occur around 147.69, 180.80 and 191.00 million years ago (MYA), approximately corresponding to the time of monocot/dicot divergence. Two other segmental duplications were found to occur around 61.23 and 67.83 MYA, which is very close to the origination of monocotyledons. Most cis-regulatory elements in StSULTRs were found associated with major hormones (such as abscisic acid and methyl jasmonate), and defense and stress responsiveness. The cis-element distribution in duplicated gene pairs indicated the contribution of duplication events in conferring the neofunctionalization/s in StSULTR genes. Notably, RNAseq data analyses unveiled expression profiles of StSULTR genes under different stress conditions. In particular, expression profiles of StSULTR III members suggested their involvement in plant stress responses. Additionally, gene co-expression networks of these group members included various regulatory factors, stress and signaling proteins, and housekeeping and some other proteins with unknown functions.
Drath, Miriam; Baier, Kerstin; Forchhammer, Karl
2009-05-01
Methionine aminopeptidases (MetAPs or MAPs, encoded by map genes) are ubiquitous and pivotal enzymes for protein maturation in all living organisms. Whereas most bacteria harbour only one map gene, many cyanobacterial genomes contain two map paralogues, the genome of Synechocystis sp. PCC 6803 even three. The physiological function of multiple map paralogues remains elusive so far. This communication reports for the first time differential MetAP function in a cyanobacterium. In Synechocystis sp. PCC 6803, the universally conserved mapC gene (sll0555) is predominantly expressed in exponentially growing cells and appears to be a housekeeping gene. By contrast, expression of mapA (slr0918) and mapB (slr0786) genes increases during stress conditions. The mapB paralogue is only transiently expressed, whereas the widely distributed mapA gene appears to be the major MetAP during stress conditions. A mapA-deficient Synechocystis mutant shows a subtle impairment of photosystem II properties even under non-stressed conditions. In particular, the binding site for the quinone Q(B) is affected, indicating specific N-terminal methionine processing requirements of photosystem II components. MAP-A-specific processing becomes essential under certain stress conditions, since the mapA-deficient mutant is severely impaired in surviving conditions of prolonged nitrogen starvation and high light exposure.
Green, Robert; Hanfrey, Colin C.; Elliott, Katherine A.; McCloskey, Diane E.; Wang, Xiaojing; Kanugula, Sreenivas; Pegg, Anthony E.; Michael, Anthony J.
2011-01-01
Summary We have identified gene fusions of polyamine biosynthetic enzymes S-adenosylmethionine decarboxylase (AdoMetDC, speD) and aminopropyltransferase (speE) orthologues in diverse bacterial phyla. Both domains are functionally active and we demonstrate the novel de novo synthesis of the triamine spermidine from the diamine putrescine by fusion enzymes from β-proteobacterium Delftia acidovorans and δ-proteobacterium Syntrophus aciditrophicus, in a ΔspeDE gene deletion strain of Salmonella enterica sv. Typhimurium. Fusion proteins from marine α-proteobacterium Candidatus Pelagibacter ubique, actinobacterium Nocardia farcinica, chlorobi species Chloroherpeton thalassium, and β-proteobacterium Delftia acidovorans each produce a different profile of non-native polyamines including sym-norspermidine when expressed in Escherichia coli. The different aminopropyltransferase activities together with phylogenetic analysis confirm independent evolutionary origins for some fusions. Comparative genomic analysis strongly indicates that gene fusions arose by merger of adjacent open reading frames. Independent fusion events, and horizontal and vertical gene transfer contributed to the scattered phyletic distribution of the gene fusions. Surprisingly, expression of fusion genes in E. coli and S. Typhimurium revealed novel latent spermidine catabolic activity producing non-native 1,3-diaminopropane in these species. We have also identified fusions of polyamine biosynthetic enzymes agmatine deiminase and N-carbamoylputrescine amidohydrolase in archaea, and of S-adenosylmethionine decarboxylase and ornithine decarboxylase in the single-celled green alga Micromonas. PMID:21762220
Clustering of change patterns using Fourier coefficients.
Kim, Jaehee; Kim, Haseong
2008-01-15
To understand the behavior of genes, it is important to explore how the patterns of gene expression change over a time period because biologically related gene groups can share the same change patterns. Many clustering algorithms have been proposed to group observation data. However, because of the complexity of the underlying functions there have not been many studies on grouping data based on change patterns. In this study, the problem of finding similar change patterns is induced to clustering with the derivative Fourier coefficients. The sample Fourier coefficients not only provide information about the underlying functions, but also reduce the dimension. In addition, as their limiting distribution is a multivariate normal, a model-based clustering method incorporating statistical properties would be appropriate. This work is aimed at discovering gene groups with similar change patterns that share similar biological properties. We developed a statistical model using derivative Fourier coefficients to identify similar change patterns of gene expression. We used a model-based method to cluster the Fourier series estimation of derivatives. The model-based method is advantageous over other methods in our proposed model because the sample Fourier coefficients asymptotically follow the multivariate normal distribution. Change patterns are automatically estimated with the Fourier representation in our model. Our model was tested in simulations and on real gene data sets. The simulation results showed that the model-based clustering method with the sample Fourier coefficients has a lower clustering error rate than K-means clustering. Even when the number of repeated time points was small, the same results were obtained. We also applied our model to cluster change patterns of yeast cell cycle microarray expression data with alpha-factor synchronization. It showed that, as the method clusters with the probability-neighboring data, the model-based clustering with our proposed model yielded biologically interpretable results. We expect that our proposed Fourier analysis with suitably chosen smoothing parameters could serve as a useful tool in classifying genes and interpreting possible biological change patterns. The R program is available upon the request.
NASA Astrophysics Data System (ADS)
Scholz, Jan; Dejori, Mathäus; Stetter, Martin; Greiner, Martin
2005-05-01
The impact of observational noise on the analysis of scale-free networks is studied. Various noise sources are modeled as random link removal, random link exchange and random link addition. Emphasis is on the resulting modifications for the node-degree distribution and for a functional ranking based on betweenness centrality. The implications for estimated gene-expressed networks for childhood acute lymphoblastic leukemia are discussed.
The Gene Set Builder: collation, curation, and distribution of sets of genes
Yusuf, Dimas; Lim, Jonathan S; Wasserman, Wyeth W
2005-01-01
Background In bioinformatics and genomics, there are many applications designed to investigate the common properties for a set of genes. Often, these multi-gene analysis tools attempt to reveal sequential, functional, and expressional ties. However, while tremendous effort has been invested in developing tools that can analyze a set of genes, minimal effort has been invested in developing tools that can help researchers compile, store, and annotate gene sets in the first place. As a result, the process of making or accessing a set often involves tedious and time consuming steps such as finding identifiers for each individual gene. These steps are often repeated extensively to shift from one identifier type to another; or to recreate a published set. In this paper, we present a simple online tool which – with the help of the gene catalogs Ensembl and GeneLynx – can help researchers build and annotate sets of genes quickly and easily. Description The Gene Set Builder is a database-driven, web-based tool designed to help researchers compile, store, export, and share sets of genes. This application supports the 17 eukaryotic genomes found in version 32 of the Ensembl database, which includes species from yeast to human. User-created information such as sets and customized annotations are stored to facilitate easy access. Gene sets stored in the system can be "exported" in a variety of output formats – as lists of identifiers, in tables, or as sequences. In addition, gene sets can be "shared" with specific users to facilitate collaborations or fully released to provide access to published results. The application also features a Perl API (Application Programming Interface) for direct connectivity to custom analysis tools. A downloadable Quick Reference guide and an online tutorial are available to help new users learn its functionalities. Conclusion The Gene Set Builder is an Ensembl-facilitated online tool designed to help researchers compile and manage sets of genes in a user-friendly environment. The application can be accessed via . PMID:16371163
Flores-Ponce, Mitzi; Vallebueno-Estrada, Miguel; González-Orozco, Eduardo; Ramos-Aboites, Hilda E; García-Chávez, J Noé; Simões, Nelson; Montiel, Rafael
2017-04-26
The entomopathogenic nematode Steinernema carpocapsae has been used worldwide as a biocontrol agent for insect pests, making it an interesting model for understanding parasite-host interactions. Two models propose that these interactions are co-evolutionary processes in such a way that equilibrium is never reached. In one model, known as "arms race", new alleles in relevant genes are fixed in both host and pathogens by directional positive selection, producing recurrent and alternating selective sweeps. In the other model, known as"trench warfare", persistent dynamic fluctuations in allele frequencies are sustained by balancing selection. There are some examples of genes evolving according to both models, however, it is not clear to what extent these interactions might alter genome-level evolutionary patterns and intraspecific diversity. Here we investigate some of these aspects by studying genomic variation in S. carpocapsae and other pathogenic and free-living nematodes from phylogenetic clades IV and V. To look for signatures of an arms-race dynamic, we conducted massive scans to detect directional positive selection in interspecific data. In free-living nematodes, we detected a significantly higher proportion of genes with sites under positive selection than in parasitic nematodes. However, in these genes, we found more enriched Gene Ontology terms in parasites. To detect possible effects of dynamic polymorphisms interactions we looked for signatures of balancing selection in intraspecific genomic data. The observed distribution of Tajima's D values in S. carpocapsae was more skewed to positive values and significantly different from the observed distribution in the free-living Caenorhabditis briggsae. Also, the proportion of significant positive values of Tajima's D was elevated in genes that were differentially expressed after induction with insect tissues as compared to both non-differentially expressed genes and the global scan. Our study provides a first portrait of the effects that lifestyle might have in shaping the patterns of selection at the genomic level. An arms-race between hosts and pathogens seems to be affecting specific genetic functions but not necessarily increasing the number of positively selected genes. Trench warfare dynamics seem to be acting more generally in the genome, likely focusing on genes responding to the interaction, rather than targeting specific genetic functions.
Larder, Rachel; Karali, Dimitra; Nelson, Nancy; Brown, Pamela
2006-12-01
GnRH binds its cognate G protein-coupled GnRH receptor (GnRHR) located on pituitary gonadotropes and drives expression of gonadotropin hormones. There are two gonadotropin hormones, comprised of a common alpha- and hormone-specific beta-subunit, which are required for gonadal function. Recently we identified that Fanconi anemia a (Fanca), a DNA damage repair gene, is differentially expressed within the LbetaT2 gonadotrope cell line in response to stimulation with GnRH. FANCA is mutated in more than 60% of cases of Fanconi anemia (FA), a rare genetically heterogeneous autosomal recessive disorder characterized by bone marrow failure, endocrine tissue cancer susceptibility, and infertility. Here we show that induction of FANCA protein is mediated by the GnRHR and that the protein constitutively adopts a nucleocytoplasmic intracellular distribution pattern. Using inhibitors to block nuclear import and export and a GnRHR antagonist, we demonstrated that GnRH induces nuclear accumulation of FANCA and green fluorescent protein (GFP)-FANCA before exporting back to the cytoplasm using the nuclear export receptor CRM1. Using FANCA point mutations that locate GFP-FANCA to the cytoplasm (H1110P) or functionally uncouple GFP-FANCA (Q1128E) from the wild-type nucleocytoplasmic distribution pattern, we demonstrated that wild-type FANCA was required for GnRH-induced activation of gonadotrope cell markers. Cotransfection of H1110P and Q1128E blocked GnRH activation of the alphaGsu and GnRHR but not the beta-subunit gene promoters. We conclude that nucleocytoplasmic shuttling of FANCA is required for GnRH transduction of the alphaGSU and GnRHR gene promoters and propose that FANCA functions as a GnRH-induced signal transducer.
Larder, Rachel; Karali, Dimitra; Nelson, Nancy; Brown, Pamela
2007-01-01
GnRH binds its cognate G protein-coupled GnRH receptor (GnRHR) located on pituitary gonadotropes and drives expression of gonadotropin hormones. There are two gonadotropin hormones, comprised of a common α- and hormone-specific β-subunit, which are required for gonadal function. Recently we identified that Fanconi anemia a (Fanca), a DNA damage repair gene, is differentially expressed within the LβT2 gonadotrope cell line in response to stimulation with GnRH. FANCA is mutated in more than 60% of cases of Fanconi anemia (FA), a rare genetically heterogeneous autosomal recessive disorder characterized by bone marrow failure, endocrine tissue cancer susceptibility, and infertility. Here we show that induction of FANCA protein is mediated by the GnRHR and that the protein constitutively adopts a nucleocytoplasmic intracellular distribution pattern. Using inhibitors to block nuclear import and export and a GnRHR antagonist, we demonstrated that GnRH induces nuclear accumulation of FANCA and green fluorescent protein (GFP)-FANCA before exporting back to the cytoplasm using the nuclear export receptor CRM1. Using FANCA point mutations that locate GFP-FANCA to the cytoplasm (H1110P) or functionally uncouple GFP-FANCA (Q1128E) from the wild-type nucleocytoplasmic distribution pattern, we demonstrated that wild-type FANCA was required for GnRH-induced activation of gonadotrope cell markers. Cotransfection of H1110P and Q1128E blocked GnRH activation of the αGsu and GnRHR but not the β-subunit gene promoters. We conclude that nucleocytoplasmic shuttling of FANCA is required for GnRH transduction of the αGSU and GnRHR gene promoters and propose that FANCA functions as a GnRH-induced signal transducer. PMID:16946016
Richa, Kumari; Balestra, Cecilia; Piredda, Roberta; Benes, Vladimir; Borra, Marco; Passarelli, Augusto; Margiotta, Francesca; Saggiomo, Maria; Biffali, Elio; Sanges, Remo; Scanlan, David J; Casotti, Raffaella
2017-09-01
Bacterioplankton are fundamental components of marine ecosystems and influence the entire biosphere by contributing to the global biogeochemical cycles of key elements. Yet, there is a significant gap in knowledge about their diversity and specific activities, as well as environmental factors that shape their community composition and function. Here, the distribution and diversity of surface bacterioplankton along the coastline of the Gulf of Naples (GON; Italy) were investigated using flow cytometry coupled with high-throughput sequencing of the 16S rRNA gene. Heterotrophic bacteria numerically dominated the bacterioplankton and comprised mainly Alphaproteobacteria , Gammaproteobacteria , and Bacteroidetes Distinct communities occupied river-influenced, coastal, and offshore sites, as indicated by Bray-Curtis dissimilarity, distance metric (UniFrac), linear discriminant analysis effect size (LEfSe), and multivariate analyses. The heterogeneity in diversity and community composition was mainly due to salinity and changes in environmental conditions across sites, as defined by nutrient and chlorophyll a concentrations. Bacterioplankton communities were composed of a few dominant taxa and a large proportion (92%) of rare taxa (here defined as operational taxonomic units [OTUs] accounting for <0.1% of the total sequence abundance), the majority of which were unique to each site. The relationship between 16S rRNA and the 16S rRNA gene, i.e., between potential metabolic activity and abundance, was positive for the whole community. However, analysis of individual OTUs revealed high rRNA-to-rRNA gene ratios for most (71.6% ± 16.7%) of the rare taxa, suggesting that these low-abundance organisms were potentially active and hence might be playing an important role in ecosystem diversity and functioning in the GON. IMPORTANCE The study of bacterioplankton in coastal zones is of critical importance, considering that these areas are highly productive and anthropogenically impacted. Their richness and evenness, as well as their potential activity, are very important to assess ecosystem health and functioning. Here, we investigated bacterial distribution, community composition, and potential metabolic activity in the GON, which is an ideal test site due to its heterogeneous environment characterized by a complex hydrodynamics and terrestrial inputs of varied quantities and quality. Our study demonstrates that bacterioplankton communities in this region are highly diverse and strongly regulated by a combination of different environmental factors leading to their heterogeneous distribution, with the rare taxa contributing to a major proportion of diversity and shifts in community composition and potentially holding a key role in ecosystem functioning. Copyright © 2017 American Society for Microbiology.
Richa, Kumari; Balestra, Cecilia; Piredda, Roberta; Benes, Vladimir; Borra, Marco; Passarelli, Augusto; Margiotta, Francesca; Saggiomo, Maria; Biffali, Elio; Sanges, Remo; Scanlan, David J.
2017-01-01
ABSTRACT Bacterioplankton are fundamental components of marine ecosystems and influence the entire biosphere by contributing to the global biogeochemical cycles of key elements. Yet, there is a significant gap in knowledge about their diversity and specific activities, as well as environmental factors that shape their community composition and function. Here, the distribution and diversity of surface bacterioplankton along the coastline of the Gulf of Naples (GON; Italy) were investigated using flow cytometry coupled with high-throughput sequencing of the 16S rRNA gene. Heterotrophic bacteria numerically dominated the bacterioplankton and comprised mainly Alphaproteobacteria, Gammaproteobacteria, and Bacteroidetes. Distinct communities occupied river-influenced, coastal, and offshore sites, as indicated by Bray-Curtis dissimilarity, distance metric (UniFrac), linear discriminant analysis effect size (LEfSe), and multivariate analyses. The heterogeneity in diversity and community composition was mainly due to salinity and changes in environmental conditions across sites, as defined by nutrient and chlorophyll a concentrations. Bacterioplankton communities were composed of a few dominant taxa and a large proportion (92%) of rare taxa (here defined as operational taxonomic units [OTUs] accounting for <0.1% of the total sequence abundance), the majority of which were unique to each site. The relationship between 16S rRNA and the 16S rRNA gene, i.e., between potential metabolic activity and abundance, was positive for the whole community. However, analysis of individual OTUs revealed high rRNA-to-rRNA gene ratios for most (71.6% ± 16.7%) of the rare taxa, suggesting that these low-abundance organisms were potentially active and hence might be playing an important role in ecosystem diversity and functioning in the GON. IMPORTANCE The study of bacterioplankton in coastal zones is of critical importance, considering that these areas are highly productive and anthropogenically impacted. Their richness and evenness, as well as their potential activity, are very important to assess ecosystem health and functioning. Here, we investigated bacterial distribution, community composition, and potential metabolic activity in the GON, which is an ideal test site due to its heterogeneous environment characterized by a complex hydrodynamics and terrestrial inputs of varied quantities and quality. Our study demonstrates that bacterioplankton communities in this region are highly diverse and strongly regulated by a combination of different environmental factors leading to their heterogeneous distribution, with the rare taxa contributing to a major proportion of diversity and shifts in community composition and potentially holding a key role in ecosystem functioning. PMID:28667110
Major, Peter; Embley, T. Martin
2017-01-01
Plasma membrane-located nucleotide transport proteins (NTTs) underpin the lifestyle of important obligate intracellular bacterial and eukaryotic pathogens by importing energy and nucleotides from infected host cells that the pathogens can no longer make for themselves. As such their presence is often seen as a hallmark of an intracellular lifestyle associated with reductive genome evolution and loss of primary biosynthetic pathways. Here, we investigate the phylogenetic distribution of NTT sequences across the domains of cellular life. Our analysis reveals an unexpectedly broad distribution of NTT genes in both host-associated and free-living prokaryotes and eukaryotes. We also identify cases of within-bacteria and bacteria-to-eukaryote horizontal NTT transfer, including into the base of the oomycetes, a major clade of parasitic eukaryotes. In addition to identifying sequences that retain the canonical NTT structure, we detected NTT gene fusions with HEAT-repeat and cyclic nucleotide binding domains in Cyanobacteria, pathogenic Chlamydiae and Oomycetes. Our results suggest that NTTs are versatile functional modules with a much wider distribution and a broader range of potential roles than has previously been appreciated. PMID:28164241