Analysis of multiplex gene expression maps obtained by voxelation.
An, Li; Xie, Hongbo; Chin, Mark H; Obradovic, Zoran; Smith, Desmond J; Megalooikonomou, Vasileios
2009-04-29
Gene expression signatures in the mammalian brain hold the key to understanding neural development and neurological disease. Researchers have previously used voxelation in combination with microarrays for acquisition of genome-wide atlases of expression patterns in the mouse brain. On the other hand, some work has been performed on studying gene functions, without taking into account the location information of a gene's expression in a mouse brain. In this paper, we present an approach for identifying the relation between gene expression maps obtained by voxelation and gene functions. To analyze the dataset, we chose typical genes as queries and aimed at discovering similar gene groups. Gene similarity was determined by using the wavelet features extracted from the left and right hemispheres averaged gene expression maps, and by the Euclidean distance between each pair of feature vectors. We also performed a multiple clustering approach on the gene expression maps, combined with hierarchical clustering. Among each group of similar genes and clusters, the gene function similarity was measured by calculating the average gene function distances in the gene ontology structure. By applying our methodology to find similar genes to certain target genes we were able to improve our understanding of gene expression patterns and gene functions. By applying the clustering analysis method, we obtained significant clusters, which have both very similar gene expression maps and very similar gene functions respectively to their corresponding gene ontologies. The cellular component ontology resulted in prominent clusters expressed in cortex and corpus callosum. The molecular function ontology gave prominent clusters in cortex, corpus callosum and hypothalamus. The biological process ontology resulted in clusters in cortex, hypothalamus and choroid plexus. Clusters from all three ontologies combined were most prominently expressed in cortex and corpus callosum. The experimental results confirm the hypothesis that genes with similar gene expression maps might have similar gene functions. The voxelation data takes into account the location information of gene expression level in mouse brain, which is novel in related research. The proposed approach can potentially be used to predict gene functions and provide helpful suggestions to biologists.
SGFSC: speeding the gene functional similarity calculation based on hash tables.
Tian, Zhen; Wang, Chunyu; Guo, Maozu; Liu, Xiaoyan; Teng, Zhixia
2016-11-04
In recent years, many measures of gene functional similarity have been proposed and widely used in all kinds of essential research. These methods are mainly divided into two categories: pairwise approaches and group-wise approaches. However, a common problem with these methods is their time consumption, especially when measuring the gene functional similarities of a large number of gene pairs. The problem of computational efficiency for pairwise approaches is even more prominent because they are dependent on the combination of semantic similarity. Therefore, the efficient measurement of gene functional similarity remains a challenging problem. To speed current gene functional similarity calculation methods, a novel two-step computing strategy is proposed: (1) establish a hash table for each method to store essential information obtained from the Gene Ontology (GO) graph and (2) measure gene functional similarity based on the corresponding hash table. There is no need to traverse the GO graph repeatedly for each method with the help of the hash table. The analysis of time complexity shows that the computational efficiency of these methods is significantly improved. We also implement a novel Speeding Gene Functional Similarity Calculation tool, namely SGFSC, which is bundled with seven typical measures using our proposed strategy. Further experiments show the great advantage of SGFSC in measuring gene functional similarity on the whole genomic scale. The proposed strategy is successful in speeding current gene functional similarity calculation methods. SGFSC is an efficient tool that is freely available at http://nclab.hit.edu.cn/SGFSC . The source code of SGFSC can be downloaded from http://pan.baidu.com/s/1dFFmvpZ .
An improved method for functional similarity analysis of genes based on Gene Ontology.
Tian, Zhen; Wang, Chunyu; Guo, Maozu; Liu, Xiaoyan; Teng, Zhixia
2016-12-23
Measures of gene functional similarity are essential tools for gene clustering, gene function prediction, evaluation of protein-protein interaction, disease gene prioritization and other applications. In recent years, many gene functional similarity methods have been proposed based on the semantic similarity of GO terms. However, these leading approaches may make errorprone judgments especially when they measure the specificity of GO terms as well as the IC of a term set. Therefore, how to estimate the gene functional similarity reliably is still a challenging problem. We propose WIS, an effective method to measure the gene functional similarity. First of all, WIS computes the IC of a term by employing its depth, the number of its ancestors as well as the topology of its descendants in the GO graph. Secondly, WIS calculates the IC of a term set by means of considering the weighted inherited semantics of terms. Finally, WIS estimates the gene functional similarity based on the IC overlap ratio of term sets. WIS is superior to some other representative measures on the experiments of functional classification of genes in a biological pathway, collaborative evaluation of GO-based semantic similarity measures, protein-protein interaction prediction and correlation with gene expression. Further analysis suggests that WIS takes fully into account the specificity of terms and the weighted inherited semantics of terms between GO terms. The proposed WIS method is an effective and reliable way to compare gene function. The web service of WIS is freely available at http://nclab.hit.edu.cn/WIS/ .
Peng, Jiajie; Zhang, Xuanshuo; Hui, Weiwei; Lu, Junya; Li, Qianqian; Liu, Shuhui; Shang, Xuequn
2018-03-19
Gene Ontology (GO) is one of the most popular bioinformatics resources. In the past decade, Gene Ontology-based gene semantic similarity has been effectively used to model gene-to-gene interactions in multiple research areas. However, most existing semantic similarity approaches rely only on GO annotations and structure, or incorporate only local interactions in the co-functional network. This may lead to inaccurate GO-based similarity resulting from the incomplete GO topology structure and gene annotations. We present NETSIM2, a new network-based method that allows researchers to measure GO-based gene functional similarities by considering the global structure of the co-functional network with a random walk with restart (RWR)-based method, and by selecting the significant term pairs to decrease the noise information. Based on the EC number (Enzyme Commission)-based groups of yeast and Arabidopsis, evaluation test shows that NETSIM2 can enhance the accuracy of Gene Ontology-based gene functional similarity. Using NETSIM2 as an example, we found that the accuracy of semantic similarities can be significantly improved after effectively incorporating the global gene-to-gene interactions in the co-functional network, especially on the species that gene annotations in GO are far from complete.
Measuring semantic similarities by combining gene ontology annotations and gene co-function networks
Peng, Jiajie; Uygun, Sahra; Kim, Taehyong; ...
2015-02-14
Background: Gene Ontology (GO) has been used widely to study functional relationships between genes. The current semantic similarity measures rely only on GO annotations and GO structure. This limits the power of GO-based similarity because of the limited proportion of genes that are annotated to GO in most organisms. Results: We introduce a novel approach called NETSIM (network-based similarity measure) that incorporates information from gene co-function networks in addition to using the GO structure and annotations. Using metabolic reaction maps of yeast, Arabidopsis, and human, we demonstrate that NETSIM can improve the accuracy of GO term similarities. We also demonstratemore » that NETSIM works well even for genomes with sparser gene annotation data. We applied NETSIM on large Arabidopsis gene families such as cytochrome P450 monooxygenases to group the members functionally and show that this grouping could facilitate functional characterization of genes in these families. Conclusions: Using NETSIM as an example, we demonstrated that the performance of a semantic similarity measure could be significantly improved after incorporating genome-specific information. NETSIM incorporates both GO annotations and gene co-function network data as a priori knowledge in the model. Therefore, functional similarities of GO terms that are not explicitly encoded in GO but are relevant in a taxon-specific manner become measurable when GO annotations are limited.« less
Wei, Qing; Khan, Ishita K; Ding, Ziyun; Yerneni, Satwica; Kihara, Daisuke
2017-03-20
The number of genomics and proteomics experiments is growing rapidly, producing an ever-increasing amount of data that are awaiting functional interpretation. A number of function prediction algorithms were developed and improved to enable fast and automatic function annotation. With the well-defined structure and manual curation, Gene Ontology (GO) is the most frequently used vocabulary for representing gene functions. To understand relationship and similarity between GO annotations of genes, it is important to have a convenient pipeline that quantifies and visualizes the GO function analyses in a systematic fashion. NaviGO is a web-based tool for interactive visualization, retrieval, and computation of functional similarity and associations of GO terms and genes. Similarity of GO terms and gene functions is quantified with six different scores including protein-protein interaction and context based association scores we have developed in our previous works. Interactive navigation of the GO function space provides intuitive and effective real-time visualization of functional groupings of GO terms and genes as well as statistical analysis of enriched functions. We developed NaviGO, which visualizes and analyses functional similarity and associations of GO terms and genes. The NaviGO webserver is freely available at: http://kiharalab.org/web/navigo .
Utility and Limitations of Using Gene Expression Data to Identify Functional Associations
Peng, Cheng; Shiu, Shin-Han
2016-01-01
Gene co-expression has been widely used to hypothesize gene function through guilt-by association. However, it is not clear to what degree co-expression is informative, whether it can be applied to genes involved in different biological processes, and how the type of dataset impacts inferences about gene functions. Here our goal is to assess the utility and limitations of using co-expression as a criterion to recover functional associations between genes. By determining the percentage of gene pairs in a metabolic pathway with significant expression correlation, we found that many genes in the same pathway do not have similar transcript profiles and the choice of dataset, annotation quality, gene function, expression similarity measure, and clustering approach significantly impacts the ability to recover functional associations between genes using Arabidopsis thaliana as an example. Some datasets are more informative in capturing coordinated expression profiles and larger data sets are not always better. In addition, to recover the maximum number of known pathways and identify candidate genes with similar functions, it is important to explore rather exhaustively multiple dataset combinations, similarity measures, clustering algorithms and parameters. Finally, we validated the biological relevance of co-expression cluster memberships with an independent phenomics dataset and found that genes that consistently cluster with leucine degradation genes tend to have similar leucine levels in mutants. This study provides a framework for obtaining gene functional associations by maximizing the information that can be obtained from gene expression datasets. PMID:27935950
InteGO2: A web tool for measuring and visualizing gene semantic similarities using Gene Ontology
Peng, Jiajie; Li, Hongxiang; Liu, Yongzhuang; ...
2016-08-31
Here, the Gene Ontology (GO) has been used in high-throughput omics research as a major bioinformatics resource. The hierarchical structure of GO provides users a convenient platform for biological information abstraction and hypothesis testing. Computational methods have been developed to identify functionally similar genes. However, none of the existing measurements take into account all the rich information in GO. Similarly, using these existing methods, web-based applications have been constructed to compute gene functional similarities, and to provide pure text-based outputs. Without a graphical visualization interface, it is difficult for result interpretation. As a result, we present InteGO2, a web toolmore » that allows researchers to calculate the GO-based gene semantic similarities using seven widely used GO-based similarity measurements. Also, we provide an integrative measurement that synergistically integrates all the individual measurements to improve the overall performance. Using HTML5 and cytoscape.js, we provide a graphical interface in InteGO2 to visualize the resulting gene functional association networks. In conclusion, InteGO2 is an easy-to-use HTML5 based web tool. With it, researchers can measure gene or gene product functional similarity conveniently, and visualize the network of functional interactions in a graphical interface.« less
InteGO2: a web tool for measuring and visualizing gene semantic similarities using Gene Ontology.
Peng, Jiajie; Li, Hongxiang; Liu, Yongzhuang; Juan, Liran; Jiang, Qinghua; Wang, Yadong; Chen, Jin
2016-08-31
The Gene Ontology (GO) has been used in high-throughput omics research as a major bioinformatics resource. The hierarchical structure of GO provides users a convenient platform for biological information abstraction and hypothesis testing. Computational methods have been developed to identify functionally similar genes. However, none of the existing measurements take into account all the rich information in GO. Similarly, using these existing methods, web-based applications have been constructed to compute gene functional similarities, and to provide pure text-based outputs. Without a graphical visualization interface, it is difficult for result interpretation. We present InteGO2, a web tool that allows researchers to calculate the GO-based gene semantic similarities using seven widely used GO-based similarity measurements. Also, we provide an integrative measurement that synergistically integrates all the individual measurements to improve the overall performance. Using HTML5 and cytoscape.js, we provide a graphical interface in InteGO2 to visualize the resulting gene functional association networks. InteGO2 is an easy-to-use HTML5 based web tool. With it, researchers can measure gene or gene product functional similarity conveniently, and visualize the network of functional interactions in a graphical interface. InteGO2 can be accessed via http://mlg.hit.edu.cn:8089/ .
InteGO2: A web tool for measuring and visualizing gene semantic similarities using Gene Ontology
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peng, Jiajie; Li, Hongxiang; Liu, Yongzhuang
Here, the Gene Ontology (GO) has been used in high-throughput omics research as a major bioinformatics resource. The hierarchical structure of GO provides users a convenient platform for biological information abstraction and hypothesis testing. Computational methods have been developed to identify functionally similar genes. However, none of the existing measurements take into account all the rich information in GO. Similarly, using these existing methods, web-based applications have been constructed to compute gene functional similarities, and to provide pure text-based outputs. Without a graphical visualization interface, it is difficult for result interpretation. As a result, we present InteGO2, a web toolmore » that allows researchers to calculate the GO-based gene semantic similarities using seven widely used GO-based similarity measurements. Also, we provide an integrative measurement that synergistically integrates all the individual measurements to improve the overall performance. Using HTML5 and cytoscape.js, we provide a graphical interface in InteGO2 to visualize the resulting gene functional association networks. In conclusion, InteGO2 is an easy-to-use HTML5 based web tool. With it, researchers can measure gene or gene product functional similarity conveniently, and visualize the network of functional interactions in a graphical interface.« less
Complexity of Gene Expression Evolution after Duplication: Protein Dosage Rebalancing
Rogozin, Igor B.
2014-01-01
Ongoing debates about functional importance of gene duplications have been recently intensified by a heated discussion of the “ortholog conjecture” (OC). Under the OC, which is central to functional annotation of genomes, orthologous genes are functionally more similar than paralogous genes at the same level of sequence divergence. However, a recent study challenged the OC by reporting a greater functional similarity, in terms of gene ontology (GO) annotations and expression profiles, among within-species paralogs compared to orthologs. These findings were taken to indicate that functional similarity of homologous genes is primarily determined by the cellular context of the genes, rather than evolutionary history. Subsequent studies suggested that the OC appears to be generally valid when applied to mammalian evolution but the complete picture of evolution of gene expression also has to incorporate lineage-specific aspects of paralogy. The observed complexity of gene expression evolution after duplication can be explained through selection for gene dosage effect combined with the duplication-degeneration-complementation model. This paper discusses expression divergence of recent duplications occurring before functional divergence of proteins encoded by duplicate genes. PMID:25197576
FunSimMat: a comprehensive functional similarity database
Schlicker, Andreas; Albrecht, Mario
2008-01-01
Functional similarity based on Gene Ontology (GO) annotation is used in diverse applications like gene clustering, gene expression data analysis, protein interaction prediction and evaluation. However, there exists no comprehensive resource of functional similarity values although such a database would facilitate the use of functional similarity measures in different applications. Here, we describe FunSimMat (Functional Similarity Matrix, http://funsimmat.bioinf.mpi-inf.mpg.de/), a large new database that provides several different semantic similarity measures for GO terms. It offers various precomputed functional similarity values for proteins contained in UniProtKB and for protein families in Pfam and SMART. The web interface allows users to efficiently perform both semantic similarity searches with GO terms and functional similarity searches with proteins or protein families. All results can be downloaded in tab-delimited files for use with other tools. An additional XML–RPC interface gives automatic online access to FunSimMat for programs and remote services. PMID:17932054
Fuzzy measures on the Gene Ontology for gene product similarity.
Popescu, Mihail; Keller, James M; Mitchell, Joyce A
2006-01-01
One of the most important objects in bioinformatics is a gene product (protein or RNA). For many gene products, functional information is summarized in a set of Gene Ontology (GO) annotations. For these genes, it is reasonable to include similarity measures based on the terms found in the GO or other taxonomy. In this paper, we introduce several novel measures for computing the similarity of two gene products annotated with GO terms. The fuzzy measure similarity (FMS) has the advantage that it takes into consideration the context of both complete sets of annotation terms when computing the similarity between two gene products. When the two gene products are not annotated by common taxonomy terms, we propose a method that avoids a zero similarity result. To account for the variations in the annotation reliability, we propose a similarity measure based on the Choquet integral. These similarity measures provide extra tools for the biologist in search of functional information for gene products. The initial testing on a group of 194 sequences representing three proteins families shows a higher correlation of the FMS and Choquet similarities to the BLAST sequence similarities than the traditional similarity measures such as pairwise average or pairwise maximum.
An integrative approach for measuring semantic similarities using gene ontology.
Peng, Jiajie; Li, Hongxiang; Jiang, Qinghua; Wang, Yadong; Chen, Jin
2014-01-01
Gene Ontology (GO) provides rich information and a convenient way to study gene functional similarity, which has been successfully used in various applications. However, the existing GO based similarity measurements have limited functions for only a subset of GO information is considered in each measure. An appropriate integration of the existing measures to take into account more information in GO is demanding. We propose a novel integrative measure called InteGO2 to automatically select appropriate seed measures and then to integrate them using a metaheuristic search method. The experiment results show that InteGO2 significantly improves the performance of gene similarity in human, Arabidopsis and yeast on both molecular function and biological process GO categories. InteGO2 computes gene-to-gene similarities more accurately than tested existing measures and has high robustness. The supplementary document and software are available at http://mlg.hit.edu.cn:8082/.
Functional clustering of time series gene expression data by Granger causality
2012-01-01
Background A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them. PMID:23107425
A transversal approach to predict gene product networks from ontology-based similarity
Chabalier, Julie; Mosser, Jean; Burgun, Anita
2007-01-01
Background Interpretation of transcriptomic data is usually made through a "standard" approach which consists in clustering the genes according to their expression patterns and exploiting Gene Ontology (GO) annotations within each expression cluster. This approach makes it difficult to underline functional relationships between gene products that belong to different expression clusters. To address this issue, we propose a transversal analysis that aims to predict functional networks based on a combination of GO processes and data expression. Results The transversal approach presented in this paper consists in computing the semantic similarity between gene products in a Vector Space Model. Through a weighting scheme over the annotations, we take into account the representativity of the terms that annotate a gene product. Comparing annotation vectors results in a matrix of gene product similarities. Combined with expression data, the matrix is displayed as a set of functional gene networks. The transversal approach was applied to 186 genes related to the enterocyte differentiation stages. This approach resulted in 18 functional networks proved to be biologically relevant. These results were compared with those obtained through a standard approach and with an approach based on information content similarity. Conclusion Complementary to the standard approach, the transversal approach offers new insight into the cellular mechanisms and reveals new research hypotheses by combining gene product networks based on semantic similarity, and data expression. PMID:17605807
DOSim: an R package for similarity between diseases based on Disease Ontology.
Li, Jiang; Gong, Binsheng; Chen, Xi; Liu, Tao; Wu, Chao; Zhang, Fan; Li, Chunquan; Li, Xiang; Rao, Shaoqi; Li, Xia
2011-06-29
The construction of the Disease Ontology (DO) has helped promote the investigation of diseases and disease risk factors. DO enables researchers to analyse disease similarity by adopting semantic similarity measures, and has expanded our understanding of the relationships between different diseases and to classify them. Simultaneously, similarities between genes can also be analysed by their associations with similar diseases. As a result, disease heterogeneity is better understood and insights into the molecular pathogenesis of similar diseases have been gained. However, bioinformatics tools that provide easy and straight forward ways to use DO to study disease and gene similarity simultaneously are required. We have developed an R-based software package (DOSim) to compute the similarity between diseases and to measure the similarity between human genes in terms of diseases. DOSim incorporates a DO-based enrichment analysis function that can be used to explore the disease feature of an independent gene set. A multilayered enrichment analysis (GO and KEGG annotation) annotation function that helps users explore the biological meaning implied in a newly detected gene module is also part of the DOSim package. We used the disease similarity application to demonstrate the relationship between 128 different DO cancer terms. The hierarchical clustering of these 128 different cancers showed modular characteristics. In another case study, we used the gene similarity application on 361 obesity-related genes. The results revealed the complex pathogenesis of obesity. In addition, the gene module detection and gene module multilayered annotation functions in DOSim when applied on these 361 obesity-related genes helped extend our understanding of the complex pathogenesis of obesity risk phenotypes and the heterogeneity of obesity-related diseases. DOSim can be used to detect disease-driven gene modules, and to annotate the modules for functions and pathways. The DOSim package can also be used to visualise DO structure. DOSim can reflect the modular characteristic of disease related genes and promote our understanding of the complex pathogenesis of diseases. DOSim is available on the Comprehensive R Archive Network (CRAN) or http://bioinfo.hrbmu.edu.cn/dosim.
An integrative approach to inferring biologically meaningful gene modules.
Cho, Ji-Hoon; Wang, Kai; Galas, David J
2011-07-26
The ability to construct biologically meaningful gene networks and modules is critical for contemporary systems biology. Though recent studies have demonstrated the power of using gene modules to shed light on the functioning of complex biological systems, most modules in these networks have shown little association with meaningful biological function. We have devised a method which directly incorporates gene ontology (GO) annotation in construction of gene modules in order to gain better functional association. We have devised a method, Semantic Similarity-Integrated approach for Modularization (SSIM) that integrates various gene-gene pairwise similarity values, including information obtained from gene expression, protein-protein interactions and GO annotations, in the construction of modules using affinity propagation clustering. We demonstrated the performance of the proposed method using data from two complex biological responses: 1. the osmotic shock response in Saccharomyces cerevisiae, and 2. the prion-induced pathogenic mouse model. In comparison with two previously reported algorithms, modules identified by SSIM showed significantly stronger association with biological functions. The incorporation of semantic similarity based on GO annotation with gene expression and protein-protein interaction data can greatly enhance the functional relevance of inferred gene modules. In addition, the SSIM approach can also reveal the hierarchical structure of gene modules to gain a broader functional view of the biological system. Hence, the proposed method can facilitate comprehensive and in-depth analysis of high throughput experimental data at the gene network level.
GO(vis), a gene ontology visualization tool based on multi-dimensional values.
Ning, Zi; Jiang, Zhenran
2010-05-01
Most of gene product similarity measurements concentrate on the information content of Gene Ontology (GO) terms or use a path-based similarity between GO terms, which may ignore other important information contained in the structure of the ontology. In our study, we integrate different GO similarity measure approaches to analyze the functional relationship of genes and gene products with a new triangle-based visualization tool called GO(Vis). The purpose of this tool is to demonstrate the effect of three important information factors when measuring the similarity between gene products. One advantage of this tool is that its important ratio can be adjusted to meet different measuring requirements according to the biological knowledge of each factor. The experimental results demonstrate that GO(Vis) can display diagrams of the functional relationship for gene products effectively.
Horizontal functional gene transfer from bacteria to fishes.
Sun, Bao-Fa; Li, Tong; Xiao, Jin-Hua; Jia, Ling-Yi; Liu, Li; Zhang, Peng; Murphy, Robert W; He, Shun-Min; Huang, Da-Wei
2015-12-22
Invertebrates can acquire functional genes via horizontal gene transfer (HGT) from bacteria but fishes are not known to do so. We provide the first reliable evidence of one HGT event from marine bacteria to fishes. The HGT appears to have occurred after emergence of the teleosts. The transferred gene is expressed and regulated developmentally. Its successful integration and expression may change the genetic and metabolic repertoire of fishes. In addition, this gene contains conserved domains and similar tertiary structures in fishes and their putative donor bacteria. Thus, it may function similarly in both groups. Evolutionary analyses indicate that it evolved under purifying selection, further indicating its conserved function. We document the first likely case of HGT of functional gene from prokaryote to fishes. This discovery certifies that HGT can influence vertebrate evolution.
An integrative approach to inferring biologically meaningful gene modules
2011-01-01
Background The ability to construct biologically meaningful gene networks and modules is critical for contemporary systems biology. Though recent studies have demonstrated the power of using gene modules to shed light on the functioning of complex biological systems, most modules in these networks have shown little association with meaningful biological function. We have devised a method which directly incorporates gene ontology (GO) annotation in construction of gene modules in order to gain better functional association. Results We have devised a method, Semantic Similarity-Integrated approach for Modularization (SSIM) that integrates various gene-gene pairwise similarity values, including information obtained from gene expression, protein-protein interactions and GO annotations, in the construction of modules using affinity propagation clustering. We demonstrated the performance of the proposed method using data from two complex biological responses: 1. the osmotic shock response in Saccharomyces cerevisiae, and 2. the prion-induced pathogenic mouse model. In comparison with two previously reported algorithms, modules identified by SSIM showed significantly stronger association with biological functions. Conclusions The incorporation of semantic similarity based on GO annotation with gene expression and protein-protein interaction data can greatly enhance the functional relevance of inferred gene modules. In addition, the SSIM approach can also reveal the hierarchical structure of gene modules to gain a broader functional view of the biological system. Hence, the proposed method can facilitate comprehensive and in-depth analysis of high throughput experimental data at the gene network level. PMID:21791051
A graph-based semantic similarity measure for the gene ontology.
Alvarez, Marco A; Yan, Changhui
2011-12-01
Existing methods for calculating semantic similarities between pairs of Gene Ontology (GO) terms and gene products often rely on external databases like Gene Ontology Annotation (GOA) that annotate gene products using the GO terms. This dependency leads to some limitations in real applications. Here, we present a semantic similarity algorithm (SSA), that relies exclusively on the GO. When calculating the semantic similarity between a pair of input GO terms, SSA takes into account the shortest path between them, the depth of their nearest common ancestor, and a novel similarity score calculated between the definitions of the involved GO terms. In our work, we use SSA to calculate semantic similarities between pairs of proteins by combining pairwise semantic similarities between the GO terms that annotate the involved proteins. The reliability of SSA was evaluated by comparing the resulting semantic similarities between proteins with the functional similarities between proteins derived from expert annotations or sequence similarity. Comparisons with existing state-of-the-art methods showed that SSA is highly competitive with the other methods. SSA provides a reliable measure for semantics similarity independent of external databases of functional-annotation observations.
Li, Zhiguang; Kwekel, Joshua C; Chen, Tao
2012-01-01
Functional comparison across microarray platforms is used to assess the comparability or similarity of the biological relevance associated with the gene expression data generated by multiple microarray platforms. Comparisons at the functional level are very important considering that the ultimate purpose of microarray technology is to determine the biological meaning behind the gene expression changes under a specific condition, not just to generate a list of genes. Herein, we present a method named percentage of overlapping functions (POF) and illustrate how it is used to perform the functional comparison of microarray data generated across multiple platforms. This method facilitates the determination of functional differences or similarities in microarray data generated from multiple array platforms across all the functions that are presented on these platforms. This method can also be used to compare the functional differences or similarities between experiments, projects, or laboratories.
Shen, Congcong; Shi, Yu; Ni, Yingying; Deng, Ye; Van Nostrand, Joy D; He, Zhili; Zhou, Jizhong; Chu, Haiyan
2016-01-01
The elevational and latitudinal diversity patterns of microbial taxa have attracted great attention in the past decade. Recently, the distribution of functional attributes has been in the spotlight. Here, we report a study profiling soil microbial communities along an elevation gradient (500-2200 m) on Changbai Mountain. Using a comprehensive functional gene microarray (GeoChip 5.0), we found that microbial functional gene richness exhibited a dramatic increase at the treeline ecotone, but the bacterial taxonomic and phylogenetic diversity based on 16S rRNA gene sequencing did not exhibit such a similar trend. However, the β-diversity (compositional dissimilarity among sites) pattern for both bacterial taxa and functional genes was similar, showing significant elevational distance-decay patterns which presented increased dissimilarity with elevation. The bacterial taxonomic diversity/structure was strongly influenced by soil pH, while the functional gene diversity/structure was significantly correlated with soil dissolved organic carbon (DOC). This finding highlights that soil DOC may be a good predictor in determining the elevational distribution of microbial functional genes. The finding of significant shifts in functional gene diversity at the treeline ecotone could also provide valuable information for predicting the responses of microbial functions to climate change.
Salem, Saeed; Ozcaglar, Cagri
2014-01-01
Advances in genomic technologies have enabled the accumulation of vast amount of genomic data, including gene expression data for multiple species under various biological and environmental conditions. Integration of these gene expression datasets is a promising strategy to alleviate the challenges of protein functional annotation and biological module discovery based on a single gene expression data, which suffers from spurious coexpression. We propose a joint mining algorithm that constructs a weighted hybrid similarity graph whose nodes are the coexpression links. The weight of an edge between two coexpression links in this hybrid graph is a linear combination of the topological similarities and co-appearance similarities of the corresponding two coexpression links. Clustering the weighted hybrid similarity graph yields recurrent coexpression link clusters (modules). Experimental results on Human gene expression datasets show that the reported modules are functionally homogeneous as evident by their enrichment with biological process GO terms and KEGG pathways.
Gene context analysis in the Integrated Microbial Genomes (IMG) data management system.
Mavromatis, Konstantinos; Chu, Ken; Ivanova, Natalia; Hooper, Sean D; Markowitz, Victor M; Kyrpides, Nikos C
2009-11-24
Computational methods for determining the function of genes in newly sequenced genomes have been traditionally based on sequence similarity to genes whose function has been identified experimentally. Function prediction methods can be extended using gene context analysis approaches such as examining the conservation of chromosomal gene clusters, gene fusion events and co-occurrence profiles across genomes. Context analysis is based on the observation that functionally related genes are often having similar gene context and relies on the identification of such events across phylogenetically diverse collection of genomes. We have used the data management system of the Integrated Microbial Genomes (IMG) as the framework to implement and explore the power of gene context analysis methods because it provides one of the largest available genome integrations. Visualization and search tools to facilitate gene context analysis have been developed and applied across all publicly available archaeal and bacterial genomes in IMG. These computations are now maintained as part of IMG's regular genome content update cycle. IMG is available at: http://img.jgi.doe.gov.
Prioritization of Disease Susceptibility Genes Using LSM/SVD.
Gong, Lejun; Yang, Ronggen; Yan, Qin; Sun, Xiao
2013-12-01
Understanding the role of genetics in diseases is one of the most important tasks in the postgenome era. It is generally too expensive and time consuming to perform experimental validation for all candidate genes related to disease. Computational methods play important roles for prioritizing these candidates. Herein, we propose an approach to prioritize disease genes using latent semantic mapping based on singular value decomposition. Our hypothesis is that similar functional genes are likely to cause similar diseases. Measuring the functional similarity between known disease susceptibility genes and unknown genes is to predict new disease susceptibility genes. Taking autism as an instance, the analysis results of the top ten genes prioritized demonstrate they might be autism susceptibility genes, which also indicates our approach could discover new disease susceptibility genes. The novel approach of disease gene prioritization could discover new disease susceptibility genes, and latent disease-gene relations. The prioritized results could also support the interpretive diversity and experimental views as computational evidence for disease researchers.
Gene function prediction based on Gene Ontology Hierarchy Preserving Hashing.
Zhao, Yingwen; Fu, Guangyuan; Wang, Jun; Guo, Maozu; Yu, Guoxian
2018-02-23
Gene Ontology (GO) uses structured vocabularies (or terms) to describe the molecular functions, biological roles, and cellular locations of gene products in a hierarchical ontology. GO annotations associate genes with GO terms and indicate the given gene products carrying out the biological functions described by the relevant terms. However, predicting correct GO annotations for genes from a massive set of GO terms as defined by GO is a difficult challenge. To combat with this challenge, we introduce a Gene Ontology Hierarchy Preserving Hashing (HPHash) based semantic method for gene function prediction. HPHash firstly measures the taxonomic similarity between GO terms. It then uses a hierarchy preserving hashing technique to keep the hierarchical order between GO terms, and to optimize a series of hashing functions to encode massive GO terms via compact binary codes. After that, HPHash utilizes these hashing functions to project the gene-term association matrix into a low-dimensional one and performs semantic similarity based gene function prediction in the low-dimensional space. Experimental results on three model species (Homo sapiens, Mus musculus and Rattus norvegicus) for interspecies gene function prediction show that HPHash performs better than other related approaches and it is robust to the number of hash functions. In addition, we also take HPHash as a plugin for BLAST based gene function prediction. From the experimental results, HPHash again significantly improves the prediction performance. The codes of HPHash are available at: http://mlda.swu.edu.cn/codes.php?name=HPHash. Copyright © 2018 Elsevier Inc. All rights reserved.
The effects of shared information on semantic calculations in the gene ontology.
Bible, Paul W; Sun, Hong-Wei; Morasso, Maria I; Loganantharaj, Rasiah; Wei, Lai
2017-01-01
The structured vocabulary that describes gene function, the gene ontology (GO), serves as a powerful tool in biological research. One application of GO in computational biology calculates semantic similarity between two concepts to make inferences about the functional similarity of genes. A class of term similarity algorithms explicitly calculates the shared information (SI) between concepts then substitutes this calculation into traditional term similarity measures such as Resnik, Lin, and Jiang-Conrath. Alternative SI approaches, when combined with ontology choice and term similarity type, lead to many gene-to-gene similarity measures. No thorough investigation has been made into the behavior, complexity, and performance of semantic methods derived from distinct SI approaches. We apply bootstrapping to compare the generalized performance of 57 gene-to-gene semantic measures across six benchmarks. Considering the number of measures, we additionally evaluate whether these methods can be leveraged through ensemble machine learning to improve prediction performance. Results showed that the choice of ontology type most strongly influenced performance across all evaluations. Combining measures into an ensemble classifier reduces cross-validation error beyond any individual measure for protein interaction prediction. This improvement resulted from information gained through the combination of ontology types as ensemble methods within each GO type offered no improvement. These results demonstrate that multiple SI measures can be leveraged for machine learning tasks such as automated gene function prediction by incorporating methods from across the ontologies. To facilitate future research in this area, we developed the GO Graph Tool Kit (GGTK), an open source C++ library with Python interface (github.com/paulbible/ggtk).
Shen, Congcong; Shi, Yu; Ni, Yingying; Deng, Ye; Van Nostrand, Joy D.; He, Zhili; Zhou, Jizhong; Chu, Haiyan
2016-01-01
The elevational and latitudinal diversity patterns of microbial taxa have attracted great attention in the past decade. Recently, the distribution of functional attributes has been in the spotlight. Here, we report a study profiling soil microbial communities along an elevation gradient (500–2200 m) on Changbai Mountain. Using a comprehensive functional gene microarray (GeoChip 5.0), we found that microbial functional gene richness exhibited a dramatic increase at the treeline ecotone, but the bacterial taxonomic and phylogenetic diversity based on 16S rRNA gene sequencing did not exhibit such a similar trend. However, the β-diversity (compositional dissimilarity among sites) pattern for both bacterial taxa and functional genes was similar, showing significant elevational distance-decay patterns which presented increased dissimilarity with elevation. The bacterial taxonomic diversity/structure was strongly influenced by soil pH, while the functional gene diversity/structure was significantly correlated with soil dissolved organic carbon (DOC). This finding highlights that soil DOC may be a good predictor in determining the elevational distribution of microbial functional genes. The finding of significant shifts in functional gene diversity at the treeline ecotone could also provide valuable information for predicting the responses of microbial functions to climate change. PMID:27524983
Obayashi, Takeshi; Kinoshita, Kengo
2010-05-01
Gene coexpression analyses are a powerful method to predict the function of genes and/or to identify genes that are functionally related to query genes. The basic idea of gene coexpression analyses is that genes with similar functions should have similar expression patterns under many different conditions. This approach is now widely used by many experimental researchers, especially in the field of plant biology. In this review, we will summarize recent successful examples obtained by using our gene coexpression database, ATTED-II. Specifically, the examples will describe the identification of new genes, such as the subunits of a complex protein, the enzymes in a metabolic pathway and transporters. In addition, we will discuss the discovery of a new intercellular signaling factor and new regulatory relationships between transcription factors and their target genes. In ATTED-II, we provide two basic views of gene coexpression, a gene list view and a gene network view, which can be used as guide gene approach and narrow-down approach, respectively. In addition, we will discuss the coexpression effectiveness for various types of gene sets.
2014-01-01
Background Advances in genomic technologies have enabled the accumulation of vast amount of genomic data, including gene expression data for multiple species under various biological and environmental conditions. Integration of these gene expression datasets is a promising strategy to alleviate the challenges of protein functional annotation and biological module discovery based on a single gene expression data, which suffers from spurious coexpression. Results We propose a joint mining algorithm that constructs a weighted hybrid similarity graph whose nodes are the coexpression links. The weight of an edge between two coexpression links in this hybrid graph is a linear combination of the topological similarities and co-appearance similarities of the corresponding two coexpression links. Clustering the weighted hybrid similarity graph yields recurrent coexpression link clusters (modules). Experimental results on Human gene expression datasets show that the reported modules are functionally homogeneous as evident by their enrichment with biological process GO terms and KEGG pathways. PMID:25221624
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family.
Danisman, Selahattin; van Dijk, Aalt D J; Bimbo, Andrea; van der Wal, Froukje; Hennig, Lars; de Folter, Stefan; Angenent, Gerco C; Immink, Richard G H
2013-12-01
Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein-protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein-protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family.
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family
Danisman, Selahattin; de Folter, Stefan; Immink, Richard G. H.
2013-01-01
Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein–protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein–protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family. PMID:24129704
Relationships among msx gene structure and function in zebrafish and other vertebrates.
Ekker, M; Akimenko, M A; Allende, M L; Smith, R; Drouin, G; Langille, R M; Weinberg, E S; Westerfield, M
1997-10-01
The zebrafish genome contains at least five msx homeobox genes, msxA, msxB, msxC, msxD, and the newly isolated msxE. Although these genes share structural features common to all Msx genes, phylogenetic analyses of protein sequences indicate that the msx genes from zebrafish are not orthologous to the Msx1 and Msx2 genes of mammals, birds, and amphibians. The zebrafish msxB and msxC are more closely related to each other and to the mouse Msx3. Similarly, although the combinatorial expression of the zebrafish msx genes in the embryonic dorsal neuroectoderm, visceral arches, fins, and sensory organs suggests functional similarities with the Msx genes of other vertebrates, differences in the expression patterns preclude precise assignment of orthological relationships. Distinct duplication events may have given rise to the msx genes of modern fish and other vertebrate lineages whereas many aspects of msx gene functions during embryonic development have been preserved.
Metabolic Pathway Assignment of Plant Genes based on Phylogenetic Profiling–A Feasibility Study
Weißenborn, Sandra; Walther, Dirk
2017-01-01
Despite many developed experimental and computational approaches, functional gene annotation remains challenging. With the rapidly growing number of sequenced genomes, the concept of phylogenetic profiling, which predicts functional links between genes that share a common co-occurrence pattern across different genomes, has gained renewed attention as it promises to annotate gene functions based on presence/absence calls alone. We applied phylogenetic profiling to the problem of metabolic pathway assignments of plant genes with a particular focus on secondary metabolism pathways. We determined phylogenetic profiles for 40,960 metabolic pathway enzyme genes with assigned EC numbers from 24 plant species based on sequence and pathway annotation data from KEGG and Ensembl Plants. For gene sequence family assignments, needed to determine the presence or absence of particular gene functions in the given plant species, we included data of all 39 species available at the Ensembl Plants database and established gene families based on pairwise sequence identities and annotation information. Aside from performing profiling comparisons, we used machine learning approaches to predict pathway associations from phylogenetic profiles alone. Selected metabolic pathways were indeed found to be composed of gene families of greater than expected phylogenetic profile similarity. This was particularly evident for primary metabolism pathways, whereas for secondary pathways, both the available annotation in different species as well as the abstraction of functional association via distinct pathways proved limiting. While phylogenetic profile similarity was generally not found to correlate with gene co-expression, direct physical interactions of proteins were reflected by a significantly increased profile similarity suggesting an application of phylogenetic profiling methods as a filtering step in the identification of protein-protein interactions. This feasibility study highlights the potential and challenges associated with phylogenetic profiling methods for the detection of functional relationships between genes as well as the need to enlarge the set of plant genes with proven secondary metabolism involvement as well as the limitations of distinct pathways as abstractions of relationships between genes. PMID:29163570
Computational gene expression profiling under salt stress reveals patterns of co-expression
Sanchita; Sharma, Ashok
2016-01-01
Plants respond differently to environmental conditions. Among various abiotic stresses, salt stress is a condition where excess salt in soil causes inhibition of plant growth. To understand the response of plants to the stress conditions, identification of the responsible genes is required. Clustering is a data mining technique used to group the genes with similar expression. The genes of a cluster show similar expression and function. We applied clustering algorithms on gene expression data of Solanum tuberosum showing differential expression in Capsicum annuum under salt stress. The clusters, which were common in multiple algorithms were taken further for analysis. Principal component analysis (PCA) further validated the findings of other cluster algorithms by visualizing their clusters in three-dimensional space. Functional annotation results revealed that most of the genes were involved in stress related responses. Our findings suggest that these algorithms may be helpful in the prediction of the function of co-expressed genes. PMID:26981411
The petunia AGL6 gene has a SEPALLATA-like function in floral patterning.
Rijpkema, Anneke S; Zethof, Jan; Gerats, Tom; Vandenbussche, Michiel
2009-10-01
SEPALLATA (SEP) MADS-box genes are required for the regulation of floral meristem determinacy and the specification of sepals, petals, stamens, carpels and ovules, specifically in angiosperms. The SEP subfamily is closely related to the AGAMOUS LIKE6 (AGL6) and SQUAMOSA (SQUA) subfamilies. So far, of these three groups only AGL6-like genes have been found in extant gymnosperms. AGL6 genes are more similar to SEP than to SQUA genes, both in sequence and in expression pattern. Despite the ancestry and wide distribution of AGL6-like MADS-box genes, not a single loss-of-function mutant exhibiting a clear phenotype has yet been reported; consequently the function of AGL6-like genes has remained elusive. Here, we characterize the Petunia hybrida AGL6 (PhAGL6, formerly called PETUNIA MADS BOX GENE4/pMADS4) gene, and show that it functions redundantly with the SEP genes FLORAL BINDING PROTEIN2 (FBP2) and FBP5 in petal and anther development. Moreover, expression analysis suggests a function for PhAGL6 in ovary and ovule development. The PhAGL6 and FBP2 proteins interact in in vitro experiments overall with the same partners, indicating that the two proteins are biochemically quite similar. It will be interesting to determine the functions of AGL6-like genes of other species, especially those of gymnosperms.
Li, Min; Li, Qi; Ganegoda, Gamage Upeksha; Wang, JianXin; Wu, FangXiang; Pan, Yi
2014-11-01
Identification of disease-causing genes among a large number of candidates is a fundamental challenge in human disease studies. However, it is still time-consuming and laborious to determine the real disease-causing genes by biological experiments. With the advances of the high-throughput techniques, a large number of protein-protein interactions have been produced. Therefore, to address this issue, several methods based on protein interaction network have been proposed. In this paper, we propose a shortest path-based algorithm, named SPranker, to prioritize disease-causing genes in protein interaction networks. Considering the fact that diseases with similar phenotypes are generally caused by functionally related genes, we further propose an improved algorithm SPGOranker by integrating the semantic similarity of GO annotations. SPGOranker not only considers the topological similarity between protein pairs in a protein interaction network but also takes their functional similarity into account. The proposed algorithms SPranker and SPGOranker were applied to 1598 known orphan disease-causing genes from 172 orphan diseases and compared with three state-of-the-art approaches, ICN, VS and RWR. The experimental results show that SPranker and SPGOranker outperform ICN, VS, and RWR for the prioritization of orphan disease-causing genes. Importantly, for the case study of severe combined immunodeficiency, SPranker and SPGOranker predict several novel causal genes.
Global Mapping of the Yeast Genetic Interaction Network
NASA Astrophysics Data System (ADS)
Tong, Amy Hin Yan; Lesage, Guillaume; Bader, Gary D.; Ding, Huiming; Xu, Hong; Xin, Xiaofeng; Young, James; Berriz, Gabriel F.; Brost, Renee L.; Chang, Michael; Chen, YiQun; Cheng, Xin; Chua, Gordon; Friesen, Helena; Goldberg, Debra S.; Haynes, Jennifer; Humphries, Christine; He, Grace; Hussein, Shamiza; Ke, Lizhu; Krogan, Nevan; Li, Zhijian; Levinson, Joshua N.; Lu, Hong; Ménard, Patrice; Munyana, Christella; Parsons, Ainslie B.; Ryan, Owen; Tonikian, Raffi; Roberts, Tania; Sdicu, Anne-Marie; Shapiro, Jesse; Sheikh, Bilal; Suter, Bernhard; Wong, Sharyl L.; Zhang, Lan V.; Zhu, Hongwei; Burd, Christopher G.; Munro, Sean; Sander, Chris; Rine, Jasper; Greenblatt, Jack; Peter, Matthias; Bretscher, Anthony; Bell, Graham; Roth, Frederick P.; Brown, Grant W.; Andrews, Brenda; Bussey, Howard; Boone, Charles
2004-02-01
A genetic interaction network containing ~1000 genes and ~4000 interactions was mapped by crossing mutations in 132 different query genes into a set of ~4700 viable gene yeast deletion mutants and scoring the double mutant progeny for fitness defects. Network connectivity was predictive of function because interactions often occurred among functionally related genes, and similar patterns of interactions tended to identify components of the same pathway. The genetic network exhibited dense local neighborhoods; therefore, the position of a gene on a partially mapped network is predictive of other genetic interactions. Because digenic interactions are common in yeast, similar networks may underlie the complex genetics associated with inherited phenotypes in other organisms.
2014-05-16
native uncharacterized genes for characterized genes from Bacillus subtilis , that is presented in a constitutive expression module. If the B... subtilis gene containing M. mycoides mutant is viable than the function of the conserved hypothetical gene is the same as the input B. subtilis gene...Characterized genes from B. subtilis were swapped with similar, but not so similar as to be clearly the same, essential genes from M. mycoides. The B. subtilis
German, M S; Moss, L G; Wang, J; Rutter, W J
1992-01-01
The pancreatic beta cell makes several unique gene products, including insulin, islet amyloid polypeptide (IAPP), and beta-cell-specific glucokinase (beta GK). The functions of isolated portions of the insulin, IAPP, and beta GK promoters were studied by using transient expression and DNA binding assays. A short portion (-247 to -197 bp) of the rat insulin I gene, the FF minienhancer, contains three interacting transcriptional regulatory elements. The FF minienhancer binds at least two nuclear complexes with limited tissue distribution. Sequences similar to that of the FF minienhancer are present in the 5' flanking DNA of the human IAPP and rat beta GK genes and also the rat insulin II and mouse insulin I and II genes. Similar minienhancer constructs from the insulin and IAPP genes function as cell-specific transcriptional regulatory elements and compete for binding of the same nuclear factors, while the beta GK construct competes for protein binding but functions poorly as a minienhancer. These observations suggest that the patterns of expression of the beta-cell-specific genes result in part from sharing the same transcriptional regulators. Images PMID:1549125
ERIC Educational Resources Information Center
Gericke, Niklas Markus; Hagberg, Mariana
2007-01-01
Models are often used when teaching science. In this paper historical models and students' ideas about genetics are compared. The historical development of the scientific idea of the gene and its function is described and categorized into five historical models of gene function. Differences and similarities between these historical models are made…
Gene function prediction with gene interaction networks: a context graph kernel approach.
Li, Xin; Chen, Hsinchun; Li, Jiexun; Zhang, Zhu
2010-01-01
Predicting gene functions is a challenge for biologists in the postgenomic era. Interactions among genes and their products compose networks that can be used to infer gene functions. Most previous studies adopt a linkage assumption, i.e., they assume that gene interactions indicate functional similarities between connected genes. In this study, we propose to use a gene's context graph, i.e., the gene interaction network associated with the focal gene, to infer its functions. In a kernel-based machine-learning framework, we design a context graph kernel to capture the information in context graphs. Our experimental study on a testbed of p53-related genes demonstrates the advantage of using indirect gene interactions and shows the empirical superiority of the proposed approach over linkage-assumption-based methods, such as the algorithm to minimize inconsistent connected genes and diffusion kernels.
Jangid, Kamlesh; Kao, Ming-Hung; Lahamge, Aishwarya; Williams, Mark A; Rathbun, Stephen L; Whitman, William B
2016-01-01
K-shuff is a new algorithm for comparing the similarity of gene sequence libraries, providing measures of the structural and compositional diversity as well as the significance of the differences between these measures. Inspired by Ripley's K-function for spatial point pattern analysis, the Intra K-function or IKF measures the structural diversity, including both the richness and overall similarity of the sequences, within a library. The Cross K-function or CKF measures the compositional diversity between gene libraries, reflecting both the number of OTUs shared as well as the overall similarity in OTUs. A Monte Carlo testing procedure then enables statistical evaluation of both the structural and compositional diversity between gene libraries. For 16S rRNA gene libraries from complex bacterial communities such as those found in seawater, salt marsh sediments, and soils, K-shuff yields reproducible estimates of structural and compositional diversity with libraries greater than 50 sequences. Similarly, for pyrosequencing libraries generated from a glacial retreat chronosequence and Illumina® libraries generated from US homes, K-shuff required >300 and 100 sequences per sample, respectively. Power analyses demonstrated that K-shuff is sensitive to small differences in Sanger or Illumina® libraries. This extra sensitivity of K-shuff enabled examination of compositional differences at much deeper taxonomic levels, such as within abundant OTUs. This is especially useful when comparing communities that are compositionally very similar but functionally different. K-shuff will therefore prove beneficial for conventional microbiome analysis as well as specific hypothesis testing.
Zhang, C H; Ma, R J; Shen, Z J; Sun, X; Korir, N K; Yu, M L
2014-04-08
In this study, 33 homeodomain-leucine zipper (HD-ZIP) genes were identified in peach using the HD-ZIP amino acid sequences of Arabidopsis thaliana as a probe. Based on the phylogenetic analysis and the individual gene or protein characteristics, the HD-ZIP gene family in peach can be classified into 4 subfamilies, HD-ZIP I, II, III, and IV, containing 14, 7, 4, and 8 members, respectively. The most closely related peach HD-ZIP members within the same subfamilies shared very similar gene structure in terms of either intron/exon numbers or lengths. Almost all members of the same subfamily shared common motif compositions, thereby implying that the HD-ZIP proteins within the same subfamily may have functional similarity. The 33 peach HD-ZIP genes were distributed across scaffolds 1 to 7. Although the primary structure varied among HD-ZIP family proteins, their tertiary structures were similar. The results from this study will be useful in selecting candidate genes from specific subfamilies for functional analysis.
Thomas, Paul D.; Wood, Valerie; Mungall, Christopher J.; Lewis, Suzanna E.; Blake, Judith A.
2012-01-01
A recent paper (Nehrt et al., PLoS Comput. Biol. 7:e1002073, 2011) has proposed a metric for the “functional similarity” between two genes that uses only the Gene Ontology (GO) annotations directly derived from published experimental results. Applying this metric, the authors concluded that paralogous genes within the mouse genome or the human genome are more functionally similar on average than orthologous genes between these genomes, an unexpected result with broad implications if true. We suggest, based on both theoretical and empirical considerations, that this proposed metric should not be interpreted as a functional similarity, and therefore cannot be used to support any conclusions about the “ortholog conjecture” (or, more properly, the “ortholog functional conservation hypothesis”). First, we reexamine the case studies presented by Nehrt et al. as examples of orthologs with divergent functions, and come to a very different conclusion: they actually exemplify how GO annotations for orthologous genes provide complementary information about conserved biological functions. We then show that there is a global ascertainment bias in the experiment-based GO annotations for human and mouse genes: particular types of experiments tend to be performed in different model organisms. We conclude that the reported statistical differences in annotations between pairs of orthologous genes do not reflect differences in biological function, but rather complementarity in experimental approaches. Our results underscore two general considerations for researchers proposing novel types of analysis based on the GO: 1) that GO annotations are often incomplete, potentially in a biased manner, and subject to an “open world assumption” (absence of an annotation does not imply absence of a function), and 2) that conclusions drawn from a novel, large-scale GO analysis should whenever possible be supported by careful, in-depth examination of examples, to help ensure the conclusions have a justifiable biological basis. PMID:22359495
Identification of the Core Set of Carbon-Associated Genes in a Bioenergy Grassland Soil
Howe, Adina; Yang, Fan; Williams, Ryan J.; ...
2016-11-17
Despite the central role of soil microbial communities in global carbon (C) cycling, little is known about soil microbial community structure and even less about their metabolic pathways. Efforts to characterize soil communities often focus on identifying differences in gene content across environmental gradients, but an alternative question is what genes are similar in soils. These genes may indicate critical species or potential functions that are required in all soils. Here we identified the “core” set of C cycling sequences widely present in multiple soil metagenomes from a fertilized prairie (FP). Of 226,887 sequences associated with known enzymes involved inmore » the synthesis, metabolism, and transport of carbohydrates, 843 were identified to be consistently prevalent across four replicate soil metagenomes. This core metagenome was functionally and taxonomically diverse, representing five enzyme classes and 99 enzyme families within the CAZy database. Though it only comprised 0.4% of all CAZy-associated genes identified in FP metagenomes, the core was found to be comprised of functions similar to those within cumulative soils. The FP CAZy-associated core sequences were present in multiple publicly available soil metagenomes and most similar to soils sharing geographic proximity. As a result, in soil ecosystems, where high diversity remains a key challenge for metagenomic investigations, these core genes represent a subset of critical functions necessary for carbohydrate metabolism, which can be targeted to evaluate important C fluxes in these and other similar soils.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Howe, Adina; Yang, Fan; Williams, Ryan J.
Despite the central role of soil microbial communities in global carbon (C) cycling, little is known about soil microbial community structure and even less about their metabolic pathways. Efforts to characterize soil communities often focus on identifying differences in gene content across environmental gradients, but an alternative question is what genes are similar in soils. These genes may indicate critical species or potential functions that are required in all soils. Here we identified the “core” set of C cycling sequences widely present in multiple soil metagenomes from a fertilized prairie (FP). Of 226,887 sequences associated with known enzymes involved inmore » the synthesis, metabolism, and transport of carbohydrates, 843 were identified to be consistently prevalent across four replicate soil metagenomes. This core metagenome was functionally and taxonomically diverse, representing five enzyme classes and 99 enzyme families within the CAZy database. Though it only comprised 0.4% of all CAZy-associated genes identified in FP metagenomes, the core was found to be comprised of functions similar to those within cumulative soils. The FP CAZy-associated core sequences were present in multiple publicly available soil metagenomes and most similar to soils sharing geographic proximity. As a result, in soil ecosystems, where high diversity remains a key challenge for metagenomic investigations, these core genes represent a subset of critical functions necessary for carbohydrate metabolism, which can be targeted to evaluate important C fluxes in these and other similar soils.« less
A premeiotic function for boule in the planarian Schmidtea mediterranea.
Iyer, Harini; Issigonis, Melanie; Sharma, Prashant P; Extavour, Cassandra G; Newmark, Phillip A
2016-06-21
Mutations in Deleted in Azoospermia (DAZ), a Y chromosome gene, are an important cause of human male infertility. DAZ is found exclusively in primates, limiting functional studies of this gene to its homologs: boule, required for meiotic progression of germ cells in invertebrate model systems, and Daz-like (Dazl), required for early germ cell maintenance in vertebrates. Dazl is believed to have acquired its premeiotic role in a vertebrate ancestor following the duplication and functional divergence of the single-copy gene boule. However, multiple homologs of boule have been identified in some invertebrates, raising the possibility that some of these genes may play other roles, including a premeiotic function. Here we identify two boule paralogs in the freshwater planarian Schmidtea mediterranea Smed-boule1 is necessary for meiotic progression of male germ cells, similar to the known function of boule in invertebrates. By contrast, Smed-boule2 is required for the maintenance of early male germ cells, similar to vertebrate Dazl To examine if Boule2 may be functionally similar to vertebrate Dazl, we identify and functionally characterize planarian homologs of human DAZL/DAZ-interacting partners and DAZ family mRNA targets. Finally, our phylogenetic analyses indicate that premeiotic functions of planarian boule2 and vertebrate Dazl evolved independently. Our study uncovers a premeiotic role for an invertebrate boule homolog and offers a tractable invertebrate model system for studying the premeiotic functions of the DAZ protein family.
Gillespie, Meagan J.; Stanley, Dragana; Chen, Honglei; Donald, John A.; Nicholas, Kevin R.; Moore, Robert J.; Crowley, Tamsyn M.
2012-01-01
Pigeon ‘milk’ and mammalian milk have functional similarities in terms of nutritional benefit and delivery of immunoglobulins to the young. Mammalian milk has been clearly shown to aid in the development of the immune system and microbiota of the young, but similar effects have not yet been attributed to pigeon ‘milk’. Therefore, using a chicken model, we investigated the effect of pigeon ‘milk’ on immune gene expression in the Gut Associated Lymphoid Tissue (GALT) and on the composition of the caecal microbiota. Chickens fed pigeon ‘milk’ had a faster rate of growth and a better feed conversion ratio than control chickens. There was significantly enhanced expression of immune-related gene pathways and interferon-stimulated genes in the GALT of pigeon ‘milk’-fed chickens. These pathways include the innate immune response, regulation of cytokine production and regulation of B cell activation and proliferation. The caecal microbiota of pigeon ‘milk’-fed chickens was significantly more diverse than control chickens, and appears to be affected by prebiotics in pigeon ‘milk’, as well as being directly seeded by bacteria present in pigeon ‘milk’. Our results demonstrate that pigeon ‘milk’ has further modes of action which make it functionally similar to mammalian milk. We hypothesise that pigeon ‘lactation’ and mammalian lactation evolved independently but resulted in similarly functional products. PMID:23110233
Ambroise, Jérôme; Robert, Annie; Macq, Benoit; Gala, Jean-Luc
2012-01-06
An important challenge in system biology is the inference of biological networks from postgenomic data. Among these biological networks, a gene transcriptional regulatory network focuses on interactions existing between transcription factors (TFs) and and their corresponding target genes. A large number of reverse engineering algorithms were proposed to infer such networks from gene expression profiles, but most current methods have relatively low predictive performances. In this paper, we introduce the novel TNIFSED method (Transcriptional Network Inference from Functional Similarity and Expression Data), that infers a transcriptional network from the integration of correlations and partial correlations of gene expression profiles and gene functional similarities through a supervised classifier. In the current work, TNIFSED was applied to predict the transcriptional network in Escherichia coli and in Saccharomyces cerevisiae, using datasets of 445 and 170 affymetrix arrays, respectively. Using the area under the curve of the receiver operating characteristics and the F-measure as indicators, we showed the predictive performance of TNIFSED to be better than unsupervised state-of-the-art methods. TNIFSED performed slightly worse than the supervised SIRENE algorithm for the target genes identification of the TF having a wide range of yet identified target genes but better for TF having only few identified target genes. Our results indicate that TNIFSED is complementary to the SIRENE algorithm, and particularly suitable to discover target genes of "orphan" TFs.
Bettembourg, Charles; Diot, Christian; Dameron, Olivier
2015-01-01
Background The analysis of gene annotations referencing back to Gene Ontology plays an important role in the interpretation of high-throughput experiments results. This analysis typically involves semantic similarity and particularity measures that quantify the importance of the Gene Ontology annotations. However, there is currently no sound method supporting the interpretation of the similarity and particularity values in order to determine whether two genes are similar or whether one gene has some significant particular function. Interpretation is frequently based either on an implicit threshold, or an arbitrary one (typically 0.5). Here we investigate a method for determining thresholds supporting the interpretation of the results of a semantic comparison. Results We propose a method for determining the optimal similarity threshold by minimizing the proportions of false-positive and false-negative similarity matches. We compared the distributions of the similarity values of pairs of similar genes and pairs of non-similar genes. These comparisons were performed separately for all three branches of the Gene Ontology. In all situations, we found overlap between the similar and the non-similar distributions, indicating that some similar genes had a similarity value lower than the similarity value of some non-similar genes. We then extend this method to the semantic particularity measure and to a similarity measure applied to the ChEBI ontology. Thresholds were evaluated over the whole HomoloGene database. For each group of homologous genes, we computed all the similarity and particularity values between pairs of genes. Finally, we focused on the PPAR multigene family to show that the similarity and particularity patterns obtained with our thresholds were better at discriminating orthologs and paralogs than those obtained using default thresholds. Conclusion We developed a method for determining optimal semantic similarity and particularity thresholds. We applied this method on the GO and ChEBI ontologies. Qualitative analysis using the thresholds on the PPAR multigene family yielded biologically-relevant patterns. PMID:26230274
Jangid, Kamlesh; Kao, Ming-Hung; Lahamge, Aishwarya; Williams, Mark A.; Rathbun, Stephen L.; Whitman, William B.
2016-01-01
K-shuff is a new algorithm for comparing the similarity of gene sequence libraries, providing measures of the structural and compositional diversity as well as the significance of the differences between these measures. Inspired by Ripley’s K-function for spatial point pattern analysis, the Intra K-function or IKF measures the structural diversity, including both the richness and overall similarity of the sequences, within a library. The Cross K-function or CKF measures the compositional diversity between gene libraries, reflecting both the number of OTUs shared as well as the overall similarity in OTUs. A Monte Carlo testing procedure then enables statistical evaluation of both the structural and compositional diversity between gene libraries. For 16S rRNA gene libraries from complex bacterial communities such as those found in seawater, salt marsh sediments, and soils, K-shuff yields reproducible estimates of structural and compositional diversity with libraries greater than 50 sequences. Similarly, for pyrosequencing libraries generated from a glacial retreat chronosequence and Illumina® libraries generated from US homes, K-shuff required >300 and 100 sequences per sample, respectively. Power analyses demonstrated that K-shuff is sensitive to small differences in Sanger or Illumina® libraries. This extra sensitivity of K-shuff enabled examination of compositional differences at much deeper taxonomic levels, such as within abundant OTUs. This is especially useful when comparing communities that are compositionally very similar but functionally different. K-shuff will therefore prove beneficial for conventional microbiome analysis as well as specific hypothesis testing. PMID:27911946
Leach, J E; White, F F
1996-01-01
Although more than 30 bacterial avirulence genes have been cloned and characterized, the function of the gene products in the elictitation of resistance is unknown in all cases but one. The product of avrD from Pseudomonas syringae pv. glycinea likely functions indirectly to elicit resistance in soybean, that is, evidence suggests the gene product is an enzyme involved in elicitor production. In most if not all cases, bacterial avirulence gene function is dependent on interactions with the hypersensitive response and pathogenicity (hrp) genes. Many hrp genes are similar to genes involved in delivery of pathogenicity factors in mammalian bacterial pathogens. Thus, analogies between mammalian and plant pathogens may provide needed clues to elucidate how virulence gene products control induction of resistance.
He, Feng; Zeng, An-Ping
2006-01-01
Background The increasing availability of time-series expression data opens up new possibilities to study functional linkages of genes. Present methods used to infer functional linkages between genes from expression data are mainly based on a point-to-point comparison. Change trends between consecutive time points in time-series data have been so far not well explored. Results In this work we present a new method based on extracting main features of the change trend and level of gene expression between consecutive time points. The method, termed as trend correlation (TC), includes two major steps: 1, calculating a maximal local alignment of change trend score by dynamic programming and a change trend correlation coefficient between the maximal matched change levels of each gene pair; 2, inferring relationships of gene pairs based on two statistical extraction procedures. The new method considers time shifts and inverted relationships in a similar way as the local clustering (LC) method but the latter is merely based on a point-to-point comparison. The TC method is demonstrated with data from yeast cell cycle and compared with the LC method and the widely used Pearson correlation coefficient (PCC) based clustering method. The biological significance of the gene pairs is examined with several large-scale yeast databases. Although the TC method predicts an overall lower number of gene pairs than the other two methods at a same p-value threshold, the additional number of gene pairs inferred by the TC method is considerable: e.g. 20.5% compared with the LC method and 49.6% with the PCC method for a p-value threshold of 2.7E-3. Moreover, the percentage of the inferred gene pairs consistent with databases by our method is generally higher than the LC method and similar to the PCC method. A significant number of the gene pairs only inferred by the TC method are process-identity or function-similarity pairs or have well-documented biological interactions, including 443 known protein interactions and some known cell cycle related regulatory interactions. It should be emphasized that the overlapping of gene pairs detected by the three methods is normally not very high, indicating a necessity of combining the different methods in search of functional association of genes from time-series data. For a p-value threshold of 1E-5 the percentage of process-identity and function-similarity gene pairs among the shared part of the three methods reaches 60.2% and 55.6% respectively, building a good basis for further experimental and functional study. Furthermore, the combined use of methods is important to infer more complete regulatory circuits and network as exemplified in this study. Conclusion The TC method can significantly augment the current major methods to infer functional linkages and biological network and is well suitable for exploring temporal relationships of gene expression in time-series data. PMID:16478547
Finding functional features in Saccharomyces genomes by phylogenetic footprinting.
Cliften, Paul; Sudarsanam, Priya; Desikan, Ashwin; Fulton, Lucinda; Fulton, Bob; Majors, John; Waterston, Robert; Cohen, Barak A; Johnston, Mark
2003-07-04
The sifting and winnowing of DNA sequence that occur during evolution cause nonfunctional sequences to diverge, leaving phylogenetic footprints of functional sequence elements in comparisons of genome sequences. We searched for such footprints among the genome sequences of six Saccharomyces species and identified potentially functional sequences. Comparison of these sequences allowed us to revise the catalog of yeast genes and identify sequence motifs that may be targets of transcriptional regulatory proteins. Some of these conserved sequence motifs reside upstream of genes with similar functional annotations or similar expression patterns or those bound by the same transcription factor and are thus good candidates for functional regulatory sequences.
GFD-Net: A novel semantic similarity methodology for the analysis of gene networks.
Díaz-Montaña, Juan J; Díaz-Díaz, Norberto; Gómez-Vela, Francisco
2017-04-01
Since the popularization of biological network inference methods, it has become crucial to create methods to validate the resulting models. Here we present GFD-Net, the first methodology that applies the concept of semantic similarity to gene network analysis. GFD-Net combines the concept of semantic similarity with the use of gene network topology to analyze the functional dissimilarity of gene networks based on Gene Ontology (GO). The main innovation of GFD-Net lies in the way that semantic similarity is used to analyze gene networks taking into account the network topology. GFD-Net selects a functionality for each gene (specified by a GO term), weights each edge according to the dissimilarity between the nodes at its ends and calculates a quantitative measure of the network functional dissimilarity, i.e. a quantitative value of the degree of dissimilarity between the connected genes. The robustness of GFD-Net as a gene network validation tool was demonstrated by performing a ROC analysis on several network repositories. Furthermore, a well-known network was analyzed showing that GFD-Net can also be used to infer knowledge. The relevance of GFD-Net becomes more evident in Section "GFD-Net applied to the study of human diseases" where an example of how GFD-Net can be applied to the study of human diseases is presented. GFD-Net is available as an open-source Cytoscape app which offers a user-friendly interface to configure and execute the algorithm as well as the ability to visualize and interact with the results(http://apps.cytoscape.org/apps/gfdnet). Copyright © 2017 Elsevier Inc. All rights reserved.
A premeiotic function for boule in the planarian Schmidtea mediterranea
Iyer, Harini; Issigonis, Melanie; Sharma, Prashant P.; Extavour, Cassandra G.; Newmark, Phillip A.
2016-01-01
Mutations in Deleted in Azoospermia (DAZ), a Y chromosome gene, are an important cause of human male infertility. DAZ is found exclusively in primates, limiting functional studies of this gene to its homologs: boule, required for meiotic progression of germ cells in invertebrate model systems, and Daz-like (Dazl), required for early germ cell maintenance in vertebrates. Dazl is believed to have acquired its premeiotic role in a vertebrate ancestor following the duplication and functional divergence of the single-copy gene boule. However, multiple homologs of boule have been identified in some invertebrates, raising the possibility that some of these genes may play other roles, including a premeiotic function. Here we identify two boule paralogs in the freshwater planarian Schmidtea mediterranea. Smed-boule1 is necessary for meiotic progression of male germ cells, similar to the known function of boule in invertebrates. By contrast, Smed-boule2 is required for the maintenance of early male germ cells, similar to vertebrate Dazl. To examine if Boule2 may be functionally similar to vertebrate Dazl, we identify and functionally characterize planarian homologs of human DAZL/DAZ-interacting partners and DAZ family mRNA targets. Finally, our phylogenetic analyses indicate that premeiotic functions of planarian boule2 and vertebrate Dazl evolved independently. Our study uncovers a premeiotic role for an invertebrate boule homolog and offers a tractable invertebrate model system for studying the premeiotic functions of the DAZ protein family. PMID:27330085
Mills, Brian D; Grayson, David S; Shunmugavel, Anandakumar; Miranda-Dominguez, Oscar; Feczko, Eric; Earl, Eric; Neve, Kim; Fair, Damien A
2018-05-22
Cognition and behavior depend on synchronized intrinsic brain activity that is organized into functional networks across the brain. Research has investigated how anatomical connectivity both shapes and is shaped by these networks, but not how anatomical connectivity interacts with intra-areal molecular properties to drive functional connectivity. Here, we present a novel linear model to explain functional connectivity by integrating systematically obtained measurements of axonal connectivity, gene expression, and resting state functional connectivity MRI in the mouse brain. The model suggests that functional connectivity arises from both anatomical links and inter-areal similarities in gene expression. By estimating these effects, we identify anatomical modules in which correlated gene expression and anatomical connectivity support functional connectivity. Along with providing evidence that not all genes equally contribute to functional connectivity, this research establishes new insights regarding the biological underpinnings of coordinated brain activity measured by BOLD fMRI. SIGNIFICANCE STATEMENT Efforts at characterizing the functional connectome with fMRI have risen exponentially over the last decade. Yet despite this rise, the biological underpinnings of these functional measurements are still largely unknown. The current report begins to fill this void by investigating the molecular underpinnings of the functional connectome through an integration of systematically obtained structural information and gene expression data throughout the rodent brain. We find that both white matter connectivity and similarity in regional gene expression relate to resting state functional connectivity. The current report furthers our understanding of the biological underpinnings of the functional connectome and provides a linear model that can be utilized to streamline preclinical animal studies of disease. Copyright © 2018 the authors.
Saliva Microbiota Carry Caries-Specific Functional Gene Signatures
Chang, Xingzhi; Yuan, Xiao; Tu, Qichao; Yuan, Tong; Deng, Ye; Hemme, Christopher L.; Van Nostrand, Joy; Cui, Xinping; He, Zhili; Chen, Zhenggang; Guo, Dawei; Yu, Jiangbo; Zhang, Yue; Zhou, Jizhong; Xu, Jian
2014-01-01
Human saliva microbiota is phylogenetically divergent among host individuals yet their roles in health and disease are poorly appreciated. We employed a microbial functional gene microarray, HuMiChip 1.0, to reconstruct the global functional profiles of human saliva microbiota from ten healthy and ten caries-active adults. Saliva microbiota in the pilot population featured a vast diversity of functional genes. No significant distinction in gene number or diversity indices was observed between healthy and caries-active microbiota. However, co-presence network analysis of functional genes revealed that caries-active microbiota was more divergent in non-core genes than healthy microbiota, despite both groups exhibited a similar degree of conservation at their respective core genes. Furthermore, functional gene structure of saliva microbiota could potentially distinguish caries-active patients from healthy hosts. Microbial functions such as Diaminopimelate epimerase, Prephenate dehydrogenase, Pyruvate-formate lyase and N-acetylmuramoyl-L-alanine amidase were significantly linked to caries. Therefore, saliva microbiota carried disease-associated functional signatures, which could be potentially exploited for caries diagnosis. PMID:24533043
Saliva microbiota carry caries-specific functional gene signatures.
Yang, Fang; Ning, Kang; Chang, Xingzhi; Yuan, Xiao; Tu, Qichao; Yuan, Tong; Deng, Ye; Hemme, Christopher L; Van Nostrand, Joy; Cui, Xinping; He, Zhili; Chen, Zhenggang; Guo, Dawei; Yu, Jiangbo; Zhang, Yue; Zhou, Jizhong; Xu, Jian
2014-01-01
Human saliva microbiota is phylogenetically divergent among host individuals yet their roles in health and disease are poorly appreciated. We employed a microbial functional gene microarray, HuMiChip 1.0, to reconstruct the global functional profiles of human saliva microbiota from ten healthy and ten caries-active adults. Saliva microbiota in the pilot population featured a vast diversity of functional genes. No significant distinction in gene number or diversity indices was observed between healthy and caries-active microbiota. However, co-presence network analysis of functional genes revealed that caries-active microbiota was more divergent in non-core genes than healthy microbiota, despite both groups exhibited a similar degree of conservation at their respective core genes. Furthermore, functional gene structure of saliva microbiota could potentially distinguish caries-active patients from healthy hosts. Microbial functions such as Diaminopimelate epimerase, Prephenate dehydrogenase, Pyruvate-formate lyase and N-acetylmuramoyl-L-alanine amidase were significantly linked to caries. Therefore, saliva microbiota carried disease-associated functional signatures, which could be potentially exploited for caries diagnosis.
Yang, Jing; Wang, Chao; Wu, Jinyu; Liu, Li; Zhang, Gang
2014-01-01
The genus Exiguobacterium can adapt readily to, and survive in, diverse environments. Our study demonstrated that Exiguobacterium sp. strain S3-2, isolated from marine sediment, is resistant to five antibiotics. The plasmid pMC1 in this strain carries seven putative resistance genes. We functionally characterized these resistance genes in Escherichia coli, and genes encoding dihydrofolate reductase and macrolide phosphotransferase were considered novel resistance genes based on their low similarities to known resistance genes. The plasmid G+C content distribution was highly heterogeneous. Only the G+C content of one block, which shared significant similarity with a plasmid from Exiguobacterium arabatum, fit well with the mean G+C content of the host. The remainder of the plasmid was composed of mobile elements with a markedly lower G+C ratio than the host. Interestingly, five mobile elements located on pMC1 showed significant similarities to sequences found in pathogens. Our data provided an example of the link between resistance genes in strains from the environment and the clinic and revealed the aggregation of antibiotic resistance genes in bacteria isolated from fish farms. PMID:24362420
Examination of a Palatogenic Gene Program in Zebrafish
Swartz, Mary E.; Sheehan-Rooney, Kelly; Dixon, Michael J.; Eberhart, Johann K.
2011-01-01
Human palatal clefting is debilitating and difficult to rectify surgically. Animal models enhance our understanding of palatogenesis and are essential in strategies designed to ameliorate palatal malformations in humans. Recent studies have shown that the zebrafish palate, or anterior neurocranium, is under similar genetic control to the amniote palatal skeleton. We extensively analyzed palatogenesis in zebrafish to determine the similarity of gene expression and function across vertebrates. By 36 hpf palatogenic cranial neural crest cells reside in homologous regions of the developing face compared to amniote species. Transcription factors and signaling molecules regulating mouse palatogenesis are expressed in similar domains during palatogenesis in zebrafish. Functional investigation of a subset of these genes, fgf10a, tgfb2, pax9 and smad5 revealed their necessity in zebrafish palatogenesis. Collectively, these results suggest that the gene regulatory networks regulating palatogenesis may be conserved across vertebrate species, demonstrating the utility of zebrafish as a model for palatogenesis. PMID:22016187
GEsture: an online hand-drawing tool for gene expression pattern search.
Wang, Chunyan; Xu, Yiqing; Wang, Xuelin; Zhang, Li; Wei, Suyun; Ye, Qiaolin; Zhu, Youxiang; Yin, Hengfu; Nainwal, Manoj; Tanon-Reyes, Luis; Cheng, Feng; Yin, Tongming; Ye, Ning
2018-01-01
Gene expression profiling data provide useful information for the investigation of biological function and process. However, identifying a specific expression pattern from extensive time series gene expression data is not an easy task. Clustering, a popular method, is often used to classify similar expression genes, however, genes with a 'desirable' or 'user-defined' pattern cannot be efficiently detected by clustering methods. To address these limitations, we developed an online tool called GEsture. Users can draw, or graph a curve using a mouse instead of inputting abstract parameters of clustering methods. GEsture explores genes showing similar, opposite and time-delay expression patterns with a gene expression curve as input from time series datasets. We presented three examples that illustrate the capacity of GEsture in gene hunting while following users' requirements. GEsture also provides visualization tools (such as expression pattern figure, heat map and correlation network) to display the searching results. The result outputs may provide useful information for researchers to understand the targets, function and biological processes of the involved genes.
SoFoCles: feature filtering for microarray classification based on gene ontology.
Papachristoudis, Georgios; Diplaris, Sotiris; Mitkas, Pericles A
2010-02-01
Marker gene selection has been an important research topic in the classification analysis of gene expression data. Current methods try to reduce the "curse of dimensionality" by using statistical intra-feature set calculations, or classifiers that are based on the given dataset. In this paper, we present SoFoCles, an interactive tool that enables semantic feature filtering in microarray classification problems with the use of external, well-defined knowledge retrieved from the Gene Ontology. The notion of semantic similarity is used to derive genes that are involved in the same biological path during the microarray experiment, by enriching a feature set that has been initially produced with legacy methods. Among its other functionalities, SoFoCles offers a large repository of semantic similarity methods that are used in order to derive feature sets and marker genes. The structure and functionality of the tool are discussed in detail, as well as its ability to improve classification accuracy. Through experimental evaluation, SoFoCles is shown to outperform other classification schemes in terms of classification accuracy in two real datasets using different semantic similarity computation approaches.
Construction and comparison of gene co-expression networks shows complex plant immune responses
López, Camilo; López-Kleine, Liliana
2014-01-01
Gene co-expression networks (GCNs) are graphic representations that depict the coordinated transcription of genes in response to certain stimuli. GCNs provide functional annotations of genes whose function is unknown and are further used in studies of translational functional genomics among species. In this work, a methodology for the reconstruction and comparison of GCNs is presented. This approach was applied using gene expression data that were obtained from immunity experiments in Arabidopsis thaliana, rice, soybean, tomato and cassava. After the evaluation of diverse similarity metrics for the GCN reconstruction, we recommended the mutual information coefficient measurement and a clustering coefficient-based method for similarity threshold selection. To compare GCNs, we proposed a multivariate approach based on the Principal Component Analysis (PCA). Branches of plant immunity that were exemplified by each experiment were analyzed in conjunction with the PCA results, suggesting both the robustness and the dynamic nature of the cellular responses. The dynamic of molecular plant responses produced networks with different characteristics that are differentiable using our methodology. The comparison of GCNs from plant pathosystems, showed that in response to similar pathogens plants could activate conserved signaling pathways. The results confirmed that the closeness of GCNs projected on the principal component space is an indicative of similarity among GCNs. This also can be used to understand global patterns of events triggered during plant immune responses. PMID:25320678
Goedbloed, D J; Czypionka, T; Altmüller, J; Rodriguez, A; Küpfer, E; Segev, O; Blaustein, L; Templeton, A R; Nolte, A W; Steinfartz, S
2017-12-01
The utilization of similar habitats by different species provides an ideal opportunity to identify genes underlying adaptation and acclimatization. Here, we analysed the gene expression of two closely related salamander species: Salamandra salamandra in Central Europe and Salamandra infraimmaculata in the Near East. These species inhabit similar habitat types: 'temporary ponds' and 'permanent streams' during larval development. We developed two species-specific gene expression microarrays, each targeting over 12 000 transcripts, including an overlapping subset of 8331 orthologues. Gene expression was examined for systematic differences between temporary ponds and permanent streams in larvae from both salamander species to establish gene sets and functions associated with these two habitat types. Only 20 orthologues were associated with a habitat in both species, but these orthologues did not show parallel expression patterns across species more than expected by chance. Functional annotation of a set of 106 genes with the highest effect size for a habitat suggested four putative gene function categories associated with a habitat in both species: cell proliferation, neural development, oxygen responses and muscle capacity. Among these high effect size genes was a single orthologue (14-3-3 protein zeta/YWHAZ) that was downregulated in temporary ponds in both species. The emergence of four gene function categories combined with a lack of parallel expression of orthologues (except 14-3-3 protein zeta) suggests that parallel habitat adaptation or acclimatization by larvae from S. salamandra and S. infraimmaculata to temporary ponds and permanent streams is mainly realized by different genes with a converging functionality.
New genes contribute to genetic and phenotypic novelties in human evolution
Zhang, Yong E.; Long, Manyuan
2014-01-01
New genes in human genomes have been found relevant in evolution and biology of humans. It was conservatively estimated that the human genome encodes more than 300 human-specific genes and 1,000 primate-specific genes. These new arrivals appear to be implicated in brain function and male reproduction. Surprisingly, increasing evidence indicates that they may also bring negative pleiotropic effects, while assuming various possible biological functions as sources of phenotypic novelties, suggesting a non-progressive route for functional evolution. Similar to these fixed new genes, polymorphic new genes were found to contribute to functional evolution within species, e.g. with respect to digestion or disease resistance, revealing that new genes can acquire new or diverged functions in its initial stage as prototypic genes. These progresses have provided new opportunity to explore the genetic basis of human biology and human evolutionary history in a new dimension. PMID:25218862
GO-based functional dissimilarity of gene sets.
Díaz-Díaz, Norberto; Aguilar-Ruiz, Jesús S
2011-09-01
The Gene Ontology (GO) provides a controlled vocabulary for describing the functions of genes and can be used to evaluate the functional coherence of gene sets. Many functional coherence measures consider each pair of gene functions in a set and produce an output based on all pairwise distances. A single gene can encode multiple proteins that may differ in function. For each functionality, other proteins that exhibit the same activity may also participate. Therefore, an identification of the most common function for all of the genes involved in a biological process is important in evaluating the functional similarity of groups of genes and a quantification of functional coherence can helps to clarify the role of a group of genes working together. To implement this approach to functional assessment, we present GFD (GO-based Functional Dissimilarity), a novel dissimilarity measure for evaluating groups of genes based on the most relevant functions of the whole set. The measure assigns a numerical value to the gene set for each of the three GO sub-ontologies. Results show that GFD performs robustly when applied to gene set of known functionality (extracted from KEGG). It performs particularly well on randomly generated gene sets. An ROC analysis reveals that the performance of GFD in evaluating the functional dissimilarity of gene sets is very satisfactory. A comparative analysis against other functional measures, such as GS2 and those presented by Resnik and Wang, also demonstrates the robustness of GFD.
Pairwise gene GO-based measures for biclustering of high-dimensional expression data.
Nepomuceno, Juan A; Troncoso, Alicia; Nepomuceno-Chamorro, Isabel A; Aguilar-Ruiz, Jesús S
2018-01-01
Biclustering algorithms search for groups of genes that share the same behavior under a subset of samples in gene expression data. Nowadays, the biological knowledge available in public repositories can be used to drive these algorithms to find biclusters composed of groups of genes functionally coherent. On the other hand, a distance among genes can be defined according to their information stored in Gene Ontology (GO). Gene pairwise GO semantic similarity measures report a value for each pair of genes which establishes their functional similarity. A scatter search-based algorithm that optimizes a merit function that integrates GO information is studied in this paper. This merit function uses a term that addresses the information through a GO measure. The effect of two possible different gene pairwise GO measures on the performance of the algorithm is analyzed. Firstly, three well known yeast datasets with approximately one thousand of genes are studied. Secondly, a group of human datasets related to clinical data of cancer is also explored by the algorithm. Most of these data are high-dimensional datasets composed of a huge number of genes. The resultant biclusters reveal groups of genes linked by a same functionality when the search procedure is driven by one of the proposed GO measures. Furthermore, a qualitative biological study of a group of biclusters show their relevance from a cancer disease perspective. It can be concluded that the integration of biological information improves the performance of the biclustering process. The two different GO measures studied show an improvement in the results obtained for the yeast dataset. However, if datasets are composed of a huge number of genes, only one of them really improves the algorithm performance. This second case constitutes a clear option to explore interesting datasets from a clinical point of view.
Coevolution of Siglec-11 and Siglec-16 via gene conversion in primates.
Hayakawa, Toshiyuki; Khedri, Zahra; Schwarz, Flavio; Landig, Corinna; Liang, Suh-Yuen; Yu, Hai; Chen, Xi; Fujito, Naoko T; Satta, Yoko; Varki, Ajit; Angata, Takashi
2017-11-23
Siglecs-11 and -16 are members of the sialic acid recognizing Ig-like lectin family, and expressed in same cells. Siglec-11 functions as an inhibitory receptor, whereas Siglec-16 exhibits activating properties. In humans, SIGLEC11 and SIGLEC16 gene sequences are extremely similar in the region encoding the extracellular domain due to gene conversions. Human SIGLEC11 was converted by the nonfunctional SIGLEC16P allele, and the converted SIGLEC11 allele became fixed in humans, possibly because it provides novel neuroprotective functions in brain microglia. However, the detailed evolutionary history of SIGLEC11 and SIGLEC16 in other primates remains unclear. We analyzed SIGLEC11 and SIGLEC16 gene sequences of multiple primate species, and examined glycan binding profiles of these Siglecs. The phylogenetic tree demonstrated that gene conversions between SIGLEC11 and SIGLEC16 occurred in the region including the exon encoding the sialic acid binding domain in every primate examined. Functional assays showed that glycan binding preference is similar between Siglec-11 and Siglec-16 in all analyzed hominid species. Taken together with the fact that Siglec-11 and Siglec-16 are expressed in the same cells, Siglec-11 and Siglec-16 are regarded as paired receptors that have maintained similar ligand binding preferences via gene conversions. Relaxed functional constraints were detected on the SIGLEC11 and SIGLEC16 exons that underwent gene conversions, possibly contributing to the evolutionary acceptance of repeated gene conversions. The frequency of nonfunctional SIGLEC16P alleles is much higher than that of SIGLEC16 alleles in every human population. Our findings indicate that Siglec-11 and Siglec-16 have been maintained as paired receptors by repeated gene conversions under relaxed functional constraints in the primate lineage. The high prevalence of the nonfunctional SIGLEC16P allele and the fixation of the converted SIGLEC11 imply that the loss of Siglec-16 and the gain of Siglec-11 in microglia might have been favored during the evolution of human lineage.
Microarray-based analysis of survival of soil microbial community during ozonation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Jian; Van Nostrand, Joy D.; He, Zhili
A 15 h ozonation was performed on bioremediated soil to remove recalcitrant residual oil. To monitor the survival of indigenous microorganisms in the soil during in-situ chemical oxidation(ISCO) culturing and a functional genearray, GeoChip, was used to examine the functional genes and structure of the microbial community during ozonation (0h, 2h, 4h, 6h, 10hand15h). Breakthrough ozonation decreased the population of cultivable heterotrophic bacteria by about 3 orders of magnitude. The total functional gene abundance and diversity decreased during ozonation, as the number of functional genes was reduced by 48percent after 15 h. However, functional genes were evenly distributed during ozonationmore » as judged by the Shannon-Weaver Evenness index. A sharp decrease in gene number was observed in the first 6 h of ozonation followed by a slower decrease in the next 9 h, which was consistent with microbial populations measured by a culture based method. Functional genes involved in carbon, nitrogen, phosphors and sulfur cycling, metal resistance and organic remediation were detected in all samples. Though the pattern of gene categories detected was similar for all time points, hierarchica lcluster of all functional genes and major functional categories all showed a time-serial pattern. Bacteria, archaea and fungi decreased by 96.1percent, 95.1percent and 91.3percent, respectively, after 15 h ozonation. Delta proteobacteria, which were reduced by 94.3percent, showed the highest resistance to ozonation while Actinobacteria, reduced by 96.3percent, showed the lowest resistance. Microorganisms similar to Rhodothermus, Obesumbacterium, Staphylothermus, Gluconobacter, and Enterococcus were dominant at all time points. Functional genes related to petroleum degradation decreased 1~;;2 orders of magnitude. Most of the key functional genes were still detected after ozonation, allowing a rapid recovery of the microbial community after ozonation. While ozone had a large impact on the indigenous soil microorganisms, a fraction of the key functional gene-containing microorganisms survived during ozonation and kept the community functional.« less
Pesaranghader, Ahmad; Matwin, Stan; Sokolova, Marina; Beiko, Robert G
2016-05-01
Measures of protein functional similarity are essential tools for function prediction, evaluation of protein-protein interactions (PPIs) and other applications. Several existing methods perform comparisons between proteins based on the semantic similarity of their GO terms; however, these measures are highly sensitive to modifications in the topological structure of GO, tend to be focused on specific analytical tasks and concentrate on the GO terms themselves rather than considering their textual definitions. We introduce simDEF, an efficient method for measuring semantic similarity of GO terms using their GO definitions, which is based on the Gloss Vector measure commonly used in natural language processing. The simDEF approach builds optimized definition vectors for all relevant GO terms, and expresses the similarity of a pair of proteins as the cosine of the angle between their definition vectors. Relative to existing similarity measures, when validated on a yeast reference database, simDEF improves correlation with sequence homology by up to 50%, shows a correlation improvement >4% with gene expression in the biological process hierarchy of GO and increases PPI predictability by > 2.5% in F1 score for molecular function hierarchy. Datasets, results and source code are available at http://kiwi.cs.dal.ca/Software/simDEF CONTACT: ahmad.pgh@dal.ca or beiko@cs.dal.ca Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Ma, Wei; Gabriel, Tobias Sebastian; Martis, Mihaela Maria; Gursinsky, Torsten; Schubert, Veit; Vrána, Jan; Doležel, Jaroslav; Grundlach, Heidrun; Altschmied, Lothar; Scholz, Uwe; Himmelbach, Axel; Behrens, Sven-Erik; Banaei-Moghaddam, Ali Mohammad; Houben, Andreas
2017-01-01
B chromosomes (Bs) are supernumerary, dispensable parts of the nuclear genome, which appear in many different species of eukaryote. So far, Bs have been considered to be genetically inert elements without any functional genes. Our comparative transcriptome analysis and the detection of active RNA polymerase II (RNAPII) in the proximity of B chromatin demonstrate that the Bs of rye (Secale cereale) contribute to the transcriptome. In total, 1954 and 1218 B-derived transcripts with an open reading frame were expressed in generative and vegetative tissues, respectively. In addition to B-derived transposable element transcripts, a high percentage of short transcripts without detectable similarity to known proteins and gene fragments from A chromosomes (As) were found, suggesting an ongoing gene erosion process. In vitro analysis of the A- and B-encoded AGO4B protein variants demonstrated that both possess RNA slicer activity. These data demonstrate unambiguously the presence of a functional AGO4B gene on Bs and that these Bs carry both functional protein coding genes and pseudogene copies. Thus, B-encoded genes may provide an additional level of gene control and complexity in combination with their related A-located genes. Hence, physiological effects, associated with the presence of Bs, may partly be explained by the activity of B-located (pseudo)genes. © 2016 IPK Gatersleben. New Phytologist © 2016 New Phytologist Trust.
Annotation of gene function in citrus using gene expression information and co-expression networks
2014-01-01
Background The genus Citrus encompasses major cultivated plants such as sweet orange, mandarin, lemon and grapefruit, among the world’s most economically important fruit crops. With increasing volumes of transcriptomics data available for these species, Gene Co-expression Network (GCN) analysis is a viable option for predicting gene function at a genome-wide scale. GCN analysis is based on a “guilt-by-association” principle whereby genes encoding proteins involved in similar and/or related biological processes may exhibit similar expression patterns across diverse sets of experimental conditions. While bioinformatics resources such as GCN analysis are widely available for efficient gene function prediction in model plant species including Arabidopsis, soybean and rice, in citrus these tools are not yet developed. Results We have constructed a comprehensive GCN for citrus inferred from 297 publicly available Affymetrix Genechip Citrus Genome microarray datasets, providing gene co-expression relationships at a genome-wide scale (33,000 transcripts). The comprehensive citrus GCN consists of a global GCN (condition-independent) and four condition-dependent GCNs that survey the sweet orange species only, all citrus fruit tissues, all citrus leaf tissues, or stress-exposed plants. All of these GCNs are clustered using genome-wide, gene-centric (guide) and graph clustering algorithms for flexibility of gene function prediction. For each putative cluster, gene ontology (GO) enrichment and gene expression specificity analyses were performed to enhance gene function, expression and regulation pattern prediction. The guide-gene approach was used to infer novel roles of genes involved in disease susceptibility and vitamin C metabolism, and graph-clustering approaches were used to investigate isoprenoid/phenylpropanoid metabolism in citrus peel, and citric acid catabolism via the GABA shunt in citrus fruit. Conclusions Integration of citrus gene co-expression networks, functional enrichment analysis and gene expression information provide opportunities to infer gene function in citrus. We present a publicly accessible tool, Network Inference for Citrus Co-Expression (NICCE, http://citrus.adelaide.edu.au/nicce/home.aspx), for the gene co-expression analysis in citrus. PMID:25023870
Piao, Hailan; Froula, Jeff; Du, Changbin; Kim, Tae-Wan; Hawley, Erik R; Bauer, Stefan; Wang, Zhong; Ivanova, Nathalia; Clark, Douglas S; Klenk, Hans-Peter; Hess, Matthias
2014-08-01
Although recent nucleotide sequencing technologies have significantly enhanced our understanding of microbial genomes, the function of ∼35% of genes identified in a genome currently remains unknown. To improve the understanding of microbial genomes and consequently of microbial processes it will be crucial to assign a function to this "genomic dark matter." Due to the urgent need for additional carbohydrate-active enzymes for improved production of transportation fuels from lignocellulosic biomass, we screened the genomes of more than 5,500 microorganisms for hypothetical proteins that are located in the proximity of already known cellulases. We identified, synthesized and expressed a total of 17 putative cellulase genes with insufficient sequence similarity to currently known cellulases to be identified as such using traditional sequence annotation techniques that rely on significant sequence similarity. The recombinant proteins of the newly identified putative cellulases were subjected to enzymatic activity assays to verify their hydrolytic activity towards cellulose and lignocellulosic biomass. Eleven (65%) of the tested enzymes had significant activity towards at least one of the substrates. This high success rate highlights that a gene context-based approach can be used to assign function to genes that are otherwise categorized as "genomic dark matter" and to identify biomass-degrading enzymes that have little sequence similarity to already known cellulases. The ability to assign function to genes that have no related sequence representatives with functional annotation will be important to enhance our understanding of microbial processes and to identify microbial proteins for a wide range of applications. © 2014 Wiley Periodicals, Inc.
PaperBLAST: Text Mining Papers for Information about Homologs
Price, Morgan N.; Arkin, Adam P.
2017-08-15
Large-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources (Swiss-Prot, GeneRIF, and EcoCyc) that link protein sequences to scientific articles. PaperBLAST’s database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quicklymore » finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. With the recent explosion of genome sequencing data, there are now millions of uncharacterized proteins. If a scientist becomes interested in one of these proteins, it can be very difficult to find information as to its likely function. Often a protein whose sequence is similar, and which is likely to have a similar function, has been studied already, but this information is not available in any database. To help find articles about similar proteins, PaperBLAST searches the full text of scientific articles for protein identifiers or gene identifiers, and it links these articles to protein sequences. Then, given a protein of interest, it can quickly find similar proteins in its database by using standard software (BLAST), and it can show snippets of text from relevant papers. We hope that PaperBLAST will make it easier for biologists to predict proteins’ functions.« less
PaperBLAST: Text Mining Papers for Information about Homologs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Price, Morgan N.; Arkin, Adam P.
Large-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources (Swiss-Prot, GeneRIF, and EcoCyc) that link protein sequences to scientific articles. PaperBLAST’s database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quicklymore » finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. With the recent explosion of genome sequencing data, there are now millions of uncharacterized proteins. If a scientist becomes interested in one of these proteins, it can be very difficult to find information as to its likely function. Often a protein whose sequence is similar, and which is likely to have a similar function, has been studied already, but this information is not available in any database. To help find articles about similar proteins, PaperBLAST searches the full text of scientific articles for protein identifiers or gene identifiers, and it links these articles to protein sequences. Then, given a protein of interest, it can quickly find similar proteins in its database by using standard software (BLAST), and it can show snippets of text from relevant papers. We hope that PaperBLAST will make it easier for biologists to predict proteins’ functions.« less
PaperBLAST: Text Mining Papers for Information about Homologs
Arkin, Adam P.
2017-01-01
ABSTRACT Large-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources (Swiss-Prot, GeneRIF, and EcoCyc) that link protein sequences to scientific articles. PaperBLAST’s database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quickly finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. PaperBLAST is available at http://papers.genomics.lbl.gov/. IMPORTANCE With the recent explosion of genome sequencing data, there are now millions of uncharacterized proteins. If a scientist becomes interested in one of these proteins, it can be very difficult to find information as to its likely function. Often a protein whose sequence is similar, and which is likely to have a similar function, has been studied already, but this information is not available in any database. To help find articles about similar proteins, PaperBLAST searches the full text of scientific articles for protein identifiers or gene identifiers, and it links these articles to protein sequences. Then, given a protein of interest, it can quickly find similar proteins in its database by using standard software (BLAST), and it can show snippets of text from relevant papers. We hope that PaperBLAST will make it easier for biologists to predict proteins’ functions. PMID:28845458
PaperBLAST: Text Mining Papers for Information about Homologs.
Price, Morgan N; Arkin, Adam P
2017-01-01
Large-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources (Swiss-Prot, GeneRIF, and EcoCyc) that link protein sequences to scientific articles. PaperBLAST's database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quickly finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. PaperBLAST is available at http://papers.genomics.lbl.gov/. IMPORTANCE With the recent explosion of genome sequencing data, there are now millions of uncharacterized proteins. If a scientist becomes interested in one of these proteins, it can be very difficult to find information as to its likely function. Often a protein whose sequence is similar, and which is likely to have a similar function, has been studied already, but this information is not available in any database. To help find articles about similar proteins, PaperBLAST searches the full text of scientific articles for protein identifiers or gene identifiers, and it links these articles to protein sequences. Then, given a protein of interest, it can quickly find similar proteins in its database by using standard software (BLAST), and it can show snippets of text from relevant papers. We hope that PaperBLAST will make it easier for biologists to predict proteins' functions.
Liu, Gangbiao; Zou, Yangyun; Cheng, Qiqun; Zeng, Yanwu; Gu, Xun; Su, Zhixi
2014-04-01
The age distribution of gene duplication events within the human genome exhibits two waves of duplications along with an ancient component. However, because of functional constraint differences, genes in different functional categories might show dissimilar retention patterns after duplication. It is known that genes in some functional categories are highly duplicated in the early stage of vertebrate evolution. However, the correlations of the age distribution pattern of gene duplication between the different functional categories are still unknown. To investigate this issue, we developed a robust pipeline to date the gene duplication events in the human genome. We successfully estimated about three-quarters of the duplication events within the human genome, along with the age distribution pattern in each Gene Ontology (GO) slim category. We found that some GO slim categories show different distribution patterns when compared to the whole genome. Further hierarchical clustering of the GO slim functional categories enabled grouping into two main clusters. We found that human genes located in the duplicated copy number variant regions, whose duplicate genes have not been fixed in the human population, were mainly enriched in the groups with a high proportion of recently duplicated genes. Moreover, we used a phylogenetic tree-based method to date the age of duplications in three signaling-related gene superfamilies: transcription factors, protein kinases and G-protein coupled receptors. These superfamilies were expressed in different subcellular localizations. They showed a similar age distribution as the signaling-related GO slim categories. We also compared the differences between the age distributions of gene duplications in multiple subcellular localizations. We found that the distribution patterns of the major subcellular localizations were similar to that of the whole genome. This study revealed the whole picture of the evolution patterns of gene functional categories in the human genome.
GeneRIF indexing: sentence selection based on machine learning.
Jimeno-Yepes, Antonio J; Sticco, J Caitlin; Mork, James G; Aronson, Alan R
2013-05-31
A Gene Reference Into Function (GeneRIF) describes novel functionality of genes. GeneRIFs are available from the National Center for Biotechnology Information (NCBI) Gene database. GeneRIF indexing is performed manually, and the intention of our work is to provide methods to support creating the GeneRIF entries. The creation of GeneRIF entries involves the identification of the genes mentioned in MEDLINE®; citations and the sentences describing a novel function. We have compared several learning algorithms and several features extracted or derived from MEDLINE sentences to determine if a sentence should be selected for GeneRIF indexing. Features are derived from the sentences or using mechanisms to augment the information provided by them: assigning a discourse label using a previously trained model, for example. We show that machine learning approaches with specific feature combinations achieve results close to one of the annotators. We have evaluated different feature sets and learning algorithms. In particular, Naïve Bayes achieves better performance with a selection of features similar to one used in related work, which considers the location of the sentence, the discourse of the sentence and the functional terminology in it. The current performance is at a level similar to human annotation and it shows that machine learning can be used to automate the task of sentence selection for GeneRIF annotation. The current experiments are limited to the human species. We would like to see how the methodology can be extended to other species, specifically the normalization of gene mentions in other species.
Yan, Lijie; Jackson, Andrew O.; Liu, Zhiyong; Han, Chenggui; Yu, Jialin; Li, Dawei
2011-01-01
Barley stripe mosaic virus (BSMV) is a single-stranded RNA virus with three genome components designated alpha, beta, and gamma. BSMV vectors have previously been shown to be efficient virus induced gene silencing (VIGS) vehicles in barley and wheat and have provided important information about host genes functioning during pathogenesis as well as various aspects of genes functioning in development. To permit more effective use of BSMV VIGS for functional genomics experiments, we have developed an Agrobacterium delivery system for BSMV and have coupled this with a ligation independent cloning (LIC) strategy to mediate efficient cloning of host genes. Infiltrated Nicotiana benthamiana leaves provided excellent sources of virus for secondary BSMV infections and VIGS in cereals. The Agro/LIC BSMV VIGS vectors were able to function in high efficiency down regulation of phytoene desaturase (PDS), magnesium chelatase subunit H (ChlH), and plastid transketolase (TK) gene silencing in N. benthamiana and in the monocots, wheat, barley, and the model grass, Brachypodium distachyon. Suppression of an Arabidopsis orthologue cloned from wheat (TaPMR5) also interfered with wheat powdery mildew (Blumeria graminis f. sp. tritici) infections in a manner similar to that of the A. thaliana PMR5 loss-of-function allele. These results imply that the PMR5 gene has maintained similar functions across monocot and dicot families. Our BSMV VIGS system provides substantial advantages in expense, cloning efficiency, ease of manipulation and ability to apply VIGS for high throughput genomics studies. PMID:22031834
Dehydration stress memory genes of Zea mays; comparison with Arabidopsis thaliana
2014-01-01
Background Pre-exposing plants to diverse abiotic stresses may alter their physiological and transcriptional responses to a subsequent stress, suggesting a form of “stress memory”. Arabidopsis thaliana plants that have experienced multiple exposures to dehydration stress display transcriptional behavior suggesting “memory” from an earlier stress. Genes that respond to a first stress by up-regulating or down-regulating their transcription but in a subsequent stress provide a significantly different response define the ‘memory genes’ category. Genes responding similarly to each stress form the ‘non-memory’ category. It is unknown whether such memory responses exists in other Angiosperm lineages and whether memory is an evolutionarily conserved response to repeated dehydration stresses. Results Here, we determine the transcriptional responses of maize (Zea mays L.) plants that have experienced repeated exposures to dehydration stress in comparison with plants encountering the stress for the first time. Four distinct transcription memory response patterns similar to those displayed by A. thaliana were revealed. The most important contribution is the evidence that monocot and eudicot plants, two lineages that have diverged 140 to 200 M years ago, display similar abilities to ‘remember’ a dehydration stress and to modify their transcriptional responses, accordingly. The highly sensitive RNA-Seq analyses allowed to identify genes that function similarly in the two lineages, as well as genes that function in species-specific ways. Memory transcription patterns indicate that the transcriptional behavior of responding genes under repeated stresses is different from the behavior during an initial dehydration stress, suggesting that stress memory is a complex phenotype resulting from coordinated responses of multiple signaling pathways. Conclusions Structurally related genes displaying the same memory responses in the two species would suggest conservation of the genes’ memory during the evolution of plants’ dehydration stress response systems. On the other hand, divergent transcription memory responses by genes encoding similar functions would suggest occurrence of species-specific memory responses. The results provide novel insights into our current knowledge of how plants respond to multiple dehydration stresses, as compared to a single exposure, and may serve as a reference platform to study the functions of memory genes in adaptive responses to water deficit in monocot and eudicot plants. PMID:24885787
Acharya, Debarun; Ghosh, Tapash C
2016-01-22
Gene duplication is a genetic mutation that creates functionally redundant gene copies that are initially relieved from selective pressures and may adapt themselves to new functions with time. The levels of gene duplication may vary from small-scale duplication (SSD) to whole genome duplication (WGD). Studies with yeast revealed ample differences between these duplicates: Yeast WGD pairs were functionally more similar, less divergent in subcellular localization and contained a lesser proportion of essential genes. In this study, we explored the differences in evolutionary genomic properties of human SSD and WGD genes, with the identifiable human duplicates coming from the two rounds of whole genome duplication occurred early in vertebrate evolution. We observed that these two groups of duplicates were also dissimilar in terms of their evolutionary and genomic properties. But interestingly, this is not like the same observed in yeast. The human WGDs were found to be functionally less similar, diverge more in subcellular level and contain a higher proportion of essential genes than the SSDs, all of which are opposite from yeast. Additionally, we explored that human WGDs were more divergent in their gene expression profile, have higher multifunctionality and are more often associated with disease, and are evolutionarily more conserved than human SSDs. Our study suggests that human WGD duplicates are more divergent and entails the adaptation of WGDs to novel and important functions that consequently lead to their evolutionary conservation in the course of evolution.
Shimada, Norimoto; Sato, Shusei; Akashi, Tomoyoshi; Nakamura, Yasukazu; Tabata, Satoshi; Ayabe, Shin-ichi; Aoki, Toshio
2007-01-01
Abstract A model legume Lotus japonicus (Regel) K. Larsen is one of the subjects of genome sequencing and functional genomics programs. In the course of targeted approaches to the legume genomics, we analyzed the genes encoding enzymes involved in the biosynthesis of the legume-specific 5-deoxyisoflavonoid of L. japonicus, which produces isoflavan phytoalexins on elicitor treatment. The paralogous biosynthetic genes were assigned as comprehensively as possible by biochemical experiments, similarity searches, comparison of the gene structures, and phylogenetic analyses. Among the 10 biosynthetic genes investigated, six comprise multigene families, and in many cases they form gene clusters in the chromosomes. Semi-quantitative reverse transcriptase–PCR analyses showed coordinate up-regulation of most of the genes during phytoalexin induction and complex accumulation patterns of the transcripts in different organs. Some paralogous genes exhibited similar expression specificities, suggesting their genetic redundancy. The molecular evolution of the biosynthetic genes is discussed. The results presented here provide reliable annotations of the genes and genetic markers for comparative and functional genomics of leguminous plants. PMID:17452423
GOMA: functional enrichment analysis tool based on GO modules
Huang, Qiang; Wu, Ling-Yun; Wang, Yong; Zhang, Xiang-Sun
2013-01-01
Analyzing the function of gene sets is a critical step in interpreting the results of high-throughput experiments in systems biology. A variety of enrichment analysis tools have been developed in recent years, but most output a long list of significantly enriched terms that are often redundant, making it difficult to extract the most meaningful functions. In this paper, we present GOMA, a novel enrichment analysis method based on the new concept of enriched functional Gene Ontology (GO) modules. With this method, we systematically revealed functional GO modules, i.e., groups of functionally similar GO terms, via an optimization model and then ranked them by enrichment scores. Our new method simplifies enrichment analysis results by reducing redundancy, thereby preventing inconsistent enrichment results among functionally similar terms and providing more biologically meaningful results. PMID:23237213
Mutant phenotypes for thousands of bacterial genes of unknown function
Price, Morgan N.; Wetmore, Kelly M.; Waters, R. Jordan; ...
2018-05-16
One-third of all protein-coding genes from bacterial genomes cannot be annotated with a function. Here, to investigate the functions of these genes, we present genome-wide mutant fitness data from 32 diverse bacteria across dozens of growth conditions. We identified mutant phenotypes for 11,779 protein-coding genes that had not been annotated with a specific function. Many genes could be associated with a specific condition because the gene affected fitness only in that condition, or with another gene in the same bacterium because they had similar mutant phenotypes. Of the poorly annotated genes, 2,316 had associations that have high confidence because theymore » are conserved in other bacteria. By combining these conserved associations with comparative genomics, we identified putative DNA repair proteins; in addition, we propose specific functions for poorly annotated enzymes and transporters and for uncharacterized protein families. Lastly, our study demonstrates the scalability of microbial genetics and its utility for improving gene annotations.« less
Mutant phenotypes for thousands of bacterial genes of unknown function
DOE Office of Scientific and Technical Information (OSTI.GOV)
Price, Morgan N.; Wetmore, Kelly M.; Waters, R. Jordan
One-third of all protein-coding genes from bacterial genomes cannot be annotated with a function. Here, to investigate the functions of these genes, we present genome-wide mutant fitness data from 32 diverse bacteria across dozens of growth conditions. We identified mutant phenotypes for 11,779 protein-coding genes that had not been annotated with a specific function. Many genes could be associated with a specific condition because the gene affected fitness only in that condition, or with another gene in the same bacterium because they had similar mutant phenotypes. Of the poorly annotated genes, 2,316 had associations that have high confidence because theymore » are conserved in other bacteria. By combining these conserved associations with comparative genomics, we identified putative DNA repair proteins; in addition, we propose specific functions for poorly annotated enzymes and transporters and for uncharacterized protein families. Lastly, our study demonstrates the scalability of microbial genetics and its utility for improving gene annotations.« less
Bossi, Flavia; Fan, Jue; Xiao, Jun; Chandra, Lilyana; Shen, Max; Dorone, Yanniv; Wagner, Doris; Rhee, Seung Y
2017-06-26
The molecular function of a gene is most commonly inferred by sequence similarity. Therefore, genes that lack sufficient sequence similarity to characterized genes (such as certain classes of transcriptional regulators) are difficult to classify using most function prediction algorithms and have remained uncharacterized. To identify novel transcriptional regulators systematically, we used a feature-based pipeline to screen protein families of unknown function. This method predicted 43 transcriptional regulator families in Arabidopsis thaliana, 7 families in Drosophila melanogaster, and 9 families in Homo sapiens. Literature curation validated 12 of the predicted families to be involved in transcriptional regulation. We tested 33 out of the 195 Arabidopsis putative transcriptional regulators for their ability to activate transcription of a reporter gene in planta and found twelve coactivators, five of which had no prior literature support. To investigate mechanisms of action in which the predicted regulators might work, we looked for interactors of an Arabidopsis candidate that did not show transactivation activity in planta and found that it might work with other members of its own family and a subunit of the Polycomb Repressive Complex 2 to regulate transcription. Our results demonstrate the feasibility of assigning molecular function to proteins of unknown function without depending on sequence similarity. In particular, we identified novel transcriptional regulators using biological features enriched in transcription factors. The predictions reported here should accelerate the characterization of novel regulators.
Hammarlöf, Disa L; Canals, Rocío; Hinton, Jay C D
2013-10-01
The availability of thousands of genome sequences of bacterial pathogens poses a particular challenge because each genome contains hundreds of genes of unknown function (FUN). How can we easily discover which FUN genes encode important virulence factors? One solution is to combine two different functional genomic approaches. First, transcriptomics identifies bacterial FUN genes that show differential expression during the process of mammalian infection. Second, global mutagenesis identifies individual FUN genes that the pathogen requires to cause disease. The intersection of these datasets can reveal a small set of candidate genes most likely to encode novel virulence attributes. We demonstrate this approach with the Salmonella infection model, and propose that a similar strategy could be used for other bacterial pathogens. Copyright © 2013 Elsevier Ltd. All rights reserved.
DaGO-Fun: tool for Gene Ontology-based functional analysis using term information content measures.
Mazandu, Gaston K; Mulder, Nicola J
2013-09-25
The use of Gene Ontology (GO) data in protein analyses have largely contributed to the improved outcomes of these analyses. Several GO semantic similarity measures have been proposed in recent years and provide tools that allow the integration of biological knowledge embedded in the GO structure into different biological analyses. There is a need for a unified tool that provides the scientific community with the opportunity to explore these different GO similarity measure approaches and their biological applications. We have developed DaGO-Fun, an online tool available at http://web.cbio.uct.ac.za/ITGOM, which incorporates many different GO similarity measures for exploring, analyzing and comparing GO terms and proteins within the context of GO. It uses GO data and UniProt proteins with their GO annotations as provided by the Gene Ontology Annotation (GOA) project to precompute GO term information content (IC), enabling rapid response to user queries. The DaGO-Fun online tool presents the advantage of integrating all the relevant IC-based GO similarity measures, including topology- and annotation-based approaches to facilitate effective exploration of these measures, thus enabling users to choose the most relevant approach for their application. Furthermore, this tool includes several biological applications related to GO semantic similarity scores, including the retrieval of genes based on their GO annotations, the clustering of functionally related genes within a set, and term enrichment analysis.
Aguilar-Martínez, José A.; Sinha, Neelima
2013-01-01
TCP family of plant-specific transcription factors regulates plant form through control of cell proliferation and differentiation. This gene family is comprised of two groups, class I and class II. While the role of class II TCP genes in plant development is well known, data about the function of some class I TCP genes is lacking. We studied a group of phylogenetically related class I TCP genes: AtTCP7, AtTCP8, AtTCP22, and AtTCP23. The similar expression pattern in young growing leaves found for this group suggests similarity in gene function. Gene redundancy is characteristic in this group, as also seen in the class II TCP genes. We generated a pentuple mutant tcp8 tcp15 tcp21 tcp22 tcp23 and show that loss of function of these genes results in changes in leaf developmental traits. We also determined that these factors are able to mutually interact in a yeast two-hybrid assay and regulate the expression of KNOX1 genes. To circumvent the issue of genetic redundancy, dominant negative forms with SRDX repressor domain were used. Analysis of transgenic plants expressing AtTCP7-SRDX and AtTCP23-SRDX indicate a role of these factors in the control of cell proliferation. PMID:24137171
Aguilar-Martínez, José A; Sinha, Neelima
2013-01-01
TCP family of plant-specific transcription factors regulates plant form through control of cell proliferation and differentiation. This gene family is comprised of two groups, class I and class II. While the role of class II TCP genes in plant development is well known, data about the function of some class I TCP genes is lacking. We studied a group of phylogenetically related class I TCP genes: AtTCP7, AtTCP8, AtTCP22, and AtTCP23. The similar expression pattern in young growing leaves found for this group suggests similarity in gene function. Gene redundancy is characteristic in this group, as also seen in the class II TCP genes. We generated a pentuple mutant tcp8 tcp15 tcp21 tcp22 tcp23 and show that loss of function of these genes results in changes in leaf developmental traits. We also determined that these factors are able to mutually interact in a yeast two-hybrid assay and regulate the expression of KNOX1 genes. To circumvent the issue of genetic redundancy, dominant negative forms with SRDX repressor domain were used. Analysis of transgenic plants expressing AtTCP7-SRDX and AtTCP23-SRDX indicate a role of these factors in the control of cell proliferation.
Fusing literature and full network data improves disease similarity computation.
Li, Ping; Nie, Yaling; Yu, Jingkai
2016-08-30
Identifying relatedness among diseases could help deepen understanding for the underlying pathogenic mechanisms of diseases, and facilitate drug repositioning projects. A number of methods for computing disease similarity had been developed; however, none of them were designed to utilize information of the entire protein interaction network, using instead only those interactions involving disease causing genes. Most of previously published methods required gene-disease association data, unfortunately, many diseases still have very few or no associated genes, which impeded broad adoption of those methods. In this study, we propose a new method (MedNetSim) for computing disease similarity by integrating medical literature and protein interaction network. MedNetSim consists of a network-based method (NetSim), which employs the entire protein interaction network, and a MEDLINE-based method (MedSim), which computes disease similarity by mining the biomedical literature. Among function-based methods, NetSim achieved the best performance. Its average AUC (area under the receiver operating characteristic curve) reached 95.2 %. MedSim, whose performance was even comparable to some function-based methods, acquired the highest average AUC in all semantic-based methods. Integration of MedSim and NetSim (MedNetSim) further improved the average AUC to 96.4 %. We further studied the effectiveness of different data sources. It was found that quality of protein interaction data was more important than its volume. On the contrary, higher volume of gene-disease association data was more beneficial, even with a lower reliability. Utilizing higher volume of disease-related gene data further improved the average AUC of MedNetSim and NetSim to 97.5 % and 96.7 %, respectively. Integrating biomedical literature and protein interaction network can be an effective way to compute disease similarity. Lacking sufficient disease-related gene data, literature-based methods such as MedSim can be a great addition to function-based algorithms. It may be beneficial to steer more resources torward studying gene-disease associations and improving the quality of protein interaction data. Disease similarities can be computed using the proposed methods at http:// www.digintelli.com:8000/ .
Segmental expression of Pax3/7 and engrailed homologs in tardigrade development.
Gabriel, Willow N; Goldstein, Bob
2007-06-01
How morphological diversity arises through evolution of gene sequence is a major question in biology. In Drosophila, the genetic basis for body patterning and morphological segmentation has been studied intensively. It is clear that some of the genes in the Drosophila segmentation program are functioning similarly in certain other taxa, although many questions remain about when these gene functions arose and which taxa use these genes similarly to establish diverse body plans. Tardigrades are an outgroup to arthropods in the Ecdysozoa and, as such, can provide insight into how gene functions have evolved among the arthropods and their close relatives. We developed immunostaining methods for tardigrade embryos, and we used cross-reactive antibodies to investigate the expression of homologs of the pair-rule gene paired (Pax3/7) and the segment polarity gene engrailed in the tardigrade Hypsibius dujardini. We find that in H. dujardini embryos, Pax3/7 protein localizes not in a pair-rule pattern but in a segmentally iterated pattern, after the segments are established, in regions of the embryo where neurons later arise. Engrailed protein localizes in the posterior ectoderm of each segment before ectodermal segmentation is apparent. Together with previous results from others, our data support the conclusions that the pair-rule function of Pax3/7 is specific to the arthropods, that some of the ancient functions of Pax3/7 and Engrailed in ancestral bilaterians may have been in neurogenesis, and that Engrailed may have a function in establishing morphological boundaries between segments that is conserved at least among the Panarthropoda.
Defining functional distance using manifold embeddings of gene ontology annotations
Lerman, Gilad; Shakhnovich, Boris E.
2007-01-01
Although rigorous measures of similarity for sequence and structure are now well established, the problem of defining functional relationships has been particularly daunting. Here, we present several manifold embedding techniques to compute distances between Gene Ontology (GO) functional annotations and consequently estimate functional distances between protein domains. To evaluate accuracy, we correlate the functional distance to the well established measures of sequence, structural, and phylogenetic similarities. Finally, we show that manual classification of structures into folds and superfamilies is mirrored by proximity in the newly defined function space. We show how functional distances place structure–function relationships in biological context resulting in insight into divergent and convergent evolution. The methods and results in this paper can be readily generalized and applied to a wide array of biologically relevant investigations, such as accuracy of annotation transference, the relationship between sequence, structure, and function, or coherence of expression modules. PMID:17595300
Lu, Shun-Wen; Chen, Shiyan; Wang, Jianying; Yu, Hang; Chronis, Demosthenis; Mitchum, Melissa G; Wang, Xiaohong
2009-09-01
Plant CLAVATA3/ESR-related (CLE) peptides have diverse roles in plant growth and development. Here, we report the isolation and functional characterization of five new CLE genes from the potato cyst nematode Globodera rostochiensis. Unlike typical plant CLE peptides that contain a single CLE motif, four of the five Gr-CLE genes encode CLE proteins with multiple CLE motifs. These Gr-CLE genes were found to be specifically expressed within the dorsal esophageal gland cell of nematode parasitic stages, suggesting a role for their encoded proteins in plant parasitism. Overexpression phenotypes of Gr-CLE genes in Arabidopsis mimicked those of plant CLE genes, and Gr-CLE proteins could rescue the Arabidopsis clv3-2 mutant phenotype when expressed within meristems. A short root phenotype was observed when synthetic GrCLE peptides were exogenously applied to roots of Arabidopsis or potato similar to the overexpression of Gr-CLE genes in Arabidopsis and potato hairy roots. These results reveal that G. rostochiensis CLE proteins with either single or multiple CLE motifs function similarly to plant CLE proteins and that CLE signaling components are conserved in both Arabidopsis and potato roots. Furthermore, our results provide evidence to suggest that the evolution of multiple CLE motifs may be an important mechanism for generating functional diversity in nematode CLE proteins to facilitate parasitism.
[Fish interferon response and its molecular regulation: a review].
Zhang, Yibing; Gui, Jianfang
2011-05-01
Interferon response is the first line of host defense against virus infection. Recent years have witnessed tremendous progress in understanding of fish innate response to virus infection, especially in fish interferon antiviral response. A line of fish genes involved in interferon antiviral response have been identified and functional studies further reveal that fish possess an IFN antiviral system similar to mammals. However, fish virus-induced interferon genes contain introns similar to mammalian type III interferon genes although they encode proteins similar to type I interferons, which makes it hard to understand the evolution of vertebrate interferon genes directly resulting in a debate on nomenclature of fish interferon genes. Actually, fish display some unique mechanisms underlying interferon antiviral response. This review documents the recent progress on fish interferon response and its molecular mechanism.
Developmentally distinct MYB genes encode functionally equivalent proteins in Arabidopsis.
Lee, M M; Schiefelbein, J
2001-05-01
The duplication and divergence of developmental control genes is thought to have driven morphological diversification during the evolution of multicellular organisms. To examine the molecular basis of this process, we analyzed the functional relationship between two paralogous MYB transcription factor genes, WEREWOLF (WER) and GLABROUS1 (GL1), in Arabidopsis. The WER and GL1 genes specify distinct cell types and exhibit non-overlapping expression patterns during Arabidopsis development. Nevertheless, reciprocal complementation experiments with a series of gene fusions showed that WER and GL1 encode functionally equivalent proteins, and their unique roles in plant development are entirely due to differences in their cis-regulatory sequences. Similar experiments with a distantly related MYB gene (MYB2) showed that its product cannot functionally substitute for WER or GL1. Furthermore, an analysis of the WER and GL1 proteins shows that conserved sequences correspond to specific functional domains. These results provide new insights into the evolution of the MYB gene family in Arabidopsis, and, more generally, they demonstrate that novel developmental gene function may arise solely by the modification of cis-regulatory sequences.
Howe, J G; Shu, M D
1988-08-01
Genes for the Epstein-Barr virus-encoded RNAs (EBERs), two low-molecular-weight RNAs encoded by the human gammaherpesvirus Epstein-Barr virus (EBV), hybridize to two small RNAs in a baboon cell line that contains a similar virus, herpesvirus papio (HVP). The genes for the HVP RNAs (HVP-1 and HVP-2) are located together in the small unique region at the left end of the viral genome and are transcribed by RNA polymerase III in a rightward direction, similar to the EBERs. There is significant similarity between EBER1 and HVP-1 RNA, except for an insert of 22 nucleotides which increases the length of HVP-1 RNA to 190 nucleotides. There is less similarity between the sequences of EBER2 and HVP-2 RNA, but both have a length of about 170 nucleotides. The predicted secondary structure of each HVP RNA is remarkably similar to that of the respective EBER, implying that the secondary structures are important for function. Upstream from the initiation sites of all four RNA genes are several highly conserved sequences which may function in the regulation of transcription. The HVP RNAs, together with the EBERs, are highly abundant in transformed cells and are efficiently bound by the cellular La protein.
Comparison of the Heme Iron Utilization Systems of Pathogenic Vibrios
O’Malley, S. M.; Mouton, S. L.; Occhino, D. A.; Deanda, M. T.; Rashidi, J. R.; Fuson, K. L.; Rashidi, C. E.; Mora, M. Y.; Payne, S. M.; Henderson, D. P.
1999-01-01
Vibrio alginolyticus, Vibrio fluvialis, and Vibrio parahaemolyticus utilized heme and hemoglobin as iron sources and contained chromosomal DNA similar to several Vibrio cholerae heme iron utilization genes. A V. parahaemolyticus gene that performed the function of V. cholerae hutA was isolated. A portion of the tonB1 locus of V. parahaemolyticus was sequenced and found to encode proteins similar in amino acid sequence to V. cholerae HutW, TonB1, and ExbB1. A recombinant plasmid containing the V. cholerae tonB1 and exbB1D1 genes complemented a V. alginolyticus heme utilization mutant. These data suggest that the heme iron utilization systems of the pathogenic vibrios tested, particularly V. parahaemolyticus and V. alginolyticus, are similar at the DNA level, the functional level, and, in the case of V. parahaemolyticus, the amino acid sequence or protein level to that of V. cholerae. PMID:10348876
Molecular and Functional Characterization of Broccoli EMBRYONIC FLOWER 2 Genes
Chen, Long-Fang O.; Lin, Chun-Hung; Lai, Ying-Mi; Huang, Jia-Yuan; Sung, Zinmay Renee
2012-01-01
Polycomb group (PcG) proteins regulate major developmental processes in Arabidopsis. EMBRYONIC FLOWER 2 (EMF2), the VEFS domain-containing PcG gene, regulates diverse genetic pathways and is required for vegetative development and plant survival. Despite widespread EMF2-like sequences in plants, little is known about their function other than in Arabidopsis and rice. To study the role of EMF2 in broccoli (Brassica oleracea var. italica cv. Elegance) development, we identified two broccoli EMF2 (BoEMF2) genes with sequence homology to and a similar gene expression pattern to that in Arabidopsis (AtEMF2). Reducing their expression in broccoli resulted in aberrant phenotypes and gene expression patterns. BoEMF2 regulates genes involved in diverse developmental and stress programs similar to AtEMF2 in Arabidopsis. However, BoEMF2 differs from AtEMF2 in the regulation of flower organ identity, cell proliferation and elongation, and death-related genes, which may explain the distinct phenotypes. The expression of BoEMF2.1 in the Arabidopsis emf2 mutant (Rescued emf2) partially rescued the mutant phenotype and restored the gene expression pattern to that of the wild type. Many EMF2-mediated molecular and developmental functions are conserved in broccoli and Arabidopsis. Furthermore, the restored gene expression pattern in Rescued emf2 provides insights into the molecular basis of PcG-mediated growth and development. PMID:22537758
Functional networks inference from rule-based machine learning models.
Lazzarini, Nicola; Widera, Paweł; Williamson, Stuart; Heer, Rakesh; Krasnogor, Natalio; Bacardit, Jaume
2016-01-01
Functional networks play an important role in the analysis of biological processes and systems. The inference of these networks from high-throughput (-omics) data is an area of intense research. So far, the similarity-based inference paradigm (e.g. gene co-expression) has been the most popular approach. It assumes a functional relationship between genes which are expressed at similar levels across different samples. An alternative to this paradigm is the inference of relationships from the structure of machine learning models. These models are able to capture complex relationships between variables, that often are different/complementary to the similarity-based methods. We propose a protocol to infer functional networks from machine learning models, called FuNeL. It assumes, that genes used together within a rule-based machine learning model to classify the samples, might also be functionally related at a biological level. The protocol is first tested on synthetic datasets and then evaluated on a test suite of 8 real-world datasets related to human cancer. The networks inferred from the real-world data are compared against gene co-expression networks of equal size, generated with 3 different methods. The comparison is performed from two different points of view. We analyse the enriched biological terms in the set of network nodes and the relationships between known disease-associated genes in a context of the network topology. The comparison confirms both the biological relevance and the complementary character of the knowledge captured by the FuNeL networks in relation to similarity-based methods and demonstrates its potential to identify known disease associations as core elements of the network. Finally, using a prostate cancer dataset as a case study, we confirm that the biological knowledge captured by our method is relevant to the disease and consistent with the specialised literature and with an independent dataset not used in the inference process. The implementation of our network inference protocol is available at: http://ico2s.org/software/funel.html.
A new family of β-helix proteins with similarities to the polysaccharide lyases
Close, Devin W.; D'Angelo, Sara; Bradbury, Andrew R. M.
2014-09-27
Microorganisms that degrade biomass produce diverse assortments of carbohydrate-active enzymes and binding modules. Despite tremendous advances in the genomic sequencing of these organisms, many genes do not have an ascribed function owing to low sequence identity to genes that have been annotated. Consequently, biochemical and structural characterization of genes with unknown function is required to complement the rapidly growing pool of genomic sequencing data. A protein with previously unknown function (Cthe_2159) was recently isolated in a genome-wide screen using phage display to identify cellulose-binding protein domains from the biomass-degrading bacterium Clostridium thermocellum. Here, the crystal structure of Cthe_2159 is presentedmore » and it is shown that it is a unique right-handed parallel β-helix protein. Despite very low sequence identity to known β-helix or carbohydrate-active proteins, Cthe_2159 displays structural features that are very similar to those of polysaccharide lyase (PL) families 1, 3, 6 and 9. Cthe_2159 is conserved across bacteria and some archaea and is a member of the domain of unknown function family DUF4353. This suggests that Cthe_2159 is the first representative of a previously unknown family of cellulose and/or acid-sugar binding β-helix proteins that share structural similarities with PLs. More importantly, these results demonstrate how functional annotation by biochemical and structural analysis remains a critical tool in the characterization of new gene products.« less
A new family of β-helix proteins with similarities to the polysaccharide lyases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Close, Devin W.; D'Angelo, Sara; Bradbury, Andrew R. M.
Microorganisms that degrade biomass produce diverse assortments of carbohydrate-active enzymes and binding modules. Despite tremendous advances in the genomic sequencing of these organisms, many genes do not have an ascribed function owing to low sequence identity to genes that have been annotated. Consequently, biochemical and structural characterization of genes with unknown function is required to complement the rapidly growing pool of genomic sequencing data. A protein with previously unknown function (Cthe_2159) was recently isolated in a genome-wide screen using phage display to identify cellulose-binding protein domains from the biomass-degrading bacterium Clostridium thermocellum. Here, the crystal structure of Cthe_2159 is presentedmore » and it is shown that it is a unique right-handed parallel β-helix protein. Despite very low sequence identity to known β-helix or carbohydrate-active proteins, Cthe_2159 displays structural features that are very similar to those of polysaccharide lyase (PL) families 1, 3, 6 and 9. Cthe_2159 is conserved across bacteria and some archaea and is a member of the domain of unknown function family DUF4353. This suggests that Cthe_2159 is the first representative of a previously unknown family of cellulose and/or acid-sugar binding β-helix proteins that share structural similarities with PLs. More importantly, these results demonstrate how functional annotation by biochemical and structural analysis remains a critical tool in the characterization of new gene products.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bossi, Flavia; Fan, Jue; Xiao, Jun
Here, the molecular function of a gene is most commonly inferred by sequence similarity. Therefore, genes that lack sufficient sequence similarity to characterized genes (such as certain classes of transcriptional regulators) are difficult to classify using most function prediction algorithms and have remained uncharacterized. As a result, to identify novel transcriptional regulators systematically, we used a feature-based pipeline to screen protein families of unknown function. This method predicted 43 transcriptional regulator families in Arabidopsis thaliana, 7 families in Drosophila melanogaster, and 9 families in Homo sapiens. Literature curation validated 12 of the predicted families to be involved in transcriptional regulation.more » We tested 33 out of the 195 Arabidopsis putative transcriptional regulators for their ability to activate transcription of a reporter gene in planta and found twelve coactivators, five of which had no prior literature support. To investigate mechanisms of action in which the predicted regulators might work, we looked for interactors of an Arabidopsis candidate that did not show transactivation activity in planta and found that it might work with other members of its own family and a subunit of the Polycomb Repressive Complex 2 to regulate transcription. Our results demonstrate the feasibility of assigning molecular function to proteins of unknown function without depending on sequence similarity. In particular, we identified novel transcriptional regulators using biological features enriched in transcription factors. The predictions reported here should accelerate the characterization of novel regulators.« less
Bossi, Flavia; Fan, Jue; Xiao, Jun; ...
2017-06-26
Here, the molecular function of a gene is most commonly inferred by sequence similarity. Therefore, genes that lack sufficient sequence similarity to characterized genes (such as certain classes of transcriptional regulators) are difficult to classify using most function prediction algorithms and have remained uncharacterized. As a result, to identify novel transcriptional regulators systematically, we used a feature-based pipeline to screen protein families of unknown function. This method predicted 43 transcriptional regulator families in Arabidopsis thaliana, 7 families in Drosophila melanogaster, and 9 families in Homo sapiens. Literature curation validated 12 of the predicted families to be involved in transcriptional regulation.more » We tested 33 out of the 195 Arabidopsis putative transcriptional regulators for their ability to activate transcription of a reporter gene in planta and found twelve coactivators, five of which had no prior literature support. To investigate mechanisms of action in which the predicted regulators might work, we looked for interactors of an Arabidopsis candidate that did not show transactivation activity in planta and found that it might work with other members of its own family and a subunit of the Polycomb Repressive Complex 2 to regulate transcription. Our results demonstrate the feasibility of assigning molecular function to proteins of unknown function without depending on sequence similarity. In particular, we identified novel transcriptional regulators using biological features enriched in transcription factors. The predictions reported here should accelerate the characterization of novel regulators.« less
Seifi Moroudi, Reihane; Masoudi, Ali Akbar; Vaez Torshizi, Rasoul; Zandi, Mohammad
2014-12-01
One of the important behaviors of dogs is trainability which is affected by learning and memory genes. These kinds of the genes have not yet been identified in dogs. In the current research, these genes were found in animal models by mining the biological data and scientific literatures. The proteins of these genes were obtained from the UniProt database in dogs and humans. Not all homologous proteins perform similar functions, thus comparison of these proteins was studied in terms of protein families, domains, biological processes, molecular functions, and cellular location of metabolic pathways in Interpro, KEGG, Quick Go and Psort databases. The results showed that some of these proteins have the same performance in the rat or mouse, dog, and human. It is anticipated that the protein of these genes may be effective in learning and memory in dogs. Then, the expression pattern of the recognized genes was investigated in the dog hippocampus using the existing information in the GEO profile. The results showed that BDNF, TAC1 and CCK genes are expressed in the dog hippocampus, therefore, these genes could be strong candidates associated with learning and memory in dogs. Subsequently, due to the importance of the promoter regions in gene function, this region was investigated in the above genes. Analysis of the promoter indicated that the HNF-4 site of BDNF gene and the transcription start site of CCK gene is exposed to methylation. Phylogenetic analysis of protein sequences of these genes showed high similarity in each of these three genes among the studied species. The dN/dS ratio for BDNF, TAC1 and CCK genes indicates a purifying selection during the evolution of the genes.
Follin, Elna; Karlsson, Maria; Lundegaard, Claus; Nielsen, Morten; Wallin, Stefan; Paulsson, Kajsa; Westerdahl, Helena
2013-04-01
The major histocompatibility complex (MHC) genes are the most polymorphic genes found in the vertebrate genome, and they encode proteins that play an essential role in the adaptive immune response. Many songbirds (passerines) have been shown to have a large number of transcribed MHC class I genes compared to most mammals. To elucidate the reason for this large number of genes, we compared 14 MHC class I alleles (α1-α3 domains), from great reed warbler, house sparrow and tree sparrow, via phylogenetic analysis, homology modelling and in silico peptide-binding predictions to investigate their functional and genetic relationships. We found more pronounced clustering of the MHC class I allomorphs (allele specific proteins) in regards to their function (peptide-binding specificities) compared to their genetic relationships (amino acid sequences), indicating that the high number of alleles is of functional significance. The MHC class I allomorphs from house sparrow and tree sparrow, species that diverged 10 million years ago (MYA), had overlapping peptide-binding specificities, and these similarities across species were also confirmed in phylogenetic analyses based on amino acid sequences. Notably, there were also overlapping peptide-binding specificities in the allomorphs from house sparrow and great reed warbler, although these species diverged 30 MYA. This overlap was not found in a tree based on amino acid sequences. Our interpretation is that convergent evolution on the level of the protein function, possibly driven by selection from shared pathogens, has resulted in allomorphs with similar peptide-binding repertoires, although trans-species evolution in combination with gene conversion cannot be ruled out.
Impact of Cigarette Smoke on the Human and Mouse Lungs: A Gene-Expression Comparison Study
Morissette, Mathieu C.; Lamontagne, Maxime; Bérubé, Jean-Christophe; Gaschler, Gordon; Williams, Andrew; Yauk, Carole; Couture, Christian; Laviolette, Michel; Hogg, James C.; Timens, Wim; Halappanavar, Sabina; Stampfli, Martin R.; Bossé, Yohan
2014-01-01
Cigarette smoke is well known for its adverse effects on human health, especially on the lungs. Basic research is essential to identify the mechanisms involved in the development of cigarette smoke-related diseases, but translation of new findings from pre-clinical models to the clinic remains difficult. In the present study, we aimed at comparing the gene expression signature between the lungs of human smokers and mice exposed to cigarette smoke to identify the similarities and differences. Using human and mouse whole-genome gene expression arrays, changes in gene expression, signaling pathways and biological functions were assessed. We found that genes significantly modulated by cigarette smoke in humans were enriched for genes modulated by cigarette smoke in mice, suggesting a similar response of both species. Sixteen smoking-induced genes were in common between humans and mice including six newly reported to be modulated by cigarette smoke. In addition, we identified a new conserved pulmonary response to cigarette smoke in the induction of phospholipid metabolism/degradation pathways. Finally, the majority of biological functions modulated by cigarette smoke in humans were also affected in mice. Altogether, the present study provides information on similarities and differences in lung gene expression response to cigarette smoke that exist between human and mouse. Our results foster the idea that animal models should be used to study the involvement of pathways rather than single genes in human diseases. PMID:24663285
DaGO-Fun: tool for Gene Ontology-based functional analysis using term information content measures
2013-01-01
Background The use of Gene Ontology (GO) data in protein analyses have largely contributed to the improved outcomes of these analyses. Several GO semantic similarity measures have been proposed in recent years and provide tools that allow the integration of biological knowledge embedded in the GO structure into different biological analyses. There is a need for a unified tool that provides the scientific community with the opportunity to explore these different GO similarity measure approaches and their biological applications. Results We have developed DaGO-Fun, an online tool available at http://web.cbio.uct.ac.za/ITGOM, which incorporates many different GO similarity measures for exploring, analyzing and comparing GO terms and proteins within the context of GO. It uses GO data and UniProt proteins with their GO annotations as provided by the Gene Ontology Annotation (GOA) project to precompute GO term information content (IC), enabling rapid response to user queries. Conclusions The DaGO-Fun online tool presents the advantage of integrating all the relevant IC-based GO similarity measures, including topology- and annotation-based approaches to facilitate effective exploration of these measures, thus enabling users to choose the most relevant approach for their application. Furthermore, this tool includes several biological applications related to GO semantic similarity scores, including the retrieval of genes based on their GO annotations, the clustering of functionally related genes within a set, and term enrichment analysis. PMID:24067102
Mina, Eleni; van Roon-Mom, Willeke; Hettne, Kristina; van Zwet, Erik; Goeman, Jelle; Neri, Christian; A C 't Hoen, Peter; Mons, Barend; Roos, Marco
2016-08-01
Huntington's disease (HD) is a devastating brain disorder with no effective treatment or cure available. The scarcity of brain tissue makes it hard to study changes in the brain and impossible to perform longitudinal studies. However, peripheral pathology in HD suggests that it is possible to study the disease using peripheral tissue as a monitoring tool for disease progression and/or efficacy of novel therapies. In this study, we investigated if blood can be used to monitor disease severity and progression in brain. Since previous attempts using only gene expression proved unsuccessful, we compared blood and brain Huntington's disease signatures in a functional context. Microarray HD gene expression profiles from three brain regions were compared to the transcriptome of HD blood generated by next generation sequencing. The comparison was performed with a combination of weighted gene co-expression network analysis and literature based functional analysis (Concept Profile Analysis). Uniquely, our comparison of blood and brain datasets was not based on (the very limited) gene overlap but on the similarity between the gene annotations in four different semantic categories: "biological process", "cellular component", "molecular function" and "disease or syndrome". We identified signatures in HD blood reflecting a broad pathophysiological spectrum, including alterations in the immune response, sphingolipid biosynthetic processes, lipid transport, cell signaling, protein modification, spliceosome, RNA splicing, vesicle transport, cell signaling and synaptic transmission. Part of this spectrum was reminiscent of the brain pathology. The HD signatures in caudate nucleus and BA4 exhibited the highest similarity with blood, irrespective of the category of semantic annotations used. BA9 exhibited an intermediate similarity, while cerebellum had the least similarity. We present two signatures that were shared between blood and brain: immune response and spinocerebellar ataxias. Our results demonstrate that HD blood exhibits dysregulation that is similar to brain at a functional level, but not necessarily at the level of individual genes. We report two common signatures that can be used to monitor the pathology in brain of HD patients in a non-invasive manner. Our results are an exemplar of how signals in blood data can be used to represent brain disorders. Our methodology can be used to study disease specific signatures in diseases where heterogeneous tissues are involved in the pathology.
Diallinas, G; Gorfinkiel, L; Arst, H N; Cecchetto, G; Scazzocchio, C
1995-04-14
In Aspergillus nidulans, loss-of-function mutations in the uapA and azgA genes, encoding the major uric acid-xanthine and hypoxanthine-adenine-guanine permeases, respectively, result in impaired utilization of these purines as sole nitrogen sources. The residual growth of the mutant strains is due to the activity of a broad specificity purine permease. We have identified uapC, the gene coding for this third permease through the isolation of both gain-of-function and loss-of-function mutations. Uptake studies with wild-type and mutant strains confirmed the genetic analysis and showed that the UapC protein contributes 30% and 8-10% to uric acid and hypoxanthine transport rates, respectively. The uapC gene was cloned, its expression studied, its sequence and transcript map established, and the sequence of its putative product analyzed. uapC message accumulation is: (i) weakly induced by 2-thiouric acid; (ii) repressed by ammonium; (iii) dependent on functional uaY and areA regulatory gene products (mediating uric acid induction and nitrogen metabolite repression, respectively); (iv) increased by uapC gain-of-function mutations which specifically, but partially, suppress a leucine to valine mutation in the zinc finger of the protein coded by the areA gene. The putative uapC gene product is a highly hydrophobic protein of 580 amino acids (M(r) = 61,251) including 12-14 putative transmembrane segments. The UapC protein is highly similar (58% identity) to the UapA permease and significantly similar (23-34% identity) to a number of bacterial transporters. Comparisons of the sequences and hydropathy profiles of members of this novel family of transporters yield insights into their structure, functionally important residues, and possible evolutionary relationships.
Tran-Nguyen, L. T. T.; Kube, M.; Schneider, B.; Reinhardt, R.; Gibb, K. S.
2008-01-01
The chromosome sequence of “Candidatus Phytoplasma australiense” (subgroup tuf-Australia I; rp-A), associated with dieback in papaya, Australian grapevine yellows in grapevine, and several other important plant diseases, was determined. The circular chromosome is represented by 879,324 nucleotides, a GC content of 27%, and 839 protein-coding genes. Five hundred two of these protein-coding genes were functionally assigned, while 337 genes were hypothetical proteins with unknown function. Potential mobile units (PMUs) containing clusters of DNA repeats comprised 12.1% of the genome. These PMUs encoded genes involved in DNA replication, repair, and recombination; nucleotide transport and metabolism; translation; and ribosomal structure. Elements with similarities to phage integrases found in these mobile units were difficult to classify, as they were similar to both insertion sequences and bacteriophages. Comparative analysis of “Ca. Phytoplasma australiense” with “Ca. Phytoplasma asteris” strains OY-M and AY-WB showed that the gene order was more conserved between the closely related “Ca. Phytoplasma asteris” strains than to “Ca. Phytoplasma australiense.” Differences observed between “Ca. Phytoplasma australiense” and “Ca. Phytoplasma asteris” strains included the chromosome size (18,693 bp larger than OY-M), a larger number of genes with assigned function, and hypothetical proteins with unknown function. PMID:18359806
Bogdanov, Yuri F; Dadashev, Sergei Y; Grishaeva, Tatiana M
2003-01-01
Evolutionarily distant organisms have not only orthologs, but also nonhomologous proteins that build functionally similar subcellular structures. For instance, this is true with protein components of the synaptonemal complex (SC), a universal ultrastructure that ensures the successful pairing and recombination of homologous chromosomes during meiosis. We aimed at developing a method to search databases for genes that code for such nonhomologous but functionally analogous proteins. Advantage was taken of the ultrastructural parameters of SC and the conformation of SC proteins responsible for these. Proteins involved in SC central space are known to be similar in secondary structure. Using published data, we found a highly significant correlation between the width of the SC central space and the length of rod-shaped central domain of mammalian and yeast intermediate proteins forming transversal filaments in the SC central space. Basing on this, we suggested a method for searching genome databases of distant organisms for genes whose virtual proteins meet the above correlation requirement. Our recent finding of the Drosophila melanogaster CG17604 gene coding for synaptonemal complex transversal filament protein received experimental support from another lab. With the same strategy, we showed that the Arabidopsis thaliana and Caenorhabditis elegans genomes contain unique genes coding for such proteins.
Vicente, Juan J; Galardi-Castilla, María; Escalante, Ricardo; Sastre, Leandro
2008-01-03
The social amoeba Dictyostelium discoideum executes a multicellular development program upon starvation. This morphogenetic process requires the differential regulation of a large number of genes and is coordinated by extracellular signals. The MADS-box transcription factor SrfA is required for several stages of development, including slug migration and spore terminal differentiation. Subtractive hybridization allowed the isolation of a gene, sigN (SrfA-induced gene N), that was dependent on the transcription factor SrfA for expression at the slug stage of development. Homology searches detected the existence of a large family of sigN-related genes in the Dictyostelium discoideum genome. The 13 most similar genes are grouped in two regions of chromosome 2 and have been named Group1 and Group2 sigN genes. The putative encoded proteins are 87-89 amino acids long. All these genes have a similar structure, composed of a first exon containing a 13 nucleotides long open reading frame and a second exon comprising the remaining of the putative coding region. The expression of these genes is induced at10 hours of development. Analyses of their promoter regions indicate that these genes are expressed in the prestalk region of developing structures. The addition of antibodies raised against SigN Group 2 proteins induced disintegration of multi-cellular structures at the mound stage of development. A large family of genes coding for small proteins has been identified in D. discoideum. Two groups of very similar genes from this family have been shown to be specifically expressed in prestalk cells during development. Functional studies using antibodies raised against Group 2 SigN proteins indicate that these genes could play a role during multicellular development.
Microbial Gene Abundance and Expression Patterns across a River to Ocean Salinity Gradient
Fortunato, Caroline S.; Crump, Byron C.
2015-01-01
Microbial communities mediate the biogeochemical cycles that drive ecosystems, and it is important to understand how these communities are affected by changing environmental conditions, especially in complex coastal zones. As fresh and marine waters mix in estuaries and river plumes, the salinity, temperature, and nutrient gradients that are generated strongly influence bacterioplankton community structure, yet, a parallel change in functional diversity has not been described. Metagenomic and metatranscriptomic analyses were conducted on five water samples spanning the salinity gradient of the Columbia River coastal margin, including river, estuary, plume, and ocean, in August 2010. Samples were pre-filtered through 3 μm filters and collected on 0.2 μm filters, thus results were focused on changes among free-living microbial communities. Results from metagenomic 16S rRNA sequences showed taxonomically distinct bacterial communities in river, estuary, and coastal ocean. Despite the strong salinity gradient observed over sampling locations (0 to 33), the functional gene profiles in the metagenomes were very similar from river to ocean with an average similarity of 82%. The metatranscriptomes, however, had an average similarity of 31%. Although differences were few among the metagenomes, we observed a change from river to ocean in the abundance of genes encoding for catabolic pathways, osmoregulators, and metal transporters. Additionally, genes specifying both bacterial oxygenic and anoxygenic photosynthesis were abundant and expressed in the estuary and plume. Denitrification genes were found throughout the Columbia River coastal margin, and most highly expressed in the estuary. Across a river to ocean gradient, the free-living microbial community followed three different patterns of diversity: 1) the taxonomy of the community changed strongly with salinity, 2) metabolic potential was highly similar across samples, with few differences in functional gene abundance from river to ocean, and 3) gene expression was highly variable and generally was independent of changes in salinity. PMID:26536246
Microbial Gene Abundance and Expression Patterns across a River to Ocean Salinity Gradient.
Fortunato, Caroline S; Crump, Byron C
2015-01-01
Microbial communities mediate the biogeochemical cycles that drive ecosystems, and it is important to understand how these communities are affected by changing environmental conditions, especially in complex coastal zones. As fresh and marine waters mix in estuaries and river plumes, the salinity, temperature, and nutrient gradients that are generated strongly influence bacterioplankton community structure, yet, a parallel change in functional diversity has not been described. Metagenomic and metatranscriptomic analyses were conducted on five water samples spanning the salinity gradient of the Columbia River coastal margin, including river, estuary, plume, and ocean, in August 2010. Samples were pre-filtered through 3 μm filters and collected on 0.2 μm filters, thus results were focused on changes among free-living microbial communities. Results from metagenomic 16S rRNA sequences showed taxonomically distinct bacterial communities in river, estuary, and coastal ocean. Despite the strong salinity gradient observed over sampling locations (0 to 33), the functional gene profiles in the metagenomes were very similar from river to ocean with an average similarity of 82%. The metatranscriptomes, however, had an average similarity of 31%. Although differences were few among the metagenomes, we observed a change from river to ocean in the abundance of genes encoding for catabolic pathways, osmoregulators, and metal transporters. Additionally, genes specifying both bacterial oxygenic and anoxygenic photosynthesis were abundant and expressed in the estuary and plume. Denitrification genes were found throughout the Columbia River coastal margin, and most highly expressed in the estuary. Across a river to ocean gradient, the free-living microbial community followed three different patterns of diversity: 1) the taxonomy of the community changed strongly with salinity, 2) metabolic potential was highly similar across samples, with few differences in functional gene abundance from river to ocean, and 3) gene expression was highly variable and generally was independent of changes in salinity.
Functional Annotation of the Arabidopsis Genome Using Controlled Vocabularies1
Berardini, Tanya Z.; Mundodi, Suparna; Reiser, Leonore; Huala, Eva; Garcia-Hernandez, Margarita; Zhang, Peifen; Mueller, Lukas A.; Yoon, Jungwoon; Doyle, Aisling; Lander, Gabriel; Moseyko, Nick; Yoo, Danny; Xu, Iris; Zoeckler, Brandon; Montoya, Mary; Miller, Neil; Weems, Dan; Rhee, Seung Y.
2004-01-01
Controlled vocabularies are increasingly used by databases to describe genes and gene products because they facilitate identification of similar genes within an organism or among different organisms. One of The Arabidopsis Information Resource's goals is to associate all Arabidopsis genes with terms developed by the Gene Ontology Consortium that describe the molecular function, biological process, and subcellular location of a gene product. We have also developed terms describing Arabidopsis anatomy and developmental stages and use these to annotate published gene expression data. As of March 2004, we used computational and manual annotation methods to make 85,666 annotations representing 26,624 unique loci. We focus on associating genes to controlled vocabulary terms based on experimental data from the literature and use The Arabidopsis Information Resource-developed PubSearch software to facilitate this process. Each annotation is tagged with a combination of evidence codes, evidence descriptions, and references that provide a robust means to assess data quality. Annotation of all Arabidopsis genes will allow quantitative comparisons between sets of genes derived from sources such as microarray experiments. The Arabidopsis annotation data will also facilitate annotation of newly sequenced plant genomes by using sequence similarity to transfer annotations to homologous genes. In addition, complete and up-to-date annotations will make unknown genes easy to identify and target for experimentation. Here, we describe the process of Arabidopsis functional annotation using a variety of data sources and illustrate several ways in which this information can be accessed and used to infer knowledge about Arabidopsis and other plant species. PMID:15173566
Ortholog-based screening and identification of genes related to intracellular survival.
Yang, Xiaowen; Wang, Jiawei; Bing, Guoxia; Bie, Pengfei; De, Yanyan; Lyu, Yanli; Wu, Qingmin
2018-04-20
Bioinformatics and comparative genomics analysis methods were used to predict unknown pathogen genes based on homology with identified or functionally clustered genes. In this study, the genes of common pathogens were analyzed to screen and identify genes associated with intracellular survival through sequence similarity, phylogenetic tree analysis and the λ-Red recombination system test method. The total 38,952 protein-coding genes of common pathogens were divided into 19,775 clusters. As demonstrated through a COG analysis, information storage and processing genes might play an important role intracellular survival. Only 19 clusters were present in facultative intracellular pathogens, and not all were present in extracellular pathogens. Construction of a phylogenetic tree selected 18 of these 19 clusters. Comparisons with the DEG database and previous research revealed that seven other clusters are considered essential gene clusters and that seven other clusters are associated with intracellular survival. Moreover, this study confirmed that clusters screened by orthologs with similar function could be replaced with an approved uvrY gene and its orthologs, and the results revealed that the usg gene is associated with intracellular survival. The study improves the current understanding of intracellular pathogens characteristics and allows further exploration of the intracellular survival-related gene modules in these pathogens. Copyright © 2018. Published by Elsevier B.V.
Singh, Sanjay K; Roy, Sujit; Choudhury, Swarup Roy; Sengupta, Dibyendu N
2010-07-21
The DNA repair and recombination (DRR) proteins protect organisms against genetic damage, caused by environmental agents and other genotoxic agents, by removal of DNA lesions or helping to abide them. We identified genes potentially involved in DRR mechanisms in Arabidopsis and rice using similarity searches and conserved domain analysis against proteins known to be involved in DRR in human, yeast and E. coli. As expected, many of DRR genes are very similar to those found in other eukaryotes. Beside these eukaryotes specific genes, several prokaryotes specific genes were also found to be well conserved in plants. In Arabidopsis, several functionally important DRR gene duplications are present, which do not occur in rice. Among DRR proteins, we found that proteins belonging to the nucleotide excision repair pathway were relatively more conserved than proteins needed for the other DRR pathways. Sub-cellular localization studies of DRR gene suggests that these proteins are mostly reside in nucleus while gene drain in between nucleus and cell organelles were also found in some cases. The similarities and dissimilarities in between plants and other organisms' DRR pathways are discussed. The observed differences broaden our knowledge about DRR in the plants world, and raises the potential question of whether differentiated functions have evolved in some cases. These results, altogether, provide a useful framework for further experimental studies in these organisms.
Functional Annotations of Paralogs: A Blessing and a Curse
Zallot, Rémi; Harrison, Katherine J.; Kolaczkowski, Bryan; de Crécy-Lagard, Valérie
2016-01-01
Gene duplication followed by mutation is a classic mechanism of neofunctionalization, producing gene families with functional diversity. In some cases, a single point mutation is sufficient to change the substrate specificity and/or the chemistry performed by an enzyme, making it difficult to accurately separate enzymes with identical functions from homologs with different functions. Because sequence similarity is often used as a basis for assigning functional annotations to genes, non-isofunctional gene families pose a great challenge for genome annotation pipelines. Here we describe how integrating evolutionary and functional information such as genome context, phylogeny, metabolic reconstruction and signature motifs may be required to correctly annotate multifunctional families. These integrative analyses can also lead to the discovery of novel gene functions, as hints from specific subgroups can guide the functional characterization of other members of the family. We demonstrate how careful manual curation processes using comparative genomics can disambiguate subgroups within large multifunctional families and discover their functions. We present the COG0720 protein family as a case study. We also discuss strategies to automate this process to improve the accuracy of genome functional annotation pipelines. PMID:27618105
Jain, Shruti; Bhattacharyya, Kausik; Bakshi, Rachit; Narang, Ankita; Brahmachari, Vani
2017-04-01
The genome annotation and identification of gene function depends on conserved biochemical activity. However, in the cell, proteins with the same biochemical function can participate in different cellular pathways and cannot complement one another. Similarly, two proteins of very different biochemical functions are put in the same class of cellular function; for example, the classification of a gene as an oncogene or a tumour suppressor gene is not related to its biochemical function, but is related to its cellular function. We have taken an approach to identify peptide signatures for cellular function in proteins with known biochemical function. ATPases as a test case, we classified ATPases (2360 proteins) and kinases (517 proteins) from the human genome into different cellular function categories such as transcriptional, replicative, and chromatin remodelling proteins. Using publicly available tool, MEME, we identify peptide signatures shared among the members of a given category but not between cellular functional categories; for example, no motif sharing is seen between chromatin remodelling and transporter ATPases, similarly between receptor Serine/Threonine Kinase and Receptor Tyrosine Kinase. There are motifs shared within each category with significant E value and high occurrence. This concept of signature for cellular function was applied to developmental regulators, the polycomb and trithorax proteins which led to the prediction of the role of INO80, a chromatin remodelling protein, in development. This has been experimentally validated earlier for its role in homeotic gene regulation and its interaction with regulatory complexes like the Polycomb and Trithorax complex. Proteins 2017; 85:682-693. © 2016 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Zhang, Li; Qian, Liqiang; Ding, Chuntao; Zhou, Weida; Li, Fanzhang
2015-09-01
The family of discriminant neighborhood embedding (DNE) methods is typical graph-based methods for dimension reduction, and has been successfully applied to face recognition. This paper proposes a new variant of DNE, called similarity-balanced discriminant neighborhood embedding (SBDNE) and applies it to cancer classification using gene expression data. By introducing a novel similarity function, SBDNE deals with two data points in the same class and the different classes with different ways. The homogeneous and heterogeneous neighbors are selected according to the new similarity function instead of the Euclidean distance. SBDNE constructs two adjacent graphs, or between-class adjacent graph and within-class adjacent graph, using the new similarity function. According to these two adjacent graphs, we can generate the local between-class scatter and the local within-class scatter, respectively. Thus, SBDNE can maximize the between-class scatter and simultaneously minimize the within-class scatter to find the optimal projection matrix. Experimental results on six microarray datasets show that SBDNE is a promising method for cancer classification. Copyright © 2015 Elsevier Ltd. All rights reserved.
Insights into social insects from the genome of the honeybee Apis mellifera
2007-01-01
Here we report the genome sequence of the honeybee Apis mellifera, a key model for social behaviour and essential to global ecology through pollination. Compared with other sequenced insect genomes, the A. mellifera genome has high A+T and CpG contents, lacks major transposon families, evolves more slowly, and is more similar to vertebrates for circadian rhythm, RNA interference and DNA methylation genes, among others. Furthermore, A. mellifera has fewer genes for innate immunity, detoxification enzymes, cuticle-forming proteins and gustatory receptors, more genes for odorant receptors, and novel genes for nectar and pollen utilization, consistent with its ecology and social organization. Compared to Drosophila, genes in early developmental pathways differ in Apis, whereas similarities exist for functions that differ markedly, such as sex determination, brain function and behaviour. Population genetics suggests a novel African origin for the species A. mellifera and insights into whether Africanized bees spread throughout the New World via hybridization or displacement. PMID:17073008
Hwang, Sun-Goo; Kim, Dong Sub; Hwang, Jung Eun; Han, A-Reum; Jang, Cheol Seong
2014-05-15
In order to better understand the biological systems that are affected in response to cosmic ray (CR), we conducted weighted gene co-expression network analysis using the module detection method. By using the Pearson's correlation coefficient (PCC) value, we evaluated complex gene-gene functional interactions between 680 CR-responsive probes from integrated microarray data sets, which included large-scale transcriptional profiling of 1000 microarray samples. These probes were divided into 6 distinct modules that contained 20 enriched gene ontology (GO) functions, such as oxidoreductase activity, hydrolase activity, and response to stimulus and stress. In particular, modules 1 and 2 commonly showed enriched annotation categories such as oxidoreductase activity, including enriched cis-regulatory elements known as ROS-specific regulators. These results suggest that the ROS-mediated irradiation response pathway is affected by CR in modules 1 and 2. We found 243 ionizing radiation (IR)-responsive probes that exhibited similarities in expression patterns in various irradiation microarray data sets. The expression patterns of 6 randomly selected IR-responsive genes were evaluated by quantitative reverse transcription polymerase chain reaction following treatment with CR, gamma rays (GR), and ion beam (IB); similar patterns were observed among these genes under these 3 treatments. Moreover, we constructed subnetworks of IR-responsive genes and evaluated the expression levels of their neighboring genes following GR treatment; similar patterns were observed among them. These results of network-based analyses might provide a clue to understanding the complex biological system related to the CR response in plants. Copyright © 2014 Elsevier B.V. All rights reserved.
Defoort, Jonas; Van de Peer, Yves; Vermeirssen, Vanessa
2018-06-05
Gene regulatory networks (GRNs) consist of different molecular interactions that closely work together to establish proper gene expression in time and space. Especially in higher eukaryotes, many questions remain on how these interactions collectively coordinate gene regulation. We study high quality GRNs consisting of undirected protein-protein, genetic and homologous interactions, and directed protein-DNA, regulatory and miRNA-mRNA interactions in the worm Caenorhabditis elegans and the plant Arabidopsis thaliana. Our data-integration framework integrates interactions in composite network motifs, clusters these in biologically relevant, higher-order topological network motif modules, overlays these with gene expression profiles and discovers novel connections between modules and regulators. Similar modules exist in the integrated GRNs of worm and plant. We show how experimental or computational methodologies underlying a certain data type impact network topology. Through phylogenetic decomposition, we found that proteins of worm and plant tend to functionally interact with proteins of a similar age, while at the regulatory level TFs favor same age, but also older target genes. Despite some influence of the duplication mode difference, we also observe at the motif and module level for both species a preference for age homogeneity for undirected and age heterogeneity for directed interactions. This leads to a model where novel genes are added together to the GRNs in a specific biological functional context, regulated by one or more TFs that also target older genes in the GRNs. Overall, we detected topological, functional and evolutionary properties of GRNs that are potentially universal in all species.
Towards an informative mutant phenotype for every bacterial gene
Deutschbauer, Adam; Price, Morgan N.; Wetmore, Kelly M.; ...
2014-08-11
Mutant phenotypes provide strong clues to the functions of the underlying genes and could allow annotation of the millions of sequenced yet uncharacterized bacterial genes. However, it is not known how many genes have a phenotype under laboratory conditions, how many phenotypes are biologically interpretable for predicting gene function, and what experimental conditions are optimal to maximize the number of genes with a phenotype. To address these issues, we measured the mutant fitness of 1,586 genes of the ethanol-producing bacterium Zymomonas mobilis ZM4 across 492 diverse experiments and found statistically significant phenotypes for 89% of all assayed genes. Thus, inmore » Z. mobilis, most genes have a functional consequence under laboratory conditions. We demonstrate that 41% of Z. mobilis genes have both a strong phenotype and a similar fitness pattern (cofitness) to another gene, and are therefore good candidates for functional annotation using mutant fitness. Among 502 poorly characterized Z. mobilis genes, we identified a significant cofitness relationship for 174. For 57 of these genes without a specific functional annotation, we found additional evidence to support the biological significance of these gene-gene associations, and in 33 instances, we were able to predict specific physiological or biochemical roles for the poorly characterized genes. Last, we identified a set of 79 diverse mutant fitness experiments in Z. mobilis that are nearly as biologically informative as the entire set of 492 experiments. Therefore, our work provides a blueprint for the functional annotation of diverse bacteria using mutant fitness.« less
Dt2 is a gain-of-function MADS-Domain factor gene that controls semi-determinacy in soybean
USDA-ARS?s Scientific Manuscript database
Similar to Arabidopsis, the wild soybean (Glycine soja) and many soybean (Glycine max) cultivars exhibit indeterminate stem growth controlled by a gene Dt1 – the functional counterpart of the Arabidopsis TFL1. Mutations in TFL1 and Dt1 both result in the shoot apical meristem (SAM) switching from ve...
A-DaGO-Fun: an adaptable Gene Ontology semantic similarity-based functional analysis tool.
Mazandu, Gaston K; Chimusa, Emile R; Mbiyavanga, Mamana; Mulder, Nicola J
2016-02-01
Gene Ontology (GO) semantic similarity measures are being used for biological knowledge discovery based on GO annotations by integrating biological information contained in the GO structure into data analyses. To empower users to quickly compute, manipulate and explore these measures, we introduce A-DaGO-Fun (ADaptable Gene Ontology semantic similarity-based Functional analysis). It is a portable software package integrating all known GO information content-based semantic similarity measures and relevant biological applications associated with these measures. A-DaGO-Fun has the advantage not only of handling datasets from the current high-throughput genome-wide applications, but also allowing users to choose the most relevant semantic similarity approach for their biological applications and to adapt a given module to their needs. A-DaGO-Fun is freely available to the research community at http://web.cbio.uct.ac.za/ITGOM/adagofun. It is implemented in Linux using Python under free software (GNU General Public Licence). gmazandu@cbio.uct.ac.za or Nicola.Mulder@uct.ac.za Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Evolution of the bovine lysozyme gene family: changes in gene expression and reversion of function.
Irwin, D M
1995-09-01
Recruitment of lysozyme to a digestive function in ruminant artiodactyls is associated with amplification of the gene. At least four of the approximately ten genes are expressed in the stomach, and several are expressed in nonstomach tissues. Characterization of additional lysozymelike sequences in the bovine genome has identified most, if not all, of the members of this gene family. There are at least six stomachlike lysozyme genes, two of which are pseudogenes. The stomach lysozyme pseudogenes show a pattern of concerted evolution similar to that of the functional stomach genes. At least four nonstomach lysozyme genes exist. The nonstomach lysozyme genes are not monophyletic. A gene encoding a tracheal lysozyme was isolated, and the stomach lysozyme of advanced ruminants was found to be more closely related to the tracheal lysozyme than to the stomach lysozyme of the camel or other nonstomach lysozyme genes of ruminants. The tracheal lysozyme shares with stomach lysozymes of advanced ruminants the deletion of amino acid 103, and several other adaptive sequence characteristics of stomach lysozymes. I suggest here that tracheal lysozyme has reverted from a functional stomach lysozyme. Tracheal lysozyme then represents a second instance of a change in lysozyme gene expression and function within ruminants.
Howe, J G; Shu, M D
1988-01-01
Genes for the Epstein-Barr virus-encoded RNAs (EBERs), two low-molecular-weight RNAs encoded by the human gammaherpesvirus Epstein-Barr virus (EBV), hybridize to two small RNAs in a baboon cell line that contains a similar virus, herpesvirus papio (HVP). The genes for the HVP RNAs (HVP-1 and HVP-2) are located together in the small unique region at the left end of the viral genome and are transcribed by RNA polymerase III in a rightward direction, similar to the EBERs. There is significant similarity between EBER1 and HVP-1 RNA, except for an insert of 22 nucleotides which increases the length of HVP-1 RNA to 190 nucleotides. There is less similarity between the sequences of EBER2 and HVP-2 RNA, but both have a length of about 170 nucleotides. The predicted secondary structure of each HVP RNA is remarkably similar to that of the respective EBER, implying that the secondary structures are important for function. Upstream from the initiation sites of all four RNA genes are several highly conserved sequences which may function in the regulation of transcription. The HVP RNAs, together with the EBERs, are highly abundant in transformed cells and are efficiently bound by the cellular La protein. Images PMID:2839701
Sugita, Chieko; Ogata, Koretsugu; Shikata, Masamitsu; Jikuya, Hiroyuki; Takano, Jun; Furumichi, Miho; Kanehisa, Minoru; Omata, Tatsuo; Sugiura, Masahiro; Sugita, Mamoru
2007-01-01
The entire genome of the unicellular cyanobacterium Synechococcus elongatus PCC 6301 (formerly Anacystis nidulans Berkeley strain 6301) was sequenced. The genome consisted of a circular chromosome 2,696,255 bp long. A total of 2,525 potential protein-coding genes, two sets of rRNA genes, 45 tRNA genes representing 42 tRNA species, and several genes for small stable RNAs were assigned to the chromosome by similarity searches and computer predictions. The translated products of 56% of the potential protein-coding genes showed sequence similarities to experimentally identified and predicted proteins of known function, and the products of 35% of the genes showed sequence similarities to the translated products of hypothetical genes. The remaining 9% of genes lacked significant similarities to genes for predicted proteins in the public DNA databases. Some 139 genes coding for photosynthesis-related components were identified. Thirty-seven genes for two-component signal transduction systems were also identified. This is the smallest number of such genes identified in cyanobacteria, except for marine cyanobacteria, suggesting that only simple signal transduction systems are found in this strain. The gene arrangement and nucleotide sequence of Synechococcus elongatus PCC 6301 were nearly identical to those of a closely related strain Synechococcus elongatus PCC 7942, except for the presence of a 188.6 kb inversion. The sequences as well as the gene information shown in this paper are available in the Web database, CYORF (http://www.cyano.genome.jp/).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deutschbauer, Adam; Price, Morgan N.; Wetmore, Kelly M.
Mutant phenotypes provide strong clues to the functions of the underlying genes and could allow annotation of the millions of sequenced yet uncharacterized bacterial genes. However, it is not known how many genes have a phenotype under laboratory conditions, how many phenotypes are biologically interpretable for predicting gene function, and what experimental conditions are optimal to maximize the number of genes with a phenotype. To address these issues, we measured the mutant fitness of 1,586 genes of the ethanol-producing bacterium Zymomonas mobilis ZM4 across 492 diverse experiments and found statistically significant phenotypes for 89% of all assayed genes. Thus, inmore » Z. mobilis, most genes have a functional consequence under laboratory conditions. We demonstrate that 41% of Z. mobilis genes have both a strong phenotype and a similar fitness pattern (cofitness) to another gene, and are therefore good candidates for functional annotation using mutant fitness. Among 502 poorly characterized Z. mobilis genes, we identified a significant cofitness relationship for 174. For 57 of these genes without a specific functional annotation, we found additional evidence to support the biological significance of these gene-gene associations, and in 33 instances, we were able to predict specific physiological or biochemical roles for the poorly characterized genes. Last, we identified a set of 79 diverse mutant fitness experiments in Z. mobilis that are nearly as biologically informative as the entire set of 492 experiments. Therefore, our work provides a blueprint for the functional annotation of diverse bacteria using mutant fitness.« less
Ma, Jinxing; Wang, Zhiwei; Li, Huan; Park, Hee-Deung; Wu, Zhichao
2016-06-01
Metagenomic sequencing was used to investigate the microbial structures, functional potentials, and biofouling-related genes in a membrane bioreactor (MBR). The results showed that the microbial community in the MBR was highly diverse. Notably, function analysis of the dominant genera indicated that common genes from different phylotypes were identified for important functional potentials with the observation of variation of abundances of genes in a certain taxon (e.g., Dechloromonas). Despite maintaining similar metabolic functional potentials with a parallel full-scale conventional activated sludge (CAS) system due to treating the identical wastewater, the MBR had more abundant nitrification-related bacteria and coding genes of ammonia monooxygenase, which could well explain its excellent ammonia removal in the low-temperature period. Furthermore, according to quantification of the genes involved in exopolysaccharide and extracellular polymeric substance (EPS) protein metabolism, the MBR did not show a much different potential in producing EPS compared to the CAS system, and bacteria from the membrane biofilm had lower abundances of genes associated with EPS biosynthesis and transport compared to the activated sludge in the MBR.
Cioffi, Anna Valentina; Ferrara, Diana; Cubellis, Maria Vittoria; Aniello, Francesco; Corrado, Marcella; Liguori, Francesca; Amoroso, Alessandro; Fucci, Laura; Branno, Margherita
2002-08-01
Analysis of the genome structure of the Paracentrotus lividus (sea urchin) DNA methyltransferase (DNA MTase) gene showed the presence of an open reading frame, named METEX, in intron 7 of the gene. METEX expression is developmentally regulated, showing no correlation with DNA MTase expression. In fact, DNA MTase transcripts are present at high concentrations in the early developmental stages, while METEX is expressed at late stages of development. Two METEX cDNA clones (Met1 and Met2) that are different in the 3' end have been isolated in a cDNA library screening. The putative translated protein from Met2 cDNA clone showed similarity with Escherichia coli endonuclease III on the basis of sequence and predictive three-dimensional structure. The protein, overexpressed in E. coli and purified, had functional properties similar to the endonuclease specific for apurinic/apyrimidinic (AP) sites on the basis of the lyase activity. Therefore the open reading frame, present in intron 7 of the P. lividus DNA MTase gene, codes for a functional AP endonuclease designated SuAP1.
Manoharan, Lokeshwaran; Kushwaha, Sandeep K.; Hedlund, Katarina; Ahrén, Dag
2015-01-01
Microbial enzyme diversity is a key to understand many ecosystem processes. Whole metagenome sequencing (WMG) obtains information on functional genes, but it is costly and inefficient due to large amount of sequencing that is required. In this study, we have applied a captured metagenomics technique for functional genes in soil microorganisms, as an alternative to WMG. Large-scale targeting of functional genes, coding for enzymes related to organic matter degradation, was applied to two agricultural soil communities through captured metagenomics. Captured metagenomics uses custom-designed, hybridization-based oligonucleotide probes that enrich functional genes of interest in metagenomic libraries where only probe-bound DNA fragments are sequenced. The captured metagenomes were highly enriched with targeted genes while maintaining their target diversity and their taxonomic distribution correlated well with the traditional ribosomal sequencing. The captured metagenomes were highly enriched with genes related to organic matter degradation; at least five times more than similar, publicly available soil WMG projects. This target enrichment technique also preserves the functional representation of the soils, thereby facilitating comparative metagenomics projects. Here, we present the first study that applies the captured metagenomics approach in large scale, and this novel method allows deep investigations of central ecosystem processes by studying functional gene abundances. PMID:26490729
Zwaenepoel, Arthur; Diels, Tim; Amar, David; Van Parys, Thomas; Shamir, Ron; Van de Peer, Yves; Tzfadia, Oren
2018-01-01
Recent times have seen an enormous growth of "omics" data, of which high-throughput gene expression data are arguably the most important from a functional perspective. Despite huge improvements in computational techniques for the functional classification of gene sequences, common similarity-based methods often fall short of providing full and reliable functional information. Recently, the combination of comparative genomics with approaches in functional genomics has received considerable interest for gene function analysis, leveraging both gene expression based guilt-by-association methods and annotation efforts in closely related model organisms. Besides the identification of missing genes in pathways, these methods also typically enable the discovery of biological regulators (i.e., transcription factors or signaling genes). A previously built guilt-by-association method is MORPH, which was proven to be an efficient algorithm that performs particularly well in identifying and prioritizing missing genes in plant metabolic pathways. Here, we present MorphDB, a resource where MORPH-based candidate genes for large-scale functional annotations (Gene Ontology, MapMan bins) are integrated across multiple plant species. Besides a gene centric query utility, we present a comparative network approach that enables researchers to efficiently browse MORPH predictions across functional gene sets and species, facilitating efficient gene discovery and candidate gene prioritization. MorphDB is available at http://bioinformatics.psb.ugent.be/webtools/morphdb/morphDB/index/. We also provide a toolkit, named "MORPH bulk" (https://github.com/arzwa/morph-bulk), for running MORPH in bulk mode on novel data sets, enabling researchers to apply MORPH to their own species of interest.
Khadjeh, Sara; Turetzek, Natascha; Pechmann, Matthias; Schwager, Evelyn E; Wimmer, Ernst A; Damen, Wim G M; Prpic, Nikola-Michael
2012-03-27
Evolution often results in morphologically similar solutions in different organisms, a phenomenon known as convergence. However, there is little knowledge of the processes that lead to convergence at the genetic level. The genes of the Hox cluster control morphology in animals. They may also be central to the convergence of morphological traits, but whether morphological similarities also require similar changes in Hox gene function is disputed. In arthropods, body subdivision into a region with locomotory appendages ("thorax") and a region with reduced appendages ("abdomen") has evolved convergently in several groups, e.g., spiders and insects. In insects, legs develop in the expression domain of the Hox gene Antennapedia (Antp), whereas the Hox genes Ultrabithorax (Ubx) and abdominal-A mediate leg repression in the abdomen. Here, we show that, unlike Antp in insects, the Antp gene in the spider Achaearanea tepidariorum represses legs in the first segment of the abdomen (opisthosoma), and that Antp and Ubx are redundant in the following segment. The down-regulation of Antp in A. tepidariorum leads to a striking 10-legged phenotype. We present evidence from ectopic expression of the spider Antp gene in Drosophila embryos and imaginal tissue that this unique function of Antp is not due to changes in the Antp protein, but likely due to divergent evolution of cofactors, Hox collaborators or target genes in spiders and flies. Our results illustrate an interesting example of convergent evolution of abdominal leg repression in arthropods by altering the role of distinct Hox genes at different levels of their action.
DynGO: a tool for visualizing and mining of Gene Ontology and its associations
Liu, Hongfang; Hu, Zhang-Zhi; Wu, Cathy H
2005-01-01
Background A large volume of data and information about genes and gene products has been stored in various molecular biology databases. A major challenge for knowledge discovery using these databases is to identify related genes and gene products in disparate databases. The development of Gene Ontology (GO) as a common vocabulary for annotation allows integrated queries across multiple databases and identification of semantically related genes and gene products (i.e., genes and gene products that have similar GO annotations). Meanwhile, dozens of tools have been developed for browsing, mining or editing GO terms, their hierarchical relationships, or their "associated" genes and gene products (i.e., genes and gene products annotated with GO terms). Tools that allow users to directly search and inspect relations among all GO terms and their associated genes and gene products from multiple databases are needed. Results We present a standalone package called DynGO, which provides several advanced functionalities in addition to the standard browsing capability of the official GO browsing tool (AmiGO). DynGO allows users to conduct batch retrieval of GO annotations for a list of genes and gene products, and semantic retrieval of genes and gene products sharing similar GO annotations. The result are shown in an association tree organized according to GO hierarchies and supported with many dynamic display options such as sorting tree nodes or changing orientation of the tree. For GO curators and frequent GO users, DynGO provides fast and convenient access to GO annotation data. DynGO is generally applicable to any data set where the records are annotated with GO terms, as illustrated by two examples. Conclusion We have presented a standalone package DynGO that provides functionalities to search and browse GO and its association databases as well as several additional functions such as batch retrieval and semantic retrieval. The complete documentation and software are freely available for download from the website . PMID:16091147
Cellular Retinoic Acid Binding Proteins: Genomic and Non-genomic Functions and their Regulation.
Wei, Li-Na
Cellular retinoic acid binding proteins (CRABPs) are high-affinity retinoic acid (RA) binding proteins that mainly reside in the cytoplasm. In mammals, this family has two members, CRABPI and II, both highly conserved during evolution. The two proteins share a very similar structure that is characteristic of a "β-clam" motif built up from10-strands. The proteins are encoded by two different genes that share a very similar genomic structure. CRABPI is widely distributed and CRABPII has restricted expression in only certain tissues. The CrabpI gene is driven by a housekeeping promoter, but can be regulated by numerous factors, including thyroid hormones and RA, which engage a specific chromatin-remodeling complex containing either TRAP220 or RIP140 as coactivator and corepressor, respectively. The chromatin-remodeling complex binds the DR4 element in the CrabpI gene promoter to activate or repress this gene in different cellular backgrounds. The CrabpII gene promoter contains a TATA-box and is rapidly activated by RA through an RA response element. Biochemical and cell culture studies carried out in vitro show the two proteins have distinct biological functions. CRABPII mainly functions to deliver RA to the nuclear RA receptors for gene regulation, although recent studies suggest that CRABPII may also be involved in other cellular events, such as RNA stability. In contrast, biochemical and cell culture studies suggest that CRABPI functions mainly in the cytoplasm to modulate intracellular RA availability/concentration and to engage other signaling components such as ERK activity. However, these functional studies remain inconclusive because knocking out one or both genes in mice does not produce definitive phenotypes. Further studies are needed to unambiguously decipher the exact physiological activities of these two proteins.
Shim, Hongseok; Kim, Ji Hyun; Kim, Chan Yeong; Hwang, Sohyun; Kim, Hyojin; Yang, Sunmo; Lee, Ji Eun; Lee, Insuk
2016-11-16
Whole exome sequencing (WES) accelerates disease gene discovery using rare genetic variants, but further statistical and functional evidence is required to avoid false-discovery. To complement variant-driven disease gene discovery, here we present function-driven disease gene discovery in zebrafish (Danio rerio), a promising human disease model owing to its high anatomical and genomic similarity to humans. To facilitate zebrafish-based function-driven disease gene discovery, we developed a genome-scale co-functional network of zebrafish genes, DanioNet (www.inetbio.org/danionet), which was constructed by Bayesian integration of genomics big data. Rigorous statistical assessment confirmed the high prediction capacity of DanioNet for a wide variety of human diseases. To demonstrate the feasibility of the function-driven disease gene discovery using DanioNet, we predicted genes for ciliopathies and performed experimental validation for eight candidate genes. We also validated the existence of heterozygous rare variants in the candidate genes of individuals with ciliopathies yet not in controls derived from the UK10K consortium, suggesting that these variants are potentially involved in enhancing the risk of ciliopathies. These results showed that an integrated genomics big data for a model animal of diseases can expand our opportunity for harnessing WES data in disease gene discovery. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Zhao, Feng; Meng, Songsong; Zhou, Deqing
2016-02-04
To construct heptyl glycosyltransferase gene II (waaF) gene deletion mutant of Vibrio parahaemolyticus, and explore the function of the waaF gene in Vibrio parahaemolyticus. The waaF gene deletion mutant was constructed by chitin-based transformation technology using clinical isolates, and then the growth rate, morphology and serotypes were identified. The different sources (O3, O5 and O10) waaF gene complementations were constructed through E. coli S17λpir strains conjugative transferring with Vibrio parahaemolyticus, and the function of the waaF gene was further verified by serotypes. The waaF gene deletion mutant strain was successfully constructed and it grew normally. The growth rate and morphology of mutant were similar with the wild type strains (WT), but the mutant could not occurred agglutination reaction with O antisera. The O3 and O5 sources waaF gene complementations occurred agglutination reaction with O antisera, but the O10 sources waaF gene complementations was not. The waaF gene was related with O-antigen synthesis and it was the key gene of O-antigen synthesis pathway in Vibrio parahaemolyticus. The function of different sources waaF gene were not the same.
Wang, X; Zhao, L; Zhang, L; Wu, Y; Chou, M; Wei, G
2018-07-01
Rhizobial symbiotic plasmids play vital roles in mutualistic symbiosis with legume plants by executing the functions of nodulation and nitrogen fixation. To explore the gene composition and genetic constitution of rhizobial symbiotic plasmids, comparison analyses of 24 rhizobial symbiotic plasmids derived from four rhizobial genera was carried out. Results illustrated that rhizobial symbiotic plasmids had higher proportion of functional genes participating in amino acid transport and metabolism, replication; recombination and repair; carbohydrate transport and metabolism; energy production and conversion and transcription. Mesorhizobium amorphae CCNWGS0123 symbiotic plasmid - pM0123d had similar gene composition with pR899b and pSNGR234a. All symbiotic plasmids shared 13 orthologous genes, including five nod and eight nif/fix genes which participate in the rhizobia-legume symbiosis process. These plasmids contained nod genes from four ancestors and fix genes from six ancestors. The ancestral type of pM0123d nod genes was similar with that of Rhizobium etli plasmids, while the ancestral type of pM0123d fix genes was same as that of pM7653Rb. The phylogenetic trees constructed based on nodCIJ and fixABC displayed different topological structures mainly due to nodCIJ and fixABC ancestral type discordance. The study presents valuable insights into mosaic structures and the evolution of rhizobial symbiotic plasmids. This study compared 24 rhizobial symbiotic plasmids that included four genera and 11 species, illuminating the functional gene composition and symbiosis gene ancestor types of symbiotic plasmids from higher taxonomy. It provides valuable insights into mosaic structures and the evolution of symbiotic plasmids. © 2018 The Society for Applied Microbiology.
NASA Astrophysics Data System (ADS)
Holtorf, Hauke; Guitton, Marie-Christine; Reski, Ralf
2002-04-01
Functional genome analysis of plants has entered the high-throughput stage. The complete genome information from key species such as Arabidopsis thaliana and rice is now available and will further boost the application of a range of new technologies to functional plant gene analysis. To broadly assign functions to unknown genes, different fast and multiparallel approaches are currently used and developed. These new technologies are based on known methods but are adapted and improved to accommodate for comprehensive, large-scale gene analysis, i.e. such techniques are novel in the sense that their design allows researchers to analyse many genes at the same time and at an unprecedented pace. Such methods allow analysis of the different constituents of the cell that help to deduce gene function, namely the transcripts, proteins and metabolites. Similarly the phenotypic variations of entire mutant collections can now be analysed in a much faster and more efficient way than before. The different methodologies have developed to form their own fields within the functional genomics technological platform and are termed transcriptomics, proteomics, metabolomics and phenomics. Gene function, however, cannot solely be inferred by using only one such approach. Rather, it is only by bringing together all the information collected by different functional genomic tools that one will be able to unequivocally assign functions to unknown plant genes. This review focuses on current technical developments and their impact on the field of plant functional genomics. The lower plant Physcomitrella is introduced as a new model system for gene function analysis, owing to its high rate of homologous recombination.
Homologues of CsLOB1 in citrus function as disease susceptibility genes in citrus canker.
Zhang, Junli; Huguet-Tapia, Jose Carlos; Hu, Yang; Jones, Jeffrey; Wang, Nian; Liu, Sanzhen; White, Frank F
2017-08-01
The lateral organ boundary domain (LBD) genes encode a group of plant-specific proteins that function as transcription factors in the regulation of plant growth and development. Citrus sinensis lateral organ boundary 1 (CsLOB1) is a member of the LBD family and functions as a disease susceptibility gene in citrus bacterial canker (CBC). Thirty-four LBD members have been identified from the Citrus sinensis genome. We assessed the potential for additional members of LBD genes in citrus to function as surrogates for CsLOB1 in CBC, and compared host gene expression on induction of different LBD genes. Using custom-designed transcription activator-like (TAL) effectors, two members of the same clade as CsLOB1, named CsLOB2 and CsLOB3, were found to be capable of functioning similarly to CsLOB1 in CBC. RNA sequencing and quantitative reverse transcription-polymerase chain reaction analyses revealed a set of cell wall metabolic genes that are associated with CsLOB1, CsLOB2 and CsLOB3 expression and may represent downstream genes involved in CBC. © 2016 BSPP AND JOHN WILEY & SONS LTD.
Di, Chao; Xu, Wenying; Su, Zhen; Yuan, Joshua S
2010-10-07
PHB (Prohibitin) gene family is involved in a variety of functions important for different biological processes. PHB genes are ubiquitously present in divergent species from prokaryotes to eukaryotes. Human PHB genes have been found to be associated with various diseases. Recent studies by our group and others have shown diverse function of PHB genes in plants for development, senescence, defence, and others. Despite the importance of the PHB gene family, no comprehensive gene family analysis has been carried to evaluate the relatedness of PHB genes across different species. In order to better guide the gene function analysis and understand the evolution of the PHB gene family, we therefore carried out the comparative genome analysis of the PHB genes across different kingdoms. The relatedness, motif distribution, and intron/exon distribution all indicated that PHB genes is a relatively conserved gene family. The PHB genes can be classified into 5 classes and each class have a very deep evolutionary origin. The PHB genes within the class maintained the same motif patterns during the evolution. With Arabidopsis as the model species, we found that PHB gene intron/exon structure and domains are also conserved during the evolution. Despite being a conserved gene family, various gene duplication events led to the expansion of the PHB genes. Both segmental and tandem gene duplication were involved in Arabidopsis PHB gene family expansion. However, segmental duplication is predominant in Arabidopsis. Moreover, most of the duplicated genes experienced neofunctionalization. The results highlighted that PHB genes might be involved in important functions so that the duplicated genes are under the evolutionary pressure to derive new function. PHB gene family is a conserved gene family and accounts for diverse but important biological functions based on the similar molecular mechanisms. The highly diverse biological function indicated that more research needs to be carried out to dissect the PHB gene function. The conserved gene evolution indicated that the study in the model species can be translated to human and mammalian studies.
Evidence for Moonlighting Functions of the θ Subunit of Escherichia coli DNA Polymerase III
Dietrich, M.; Pedró, L.; García, J.; Pons, M.; Hüttener, M.; Paytubi, S.; Madrid, C.
2014-01-01
The holE gene is an enterobacterial ORFan gene (open reading frame [ORF] with no detectable homology to other ORFs in a database). It encodes the θ subunit of the DNA polymerase III core complex. The precise function of the θ subunit within this complex is not well established, and loss of holE does not result in a noticeable phenotype. Paralogs of holE are also present on many conjugative plasmids and on phage P1 (hot gene). In this study, we provide evidence indicating that θ (HolE) exhibits structural and functional similarities to a family of nucleoid-associated regulatory proteins, the Hha/YdgT-like proteins that are also encoded by enterobacterial ORFan genes. Microarray studies comparing the transcriptional profiles of Escherichia coli holE, hha, and ydgT mutants revealed highly similar expression patterns for strains harboring holE and ydgT alleles. Among the genes differentially regulated in both mutants were genes of the tryptophanase (tna) operon. The tna operon consists of a transcribed leader region, tnaL, and two structural genes, tnaA and tnaB. Further experiments with transcriptional lacZ fusions (tnaL::lacZ and tnaA::lacZ) indicate that HolE and YdgT downregulate expression of the tna operon by possibly increasing the level of Rho-dependent transcription termination at the tna operon's leader region. Thus, for the first time, a regulatory function can be attributed to HolE, in addition to its role as structural component of the DNA polymerase III complex. PMID:24375106
Human Intellectual Disability Genes Form Conserved Functional Modules in Drosophila
Oortveld, Merel A. W.; Keerthikumar, Shivakumar; Oti, Martin; Nijhof, Bonnie; Fernandes, Ana Clara; Kochinke, Korinna; Castells-Nobau, Anna; van Engelen, Eva; Ellenkamp, Thijs; Eshuis, Lilian; Galy, Anne; van Bokhoven, Hans; Habermann, Bianca; Brunner, Han G.; Zweier, Christiane; Verstreken, Patrik; Huynen, Martijn A.; Schenck, Annette
2013-01-01
Intellectual Disability (ID) disorders, defined by an IQ below 70, are genetically and phenotypically highly heterogeneous. Identification of common molecular pathways underlying these disorders is crucial for understanding the molecular basis of cognition and for the development of therapeutic intervention strategies. To systematically establish their functional connectivity, we used transgenic RNAi to target 270 ID gene orthologs in the Drosophila eye. Assessment of neuronal function in behavioral and electrophysiological assays and multiparametric morphological analysis identified phenotypes associated with knockdown of 180 ID gene orthologs. Most of these genotype-phenotype associations were novel. For example, we uncovered 16 genes that are required for basal neurotransmission and have not previously been implicated in this process in any system or organism. ID gene orthologs with morphological eye phenotypes, in contrast to genes without phenotypes, are relatively highly expressed in the human nervous system and are enriched for neuronal functions, suggesting that eye phenotyping can distinguish different classes of ID genes. Indeed, grouping genes by Drosophila phenotype uncovered 26 connected functional modules. Novel links between ID genes successfully predicted that MYCN, PIGV and UPF3B regulate synapse development. Drosophila phenotype groups show, in addition to ID, significant phenotypic similarity also in humans, indicating that functional modules are conserved. The combined data indicate that ID disorders, despite their extreme genetic diversity, are caused by disruption of a limited number of highly connected functional modules. PMID:24204314
Human intellectual disability genes form conserved functional modules in Drosophila.
Oortveld, Merel A W; Keerthikumar, Shivakumar; Oti, Martin; Nijhof, Bonnie; Fernandes, Ana Clara; Kochinke, Korinna; Castells-Nobau, Anna; van Engelen, Eva; Ellenkamp, Thijs; Eshuis, Lilian; Galy, Anne; van Bokhoven, Hans; Habermann, Bianca; Brunner, Han G; Zweier, Christiane; Verstreken, Patrik; Huynen, Martijn A; Schenck, Annette
2013-10-01
Intellectual Disability (ID) disorders, defined by an IQ below 70, are genetically and phenotypically highly heterogeneous. Identification of common molecular pathways underlying these disorders is crucial for understanding the molecular basis of cognition and for the development of therapeutic intervention strategies. To systematically establish their functional connectivity, we used transgenic RNAi to target 270 ID gene orthologs in the Drosophila eye. Assessment of neuronal function in behavioral and electrophysiological assays and multiparametric morphological analysis identified phenotypes associated with knockdown of 180 ID gene orthologs. Most of these genotype-phenotype associations were novel. For example, we uncovered 16 genes that are required for basal neurotransmission and have not previously been implicated in this process in any system or organism. ID gene orthologs with morphological eye phenotypes, in contrast to genes without phenotypes, are relatively highly expressed in the human nervous system and are enriched for neuronal functions, suggesting that eye phenotyping can distinguish different classes of ID genes. Indeed, grouping genes by Drosophila phenotype uncovered 26 connected functional modules. Novel links between ID genes successfully predicted that MYCN, PIGV and UPF3B regulate synapse development. Drosophila phenotype groups show, in addition to ID, significant phenotypic similarity also in humans, indicating that functional modules are conserved. The combined data indicate that ID disorders, despite their extreme genetic diversity, are caused by disruption of a limited number of highly connected functional modules.
Predicting taxonomic and functional structure of microbial communities in acid mine drainage
Kuang, Jialiang; Huang, Linan; He, Zhili; Chen, Linxing; Hua, Zhengshuang; Jia, Pu; Li, Shengjin; Liu, Jun; Li, Jintian; Zhou, Jizhong; Shu, Wensheng
2016-01-01
Predicting the dynamics of community composition and functional attributes responding to environmental changes is an essential goal in community ecology but remains a major challenge, particularly in microbial ecology. Here, by targeting a model system with low species richness, we explore the spatial distribution of taxonomic and functional structure of 40 acid mine drainage (AMD) microbial communities across Southeast China profiled by 16S ribosomal RNA pyrosequencing and a comprehensive microarray (GeoChip). Similar environmentally dependent patterns of dominant microbial lineages and key functional genes were observed regardless of the large-scale geographical isolation. Functional and phylogenetic β-diversities were significantly correlated, whereas functional metabolic potentials were strongly influenced by environmental conditions and community taxonomic structure. Using advanced modeling approaches based on artificial neural networks, we successfully predicted the taxonomic and functional dynamics with significantly higher prediction accuracies of metabolic potentials (average Bray–Curtis similarity 87.8) as compared with relative microbial abundances (similarity 66.8), implying that natural AMD microbial assemblages may be better predicted at the functional genes level rather than at taxonomic level. Furthermore, relative metabolic potentials of genes involved in many key ecological functions (for example, nitrogen and phosphate utilization, metals resistance and stress response) were extrapolated to increase under more acidic and metal-rich conditions, indicating a critical strategy of stress adaptation in these extraordinary communities. Collectively, our findings indicate that natural selection rather than geographic distance has a more crucial role in shaping the taxonomic and functional patterns of AMD microbial community that readily predicted by modeling methods and suggest that the model-based approach is essential to better understand natural acidophilic microbial communities. PMID:26943622
Predicting taxonomic and functional structure of microbial communities in acid mine drainage.
Kuang, Jialiang; Huang, Linan; He, Zhili; Chen, Linxing; Hua, Zhengshuang; Jia, Pu; Li, Shengjin; Liu, Jun; Li, Jintian; Zhou, Jizhong; Shu, Wensheng
2016-06-01
Predicting the dynamics of community composition and functional attributes responding to environmental changes is an essential goal in community ecology but remains a major challenge, particularly in microbial ecology. Here, by targeting a model system with low species richness, we explore the spatial distribution of taxonomic and functional structure of 40 acid mine drainage (AMD) microbial communities across Southeast China profiled by 16S ribosomal RNA pyrosequencing and a comprehensive microarray (GeoChip). Similar environmentally dependent patterns of dominant microbial lineages and key functional genes were observed regardless of the large-scale geographical isolation. Functional and phylogenetic β-diversities were significantly correlated, whereas functional metabolic potentials were strongly influenced by environmental conditions and community taxonomic structure. Using advanced modeling approaches based on artificial neural networks, we successfully predicted the taxonomic and functional dynamics with significantly higher prediction accuracies of metabolic potentials (average Bray-Curtis similarity 87.8) as compared with relative microbial abundances (similarity 66.8), implying that natural AMD microbial assemblages may be better predicted at the functional genes level rather than at taxonomic level. Furthermore, relative metabolic potentials of genes involved in many key ecological functions (for example, nitrogen and phosphate utilization, metals resistance and stress response) were extrapolated to increase under more acidic and metal-rich conditions, indicating a critical strategy of stress adaptation in these extraordinary communities. Collectively, our findings indicate that natural selection rather than geographic distance has a more crucial role in shaping the taxonomic and functional patterns of AMD microbial community that readily predicted by modeling methods and suggest that the model-based approach is essential to better understand natural acidophilic microbial communities.
Zhou, Yan; Xu, Daixiang; Jia, Ledong; Huang, Xiaohu; Ma, Guoqiang; Wang, Shuxian; Zhu, Meichen; Zhang, Aoxiang; Guan, Mingwei; Lu, Kun; Xu, Xinfu; Wang, Rui; Li, Jiana; Qu, Cunmin
2017-10-24
The basic region/leucine zipper motif (bZIP) transcription factor family is one of the largest families of transcriptional regulators in plants. bZIP genes have been systematically characterized in some plants, but not in rapeseed ( Brassica napus ). In this study, we identified 247 BnbZIP genes in the rapeseed genome, which we classified into 10 subfamilies based on phylogenetic analysis of their deduced protein sequences. The BnbZIP genes were grouped into functional clades with Arabidopsis genes with similar putative functions, indicating functional conservation. Genome mapping analysis revealed that the BnbZIPs are distributed unevenly across all 19 chromosomes, and that some of these genes arose through whole-genome duplication and dispersed duplication events. All expression profiles of 247 bZIP genes were extracted from RNA-sequencing data obtained from 17 different B . napus ZS11 tissues with 42 various developmental stages. These genes exhibited different expression patterns in various tissues, revealing that these genes are differentially regulated. Our results provide a valuable foundation for functional dissection of the different BnbZIP homologs in B . napus and its parental lines and for molecular breeding studies of bZIP genes in B . napus .
Zhou, Yan; Xu, Daixiang; Jia, Ledong; Huang, Xiaohu; Ma, Guoqiang; Wang, Shuxian; Zhu, Meichen; Zhang, Aoxiang; Guan, Mingwei; Xu, Xinfu; Wang, Rui; Li, Jiana
2017-01-01
The basic region/leucine zipper motif (bZIP) transcription factor family is one of the largest families of transcriptional regulators in plants. bZIP genes have been systematically characterized in some plants, but not in rapeseed (Brassica napus). In this study, we identified 247 BnbZIP genes in the rapeseed genome, which we classified into 10 subfamilies based on phylogenetic analysis of their deduced protein sequences. The BnbZIP genes were grouped into functional clades with Arabidopsis genes with similar putative functions, indicating functional conservation. Genome mapping analysis revealed that the BnbZIPs are distributed unevenly across all 19 chromosomes, and that some of these genes arose through whole-genome duplication and dispersed duplication events. All expression profiles of 247 bZIP genes were extracted from RNA-sequencing data obtained from 17 different B. napus ZS11 tissues with 42 various developmental stages. These genes exhibited different expression patterns in various tissues, revealing that these genes are differentially regulated. Our results provide a valuable foundation for functional dissection of the different BnbZIP homologs in B. napus and its parental lines and for molecular breeding studies of bZIP genes in B. napus. PMID:29064393
Bedell, Victoria M; Person, Anthony D; Larson, Jon D; McLoon, Anna; Balciunas, Darius; Clark, Karl J; Neff, Kevin I; Nelson, Katie E; Bill, Brent R; Schimmenti, Lisa A; Beiraghi, Soraya; Ekker, Stephen C
2012-02-01
The Homeobox (Hox) and Paired box (Pax) gene families are key determinants of animal body plans and organ structure. In particular, they function within regulatory networks that control organogenesis. How these conserved genes elicit differences in organ form and function in response to evolutionary pressures is incompletely understood. We molecularly and functionally characterized one member of an evolutionarily dynamic gene family, plac8 onzin related protein 1 (ponzr1), in the zebrafish. ponzr1 mRNA is expressed early in the developing kidney and pharyngeal arches. Using ponzr1-targeting morpholinos, we show that ponzr1 is required for formation of the glomerulus. Loss of ponzr1 results in a nonfunctional glomerulus but retention of a functional pronephros, an arrangement similar to the aglomerular kidneys found in a subset of marine fish. ponzr1 is integrated into the pax2a pathway, with ponzr1 expression requiring pax2a gene function, and proper pax2a expression requiring normal ponzr1 expression. In addition to pronephric function, ponzr1 is required for pharyngeal arch formation. We functionally demonstrate that ponzr1 can act as a transcription factor or co-factor, providing the first molecular mode of action for this newly described gene family. Together, this work provides experimental evidence of an additional mechanism that incorporates evolutionarily dynamic, lineage-specific gene families into conserved regulatory gene networks to create functional organ diversity.
USDA-ARS?s Scientific Manuscript database
Such Biomedical vocabularies and ontologies aid in recapitulating biological knowledge. The annotation of gene products is mainly accelerated by Gene Ontology (GO) and more recently by Medical Subject Headings (MeSH). MeSH is the National Library of Medicine's controlled vocabulary and it is making ...
Functions of the gene products of Escherichia coli.
Riley, M
1993-01-01
A list of currently identified gene products of Escherichia coli is given, together with a bibliography that provides pointers to the literature on each gene product. A scheme to categorize cellular functions is used to classify the gene products of E. coli so far identified. A count shows that the numbers of genes concerned with small-molecule metabolism are on the same order as the numbers concerned with macromolecule biosynthesis and degradation. One large category is the category of tRNAs and their synthetases. Another is the category of transport elements. The categories of cell structure and cellular processes other than metabolism are smaller. Other subjects discussed are the occurrence in the E. coli genome of redundant pairs and groups of genes of identical or closely similar function, as well as variation in the degree of density of genetic information in different parts of the genome. PMID:7508076
Halbleib, Jennifer M.; Sääf, Annika M.
2007-01-01
Although there is considerable evidence implicating posttranslational mechanisms in the development of epithelial cell polarity, little is known about the patterns of gene expression and transcriptional regulation during this process. We characterized the temporal program of gene expression during cell–cell adhesion–initiated polarization of human Caco-2 cells in tissue culture, which develop structural and functional polarity similar to that of enterocytes in vivo. A distinctive switch in gene expression patterns occurred upon formation of cell–cell contacts between neighboring cells. Expression of genes involved in cell proliferation was down-regulated concomitant with induction of genes necessary for functional specialization of polarized epithelial cells. Transcriptional up-regulation of these latter genes correlated with formation of important structural and functional features in enterocyte differentiation and establishment of structural and functional cell polarity; components of the apical microvilli were induced as the brush border formed during polarization; as barrier function was established, expression of tight junction transmembrane proteins peaked; transcripts encoding components of the apical, but not the basal-lateral trafficking machinery were increased during polarization. Coordinated expression of genes encoding components of functional cell structures were often observed indicating temporal control of expression and assembly of multiprotein complexes. PMID:17699590
Protein-protein interaction network-based detection of functionally similar proteins within species.
Song, Baoxing; Wang, Fen; Guo, Yang; Sang, Qing; Liu, Min; Li, Dengyun; Fang, Wei; Zhang, Deli
2012-07-01
Although functionally similar proteins across species have been widely studied, functionally similar proteins within species showing low sequence similarity have not been examined in detail. Identification of these proteins is of significant importance for understanding biological functions, evolution of protein families, progression of co-evolution, and convergent evolution and others which cannot be obtained by detection of functionally similar proteins across species. Here, we explored a method of detecting functionally similar proteins within species based on graph theory. After denoting protein-protein interaction networks using graphs, we split the graphs into subgraphs using the 1-hop method. Proteins with functional similarities in a species were detected using a method of modified shortest path to compare these subgraphs and to find the eligible optimal results. Using seven protein-protein interaction networks and this method, some functionally similar proteins with low sequence similarity that cannot detected by sequence alignment were identified. By analyzing the results, we found that, sometimes, it is difficult to separate homologous from convergent evolution. Evaluation of the performance of our method by gene ontology term overlap showed that the precision of our method was excellent. Copyright © 2012 Wiley Periodicals, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pena-Castillo, Lourdes; Mercer, Ryan; Gurinovich, Anastasia
2014-08-28
The genus Rhodobacter contains purple nonsulfur bacteria found mostly in freshwater environments. Representative strains of two Rhodobacter species, R. capsulatus and R. sphaeroides, have had their genomes fully sequenced and both have been the subject of transcriptional profiling studies. Gene co-expression networks can be used to identify modules of genes with similar expression profiles. Functional analysis of gene modules can then associate co-expressed genes with biological pathways, and network statistics can determine the degree of module preservation in related networks. In this paper, we constructed an R. capsulatus gene co-expression network, performed functional analysis of identified gene modules, and investigatedmore » preservation of these modules in R. capsulatus proteomics data and in R. sphaeroides transcriptomics data. Results: The analysis identified 40 gene co-expression modules in R. capsulatus. Investigation of the module gene contents and expression profiles revealed patterns that were validated based on previous studies supporting the biological relevance of these modules. We identified two R. capsulatus gene modules preserved in the protein abundance data. We also identified several gene modules preserved between both Rhodobacter species, which indicate that these cellular processes are conserved between the species and are candidates for functional information transfer between species. Many gene modules were non-preserved, providing insight into processes that differentiate the two species. In addition, using Local Network Similarity (LNS), a recently proposed metric for expression divergence, we assessed the expression conservation of between-species pairs of orthologs, and within-species gene-protein expression profiles. Conclusions: Our analyses provide new sources of information for functional annotation in R. capsulatus because uncharacterized genes in modules are now connected with groups of genes that constitute a joint functional annotation. We identified R. capsulatus modules enriched with genes for ribosomal proteins, porphyrin and bacteriochlorophyll anabolism, and biosynthesis of secondary metabolites to be preserved in R. sphaeroides whereas modules related to RcGTA production and signalling showed lack of preservation in R. sphaeroides. In addition, we demonstrated that network statistics may also be applied within-species to identify congruence between mRNA expression and protein abundance data for which simple correlation measurements have previously had mixed results.« less
Sundström, Jens; Engström, Peter
2002-07-01
The Norway spruce MADS-box genes DAL11, DAL12 and DAL13 are phylogenetically related to the angiosperm B-function MADS-box genes: genes that act together with A-function genes in specifying petal identity and with C-function genes in specifying stamen identity to floral organs. In this report we present evidence to suggest that the B-gene function in the specification of identity of the pollen-bearing organs has been conserved between conifers and angiosperms. Expression of DAL11 or DAL12 in transgenic Arabidopsis causes phenotypic changes which partly resemble those caused by ectopic expression of the endogenous B-genes. In similar experiments, flowers of Arabidopsis plants expressing DAL13 showed a different homeotic change in that they formed ectopic anthers in whorls one, two or four. We also demonstrate the capacity of the spruce gene products to form homodimers, and that DAL11 and DAL13 may form heterodimers with each other and with the Arabidopsis B-protein AP3, but not with PI, the second B-gene product in Arabidopsis. In situ hybridization experiments show that the conifer B-like genes are expressed specifically in developing pollen cones, but differ in both temporal and spatial distribution patterns. These results suggest that the B-function in conifers is dual and is separated into a meristem identity and an organ identity function, the latter function possibly being independent of an interaction with the C-function. Thus, even though an ancestral B-function may have acted in combination with C to specify micro- and megasporangia, the B-function has evolved differently in conifers and angiosperms.
Functional cohesion of gene sets determined by latent semantic indexing of PubMed abstracts.
Xu, Lijing; Furlotte, Nicholas; Lin, Yunyue; Heinrich, Kevin; Berry, Michael W; George, Ebenezer O; Homayouni, Ramin
2011-04-14
High-throughput genomic technologies enable researchers to identify genes that are co-regulated with respect to specific experimental conditions. Numerous statistical approaches have been developed to identify differentially expressed genes. Because each approach can produce distinct gene sets, it is difficult for biologists to determine which statistical approach yields biologically relevant gene sets and is appropriate for their study. To address this issue, we implemented Latent Semantic Indexing (LSI) to determine the functional coherence of gene sets. An LSI model was built using over 1 million Medline abstracts for over 20,000 mouse and human genes annotated in Entrez Gene. The gene-to-gene LSI-derived similarities were used to calculate a literature cohesion p-value (LPv) for a given gene set using a Fisher's exact test. We tested this method against genes in more than 6,000 functional pathways annotated in Gene Ontology (GO) and found that approximately 75% of gene sets in GO biological process category and 90% of the gene sets in GO molecular function and cellular component categories were functionally cohesive (LPv<0.05). These results indicate that the LPv methodology is both robust and accurate. Application of this method to previously published microarray datasets demonstrated that LPv can be helpful in selecting the appropriate feature extraction methods. To enable real-time calculation of LPv for mouse or human gene sets, we developed a web tool called Gene-set Cohesion Analysis Tool (GCAT). GCAT can complement other gene set enrichment approaches by determining the overall functional cohesion of data sets, taking into account both explicit and implicit gene interactions reported in the biomedical literature. GCAT is freely available at http://binf1.memphis.edu/gcat.
Fast gene ontology based clustering for microarray experiments.
Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa
2008-11-21
Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.
Functions and impact of tal-like genes in animals with regard to applied aspects.
Zhu, Min; Hu, Xiaolong; Cao, Guangli; Xue, Renyu; Gong, Chengliang
2018-06-16
A large number of DNAs in eukaryote genomes can code for atypical transcripts, and their functions are controversial. It has been reported that the transcripts contain many small open reading frames (sORFs), which were originally considered as non-translatable RNAs. However, increasing evidence has suggested that some of these sORFs can encode for small peptides and some are conserved across large evolutionary distances. It has been reported that the small peptides have functions and may be involved in varieties of cellular processes, playing important roles in development, physiology, and metabolism. Among the sORFs, studies of the non-canonical gene polished rice/tarsal-less (pri/tal) in Drosophila and mille-pattes(mlpt) in Tribolium have been more thoroughly studied. The genes similar to pri/tal in other species have been defined as the tarsal-less-related gene family, tal-like gene. In this review, we described recent progress in the discovery and functional characterization of the small peptides encoded by the tal-like gene and their possible functional potentials.
Burt, Andrew J; William, H Manilal; Perry, Gregory; Khanal, Raja; Pauls, K Peter; Kelly, James D; Navabi, Alireza
2015-01-01
Anthracnose, caused by Colletotrichum lindemuthianum, is an important fungal disease of common bean (Phaseolus vulgaris). Alleles at the Co-4 locus confer resistance to a number of races of C. lindemuthianum. A population of 94 F4:5 recombinant inbred lines of a cross between resistant black bean genotype B09197 and susceptible navy bean cultivar Nautica was used to identify markers associated with resistance in bean chromosome 8 (Pv08) where Co-4 is localized. Three SCAR markers with known linkage to Co-4 and a panel of single nucleotide markers were used for genotyping. A refined physical region on Pv08 with significant association with anthracnose resistance identified by markers was used in BLAST searches with the genomic sequence of common bean accession G19833. Thirty two unique annotated candidate genes were identified that spanned a physical region of 936.46 kb. A majority of the annotated genes identified had functional similarity to leucine rich repeats/receptor like kinase domains. Three annotated genes had similarity to 1, 3-β-glucanase domains. There were sequence similarities between some of the annotated genes found in the study and the genes associated with phosphoinositide-specific phosphilipases C associated with Co-x and the COK-4 loci found in previous studies. It is possible that the Co-4 locus is structured as a group of genes with functional domains dominated by protein tyrosine kinase along with leucine rich repeats/nucleotide binding site, phosphilipases C as well as β-glucanases.
Burt, Andrew J.; William, H. Manilal; Perry, Gregory; Khanal, Raja; Pauls, K. Peter; Kelly, James D.; Navabi, Alireza
2015-01-01
Anthracnose, caused by Colletotrichum lindemuthianum, is an important fungal disease of common bean (Phaseolus vulgaris). Alleles at the Co–4 locus confer resistance to a number of races of C. lindemuthianum. A population of 94 F4:5 recombinant inbred lines of a cross between resistant black bean genotype B09197 and susceptible navy bean cultivar Nautica was used to identify markers associated with resistance in bean chromosome 8 (Pv08) where Co–4 is localized. Three SCAR markers with known linkage to Co–4 and a panel of single nucleotide markers were used for genotyping. A refined physical region on Pv08 with significant association with anthracnose resistance identified by markers was used in BLAST searches with the genomic sequence of common bean accession G19833. Thirty two unique annotated candidate genes were identified that spanned a physical region of 936.46 kb. A majority of the annotated genes identified had functional similarity to leucine rich repeats/receptor like kinase domains. Three annotated genes had similarity to 1, 3-β-glucanase domains. There were sequence similarities between some of the annotated genes found in the study and the genes associated with phosphoinositide-specific phosphilipases C associated with Co-x and the COK–4 loci found in previous studies. It is possible that the Co–4 locus is structured as a group of genes with functional domains dominated by protein tyrosine kinase along with leucine rich repeats/nucleotide binding site, phosphilipases C as well as β-glucanases. PMID:26431031
Vazquez, Miguel; Nogales-Cadenas, Ruben; Arroyo, Javier; Botías, Pedro; García, Raul; Carazo, Jose M; Tirado, Francisco; Pascual-Montano, Alberto; Carmona-Saez, Pedro
2010-07-01
The enormous amount of data available in public gene expression repositories such as Gene Expression Omnibus (GEO) offers an inestimable resource to explore gene expression programs across several organisms and conditions. This information can be used to discover experiments that induce similar or opposite gene expression patterns to a given query, which in turn may lead to the discovery of new relationships among diseases, drugs or pathways, as well as the generation of new hypotheses. In this work, we present MARQ, a web-based application that allows researchers to compare a query set of genes, e.g. a set of over- and under-expressed genes, against a signature database built from GEO datasets for different organisms and platforms. MARQ offers an easy-to-use and integrated environment to mine GEO, in order to identify conditions that induce similar or opposite gene expression patterns to a given experimental condition. MARQ also includes additional functionalities for the exploration of the results, including a meta-analysis pipeline to find genes that are differentially expressed across different experiments. The application is freely available at http://marq.dacya.ucm.es.
Jothi, R; Mohanty, Sraban Kumar; Ojha, Aparajita
2016-04-01
Gene expression data clustering is an important biological process in DNA microarray analysis. Although there have been many clustering algorithms for gene expression analysis, finding a suitable and effective clustering algorithm is always a challenging problem due to the heterogeneous nature of gene profiles. Minimum Spanning Tree (MST) based clustering algorithms have been successfully employed to detect clusters of varying shapes and sizes. This paper proposes a novel clustering algorithm using Eigenanalysis on Minimum Spanning Tree based neighborhood graph (E-MST). As MST of a set of points reflects the similarity of the points with their neighborhood, the proposed algorithm employs a similarity graph obtained from k(') rounds of MST (k(')-MST neighborhood graph). By studying the spectral properties of the similarity matrix obtained from k(')-MST graph, the proposed algorithm achieves improved clustering results. We demonstrate the efficacy of the proposed algorithm on 12 gene expression datasets. Experimental results show that the proposed algorithm performs better than the standard clustering algorithms. Copyright © 2016 Elsevier Ltd. All rights reserved.
MADS-Box gene diversity in seed plants 300 million years ago.
Becker, A; Winter, K U; Meyer, B; Saedler, H; Theissen, G
2000-10-01
MADS-box genes encode a family of transcription factors which control diverse developmental processes in flowering plants ranging from root development to flower and fruit development. Through phylogeny reconstructions, most of these genes can be subdivided into defined monophyletic gene clades whose members share similar expression patterns and functions. Therefore, the establishment of the diversity of gene clades was probably an important event in land plant evolution. In order to determine when these clades originated, we isolated cDNAs of 19 different MADS-box genes from Gnetum gnemon, a gymnosperm model species and thus a representative of the sister group of the angiosperms. Phylogeny reconstructions involving all published MADS-box genes were then used to identify gene clades containing putative orthologs from both angiosperm and gymnosperm lineages. Thus, the minimal number of MADS-box genes that were already present in the last common ancestor of extant gymnosperms and angiosperms was determined. Comparative expression studies involving pairs of putatively orthologous genes revealed a diversity of patterns that has been largely conserved since the time when the angiosperm and gymnosperm lineages separated. Taken together, our data suggest that there were already at least seven different MADS-box genes present at the base of extant seed plants about 300 MYA. These genes were probably already quite diverse in terms of both sequence and function. In addition, our data demonstrate that the MADS-box gene families of extant gymnosperms and angiosperms are of similar complexities.
Evolution of an Expanded Mannose Receptor Gene Family
Staines, Karen; Hunt, Lawrence G.; Young, John R.; Butter, Colin
2014-01-01
Sequences of peptides from a protein specifically immunoprecipitated by an antibody, KUL01, that recognises chicken macrophages, identified a homologue of the mammalian mannose receptor, MRC1, which we called MRC1L-B. Inspection of the genomic environment of the chicken gene revealed an array of five paralogous genes, MRC1L-A to MRC1L-E, located between conserved flanking genes found either side of the single MRC1 gene in mammals. Transcripts of all five genes were detected in RNA from a macrophage cell line and other RNAs, whose sequences allowed the precise definition of spliced exons, confirming or correcting existing bioinformatic annotation. The confirmed gene structures were used to locate orthologues of all five genes in the genomes of two other avian species and of the painted turtle, all with intact coding sequences. The lizard genome had only three genes, one orthologue of MRC1L-A and two orthologues of the MRC1L-B antigen gene resulting from a recent duplication. The Xenopus genome, like that of most mammals, had only a single MRC1-like gene at the corresponding locus. MRC1L-A and MRC1L-B genes had similar cytoplasmic regions that may be indicative of similar subcellular migration and functions. Cytoplasmic regions of the other three genes were very divergent, possibly indicating the evolution of a new functional repertoire for this family of molecules, which might include novel interactions with pathogens. PMID:25390371
Heritability of Lung Disease Severity in Cystic Fibrosis
Vanscoy, Lori L.; Blackman, Scott M.; Collaco, Joseph M.; Bowers, Amanda; Lai, Teresa; Naughton, Kathleen; Algire, Marilyn; McWilliams, Rita; Beck, Suzanne; Hoover-Fong, Julie; Hamosh, Ada; Cutler, Dave; Cutting, Garry R.
2007-01-01
Rationale: Obstructive lung disease, the major cause of mortality in cystic fibrosis (CF), is poorly correlated with mutations in the disease-causing gene, indicating that other factors determine severity of lung disease. Objectives: To quantify the contribution of modifier genes to variation in CF lung disease severity. Methods: Pulmonary function data from patients with CF living with their affected twin or sibling were converted into reference values based on both healthy and CF populations. The best measure of FEV1 within the last year was used for cross-sectional analysis. FEV1 measures collected over at least 4 years were used for longitudinal analysis. Genetic contribution to disease variation (i.e., heritability) was estimated in two ways: by comparing similarity of lung function in monozygous (MZ) twins (∼ 100% gene sharing) with that of dizygous (DZ) twins/siblings (∼ 50% gene sharing), and by comparing similarity of lung function measures for related siblings to similarity for all study subjects. Measurements and Main Results: Forty-seven MZ twin pairs, 10 DZ twin pairs, and 231 sibling pairs (of a total of 526 patients) with CF were studied. Correlations for all measures of lung function for MZ twins (0.82–0.91, p < 0.0001) were higher than for DZ twins and siblings (0.50–0.64, p < 0.001). Heritability estimates from both methods were consistent for each measure of lung function and ranged from 0.54 to 1.0. Heritability estimates generally increased after adjustment for differences in nutritional status (measured as body mass index z-score). Conclusions: Our heritability estimates indicate substantial genetic control of variation in CF lung disease severity, independent of CFTR genotype. PMID:17332481
Adam, Helene; Jouannic, Stefan; Morcillo, Fabienne; Verdeil, Jean-Luc; Duval, Yves; Tregear, James W.
2007-01-01
Aims In this article a review is made of data recently obtained on the structural diversity and possible functions of MADS box genes in the determination of flower structure in the African oil palm (Elaeis guineensis). MADS box genes play a dominant role in the ABC model established to explain how floral organ identity is determined in model dicotyledon species such as Arabidopsis thaliana and Antirrhinum majus. In the monocotyledons, although there appears to be a broad general conservation of ABC gene functions, the model itself needs to be adapted in some cases, notably for certain species which produce flowers with sepals and petals of similar appearance. For the moment, ABC genes remain unstudied in a number of key monocot clades, so only a partial picture is available for the Liliopsida as a whole. The aim of this article is to summarize data recently obtained for the African oil palm Elaeis guineensis, a member of the family Arecaceae (Arecales), and to discuss their significance with respect to knowledge gained from other Angiosperm groups, particularly within the monocotyledons. Scope The essential details of reproductive development in oil palm are discussed and an overview is provided of the structural and functional characterization of MADS box genes likely to play a homeotic role in flower development in this species. Conclusions The structural and functional data provide evidence for a general conservation of the generic ‘ABC’ model in oil palm, rather than the ‘modified ABC model’ proposed for some other monocot species which produce homochlamydeous flowers (i.e. with morphologically similar organs in both perianth whorls), such as members of the Liliales. Our oil palm data therefore follow a similar pattern to those obtained for other Commelinid species in the orders Commelinales and Poales. The significance of these findings is discussed. PMID:17355996
Dt2 Is a Gain-of-Function MADS-Domain Factor Gene That Specifies Semideterminacy in Soybean[C][W
Ping, Jieqing; Liu, Yunfeng; Sun, Lianjun; Zhao, Meixia; Li, Yinghui; She, Maoyun; Sui, Yi; Lin, Feng; Liu, Xiaodong; Tang, Zongxiang; Nguyen, Hanh; Tian, Zhixi; Qiu, Lijuan; Nelson, Randall L.; Clemente, Thomas E.; Specht, James E.; Ma, Jianxin
2014-01-01
Similar to Arabidopsis thaliana, the wild soybeans (Glycine soja) and many cultivars exhibit indeterminate stem growth specified by the shoot identity gene Dt1, the functional counterpart of Arabidopsis TERMINAL FLOWER1 (TFL1). Mutations in TFL1 and Dt1 both result in the shoot apical meristem (SAM) switching from vegetative to reproductive state to initiate terminal flowering and thus produce determinate stems. A second soybean gene (Dt2) regulating stem growth was identified, which, in the presence of Dt1, produces semideterminate plants with terminal racemes similar to those observed in determinate plants. Here, we report positional cloning and characterization of Dt2, a dominant MADS domain factor gene classified into the APETALA1/SQUAMOSA (AP1/SQUA) subfamily that includes floral meristem (FM) identity genes AP1, FUL, and CAL in Arabidopsis. Unlike AP1, whose expression is limited to FMs in which the expression of TFL1 is repressed, Dt2 appears to repress the expression of Dt1 in the SAMs to promote early conversion of the SAMs into reproductive inflorescences. Given that Dt2 is not the gene most closely related to AP1 and that semideterminacy is rarely seen in wild soybeans, Dt2 appears to be a recent gain-of-function mutation, which has modified the genetic pathways determining the stem growth habit in soybean. PMID:25005919
Khalyfa, Abdelnaby; Capdevila, Oscar Sans; Kheirandish-Gozal, Leila; Khalyfa, Ahamed A.; Kim, Jinkwan
2012-01-01
Abstract Pediatric obstructive sleep apnea (OSA) may lead to neurocognitive dysfunction, but not in everyone affected. The frequencies of NADPH oxidase (NOX) polymorphisms in the p22phox subunit were similar between children with OSA and controls, except for rs6520785 and rs4673, the latter being significantly more frequent among the OSA children without deficits than with deficits (p<0.02). Similarly, 8-hydroxydeoxyguanine urine levels and NOX activity were lower among children without cognitive deficits and particularly among those with the rs4673 polymorphism. Thus, polymorphisms within the NOX gene or its functional subunits may account for important components of the variance in cognitive function deficits associated with OSA in children. Antioxid. Redox Signal. 16, 171–177. PMID:21902598
Koyama, Fernanda C; Carvalho, Thais L G; Alves, Eduardo; da Silva, Henrique B; de Azevedo, Mauro F; Hemerly, Adriana S; Garcia, Célia R S
2013-01-01
Indole compounds are involved in a range of functions in many organisms. In the human malaria parasite Plasmodium falciparum, melatonin and other tryptophan derivatives are able to modulate its intraerythrocytic cycle, increasing the schizont population as well as parasitemia, likely through ubiquitin-proteasome system (UPS) gene regulation. In plants, melatonin regulates root development, in a similar way to that described for indoleacetic acid, suggesting that melatonin and indoleacetic acid could co-participate in some physiological processes due to structural similarities. In the present work, we evaluate whether the chemical structure similarity found in indoleacetic acid and melatonin can lead to similar effects in Arabidopsis thaliana lateral root formation and P. falciparum cell cycle modulation, as well as in the UPS of gene regulation, by qRT-PCR. Our data show that P. falciparum is not able to respond to indoleacetic acid either in the modulation of the intraerythrocytic cycle or in the gene regulation mediated by the UPS as observed for melatonin. The similarities of these indole compounds are not sufficient to confer synergistic functions in P. falciparum cell cycle modulation, but could interplay in A. thaliana lateral root formation. © 2013 The Author(s) Journal of Eukaryotic Microbiology © 2013 International Society of Protistologists.
2010-01-01
Background Cytochrome P450 monooxygenases (P450s) catalyze oxidation of various substrates using oxygen and NAD(P)H. Plant P450s are involved in the biosynthesis of primary and secondary metabolites performing diverse biological functions. The recent availability of the soybean genome sequence allows us to identify and analyze soybean putative P450s at a genome scale. Co-expression analysis using an available soybean microarray and Illumina sequencing data provides clues for functional annotation of these enzymes. This approach is based on the assumption that genes that have similar expression patterns across a set of conditions may have a functional relationship. Results We have identified a total number of 332 full-length P450 genes and 378 pseudogenes from the soybean genome. From the full-length sequences, 195 genes belong to A-type, which could be further divided into 20 families. The remaining 137 genes belong to non-A type P450s and are classified into 28 families. A total of 178 probe sets were found to correspond to P450 genes on the Affymetrix soybean array. Out of these probe sets, 108 represented single genes. Using the 28 publicly available microarray libraries that contain organ-specific information, some tissue-specific P450s were identified. Similarly, stress responsive soybean P450s were retrieved from 99 microarray soybean libraries. We also utilized Illumina transcriptome sequencing technology to analyze the expressions of all 332 soybean P450 genes. This dataset contains total RNAs isolated from nodules, roots, root tips, leaves, flowers, green pods, apical meristem, mock-inoculated and Bradyrhizobium japonicum-infected root hair cells. The tissue-specific expression patterns of these P450 genes were analyzed and the expression of a representative set of genes were confirmed by qRT-PCR. We performed the co-expression analysis on many of the 108 P450 genes on the Affymetrix arrays. First we confirmed that CYP93C5 (an isoflavone synthase gene) is co-expressed with several genes encoding isoflavonoid-related metabolic enzymes. We then focused on nodulation-induced P450s and found that CYP728H1 was co-expressed with the genes involved in phenylpropanoid metabolism. Similarly, CYP736A34 was highly co-expressed with lipoxygenase, lectin and CYP83D1, all of which are involved in root and nodule development. Conclusions The genome scale analysis of P450s in soybean reveals many unique features of these important enzymes in this crop although the functions of most of them are largely unknown. Gene co-expression analysis proves to be a useful tool to infer the function of uncharacterized genes. Our work presented here could provide important leads toward functional genomics studies of soybean P450s and their regulatory network through the integration of reverse genetics, biochemistry, and metabolic profiling tools. The identification of nodule-specific P450s and their further exploitation may help us to better understand the intriguing process of soybean and rhizobium interaction. PMID:21062474
The Rhizobium etli cyaC Product: Characterization of a Novel Adenylate Cyclase Class
Téllez-Sosa, Juan; Soberón, Nora; Vega-Segura, Alicia; Torres-Márquez, María E.; Cevallos, Miguel A.
2002-01-01
Adenylate cyclases (ACs) catalyze the formation of 3′,5′-cyclic AMP (cAMP) from ATP. A novel AC-encoding gene, cyaC, was isolated from Rhizobium etli by phenotypic complementation of an Escherichia coli cya mutant. The functionality of the cyaC gene was corroborated by its ability to restore cAMP accumulation in an E. coli cya mutant. Further, overexpression of a malE::cyaC fusion protein allowed the detection of significant AC activity levels in cell extracts of an E. coli cya mutant. CyaC is unrelated to any known AC or to any other protein exhibiting a currently known function. Thus, CyaC represents the first member of a novel class of ACs (class VI). Hypothetical genes of unknown function similar to cyaC have been identified in the genomes of the related bacterial species Mesorhizobium loti, Sinorhizobium meliloti, and Agrobacterium tumefaciens. The cyaC gene is cotranscribed with a gene similar to ohr of Xanthomonas campestris and is expressed only in the presence of organic hydroperoxides. The physiological performance of an R. etli cyaC mutant was indistinguishable from that of the wild-type parent strain both under free-living conditions and during symbiosis. PMID:12057950
Hutton, John J; Jegga, Anil G; Kong, Sue; Gupta, Ashima; Ebert, Catherine; Williams, Sarah; Katz, Jonathan D; Aronow, Bruce J
2004-01-01
Background In this study we have built and mined a gene expression database composed of 65 diverse mouse tissues for genes preferentially expressed in immune tissues and cell types. Using expression pattern criteria, we identified 360 genes with preferential expression in thymus, spleen, peripheral blood mononuclear cells, lymph nodes (unstimulated or stimulated), or in vitro activated T-cells. Results Gene clusters, formed based on similarity of expression-pattern across either all tissues or the immune tissues only, had highly significant associations both with immunological processes such as chemokine-mediated response, antigen processing, receptor-related signal transduction, and transcriptional regulation, and also with more general processes such as replication and cell cycle control. Within-cluster gene correlations implicated known associations of known genes, as well as immune process-related roles for poorly described genes. To characterize regulatory mechanisms and cis-elements of genes with similar patterns of expression, we used a new version of a comparative genomics-based cis-element analysis tool to identify clusters of cis-elements with compositional similarity among multiple genes. Several clusters contained genes that shared 5–6 cis-elements that included ETS and zinc-finger binding sites. cis-Elements AP2 EGRF ETSF MAZF SP1F ZF5F and AREB ETSF MZF1 PAX5 STAT were shared in a thymus-expressed set; AP4R E2FF EBOX ETSF MAZF SP1F ZF5F and CREB E2FF MAZF PCAT SP1F STAT cis-clusters occurred in activated T-cells; CEBP CREB NFKB SORY and GATA NKXH OCT1 RBIT occurred in stimulated lymph nodes. Conclusion This study demonstrates a series of analytic approaches that have allowed the implication of genes and regulatory elements that participate in the differentiation, maintenance, and function of the immune system. Polymorphism or mutation of these could adversely impact immune system functions. PMID:15504237
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Jing; Ma, Zihao; Carr, Steven A.
Coexpression of mRNAs under multiple conditions is commonly used to infer cofunctionality of their gene products despite well-known limitations of this “guilt-by-association” (GBA) approach. Recent advancements in mass spectrometry-based proteomic technologies have enabled global expression profiling at the protein level; however, whether proteome profiling data can outperform transcriptome profiling data for coexpression based gene function prediction has not been systematically investigated. Here, we address this question by constructing and analyzing mRNA and protein coexpression networks for three cancer types with matched mRNA and protein profiling data from The Cancer Genome Atlas (TCGA) and the Clinical Proteomic Tumor Analysis Consortium (CPTAC).more » Our analyses revealed a marked difference in wiring between the mRNA and protein coexpression networks. Whereas protein coexpression was driven primarily by functional similarity between coexpressed genes, mRNA coexpression was driven by both cofunction and chromosomal colocalization of the genes. Functionally coherent mRNA modules were more likely to have their edges preserved in corresponding protein networks than functionally incoherent mRNA modules. Proteomic data strengthened the link between gene expression and function for at least 75% of Gene Ontology (GO) biological processes and 90% of KEGG pathways. A web application Gene2Net (http://cptac.gene2net.org) developed based on the three protein coexpression networks revealed novel gene-function relationships, such as linking ERBB2 (HER2) to lipid biosynthetic process in breast cancer, identifying PLG as a new gene involved in complement activation, and identifying AEBP1 as a new epithelial-mesenchymal transition (EMT) marker. Our results demonstrate that proteome profiling outperforms transcriptome profiling for coexpression based gene function prediction. Proteomics should be integrated if not preferred in gene function and human disease studies. Molecular & Cellular Proteomics 16: 10.1074/mcp.M116.060301, 121–134, 2017.« less
Bruce, A. Gregory; Ryan, Jonathan T.; Thomas, Mathew J.; Peng, Xinxia; Grundhoff, Adam; Tsai, Che-Chung
2013-01-01
The complete sequence of retroperitoneal fibromatosis-associated herpesvirus Macaca nemestrina (RFHVMn), the pig-tailed macaque homolog of Kaposi's sarcoma-associated herpesvirus (KSHV), was determined by next-generation sequence analysis of a Kaposi's sarcoma (KS)-like macaque tumor. Colinearity of genes was observed with the KSHV genome, and the core herpesvirus genes had strong sequence homology to the corresponding KSHV genes. RFHVMn lacked homologs of open reading frame 11 (ORF11) and KSHV ORFs K5 and K6, which appear to have been generated by duplication of ORFs K3 and K4 after the divergence of KSHV and RFHV. RFHVMn contained positional homologs of all other unique KSHV genes, although some showed limited sequence similarity. RFHVMn contained a number of candidate microRNA genes. Although there was little sequence similarity with KSHV microRNAs, one candidate contained the same seed sequence as the positional homolog, kshv-miR-K12-10a, suggesting functional overlap. RNA transcript splicing was highly conserved between RFHVMn and KSHV, and strong sequence conservation was noted in specific promoters and putative origins of replication, predicting important functional similarities. Sequence comparisons indicated that RFHVMn and KSHV developed in long-term synchrony with the evolution of their hosts, and both viruses phylogenetically group within the RV1 lineage of Old World primate rhadinoviruses. RFHVMn is the closest homolog of KSHV to be completely sequenced and the first sequenced RV1 rhadinovirus homolog of KSHV from a nonhuman Old World primate. The strong genetic and sequence similarity between RFHVMn and KSHV, coupled with similarities in biology and pathology, demonstrate that RFHVMn infection in macaques offers an important and relevant model for the study of KSHV in humans. PMID:24109218
APETALA2 like genes from Picea abies show functional similarities to their Arabidopsis homologues.
Nilsson, Lars; Carlsbecker, Annelie; Sundås-Larsson, Annika; Vahala, Tiina
2007-02-01
In angiosperm flower development the identity of the floral organs is determined by the A, B and C factors. Here we present the characterisation of three homologues of the A class gene APETALA2 (AP2) from the conifer Picea abies (Norway spruce), Picea abies APETALA2 LIKE1 (PaAP2L1), PaAP2L2 and PaAP2L3. Similar to AP2 these genes contain sequence motifs complementary to miRNA172 that has been shown to regulate AP2 in Arabidopsis. The genes display distinct expression patterns during plant development; in the female-cone bud PaAP2L1 and PaAP2L3 are expressed in the seed-bearing ovuliferous scale in a pattern complementary to each other, and overlapping with the expression of the C class-related gene DAL2. To study the function of PaAP2L1 and PaAP2L2 the genes were expressed in Arabidopsis. The transgenic PaAP2L2 plants were stunted and flowered later than control plants. Flowers were indeterminate and produced an excess of floral organs most severely in the two inner whorls, associated with an ectopic expression of the meristem-regulating gene WUSCHEL. No homeotic changes in floral-organ identities occurred, but in the ap2-1 mutant background PaAP2L2 was able to promote petal identity, indicating that the spruce AP2 gene has the capacity to substitute for an A class gene in Arabidopsis. In spite of the long evolutionary distance between angiosperms and gymnosperms and the fact that gymnosperms lack structures homologous to sepals and petals our data supports a functional conservation of AP2 genes among the seed plants.
YY1 Regulates Melanocyte Development and Function by Cooperating with MITF
Bell, Robert J. A.; Tran, Thanh-Nga T.; Haq, Rizwan; Liu, Huifei; Love, Kevin T.; Langer, Robert; Anderson, Daniel G.; Larue, Lionel; Fisher, David E.
2012-01-01
Studies of coat color mutants have greatly contributed to the discovery of genes that regulate melanocyte development and function. Here, we generated Yy1 conditional knockout mice in the melanocyte-lineage and observed profound melanocyte deficiency and premature gray hair, similar to the loss of melanocytes in human piebaldism and Waardenburg syndrome. Although YY1 is a ubiquitous transcription factor, YY1 interacts with M-MITF, the Waardenburg Syndrome IIA gene and a master transcriptional regulator of melanocytes. YY1 cooperates with M-MITF in regulating the expression of piebaldism gene KIT and multiple additional pigmentation genes. Moreover, ChIP–seq identified genome-wide YY1 targets in the melanocyte lineage. These studies mechanistically link genes implicated in human conditions of melanocyte deficiency and reveal how a ubiquitous factor (YY1) gains lineage-specific functions by co-regulating gene expression with a lineage-restricted factor (M-MITF)—a general mechanism which may confer tissue-specific gene expression in multiple lineages. PMID:22570637
Inferring gene regression networks with model trees
2010-01-01
Background Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. Results We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database) is used as control to compare the results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods. Conclusions REGNET generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear regressions to separate areas of the search space favoring to infer localized similarities over a more global similarity. Furthermore, experimental results show the good performance of REGNET. PMID:20950452
Burgess, Diane; Freeling, Michael
2014-01-01
In vertebrates, conserved noncoding elements (CNEs) are functionally constrained sequences that can show striking conservation over >400 million years of evolutionary distance and frequently are located megabases away from target developmental genes. Conserved noncoding sequences (CNSs) in plants are much shorter, and it has been difficult to detect conservation among distantly related genomes. In this article, we show not only that CNS sequences can be detected throughout the eudicot clade of flowering plants, but also that a subset of 37 CNSs can be found in all flowering plants (diverging ∼170 million years ago). These CNSs are functionally similar to vertebrate CNEs, being highly associated with transcription factor and development genes and enriched in transcription factor binding sites. Some of the most highly conserved sequences occur in genes encoding RNA binding proteins, particularly the RNA splicing–associated SR genes. Differences in sequence conservation between plants and animals are likely to reflect differences in the biology of the organisms, with plants being much more able to tolerate genomic deletions and whole-genome duplication events due, in part, to their far greater fecundity compared with vertebrates. PMID:24681619
Characterisation of a collagen gene subfamily from the potato cyst nematode Globodera pallida.
Gray, L J; Curtis, R H; Jones, J T
2001-01-24
We have isolated two full-length genomic DNA sequences, which encode the cuticle collagen proteins GP-COL-1 and GP-COL-2, from the potato cyst nematode Globodera pallida. A third, partial collagen gene ORF termed gp-col-t(t=truncated) has also been isolated and appears to represent an unexpressed pseudogene. The gp-col-1 and gp-col-2 genes both contain three short (<97 bp) introns which disrupt coding regions predicted to specify proteins with molecular weights of 33 and 32.7 kDa respectively. All three sequences show high similarity to each other and to the previously isolated G. pallida cDNA clone gp-col-8. The conserved pattern of cysteine residues and non-(Gly-X-Y)(n) region sequence similarity observed in all four G. pallida genes suggests that these molecules form part of the same subfamily of collagens. Southern analysis indicates that this subfamily is likely to contain further members. The G. pallida collagen sequences show striking similarity to twelve genes from Caenorhabditis elegans which collectively represent the recently classified Group 1a collagen subfamily. No data exists on the function of this subfamily in C. elegans. gp-col-1 and gp-col-2 are developmentally regulated with transcripts of both genes detected in adult virgin and gravid females but not in pre-parasitic second stage juveniles. A similar expression pattern is observed for the Group 1a collagen lemmi 5 from Meloidogyne incognita perhaps indicating a generic link between subfamily and function during the various changes in cuticular structure which accompany nematode growth and reproduction. Immunochemical studies indicate that the GP-COL-1 protein is specifically located in the hypodermis of G. pallida adult females.
Guo, Wei-Li; Chen, Ru-Gang; Gong, Zhen-Hui; Yin, Yan-Xu; Li, Da-Wei
2013-01-01
Low temperature is one of the major factors limiting pepper (Capsicum annuum L.) production during winter and early spring in non-tropical regions. Application of exogenous abscisic acid (ABA) effectively alleviates the symptoms of chilling injury, such as wilting and formation of necrotic lesions on pepper leaves; however, the underlying molecular mechanism is not understood. The aim of this study was to identify genes that are differentially up- or downregulated in ABA-pretreated hot pepper seedlings incubated at 6°C for 48 h, using a suppression subtractive hybridization (SSH) method. A total of 235 high-quality ESTs were isolated, clustered and assembled into a collection of 73 unigenes including 18 contigs and 55 singletons. A total of 37 unigenes (50.68%) showed similarities to genes with known functions in the non-redundant database; the other 36 unigenes (49.32%) showed low similarities or unknown functions. Gene ontology analysis revealed that the 37 unigenes could be classified into nine functional categories. The expression profiles of 18 selected genes were analyzed using quantitative RT-PCR; the expression levels of 10 of these genes were at least two-fold higher in the ABA-pretreated seedlings under chilling stress than water-pretreated (control) plants under chilling stress. In contrast, the other eight genes were downregulated in ABA-pretreated seedlings under chilling stress, with expression levels that were one-third or less of the levels observed in control seedlings under chilling stress. These results suggest that ABA can positively and negatively regulate genes in pepper plants under chilling stress.
Peri, A; Cordella-Miele, E; Miele, L; Mukherjee, A B
1993-01-01
Clara cell 10-kD protein (cc10kD), a secretory phospholipase A2 inhibitor, is suggested to be the human counterpart of rabbit uteroglobin (UG). Because cc10kD is expressed constitutively at a very high level in the human respiratory epithelium, the 5' region of its gene may be useful in achieving organ-specific expression of recombinant DNA in gene therapy of diseases such as cystic fibrosis. However, it is important to establish the tissue-specific expression of this gene before designing gene transfer experiments. Since the UG gene in the rabbit is expressed in many other organs besides the lung and the endometrium, we investigated the organ and tissue specificity of human cc10kD gene expression using polymerase chain reaction, nucleotide sequence analysis, immunofluorescence, and Northern blotting. Our results indicate that, in addition to the lung, cc10kD is expressed in several nonrespiratory organs, with a distribution pattern very similar, if not identical, to that of UG in the rabbit. These results underscore the necessity for more detailed analyses of the 5' region of the human cc10kD gene before its usefulness in gene therapy could be fully assessed. These data also suggest that cc10kD and UG may have similar physiological function(s). Images PMID:8227325
Versatile types of polysaccharide-based supramolecular polycation/pDNA nanoplexes for gene delivery
NASA Astrophysics Data System (ADS)
Hu, Yang; Zhao, Nana; Yu, Bingran; Liu, Fusheng; Xu, Fu-Jian
2014-06-01
Different polysaccharide-based supramolecular polycations were readily synthesized by assembling multiple β-cyclodextrin-cored star polycations with an adamantane-functionalized dextran via host-guest interaction in the absence or presence of bioreducible linkages. Compared with nanoplexes of the starting star polycation and pDNA, the supramolecular polycation/pDNA nanoplexes exhibited similarly low cytotoxicity, improved cellular internalization and significantly higher gene transfection efficiencies. The incorporation of disulfide linkages imparted the supramolecular polycation/pDNA nanoplexes with the advantage of intracellular bioreducibility, resulting in better gene delivery properties. In addition, the antitumor properties of supramolecular polycation/pDNA nanoplexes were also investigated using a suicide gene therapy system. The present study demonstrates that the proper assembly of cyclodextrin-cored polycations with adamantane-functionalized polysaccharides is an effective strategy for the production of new nanoplex delivery systems.Different polysaccharide-based supramolecular polycations were readily synthesized by assembling multiple β-cyclodextrin-cored star polycations with an adamantane-functionalized dextran via host-guest interaction in the absence or presence of bioreducible linkages. Compared with nanoplexes of the starting star polycation and pDNA, the supramolecular polycation/pDNA nanoplexes exhibited similarly low cytotoxicity, improved cellular internalization and significantly higher gene transfection efficiencies. The incorporation of disulfide linkages imparted the supramolecular polycation/pDNA nanoplexes with the advantage of intracellular bioreducibility, resulting in better gene delivery properties. In addition, the antitumor properties of supramolecular polycation/pDNA nanoplexes were also investigated using a suicide gene therapy system. The present study demonstrates that the proper assembly of cyclodextrin-cored polycations with adamantane-functionalized polysaccharides is an effective strategy for the production of new nanoplex delivery systems. Electronic supplementary information (ESI) available: 1H NMR assay and synthetic route of Dex-Ad and Dex-SS-Ad. See DOI: 10.1039/c4nr01590h
Edelmann, Lisa; Stankiewicz, Pavel; Spiteri, Elizabeth; Pandita, Raj K.; Shaffer, Lisa; Lupski, James; Morrow, Bernice E.
2001-01-01
The DGCR6 (DiGeorge critical region) gene encodes a putative protein with sequence similarity to gonadal (gdl), a Drosophila melanogaster gene of unknown function. We mapped the DGCR6 gene to chromosome 22q11 within a low copy repeat, termed sc11.1a, and identified a second copy of the gene, DGCR6L, within the duplicate locus, termed sc11.1b. Both sc11.1 repeats are deleted in most persons with velo-cardio-facial syndrome/DiGeorge syndrome (VCFS/DGS), and they map immediately adjacent and internal to the low copy repeats, termed LCR22, that mediate the deletions associated with VCFS/DGS. We sequenced genomic clones from both loci and determined that the putative initiator methionine is located further upstream than originally described, but in a position similar to the mouse and chicken orthologs. DGCR6L encodes a highly homologous, functional copy of DGCR6, with some base changes rendering amino acid differences. Expression studies of the two genes indicate that both genes are widely expressed in fetal and adult tissues. Evolutionary studies using FISH mapping in several different species of ape combined with sequence analysis of DGCR6 in a number of different primate species indicate that the duplication is at least 12 million years old and may date back to before the divergence of Catarrhines from Platyrrhines, 35 mya. These data suggest that there has been selective evolutionary pressure toward the functional maintenance of both paralogs. Interestingly, a full-length HERV-K provirus integrated into the sc11.1a locus after the divergence of chimpanzees and humans. PMID:11157784
Bedell, Victoria M.; Person, Anthony D.; Larson, Jon D.; McLoon, Anna; Balciunas, Darius; Clark, Karl J.; Neff, Kevin I.; Nelson, Katie E.; Bill, Brent R.; Schimmenti, Lisa A.; Beiraghi, Soraya; Ekker, Stephen C.
2012-01-01
The Homeobox (Hox) and Paired box (Pax) gene families are key determinants of animal body plans and organ structure. In particular, they function within regulatory networks that control organogenesis. How these conserved genes elicit differences in organ form and function in response to evolutionary pressures is incompletely understood. We molecularly and functionally characterized one member of an evolutionarily dynamic gene family, plac8 onzin related protein 1 (ponzr1), in the zebrafish. ponzr1 mRNA is expressed early in the developing kidney and pharyngeal arches. Using ponzr1-targeting morpholinos, we show that ponzr1 is required for formation of the glomerulus. Loss of ponzr1 results in a nonfunctional glomerulus but retention of a functional pronephros, an arrangement similar to the aglomerular kidneys found in a subset of marine fish. ponzr1 is integrated into the pax2a pathway, with ponzr1 expression requiring pax2a gene function, and proper pax2a expression requiring normal ponzr1 expression. In addition to pronephric function, ponzr1 is required for pharyngeal arch formation. We functionally demonstrate that ponzr1 can act as a transcription factor or co-factor, providing the first molecular mode of action for this newly described gene family. Together, this work provides experimental evidence of an additional mechanism that incorporates evolutionarily dynamic, lineage-specific gene families into conserved regulatory gene networks to create functional organ diversity. PMID:22274699
Zhang, Wei-Dong; Zhao, Yong; Zhang, Hong-Fu; Wang, Shu-Kun; Hao, Zhi-Hui; Liu, Jing; Yuan, Yu-Qing; Zhang, Peng-Fei; Yang, Hong-Di; Shen, Wei; Li, Lan
2016-08-01
Granulosa cells (GCs) are those somatic cells closest to the female germ cell. GCs play a vital role in oocyte growth and development, and the oocyte is necessary for multiplication of a species. Zinc oxide (ZnO) nanoparticles (NPs) readily cross biologic barriers to be absorbed into biologic systems that make them promising candidates as food additives. The objective of the present investigation was to explore the impact of intact NPs on gene expression and the functional classification of altered genes in hen GCs in vivo, to compare the data from in vivo and in vitro studies, and finally to point out the adverse effects of ZnO NPs on the reproductive system. After a 24-week treatment, hen GCs were isolated and gene expression was quantified. Intact NPs were found in the ovary and other organs. Zn levels were similar in ZnO-NP-100 mg/kg- and ZnSO4-100 mg/kg-treated hen ovaries. ZnO-NP-100 mg/kg and ZnSO4-100 mg/kg regulated the expression of the same sets of genes, and they also altered the expression of different sets of genes individually. The number of genes altered by the ZnO-NP-100 mg/kg and ZnSO4-100 mg/kg treatments was different. Gene Ontology (GO) functional analysis reported that different results for the two treatments and, in Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment, 12 pathways (out of the top 20 pathways) in each treatment were different. These results suggested that intact NPs and Zn(2+) had different effects on gene expression in GCs in vivo. In our recent publication, we noted that intact NPs and Zn(2+) differentially altered gene expression in GCs in vitro. However, GO functional classification and KEGG pathway enrichment analyses revealed close similarities for the changed genes in vivo and in vitro after ZnO NP treatment. Furthermore, close similarities were observed for the changed genes after ZnSO4 treatments in vivo and in vitro by GO functional classification and KEGG pathway enrichment analyses. Therefore, the effects of ZnO NPs on gene expression in vitro might represent their effects on gene expression in vivo. The results from this study and our earlier studies support previous findings indicating ZnO NPs promote adverse effects on organisms. Therefore, precautions should be taken when ZnO NPs are used as diet additives for hens because they might cause reproductive issues. Copyright © 2016 Elsevier Inc. All rights reserved.
Ehlers, Claudia; Veit, Katharina; Gottschalk, Gerhard; Schmitz, Ruth A.
2002-01-01
The mesophilic methanogenic archaeon Methanosarcina mazei strain Gö1 is able to utilize molecular nitrogen (N2) as its sole nitrogen source. We have identified and characterized a single nitrogen fixation (nif) gene cluster in M. mazei Gö1 with an approximate length of 9 kbp. Sequence analysis revealed seven genes with sequence similarities to nifH, nifI1, nifI2, nifD, nifK, nifE and nifN, similar to other diazotrophic methanogens and certain bacteria such as Clostridium acetobutylicum, with the two glnB-like genes (nifI1 and nifI2) located between nifH and nifD. Phylogenetic analysis of deduced amino acid sequences for the nitrogenase structural genes of M. mazei Gö1 showed that they are most closely related to Methanosarcina barkeri nif2 genes, and also closely resemble those for the corresponding nif products of the gram-positive bacterium C. acetobutylicum. Northern blot analysis and reverse transcription PCR analysis demonstrated that the M. mazei nif genes constitute an operon transcribed only under nitrogen starvation as a single 8 kb transcript. Sequence analysis revealed a palindromic sequence at the transcriptional start site in front of the M. mazei nifH gene, which may have a function in transcriptional regulation of the nif operon. PMID:15803652
Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures
Stark, Alexander; Lin, Michael F.; Kheradpour, Pouya; Pedersen, Jakob S.; Parts, Leopold; Carlson, Joseph W.; Crosby, Madeline A.; Rasmussen, Matthew D.; Roy, Sushmita; Deoras, Ameya N.; Ruby, J. Graham; Brennecke, Julius; Hodges, Emily; Hinrichs, Angie S.; Caspi, Anat; Paten, Benedict; Park, Seung-Won; Han, Mira V.; Maeder, Morgan L.; Polansky, Benjamin J.; Robson, Bryanne E.; Aerts, Stein; van Helden, Jacques; Hassan, Bassem; Gilbert, Donald G.; Eastman, Deborah A.; Rice, Michael; Weir, Michael; Hahn, Matthew W.; Park, Yongkyu; Dewey, Colin N.; Pachter, Lior; Kent, W. James; Haussler, David; Lai, Eric C.; Bartel, David P.; Hannon, Gregory J.; Kaufman, Thomas C.; Eisen, Michael B.; Clark, Andrew G.; Smith, Douglas; Celniker, Susan E.; Gelbart, William M.; Kellis, Manolis
2008-01-01
Sequencing of multiple related species followed by comparative genomics analysis constitutes a powerful approach for the systematic understanding of any genome. Here, we use the genomes of 12 Drosophila species for the de novo discovery of functional elements in the fly. Each type of functional element shows characteristic patterns of change, or ‘evolutionary signatures’, dictated by its precise selective constraints. Such signatures enable recognition of new protein-coding genes and exons, spurious and incorrect gene annotations, and numerous unusual gene structures, including abundant stop-codon readthrough. Similarly, we predict non-protein-coding RNA genes and structures, and new microRNA (miRNA) genes. We provide evidence of miRNA processing and functionality from both hairpin arms and both DNA strands. We identify several classes of pre- and post-transcriptional regulatory motifs, and predict individual motif instances with high confidence. We also study how discovery power scales with the divergence and number of species compared, and we provide general guidelines for comparative studies. PMID:17994088
Guo, Yong; Qiu, Li-Juan
2013-01-01
The Dof domain protein family is a classic plant-specific zinc-finger transcription factor family involved in a variety of biological processes. There is great diversity in the number of Dof genes in different plants. However, there are only very limited reports on the characterization of Dof transcription factors in soybean (Glycine max). In the present study, 78 putative Dof genes were identified from the whole-genome sequence of soybean. The predicted GmDof genes were non-randomly distributed within and across 19 out of 20 chromosomes and 97.4% (38 pairs) were preferentially retained duplicate paralogous genes located in duplicated regions of the genome. Soybean-specific segmental duplications contributed significantly to the expansion of the soybean Dof gene family. These Dof proteins were phylogenetically clustered into nine distinct subgroups among which the gene structure and motif compositions were considerably conserved. Comparative phylogenetic analysis of these Dof proteins revealed four major groups, similar to those reported for Arabidopsis and rice. Most of the GmDofs showed specific expression patterns based on RNA-seq data analyses. The expression patterns of some duplicate genes were partially redundant while others showed functional diversity, suggesting the occurrence of sub-functionalization during subsequent evolution. Comprehensive expression profile analysis also provided insights into the soybean-specific functional divergence among members of the Dof gene family. Cis-regulatory element analysis of these GmDof genes suggested diverse functions associated with different processes. Taken together, our results provide useful information for the functional characterization of soybean Dof genes by combining phylogenetic analysis with global gene-expression profiling.
Grau, Jan; Reschke, Maik; Erkes, Annett; Streubel, Jana; Morgan, Richard D.; Wilson, Geoffrey G.; Koebnik, Ralf; Boch, Jens
2016-01-01
Transcription activator-like effectors (TALEs) are virulence factors, produced by the bacterial plant-pathogen Xanthomonas, that function as gene activators inside plant cells. Although the contribution of individual TALEs to infectivity has been shown, the specific roles of most TALEs, and the overall TALE diversity in Xanthomonas spp. is not known. TALEs possess a highly repetitive DNA-binding domain, which is notoriously difficult to sequence. Here, we describe an improved method for characterizing TALE genes by the use of PacBio sequencing. We present ‘AnnoTALE’, a suite of applications for the analysis and annotation of TALE genes from Xanthomonas genomes, and for grouping similar TALEs into classes. Based on these classes, we propose a unified nomenclature for Xanthomonas TALEs that reveals similarities pointing to related functionalities. This new classification enables us to compare related TALEs and to identify base substitutions responsible for the evolution of TALE specificities. PMID:26876161
The Saccharomyces cerevisiae enolase-related regions encode proteins that are active enolases.
Kornblatt, M J; Richard Albert, J; Mattie, S; Zakaib, J; Dayanandan, S; Hanic-Joyce, P J; Joyce, P B M
2013-02-01
In addition to two genes (ENO1 and ENO2) known to code for enolase (EC4.2.1.11), the Saccharomyces cerevisiae genome contains three enolase-related regions (ERR1, ERR2 and ERR3) which could potentially encode proteins with enolase function. Here, we show that products of these genes (Err2p and Err3p) have secondary and quaternary structures similar to those of yeast enolase (Eno1p). In addition, Err2p and Err3p can convert 2-phosphoglycerate to phosphoenolpyruvate, with kinetic parameters similar to those of Eno1p, suggesting that these proteins could function as enolases in vivo. To address this possibility, we overexpressed the ERR2 and ERR3 genes individually in a double-null yeast strain lacking ENO1 and ENO2, and showed that either ERR2 or ERR3 could complement the growth defect in this strain when cells are grown in medium with glucose as the carbon source. Taken together, these data suggest that the ERR genes in Saccharomyces cerevisiae encode a protein that could function in glycolysis as enolase. The presence of these enolase-related regions in Saccharomyces cerevisiae and their absence in other related yeasts suggests that these genes may play some unique role in Saccharomyces cerevisiae. Further experiments will be required to determine whether these functions are related to glycolysis or other cellular processes. Copyright © 2012 John Wiley & Sons, Ltd.
Gibson, Scott M; Ficklin, Stephen P; Isaacson, Sven; Luo, Feng; Feltus, Frank A; Smith, Melissa C
2013-01-01
The study of gene relationships and their effect on biological function and phenotype is a focal point in systems biology. Gene co-expression networks built using microarray expression profiles are one technique for discovering and interpreting gene relationships. A knowledge-independent thresholding technique, such as Random Matrix Theory (RMT), is useful for identifying meaningful relationships. Highly connected genes in the thresholded network are then grouped into modules that provide insight into their collective functionality. While it has been shown that co-expression networks are biologically relevant, it has not been determined to what extent any given network is functionally robust given perturbations in the input sample set. For such a test, hundreds of networks are needed and hence a tool to rapidly construct these networks. To examine functional robustness of networks with varying input, we enhanced an existing RMT implementation for improved scalability and tested functional robustness of human (Homo sapiens), rice (Oryza sativa) and budding yeast (Saccharomyces cerevisiae). We demonstrate dramatic decrease in network construction time and computational requirements and show that despite some variation in global properties between networks, functional similarity remains high. Moreover, the biological function captured by co-expression networks thresholded by RMT is highly robust.
Discover mouse gene coexpression landscapes using dictionary learning and sparse coding.
Li, Yujie; Chen, Hanbo; Jiang, Xi; Li, Xiang; Lv, Jinglei; Peng, Hanchuan; Tsien, Joe Z; Liu, Tianming
2017-12-01
Gene coexpression patterns carry rich information regarding enormously complex brain structures and functions. Characterization of these patterns in an unbiased, integrated, and anatomically comprehensive manner will illuminate the higher-order transcriptome organization and offer genetic foundations of functional circuitry. Here using dictionary learning and sparse coding, we derived coexpression networks from the space-resolved anatomical comprehensive in situ hybridization data from Allen Mouse Brain Atlas dataset. The key idea is that if two genes use the same dictionary to represent their original signals, then their gene expressions must share similar patterns, thereby considering them as "coexpressed." For each network, we have simultaneous knowledge of spatial distributions, the genes in the network and the extent a particular gene conforms to the coexpression pattern. Gene ontologies and the comparisons with published gene lists reveal biologically identified coexpression networks, some of which correspond to major cell types, biological pathways, and/or anatomical regions.
Yang, Yang; Fu, Xiaofeng; Qu, Wenhao; Xiao, Yiqun; Shen, Hong-Bin
2018-04-27
Benefiting from high-throughput experimental technologies, whole-genome analysis of microRNAs (miRNAs) has been more and more common to uncover important regulatory roles of miRNAs and identify miRNA biomarkers for disease diagnosis. As a complementary information to the high-throughput experimental data, domain knowledge like the Gene Ontology and KEGG pathway is usually used to guide gene function analysis. However, functional annotation for miRNAs is scarce in the public databases. Till now, only a few methods have been proposed for measuring the functional similarity between miRNAs based on public annotation data, and these methods cover a very limited number of miRNAs, which are not applicable to large-scale miRNA analysis. In this paper, we propose a new method to measure the functional similarity for miRNAs, called miRGOFS, which has two notable features: I) it adopts a new GO semantic similarity metric which considers both common ancestors and descendants of GO terms; II) it computes similarity between GO sets in an asymmetric manner, and weights each GO term by its statistical significance. The miRGOFS-based predictor achieves an F1 of 61.2% on a benchmark data set of miRNA localization, and AUC values of 87.7% and 81.1% on two benchmark sets of miRNA-disease association, respectively. Compared with the existing functional similarity measurements of miRNAs, miRGOFS has the advantages of higher accuracy and larger coverage of human miRNAs (over 1000 miRNAs). http://www.csbio.sjtu.edu.cn/bioinf/MiRGOFS/. yangyang@cs.sjtu.edu.cn or hbshen@sjtu.edu.cn. Supplementary data are available at Bioinformatics online.
Ahn, Suzie E.; Lim, Chul-Hong; Lee, Jin-Young; Bae, Seung-Min; Kim, Jinyoung; Bazer, Fuller W.; Song, Gwonhwa
2013-01-01
The reproductive system of chickens undergoes dynamic morphological and functional tissue remodeling during the molting period. The present study identified global gene expression profiles following oviductal tissue regression and regeneration in laying hens in which molting was induced by feeding high levels of zinc in the diet. During the molting and recrudescence processes, progressive morphological and physiological changes included regression and re-growth of reproductive organs and fluctuations in concentrations of testosterone, progesterone, estradiol and corticosterone in blood. The cDNA microarray analysis of oviductal tissues revealed the biological significance of gene expression-based modulation in oviductal tissue during its remodeling. Based on the gene expression profiles, expression patterns of selected genes such as, TF, ANGPTL3, p20K, PTN, AvBD11 and SERPINB3 exhibited similar patterns in expression with gradual decreases during regression of the oviduct and sequential increases during resurrection of the functional oviduct. Also, miR-1689* inhibited expression of Sp1, while miR-17-3p, miR-22* and miR-1764 inhibited expression of STAT1. Similarly, chicken miR-1562 and miR-138 reduced the expression of ANGPTL3 and p20K, respectively. These results suggest that these differentially regulated genes are closely correlated with the molecular mechanism(s) for development and tissue remodeling of the avian female reproductive tract, and that miRNA-mediated regulation of key genes likely contributes to remodeling of the avian reproductive tract by controlling expression of those genes post-transcriptionally. The discovered global gene profiles provide new molecular candidates responsible for regulating morphological and functional recrudescence of the avian reproductive tract, and provide novel insights into understanding the remodeling process at the genomic and epigenomic levels. PMID:24098561
Early gene expression during natural spinal cord regeneration in the salamander Ambystoma mexicanum.
Monaghan, James R; Walker, John A; Page, Robert B; Putta, Srikrishna; Beachy, Christopher K; Voss, S Randal
2007-04-01
In contrast to mammals, salamanders have a remarkable ability to regenerate their spinal cord and recover full movement and function after tail amputation. To identify genes that may be associated with this greater regenerative ability, we designed an oligonucleotide microarray and profiled early gene expression during natural spinal cord regeneration in Ambystoma mexicanum. We sampled tissue at five early time points after tail amputation and identified genes that registered significant changes in mRNA abundance during the first 7 days of regeneration. A list of 1036 statistically significant genes was identified. Additional statistical and fold change criteria were applied to identify a smaller list of 360 genes that were used to describe predominant expression patterns and gene functions. Our results show that a diverse injury response is activated in concert with extracellular matrix remodeling mechanisms during the early acute phase of natural spinal cord regeneration. We also report gene expression similarities and differences between our study and studies that have profiled gene expression after spinal cord injury in rat. Our study illustrates the utility of a salamander model for identifying genes and gene functions that may enhance regenerative ability in mammals.
Kaneko, Kumi; Hori, Sayaka; Morimoto, Mai M; Nakaoka, Takayoshi; Paul, Rajib Kumar; Fujiyuki, Tomoko; Shirai, Kenichi; Wakamoto, Akiko; Tsuboko, Satomi; Takeuchi, Hideaki; Kubo, Takeo
2010-02-16
The importance of visual sense in Hymenopteran social behavior is suggested by the existence of a Hymenopteran insect-specific neural circuit related to visual processing and the fact that worker honeybee brain changes morphologically according to its foraging experience. To analyze molecular and neural bases that underlie the visual abilities of the honeybees, we used a cDNA microarray to search for gene(s) expressed in a neural cell-type preferential manner in a visual center of the honeybee brain, the optic lobes (OLs). Expression analysis of candidate genes using in situ hybridization revealed two genes expressed in a neural cell-type preferential manner in the OLs. One is a homologue of Drosophila futsch, which encodes a microtubule-associated protein and is preferentially expressed in the monopolar cells in the lamina of the OLs. The gene for another microtubule-associated protein, tau, which functionally overlaps with futsch, was also preferentially expressed in the monopolar cells, strongly suggesting the functional importance of these two microtubule-associated proteins in monopolar cells. The other gene encoded a homologue of Misexpression Suppressor of Dominant-negative Kinase Suppressor of Ras 2 (MESK2), which might activate Ras/MAPK-signaling in Drosophila. MESK2 was expressed preferentially in a subclass of neurons located in the ventral region between the lamina and medulla neuropil in the OLs, suggesting that this subclass is a novel OL neuron type characterized by MESK2-expression. These three genes exhibited similar expression patterns in the worker, drone, and queen brains, suggesting that they function similarly irrespective of the honeybee sex or caste. Here we identified genes that are expressed in a monopolar cell (Amfutsch and Amtau) or ventral medulla-preferential manner (AmMESK2) in insect OLs. These genes may aid in visualizing neurites of monopolar cells and ventral medulla cells, as well as in analyzing the function of these neurons.
Aging is associated with a predictable loss of cellular homeostasis, a decline in physiological function and an increase in various diseases. We hypothesized that similar age-related gene expression profiles would be observed in mice across independent studies. Employing a metaan...
Surprises in the maize pollen transcriptome: Inbred differences and developmental similarities
Pollen is the primary means of gene flow between plants and plant populations and plays a critical role in seed production. Our overall objective is to better understand the molecular and genetic basis of the pollen function. We compared gene expression levels in seedlings, mat...
A Weighted Multipath Measurement Based on Gene Ontology for Estimating Gene Products Similarity
Liu, Lizhen; Dai, Xuemin; Song, Wei; Lu, Jingli
2014-01-01
Abstract Many different methods have been proposed for calculating the semantic similarity of term pairs based on gene ontology (GO). Most existing methods are based on information content (IC), and the methods based on IC are used more commonly than those based on the structure of GO. However, most IC-based methods not only fail to handle identical annotations but also show a strong bias toward well-annotated proteins. We propose a new method called weighted multipath measurement (WMM) for estimating the semantic similarity of gene products based on the structure of the GO. We not only considered the contribution of every path between two GO terms but also took the depth of the lowest common ancestors into account. We assigned different weights for different kinds of edges in GO graph. The similarity values calculated by WMM can be reused because they are only relative to the characteristics of GO terms. Experimental results showed that the similarity values obtained by WMM have a higher accuracy. We compared the performance of WMM with that of other methods using GO data and gene annotation datasets for yeast and humans downloaded from the GO database. We found that WMM is more suited for prediction of gene function than most existing IC-based methods and that it can distinguish proteins with identical annotations (two proteins are annotated with the same terms) from each other. PMID:25229994
Zhang, Cui; Li, Zhenkui; Cui, Huiting; Jiang, Yuanyuan; Yang, Zhenke; Wang, Xu; Gao, Han; Liu, Cong; Zhang, Shujia
2017-01-01
ABSTRACT Malaria parasites have a complex life cycle with multiple developmental stages in mosquito and vertebrate hosts, and different developmental stages express unique sets of genes. Unexpectedly, many transcription factors (TFs) commonly found in eukaryotic organisms are absent in malaria parasites; instead, a family of genes encoding proteins similar to the plant Apetala2 (ApiAP2) transcription factors is expanded in the parasites. Several malaria ApiAP2 genes have been shown to play a critical role in parasite development; however, the functions of the majority of the ApiAP2 genes remain to be elucidated. In particular, no study on the Plasmodium yoelii ApiAP2 (PyApiAP2) gene family has been reported so far. This study systematically investigated the functional roles of PyApiAP2 genes in parasite development. Twenty-four of the 26 PyApiAP2 genes were selected for disruption, and 12 were successfully knocked out using the clustered regularly interspaced short palindromic repeat–CRISPR-associated protein 9 (CRISPR-Cas9) method. The effects of gene knockout (KO) on parasite development in mouse and mosquito stages were evaluated. Ten of 12 successfully disrupted genes, including two genes that have not been functionally characterized in any Plasmodium species previously, were shown to be critical for P. yoelii development of sexual and mosquito stages. Additionally, seven of the genes were labeled for protein expression analysis, revealing important information supporting their functions. This study represents the first systematic functional characterization of the P. yoelii ApiAP2 gene family and discovers important insights on the roles of the ApiAP2 genes in parasite development. PMID:29233900
Xu, Xiao Hui; Chen, Hao; Sang, Ya Lin; Wang, Fang; Ma, Jun Ping; Gao, Xin-Qi; Zhang, Xian Sheng
2012-07-02
In plants, pollination is a critical step in reproduction. During pollination, constant communication between male pollen and the female stigma is required for pollen adhesion, germination, and tube growth. The detailed mechanisms of stigma-mediated reproductive processes, however, remain largely unknown. Maize (Zea mays L.), one of the world's most important crops, has been extensively used as a model species to study molecular mechanisms of pollen and stigma interaction. A comprehensive analysis of maize silk transcriptome may provide valuable information for investigating stigma functionality. A comparative analysis of expression profiles between maize silk and dry stigmas of other species might reveal conserved and diverse mechanisms that underlie stigma-mediated reproductive processes in various plant species. Transcript abundance profiles of mature silk, mature pollen, mature ovary, and seedling were investigated using RNA-seq. By comparing the transcriptomes of these tissues, we identified 1,427 genes specifically or preferentially expressed in maize silk. Bioinformatic analyses of these genes revealed many genes with known functions in plant reproduction as well as novel candidate genes that encode amino acid transporters, peptide and oligopeptide transporters, and cysteine-rich receptor-like kinases. In addition, comparison of gene sets specifically or preferentially expressed in stigmas of maize, rice (Oryza sativa L.), and Arabidopsis (Arabidopsis thaliana [L.] Heynh.) identified a number of homologous genes involved either in pollen adhesion, hydration, and germination or in initial growth and penetration of pollen tubes into the stigma surface. The comparison also indicated that maize shares a more similar profile and larger number of conserved genes with rice than with Arabidopsis, and that amino acid and lipid transport-related genes are distinctively overrepresented in maize. Many of the novel genes uncovered in this study are potentially involved in stigma-mediated reproductive processes, including genes encoding amino acid transporters, peptide and oligopeptide transporters, and cysteine-rich receptor-like kinases. The data also suggest that dry stigmas share similar mechanisms at early stages of pollen-stigma interaction. Compared with Arabidopsis, maize and rice appear to have more conserved functional mechanisms. Genes involved in amino acid and lipid transport may be responsible for mechanisms in the reproductive process that are unique to maize silk.
2012-01-01
Background In plants, pollination is a critical step in reproduction. During pollination, constant communication between male pollen and the female stigma is required for pollen adhesion, germination, and tube growth. The detailed mechanisms of stigma-mediated reproductive processes, however, remain largely unknown. Maize (Zea mays L.), one of the world’s most important crops, has been extensively used as a model species to study molecular mechanisms of pollen and stigma interaction. A comprehensive analysis of maize silk transcriptome may provide valuable information for investigating stigma functionality. A comparative analysis of expression profiles between maize silk and dry stigmas of other species might reveal conserved and diverse mechanisms that underlie stigma-mediated reproductive processes in various plant species. Results Transcript abundance profiles of mature silk, mature pollen, mature ovary, and seedling were investigated using RNA-seq. By comparing the transcriptomes of these tissues, we identified 1,427 genes specifically or preferentially expressed in maize silk. Bioinformatic analyses of these genes revealed many genes with known functions in plant reproduction as well as novel candidate genes that encode amino acid transporters, peptide and oligopeptide transporters, and cysteine-rich receptor-like kinases. In addition, comparison of gene sets specifically or preferentially expressed in stigmas of maize, rice (Oryza sativa L.), and Arabidopsis (Arabidopsis thaliana [L.] Heynh.) identified a number of homologous genes involved either in pollen adhesion, hydration, and germination or in initial growth and penetration of pollen tubes into the stigma surface. The comparison also indicated that maize shares a more similar profile and larger number of conserved genes with rice than with Arabidopsis, and that amino acid and lipid transport-related genes are distinctively overrepresented in maize. Conclusions Many of the novel genes uncovered in this study are potentially involved in stigma-mediated reproductive processes, including genes encoding amino acid transporters, peptide and oligopeptide transporters, and cysteine-rich receptor-like kinases. The data also suggest that dry stigmas share similar mechanisms at early stages of pollen-stigma interaction. Compared with Arabidopsis, maize and rice appear to have more conserved functional mechanisms. Genes involved in amino acid and lipid transport may be responsible for mechanisms in the reproductive process that are unique to maize silk. PMID:22748054
A limited role for gene duplications in the evolution of platypus venom.
Wong, Emily S W; Papenfuss, Anthony T; Whittington, Camilla M; Warren, Wesley C; Belov, Katherine
2012-01-01
Gene duplication followed by adaptive selection is believed to be the primary driver of venom evolution. However, to date, no studies have evaluated the importance of gene duplications for venom evolution using a genomic approach. The availability of a sequenced genome and a venom gland transcriptome for the enigmatic platypus provides a unique opportunity to explore the role that gene duplication plays in venom evolution. Here, we identify gene duplication events and correlate them with expressed transcripts in an in-season venom gland. Gene duplicates (1,508) were identified. These duplicated pairs (421), including genes that have undergone multiple rounds of gene duplications, were expressed in the venom gland. The majority of these genes are involved in metabolism and protein synthesis not toxin functions. Twelve secretory genes including serine proteases, metalloproteinases, and protease inhibitors likely to produce symptoms of envenomation such as vasodilation and pain were detected. Only 16 of 107 platypus genes with high similarity to known toxins evolved through gene duplication. Platypus venom C-type natriuretic peptides and nerve growth factor do not possess lineage-specific gene duplicates. Extensive duplications, believed to increase the potency of toxic content and promote toxin diversification, were not found. This is the first study to take a genome-wide approach in order to examine the impact of gene duplication on venom evolution. Our findings support the idea that adaptive selection acts on gene duplicates to drive the independent evolution and functional diversification of similar venom genes in venomous species. However, gene duplications alone do not explain the "venome" of the platypus. Other mechanisms, such as alternative splicing and mutation, may be important in venom innovation.
A Limited Role for Gene Duplications in the Evolution of Platypus Venom
Wong, Emily S. W.; Papenfuss, Anthony T.; Whittington, Camilla M.; Warren, Wesley C.; Belov, Katherine
2012-01-01
Gene duplication followed by adaptive selection is believed to be the primary driver of venom evolution. However, to date, no studies have evaluated the importance of gene duplications for venom evolution using a genomic approach. The availability of a sequenced genome and a venom gland transcriptome for the enigmatic platypus provides a unique opportunity to explore the role that gene duplication plays in venom evolution. Here, we identify gene duplication events and correlate them with expressed transcripts in an in-season venom gland. Gene duplicates (1,508) were identified. These duplicated pairs (421), including genes that have undergone multiple rounds of gene duplications, were expressed in the venom gland. The majority of these genes are involved in metabolism and protein synthesis not toxin functions. Twelve secretory genes including serine proteases, metalloproteinases, and protease inhibitors likely to produce symptoms of envenomation such as vasodilation and pain were detected. Only 16 of 107 platypus genes with high similarity to known toxins evolved through gene duplication. Platypus venom C-type natriuretic peptides and nerve growth factor do not possess lineage-specific gene duplicates. Extensive duplications, believed to increase the potency of toxic content and promote toxin diversification, were not found. This is the first study to take a genome-wide approach in order to examine the impact of gene duplication on venom evolution. Our findings support the idea that adaptive selection acts on gene duplicates to drive the independent evolution and functional diversification of similar venom genes in venomous species. However, gene duplications alone do not explain the “venome” of the platypus. Other mechanisms, such as alternative splicing and mutation, may be important in venom innovation. PMID:21816864
Xi, Jianing; Wang, Minghui; Li, Ao
2017-09-26
The accumulating availability of next-generation sequencing data offers an opportunity to pinpoint driver genes that are causally implicated in oncogenesis through computational models. Despite previous efforts made regarding this challenging problem, there is still room for improvement in the driver gene identification accuracy. In this paper, we propose a novel integrated approach called IntDriver for prioritizing driver genes. Based on a matrix factorization framework, IntDriver can effectively incorporate functional information from both the interaction network and Gene Ontology similarity, and detect driver genes mutated in different sets of patients at the same time. When evaluated through known benchmarking driver genes, the top ranked genes of our result show highly significant enrichment for the known genes. Meanwhile, IntDriver also detects some known driver genes that are not found by the other competing approaches. When measured by precision, recall and F1 score, the performances of our approach are comparable or increased in comparison to the competing approaches.
Benoit, Joshua B; Attardo, Geoffrey M; Michalkova, Veronika; Krause, Tyler B; Bohova, Jana; Zhang, Qirui; Baumann, Aaron A; Mireji, Paul O; Takáč, Peter; Denlinger, David L; Ribeiro, Jose M; Aksoy, Serap
2014-04-01
In tsetse flies, nutrients for intrauterine larval development are synthesized by the modified accessory gland (milk gland) and provided in mother's milk during lactation. Interference with at least two milk proteins has been shown to extend larval development and reduce fecundity. The goal of this study was to perform a comprehensive characterization of tsetse milk proteins using lactation-specific transcriptome/milk proteome analyses and to define functional role(s) for the milk proteins during lactation. Differential analysis of RNA-seq data from lactating and dry (non-lactating) females revealed enrichment of transcripts coding for protein synthesis machinery, lipid metabolism and secretory proteins during lactation. Among the genes induced during lactation were those encoding the previously identified milk proteins (milk gland proteins 1-3, transferrin and acid sphingomyelinase 1) and seven new genes (mgp4-10). The genes encoding mgp2-10 are organized on a 40 kb syntenic block in the tsetse genome, have similar exon-intron arrangements, and share regions of amino acid sequence similarity. Expression of mgp2-10 is female-specific and high during milk secretion. While knockdown of a single mgp failed to reduce fecundity, simultaneous knockdown of multiple variants reduced milk protein levels and lowered fecundity. The genomic localization, gene structure similarities, and functional redundancy of MGP2-10 suggest that they constitute a novel highly divergent protein family. Our data indicates that MGP2-10 function both as the primary amino acid resource for the developing larva and in the maintenance of milk homeostasis, similar to the function of the mammalian casein family of milk proteins. This study underscores the dynamic nature of the lactation cycle and identifies a novel family of lactation-specific proteins, unique to Glossina sp., that are essential to larval development. The specificity of MGP2-10 to tsetse and their critical role during lactation suggests that these proteins may be an excellent target for tsetse-specific population control approaches.
Bacterial infection as assessed by in vivo gene expression
Heithoff, Douglas M.; Conner, Christopher P.; Hanna, Philip C.; Julio, Steven M.; Hentschel, Ute; Mahan, Michael J.
1997-01-01
In vivo expression technology (IVET) has been used to identify >100 Salmonella typhimurium genes that are specifically expressed during infection of BALB/c mice and/or murine cultured macrophages. Induction of these genes is shown to be required for survival in the animal under conditions of the IVET selection. One class of in vivo induced (ivi) genes, iviVI-A and iviVI-B, constitute an operon that resides in a region of the Salmonella genome with low G+C content and presumably has been acquired by horizontal transfer. These ivi genes encode predicted proteins that are similar to adhesins and invasins from prokaryotic and eukaryotic pathogens (Escherichia coli [tia], Plasmodium falciparum [PfEMP1]) and have coopted the PhoPQ regulatory circuitry of Salmonella virulence genes. Examination of the in vivo induction profile indicates (i) many ivi genes encode regulatory functions (e.g., phoPQ and pmrAB) that serve to enhance the sensitivity and amplitude of virulence gene expression (e.g., spvB); (ii) the biochemical function of many metabolic genes may not represent their sole contribution to virulence; (iii) the host ecology can be inferred from the biochemical functions of ivi genes; and (iv) nutrient limitation plays a dual signaling role in pathogenesis: to induce metabolic functions that complement host nutritional deficiencies and to induce virulence functions required for immediate survival and spread to subsequent host sites. PMID:9023360
Identifying metabolic enzymes with multiple types of association evidence
Kharchenko, Peter; Chen, Lifeng; Freund, Yoav; Vitkup, Dennis; Church, George M
2006-01-01
Background Existing large-scale metabolic models of sequenced organisms commonly include enzymatic functions which can not be attributed to any gene in that organism. Existing computational strategies for identifying such missing genes rely primarily on sequence homology to known enzyme-encoding genes. Results We present a novel method for identifying genes encoding for a specific metabolic function based on a local structure of metabolic network and multiple types of functional association evidence, including clustering of genes on the chromosome, similarity of phylogenetic profiles, gene expression, protein fusion events and others. Using E. coli and S. cerevisiae metabolic networks, we illustrate predictive ability of each individual type of association evidence and show that significantly better predictions can be obtained based on the combination of all data. In this way our method is able to predict 60% of enzyme-encoding genes of E. coli metabolism within the top 10 (out of 3551) candidates for their enzymatic function, and as a top candidate within 43% of the cases. Conclusion We illustrate that a combination of genome context and other functional association evidence is effective in predicting genes encoding metabolic enzymes. Our approach does not rely on direct sequence homology to known enzyme-encoding genes, and can be used in conjunction with traditional homology-based metabolic reconstruction methods. The method can also be used to target orphan metabolic activities. PMID:16571130
Carré, Aurore; Hamza, Rasha T.; Kariyawasam, Dulanjalee; Guillot, Loïc; Teissier, Raphaël; Tron, Elodie; Castanet, Mireille; Dupuy, Corinne; El Kholy, Mohamed; Polak, Michel
2014-01-01
Background: Homozygous loss-of-function mutations in the FOXE1 gene have been reported in several patients with partial or complete Bamforth–Lazarus syndrome: congenital hypothyroidism (CH) with thyroid dysgenesis (usually athyreosis), cleft palate, spiky hair, with or without choanal atresia, and bifid epiglottis. Here, our objective was to evaluate potential functional consequences of a FOXE1 mutation in a patient with a similar clinical phenotype. Methods: FOXE1 was sequenced in eight patients with thyroid dysgenesis and cleft palate. Transient transfection was performed in HEK293 cells using the thyroglobulin (TG) and thyroid peroxidase (TPO) promoters in luciferase reporter plasmids to assess the functional impact of the FOXE1 mutations. Primary human thyrocytes transfected with wild type and mutant FOXE1 served to assess the impact of the mutation on endogenous TG and TPO expression. Results: We identified and characterized the function of a new homozygous FOXE1 missense mutation (p.R73S) in a boy with a typical phenotype (athyreosis, cleft palate, and partial choanal atresia). This new mutation located within the forkhead domain was inherited from the heterozygous healthy consanguineous parents. In vitro functional studies in HEK293 cells showed that this mutant gene enhanced the activity of the TG and TPO gene promoters (1.5-fold and 1.7-fold respectively vs. wild type FOXE1; p<0.05), unlike the five mutations previously reported in Bamforth–Lazarus syndrome. The gain-of-function effect of the FOXE1-p.R73S mutant gene was confirmed by an increase in endogenous TG production in primary human thyrocytes. Conclusion: We identified a new homozygous FOXE1 mutation responsible for enhanced expression of the TG and TPO genes in a boy whose phenotype is similar to that reported previously in patients with loss-of-function FOXE1 mutations. This finding further delineates the role for FOXE1 in both thyroid and palate development, and shows that enhanced gene activity should be considered among the mechanisms underlying Bamforth–Lazarus syndrome. PMID:24219130
Stevenson, G; Andrianopoulos, K; Hobbs, M; Reeves, P R
1996-01-01
Colanic acid (CA) is an extracellular polysaccharide produced by most Escherichia coli strains as well as by other species of the family Enterobacteriaceae. We have determined the sequence of a 23-kb segment of the E. coli K-12 chromosome which includes the cluster of genes necessary for production of CA. The CA cluster comprises 19 genes. Two other sequenced genes (orf1.3 and galF), which are situated between the CA cluster and the O-antigen cluster, were shown to be unnecessary for CA production. The CA cluster includes genes for synthesis of GDP-L-fucose, one of the precursors of CA, and the gene for one of the enzymes in this pathway (GDP-D-mannose 4,6-dehydratase) was identified by biochemical assay. Six of the inferred proteins show sequence similarity to glycosyl transferases, and two others have sequence similarity to acetyl transferases. Another gene (wzx) is predicted to encode a protein with multiple transmembrane segments and may function in export of the CA repeat unit from the cytoplasm into the periplasm in a process analogous to O-unit export. The first three genes of the cluster are predicted to encode an outer membrane lipoprotein, a phosphatase, and an inner membrane protein with an ATP-binding domain. Since homologs of these genes are found in other extracellular polysaccharide gene clusters, they may have a common function, such as export of polysaccharide from the cell. PMID:8759852
Hefer, Charles A; Mizrachi, Eshchar; Myburg, Alexander A; Douglas, Carl J; Mansfield, Shawn D
2015-06-01
Wood formation is a complex developmental process governed by genetic and environmental stimuli. Populus and Eucalyptus are fast-growing, high-yielding tree genera that represent ecologically and economically important species suitable for generating significant lignocellulosic biomass. Comparative analysis of the developing xylem and leaf transcriptomes of Populus trichocarpa and Eucalyptus grandis together with phylogenetic analyses identified clusters of homologous genes preferentially expressed during xylem formation in both species. A conserved set of 336 single gene pairs showed highly similar xylem preferential expression patterns, as well as evidence of high functional constraint. Individual members of multi-gene orthologous clusters known to be involved in secondary cell wall biosynthesis also showed conserved xylem expression profiles. However, species-specific expression as well as opposite (xylem versus leaf) expression patterns observed for a subset of genes suggest subtle differences in the transcriptional regulation important for xylem development in each species. Using sequence similarity and gene expression status, we identified functional homologs likely to be involved in xylem developmental and biosynthetic processes in Populus and Eucalyptus. Our study suggests that, while genes involved in secondary cell wall biosynthesis show high levels of gene expression conservation, differential regulation of some xylem development genes may give rise to unique xylem properties. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.
InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk.
Cheng, Liang; Jiang, Yue; Ju, Hong; Sun, Jie; Peng, Jiajie; Zhou, Meng; Hu, Yang
2018-01-19
Since the establishment of the first biomedical ontology Gene Ontology (GO), the number of biomedical ontology has increased dramatically. Nowadays over 300 ontologies have been built including extensively used Disease Ontology (DO) and Human Phenotype Ontology (HPO). Because of the advantage of identifying novel relationships between terms, calculating similarity between ontology terms is one of the major tasks in this research area. Though similarities between terms within each ontology have been studied with in silico methods, term similarities across different ontologies were not investigated as deeply. The latest method took advantage of gene functional interaction network (GFIN) to explore such inter-ontology similarities of terms. However, it only used gene interactions and failed to make full use of the connectivity among gene nodes of the network. In addition, all existent methods are particularly designed for GO and their performances on the extended ontology community remain unknown. We proposed a method InfAcrOnt to infer similarities between terms across ontologies utilizing the entire GFIN. InfAcrOnt builds a term-gene-gene network which comprised ontology annotations and GFIN, and acquires similarities between terms across ontologies through modeling the information flow within the network by random walk. In our benchmark experiments on sub-ontologies of GO, InfAcrOnt achieves a high average area under the receiver operating characteristic curve (AUC) (0.9322 and 0.9309) and low standard deviations (1.8746e-6 and 3.0977e-6) in both human and yeast benchmark datasets exhibiting superior performance. Meanwhile, comparisons of InfAcrOnt results and prior knowledge on pair-wise DO-HPO terms and pair-wise DO-GO terms show high correlations. The experiment results show that InfAcrOnt significantly improves the performance of inferring similarities between terms across ontologies in benchmark set.
Li, Wan; Zhu, Lina; Huang, Hao; He, Yuehan; Lv, Junjie; Li, Weimin; Chen, Lina; He, Weiming
2017-10-01
Complex chronic diseases are caused by the effects of genetic and environmental factors. Single nucleotide polymorphisms (SNPs), one common type of genetic variations, played vital roles in diseases. We hypothesized that disease risk functional SNPs in coding regions and protein interaction network modules were more likely to contribute to the identification of disease susceptible genes for complex chronic diseases. This could help to further reveal the pathogenesis of complex chronic diseases. Disease risk SNPs were first recognized from public SNP data for coronary heart disease (CHD), hypertension (HT) and type 2 diabetes (T2D). SNPs in coding regions that were classified into nonsense and missense by integrating several SNP functional annotation databases were treated as functional SNPs. Then, regions significantly associated with each disease were screened using random permutations for disease risk functional SNPs. Corresponding to these regions, 155, 169 and 173 potential disease susceptible genes were identified for CHD, HT and T2D, respectively. A disease-related gene product interaction network in environmental context was constructed for interacting gene products of both disease genes and potential disease susceptible genes for these diseases. After functional enrichment analysis for disease associated modules, 5 CHD susceptible genes, 7 HT susceptible genes and 3 T2D susceptible genes were finally identified, some of which had pleiotropic effects. Most of these genes were verified to be related to these diseases in literature. This was similar for disease genes identified from another method proposed by Lee et al. from a different aspect. This research could provide novel perspectives for diagnosis and treatment of complex chronic diseases and susceptible genes identification for other diseases. Copyright © 2017 Elsevier Inc. All rights reserved.
Preparation of fosmid libraries and functional metagenomic analysis of microbial community DNA.
Martínez, Asunción; Osburne, Marcia S
2013-01-01
One of the most important challenges in contemporary microbial ecology is to assign a functional role to the large number of novel genes discovered through large-scale sequencing of natural microbial communities that lack similarity to genes of known function. Functional screening of metagenomic libraries, that is, screening environmental DNA clones for the ability to confer an activity of interest to a heterologous bacterial host, is a promising approach for bridging the gap between metagenomic DNA sequencing and functional characterization. Here, we describe methods for isolating environmental DNA and constructing metagenomic fosmid libraries, as well as methods for designing and implementing successful functional screens of such libraries. © 2013 Elsevier Inc. All rights reserved.
Los1p, involved in yeast pre-tRNA splicing, positively regulates members of the SOL gene family.
Shen, W C; Stanford, D R; Hopper, A K
1996-06-01
To understand the role of Los1p in pre-tRNA splicing, we sought los1 multicopy suppressors. We found SOL1 that suppresses both point and null LOS1 mutations. Since, when fused to the Ga14p DNA-binding domain, Los1p activates transcription, we tested whether Los1p regulates SOL1. We found that las1 mutants have depleted levels of SOL1 mRNA and Sol1p. Thus, LOS1 appears to positively regulate SOL1. SOL1 belongs to a multigene family with at least two additional members, SOL2 and SOL3. Sol proteins have extensive similarity to an unusual group of glucose-6-phosphate dehydrogenases. As the similarities are restricted to areas separate from the catalytic domain, these G6PDs may have more than one function. The SOL family appears to be unessential since cells with a triple disruption of all three SOL genes are viable. SOL gene disruptions negatively affect tRNA-mediated nonsense suppression and the severity increases with the number of mutant SOL genes. However, tRNA levels do not vary with either multicopy SOL genes or with SOL disruptions. Therefore, the Sol proteins affect tRNA expression/ function at steps other than transcription or splicing. We propose that LOS1 regulates gene products involved in tRNA expression/function as well as pre-tRNA splicing.
Diametrical clustering for identifying anti-correlated gene clusters.
Dhillon, Inderjit S; Marcotte, Edward M; Roshan, Usman
2003-09-01
Clustering genes based upon their expression patterns allows us to predict gene function. Most existing clustering algorithms cluster genes together when their expression patterns show high positive correlation. However, it has been observed that genes whose expression patterns are strongly anti-correlated can also be functionally similar. Biologically, this is not unintuitive-genes responding to the same stimuli, regardless of the nature of the response, are more likely to operate in the same pathways. We present a new diametrical clustering algorithm that explicitly identifies anti-correlated clusters of genes. Our algorithm proceeds by iteratively (i). re-partitioning the genes and (ii). computing the dominant singular vector of each gene cluster; each singular vector serving as the prototype of a 'diametric' cluster. We empirically show the effectiveness of the algorithm in identifying diametrical or anti-correlated clusters. Testing the algorithm on yeast cell cycle data, fibroblast gene expression data, and DNA microarray data from yeast mutants reveals that opposed cellular pathways can be discovered with this method. We present systems whose mRNA expression patterns, and likely their functions, oppose the yeast ribosome and proteosome, along with evidence for the inverse transcriptional regulation of a number of cellular systems.
Borchert, S; Stachelhaus, T; Marahiel, M A
1994-01-01
The deduced amino acid sequence of the gsp gene, located upstream of the 5' end of the gramicidin S operon (grs operon) in Bacillus brevis, showed a high degree of similarity to the sfp gene product, which is located downstream of the srfA operon in B. subtilis. The gsp gene complemented in trans a defect in the sfp gene (sfpO) and promoted production of the lipopeptide antibiotic surfactin. The functional homology of Gsp and Sfp and the sequence similarity of these two proteins to EntD suggest that the three proteins represent a new class of proteins involved in peptide secretion, in support of a hypothesis published previously (T. H. Grossman, M. Tuckman, S. Ellestad, and M. S. Osburne, J. Bacteriol. 175:6203-6211, 1993). Images PMID:7512553
Neuhaus, Klaus; Landstorfer, Richard; Fellner, Lea; Simon, Svenja; Schafferhans, Andrea; Goldberg, Tatyana; Marx, Harald; Ozoline, Olga N; Rost, Burkhard; Kuster, Bernhard; Keim, Daniel A; Scherer, Siegfried
2016-02-24
Genomes of E. coli, including that of the human pathogen Escherichia coli O157:H7 (EHEC) EDL933, still harbor undetected protein-coding genes which, apparently, have escaped annotation due to their small size and non-essential function. To find such genes, global gene expression of EHEC EDL933 was examined, using strand-specific RNAseq (transcriptome), ribosomal footprinting (translatome) and mass spectrometry (proteome). Using the above methods, 72 short, non-annotated protein-coding genes were detected. All of these showed signals in the ribosomal footprinting assay indicating mRNA translation. Seven were verified by mass spectrometry. Fifty-seven genes are annotated in other enterobacteriaceae, mainly as hypothetical genes; the remaining 15 genes constitute novel discoveries. In addition, protein structure and function were predicted computationally and compared between EHEC-encoded proteins and 100-times randomly shuffled proteins. Based on this comparison, 61 of the 72 novel proteins exhibit predicted structural and functional features similar to those of annotated proteins. Many of the novel genes show differential transcription when grown under eleven diverse growth conditions suggesting environmental regulation. Three genes were found to confer a phenotype in previous studies, e.g., decreased cattle colonization. These findings demonstrate that ribosomal footprinting can be used to detect novel protein coding genes, contributing to the growing body of evidence that hypothetical genes are not annotation artifacts and opening an additional way to study their functionality. All 72 genes are taxonomically restricted and, therefore, appear to have evolved relatively recently de novo.
Fungal Genes in Context: Genome Architecture Reflects Regulatory Complexity and Function
Noble, Luke M.; Andrianopoulos, Alex
2013-01-01
Gene context determines gene expression, with local chromosomal environment most influential. Comparative genomic analysis is often limited in scope to conserved or divergent gene and protein families, and fungi are well suited to this approach with low functional redundancy and relatively streamlined genomes. We show here that one aspect of gene context, the amount of potential upstream regulatory sequence maintained through evolution, is highly predictive of both molecular function and biological process in diverse fungi. Orthologs with large upstream intergenic regions (UIRs) are strongly enriched in information processing functions, such as signal transduction and sequence-specific DNA binding, and, in the genus Aspergillus, include the majority of experimentally studied, high-level developmental and metabolic transcriptional regulators. Many uncharacterized genes are also present in this class and, by implication, may be of similar importance. Large intergenic regions also share two novel sequence characteristics, currently of unknown significance: they are enriched for plus-strand polypyrimidine tracts and an information-rich, putative regulatory motif that was present in the last common ancestor of the Pezizomycotina. Systematic consideration of gene UIR in comparative genomics, particularly for poorly characterized species, could help reveal organisms’ regulatory priorities. PMID:23699226
DIA1R is an X-linked gene related to Deleted In Autism-1.
Aziz, Azhari; Harrop, Sean P; Bishop, Naomi E
2011-01-17
Autism spectrum disorders (ASDS) are frequently occurring disorders diagnosed by deficits in three core functional areas: social skills, communication, and behaviours and/or interests. Mental retardation frequently accompanies the most severe forms of ASDs, while overall ASDs are more commonly diagnosed in males. Most ASDs have a genetic origin and one gene recently implicated in the etiology of autism is the Deleted-In-Autism-1 (DIA1) gene. Using a bioinformatics-based approach, we have identified a human gene closely related to DIA1, we term DIA1R (DIA1-Related). While DIA1 is autosomal (chromosome 3, position 3q24), DIA1R localizes to the X chromosome at position Xp11.3 and is known to escape X-inactivation. The gene products are of similar size, with DIA1 encoding 430, and DIA1R 433, residues. At the amino acid level, DIA1 and DIA1R are 62% similar overall (28% identical), and both encode signal peptides for targeting to the secretory pathway. Both genes are ubiquitously expressed, including in fetal and adult brain tissue. Examination of published literature revealed point mutations in DIA1R are associated with X-linked mental retardation (XLMR) and DIA1R deletion is associated with syndromes with ASD-like traits and/or XLMR. Together, these results support a model where the DIA1 and DIA1R gene products regulate molecular traffic through the cellular secretory pathway or affect the function of secreted factors, and functional deficits cause disorders with ASD-like symptoms and/or mental retardation.
Di Meglio, Paola; Di Cesare, Antonella; Laggner, Ute; Chu, Chung-Ching; Napolitano, Luca; Villanova, Federica; Tosi, Isabella; Capon, Francesca; Trembath, Richard C.; Peris, Ketty; Nestle, Frank O.
2011-01-01
IL-23 and Th17 cells are key players in tissue immunosurveillance and are implicated in human immune-mediated diseases. Genome-wide association studies have shown that the IL23R R381Q gene variant protects against psoriasis, Crohn's disease and ankylosing spondylitis. We investigated the immunological consequences of the protective IL23R R381Q gene variant in healthy donors. The IL23R R381Q gene variant had no major effect on Th17 cell differentiation as the frequency of circulating Th17 cells was similar in carriers of the IL23R protective (A) and common (G) allele. Accordingly, Th17 cells generated from A and G donors produced similar amounts of Th17 cytokines. However, IL-23-mediated Th17 cell effector function was impaired, as Th17 cells from A allele carriers had significantly reduced IL-23-induced IL-17A production and STAT3 phosphorylation compared to G allele carriers. Our functional analysis of a human disease-associated gene variant demonstrates that IL23R R381Q exerts its protective effects through selective attenuation of IL-23-induced Th17 cell effector function without interfering with Th17 differentiation, and highlights its importance in the protection against IL-23-induced tissue pathologies. PMID:21364948
Di Meglio, Paola; Di Cesare, Antonella; Laggner, Ute; Chu, Chung-Ching; Napolitano, Luca; Villanova, Federica; Tosi, Isabella; Capon, Francesca; Trembath, Richard C; Peris, Ketty; Nestle, Frank O
2011-02-22
IL-23 and Th17 cells are key players in tissue immunosurveillance and are implicated in human immune-mediated diseases. Genome-wide association studies have shown that the IL23R R381Q gene variant protects against psoriasis, Crohn's disease and ankylosing spondylitis. We investigated the immunological consequences of the protective IL23R R381Q gene variant in healthy donors. The IL23R R381Q gene variant had no major effect on Th17 cell differentiation as the frequency of circulating Th17 cells was similar in carriers of the IL23R protective (A) and common (G) allele. Accordingly, Th17 cells generated from A and G donors produced similar amounts of Th17 cytokines. However, IL-23-mediated Th17 cell effector function was impaired, as Th17 cells from A allele carriers had significantly reduced IL-23-induced IL-17A production and STAT3 phosphorylation compared to G allele carriers. Our functional analysis of a human disease-associated gene variant demonstrates that IL23R R381Q exerts its protective effects through selective attenuation of IL-23-induced Th17 cell effector function without interfering with Th17 differentiation, and highlights its importance in the protection against IL-23-induced tissue pathologies.
Los1p, involved in yeast pre-tRNA splicing, positively regulates members of the SOL gene family
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shen, W.C.; Stanford, D.R.; Hopper, A.K.
1996-06-01
To understand the role of Los1p in pre-tRNA splicing, we sought los1 multicopy suppressors. We found SOL1 that suppresses both point and null LOS1 mutations. Since, when fused to the Gal4p DNA-binding domain, Los1p activates transcription, we tested whether Los1p regulates SOL1. We found that los1 mutants have depleted levels of SOL1 mRNA and Sol1p. Thus, LOS1 appears to positively regulate SOL1. SOL1 belongs to a multigene family with at least two additional members, SOL2 and SOL3. Sol proteins have extensive similarity to an unusual group of glucose-6-phosphate dehydrogenases (G6PDs). As the similarities are restricted to areas separate from themore » catalytic domain, these G6PDs may have more than one function. The SOL gene disruptions negatively affect tRNA-mediated nonsense suppression and the severity increases with the number of mutant SOL genes. However, tRNA levels do not vary with either multicopy SOL genes or with SOL disruptions. Therefore, the Sol proteins affect tRNA expression/function at steps other than transcription or splicing. We propose that LOS1 regulates gene products involved in tRNA expression/function as well as pre-tRNA splicing. 64 refs., 6 figs., 6 tabs.« less
Land scale biogeography of arsenic biotransformation genes in estuarine wetland.
Zhang, Si-Yu; Su, Jian-Qiang; Sun, Guo-Xin; Yang, Yunfeng; Zhao, Yi; Ding, Junjun; Chen, Yong-Shan; Shen, Yu; Zhu, Guibing; Rensing, Christopher; Zhu, Yong-Guan
2017-06-01
As an analogue of phosphorus, arsenic (As) has a biogeochemical cycle coupled closely with other key elements on the Earth, such as iron, sulfate and phosphate. It has been documented that microbial genes associated with As biotransformation are widely present in As-rich environments. Nonetheless, their presence in natural environment with low As levels remains unclear. To address this issue, we investigated the abundance levels and diversities of aioA, arrA, arsC and arsM genes in estuarine sediments at low As levels across Southeastern China to uncover biogeographic patterns at a large spatial scale. Unexpectedly, genes involved in As biotransformation were characterized by high abundance and diversity. The functional microbial communities showed a significant decrease in similarity along the geographic distance, with higher turnover rates than taxonomic microbial communities based on the similarities of 16S rRNA genes. Further investigation with niche-based models showed that deterministic processes played primary roles in shaping both functional and taxonomic microbial communities. Temperature, pH, total nitrogen concentration, carbon/nitrogen ratio and ferric iron concentration rather than As content in these sediments were significantly linked to functional microbial communities, while sediment temperature and pH were linked to taxonomic microbial communities. We proposed several possible mechanisms to explain these results. © 2017 Society for Applied Microbiology and John Wiley & Sons Ltd.
Huang, Pengyun; Lin, Fucheng
2014-01-01
Because of great challenges and workload in deleting genes on a large scale, the functions of most genes in pathogenic fungi are still unclear. In this study, we developed a high-throughput gene knockout system using a novel yeast-Escherichia-Agrobacterium shuttle vector, pKO1B, in the rice blast fungus Magnaporthe oryzae. Using this method, we deleted 104 fungal-specific Zn2Cys6 transcription factor (TF) genes in M. oryzae. We then analyzed the phenotypes of these mutants with regard to growth, asexual and infection-related development, pathogenesis, and 9 abiotic stresses. The resulting data provide new insights into how this rice pathogen of global significance regulates important traits in the infection cycle through Zn2Cys6TF genes. A large variation in biological functions of Zn2Cys6TF genes was observed under the conditions tested. Sixty-one of 104 Zn2Cys6 TF genes were found to be required for fungal development. In-depth analysis of TF genes revealed that TF genes involved in pathogenicity frequently tend to function in multiple development stages, and disclosed many highly conserved but unidentified functional TF genes of importance in the fungal kingdom. We further found that the virulence-required TF genes GPF1 and CNF2 have similar regulation mechanisms in the gene expression involved in pathogenicity. These experimental validations clearly demonstrated the value of a high-throughput gene knockout system in understanding the biological functions of genes on a genome scale in fungi, and provided a solid foundation for elucidating the gene expression network that regulates the development and pathogenicity of M. oryzae. PMID:25299517
Molecular definition of the identity and activation of natural killer cells.
Bezman, Natalie A; Kim, Charles C; Sun, Joseph C; Min-Oo, Gundula; Hendricks, Deborah W; Kamimura, Yosuke; Best, J Adam; Goldrath, Ananda W; Lanier, Lewis L
2012-10-01
Using whole-genome microarray data sets of the Immunological Genome Project, we demonstrate a closer transcriptional relationship between NK cells and T cells than between any other leukocytes, distinguished by their shared expression of genes encoding molecules with similar signaling functions. Whereas resting NK cells are known to share expression of a few genes with cytotoxic CD8(+) T cells, our transcriptome-wide analysis demonstrates that the commonalities extend to hundreds of genes, many encoding molecules with unknown functions. Resting NK cells demonstrate a 'preprimed' state compared with naive T cells, which allows NK cells to respond more rapidly to viral infection. Collectively, our data provide a global context for known and previously unknown molecular aspects of NK cell identity and function by delineating the genome-wide repertoire of gene expression of NK cells in various states.
Strakova, Eva; Zikova, Alice; Vohradsky, Jiri
2014-01-01
A computational model of gene expression was applied to a novel test set of microarray time series measurements to reveal regulatory interactions between transcriptional regulators represented by 45 sigma factors and the genes expressed during germination of a prokaryote Streptomyces coelicolor. Using microarrays, the first 5.5 h of the process was recorded in 13 time points, which provided a database of gene expression time series on genome-wide scale. The computational modeling of the kinetic relations between the sigma factors, individual genes and genes clustered according to the similarity of their expression kinetics identified kinetically plausible sigma factor-controlled networks. Using genome sequence annotations, functional groups of genes that were predominantly controlled by specific sigma factors were identified. Using external binding data complementing the modeling approach, specific genes involved in the control of the studied process were identified and their function suggested.
Integrative and conjugative elements and their hosts: composition, distribution and organization
Touchon, Marie; Rocha, Eduardo P. C.
2017-01-01
Abstract Conjugation of single-stranded DNA drives horizontal gene transfer between bacteria and was widely studied in conjugative plasmids. The organization and function of integrative and conjugative elements (ICE), even if they are more abundant, was only studied in a few model systems. Comparative genomics of ICE has been precluded by the difficulty in finding and delimiting these elements. Here, we present the results of a method that circumvents these problems by requiring only the identification of the conjugation genes and the species’ pan-genome. We delimited 200 ICEs and this allowed the first large-scale characterization of these elements. We quantified the presence in ICEs of a wide set of functions associated with the biology of mobile genetic elements, including some that are typically associated with plasmids, such as partition and replication. Protein sequence similarity networks and phylogenetic analyses revealed that ICEs are structured in functional modules. Integrases and conjugation systems have different evolutionary histories, even if the gene repertoires of ICEs can be grouped in function of conjugation types. Our characterization of the composition and organization of ICEs paves the way for future functional and evolutionary analyses of their cargo genes, composed of a majority of unknown function genes. PMID:28911112
A graph-theoretic approach for inparalog detection.
Tremblay-Savard, Olivier; Swenson, Krister M
2012-01-01
Understanding the history of a gene family that evolves through duplication, speciation, and loss is a fundamental problem in comparative genomics. Features such as function, position, and structural similarity between genes are intimately connected to this history; relationships between genes such as orthology (genes related through a speciation event) or paralogy (genes related through a duplication event) are usually correlated with these features. For example, recent work has shown that in human and mouse there is a strong connection between function and inparalogs, the paralogs that were created since the speciation event separating the human and mouse lineages. Methods exist for detecting inparalogs that either use information from only two species, or consider a set of species but rely on clustering methods. In this paper we present a graph-theoretic approach for finding lower bounds on the number of inparalogs for a given set of species; we pose an edge covering problem on the similarity graph and give an efficient 2/3-approximation as well as a faster heuristic. Since the physical position of inparalogs corresponding to recent speciations is not likely to have changed since the duplication, we also use our predictions to estimate the types of duplications that have occurred in some vertebrates and drosophila.
Protein classification using probabilistic chain graphs and the Gene Ontology structure.
Carroll, Steven; Pavlovic, Vladimir
2006-08-01
Probabilistic graphical models have been developed in the past for the task of protein classification. In many cases, classifications obtained from the Gene Ontology have been used to validate these models. In this work we directly incorporate the structure of the Gene Ontology into the graphical representation for protein classification. We present a method in which each protein is represented by a replicate of the Gene Ontology structure, effectively modeling each protein in its own 'annotation space'. Proteins are also connected to one another according to different measures of functional similarity, after which belief propagation is run to make predictions at all ontology terms. The proposed method was evaluated on a set of 4879 proteins from the Saccharomyces Genome Database whose interactions were also recorded in the GRID project. Results indicate that direct utilization of the Gene Ontology improves predictive ability, outperforming traditional models that do not take advantage of dependencies among functional terms. Average increase in accuracy (precision) of positive and negative term predictions of 27.8% (2.0%) over three different similarity measures and three subontologies was observed. C/C++/Perl implementation is available from authors upon request.
2013-01-01
Background MicroRNAs (miRNAs) are small non-coding RNAs that play critical roles in regulating post transcriptional gene expression. Gall midges encompass a large group of insects that are of economic importance and also possess fascinating biological traits. The gall midge Mayetiola destructor, commonly known as the Hessian fly, is a destructive pest of wheat and model organism for studying gall midge biology and insect – host plant interactions. Results In this study, we systematically analyzed miRNAs from the Hessian fly. Deep-sequencing a Hessian fly larval transcriptome led to the identification of 89 miRNA species that are either identical or very similar to known miRNAs from other insects, and 184 novel miRNAs that have not been reported from other species. A genome-wide search through a draft Hessian fly genome sequence identified a total of 611 putative miRNA-encoding genes based on sequence similarity and the existence of a stem-loop structure for miRNA precursors. Analysis of the 611 putative genes revealed a striking feature: the dramatic expansion of several miRNA gene families. The largest family contained 91 genes that encoded 20 different miRNAs. Microarray analyses revealed the expression of miRNA genes was strictly regulated during Hessian fly larval development and abundance of many miRNA genes were affected by host genotypes. Conclusion The identification of a large number of miRNAs for the first time from a gall midge provides a foundation for further studies of miRNA functions in gall midge biology and behavior. The dramatic expansion of identical or similar miRNAs provides a unique system to study functional relations among miRNA iso-genes as well as changes in sequence specificity due to small changes in miRNAs and in their mRNA targets. These results may also facilitate the identification of miRNA genes for potential pest control through transgenic approaches. PMID:23496979
A gene-targeted approach to investigate the intestinal butyrate-producing bacterial community
2013-01-01
Background Butyrate, which is produced by the human microbiome, is essential for a well-functioning colon. Bacteria that produce butyrate are phylogenetically diverse, which hinders their accurate detection based on conventional phylogenetic markers. As a result, reliable information on this important bacterial group is often lacking in microbiome research. Results In this study we describe a gene-targeted approach for 454 pyrotag sequencing and quantitative polymerase chain reaction for the final genes in the two primary bacterial butyrate synthesis pathways, butyryl-CoA:acetate CoA-transferase (but) and butyrate kinase (buk). We monitored the establishment and early succession of butyrate-producing communities in four patients with ulcerative colitis who underwent a colectomy with ileal pouch anal anastomosis and compared it with three control samples from healthy colons. All patients established an abundant butyrate-producing community (approximately 5% to 26% of the total community) in the pouch within the 2-month study, but patterns were distinctive among individuals. Only one patient harbored a community profile similar to the healthy controls, in which there was a predominance of but genes that are similar to reference genes from Acidaminococcus sp., Eubacterium sp., Faecalibacterium prausnitzii and Roseburia sp., and an almost complete absence of buk genes. Two patients were greatly enriched in buk genes similar to those of Clostridium butyricum and C. perfringens, whereas a fourth patient displayed abundant communities containing both genes. Most butyrate producers identified in previous studies were detected and the general patterns of taxa found were supported by 16S rRNA gene pyrotag analysis, but the gene-targeted approach provided more detail about the potential butyrate-producing members of the community. Conclusions The presented approach provides quantitative and genotypic insights into butyrate-producing communities and facilitates a more specific functional characterization of the intestinal microbiome. Furthermore, our analysis refines but and buk reference annotations found in central databases. PMID:24451334
Gong, Zhen-Hui; Yin, Yan-Xu; Li, Da-Wei
2013-01-01
Low temperature is one of the major factors limiting pepper (Capsicum annuum L.) production during winter and early spring in non-tropical regions. Application of exogenous abscisic acid (ABA) effectively alleviates the symptoms of chilling injury, such as wilting and formation of necrotic lesions on pepper leaves; however, the underlying molecular mechanism is not understood. The aim of this study was to identify genes that are differentially up- or downregulated in ABA-pretreated hot pepper seedlings incubated at 6°C for 48 h, using a suppression subtractive hybridization (SSH) method. A total of 235 high-quality ESTs were isolated, clustered and assembled into a collection of 73 unigenes including 18 contigs and 55 singletons. A total of 37 unigenes (50.68%) showed similarities to genes with known functions in the non-redundant database; the other 36 unigenes (49.32%) showed low similarities or unknown functions. Gene ontology analysis revealed that the 37 unigenes could be classified into nine functional categories. The expression profiles of 18 selected genes were analyzed using quantitative RT-PCR; the expression levels of 10 of these genes were at least two-fold higher in the ABA-pretreated seedlings under chilling stress than water-pretreated (control) plants under chilling stress. In contrast, the other eight genes were downregulated in ABA-pretreated seedlings under chilling stress, with expression levels that were one-third or less of the levels observed in control seedlings under chilling stress. These results suggest that ABA can positively and negatively regulate genes in pepper plants under chilling stress. PMID:23825555
Liang, Yuting; Zhao, Huihui; Zhang, Xu; Zhou, Jizhong; Li, Guanghe
2014-07-15
To compare the functional gene structure and diversity of microbial communities in saline-alkali and slightly acidic oil-contaminated sites, 40 soil samples were collected from two typical oil exploration sites in North and South China and analyzed with a comprehensive functional gene array (GeoChip 3.0). The overall microbial pattern was significantly different between the two sites, and a more divergent pattern was observed in slightly acidic soils. Response ratio was calculated to compare the microbial functional genes involved in organic contaminant degradation and carbon, nitrogen, phosphorus, and sulfur cycling. The results indicated a significantly low abundance of most genes involved in organic contaminant degradation and in the cycling of nitrogen and phosphorus in saline-alkali soils. By contrast, most carbon degradation genes and all carbon fixation genes had similar abundance at both sites. Based on the relationship between the environmental variables and microbial functional structure, pH was the major factor influencing the microbial distribution pattern in the two sites. This study demonstrated that microbial functional diversity and heterogeneity in oil-contaminated environments can vary significantly in relation to local environmental conditions. The limitation of nitrogen and phosphorus and the low degradation capacity of organic contaminant should be carefully considered, particularly in most oil-exploration sites with saline-alkali soils. Copyright © 2014 Elsevier B.V. All rights reserved.
Yang, Laurence; Tan, Justin; O'Brien, Edward J; Monk, Jonathan M; Kim, Donghyuk; Li, Howard J; Charusanti, Pep; Ebrahim, Ali; Lloyd, Colton J; Yurkovich, James T; Du, Bin; Dräger, Andreas; Thomas, Alex; Sun, Yuekai; Saunders, Michael A; Palsson, Bernhard O
2015-08-25
Finding the minimal set of gene functions needed to sustain life is of both fundamental and practical importance. Minimal gene lists have been proposed by using comparative genomics-based core proteome definitions. A definition of a core proteome that is supported by empirical data, is understood at the systems-level, and provides a basis for computing essential cell functions is lacking. Here, we use a systems biology-based genome-scale model of metabolism and expression to define a functional core proteome consisting of 356 gene products, accounting for 44% of the Escherichia coli proteome by mass based on proteomics data. This systems biology core proteome includes 212 genes not found in previous comparative genomics-based core proteome definitions, accounts for 65% of known essential genes in E. coli, and has 78% gene function overlap with minimal genomes (Buchnera aphidicola and Mycoplasma genitalium). Based on transcriptomics data across environmental and genetic backgrounds, the systems biology core proteome is significantly enriched in nondifferentially expressed genes and depleted in differentially expressed genes. Compared with the noncore, core gene expression levels are also similar across genetic backgrounds (two times higher Spearman rank correlation) and exhibit significantly more complex transcriptional and posttranscriptional regulatory features (40% more transcription start sites per gene, 22% longer 5'UTR). Thus, genome-scale systems biology approaches rigorously identify a functional core proteome needed to support growth. This framework, validated by using high-throughput datasets, facilitates a mechanistic understanding of systems-level core proteome function through in silico models; it de facto defines a paleome.
Recombineering Pseudomonas syringae
USDA-ARS?s Scientific Manuscript database
Here we report the identification of functions that promote genomic recombination of linear DNA introduced into Pseudomonas cells by electroporation. The genes encoding these functions were identified in Pseudomonas syringae pv. syringae B728a based on similarity to the lambda Red Exo/Beta and RecE...
Jiang, P; Stone, S; Wagner, R; Wang, S; Dayananth, P; Kozak, C A; Wold, B; Kamb, A
1995-12-01
Cyclin-dependent kinase inhibitors are a growing family of molecules that regulate important transitions in the cell cycle. At least one of these molecules, p16, has been implicated in human tumorigenesis while its close homolog, p15, is induced by cell contact and transforming growth factor-beta (TGF-beta). To investigate the evolutionary and functional features of p15 and p16, we have isolated mouse (Mus musculus) homologs of each gene. Comparative analysis of these sequences provides evidence that the genes have similar functions in mouse and human. In addition, the comparison suggests that a gene conversion event is part of the evolution of the human p15 and p16 genes.
Yoshihara, Takeshi; Spalding, Edgar P; Iino, Moritoshi
2013-04-01
The present study identified a family of six A. thaliana genes that share five limited regions of sequence similarity with LAZY1, a gene in Oryza sativa (rice) shown to participate in the early gravity signaling for shoot gravitropism. A T-DNA insertion into the Arabidopsis gene (At5g14090) most similar to LAZY1 increased the inflorescence branch angle to 81° from the wild type value of 42°. RNA interference lines and molecular rescue experiments confirmed the linkage between the branch-angle phenotype and the gene consequently named AtLAZY1. Time-resolved gravitropism measurements of atlazy1 hypocotyls and primary inflorescence stems showed a significantly reduced bending rate during the first hour of response. The subcellular localization of AtLAZY1 protein was investigated to determine if the nuclear localization predicted from the gene sequence was observable and important to its function in shoot gravity responses. AtLAZY1 fused to green fluorescent protein largely rescued the branch-angle phenotype of atlazy1, and was observed by confocal microscopy at the cell periphery and within the nucleus. Mutation of the nuclear localization signal prevented detectable levels of AtLAZY1 in the nucleus without affecting the ability of the gene to rescue the atlazy1 branch-angle phenotype. These results indicate that AtLAZY1 functions in gravity signaling during shoot gravitropism, being a functional ortholog of rice LAZY1. The nuclear pool of the protein appears to be unnecessary for this function, which instead relies on a pool that appears to reside at the cell periphery. © 2013 The Authors The Plant Journal © 2013 Blackwell Publishing Ltd.
Jay, Z. J.; Rusch, D. B.; Tringe, S. G.; Bailey, C.; Jennings, R. M.
2014-01-01
High-temperature (>70°C) ecosystems in Yellowstone National Park (YNP) provide an unparalleled opportunity to study chemotrophic archaea and their role in microbial community structure and function under highly constrained geochemical conditions. Acidilobus spp. (order Desulfurococcales) comprise one of the dominant phylotypes in hypoxic geothermal sulfur sediment and Fe(III)-oxide environments along with members of the Thermoproteales and Sulfolobales. Consequently, the primary goals of the current study were to analyze and compare replicate de novo sequence assemblies of Acidilobus-like populations from four different mildly acidic (pH 3.3 to 6.1) high-temperature (72°C to 82°C) environments and to identify metabolic pathways and/or protein-encoding genes that provide a detailed foundation of the potential functional role of these populations in situ. De novo assemblies of the highly similar Acidilobus-like populations (>99% 16S rRNA gene identity) represent near-complete consensus genomes based on an inventory of single-copy genes, deduced metabolic potential, and assembly statistics generated across sites. Functional analysis of coding sequences and confirmation of gene transcription by Acidilobus-like populations provide evidence that they are primarily chemoorganoheterotrophs, generating acetyl coenzyme A (acetyl-CoA) via the degradation of carbohydrates, lipids, and proteins, and auxotrophic with respect to several external vitamins, cofactors, and metabolites. No obvious pathways or protein-encoding genes responsible for the dissimilatory reduction of sulfur were identified. The presence of a formate dehydrogenase (Fdh) and other protein-encoding genes involved in mixed-acid fermentation supports the hypothesis that Acidilobus spp. function as degraders of complex organic constituents in high-temperature, mildly acidic, hypoxic geothermal systems. PMID:24162572
DOE Office of Scientific and Technical Information (OSTI.GOV)
Taguchi, Takahiro; Testa, J.R.; Mitcham, J.L.
This report describes the localization of the the TIL gene to human chromosome 4p14 using fluorescence in situ hybridization. This gene encodes a protein which is related to the Drosophila transmembrane receptor Toll and the mammalian interleukin-1 receptor, which share similarities in structure and function. The Drosophila gene is also important during embryonic development, which makes TIL a candidate locus for human congenital malformations that are genetically linked to human chromosome 4. 17 refs., 1 fig.
Meléndez, Giselle C.; Manteufel, Edward J.; Dehlin, Heather M.; Register, Thomas C.; Levick, Scott P.
2015-01-01
Background The sensory nerve neuropeptide substance P (SP) regulates cardiac fibrosis in rodents under pressure overload conditions. Interestingly, SP induces transient increase expression of specific genes in isolated rat cardiac fibroblasts, without resultant changes in cell function. This suggests that SP ‘primes’ fibroblasts, but does not directly activate them. We investigated whether these unusual findings are specific to rodent fibroblasts or are translatable to a larger animal model more closely related to humans. Methods We compared the effects of SP on genes associated with extracellular matrix (ECM) regulation, cell-cell adhesion, cell-matrix adhesion and ECM in cardiac fibroblasts isolated from a non-human primate and Sprague-Dawley rats. Results We found that rodent and non-human primate cardiac fibroblasts showed similar ECM regulation and cell adhesion gene expression responses to SP. There were, however, large discrepancies in ECM genes which did not result in collagen or laminin synthesis in rat or non-human primate fibroblasts in response to SP. Conclusions This study further supports the notion that SP serves as a ‘primer’ for fibroblasts rather than initiating direct effects and suggests that rodent fibroblasts are a suitable model for studying gene and functional responses to SP in the absence of human or non-human primate fibroblasts. PMID:25550118
Peng, Chuanhua; Wang, Xiaoping; Li, Fei; Lin, Yongjun
2012-01-01
The rice stem borer, Chilo suppressalis (Walker) (Lepidoptera: Pyralidae), is one of the most detrimental pests affecting rice crops. The use of Bacillus thuringiensis (Bt) toxins has been explored as a means to control this pest, but the potential for C. suppressalis to develop resistance to Bt toxins makes this approach problematic. Few C. suppressalis gene sequences are known, which makes in-depth study of gene function difficult. Herein, we sequenced the midgut transcriptome of the rice stem borer. In total, 37,040 contigs were obtained, with a mean size of 497 bp. As expected, the transcripts of C. suppressalis shared high similarity with arthropod genes. Gene ontology and KEGG analysis were used to classify the gene functions in C. suppressalis. Using the midgut transcriptome data, we conducted a proteome analysis to identify proteins expressed abundantly in the brush border membrane vesicles (BBMV). Of the 100 top abundant proteins that were excised and subjected to mass spectrometry analysis, 74 share high similarity with known proteins. Among these proteins, Western blot analysis showed that Aminopeptidase N and EH domain-containing protein have the binding activities with Bt-toxin Cry1Ac. These data provide invaluable information about the gene sequences of C. suppressalis and the proteins that bind with Cry1Ac. PMID:22666467
Sinha, Amit; Langnick, Claudia; Sommer, Ralf J; Dieterich, Christoph
2014-09-01
Discovery of trans-splicing in multiple metazoan lineages led to the identification of operon-like gene organization in diverse organisms, including trypanosomes, tunicates, and nematodes, but the functional significance of such operons is not completely understood. To see whether the content or organization of operons serves similar roles across species, we experimentally defined operons in the nematode model Pristionchus pacificus. We performed affinity capture experiments on mRNA pools to specifically enrich for transcripts that are trans-spliced to either the SL1- or SL2-spliced leader, using spliced leader-specific probes. We obtained distinct trans-splicing patterns from the analysis of three mRNA pools (total mRNA, SL1 and SL2 fraction) by RNA-seq. This information was combined with a genome-wide analysis of gene orientation and spacing. We could confirm 2219 operons by RNA-seq data out of 6709 candidate operons, which were predicted by sequence information alone. Our gene order comparison of the Caenorhabditis elegans and P. pacificus genomes shows major changes in operon organization in the two species. Notably, only 128 out of 1288 operons in C. elegans are conserved in P. pacificus. However, analysis of gene-expression profiles identified conserved functions such as an enrichment of germline-expressed genes and higher expression levels of operonic genes during recovery from dauer arrest in both species. These results provide support for the model that a necessity for increased transcriptional efficiency in the context of certain developmental processes could be a selective constraint for operon evolution in metazoans. Our method is generally applicable to other metazoans to see if similar functional constraints regulate gene organization into operons. © 2014 Sinha et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Zhu, Chengsheng; Miller, Maximilian
2018-01-01
Abstract Microbial functional diversification is driven by environmental factors, i.e. microorganisms inhabiting the same environmental niche tend to be more functionally similar than those from different environments. In some cases, even closely phylogenetically related microbes differ more across environments than across taxa. While microbial similarities are often reported in terms of taxonomic relationships, no existing databases directly link microbial functions to the environment. We previously developed a method for comparing microbial functional similarities on the basis of proteins translated from their sequenced genomes. Here, we describe fusionDB, a novel database that uses our functional data to represent 1374 taxonomically distinct bacteria annotated with available metadata: habitat/niche, preferred temperature, and oxygen use. Each microbe is encoded as a set of functions represented by its proteome and individual microbes are connected via common functions. Users can search fusionDB via combinations of organism names and metadata. Moreover, the web interface allows mapping new microbial genomes to the functional spectrum of reference bacteria, rendering interactive similarity networks that highlight shared functionality. fusionDB provides a fast means of comparing microbes, identifying potential horizontal gene transfer events, and highlighting key environment-specific functionality. PMID:29112720
Wang, Xu; Werren, John H.; Clark, Andrew G.
2015-01-01
There is extraordinary diversity in sexual dimorphism (SD) among animals, but little is known about its epigenetic basis. To study the epigenetic architecture of SD in a haplodiploid system, we performed RNA-seq and whole-genome bisulfite sequencing of adult females and males from two closely related parasitoid wasps, Nasonia vitripennis and Nasonia giraulti. More than 75% of expressed genes displayed significantly sex-biased expression. As a consequence, expression profiles are more similar between species within each sex than between sexes within each species. Furthermore, extremely male- and female-biased genes are enriched for totally different functional categories: male-biased genes for key enzymes in sex-pheromone synthesis and female-biased genes for genes involved in epigenetic regulation of gene expression. Remarkably, just 70 highly expressed, extremely male-biased genes account for 10% of all transcripts in adult males. Unlike expression profiles, DNA methylomes are highly similar between sexes within species, with no consistent sex differences in methylation found. Therefore, methylation changes cannot explain the extensive level of sex-biased gene expression observed. Female-biased genes have smaller sequence divergence between species, higher conservation to other hymenopterans, and a broader expression range across development. Overall, female-biased genes have been recruited from genes with more conserved and broadly expressing “house-keeping” functions, whereas male-biased genes are more recently evolved and are predominately testis specific. In summary, Nasonia accomplish a striking degree of sex-biased expression without sex chromosomes or epigenetic differences in methylation. We propose that methylation provides a general signal for constitutive gene expression, whereas other sex-specific signals cause sex-biased gene expression. PMID:26100871
USDA-ARS?s Scientific Manuscript database
The secreted proteins encoded by “parasitism genes” expressed within the esophageal glands cells of cyst nematodes play important roles in plant parasitism. Homologous transcripts and encoded proteins of the Heterodera glycines pioneer parasitism genes Hgsyv46, Hg4e02 and Hg5d08 were identified and ...
Ichige, A; Walker, G C
1997-01-01
The Rhizobium meliloti bacA gene encodes a function that is essential for bacterial differentiation into bacteroids within plant cells in the symbiosis between R. meliloti and alfalfa. An Escherichia coli homolog of BacA, SbmA, is implicated in the uptake of microcin B17, microcin J25 (formerly microcin 25), and bleomycin. When expressed in E. coli with the lacZ promoter, the R. meliloti bacA gene was found to suppress all the known defects of E. coli sbmA mutants, namely, increased resistance to microcin B17, microcin J25, and bleomycin, demonstrating the functional similarity between the two proteins. The R. meliloti bacA386::Tn(pho)A mutant, as well as a newly constructed bacA deletion mutant, was found to show increased resistance to bleomycin. However, it also showed increased resistance to certain aminoglycosides and increased sensitivity to ethanol and detergents, suggesting that the loss of bacA function causes some defect in membrane integrity. The E. coli sbmA gene suppressed all these bacA mutant phenotypes as well as the Fix- phenotype when placed under control of the bacA promoter. Taken together, these results strongly suggest that the BacA and SbmA proteins are functionally similar and thus provide support for our previous hypothesis that BacA may be required for uptake of some compound that plays an important role in bacteroid development. However, the additional phenotypes of bacA mutants identified in this study suggest the alternative possibility that BacA may be needed for membrane integrity, which is likely to be critically important during the early stages of bacterial differentiation within plant cells. PMID:8982000
Origin of sphinx, a young chimeric RNA gene in Drosophila melanogaster
Wang, Wen; Brunet, Frédéric G.; Nevo, Eviatar; Long, Manyuan
2002-01-01
Non-protein-coding RNA genes play an important role in various biological processes. How new RNA genes originated and whether this process is controlled by similar evolutionary mechanisms for the origin of protein-coding genes remains unclear. A young chimeric RNA gene that we term sphinx (spx) provides the first insight into the early stage of evolution of RNA genes. spx originated as an insertion of a retroposed sequence of the ATP synthase chain F gene at the cytological region 60DB since the divergence of Drosophila melanogaster from its sibling species 2–3 million years ago. This retrosequence, which is located at 102F on the fourth chromosome, recruited a nearby exon and intron, thereby evolving a chimeric gene structure. This molecular process suggests that the mechanism of exon shuffling, which can generate protein-coding genes, also plays a role in the origin of RNA genes. The subsequent evolutionary process of spx has been associated with a high nucleotide substitution rate, possibly driven by a continuous positive Darwinian selection for a novel function, as is shown in its sex- and development-specific alternative splicing. To test whether spx has adapted to different environments, we investigated its population genetic structure in the unique “Evolution Canyon” in Israel, revealing a similar haplotype structure in spx, and thus similar evolutionary forces operating on spx between environments. PMID:11904380
Oncodomains: A protein domain-centric framework for analyzing rare variants in tumor samples
Peterson, Thomas A.; Park, Junyong
2017-01-01
The fight against cancer is hindered by its highly heterogeneous nature. Genome-wide sequencing studies have shown that individual malignancies contain many mutations that range from those commonly found in tumor genomes to rare somatic variants present only in a small fraction of lesions. Such rare somatic variants dominate the landscape of genomic mutations in cancer, yet efforts to correlate somatic mutations found in one or few individuals with functional roles have been largely unsuccessful. Traditional methods for identifying somatic variants that drive cancer are ‘gene-centric’ in that they consider only somatic variants within a particular gene and make no comparison to other similar genes in the same family that may play a similar role in cancer. In this work, we present oncodomain hotspots, a new ‘domain-centric’ method for identifying clusters of somatic mutations across entire gene families using protein domain models. Our analysis confirms that our approach creates a framework for leveraging structural and functional information encapsulated by protein domains into the analysis of somatic variants in cancer, enabling the assessment of even rare somatic variants by comparison to similar genes. Our results reveal a vast landscape of somatic variants that act at the level of domain families altering pathways known to be involved with cancer such as protein phosphorylation, signaling, gene regulation, and cell metabolism. Due to oncodomain hotspots’ unique ability to assess rare variants, we expect our method to become an important tool for the analysis of sequenced tumor genomes, complementing existing methods. PMID:28426665
Tornow, J; Santangelo, G M
1994-06-01
A duplicate copy of the RPL37A gene (encoding ribosomal protein L37) was cloned and sequenced. The coding region of RPL37B is very similar to that of RPL37A, with only one conservative amino-acid difference. However, the intron and flanking sequences of the two genes are extremely dissimilar. Disruption experiments indicate that the two loci are not functionally equivalent: disruption of RPL37B was insignificant, but disruption of RPL37A severely impaired the growth rate of the cell. When both RPL37 loci are disrupted, the cell is unable to grow at all, indicating that rpL37 is an essential protein. The functional disparity between the two RPL37 loci could be explained by differential gene expression. The results of two experiments support this idea: gene fusion of RPL37A to a reporter gene resulted in six-fold higher mRNA levels than was generated by the same reporter gene fused to RPL37B, and a modest increase in gene dosage of RPL37B overcame the lack of a functional RPL37A gene.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Taylor, Ronald C.; Sanfilippo, Antonio P.; McDermott, Jason E.
2011-02-18
Transcriptional regulatory networks are being determined using “reverse engineering” methods that infer connections based on correlations in gene state. Corroboration of such networks through independent means such as evidence from the biomedical literature is desirable. Here, we explore a novel approach, a bootstrapping version of our previous Cross-Ontological Analytic method (XOA) that can be used for semi-automated annotation and verification of inferred regulatory connections, as well as for discovery of additional functional relationships between the genes. First, we use our annotation and network expansion method on a biological network learned entirely from the literature. We show how new relevant linksmore » between genes can be iteratively derived using a gene similarity measure based on the Gene Ontology that is optimized on the input network at each iteration. Second, we apply our method to annotation, verification, and expansion of a set of regulatory connections found by the Context Likelihood of Relatedness algorithm.« less
Calduch-Giner, Josep A.; Sitjà-Bobadilla, Ariadna; Pérez-Sánchez, Jaume
2016-01-01
High-quality sequencing reads from the intestine of European sea bass were assembled, annotated by similarity against protein reference databases and combined with nucleotide sequences from public and private databases. After redundancy filtering, 24,906 non-redundant annotated sequences encoding 15,367 different gene descriptions were obtained. These annotated sequences were used to design a custom, high-density oligo-microarray (8 × 15 K) for the transcriptomic profiling of anterior (AI), middle (MI), and posterior (PI) intestinal segments. Similar molecular signatures were found for AI and MI segments, which were combined in a single group (AI-MI) whereas the PI outstood separately, with more than 1900 differentially expressed genes with a fold-change cutoff of 2. Functional analysis revealed that molecular and cellular functions related to feed digestion and nutrient absorption and transport were over-represented in AI-MI segments. By contrast, the initiation and establishment of immune defense mechanisms became especially relevant in PI, although the microarray expression profiling validated by qPCR indicated that these functional changes are gradual from anterior to posterior intestinal segments. This functional divergence occurred in association with spatial transcriptional changes in nutrient transporters and the mucosal chemosensing system via G protein-coupled receptors. These findings contribute to identify key indicators of gut functions and to compare different fish feeding strategies and immune defense mechanisms acquired along the evolution of teleosts. PMID:27610085
Calduch-Giner, Josep A; Sitjà-Bobadilla, Ariadna; Pérez-Sánchez, Jaume
2016-01-01
High-quality sequencing reads from the intestine of European sea bass were assembled, annotated by similarity against protein reference databases and combined with nucleotide sequences from public and private databases. After redundancy filtering, 24,906 non-redundant annotated sequences encoding 15,367 different gene descriptions were obtained. These annotated sequences were used to design a custom, high-density oligo-microarray (8 × 15 K) for the transcriptomic profiling of anterior (AI), middle (MI), and posterior (PI) intestinal segments. Similar molecular signatures were found for AI and MI segments, which were combined in a single group (AI-MI) whereas the PI outstood separately, with more than 1900 differentially expressed genes with a fold-change cutoff of 2. Functional analysis revealed that molecular and cellular functions related to feed digestion and nutrient absorption and transport were over-represented in AI-MI segments. By contrast, the initiation and establishment of immune defense mechanisms became especially relevant in PI, although the microarray expression profiling validated by qPCR indicated that these functional changes are gradual from anterior to posterior intestinal segments. This functional divergence occurred in association with spatial transcriptional changes in nutrient transporters and the mucosal chemosensing system via G protein-coupled receptors. These findings contribute to identify key indicators of gut functions and to compare different fish feeding strategies and immune defense mechanisms acquired along the evolution of teleosts.
Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya
2015-01-01
Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930
Methodology for the inference of gene function from phenotype data.
Ascensao, Joao A; Dolan, Mary E; Hill, David P; Blake, Judith A
2014-12-12
Biomedical ontologies are increasingly instrumental in the advancement of biological research primarily through their use to efficiently consolidate large amounts of data into structured, accessible sets. However, ontology development and usage can be hampered by the segregation of knowledge by domain that occurs due to independent development and use of the ontologies. The ability to infer data associated with one ontology to data associated with another ontology would prove useful in expanding information content and scope. We here focus on relating two ontologies: the Gene Ontology (GO), which encodes canonical gene function, and the Mammalian Phenotype Ontology (MP), which describes non-canonical phenotypes, using statistical methods to suggest GO functional annotations from existing MP phenotype annotations. This work is in contrast to previous studies that have focused on inferring gene function from phenotype primarily through lexical or semantic similarity measures. We have designed and tested a set of algorithms that represents a novel methodology to define rules for predicting gene function by examining the emergent structure and relationships between the gene functions and phenotypes rather than inspecting the terms semantically. The algorithms inspect relationships among multiple phenotype terms to deduce if there are cases where they all arise from a single gene function. We apply this methodology to data about genes in the laboratory mouse that are formally represented in the Mouse Genome Informatics (MGI) resource. From the data, 7444 rule instances were generated from five generalized rules, resulting in 4818 unique GO functional predictions for 1796 genes. We show that our method is capable of inferring high-quality functional annotations from curated phenotype data. As well as creating inferred annotations, our method has the potential to allow for the elucidation of unforeseen, biologically significant associations between gene function and phenotypes that would be overlooked by a semantics-based approach. Future work will include the implementation of the described algorithms for a variety of other model organism databases, taking full advantage of the abundance of available high quality curated data.
Strategies and Challenges in Identifying Function for Thousands of sORF-Encoded Peptides in Meiosis.
Hollerer, Ina; Higdon, Andrea; Brar, Gloria A
2017-09-20
Recent genomic analyses have revealed pervasive translation from formerly unrecognized short open reading frames (sORFs) during yeast meiosis. Despite their short length, which has caused these regions to be systematically overlooked by traditional gene annotation approaches, meiotic sORFs share many features with classical genes, implying the potential for similar types of cellular functions. We found that sORF expression accounts for approximately 10-20% of the cellular translation capacity in yeast during meiotic differentiation and occurs within well-defined time windows, suggesting the production of relatively abundant peptides with stage-specific meiotic roles from these regions. Here, we provide arguments supporting this hypothesis and discuss sORF similarities and differences, as a group, to traditional protein coding regions, as well as challenges in defining their specific functions. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
An auxin responsive CLE gene regulates shoot apical meristem development in Arabidopsis
Guo, Hongyan; Zhang, Wei; Tian, Hainan; Zheng, Kaijie; Dai, Xuemei; Liu, Shanda; Hu, Qingnan; Wang, Xianling; Liu, Bao; Wang, Shucai
2015-01-01
Plant hormone auxin regulates most, if not all aspects of plant growth and development, including lateral root formation, organ pattering, apical dominance, and tropisms. Peptide hormones are peptides with hormone activities. Some of the functions of peptide hormones in regulating plant growth and development are similar to that of auxin, however, the relationship between auxin and peptide hormones remains largely unknown. Here we report the identification of OsCLE48, a rice (Oryza sativa) CLE (CLAVATA3/ENDOSPERM SURROUNDING REGION) gene, as an auxin response gene, and the functional characterization of OsCLE48 in Arabidopsis and rice. OsCLE48 encodes a CLE peptide hormone that is similar to Arabidopsis CLEs. RT-PCR analysis showed that OsCLE48 was induced by exogenously application of IAA (indole-3-acetic acid), a naturally occurred auxin. Expression of integrated OsCLE48p:GUS reporter gene in transgenic Arabidopsis plants was also induced by exogenously IAA treatment. These results indicate that OsCLE48 is an auxin responsive gene. Histochemical staining showed that GUS activity was detected in all the tissue and organs of the OsCLE48p:GUS transgenic Arabidopsis plants. Expression of OsCLE48 under the control of the 35S promoter in Arabidopsis inhibited shoot apical meristem development. Expression of OsCLE48 under the control of the CLV3 native regulatory elements almost completely complemented clv3-2 mutant phenotypes, suggesting that OsCLE48 is functionally similar to CLV3. On the other hand, expression of OsCLE48 under the control of the 35S promoter in Arabidopsis has little, if any effects on root apical meristem development, and transgenic rice plants overexpressing OsCLE48 are morphologically indistinguishable from wild type plants, suggesting that the functions of some CLE peptides may not be fully conserved in Arabidopsis and rice. Taken together, our results showed that OsCLE48 is an auxin responsive peptide hormone gene, and it regulates shoot apical meristem development when expressed in Arabidopsis. PMID:25983737
Hsieh, J; Liu, J; Kostas, S A; Chang, C; Sternberg, P W; Fire, A
1999-11-15
Context-dependent gene silencing is used by many organisms to stably modulate gene activity for large chromosomal regions. We have used tandem array transgenes as a model substrate in a screen for Caenorhabditis elegans mutants that affect context-dependent gene silencing in somatic tissues. This screen yielded multiple alleles of a previously uncharacterized gene, designated tam-1 (for tandem-array-modifier). Loss-of-function mutations in tam-1 led to a dramatic reduction in the activity of numerous highly repeated transgenes. These effects were apparently context dependent, as nonrepetitive transgenes retained activity in a tam-1 mutant background. In addition to the dramatic alterations in transgene activity, tam-1 mutants showed modest alterations in expression of a subset of endogenous cellular genes. These effects include genetic interactions that place tam-1 into a group called the class B synMuv genes (for a Synthetic Multivulva phenotype); this family plays a negative role in the regulation of RAS pathway activity in C. elegans. Loss-of-function mutants in other members of the class-B synMuv family, including lin-35, which encodes a protein similar to the tumor suppressor Rb, exhibit a hypersilencing in somatic transgenes similar to that of tam-1 mutants. Molecular analysis reveals that tam-1 encodes a broadly expressed nuclear protein with RING finger and B-box motifs.
Comprehensive gene expression analysis of canine invasive urothelial bladder carcinoma by RNA-Seq.
Maeda, Shingo; Tomiyasu, Hirotaka; Tsuboi, Masaya; Inoue, Akiko; Ishihara, Genki; Uchikai, Takao; Chambers, James K; Uchida, Kazuyuki; Yonezawa, Tomohiro; Matsuki, Naoaki
2018-04-27
Invasive urothelial carcinoma (iUC) is a major cause of death in humans, and approximately 165,000 individuals succumb to this cancer annually worldwide. Comparative oncology using relevant animal models is necessary to improve our understanding of progression, diagnosis, and treatment of iUC. Companion canines are a preferred animal model of iUC due to spontaneous tumor development and similarity to human disease in terms of histopathology, metastatic behavior, and treatment response. However, the comprehensive molecular characterization of canine iUC is not well documented. In this study, we performed transcriptome analysis of tissue samples from canine iUC and normal bladders using an RNA sequencing (RNA-Seq) approach to identify key molecular pathways in canine iUC. Total RNA was extracted from bladder tissues of 11 dogs with iUC and five healthy dogs, and RNA-Seq was conducted. Ingenuity Pathway Analysis (IPA) was used to assign differentially expressed genes to known upstream regulators and functional networks. Differential gene expression analysis of the RNA-Seq data revealed 2531 differentially expressed genes, comprising 1007 upregulated and 1524 downregulated genes, in canine iUC. IPA revealed that the most activated upstream regulator was PTGER2 (encoding the prostaglandin E 2 receptor EP2), which is consistent with the therapeutic efficiency of cyclooxygenase inhibitors in canine iUC. Similar to human iUC, canine iUC exhibited upregulated ERBB2 and downregulated TP53 pathways. Biological functions associated with cancer, cell proliferation, and leukocyte migration were predicted to be activated, while muscle functions were predicted to be inhibited, indicating muscle-invasive tumor property. Our data confirmed similarities in gene expression patterns between canine and human iUC and identified potential therapeutic targets (PTGER2, ERBB2, CCND1, Vegf, and EGFR), suggesting the value of naturally occurring canine iUC as a relevant animal model for human iUC.
Guo, Chun Yu; Yin, Hui Jun; Jiang, Yue Rong; Xue, Mei; Zhang, Lu; Shi, Da Zhuo
2008-06-18
To construct the differential genes expressed profile in the ischemic myocardium tissue reduced from acute myocardial infarction(AMI), and determine the biological functions of target genes. AMI model was generated by ligation of the left anterior descending coronary artery in Wistar rats. Total RNA was extracted from the normal and the ischemic heart tissues under the ligation point 7 days after the operation. Differential gene expression profiles of the two samples were constructed using Long Serial Analysis of Gene Expression(LongSAGE). Real time fluorescence quantitative PCR was used to verify gene expression profile and to identify the expression of 2 functional genes. The activities of enzymes from functional genes were determined by histochemistry. A total of 15,966 tags were screened from the normal and the ischemic LongSAGE maps. The similarities of the sequences were compared using the BLAST algebra in NCBI and 7,665 novel tags were found. In the ischemic tissue 142 genes were significantly changed compared with those in the normal tissue (P<0.05). These differentially expressed genes represented the proteins which might play important roles in the pathways of oxidation and phosphorylation, ATP synthesis and glycolysis. The partial genes identified by LongSAGE were confirmed using real time fluorescence quantitative PCR. Two genes related to energy metabolism, COX5a and ATP5e, were screened and quantified. Expression of two functional genes down-regulated at their mRNA levels and the activities of correlative functional enzymes decreased compared with those in the normal tissue. AMI causes a series of changes in gene expression, in which the abnormal expression of genes related to energy metabolism could be one of the molecular mechanisms of AMI. The intervention of the expressions of COX5a and ATP5e may be a new target for AMI therapy.
Castaneda, Francisco; Rosin-Steiner, Sigrid; Jung, Klaus
2006-12-21
We previously found that ethanol at millimolar level (1 mM) activates the expression of transcription factors with subsequent regulation of apoptotic genes in human hepatocellular carcinoma (HCC) HepG2 cells. However, the role of ethanol on the expression of genes implicated in transcriptional and translational processes remains unknown. Therefore, the aim of this study was to characterize the effect of low concentration of ethanol on gene expression profiling in HepG2 cells using cDNA microarrays with especial interest in genes with transcriptional and translational function. The gene expression pattern observed in the ethanol-treated HepG2 cells revealed a relatively similar pattern to that found in the untreated control cells. The pairwise comparison analysis demonstrated four significantly up-regulated (COBRA1, ITGB4, STAU2, and HMGN3) genes and one down-regulated (ANK3) gene. All these genes exert their function on transcriptional and translational processes and until now none of these genes have been associated with ethanol. This functional genomic analysis demonstrates the reported interaction between ethanol and ethanol-regulated genes. Moreover, it confirms the relationship between ethanol-regulated genes and various signaling pathways associated with ethanol-induced apoptosis. The data presented in this study represents an important contribution toward the understanding of the molecular mechanisms of ethanol at low concentration in HepG2 cells, a HCC-derived cell line.
Castaneda, Francisco; Rosin-Steiner, Sigrid; Jung, Klaus
2007-01-01
We previously found that ethanol at millimolar level (1 mM) activates the expression of transcription factors with subsequent regulation of apoptotic genes in human hepatocellular carcinoma (HCC) HepG2 cells. However, the role of ethanol on the expression of genes implicated in transcriptional and translational processes remains unknown. Therefore, the aim of this study was to characterize the effect of low concentration of ethanol on gene expression profiling in HepG2 cells using cDNA microarrays with especial interest in genes with transcriptional and translational function. The gene expression pattern observed in the ethanol-treated HepG2 cells revealed a relatively similar pattern to that found in the untreated control cells. The pairwise comparison analysis demonstrated four significantly up-regulated (COBRA1, ITGB4, STAU2, and HMGN3) genes and one down-regulated (ANK3) gene. All these genes exert their function on transcriptional and translational processes and until now none of these genes have been associated with ethanol. This functional genomic analysis demonstrates the reported interaction between ethanol and ethanol-regulated genes. Moreover, it confirms the relationship between ethanol-regulated genes and various signaling pathways associated with ethanol-induced apoptosis. The data presented in this study represents an important contribution toward the understanding of the molecular mechanisms of ethanol at low concentration in HepG2 cells, a HCC-derived cell line. PMID:17211498
Isaacson, Sven; Luo, Feng; Feltus, Frank A.; Smith, Melissa C.
2013-01-01
The study of gene relationships and their effect on biological function and phenotype is a focal point in systems biology. Gene co-expression networks built using microarray expression profiles are one technique for discovering and interpreting gene relationships. A knowledge-independent thresholding technique, such as Random Matrix Theory (RMT), is useful for identifying meaningful relationships. Highly connected genes in the thresholded network are then grouped into modules that provide insight into their collective functionality. While it has been shown that co-expression networks are biologically relevant, it has not been determined to what extent any given network is functionally robust given perturbations in the input sample set. For such a test, hundreds of networks are needed and hence a tool to rapidly construct these networks. To examine functional robustness of networks with varying input, we enhanced an existing RMT implementation for improved scalability and tested functional robustness of human (Homo sapiens), rice (Oryza sativa) and budding yeast (Saccharomyces cerevisiae). We demonstrate dramatic decrease in network construction time and computational requirements and show that despite some variation in global properties between networks, functional similarity remains high. Moreover, the biological function captured by co-expression networks thresholded by RMT is highly robust. PMID:23409071
Zhang, Jianzhi; Dyer, Kimberly D.; Rosenberg, Helene F.
2000-01-01
The mammalian RNase A superfamily comprises a diverse array of ribonucleolytic proteins that have a variety of biochemical activities and physiological functions. Two rapidly evolving RNases of higher primates are of particular interest as they are major secretory proteins of eosinophilic leukocytes and have been found to possess anti-pathogen activities in vitro. To understand how these RNases acquired this function during evolution and to develop animal models for the study of their functions in vivo, it is necessary to investigate these genes in many species. Here, we report the sequences of 38 functional genes and 23 pseudogenes of the eosinophil-associated RNase (EAR) family from 5 rodent species. Our phylogenetic analysis of these genes showed a clear pattern of evolution by a rapid birth-and-death process and gene sorting, a process characterized by rapid gene duplication and deactivation occurring differentially among lineages. This process ultimately generates distinct or only partially overlapping inventories of the genes, even in closely related species. Positive Darwinian selection also contributed to the diversification of these EAR genes. The striking similarity between the evolutionary patterns of the EAR genes and those of the major histocompatibility complex, immunoglobulin, and T cell receptor genes stands in strong support of the hypothesis that host-defense and generation of diversity are among the primary physiological function of the rodent EARs. The discovery of a large number of divergent EARs suggests the intriguing possibility that these proteins have been specifically tailored to fight against distinct rodent pathogens. PMID:10758160
Three new members of the RNP protein family in Xenopus.
Good, P J; Rebbert, M L; Dawid, I B
1993-01-01
Many RNP proteins contain one or more copies of the RNA recognition motif (RRM) and are thought to be involved in cellular RNA metabolism. We have previously characterized in Xenopus a nervous system specific gene, nrp1, that is more similar to the hnRNP A/B proteins than to other known proteins (K. Richter, P. J. Good, and I. B. Dawid (1990), New Biol. 2, 556-565). PCR amplification with degenerate primers was used to identify additional cDNAs encoding two RRMs in Xenopus. Three previously uncharacterized genes were identified. Two genes encode hnRNP A/B proteins with two RRMs and a glycine-rich domain. One of these is the Xenopus homolog of the human A2/B1 gene; the other, named hnRNP A3, is similar to both the A1 and A2 hnRNP genes. The Xenopus hnRNP A1, A2 and A3 genes are expressed throughout development and in all adult tissues. Multiple protein isoforms for the hnRNP A2 gene are predicted that differ by the insertion of short peptide sequences in the glycine-rich domain. The third newly isolated gene, named xrp1, encodes a protein that is related by sequence to the nrp1 protein but is expressed ubiquitously. Despite the similarity to nuclear RNP proteins, both the nrp1 and xrp1 proteins are localized to the cytoplasm in the Xenopus oocyte. The xrp1 gene may have a function in all cells that is similar to that executed by nrp1 specifically within the nervous system. Images PMID:8451200
Studying Functions of All Yeast Genes Simultaneously
NASA Technical Reports Server (NTRS)
Stolc, Viktor; Eason, Robert G.; Poumand, Nader; Herman, Zelek S.; Davis, Ronald W.; Anthony Kevin; Jejelowo, Olufisayo
2006-01-01
A method of studying the functions of all the genes of a given species of microorganism simultaneously has been developed in experiments on Saccharomyces cerevisiae (commonly known as baker's or brewer's yeast). It is already known that many yeast genes perform functions similar to those of corresponding human genes; therefore, by facilitating understanding of yeast genes, the method may ultimately also contribute to the knowledge needed to treat some diseases in humans. Because of the complexity of the method and the highly specialized nature of the underlying knowledge, it is possible to give only a brief and sketchy summary here. The method involves the use of unique synthetic deoxyribonucleic acid (DNA) sequences that are denoted as DNA bar codes because of their utility as molecular labels. The method also involves the disruption of gene functions through deletion of genes. Saccharomyces cerevisiae is a particularly powerful experimental system in that multiple deletion strains easily can be pooled for parallel growth assays. Individual deletion strains recently have been created for 5,918 open reading frames, representing nearly all of the estimated 6,000 genetic loci of Saccharomyces cerevisiae. Tagging of each deletion strain with one or two unique 20-nucleotide sequences enables identification of genes affected by specific growth conditions, without prior knowledge of gene functions. Hybridization of bar-code DNA to oligonucleotide arrays can be used to measure the growth rate of each strain over several cell-division generations. The growth rate thus measured serves as an index of the fitness of the strain.
The what, where, how and why of gene ontology—a primer for bioinformaticians
du Plessis, Louis; Škunca, Nives
2011-01-01
With high-throughput technologies providing vast amounts of data, it has become more important to provide systematic, quality annotations. The Gene Ontology (GO) project is the largest resource for cataloguing gene function. Nonetheless, its use is not yet ubiquitous and is still fraught with pitfalls. In this review, we provide a short primer to the GO for bioinformaticians. We summarize important aspects of the structure of the ontology, describe sources and types of functional annotations, survey measures of GO annotation similarity, review typical uses of GO and discuss other important considerations pertaining to the use of GO in bioinformatics applications. PMID:21330331
González-Escalona, Narjol; Blackstone, George M.; DePaola, Angelo
2006-01-01
A Vibrio strain isolated from Alaskan oysters and classified by its biochemical characteristics as Vibrio alginolyticus possessed a thermostable direct hemolysin-related hemolysin (trh) gene previously reported only in Vibrio parahaemolyticus. This trh-like gene was cloned and sequenced and was 98% identical to the trh2 gene of V. parahaemolyticus. This gene seems to be functional since it was transcriptionally active in early-stationary-phase growing cells. To our knowledge, this is the first report of V. alginolyticus possessing a trh gene. PMID:17056701
A curated catalog of canine and equine keratin genes
Pujar, Shashikant; McGarvey, Kelly M.; Welle, Monika; Galichet, Arnaud; Müller, Eliane J.; Pruitt, Kim D.; Leeb, Tosso
2017-01-01
Keratins represent a large protein family with essential structural and functional roles in epithelial cells of skin, hair follicles, and other organs. During evolution the genes encoding keratins have undergone multiple rounds of duplication and humans have two clusters with a total of 55 functional keratin genes in their genomes. Due to the high similarity between different keratin paralogs and species-specific differences in gene content, the currently available keratin gene annotation in species with draft genome assemblies such as dog and horse is still imperfect. We compared the National Center for Biotechnology Information (NCBI) (dog annotation release 103, horse annotation release 101) and Ensembl (release 87) gene predictions for the canine and equine keratin gene clusters to RNA-seq data that were generated from adult skin of five dogs and two horses and from adult hair follicle tissue of one dog. Taking into consideration the knowledge on the conserved exon/intron structure of keratin genes, we annotated 61 putatively functional keratin genes in both the dog and horse, respectively. Subsequently, curators in the RefSeq group at NCBI reviewed their annotation of keratin genes in the dog and horse genomes (Annotation Release 104 and Annotation Release 102, respectively) and updated annotation and gene nomenclature of several keratin genes. The updates are now available in the NCBI Gene database (https://www.ncbi.nlm.nih.gov/gene). PMID:28846680
Prior knowledge based mining functional modules from Yeast PPI networks with gene ontology
2010-01-01
Background In the literature, there are fruitful algorithmic approaches for identification functional modules in protein-protein interactions (PPI) networks. Because of accumulation of large-scale interaction data on multiple organisms and non-recording interaction data in the existing PPI database, it is still emergent to design novel computational techniques that can be able to correctly and scalably analyze interaction data sets. Indeed there are a number of large scale biological data sets providing indirect evidence for protein-protein interaction relationships. Results The main aim of this paper is to present a prior knowledge based mining strategy to identify functional modules from PPI networks with the aid of Gene Ontology. Higher similarity value in Gene Ontology means that two gene products are more functionally related to each other, so it is better to group such gene products into one functional module. We study (i) to encode the functional pairs into the existing PPI networks; and (ii) to use these functional pairs as pairwise constraints to supervise the existing functional module identification algorithms. Topology-based modularity metric and complex annotation in MIPs will be used to evaluate the identified functional modules by these two approaches. Conclusions The experimental results on Yeast PPI networks and GO have shown that the prior knowledge based learning methods perform better than the existing algorithms. PMID:21172053
The human oxytocin gene promoter is regulated by estrogens.
Richard, S; Zingg, H H
1990-04-15
Gonadal steroids affect brain function primarily by altering the expression of specific genes, yet the specific mechanisms by which neuronal target genes undergo such regulation are unknown. Recent evidence suggests that the expression of the neuropeptide gene for oxytocin (OT) is modulated by estrogens. We therefore examined the possibility that this regulation occurred via a direct interaction of the estrogen-receptor complex with cis-acting elements flanking the OT gene. DNA-mediated gene transfer experiments were performed using Neuro-2a neuroblastoma cells and chimeric plasmids containing portions of the human OT gene 5'-glanking region linked to the chloramphenicol acetyltransferase gene. We identified a 19-base pair region located at -164 to -146 upstream of the transcription start site which is capable of conferring estrogen responsiveness to the homologous as well as to a heterologous promoter. The hormonal response is strictly dependent on the presence of intracellular estrogen receptors, since estrogen induced stimulation occurred only in Neuro-2a cells co-transfected with an expression vector for the human estrogen receptor. The identified region contains a novel imperfect palindrome (GGTGACCTTGACC) with sequence similarity to other estrogen response elements (EREs). To define cis-acting elements that function in synergism with the ERE, sequences 3' to the ERE were deleted, including the CCAAT box, two additional motifs corresponding to the right half of the ERE palindrome (TGACC), as well as a CTGCTAA heptamer similar to the "elegans box" found in Caenorhabditis elegans. Interestingly, optimal function of the identified ERE was fully independent of these elements and only required a short promoter region (-49 to +36). Our studies define a molecular mechanism by which estrogens can directly modulate OT gene expression. However, only a subset of OT neurons are capable of binding estrogens, therefore, direct action of estrogens on the OT gene may be restricted to a subpopulation of OT neurons.
McClelland, Michael; Sanderson, Kenneth E; Clifton, Sandra W; Latreille, Phil; Porwollik, Steffen; Sabo, Aniko; Meyer, Rekha; Bieri, Tamberlyn; Ozersky, Phil; McLellan, Michael; Harkins, C Richard; Wang, Chunyan; Nguyen, Christine; Berghoff, Amy; Elliott, Glendoria; Kohlberg, Sara; Strong, Cindy; Du, Feiyu; Carter, Jason; Kremizki, Colin; Layman, Dan; Leonard, Shawn; Sun, Hui; Fulton, Lucinda; Nash, William; Miner, Tracie; Minx, Patrick; Delehaunty, Kim; Fronick, Catrina; Magrini, Vincent; Nhan, Michael; Warren, Wesley; Florea, Liliana; Spieth, John; Wilson, Richard K
2004-12-01
Salmonella enterica serovars often have a broad host range, and some cause both gastrointestinal and systemic disease. But the serovars Paratyphi A and Typhi are restricted to humans and cause only systemic disease. It has been estimated that Typhi arose in the last few thousand years. The sequence and microarray analysis of the Paratyphi A genome indicates that it is similar to the Typhi genome but suggests that it has a more recent evolutionary origin. Both genomes have independently accumulated many pseudogenes among their approximately 4,400 protein coding sequences: 173 in Paratyphi A and approximately 210 in Typhi. The recent convergence of these two similar genomes on a similar phenotype is subtly reflected in their genotypes: only 30 genes are degraded in both serovars. Nevertheless, these 30 genes include three known to be important in gastroenteritis, which does not occur in these serovars, and four for Salmonella-translocated effectors, which are normally secreted into host cells to subvert host functions. Loss of function also occurs by mutation in different genes in the same pathway (e.g., in chemotaxis and in the production of fimbriae).
Dong, Tao; He, Jing; Wang, Shiqing; Wang, Lianzhang; Cheng, Yuqi; Zhong, Yi
2016-01-01
The etiology of autism is so complicated because it involves the effects of variants of several hundred risk genes along with the contribution of environmental factors. Therefore, it has been challenging to identify the causal paths that lead to the core autistic symptoms such as social deficit, repetitive behaviors, and behavioral inflexibility. As an alternative approach, extensive efforts have been devoted to identifying the convergence of the targets and functions of the autism-risk genes to facilitate mapping out causal paths. In this study, we used a reversal-learning task to measure behavioral flexibility in Drosophila and determined the effects of loss-of-function mutations in multiple autism-risk gene homologs in flies. Mutations of five autism-risk genes with diversified molecular functions all led to a similar phenotype of behavioral inflexibility indicated by impaired reversal-learning. These reversal-learning defects resulted from the inability to forget or rather, specifically, to activate Rac1 (Ras-related C3 botulinum toxin substrate 1)-dependent forgetting. Thus, behavior-evoked activation of Rac1-dependent forgetting has a converging function for autism-risk genes. PMID:27335463
Wang, Jia-Hong; Zhao, Ling-Feng; Lin, Pei; Su, Xiao-Rong; Chen, Shi-Jun; Huang, Li-Qiang; Wang, Hua-Feng; Zhang, Hai; Hu, Zhen-Fu; Yao, Kai-Tai; Huang, Zhong-Xi
2014-09-01
Identifying biological functions and molecular networks in a gene list and how the genes may relate to various topics is of considerable value to biomedical researchers. Here, we present a web-based text-mining server, GenCLiP 2.0, which can analyze human genes with enriched keywords and molecular interactions. Compared with other similar tools, GenCLiP 2.0 offers two unique features: (i) analysis of gene functions with free terms (i.e. any terms in the literature) generated by literature mining or provided by the user and (ii) accurate identification and integration of comprehensive molecular interactions from Medline abstracts, to construct molecular networks and subnetworks related to the free terms. http://ci.smu.edu.cn. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Haselier, André; Akbari, Hana; Weth, Agnes; Baumgartner, Werner; Frentzen, Margrit
2010-01-01
Cytidinediphosphate diacylglycerol synthase (CDS) catalyzes the formation of cytidinediphosphate diacylglycerol, an essential precursor of anionic phosphoglycerolipids like phosphatidylglycerol or -inositol. In plant cells, CDS isozymes are located in plastids, mitochondria, and microsomes. Here, we show that these isozymes are encoded by five genes in Arabidopsis (Arabidopsis thaliana). Alternative translation initiation or alternative splicing of CDS2 and CDS4 transcripts can result in up to 10 isoforms. Most of the cDNAs encoding the various plant isoforms were functionally expressed in yeast and rescued the nonviable phenotype of the mutant strain lacking CDS activity. The closely related genes CDS4 and CDS5 were found to encode plastidial isozymes with similar catalytic properties. Inactivation of both genes was required to obtain Arabidopsis mutant lines with a visible phenotype, suggesting that the genes have redundant functions. Analysis of these Arabidopsis mutants provided further independent evidence for the importance of plastidial phosphatidylglycerol for structure and function of thylakoid membranes and, hence, for photoautotrophic growth. PMID:20442275
Senthivel, Vivek Raj; Sturrock, Marc; Piedrafita, Gabriel; Isalan, Mark
2016-12-16
Nonlinear responses to signals are widespread natural phenomena that affect various cellular processes. Nonlinearity can be a desirable characteristic for engineering living organisms because it can lead to more switch-like responses, similar to those underlying the wiring in electronics. Steeper functions are described as ultrasensitive, and can be applied in synthetic biology by using various techniques including receptor decoys, multiple co-operative binding sites, and sequential positive feedbacks. Here, we explore the inherent non-linearity of a biological signaling system to identify functions that can potentially be exploited using cell genome engineering. For this, we performed genome-wide transcription profiling to identify genes with ultrasensitive response functions to Hepatocyte Growth Factor (HGF). We identified 3,527 genes that react to increasing concentrations of HGF, in Madin-Darby canine kidney (MDCK) cells, grown as cysts in 3D collagen cell culture. By fitting a generic Hill function to the dose-responses of these genes we obtained a measure of the ultrasensitivity of HGF-responsive genes, identifying a subset with higher apparent Hill coefficients (e.g. MMP1, TIMP1, SNORD75, SNORD86 and ERRFI1). The regulatory regions of these genes are potential candidates for future engineering of synthetic mammalian gene circuits requiring nonlinear responses to HGF signalling.
FamNet: A Framework to Identify Multiplied Modules Driving Pathway Expansion in Plants1
Tohge, Takayuki; Klie, Sebastian; Fernie, Alisdair R.
2016-01-01
Gene duplications generate new genes that can acquire similar but often diversified functions. Recent studies of gene coexpression networks have indicated that, not only genes, but also pathways can be multiplied and diversified to perform related functions in different parts of an organism. Identification of such diversified pathways, or modules, is needed to expand our knowledge of biological processes in plants and to understand how biological functions evolve. However, systematic explorations of modules remain scarce, and no user-friendly platform to identify them exists. We have established a statistical framework to identify modules and show that approximately one-third of the genes of a plant’s genome participate in hundreds of multiplied modules. Using this framework as a basis, we implemented a platform that can explore and visualize multiplied modules in coexpression networks of eight plant species. To validate the usefulness of the platform, we identified and functionally characterized pollen- and root-specific cell wall modules that multiplied to confer tip growth in pollen tubes and root hairs, respectively. Furthermore, we identified multiplied modules involved in secondary metabolite synthesis and corroborated them by metabolite profiling of tobacco (Nicotiana tabacum) tissues. The interactive platform, referred to as FamNet, is available at http://www.gene2function.de/famnet.html. PMID:26754669
The Goddard and Saturn Genes Are Essential for Drosophila Male Fertility and May Have Arisen De Novo
Gubala, Anna M.; Schmitz, Jonathan F.; Kearns, Michael J.; Vinh, Tery T.; Bornberg-Bauer, Erich; Wolfner, Mariana F.
2017-01-01
New genes arise through a variety of mechanisms, including the duplication of existing genes and the de novo birth of genes from noncoding DNA sequences. While there are numerous examples of duplicated genes with important functional roles, the functions of de novo genes remain largely unexplored. Many newly evolved genes are expressed in the male reproductive tract, suggesting that these evolutionary innovations may provide advantages to males experiencing sexual selection. Using testis-specific RNA interference, we screened 11 putative de novo genes in Drosophila melanogaster for effects on male fertility and identified two, goddard and saturn, that are essential for spermatogenesis and sperm function. Goddard knockdown (KD) males fail to produce mature sperm, while saturn KD males produce few sperm, and these function inefficiently once transferred to females. Consistent with a de novo origin, both genes are identifiable only in Drosophila and are predicted to encode proteins with no sequence similarity to any annotated protein. However, since high levels of divergence prevented the unambiguous identification of the noncoding sequences from which each gene arose, we consider goddard and saturn to be putative de novo genes. Within Drosophila, both genes have been lost in certain lineages, but show conserved, male-specific patterns of expression in the species in which they are found. Goddard is consistently found in single-copy and evolves under purifying selection. In contrast, saturn has diversified through gene duplication and positive selection. These data suggest that de novo genes can acquire essential roles in male reproduction. PMID:28104747
Zhu, Jia-Hong; Xu, Jing; Chang, Wen-Jun; Zhang, Zhi-Li
2015-01-01
Ethylene is an important factor that stimulates Hevea brasiliensis to produce natural rubber. 1-Aminocyclopropane-1-carboxylic acid synthase (ACS) is a rate-limiting enzyme in ethylene biosynthesis. However, knowledge of the ACS gene family of H. brasiliensis is limited. In this study, nine ACS-like genes were identified in H. brasiliensis. Sequence and phylogenetic analysis results confirmed that seven isozymes (HbACS1–7) of these nine ACS-like genes were similar to ACS isozymes with ACS activity in other plants. Expression analysis results showed that seven ACS genes were differentially expressed in roots, barks, flowers, and leaves of H. brasiliensis. However, no or low ACS gene expression was detected in the latex of H. brasiliensis. Moreover, seven genes were differentially up-regulated by ethylene treatment.These results provided relevant information to help determine the functions of the ACS gene in H. brasiliensis, particularly the functions in regulating ethylene stimulation of latex production. PMID:25690030
Complete Genomic Structure of the Bloom-forming Toxic Cyanobacterium Microcystis aeruginosa NIES-843
Kaneko, Takakazu; Nakajima, Nobuyoshi; Okamoto, Shinobu; Suzuki, Iwane; Tanabe, Yuuhiko; Tamaoki, Masanori; Nakamura, Yasukazu; Kasai, Fumie; Watanabe, Akiko; Kawashima, Kumiko; Kishida, Yoshie; Ono, Akiko; Shimizu, Yoshimi; Takahashi, Chika; Minami, Chiharu; Fujishiro, Tsunakazu; Kohara, Mitsuyo; Katoh, Midori; Nakazaki, Naomi; Nakayama, Shinobu; Yamada, Manabu; Tabata, Satoshi; Watanabe, Makoto M.
2007-01-01
Abstract The nucleotide sequence of the complete genome of a cyanobacterium, Microcystis aeruginosa NIES-843, was determined. The genome of M. aeruginosa is a single, circular chromosome of 5 842 795 base pairs (bp) in length, with an average GC content of 42.3%. The chromosome comprises 6312 putative protein-encoding genes, two sets of rRNA genes, 42 tRNA genes representing 41 tRNA species, and genes for tmRNA, the B subunit of RNase P, SRP RNA, and 6Sa RNA. Forty-five percent of the putative protein-encoding sequences showed sequence similarity to genes of known function, 32% were similar to hypothetical genes, and the remaining 23% had no apparent similarity to reported genes. A total of 688 kb of the genome, equivalent to 11.8% of the entire genome, were composed of both insertion sequences and miniature inverted-repeat transposable elements. This is indicative of a plasticity of the M. aeruginosa genome, through a mechanism that involves homologous recombination mediated by repetitive DNA elements. In addition to known gene clusters related to the synthesis of microcystin and cyanopeptolin, novel gene clusters that may be involved in the synthesis and modification of toxic small polypeptides were identified. Compared with other cyanobacteria, a relatively small number of genes for two component systems and a large number of genes for restriction-modification systems were notable characteristics of the M. aeruginosa genome. PMID:18192279
Martiny, Adam C.; Martiny, Jennifer B. H.; Weihe, Claudia; Field, Andrew; Ellis, Julie C.
2011-01-01
Wildlife may facilitate the spread of antibiotic resistance (AR) between human-dominated habitats and the surrounding environment. Here, we use functional metagenomics to survey the diversity and genomic context of AR genes in gulls. Using this approach, we found a variety of AR genes not previously detected in gulls and wildlife, including class A and C β-lactamases as well as six tetracycline resistance gene types. An analysis of the flanking sequences indicates that most of these genes are present in Enterobacteriaceae and various Gram-positive bacteria. In addition to finding known gene types, we detected 31 previously undescribed AR genes. These undescribed genes include one most similar to an uncharacterized gene in Verrucomicrobium and another to a putative DNA repair protein in Lactobacillus. Overall, the study more than doubled the number of clinically relevant AR gene types known to be carried by gulls or by wildlife in general. Together with the propensity of gulls to visit human-dominated habitats, this high diversity of AR gene types suggests that gulls could facilitate the spread of AR. PMID:22347872
Lavore, Andrés; Pagola, Lucía; Esponda-Behrens, Natalia; Rivera-Pomar, Rolando
2012-01-01
The segmentation process in insects depends on a hierarchical cascade of gene activity. The first effectors downstream of the maternal activation are the gap genes, which divide the embryo in broad fields. We discovered a sequence corresponding to the leucine-zipper domain of the orthologue of the gene giant (Rp-gt) in traces from the genome of Rhodnius prolixus, a hemipteran with intermediate germ-band development. We cloned the Rp-gt gene from a normalized cDNA library and characterized its expression and function. Bioinformatic analysis of 12.5 kbp of genomic sequence containing the Rp-gt transcriptional unit shows a cluster of bona fide regulatory binding sites, which is similar in location and structure to the predicted posterior expression domain of the Drosophila orthologue. Rp-gt is expressed in ovaries and maternally supplied in the early embryo. The maternal contribution forms a gradient of scattered patches of mRNA in the preblastoderm embryo. Zygotic Rp-gt is expressed in two domains that after germ band extension are restricted to the head and the posterior growth zone. Parental RNAi shows that Rp-gt is required for proper head and abdomen formation. The head lacks mandibulary and maxillary appendages and shows reduced clypeus-labrum, while the abdomen lacks anterior segments. We conclude that Rp-gt is a gap gene on the head and abdomen and, in addition, has a function in patterning the anterior head capsule suggesting that the function of gt in hemipterans is more similar to dipterans than expected. Copyright © 2011. Published by Elsevier Inc.
Recombineering using RecET from Pseudomonas syringae
USDA-ARS?s Scientific Manuscript database
Here we report the identification of functions that promote genomic recombination of linear DNA introduced into Pseudomonas cells by electroporation. The genes encoding these functions were identified in Pseudomonas syringae pv. syringae B728a based on similarity to the lambda Red Exo/Beta and RecE...
Neufeld, Stanley J.; Wang, Fan; Cobb, John
2014-01-01
The growth and development of the vertebrate limb relies on homeobox genes of the Hox and Shox families, with their independent mutation often giving dose-dependent effects. Here we investigate whether Shox2 and Hox genes function together during mouse limb development by modulating their relative dosage and examining the limb for nonadditive effects on growth. Using double mRNA fluorescence in situ hybridization (FISH) in single embryos, we first show that Shox2 and Hox genes have associated spatial expression dynamics, with Shox2 expression restricted to the proximal limb along with Hoxd9 and Hoxa11 expression, juxtaposing the distal expression of Hoxa13 and Hoxd13. By generating mice with all possible dosage combinations of mutant Shox2 alleles and HoxA/D cluster deletions, we then show that their coordinated proximal limb expression is critical to generate normally proportioned limb segments. These epistatic interactions tune limb length, where Shox2 underexpression enhances, and Shox2 overexpression suppresses, Hox-mutant phenotypes. Disruption of either Shox2 or Hox genes leads to a similar reduction in Runx2 expression in the developing humerus, suggesting their concerted action drives cartilage maturation during normal development. While we furthermore provide evidence that Hox gene function influences Shox2 expression, this regulation is limited in extent and is unlikely on its own to be a major explanation for their genetic interaction. Given the similar effect of human SHOX mutations on regional limb growth, Shox and Hox genes may generally function as genetic interaction partners during the growth and development of the proximal vertebrate limb. PMID:25217052
Neufeld, Stanley J; Wang, Fan; Cobb, John
2014-11-01
The growth and development of the vertebrate limb relies on homeobox genes of the Hox and Shox families, with their independent mutation often giving dose-dependent effects. Here we investigate whether Shox2 and Hox genes function together during mouse limb development by modulating their relative dosage and examining the limb for nonadditive effects on growth. Using double mRNA fluorescence in situ hybridization (FISH) in single embryos, we first show that Shox2 and Hox genes have associated spatial expression dynamics, with Shox2 expression restricted to the proximal limb along with Hoxd9 and Hoxa11 expression, juxtaposing the distal expression of Hoxa13 and Hoxd13. By generating mice with all possible dosage combinations of mutant Shox2 alleles and HoxA/D cluster deletions, we then show that their coordinated proximal limb expression is critical to generate normally proportioned limb segments. These epistatic interactions tune limb length, where Shox2 underexpression enhances, and Shox2 overexpression suppresses, Hox-mutant phenotypes. Disruption of either Shox2 or Hox genes leads to a similar reduction in Runx2 expression in the developing humerus, suggesting their concerted action drives cartilage maturation during normal development. While we furthermore provide evidence that Hox gene function influences Shox2 expression, this regulation is limited in extent and is unlikely on its own to be a major explanation for their genetic interaction. Given the similar effect of human SHOX mutations on regional limb growth, Shox and Hox genes may generally function as genetic interaction partners during the growth and development of the proximal vertebrate limb. Copyright © 2014 by the Genetics Society of America.
Moodley, Yoshan; Uhr, Markus; Stamer, Christiana; Vauterin, Marc; Suerbaum, Sebastian; Achtman, Mark
2010-01-01
The Helicobacter pylori cag pathogenicity island (cagPAI) encodes a type IV secretion system. Humans infected with cagPAI–carrying H. pylori are at increased risk for sequelae such as gastric cancer. Housekeeping genes in H. pylori show considerable genetic diversity; but the diversity of virulence factors such as the cagPAI, which transports the bacterial oncogene CagA into host cells, has not been systematically investigated. Here we compared the complete cagPAI sequences for 38 representative isolates from all known H. pylori biogeographic populations. Their gene content and gene order were highly conserved. The phylogeny of most cagPAI genes was similar to that of housekeeping genes, indicating that the cagPAI was probably acquired only once by H. pylori, and its genetic diversity reflects the isolation by distance that has shaped this bacterial species since modern humans migrated out of Africa. Most isolates induced IL-8 release in gastric epithelial cells, indicating that the function of the Cag secretion system has been conserved despite some genetic rearrangements. More than one third of cagPAI genes, in particular those encoding cell-surface exposed proteins, showed signatures of diversifying (Darwinian) selection at more than 5% of codons. Several unknown gene products predicted to be under Darwinian selection are also likely to be secreted proteins (e.g. HP0522, HP0535). One of these, HP0535, is predicted to code for either a new secreted candidate effector protein or a protein which interacts with CagA because it contains two genetic lineages, similar to cagA. Our study provides a resource that can guide future research on the biological roles and host interactions of cagPAI proteins, including several whose function is still unknown. PMID:20808891
Spiegel, S; Chiu, A; James, A S; Jentsch, J D; Karlsgodt, K H
2015-11-01
Numerous studies have implicated DTNBP1, the gene encoding dystrobrevin-binding protein or dysbindin, as a candidate risk gene for schizophrenia, though this relationship remains somewhat controversial. Variation in dysbindin, and its location on chromosome 6p, has been associated with cognitive processes, including those relying on a complex system of glutamatergic and dopaminergic interactions. Dysbindin is one of the seven protein subunits that comprise the biogenesis of lysosome-related organelles complex 1 (BLOC-1). Dysbindin protein levels are lower in mice with null mutations in pallidin, another gene in the BLOC-1, and pallidin levels are lower in mice with null mutations in the dysbindin gene, suggesting that multiple subunit proteins must be present to form a functional oligomeric complex. Furthermore, pallidin and dysbindin have similar distribution patterns in a mouse and human brain. Here, we investigated whether the apparent correspondence of pallid and dysbindin at the level of gene expression is also found at the level of behavior. Hypothesizing a mutation leading to underexpression of either of these proteins should show similar phenotypic effects, we studied recognition memory in both strains using the novel object recognition task (NORT) and social novelty recognition task (SNRT). We found that mice with a null mutation in either gene are impaired on SNRT and NORT when compared with wild-type controls. These results support the conclusion that deficits consistent with recognition memory impairment, a cognitive function that is impaired in schizophrenia, result from either pallidin or dysbindin mutations, possibly through degradation of BLOC-1 expression and/or function. © 2015 John Wiley & Sons Ltd and International Behavioural and Neural Genetics Society.
Olbermann, Patrick; Josenhans, Christine; Moodley, Yoshan; Uhr, Markus; Stamer, Christiana; Vauterin, Marc; Suerbaum, Sebastian; Achtman, Mark; Linz, Bodo
2010-08-19
The Helicobacter pylori cag pathogenicity island (cagPAI) encodes a type IV secretion system. Humans infected with cagPAI-carrying H. pylori are at increased risk for sequelae such as gastric cancer. Housekeeping genes in H. pylori show considerable genetic diversity; but the diversity of virulence factors such as the cagPAI, which transports the bacterial oncogene CagA into host cells, has not been systematically investigated. Here we compared the complete cagPAI sequences for 38 representative isolates from all known H. pylori biogeographic populations. Their gene content and gene order were highly conserved. The phylogeny of most cagPAI genes was similar to that of housekeeping genes, indicating that the cagPAI was probably acquired only once by H. pylori, and its genetic diversity reflects the isolation by distance that has shaped this bacterial species since modern humans migrated out of Africa. Most isolates induced IL-8 release in gastric epithelial cells, indicating that the function of the Cag secretion system has been conserved despite some genetic rearrangements. More than one third of cagPAI genes, in particular those encoding cell-surface exposed proteins, showed signatures of diversifying (Darwinian) selection at more than 5% of codons. Several unknown gene products predicted to be under Darwinian selection are also likely to be secreted proteins (e.g. HP0522, HP0535). One of these, HP0535, is predicted to code for either a new secreted candidate effector protein or a protein which interacts with CagA because it contains two genetic lineages, similar to cagA. Our study provides a resource that can guide future research on the biological roles and host interactions of cagPAI proteins, including several whose function is still unknown.
oPOSSUM: identification of over-represented transcription factor binding sites in co-expressed genes
Ho Sui, Shannan J.; Mortimer, James R.; Arenillas, David J.; Brumm, Jochen; Walsh, Christopher J.; Kennedy, Brian P.; Wasserman, Wyeth W.
2005-01-01
Targeted transcript profiling studies can identify sets of co-expressed genes; however, identification of the underlying functional mechanism(s) is a significant challenge. Established methods for the analysis of gene annotations, particularly those based on the Gene Ontology, can identify functional linkages between genes. Similar methods for the identification of over-represented transcription factor binding sites (TFBSs) have been successful in yeast, but extension to human genomics has largely proved ineffective. Creation of a system for the efficient identification of common regulatory mechanisms in a subset of co-expressed human genes promises to break a roadblock in functional genomics research. We have developed an integrated system that searches for evidence of co-regulation by one or more transcription factors (TFs). oPOSSUM combines a pre-computed database of conserved TFBSs in human and mouse promoters with statistical methods for identification of sites over-represented in a set of co-expressed genes. The algorithm successfully identified mediating TFs in control sets of tissue-specific genes and in sets of co-expressed genes from three transcript profiling studies. Simulation studies indicate that oPOSSUM produces few false positives using empirically defined thresholds and can tolerate up to 50% noise in a set of co-expressed genes. PMID:15933209
Presence of a novel exon 2E encoding a putative transmembrane protein in human IL-33 gene.
Tominaga, Shin-ichi; Hayakawa, Morisada; Tsuda, Hidetoshi; Ohta, Satoshi; Yanagisawa, Ken
2013-01-18
Interleukin-33 (IL-33) is a dual-function molecule that regulates gene expression in nuclei and, as a cytokine, conveys proinflammatory signals from outside of cells via its specific receptor ST2L. There are still a lot of questions about localization and processing of IL-33 gene products. In the course of re-evaluating human IL-33 gene, we found distinct promoter usage depending on the cell type, similar to the case in the ST2 gene. Furthermore, we found a novel exon 2E in the conventional intron 2 whose open reading frame corresponded to a transmembrane protein of 131 amino acids. Dependence of exon 2E expression on differentiation of HUVEC cells is of great interest in relation to human IL-33 function. Copyright © 2012 Elsevier Inc. All rights reserved.
The clc Element of Pseudomonas sp. Strain B13, a Genomic Island with Various Catabolic Properties
Gaillard, Muriel; Vallaeys, Tatiana; Vorhölter, Frank Jörg; Minoia, Marco; Werlen, Christoph; Sentchilo, Vladimir; Pühler, Alfred; van der Meer, Jan Roelof
2006-01-01
Pseudomonas sp. strain B13 is a bacterium known to degrade chloroaromatic compounds. The properties to use 3- and 4-chlorocatechol are determined by a self-transferable DNA element, the clc element, which normally resides at two locations in the cell's chromosome. Here we report the complete nucleotide sequence of the clc element, demonstrating the unique catabolic properties while showing its relatedness to genomic islands and integrative and conjugative elements rather than to other known catabolic plasmids. As far as catabolic functions, the clc element harbored, in addition to the genes for chlorocatechol degradation, a complete functional operon for 2-aminophenol degradation and genes for a putative aromatic compound transport protein and for a multicomponent aromatic ring dioxygenase similar to anthranilate hydroxylase. The genes for catabolic functions were inducible under various conditions, suggesting a network of catabolic pathway induction. For about half of the open reading frames (ORFs) on the clc element, no clear functional prediction could be given, although some indications were found for functions that were similar to plasmid conjugation. The region in which these ORFs were situated displayed a high overall conservation of nucleotide sequence and gene order to genomic regions in other recently completed bacterial genomes or to other genomic islands. Most notably, except for two discrete regions, the clc element was almost 100% identical over the whole length to a chromosomal region in Burkholderia xenovorans LB400. This indicates the dynamic evolution of this type of element and the continued transition between elements with a more pathogenic character and those with catabolic properties. PMID:16484212
Genome-wide analysis of putative peroxiredoxin in unicellular and filamentous cyanobacteria.
Cui, Hongli; Wang, Yipeng; Wang, Yinchu; Qin, Song
2012-11-16
Cyanobacteria are photoautotrophic prokaryotes with wide variations in genome sizes and ecological habitats. Peroxiredoxin (PRX) is an important protein that plays essential roles in protecting own cells against reactive oxygen species (ROS). PRXs have been identified from mammals, fungi and higher plants. However, knowledge on cyanobacterial PRXs still remains obscure. With the availability of 37 sequenced cyanobacterial genomes, we performed a comprehensive comparative analysis of PRXs and explored their diversity, distribution, domain structure and evolution. Overall 244 putative prx genes were identified, which were abundant in filamentous diazotrophic cyanobacteria, Acaryochloris marina MBIC 11017, and unicellular cyanobacteria inhabiting freshwater and hot-springs, while poor in all Prochlorococcus and marine Synechococcus strains. Among these putative genes, 25 open reading frames (ORFs) encoding hypothetical proteins were identified as prx gene family members and the others were already annotated as prx genes. All 244 putative PRXs were classified into five major subfamilies (1-Cys, 2-Cys, BCP, PRX5_like, and PRX-like) according to their domain structures. The catalytic motifs of the cyanobacterial PRXs were similar to those of eukaryotic PRXs and highly conserved in all but the PRX-like subfamily. Classical motif (CXXC) of thioredoxin was detected in protein sequences from the PRX-like subfamily. Phylogenetic tree constructed of catalytic domains coincided well with the domain structures of PRXs and the phylogenies based on 16s rRNA. The distribution of genes encoding PRXs in different unicellular and filamentous cyanobacteria especially those sub-families like PRX-like or 1-Cys PRX correlate with the genome size, eco-physiology, and physiological properties of the organisms. Cyanobacterial and eukaryotic PRXs share similar conserved motifs, indicating that cyanobacteria adopt similar catalytic mechanisms as eukaryotes. All cyanobacterial PRX proteins share highly similar structures, implying that these genes may originate from a common ancestor. In this study, a general framework of the sequence-structure-function connections of the PRXs was revealed, which may facilitate functional investigations of PRXs in various organisms.
Genome-wide analysis of putative peroxiredoxin in unicellular and filamentous cyanobacteria
2012-01-01
Background Cyanobacteria are photoautotrophic prokaryotes with wide variations in genome sizes and ecological habitats. Peroxiredoxin (PRX) is an important protein that plays essential roles in protecting own cells against reactive oxygen species (ROS). PRXs have been identified from mammals, fungi and higher plants. However, knowledge on cyanobacterial PRXs still remains obscure. With the availability of 37 sequenced cyanobacterial genomes, we performed a comprehensive comparative analysis of PRXs and explored their diversity, distribution, domain structure and evolution. Results Overall 244 putative prx genes were identified, which were abundant in filamentous diazotrophic cyanobacteria, Acaryochloris marina MBIC 11017, and unicellular cyanobacteria inhabiting freshwater and hot-springs, while poor in all Prochlorococcus and marine Synechococcus strains. Among these putative genes, 25 open reading frames (ORFs) encoding hypothetical proteins were identified as prx gene family members and the others were already annotated as prx genes. All 244 putative PRXs were classified into five major subfamilies (1-Cys, 2-Cys, BCP, PRX5_like, and PRX-like) according to their domain structures. The catalytic motifs of the cyanobacterial PRXs were similar to those of eukaryotic PRXs and highly conserved in all but the PRX-like subfamily. Classical motif (CXXC) of thioredoxin was detected in protein sequences from the PRX-like subfamily. Phylogenetic tree constructed of catalytic domains coincided well with the domain structures of PRXs and the phylogenies based on 16s rRNA. Conclusions The distribution of genes encoding PRXs in different unicellular and filamentous cyanobacteria especially those sub-families like PRX-like or 1-Cys PRX correlate with the genome size, eco-physiology, and physiological properties of the organisms. Cyanobacterial and eukaryotic PRXs share similar conserved motifs, indicating that cyanobacteria adopt similar catalytic mechanisms as eukaryotes. All cyanobacterial PRX proteins share highly similar structures, implying that these genes may originate from a common ancestor. In this study, a general framework of the sequence-structure-function connections of the PRXs was revealed, which may facilitate functional investigations of PRXs in various organisms. PMID:23157370
Mapping Gene Associations in Human Mitochondria using Clinical Disease Phenotypes
Scharfe, Curt; Lu, Henry Horng-Shing; Neuenburg, Jutta K.; Allen, Edward A.; Li, Guan-Cheng; Klopstock, Thomas; Cowan, Tina M.; Enns, Gregory M.; Davis, Ronald W.
2009-01-01
Nuclear genes encode most mitochondrial proteins, and their mutations cause diverse and debilitating clinical disorders. To date, 1,200 of these mitochondrial genes have been recorded, while no standardized catalog exists of the associated clinical phenotypes. Such a catalog would be useful to develop methods to analyze human phenotypic data, to determine genotype-phenotype relations among many genes and diseases, and to support the clinical diagnosis of mitochondrial disorders. Here we establish a clinical phenotype catalog of 174 mitochondrial disease genes and study associations of diseases and genes. Phenotypic features such as clinical signs and symptoms were manually annotated from full-text medical articles and classified based on the hierarchical MeSH ontology. This classification of phenotypic features of each gene allowed for the comparison of diseases between different genes. In turn, we were then able to measure the phenotypic associations of disease genes for which we calculated a quantitative value that is based on their shared phenotypic features. The results showed that genes sharing more similar phenotypes have a stronger tendency for functional interactions, proving the usefulness of phenotype similarity values in disease gene network analysis. We then constructed a functional network of mitochondrial genes and discovered a higher connectivity for non-disease than for disease genes, and a tendency of disease genes to interact with each other. Utilizing these differences, we propose 168 candidate genes that resemble the characteristic interaction patterns of mitochondrial disease genes. Through their network associations, the candidates are further prioritized for the study of specific disorders such as optic neuropathies and Parkinson disease. Most mitochondrial disease phenotypes involve several clinical categories including neurologic, metabolic, and gastrointestinal disorders, which might indicate the effects of gene defects within the mitochondrial system. The accompanying knowledgebase (http://www.mitophenome.org/) supports the study of clinical diseases and associated genes. PMID:19390613
Identification of hub subnetwork based on topological features of genes in breast cancer
ZHUANG, DA-YONG; JIANG, LI; HE, QING-QING; ZHOU, PENG; YUE, TAO
2015-01-01
The aim of this study was to provide functional insight into the identification of hub subnetworks by aggregating the behavior of genes connected in a protein-protein interaction (PPI) network. We applied a protein network-based approach to identify subnetworks which may provide new insight into the functions of pathways involved in breast cancer rather than individual genes. Five groups of breast cancer data were downloaded and analyzed from the Gene Expression Omnibus (GEO) database of high-throughput gene expression data to identify gene signatures using the genome-wide global significance (GWGS) method. A PPI network was constructed using Cytoscape and clusters that focused on highly connected nodes were obtained using the molecular complex detection (MCODE) clustering algorithm. Pathway analysis was performed to assess the functional relevance of selected gene signatures based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Topological centrality was used to characterize the biological importance of gene signatures, pathways and clusters. The results revealed that, cluster1, as well as the cell cycle and oocyte meiosis pathways were significant subnetworks in the analysis of degree and other centralities, in which hub nodes mostly distributed. The most important hub nodes, with top ranked centrality, were also similar with the common genes from the above three subnetwork intersections, which was viewed as a hub subnetwork with more reproducible than individual critical genes selected without network information. This hub subnetwork attributed to the same biological process which was essential in the function of cell growth and death. This increased the accuracy of identifying gene interactions that took place within the same functional process and was potentially useful for the development of biomarkers and networks for breast cancer. PMID:25573623
Gaji, Rajshekhar Y; Howe, Daniel K
2009-07-01
The apicomplexan parasite Sarcocystis neurona undergoes a complex process of intracellular development, during which many genes are temporally regulated. The described study was undertaken to begin identifying the basic promoter elements that control gene expression in S. neurona. Sequence analysis of the 5'-flanking region of five S. neurona genes revealed a conserved heptanucleotide motif GAGACGC that is similar to the WGAGACG motif described upstream of multiple genes in Toxoplasma gondii. The promoter region for the major surface antigen gene SnSAG1, which contains three heptanucleotide motifs within 135 bases of the transcription start site, was dissected by functional analysis using a dual luciferase reporter assay. These analyses revealed that a minimal promoter fragment containing all three motifs was sufficient to drive reporter molecule expression, with the presence and orientation of the 5'-most heptanucleotide motif being absolutely critical for promoter function. Further studies should help to identify additional sequence elements important for promoter function and for controlling gene expression during intracellular development by this apicomplexan pathogen.
Genetic ablation of P65 subunit of NF-κB in mdx mice to improve muscle physiological function.
Yin, Xi; Tang, Ying; Li, Jian; Dzuricky, Anna T; Pu, Chuanqiang; Fu, Freddie; Wang, Bing
2017-10-01
Duchenne muscular dystrophy (DMD) is a genetic muscle disease characterized by dystrophin deficiency. Beyond gene replacement, the question of whether ablation of the p65 gene of nuclear factor-kappa B (NF-κB) in DMD can improve muscle physiology function is unknown. In this study, we investigated muscle physiological improvement in mdx mice (DMD model) with a genetic reduction of NF-κB. Muscle physiological function and histology were studied in 2-month-old mdx/p65 +/- , wild-type, mdx, and human minidystrophin gene transgenic mdx (TghΔDys/mdx) mice. Improved muscle physiological function was found in mdx/p65 +/- mice when compared with mdx mice; however, it was similar to TghΔDys/mdx mice. The results indicate that genetic reduction of p65 levels diminished chronic inflammation in dystrophic muscle, thus leading to amelioration of muscle pathology and improved muscle physiological function. The results show that inhibition of NF-κB may be a promising therapy when combined with gene therapy for DMD. Muscle Nerve 56: 759-767, 2017. © 2016 Wiley Periodicals, Inc.
Computational gene network study on antibiotic resistance genes of Acinetobacter baumannii.
Anitha, P; Anbarasu, Anand; Ramaiah, Sudha
2014-05-01
Multi Drug Resistance (MDR) in Acinetobacter baumannii is one of the major threats for emerging nosocomial infections in hospital environment. Multidrug-resistance in A. baumannii may be due to the implementation of multi-combination resistance mechanisms such as β-lactamase synthesis, Penicillin-Binding Proteins (PBPs) changes, alteration in porin proteins and in efflux pumps against various existing classes of antibiotics. Multiple antibiotic resistance genes are involved in MDR. These resistance genes are transferred through plasmids, which are responsible for the dissemination of antibiotic resistance among Acinetobacter spp. In addition, these resistance genes may also have a tendency to interact with each other or with their gene products. Therefore, it becomes necessary to understand the impact of these interactions in antibiotic resistance mechanism. Hence, our study focuses on protein and gene network analysis on various resistance genes, to elucidate the role of the interacting proteins and to study their functional contribution towards antibiotic resistance. From the search tool for the retrieval of interacting gene/protein (STRING), a total of 168 functional partners for 15 resistance genes were extracted based on the confidence scoring system. The network study was then followed up with functional clustering of associated partners using molecular complex detection (MCODE). Later, we selected eight efficient clusters based on score. Interestingly, the associated protein we identified from the network possessed greater functional similarity with known resistance genes. This network-based approach on resistance genes of A. baumannii could help in identifying new genes/proteins and provide clues on their association in antibiotic resistance. Copyright © 2014 Elsevier Ltd. All rights reserved.
Integrative and conjugative elements and their hosts: composition, distribution and organization.
Cury, Jean; Touchon, Marie; Rocha, Eduardo P C
2017-09-06
Conjugation of single-stranded DNA drives horizontal gene transfer between bacteria and was widely studied in conjugative plasmids. The organization and function of integrative and conjugative elements (ICE), even if they are more abundant, was only studied in a few model systems. Comparative genomics of ICE has been precluded by the difficulty in finding and delimiting these elements. Here, we present the results of a method that circumvents these problems by requiring only the identification of the conjugation genes and the species' pan-genome. We delimited 200 ICEs and this allowed the first large-scale characterization of these elements. We quantified the presence in ICEs of a wide set of functions associated with the biology of mobile genetic elements, including some that are typically associated with plasmids, such as partition and replication. Protein sequence similarity networks and phylogenetic analyses revealed that ICEs are structured in functional modules. Integrases and conjugation systems have different evolutionary histories, even if the gene repertoires of ICEs can be grouped in function of conjugation types. Our characterization of the composition and organization of ICEs paves the way for future functional and evolutionary analyses of their cargo genes, composed of a majority of unknown function genes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
NASA Astrophysics Data System (ADS)
To, Cuong; Pham, Tuan D.
2010-01-01
In machine learning, pattern recognition may be the most popular task. "Similar" patterns identification is also very important in biology because first, it is useful for prediction of patterns associated with disease, for example cancer tissue (normal or tumor); second, similarity or dissimilarity of the kinetic patterns is used to identify coordinately controlled genes or proteins involved in the same regulatory process. Third, similar genes (proteins) share similar functions. In this paper, we present an algorithm which uses genetic programming to create decision tree for binary classification problem. The application of the algorithm was implemented on five real biological databases. Base on the results of comparisons with well-known methods, we see that the algorithm is outstanding in most of cases.
Evaluating Functional Annotations of Enzymes Using the Gene Ontology.
Holliday, Gemma L; Davidson, Rebecca; Akiva, Eyal; Babbitt, Patricia C
2017-01-01
The Gene Ontology (GO) (Ashburner et al., Nat Genet 25(1):25-29, 2000) is a powerful tool in the informatics arsenal of methods for evaluating annotations in a protein dataset. From identifying the nearest well annotated homologue of a protein of interest to predicting where misannotation has occurred to knowing how confident you can be in the annotations assigned to those proteins is critical. In this chapter we explore what makes an enzyme unique and how we can use GO to infer aspects of protein function based on sequence similarity. These can range from identification of misannotation or other errors in a predicted function to accurate function prediction for an enzyme of entirely unknown function. Although GO annotation applies to any gene products, we focus here a describing our approach for hierarchical classification of enzymes in the Structure-Function Linkage Database (SFLD) (Akiva et al., Nucleic Acids Res 42(Database issue):D521-530, 2014) as a guide for informed utilisation of annotation transfer based on GO terms.
Gene Expression in Human Accessory Lacrimal Glands of Wolfring
Ubels, John L.; Gipson, Ilene K.; Spurr-Michaud, Sandra J.; Tisdale, Ann S.; Van Dyken, Rachel E.; Hatton, Mark P.
2012-01-01
Purpose. The accessory lacrimal glands are assumed to contribute to the production of tear fluid, but little is known about their function. The goal of this study was to conduct an analysis of gene expression by glands of Wolfring that would provide a more complete picture of the function of these glands. Methods. Glands of Wolfring were isolated from frozen sections of human eyelids by laser microdissection. RNA was extracted from the cells and hybridized to gene expression arrays. The expression of several of the major genes was confirmed by immunohistochemistry. Results. Of the 24 most highly expressed genes, 9 were of direct relevance to lacrimal function. These included lysozyme, lactoferrin, tear lipocalin, and lacritin. The glands of Wolfring are enriched in genes related to protein synthesis, targeting, and secretion, and a large number of genes for proteins with antimicrobial activity were detected. Ion channels and transporters, carbonic anhydrase, and aquaporins were abundantly expressed. Genes for control of lacrimal function, including cholinergic, adrenergic, vasoactive intestinal polypeptide, purinergic, androgen, and prolactin receptors were also expressed in gland of Wolfring. Conclusions. The data suggest that the function of glands of Wolfring is similar to that of main lacrimal glands and are consistent with secretion electrolytes, fluid, and protein under nervous and hormonal control. Since these glands secrete directly onto the ocular surface, their location may allow rapid response to exogenous stimuli and makes them readily accessible to topical drugs. PMID:22956620
Keshri, Jitendra; Mishra, Avinash; Jha, Bhavanath
2013-03-30
Population indices of bacteria and archaea were investigated from saline-alkaline soil and a possible microbe-environment pattern was established using gene targeted metagenomics. Clone libraries were constructed using 16S rRNA and functional gene(s) involved in carbon fixation (cbbL), nitrogen fixation (nifH), ammonia oxidation (amoA) and sulfur metabolism (apsA). Molecular phylogeny revealed the dominance of Actinobacteria, Firmicutes and Proteobacteria along with archaeal members of Halobacteraceae. The library consisted of novel bacterial (20%) and archaeal (38%) genera showing ≤95% similarity to previously retrieved sequences. Phylogenetic analysis indicated ability of inhabitant to survive in stress condition. The 16S rRNA gene libraries contained novel gene sequences and were distantly homologous with cultured bacteria. Functional gene libraries were found unique and most of the clones were distantly related to Proteobacteria, while clones of nifH gene library also showed homology with Cyanobacteria and Firmicutes. Quantitative real-time PCR exhibited that bacterial abundance was two orders of magnitude higher than archaeal. The gene(s) quantification indicated the size of the functional guilds harboring relevant key genes. The study provides insights on microbial ecology and different metabolic interactions occurring in saline-alkaline soil, possessing phylogenetically diverse groups of bacteria and archaea, which may be explored further for gene cataloging and metabolic profiling. Copyright © 2012 Elsevier GmbH. All rights reserved.
COGNAT: a web server for comparative analysis of genomic neighborhoods.
Klimchuk, Olesya I; Konovalov, Kirill A; Perekhvatov, Vadim V; Skulachev, Konstantin V; Dibrova, Daria V; Mulkidjanian, Armen Y
2017-11-22
In prokaryotic genomes, functionally coupled genes can be organized in conserved gene clusters enabling their coordinated regulation. Such clusters could contain one or several operons, which are groups of co-transcribed genes. Those genes that evolved from a common ancestral gene by speciation (i.e. orthologs) are expected to have similar genomic neighborhoods in different organisms, whereas those copies of the gene that are responsible for dissimilar functions (i.e. paralogs) could be found in dissimilar genomic contexts. Comparative analysis of genomic neighborhoods facilitates the prediction of co-regulated genes and helps to discern different functions in large protein families. We intended, building on the attribution of gene sequences to the clusters of orthologous groups of proteins (COGs), to provide a method for visualization and comparative analysis of genomic neighborhoods of evolutionary related genes, as well as a respective web server. Here we introduce the COmparative Gene Neighborhoods Analysis Tool (COGNAT), a web server for comparative analysis of genomic neighborhoods. The tool is based on the COG database, as well as the Pfam protein families database. As an example, we show the utility of COGNAT in identifying a new type of membrane protein complex that is formed by paralog(s) of one of the membrane subunits of the NADH:quinone oxidoreductase of type 1 (COG1009) and a cytoplasmic protein of unknown function (COG3002). This article was reviewed by Drs. Igor Zhulin, Uri Gophna and Igor Rogozin.
Mouse Dux is myotoxic and shares partial functional homology with its human paralog DUX4
Eidahl, Jocelyn O.; Giesige, Carlee R.; Domire, Jacqueline S.; Wallace, Lindsay M.; Fowler, Allison M.; Guckes, Susan M.; Garwick-Coppens, Sara E.; Labhart, Paul
2016-01-01
Abstract D4Z4 repeats are present in at least 11 different mammalian species, including humans and mice. Each repeat contains an open reading frame encoding a double homeodomain (DUX) family transcription factor. Aberrant expression of the D4Z4 ORF called DUX4 is associated with the pathogenesis of Facioscapulohumeral muscular dystrophy (FSHD). DUX4 is toxic to numerous cell types of different species, and over-expression caused dysmorphism and developmental arrest in frogs and zebrafish, embryonic lethality in transgenic mice, and lesions in mouse muscle. Because DUX4 is a primate-specific gene, questions have been raised about the biological relevance of over-expressing it in non-primate models, as DUX4 toxicity could be related to non-specific cellular stress induced by over-expressing a DUX family transcription factor in organisms that did not co-evolve its regulated transcriptional networks. We assessed toxic phenotypes of DUX family genes, including DUX4, DUX1, DUX5, DUXA, DUX4-s, Dux-bl and mouse Dux. We found that DUX proteins were not universally toxic, and only the mouse Dux gene caused similar toxic phenotypes as human DUX4. Using RNA-seq, we found that 80% of genes upregulated by Dux were similarly increased in DUX4-expressing cells. Moreover, 43% of Dux-responsive genes contained ChIP-seq binding sites for both Dux and DUX4, and both proteins had similar consensus binding site sequences. These results suggested DUX4 and Dux may regulate some common pathways, and despite diverging from a common progenitor under different selective pressures for millions of years, the two genes maintain partial functional homology. PMID:28173143
NASA Technical Reports Server (NTRS)
Guan, Changhui; Rosen, Elizabeth S.; Boonsirichai, Kanokporn; Poff, Kenneth L.; Masson, Patrick H.
2003-01-01
The arl2 mutants of Arabidopsis display altered root and hypocotyl gravitropism, whereas their inflorescence stems are fully gravitropic. Interestingly, mutant roots respond like the wild type to phytohormones and an inhibitor of polar auxin transport. Also, their cap columella cells accumulate starch similarly to wild-type cells, and mutant hypocotyls display strong phototropic responses to lateral light stimulation. The ARL2 gene encodes a DnaJ-like protein similar to ARG1, another protein previously implicated in gravity signal transduction in Arabidopsis seedlings. ARL2 is expressed at low levels in all organs of seedlings and plants. arl2-1 arg1-2 double mutant roots display kinetics of gravitropism similar to those of single mutants. However, double mutants carrying both arl2-1 and pgm-1 (a mutation in the starch-biosynthetic gene PHOSPHOGLUCOMUTASE) at the homozygous state display a more pronounced root gravitropic defect than the single mutants. On the other hand, seedlings with a null mutation in ARL1, a paralog of ARG1 and ARL2, behave similarly to the wild type in gravitropism and other related assays. Taken together, the results suggest that ARG1 and ARL2 function in the same gravity signal transduction pathway in the hypocotyl and root of Arabidopsis seedlings, distinct from the pathway involving PGM.
Li, Jingtao; Sun, Xinhua; Yu, Gang; Jia, Chengguo; Liu, Jinliang; Pan, Hongyu
2014-01-01
Little information is available on gene expression profiling of halophyte A. canescens. To elucidate the molecular mechanism for stress tolerance in A. canescens, a full-length complementary DNA library was generated from A. canescens exposed to 400 mM NaCl, and provided 343 high-quality ESTs. In an evaluation of 343 valid EST sequences in the cDNA library, 197 unigenes were assembled, among which 190 unigenes (83.1% ESTs) were identified according to their significant similarities with proteins of known functions. All the 343 EST sequences have been deposited in the dbEST GenBank under accession numbers JZ535802 to JZ536144. According to Arabidopsis MIPS functional category and GO classifications, we identified 193 unigenes of the 311 annotations EST, representing 72 non-redundant unigenes sharing similarities with genes related to the defense response. The sets of ESTs obtained provide a rich genetic resource and 17 up-regulated genes related to salt stress resistance were identified by qRT-PCR. Six of these genes may contribute crucially to earlier and later stage salt stress resistance. Additionally, among the 343 unigenes sequences, 22 simple sequence repeats (SSRs) were also identified contributing to the study of A. canescens resources. PMID:24960361
Hadjithomas, Michalis; Chen, I-Min A; Chu, Ken; Huang, Jinghua; Ratner, Anna; Palaniappan, Krishna; Andersen, Evan; Markowitz, Victor; Kyrpides, Nikos C; Ivanova, Natalia N
2017-01-04
Secondary metabolites produced by microbes have diverse biological functions, which makes them a great potential source of biotechnologically relevant compounds with antimicrobial, anti-cancer and other activities. The proteins needed to synthesize these natural products are often encoded by clusters of co-located genes called biosynthetic gene clusters (BCs). In order to advance the exploration of microbial secondary metabolism, we developed the largest publically available database of experimentally verified and predicted BCs, the Integrated Microbial Genomes Atlas of Biosynthetic gene Clusters (IMG-ABC) (https://img.jgi.doe.gov/abc/). Here, we describe an update of IMG-ABC, which includes ClusterScout, a tool for targeted identification of custom biosynthetic gene clusters across 40 000 isolate microbial genomes, and a new search capability to query more than 700 000 BCs from isolate genomes for clusters with similar Pfam composition. Additional features enable fast exploration and analysis of BCs through two new interactive visualization features, a BC function heatmap and a BC similarity network graph. These new tools and features add to the value of IMG-ABC's vast body of BC data, facilitating their in-depth analysis and accelerating secondary metabolite discovery. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Nahas, John V; Iosue, Christine L; Shaik, Noor F; Selhorst, Kathleen; He, Bin Z; Wykoff, Dennis D
2018-05-10
Convergent evolution is often due to selective pressures generating a similar phenotype. We observe relatively recent duplications in a spectrum of Saccharomycetaceae yeast species resulting in multiple phosphatases that are regulated by different nutrient conditions - thiamine and phosphate starvation. This specialization is both transcriptional and at the level of phosphatase substrate specificity. In Candida glabrata , loss of the ancestral phosphatase family was compensated by the co-option of a different histidine phosphatase family with three paralogs. Using RNA-seq and functional assays, we identify one of these paralogs, CgPMU3 , as a thiamine phosphatase. We further determine that the 81% identical paralog CgPMU2 does not encode thiamine phosphatase activity; however, both are capable of cleaving the phosphatase substrate, 1-napthyl-phosphate. We functionally demonstrate that members of this family evolved novel enzymatic functions for phosphate and thiamine starvation, and are regulated transcriptionally by either nutrient condition, and observe similar trends in other yeast species. This independent, parallel evolution involving two different families of histidine phosphatases suggests that there were likely similar selective pressures on multiple yeast species to recycle thiamine and phosphate. In this work, we focused on duplication and specialization, but there is also repeated loss of phosphatases, indicating that the expansion and contraction of the phosphatase family is dynamic in many Ascomycetes. The dynamic evolution of the phosphatase gene families is perhaps just one example of how gene duplication, co-option, and transcriptional and functional specialization together allow species to adapt to their environment with existing genetic resources. Copyright © 2018, G3: Genes, Genomes, Genetics.
McEvoy, Brian; Beleza, Sandra; Shriver, Mark D
2006-10-15
Skin pigmentation varies substantially across human populations in a manner largely coincident with ultraviolet radiation intensity. This observation suggests that natural selection in response to sunlight is a major force in accounting for pigmentation variability. We review recent progress in identifying the genes controlling this variation with a particular focus on the trait's evolutionary past and the potential role of testing for signatures of selection in aiding the discovery of functionally important genes. We have analyzed SNP data from the International HapMap project in 77 pigmentation candidate genes for such signatures. On the basis of these results and other similar work, we provide a tentative three-population model (West Africa, East Asia and North Europe) of the evolutionary-genetic architecture of human pigmentation. These results suggest a complex evolutionary history, with selection acting on different gene targets at different times and places in the human past. Some candidate genes may have been selected in the ancestral human population, others in the 'out of Africa' proto European-Asian population, whereas most appear to have selectively evolved solely in either Europeans or East Asians separately despite the pigmentation similarities between these two populations. Selection signatures can provide important clues to aid gene discovery. However, these should be viewed as complements, rather than replacements of, functional studies including linkage and association analyses, which can directly refine our understanding of the trait.
Interhemispheric gene expression differences in the cerebral cortex of humans and macaque monkeys.
Muntané, Gerard; Santpere, Gabriel; Verendeev, Andrey; Seeley, William W; Jacobs, Bob; Hopkins, William D; Navarro, Arcadi; Sherwood, Chet C
2017-09-01
Handedness and language are two well-studied examples of asymmetrical brain function in humans. Approximately 90% of humans exhibit a right-hand preference, and the vast majority shows left-hemisphere dominance for language function. Although genetic models of human handedness and language have been proposed, the actual gene expression differences between cerebral hemispheres in humans remain to be fully defined. In the present study, gene expression profiles were examined in both hemispheres of three cortical regions involved in handedness and language in humans and their homologues in rhesus macaques: ventrolateral prefrontal cortex, posterior superior temporal cortex (STC), and primary motor cortex. Although the overall pattern of gene expression was very similar between hemispheres in both humans and macaques, weighted gene correlation network analysis revealed gene co-expression modules associated with hemisphere, which are different among the three cortical regions examined. Notably, a receptor-enriched gene module in STC was particularly associated with hemisphere and showed different expression levels between hemispheres only in humans.
Pan, Feng; Wang, Yue; Liu, Huanglong; Wu, Min; Chu, Wenyuan; Chen, Danmei; Xiang, Yan
2017-06-27
The SQUAMOSA promoter binding protein-like (SPL) proteins are plant-specific transcription factors (TFs) that function in a variety of developmental processes including growth, flower development, and signal transduction. SPL proteins are encoded by a gene family, and these genes have been characterized in two model grass species, Zea mays and Oryza sativa. The SPL gene family has not been well studied in moso bamboo (Phyllostachys edulis), a woody grass species. We identified 32 putative PeSPL genes in the P. edulis genome. Phylogenetic analysis arranged the PeSPL protein sequences in eight groups. Similarly, phylogenetic analysis of the SBP-like and SBP proteins from rice and maize clustered them into eight groups analogous to those from P. edulis. Furthermore, the deduced PeSPL proteins in each group contained very similar conserved sequence motifs. Our analyses indicate that the PeSPL genes experienced a large-scale duplication event ~15 million years ago (MYA), and that divergence between the PeSPL and OsSPL genes occurred 34 MYA. The stress-response expression profiles and tissue-specificity of the putative PeSPL gene promoter regions showed that SPL genes in moso bamboo have potential biological functions in stress resistance as well as in growth and development. We therefore examined PeSPL gene expression in response to different plant hormone and drought (polyethylene glycol-6000; PEG) treatments to mimic biotic and abiotic stresses. Expression of three (PeSPL10, -12, -17), six (PeSPL1, -10, -12, -17, -20, -31), and nine (PeSPL5, -8, -9, -14, -15, -19, -20, -31, -32) genes remained relatively stable after treating with salicylic acid (SA), gibberellic acid (GA), and PEG, respectively, while the expression patterns of other genes changed. In addition, analysis of tissue-specific expression of the moso bamboo SPL genes during development showed differences in their spatiotemporal expression patterns, and many were expressed at high levels in flowers and leaves. The PeSPL genes play important roles in plant growth and development, including responses to stresses, and most of the genes are expressed in different tissues. Our study provides a comprehensive understanding of the PeSPL gene family and may enable future studies on the function and evolution of SPL genes in moso bamboo.
The immune signaling pathways of Manduca sexta
Cao, Xiaolong; He, Yan; Hu, Yingxia; Wang, Yang; Chen, Yun-Ru; Bryant, Bart; Clem, Rollie J.; Schwartz, Lawrence M.; Blissard, Gary; Jiang, Haobo
2015-01-01
Signal transduction pathways and their coordination are critically important for proper functioning of animal immune systems. Our knowledge of the constituents of the intracellular signaling network in insects mainly comes from genetic analyses in Drosophila melanogaster. To facilitate future studies of similar systems in the tobacco hornworm and other lepidopteran insects, we have identified and examined the homologous genes in the genome of Manduca sexta. Based on 1:1 orthologous relationships in most cases, we hypothesize that the Toll, Imd, MAPK-JNK-p38 and JAK-STAT pathways are intact and operative in this species, as are most of the regulatory mechanisms. Similarly, cellular processes such as autophagy, apoptosis and RNA interference probably function in similar ways, because their mediators and modulators are mostly conserved in this lepidopteran species. We have annotated a total of 186 genes encoding 199 proteins, studied their domain structures and evolution, and examined their mRNA levels in tissues at different life stages. Such information provides a genomic perspective of the intricate signaling system in a non-drosophiline insect. PMID:25858029
Benoit, Joshua B.; Attardo, Geoffrey M.; Michalkova, Veronika; Krause, Tyler B.; Bohova, Jana; Zhang, Qirui; Baumann, Aaron A.; Mireji, Paul O.; Takáč, Peter; Denlinger, David L.; Ribeiro, Jose M.; Aksoy, Serap
2014-01-01
In tsetse flies, nutrients for intrauterine larval development are synthesized by the modified accessory gland (milk gland) and provided in mother's milk during lactation. Interference with at least two milk proteins has been shown to extend larval development and reduce fecundity. The goal of this study was to perform a comprehensive characterization of tsetse milk proteins using lactation-specific transcriptome/milk proteome analyses and to define functional role(s) for the milk proteins during lactation. Differential analysis of RNA-seq data from lactating and dry (non-lactating) females revealed enrichment of transcripts coding for protein synthesis machinery, lipid metabolism and secretory proteins during lactation. Among the genes induced during lactation were those encoding the previously identified milk proteins (milk gland proteins 1–3, transferrin and acid sphingomyelinase 1) and seven new genes (mgp4–10). The genes encoding mgp2–10 are organized on a 40 kb syntenic block in the tsetse genome, have similar exon-intron arrangements, and share regions of amino acid sequence similarity. Expression of mgp2–10 is female-specific and high during milk secretion. While knockdown of a single mgp failed to reduce fecundity, simultaneous knockdown of multiple variants reduced milk protein levels and lowered fecundity. The genomic localization, gene structure similarities, and functional redundancy of MGP2–10 suggest that they constitute a novel highly divergent protein family. Our data indicates that MGP2–10 function both as the primary amino acid resource for the developing larva and in the maintenance of milk homeostasis, similar to the function of the mammalian casein family of milk proteins. This study underscores the dynamic nature of the lactation cycle and identifies a novel family of lactation-specific proteins, unique to Glossina sp., that are essential to larval development. The specificity of MGP2–10 to tsetse and their critical role during lactation suggests that these proteins may be an excellent target for tsetse-specific population control approaches. PMID:24763277
Pathway Distiller - multisource biological pathway consolidation
2012-01-01
Background One method to understand and evaluate an experiment that produces a large set of genes, such as a gene expression microarray analysis, is to identify overrepresentation or enrichment for biological pathways. Because pathways are able to functionally describe the set of genes, much effort has been made to collect curated biological pathways into publicly accessible databases. When combining disparate databases, highly related or redundant pathways exist, making their consolidation into pathway concepts essential. This will facilitate unbiased, comprehensive yet streamlined analysis of experiments that result in large gene sets. Methods After gene set enrichment finds representative pathways for large gene sets, pathways are consolidated into representative pathway concepts. Three complementary, but different methods of pathway consolidation are explored. Enrichment Consolidation combines the set of the pathways enriched for the signature gene list through iterative combining of enriched pathways with other pathways with similar signature gene sets; Weighted Consolidation utilizes a Protein-Protein Interaction network based gene-weighting approach that finds clusters of both enriched and non-enriched pathways limited to the experiments' resultant gene list; and finally the de novo Consolidation method uses several measurements of pathway similarity, that finds static pathway clusters independent of any given experiment. Results We demonstrate that the three consolidation methods provide unified yet different functional insights of a resultant gene set derived from a genome-wide profiling experiment. Results from the methods are presented, demonstrating their applications in biological studies and comparing with a pathway web-based framework that also combines several pathway databases. Additionally a web-based consolidation framework that encompasses all three methods discussed in this paper, Pathway Distiller (http://cbbiweb.uthscsa.edu/PathwayDistiller), is established to allow researchers access to the methods and example microarray data described in this manuscript, and the ability to analyze their own gene list by using our unique consolidation methods. Conclusions By combining several pathway systems, implementing different, but complementary pathway consolidation methods, and providing a user-friendly web-accessible tool, we have enabled users the ability to extract functional explanations of their genome wide experiments. PMID:23134636
Pathway Distiller - multisource biological pathway consolidation.
Doderer, Mark S; Anguiano, Zachry; Suresh, Uthra; Dashnamoorthy, Ravi; Bishop, Alexander J R; Chen, Yidong
2012-01-01
One method to understand and evaluate an experiment that produces a large set of genes, such as a gene expression microarray analysis, is to identify overrepresentation or enrichment for biological pathways. Because pathways are able to functionally describe the set of genes, much effort has been made to collect curated biological pathways into publicly accessible databases. When combining disparate databases, highly related or redundant pathways exist, making their consolidation into pathway concepts essential. This will facilitate unbiased, comprehensive yet streamlined analysis of experiments that result in large gene sets. After gene set enrichment finds representative pathways for large gene sets, pathways are consolidated into representative pathway concepts. Three complementary, but different methods of pathway consolidation are explored. Enrichment Consolidation combines the set of the pathways enriched for the signature gene list through iterative combining of enriched pathways with other pathways with similar signature gene sets; Weighted Consolidation utilizes a Protein-Protein Interaction network based gene-weighting approach that finds clusters of both enriched and non-enriched pathways limited to the experiments' resultant gene list; and finally the de novo Consolidation method uses several measurements of pathway similarity, that finds static pathway clusters independent of any given experiment. We demonstrate that the three consolidation methods provide unified yet different functional insights of a resultant gene set derived from a genome-wide profiling experiment. Results from the methods are presented, demonstrating their applications in biological studies and comparing with a pathway web-based framework that also combines several pathway databases. Additionally a web-based consolidation framework that encompasses all three methods discussed in this paper, Pathway Distiller (http://cbbiweb.uthscsa.edu/PathwayDistiller), is established to allow researchers access to the methods and example microarray data described in this manuscript, and the ability to analyze their own gene list by using our unique consolidation methods. By combining several pathway systems, implementing different, but complementary pathway consolidation methods, and providing a user-friendly web-accessible tool, we have enabled users the ability to extract functional explanations of their genome wide experiments.
Masuno, Kiriko; Haldar, Saptarsi M.; Jeyaraj, Darwin; Mailloux, Christina M.; Huang, Xiaozhu; Panettieri, Rey A.; Jain, Mukesh K.
2011-01-01
Glucocorticoids (GCs), which activate GC receptor (GR) signaling and thus modulate gene expression, are widely used to treat asthma. GCs exert their therapeutic effects in part through modulating airway smooth muscle (ASM) structure and function. However, the effects of genes that are regulated by GCs on airway function are not fully understood. We therefore used transcription profiling to study the effects of a potent GC, dexamethasone, on human ASM (HASM) gene expression at 4 and 24 hours. After 24 hours of dexamethasone treatment, nearly 7,500 genes had statistically distinguishable changes in expression; quantitative PCR validation of a 40-gene subset of putative GR-regulated genes in 6 HASM cell lines suggested that the early transcriptional targets of GR signaling are similar in independent HASM lines. Gene ontology analysis implicated GR targets in controlling multiple aspects of ASM function. One GR-regulated gene, the transcription factor, Kruppel-like factor 15 (Klf15), was already known to modulate vascular smooth and cardiac muscle function, but had no known role in the lung. We therefore analyzed the pulmonary phenotype of Klf15−/− mice after ovalbumin sensitization and challenge. We found diminished airway responses to acetylcholine in ovalbumin-challenged Klf15−/− mice without a significant change in the induction of asthmatic inflammation. In cultured cells, overexpression of Klf15 reduced proliferation of HASM cells, whereas apoptosis in Klf15−/− murine ASM cells was increased. Together, these results further characterize the GR-regulated gene network in ASM and establish a novel role for the GR target, Klf15, in modulating airway function. PMID:21257922
γ-PGA Hydrolases of Phage Origin in Bacillus subtilis and Other Microbial Genomes.
Mamberti, Stefania; Prati, Paola; Cremaschi, Paolo; Seppi, Claudio; Morelli, Carlo F; Galizzi, Alessandro; Fabbi, Massimo; Calvio, Cinzia
2015-01-01
Poly-γ-glutamate (γ-PGA) is an industrially interesting polymer secreted mainly by members of the class Bacilli which forms a shield able to protect bacteria from phagocytosis and phages. Few enzymes are known to degrade γ-PGA; among them is a phage-encoded γ-PGA hydrolase, PghP. The supposed role of PghP in phages is to ensure access to the surface of bacterial cells by dismantling the γ-PGA barrier. We identified four unannotated B. subtilis genes through similarity of their encoded products to PghP; in fact these genes reside in prophage elements of B. subtilis genome. The recombinant products of two of them demonstrate efficient polymer degradation, confirming that sequence similarity reflects functional homology. Genes encoding similar γ-PGA hydrolases were identified in phages specific for the order Bacillales and in numerous microbial genomes, not only belonging to that order. The distribution of the γ-PGA biosynthesis operon was also investigated with a bioinformatics approach; it was found that the list of organisms endowed with γ-PGA biosynthetic functions is larger than expected and includes several pathogenic species. Moreover in non-Bacillales bacteria the predicted γ-PGA hydrolase genes are preferentially found in species that do not have the genetic asset for polymer production. Our findings suggest that γ-PGA hydrolase genes might have spread across microbial genomes via horizontal exchanges rather than via phage infection. We hypothesize that, in natural habitats rich in γ-PGA supplied by producer organisms, the availability of hydrolases that release glutamate oligomers from γ-PGA might be a beneficial trait under positive selection.
Analysis of expressed sequence tags for Frankliniella occidentalis, the western flower thrips.
Rotenberg, D; Whitfield, A E
2010-08-01
Thrips are members of the insect order Thysanoptera and Frankliniella occidentalis (the western flower thrips) is the most economically important pest within this order. F. occidentalis is both a direct pest of crops and an efficient vector of plant viruses, including Tomato spotted wilt virus (TSWV). Despite the world-wide importance of thrips in agriculture, there is little knowledge of the F. occidentalis genome or gene functions at this time. A normalized cDNA library was constructed from first instar thrips and 13 839 expressed sequence tags (ESTs) were obtained. Our EST data assembled into 894 contigs and 11 806 singletons (12 700 nonredundant sequences). We found that 31% of these sequences had significant similarity (E< or = 10(-10)) to protein sequences in the National Center for Biotechnology Information nonredundant (nr) protein database, and 25% were functionally annotated using Blast 2GO. We identified 74 sequences with putative homology to proteins associated with insect innate immunity. Sixteen sequences had significant similarity to proteins associated with small RNA-mediated gene silencing pathways (RNA interference; RNAi), including the antiviral pathway (short interfering RNA-mediated pathway). Our EST collection provides new sequence resources for characterizing gene functions in F. occidentalis and other thrips species with regards to vital biological processes, studying the mechanism of interactions with the viruses harboured and transmitted by the vector, and identifying new insect gene-centred targets for plant disease and insect control.
Manijak, Mieszko P; Nielsen, Henrik B
2011-06-11
Although, systematic analysis of gene annotation is a powerful tool for interpreting gene expression data, it sometimes is blurred by incomplete gene annotation, missing expression response of key genes and secondary gene expression responses. These shortcomings may be partially circumvented by instead matching gene expression signatures to signatures of other experiments. To facilitate this we present the Functional Association Response by Overlap (FARO) server, that match input signatures to a compendium of 242 gene expression signatures, extracted from more than 1700 Arabidopsis microarray experiments. Hereby we present a publicly available tool for robust characterization of Arabidopsis gene expression experiments which can point to similar experimental factors in other experiments. The server is available at http://www.cbs.dtu.dk/services/faro/.
Applying gene regulatory network logic to the evolution of social behavior.
Baran, Nicole M; McGrath, Patrick T; Streelman, J Todd
2017-06-06
Animal behavior is ultimately the product of gene regulatory networks (GRNs) for brain development and neural networks for brain function. The GRN approach has advanced the fields of genomics and development, and we identify organizational similarities between networks of genes that build the brain and networks of neurons that encode brain function. In this perspective, we engage the analogy between developmental networks and neural networks, exploring the advantages of using GRN logic to study behavior. Applying the GRN approach to the brain and behavior provides a quantitative and manipulative framework for discovery. We illustrate features of this framework using the example of social behavior and the neural circuitry of aggression.
The Zn finger protein Iguana impacts Hedgehog signaling by promoting ciliogenesis.
Glazer, Andrew M; Wilkinson, Alex W; Backer, Chelsea B; Lapan, Sylvain W; Gutzman, Jennifer H; Cheeseman, Iain M; Reddien, Peter W
2010-01-01
Hedgehog signaling is critical for metazoan development and requires cilia for pathway activity. The gene iguana was discovered in zebrafish as required for Hedgehog signaling, and encodes a novel Zn finger protein. Planarians are flatworms with robust regenerative capacities and utilize epidermal cilia for locomotion. RNA interference of Smed-iguana in the planarian Schmidtea mediterranea caused cilia loss and failure to regenerate new cilia, but did not cause defects similar to those observed in hedgehog(RNAi) animals. Smed-iguana gene expression was also similar in pattern to the expression of multiple other ciliogenesis genes, but was not required for expression of these ciliogenesis genes. iguana-defective zebrafish had too few motile cilia in pronephric ducts and in Kupffer's vesicle. Kupffer's vesicle promotes left-right asymmetry and iguana mutant embryos had left-right asymmetry defects. Finally, human Iguana proteins (dZIP1 and dZIP1L) localize to the basal bodies of primary cilia and, together, are required for primary cilia formation. Our results indicate that a critical and broadly conserved function for Iguana is in ciliogenesis and that this function has come to be required for Hedgehog signaling in vertebrates.
Hadjithomas, Michalis; Chen, I-Min A.; Chu, Ken; ...
2016-11-29
Secondary metabolites produced by microbes have diverse biological functions, which makes them a great potential source of biotechnologically relevant compounds with antimicrobial, anti-cancer and other activities. The proteins needed to synthesize these natural products are often encoded by clusters of co-located genes called biosynthetic gene clusters (BCs). In order to advance the exploration of microbial secondary metabolism, we developed the largest publically available database of experimentally verified and predicted BCs, the Integrated Microbial Genomes Atlas of Biosynthetic gene Clusters (IMG-ABC) (https://img.jgi.doe.gov/abc/). Here, we describe an update of IMG-ABC, which includes ClusterScout, a tool for targeted identification of custom biosynthetic genemore » clusters across 40 000 isolate microbial genomes, and a new search capability to query more than 700 000 BCs from isolate genomes for clusters with similar Pfam composition. Additional features enable fast exploration and analysis of BCs through two new interactive visualization features, a BC function heatmap and a BC similarity network graph. These new tools and features add to the value of IMG-ABC's vast body of BC data, facilitating their in-depth analysis and accelerating secondary metabolite discovery.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hadjithomas, Michalis; Chen, I-Min A.; Chu, Ken
Secondary metabolites produced by microbes have diverse biological functions, which makes them a great potential source of biotechnologically relevant compounds with antimicrobial, anti-cancer and other activities. The proteins needed to synthesize these natural products are often encoded by clusters of co-located genes called biosynthetic gene clusters (BCs). In order to advance the exploration of microbial secondary metabolism, we developed the largest publically available database of experimentally verified and predicted BCs, the Integrated Microbial Genomes Atlas of Biosynthetic gene Clusters (IMG-ABC) (https://img.jgi.doe.gov/abc/). Here, we describe an update of IMG-ABC, which includes ClusterScout, a tool for targeted identification of custom biosynthetic genemore » clusters across 40 000 isolate microbial genomes, and a new search capability to query more than 700 000 BCs from isolate genomes for clusters with similar Pfam composition. Additional features enable fast exploration and analysis of BCs through two new interactive visualization features, a BC function heatmap and a BC similarity network graph. These new tools and features add to the value of IMG-ABC's vast body of BC data, facilitating their in-depth analysis and accelerating secondary metabolite discovery.« less
Evolutionary origins of the endocannabinoid system.
McPartland, John M; Matias, Isabel; Di Marzo, Vincenzo; Glass, Michelle
2006-03-29
Endocannabinoid system evolution was estimated by searching for functional orthologs in the genomes of twelve phylogenetically diverse organisms: Homo sapiens, Mus musculus, Takifugu rubripes, Ciona intestinalis, Caenorhabditis elegans, Drosophila melanogaster, Saccharomyces cerevisiae, Arabidopsis thaliana, Plasmodium falciparum, Tetrahymena thermophila, Archaeoglobus fulgidus, and Mycobacterium tuberculosis. Sequences similar to human endocannabinoid exon sequences were derived from filtered BLAST searches, and subjected to phylogenetic testing with ClustalX and tree building programs. Monophyletic clades that agreed with broader phylogenetic evidence (i.e., gene trees displaying topographical congruence with species trees) were considered orthologs. The capacity of orthologs to function as endocannabinoid proteins was predicted with pattern profilers (Pfam, Prosite, TMHMM, and pSORT), and by examining queried sequences for amino acid motifs known to serve critical roles in endocannabinoid protein function (obtained from a database of site-directed mutagenesis studies). This novel transfer of functional information onto gene trees enabled us to better predict the functional origins of the endocannabinoid system. Within this limited number of twelve organisms, the endocannabinoid genes exhibited heterogeneous evolutionary trajectories, with functional orthologs limited to mammals (TRPV1 and GPR55), or vertebrates (CB2 and DAGLbeta), or chordates (MAGL and COX2), or animals (DAGLalpha and CB1-like receptors), or opisthokonta (animals and fungi, NAPE-PLD), or eukaryotes (FAAH). Our methods identified fewer orthologs than did automated annotation systems, such as HomoloGene. Phylogenetic profiles, nonorthologous gene displacement, functional convergence, and coevolution are discussed.
Consistency of gene starts among Burkholderia genomes
2011-01-01
Background Evolutionary divergence in the position of the translational start site among orthologous genes can have significant functional impacts. Divergence can alter the translation rate, degradation rate, subcellular location, and function of the encoded proteins. Results Existing Genbank gene maps for Burkholderia genomes suggest that extensive divergence has occurred--53% of ortholog sets based on Genbank gene maps had inconsistent gene start sites. However, most of these inconsistencies appear to be gene-calling errors. Evolutionary divergence was the most plausible explanation for only 17% of the ortholog sets. Correcting probable errors in the Genbank gene maps decreased the percentage of ortholog sets with inconsistent starts by 68%, increased the percentage of ortholog sets with extractable upstream intergenic regions by 32%, increased the sequence similarity of intergenic regions and predicted proteins, and increased the number of proteins with identifiable signal peptides. Conclusions Our findings highlight an emerging problem in comparative genomics: single-digit percent errors in gene predictions can lead to double-digit percentages of inconsistent ortholog sets. The work demonstrates a simple approach to evaluate and improve the quality of gene maps. PMID:21342528
Gubala, Anna M; Schmitz, Jonathan F; Kearns, Michael J; Vinh, Tery T; Bornberg-Bauer, Erich; Wolfner, Mariana F; Findlay, Geoffrey D
2017-05-01
New genes arise through a variety of mechanisms, including the duplication of existing genes and the de novo birth of genes from noncoding DNA sequences. While there are numerous examples of duplicated genes with important functional roles, the functions of de novo genes remain largely unexplored. Many newly evolved genes are expressed in the male reproductive tract, suggesting that these evolutionary innovations may provide advantages to males experiencing sexual selection. Using testis-specific RNA interference, we screened 11 putative de novo genes in Drosophila melanogaster for effects on male fertility and identified two, goddard and saturn, that are essential for spermatogenesis and sperm function. Goddard knockdown (KD) males fail to produce mature sperm, while saturn KD males produce few sperm, and these function inefficiently once transferred to females. Consistent with a de novo origin, both genes are identifiable only in Drosophila and are predicted to encode proteins with no sequence similarity to any annotated protein. However, since high levels of divergence prevented the unambiguous identification of the noncoding sequences from which each gene arose, we consider goddard and saturn to be putative de novo genes. Within Drosophila, both genes have been lost in certain lineages, but show conserved, male-specific patterns of expression in the species in which they are found. Goddard is consistently found in single-copy and evolves under purifying selection. In contrast, saturn has diversified through gene duplication and positive selection. These data suggest that de novo genes can acquire essential roles in male reproduction. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Jin, Qijiang; Hu, Xin; Li, Xin; Wang, Bei; Wang, Yanjie; Jiang, Hongwei; Mattson, Neil; Xu, Yingchun
2016-01-01
Trehalose-6-phosphate synthase (TPS) plays a key role in plant carbohydrate metabolism and the perception of carbohydrate availability. In the present work, the publicly available Nelumbo nucifera (lotus) genome sequence database was analyzed which led to identification of nine lotus TPS genes (NnTPS). It was found that at least two introns are included in the coding sequences of NnTPS genes. When the motif compositions were analyzed we found that NnTPS generally shared the similar motifs, implying that they have similar functions. The dN/dS ratios were always less than 1 for different domains and regions outside domains, suggesting purifying selection on the lotus TPS gene family. The regions outside TPS domain evolved relatively faster than NnTPS domains. A phylogenetic tree was constructed using all predicted coding sequences of lotus TPS genes, together with those from Arabidopsis, poplar, soybean, and rice. The result indicated that those TPS genes could be clearly divided into two main subfamilies (I-II), where each subfamily could be further divided into 2 (I) and 5 (II) subgroups. Analyses of divergence and adaptive evolution show that purifying selection may have been the main force driving evolution of plant TPS genes. Some of the critical sites that contributed to divergence may have been under positive selection. Transcriptome data analysis revealed that most NnTPS genes were predominantly expressed in sink tissues. Expression pattern of NnTPS genes under copper and submergence stress indicated that NNU_014679 and NNU_022788 might play important roles in lotus energy metabolism and participate in stress response. Our results can facilitate further functional studies of TPS genes in lotus. PMID:27746792
Similarities and Differences between Porcine Mandibular and Limb Bone Marrow Mesenchymal Stem Cells
Lloyd, Brandon; Tee, Boon Ching; Headley, Colwyn; Emam, Hany; Mallery, Susan; Sun, Zongyang
2017-01-01
Objective Research has shown promise of using bone marrow mesenchymal stem cells (BMSCs) for craniofacial bone regeneration; yet little is known about the differences of BMSCs from limb and craniofacial bones. This study compared pig mandibular and tibia BMSCs for their in vitro proliferation, osteogenic differentiation properties and gene expression. Design Bone marrow was aspirated from the tibia and mandible of 3–4 month-old pigs (n=4), followed by BMSC isolation, culture-expansion and characterization by flow cytometry. Proliferation rates were assessed using population doubling times. Osteogenic differentiation was evaluated by alkaline phosphatase activity. Affymetrix porcine microarray was used to compare gene expressions of tibial and mandibular BMSCs, followed by real-time RT-PCR evaluation of certain genes. Results Our results showed that BMSCs from both locations expressed MSC markers but not hematopoietic markers. The proliferation and osteogenic differentiation potential of mandibular BMSCs were significantly stronger than those of tibial BMSCs. Microarray analysis identified 404 highly abundant genes, out of which 334 genes were matched between the two locations and annotated into the same functional groups including osteogenesis and angiogenesis, while 70 genes were mismatched and annotated into different functional groups. In addition, 48 genes were differentially expressed by at least 1.5-fold difference between the two locations, including higher expression of cranial neural crest-related gene BMP-4 in mandibular BMSCs, which was confirmed by real-time RT-PCR. Conclusions Altogether, these data indicate that despite strong similarities in gene expression between mandibular and tibial BMSCs, mandibular BMSCs express some genes differently than tibial BMSCs and have a phenotypic profile that may make them advantageous for craniofacial bone regeneration. PMID:28135571
Clustering of change patterns using Fourier coefficients.
Kim, Jaehee; Kim, Haseong
2008-01-15
To understand the behavior of genes, it is important to explore how the patterns of gene expression change over a time period because biologically related gene groups can share the same change patterns. Many clustering algorithms have been proposed to group observation data. However, because of the complexity of the underlying functions there have not been many studies on grouping data based on change patterns. In this study, the problem of finding similar change patterns is induced to clustering with the derivative Fourier coefficients. The sample Fourier coefficients not only provide information about the underlying functions, but also reduce the dimension. In addition, as their limiting distribution is a multivariate normal, a model-based clustering method incorporating statistical properties would be appropriate. This work is aimed at discovering gene groups with similar change patterns that share similar biological properties. We developed a statistical model using derivative Fourier coefficients to identify similar change patterns of gene expression. We used a model-based method to cluster the Fourier series estimation of derivatives. The model-based method is advantageous over other methods in our proposed model because the sample Fourier coefficients asymptotically follow the multivariate normal distribution. Change patterns are automatically estimated with the Fourier representation in our model. Our model was tested in simulations and on real gene data sets. The simulation results showed that the model-based clustering method with the sample Fourier coefficients has a lower clustering error rate than K-means clustering. Even when the number of repeated time points was small, the same results were obtained. We also applied our model to cluster change patterns of yeast cell cycle microarray expression data with alpha-factor synchronization. It showed that, as the method clusters with the probability-neighboring data, the model-based clustering with our proposed model yielded biologically interpretable results. We expect that our proposed Fourier analysis with suitably chosen smoothing parameters could serve as a useful tool in classifying genes and interpreting possible biological change patterns. The R program is available upon the request.
Groten, Karin; Pahari, Nabin T; Xu, Shuqing; Miloradovic van Doorn, Maja; Baldwin, Ian T
2015-01-01
Most land plants live in a symbiotic association with arbuscular mycorrhizal fungi (AMF) that belong to the phylum Glomeromycota. Although a number of plant genes involved in the plant-AMF interactions have been identified by analyzing mutants, the ability to rapidly manipulate gene expression to study the potential functions of new candidate genes remains unrealized. We analyzed changes in gene expression of wild tobacco roots (Nicotiana attenuata) after infection with mycorrhizal fungi (Rhizophagus irregularis) by serial analysis of gene expression (SuperSAGE) combined with next generation sequencing, and established a virus-induced gene-silencing protocol to study the function of candidate genes in the interaction. From 92,434 SuperSAGE Tag sequences, 32,808 (35%) matched with our in-house Nicotiana attenuata transcriptome database and 3,698 (4%) matched to Rhizophagus genes. In total, 11,194 Tags showed a significant change in expression (p<0.05, >2-fold change) after infection. When comparing the functions of highly up-regulated annotated Tags in this study with those of two previous large-scale gene expression studies, 18 gene functions were found to be up-regulated in all three studies mainly playing roles related to phytohormone metabolism, catabolism and defense. To validate the function of identified candidate genes, we used the technique of virus-induced gene silencing (VIGS) to silence the expression of three putative N. attenuata genes: germin-like protein, indole-3-acetic acid-amido synthetase GH3.9 and, as a proof-of-principle, calcium and calmodulin-dependent protein kinase (CCaMK). The silencing of the three plant genes in roots was successful, but only CCaMK silencing had a significant effect on the interaction with R. irregularis. Interestingly, when a highly activated inoculum was used for plant inoculation, the effect of CCaMK silencing on fungal colonization was masked, probably due to trans-complementation. This study demonstrates that large-scale gene expression studies across different species induce of a core set of genes of similar functions. However, additional factors seem to influence the overall pattern of gene expression, resulting in high variability among independent studies with different hosts. We conclude that VIGS is a powerful tool with which to investigate the function of genes involved in plant-AMF interactions but that inoculum strength can strongly influence the outcome of the interaction.
Asaf, Sajjad; Khan, Abdul Latif; Khan, Abdur Rahim; Waqas, Muhammad; Kang, Sang-Mo; Khan, Muhammad Aaqil; Shahzad, Raheem; Seo, Chang-Woo; Shin, Jae-Ho; Lee, In-Jung
2016-01-01
Oryza minuta (Poaceae family) is a tetraploid wild relative of cultivated rice with a BBCC genome. O. minuta has the potential to resist against various pathogenic diseases such as bacterial blight (BB), white backed planthopper (WBPH) and brown plant hopper (BPH). Here, we sequenced and annotated the complete mitochondrial genome of O. minuta. The mtDNA genome is 515,022 bp, containing 60 protein coding genes, 31 tRNA genes and two rRNA genes. The mitochondrial genome organization and the gene content at the nucleotide level are highly similar (89%) to that of O. rufipogon. Comparison with other related species revealed that most of the genes with known function are conserved among the Poaceae members. Similarly, O. minuta mt genome shared 24 protein-coding genes, 15 tRNA genes and 1 ribosomal RNA gene with other rice species (indica and japonica). The evolutionary relationship and phylogenetic analysis revealed that O. minuta is more closely related to O. rufipogon than to any other related species. Such studies are essential to understand the evolutionary divergence among species and analyze common gene pools to combat risks in the current scenario of a changing environment.
Unsupervised text mining for assessing and augmenting GWAS results.
Ailem, Melissa; Role, François; Nadif, Mohamed; Demenais, Florence
2016-04-01
Text mining can assist in the analysis and interpretation of large-scale biomedical data, helping biologists to quickly and cheaply gain confirmation of hypothesized relationships between biological entities. We set this question in the context of genome-wide association studies (GWAS), an actively emerging field that contributed to identify many genes associated with multifactorial diseases. These studies allow to identify groups of genes associated with the same phenotype, but provide no information about the relationships between these genes. Therefore, our objective is to leverage unsupervised text mining techniques using text-based cosine similarity comparisons and clustering applied to candidate and random gene vectors, in order to augment the GWAS results. We propose a generic framework which we used to characterize the relationships between 10 genes reported associated with asthma by a previous GWAS. The results of this experiment showed that the similarities between these 10 genes were significantly stronger than would be expected by chance (one-sided p-value<0.01). The clustering of observed and randomly selected gene also allowed to generate hypotheses about potential functional relationships between these genes and thus contributed to the discovery of new candidate genes for asthma. Copyright © 2016 Elsevier Inc. All rights reserved.
Dorman, J B; Albinder, B; Shroyer, T; Kenyon, C
1995-12-01
Recessive mutations in two genes, daf-2 and age-1, extend the lifespan of Caenorhabditis elegans significantly. The daf-2 gene also regulates formation of an alternative developmental state called the dauer. Here we asked whether these two genes function in the same or different lifespan pathways. We found that the longevity of both age-1 and daf-2 mutants requires the activities of the same two genes, daf-16 and daf-18. In addition, the daf-2(e1370); age-1(hx546) double mutant did not live significantly longer than the daf-2 single mutant. We also found that, like daf-2 mutations, the age-1(hx546) mutation affects certain aspects of dauer formation. These findings suggest that age-1 and daf-2 mutations do act in the same lifespan pathway and extend lifespan by triggering similar if not identical processes.
Dorman, J. B.; Albinder, B.; Shroyer, T.; Kenyon, C.
1995-01-01
Recessive mutations in two genes, daf-2 and age-1, extend the lifespan of Caenorhabditis elegans significantly. The daf-2 gene also regulates formation of an alternative developmental state called the dauer. Here we asked whether these two genes function in the same or different lifespan pathways. We found that the longevity of both age-1 and daf-2 mutants requires the activities of the same two genes, daf-16 and daf-18. In addition, the daf-2(e1370); age-1(hx546) double mutant did not live significantly longer than the daf-2 single mutant. We also found that, like daf-2 mutations, the age-1(hx546) mutation affects certain aspects of dauer formation. These findings suggest that age-1 and daf-2 mutations do act in the same lifespan pathway and extend lifespan by triggering similar if not identical processes. PMID:8601482
Genomic analysis reveals extensive gene duplication within the bovine TRB locus
Connelley, Timothy; Aerts, Jan; Law, Andy; Morrison, W Ivan
2009-01-01
Background Diverse TR and IG repertoires are generated by V(D)J somatic recombination. Genomic studies have been pivotal in cataloguing the V, D, J and C genes present in the various TR/IG loci and describing how duplication events have expanded the number of these genes. Such studies have also provided insights into the evolution of these loci and the complex mechanisms that regulate TR/IG expression. In this study we analyze the sequence of the third bovine genome assembly to characterize the germline repertoire of bovine TRB genes and compare the organization, evolution and regulatory structure of the bovine TRB locus with that of humans and mice. Results The TRB locus in the third bovine genome assembly is distributed over 5 scaffolds, extending to ~730 Kb. The available sequence contains 134 TRBV genes, assigned to 24 subgroups, and 3 clusters of DJC genes, each comprising a single TRBD gene, 5–7 TRBJ genes and a single TRBC gene. Seventy-nine of the TRBV genes are predicted to be functional. Comparison with the human and murine TRB loci shows that the gene order, as well as the sequences of non-coding elements that regulate TRB expression, are highly conserved in the bovine. Dot-plot analyses demonstrate that expansion of the genomic TRBV repertoire has occurred via a complex and extensive series of duplications, predominantly involving DNA blocks containing multiple genes. These duplication events have resulted in massive expansion of several TRBV subgroups, most notably TRBV6, 9 and 21 which contain 40, 35 and 16 members respectively. Similarly, duplication has lead to the generation of a third DJC cluster. Analyses of cDNA data confirms the diversity of the TRBV genes and, in addition, identifies a substantial number of TRBV genes, predominantly from the larger subgroups, which are still absent from the genome assembly. The observed gene duplication within the bovine TRB locus has created a repertoire of phylogenetically diverse functional TRBV genes, which is substantially larger than that described for humans and mice. Conclusion The analyses completed in this study reveal that, although the gene content and organization of the bovine TRB locus are broadly similar to that of humans and mice, multiple duplication events have led to a marked expansion in the number of TRB genes. Similar expansions in other ruminant TR loci suggest strong evolutionary pressures in this lineage have selected for the development of enlarged sets of TR genes that can contribute to diverse TR repertoires. PMID:19393068
hSMR3A as a Marker for Patients With Erectile Dysfunction
Tong, Yuehong; Tar, Moses; Monrose, Val; DiSanto, Michael; Melman, Arnold; Davies, Kelvin P.
2007-01-01
Purpose We recently reported that Vcsa1 is one of the most down-regulated genes in the corpora of rats in 3 distinct models of erectile dysfunction. Since gene transfer of plasmids expressing Vcsa1 or intracorporeal injection of its mature peptide product sialorphin into the corpora of aging rats was shown to restore erectile function, we proposed that the Vcsa1 gene has a direct role in erectile function. To determine if similar changes in gene expression occur in the corpora of human subjects with erectile dysfunction we identified a human homologue of Vcsa1 (hSMR3A) and determined the level of expression of hSMR3A in patients. Materials and Methods hSMR3A was identified as a homologue of Vcsa1 by searching protein databases for proteins with similarity. hSMR3A cDNA was generated and subcloned into the plasmid pVAX to generate pVAX-hSMR3A. pVAX-hSMR3A (25 or 100 μg) was intracorporeally injected into aging rats. The effect on erectile physiology was compared histologically and by measuring intracorporeal pressure/blood pressure with controls treated with the empty plasmid pVAX. Total RNA was extracted from human corporeal tissue obtained from patients undergoing previously scheduled penile surgery. Patients were grouped according to normal erectile function (3), erectile dysfunction and diabetes (5) and patients without diabetes but with erectile dysfunction (5). Quantitative reverse-transcriptase polymerase chain reaction was used to determine the hSMR3A expression level. Results Intracorporeal injection of 25 μg pVAX-hSMR3A was able to significantly increase the intracorporeal pressure-to-blood pressure ratio in aging rats compared to age matched controls. Higher amounts (100 μg) of gene transfer of the plasmid caused less of an improvement in the intracorporeal pressure-to-blood pressure ratio compared to controls, although there was histological and visual evidence that the animals were post-priapitic. These physiological effects were similar to previously reported effects of intracorporeal injection of pVAX-Vcsa1 into the corpora of aging rats, establishing hSMR3A as a functional homologue of Vcsa1. More than 10-fold down-regulation in hSMR3A transcript expression was observed in the corpora of patients with vs without erectile dysfunction. In patients with diabetes associated and nondiabetes associated erectile dysfunction hSMR3A expression was found to be down-regulated. Conclusions These results suggest that hSMR3A can act as a marker for erectile dysfunction associated with diabetic and nondiabetic etiologies. Given that our previous studies demonstrated that gene transfer of the Vcsa1 gene and intracorporeal injection of its protein product in rats can restore erectile function, these results suggest that therapies that increase the hSMR3A gene and product expression could potentially have a positive impact on erectile function. PMID:17512016
hSMR3A as a marker for patients with erectile dysfunction.
Tong, Yuehong; Tar, Moses; Monrose, Val; DiSanto, Michael; Melman, Arnold; Davies, Kelvin P
2007-07-01
We recently reported that Vcsa1 is one of the most down-regulated genes in the corpora of rats in 3 distinct models of erectile dysfunction. Since gene transfer of plasmids expressing Vcsa1 or intracorporeal injection of its mature peptide product sialorphin into the corpora of aging rats was shown to restore erectile function, we proposed that the Vcsa1 gene has a direct role in erectile function. To determine if similar changes in gene expression occur in the corpora of human subjects with erectile dysfunction we identified a human homologue of Vcsa1 (hSMR3A) and determined the level of expression of hSMR3A in patients. hSMR3A was identified as a homologue of Vcsa1 by searching protein databases for proteins with similarity. hSMR3A cDNA was generated and subcloned into the plasmid pVAX to generate pVAX-hSMR3A. pVAX-hSMR3A (25 or 100 microg) was intracorporeally injected into aging rats. The effect on erectile physiology was compared histologically and by measuring intracorporeal pressure/blood pressure with controls treated with the empty plasmid pVAX. Total RNA was extracted from human corporeal tissue obtained from patients undergoing previously scheduled penile surgery. Patients were grouped according to normal erectile function (3), erectile dysfunction and diabetes (5) and patients without diabetes but with erectile dysfunction (5). Quantitative reverse-transcriptase polymerase chain reaction was used to determine the hSMR3A expression level. Intracorporeal injection of 25 microg pVAX-hSMR3A was able to significantly increase the intracorporeal pressure-to-blood pressure ratio in aging rats compared to age matched controls. Higher amounts (100 microg) of gene transfer of the plasmid caused less of an improvement in the intracorporeal pressure-to-blood pressure ratio compared to controls, although there was histological and visual evidence that the animals were post-priapitic. These physiological effects were similar to previously reported effects of intracorporeal injection of pVAX-Vcsa1 into the corpora of aging rats, establishing hSMR3A as a functional homologue of Vcsa1. More than 10-fold down-regulation in hSMR3A transcript expression was observed in the corpora of patients with vs without erectile dysfunction. In patients with diabetes associated and nondiabetes associated erectile dysfunction hSMR3A expression was found to be down-regulated. These results suggest that hSMR3A can act as a marker for erectile dysfunction associated with diabetic and nondiabetic etiologies. Given that our previous studies demonstrated that gene transfer of the Vcsa1 gene and intracorporeal injection of its protein product in rats can restore erectile function, these results suggest that therapies that increase the hSMR3A gene and product expression could potentially have a positive impact on erectile function.
Construction of ontology augmented networks for protein complex prediction.
Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian
2013-01-01
Protein complexes are of great importance in understanding the principles of cellular organization and function. The increase in available protein-protein interaction data, gene ontology and other resources make it possible to develop computational methods for protein complex prediction. Most existing methods focus mainly on the topological structure of protein-protein interaction networks, and largely ignore the gene ontology annotation information. In this article, we constructed ontology augmented networks with protein-protein interaction data and gene ontology, which effectively unified the topological structure of protein-protein interaction networks and the similarity of gene ontology annotations into unified distance measures. After constructing ontology augmented networks, a novel method (clustering based on ontology augmented networks) was proposed to predict protein complexes, which was capable of taking into account the topological structure of the protein-protein interaction network, as well as the similarity of gene ontology annotations. Our method was applied to two different yeast protein-protein interaction datasets and predicted many well-known complexes. The experimental results showed that (i) ontology augmented networks and the unified distance measure can effectively combine the structure closeness and gene ontology annotation similarity; (ii) our method is valuable in predicting protein complexes and has higher F1 and accuracy compared to other competing methods.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kozbial, Piotr; Xu, Qingping; Chiu, Hsiu-Ju
2009-08-28
To extend the structural coverage of proteins with unknown functions, we targeted a novel protein family (Pfam accession number PF08807, DUF1798) for which we proposed and determined the structures of two representative members. The MW1337R gene of Staphylococcus aureus subsp. aureus Rosenbach (Wood 46) encodes a protein with a molecular weight of 13.8 kDa (residues 1-116) and a calculated isoelectric point of 5.15. The lin2004 gene of the nonspore-forming bacterium Listeria innocua Clip11262 encodes a protein with a molecular weight of 14.6 kDa (residues 1-121) and a calculated isoelectric point of 5.45. MW1337R and lin2004, as well as their homologs,more » which, so far, have been found only in Bacillus, Staphylococcus, Listeria, and related genera (Geobacillus, Exiguobacterium, and Oceanobacillus), have unknown functions and are annotated as hypothetical proteins. The genomic contexts of MW1337R and lin2004 are similar and conserved in related species. In prokaryotic genomes, most often, functionally interacting proteins are coded by genes, which are colocated in conserved operons. Proteins from the same operon as MW1337R and lin2004 either have unknown functions (i.e., belong to DUF1273, Pfam accession number PF06908) or are similar to ypsB from Bacillus subtilis. The function of ypsB is unclear, although it has a strong similarity to the N-terminal region of DivIVA, which was characterized as a bifunctional protein with distinct roles during vegetative growth and sporulation. In addition, members of the DUF1273 family display distant sequence similarity with the DprA/Smf protein, which acts downstream of the DNA uptake machinery, possibly in conjunction with RecA. The RecA activities in Bacillus subtilis are modulated by RecU Holliday-junction resolvase. In all analyzed cases, the gene coding for RecU is in the vicinity of MW1337R, lin2004, or their orthologs, but on a different operon located in the complementary DNA strand. Here, we report the crystal structures of MW1337R and lin2004, which were determined using the semiautomated, high-throughput pipeline of the Joint Center for Structural Genomics (JCSG), part of the National Institute of General Medical Sciences Protein Structure Initiative.« less
Nallapareddy, Sreedhar R; Weinstock, George M; Murray, Barbara E
2003-03-01
A collagen-binding adhesin of Enterococcus faecium, Acm, was identified. Acm shows 62% similarity to the Staphylococcus aureus collagen adhesin Cna over the entire protein and is more similar to Cna (60% and 75% similarity with Cna A and B domains respectively) than to the Enterococcus faecalis collagen-binding adhesin, Ace, which shares homology with Acm only in the A domain. Despite the detection of acm in 32 out of 32 E. faecium isolates, only 11 of these (all clinical isolates, including four vancomycin-resistant endocarditis isolates and seven other isolates) exhibited binding to collagen type I (CI). Although acm from three CI-binding vancomycin-resistant E. faecium clinical isolates showed 100% identity, analysis of acm genes and their promoter regions from six non-CI-binding strains identified deletions or mutations that introduced stop codons and/or IS elements within the gene or the promoter region in five out of six strains, suggesting that the presence of an intact functional acm gene is necessary for binding of E. faecium strains to CI. Recombinant Acm A domain showed specific and concentration-dependent binding to collagen, and this protein competed with E. faecium binding to immobilized CI. Consistent with the adherence phenotype and sequence data, probing with Acm-specific IgGs purified from anti-recombinant Acm A polyclonal rabbit serum confirmed the surface expression of Acm in three out of three collagen-binding clinical isolates of E. faecium tested, but in none of the strains with a non-functional pseudo acm gene. Introduction of a functional acm gene into two non-CI-binding natural acm mutant strains conferred a CI-binding phenotype, further confirming that native Acm is sufficient for the binding of E. faecium to CI. These results demonstrate that acm, which encodes a potential virulence factor, is functional only in certain infection-derived clinical isolates of E. faecium, and suggest that Acm is the primary adhesin responsible for the ability of E. faecium to bind collagen.
Kawai, Mikihiko; Futagami, Taiki; Toyoda, Atsushi; Takaki, Yoshihiro; Nishi, Shinro; Hori, Sayaka; Arai, Wataru; Tsubouchi, Taishi; Morono, Yuki; Uchiyama, Ikuo; Ito, Takehiko; Fujiyama, Asao; Inagaki, Fumio; Takami, Hideto
2014-01-01
Marine subsurface sediments on the Pacific margin harbor diverse microbial communities even at depths of several hundreds meters below the seafloor (mbsf) or more. Previous PCR-based molecular analysis showed the presence of diverse reductive dehalogenase gene (rdhA) homologs in marine subsurface sediment, suggesting that anaerobic respiration of organohalides is one of the possible energy-yielding pathways in the organic-rich sedimentary habitat. However, primer-independent molecular characterization of rdhA has remained to be demonstrated. Here, we studied the diversity and frequency of rdhA homologs by metagenomic analysis of five different depth horizons (0.8, 5.1, 18.6, 48.5, and 107.0 mbsf) at Site C9001 off the Shimokita Peninsula of Japan. From all metagenomic pools, remarkably diverse rdhA-homologous sequences, some of which are affiliated with novel clusters, were observed with high frequency. As a comparison, we also examined frequency of dissimilatory sulfite reductase genes (dsrAB), key functional genes for microbial sulfate reduction. The dsrAB were also widely observed in the metagenomic pools whereas the frequency of dsrAB genes was generally smaller than that of rdhA-homologous genes. The phylogenetic composition of rdhA-homologous genes was similar among the five depth horizons. Our metagenomic data revealed that subseafloor rdhA homologs are more diverse than previously identified from PCR-based molecular studies. Spatial distribution of similar rdhA homologs across wide depositional ages indicates that the heterotrophic metabolic processes mediated by the genes can be ecologically important, functioning in the organic-rich subseafloor sedimentary biosphere. PMID:24624126
Kawai, Mikihiko; Futagami, Taiki; Toyoda, Atsushi; Takaki, Yoshihiro; Nishi, Shinro; Hori, Sayaka; Arai, Wataru; Tsubouchi, Taishi; Morono, Yuki; Uchiyama, Ikuo; Ito, Takehiko; Fujiyama, Asao; Inagaki, Fumio; Takami, Hideto
2014-01-01
Marine subsurface sediments on the Pacific margin harbor diverse microbial communities even at depths of several hundreds meters below the seafloor (mbsf) or more. Previous PCR-based molecular analysis showed the presence of diverse reductive dehalogenase gene (rdhA) homologs in marine subsurface sediment, suggesting that anaerobic respiration of organohalides is one of the possible energy-yielding pathways in the organic-rich sedimentary habitat. However, primer-independent molecular characterization of rdhA has remained to be demonstrated. Here, we studied the diversity and frequency of rdhA homologs by metagenomic analysis of five different depth horizons (0.8, 5.1, 18.6, 48.5, and 107.0 mbsf) at Site C9001 off the Shimokita Peninsula of Japan. From all metagenomic pools, remarkably diverse rdhA-homologous sequences, some of which are affiliated with novel clusters, were observed with high frequency. As a comparison, we also examined frequency of dissimilatory sulfite reductase genes (dsrAB), key functional genes for microbial sulfate reduction. The dsrAB were also widely observed in the metagenomic pools whereas the frequency of dsrAB genes was generally smaller than that of rdhA-homologous genes. The phylogenetic composition of rdhA-homologous genes was similar among the five depth horizons. Our metagenomic data revealed that subseafloor rdhA homologs are more diverse than previously identified from PCR-based molecular studies. Spatial distribution of similar rdhA homologs across wide depositional ages indicates that the heterotrophic metabolic processes mediated by the genes can be ecologically important, functioning in the organic-rich subseafloor sedimentary biosphere.
Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko
2012-07-15
Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of E<10(-5)) are included in 27 clusters. Five clusters are associated with metabolism, containing P450 genes restricted to the Brassica family and predicted to be involved in secondary metabolism. Operon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system. Copyright © 2012 Elsevier B.V. All rights reserved.
Kervadec, Anaïs; Bellamy, Valérie; El Harane, Nadia; Arakélian, Lousineh; Vanneaux, Valérie; Cacciapuoti, Isabelle; Nemetalla, Hany; Périer, Marie-Cécile; Toeg, Hadi D; Richart, Adèle; Lemitre, Mathilde; Yin, Min; Loyer, Xavier; Larghero, Jérôme; Hagège, Albert; Ruel, Marc; Boulanger, Chantal M; Silvestre, Jean-Sébastien; Menasché, Philippe; Renault, Nisa K E
2016-06-01
Cell-based therapies are being explored as a therapeutic option for patients with chronic heart failure following myocardial infarction. Extracellular vesicles (EV), including exosomes and microparticles, secreted by transplanted cells may orchestrate their paracrine therapeutic effects. We assessed whether post-infarction administration of EV released by human embryonic stem cell-derived cardiovascular progenitors (hESC-Pg) can provide equivalent benefits to administered hESC-Pg and whether hESC-Pg and EV treatments activate similar endogenous pathways. Mice underwent surgical occlusion of their left coronary arteries. After 2-3 weeks, 95 mice included in the study were treated with hESC-Pg, EV, or Minimal Essential Medium Alpha Medium (alpha-MEM; vehicle control) delivered by percutaneous injections under echocardiographic guidance into the peri-infarct myocardium. functional and histologic end-points were blindly assessed 6 weeks later, and hearts were processed for gene profiling. Genes differentially expressed between control hearts and hESC-Pg-treated and EV-treated hearts were clustered into functionally relevant pathways. At 6 weeks after hESC-Pg administration, treated mice had significantly reduced left ventricular end-systolic (-4.20 ± 0.96 µl or -7.5%, p = 0.0007) and end-diastolic (-4.48 ± 1.47 µl or -4.4%, p = 0.009) volumes compared with baseline values despite the absence of any transplanted hESC-Pg or human embryonic stem cell-derived cardiomyocytes in the treated mouse hearts. Equal benefits were seen with the injection of hESC-Pg-derived EV, whereas animals injected with alpha-MEM (vehicle control) did not improve significantly. Histologic examination suggested a slight reduction in infarct size in hESC-Pg-treated animals and EV-treated animals compared with alpha-MEM-treated control animals. In the hESC-Pg-treated and EV-treated groups, heart gene profiling identified 927 genes that were similarly upregulated compared with the control group. Among the 49 enriched pathways associated with these up-regulated genes that could be related to cardiac function or regeneration, 78% were predicted to improve cardiac function through increased cell survival and/or proliferation or DNA repair as well as pathways related to decreased fibrosis and heart failure. In this post-infarct heart failure model, either hESC-Pg or their secreted EV enhance recovery of cardiac function and similarly affect cardiac gene expression patterns that could be related to this recovery. Although the mechanisms by which EV improve cardiac function remain to be determined, these results support the idea that a paracrine mechanism is sufficient to effect functional recovery in cell-based therapies for post-infarction-related chronic heart failure. Copyright © 2016 International Society for Heart and Lung Transplantation. Published by Elsevier Inc. All rights reserved.
Hilson, Pierre; Allemeersch, Joke; Altmann, Thomas; Aubourg, Sébastien; Avon, Alexandra; Beynon, Jim; Bhalerao, Rishikesh P.; Bitton, Frédérique; Caboche, Michel; Cannoot, Bernard; Chardakov, Vasil; Cognet-Holliger, Cécile; Colot, Vincent; Crowe, Mark; Darimont, Caroline; Durinck, Steffen; Eickhoff, Holger; de Longevialle, Andéol Falcon; Farmer, Edward E.; Grant, Murray; Kuiper, Martin T.R.; Lehrach, Hans; Léon, Céline; Leyva, Antonio; Lundeberg, Joakim; Lurin, Claire; Moreau, Yves; Nietfeld, Wilfried; Paz-Ares, Javier; Reymond, Philippe; Rouzé, Pierre; Sandberg, Goran; Segura, Maria Dolores; Serizet, Carine; Tabrett, Alexandra; Taconnat, Ludivine; Thareau, Vincent; Van Hummelen, Paul; Vercruysse, Steven; Vuylsteke, Marnik; Weingartner, Magdalena; Weisbeek, Peter J.; Wirta, Valtteri; Wittink, Floyd R.A.; Zabeau, Marc; Small, Ian
2004-01-01
Microarray transcript profiling and RNA interference are two new technologies crucial for large-scale gene function studies in multicellular eukaryotes. Both rely on sequence-specific hybridization between complementary nucleic acid strands, inciting us to create a collection of gene-specific sequence tags (GSTs) representing at least 21,500 Arabidopsis genes and which are compatible with both approaches. The GSTs were carefully selected to ensure that each of them shared no significant similarity with any other region in the Arabidopsis genome. They were synthesized by PCR amplification from genomic DNA. Spotted microarrays fabricated from the GSTs show good dynamic range, specificity, and sensitivity in transcript profiling experiments. The GSTs have also been transferred to bacterial plasmid vectors via recombinational cloning protocols. These cloned GSTs constitute the ideal starting point for a variety of functional approaches, including reverse genetics. We have subcloned GSTs on a large scale into vectors designed for gene silencing in plant cells. We show that in planta expression of GST hairpin RNA results in the expected phenotypes in silenced Arabidopsis lines. These versatile GST resources provide novel and powerful tools for functional genomics. PMID:15489341
NASA Astrophysics Data System (ADS)
Boulter, Jim; Connolly, John; Deneris, Evan; Goldman, Dan; Heinemann, Steven; Patrick, Jim
1987-11-01
A family of genes coding for proteins homologous to the α subunit of the muscle nicotinic acetylcholine receptor has been identified in the rat genome. These genes are transcribed in the central and peripheral nervous systems in areas known to contain functional nicotinic receptors. In this paper, we demonstrate that three of these genes, which we call alpha3, alpha4, and beta2, encode proteins that form functional nicotinic acetylcholine receptors when expressed in Xenopus oocytes. Oocytes expressing either alpha3 or alpha4 protein in combination with the beta2 protein produced a strong response to acetylcholine. Oocytes expressing only the alpha4 protein gave a weak response to acetylcholine. These receptors are activated by acetylcholine and nicotine and are blocked by Bungarus toxin 3.1. They are not blocked by α -bungarotoxin, which blocks the muscle nicotinic acetylcholine receptor. Thus, the receptors formed by the alpha3, alpha4, and beta2 subunits are pharmacologically similar to the ganglionic-type neuronal nicotinic acetylcholine receptor. These results indicate that the alpha3, alpha4, and beta2 genes encode functional nicotinic acetylcholine receptor subunits that are expressed in the brain and peripheral nervous system.
Donakonda, Sainitin; Sinha, Swati; Dighe, Shrinivas Nivrutti; Rao, Manchanahalli R Satyanarayana
2017-07-25
ASCL1 is a basic Helix-Loop-Helix transcription factor (TF), which is involved in various cellular processes like neuronal development and signaling pathways. Transcriptome profiling has shown that ASCL1 overexpression plays an important role in the development of glioma and Small Cell Lung Carcinoma (SCLC), but distinct and common molecular mechanisms regulated by ASCL1 in these cancers are unknown. In order to understand how it drives the cellular functional network in these two tumors, we generated a gene expression profile in a glioma cell line (U87MG) to identify ASCL1 gene targets by an si RNA silencing approach and then compared this with a publicly available dataset of similarly silenced SCLC (NCI-H1618 cells). We constructed TF-TF and gene-gene interactions, as well as protein interaction networks of ASCL1 regulated genes in glioma and SCLC cells. Detailed network analysis uncovered various biological processes governed by ASCL1 target genes in these two tumor cell lines. We find that novel ASCL1 functions related to mitosis and signaling pathways influencing development and tumor growth are affected in both glioma and SCLC cells. In addition, we also observed ASCL1 governed functional networks that are distinct to glioma and SCLC.
Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation
Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.; ...
2016-11-24
Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. A multitude of technologies, abstractions, and interpretive frameworks have emerged to answer the challenges presented by genome function and regulatory network inference. Here, we propose a new approach for producing biologically meaningful clusters of coexpressed genes, called Atomic Regulons (ARs), based on expression data, gene context, and functional relationships. We demonstrate this new approach by computing ARs for Escherichia coli, which we compare with the coexpressed gene clusters predicted by two prevalent existing methods: hierarchical clustering and k-meansmore » clustering. We test the consistency of ARs predicted by all methods against expected interactions predicted by the Context Likelihood of Relatedness (CLR) mutual information based method, finding that the ARs produced by our approach show better agreement with CLR interactions. We then apply our method to compute ARs for four other genomes: Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus. We compare the AR clusters from all genomes to study the similarity of coexpression among a phylogenetically diverse set of species, identifying subsystems that show remarkable similarity over wide phylogenetic distances. We also study the sensitivity of our method for computing ARs to the expression data used in the computation, showing that our new approach requires less data than competing approaches to converge to a near final configuration of ARs. We go on to use our sensitivity analysis to identify the specific experiments that lead most rapidly to the final set of ARs for E. coli. As a result, this analysis produces insights into improving the design of gene expression experiments.« less
Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.
Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. A multitude of technologies, abstractions, and interpretive frameworks have emerged to answer the challenges presented by genome function and regulatory network inference. Here, we propose a new approach for producing biologically meaningful clusters of coexpressed genes, called Atomic Regulons (ARs), based on expression data, gene context, and functional relationships. We demonstrate this new approach by computing ARs for Escherichia coli, which we compare with the coexpressed gene clusters predicted by two prevalent existing methods: hierarchical clustering and k-meansmore » clustering. We test the consistency of ARs predicted by all methods against expected interactions predicted by the Context Likelihood of Relatedness (CLR) mutual information based method, finding that the ARs produced by our approach show better agreement with CLR interactions. We then apply our method to compute ARs for four other genomes: Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus. We compare the AR clusters from all genomes to study the similarity of coexpression among a phylogenetically diverse set of species, identifying subsystems that show remarkable similarity over wide phylogenetic distances. We also study the sensitivity of our method for computing ARs to the expression data used in the computation, showing that our new approach requires less data than competing approaches to converge to a near final configuration of ARs. We go on to use our sensitivity analysis to identify the specific experiments that lead most rapidly to the final set of ARs for E. coli. As a result, this analysis produces insights into improving the design of gene expression experiments.« less
González-Plaza, Juan J.; Šimatović, Ana; Milaković, Milena; Bielen, Ana; Wichmann, Fabienne; Udiković-Kolić, Nikolina
2018-01-01
Environments polluted by direct discharges of effluents from antibiotic manufacturing are important reservoirs for antibiotic resistance genes (ARGs), which could potentially be transferred to human pathogens. However, our knowledge about the identity and diversity of ARGs in such polluted environments remains limited. We applied functional metagenomics to explore the resistome of two Croatian antibiotic manufacturing effluents and sediments collected upstream of and at the effluent discharge sites. Metagenomic libraries built from an azithromycin-production site were screened for resistance to macrolide antibiotics, whereas the libraries from a site producing veterinary antibiotics were screened for resistance to sulfonamides, tetracyclines, trimethoprim, and beta-lactams. Functional analysis of eight libraries identified a total of 82 unique, often clinically relevant ARGs, which were frequently found in clusters and flanked by mobile genetic elements. The majority of macrolide resistance genes identified from matrices exposed to high levels of macrolides were similar to known genes encoding ribosomal protection proteins, macrolide phosphotransferases, and transporters. Potentially novel macrolide resistance genes included one most similar to a 23S rRNA methyltransferase from Clostridium and another, derived from upstream unpolluted sediment, to a GTPase HflX from Emergencia. In libraries deriving from sediments exposed to lower levels of veterinary antibiotics, we found 8 potentially novel ARGs, including dihydrofolate reductases and beta-lactamases from classes A, B, and D. In addition, we detected 7 potentially novel ARGs in upstream sediment, including thymidylate synthases, dihydrofolate reductases, and class D beta-lactamase. Taken together, in addition to finding known gene types, we report the discovery of novel and diverse ARGs in antibiotic-polluted industrial effluents and sediments, providing a qualitative basis for monitoring the dispersal of ARGs from environmental hotspots such as discharge sites of pharmaceutical effluents. PMID:29387045
Heisig, Julia; Weber, David; Englberger, Eva; Winkler, Anja; Kneitz, Susanne; Sung, Wing-Kin; Wolf, Elmar; Eilers, Martin; Wei, Chia-Lin; Gessler, Manfred
2012-01-01
HEY bHLH transcription factors have been shown to regulate multiple key steps in cardiovascular development. They can be induced by activated NOTCH receptors, but other upstream stimuli mediated by TGFß and BMP receptors may elicit a similar response. While the basic and helix-loop-helix domains exhibit strong similarity, large parts of the proteins are still unique and may serve divergent functions. The striking overlap of cardiac defects in HEY2 and combined HEY1/HEYL knockout mice suggested that all three HEY genes fulfill overlapping function in target cells. We therefore sought to identify target genes for HEY proteins by microarray expression and ChIPseq analyses in HEK293 cells, cardiomyocytes, and murine hearts. HEY proteins were found to modulate expression of their target gene to a rather limited extent, but with striking functional interchangeability between HEY factors. Chromatin immunoprecipitation revealed a much greater number of potential binding sites that again largely overlap between HEY factors. Binding sites are clustered in the proximal promoter region especially of transcriptional regulators or developmental control genes. Multiple lines of evidence suggest that HEY proteins primarily act as direct transcriptional repressors, while gene activation seems to be due to secondary or indirect effects. Mutagenesis of putative DNA binding residues supports the notion of direct DNA binding. While class B E-box sequences (CACGYG) clearly represent preferred target sequences, there must be additional and more loosely defined modes of DNA binding since many of the target promoters that are efficiently bound by HEY proteins do not contain an E-box motif. These data clearly establish the three HEY bHLH factors as highly redundant transcriptional repressors in vitro and in vivo, which explains the combinatorial action observed in different tissues with overlapping expression.
Englberger, Eva; Winkler, Anja; Kneitz, Susanne; Sung, Wing-Kin; Wolf, Elmar; Eilers, Martin; Wei, Chia-Lin; Gessler, Manfred
2012-01-01
HEY bHLH transcription factors have been shown to regulate multiple key steps in cardiovascular development. They can be induced by activated NOTCH receptors, but other upstream stimuli mediated by TGFß and BMP receptors may elicit a similar response. While the basic and helix-loop-helix domains exhibit strong similarity, large parts of the proteins are still unique and may serve divergent functions. The striking overlap of cardiac defects in HEY2 and combined HEY1/HEYL knockout mice suggested that all three HEY genes fulfill overlapping function in target cells. We therefore sought to identify target genes for HEY proteins by microarray expression and ChIPseq analyses in HEK293 cells, cardiomyocytes, and murine hearts. HEY proteins were found to modulate expression of their target gene to a rather limited extent, but with striking functional interchangeability between HEY factors. Chromatin immunoprecipitation revealed a much greater number of potential binding sites that again largely overlap between HEY factors. Binding sites are clustered in the proximal promoter region especially of transcriptional regulators or developmental control genes. Multiple lines of evidence suggest that HEY proteins primarily act as direct transcriptional repressors, while gene activation seems to be due to secondary or indirect effects. Mutagenesis of putative DNA binding residues supports the notion of direct DNA binding. While class B E-box sequences (CACGYG) clearly represent preferred target sequences, there must be additional and more loosely defined modes of DNA binding since many of the target promoters that are efficiently bound by HEY proteins do not contain an E-box motif. These data clearly establish the three HEY bHLH factors as highly redundant transcriptional repressors in vitro and in vivo, which explains the combinatorial action observed in different tissues with overlapping expression. PMID:22615585
Morrow, James M; Lazic, Savo; Dixon Fox, Monica; Kuo, Claire; Schott, Ryan K; de A Gutierrez, Eduardo; Santini, Francesco; Tropepe, Vincent; Chang, Belinda S W
2017-01-15
Rhodopsin (rh1) is the visual pigment expressed in rod photoreceptors of vertebrates that is responsible for initiating the critical first step of dim-light vision. Rhodopsin is usually a single copy gene; however, we previously discovered a novel rhodopsin-like gene expressed in the zebrafish retina, rh1-2, which we identified as a functional photosensitive pigment that binds 11-cis retinal and activates in response to light. Here, we localized expression of rh1-2 in the zebrafish retina to a subset of peripheral photoreceptor cells, which indicates a partially overlapping expression pattern with rh1 We also expressed, purified and characterized Rh1-2, including investigation of the stability of the biologically active intermediate. Using fluorescence spectroscopy, we found the half-life of the rate of retinal release of Rh1-2 following photoactivation to be more similar to that of the visual pigment rhodopsin than to the non-visual pigment exo-rhodopsin (exorh), which releases retinal around 5 times faster. Phylogenetic and molecular evolutionary analyses show that rh1-2 has ancient origins within teleost fishes, is under similar selective pressure to rh1, and likely experienced a burst of positive selection following its duplication and divergence from rh1 These findings indicate that rh1-2 is another functional visual rhodopsin gene, which contradicts the prevailing notion that visual rhodopsin is primarily found as a single copy gene within ray-finned fishes. The reasons for retention of this duplicate gene, as well as possible functional consequences for the visual system, are discussed. © 2017. Published by The Company of Biologists Ltd.
2004-01-01
Flagellar genes Presentb Presentc Presentc Tagatose utilization genes Absent Present Partiald Functional PlcR Absente Presente Presente Mobile genetic...closely related and one that is divergent (Supplementary ®g. S3). dThere are similar tagatose utilization genes in B.cereus ATCC 14579; however, they...replacement responsible for the transport and utilization of the carbohydrate tagatose (BCE1896±BCE1912). The corres- ponding 5.0 kb region in
Discriminating the reaction types of plant type III polyketide synthases
Shimizu, Yugo; Ogata, Hiroyuki; Goto, Susumu
2017-01-01
Abstract Motivation: Functional prediction of paralogs is challenging in bioinformatics because of rapid functional diversification after gene duplication events combined with parallel acquisitions of similar functions by different paralogs. Plant type III polyketide synthases (PKSs), producing various secondary metabolites, represent a paralogous family that has undergone gene duplication and functional alteration. Currently, there is no computational method available for the functional prediction of type III PKSs. Results: We developed a plant type III PKS reaction predictor, pPAP, based on the recently proposed classification of type III PKSs. pPAP combines two kinds of similarity measures: one calculated by profile hidden Markov models (pHMMs) built from functionally and structurally important partial sequence regions, and the other based on mutual information between residue positions. pPAP targets PKSs acting on ring-type starter substrates, and classifies their functions into four reaction types. The pHMM approach discriminated two reaction types with high accuracy (97.5%, 39/40), but its accuracy decreased when discriminating three reaction types (87.8%, 43/49). When combined with a correlation-based approach, all 49 PKSs were correctly discriminated, and pPAP was still highly accurate (91.4%, 64/70) even after adding other reaction types. These results suggest pPAP, which is based on linear discriminant analyses of similarity measures, is effective for plant type III PKS function prediction. Availability and Implementation: pPAP is freely available at ftp://ftp.genome.jp/pub/tools/ppap/ Contact: goto@kuicr.kyoto-u.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28334262
Shen, Xing-Xing; Salichos, Leonidas; Rokas, Antonis
2016-09-02
Molecular phylogenetic inference is inherently dependent on choices in both methodology and data. Many insightful studies have shown how choices in methodology, such as the model of sequence evolution or optimality criterion used, can strongly influence inference. In contrast, much less is known about the impact of choices in the properties of the data, typically genes, on phylogenetic inference. We investigated the relationships between 52 gene properties (24 sequence-based, 19 function-based, and 9 tree-based) with each other and with three measures of phylogenetic signal in two assembled data sets of 2,832 yeast and 2,002 mammalian genes. We found that most gene properties, such as evolutionary rate (measured through the percent average of pairwise identity across taxa) and total tree length, were highly correlated with each other. Similarly, several gene properties, such as gene alignment length, Guanine-Cytosine content, and the proportion of tree distance on internal branches divided by relative composition variability (treeness/RCV), were strongly correlated with phylogenetic signal. Analysis of partial correlations between gene properties and phylogenetic signal in which gene evolutionary rate and alignment length were simultaneously controlled, showed similar patterns of correlations, albeit weaker in strength. Examination of the relative importance of each gene property on phylogenetic signal identified gene alignment length, alongside with number of parsimony-informative sites and variable sites, as the most important predictors. Interestingly, the subsets of gene properties that optimally predicted phylogenetic signal differed considerably across our three phylogenetic measures and two data sets; however, gene alignment length and RCV were consistently included as predictors of all three phylogenetic measures in both yeasts and mammals. These results suggest that a handful of sequence-based gene properties are reliable predictors of phylogenetic signal and could be useful in guiding the choice of phylogenetic markers. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Azarbad, Hamed; Niklińska, Maria; Laskowski, Ryszard; van Straalen, Nico M; van Gestel, Cornelis A M; Zhou, Jizhong; He, Zhili; Wen, Chongqing; Röling, Wilfred F M
2015-01-01
Despite the global importance of forests, it is virtually unknown how their soil microbial communities adapt at the phylogenetic and functional level to long-term metal pollution. Studying 12 sites located along two distinct gradients of metal pollution in Southern Poland revealed that functional potential and diversity (assessed using GeoChip 4.2) were highly similar across the gradients despite drastically diverging metal contamination levels. Metal pollution level did, however, significantly impact bacterial community structure (as shown by MiSeq Illumina sequencing of 16S rRNA genes), but not bacterial taxon richness and community composition. Metal pollution caused changes in the relative abundance of specific bacterial taxa, including Acidobacteria, Actinobacteria, Bacteroidetes, Chloroflexi, Firmicutes, Planctomycetes and Proteobacteria. Also, a group of metal-resistance genes showed significant correlations with metal concentrations in soil. Our study showed that microbial communities are resilient to metal pollution; despite differences in community structure, no clear impact of metal pollution levels on overall functional diversity was observed. While screens of phylogenetic marker genes, such as 16S rRNA genes, provide only limited insight into resilience mechanisms, analysis of specific functional genes, e.g. involved in metal resistance, appears to be a more promising strategy. © FEMS 2014. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Accelerated recruitment of new brain development genes into the human genome.
Zhang, Yong E; Landback, Patrick; Vibranovski, Maria D; Long, Manyuan
2011-10-01
How the human brain evolved has attracted tremendous interests for decades. Motivated by case studies of primate-specific genes implicated in brain function, we examined whether or not the young genes, those emerging genome-wide in the lineages specific to the primates or rodents, showed distinct spatial and temporal patterns of transcription compared to old genes, which had existed before primate and rodent split. We found consistent patterns across different sources of expression data: there is a significantly larger proportion of young genes expressed in the fetal or infant brain of humans than in mouse, and more young genes in humans have expression biased toward early developing brains than old genes. Most of these young genes are expressed in the evolutionarily newest part of human brain, the neocortex. Remarkably, we also identified a number of human-specific genes which are expressed in the prefrontal cortex, which is implicated in complex cognitive behaviors. The young genes upregulated in the early developing human brain play diverse functional roles, with a significant enrichment of transcription factors. Genes originating from different mechanisms show a similar expression bias in the developing brain. Moreover, we found that the young genes upregulated in early brain development showed rapid protein evolution compared to old genes also expressed in the fetal brain. Strikingly, genes expressed in the neocortex arose soon after its morphological origin. These four lines of evidence suggest that positive selection for brain function may have contributed to the origination of young genes expressed in the developing brain. These data demonstrate a striking recruitment of new genes into the early development of the human brain.
Genes under positive selection in a model plant pathogenic fungus, Botrytis.
Aguileta, Gabriela; Lengelle, Juliette; Chiapello, Hélène; Giraud, Tatiana; Viaud, Muriel; Fournier, Elisabeth; Rodolphe, François; Marthey, Sylvain; Ducasse, Aurélie; Gendrault, Annie; Poulain, Julie; Wincker, Patrick; Gout, Lilian
2012-07-01
The rapid evolution of particular genes is essential for the adaptation of pathogens to new hosts and new environments. Powerful methods have been developed for detecting targets of selection in the genome. Here we used divergence data to compare genes among four closely related fungal pathogens adapted to different hosts to elucidate the functions putatively involved in adaptive processes. For this goal, ESTs were sequenced in the specialist fungal pathogens Botrytis tulipae and Botrytis ficariarum, and compared with genome sequences of Botrytis cinerea and Sclerotinia sclerotiorum, responsible for diseases on over 200 plant species. A maximum likelihood-based analysis of 642 predicted orthologs detected 21 genes showing footprints of positive selection. These results were validated by resequencing nine of these genes in additional Botrytis species, showing they have also been rapidly evolving in other related species. Twenty of the 21 genes had not previously been identified as pathogenicity factors in B. cinerea, but some had functions related to plant-fungus interactions. The putative functions were involved in respiratory and energy metabolism, protein and RNA metabolism, signal transduction or virulence, similarly to what was detected in previous studies using the same approach in other pathogens. Mutants of B. cinerea were generated for four of these genes as a first attempt to elucidate their functions. Copyright © 2012 Elsevier B.V. All rights reserved.
Li, Tao; Hu, Ya-Jun; Hao, Zhi-Peng; Li, Hong; Chen, Bao-Dong
2013-05-01
Arbuscular mycorrhizal (AM) symbiosis, established between AM fungi (AMF) and roots of higher plants, occurs in most terrestrial ecosystems. It has been well demonstrated that AM symbiosis can improve plant performance under various environmental stresses, including drought stress. However, the molecular basis for the direct involvement of AMF in plant drought tolerance has not yet been established. Most recently, we cloned two functional aquaporin genes, GintAQPF1 and GintAQPF2, from AM fungus Glomus intraradices. By heterologous gene expression in yeast, aquaporin localization, activities and water permeability were examined. Gene expressions during symbiosis in expose to drought stress were also analyzed. Our data strongly supported potential water transport via AMF to host plants. As a complement, here we adopted the monoxenic culture system for AMF, in which carrot roots transformed by Ri-T DNA were cultured with Glomus intraradices in two-compartment Petri dishes, to verify the aquaporin gene functions in assisting AMF survival under polyethylene glycol (PEG) treatment. Our results showed that 25% PEG significantly upregulated the expression of two aquaporin genes, which was in line with the gene functions examined in yeast. We therefore concluded that the aquaporins function similarly in AMF as in yeast subjected to osmotic stress. The study provided further evidence to the direct involvement of AMF in improving plant water relations under drought stresses.
Pseudogenization of a Sweet-Receptor Gene Accounts for Cats' Indifference toward Sugar
Li, Xia; Li, Weihua; Wang, Hong; Cao, Jie; Maehashi, Kenji; Huang, Liquan; Bachmanov, Alexander A; Reed, Danielle R; Legrand-Defretin, Véronique; Beauchamp, Gary K; Brand, Joseph G
2005-01-01
Although domestic cats (Felis silvestris catus) possess an otherwise functional sense of taste, they, unlike most mammals, do not prefer and may be unable to detect the sweetness of sugars. One possible explanation for this behavior is that cats lack the sensory system to taste sugars and therefore are indifferent to them. Drawing on work in mice, demonstrating that alleles of sweet-receptor genes predict low sugar intake, we examined the possibility that genes involved in the initial transduction of sweet perception might account for the indifference to sweet-tasting foods by cats. We characterized the sweet-receptor genes of domestic cats as well as those of other members of the Felidae family of obligate carnivores, tiger and cheetah. Because the mammalian sweet-taste receptor is formed by the dimerization of two proteins (T1R2 and T1R3; gene symbols Tas1r2 and Tas1r3), we identified and sequenced both genes in the cat by screening a feline genomic BAC library and by performing PCR with degenerate primers on cat genomic DNA. Gene expression was assessed by RT-PCR of taste tissue, in situ hybridization, and immunohistochemistry. The cat Tas1r3 gene shows high sequence similarity with functional Tas1r3 genes of other species. Message from Tas1r3 was detected by RT-PCR of taste tissue. In situ hybridization and immunohistochemical studies demonstrate that Tas1r3 is expressed, as expected, in taste buds. However, the cat Tas1r2 gene shows a 247-base pair microdeletion in exon 3 and stop codons in exons 4 and 6. There was no evidence of detectable mRNA from cat Tas1r2 by RT-PCR or in situ hybridization, and no evidence of protein expression by immunohistochemistry. Tas1r2 in tiger and cheetah and in six healthy adult domestic cats all show the similar deletion and stop codons. We conclude that cat Tas1r3 is an apparently functional and expressed receptor but that cat Tas1r2 is an unexpressed pseudogene. A functional sweet-taste receptor heteromer cannot form, and thus the cat lacks the receptor likely necessary for detection of sweet stimuli. This molecular change was very likely an important event in the evolution of the cat's carnivorous behavior. PMID:16103917
DOSE: an R/Bioconductor package for disease ontology semantic and enrichment analysis.
Yu, Guangchuang; Wang, Li-Gen; Yan, Guang-Rong; He, Qing-Yu
2015-02-15
Disease ontology (DO) annotates human genes in the context of disease. DO is important annotation in translating molecular findings from high-throughput data to clinical relevance. DOSE is an R package providing semantic similarity computations among DO terms and genes which allows biologists to explore the similarities of diseases and of gene functions in disease perspective. Enrichment analyses including hypergeometric model and gene set enrichment analysis are also implemented to support discovering disease associations of high-throughput biological data. This allows biologists to verify disease relevance in a biological experiment and identify unexpected disease associations. Comparison among gene clusters is also supported. DOSE is released under Artistic-2.0 License. The source code and documents are freely available through Bioconductor (http://www.bioconductor.org/packages/release/bioc/html/DOSE.html). Supplementary data are available at Bioinformatics online. gcyu@connect.hku.hk or tqyhe@jnu.edu.cn. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Two rapidly evolving genes contribute to male fitness in Drosophila
Reinhardt, Josephine A; Jones, Corbin D
2013-01-01
Purifying selection often results in conservation of gene sequence and function. The most functionally conserved genes are also thought to be among the most biologically essential. These observations have led to the use of sequence conservation as a proxy for functional conservation. Here we describe two genes that are exceptions to this pattern. We show that lack of sequence conservation among orthologs of CG15460 and CG15323 – herein named jean-baptiste (jb) and karr respectively – does not necessarily predict lack of functional conservation. These two Drosophila melanogaster genes are among the most rapidly evolving protein-coding genes in this species, being nearly as diverged from their D. yakuba orthologs as random sequences are. jb and karr are both expressed at an elevated level in larval males and adult testes, but they are not accessory gland proteins and their loss does not affect male fertility. Instead, knockdown of these genes in D. melanogaster via RNA interference caused male-biased viability defects. These viability effects occur prior to the third instar for jb and during late pupation for karr. We show that putative orthologs to jb and karr are also expressed strongly in the testes of other Drosophila species and have similar gene structure across species despite low levels of sequence conservation. While standard molecular evolution tests could not reject neutrality, other data hint at a role for natural selection. Together these data provide a clear case where a lack of sequence conservation does not imply a lack of conservation of expression or function. PMID:24221639
Ma, Jun; Wang, Qinglian; Sun, Runrun; Xie, Fuliang; Jones, Don C; Zhang, Baohong
2014-10-16
Plant-specific TEOSINTE-BRANCHED1/CYCLOIDEA/PCF (TCP) transcription factors play versatile functions in multiple aspects of plant growth and development. However, no systematical study has been performed in cotton. In this study, we performed for the first time the genome-wide identification and expression analysis of the TCP transcription factor family in Gossypium raimondii. A total of 38 non-redundant cotton TCP encoding genes were identified. The TCP transcription factors were divided into eleven subgroups based on phylogenetic analysis. Most TCP genes within the same subfamily demonstrated similar exon and intron organization and the motif structures were highly conserved among the subfamilies. Additionally, the chromosomal distribution pattern revealed that TCP genes were unevenly distributed across 11 out of the 13 chromosomes; segmental duplication is a predominant duplication event for TCP genes and the major contributor to the expansion of TCP gene family in G. raimondii. Moreover, the expression profiles of TCP genes shed light on their functional divergence.
Ma, Jun; Wang, Qinglian; Sun, Runrun; Xie, Fuliang; Jones, Don C.; Zhang, Baohong
2014-01-01
Plant-specific TEOSINTE-BRANCHED1/CYCLOIDEA/PCF (TCP) transcription factors play versatile functions in multiple aspects of plant growth and development. However, no systematical study has been performed in cotton. In this study, we performed for the first time the genome-wide identification and expression analysis of the TCP transcription factor family in Gossypium raimondii. A total of 38 non-redundant cotton TCP encoding genes were identified. The TCP transcription factors were divided into eleven subgroups based on phylogenetic analysis. Most TCP genes within the same subfamily demonstrated similar exon and intron organization and the motif structures were highly conserved among the subfamilies. Additionally, the chromosomal distribution pattern revealed that TCP genes were unevenly distributed across 11 out of the 13 chromosomes; segmental duplication is a predominant duplication event for TCP genes and the major contributor to the expansion of TCP gene family in G. raimondii. Moreover, the expression profiles of TCP genes shed light on their functional divergence. PMID:25322260
Omics analysis of mouse brain models of human diseases.
Paban, Véronique; Loriod, Béatrice; Villard, Claude; Buee, Luc; Blum, David; Pietropaolo, Susanna; Cho, Yoon H; Gory-Faure, Sylvie; Mansour, Elodie; Gharbi, Ali; Alescio-Lautier, Béatrice
2017-02-05
The identification of common gene/protein profiles related to brain alterations, if they exist, may indicate the convergence of the pathogenic mechanisms driving brain disorders. Six genetically engineered mouse lines modelling neurodegenerative diseases and neuropsychiatric disorders were considered. Omics approaches, including transcriptomic and proteomic methods, were used. The gene/protein lists were used for inter-disease comparisons and further functional and network investigations. When the inter-disease comparison was performed using the gene symbol identifiers, the number of genes/proteins involved in multiple diseases decreased rapidly. Thus, no genes/proteins were shared by all 6 mouse models. Only one gene/protein (Gfap) was shared among 4 disorders, providing strong evidence that a common molecular signature does not exist among brain diseases. The inter-disease comparison of functional processes showed the involvement of a few major biological processes indicating that brain diseases of diverse aetiologies might utilize common biological pathways in the nervous system, without necessarily involving similar molecules. Copyright © 2016 Elsevier B.V. All rights reserved.
Generation of transgenic mouse model using PTTG as an oncogene.
Kakar, Sham S; Kakar, Cohin
2015-01-01
The close physiological similarity between the mouse and human has provided tools to understanding the biological function of particular genes in vivo by introduction or deletion of a gene of interest. Using a mouse as a model has provided a wealth of resources, knowledge, and technology, helping scientists to understand the biological functions, translocation, trafficking, and interaction of a candidate gene with other intracellular molecules, transcriptional regulation, posttranslational modification, and discovery of novel signaling pathways for a particular gene. Most importantly, the generation of the mouse model for a specific human disease has provided a powerful tool to understand the etiology of a disease and discovery of novel therapeutics. This chapter describes in detail the step-by-step generation of the transgenic mouse model, which can be helpful in guiding new investigators in developing successful models. For practical purposes, we will describe the generation of a mouse model using pituitary tumor transforming gene (PTTG) as the candidate gene of interest.
Systematic bacterialization of yeast genes identifies a near-universally swappable pathway
Kachroo, Aashiq H; Laurent, Jon M; Akhmetov, Azat; Szilagyi-Jones, Madelyn; McWhite, Claire D; Zhao, Alice; Marcotte, Edward M
2017-01-01
Eukaryotes and prokaryotes last shared a common ancestor ~2 billion years ago, and while many present-day genes in these lineages predate this divergence, the extent to which these genes still perform their ancestral functions is largely unknown. To test principles governing retention of ancient function, we asked if prokaryotic genes could replace their essential eukaryotic orthologs. We systematically replaced essential genes in yeast by their 1:1 orthologs from Escherichia coli. After accounting for mitochondrial localization and alternative start codons, 31 out of 51 bacterial genes tested (61%) could complement a lethal growth defect and replace their yeast orthologs with minimal effects on growth rate. Replaceability was determined on a pathway-by-pathway basis; codon usage, abundance, and sequence similarity contributed predictive power. The heme biosynthesis pathway was particularly amenable to inter-kingdom exchange, with each yeast enzyme replaceable by its bacterial, human, or plant ortholog, suggesting it as a near-universally swappable pathway. DOI: http://dx.doi.org/10.7554/eLife.25093.001 PMID:28661399
Matus, José Tomás; Aquea, Felipe; Arce-Johnson, Patricio
2008-01-01
Background The MYB superfamily constitutes the most abundant group of transcription factors described in plants. Members control processes such as epidermal cell differentiation, stomatal aperture, flavonoid synthesis, cold and drought tolerance and pathogen resistance. No genome-wide characterization of this family has been conducted in a woody species such as grapevine. In addition, previous analysis of the recently released grape genome sequence suggested expansion events of several gene families involved in wine quality. Results We describe and classify 108 members of the grape R2R3 MYB gene subfamily in terms of their genomic gene structures and similarity to their putative Arabidopsis thaliana orthologues. Seven gene models were derived and analyzed in terms of gene expression and their DNA binding domain structures. Despite low overall sequence homology in the C-terminus of all proteins, even in those with similar functions across Arabidopsis and Vitis, highly conserved motif sequences and exon lengths were found. The grape epidermal cell fate clade is expanded when compared with the Arabidopsis and rice MYB subfamilies. Two anthocyanin MYBA related clusters were identified in chromosomes 2 and 14, one of which includes the previously described grape colour locus. Tannin related loci were also detected with eight candidate homologues in chromosomes 4, 9 and 11. Conclusion This genome wide transcription factor analysis in Vitis suggests that clade-specific grape R2R3 MYB genes are expanded while other MYB genes could be well conserved compared to Arabidopsis. MYB gene abundance, homology and orientation within particular loci also suggests that expanded MYB clades conferring quality attributes of grapes and wines, such as colour and astringency, could possess redundant, overlapping and cooperative functions. PMID:18647406
Gonzales, Bianca; Yang, Hushan; Henning, Dale; Valdez, Benigno C
2005-10-10
Treacher Collins syndrome (TCS) is an autosomal dominant disorder of craniofacial development caused by mutations in the TCOF1 gene, which encodes the nucleolar phosphoprotein treacle. We previously reported a function for mammalian treacle in ribosomal DNA gene transcription by its interaction with upstream binding factor. As an initial step in the development of a TCS model for frog the cDNA that encodes the Xenopus laevis treacle was cloned. Although the derived amino acid sequence shows a poor homology with its mammalian orthologues, Xenopus treacle has 11 highly homologous direct repeats near the center of the protein molecule similar to those present in its human, dog and mouse orthologues. Comparison of their amino acid compositions indicates conservation of predominant specific amino acid residues. Antisense-mediated down-regulation of treacle expression in X. laevis oocytes resulted in inhibition of rDNA gene transcription. The results suggest evolutionary conservation of the function of treacle in ribosomal RNA biogenesis in higher eukaryotes.
Ponce-Toledo, Rafael I; Moreira, David; López-García, Purificación; Deschamps, Philippe
2018-06-19
Endosymbiosis has been common all along eukaryotic evolution, providing opportunities for genomic and organellar innovation. Plastids are a prominent example. After the primary endosymbiosis of the cyanobacterial plastid ancestor, photosynthesis spread in many eukaryotic lineages via secondary endosymbioses involving red or green algal endosymbionts and diverse heterotrophic hosts. However, the number of secondary endosymbioses and how they occurred remain poorly understood. In particular, contrasting patterns of endosymbiotic gene transfer (EGT) have been detected and subjected to various interpretations. In this context, accurate detection of EGTs is essential to avoid wrong evolutionary conclusions. We have assembled a strictly selected set of markers that provides robust phylogenomic evidence suggesting that nuclear genes involved in the function and maintenance of green secondary plastids in chlorarachniophytes and euglenids have unexpected mixed red and green algal origins. This mixed ancestry contrasts with the clear red algal origin of most nuclear genes carrying similar functions in secondary algae with red plastids.
Evolving phenotypic networks in silico.
François, Paul
2014-11-01
Evolved gene networks are constrained by natural selection. Their structures and functions are consequently far from being random, as exemplified by the multiple instances of parallel/convergent evolution. One can thus ask if features of actual gene networks can be recovered from evolutionary first principles. I review a method for in silico evolution of small models of gene networks aiming at performing predefined biological functions. I summarize the current implementation of the algorithm, insisting on the construction of a proper "fitness" function. I illustrate the approach on three examples: biochemical adaptation, ligand discrimination and vertebrate segmentation (somitogenesis). While the structure of the evolved networks is variable, dynamics of our evolved networks are usually constrained and present many similar features to actual gene networks, including properties that were not explicitly selected for. In silico evolution can thus be used to predict biological behaviours without a detailed knowledge of the mapping between genotype and phenotype. Copyright © 2014 The Author. Published by Elsevier Ltd.. All rights reserved.
Functional gene diversity of soil microbial communities from five oil-contaminated fields in China.
Liang, Yuting; Van Nostrand, Joy D; Deng, Ye; He, Zhili; Wu, Liyou; Zhang, Xu; Li, Guanghe; Zhou, Jizhong
2011-03-01
To compare microbial functional diversity in different oil-contaminated fields and to know the effects of oil contaminant and environmental factors, soil samples were taken from typical oil-contaminated fields located in five geographic regions of China. GeoChip, a high-throughput functional gene array, was used to evaluate the microbial functional genes involved in contaminant degradation and in other major biogeochemical/metabolic processes. Our results indicated that the overall microbial community structures were distinct in each oil-contaminated field, and samples were clustered by geographic locations. The organic contaminant degradation genes were most abundant in all samples and presented a similar pattern under oil contaminant stress among the five fields. In addition, alkane and aromatic hydrocarbon degradation genes such as monooxygenase and dioxygenase were detected in high abundance in the oil-contaminated fields. Canonical correspondence analysis indicated that the microbial functional patterns were highly correlated to the local environmental variables, such as oil contaminant concentration, nitrogen and phosphorus contents, salt and pH. Finally, a total of 59% of microbial community variation from GeoChip data can be explained by oil contamination, geographic location and soil geochemical parameters. This study provided insights into the in situ microbial functional structures in oil-contaminated fields and discerned the linkages between microbial communities and environmental variables, which is important to the application of bioremediation in oil-contaminated sites.
Functional gene diversity of soil microbial communities from five oil-contaminated fields in China
Liang, Yuting; Van Nostrand, Joy D; Deng, Ye; He, Zhili; Wu, Liyou; Zhang, Xu; Li, Guanghe; Zhou, Jizhong
2011-01-01
To compare microbial functional diversity in different oil-contaminated fields and to know the effects of oil contaminant and environmental factors, soil samples were taken from typical oil-contaminated fields located in five geographic regions of China. GeoChip, a high-throughput functional gene array, was used to evaluate the microbial functional genes involved in contaminant degradation and in other major biogeochemical/metabolic processes. Our results indicated that the overall microbial community structures were distinct in each oil-contaminated field, and samples were clustered by geographic locations. The organic contaminant degradation genes were most abundant in all samples and presented a similar pattern under oil contaminant stress among the five fields. In addition, alkane and aromatic hydrocarbon degradation genes such as monooxygenase and dioxygenase were detected in high abundance in the oil-contaminated fields. Canonical correspondence analysis indicated that the microbial functional patterns were highly correlated to the local environmental variables, such as oil contaminant concentration, nitrogen and phosphorus contents, salt and pH. Finally, a total of 59% of microbial community variation from GeoChip data can be explained by oil contamination, geographic location and soil geochemical parameters. This study provided insights into the in situ microbial functional structures in oil-contaminated fields and discerned the linkages between microbial communities and environmental variables, which is important to the application of bioremediation in oil-contaminated sites. PMID:20861922
Sugar Lego: gene composition of bacterial carbohydrate metabolism genomic loci.
Kaznadzey, Anna; Shelyakin, Pavel; Gelfand, Mikhail S
2017-11-25
Bacterial carbohydrate metabolism is extremely diverse, since carbohydrates serve as a major energy source and are involved in a variety of cellular processes. Bacterial genes belonging to same metabolic pathway are often co-localized in the chromosome, but it is not a strict rule. Gene co-localization in linked to co-evolution and co-regulation. This study focuses on a large-scale analysis of bacterial genomic loci related to the carbohydrate metabolism. We demonstrate that only 53% of 148,000 studied genes from over six hundred bacterial genomes are co-localized in bacterial genomes with other carbohydrate metabolism genes, which points to a significant role of singleton genes. Co-localized genes form cassettes, ranging in size from two to fifteen genes. Two major factors influencing the cassette-forming tendency are gene function and bacterial phylogeny. We have obtained a comprehensive picture of co-localization preferences of genes for nineteen major carbohydrate metabolism functional classes, over two hundred gene orthologous clusters, and thirty bacterial classes, and characterized the cassette variety in size and content among different species, highlighting a significant role of short cassettes. The preference towards co-localization of carbohydrate metabolism genes varies between 40 and 76% for bacterial taxa. Analysis of frequently co-localized genes yielded forty-five significant pairwise links between genes belonging to different functional classes. The number of such links per class range from zero to eight, demonstrating varying preferences of respective genes towards a specific chromosomal neighborhood. Genes from eleven functional classes tend to co-localize with genes from the same class, indicating an important role of clustering of genes with similar functions. At that, in most cases such co-localization does not originate from local duplication events. Overall, we describe a complex web formed by evolutionary relationships of bacterial carbohydrate metabolism genes, manifested as co-localization patterns. This article was reviewed by Daria V. Dibrova (A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia), nominated by Armen Mulkidjanian (University of Osnabrück, Germany), Igor Rogozin (NCBI, NLM, NIH, USA) and Yuri Wolf (NCBI, NLM, NIH, USA).
2014-01-01
Background Sho-saiko-to (SST) (also known as so-shi-ho-tang or xiao-chai-hu-tang) has been widely prescribed for chronic liver diseases in traditional Oriental medicine. Despite the substantial amount of clinical evidence for SST, its molecular mechanism has not been clearly identified at a genome-wide level. Methods By using a microarray, we analyzed the temporal changes of messenger RNA (mRNA) and microRNA expression in primary mouse hepatocytes after SST treatment. The pattern of genes regulated by SST was identified by using time-series microarray analysis. The biological function of genes was measured by pathway analysis. For the identification of the exact targets of the microRNAs, a permutation-based correlation method was implemented in which the temporal expression of mRNAs and microRNAs were integrated. The similarity of the promoter structure between temporally regulated genes was measured by analyzing the transcription factor binding sites in the promoter region. Results The SST-regulated gene expression had two major patterns: (1) a temporally up-regulated pattern (463 genes) and (2) a temporally down-regulated pattern (177 genes). The integration of the genes and microRNA demonstrated that 155 genes could be the targets of microRNAs from the temporally up-regulated pattern and 19 genes could be the targets of microRNAs from the temporally down-regulated pattern. The temporally up-regulated pattern by SST was associated with signaling pathways such as the cell cycle pathway, whereas the temporally down-regulated pattern included drug metabolism-related pathways and immune-related pathways. All these pathways could be possibly associated with liver regenerative activity of SST. Genes targeted by microRNA were moreover associated with different biological pathways from the genes not targeted by microRNA. An analysis of promoter similarity indicated that co-expressed genes after SST treatment were clustered into subgroups, depending on the temporal expression patterns. Conclusions We are the first to identify that SST regulates temporal gene expression by way of microRNA. MicroRNA targets and non-microRNA targets moreover have different biological roles. This functional segregation by microRNA would be critical for the elucidation of the molecular activities of SST. PMID:24410935
Song, Kwang Hoon; Kim, Yun Hee; Kim, Bu-Yeo
2014-01-11
Sho-saiko-to (SST) (also known as so-shi-ho-tang or xiao-chai-hu-tang) has been widely prescribed for chronic liver diseases in traditional Oriental medicine. Despite the substantial amount of clinical evidence for SST, its molecular mechanism has not been clearly identified at a genome-wide level. By using a microarray, we analyzed the temporal changes of messenger RNA (mRNA) and microRNA expression in primary mouse hepatocytes after SST treatment. The pattern of genes regulated by SST was identified by using time-series microarray analysis. The biological function of genes was measured by pathway analysis. For the identification of the exact targets of the microRNAs, a permutation-based correlation method was implemented in which the temporal expression of mRNAs and microRNAs were integrated. The similarity of the promoter structure between temporally regulated genes was measured by analyzing the transcription factor binding sites in the promoter region. The SST-regulated gene expression had two major patterns: (1) a temporally up-regulated pattern (463 genes) and (2) a temporally down-regulated pattern (177 genes). The integration of the genes and microRNA demonstrated that 155 genes could be the targets of microRNAs from the temporally up-regulated pattern and 19 genes could be the targets of microRNAs from the temporally down-regulated pattern. The temporally up-regulated pattern by SST was associated with signaling pathways such as the cell cycle pathway, whereas the temporally down-regulated pattern included drug metabolism-related pathways and immune-related pathways. All these pathways could be possibly associated with liver regenerative activity of SST. Genes targeted by microRNA were moreover associated with different biological pathways from the genes not targeted by microRNA. An analysis of promoter similarity indicated that co-expressed genes after SST treatment were clustered into subgroups, depending on the temporal expression patterns. We are the first to identify that SST regulates temporal gene expression by way of microRNA. MicroRNA targets and non-microRNA targets moreover have different biological roles. This functional segregation by microRNA would be critical for the elucidation of the molecular activities of SST.
Maciejowski, John; Ahn, James Hyungsoo; Cipriani, Patricia Giselle; Killian, Darrell J.; Chaudhary, Aisha L.; Lee, Ji Inn; Voutev, Roumen; Johnsen, Robert C.; Baillie, David L.; Gunsalus, Kristin C.; Fitch, David H. A.; Hubbard, E. Jane Albert
2005-01-01
We report molecular genetic studies of three genes involved in early germ-line proliferation in Caenorhabditis elegans that lend unexpected insight into a germ-line/soma functional separation of autosomal/X-linked duplicated gene pairs. In a genetic screen for germ-line proliferation-defective mutants, we identified mutations in rpl-11.1 (L11 protein of the large ribosomal subunit), pab-1 [a poly(A)-binding protein], and glp-3/eft-3 (an elongation factor 1-α homolog). All three are members of autosome/X gene pairs. Consistent with a germ-line-restricted function of rpl-11.1 and pab-1, mutations in these genes extend life span and cause gigantism. We further examined the RNAi phenotypes of the three sets of rpl genes (rpl-11, rpl-24, and rpl-25) and found that for the two rpl genes with autosomal/X-linked pairs (rpl-11 and rpl-25), zygotic germ-line function is carried by the autosomal copy. Available RNAi results for highly conserved autosomal/X-linked gene pairs suggest that other duplicated genes may follow a similar trend. The three rpl and the pab-1/2 duplications predate the divergence between C. elegans and C. briggsae, while the eft-3/4 duplication appears to have occurred in the lineage to C. elegans after it diverged from C. briggsae. The duplicated C. briggsae orthologs of the three C. elegans autosomal/X-linked gene pairs also display functional differences between paralogs. We present hypotheses for evolutionary mechanisms that may underlie germ-line/soma subfunctionalization of duplicated genes, taking into account the role of X chromosome silencing in the germ line and analogous mammalian phenomena. PMID:15687263
Exploration of Uncharted Regions of the Protein Universe
Jaroszewski, Lukasz; Li, Zhanwen; Krishna, S. Sri; Bakolitsa, Constantina; Wooley, John; Deacon, Ashley M.; Wilson, Ian A.; Godzik, Adam
2009-01-01
The genome projects have unearthed an enormous diversity of genes of unknown function that are still awaiting biological and biochemical characterization. These genes, as most others, can be grouped into families based on sequence similarity. The PFAM database currently contains over 2,200 such families, referred to as domains of unknown function (DUF). In a coordinated effort, the four large-scale centers of the NIH Protein Structure Initiative have determined the first three-dimensional structures for more than 250 of these DUF families. Analysis of the first 248 reveals that about two thirds of the DUF families likely represent very divergent branches of already known and well-characterized families, which allows hypotheses to be formulated about their biological function. The remainder can be formally categorized as new folds, although about one third of these show significant substructure similarity to previously characterized folds. These results infer that, despite the enormous increase in the number and the diversity of new genes being uncovered, the fold space of the proteins they encode is gradually becoming saturated. The previously unexplored sectors of the protein universe appear to be primarily shaped by extreme diversification of known protein families, which then enables organisms to evolve new functions and adapt to particular niches and habitats. Notwithstanding, these DUF families still constitute the richest source for discovery of the remaining protein folds and topologies. PMID:19787035
[Cloning and functional characterization of phytoene desaturase in Andrographis paniculata].
Shen, Qin-qin; Li, Li-xia; Zhan, Peng-lin; Wang, Qiang
2015-10-01
A full-length cDNA of phytoene desaturase (PDS) gene from Andrographis paniculata was obtained through RACE-PCR. The cDNA sequence consists of 2 224 bp with an intact ORF of 1 752 bp (GeneBank: KP982892), encoding a ploypeptide of 584 amino acids. Homology analysis showed that the deduced protein has extensive sequence similarities to PDS from other plants, and contains a conserved NAD ( H) -binding domain of plant dehydrase cofactor binding-domain in N-terminal. Phylogenetic analysis demonstrated that ApPDS was more related to PDS of Sesamum indicum and Pogostemon cablin. The semi-quantitative RT-PCR analysis revealed that ApPDS expressed in whole aboveground tissues with the highest expression in leaves. Virus induced gene silencing (VIGS) was performed to characterize the functional of ApPDS in planta. Significant photobleaching was not observed in infiltrated leaves, while the PDS gene has been down-regulated significantly at the yellowish area. To the best of our knowledge, this represents the first report of PDS gene cloning and functional characterization from A. paniculata, which lays the foundation for further investigation of new genes, especially that correlative to andrographolide biosynthetic pathway.
Daub, Carsten O; Steuer, Ralf; Selbig, Joachim; Kloska, Sebastian
2004-01-01
Background The information theoretic concept of mutual information provides a general framework to evaluate dependencies between variables. In the context of the clustering of genes with similar patterns of expression it has been suggested as a general quantity of similarity to extend commonly used linear measures. Since mutual information is defined in terms of discrete variables, its application to continuous data requires the use of binning procedures, which can lead to significant numerical errors for datasets of small or moderate size. Results In this work, we propose a method for the numerical estimation of mutual information from continuous data. We investigate the characteristic properties arising from the application of our algorithm and show that our approach outperforms commonly used algorithms: The significance, as a measure of the power of distinction from random correlation, is significantly increased. This concept is subsequently illustrated on two large-scale gene expression datasets and the results are compared to those obtained using other similarity measures. A C++ source code of our algorithm is available for non-commercial use from kloska@scienion.de upon request. Conclusion The utilisation of mutual information as similarity measure enables the detection of non-linear correlations in gene expression datasets. Frequently applied linear correlation measures, which are often used on an ad-hoc basis without further justification, are thereby extended. PMID:15339346
Barad, Shiri; Sela, Noa; Kumar, Dilip; Kumar-Dubey, Amit; Glam-Matana, Nofar; Sherman, Amir; Prusky, Dov
2016-05-04
Penicillium expansum is a destructive phytopathogen that causes decay in deciduous fruits during postharvest handling and storage. During colonization the fungus secretes D-gluconic acid (GLA), which modulates environmental pH and regulates mycotoxin accumulation in colonized tissue. Till now no transcriptomic analysis has addressed the specific contribution of the pathogen's pH regulation to the P. expansum colonization process. For this purpose total RNA from the leading edge of P. expansum-colonized apple tissue of cv. 'Golden Delicious' and from fungal cultures grown under pH 4 or 7 were sequenced and their gene expression patterns were compared. We present a large-scale analysis of the transcriptome data of P. expansum and apple response to fungal colonization. The fungal analysis revealed nine different clusters of gene expression patterns that were divided among three major groups in which the colonized tissue showed, respectively: (i) differing transcript expression patterns between mycelial growth at pH 4 and pH 7; (ii) similar transcript expression patterns of mycelial growth at pH 4; and (iii) similar transcript expression patterns of mycelial growth at pH 7. Each group was functionally characterized in order to decipher genes that are important for pH regulation and also for colonization of apple fruits by Penicillium. Furthermore, comparison of gene expression of healthy apple tissue with that of colonized tissue showed that differentially expressed genes revealed up-regulation of the jasmonic acid and mevalonate pathways, and also down-regulation of the glycogen and starch biosynthesis pathways. Overall, we identified important genes and functionalities of P. expansum that were controlled by the environmental pH. Differential expression patterns of genes belonging to the same gene family suggest that genes were selectively activated according to their optimal environmental conditions (pH, in vitro or in vivo) to enable the fungus to cope with varying conditions and to make optimal use of available enzymes. Comparison between the activation of the colonized host's gene responses by alkalizing Colletotrichum gloeosporioides and acidifying P. expansum pathogens indicated similar gene response patterns, but stronger responses to P. expansum, suggesting the importance of acidification by P. expansum as a factor in its increased aggressiveness.
Discovery of new candidate genes related to brain development using protein interaction information.
Chen, Lei; Chu, Chen; Kong, Xiangyin; Huang, Tao; Cai, Yu-Dong
2015-01-01
Human brain development is a dramatic process composed of a series of complex and fine-tuned spatiotemporal gene expressions. A good comprehension of this process can assist us in developing the potential of our brain. However, we have only limited knowledge about the genes and gene functions that are involved in this biological process. Therefore, a substantial demand remains to discover new brain development-related genes and identify their biological functions. In this study, we aimed to discover new brain-development related genes by building a computational method. We referred to a series of computational methods used to discover new disease-related genes and developed a similar method. In this method, the shortest path algorithm was executed on a weighted graph that was constructed using protein-protein interactions. New candidate genes fell on at least one of the shortest paths connecting two known genes that are related to brain development. A randomization test was then adopted to filter positive discoveries. Of the final identified genes, several have been reported to be associated with brain development, indicating the effectiveness of the method, whereas several of the others may have potential roles in brain development.
Tripathi, Binu M; Moroenyane, Itumeleng; Sherman, Chen; Lee, Yoo Kyung; Adams, Jonathan M; Steinberger, Yosef
2017-07-01
The soil microbiome is important for the functioning of terrestrial ecosystems. However, the impacts of climate on taxonomic and functional diversity of soil microbiome are not well understood. A precipitation gradient along regional scale transects may offer a model setting for understanding the effect of climate on the composition and function of the soil microbiome. Here, we compared taxonomic and functional attributes of soil microorganisms in arid, semiarid, Mediterranean, and humid Mediterranean climatic conditions of Israel using shotgun metagenomic sequencing. We hypothesized that there would be a distinct taxonomic and functional soil community for each precipitation zone, with arid environments having lower taxonomic and functional diversity, greater relative abundance of stress response and sporulation-related genes, and lower relative abundance of genes related to nutrient cycling and degradation of complex organic compounds. As hypothesized, our results showed a distinct taxonomic and functional community in each precipitation zone, revealing differences in soil taxonomic and functional selection in the different climates. Although the taxonomic diversity remained similar across all sites, the functional diversity was-as hypothesized-lower in the arid environments, suggesting that functionality is more constrained in "extreme" environments. Also, with increasing aridity, we found a significant increase in genes related to dormancy/sporulation and a decrease in those related to nutrient cycling (genes related to nitrogen, potassium, and sulfur metabolism), respectively. However, relative abundance of genes related to stress response were lower in arid soils. Overall, these results indicate that climatic conditions play an important role in shaping taxonomic and functional attributes of soil microbiome. These findings have important implications for understanding the impacts of climate change (e.g., precipitation change) on structure and function of the soil microbiome.
Reranking candidate gene models with cross-species comparison for improved gene prediction
Liu, Qian; Crammer, Koby; Pereira, Fernando CN; Roos, David S
2008-01-01
Background Most gene finders score candidate gene models with state-based methods, typically HMMs, by combining local properties (coding potential, splice donor and acceptor patterns, etc). Competing models with similar state-based scores may be distinguishable with additional information. In particular, functional and comparative genomics datasets may help to select among competing models of comparable probability by exploiting features likely to be associated with the correct gene models, such as conserved exon/intron structure or protein sequence features. Results We have investigated the utility of a simple post-processing step for selecting among a set of alternative gene models, using global scoring rules to rerank competing models for more accurate prediction. For each gene locus, we first generate the K best candidate gene models using the gene finder Evigan, and then rerank these models using comparisons with putative orthologous genes from closely-related species. Candidate gene models with lower scores in the original gene finder may be selected if they exhibit strong similarity to probable orthologs in coding sequence, splice site location, or signal peptide occurrence. Experiments on Drosophila melanogaster demonstrate that reranking based on cross-species comparison outperforms the best gene models identified by Evigan alone, and also outperforms the comparative gene finders GeneWise and Augustus+. Conclusion Reranking gene models with cross-species comparison improves gene prediction accuracy. This straightforward method can be readily adapted to incorporate additional lines of evidence, as it requires only a ranked source of candidate gene models. PMID:18854050
Takashima, Eizo; Williams, Marni; Eiglmeier, Karin; Pain, Adrien; Guelbeogo, Wamdaogo M.; Gneme, Awa; Brito-Fravallo, Emma; Holm, Inge; Lavazec, Catherine; Sagnon, N’Fale; Baxter, Richard H.; Riehle, Michelle M.; Vernick, Kenneth D.
2015-01-01
Nucleotide variation patterns across species are shaped by the processes of natural selection, including exposure to environmental pathogens. We examined patterns of genetic variation in two sister species, Anopheles gambiae and Anopheles coluzzii, both efficient natural vectors of human malaria in West Africa. We used the differentiation signature displayed by a known coordinate selective sweep of immune genes APL1 and TEP1 in A. coluzzii to design a population genetic screen trained on the sweep, classified a panel of 26 potential immune genes for concordance with the signature, and functionally tested their immune phenotypes. The screen results were strongly predictive for genes with protective immune phenotypes: genes meeting the screen criteria were significantly more likely to display a functional phenotype against malaria infection than genes not meeting the criteria (p = 0.0005). Thus, an evolution-based screen can efficiently prioritize candidate genes for labor-intensive downstream functional testing, and safely allow the elimination of genes not meeting the screen criteria. The suite of immune genes with characteristics similar to the APL1-TEP1 selective sweep appears to be more widespread in the A. coluzzii genome than previously recognized. The immune gene differentiation may be a consequence of adaptation of A. coluzzii to new pathogens encountered in its niche expansion during the separation from A. gambiae, although the role, if any of natural selection by Plasmodium is unknown. Application of the screen allowed identification of new functional immune factors, and assignment of new functions to known factors. We describe biochemical binding interactions between immune proteins that underlie functional activity for malaria infection, which highlights the interplay between pathogen specificity and the structure of immune complexes. We also find that most malaria-protective immune factors display phenotypes for either human or rodent malaria, with broad specificity a rarity. PMID:26633695
Dobias, S L; Ma, L; Wu, H; Bell, J R; Maxson, R
1997-01-01
Msx- class homeobox genes, characterized by a distinct and highly conserved homeodomain, have been identified in a wide variety of metazoans from vertebrates to coelenterates. Although there is evidence that they participate in inductive tissue interactions that underlie vertebrate organogenesis, including those that pattern the neural crest, there is little information about their function in simple deuterostomes. Both to learn more about the ancient function of Msx genes, and to shed light on the evolution of developmental mechanisms within the lineage that gave rise to vertebrates, we have isolated and characterized Msx genes from ascidians and echinoderms. Here we describe the sequence and expression of a sea urchin (Strongylocentrotus purpouratus) Msx gene whose homeodomain is very similar to that of vertebrate Msx2. This gene, designated SpMsx, is first expressed in blastula stage embryos, apparently in a non-localized manner. Subsequently, during the early phases of gastrulation, SpMsx transcripts are expressed intensely in the invaginating archenteron and secondary mesenchyme, and at reduced levels in the ectoderm. In the latter part of gastrulation, SpMsx transcripts are concentrated in the oral ectoderm and gut, and continue to be expressed at those sites through the remainder of embryonic development. That vertebrate Msx genes are regulated by inductive tissue interactions and growth factors suggested to us that the restriction of SpMsx gene expression to the oral ectoderm and derivatives of the vegetal plate might similarly be regulated by the series of signaling events that pattern these embryonic territories. As a first test of this hypothesis, we examined the influence of exogastrulation and cell-dissociation on SpMsx gene expression. In experimentally-induced exogastrulae, SpMsx transcripts were distributed normally in the oral ectoderm, evaginated gut, and secondary mesenchyme. However, when embryos were dissociated into their component cells, SpMsx transcripts failed to accumulate. These data show that the localization of SpMsx transcripts in gastrulae does not depend on interactions between germ layers, yet the activation and maintenance of SpMsx expression does require cell-cell or cell-matrix interactions.
Dong, Chongmei; Vincent, Kate; Sharp, Peter
2009-12-04
TILLING (Targeting Induced Local Lesions IN Genomes) is a powerful tool for reverse genetics, combining traditional chemical mutagenesis with high-throughput PCR-based mutation detection to discover induced mutations that alter protein function. The most popular mutation detection method for TILLING is a mismatch cleavage assay using the endonuclease CelI. For this method, locus-specific PCR is essential. Most wheat genes are present as three similar sequences with high homology in exons and low homology in introns. Locus-specific primers can usually be designed in introns. However, it is sometimes difficult to design locus-specific PCR primers in a conserved region with high homology among the three homoeologous genes, or in a gene lacking introns, or if information on introns is not available. Here we describe a mutation detection method which combines High Resolution Melting (HRM) analysis of mixed PCR amplicons containing three homoeologous gene fragments and sequence analysis using Mutation Surveyor software, aimed at simultaneous detection of mutations in three homoeologous genes. We demonstrate that High Resolution Melting (HRM) analysis can be used in mutation scans in mixed PCR amplicons containing three homoeologous gene fragments. Combining HRM scanning with sequence analysis using Mutation Surveyor is sensitive enough to detect a single nucleotide mutation in the heterozygous state in a mixed PCR amplicon containing three homoeoloci. The method was tested and validated in an EMS (ethylmethane sulfonate)-treated wheat TILLING population, screening mutations in the carboxyl terminal domain of the Starch Synthase II (SSII) gene. Selected identified mutations of interest can be further analysed by cloning to confirm the mutation and determine the genomic origin of the mutation. Polyploidy is common in plants. Conserved regions of a gene often represent functional domains and have high sequence similarity between homoeologous loci. The method described here is a useful alternative to locus-specific based methods for screening mutations in conserved functional domains of homoeologous genes. This method can also be used for SNP (single nucleotide polymorphism) marker development and eco-TILLING in polyploid species.
Myotonic Dystrophy Type 2: An Update on Clinical Aspects, Genetic and Pathomolecular Mechanism
Meola, Giovanni; Cardani, Rosanna
2015-01-01
Abstract Myotonic dystrophy (DM) is the most common adult muscular dystrophy, characterized by autosomal dominant progressive myopathy, myotonia and multiorgan involvement. To date two distinct forms caused by similar mutations have been identified. Myotonic dystrophy type 1 (DM1, Steinert’s disease) is caused by a (CTG)n expansion in DMPK, while myotonic dystrophy type 2 (DM2) is caused by a (CCTG)n expansion in CNBP. Despite clinical and genetic similarities, DM1 and DM2 are distinct disorders. The pathogenesis of DM is explained by a common RNA gain-of-function mechanism in which the CUG and CCUG repeats alter cellular function, including alternative splicing of various genes. However additional pathogenic mechanism like changes in gene expression, modifier genes, protein translation and micro-RNA metabolism may also contribute to disease pathology and to clarify the phenotypic differences between these two types of myotonic dystrophies. This review is an update on the latest findings specific to DM2, including explanations for the differences in clinical manifestations and pathophysiology between the two forms of myotonic dystrophies. PMID:27858759
Dynamic expression of the LAP family of genes during early development of Xenopus tropicalis.
Yang, Qiutan; Lv, Xiaoyan; Kong, Qinghua; Li, Chaocui; Zhou, Qin; Mao, Bingyu
2011-10-01
The leucine-rich repeats and PDZ (LAP) family of genes are crucial for the maintenance of cell polarity as well as for epithelial homeostasis and tumor suppression in both vertebrates and invertebrates. Four members of this gene family are known: densin, erbin, scribble and lano. Here, we identified the four members of the LAP gene family in Xenopus tropicalis and studied their expression patterns during embryonic development. The Xenopus LAP proteins show a conserved domain structure that is similar to their homologs in other vertebrates. In Xenopus embryos, these genes were detected in animal cap cells at the early gastrula stage. At later stages of development, they were widely expressed in epithelial tissues that are highly polar in nature, including the neural epithelia, optic and otic vesicles, and in the pronephros. These data suggest that the roles of the Xenopus LAP genes in the control of cell polarity and morphogenesis are conserved during early development. Erbin and lano show similar expression patterns in the developing head, suggesting potential functional interactions between the two molecules in vivo.
Conserved properties of Drosophila Insomniac link sleep regulation and synaptic function.
Li, Qiuling; Kellner, David A; Hatch, Hayden A M; Yumita, Tomohiro; Sanchez, Sandrine; Machold, Robert P; Frank, C Andrew; Stavropoulos, Nicholas
2017-05-01
Sleep is an ancient animal behavior that is regulated similarly in species ranging from flies to humans. Various genes that regulate sleep have been identified in invertebrates, but whether the functions of these genes are conserved in mammals remains poorly explored. Drosophila insomniac (inc) mutants exhibit severely shortened and fragmented sleep. Inc protein physically associates with the Cullin-3 (Cul3) ubiquitin ligase, and neuronal depletion of Inc or Cul3 strongly curtails sleep, suggesting that Inc is a Cul3 adaptor that directs the ubiquitination of neuronal substrates that impact sleep. Three proteins similar to Inc exist in vertebrates-KCTD2, KCTD5, and KCTD17-but are uncharacterized within the nervous system and their functional conservation with Inc has not been addressed. Here we show that Inc and its mouse orthologs exhibit striking biochemical and functional interchangeability within Cul3 complexes. Remarkably, KCTD2 and KCTD5 restore sleep to inc mutants, indicating that they can substitute for Inc in vivo and engage its neuronal targets relevant to sleep. Inc and its orthologs localize similarly within fly and mammalian neurons and can traffic to synapses, suggesting that their substrates may include synaptic proteins. Consistent with such a mechanism, inc mutants exhibit defects in synaptic structure and physiology, indicating that Inc is essential for both sleep and synaptic function. Our findings reveal that molecular functions of Inc are conserved through ~600 million years of evolution and support the hypothesis that Inc and its orthologs participate in an evolutionarily conserved ubiquitination pathway that links synaptic function and sleep regulation.
Identification of functional modules using network topology and high-throughput data.
Ulitsky, Igor; Shamir, Ron
2007-01-26
With the advent of systems biology, biological knowledge is often represented today by networks. These include regulatory and metabolic networks, protein-protein interaction networks, and many others. At the same time, high-throughput genomics and proteomics techniques generate very large data sets, which require sophisticated computational analysis. Usually, separate and different analysis methodologies are applied to each of the two data types. An integrated investigation of network and high-throughput information together can improve the quality of the analysis by accounting simultaneously for topological network properties alongside intrinsic features of the high-throughput data. We describe a novel algorithmic framework for this challenge. We first transform the high-throughput data into similarity values, (e.g., by computing pairwise similarity of gene expression patterns from microarray data). Then, given a network of genes or proteins and similarity values between some of them, we seek connected sub-networks (or modules) that manifest high similarity. We develop algorithms for this problem and evaluate their performance on the osmotic shock response network in S. cerevisiae and on the human cell cycle network. We demonstrate that focused, biologically meaningful and relevant functional modules are obtained. In comparison with extant algorithms, our approach has higher sensitivity and higher specificity. We have demonstrated that our method can accurately identify functional modules. Hence, it carries the promise to be highly useful in analysis of high throughput data.
Boldogköi, Zsolt
2012-01-01
The regulation of gene expression is essential for normal functioning of biological systems in every form of life. Gene expression is primarily controlled at the level of transcription, especially at the phase of initiation. Non-coding RNAs are one of the major players at every level of genetic regulation, including the control of chromatin organization, transcription, various post-transcriptional processes, and translation. In this study, the Transcriptional Interference Network (TIN) hypothesis was put forward in an attempt to explain the global expression of antisense RNAs and the overall occurrence of tandem gene clusters in the genomes of various biological systems ranging from viruses to mammalian cells. The TIN hypothesis suggests the existence of a novel layer of genetic regulation, based on the interactions between the transcriptional machineries of neighboring genes at their overlapping regions, which are assumed to play a fundamental role in coordinating gene expression within a cluster of functionally linked genes. It is claimed that the transcriptional overlaps between adjacent genes are much more widespread in genomes than is thought today. The Waterfall model of the TIN hypothesis postulates a unidirectional effect of upstream genes on the transcription of downstream genes within a cluster of tandemly arrayed genes, while the Seesaw model proposes a mutual interdependence of gene expression between the oppositely oriented genes. The TIN represents an auto-regulatory system with an exquisitely timed and highly synchronized cascade of gene expression in functionally linked genes located in close physical proximity to each other. In this study, we focused on herpesviruses. The reason for this lies in the compressed nature of viral genes, which allows a tight regulation and an easier investigation of the transcriptional interactions between genes. However, I believe that the same or similar principles can be applied to cellular organisms too. PMID:22783276
Boldogköi, Zsolt
2012-01-01
The regulation of gene expression is essential for normal functioning of biological systems in every form of life. Gene expression is primarily controlled at the level of transcription, especially at the phase of initiation. Non-coding RNAs are one of the major players at every level of genetic regulation, including the control of chromatin organization, transcription, various post-transcriptional processes, and translation. In this study, the Transcriptional Interference Network (TIN) hypothesis was put forward in an attempt to explain the global expression of antisense RNAs and the overall occurrence of tandem gene clusters in the genomes of various biological systems ranging from viruses to mammalian cells. The TIN hypothesis suggests the existence of a novel layer of genetic regulation, based on the interactions between the transcriptional machineries of neighboring genes at their overlapping regions, which are assumed to play a fundamental role in coordinating gene expression within a cluster of functionally linked genes. It is claimed that the transcriptional overlaps between adjacent genes are much more widespread in genomes than is thought today. The Waterfall model of the TIN hypothesis postulates a unidirectional effect of upstream genes on the transcription of downstream genes within a cluster of tandemly arrayed genes, while the Seesaw model proposes a mutual interdependence of gene expression between the oppositely oriented genes. The TIN represents an auto-regulatory system with an exquisitely timed and highly synchronized cascade of gene expression in functionally linked genes located in close physical proximity to each other. In this study, we focused on herpesviruses. The reason for this lies in the compressed nature of viral genes, which allows a tight regulation and an easier investigation of the transcriptional interactions between genes. However, I believe that the same or similar principles can be applied to cellular organisms too.
2011-01-01
Background Nucleoside diphosphate kinases NDPK are evolutionarily conserved enzymes present in Bacteria, Archaea and Eukarya, with human Nme1 the most studied representative of the family and the first identified metastasis suppressor. Sponges (Porifera) are simple metazoans without tissues, closest to the common ancestor of all animals. They changed little during evolution and probably provide the best insight into the metazoan ancestor's genomic features. Recent studies show that sponges have a wide repertoire of genes many of which are involved in diseases in more complex metazoans. The original function of those genes and the way it has evolved in the animal lineage is largely unknown. Here we report new results on the metastasis suppressor gene/protein homolog from the marine sponge Suberites domuncula, NmeGp1Sd. The purpose of this study was to investigate the properties of the sponge Group I Nme gene and protein, and compare it to its human homolog in order to elucidate the evolution of the structure and function of Nme. Results We found that sponge genes coding for Group I Nme protein are intron-rich. Furthermore, we discovered that the sponge NmeGp1Sd protein has a similar level of kinase activity as its human homolog Nme1, does not cleave negatively supercoiled DNA and shows nonspecific DNA-binding activity. The sponge NmeGp1Sd forms a hexamer, like human Nme1, and all other eukaryotic Nme proteins. NmeGp1Sd interacts with human Nme1 in human cells and exhibits the same subcellular localization. Stable clones expressing sponge NmeGp1Sd inhibited the migratory potential of CAL 27 cells, as already reported for human Nme1, which suggests that Nme's function in migratory processes was engaged long before the composition of true tissues. Conclusions This study suggests that the ancestor of all animals possessed a NmeGp1 protein with properties and functions similar to evolutionarily recent versions of the protein, even before the appearance of true tissues and the origin of tumors and metastasis. PMID:21457554
NASA Astrophysics Data System (ADS)
Kvitt, Hagit; Rosenfeld, Hanna; Tchernov, Dan
2016-07-01
Recent studies suggest that controlled apoptotic response provides an essential mechanism, enabling corals to respond to global warming and ocean acidification. However, the molecules involved and their functions are still unclear. To better characterize the apoptotic response in basal metazoans, we studied the expression profiles of selected genes that encode for putative pro- and anti-apoptotic mediators in the coral Stylophora pistillata under thermal stress and bleaching conditions. Upon thermal stress, as attested by the elevation of the heat-shock protein gene HSP70’s mRNA levels, the expression of all studied genes, including caspase, Bcl-2, Bax, APAF-1 and BI-1, peaked at 6-24 h of thermal stress (hts) and declined at 72 hts. Adversely, the expression levels of the survivin gene showed a shifted pattern, with elevation at 48-72 hts and a return to basal levels at 168 hts. Overall, we show the quantitative anti-apoptotic traits of the coral Bcl-2 protein, which resemble those of its mammalian counterpart. Altogether, our results highlight the similarities between apoptotic networks operating in simple metazoans and in higher animals and clearly demonstrate the activation of pro-cell survival regulators at early stages of the apoptotic response, contributing to the decline of apoptosis and the acclimation to chronic stress.
Preston, Jill C; Jorgensen, Stacy A; Jha, Suryatapa G
2014-01-01
Flowering time is strictly controlled by a combination of internal and external signals that match seed set with favorable environmental conditions. In the model plant species Arabidopsis thaliana (Brassicaceae), many of the genes underlying development and evolution of flowering have been discovered. However, much remains unknown about how conserved the flowering gene networks are in plants with different growth habits, gene duplication histories, and distributions. Here we functionally characterize three homologs of the flowering gene Suppressor Of Overexpression of Constans 1 (SOC1) in the short-lived perennial Petunia hybrida (petunia, Solanaceae). Similar to A. thaliana soc1 mutants, co-silencing of duplicated petunia SOC1-like genes results in late flowering. This phenotype is most severe when all three SOC1-like genes are silenced. Furthermore, expression levels of the SOC1-like genes Unshaven (UNS) and Floral Binding Protein 21 (FBP21), but not FBP28, are positively correlated with developmental age. In contrast to A. thaliana, petunia SOC1-like gene expression did not increase with longer photoperiods, and FBP28 transcripts were actually more abundant under short days. Despite evidence of functional redundancy, differential spatio-temporal expression data suggest that SOC1-like genes might fine-tune petunia flowering in response to photoperiod and developmental stage. This likely resulted from modification of SOC1-like gene regulatory elements following recent duplication, and is a possible mechanism to ensure flowering under both inductive and non-inductive photoperiods.
Preston, Jill C.; Jorgensen, Stacy A.; Jha, Suryatapa G.
2014-01-01
Flowering time is strictly controlled by a combination of internal and external signals that match seed set with favorable environmental conditions. In the model plant species Arabidopsis thaliana (Brassicaceae), many of the genes underlying development and evolution of flowering have been discovered. However, much remains unknown about how conserved the flowering gene networks are in plants with different growth habits, gene duplication histories, and distributions. Here we functionally characterize three homologs of the flowering gene SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1 (SOC1) in the short-lived perennial Petunia hybrida (petunia, Solanaceae). Similar to A. thaliana soc1 mutants, co-silencing of duplicated petunia SOC1-like genes results in late flowering. This phenotype is most severe when all three SOC1-like genes are silenced. Furthermore, expression levels of the SOC1-like genes UNSHAVEN (UNS) and FLORAL BINDING PROTEIN 21 (FBP21), but not FBP28, are positively correlated with developmental age. In contrast to A. thaliana, petunia SOC1-like gene expression did not increase with longer photoperiods, and FBP28 transcripts were actually more abundant under short days. Despite evidence of functional redundancy, differential spatio-temporal expression data suggest that SOC1-like genes might fine-tune petunia flowering in response to photoperiod and developmental stage. This likely resulted from modification of SOC1-like gene regulatory elements following recent duplication, and is a possible mechanism to ensure flowering under both inductive and non-inductive photoperiods. PMID:24787903
Rund, Samuel S C; Yoo, Boyoung; Alam, Camille; Green, Taryn; Stephens, Melissa T; Zeng, Erliang; George, Gary F; Sheppard, Aaron D; Duffield, Giles E; Milenković, Tijana; Pfrender, Michael E
2016-08-18
Marine and freshwater zooplankton exhibit daily rhythmic patterns of behavior and physiology which may be regulated directly by the light:dark (LD) cycle and/or a molecular circadian clock. One of the best-studied zooplankton taxa, the freshwater crustacean Daphnia, has a 24 h diel vertical migration (DVM) behavior whereby the organism travels up and down through the water column daily. DVM plays a critical role in resource tracking and the behavioral avoidance of predators and damaging ultraviolet radiation. However, there is little information at the transcriptional level linking the expression patterns of genes to the rhythmic physiology/behavior of Daphnia. Here we analyzed genome-wide temporal transcriptional patterns from Daphnia pulex collected over a 44 h time period under a 12:12 LD cycle (diel) conditions using a cosine-fitting algorithm. We used a comprehensive network modeling and analysis approach to identify novel co-regulated rhythmic genes that have similar network topological properties and functional annotations as rhythmic genes identified by the cosine-fitting analyses. Furthermore, we used the network approach to predict with high accuracy novel gene-function associations, thus enhancing current functional annotations available for genes in this ecologically relevant model species. Our results reveal that genes in many functional groupings exhibit 24 h rhythms in their expression patterns under diel conditions. We highlight the rhythmic expression of immunity, oxidative detoxification, and sensory process genes. We discuss differences in the chronobiology of D. pulex from other well-characterized terrestrial arthropods. This research adds to a growing body of literature suggesting the genetic mechanisms governing rhythmicity in crustaceans may be divergent from other arthropod lineages including insects. Lastly, these results highlight the power of using a network analysis approach to identify differential gene expression and provide novel functional annotation.
Pereira, Joana; Johnson, Warren E.; O’Brien, Stephen J.; Jarvis, Erich D.; Zhang, Guojie; Gilbert, M. Thomas P.; Vasconcelos, Vitor; Antunes, Agostinho
2014-01-01
The Hedgehog (Hh) gene family codes for a class of secreted proteins composed of two active domains that act as signalling molecules during embryo development, namely for the development of the nervous and skeletal systems and the formation of the testis cord. While only one Hh gene is found typically in invertebrate genomes, most vertebrates species have three (Sonic hedgehog – Shh; Indian hedgehog – Ihh; and Desert hedgehog – Dhh), each with different expression patterns and functions, which likely helped promote the increasing complexity of vertebrates and their successful diversification. In this study, we used comparative genomic and adaptive evolutionary analyses to characterize the evolution of the Hh genes in vertebrates following the two major whole genome duplication (WGD) events. To overcome the lack of Hh-coding sequences on avian publicly available databases, we used an extensive dataset of 45 avian and three non-avian reptilian genomes to show that birds have all three Hh paralogs. We find suggestions that following the WGD events, vertebrate Hh paralogous genes evolved independently within similar linkage groups and under different evolutionary rates, especially within the catalytic domain. The structural regions around the ion-binding site were identified to be under positive selection in the signaling domain. These findings contrast with those observed in invertebrates, where different lineages that experienced gene duplication retained similar selective constraints in the Hh orthologs. Our results provide new insights on the evolutionary history of the Hh gene family, the functional roles of these paralogs in vertebrate species, and on the location of mutational hotspots. PMID:25549322
Molecular cloning of doublesex genes of four cladocera (water flea) species.
Toyota, Kenji; Kato, Yasuhiko; Sato, Masaru; Sugiura, Naomi; Miyagawa, Shinichi; Miyakawa, Hitoshi; Watanabe, Hajime; Oda, Shigeto; Ogino, Yukiko; Hiruta, Chizue; Mizutani, Takeshi; Tatarazako, Norihisa; Paland, Susanne; Jackson, Craig; Colbourne, John K; Iguchi, Taisen
2013-04-10
The gene doublesex (dsx) is known as a key factor regulating genetic sex determination in many organisms. We previously identified two dsx genes (DapmaDsx1 and DapmaDsx2) from a freshwater branchiopod crustacean, Daphnia magna, which are expressed in males but not in females. D. magna produces males by parthenogenesis in response to environmental cues (environmental sex determination) and we showed that DapmaDsx1 expression during embryonic stages is responsible for the male trait development. The D. magna dsx genes are thought to have arisen by a cladoceran-specific duplication; therefore, to investigate evolutionary conservation of sex specific expression of dsx genes and to further assess their functions in the environmental sex determination, we searched for dsx homologs in four closely related cladoceran species. We identified homologs of both dsx genes from, D. pulex, D. galeata, and Ceriodaphnia dubia, yet only a single dsx gene was found from Moina macrocopa. The deduced amino acid sequences of all 9 dsx homologs contained the DM and oligomerization domains, which are characteristic for all arthropod DSX family members. Molecular phylogenetic analysis suggested that the dsx gene duplication likely occurred prior to the divergence of these cladoceran species, because that of the giant tiger prawn Penaeus monodon is rooted ancestrally to both DSX1 and DSX2 of cladocerans. Therefore, this result also suggested that M. macrocopa lost dsx2 gene secondarily. Furthermore, all dsx genes identified in this study showed male-biased expression levels, yet only half of the putative 5' upstream regulatory elements are preserved in D. magna and D. pulex. The all dsx genes of five cladoceran species examined had similar amino acid structure containing highly conserved DM and oligomerization domains, and exhibited sexually dimorphic expression patterns, suggesting that these genes may have similar functions for environmental sex determination in cladocerans.
Molecular cloning of doublesex genes of four cladocera (water flea) species
2013-01-01
Background The gene doublesex (dsx) is known as a key factor regulating genetic sex determination in many organisms. We previously identified two dsx genes (DapmaDsx1 and DapmaDsx2) from a freshwater branchiopod crustacean, Daphnia magna, which are expressed in males but not in females. D. magna produces males by parthenogenesis in response to environmental cues (environmental sex determination) and we showed that DapmaDsx1 expression during embryonic stages is responsible for the male trait development. The D. magna dsx genes are thought to have arisen by a cladoceran-specific duplication; therefore, to investigate evolutionary conservation of sex specific expression of dsx genes and to further assess their functions in the environmental sex determination, we searched for dsx homologs in four closely related cladoceran species. Results We identified homologs of both dsx genes from, D. pulex, D. galeata, and Ceriodaphnia dubia, yet only a single dsx gene was found from Moina macrocopa. The deduced amino acid sequences of all 9 dsx homologs contained the DM and oligomerization domains, which are characteristic for all arthropod DSX family members. Molecular phylogenetic analysis suggested that the dsx gene duplication likely occurred prior to the divergence of these cladoceran species, because that of the giant tiger prawn Penaeus monodon is rooted ancestrally to both DSX1 and DSX2 of cladocerans. Therefore, this result also suggested that M. macrocopa lost dsx2 gene secondarily. Furthermore, all dsx genes identified in this study showed male-biased expression levels, yet only half of the putative 5’ upstream regulatory elements are preserved in D. magna and D. pulex. Conclusions The all dsx genes of five cladoceran species examined had similar amino acid structure containing highly conserved DM and oligomerization domains, and exhibited sexually dimorphic expression patterns, suggesting that these genes may have similar functions for environmental sex determination in cladocerans. PMID:23575357
Li, Xiang; Bi, Zhenghong; Di, Rong; Liang, Peng; He, Qiguang; Liu, Wenbo; Miao, Weiguo; Zheng, Fucong
2016-01-01
Powdery mildew is an important disease of rubber trees caused by Oidium heveae B. A. Steinmann. As far as we know, none of the resistance genes related to powdery mildew have been isolated from the rubber tree. There is little information available at the molecular level regarding how a rubber tree develops defense mechanisms against this pathogen. We have studied rubber tree mRNA transcripts from the resistant RRIC52 cultivar by differential display analysis. Leaves inoculated with the spores of O. heveae were collected from 0 to 120 hpi in order to identify pathogen-regulated genes at different infection stages. We identified 78 rubber tree genes that were differentially expressed during the plant–pathogen interaction. BLAST analysis for these 78 ESTs classified them into seven functional groups: cell wall and membrane pathways, transcription factor and regulatory proteins, transporters, signal transduction, phytoalexin biosynthesis, other metabolism functions, and unknown functions. The gene expression for eight of these genes was validated by qRT-PCR in both RRIC52 and the partially susceptible Reyan 7-33-97 cultivars, revealing the similar or differential changes of gene expressions between these two cultivars. This study has improved our overall understanding of the molecular mechanisms of rubber tree resistance to powdery mildew. PMID:26840302
Gu, Lijiao; Li, Libei; Wei, Hengling; Wang, Hantao; Su, Junji; Guo, Yaning; Yu, Shuxun
2018-01-01
WRKY transcription factors play important roles in plant defense, stress response, leaf senescence, and plant growth and development. Previous studies have revealed the important roles of the group IIa GhWRKY genes in cotton. To comprehensively analyze the group IIa GhWRKY genes in upland cotton, we identified 15 candidate group IIa GhWRKY genes in the Gossypium hirsutum genome. The phylogenetic tree, intron-exon structure, motif prediction and Ka/Ks analyses indicated that most group IIa GhWRKY genes shared high similarity and conservation and underwent purifying selection during evolution. In addition, we detected the expression patterns of several group IIa GhWRKY genes in individual tissues as well as during leaf senescence using public RNA sequencing data and real-time quantitative PCR. To better understand the functions of group IIa GhWRKYs in cotton, GhWRKY17 (KF669857) was isolated from upland cotton, and its sequence alignment, promoter cis-acting elements and subcellular localization were characterized. Moreover, the over-expression of GhWRKY17 in Arabidopsis up-regulated the senescence-associated genes AtWRKY53, AtSAG12 and AtSAG13, enhancing the plant's susceptibility to leaf senescence. These findings lay the foundation for further analysis and study of the functions of WRKY genes in cotton.
Yang, Tianquan; Xu, Ronghua; Chen, Jianghua; Liu, Aizhong
2016-01-01
Fatty acids serve many functions in plants, but the effects of some key genes involved in fatty acids biosynthesis on plants growth and development are not well understood yet. To understand the functions of 3-ketoacyl-acyl-carrier protein synthase I (KASI) in tobacco, we isolated two KASI homologs, which we have designated NtKASI-1 and NtKASI-2. Expression analysis showed that these two KASI genes were transcribed constitutively in all tissues examined. Over-expression of NtKASI-1 in tobacco changed the fatty acid content in leaves, whereas over-expressed lines of NtKASI-2 exhibited distinct phenotypic features such as slightly variegated leaves and reduction of the fatty acid content in leaves, similar to the silencing plants of NtKASI-1 gene. Interestingly, the silencing of NtKASI-2 gene had no discernibly altered phenotypes compared to wild type. The double silencing plants of these two genes enhanced the phenotypic changes during vegetative and reproductive growth compared to wild type. These results uncovered that these two KASI genes had the partially functional redundancy, and that the KASI genes played a key role in regulating fatty acids synthesis and in mediating plant growth and development in tobacco. PMID:27509494
Araripe, Luciana O; Montenegro, Horácio; Lemos, Bernardo; Hartl, Daniel L
2010-12-14
Hybrid male sterility (HMS) is a usual outcome of hybridization between closely related animal species. It arises because interactions between alleles that are functional within one species may be disrupted in hybrids. The identification of genes leading to hybrid sterility is of great interest for understanding the evolutionary process of speciation. In the current work we used marked P-element insertions as dominant markers to efficiently locate one genetic factor causing a severe reduction in fertility in hybrid males of Drosophila simulans and D. mauritiana. Our mapping effort identified a region of 9 kb on chromosome 3, containing three complete and one partial coding sequences. Within this region, two annotated genes are suggested as candidates for the HMS factor, based on the comparative molecular characterization and public-source information. Gene Taf1 is partially contained in the region, but yet shows high polymorphism with four fixed non-synonymous substitutions between the two species. Its molecular functions involve sequence-specific DNA binding and transcription factor activity. Gene agt is a small, intronless gene, whose molecular function is annotated as methylated-DNA-protein-cysteine S-methyltransferase activity. High polymorphism and one fixed non-synonymous substitution suggest this is a fast evolving gene. The gene trees of both genes perfectly separate D. simulans and D. mauritiana into monophyletic groups. Analysis of gene expression using microarray revealed trends that were similar to those previously found in comparisons between whole-genome hybrids and parental species. The identification following confirmation of the HMS candidate gene will add another case study leading to understanding the evolutionary process of hybrid incompatibility.
Mutwil, Marek; Klie, Sebastian; Tohge, Takayuki; Giorgi, Federico M.; Wilkins, Olivia; Campbell, Malcolm M.; Fernie, Alisdair R.; Usadel, Björn; Nikoloski, Zoran; Persson, Staffan
2011-01-01
The model organism Arabidopsis thaliana is readily used in basic research due to resource availability and relative speed of data acquisition. A major goal is to transfer acquired knowledge from Arabidopsis to crop species. However, the identification of functional equivalents of well-characterized Arabidopsis genes in other plants is a nontrivial task. It is well documented that transcriptionally coordinated genes tend to be functionally related and that such relationships may be conserved across different species and even kingdoms. To exploit such relationships, we constructed whole-genome coexpression networks for Arabidopsis and six important plant crop species. The interactive networks, clustered using the HCCA algorithm, are provided under the banner PlaNet (http://aranet.mpimp-golm.mpg.de). We implemented a comparative network algorithm that estimates similarities between network structures. Thus, the platform can be used to swiftly infer similar coexpressed network vicinities within and across species and can predict the identity of functional homologs. We exemplify this using the PSA-D and chalcone synthase-related gene networks. Finally, we assessed how ontology terms are transcriptionally connected in the seven species and provide the corresponding MapMan term coexpression networks. The data support the contention that this platform will considerably improve transfer of knowledge generated in Arabidopsis to valuable crop species. PMID:21441431
Yeast Genetics for Delineating Bax/Bcl Pathway of Cell Death Regulation.
1998-07-01
differences in tosol. The cytosol also became electron dense ("cyto- the copy number of the episomal plasmid from which solic condensation"), similar to...Cell Death & Differ . 3, 229-236. (1993). The C. eheans cell death gene ccd-3 encodes a protein similar ¶Xhitc. K., Tahaoglu, E., and Steller, H. (1996...components may be used in different functional contexts. Similar modules might exist in metazoan apoptotic pathways. Even though yeast does not contain
Lineage-specific expansion of IFIT gene family: an insight into coevolution with IFN gene family.
Liu, Ying; Zhang, Yi-Bing; Liu, Ting-Kai; Gui, Jian-Fang
2013-01-01
In mammals, IFIT (Interferon [IFN]-induced proteins with Tetratricopeptide Repeat [TPR] motifs) family genes are involved in many cellular and viral processes, which are tightly related to mammalian IFN response. However, little is known about non-mammalian IFIT genes. In the present study, IFIT genes are identified in the genome databases from the jawed vertebrates including the cartilaginous elephant shark but not from non-vertebrates such as lancelet, sea squirt and acorn worm, suggesting that IFIT gene family originates from a vertebrate ancestor about 450 million years ago. IFIT family genes show conserved gene structure and gene arrangements. Phylogenetic analyses reveal that this gene family has expanded through lineage-specific and species-specific gene duplication. Interestingly, IFN gene family seem to share a common ancestor and a similar evolutionary mechanism; the function link of IFIT genes to IFN response is present early since the origin of both gene families, as evidenced by the finding that zebrafish IFIT genes are upregulated by fish IFNs, poly(I:C) and two transcription factors IRF3/IRF7, likely via the IFN-stimulated response elements (ISRE) within the promoters of vertebrate IFIT family genes. These coevolution features creates functional association of both family genes to fulfill a common biological process, which is likely selected by viral infection during evolution of vertebrates. Our results are helpful for understanding of evolution of vertebrate IFN system.
Bidard, Frédérique; Coppin, Evelyne; Silar, Philippe
2012-08-01
Transcription pattern during mycelium growth of Podospora anserina was assayed by microarray analysis in wild type and in mutants affected in the MAP kinase genes PaMpk1 and PaMpk2 and in the NADPH oxidase gene PaNox1. 15% of the genes have their expression modified by a factor two or more as growth proceeds in wild type. The genes whose expression is modified during growth in P. anserina are either not conserved or differently regulated in Neurospora crassa and Aspergillus niger, two fungi for which transcriptome data during growth are available. The P. anserina mutants display a similar alteration of their transcriptome profile, with nearly 1000 genes affected similarly in the three mutants, accounting for their similar growth phenotypes. Yet, each mutant has its specific set of modified transcripts, in line with particular phenotypes exhibited by each mutant. Again, there is limited conservation during evolution of the genes regulated at the transcription level by MAP kinases, as indicated by the comparison the P. anserina data, with those of Aspergillus fumigatus and N. crassa, two fungi for which gene expression data are available for mutants of the MAPK pathways. Among the genes regulated in wild type and affected in the mutants, those involved in carbohydrate and secondary metabolisms appear prominent. The vast majority of the genes differentially expressed are of unknown function. Availability of their transcription profile at various stages of development should help to decipher their role in fungal physiology and development. Copyright © 2012 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Xue, Zhuang; Li, Hui; Liu, Yang; Zhou, Wei; Sun, Jing; Wang, Xiuli
2017-12-01
As a `living fossil' of species origin and `rich treasure' of food and nutrition development, sea cucumber has received a lot of attentions from researchers. The cDNA library construction and EST sequencing of blood had been conducted previously in our lab. The bioinformatic analysis provided a gene fragment which is highly homologous with the genes of lectin family, named AjL ( Apostichopus japonicus lectin). To characterize and determine the phylogeny of AjL genes in early evolution, we isolated a full-length cDNA of lectin gene from the body wall of A. japonicus. The open reading frame of this gene contained 489 bp and encoded a 163 amino acids secretory protein being homologous to lectins of mammals and aquatic organisms. The deduced protein included a lectin-like domain. SDS-PAGE analysis showed that AjL migrated as a specific band (about 36.09 kDa under reducing), and agglutinated against rabbit red blood cells. AjL was similar to chain A of CEL-IV in space structure. We predicted that AjL may play the same role of CEL-IV. Our results suggested that more than one lectin gene functioned in sea cucumber and most of other species, which was fused by uncertain sequences during the evolution and encoded different proteins with diverse functions. Our findings provided the insights into the function and characteristics of lectin genes invertebrates. The results will also be helpful for the identification and structural, functional, and evolutionary analyses of lectin genes.
A human functional protein interaction network and its application to cancer data analysis
2010-01-01
Background One challenge facing biologists is to tease out useful information from massive data sets for further analysis. A pathway-based analysis may shed light by projecting candidate genes onto protein functional relationship networks. We are building such a pathway-based analysis system. Results We have constructed a protein functional interaction network by extending curated pathways with non-curated sources of information, including protein-protein interactions, gene coexpression, protein domain interaction, Gene Ontology (GO) annotations and text-mined protein interactions, which cover close to 50% of the human proteome. By applying this network to two glioblastoma multiforme (GBM) data sets and projecting cancer candidate genes onto the network, we found that the majority of GBM candidate genes form a cluster and are closer than expected by chance, and the majority of GBM samples have sequence-altered genes in two network modules, one mainly comprising genes whose products are localized in the cytoplasm and plasma membrane, and another comprising gene products in the nucleus. Both modules are highly enriched in known oncogenes, tumor suppressors and genes involved in signal transduction. Similar network patterns were also found in breast, colorectal and pancreatic cancers. Conclusions We have built a highly reliable functional interaction network upon expert-curated pathways and applied this network to the analysis of two genome-wide GBM and several other cancer data sets. The network patterns revealed from our results suggest common mechanisms in the cancer biology. Our system should provide a foundation for a network or pathway-based analysis platform for cancer and other diseases. PMID:20482850
The evolution of CHROMOMETHYLASES and gene body DNA methylation in plants.
Bewick, Adam J; Niederhuth, Chad E; Ji, Lexiang; Rohr, Nicholas A; Griffin, Patrick T; Leebens-Mack, Jim; Schmitz, Robert J
2017-05-01
The evolution of gene body methylation (gbM), its origins, and its functional consequences are poorly understood. By pairing the largest collection of transcriptomes (>1000) and methylomes (77) across Viridiplantae, we provide novel insights into the evolution of gbM and its relationship to CHROMOMETHYLASE (CMT) proteins. CMTs are evolutionary conserved DNA methyltransferases in Viridiplantae. Duplication events gave rise to what are now referred to as CMT1, 2 and 3. Independent losses of CMT1, 2, and 3 in eudicots, CMT2 and ZMET in monocots and monocots/commelinids, variation in copy number, and non-neutral evolution suggests overlapping or fluid functional evolution of this gene family. DNA methylation within genes is widespread and is found in all major taxonomic groups of Viridiplantae investigated. Genes enriched with methylated CGs (mCG) were also identified in species sister to angiosperms. The proportion of genes and DNA methylation patterns associated with gbM are restricted to angiosperms with a functional CMT3 or ortholog. However, mCG-enriched genes in the gymnosperm Pinus taeda shared some similarities with gbM genes in Amborella trichopoda. Additionally, gymnosperms and ferns share a CMT homolog closely related to CMT2 and 3. Hence, the dependency of gbM on a CMT most likely extends to all angiosperms and possibly gymnosperms and ferns. The resulting gene family phylogeny of CMT transcripts from the most diverse sampling of plants to date redefines our understanding of CMT evolution and its evolutionary consequences on DNA methylation. Future, functional tests of homologous and paralogous CMTs will uncover novel roles and consequences to the epigenome.
Rossmassler, Karen; Dietrich, Carsten; Thompson, Claire; Mikaelyan, Aram; Nonoh, James O; Scheffrahn, Rudolf H; Sillam-Dussès, David; Brune, Andreas
2015-11-26
Termites are important contributors to carbon and nitrogen cycling in tropical ecosystems. Higher termites digest lignocellulose in various stages of humification with the help of an entirely prokaryotic microbiota housed in their compartmented intestinal tract. Previous studies revealed fundamental differences in community structure between compartments, but the functional roles of individual lineages in symbiotic digestion are mostly unknown. Here, we conducted a highly resolved analysis of the gut microbiota in six species of higher termites that feed on plant material at different levels of humification. Combining amplicon sequencing and metagenomics, we assessed similarities in community structure and functional potential between the major hindgut compartments (P1, P3, and P4). Cluster analysis of the relative abundances of orthologous gene clusters (COGs) revealed high similarities among wood- and litter-feeding termites and strong differences to humivorous species. However, abundance estimates of bacterial phyla based on 16S rRNA genes greatly differed from those based on protein-coding genes. Community structure and functional potential of the microbiota in individual gut compartments are clearly driven by the digestive strategy of the host. The metagenomics libraries obtained in this study provide the basis for future studies that elucidate the fundamental differences in the symbiont-mediated breakdown of lignocellulose and humus by termites of different feeding groups. The high proportion of uncultured bacterial lineages in all samples calls for a reference-independent approach for the correct taxonomic assignment of protein-coding genes.
Kakinuma, Makoto; Inoue, Miho; Morita, Teruwo; Tominaga, Hiroshi; Maegawa, Miyuki; Coury, Daniel A; Amano, Hideomi
2012-05-01
In flowering plants, floral homeotic MADS-box genes, which constitute a large multigene family, play important roles in the specification of floral organs as defined by the ABCDE model. In this study, a MADS-box gene, ZjMADS1, was isolated and characterized from the marine angiosperm Zostera japonica. The predicted length of the ZjMADS1 protein was 246 amino acids (AA), and the AA sequence was most similar to those of the SEPALLATA (SEP) subfamily, corresponding to E-function genes. Southern blot analysis suggested the presence of two SEP3-like genes in the Z. japonica genome. ZjMADS1 mRNA levels were extremely high in the spadices, regardless of the developmental stage, compared to other organs from the reproductive and vegetative shoots. These results suggest that the ZjMADS1 gene may be involved in spadix development in Z. japonica and act as an E-function gene in floral organ development in marine angiosperms. Copyright © 2011 Elsevier Ltd. All rights reserved.
Synnergren, Jane; Améen, Caroline; Jansson, Andreas; Sartipy, Peter
2012-02-27
It is now well documented that human embryonic stem cells (hESCs) can differentiate into functional cardiomyocytes. These cells constitute a promising source of material for use in drug development, toxicity testing, and regenerative medicine. To assess their utility as replacement or complement to existing models, extensive phenotypic characterization of the cells is required. In the present study, we used microarrays and analyzed the global transcription of hESC-derived cardiomyocyte clusters (CMCs) and determined similarities as well as differences compared with reference samples from fetal and adult heart tissue. In addition, we performed a focused analysis of the expression of cardiac ion channels and genes involved in the Ca(2+)-handling machinery, which in previous studies have been shown to be immature in stem cell-derived cardiomyocytes. Our results show that hESC-derived CMCs, on a global level, have a highly similar gene expression profile compared with human heart tissue, and their transcriptional phenotype was more similar to fetal than to adult heart. Despite the high similarity to heart tissue, a number of significantly differentially expressed genes were identified, providing some clues toward understanding the molecular difference between in vivo sourced tissue and stem cell derivatives generated in vitro. Interestingly, some of the cardiac-related ion channels and Ca(2+)-handling genes showed differential expression between the CMCs and heart tissues. These genes may represent candidates for future genetic engineering to create hESC-derived CMCs that better mimic the phenotype of the cardiomyocytes present in the adult human heart.
Genetic similarities between tobacco use disorder and related comorbidities: an exploratory study
2014-01-01
Background Tobacco use disorder (TUD), defined as the use of tobacco to the detriment of a person’s health or social functioning, is associated with various disorders. We hypothesized that mutual variation in genes may partly explain this link. The aims of this study were to make a non-exhaustive inventory of the disorders using (partially) the same genetic pathways as TUD, and to describe the genetic similarities between TUD and the selected disorders. Methods We developed a 3 stage approach: (i) selection of genes influencing TUD using Gene2Mesh and Ingenuity Pathway Analysis (IPA), (ii) selection of disorders associated with the selected genes using IPA and (iii) genetic similarities between disorders associated with TUD using Jaccard distance and cluster analyses. Results Fourteen disorders and thirty-two genes met our inclusion criteria. The Jaccard distance between pairs of disorders ranged from 0.00 (e.g. oesophageal cancer and malignant hypertension) to 0.45 (e.g. bladder cancer and addiction). A lower number in the Jaccard distance indicates a higher similarity between the two disorders. Two main clusters of genetically similar disorders were observed, one including coexisting disorders (e.g. addiction and alcoholism) and the other one with the side-effects of smoking (e.g. gastric cancer and malignant hypertension). Conclusions This exploratory study partly explains the potential genetic components linking TUD to other disorders. Two principle clusters of disorders were observed (i) coexisting disorders of TUD and (ii) side-effects of TUD disorders. A further deepening of this observation in a real life study should allow strengthening this hypothesis. PMID:25060307
Long noncoding RNAs as enhancers of gene expression.
Ørom, U A; Derrien, T; Guigo, R; Shiekhattar, R
2010-01-01
The human genome contains thousands of long noncoding RNAs (ncRNAs) transcribed from diverse genomic locations. A large set of long ncRNAs is transcribed independent of protein-coding genes. We have used the GENCODE annotation of the human genome to identify 3019 long ncRNAs expressed in various human cell lines and tissue. This set of long ncRNAs responds to differentiation signals in primary human keratinocytes and is coexpressed with important regulators of keratinocyte development. Depletion of a number of these long ncRNAs leads to the repression of specific genes in their surrounding locus, supportive of an activating function for ncRNAs. Using reporter assays, we confirmed such activating function and show that such transcriptional enhancement is mediated through the long ncRNA transcripts. Our studies show that long ncRNAs exhibit functions similar to classically defined enhancers, through an RNA-dependent mechanism.
Crnovčić, Ivana; Rückert, Christian; Semsary, Siamak; Lang, Manuel; Kalinowski, Jörn; Keller, Ullrich
2017-01-01
Sequencing the actinomycin (acm) biosynthetic gene cluster of Streptomyces antibioticus IMRU 3720, which produces actinomycin X (Acm X), revealed 20 genes organized into a highly similar framework as in the bi-armed acm C biosynthetic gene cluster of Streptomyces chrysomallus but without an attached additional extra arm of orthologues as in the latter. Curiously, the extra arm of the S. chrysomallus gene cluster turned out to perfectly match the single arm of the S. antibioticus gene cluster in the same order of orthologues including the the presence of two pseudogenes, scacmM and scacmN, encoding a cytochrome P450 and its ferredoxin, respectively. Orthologues of the latter genes were both missing in the principal arm of the S. chrysomallus acm C gene cluster. All orthologues of the extra arm showed a G +C-contents different from that of their counterparts in the principal arm. Moreover, the similarities of translation products from the extra arm were all higher to the corresponding translation products of orthologue genes from the S. antibioticus acm X gene cluster than to those encoded by the principal arm of their own gene cluster. This suggests that the duplicated structure of the S. chrysomallus acm C biosynthetic gene cluster evolved from previous fusion between two one-armed acm gene clusters each from a different genetic background. However, while scacmM and scacmN in the extra arm of the S. chrysomallus acm C gene cluster are mutated and therefore are non-functional, their orthologues saacmM and saacmN in the S. antibioticus acm C gene cluster show no defects seemingly encoding active enzymes with functions specific for Acm X biosynthesis. Both acm biosynthetic gene clusters lack a kynurenine-3-monooxygenase gene necessary for biosynthesis of 3-hydroxy-4-methylanthranilic acid, the building block of the Acm chromophore, which suggests participation of a genome-encoded relevant monooxygenase during Acm biosynthesis in both S. chrysomallus and S. antibioticus. PMID:28435299
Crnovčić, Ivana; Rückert, Christian; Semsary, Siamak; Lang, Manuel; Kalinowski, Jörn; Keller, Ullrich
2017-01-01
Sequencing the actinomycin ( acm ) biosynthetic gene cluster of Streptomyces antibioticus IMRU 3720, which produces actinomycin X (Acm X), revealed 20 genes organized into a highly similar framework as in the bi-armed acm C biosynthetic gene cluster of Streptomyces chrysomallus but without an attached additional extra arm of orthologues as in the latter. Curiously, the extra arm of the S. chrysomallus gene cluster turned out to perfectly match the single arm of the S. antibioticus gene cluster in the same order of orthologues including the the presence of two pseudogenes, scacmM and scacmN , encoding a cytochrome P450 and its ferredoxin, respectively. Orthologues of the latter genes were both missing in the principal arm of the S. chrysomallus acm C gene cluster. All orthologues of the extra arm showed a G +C-contents different from that of their counterparts in the principal arm. Moreover, the similarities of translation products from the extra arm were all higher to the corresponding translation products of orthologue genes from the S. antibioticus acm X gene cluster than to those encoded by the principal arm of their own gene cluster. This suggests that the duplicated structure of the S. chrysomallus acm C biosynthetic gene cluster evolved from previous fusion between two one-armed acm gene clusters each from a different genetic background. However, while scacmM and scacmN in the extra arm of the S. chrysomallus acm C gene cluster are mutated and therefore are non-functional, their orthologues saacmM and saacmN in the S. antibioticus acm C gene cluster show no defects seemingly encoding active enzymes with functions specific for Acm X biosynthesis. Both acm biosynthetic gene clusters lack a kynurenine-3-monooxygenase gene necessary for biosynthesis of 3-hydroxy-4-methylanthranilic acid, the building block of the Acm chromophore, which suggests participation of a genome-encoded relevant monooxygenase during Acm biosynthesis in both S. chrysomallus and S. antibioticus .
Zhang, Minlu; Zhu, Cheng; Jacomy, Alexis; Lu, Long J.; Jegga, Anil G.
2011-01-01
The low prevalence rate of orphan diseases (OD) requires special combined efforts to improve diagnosis, prevention, and discovery of novel therapeutic strategies. To identify and investigate relationships based on shared genes or shared functional features, we have conducted a bioinformatic-based global analysis of all orphan diseases with known disease-causing mutant genes. Starting with a bipartite network of known OD and OD-causing mutant genes and using the human protein interactome, we first construct and topologically analyze three networks: the orphan disease network, the orphan disease-causing mutant gene network, and the orphan disease-causing mutant gene interactome. Our results demonstrate that in contrast to the common disease-causing mutant genes that are predominantly nonessential, a majority of orphan disease-causing mutant genes are essential. In confirmation of this finding, we found that OD-causing mutant genes are topologically important in the protein interactome and are ubiquitously expressed. Additionally, functional enrichment analysis of those genes in which mutations cause ODs shows that a majority result in premature death or are lethal in the orthologous mouse gene knockout models. To address the limitations of traditional gene-based disease networks, we also construct and analyze OD networks on the basis of shared enriched features (biological processes, cellular components, pathways, phenotypes, and literature citations). Analyzing these functionally-linked OD networks, we identified several additional OD-OD relations that are both phenotypically similar and phenotypically diverse. Surprisingly, we observed that the wiring of the gene-based and other feature-based OD networks are largely different; this suggests that the relationship between ODs cannot be fully captured by the gene-based network alone. PMID:21664998
Immunoglobulin Genomics in the Guinea Pig (Cavia porcellus)
Guo, Yongchen; Bao, Yonghua; Meng, Qingwen; Hu, Xiaoxiang; Meng, Qingyong; Ren, Liming; Li, Ning; Zhao, Yaofeng
2012-01-01
In science, the guinea pig is known as one of the gold standards for modeling human disease. It is especially important as a molecular and cellular biology model for studying the human immune system, as its immunological genes are more similar to human genes than are those of mice. The utility of the guinea pig as a model organism can be further enhanced by further characterization of the genes encoding components of the immune system. Here, we report the genomic organization of the guinea pig immunoglobulin (Ig) heavy and light chain genes. The guinea pig IgH locus is located in genomic scaffolds 54 and 75, and spans approximately 6,480 kb. 507 VH segments (94 potentially functional genes and 413 pseudogenes), 41 DH segments, six JH segments, four constant region genes (μ, γ, ε, and α), and one reverse δ remnant fragment were identified within the two scaffolds. Many VH pseudogenes were found within the guinea pig, and likely constituted a potential donor pool for gene conversion during evolution. The Igκ locus mapped to a 4,029 kb region of scaffold 37 and 24 is composed of 349 Vκ (111 potentially functional genes and 238 pseudogenes), three Jκ and one Cκ genes. The Igλ locus spans 1,642 kb in scaffold 4 and consists of 142 Vλ (58 potentially functional genes and 84 pseudogenes) and 11 Jλ -Cλ clusters. Phylogenetic analysis suggested the guinea pig’s large germline VH gene segments appear to form limited gene families. Therefore, this species may generate antibody diversity via a gene conversion-like mechanism associated with its pseudogene reserves. PMID:22761756
Orthology detection combining clustering and synteny for very large datasets.
Lechner, Marcus; Hernandez-Rosales, Maribel; Doerr, Daniel; Wieseke, Nicolas; Thévenin, Annelyse; Stoye, Jens; Hartmann, Roland K; Prohaska, Sonja J; Stadler, Peter F
2014-01-01
The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets.
Orthology Detection Combining Clustering and Synteny for Very Large Datasets
Lechner, Marcus; Hernandez-Rosales, Maribel; Doerr, Daniel; Wieseke, Nicolas; Thévenin, Annelyse; Stoye, Jens; Hartmann, Roland K.; Prohaska, Sonja J.; Stadler, Peter F.
2014-01-01
The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets. PMID:25137074
Cushion, Melanie T; Smulian, A George; Slaven, Bradley E; Sesterhenn, Tom; Arnold, Jonathan; Staben, Chuck; Porollo, Aleksey; Adamczak, Rafal; Meller, Jarek
2007-05-09
Members of the genus Pneumocystis are fungal pathogens that cause pneumonia in a wide variety of mammals with debilitated immune systems. Little is known about their basic biological functions, including life cycle, since no species can be cultured continuously outside the mammalian lung. To better understand the pathological process, about 4500 ESTS derived from sequencing of the poly(A) tail ends of P. carinii mRNAs during fulminate infection were annotated and functionally characterized as unassembled reads, and then clustered and reduced to a unigene set with 1042 members. Because of the presence of sequences from other microbial genomes and the rat host, the analysis and compression to a unigene set was necessarily an iterative process. BLASTx analysis of the unassembled reads (UR) vs. the Uni-Prot and TREMBL databases revealed 56% had similarities to existing polypeptides at E values of
Sesterhenn, Tom; Arnold, Jonathan; Staben, Chuck; Porollo, Aleksey; Adamczak, Rafal; Meller, Jarek
2007-01-01
Members of the genus Pneumocystis are fungal pathogens that cause pneumonia in a wide variety of mammals with debilitated immune systems. Little is known about their basic biological functions, including life cycle, since no species can be cultured continuously outside the mammalian lung. To better understand the pathological process, about 4500 ESTS derived from sequencing of the poly(A) tail ends of P. carinii mRNAs during fulminate infection were annotated and functionally characterized as unassembled reads, and then clustered and reduced to a unigene set with 1042 members. Because of the presence of sequences from other microbial genomes and the rat host, the analysis and compression to a unigene set was necessarily an iterative process. BLASTx analysis of the unassembled reads (UR) vs. the Uni-Prot and TREMBL databases revealed 56% had similarities to existing polypeptides at E values of≤10−6, with the remainder lacking any significant homology. The most abundant transcripts in the UR were associated with stress responses, energy production, transcription and translation. Most (70%) of the UR had similarities to proteins from filamentous fungi (e.g., Aspergillus, Neurospora) and existing P. carinii gene products. In contrast, similarities to proteins of the yeast-like fungi, Schizosaccharomyces pombe and Saccharomyces cerevisiae, predominated in the unigene set. Gene Ontology analysis using BLAST2GO revealed P. carinii dedicated most of its transcripts to cellular and physiological processes (∼80%), molecular binding and catalytic activities (∼70%), and were primarily derived from cell and organellar compartments (∼80%). KEGG Pathway mapping showed the putative P. carinii genes represented most standard metabolic pathways and cellular processes, including the tricarboxylic acid cycle, glycolysis, amino acid biosynthesis, cell cycle and mitochondrial function. Several gene homologs associated with mating, meiosis, and sterol biosynthesis in fungi were identified. Genes encoding the major surface glycoprotein family (MSG), heat shock (HSP70), and proteases (PROT/KEX) were the most abundantly expressed of known P. carinii genes. The apparent presence of many metabolic pathways in P. carinii, sexual reproduction within the host, and lack of an invasive infection process in the immunologically intact host suggest members of the genus Pneumocystis may be adapted parasites and have a compatible relationship with their mammalian hosts. This study represents the first characterization of the expressed genes of a non-culturable fungal pathogen of mammals during the infective process. PMID:17487271
Krisinger, J; Jeung, E B; Simmen, R C; Leung, P C
1995-01-01
The expression of Calbindin-D9k (CaBP-9k) in the pig uterus and placenta was measured by Northern blot analysis and reverse transcription polymerase chain reaction (PCR), respectively. Progesterone (P4) administration to ovariectomized pigs decreased CaBP-9k mRNA levels. Expression of endometrial CaBP-9k mRNA was high on pregnancy Days 10-12 and below the detection limit on Days 15 and 18. On Day 60, expression could be detected at low levels. In myometrium and placenta, CaBP-9k mRNA expression was not detectable by Northern analysis using total RNA. Reverse-transcribed RNA from both tissues demonstrated the presence of CaBP-9k transcripts by means of PCR. The partial CaBP-9k gene was amplified by PCR and cloned to determine the sequence of intron A. In contrast to the rat CaBP-9k gene, the pig gene does not contain a functional estrogen response element (ERE) within this region. A similar ERE-like sequence located at the identical location was examined by gel retardation analysis and failed to bind the estradiol receptor. A similar disruption of this ERE-like sequence has been described in the human CaBP-9k gene, which is not expressed at any level in placenta, myometrium, or endometrium. It is concluded that the pig CaBP-9k gene is regulated in these reproductive tissues in a manner distinct from that in rat and human tissues. The regulation is probably due to a regulatory region outside of intron A, which in the rat gene contains the key cis element for uterine expression of the CaBP-9k gene.
Ikram, Najmul; Qadir, Muhammad Abdul; Afzal, Muhammad Tanvir
2018-01-01
Sequence similarity is a commonly used measure to compare proteins. With the increasing use of ontologies, semantic (function) similarity is getting importance. The correlation between these measures has been applied in the evaluation of new semantic similarity methods, and in protein function prediction. In this research, we investigate the relationship between the two similarity methods. The results suggest absence of a strong correlation between sequence and semantic similarities. There is a large number of proteins with low sequence similarity and high semantic similarity. We observe that Pearson's correlation coefficient is not sufficient to explain the nature of this relationship. Interestingly, the term semantic similarity values above 0 and below 1 do not seem to play a role in improving the correlation. That is, the correlation coefficient depends only on the number of common GO terms in proteins under comparison, and the semantic similarity measurement method does not influence it. Semantic similarity and sequence similarity have a distinct behavior. These findings are of significant effect for future works on protein comparison, and will help understand the semantic similarity between proteins in a better way.
Comparative Analysis of Vertebrate Dystrophin Loci Indicate Intron Gigantism as a Common Feature
Pozzoli, Uberto; Elgar, Greg; Cagliani, Rachele; Riva, Laura; Comi, Giacomo P.; Bresolin, Nereo; Bardoni, Alessandra; Sironi, Manuela
2003-01-01
The human DMD gene is the largest known to date, spanning > 2000 kb on the X chromosome. The gene size is mainly accounted for by huge intronic regions. We sequenced 190 kb of Fugu rubripes (pufferfish) genomic DNA corresponding to the complete dystrophin gene (FrDMD) and provide the first report of gene structure and sequence comparison among dystrophin genomic sequences from different vertebrate organisms. Almost all intron positions and phases are conserved between FrDMD and its mammalian counterparts, and the predicted protein product of the Fugu gene displays 55% identity and 71% similarity to human dystrophin. In analogy to the human gene, FrDMD presents several-fold longer than average intronic regions. Analysis of intron sequences of the human and murine genes revealed that they are extremely conserved in size and that a similar fraction of total intron length is represented by repetitive elements; moreover, our data indicate that intron expansion through repeat accumulation in the two orthologs is the result of independent insertional events. The hypothesis that intron length might be functionally relevant to the DMD gene regulation is proposed and substantiated by the finding that dystrophin intron gigantism is common to the three vertebrate genes. [Supplemental material is available online at www.genome.org.] PMID:12727896
Pereyra, Luciana P; Hiibel, Sage R; Perrault, Elizabeth M; Reardon, Kenneth F; Pruden, Amy
2012-10-01
Sulfate-reducing permeable reactive zones (SR-PRZs) depend upon a complex microbial community to utilize a lignocellulosic substrate and produce sulfides, which remediate mine drainage by binding heavy metals. To gain insight into the impact of the microbial community composition on the startup time and pseudo-steady-state performance, functional genes corresponding to cellulose-degrading (CD), fermentative, sulfate-reducing, and methanogenic microorganisms were characterized in columns simulating SR-PRZs using quantitative polymerase chain reaction (qPCR) and denaturing gradient gel electrophoresis (DGGE). Duplicate columns were bioaugmented with sulfate-reducing or CD bacteria or biostimulated with ethanol or carboxymethyl cellulose and compared with baseline dairy manure inoculum and uninoculated controls. Sulfate removal began after ~ 15 days for all columns and pseudo-steady state was achieved by Day 30. Despite similar performance, DGGE profiles of 16S rRNA gene and functional genes at pseudo-steady state were distinct among the column treatments, suggesting the potential to control ultimate microbial community composition via bioaugmentation and biostimulation. qPCR revealed enrichment of functional genes in all columns between the initial and pseudo-steady-state time points. This is the first functional gene-based study of CD, fermentative and sulfate-reducing bacteria and methanogenic archaea in a lignocellulose-based environment and provides new qualitative and quantitative insight into startup of a complex microbial system. © 2012 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.
The Center for Regenerative Biology and Medicine at Mount Desert Island Biological Laboratory
2013-06-01
system through in vivo disruption of gene function. 15. SUBJECT TERMS limb regeneration Positional Memory Code Axolotl ...another selection factor to identify those genes that are similarly controlled in both Polypterus and axolotl samples. These comparisons revealed a...sequence IDs among Axolotl and Polypterus contigs that were up-regulated and down regulated greater than 2-fold between 0 and 7 dpa. (Left) The
Wuest, Samuel E; Vijverberg, Kitty; Schmidt, Anja; Weiss, Manuel; Gheyselinck, Jacqueline; Lohr, Miriam; Wellmer, Frank; Rahnenführer, Jörg; von Mering, Christian; Grossniklaus, Ueli
2010-03-23
The development of multicellular organisms is controlled by differential gene expression whereby cells adopt distinct fates. A spatially resolved view of gene expression allows the elucidation of transcriptional networks that are linked to cellular identity and function. The haploid female gametophyte of flowering plants is a highly reduced organism: at maturity, it often consists of as few as three cell types derived from a common precursor [1, 2]. However, because of its inaccessibility and small size, we know little about the molecular basis of cell specification and differentiation in the female gametophyte. Here we report expression profiles of all cell types in the mature Arabidopsis female gametophyte. Differentially expressed posttranscriptional regulatory modules and metabolic pathways characterize the distinct cell types. Several transcription factor families are overrepresented in the female gametophyte in comparison to other plant tissues, e.g., type I MADS domain, RWP-RK, and reproductive meristem transcription factors. PAZ/Piwi-domain encoding genes are upregulated in the egg, indicating a role of epigenetic regulation through small RNA pathways-a feature paralleled in the germline of animals [3]. A comparison of human and Arabidopsis egg cells for enrichment of functional groups identified several similarities that may represent a consequence of coevolution or ancestral gametic features. 2010 Elsevier Ltd. All rights reserved.
Faithful transcription initiation from a mitochondrial promoter in transgenic plastids
Bohne, Alexandra-Viola; Ruf, Stephanie; Börner, Thomas; Bock, Ralph
2007-01-01
The transcriptional machineries of plastids and mitochondria in higher plants exhibit striking similarities. All mitochondrial genes and part of the plastid genes are transcribed by related phage-type RNA polymerases. Furthermore, the majority of mitochondrial promoters and a subset of plastid promoters show a similar structural organization. We show here that the plant mitochondrial atpA promoter is recognized by plastid RNA polymerases in vitro and in vivo. The Arabidopsis phage-type RNA polymerase RpoTp, an enzyme localized exclusively to plastids, was found to recognize the mitochondrial atpA promoter in in vitro assays suggesting the possibility that mitochondrial promoters might function as well in plastids. We have, therefore, generated transplastomic tobacco plants harboring in their chloroplast genome the atpA promoter fused to the coding region of the bacterial nptII gene. The chimeric nptII gene was found to be efficiently transcribed in chloroplasts. Mapping of the 5′ ends of the nptII transcripts revealed accurate recognition of the atpA promoter by the chloroplast transcription machinery. We show further that the 5′ untranslated region (UTR) of the mitochondrial atpA transcript is capable of mediating translation in chloroplasts. The functional and evolutionary implications of these findings as well as possible applications in chloroplast genome engineering are discussed. PMID:17959651
Melanocortin-1 receptor gene variants affect pain and µ-opioid analgesia in mice and humans
Mogil, J; Ritchie, J; Smith, S; Strasburg, K; Kaplan, L; Wallace, M; Romberg, R; Bijl, H; Sarton, E; Fillingim, R; Dahan, A
2005-01-01
Background: A recent genetic study in mice and humans revealed the modulatory effect of MC1R (melanocortin-1 receptor) gene variants on κ-opioid receptor mediated analgesia. It is unclear whether this gene affects basal pain sensitivity or the efficacy of analgesics acting at the more clinically relevant µ-opioid receptor. Objective: To characterise sensitivity to pain and µ-opioid analgesia in mice and humans with non-functional melanocortin-1 receptors. Methods: Comparisons of spontaneous mutant C57BL/6-Mc1re/e mice to C57BL/6 wildtype mice, followed by a gene dosage study of pain and morphine-6-glucuronide (M6G) analgesia in humans with MC1R variants. Results: C57BL/6-Mc1re/e mutant mice and human redheads—both with non-functional MC1Rs—display reduced sensitivity to noxious stimuli and increased analgesic responsiveness to the µ-opioid selective morphine metabolite, M6G. In both species the differential analgesia is likely due to pharmacodynamic factors, as plasma levels of M6G are similar across genotype. Conclusions: Genotype at MC1R similarly affects pain sensitivity and M6G analgesia in mice and humans. These findings confirm the utility of cross species translational strategies in pharmacogenetics. PMID:15994880
Genomic and proteomic characterization of a thermophilic Geobacillus bacteriophage GBSV1.
Liu, Bin; Zhou, Fengfeng; Wu, Suijie; Xu, Ying; Zhang, Xiaobo
2009-03-01
Phages are present wherever life is found, and play roles in many biogeochemical and ecological processes. The thermophilic bacteriophages, however, have not been well studied. In this study, phage GBSV1 was obtained from a thermophilic bacterium Geobacillus sp. 6k51 isolated from a hot spring. GBSV1 contains a double-stranded linear DNA of 34,683bp, which encodes 54 putative open reading frames (ORFs). Thirty three of these 54 ORFs exhibit sequence similarities to genes from 7 species of Geobacillus or Bacillus bacteria, as well as of bacteriophages infecting these bacteria. Twenty-two ORFs have been functionally annotated based on both their sequence similarities to known genes and predicted Pfam protein domains. Five structural proteins of the purified GBSV1 virion have been identified by proteomic analyses. Surprisingly, 7 of the GBSV1 ORFs share sequence similarities with genes from bacteria relevant to human diseases. This is the first report that genes of human disease-inducing bacteria are found in a thermophilic phage. It is suggested that thermophilic phages may be the potential evolutionary link between thermophiles and human pathogens. The characterization of GBSV1 may possibly lead to new insights into virus-host interactions and to a better understanding of gene transfers and evolution of life on earth in general.
D'Auria, Giuseppe; Jiménez, Núria; Peris-Bondia, Francesc; Pelaz, Carmen; Latorre, Amparo; Moya, Andrés
2008-01-14
The repeats in toxin (Rtx) are an important pathogenicity factor involved in host cells invasion of Legionella pneumophila and other pathogenic bacteria. Its role in escaping the host immune system and cytotoxic activity is well known. Its repeated motives and modularity make Rtx a multifunctional factor in pathogenicity. The comparative analysis of rtx gene among 6 strains of L. pneumophila showed modularity in their structures. Among compared genomes, the N-terminal region of the protein presents highly dissimilar repeats with functionally similar domains. On the contrary, the C-terminal region is maintained with a fashionable modular configuration, which gives support to its proposed role in adhesion and pore formation. Despite the variability of rtx among the considered strains, the flanking genes are maintained in synteny and similarity. In contrast to the extracellular bacteria Vibrio cholerae, in which the rtx gene is highly conserved and flanking genes have lost synteny and similarity, the gene region coding for the Rtx toxin in the intracellular pathogen L. pneumophila shows a rapid evolution. Changes in the rtx could play a role in pathogenicity. The interplay of the Rtx toxin with host membranes might lead to the evolution of new variants that are able to escape host cell defences.
History of a prolific family: the Hes/Hey-related genes of the annelid Platynereis.
Gazave, Eve; Guillou, Aurélien; Balavoine, Guillaume
2014-01-01
The Hes superfamily or Hes/Hey-related genes encompass a variety of metazoan-specific bHLH genes, with somewhat fuzzy phylogenetic relationships. Hes superfamily members are involved in a variety of major developmental mechanisms in metazoans, notably in neurogenesis and segmentation processes, in which they often act as direct effector genes of the Notch signaling pathway. We have investigated the molecular and functional evolution of the Hes superfamily in metazoans using the lophotrochozoan Platynereis dumerilii as model. Our phylogenetic analyses of more than 200 Metazoan Hes/Hey-related genes revealed the presence of five families, three of them (Hes, Hey and Helt) being pan-metazoan. Those families were likely composed of a unique representative in the last common metazoan ancestor. The evolution of the Hes family was shaped by many independent lineage specific tandem duplication events. The expression patterns of 13 of the 15 Hes/Hey-related genes in Platynereis indicate a broad functional diversification. Nevertheless, a majority of these genes are involved in two crucial developmental processes in annelids: neurogenesis and segmentation, resembling functions highlighted in other animal models. Combining phylogenetic and expression data, our study suggests an unusual evolutionary history for the Hes superfamily. An ancestral multifunctional annelid Hes gene may have undergone multiples rounds of duplication-degeneration-complementation processes in the lineage leading to Platynereis, each gene copies ensuring their maintenance in the genome by subfunctionalisation. Similar but independent waves of duplications are at the origin of the multiplicity of Hes genes in other metazoan lineages.
Family-specific scaling laws in bacterial genomes.
De Lazzari, Eleonora; Grilli, Jacopo; Maslov, Sergei; Cosentino Lagomarsino, Marco
2017-07-27
Among several quantitative invariants found in evolutionary genomics, one of the most striking is the scaling of the overall abundance of proteins, or protein domains, sharing a specific functional annotation across genomes of given size. The size of these functional categories change, on average, as power-laws in the total number of protein-coding genes. Here, we show that such regularities are not restricted to the overall behavior of high-level functional categories, but also exist systematically at the level of single evolutionary families of protein domains. Specifically, the number of proteins within each family follows family-specific scaling laws with genome size. Functionally similar sets of families tend to follow similar scaling laws, but this is not always the case. To understand this systematically, we provide a comprehensive classification of families based on their scaling properties. Additionally, we develop a quantitative score for the heterogeneity of the scaling of families belonging to a given category or predefined group. Under the common reasonable assumption that selection is driven solely or mainly by biological function, these findings point to fine-tuned and interdependent functional roles of specific protein domains, beyond our current functional annotations. This analysis provides a deeper view on the links between evolutionary expansion of protein families and the functional constraints shaping the gene repertoire of bacterial genomes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Piscopo, Sara-Pier; Drouin, Guy
2014-05-01
Gene conversions are nonreciprocal sequence exchanges between genes. They are relatively common in Saccharomyces cerevisiae, but few studies have investigated the evolutionary fate of gene conversions or their functional impacts. Here, we analyze the evolution and impact of gene conversions between the two genes encoding 2-deoxyglucose-6-phosphate phosphatase in S. cerevisiae, Saccharomyces paradoxus and Saccharomyces mikatae. Our results demonstrate that the last half of these genes are subject to gene conversions among these three species. The greater similarity and the greater percentage of GC nucleotides in the converted regions, as well as the absence of long regions of adjacent common converted sites, suggest that these gene conversions are frequent and occur independently in all three species. The high frequency of these conversions probably result from the fact that they have little impact on the protein sequences encoded by these genes.
Gene expression variability in human hepatic drug metabolizing enzymes and transporters.
Yang, Lun; Price, Elvin T; Chang, Ching-Wei; Li, Yan; Huang, Ying; Guo, Li-Wu; Guo, Yongli; Kaput, Jim; Shi, Leming; Ning, Baitang
2013-01-01
Interindividual variability in the expression of drug-metabolizing enzymes and transporters (DMETs) in human liver may contribute to interindividual differences in drug efficacy and adverse reactions. Published studies that analyzed variability in the expression of DMET genes were limited by sample sizes and the number of genes profiled. We systematically analyzed the expression of 374 DMETs from a microarray data set consisting of gene expression profiles derived from 427 human liver samples. The standard deviation of interindividual expression for DMET genes was much higher than that for non-DMET genes. The 20 DMET genes with the largest variability in the expression provided examples of the interindividual variation. Gene expression data were also analyzed using network analysis methods, which delineates the similarities of biological functionalities and regulation mechanisms for these highly variable DMET genes. Expression variability of human hepatic DMET genes may affect drug-gene interactions and disease susceptibility, with concomitant clinical implications.
Itoh, Takeshi; Tanaka, Tsuyoshi; Barrero, Roberto A.; Yamasaki, Chisato; Fujii, Yasuyuki; Hilton, Phillip B.; Antonio, Baltazar A.; Aono, Hideo; Apweiler, Rolf; Bruskiewich, Richard; Bureau, Thomas; Burr, Frances; Costa de Oliveira, Antonio; Fuks, Galina; Habara, Takuya; Haberer, Georg; Han, Bin; Harada, Erimi; Hiraki, Aiko T.; Hirochika, Hirohiko; Hoen, Douglas; Hokari, Hiroki; Hosokawa, Satomi; Hsing, Yue; Ikawa, Hiroshi; Ikeo, Kazuho; Imanishi, Tadashi; Ito, Yukiyo; Jaiswal, Pankaj; Kanno, Masako; Kawahara, Yoshihiro; Kawamura, Toshiyuki; Kawashima, Hiroaki; Khurana, Jitendra P.; Kikuchi, Shoshi; Komatsu, Setsuko; Koyanagi, Kanako O.; Kubooka, Hiromi; Lieberherr, Damien; Lin, Yao-Cheng; Lonsdale, David; Matsumoto, Takashi; Matsuya, Akihiro; McCombie, W. Richard; Messing, Joachim; Miyao, Akio; Mulder, Nicola; Nagamura, Yoshiaki; Nam, Jongmin; Namiki, Nobukazu; Numa, Hisataka; Nurimoto, Shin; O’Donovan, Claire; Ohyanagi, Hajime; Okido, Toshihisa; OOta, Satoshi; Osato, Naoki; Palmer, Lance E.; Quetier, Francis; Raghuvanshi, Saurabh; Saichi, Naomi; Sakai, Hiroaki; Sakai, Yasumichi; Sakata, Katsumi; Sakurai, Tetsuya; Sato, Fumihiko; Sato, Yoshiharu; Schoof, Heiko; Seki, Motoaki; Shibata, Michie; Shimizu, Yuji; Shinozaki, Kazuo; Shinso, Yuji; Singh, Nagendra K.; Smith-White, Brian; Takeda, Jun-ichi; Tanino, Motohiko; Tatusova, Tatiana; Thongjuea, Supat; Todokoro, Fusano; Tsugane, Mika; Tyagi, Akhilesh K.; Vanavichit, Apichart; Wang, Aihui; Wing, Rod A.; Yamaguchi, Kaori; Yamamoto, Mayu; Yamamoto, Naoyuki; Yu, Yeisoo; Zhang, Hao; Zhao, Qiang; Higo, Kenichi; Burr, Benjamin; Gojobori, Takashi; Sasaki, Takuji
2007-01-01
We present here the annotation of the complete genome of rice Oryza sativa L. ssp. japonica cultivar Nipponbare. All functional annotations for proteins and non-protein-coding RNA (npRNA) candidates were manually curated. Functions were identified or inferred in 19,969 (70%) of the proteins, and 131 possible npRNAs (including 58 antisense transcripts) were found. Almost 5000 annotated protein-coding genes were found to be disrupted in insertional mutant lines, which will accelerate future experimental validation of the annotations. The rice loci were determined by using cDNA sequences obtained from rice and other representative cereals. Our conservative estimate based on these loci and an extrapolation suggested that the gene number of rice is ∼32,000, which is smaller than previous estimates. We conducted comparative analyses between rice and Arabidopsis thaliana and found that both genomes possessed several lineage-specific genes, which might account for the observed differences between these species, while they had similar sets of predicted functional domains among the protein sequences. A system to control translational efficiency seems to be conserved across large evolutionary distances. Moreover, the evolutionary process of protein-coding genes was examined. Our results suggest that natural selection may have played a role for duplicated genes in both species, so that duplication was suppressed or favored in a manner that depended on the function of a gene. PMID:17210932
The AtNFXL1 gene functions as a signaling component of the type A trichothecene-dependent response
Asano, Tomoya; Yasuda, Michiko; Nakashita, Hideo; Kimura, Makoto; Yamaguchi1, Kazuo
2008-01-01
Phytopathogenic Fusarium species produce the trichothecene family of phytotoxins, which function as a virulence factor during infection of plants. Trichothecenes are classifiable into four major groups by their chemical structures. Recently, the AtNFXL1 gene was reported as a type A trichothecene T-2 toxin-inducible gene. The AtNFXL1 gene encodes a putative transcription factor with similarity to the human transcription repressor NF-X1. The atnfxl1 mutant exhibited hypersensitivity phenotype to T-2 toxin but not to type B deoxynivalenol (DON) in comparison with wild type when Arabidopsis thaliana grew on agar medium containing trichothecenes. The absence or presence of a carbonyl group at the C8 position distinguishes type A and type B. Growth defect by another type A trichothecene diacetoxyscirpenol (DAS), was weakly enhanced in the atnfxl1 mutant. Diacetoxyscirpenol is distinguishable from T-2 toxin only by the absence of an isovaleryl group at the C8 position. Correspondingly, the AtNFXL1 promoter activity was apparently induced in T-2 toxin-treated and DAS-treated plants. In contrast, DON failed to induce the AtNFXL1 promoter activity. Consequently, the AtNFXL1 gene functions as a signaling component of the type A trichothecene-dependent response in Arabidopsis. In addition, the C8 position of trichothecenes might be closely related to the function of AtNFXL1 gene. PMID:19704430
Senthilkumar, Palanisamy; Thirugnanasambantham, Krishnaraj; Mandal, Abul Kalam Azad
2012-12-01
Tea (Camellia sinensis (L.) O. Kuntze) is an economically important plant cultivated for its leaves. Infection of Pestalotiopsis theae in leaves causes gray blight disease and enormous loss to the tea industry. We used suppressive subtractive hybridization (SSH) technique to unravel the differential gene expression pattern during gray blight disease development in tea. Complementary DNA from P. theae-infected and uninfected leaves of disease tolerant cultivar UPASI-10 was used as tester and driver populations respectively. Subtraction efficiency was confirmed by comparing abundance of β-actin gene. A total of 377 and 720 clones with insert size >250 bp from forward and reverse library respectively were sequenced and analyzed. Basic Local Alignment Search Tool analysis revealed 17 sequences in forward SSH library have high degree of similarity with disease and hypersensitive response related genes and 20 sequences with hypothetical proteins while in reverse SSH library, 23 sequences have high degree of similarity with disease and stress response-related genes and 15 sequences with hypothetical proteins. Functional analysis indicated unknown (61 and 59 %) or hypothetical functions (23 and 18 %) for most of the differentially regulated genes in forward and reverse SSH library, respectively, while others have important role in different cellular activities. Majority of the upregulated genes are related to hypersensitive response and reactive oxygen species production. Based on these expressed sequence tag data, putative role of differentially expressed genes were discussed in relation to disease. We also demonstrated the efficiency of SSH as a tool in enriching gray blight disease related up- and downregulated genes in tea. The present study revealed that many genes related to disease resistance were suppressed during P. theae infection and enhancing these genes by the application of inducers may impart better disease tolerance to the plants.
Pao, Sheng-Ying; Lin, Win-Li; Hwang, Ming-Jing
2006-01-01
Background Screening for differentially expressed genes on the genomic scale and comparative analysis of the expression profiles of orthologous genes between species to study gene function and regulation are becoming increasingly feasible. Expressed sequence tags (ESTs) are an excellent source of data for such studies using bioinformatic approaches because of the rich libraries and tremendous amount of data now available in the public domain. However, any large-scale EST-based bioinformatics analysis must deal with the heterogeneous, and often ambiguous, tissue and organ terms used to describe EST libraries. Results To deal with the issue of tissue source, in this work, we carefully screened and organized more than 8 million human and mouse ESTs into 157 human and 108 mouse tissue/organ categories, to which we applied an established statistic test using different thresholds of the p value to identify genes differentially expressed in different tissues. Further analysis of the tissue distribution and level of expression of human and mouse orthologous genes showed that tissue-specific orthologs tended to have more similar expression patterns than those lacking significant tissue specificity. On the other hand, a number of orthologs were found to have significant disparity in their expression profiles, hinting at novel functions, divergent regulation, or new ortholog relationships. Conclusion Comprehensive statistics on the tissue-specific expression of human and mouse genes were obtained in this very large-scale, EST-based analysis. These statistical results have been organized into a database, freely accessible at our website , for easy searching of human and mouse tissue-specific genes and for investigating gene expression profiles in the context of comparative genomics. Comparative analysis showed that, although highly tissue-specific genes tend to exhibit similar expression profiles in human and mouse, there are significant exceptions, indicating that orthologous genes, while sharing basic genomic properties, could result in distinct phenotypes. PMID:16626500
Sineokiĭ, S P; Pogosov, V Z; Iankovskiĭ, N K; Krylov, V N
1976-01-01
123 Amber mutants of lambdoid bacteriophage phi81 are isolated and distributed into 19 complementation groups. Deletion mapping made possible to locate 5 gene groups on the genetic map of bacteriophage phi81 and to determine a region of possible location of mm' sticky ends on the prophage genetic map. A gene of phage phi81 is localized, which controls the adsorption specificity, and which functional similarity to a respective gene of phage phi80 is demonstrated.
Age-related regulation of genes: slow homeostatic changes and age-dimension technology
NASA Astrophysics Data System (ADS)
Kurachi, Kotoku; Zhang, Kezhong; Huo, Jeffrey; Ameri, Afshin; Kuwahara, Mitsuhiro; Fontaine, Jean-Marc; Yamamoto, Kei; Kurachi, Sumiko
2002-11-01
Through systematic studies of pro- and anti-blood coagulation factors, we have determined molecular mechanisms involving two genetic elements, age-related stability element (ASE), GAGGAAG and age-related increase element (AIE), a unique stretch of dinucleotide repeats (AIE). ASE and AIE are essential for age-related patterns of stable and increased gene expression patterns, respectively. Such age-related gene regulatory mechanisms are also critical for explaining homeostasis in various physiological reactions as well as slow homeostatic changes in them. The age-related increase expression of the human factor IX (hFIX) gene requires the presence of both ASE and AIE, which apparently function additively. The anti-coagulant factor protein C (hPC) gene uses an ASE (CAGGAG) to produce age-related stable expression. Both ASE sequences (G/CAGAAG) share consensus sequence of the transcriptional factor PEA-3 element. No other similar sequences, including another PEA-3 consensus sequence, GAGGATG, function in conferring age-related gene regulation. The age-regulatory mechanisms involving ASE and AIE apparently function universally with different genes and across different animal species. These findings have led us to develop a new field of research and applications, which we named “age-dimension technology (ADT)”. ADT has exciting potential for modifying age-related expression of genes as well as associated physiological processes, and developing novel, more effective prophylaxis or treatments for age-related diseases.
Chemical-genetic profile analysis of five inhibitory compounds in yeast.
Alamgir, Md; Erukova, Veronika; Jessulat, Matthew; Azizi, Ali; Golshani, Ashkan
2010-08-06
Chemical-genetic profiling of inhibitory compounds can lead to identification of their modes of action. These profiles can help elucidate the complex interactions between small bioactive compounds and the cell machinery, and explain putative gene function(s). Colony size reduction was used to investigate the chemical-genetic profile of cycloheximide, 3-amino-1,2,4-triazole, paromomycin, streptomycin and neomycin in the yeast Saccharomyces cerevisiae. These compounds target the process of protein biosynthesis. More than 70,000 strains were analyzed from the array of gene deletion mutant yeast strains. As expected, the overall profiles of the tested compounds were similar, with deletions for genes involved in protein biosynthesis being the major category followed by metabolism. This implies that novel genes involved in protein biosynthesis could be identified from these profiles. Further investigations were carried out to assess the activity of three profiled genes in the process of protein biosynthesis using relative fitness of double mutants and other genetic assays. Chemical-genetic profiles provide insight into the molecular mechanism(s) of the examined compounds by elucidating their potential primary and secondary cellular target sites. Our follow-up investigations into the activity of three profiled genes in the process of protein biosynthesis provided further evidence concerning the usefulness of chemical-genetic analyses for annotating gene functions. We termed these genes TAE2, TAE3 and TAE4 for translation associated elements 2-4.
Lears, Kimberly A.; Parry, Jesse J.; Andrews, Rebecca; Nguyen, Kim; Wadas, Thaddeus J.; Rogers, Buck E.
2015-01-01
Suicide gene therapy is a process by which cells are administered a gene that encodes a protein capable of converting a nontoxic prodrug into an active toxin. Cytosine deaminase (CD) has been widely investigated as a means of suicide gene therapy due to the enzyme’s ability to convert the prodrug 5-fluorocytosine (5-FC) into the toxic compound 5-fluorouracil (5-FU). However, the extent of gene transfer is a limiting factor in predicting therapeutic outcome. The ability to monitor gene transfer, non-invasively, would strengthen the efficiency of therapy. In this regard, we have constructed and evaluated a replication-deficient adenovirus (Ad) containing the human somatostatin receptor subtype 2 (SSTR2) fused with a C-terminal yeast CD gene for the non-invasive monitoring of gene transfer and therapy. The resulting Ad (AdSSTR2-yCD) was evaluated in vitro in breast cancer cells to determine the function of the fusion protein. These studies demonstrated that the both the SSTR2 and yCD were functional in binding assays, conversion assays, and cytotoxicity assays. In vivo studies similarly demonstrated the functionality using conversion assays, biodistribution studies, and small animal positron-emission tomography (PET) imaging studies. In conclusion, the fusion protein has been validated as useful for the non-invasive imaging of yCD expression and will be evaluated in the future for monitoring yCD-based therapy. PMID:25837665
CRISPR-Cas9 and CRISPR-Cpf1 mediated targeting of a stomatal developmental gene EPFL9 in rice.
Yin, Xiaojia; Biswal, Akshaya K; Dionora, Jacqueline; Perdigon, Kristel M; Balahadia, Christian P; Mazumdar, Shamik; Chater, Caspar; Lin, Hsiang-Chun; Coe, Robert A; Kretzschmar, Tobias; Gray, Julie E; Quick, Paul W; Bandyopadhyay, Anindya
2017-05-01
CRISPR-Cas9/Cpf1 system with its unique gene targeting efficiency, could be an important tool for functional study of early developmental genes through the generation of successful knockout plants. The introduction and utilization of systems biology approaches have identified several genes that are involved in early development of a plant and with such knowledge a robust tool is required for the functional validation of putative candidate genes thus obtained. The development of the CRISPR-Cas9/Cpf1 genome editing system has provided a convenient tool for creating loss of function mutants for genes of interest. The present study utilized CRISPR/Cas9 and CRISPR-Cpf1 technology to knock out an early developmental gene EPFL9 (Epidermal Patterning Factor like-9, a positive regulator of stomatal development in Arabidopsis) orthologue in rice. Germ-line mutants that were generated showed edits that were carried forward into the T2 generation when Cas9-free homozygous mutants were obtained. The homozygous mutant plants showed more than an eightfold reduction in stomatal density on the abaxial leaf surface of the edited rice plants. Potential off-target analysis showed no significant off-target effects. This study also utilized the CRISPR-LbCpf1 (Lachnospiracae bacterium Cpf1) to target the same OsEPFL9 gene to test the activity of this class-2 CRISPR system in rice and found that Cpf1 is also capable of genome editing and edits get transmitted through generations with similar phenotypic changes seen with CRISPR-Cas9. This study demonstrates the application of CRISPR-Cas9/Cpf1 to precisely target genomic locations and develop transgene-free homozygous heritable gene edits and confirms that the loss of function analysis of the candidate genes emerging from different systems biology based approaches, could be performed, and therefore, this system adds value in the validation of gene function studies.
A literature-driven method to calculate similarities among diseases.
Kim, Hyunjin; Yoon, Youngmi; Ahn, Jaegyoon; Park, Sanghyun
2015-11-01
"Our lives are connected by a thousand invisible threads and along these sympathetic fibers, our actions run as causes and return to us as results". It is Herman Melville's famous quote describing connections among human lives. To paraphrase the Melville's quote, diseases are connected by many functional threads and along these sympathetic fibers, diseases run as causes and return as results. The Melville's quote explains the reason for researching disease-disease similarity and disease network. Measuring similarities between diseases and constructing disease network can play an important role in disease function research and in disease treatment. To estimate disease-disease similarities, we proposed a novel literature-based method. The proposed method extracted disease-gene relations and disease-drug relations from literature and used the frequencies of occurrence of the relations as features to calculate similarities among diseases. We also constructed disease network with top-ranking disease pairs from our method. The proposed method discovered a larger number of answer disease pairs than other comparable methods and showed the lowest p-value. We presume that our method showed good results because of using literature data, using all possible gene symbols and drug names for features of a disease, and determining feature values of diseases with the frequencies of co-occurrence of two entities. The disease-disease similarities from the proposed method can be used in computational biology researches which use similarities among diseases. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Petit, Daniel; Teppa, Elin; Mir, Anne-Marie; Vicogne, Dorothée; Thisse, Christine; Thisse, Bernard; Filloux, Cyril; Harduin-Lepers, Anne
2015-01-01
Sialyltransferases are responsible for the synthesis of a diverse range of sialoglycoconjugates predicted to be pivotal to deuterostomes’ evolution. In this work, we reconstructed the evolutionary history of the metazoan α2,3-sialyltransferases family (ST3Gal), a subset of sialyltransferases encompassing six subfamilies (ST3Gal I–ST3Gal VI) functionally characterized in mammals. Exploration of genomic and expressed sequence tag databases and search of conserved sialylmotifs led to the identification of a large data set of st3gal-related gene sequences. Molecular phylogeny and large scale sequence similarity network analysis identified four new vertebrate subfamilies called ST3Gal III-r, ST3Gal VII, ST3Gal VIII, and ST3Gal IX. To address the issue of the origin and evolutionary relationships of the st3gal-related genes, we performed comparative syntenic mapping of st3gal gene loci combined to ancestral genome reconstruction. The ten vertebrate ST3Gal subfamilies originated from genome duplication events at the base of vertebrates and are organized in three distinct and ancient groups of genes predating the early deuterostomes. Inferring st3gal gene family history identified also several lineage-specific gene losses, the significance of which was explored in a functional context. Toward this aim, spatiotemporal distribution of st3gal genes was analyzed in zebrafish and bovine tissues. In addition, molecular evolutionary analyses using specificity determining position and coevolved amino acid predictions led to the identification of amino acid residues with potential implication in functional divergence of vertebrate ST3Gal. We propose a detailed scenario of the evolutionary relationships of st3gal genes coupled to a conceptual framework of the evolution of ST3Gal functions. PMID:25534026
TAL effectors and the executor R genes
Zhang, Junli; Yin, Zhongchao; White, Frank
2015-01-01
Transcription activator-like (TAL) effectors are bacterial type III secretion proteins that function as transcription factors in plants during Xanthomonas/plant interactions, conditioning either host susceptibility and/or host resistance. Three types of TAL effector associated resistance (R) genes have been characterized—recessive, dominant non-transcriptional, and dominant TAL effector-dependent transcriptional based resistance. Here, we discuss the last type of R genes, whose functions are dependent on direct TAL effector binding to discrete effector binding elements in the promoters. Only five of the so-called executor R genes have been cloned, and commonalities are not clear. We have placed the protein products in two groups for conceptual purposes. Group 1 consists solely of the protein from pepper, BS3, which is predicted to have catalytic function on the basis of homology to a large conserved protein family. Group 2 consists of BS4C-R, XA27, XA10, and XA23, all of which are relatively short proteins from pepper or rice with multiple potential transmembrane domains. Group 2 members have low sequence similarity to proteins of unknown function in closely related species. Firm predictions await further experimentation on these interesting new members to the R gene repertoire, which have potential broad application in new strategies for disease resistance. PMID:26347759
TAL effectors and the executor R genes.
Zhang, Junli; Yin, Zhongchao; White, Frank
2015-01-01
Transcription activator-like (TAL) effectors are bacterial type III secretion proteins that function as transcription factors in plants during Xanthomonas/plant interactions, conditioning either host susceptibility and/or host resistance. Three types of TAL effector associated resistance (R) genes have been characterized-recessive, dominant non-transcriptional, and dominant TAL effector-dependent transcriptional based resistance. Here, we discuss the last type of R genes, whose functions are dependent on direct TAL effector binding to discrete effector binding elements in the promoters. Only five of the so-called executor R genes have been cloned, and commonalities are not clear. We have placed the protein products in two groups for conceptual purposes. Group 1 consists solely of the protein from pepper, BS3, which is predicted to have catalytic function on the basis of homology to a large conserved protein family. Group 2 consists of BS4C-R, XA27, XA10, and XA23, all of which are relatively short proteins from pepper or rice with multiple potential transmembrane domains. Group 2 members have low sequence similarity to proteins of unknown function in closely related species. Firm predictions await further experimentation on these interesting new members to the R gene repertoire, which have potential broad application in new strategies for disease resistance.
De Novo Protein Structure Prediction
NASA Astrophysics Data System (ADS)
Hung, Ling-Hong; Ngan, Shing-Chung; Samudrala, Ram
An unparalleled amount of sequence data is being made available from large-scale genome sequencing efforts. The data provide a shortcut to the determination of the function of a gene of interest, as long as there is an existing sequenced gene with similar sequence and of known function. This has spurred structural genomic initiatives with the goal of determining as many protein folds as possible (Brenner and Levitt, 2000; Burley, 2000; Brenner, 2001; Heinemann et al., 2001). The purpose of this is twofold: First, the structure of a gene product can often lead to direct inference of its function. Second, since the function of a protein is dependent on its structure, direct comparison of the structures of gene products can be more sensitive than the comparison of sequences of genes for detecting homology. Presently, structural determination by crystallography and NMR techniques is still slow and expensive in terms of manpower and resources, despite attempts to automate the processes. Computer structure prediction algorithms, while not providing the accuracy of the traditional techniques, are extremely quick and inexpensive and can provide useful low-resolution data for structure comparisons (Bonneau and Baker, 2001). Given the immense number of structures which the structural genomic projects are attempting to solve, there would be a considerable gain even if the computer structure prediction approach were applicable to a subset of proteins.
Modeling Fragile X Syndrome in Drosophila
Drozd, Małgorzata; Bardoni, Barbara; Capovilla, Maria
2018-01-01
Intellectual disability (ID) and autism are hallmarks of Fragile X Syndrome (FXS), a hereditary neurodevelopmental disorder. The gene responsible for FXS is Fragile X Mental Retardation gene 1 (FMR1) encoding the Fragile X Mental Retardation Protein (FMRP), an RNA-binding protein involved in RNA metabolism and modulating the expression level of many targets. Most cases of FXS are caused by silencing of FMR1 due to CGG expansions in the 5′-UTR of the gene. Humans also carry the FXR1 and FXR2 paralogs of FMR1 while flies have only one FMR1 gene, here called dFMR1, sharing the same level of sequence homology with all three human genes, but functionally most similar to FMR1. This enables a much easier approach for FMR1 genetic studies. Drosophila has been widely used to investigate FMR1 functions at genetic, cellular, and molecular levels since dFMR1 mutants have many phenotypes in common with the wide spectrum of FMR1 functions that underlay the disease. In this review, we present very recent Drosophila studies investigating FMRP functions at genetic, cellular, molecular, and electrophysiological levels in addition to research on pharmacological treatments in the fly model. These studies have the potential to aid the discovery of pharmacological therapies for FXS. PMID:29713264
Petersen-Jones, Simon M.; Occelli, Laurence M.; Winkler, Paige A.; Lee, Winston; Sparrow, Janet R.; Tsukikawa, Mai; Boye, Sanford L.; Chiodo, Vince; Capasso, Jenina E.; Becirovic, Elvir; Schön, Christian; Seeliger, Mathias W.; Levin, Alex V.; Hauswirth, William W.
2017-01-01
Retinitis pigmentosa (RP) is a major cause of blindness that affects 1.5 million people worldwide. Mutations in cyclic nucleotide-gated channel β 1 (CNGB1) cause approximately 4% of autosomal recessive RP. Gene augmentation therapy shows promise for treating inherited retinal degenerations; however, relevant animal models and biomarkers of progression in patients with RP are needed to assess therapeutic outcomes. Here, we evaluated RP patients with CNGB1 mutations for potential biomarkers of progression and compared human phenotypes with those of mouse and dog models of the disease. Additionally, we used gene augmentation therapy in a CNGβ1-deficient dog model to evaluate potential translation to patients. CNGB1-deficient RP patients and mouse and dog models had a similar phenotype characterized by early loss of rod function and slow rod photoreceptor loss with a secondary decline in cone function. Advanced imaging showed promise for evaluating RP progression in human patients, and gene augmentation using adeno-associated virus vectors robustly sustained the rescue of rod function and preserved retinal structure in the dog model. Together, our results reveal an early loss of rod function in CNGB1-deficient patients and a wide window for therapeutic intervention. Moreover, the identification of potential biomarkers of outcome measures, availability of relevant animal models, and robust functional rescue from gene augmentation therapy support future work to move CNGB1-RP therapies toward clinical trials. PMID:29202463
Abi Rached, L; McDermott, M F; Pontarotti, P
1999-02-01
The human Major Histocompatibility Complex (MHC) shares similarities with three other chromosome regions in human. This could be the vestige of ancestral large scale duplications. We discuss here the possibility i) that these duplications occurred during two rounds of tetraploidization supposed to have taken place during chordate evolution before the jawed vertebrate radiation, and ii) that one of the quadruplicate regions, relaxed of functional constraints, gave rise to the vertebrate MHC by a quick round of gene cis-duplication and cis-exon shuffling. These different rounds of cis-duplications and exon shufflings allowed the emergence of new genes participating in novel biological functions i.e. adaptive immune responses. Cis-duplications and cis-exon shufflings are ongoing processes in the evolution of some of these genes in this region as they have occurred and were fixed at different times and in different lineages during vertebrate evolution. In contrast, other genes within the MHC have remained stable since the emergence of jawed vertebrates.
A systematic approach to infer biological relevance and biases of gene network structures.
Antonov, Alexey V; Tetko, Igor V; Mewes, Hans W
2006-01-10
The development of high-throughput technologies has generated the need for bioinformatics approaches to assess the biological relevance of gene networks. Although several tools have been proposed for analysing the enrichment of functional categories in a set of genes, none of them is suitable for evaluating the biological relevance of the gene network. We propose a procedure and develop a web-based resource (BIOREL) to estimate the functional bias (biological relevance) of any given genetic network by integrating different sources of biological information. The weights of the edges in the network may be either binary or continuous. These essential features make our web tool unique among many similar services. BIOREL provides standardized estimations of the network biases extracted from independent data. By the analyses of real data we demonstrate that the potential application of BIOREL ranges from various benchmarking purposes to systematic analysis of the network biology.
A Seven-Gene Locus for Synthesis of Phenazine-1-Carboxylic Acid by Pseudomonas fluorescens 2-79
Mavrodi, Dmitri V.; Ksenzenko, Vladimir N.; Bonsall, Robert F.; Cook, R. James; Boronin, Alexander M.; Thomashow, Linda S.
1998-01-01
Pseudomonas fluorescens 2-79 produces the broad-spectrum antibiotic phenazine-1-carboxylic acid (PCA), which is active against a variety of fungal root pathogens. In this study, seven genes designated phzABCDEFG that are sufficient for synthesis of PCA were localized within a 6.8-kb BglII-XbaI fragment from the phenazine biosynthesis locus of strain 2-79. Polypeptides corresponding to all phz genes were identified by analysis of recombinant plasmids in a T7 promoter/polymerase expression system. Products of the phzC, phzD, and phzE genes have similarities to enzymes of shikimic acid and chorismic acid metabolism and, together with PhzF, are absolutely necessary for PCA production. PhzG is similar to pyridoxamine-5′-phosphate oxidases and probably is a source of cofactor for the PCA-synthesizing enzyme(s). Products of the phzA and phzB genes are highly homologous to each other and may be involved in stabilization of a putative PCA-synthesizing multienzyme complex. Two new genes, phzX and phzY, that are homologous to phzA and phzB, respectively, were cloned and sequenced from P. aureofaciens 30-84, which produces PCA, 2-hydroxyphenazine-1-carboxylic acid, and 2-hydroxyphenazine. Based on functional analysis of the phz genes from strains 2-79 and 30-84, we postulate that different species of fluorescent pseudomonads have similar genetic systems that confer the ability to synthesize PCA. PMID:9573209
Xu, Yingchun; Wang, Yanjie; Mattson, Neil; Yang, Liu; Jin, Qijiang
2017-12-01
Trehalose-6-phosphate synthase (TPS) serves important functions in plant desiccation tolerance and response to environmental stimuli. At present, a comprehensive analysis, i.e. functional classification, molecular evolution, and expression patterns of this gene family are still lacking in Solanum tuberosum (potato). In this study, a comprehensive analysis of the TPS gene family was conducted in potato. A total of eight putative potato TPS genes (StTPSs) were identified by searching the latest potato genome sequence. The amino acid identity among eight StTPSs varied from 59.91 to 89.54%. Analysis of d N /d S ratios suggested that regions in the TPP (trehalose-6-phosphate phosphatase) domains evolved faster than the TPS domains. Although the sequence of the eight StTPSs showed high similarity (2571-2796 bp), their gene length is highly differentiated (3189-8406 bp). Many of the regulatory elements possibly related to phytohormones, abiotic stress and development were identified in different TPS genes. Based on the phylogenetic tree constructed using TPS genes of potato, and four other Solanaceae plants, TPS genes could be categorized into 6 distinct groups. Analysis revealed that purifying selection most likely played a major role during the evolution of this family. Amino acid changes detected in specific branches of the phylogenetic tree suggests relaxed constraints might have contributed to functional divergence among groups. Moreover, StTPSs were found to exhibit tissue and treatment specific expression patterns upon analysis of transcriptome data, and performing qRT-PCR. This study provides a reference for genome-wide identification of the potato TPS gene family and sets a framework for further functional studies of this important gene family in development and stress response.
Modularity and evolutionary constraints in a baculovirus gene regulatory network
2013-01-01
Background The structure of regulatory networks remains an open question in our understanding of complex biological systems. Interactions during complete viral life cycles present unique opportunities to understand how host-parasite network take shape and behave. The Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV) is a large double-stranded DNA virus, whose genome may encode for 152 open reading frames (ORFs). Here we present the analysis of the ordered cascade of the AgMNPV gene expression. Results We observed an earlier onset of the expression than previously reported for other baculoviruses, especially for genes involved in DNA replication. Most ORFs were expressed at higher levels in a more permissive host cell line. Genes with more than one copy in the genome had distinct expression profiles, which could indicate the acquisition of new functionalities. The transcription gene regulatory network (GRN) for 149 ORFs had a modular topology comprising five communities of highly interconnected nodes that separated key genes that are functionally related on different communities, possibly maximizing redundancy and GRN robustness by compartmentalization of important functions. Core conserved functions showed expression synchronicity, distinct GRN features and significantly less genetic diversity, consistent with evolutionary constraints imposed in key elements of biological systems. This reduced genetic diversity also had a positive correlation with the importance of the gene in our estimated GRN, supporting a relationship between phylogenetic data of baculovirus genes and network features inferred from expression data. We also observed that gene arrangement in overlapping transcripts was conserved among related baculoviruses, suggesting a principle of genome organization. Conclusions Albeit with a reduced number of nodes (149), the AgMNPV GRN had a topology and key characteristics similar to those observed in complex cellular organisms, which indicates that modularity may be a general feature of biological gene regulatory networks. PMID:24006890
NASA Technical Reports Server (NTRS)
Norga, Koenraad K.; Gurganus, Marjorie C.; Dilda, Christy L.; Yamamoto, Akihiko; Lyman, Richard F.; Patel, Prajal H.; Rubin, Gerald M.; Hoskins, Roger A.; Mackay, Trudy F.; Bellen, Hugo J.
2003-01-01
BACKGROUND: The identification of the function of all genes that contribute to specific biological processes and complex traits is one of the major challenges in the postgenomic era. One approach is to employ forward genetic screens in genetically tractable model organisms. In Drosophila melanogaster, P element-mediated insertional mutagenesis is a versatile tool for the dissection of molecular pathways, and there is an ongoing effort to tag every gene with a P element insertion. However, the vast majority of P element insertion lines are viable and fertile as homozygotes and do not exhibit obvious phenotypic defects, perhaps because of the tendency for P elements to insert 5' of transcription units. Quantitative genetic analysis of subtle effects of P element mutations that have been induced in an isogenic background may be a highly efficient method for functional genome annotation. RESULTS: Here, we have tested the efficacy of this strategy by assessing the extent to which screening for quantitative effects of P elements on sensory bristle number can identify genes affecting neural development. We find that such quantitative screens uncover an unusually large number of genes that are known to function in neural development, as well as genes with yet uncharacterized effects on neural development, and novel loci. CONCLUSIONS: Our findings establish the use of quantitative trait analysis for functional genome annotation through forward genetics. Similar analyses of quantitative effects of P element insertions will facilitate our understanding of the genes affecting many other complex traits in Drosophila.
Lxr regulates lipid metabolic and visual perception pathways during zebrafish development.
Pinto, Caroline Lucia; Kalasekar, Sharanya Maanasi; McCollum, Catherine W; Riu, Anne; Jonsson, Philip; Lopez, Justin; Swindell, Eric C; Bouhlatouf, Abdel; Balaguer, Patrick; Bondesson, Maria; Gustafsson, Jan-Åke
2016-01-05
The Liver X Receptors (LXRs) play important roles in multiple metabolic pathways, including fatty acid, cholesterol, carbohydrate and energy metabolism. To expand the knowledge of the functions of LXR signaling during embryonic development, we performed a whole-genome microarray analysis of Lxr target genes in zebrafish larvae treated with either one of the synthetic LXR ligands T0901317 or GW3965. Assessment of the biological processes enriched by differentially expressed genes revealed a prime role for Lxr in regulating lipid metabolic processes, similarly to the function of LXR in mammals. In addition, exposure to the Lxr ligands induced changes in expression of genes in the neural retina and lens of the zebrafish eye, including the photoreceptor guanylate cyclase activators and lens gamma crystallins, suggesting a potential novel role for Lxr in modulating the transcription of genes associated with visual function in zebrafish. The regulation of expression of metabolic genes was phenotypically reflected in an increased absorption of yolk in the zebrafish larvae, and changes in the expression of genes involved in visual perception were associated with morphological alterations in the retina and lens of the developing zebrafish eye. The regulation of expression of both lipid metabolic and eye specific genes was sustained in 1 month old fish. The transcriptional networks demonstrated several conserved effects of LXR activation between zebrafish and mammals, and also identified potential novel functions of Lxr, supporting zebrafish as a promising model for investigating the role of Lxr during development. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Lxr regulates lipid metabolic and visual perception pathways during zebrafish development
Pinto, Caroline Lucia; Kalasekar, Sharanya Maanasi; McCollum, Catherine W.; Riu, Anne; Jonsson, Philip; Lopez, Justin; Swindell, Eric; Bouhlatouf, Abdel; Balaguer, Patrick; Bondesson, Maria; Gustafsson, Jan-Åke
2015-01-01
The Liver X Receptors (LXRs) play important roles in multiple metabolic pathways, including fatty acid, cholesterol, carbohydrate and energy metabolism. To expand the knowledge of the functions of LXR signaling during embryonic development, we performed a whole-genome microarray analysis of Lxr target genes in zebrafish larvae treated with either one of the synthetic LXR ligands T0901317 or GW3965. Assessment of the biological processes enriched by differentially expressed genes revealed a prime role for Lxr in regulating lipid metabolic processes, similarly to the function of LXR in mammals. In addition, exposure to the Lxr ligands induced changes in expression of genes in the neural retina and lens of the zebrafish eye, including the photoreceptor guanylate cyclase activators and lens gamma crystallins, suggesting a potential novel role for Lxr in modulating the transcription of genes associated with visual function in zebrafish. The regulation of expression of metabolic genes was phenotypically reflected in an increased absorption of yolk in the zebrafish larvae, and changes in the expression of genes involved in visual perception were associated with morphological alterations in the retina and lens of the developing zebrafish eye. The regulation of expression of both lipid metabolic and eye specific genes was sustained in 1 month old fish. The transcriptional networks demonstrated several conserved effects of LXR activation between zebrafish and mammals, and also identified potential novel functions of Lxr, supporting zebrafish as a promising model for investigating the role of Lxr during development. PMID:26427652
Chao, Yuanqing; Ma, Liping; Yang, Ying; Ju, Feng; Zhang, Xu-Xiang; Wu, Wei-Min; Zhang, Tong
2013-12-19
The metagenomic approach was applied to characterize variations of microbial structure and functions in raw (RW) and treated water (TW) in a drinking water treatment plant (DWTP) at Pearl River Delta, China. Microbial structure was significantly influenced by the treatment processes, shifting from Gammaproteobacteria and Betaproteobacteria in RW to Alphaproteobacteria in TW. Further functional analysis indicated the basic metabolic functions of microorganisms in TW did not vary considerably. However, protective functions, i.e. glutathione synthesis genes in 'oxidative stress' and 'detoxification' subsystems, significantly increased, revealing the surviving bacteria may have higher chlorine resistance. Similar results were also found in glutathione metabolism pathway, which identified the major reaction for glutathione synthesis and supported more genes for glutathione metabolism existed in TW. This metagenomic study largely enhanced our knowledge about the influences of treatment processes, especially chlorination, on bacterial community structure and protective functions (e.g. glutathione metabolism) in ecosystems of DWTPs.
Muhammad, Izhar; Jing, Xiu-Qing; Shalmani, Abdullah; Ali, Muhammad; Yi, Shi; Gan, Peng-Fei; Li, Wen-Qiang; Liu, Wen-Ting; Chen, Kun-Ming
2018-05-12
The ferric reduction oxidase (FRO) gene family is involved in various biological processes widely found in plants and may play an essential role in metal homeostasis, tolerance and intricate signaling networks in response to a number of abiotic stresses. Our study describes the identification, characterization and evolutionary relationships of FRO genes families. Here, total 50 FRO genes in Plantae and 15 ‘FRO like’ genes in non-Plantae were retrieved from 16 different species. The entire FRO genes have been divided into seven clades according to close similarity in biological and functional behavior. Three conserved domains were common in FRO genes while in two FROs sub genome have an extra NADPH-Ox domain, separating the function of plant FROs. OsFRO1 and OsFRO7 genes were expressed constitutively in rice plant. Real-time RT-PCR analysis demonstrated that the expression of OsFRO1 was high in flag leaf, and OsFRO7 gene expression was maximum in leaf blade and flag leaf. Both genes showed vigorous expressions level in response to different abiotic and hormones treatments. Moreover, the expression of both genes was also substantial under heavy metal stresses. OsFRO1 gene expression was triggered following 6 h under Zn, Pb, Co and Ni treatments, whereas OsFRO7 gene expression under Fe, Pb and Ni after 12 h, Zn and Cr after 6 h, and Mn and Co after 3 h treatments. These findings suggest the possible involvement of both the genes under abiotic and metal stress and the regulation of phytohormones. Therefore, our current work may provide the foundation for further functional characterization of rice FRO genes family.
Discovering functional modules by topic modeling RNA-Seq based toxicogenomic data.
Yu, Ke; Gong, Binsheng; Lee, Mikyung; Liu, Zhichao; Xu, Joshua; Perkins, Roger; Tong, Weida
2014-09-15
Toxicogenomics (TGx) endeavors to elucidate the underlying molecular mechanisms through exploring gene expression profiles in response to toxic substances. Recently, RNA-Seq is increasingly regarded as a more powerful alternative to microarrays in TGx studies. However, realizing RNA-Seq's full potential requires novel approaches to extracting information from the complex TGx data. Considering read counts as the number of times a word occurs in a document, gene expression profiles from RNA-Seq are analogous to a word by document matrix used in text mining. Topic modeling aiming at to discover the latent structures in text corpora would be helpful to explore RNA-Seq based TGx data. In this study, topic modeling was applied on a typical RNA-Seq based TGx data set to discover hidden functional modules. The RNA-Seq based gene expression profiles were transformed into "documents", on which latent Dirichlet allocation (LDA) was used to build a topic model. We found samples treated by the compounds with the same modes of actions (MoAs) could be clustered based on topic similarities. The topic most relevant to each cluster was identified as a "marker" topic, which was interpreted by gene enrichment analysis with MoAs then confirmed by compound and pathways associations mined from literature. To further validate the "marker" topics, we tested topic transferability from RNA-Seq to microarrays. The RNA-Seq based gene expression profile of a topic specifically associated with peroxisome proliferator-activated receptors (PPAR) signaling pathway was used to query samples with similar expression profiles in two different microarray data sets, yielding accuracy of about 85%. This proof-of-concept study demonstrates the applicability of topic modeling to discover functional modules in RNA-Seq data and suggests a valuable computational tool for leveraging information within TGx data in RNA-Seq era.
Draeger, Christian; Ndinyanka Fabrice, Tohnyui; Gineau, Emilie; Mouille, Grégory; Kuhn, Benjamin M; Moller, Isabel; Abdou, Marie-Therese; Frey, Beat; Pauly, Markus; Bacic, Antony; Ringli, Christoph
2015-06-24
Leucine-rich repeat extensins (LRXs) are extracellular proteins consisting of an N-terminal leucine-rich repeat (LRR) domain and a C-terminal extensin domain containing the typical features of this class of structural hydroxyproline-rich glycoproteins (HRGPs). The LRR domain is likely to bind an interaction partner, whereas the extensin domain has an anchoring function to insolubilize the protein in the cell wall. Based on the analysis of the root hair-expressed LRX1 and LRX2 of Arabidopsis thaliana, LRX proteins are important for cell wall development. The importance of LRX proteins in non-root hair cells and on the structural changes induced by mutations in LRX genes remains elusive. The LRX gene family of Arabidopsis consists of eleven members, of which LRX3, LRX4, and LRX5 are expressed in aerial organs, such as leaves and stem. The importance of these LRX genes for plant development and particularly cell wall formation was investigated. Synergistic effects of mutations with gradually more severe growth retardation phenotypes in double and triple mutants suggest a similar function of the three genes. Analysis of cell wall composition revealed a number of changes to cell wall polysaccharides in the mutants. LRX3, LRX4, and LRX5, and most likely LRX proteins in general, are important for cell wall development. Due to the complexity of changes in cell wall structures in the lrx mutants, the exact function of LRX proteins remains to be determined. The increasingly strong growth-defect phenotypes in double and triple mutants suggests that the LRX proteins have similar functions and that they are important for proper plant development.
Identification, Expression, and Functional Analysis of the Fructokinase Gene Family in Cassava.
Yao, Yuan; Geng, Meng-Ting; Wu, Xiao-Hui; Sun, Chong; Wang, Yun-Lin; Chen, Xia; Shang, Lu; Lu, Xiao-Hua; Li, Zhan; Li, Rui-Mei; Fu, Shao-Ping; Duan, Rui-Jun; Liu, Jiao; Hu, Xin-Wen; Guo, Jian-Chun
2017-11-12
Fructokinase (FRK) proteins play important roles in catalyzing fructose phosphorylation and participate in the carbohydrate metabolism of storage organs in plants. To investigate the roles of FRKs in cassava tuber root development, seven FRK genes ( MeFRK1 - 7 ) were identified, and MeFRK1 - 6 were isolated. Phylogenetic analysis revealed that the MeFRK family genes can be divided into α ( MeFRK 1 , 2 , 6 , 7 ) and β ( MeFRK 3 , 4 , 5 ) groups. All the MeFRK proteins have typical conserved regions and substrate binding residues similar to those of the FRKs. The overall predicted three-dimensional structures of MeFRK1-6 were similar, folding into a catalytic domain and a β-sheet ''lid" region, forming a substrate binding cleft, which contains many residues involved in the binding to fructose. The gene and the predicted three-dimensional structures of MeFRK3 and MeFRK4 were the most similar. MeFRK1-6 displayed different expression patterns across different tissues, including leaves, stems, tuber roots, flowers, and fruits. In tuber roots, the expressions of MeFRK3 and MeFRK4 were much higher compared to those of the other genes. Notably, the expression of MeFRK3 and MeFRK4 as well as the enzymatic activity of FRK were higher at the initial and early expanding tuber stages and were lower at the later expanding and mature tuber stages. The FRK activity of MeFRK3 and MeFRK4 was identified by the functional complementation of triple mutant yeast cells that were unable to phosphorylate either glucose or fructose. The gene expression and enzymatic activity of MeFRK3 and MeFRK4 suggest that they might be the main enzymes in fructose phosphorylation for regulating the formation of tuber roots and starch accumulation at the tuber root initial and expanding stages.
Das, Suresh Chandra; Ramamurthy, Thandavanaryanalu; Ghosh, Santanu; Pazhani, Gururaja Perumal; Sen, Tista; Singh, Raghubir
2017-01-01
Background & objectives: Shigatoxic Escherichia coli (STEC) recovered from dairy animals of Kolkata, India, harboured the putative virulence genes; however, the animals did not exhibit clinical symptoms. Similarly, human isolates in this locality also showed variations in degree of symptoms. Hence, this study was designed to know the presence of recognized gene(s) in the locus of enterocyte effacement (LEE) pathogenicity island in these STEC isolates and functional status of the cardinal gene (eae) related to pathogenicity. Methods: Genes were characterized using polymerase chain reaction (PCR) assays, and functional status of cardinal gene (eae) was evaluated by fluorescent actin staining (FAS) assay. Variation in eae gene was determined by intimin PCR. Results: Cattle STEC isolates carried 22 genes in LEE pathogenicity island in different frequencies ranging from 5.63 to 47.88 per cent of the isolates. In human isolates, the genes namely ler, escRSTU, orf2, escC, escV, orf3 and tir that are associated with secretory function, were found to be absent and rest of the genes were present in lower frequency. Further, the cardinal gene (eae) responsible for initiation of pathogenesis was in a very low frequency in human (n=2; 10.5%) and cattle (n=11; 15.5%) isolates. None of these eae+ STEC isolates from human and cattle revealed positivity in FAS assay. Interpretation & conclusions: Majority of human STEC isolates lacked the cardinal virulence gene (eae), and genes for secretory function that are essential for facilitating pathogenesis. This may partially be attributed to low occurrence of STEC in human clinical diarrhoea in this area. Although a few isolates (11 of 71) from cattle had eae gene, they did not express phenotypically. This could be one of the reasons for not appearing of clinical symptoms in the hosts. PMID:29205193
Zhang, Yufan; Maximova, Siela N; Guiltinan, Mark J
2015-01-01
In plants, the conversion of stearoyl-ACP to oleoyol-ACP is catalyzed by a plastid-localized soluble stearoyl-acyl carrier protein (ACP) desaturase (SAD). The activity of SAD significantly impacts the ratio of saturated and unsaturated fatty acids, and is thus a major determinant of fatty acid composition. The cacao genome contains eight putative SAD isoforms with high amino acid sequence similarities and functional domain conservation with SAD genes from other species. Sequence variation in known functional domains between different SAD family members suggested that these eight SAD isoforms might have distinct functions in plant development, a hypothesis supported by their diverse expression patterns in various cacao tissues. Notably, TcSAD1 is universally expressed across all the tissues, and its expression pattern in seeds is highly correlated with the dramatic change in fatty acid composition during seed maturation. Interestingly, TcSAD3 and TcSAD4 appear to be exclusively and highly expressed in flowers, functions of which remain unknown. To test the function of TcSAD1 in vivo, transgenic complementation of the Arabidopsis ssi2 mutant was performed, demonstrating that TcSAD1 successfully rescued all AtSSI2 related phenotypes further supporting the functional orthology between these two genes. The identification of the major SAD gene responsible for cocoa butter biosynthesis provides new strategies for screening for novel genotypes with desirable fatty acid compositions, and for use in breeding programs to help pyramid genes for quality and other traits such as disease resistance.
Marques, M Carmen; Alonso-Cantabrana, Hugo; Forment, Javier; Arribas, Raquel; Alamar, Santiago; Conejero, Vicente; Perez-Amador, Miguel A
2009-01-01
Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new EST collection denotes an important step towards the identification of all genes in the citrus genome. Furthermore, public availability of the cDNA clones generated in this study, and not only their sequence, enables testing of the biological function of the genes represented in the collection. Expression of the citrus SEP3 homologue, CitrSEP, in Arabidopsis results in early flowering, along with other phenotypes resembling the over-expression of the Arabidopsis SEPALLATA genes. Our findings suggest that the members of the SEP gene family play similar roles in these quite distant plant species. PMID:19747386
Dominant resistance against plant viruses
de Ronde, Dryas; Butterbach, Patrick; Kormelink, Richard
2014-01-01
To establish a successful infection plant viruses have to overcome a defense system composed of several layers. This review will overview the various strategies plants employ to combat viral infections with main emphasis on the current status of single dominant resistance (R) genes identified against plant viruses and the corresponding avirulence (Avr) genes identified so far. The most common models to explain the mode of action of dominant R genes will be presented. Finally, in brief the hypersensitive response (HR) and extreme resistance (ER), and the functional and structural similarity of R genes to sensors of innate immunity in mammalian cell systems will be described. PMID:25018765
Yang, Shuang; Zhang, Guoqing; Liu, Wan; Wang, Zhen; Zhang, Jifeng; Yang, Dongshan; Chen, Y Eugene; Sun, Hong; Li, Yixue
2017-05-20
Animal models are increasingly gaining values by cross-comparisons of response or resistance to clinical agents used for patients. However, many disease mechanisms and drug effects generated from animal models are not transferable to human. To address these issues, we developed SysFinder (http://lifecenter.sgst.cn/SysFinder), a platform for scientists to find appropriate animal models for translational research. SysFinder offers a "topic-centered" approach for systematic comparisons of human genes, whose functions are involved in a specific scientific topic, to the corresponding homologous genes of animal models. Scientific topic can be a certain disease, drug, gene function or biological pathway. SysFinder calculates multi-level similarity indexes to evaluate the similarities between human and animal models in specified scientific topics. Meanwhile, SysFinder offers species-specific information to investigate the differences in molecular mechanisms between humans and animal models. Furthermore, SysFinder provides a user-friendly platform for determination of short guide RNAs (sgRNAs) and homology arms to design a new animal model. Case studies illustrate the ability of SysFinder in helping experimental scientists. SysFinder is a useful platform for experimental scientists to carry out their research in the human molecular mechanisms. Copyright © 2017 Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and Genetics Society of China. Published by Elsevier Ltd. All rights reserved.
Topology-function conservation in protein-protein interaction networks.
Davis, Darren; Yaveroğlu, Ömer Nebil; Malod-Dognin, Noël; Stojmirovic, Aleksandar; Pržulj, Nataša
2015-05-15
Proteins underlay the functioning of a cell and the wiring of proteins in protein-protein interaction network (PIN) relates to their biological functions. Proteins with similar wiring in the PIN (topology around them) have been shown to have similar functions. This property has been successfully exploited for predicting protein functions. Topological similarity is also used to guide network alignment algorithms that find similarly wired proteins between PINs of different species; these similarities are used to transfer annotation across PINs, e.g. from model organisms to human. To refine these functional predictions and annotation transfers, we need to gain insight into the variability of the topology-function relationships. For example, a function may be significantly associated with specific topologies, while another function may be weakly associated with several different topologies. Also, the topology-function relationships may differ between different species. To improve our understanding of topology-function relationships and of their conservation among species, we develop a statistical framework that is built upon canonical correlation analysis. Using the graphlet degrees to represent the wiring around proteins in PINs and gene ontology (GO) annotations to describe their functions, our framework: (i) characterizes statistically significant topology-function relationships in a given species, and (ii) uncovers the functions that have conserved topology in PINs of different species, which we term topologically orthologous functions. We apply our framework to PINs of yeast and human, identifying seven biological process and two cellular component GO terms to be topologically orthologous for the two organisms. © The Author 2015. Published by Oxford University Press.
Unprecedented genomic diversity of AhR1 and AhR2 genes in Atlantic salmon (Salmo salar L.).
Hansson, Maria C; Wittzell, Håkan; Persson, Kerstin; von Schantz, Torbjörn
2004-06-24
Aryl hydrocarbon receptor (AhR) genes encode proteins involved in mediating the toxic responses induced by several environmental pollutants. Here, we describe the identification of the first two AhR1 (alpha and beta) genes and two additional AhR2 (alpha and beta) genes in the tetraploid species Atlantic salmon (Salmo salar L.) from a cosmid library screening. Cosmid clones containing genomic salmon AhR sequences were isolated using a cDNA clone containing the coding region of the Atlantic salmon AhR2gamma as a probe. Screening revealed 14 positive clones, from which four were chosen for further analyses. One of the cosmids contained genomic AhR sequences that were highly similar to the rainbow trout (Oncorhynchus mykiss) AhR2alpha and beta genes. SMART RACE amplified two complete, highly similar but not identical AhR type 2 sequences from salmon cDNA, which from phylogenetic analyses were determined as the rainbow trout AhR2alpha and beta orthologs. The salmon AhR2alpha and beta encode proteins of 1071 and 1058 residues, respectively, and encompass characteristic AhR sequence elements like a basic-helix-loop-helix (bHLH) and two PER-ARNT-SIM (PAS) domains. Both genes are transcribed in liver, spleen and muscle tissues of adult salmon. A second cosmid contained partial sequences, which were identical to the previously characterized AhR2gamma gene. The last two cosmids contained partial genomic AhR sequences, which were more similar to other AhR type 1 fish genes than the four characterized salmon AhR2 genes. However, attempts to amplify the corresponding complete cDNA sequences of the inserts proved very difficult, suggesting that these genes are non-functional or very weakly transcribed in the examined tissues. Phylogenetic analyses of the conserved regions did, however, clearly indicate that these two AhRs belong to the AhR type 1 clade and have been assigned as the Atlantic salmon AhR1alpha and AhR1beta genes. Taken together, these findings demonstrate that multiple AhR genes are present in Atlantic salmon genome, which likely is a consequence of previous genome duplications in the evolutionary past of salmonids. Plausible explanations for the high incidence of AhR genes in fish and more specifically in salmonids, like rapid divergences in specialized functions, are discussed.
NASA Technical Reports Server (NTRS)
Wang, Vincent Y.; Hassan, Bassem A.; Bellen, Hugo J.; Zoghbi, Huda Y.
2002-01-01
Many genes share sequence similarity between species, but their properties often change significantly during evolution. For example, the Drosophila genes engrailed and orthodenticle and the onychophoran gene Ultrabithorax only partially substitute for their mouse or Drosophila homologs. We have been analyzing the relationship between atonal (ato) in the fruit fly and its mouse homolog, Math1. In flies, ato acts as a proneural gene that governs the development of chordotonal organs (CHOs), which serve as stretch receptors in the body wall and joints and as auditory organs in the antennae. In the fly CNS, ato is important not for specification but for axonal arborization. Math1, in contrast, is required for the specification of cells in both the CNS and the PNS. Furthermore, Math1 serves a role in the development of secretory lineage cells in the gut, a function that does not parallel any known to be served by ato. We wondered whether ato and Math1 might be more functionally homologous than they appear, so we expressed Math1 in ato mutant flies and ato in Math1 null mice. To our surprise, the two proteins are functionally interchangeable.
Zhao, M; Wang, T; Adamson, K J; Storey, K B; Cummins, S F
2016-02-08
The land snail Theba pisana is native to the Mediterranean region but has become one of the most abundant invasive species worldwide. Here, we present three transcriptomes of this agriculture pest derived from three tissues: the central nervous system, hepatopancreas (digestive gland), and foot muscle. Sequencing of the three tissues produced 339,479,092 high quality reads and a global de novo assembly generated a total of 250,848 unique transcripts (unigenes). BLAST analysis mapped 52,590 unigenes to NCBI non-redundant protein databases and further functional analysis annotated 21,849 unigenes with gene ontology. We report that T. pisana transcripts have representatives in all functional classes and a comparison of differentially expressed transcripts amongst all three tissues demonstrates enormous differences in their potential metabolic activities. The genes differentially expressed include those with sequence similarity to those genes associated with multiple bacterial diseases and neurological diseases. To provide a valuable resource that will assist functional genomics study, we have implemented a user-friendly web interface, ThebaDB (http://thebadb.bioinfo-minzhao.org/). This online database allows for complex text queries, sequence searches, and data browsing by enriched functional terms and KEGG mapping.
PAX6 maintains β cell identity by repressing genes of alternative islet cell types.
Swisa, Avital; Avrahami, Dana; Eden, Noa; Zhang, Jia; Feleke, Eseye; Dahan, Tehila; Cohen-Tayar, Yamit; Stolovich-Rain, Miri; Kaestner, Klaus H; Glaser, Benjamin; Ashery-Padan, Ruth; Dor, Yuval
2017-01-03
Type 2 diabetes is thought to involve a compromised β cell differentiation state, but the mechanisms underlying this dysfunction remain unclear. Here, we report a key role for the TF PAX6 in the maintenance of adult β cell identity and function. PAX6 was downregulated in β cells of diabetic db/db mice and in WT mice treated with an insulin receptor antagonist, revealing metabolic control of expression. Deletion of Pax6 in β cells of adult mice led to lethal hyperglycemia and ketosis that were attributed to loss of β cell function and expansion of α cells. Lineage-tracing, transcriptome, and chromatin analyses showed that PAX6 is a direct activator of β cell genes, thus maintaining mature β cell function and identity. In parallel, we found that PAX6 binds promoters and enhancers to repress alternative islet cell genes including ghrelin, glucagon, and somatostatin. Chromatin analysis and shRNA-mediated gene suppression experiments indicated a similar function of PAX6 in human β cells. We conclude that reduced expression of PAX6 in metabolically stressed β cells may contribute to β cell failure and α cell dysfunction in diabetes.
PAX6 maintains β cell identity by repressing genes of alternative islet cell types
Swisa, Avital; Avrahami, Dana; Eden, Noa; Zhang, Jia; Feleke, Eseye; Dahan, Tehila; Cohen-Tayar, Yamit; Stolovich-Rain, Miri; Kaestner, Klaus H.; Glaser, Benjamin; Ashery-Padan, Ruth
2016-01-01
Type 2 diabetes is thought to involve a compromised β cell differentiation state, but the mechanisms underlying this dysfunction remain unclear. Here, we report a key role for the TF PAX6 in the maintenance of adult β cell identity and function. PAX6 was downregulated in β cells of diabetic db/db mice and in WT mice treated with an insulin receptor antagonist, revealing metabolic control of expression. Deletion of Pax6 in β cells of adult mice led to lethal hyperglycemia and ketosis that were attributed to loss of β cell function and expansion of α cells. Lineage-tracing, transcriptome, and chromatin analyses showed that PAX6 is a direct activator of β cell genes, thus maintaining mature β cell function and identity. In parallel, we found that PAX6 binds promoters and enhancers to repress alternative islet cell genes including ghrelin, glucagon, and somatostatin. Chromatin analysis and shRNA-mediated gene suppression experiments indicated a similar function of PAX6 in human β cells. We conclude that reduced expression of PAX6 in metabolically stressed β cells may contribute to β cell failure and α cell dysfunction in diabetes. PMID:27941241
Zhao, Yang; Zhou, Yuqiong; Jiang, Haiyang; Li, Xiaoyu; Gan, Defang; Peng, Xiaojian; Zhu, Suwen; Cheng, Beijiu
2011-01-01
Background Members of the homeodomain-leucine zipper (HD-Zip) gene family encode transcription factors that are unique to plants and have diverse functions in plant growth and development such as various stress responses, organ formation and vascular development. Although systematic characterization of this family has been carried out in Arabidopsis and rice, little is known about HD-Zip genes in maize (Zea mays L.). Methods and Findings In this study, we described the identification and structural characterization of HD-Zip genes in the maize genome. A complete set of 55 HD-Zip genes (Zmhdz1-55) were identified in the maize genome using Blast search tools and categorized into four classes (HD-Zip I-IV) based on phylogeny. Chromosomal location of these genes revealed that they are distributed unevenly across all 10 chromosomes. Segmental duplication contributed largely to the expansion of the maize HD-ZIP gene family, while tandem duplication was only responsible for the amplification of the HD-Zip II genes. Furthermore, most of the maize HD-Zip I genes were found to contain an overabundance of stress-related cis-elements in their promoter sequences. The expression levels of the 17 HD-Zip I genes under drought stress were also investigated by quantitative real-time PCR (qRT-PCR). All of the 17 maize HD-ZIP I genes were found to be regulated by drought stress, and the duplicated genes within a sister pair exhibited the similar expression patterns, suggesting their conserved functions during the process of evolution. Conclusions Our results reveal a comprehensive overview of the maize HD-Zip gene family and provide the first step towards the selection of Zmhdz genes for cloning and functional research to uncover their roles in maize growth and development. PMID:22164299
Zhao, Yang; Zhou, Yuqiong; Jiang, Haiyang; Li, Xiaoyu; Gan, Defang; Peng, Xiaojian; Zhu, Suwen; Cheng, Beijiu
2011-01-01
Members of the homeodomain-leucine zipper (HD-Zip) gene family encode transcription factors that are unique to plants and have diverse functions in plant growth and development such as various stress responses, organ formation and vascular development. Although systematic characterization of this family has been carried out in Arabidopsis and rice, little is known about HD-Zip genes in maize (Zea mays L.). In this study, we described the identification and structural characterization of HD-Zip genes in the maize genome. A complete set of 55 HD-Zip genes (Zmhdz1-55) were identified in the maize genome using Blast search tools and categorized into four classes (HD-Zip I-IV) based on phylogeny. Chromosomal location of these genes revealed that they are distributed unevenly across all 10 chromosomes. Segmental duplication contributed largely to the expansion of the maize HD-ZIP gene family, while tandem duplication was only responsible for the amplification of the HD-Zip II genes. Furthermore, most of the maize HD-Zip I genes were found to contain an overabundance of stress-related cis-elements in their promoter sequences. The expression levels of the 17 HD-Zip I genes under drought stress were also investigated by quantitative real-time PCR (qRT-PCR). All of the 17 maize HD-ZIP I genes were found to be regulated by drought stress, and the duplicated genes within a sister pair exhibited the similar expression patterns, suggesting their conserved functions during the process of evolution. Our results reveal a comprehensive overview of the maize HD-Zip gene family and provide the first step towards the selection of Zmhdz genes for cloning and functional research to uncover their roles in maize growth and development.
Emergence of the self-similar property in gene expression dynamics
NASA Astrophysics Data System (ADS)
Ochiai, T.; Nacher, J. C.; Akutsu, T.
2007-08-01
Many theoretical models have recently been proposed to understand the structure of cellular systems composed of various types of elements (e.g., proteins, metabolites and genes) and their interactions. However, the cell is a highly dynamic system with thousands of functional elements fluctuating across temporal states. Therefore, structural analysis alone is not sufficient to reproduce the cell's observed behavior. In this article, we analyze the gene expression dynamics (i.e., how the amount of mRNA molecules in cell fluctuate in time) by using a new constructive approach, which reveals a symmetry embedded in gene expression fluctuations and characterizes the dynamical equation of gene expression (i.e., a specific stochastic differential equation). First, by using experimental data of human and yeast gene expression time series, we found a symmetry in short-time transition probability from time t to time t+1. We call it self-similarity symmetry (i.e., the gene expression short-time fluctuations contain a repeating pattern of smaller and smaller parts that are like the whole, but different in size). Secondly, we reconstruct the global behavior of the observed distribution of gene expression (i.e., scaling-law) and the local behavior of the power-law tail of this distribution. This approach may represent a step forward toward an integrated image of the basic elements of the whole cell.
Holliday, Jason A; Ralph, Steven G; White, Richard; Bohlmann, Jörg; Aitken, Sally N
2008-01-01
Cold acclimation in conifers is a complex process, the timing and extent of which reflects local adaptation and varies widely along latitudinal gradients for many temperate and boreal tree species. Despite their ecological and economic importance, little is known about the global changes in gene expression that accompany autumn cold acclimation in conifers. Using three populations of Sitka spruce (Picea sitchensis) spanning the species range, and a Picea cDNA microarray with 21,840 unique elements, within- and among-population gene expression was monitored during the autumn. Microarray data were validated for selected genes using real-time PCR. Similar numbers of genes were significantly twofold upregulated (1257) and downregulated (967) between late summer and early winter. Among those upregulated were dehydrins, pathogenesis-related/antifreeze genes, carbohydrate and lipid metabolism genes, and genes involved in signal transduction and transcriptional regulation. Among-population microarray hybridizations at early and late autumn time points revealed substantial variation in the autumn transcriptome, some of which may reflect local adaptation. These results demonstrate the complexity of cold acclimation in conifers, highlight similarities and differences to cold tolerance in annual plants, and provide a solid foundation for functional and genetic studies of this important adaptive process.
Non-random mate choice in humans: insights from a genome scan.
Laurent, R; Toupance, B; Chaix, R
2012-02-01
Little is known about the genetic factors influencing mate choice in humans. Still, there is evidence for non-random mate choice with respect to physical traits. In addition, some studies suggest that the Major Histocompatibility Complex may affect pair formation. Nowadays, the availability of high density genomic data sets gives the opportunity to scan the genome for signatures of non-random mate choice without prior assumptions on which genes may be involved, while taking into account socio-demographic factors. Here, we performed a genome scan to detect extreme patterns of similarity or dissimilarity among spouses throughout the genome in three populations of African, European American, and Mexican origins from the HapMap 3 database. Our analyses identified genes and biological functions that may affect pair formation in humans, including genes involved in skin appearance, morphogenesis, immunity and behaviour. We found little overlap between the three populations, suggesting that the biological functions potentially influencing mate choice are population specific, in other words are culturally driven. Moreover, whenever the same functional category of genes showed a significant signal in two populations, different genes were actually involved, which suggests the possibility of evolutionary convergences. © 2011 Blackwell Publishing Ltd.
Barik, Suvakanta; SarkarDas, Shabari; Singh, Archita; Gautam, Vibhav; Kumar, Pramod; Majee, Manoj; Sarkar, Ananda K
2014-01-01
Similar to the majority of the microRNAs, mature miR166s are derived from multiple members of MIR166 genes (precursors) and regulate various aspects of plant development by negatively regulating their target genes (Class III HD-ZIP). The evolutionary conservation or functional diversification of miRNA166 family members remains elusive. Here, we show the phylogenetic relationships among MIR166 precursor and mature sequences from three diverse model plant species. Despite strong conservation, some mature miR166 sequences, such as ppt-miR166m, have undergone sequence variation. Critical sequence variation in ppt-miR166m has led to functional diversification, as it targets non-HD-ZIPIII gene transcript (s). MIR166 precursor sequences have diverged in a lineage specific manner, and both precursors and mature osa-miR166i/j are highly conserved. Interestingly, polycistronic MIR166s were present in Physcomitrella and Oryza but not in Arabidopsis. The nature of cis-regulatory motifs on the upstream promoter sequences of MIR166 genes indicates their possible contribution to the functional variation observed among miR166 species. Copyright © 2013 Elsevier Inc. All rights reserved.
González-Pedrajo, Bertha; de la Mora, Javier; Ballado, Teresa; Camarena, Laura; Dreyfus, Georges
2002-11-13
In this work, we show evidence regarding the functionality of a large cluster of flagellar genes in Rhodobacter sphaeroides. The genes of this cluster, flgGHIJKL and orf-1, are mainly involved in the formation of the basal body, and flgK and flgL encode the hook-associated proteins HAP1 and HAP3. In general, these genes showed a good similarity as compared with those reported for Salmonella enterica. However, flgJ and flgK showed particular features that make them unique among the flagellar sequences already reported. flgJ is only a third of the size reported for flgJ from Salmonella; whereas flgK is about three times larger than any other flgK sequence previously known. Our results indicate that both genes are functional, and their products are essential for flagellar assembly. In contrast, the interruption of orf-1, did not affect motility suggesting that this sequence, if functional, is not indispensable for flagellar assembly. Finally, we present genetic evidence suggesting that the flgGHIJKL genes are expressed as a single transcriptional unit depending on the sigma-54 factor.
Functional metagenomics reveals novel β-galactosidases not predictable from gene sequences.
Cheng, Jiujun; Romantsov, Tatyana; Engel, Katja; Doxey, Andrew C; Rose, David R; Neufeld, Josh D; Charles, Trevor C
2017-01-01
The techniques of metagenomics have allowed researchers to access the genomic potential of uncultivated microbes, but there remain significant barriers to determination of gene function based on DNA sequence alone. Functional metagenomics, in which DNA is cloned and expressed in surrogate hosts, can overcome these barriers, and make important contributions to the discovery of novel enzymes. In this study, a soil metagenomic library carried in an IncP cosmid was used for functional complementation for β-galactosidase activity in both Sinorhizobium meliloti (α-Proteobacteria) and Escherichia coli (γ-Proteobacteria) backgrounds. One β-galactosidase, encoded by six overlapping clones that were selected in both hosts, was identified as a member of glycoside hydrolase family 2. We could not identify ORFs obviously encoding possible β-galactosidases in 19 other sequenced clones that were only able to complement S. meliloti. Based on low sequence identity to other known glycoside hydrolases, yet not β-galactosidases, three of these ORFs were examined further. Biochemical analysis confirmed that all three encoded β-galactosidase activity. Lac36W_ORF11 and Lac161_ORF7 had conserved domains, but lacked similarities to known glycoside hydrolases. Lac161_ORF10 had neither conserved domains nor similarity to known glycoside hydrolases. Bioinformatic and structural modeling implied that Lac161_ORF10 protein represented a novel enzyme family with a five-bladed propeller glycoside hydrolase domain. By discovering founding members of three novel β-galactosidase families, we have reinforced the value of functional metagenomics for isolating novel genes that could not have been predicted from DNA sequence analysis alone.
Mueller, Kristina M.; Themanns, Madeleine; Friedbichler, Katrin; Kornfeld, Jan-Wilhelm; Esterbauer, Harald; Tuckermann, Jan P.; Moriggl, Richard
2012-01-01
Growth hormone (GH) and glucocorticoids (GCs) are involved in the control of processes that are essential for the maintenance of vital body functions including energy supply and growth control. GH and GCs have been well characterized to regulate systemic energy homeostasis, particular during certain conditions of physical stress. However, dysfunctional signaling in both pathways is linked to various metabolic disorders associated with aberrant carbohydrate and lipid metabolism. In liver, GH-dependent activation of the transcription factor signal transducer and activator of transcription (STAT) 5 controls a variety of physiologic functions within hepatocytes. Similarly, GCs, through activation of the glucocorticoid receptor (GR), influence many important liver functions such as gluconeogenesis. Studies in hepatic Stat5 or GR knockout mice have revealed that they similarly control liver function on their target gene level and indeed, the GR functions often as a cofactor of STAT5 for GH-induced genes. Gene sets, which require physical STAT5–GR interaction, include those controlling body growth and maturation. More recently, it has become evident that impairment of GH-STAT5 signaling in different experimental models correlates with metabolic liver disease, ranging from hepatic steatosis to hepatocellular carcinoma (HCC). While GH-activated STAT5 has a protective role in chronic liver disease, experimental disruption of GC-GR signaling rather seems to ameliorate metabolic disorders under metabolic challenge. In this review, we focus on the current knowledge about hepatic GH-STAT5 and GC-GR signaling in body growth, metabolism, and protection from fatty liver disease and HCC development. PMID:22564914
Cortés-Romero, Celso; Martínez-Hernández, Aída; Mellado-Mojica, Erika; López, Mercedes G; Simpson, June
2012-01-01
Fructans are the main storage polysaccharides found in Agave species. The synthesis of these complex carbohydrates relies on the activities of specific fructosyltransferase enzymes closely related to the hydrolytic invertases. Analysis of Agave tequilana transcriptome data led to the identification of ESTs encoding putative fructosyltransferases and invertases. Based on sequence alignments and structure/function relationships, two different genes were predicted to encode 1-SST and 6G-FFT type fructosyltransferases, in addition, 4 genes encoding putative cell wall invertases and 4 genes encoding putative vacuolar invertases were also identified. Probable functions for each gene, were assigned based on conserved amino acid sequences and confirmed for 2 fructosyltransferases and one invertase by analyzing the enzymatic activity of recombinant Agave protein s expressed and purified from Pichia pastoris. The genome organization of the fructosyltransferase/invertase genes, for which the corresponding cDNA contained the complete open reading frame, was found to be well conserved since all genes were shown to carry a 9 bp mini-exon and all showed a similar structure of 8 exons/7 introns with the exception of a cell wall invertase gene which has 7 exons and 6 introns. Fructosyltransferase genes were strongly expressed in the storage organs of the plants, especially in vegetative stages of development and to lower levels in photosynthetic tissues, in contrast to the invertase genes where higher levels of expression were observed in leaf tissues and in mature plants.
Cortés-Romero, Celso; Martínez-Hernández, Aída; Mellado-Mojica, Erika; López, Mercedes G.; Simpson, June
2012-01-01
Fructans are the main storage polysaccharides found in Agave species. The synthesis of these complex carbohydrates relies on the activities of specific fructosyltransferase enzymes closely related to the hydrolytic invertases. Analysis of Agave tequilana transcriptome data led to the identification of ESTs encoding putative fructosyltransferases and invertases. Based on sequence alignments and structure/function relationships, two different genes were predicted to encode 1-SST and 6G-FFT type fructosyltransferases, in addition, 4 genes encoding putative cell wall invertases and 4 genes encoding putative vacuolar invertases were also identified. Probable functions for each gene, were assigned based on conserved amino acid sequences and confirmed for 2 fructosyltransferases and one invertase by analyzing the enzymatic activity of recombinant Agave protein s expressed and purified from Pichia pastoris. The genome organization of the fructosyltransferase/invertase genes, for which the corresponding cDNA contained the complete open reading frame, was found to be well conserved since all genes were shown to carry a 9 bp mini-exon and all showed a similar structure of 8 exons/7 introns with the exception of a cell wall invertase gene which has 7 exons and 6 introns. Fructosyltransferase genes were strongly expressed in the storage organs of the plants, especially in vegetative stages of development and to lower levels in photosynthetic tissues, in contrast to the invertase genes where higher levels of expression were observed in leaf tissues and in mature plants. PMID:22558253