Science.gov

Sample records for duplicated gene family

  1. Predictions of Gene Family Distributions in Microbial Genomes: Evolution by Gene Duplication and Modification

    NASA Astrophysics Data System (ADS)

    Yanai, Itai; Camacho, Carlos J.; Delisi, Charles

    2000-09-01

    A universal property of microbial genomes is the considerable fraction of genes that are homologous to other genes within the same genome. The process by which these homologues are generated is not well understood, but sequence analysis of 20 microbial genomes unveils a recurrent distribution of gene family sizes. We show that a simple evolutionary model based on random gene duplication and point mutations fully accounts for these distributions and permits predictions for the number of gene families in genomes not yet complete. Our findings are consistent with the notion that a genome evolves from a set of precursor genes to a mature size by gene duplications and increasing modifications.

  2. Genome-wide analysis of homeobox gene family in legumes: identification, gene duplication and expression profiling.

    PubMed

    Bhattacharjee, Annapurna; Ghangal, Rajesh; Garg, Rohini; Jain, Mukesh

    2015-01-01

    Homeobox genes encode transcription factors that are known to play a major role in different aspects of plant growth and development. In the present study, we identified homeobox genes belonging to 14 different classes in five legume species, including chickpea, soybean, Medicago, Lotus and pigeonpea. The characteristic differences within homeodomain sequences among various classes of homeobox gene family were quite evident. Genome-wide expression analysis using publicly available datasets (RNA-seq and microarray) indicated that homeobox genes are differentially expressed in various tissues/developmental stages and under stress conditions in different legumes. We validated the differential expression of selected chickpea homeobox genes via quantitative reverse transcription polymerase chain reaction. Genome duplication analysis in soybean indicated that segmental duplication has significantly contributed in the expansion of homeobox gene family. The Ka/Ks ratio of duplicated homeobox genes in soybean showed that several members of this family have undergone purifying selection. Moreover, expression profiling indicated that duplicated genes might have been retained due to sub-functionalization. The genome-wide identification and comprehensive gene expression profiling of homeobox gene family members in legumes will provide opportunities for functional analysis to unravel their exact role in plant growth and development.

  3. Genome-Wide Analysis of Homeobox Gene Family in Legumes: Identification, Gene Duplication and Expression Profiling

    PubMed Central

    Garg, Rohini; Jain, Mukesh

    2015-01-01

    Homeobox genes encode transcription factors that are known to play a major role in different aspects of plant growth and development. In the present study, we identified homeobox genes belonging to 14 different classes in five legume species, including chickpea, soybean, Medicago, Lotus and pigeonpea. The characteristic differences within homeodomain sequences among various classes of homeobox gene family were quite evident. Genome-wide expression analysis using publicly available datasets (RNA-seq and microarray) indicated that homeobox genes are differentially expressed in various tissues/developmental stages and under stress conditions in different legumes. We validated the differential expression of selected chickpea homeobox genes via quantitative reverse transcription polymerase chain reaction. Genome duplication analysis in soybean indicated that segmental duplication has significantly contributed in the expansion of homeobox gene family. The Ka/Ks ratio of duplicated homeobox genes in soybean showed that several members of this family have undergone purifying selection. Moreover, expression profiling indicated that duplicated genes might have been retained due to sub-functionalization. The genome-wide identification and comprehensive gene expression profiling of homeobox gene family members in legumes will provide opportunities for functional analysis to unravel their exact role in plant growth and development. PMID:25745864

  4. Predictions of Gene Family Distributions in Microbial Genomes: Evolution by Gene Duplication and Modification

    SciTech Connect

    Yanai, Itai; Camacho, Carlos J.; DeLisi, Charles

    2000-09-18

    A universal property of microbial genomes is the considerable fraction of genes that are homologous to other genes within the same genome. The process by which these homologues are generated is not well understood, but sequence analysis of 20 microbial genomes unveils a recurrent distribution of gene family sizes. We show that a simple evolutionary model based on random gene duplication and point mutations fully accounts for these distributions and permits predictions for the number of gene families in genomes not yet complete. Our findings are consistent with the notion that a genome evolves from a set of precursor genes to a mature size by gene duplications and increasing modifications. (c) 2000 The American Physical Society.

  5. Duplication of OsHAP family genes and their association with heading date in rice.

    PubMed

    Li, Qiuping; Yan, Wenhao; Chen, Huaxia; Tan, Cong; Han, Zhongmin; Yao, Wen; Li, Guangwei; Yuan, Mengqi; Xing, Yongzhong

    2016-03-01

    Heterotrimeric Heme Activator Protein (HAP) family genes are involved in the regulation of flowering in plants. It is not clear how many HAP genes regulate heading date in rice. In this study, we identified 35 HAP genes, including seven newly identified genes, and performed gene duplication and candidate gene-based association analyses. Analyses showed that segmental duplication and tandem duplication are the main mechanisms of HAP gene duplication. Expression profiling and functional identification indicated that duplication probably diversifies the functions of HAP genes. A nucleotide diversity analysis revealed that 13 HAP genes underwent selection. A candidate gene-based association analysis detected four HAP genes related to heading date. An investigation of transgenic plants or mutants of 23 HAP genes confirmed that overexpression of at least four genes delayed heading date under long-day conditions, including the previously cloned Ghd8/OsHAP3H. Our results indicate that the large number of HAP genes in rice was mainly produced by gene duplication, and a few HAP genes function to regulate heading date. Selection of HAP genes is probably caused by their diverse functions rather than regulation of heading.

  6. Extensive and continuous duplication facilitates rapid evolution and diversification of gene families.

    PubMed

    Chang, Dan; Duda, Thomas F

    2012-08-01

    The origin of novel gene functions through gene duplication, mutation, and natural selection represents one of the mechanisms by which organisms diversify and one of the possible paths leading to adaptation. Nonetheless, the extent, role, and consequences of duplications in the origins of ecological adaptations, especially in the context of species interactions, remain unclear. To explore the evolution of a gene family that is likely linked to species associations, we investigated the evolutionary history of the A-superfamily of conotoxin genes of predatory marine cone snails (Conus species). Members of this gene family are expressed in the venoms of Conus species and are presumably involved in predator-prey associations because of their utility in prey capture. We recovered sequences of this gene family from genomic DNA of four closely related species of Conus and reconstructed the evolutionary history of these genes. Our study is the first to directly recover conotoxin genes from Conus genomes to investigate the evolution of conotoxin gene families. Our results revealed a phenomenon of rapid and continuous gene turnover that is coupled with heightened rates of evolution. This continuous duplication pattern has not been observed previously, and the rate of gene turnover is at least two times higher than estimates from other multigene families. Conotoxin genes are among the most rapidly evolving protein-coding genes in metazoans, a phenomenon that may be facilitated by extensive gene duplications and have driven changes in conotoxin functions through neofunctionalization. Together these mechanisms led to dramatically divergent arrangements of A-superfamily conotoxin genes among closely related species of Conus. Our findings suggest that extensive and continuous gene duplication facilitates rapid evolution and drastic divergence in venom compositions among species, processes that may be associated with evolutionary responses to predator-prey interactions.

  7. Duplication, divergence and persistence in the Phytochrome photoreceptor gene family of cottons (Gossypium spp.)

    PubMed Central

    2010-01-01

    Background Phytochromes are a family of red/far-red photoreceptors that regulate a number of important developmental traits in cotton (Gossypium spp.), including plant architecture, fiber development, and photoperiodic flowering. Little is known about the composition and evolution of the phytochrome gene family in diploid (G. herbaceum, G. raimondii) or allotetraploid (G. hirsutum, G. barbadense) cotton species. The objective of this study was to obtain a preliminary inventory and molecular-evolutionary characterization of the phytochrome gene family in cotton. Results We used comparative sequence resources to design low-degeneracy PCR primers that amplify genomic sequence tags (GSTs) for members of the PHYA, PHYB/D, PHYC and PHYE gene sub-families from A- and D-genome diploid and AD-genome allotetraploid Gossypium species. We identified two paralogous PHYA genes (designated PHYA1 and PHYA2) in diploid cottons, the result of a Malvaceae-specific PHYA gene duplication that occurred approximately 14 million years ago (MYA), before the divergence of the A- and D-genome ancestors. We identified a single gene copy of PHYB, PHYC, and PHYE in diploid cottons. The allotetraploid genomes have largely retained the complete gene complements inherited from both of the diploid genome ancestors, with at least four PHYA genes and two genes encoding PHYB, PHYC and PHYE in the AD-genomes. We did not identify a PHYD gene in any cotton genomes examined. Conclusions Detailed sequence analysis suggests that phytochrome genes retained after duplication by segmental duplication and allopolyploidy appear to be evolving independently under a birth-and-death-process with strong purifying selection. Our study provides a preliminary phytochrome gene inventory that is necessary and sufficient for further characterization of the biological functions of each of the cotton phytochrome genes, and for the development of 'candidate gene' markers that are potentially useful for cotton improvement via

  8. Gene duplications and losses within the cyclooxygenase family of teleosts and other chordates.

    PubMed

    Havird, Justin C; Miyamoto, Michael M; Choe, Keith P; Evans, David H

    2008-11-01

    Cyclooxygenase (COX) produces prostaglandins in animals via the oxidation and reduction of arachidonic acid. Different types and numbers of COX genes have been found in corals, sea squirts, fishes, and tetrapods, but no study has used a comparative phylogenetic approach to investigate the evolutionary history of this complex gene family. Therefore, to examine COX evolution in the teleosts and chordates, 9 novel COX sequences (possessing residues and domains critical to COX function) were acquired from the euryhaline killifish, longhorn sculpin, sea lamprey, Atlantic hagfish, and amphioxus using standard polymerase chain reaction (PCR) and cloning methods. Phylogenetic analyses of these and other COX sequences show a complicated history of COX duplications and losses. There are three main lineages of COX in the chordates corresponding to the three subphyla in the phylum Chordata, with each lineage representing an independent COX duplication. Hagfish and lamprey most likely have traditional COX-1/2 genes, suggesting that these genes originated with the first round of genome duplication in the vertebrates according to the 2R hypothesis and are not exclusively present in the gnathostomes. All teleosts examined have three COX genes due to a teleost-specific genome duplication followed by variable loss of a COX-1 (in the zebrafish and rainbow trout) or COX-2 gene (in the derived teleosts). Future studies should examine the functional ramifications of these differential gene losses.

  9. Convergent gene loss following gene and genome duplications creates single-copy families in flowering plants.

    PubMed

    De Smet, Riet; Adams, Keith L; Vandepoele, Klaas; Van Montagu, Marc C E; Maere, Steven; Van de Peer, Yves

    2013-02-19

    The importance of gene gain through duplication has long been appreciated. In contrast, the importance of gene loss has only recently attracted attention. Indeed, studies in organisms ranging from plants to worms and humans suggest that duplication of some genes might be better tolerated than that of others. Here we have undertaken a large-scale study to investigate the existence of duplication-resistant genes in the sequenced genomes of 20 flowering plants. We demonstrate that there is a large set of genes that is convergently restored to single-copy status following multiple genome-wide and smaller scale duplication events. We rule out the possibility that such a pattern could be explained by random gene loss only and therefore propose that there is selection pressure to preserve such genes as singletons. This is further substantiated by the observation that angiosperm single-copy genes do not comprise a random fraction of the genome, but instead are often involved in essential housekeeping functions that are highly conserved across all eukaryotes. Furthermore, single-copy genes are generally expressed more highly and in more tissues than non-single-copy genes, and they exhibit higher sequence conservation. Finally, we propose different hypotheses to explain their resistance against duplication.

  10. Convergent gene loss following gene and genome duplications creates single-copy families in flowering plants

    PubMed Central

    De Smet, Riet; Adams, Keith L.; Vandepoele, Klaas; Van Montagu, Marc C. E.; Maere, Steven; Van de Peer, Yves

    2013-01-01

    The importance of gene gain through duplication has long been appreciated. In contrast, the importance of gene loss has only recently attracted attention. Indeed, studies in organisms ranging from plants to worms and humans suggest that duplication of some genes might be better tolerated than that of others. Here we have undertaken a large-scale study to investigate the existence of duplication-resistant genes in the sequenced genomes of 20 flowering plants. We demonstrate that there is a large set of genes that is convergently restored to single-copy status following multiple genome-wide and smaller scale duplication events. We rule out the possibility that such a pattern could be explained by random gene loss only and therefore propose that there is selection pressure to preserve such genes as singletons. This is further substantiated by the observation that angiosperm single-copy genes do not comprise a random fraction of the genome, but instead are often involved in essential housekeeping functions that are highly conserved across all eukaryotes. Furthermore, single-copy genes are generally expressed more highly and in more tissues than non–single-copy genes, and they exhibit higher sequence conservation. Finally, we propose different hypotheses to explain their resistance against duplication. PMID:23382190

  11. Duplications and losses in gene families of rust pathogens highlight putative effectors

    PubMed Central

    Pendleton, Amanda L.; Smith, Katherine E.; Feau, Nicolas; Martin, Francis M.; Grigoriev, Igor V.; Hamelin, Richard; Nelson, C. Dana; Burleigh, J. Gordon; Davis, John M.

    2014-01-01

    Rust fungi are a group of fungal pathogens that cause some of the world's most destructive diseases of trees and crops. A shared characteristic among rust fungi is obligate biotrophy, the inability to complete a lifecycle without a host. This dependence on a host species likely affects patterns of gene expansion, contraction, and innovation within rust pathogen genomes. The establishment of disease by biotrophic pathogens is reliant upon effector proteins that are encoded in the fungal genome and secreted from the pathogen into the host's cell apoplast or within the cells. This study uses a comparative genomic approach to elucidate putative effectors and determine their evolutionary histories. We used OrthoMCL to identify nearly 20,000 gene families in proteomes of 16 diverse fungal species, which include 15 basidiomycetes and one ascomycete. We inferred patterns of duplication and loss for each gene family and identified families with distinctive patterns of expansion/contraction associated with the evolution of rust fungal genomes. To recognize potential contributors for the unique features of rust pathogens, we identified families harboring secreted proteins that: (i) arose or expanded in rust pathogens relative to other fungi, or (ii) contracted or were lost in rust fungal genomes. While the origin of rust fungi appears to be associated with considerable gene loss, there are many gene duplications associated with each sampled rust fungal genome. We also highlight two putative effector gene families that have expanded in Cqf that we hypothesize have roles in pathogenicity. PMID:25018762

  12. Age distribution of human gene families shows significant roles of both large- and small-scale duplications in vertebrate evolution.

    PubMed

    Gu, Xun; Wang, Yufeng; Gu, Jianying

    2002-06-01

    The classical (two-round) hypothesis of vertebrate genome duplication proposes two successive whole-genome duplication(s) (polyploidizations) predating the origin of fishes, a view now being seriously challenged. As the debate largely concerns the relative merits of the 'big-bang mode' theory (large-scale duplication) and the 'continuous mode' theory (constant creation by small-scale duplications), we tested whether a significant proportion of paralogous genes in the contemporary human genome was indeed generated in the early stage of vertebrate evolution. After an extensive search of major databases, we dated 1,739 gene duplication events from the phylogenetic analysis of 749 vertebrate gene families. We found a pattern characterized by two waves (I, II) and an ancient component. Wave I represents a recent gene family expansion by tandem or segmental duplications, whereas wave II, a rapid paralogous gene increase in the early stage of vertebrate evolution, supports the idea of genome duplication(s) (the big-bang mode). Further analysis indicated that large- and small-scale gene duplications both make a significant contribution during the early stage of vertebrate evolution to build the current hierarchy of the human proteome.

  13. Expansion and diversification of the SET domain gene family following whole-genome duplications in Populus trichocarpa.

    PubMed

    Lei, Li; Zhou, Shi-Liang; Ma, Hong; Zhang, Liang-Sheng

    2012-04-12

    Histone lysine methylation modifies chromatin structure and regulates eukaryotic gene transcription and a variety of developmental and physiological processes. SET domain proteins are lysine methyltransferases containing the evolutionarily-conserved SET domain, which is known to be the catalytic domain. We identified 59 SET genes in the Populus genome. Phylogenetic analyses of 106 SET genes from Populus and Arabidopsis supported the clustering of SET genes into six distinct subfamilies and identified 19 duplicated gene pairs in Populus. The chromosome locations of these gene pairs and the distribution of synonymous substitution rates showed that the expansion of the SET gene family might be caused by large-scale duplications in Populus. Comparison of gene structures and domain architectures of each duplicate pair indicated that divergence took place at the 3'- and 5'-terminal transcribed regions and at the N- and C-termini of the predicted proteins, respectively. Expression profile analysis of Populus SET genes suggested that most Populus SET genes were expressed widely, many with the highest expression in young leaves. In particular, the expression profiles of 12 of the 19 duplicated gene pairs fell into two types of expression patterns. The 19 duplicated SET genes could have originated from whole genome duplication events. The differences in SET gene structure, domain architecture, and expression profiles in various tissues of Populus suggest that members of the SET gene family have a variety of developmental and physiological functions. Our study provides clues about the evolution of epigenetic regulation of chromatin structure and gene expression.

  14. Importance of gene duplication in the evolution of genomic imprinting revealed by molecular evolutionary analysis of the type I MADS-box gene family in Arabidopsis species.

    PubMed

    Yoshida, Takanori; Kawabe, Akira

    2013-01-01

    The pattern of molecular evolution of imprinted genes is controversial and the entire picture is still to be unveiled. Recently, a relationship between the formation of imprinted genes and gene duplication was reported in genome-wide survey of imprinted genes in Arabidopsis thaliana. Because gene duplications influence the molecular evolution of the duplicated gene family, it is necessary to investigate both the pattern of molecular evolution and the possible relationship between gene duplication and genomic imprinting for a better understanding of evolutionary aspects of imprinted genes. In this study, we investigated the evolutionary changes of type I MADS-box genes that include imprinted genes by using relative species of Arabidopsis thaliana (two subspecies of A. lyrata and three subspecies of A. halleri). A duplicated gene family enables us to compare DNA sequences between imprinted genes and its homologs. We found an increased number of gene duplications within species in clades containing the imprinted genes, further supporting the hypothesis that local gene duplication is one of the driving forces for the formation of imprinted genes. Moreover, data obtained by phylogenetic analysis suggested "rapid evolution" of not only imprinted genes but also its closely related orthologous genes, which implies the effect of gene duplication on molecular evolution of imprinted genes.

  15. bMERB domains are bivalent Rab8 family effectors evolved by gene duplication.

    PubMed

    Rai, Amrita; Oprisko, Anastasia; Campos, Jeremy; Fu, Yangxue; Friese, Timon; Itzen, Aymelt; Goody, Roger S; Gazdag, Emerich Mihai; Müller, Matthias P

    2016-08-23

    In their active GTP-bound form, Rab proteins interact with proteins termed effector molecules. In this study, we have thoroughly characterized a Rab effector domain that is present in proteins of the Mical and EHBP families, both known to act in endosomal trafficking. Within our study, we show that these effectors display a preference for Rab8 family proteins (Rab8, 10, 13 and 15) and that some of the effector domains can bind two Rab proteins via separate binding sites. Structural analysis allowed us to explain the specificity towards Rab8 family members and the presence of two similar Rab binding sites that must have evolved via gene duplication. This study is the first to thoroughly characterize a Rab effector protein that contains two separate Rab binding sites within a single domain, allowing Micals and EHBPs to bind two Rabs simultaneously, thus suggesting previously unknown functions of these effector molecules in endosomal trafficking.

  16. Evolution of the KCS gene family in plants: the history of gene duplication, sub/neofunctionalization and redundancy.

    PubMed

    Guo, Hai-Song; Zhang, Yan-Mei; Sun, Xiao-Qin; Li, Mi-Mi; Hang, Yue-Yu; Xue, Jia-Yu

    2016-04-01

    Very long-chain fatty acids (VLCFAs) play an important role in the survival and development of plants, and VLCFA synthesis is regulated by β-ketoacyl-CoA synthases (KCSs), which catalyze the condensation of an acyl-CoA with malonyl-CoA. Here, we present a genome-wide survey of the genes encoding these enzymes, KCS genes, in 28 species (26 genomes and two transcriptomes), which represents a large phylogenetic scale, and also reconstruct the evolutionary history of this gene family. KCS genes were initially single-copy genes in the green plant lineage; duplication resulted in five ancestral copies in land plants, forming five fundamental monophyletic groups in the phylogenetic tree. Subsequently, KCS genes duplicated to generate 11 genes of angiosperm origin, expanding up to 20-30 members in further-diverged angiosperm species. During this process, tandem duplications had only a small contribution, whereas polyploidy events and large-scale segmental duplications appear to be the main driving force. Accompanying this expansion were variations that led to the sub- and neofunctionalization of different members, resulting in specificity that is likely determined by the 3-D protein structure. Novel functions involved in other physiological processes emerged as well, though redundancy is also observed, largely among recent duplications. Conserved sites and variable sites of KCS proteins are also identified by statistical analysis. The variable sites are likely to be involved in the emergence of product specificity and catalytic power, and conserved sites are possibly responsible for the preservation of fundamental function.

  17. Expansion and diversification of the SET domain gene family following whole-genome duplications in Populus trichocarpa

    PubMed Central

    2012-01-01

    Background Histone lysine methylation modifies chromatin structure and regulates eukaryotic gene transcription and a variety of developmental and physiological processes. SET domain proteins are lysine methyltransferases containing the evolutionarily-conserved SET domain, which is known to be the catalytic domain. Results We identified 59 SET genes in the Populus genome. Phylogenetic analyses of 106 SET genes from Populus and Arabidopsis supported the clustering of SET genes into six distinct subfamilies and identified 19 duplicated gene pairs in Populus. The chromosome locations of these gene pairs and the distribution of synonymous substitution rates showed that the expansion of the SET gene family might be caused by large-scale duplications in Populus. Comparison of gene structures and domain architectures of each duplicate pair indicated that divergence took place at the 3'- and 5'-terminal transcribed regions and at the N- and C-termini of the predicted proteins, respectively. Expression profile analysis of Populus SET genes suggested that most Populus SET genes were expressed widely, many with the highest expression in young leaves. In particular, the expression profiles of 12 of the 19 duplicated gene pairs fell into two types of expression patterns. Conclusions The 19 duplicated SET genes could have originated from whole genome duplication events. The differences in SET gene structure, domain architecture, and expression profiles in various tissues of Populus suggest that members of the SET gene family have a variety of developmental and physiological functions. Our study provides clues about the evolution of epigenetic regulation of chromatin structure and gene expression. PMID:22497662

  18. Extensive lineage-specific gene duplication and evolution of the spiggin multi-gene family in stickleback

    PubMed Central

    Kawahara, Ryouka; Nishida, Mutsumi

    2007-01-01

    Background The threespine stickleback (Gasterosteus aculeatus) has a characteristic reproductive mode; mature males build nests using a secreted glue-like protein called spiggin. Although recent studies reported multiple occurrences of genes that encode this glue-like protein spiggin in threespine and ninespine sticklebacks, it is still unclear how many genes compose the spiggin multi-gene family. Results Genome sequence analysis of threespine stickleback showed that there are at least five spiggin genes and two pseudogenes, whereas a single spiggin homolog occurs in the genomes of other fishes. Comparative genome sequence analysis demonstrated that Muc19, a single-copy mucous gene in human and mouse, is an ortholog of spiggin. Phylogenetic and molecular evolutionary analyses of these sequences suggested that an ancestral spiggin gene originated from a member of the mucin gene family as a single gene in the common ancestor of teleosts, and gene duplications of spiggin have occurred in the stickleback lineage. There was inter-population variation in the copy number of spiggin genes and positive selection on some codons, indicating that additional gene duplication/deletion events and adaptive evolution at some amino acid sites may have occurred in each stickleback population. Conclusion A number of spiggin genes exist in the threespine stickleback genome. Our results provide insight into the origin and dynamic evolutionary process of the spiggin multi-gene family in the threespine stickleback lineage. The dramatic evolution of genes for mucous substrates may have contributed to the generation of distinct characteristics such as "bio-glue" in vertebrates. PMID:17980047

  19. Structure of Mycobacterium tuberculosis Rv2714, a representative of a duplicated gene family in Actinobacteria

    PubMed Central

    Graña, Martin; Bellinzoni, Marco; Miras, Isabelle; Fiez-Vandal, Cedric; Haouz, Ahmed; Shepard, William; Buschiazzo, Alejandro; Alzari, Pedro M.

    2009-01-01

    The gene Rv2714 from Mycobacterium tuberculosis, which codes for a hypothetical protein of unknown function, is a representative member of a gene family that is largely confined to the order Actinomycetales of Actinobacteria. Sequence analysis indicates the presence of two paralogous genes in most mycobacterial genomes and suggests that gene duplication was an ancient event in bacterial evolution. The crystal structure of Rv2714 has been determined at 2.6 Å resolution, revealing a trimer in which the topology of the protomer core is similar to that observed in a functionally diverse set of enzymes, including purine nucleoside phosphorylases, some carboxypeptidases, bacterial peptidyl-tRNA hydrolases and even the plastidic form of an intron splicing factor. However, some structural elements, such as a β-hairpin insertion involved in protein oligomerization and a C-terminal α-helical domain that serves as a lid to the putative substrate-binding (or ligand-binding) site, are only found in Rv2714 bacterial homologues and represent specific signatures of this protein family. PMID:19851001

  20. bMERB domains are bivalent Rab8 family effectors evolved by gene duplication

    PubMed Central

    Rai, Amrita; Oprisko, Anastasia; Campos, Jeremy; Fu, Yangxue; Friese, Timon; Itzen, Aymelt; Goody, Roger S; Gazdag, Emerich Mihai; Müller, Matthias P

    2016-01-01

    In their active GTP-bound form, Rab proteins interact with proteins termed effector molecules. In this study, we have thoroughly characterized a Rab effector domain that is present in proteins of the Mical and EHBP families, both known to act in endosomal trafficking. Within our study, we show that these effectors display a preference for Rab8 family proteins (Rab8, 10, 13 and 15) and that some of the effector domains can bind two Rab proteins via separate binding sites. Structural analysis allowed us to explain the specificity towards Rab8 family members and the presence of two similar Rab binding sites that must have evolved via gene duplication. This study is the first to thoroughly characterize a Rab effector protein that contains two separate Rab binding sites within a single domain, allowing Micals and EHBPs to bind two Rabs simultaneously, thus suggesting previously unknown functions of these effector molecules in endosomal trafficking. DOI: http://dx.doi.org/10.7554/eLife.18675.001 PMID:27552051

  1. Genome-wide analysis of the Dof transcription factor gene family reveals soybean-specific duplicable and functional characteristics.

    PubMed

    Guo, Yong; Qiu, Li-Juan

    2013-01-01

    The Dof domain protein family is a classic plant-specific zinc-finger transcription factor family involved in a variety of biological processes. There is great diversity in the number of Dof genes in different plants. However, there are only very limited reports on the characterization of Dof transcription factors in soybean (Glycine max). In the present study, 78 putative Dof genes were identified from the whole-genome sequence of soybean. The predicted GmDof genes were non-randomly distributed within and across 19 out of 20 chromosomes and 97.4% (38 pairs) were preferentially retained duplicate paralogous genes located in duplicated regions of the genome. Soybean-specific segmental duplications contributed significantly to the expansion of the soybean Dof gene family. These Dof proteins were phylogenetically clustered into nine distinct subgroups among which the gene structure and motif compositions were considerably conserved. Comparative phylogenetic analysis of these Dof proteins revealed four major groups, similar to those reported for Arabidopsis and rice. Most of the GmDofs showed specific expression patterns based on RNA-seq data analyses. The expression patterns of some duplicate genes were partially redundant while others showed functional diversity, suggesting the occurrence of sub-functionalization during subsequent evolution. Comprehensive expression profile analysis also provided insights into the soybean-specific functional divergence among members of the Dof gene family. Cis-regulatory element analysis of these GmDof genes suggested diverse functions associated with different processes. Taken together, our results provide useful information for the functional characterization of soybean Dof genes by combining phylogenetic analysis with global gene-expression profiling.

  2. Divergence of the Dof gene families in poplar, Arabidopsis, and rice suggests multiple modes of gene evolution after duplication.

    PubMed

    Yang, Xiaohan; Tuskan, Gerald A; Cheng, Max Zong-Ming

    2006-11-01

    It is widely accepted that gene duplication is a primary source of genetic novelty. However, the evolutionary fate of duplicated genes remains largely unresolved. The classical Ohno's Duplication-Retention-Non/Neofunctionalization theory, and the recently proposed alternatives such as subfunctionalization or duplication-degeneration-complementation, and subneofunctionalization, each can explain one or more aspects of gene fate after duplication. Duplicated genes are also affected by epigenetic changes. We constructed a phylogenetic tree using Dof (DNA binding with one finger) protein sequences from poplar (Populus trichocarpa) Torr. & Gray ex Brayshaw, Arabidopsis (Arabidopsis thaliana), and rice (Oryza sativa). From the phylogenetic tree, we identified 27 pairs of paralogous Dof genes in the terminal nodes. Analysis of protein motif structure of the Dof paralogs and their ancestors revealed six different gene fates after gene duplication. Differential protein methylation was revealed between a pair of duplicated poplar Dof genes, which have identical motif structure and similar expression pattern, indicating that epigenetics is involved in evolution. Analysis of reverse transcription-PCR, massively parallel signature sequencing, and microarray data revealed that the paralogs differ in expression pattern. Furthermore, analysis of nonsynonymous and synonymous substitution rates indicated that divergence of the duplicated genes was driven by positive selection. About one-half of the motifs in Dof proteins were shared by non-Dof proteins in the three plants species, indicating that motif co-option may be one of the forces driving gene diversification. We provided evidence that the Ohno's Duplication-Retention-Non/Neofunctionalization, subfunctionalization/duplication-degeneration-complementation, and subneofunctionalization hypotheses are complementary with, not alternative to, each other.

  3. Tubulin evolution in insects: gene duplication and subfunctionalization provide specialized isoforms in a functionally constrained gene family

    PubMed Central

    2010-01-01

    Background The completion of 19 insect genome sequencing projects spanning six insect orders provides the opportunity to investigate the evolution of important gene families, here tubulins. Tubulins are a family of eukaryotic structural genes that form microtubules, fundamental components of the cytoskeleton that mediate cell division, shape, motility, and intracellular trafficking. Previous in vivo studies in Drosophila find a stringent relationship between tubulin structure and function; small, biochemically similar changes in the major alpha 1 or testis-specific beta 2 tubulin protein render each unable to generate a motile spermtail axoneme. This has evolutionary implications, not a single non-synonymous substitution is found in beta 2 among 17 species of Drosophila and Hirtodrosophila flies spanning 60 Myr of evolution. This raises an important question, How do tubulins evolve while maintaining their function? To answer, we use molecular evolutionary analyses to characterize the evolution of insect tubulins. Results Sixty-six alpha tubulins and eighty-six beta tubulin gene copies were retrieved and subjected to molecular evolutionary analyses. Four ancient clades of alpha and beta tubulins are found in insects, a major isoform clade (alpha 1, beta 1) and three minor, tissue-specific clades (alpha 2-4, beta 2-4). Based on a Homarus americanus (lobster) outgroup, these were generated through gene duplication events on major beta and alpha tubulin ancestors, followed by subfunctionalization in expression domain. Strong purifying selection acts on all tubulins, yet maximum pairwise amino acid distances between tubulin paralogs are large (0.464 substitutions/site beta tubulins, 0.707 alpha tubulins). Conversely orthologs, with the exception of reproductive tissue isoforms, show little sequence variation except in the last 15 carboxy terminus tail (CTT) residues, which serve as sites for post-translational modifications (PTMs) and interactions with microtubule

  4. Diversification of the light-harvesting complex gene family via intra- and intergenic duplications in the coral symbiotic alga Symbiodinium.

    PubMed

    Maruyama, Shinichiro; Shoguchi, Eiichi; Satoh, Nori; Minagawa, Jun

    2015-01-01

    The light-harvesting complex (LHC) is an essential component in light energy capture and transduction to facilitate downstream photosynthetic reactions in plant and algal chloroplasts. The unicellular dinoflagellate alga Symbiodinium is an endosymbiont of cnidarian animals, including corals and sea anemones, and provides carbohydrates generated through photosynthesis to host animals. Although Symbiodinium possesses a unique LHC gene family, called chlorophyll a-chlorophyll c2-peridinin protein complex (acpPC), its genome-level diversity and evolutionary trajectories have not been investigated. Here, we describe a phylogenetic analysis revealing that many of the LHCs are encoded by highly duplicated genes with multi-subunit polyprotein structures in the nuclear genome of Symbiodinium minutum. This analysis provides an extended list of the LHC gene family in a single organism, including 80 loci encoding polyproteins composed of 145 LHC subunits recovered in the phylogenetic tree. In S. minutum, 5 phylogenetic groups of the Lhcf-type gene family, which is exclusively conserved in algae harboring secondary plastids of red algal origin, were identified. Moreover, 5 groups of the Lhcr-type gene family, of which members are known to be associated with PSI in red algal plastids and secondary plastids of red algal origin, were identified. Notably, members classified within a phylogenetic group of the Lhcf-type (group F1) are highly duplicated, which may explain the presence of an unusually large number of LHC genes in this species. Some gene units were homologous to other units within single loci of the polyprotein genes, whereas intergenic homologies between separate loci were conspicuous in other cases, implying that gene unit 'shuffling' by gene conversion and/or genome rearrangement might have been a driving force for diversification. These results suggest that vigorous intra- and intergenic gene duplication events have resulted in the genomic framework of

  5. The role of gene duplication and unconstrained selective pressures in the melanopsin gene family evolution and vertebrate circadian rhythm regulation.

    PubMed

    Borges, Rui; Johnson, Warren E; O'Brien, Stephen J; Vasconcelos, Vitor; Antunes, Agostinho

    2012-01-01

    Melanopsin is a photosensitive cell protein involved in regulating circadian rhythms and other non-visual responses to light. The melanopsin gene family is represented by two paralogs, OPN4x and OPN4m, which originated through gene duplication early in the emergence of vertebrates. Here we studied the melanopsin gene family using an integrated gene/protein evolutionary approach, which revealed that the rhabdomeric urbilaterian ancestor had the same amino acid patterns (DRY motif and the Y and E conterions) as extant vertebrate species, suggesting that the mechanism for light detection and regulation is similar to rhabdomeric rhodopsins. Both OPN4m and OPN4x paralogs are found in vertebrate genomic paralogons, suggesting that they diverged following this duplication event about 600 million years ago, when the complex eye emerged in the vertebrate ancestor. Melanopsins generally evolved under negative selection (ω = 0.171) with some minor episodes of positive selection (proportion of sites = 25%) and functional divergence (θ(I) = 0.349 and θ(II) = 0.126). The OPN4m and OPN4x melanopsin paralogs show evidence of spectral divergence at sites likely involved in melanopsin light absorbance (200F, 273S and 276A). Also, following the teleost lineage-specific whole genome duplication (3R) that prompted the teleost fish radiation, type I divergence (θ(I) = 0.181) and positive selection (affecting 11% of sites) contributed to amino acid variability that we related with the photo-activation stability of melanopsin. The melanopsin intracellular regions had unexpectedly high variability in their coupling specificity of G-proteins and we propose that Gq/11 and Gi/o are the two G-proteins most-likely to mediate the melanopsin phototransduction pathway. The selection signatures were mainly observed on retinal-related sites and the third and second intracellular loops, demonstrating the physiological plasticity of the melanopsin protein group. Our results provide new insights on

  6. Tandem Duplication Events in the Expansion of the Small Heat Shock Protein Gene Family in Solanum lycopersicum (cv. Heinz 1706).

    PubMed

    Krsticevic, Flavia J; Arce, Débora P; Ezpeleta, Joaquín; Tapia, Elizabeth

    2016-10-13

    In plants, fruit maturation and oxidative stress can induce small heat shock protein (sHSP) synthesis to maintain cellular homeostasis. Although the tomato reference genome was published in 2012, the actual number and functionality of sHSP genes remain unknown. Using a transcriptomic (RNA-seq) and evolutionary genomic approach, putative sHSP genes in the Solanum lycopersicum (cv. Heinz 1706) genome were investigated. A sHSP gene family of 33 members was established. Remarkably, roughly half of the members of this family can be explained by nine independent tandem duplication events that determined, evolutionarily, their functional fates. Within a mitochondrial class subfamily, only one duplicated member, Solyc08g078700, retained its ancestral chaperone function, while the others, Solyc08g078710 and Solyc08g078720, likely degenerated under neutrality and lack ancestral chaperone function. Functional conservation occurred within a cytosolic class I subfamily, whose four members, Solyc06g076570, Solyc06g076560, Solyc06g076540, and Solyc06g076520, support ∼57% of the total sHSP RNAm in the red ripe fruit. Subfunctionalization occurred within a new subfamily, whose two members, Solyc04g082720 and Solyc04g082740, show heterogeneous differential expression profiles during fruit ripening. These findings, involving the birth/death of some genes or the preferential/plastic expression of some others during fruit ripening, highlight the importance of tandem duplication events in the expansion of the sHSP gene family in the tomato genome. Despite its evolutionary diversity, the sHSP gene family in the tomato genome seems to be endowed with a core set of four homeostasis genes: Solyc05g014280, Solyc03g082420, Solyc11g020330, and Solyc06g076560, which appear to provide a baseline protection during both fruit ripening and heat shock stress in different tomato tissues.

  7. Tandem Duplication Events in the Expansion of the Small Heat Shock Protein Gene Family in Solanum lycopersicum (cv. Heinz 1706)

    PubMed Central

    Krsticevic, Flavia J.; Arce, Débora P.; Ezpeleta, Joaquín; Tapia, Elizabeth

    2016-01-01

    In plants, fruit maturation and oxidative stress can induce small heat shock protein (sHSP) synthesis to maintain cellular homeostasis. Although the tomato reference genome was published in 2012, the actual number and functionality of sHSP genes remain unknown. Using a transcriptomic (RNA-seq) and evolutionary genomic approach, putative sHSP genes in the Solanum lycopersicum (cv. Heinz 1706) genome were investigated. A sHSP gene family of 33 members was established. Remarkably, roughly half of the members of this family can be explained by nine independent tandem duplication events that determined, evolutionarily, their functional fates. Within a mitochondrial class subfamily, only one duplicated member, Solyc08g078700, retained its ancestral chaperone function, while the others, Solyc08g078710 and Solyc08g078720, likely degenerated under neutrality and lack ancestral chaperone function. Functional conservation occurred within a cytosolic class I subfamily, whose four members, Solyc06g076570, Solyc06g076560, Solyc06g076540, and Solyc06g076520, support ∼57% of the total sHSP RNAm in the red ripe fruit. Subfunctionalization occurred within a new subfamily, whose two members, Solyc04g082720 and Solyc04g082740, show heterogeneous differential expression profiles during fruit ripening. These findings, involving the birth/death of some genes or the preferential/plastic expression of some others during fruit ripening, highlight the importance of tandem duplication events in the expansion of the sHSP gene family in the tomato genome. Despite its evolutionary diversity, the sHSP gene family in the tomato genome seems to be endowed with a core set of four homeostasis genes: Solyc05g014280, Solyc03g082420, Solyc11g020330, and Solyc06g076560, which appear to provide a baseline protection during both fruit ripening and heat shock stress in different tomato tissues. PMID:27565886

  8. The fate of tandemly duplicated genes assessed by the expression analysis of a group of Arabidopsis thaliana RING-H2 ubiquitin ligase genes of the ATL family.

    PubMed

    Aguilar-Hernández, Victor; Guzmán, Plinio

    2014-03-01

    Gene duplication events exert key functions on gene innovations during the evolution of the eukaryotic genomes. A large portion of the total gene content in plants arose from tandem duplications events, which often result in paralog genes with high sequence identity. Ubiquitin ligases or E3 enzymes are components of the ubiquitin proteasome system that function during the transfer of the ubiquitin molecule to the substrate. In plants, several E3s have expanded in their genomes as multigene families. To gain insight into the consequences of gene duplications on the expansion and diversification of E3s, we examined the evolutionary basis of a cluster of six genes, duplC-ATLs, which arose from segmental and tandem duplication events in Brassicaceae. The assessment of the expression suggested two patterns that are supported by lineage. While retention of expression domains was observed, an apparent absence or reduction of expression was also inferred. We found that two duplC-ATL genes underwent pseudogenization and that, in one case, gene expression is probably regained. Our findings provide insights into the evolution of gene families in plants, defining key events on the expansion of the Arabidopsis Tóxicos en Levadura family of E3 ligases.

  9. A dynamic history of gene duplications and losses characterizes the evolution of the SPARC family in eumetazoans.

    PubMed

    Bertrand, Stephanie; Fuentealba, Jaime; Aze, Antoine; Hudson, Clare; Yasuo, Hitoyoshi; Torrejon, Marcela; Escriva, Hector; Marcellini, Sylvain

    2013-04-22

    The vertebrates share the ability to produce a skeleton made of mineralized extracellular matrix. However, our understanding of the molecular changes that accompanied their emergence remains scarce. Here, we describe the evolutionary history of the SPARC (secreted protein acidic and rich in cysteine) family, because its vertebrate orthologues are expressed in cartilage, bones and teeth where they have been proposed to bind calcium and act as extracellular collagen chaperones, and because further duplications of specific SPARC members produced the small calcium-binding phosphoproteins (SCPP) family that is crucial for skeletal mineralization to occur. Both phylogeny and synteny conservation analyses reveal that, in the eumetazoan ancestor, a unique ancestral gene duplicated to give rise to SPARC and SPARCB described here for the first time. Independent losses have eliminated one of the two paralogues in cnidarians, protostomes and tetrapods. Hence, only non-tetrapod deuterostomes have conserved both genes. Remarkably, SPARC and SPARCB paralogues are still linked in the amphioxus genome. To shed light on the evolution of the SPARC family members in chordates, we performed a comprehensive analysis of their embryonic expression patterns in amphioxus, tunicates, teleosts, amphibians and mammals. Our results show that in the chordate lineage SPARC and SPARCB family members were recurrently recruited in a variety of unrelated tissues expressing collagen genes. We propose that one of the earliest steps of skeletal evolution involved the co-expression of SPARC paralogues with collagenous proteins.

  10. Massive Gene Duplication Event among Clinical Isolates of the Mycobacterium tuberculosis W/Beijing Family ▿ †

    PubMed Central

    Domenech, Pilar; Kolly, Gaëlle S.; Leon-Solis, Lizbel; Fallow, Ashley; Reed, Michael B.

    2010-01-01

    As part of our effort to uncover the molecular basis for the phenotypic variation among clinical Mycobacterium tuberculosis isolates, we have previously reported that isolates belonging to the W/Beijing lineage constitutively overexpress the DosR-regulated transcriptional program. While generating dosR knockouts in two independent W/Beijing sublineages, we were surprised to discover that they possess two copies of dosR. This dosR amplification is part of a massive genomic duplication spanning 350 kb and encompassing >300 genes. In total, this equates to 8% of the genome being present as two copies. The presence of IS6110 elements at both ends of the region of duplication, and in the novel junction region, suggests that it arose through unequal homologous recombination of sister chromatids at the IS6110 sequences. Analysis of isolates representing the major M. tuberculosis lineages has revealed that the 350-kb duplication is restricted to the most recently evolved sublineages of the W/Beijing family. Within these isolates, the duplication is partly responsible for the constitutive dosR overexpression phenotype. Although the nature of the selection event giving rise to the duplication remains unresolved, its evolution is almost certainly the result of specific selective pressure(s) encountered inside the host. A preliminary in vitro screen has failed to reveal a role of the duplication in conferring resistance to common antitubercular drugs, a trait frequently associated with W/Beijing isolates. Nevertheless, this first description of a genetic remodeling event of this nature for M. tuberculosis further highlights the potential for the evolution of diversity in this important global pathogen. PMID:20639330

  11. Duplications and Positive Selection Drive the Evolution of Parasitism-Associated Gene Families in the Nematode Strongyloides papillosus

    PubMed Central

    Baskaran, Praveen; Jaleta, Tegegn G.; Streit, Adrian

    2017-01-01

    Gene duplication is a major mechanism playing a role in the evolution of phenotypic complexity and in the generation of novel traits. By comparing parasitic and nonparasitic nematodes, a recent study found that the evolution of parasitism in Strongyloididae is associated with a large expansion in the Astacin and CAP gene families. To gain novel insights into the developmental processes in the sheep parasite Strongyloides papillosus, we sequenced transcriptomes of different developmental stages and sexes. Overall, we found that the majority of genes are developmentally regulated and have one-to-one orthologs in the diverged S. ratti genome. Together with the finding of similar expression profiles between S. papillosus and S. ratti, these results indicate a strong evolutionary constraint acting against change at sequence and expression levels. However, the comparison between parasitic and free-living females demonstrates a quite divergent pattern that is mostly due to the previously mentioned expansion in the Astacin and CAP gene families. More detailed phylogenetic analysis of both gene families shows that most members date back to single expansion events early in the Strongyloides lineage and have undergone subfunctionalization resulting in clusters that are highly expressed either in infective larvae or in parasitic females. Finally, we found increased evidence for positive selection in both gene families relative to the genome-wide expectation. In summary, our study reveals first insights into the developmental transcriptomes of S. papillosus and provides a detailed analysis of sequence and expression evolution in parasitism-associated gene families. PMID:28338804

  12. Duplications and positive selection drive the evolution of parasitism associated gene families in the nematode Strongyloides papillosus.

    PubMed

    Baskaran, Praveen; Jaleta, Tegegn G; Streit, Adrian; Rödelsperger, Christian

    2017-03-02

    Gene duplication is one major mechanism playing a role in the evolution of phenotypic complexity and in the generation of novel traits. By comparing parasitic and nonparasitic nematodes, a recent study found that the evolution of parasitism in Strongyloididae is associated with a large expansion in the Astacin and CAP gene families.To gain novel insights into the developmental processes in the sheep parasite Strongyloides papillosus, we sequenced transcriptomes of different developmental stages and sexes. Overall, we found that the majority of genes are developmentally regulated and have one-to-one orthologs in the diverged S. ratti genome. Together with the finding of similar expression profiles between S. papillosus and S. ratti, these results indicate a strong evolutionary constraint acting against change at sequence and expression levels. However, the comparison between parasitic and free-living females demonstrates a quite divergent pattern that is mostly due to the previously mentioned expansion in the Astacin and CAP gene families. More detailed phylogenetic analysis of both gene families shows that most members date back to single expansion events early in the Strongyloides lineage and have undergone subfunctionalization resulting in clusters that are highly expressed either in infective larvae or in parasitic females. Finally, we found increased evidence for positive selection in both gene families relative to the genome-wide expectation.In summary, our study reveals first insights into the developmental transcriptomes of S. papillosus and provides a detailed analysis of sequence and expression evolution in parasitism associated gene families.

  13. Gene duplication event in family 12 glycosyl hydrolase from Phytophthora spp.

    PubMed

    Costanzo, Stefano; Ospina-Giraldo, M D; Deahl, K L; Baker, C J; Jones, Richard W

    2006-10-01

    A total of 18 paralogs of xyloglucan-specific endoglucanases (EGLs) from the glycosyl hydrolase family 12 were identified and characterized in Phytophthora sojae and Phytophthora ramorum. These genes encode predicted extracellular enzymes, with sizes ranging from 189 to 435 amino acid residues, that would be capable of hydrolyzing the xyloglucan component of the host cell wall. In two cases, four and six functional copies of these genes were found in tight succession within a region of 5 and 18 kb, respectively. The overall gene copy number and relative organization appeared well conserved between P. sojae and P. ramorum, with apparent synteny in this region of their respective genomes. Phylogenetic analyses of Phytophthora endoglucanases of family 12 and other known members of EGL 12, revealed a close relatedness with a fairly conserved gene sub-family containing, among others, sequences from the fungi Emericella desertorum and Aspergillus aculeatus. This is the first report of family 12 EGLs present in plant pathogenic eukaryotes.

  14. Evolution of non-specific lipid transfer protein (nsLTP) genes in the Poaceae family: their duplication and diversity.

    PubMed

    Jang, Cheol Seong; Yim, Won Cheol; Moon, Jun-Cheol; Hung, Je Hyeong; Lee, Tong Geon; Lim, Sung Don; Cho, Seon Hae; Lee, Kwang Kook; Kim, Wook; Seo, Yong Weon; Lee, Byung-Moo

    2008-05-01

    Previously, the genes encoding non-specific lipid transfer proteins (nsLTPs) of the Poaceae family appear to evidence different genomic distribution and somewhat different shares of EST clones, which is suggestive of independent duplication(s) followed by functional diversity. To further evaluate the evolutionary fate of the Poaceae nsLTP genes, we have identified Ka/Ks values, conserved, mutated or lost cis-regulatory elements, responses to several elicitors, genome-wide expression profiles, and nsLTP gene-coexpression networks of both (or either) wheat and rice. The Ka/Ks values within each group and between groups appeared to be similar, but not identical, in both species. The conserved cis-regulatory elements, e.g. the RY repeat (CATGCA) element related to ABA regulation in group A, might be reflected in some degree of long-term conservation in transcriptional regulation post-dating speciation. In group A, wheat nsLTP genes, with the exception of TaLTP4, evidenced responses similar to those of plant elicitors; however, the rice nsLTP genes evidenced differences in expression profiles, even though the genes of both species have undergone purifying selection, thereby suggesting their independent functional diversity. The expression profiles of rice nsLTP genes with a microarray dataset of 155 gene expression omnibus sample (GSM) plates suggest that subfunctionalization is not the sole mechanism inherent to the evolutionary history of nsLTP genes but may, rather, function in concert with other mechanism(s). As inferred by the nsLTP gene-coexpression networks, the functional diversity of nsLTP genes appears not to be randomized, but rather to be specialized in the direction of specific biological processes over evolutionary time.

  15. Phylogenetic relationships among Perissodactyla: secretoglobin 1A1 gene duplication and triplication in the Equidae family.

    PubMed

    Côté, Olivier; Viel, Laurent; Bienzle, Dorothee

    2013-12-01

    Secretoglobin family 1A member 1 (SCGB 1A1) is a small anti-inflammatory and immunomodulatory protein that is abundantly secreted in airway surface fluids. We recently reported the existence of three distinct SCGB1A1 genes in the domestic horse genome as opposed to the single gene copy consensus present in other mammals. The origin of SCGB1A1 gene triplication and the evolutionary relationship of the three genes amongst Equidae family members are unknown. For this study, SCGB1A1 genomic data were collected from various Equus individuals including E. caballus, E. przewalskii, E. asinus, E. grevyi, and E. quagga. Three SCGB1A1 genes in E. przewalskii, two SCGB1A1 genes in E. asinus, and a single SCGB1A1 gene in E. grevyi and E. quagga were identified. Sequence analysis revealed that the non-synonymous nucleotide substitutions between the different equid genes coded for 17 amino acid changes. Most of these changes localized to the SCGB 1A1 central cavity that binds hydrophobic ligands, suggesting that this area of SCGB 1A1 evolved to accommodate diverse molecular interactions. Three-dimensional modeling of the proteins revealed that the size of the SCGB 1A1 central cavity is larger than that of SCGB 1A1A. Altogether, these findings suggest that evolution of the SCGB1A1 gene may parallel the separation of caballine and non-caballine species amongst Equidae, and may indicate an expansion of function for SCGB1A1 gene products.

  16. Discrimination of Deletion and Duplication Subtypes of the Deleted in Azoospermia Gene Family in the Context of Frequent Interloci Gene Conversion

    PubMed Central

    Vaszkó, Tibor; Papp, János; Krausz, Csilla; Casamonti, Elena; Géczi, Lajos; Olah, Edith

    2016-01-01

    Due to its palindromic setup, AZFc (Azoospermia Factor c) region of chromosome Y is one of the most unstable regions of the human genome. It contains eight gene families expressed mainly in the testes. Several types of rearrangement resulting in changes in the cumulative copy number of the gene families were reported to be associated with diseases such as male infertility and testicular germ cell tumors. The best studied AZFc rearrangement is gr/gr deletion. Its carriers show widespread phenotypic variation from azoospermia to normospermia. This phenomenon was initially attributed to different gr/gr subtypes that would eliminate distinct members of the affected gene families. However, studies conducted to confirm this hypothesis have brought controversial results, perhaps, in part, due to the shortcomings of the utilized subtyping methodology. This proof-of-concept paper is meant to introduce here a novel method aimed at subtyping AZFc rearrangements. It is able to differentiate the partial deletion and partial duplication subtypes of the Deleted in Azoospermia (DAZ) gene family. The keystone of the method is the determination of the copy number of the gene family member-specific variant(s) in a series of sequence family variant (SFV) positions. Most importantly, we present a novel approach for the correct interpretation of the variant copy number data to determine the copy number of the individual DAZ family members in the context of frequent interloci gene conversion.Besides DAZ1/DAZ2 and DAZ3/DAZ4 deletions, not yet described rearrangements such as DAZ2/DAZ4 deletion and three duplication subtypes were also found by the utilization of the novel approach. A striking feature is the extremely high concordance among the individual data pointing to a certain type of rearrangement. In addition to being able to identify DAZ deletion subtypes more reliably than the methods used previously, this approach is the first that can discriminate DAZ duplication subtypes as well

  17. Evolution of Gene Duplication in Plants.

    PubMed

    Panchy, Nicholas; Lehti-Shiu, Melissa; Shiu, Shin-Han

    2016-08-01

    Ancient duplication events and a high rate of retention of extant pairs of duplicate genes have contributed to an abundance of duplicate genes in plant genomes. These duplicates have contributed to the evolution of novel functions, such as the production of floral structures, induction of disease resistance, and adaptation to stress. Additionally, recent whole-genome duplications that have occurred in the lineages of several domesticated crop species, including wheat (Triticum aestivum), cotton (Gossypium hirsutum), and soybean (Glycine max), have contributed to important agronomic traits, such as grain quality, fruit shape, and flowering time. Therefore, understanding the mechanisms and impacts of gene duplication will be important to future studies of plants in general and of agronomically important crops in particular. In this review, we survey the current knowledge about gene duplication, including gene duplication mechanisms, the potential fates of duplicate genes, models explaining duplicate gene retention, the properties that distinguish duplicate from singleton genes, and the evolutionary impact of gene duplication.

  18. The vertebrate makorin ubiquitin ligase gene family has been shaped by large-scale duplication and retroposition from an ancestral gonad-specific, maternal-effect gene

    PubMed Central

    2010-01-01

    Background Members of the makorin (mkrn) gene family encode RING/C3H zinc finger proteins with U3 ubiquitin ligase activity. Although these proteins have been described in a variety of eukaryotes such as plants, fungi, invertebrates and vertebrates including human, almost nothing is known about their structural and functional evolution. Results Via partial sequencing of a testis cDNA library from the poeciliid fish Xiphophorus maculatus, we have identified a new member of the makorin gene family, that we called mkrn4. In addition to the already described mkrn1 and mkrn2, mkrn4 is the third example of a makorin gene present in both tetrapods and ray-finned fish. However, this gene was not detected in mouse and rat, suggesting its loss in the lineage leading to rodent murids. Mkrn2 and mkrn4 are located in large ancient duplicated regions in tetrapod and fish genomes, suggesting the possible involvement of ancestral vertebrate-specific genome duplication in the formation of these genes. Intriguingly, many mkrn1 and mkrn2 intronless retrocopies have been detected in mammals but not in other vertebrates, most of them corresponding to pseudogenes. The nature and number of zinc fingers were found to be conserved in Mkrn1 and Mkrn2 but much more variable in Mkrn4, with lineage-specific differences. RT-qPCR analysis demonstrated a highly gonad-biased expression pattern for makorin genes in medaka and zebrafish (ray-finned fishes) and amphibians, but a strong relaxation of this specificity in birds and mammals. All three mkrn genes were maternally expressed before zygotic genome activation in both medaka and zebrafish early embryos. Conclusion Our analysis demonstrates that the makorin gene family has evolved through large-scale duplication and subsequent lineage-specific retroposition-mediated duplications in vertebrates. From the three major vertebrate mkrn genes, mkrn4 shows the highest evolutionary dynamics, with lineage-specific loss of zinc fingers and even complete

  19. Evolution by leaps: gene duplication in bacteria

    PubMed Central

    2009-01-01

    Background Sequence related families of genes and proteins are common in bacterial genomes. In Escherichia coli they constitute over half of the genome. The presence of families and superfamilies of proteins suggest a history of gene duplication and divergence during evolution. Genome encoded protein families, their size and functional composition, reflect metabolic potentials of the organisms they are found in. Comparing protein families of different organisms give insight into functional differences and similarities. Results Equivalent enzyme families with metabolic functions were selected from the genomes of four experimentally characterized bacteria belonging to separate genera. Both similarities and differences were detected in the protein family memberships, with more similarities being detected among the more closely related organisms. Protein family memberships reflected known metabolic characteristics of the organisms. Differences in divergence of functionally characterized enzyme family members accounted for characteristics of taxa known to differ in those biochemical properties and capabilities. While some members of the gene families will have been acquired by lateral exchange and other former family members will have been lost over time, duplication and divergence of genes and functions appear to have been a significant contributor to the functional diversity of today’s microbes. Conclusions Protein families seem likely to have arisen during evolution by gene duplication and divergence where the gene copies that have been retained are the variants that have led to distinct bacterial physiologies and taxa. Thus divergence of the duplicate enzymes has been a major process in the generation of different kinds of bacteria. Reviewers This article was reviewed by Drs. Iyer Aravind, Ardcady Mushegian, and Pierre Pontarotti. PMID:19930658

  20. A novel exon duplication event leading to a truncating germ-line mutation of the APC gene in a familial adenomatous polyposis family.

    PubMed

    McCart, Amy; Latchford, Andrew; Volikos, Emmanouil; Rowan, Andrew; Tomlinson, Ian; Silver, Andrew

    2006-01-01

    Familial Adenomatous Polyposis (FAP) is an autosomal dominant condition predisposing to multiple adenomatous polyps of the colon. FAP patients frequently carry heterozygous mutations of the APC tumour suppressor gene. Affected individuals from a cohort of FAP families (n=22), where no germ-line APC mutation was detected by direct sequencing, were analysed by Multiplex Ligation-dependent Probe Amplification (MLPA). MLPA identified a previously unreported APC mutation involving duplication of exon 4. Subsequent analysis of cDNA from affected family members revealed expression of mutant mRNA species containing two copies of exon 4, resulting in a frameshift and premature stop codon. Bioinformatic analysis of the relevant APC genomic segment predicted a role for homologous recombination possibly involving Alu repeats in the generation of this genotype. Our results highlight the importance of MLPA as an adjunct to exon-by-exon sequencing in identifying infrequent mutational events in cancer predisposing genes.

  1. A complex interplay of tandem- and whole-genome duplication drives expansion of the L-type lectin receptor kinase gene family in the brassicaceae.

    PubMed

    Hofberger, Johannes A; Nsibo, David L; Govers, Francine; Bouwmeester, Klaas; Schranz, M Eric

    2015-01-28

    The comparative analysis of plant gene families in a phylogenetic framework has greatly accelerated due to advances in next generation sequencing. In this study, we provide an evolutionary analysis of the L-type lectin receptor kinase and L-type lectin domain proteins (L-type LecRKs and LLPs) that are considered as components in plant immunity, in the plant family Brassicaceae and related outgroups. We combine several lines of evidence provided by sequence homology, HMM-driven protein domain annotation, phylogenetic analysis, and gene synteny for large-scale identification of L-type LecRK and LLP genes within nine core-eudicot genomes. We show that both polyploidy and local duplication events (tandem duplication and gene transposition duplication) have played a major role in L-type LecRK and LLP gene family expansion in the Brassicaceae. We also find significant differences in rates of molecular evolution based on the mode of duplication. Additionally, we show that LLPs share a common evolutionary origin with L-type LecRKs and provide a consistent gene family nomenclature. Finally, we demonstrate that the largest and most diverse L-type LecRK clades are lineage-specific. Our evolutionary analyses of these plant immune components provide a framework to support future plant resistance breeding.

  2. A Complex Interplay of Tandem- and Whole-Genome Duplication Drives Expansion of the L-Type Lectin Receptor Kinase Gene Family in the Brassicaceae

    PubMed Central

    Govers, Francine; Bouwmeester, Klaas; Schranz, M. Eric

    2015-01-01

    The comparative analysis of plant gene families in a phylogenetic framework has greatly accelerated due to advances in next generation sequencing. In this study, we provide an evolutionary analysis of the L-type lectin receptor kinase and L-type lectin domain proteins (L-type LecRKs and LLPs) that are considered as components in plant immunity, in the plant family Brassicaceae and related outgroups. We combine several lines of evidence provided by sequence homology, HMM-driven protein domain annotation, phylogenetic analysis, and gene synteny for large-scale identification of L-type LecRK and LLP genes within nine core-eudicot genomes. We show that both polyploidy and local duplication events (tandem duplication and gene transposition duplication) have played a major role in L-type LecRK and LLP gene family expansion in the Brassicaceae. We also find significant differences in rates of molecular evolution based on the mode of duplication. Additionally, we show that LLPs share a common evolutionary origin with L-type LecRKs and provide a consistent gene family nomenclature. Finally, we demonstrate that the largest and most diverse L-type LecRK clades are lineage-specific. Our evolutionary analyses of these plant immune components provide a framework to support future plant resistance breeding. PMID:25635042

  3. Characterization of duplicated Dunaliella viridis SPT1 genes provides insights into early gene divergence after duplication.

    PubMed

    Guan, Zhenwei; Meng, Xiangzong; Sun, Zhenhua; Xu, Zhengkai; Song, Rentao

    2008-10-15

    The sodium-dependent phosphate transporter gene from unicellular green algae Dunaliella viridis, DvSPT1, shares similarity with members of Pi transporter family. Sequencing analysis of D. viridis BAC clone containing the DvSPT1 gene revealed two inverted duplicated copies of this gene (DvSPT1 and DvSPT1-2 respectively). The duplication covered most of both genes except for their 3' downstream region. The duplicated genomic sequences exhibited 97.9% identity with a synonymous divergence of Ks=0.0126 in the coding region. This data indicated very recent gene duplication in D. viridis genome, providing an excellent opportunity to investigate sequence and expression divergence of duplicated genes at an early stage. Scattered point mutations and length polymorphism of simple sequence repeats (SSRs) were predominant among the sequence divergence soon after gene duplication. Due to sequence divergence in the 5' regulatory regions and a swap of the entire 3' downstream regions (3'-UTR), DvSPT1 and DvSPT1-2 showed expression divergence in response to extra-cellular NaCl concentration changes. According to their expression patterns, the two diverged gene copies would provide better adaptation to a broader range of extra-cellular NaCl concentration. Furthermore, Southern blot analysis indicated that there might be a large phosphate transporter gene family in D. viridis.

  4. Evolution and expansion of the Mycobacterium tuberculosis PE and PPE multigene families and their association with the duplication of the ESAT-6 (esx) gene cluster regions

    PubMed Central

    Gey van Pittius, Nicolaas C; Sampson, Samantha L; Lee, Hyeyoung; Kim, Yeun; van Helden, Paul D; Warren, Robin M

    2006-01-01

    Background The PE and PPE multigene families of Mycobacterium tuberculosis comprise about 10% of the coding potential of the genome. The function of the proteins encoded by these large gene families remains unknown, although they have been proposed to be involved in antigenic variation and disease pathogenesis. Interestingly, some members of the PE and PPE families are associated with the ESAT-6 (esx) gene cluster regions, which are regions of immunopathogenic importance, and encode a system dedicated to the secretion of members of the potent T-cell antigen ESAT-6 family. This study investigates the duplication characteristics of the PE and PPE gene families and their association with the ESAT-6 gene clusters, using a combination of phylogenetic analyses, DNA hybridization, and comparative genomics, in order to gain insight into their evolutionary history and distribution in the genus Mycobacterium. Results The results showed that the expansion of the PE and PPE gene families is linked to the duplications of the ESAT-6 gene clusters, and that members situated in and associated with the clusters represent the most ancestral copies of the two gene families. Furthermore, the emergence of the repeat protein PGRS and MPTR subfamilies is a recent evolutionary event, occurring at defined branching points in the evolution of the genus Mycobacterium. These gene subfamilies are thus present in multiple copies only in the members of the M. tuberculosis complex and close relatives. The study provides a complete analysis of all the PE and PPE genes found in the sequenced genomes of members of the genus Mycobacterium such as M. smegmatis, M. avium paratuberculosis, M. leprae, M. ulcerans, and M. tuberculosis. Conclusion This work provides insight into the evolutionary history for the PE and PPE gene families of the mycobacteria, linking the expansion of these families to the duplications of the ESAT-6 (esx) gene cluster regions, and showing that they are composed of subgroups

  5. Evolution, organization, and expression of alpha-tubulin genes in the antarctic fish Notothenia coriiceps. Adaptive expansion of a gene family by recent gene duplication, inversion, and divergence.

    PubMed

    Parker, S K; Detrich, H W

    1998-12-18

    To assess the organization and expression of tubulin genes in ectothermic vertebrates, we have chosen the Antarctic yellowbelly rockcod, Notothenia coriiceps, as a model system. The genome of N. coriiceps contains approximately 15 distinct DNA fragments complementary to alpha-tubulin cDNA probes, which suggests that the alpha-tubulins of this cold-adapted fish are encoded by a substantial multigene family. From an N. coriiceps testicular DNA library, we isolated a 13.8-kilobase pair genomic clone that contains a tightly linked cluster of three alpha-tubulin genes, designated NcGTbalphaa, NcGTbalphab, and NcGTbalphac. Two of these genes, NcGTbalphaa and NcGTbalphab, are linked in head-to-head (5' to 5') orientation with approximately 500 bp separating their start codons, whereas NcGTbalphaa and NcGTbalphac are linked tail-to-tail (3' to 3') with approximately 2.5 kilobase pairs between their stop codons. The exons, introns, and untranslated regions of the three alpha-tubulin genes are strikingly similar in sequence, and the intergenic region between the alphaa and alphab genes is significantly palindromic. Thus, this cluster probably evolved by duplication, inversion, and divergence of a common ancestral alpha-tubulin gene. Expression of the NcGTbalphac gene is cosmopolitan, with its mRNA most abundant in hematopoietic, neural, and testicular tissues, whereas NcGTbalphaa and NcGTbalphab transcripts accumulate primarily in brain. The differential expression of the three genes is consistent with distinct suites of putative promoter and enhancer elements. We propose that cold adaptation of the microtubule system of Antarctic fishes is based in part on expansion of the alpha- and beta-tubulin gene families to ensure efficient synthesis of tubulin polypeptides.

  6. Functional requirements driving the gene duplication in 12 Drosophila species

    PubMed Central

    2013-01-01

    Background Gene duplication supplies the raw materials for novel gene functions and many gene families arisen from duplication experience adaptive evolution. Most studies of young duplicates have focused on mammals, especially humans, whereas reports describing their genome-wide evolutionary patterns across the closely related Drosophila species are rare. The sequenced 12 Drosophila genomes provide the opportunity to address this issue. Results In our study, 3,647 young duplicate gene families were identified across the 12 Drosophila species and three types of expansions, species-specific, lineage-specific and complex expansions, were detected in these gene families. Our data showed that the species-specific young duplicate genes predominated (86.6%) over the other two types. Interestingly, many independent species-specific expansions in the same gene family have been observed in many species, even including 11 or 12 Drosophila species. Our data also showed that the functional bias observed in these young duplicate genes was mainly related to responses to environmental stimuli and biotic stresses. Conclusions This study reveals the evolutionary patterns of young duplicates across 12 Drosophila species on a genomic scale. Our results suggest that convergent evolution acts on young duplicate genes after the species differentiation and adaptive evolution may play an important role in duplicate genes for adaption to ecological factors and environmental changes in Drosophila. PMID:23945147

  7. Evolutionary analysis of multidrug resistance genes in fungi - impact of gene duplication and family conservation.

    PubMed

    Gossani, Cristiani; Bellieny-Rabelo, Daniel; Venancio, Thiago M

    2014-11-01

    Although the emergence of bacterial drug resistance is of great concern to the scientific community, few studies have evaluated this phenomenon systematically in fungi by using genome-wide datasets. In the present study, we assembled a large compendium of Saccharomyces cerevisiae chemical genetic data to study the evolution of multidrug resistance genes (MDRs) in the fungal lineage. We found that MDRs typically emerge in widely conserved families, most of which containing homologs from pathogenic fungi, such as Candida albicans and Coccidioides immitis, which could favor the evolution of drug resistance in those species. By integrating data from chemical genetics with protein family conservation, genetic and protein interactions, we found that gene families rarely have more than one MDR, indicating that paralogs evolve asymmetrically with regard to multidrug resistance roles. Furthermore, MDRs have more genetic and protein interaction partners than non-MDRs, supporting their participation in complex biochemical systems underlying the tolerance to multiple bioactive molecules. MDRs share more chemical genetic interactions with other MDRs than with non-MDRs, regardless of their evolutionary affinity. These results suggest the existence of an intricate system involved in the global drug tolerance phenotypes. Finally, MDRs are more likely to be hit repeatedly by mutations in laboratory evolution experiments, indicating that they have great adaptive potential. The results presented here not only reveal the main genomic features underlying the evolution of MDRs, but also shed light on the gene families from which drug resistance is more likely to emerge in fungi.

  8. Duplications and losses in gene families of rust pathogens highlight putative effectors

    Treesearch

    Amanda L. Pendleton; Katherine E. Smith; Nicolas Feau; Francis M. Martin; Igor V. Grigoriev; Richard Hamelin; C.Dana Nelson; J.Gordon Burleigh; John M. Davis

    2014-01-01

    Rust fungi are a group of fungal pathogens that cause some of the world’s most destructive diseases of trees and crops . A shared characteristic among rust fungi is obligate biotrophy, the inability to complete a lifecycle without a host. This dependence on a host species likely affects patterns of gene expansion, contraction, and innovation within rust pathogen...

  9. PR-1 gene family of grapevine: a uniquely duplicated PR-1 gene from a Vitis interspecific hybrid confers high level resistance to bacterial disease in transgenic tobacco.

    PubMed

    Li, Zhijian T; Dhekney, Sadanand A; Gray, Dennis J

    2011-01-01

    A functional contribution of pathogenesis-related 1 (PR-1) proteins to host defense has been established. However, systematic investigation of the PR-1 gene family in grapevine (Vitis spp.) has not been conducted previously. Through mining genomic databases, we identified 21 PR-1 genes from the Vitis vinifera genome. Polypeptides encoded by putative PR-1 genes had a signal sequence of about 25 residues and a mature protein of 10.9-29 kDa in size. PR-1 mature proteins contained a highly conserved six-cysteine motif and pI values ranging from 4.6 to 9. A major cluster with 14 PR-1 genes was mapped to a 280-kb region on chromosome 3. One particular PR-1 gene within the cluster encoding a basic-type isoform (pI 7.77), herein named VvPR1b1, was isolated from various genotypes of grapevine (Vitis spp.) for functional studies. Sequence analysis of PCR-amplified DNA revealed that all genotypes contained a single VvPR1b1 gene except for a broad-spectrum bacterial and fungal disease resistant Florida bunch grape hybrid, 'BN5-4', from which seven different homologues were identified. Duplication of VvPR1b1-related genes encoding acidic-type PR-1 isoforms was also observed among several genotypes. However, transgenic expression analysis of grapevine PR-1 genes under strong constitutive promoters in transgenic tobacco revealed that only the basic-type VvPR1b1 gene duplicated in 'BN5-4' was capable of conferring high level resistance to bacterial disease caused by Pseudomonas syringae pv. tabaci.

  10. The roles of gene duplication, gene conversion and positive selection in rodent Esp and Mup pheromone gene families with comparison to the Abp family.

    PubMed

    Karn, Robert C; Laukaitis, Christina M

    2012-01-01

    Three proteinaceous pheromone families, the androgen-binding proteins (ABPs), the exocrine-gland secreting peptides (ESPs) and the major urinary proteins (MUPs) are encoded by large gene families in the genomes of Mus musculus and Rattus norvegicus. We studied the evolutionary histories of the Mup and Esp genes and compared them with what is known about the Abp genes. Apparently gene conversion has played little if any role in the expansion of the mouse Class A and Class B Mup genes and pseudogenes, and the rat Mups. By contrast, we found evidence of extensive gene conversion in many Esp genes although not in all of them. Our studies of selection identified at least two amino acid sites in β-sheets as having evolved under positive selection in the mouse Class A and Class B MUPs and in rat MUPs. We show that selection may have acted on the ESPs by determining K(a)/K(s) for Exon 3 sequences with and without the converted sequence segment. While it appears that purifying selection acted on the ESP signal peptides, the secreted portions of the ESPs probably have undergone much more rapid evolution. When the inner gene converted fragment sequences were removed, eleven Esp paralogs were present in two or more pairs with K(a)/K(s) >1.0 and thus we propose that positive selection is detectable by this means in at least some mouse Esp paralogs. We compare and contrast the evolutionary histories of all three mouse pheromone gene families in light of their proposed functions in mouse communication.

  11. Independent gene duplications of the YidC/Oxa/Alb3 family enabled a specialized cotranslational function

    PubMed Central

    Funes, Soledad; Hasona, Adnan; Bauerschmitt, Heike; Grubbauer, Caroline; Kauff, Frank; Collins, Ryan; Crowley, Paula J.; Palmer, Sara R.; Brady, L. Jeannine; Herrmann, Johannes M.

    2009-01-01

    YidC/Oxa/Alb3 family proteins catalyze the insertion of integral membrane proteins in bacteria, mitochondria, and chloroplasts, respectively. Unlike gram-negative organisms, gram-positive bacteria express 2 paralogs of this family, YidC1/SpoIIIJ and YidC2/YgjG. In Streptococcus mutans, deletion of yidC2 results in a stress-sensitive phenotype similar to that of mutants lacking the signal recognition particle (SRP) protein translocation pathway, while deletion of yidC1 has a less severe phenotype. In contrast to eukaryotes and gram-negative bacteria, SRP-deficient mutants are viable in S. mutans; however, double SRP-yidC2 mutants are severely compromised. Thus, YidC2 may enable loss of the SRP by playing an independent but overlapping role in cotranslational protein insertion into the membrane. This is reminiscent of the situation in mitochondria that lack an SRP pathway and where Oxa1 facilitates cotranslational membrane protein insertion by binding directly to translation-active ribosomes. Here, we show that OXA1 complements a lack of yidC2 in S. mutans. YidC2 also functions reciprocally in oxa1-deficient Saccharomyces cerevisiae mutants and mediates the cotranslational insertion of mitochondrial translation products into the inner membrane. YidC2, like Oxa1, contains a positively charged C-terminal extension and associates with translating ribosomes. Our results are consistent with a gene-duplication event in gram-positive bacteria that enabled the specialization of a YidC isoform that mediates cotranslational activity independent of an SRP pathway. PMID:19366667

  12. Enzyme evolution beyond gene duplication

    PubMed Central

    Noda-García, Lianet; Barona-Gómez, Francisco

    2013-01-01

    Understanding the evolution of enzyme function after gene duplication has been a major goal of molecular biologists, biochemists and evolutionary biologists alike, for almost half a century. In contrast, the impact that horizontal gene transfer (HGT) has had on the evolution of enzyme specialization and the assembly of metabolic networks has just started to being investigated. Traditionally, evolutionary studies of enzymes have been limited to either the function of enzymes in vitro, or to sequence variability at the population level, where in almost all cases the starting conceptual framework embraces gene duplication as the mechanism responsible for the appearance of genetic redundancy. Very recently, we merged comparative phylogenomics, detection of selection signals, enzyme kinetics, X-ray crystallography and computational molecular dynamics, to characterize the sub-functionalization process of an amino acid biosynthetic enzyme prompted by an episode of HGT in bacteria. Some of the evolutionary implications of these functional studies, including a proposed model of enzyme specialization independent of gene duplication, are developed in this commentary. PMID:24251070

  13. Subcellular Relocalization and Positive Selection Play Key Roles in the Retention of Duplicate Genes of Populus Class III Peroxidase Family[W][OPEN

    PubMed Central

    Ren, Lin-Ling; Liu, Yan-Jing; Liu, Hai-Jing; Qian, Ting-Ting; Qi, Li-Wang; Wang, Xiao-Ru; Zeng, Qing-Yin

    2014-01-01

    Gene duplication is the primary source of new genes and novel functions. Over the course of evolution, many duplicate genes lose their function and are eventually removed by deletion. However, some duplicates have persisted and evolved diverse functions. A particular challenge is to understand how this diversity arises and whether positive selection plays a role. In this study, we reconstructed the evolutionary history of the class III peroxidase (PRX) genes from the Populus trichocarpa genome. PRXs are plant-specific enzymes that play important roles in cell wall metabolism and in response to biotic and abiotic stresses. We found that two large tandem-arrayed clusters of PRXs evolved from an ancestral cell wall type PRX to vacuole type, followed by tandem duplications and subsequent functional specification. Substitution models identified seven positively selected sites in the vacuole PRXs. These positively selected sites showed significant effects on the biochemical functions of the enzymes. We also found that positive selection acts more frequently on residues adjacent to, rather than directly at, a critical active site of the enzyme, and on flexible regions rather than on rigid structural elements of the protein. Our study provides new insights into the adaptive molecular evolution of plant enzyme families. PMID:24934172

  14. Genomic evidence for adaptation by gene duplication.

    PubMed

    Qian, Wenfeng; Zhang, Jianzhi

    2014-08-01

    Gene duplication is widely believed to facilitate adaptation, but unambiguous evidence for this hypothesis has been found in only a small number of cases. Although gene duplication may increase the fitness of the involved organisms by doubling gene dosage or neofunctionalization, it may also result in a simple division of ancestral functions into daughter genes, which need not promote adaptation. Hence, the general validity of the adaptation by gene duplication hypothesis remains uncertain. Indeed, a genome-scale experiment found similar fitness effects of deleting pairs of duplicate genes and deleting individual singleton genes from the yeast genome, leading to the conclusion that duplication rarely results in adaptation. Here we contend that the above comparison is unfair because of a known duplication bias among genes with different fitness contributions. To rectify this problem, we compare homologous genes from the budding yeast Saccharomyces cerevisiae and the fission yeast Schizosaccharomyces pombe. We discover that simultaneously deleting a duplicate gene pair in S. cerevisiae reduces fitness significantly more than deleting their singleton counterpart in S. pombe, revealing post-duplication adaptation. The duplicates-singleton difference in fitness effect is not attributable to a potential increase in gene dose after duplication, suggesting that the adaptation is owing to neofunctionalization, which we find to be explicable by acquisitions of binary protein-protein interactions rather than gene expression changes. These results provide genomic evidence for the role of gene duplication in organismal adaptation and are important for understanding the genetic mechanisms of evolutionary innovation.

  15. The reduced expression of endogenous duplications (REED) in the maize R gene family is mediated by DNA methylation.

    PubMed Central

    Ronchi, A; Petroni, K; Tonelli, C

    1995-01-01

    The duplicated R and Sn genes regulate the maize anthocyanin biosynthetic pathway and encode tissue-specific products that are homologous to helix-loop-helix transcriptional activators. As a consequence of their coupling in the genome, Sn is partially silenced. Genomic restriction analysis failed to reveal gross structural DNA alterations between the strong original phenotype and the weak derivatives. However, the differences in pigmentation were inversely correlated with differences in the methylation of the Sn promoter. Accordingly, treatment with 5-azacytidine (AZA), a demethylating agent, restored a strong pigmentation pattern that was transmitted to the progeny and that was correlated with differential expression of the Sn transcript. Genomic sequencing confirmed that methylation of the Sn promoter was more apparent in the less pigmented seedlings and was greatly reduced in the AZA revertants. In addition, some methylcytosines were located in non-symmetrical C sequences. These findings provide an insight into Sn and R interaction, a process that we have termed Reduced Expression of Endogenous Duplications (REED). We propose that increasing the copy number of regulatory genes by endogenous duplication leads to such epigenetic mechanisms of silencing. Further understanding of the REED process may have broader implications for gene regulation and may identify new levels of regulation within eukaryotic genomes. Images PMID:7489721

  16. Gene Duplicability of Core Genes Is Highly Consistent across All Angiosperms.

    PubMed

    Li, Zhen; Defoort, Jonas; Tasdighian, Setareh; Maere, Steven; Van de Peer, Yves; De Smet, Riet

    2016-02-01

    Gene duplication is an important mechanism for adding to genomic novelty. Hence, which genes undergo duplication and are preserved following duplication is an important question. It has been observed that gene duplicability, or the ability of genes to be retained following duplication, is a nonrandom process, with certain genes being more amenable to survive duplication events than others. Primarily, gene essentiality and the type of duplication (small-scale versus large-scale) have been shown in different species to influence the (long-term) survival of novel genes. However, an overarching view of "gene duplicability" is lacking, mainly due to the fact that previous studies usually focused on individual species and did not account for the influence of genomic context and the time of duplication. Here, we present a large-scale study in which we investigated duplicate retention for 9178 gene families shared between 37 flowering plant species, referred to as angiosperm core gene families. For most gene families, we observe a strikingly consistent pattern of gene duplicability across species, with gene families being either primarily single-copy or multicopy in all species. An intermediate class contains gene families that are often retained in duplicate for periods extending to tens of millions of years after whole-genome duplication, but ultimately appear to be largely restored to singleton status, suggesting that these genes may be dosage balance sensitive. The distinction between single-copy and multicopy gene families is reflected in their functional annotation, with single-copy genes being mainly involved in the maintenance of genome stability and organelle function and multicopy genes in signaling, transport, and metabolism. The intermediate class was overrepresented in regulatory genes, further suggesting that these represent putative dosage-balance-sensitive genes.

  17. Duplications on human chromosome 22 reveal a novel Ret Finger Protein-like gene family with sense and endogenous antisense transcripts.

    PubMed

    Seroussi, E; Kedra, D; Pan, H Q; Peyrard, M; Schwartz, C; Scambler, P; Donnai, D; Roe, B A; Dumanski, J P

    1999-09-01

    Analysis of 600 kb of sequence encompassing the beta-prime adaptin (BAM22) gene on human chromosome 22 revealed intrachromosomal duplications within 22q12-13 resulting in three active RFPL genes, two RFPL pseudogenes, and two pseudogenes of BAM22. The genomic sequence of BAM22vartheta1 shows a remarkable similarity to that of BAM22. The cDNA sequence comparison of RFPL1, RFPL2, and RFPL3 showed 95%-96% identity between the genes, which were most similar to the Ret Finger Protein gene from human chromosome 6. The sense RFPL transcripts encode proteins with the tripartite structure, composed of RING finger, coiled-coil, and B30-2 domains, which are characteristic of the RING-B30 family. Each of these domains are thought to mediate protein-protein interactions by promoting homo- or heterodimerization. The MID1 gene on Xp22 is also a member of the RING-B30 family and is mutated in Opitz syndrome (OS). The autosomal dominant form of OS shows linkage to 22q11-q12. We detected a polymorphic protein-truncating allele of RFPL1 in 8% of the population, which was not associated with the OS phenotype. We identified 6-kb and 1.2-kb noncoding antisense mRNAs of RFPL1S and RFPL3S antisense genes, respectively. The RFPL1S and RFPL3S genes cover substantial portions of their sense counterparts, which suggests that the function of RFPL1S and RFPL3S is a post-transcriptional regulation of the sense RFPL genes. We illustrate the role of intrachromosomal duplications in the generation of RFPL genes, which were created by a series of duplications and share an ancestor with the RING-B30 domain containing genes from the major histocompatibility complex region on human chromosome 6.

  18. Expansion of banana (Musa acuminata) gene families involved in ethylene biosynthesis and signalling after lineage-specific whole-genome duplications.

    PubMed

    Jourda, Cyril; Cardi, Céline; Mbéguié-A-Mbéguié, Didier; Bocs, Stéphanie; Garsmeur, Olivier; D'Hont, Angélique; Yahiaoui, Nabila

    2014-05-01

    Whole-genome duplications (WGDs) are widespread in plants, and three lineage-specific WGDs occurred in the banana (Musa acuminata) genome. Here, we analysed the impact of WGDs on the evolution of banana gene families involved in ethylene biosynthesis and signalling, a key pathway for banana fruit ripening. Banana ethylene pathway genes were identified using comparative genomics approaches and their duplication modes and expression profiles were analysed. Seven out of 10 banana ethylene gene families evolved through WGD and four of them (1-aminocyclopropane-1-carboxylate synthase (ACS), ethylene-insensitive 3-like (EIL), ethylene-insensitive 3-binding F-box (EBF) and ethylene response factor (ERF)) were preferentially retained. Banana orthologues of AtEIN3 and AtEIL1, two major genes for ethylene signalling in Arabidopsis, were particularly expanded. This expansion was paralleled by that of EBF genes which are responsible for control of EIL protein levels. Gene expression profiles in banana fruits suggested functional redundancy for several MaEBF and MaEIL genes derived from WGD and subfunctionalization for some of them. We propose that EIL and EBF genes were co-retained after WGD in banana to maintain balanced control of EIL protein levels and thus avoid detrimental effects of constitutive ethylene signalling. In the course of evolution, subfunctionalization was favoured to promote finer control of ethylene signalling. © 2014 CIRAD New Phytologist © 2014 New Phytologist Trust.

  19. Gene Duplicability of Core Genes Is Highly Consistent across All Angiosperms[OPEN

    PubMed Central

    Li, Zhen; Van de Peer, Yves; De Smet, Riet

    2016-01-01

    Gene duplication is an important mechanism for adding to genomic novelty. Hence, which genes undergo duplication and are preserved following duplication is an important question. It has been observed that gene duplicability, or the ability of genes to be retained following duplication, is a nonrandom process, with certain genes being more amenable to survive duplication events than others. Primarily, gene essentiality and the type of duplication (small-scale versus large-scale) have been shown in different species to influence the (long-term) survival of novel genes. However, an overarching view of “gene duplicability” is lacking, mainly due to the fact that previous studies usually focused on individual species and did not account for the influence of genomic context and the time of duplication. Here, we present a large-scale study in which we investigated duplicate retention for 9178 gene families shared between 37 flowering plant species, referred to as angiosperm core gene families. For most gene families, we observe a strikingly consistent pattern of gene duplicability across species, with gene families being either primarily single-copy or multicopy in all species. An intermediate class contains gene families that are often retained in duplicate for periods extending to tens of millions of years after whole-genome duplication, but ultimately appear to be largely restored to singleton status, suggesting that these genes may be dosage balance sensitive. The distinction between single-copy and multicopy gene families is reflected in their functional annotation, with single-copy genes being mainly involved in the maintenance of genome stability and organelle function and multicopy genes in signaling, transport, and metabolism. The intermediate class was overrepresented in regulatory genes, further suggesting that these represent putative dosage-balance-sensitive genes. PMID:26744215

  20. Identification of a novel duplication mutation in the VHL gene in a large Chinese family with Von Hippel-Lindau (VHL) syndrome.

    PubMed

    Cao, L H; Kuang, B H; Chen, C; Hu, C; Sun, Z; Chen, H; Wang, S S; Luo, Y

    2014-12-04

    Von Hippel-Lindau (VHL) syndrome is characterized by hemangioblastomas of the brain, spinal cord, and retina, renal cysts, clear cell renal cell carcinoma, and pheochromocytoma. VHL is caused by mutations in the VHL tumor suppressor gene. We attempted to detect mutation in the VHL gene in a 5-generation Chinese family with VHL. We identified a novel small duplication that altered the reading frame downstream and created a premature TGA stop signal, resulting in severely truncated pVHL30 (p.Gly114Serfs*50) and pVHL19 (p.Gly61Serfs*50). This change was predicted to be an elongin-binding domain deletion.

  1. Gene duplication in the evolution of sexual dimorphism.

    PubMed

    Wyman, Minyoung J; Cutter, Asher D; Rowe, Locke

    2012-05-01

    Males and females share most of the same genes, so selection in one sex will typically produce a correlated response in the other sex. Yet, the sexes have evolved to differ in a multitude of behavioral, morphological, and physiological traits. How did this sexual dimorphism evolve despite the presence of a common underlying genome? We investigated the potential role of gene duplication in the evolution of sexual dimorphism. Because duplication events provide extra genetic material, the sexes each might use this redundancy to facilitate sex-specific gene expression, permitting the evolution of dimorphism. We investigated this hypothesis at the genome-wide level in Drosophila melanogaster, using the presence of sex-biased expression as a proxy for the sex-specific specialization of gene function. We expected that if sexually antagonistic selection is a potent force acting upon individual genes, duplication will result in paralog families whose members differ in sex-biased expression. Gene members of the same duplicate family can have different expression patterns in males versus females. In particular, duplicate pairs containing a male-biased gene are found more frequently than expected, in agreement with previous studies. Furthermore, when the singleton ortholog is unbiased, duplication appears to allow one of the paralog copies to acquire male-biased expression. Conversely, female-biased expression is not common among duplicates; fewer duplicate genes are expressed in the female-soma and ovaries than in the male-soma and testes. Expression divergence exists more in older than in younger duplicates pairs, but expression divergence does not correlate with protein sequence divergence. Finally, genomic proximity may have an effect on whether paralogs differ in sex-biased expression. We conclude that the data are consistent with a role of gene duplication in fostering male-biased, but not female-biased, gene expression, thereby aiding the evolution of sexual dimorphism.

  2. Tandem duplication within a neurofibromatosis type 1 (NF1) gene exon in a family with features of Watson syndrome and Noonan syndrome.

    PubMed Central

    Tassabehji, M; Strachan, T; Sharland, M; Colley, A; Donnai, D; Harris, R; Thakker, N

    1993-01-01

    Type 1 neurofibromatosis (NF1), Watson syndrome (WS), and Noonan syndrome (NS) show some overlap in clinical manifestations. In addition, WS has been shown to be linked to markers flanking the NF1 locus and a deletion at the NF1 locus demonstrated in a WS patient. This suggests either that WS and NF1 are allelic or that phenotypes arise from mutations in very closely linked genes. Here we provide evidence for the former by demonstrating a mutation in the NF1 gene in a family with features of both WS and NS. The mutation is an almost perfect in-frame tandem duplication of 42 bases in exon 28 of the NF1 gene. Unlike the mutations previously described in classical NF1, which show a preponderance of null alleles, the mutation in this family would be expected to result in a mutant neurofibromin product. Images Figure 1 Figure 2 PMID:8317503

  3. Tandem duplication within a Neurofibromatosis type I (NFI) gene exon in a family with features of Watson syndrome and Noonan syndrome

    SciTech Connect

    Tassabehji, M.; Strachan, T.; Colley, A.; Donnai, D.; Harris, R.; Thakker, N. ); Sharland, M )

    1993-07-01

    Type 1 neurofibromatosis (NF1), Watson syndrome (WS), and Noonan syndrome (NS) show some overlap in clinical manifestations. In addition, WS has been shown to be linked to markers flanking the NF1 locus and a deletion at the NF1 locus demonstrated in a WS patient. This suggests either that WS and NF1 are allelic or the phenotypes arise from mutations in very closely linked genes. Here the authors provide evidence for the former by demonstrating a mutation in the NF1 gene in a family with features of both WS and NS. The mutation is an almost perfect in-frame tandem duplication of 42 bases in exon 28 of the NF1 gene. Unlike the mutations previously described in classical NF1, which show a preponderance of null alleles, the mutation in this family would be expected to result in a mutant neurofibromin product. 31 refs., 2 figs.

  4. Altered patterns of gene duplication and differential gene gain and loss in fungal pathogens

    PubMed Central

    Powell, Amy J; Conant, Gavin C; Brown, Douglas E; Carbone, Ignazio; Dean, Ralph A

    2008-01-01

    Background Duplication, followed by fixation or random loss of novel genes, contributes to genome evolution. Particular outcomes of duplication events are possibly associated with pathogenic life histories in fungi. To date, differential gene gain and loss have not been studied at genomic scales in fungal pathogens, despite this phenomenon's known importance in virulence in bacteria and viruses. Results To determine if patterns of gene duplication differed between pathogens and non-pathogens, we identified gene families across nine euascomycete and two basidiomycete species. Gene family size distributions were fit to power laws to compare gene duplication trends in pathogens versus non-pathogens. Fungal phytopathogens showed globally altered patterns of gene duplication, as indicated by differences in gene family size distribution. We also identified sixteen examples of gene family expansion and five instances of gene family contraction in pathogenic lineages. Expanded gene families included those predicted to be important in melanin biosynthesis, host cell wall degradation and transport functions. Contracted families included those encoding genes involved in toxin production, genes with oxidoreductase activity, as well as subunits of the vacuolar ATPase complex. Surveys of the functional distribution of gene duplicates indicated that pathogens show enrichment for gene duplicates associated with receptor and hydrolase activities, while euascomycete pathogens appeared to have not only these differences, but also significantly more duplicates associated with regulatory and carbohydrate binding functions. Conclusion Differences in the overall levels of gene duplication in phytopathogenic species versus non-pathogenic relatives implicate gene inventory flux as an important virulence-associated process in fungi. We hypothesize that the observed patterns of gene duplicate enrichment, gene family expansion and contraction reflect adaptation within pathogenic life

  5. Selection for Higher Gene Copy Number after Different Types of Plant Gene Duplications

    PubMed Central

    Hudson, Corey M.; Puckett, Emily E.; Bekaert, Michaël; Pires, J. Chris; Conant, Gavin C.

    2011-01-01

    The evolutionary origins of the multitude of duplicate genes in the plant genomes are still incompletely understood. To gain an appreciation of the potential selective forces acting on these duplicates, we phylogenetically inferred the set of metabolic gene families from 10 flowering plant (angiosperm) genomes. We then compared the metabolic fluxes for these families, predicted using the Arabidopsis thaliana and Sorghum bicolor metabolic networks, with the families' duplication propensities. For duplications produced by both small scale (small-scale duplications) and genome duplication (whole-genome duplications), there is a significant association between the flux and the tendency to duplicate. Following this global analysis, we made a more fine-scale study of the selective constraints observed on plant sodium and phosphate transporters. We find that the different duplication mechanisms give rise to differing selective constraints. However, the exact nature of this pattern varies between the gene families, and we argue that the duplication mechanism alone does not define a duplicated gene's subsequent evolutionary trajectory. Collectively, our results argue for the interplay of history, function, and selection in shaping the duplicate gene evolution in plants. PMID:22056313

  6. Identification of an interstitial 18p11.32-p11.31 duplication including the EMILIN2 gene in a family with porokeratosis of Mibelli.

    PubMed

    Occella, Corrado; Bleidl, Dario; Nozza, Paolo; Mascelli, Samantha; Raso, Alessandro; Gimelli, Giorgio; Gimelli, Stefania; Tassano, Elisa

    2013-01-01

    Porokeratosis is a rare disease of epidermal keratinization characterized by the histopathological feature of the cornoid lamella, a column of tightly fitted parakeratocytic cells, whose etiology is still unclear. Porokeratosis of Mibelli is a subtype of porokeratosis presenting a single plaque or a small number of plaques of variable size located unilaterally on limbs. It frequently appears in childhood and occurs with a higher incidence in males. Cytogenetic analyses were performed in all members of the family on lesioned and uninvolved skin. An array-CGH analysis was also performed utilizing the Human Genome CGH Microarray Kit G3 400 with 5.3 KB overall median probe spacing. Gene expression was performed on skin fibroblasts. In this study, we describe a Caucasian healthy 4-year-old child and his father showing features of porokeratosis of Mibelli. Array-CGH analysis revealed an interstitial 429.5 Kb duplication of chromosome 18p11.32-p11.3 containing four genes, namely: SMCHD1, EMILIN2, LPIN2, and MYOM1 both in patient and his father. EMILIN2 resulted overexpressed on skin fibroblasts. Also other members of this family, without evident signs of porokeratosis, carried the same duplication. Among these genes, we focused our attention on elastin microfibril interfacer 2 (EMILIN2) gene. Apoptosis plays a fundamental role in maintaining epidermal homeostasis, balancing keratinocytes proliferation, and forming the stratum corneum. EMILIN2 is known to trigger the apoptosis of different cell lines negatively affecting cell survival. It is expressed in the skin. We could speculate that the duplication and overexpression of EMILIN2 cause an abnormal apoptosis of epidermal keratinocytes and alter the process of keratinization, even if other epigenetic and genetic factors could also be involved. Our results could contribute to a better understanding of the pathogenesis of porokeratosis of Mibelli.

  7. Identification of an Interstitial 18p11.32-p11.31 Duplication Including the EMILIN2 Gene in a Family with Porokeratosis of Mibelli

    PubMed Central

    2013-01-01

    Porokeratosis is a rare disease of epidermal keratinization characterized by the histopathological feature of the cornoid lamella, a column of tightly fitted parakeratocytic cells, whose etiology is still unclear. Porokeratosis of Mibelli is a subtype of porokeratosis presenting a single plaque or a small number of plaques of variable size located unilaterally on limbs. It frequently appears in childhood and occurs with a higher incidence in males. Cytogenetic analyses were performed in all members of the family on lesioned and uninvolved skin. An array-CGH analysis was also performed utilizing the Human Genome CGH Microarray Kit G3 400 with 5.3 KB overall median probe spacing. Gene expression was performed on skin fibroblasts. In this study, we describe a Caucasian healthy 4-year-old child and his father showing features of porokeratosis of Mibelli. Array-CGH analysis revealed an interstitial 429.5 Kb duplication of chromosome 18p11.32-p11.3 containing four genes, namely: SMCHD1, EMILIN2, LPIN2, and MYOM1 both in patient and his father. EMILIN2 resulted overexpressed on skin fibroblasts. Also other members of this family, without evident signs of porokeratosis, carried the same duplication. Among these genes, we focused our attention on elastin microfibril interfacer 2 (EMILIN2) gene. Apoptosis plays a fundamental role in maintaining epidermal homeostasis, balancing keratinocytes proliferation, and forming the stratum corneum. EMILIN2 is known to trigger the apoptosis of different cell lines negatively affecting cell survival. It is expressed in the skin. We could speculate that the duplication and overexpression of EMILIN2 cause an abnormal apoptosis of epidermal keratinocytes and alter the process of keratinization, even if other epigenetic and genetic factors could also be involved. Our results could contribute to a better understanding of the pathogenesis of porokeratosis of Mibelli. PMID:23593459

  8. Clinical characterization and identification of duplication breakpoints in a Japanese family with Xq28 duplication syndrome including MECP2.

    PubMed

    Fukushi, Daisuke; Yamada, Kenichiro; Nomura, Noriko; Naiki, Misako; Kimura, Reiko; Yamada, Yasukazu; Kumagai, Toshiyuki; Yamaguchi, Kumiko; Miyake, Yoshishige; Wakamatsu, Nobuaki

    2014-04-01

    Xq28 duplication syndrome including MECP2 is a neurodevelopmental disorder characterized by axial hypotonia at infancy, severe intellectual disability, developmental delay, mild characteristic facial appearance, epilepsy, regression, and recurrent infections in males. We identified a Japanese family of Xq28 duplications, in which the patients presented with cerebellar ataxia, severe constipation, and small feet, in addition to the common clinical features. The 488-kb duplication spanned from L1CAM to EMD and contained 17 genes, two pseudo genes, and three microRNA-coding genes. FISH and nucleotide sequence analyses demonstrated that the duplication was tandem and in a forward orientation, and the duplication breakpoints were located in AluSc at the EMD side, with a 32-bp deletion, and LTR50 at the L1CAM side, with "tc" and "gc" microhomologies at the duplication breakpoints, respectively. The duplicated segment was completely segregated from the grandmother to the patients. These results suggest that the duplication was generated by fork-stalling and template-switching at the AluSc and LTR50 sites. This is the first report to determine the size and nucleotide sequences of the duplicated segments at Xq28 of three generations of a family and provides the genotype-phenotype correlation of the patients harboring the specific duplicated segment.

  9. Sequencing of Pax6 Loci from the Elephant Shark Reveals a Family of Pax6 Genes in Vertebrate Genomes, Forged by Ancient Duplications and Divergences

    PubMed Central

    Gautier, Philippe; Loosli, Felix; Tay, Boon-Hui; Tay, Alice; Murdoch, Emma; Coutinho, Pedro; van Heyningen, Veronica; Brenner, Sydney; Venkatesh, Byrappa; Kleinjan, Dirk A.

    2013-01-01

    family of Pax6 genes, forged by ancient duplication events and by independent, lineage-specific gene losses. PMID:23359656

  10. Duplicated gelsolin family genes in zebrafish: a novel scinderin-like gene (scinla) encodes the major corneal crystallin.

    PubMed

    Jia, Sujuan; Omelchenko, Marina; Garland, Donita; Vasiliou, Vasilis; Kanungo, Jyotshnabala; Spencer, Michael; Wolf, Yuri; Koonin, Eugene; Piatigorsky, Joram

    2007-10-01

    We have previously identified a gelsolin-like protein (C/L-gelsolin) as a corneal crystallin in zebrafish. Here we show by phylogenetic analysis that there are at least six genes encoding gelsolin-like proteins based on their gelsolin domains in zebrafish: gsna and gsnb group with the vertebrate gelsolin gene, scina and scinb group with the scinderin (adseverin) gene, and scinla (C/L-gelsolin) and scinlb are novel scinderin-like genes. RT-PCR showed that scinla, scinlb, and gsnb are preferentially expressed in the adult cornea whereas gsna is expressed to a similar extent in cornea, lens, brain, and heart; scina and scinb expression were detectable only in whole zebrafish and not in these adult tissues. Quantitative RT-PCR and 2-dimensional polyacrylamide gel electrophoresis followed by MALDI/TOF mass spectroscopy confirmed high expression of beta-actin and scinla, moderate expression of scinlb, and very low expression of gsna and gsnb in the cornea. Finally, transgenic zebrafish carrying a green fluorescent protein reporter transgene driven by a 4 kb scinla promoter fragment showed expression in the cornea, snout, dorsal fin, and tail fin of 3-day-old zebrafish larvae. Our data suggest that scinla and scinlb are diverged paralogs of the vertebrate scinderin gene and show that scinla encodes the zebrafish corneal crystallin previously called C/L-gelsolin.

  11. Mechanisms of Gene Duplication and Amplification

    PubMed Central

    Reams, Andrew B.; Roth, John R.

    2015-01-01

    Changes in gene copy number are among the most frequent mutational events in all genomes and were among the mutations for which a physical basis was first known. Yet mechanisms of gene duplication remain uncertain because formation rates are difficult to measure and mechanisms may vary with position in a genome. Duplications are compared here to deletions, which seem formally similar but can arise at very different rates by distinct mechanisms. Methods of assessing duplication rates and dependencies are described with several proposed formation mechanisms. Emphasis is placed on duplications formed in extensively studied experimental situations. Duplications studied in microbes are compared with those observed in metazoan cells, specifically those in genomes of cancer cells. Duplications, and especially their derived amplifications, are suggested to form by multistep processes often under positive selection for increased copy number. PMID:25646380

  12. Duplication of the EFNB1 Gene in Familial Hypertelorism: Imbalance in Ephrin-B1 Expression and Abnormal Phenotypes in Humans and Mice

    PubMed Central

    Babbs, Christian; Stewart, Helen S; Williams, Louise J; Connell, Lyndsey; Goriely, Anne; Twigg, Stephen RF; Smith, Kim; Lester, Tracy; Wilkie, Andrew OM

    2011-01-01

    Familial hypertelorism, characterized by widely spaced eyes, classically shows autosomal dominant inheritance (Teebi type), but some pedigrees are compatible with X-linkage. No mechanism has been described previously, but clinical similarity has been noted to craniofrontonasal syndrome (CFNS), which is caused by mutations in the X-linked EFNB1 gene. Here we report a family in which females in three generations presented with hypertelorism, but lacked either craniosynostosis or a grooved nasal tip, excluding CFNS. DNA sequencing of EFNB1 was normal, but further analysis revealed a duplication of 937 kb including EFNB1 and two flanking genes: PJA1 and STARD8. We found that the X chromosome bearing the duplication produces ∼1.6-fold more EFNB1 transcript than the normal X chromosome and propose that, in the context of X-inactivation, this difference in expression level of EFNB1 results in abnormal cell sorting leading to hypertelorism. To support this hypothesis, we provide evidence from a mouse model carrying a targeted human EFNB1 cDNA, that abnormal cell sorting occurs in the cranial region. Hence, we propose that X-linked cases resembling Teebi hypertelorism may have a similar mechanism to CFNS, and that cellular mosaicism for different levels of ephrin-B1 (as well as simple presence/absence) leads to craniofacial abnormalities. Hum Mutat 32:1–9, 2011. © 2011 Wiley-Liss, Inc. PMID:21542058

  13. Duplicated genes evolve independently in allopolyploid cotton.

    Treesearch

    Richard C. Cronn; Randall L. Small; Jonathan F. Wendel

    1999-01-01

    Of the many processes that generate gene duplications, polyploidy is unique in that entire genomes are duplicated. This process has been important in the evolution of many eukaryotic groups, and it occurs with high frequency in plants. Recent evidence suggests that polyploidization may be accompanied by rapid genomic changes, but the evolutionary fate of discrete loci...

  14. Divergent evolutionary fates of major photosynthetic gene networks following gene and whole genome duplications.

    PubMed

    Coate, Jeremy E; Doyle, Jeff J

    2011-04-01

    Gene and genome duplication are recurring processes in flowering plants, and elucidating the mechanisms by which duplicated genes are lost or deployed is a key component of understanding plant evolution. Using gene ontologies (GO) or protein family (PFAM) domains, distinct patterns of duplicate retention and loss have been identified depending on gene functional properties and duplication mechanism, but little is known about how gene networks encoding interacting proteins (protein complexes or signaling cascades) evolve in response to duplication. We examined patterns of duplicate retention within four major gene networks involved in photosynthesis (the Calvin cycle, photosystem I, photosystem II, and the light harvesting complex) across three species and four whole genome duplications, as well as small-scale duplications, and showed that photosystem gene family evolution is governed largely by dosage sensitivity. ( 1) In contrast, Calvin cycle gene families are not dosage sensitive, but exhibit a greater capacity for functional differentiation. Here we review these findings, highlight how this study, by analyzing defined gene networks, is complementary to global studies using functional annotations such as GO and PFAM, and elaborate on one example of functional differentiation in the Calvin cycle gene family, transketolase.

  15. Genome-wide identification and comparative expression analysis reveal a rapid expansion and functional divergence of duplicated genes in the WRKY gene family of cabbage, Brassica oleracea var. capitata.

    PubMed

    Yao, Qiu-Yang; Xia, En-Hua; Liu, Fei-Hu; Gao, Li-Zhi

    2015-02-15

    WRKY transcription factors (TFs), one of the ten largest TF families in higher plants, play important roles in regulating plant development and resistance. To date, little is known about the WRKY TF family in Brassica oleracea. Recently, the completed genome sequence of cabbage (B. oleracea var. capitata) allows us to systematically analyze WRKY genes in this species. A total of 148 WRKY genes were characterized and classified into seven subgroups that belong to three major groups. Phylogenetic and synteny analyses revealed that the repertoire of cabbage WRKY genes was derived from a common ancestor shared with Arabidopsis thaliana. The B. oleracea WRKY genes were found to be preferentially retained after the whole-genome triplication (WGT) event in its recent ancestor, suggesting that the WGT event had largely contributed to a rapid expansion of the WRKY gene family in B. oleracea. The analysis of RNA-Seq data from various tissues (i.e., roots, stems, leaves, buds, flowers and siliques) revealed that most of the identified WRKY genes were positively expressed in cabbage, and a large portion of them exhibited patterns of differential and tissue-specific expression, demonstrating that these gene members might play essential roles in plant developmental processes. Comparative analysis of the expression level among duplicated genes showed that gene expression divergence was evidently presented among cabbage WRKY paralogs, indicating functional divergence of these duplicated WRKY genes.

  16. Evolution of Gene Duplication in Plants1[OPEN

    PubMed Central

    2016-01-01

    Ancient duplication events and a high rate of retention of extant pairs of duplicate genes have contributed to an abundance of duplicate genes in plant genomes. These duplicates have contributed to the evolution of novel functions, such as the production of floral structures, induction of disease resistance, and adaptation to stress. Additionally, recent whole-genome duplications that have occurred in the lineages of several domesticated crop species, including wheat (Triticum aestivum), cotton (Gossypium hirsutum), and soybean (Glycine max), have contributed to important agronomic traits, such as grain quality, fruit shape, and flowering time. Therefore, understanding the mechanisms and impacts of gene duplication will be important to future studies of plants in general and of agronomically important crops in particular. In this review, we survey the current knowledge about gene duplication, including gene duplication mechanisms, the potential fates of duplicate genes, models explaining duplicate gene retention, the properties that distinguish duplicate from singleton genes, and the evolutionary impact of gene duplication. PMID:27288366

  17. Opsin gene duplication and divergence in ray-finned fish.

    PubMed

    Rennison, Diana J; Owens, Gregory L; Taylor, John S

    2012-03-01

    Opsin gene sequences were first reported in the 1980s. The goal of that research was to test the hypothesis that human opsins were members of a single gene family and that variation in human color vision was mediated by mutations in these genes. While the new data supported both hypotheses, the greatest contribution of this work was, arguably, that it provided the data necessary for PCR-based surveys in a diversity of other species. Such studies, and recent whole genome sequencing projects, have uncovered exceptionally large opsin gene repertoires in ray-finned fishes (taxon, Actinopterygii). Guppies and zebrafish, for example, have 10 visual opsin genes each. Here we review the duplication and divergence events that have generated these gene collections. Phylogenetic analyses revealed that large opsin gene repertories in fish have been generated by gene duplication and divergence events that span the age of the ray-finned fishes. Data from whole genome sequencing projects and from large-insert clones show that tandem duplication is the primary mode of opsin gene family expansion in fishes. In some instances gene conversion between tandem duplicates has obscured evolutionary relationships among genes and generated unique key-site haplotypes. We mapped amino acid substitutions at so-called key-sites onto phylogenies and this exposed many examples of convergence. We found that dN/dS values were higher on the branches of our trees that followed gene duplication than on branches that followed speciation events, suggesting that duplication relaxes constraints on opsin sequence evolution. Though the focus of the review is opsin sequence evolution, we also note that there are few clear connections between opsin gene repertoires and variation in spectral environment, morphological traits, or life history traits.

  18. Duplication and maintenance of the Myb genes of vertebrate animals.

    PubMed

    Davidson, Colin J; Guthrie, Erin E; Lipsick, Joseph S

    2013-02-15

    Gene duplication is an important means of generating new genes. The major mechanisms by which duplicated genes are preserved in the face of purifying selection are thought to be neofunctionalization, subfunctionalization, and increased gene dosage. However, very few duplicated gene families in vertebrate species have been analyzed by functional tests in vivo. We have therefore examined the three vertebrate Myb genes (c-Myb, A-Myb, and B-Myb) by cytogenetic map analysis, by sequence analysis, and by ectopic expression in Drosophila. We provide evidence that the vertebrate Myb genes arose by two rounds of regional genomic duplication. We found that ubiquitous expression of c-Myb and A-Myb, but not of B-Myb or Drosophila Myb, was lethal in Drosophila. Expression of any of these genes during early larval eye development was well tolerated. However, expression of c-Myb and A-Myb, but not of B-Myb or Drosophila Myb, during late larval eye development caused drastic alterations in adult eye morphology. Mosaic analysis implied that this eye phenotype was cell-autonomous. Interestingly, some of the eye phenotypes caused by the retroviral v-Myb oncogene and the normal c-Myb proto-oncogene from which v-Myb arose were quite distinct. Finally, we found that post-translational modifications of c-Myb by the GSK-3 protein kinase and by the Ubc9 SUMO-conjugating enzyme that normally occur in vertebrate cells can modify the eye phenotype caused by c-Myb in Drosophila. These results support a model in which the three Myb genes of vertebrates arose by two sequential duplications. The first duplication was followed by a subfunctionalization of gene expression, then neofunctionalization of protein function to yield a c/A-Myb progenitor. The duplication of this progenitor was followed by subfunctionalization of gene expression to give rise to tissue-specific c-Myb and A-Myb genes.

  19. Evolution of the duplicated intracellular lipid-binding protein genes of teleost fishes.

    PubMed

    Venkatachalam, Ananda B; Parmar, Manoj B; Wright, Jonathan M

    2017-08-01

    Increasing organismal complexity during the evolution of life has been attributed to the duplication of genes and entire genomes. More recently, theoretical models have been proposed that postulate the fate of duplicated genes, among them the duplication-degeneration-complementation (DDC) model. In the DDC model, the common fate of a duplicated gene is lost from the genome owing to nonfunctionalization. Duplicated genes are retained in the genome either by subfunctionalization, where the functions of the ancestral gene are sub-divided between the sister duplicate genes, or by neofunctionalization, where one of the duplicate genes acquires a new function. Both processes occur either by loss or gain of regulatory elements in the promoters of duplicated genes. Here, we review the genomic organization, evolution, and transcriptional regulation of the multigene family of intracellular lipid-binding protein (iLBP) genes from teleost fishes. Teleost fishes possess many copies of iLBP genes owing to a whole genome duplication (WGD) early in the teleost fish radiation. Moreover, the retention of duplicated iLBP genes is substantially higher than the retention of all other genes duplicated in the teleost genome. The fatty acid-binding protein genes, a subfamily of the iLBP multigene family in zebrafish, are differentially regulated by peroxisome proliferator-activated receptor (PPAR) isoforms, which may account for the retention of iLBP genes in the zebrafish genome by the process of subfunctionalization of cis-acting regulatory elements in iLBP gene promoters.

  20. Gene duplication and the properties of biological networks.

    PubMed

    Hughes, Austin L; Friedman, Robert

    2005-12-01

    Patterns of network connection of members of multigene families were examined for two biological networks: a genetic network from the yeast Saccharomyces cerevisiae and a protein-protein interaction network from Caenorhabditis elegans. In both networks, genes belonging to gene families represented by a single member in the genome ("singletons") were disproportionately represented among the nodes having large numbers of connections. Of 68 single-member yeast families with 25 or more network connections, 28 (44.4%) were located in duplicated genomic segments believed to have originated from an ancient polyploidization event; thus, each of these 28 loci was thus presumably duplicated along with the genomic segment to which it belongs, but one of the two duplicates has subsequently been deleted. Nodes connected to major "hubs" with a large number of connections, tended to be relatively sparsely interconnected among themselves. Furthermore, duplicated genes, even those arising from recent duplication, rarely shared many network connections, suggesting that network connections are remarkably labile over evolutionary time. These factors serve to explain well-known general properties of biological networks, including their scale-free and modular nature.

  1. Pervasive and Persistent Redundancy among Duplicated Genes in Yeast

    PubMed Central

    Dean, E. Jedediah; Davis, Jerel C.; Davis, Ronald W.; Petrov, Dmitri A.

    2008-01-01

    The loss of functional redundancy is the key process in the evolution of duplicated genes. Here we systematically assess the extent of functional redundancy among a large set of duplicated genes in Saccharomyces cerevisiae. We quantify growth rate in rich medium for a large number of S. cerevisiae strains that carry single and double deletions of duplicated and singleton genes. We demonstrate that duplicated genes can maintain substantial redundancy for extensive periods of time following duplication (∼100 million years). We find high levels of redundancy among genes duplicated both via the whole genome duplication and via smaller scale duplications. Further, we see no evidence that two duplicated genes together contribute to fitness in rich medium substantially beyond that of their ancestral progenitor gene. We argue that duplicate genes do not often evolve to behave like singleton genes even after very long periods of time. PMID:18604285

  2. Drosophila melanogaster metallothionein genes: Selection for duplications

    SciTech Connect

    Lange, B.W.

    1989-01-01

    The metallothionein genes of Drosophila melanogaster, Mtn and Mto, may play an important role in heavy-metal detoxification. In order to investigate the possibility of increased selection for duplications of these genes in natural populations exposed to high levels of heavy metals, I compared the frequencies of such duplications among flies collected from metal-contaminated and non-contaminated orchards in Pennsylvania, Tennessee, and Georgia. Contaminated of collection sites and of local flies was confirmed by atomic absorption spectrosphotometry. Six-nucleotide-recognizing restriction enzyme analysis was used to screen 1666 wild third chromosomes for Mtn duplications. A subset (327) of these lines was screened for Mto duplications: none were found. Cadmium tolerance test performed on F{sub 2} progeny of wild females failed to detect a difference in tolerance levels between flies from contaminated orchards and flies from control orchards. Estimates of sequence diversity among a subsample (92) of the chromosomes used in the duplication survey, including all 27 Mtn duplication chromosomes, were obtained using four-nucleotide-recognizing restriction enzyme analysis.

  3. A Family Harboring CMT1A Duplication and HNPP Deletion.

    PubMed

    Lee, Jung Hwa; Kang, Hee Jin; Song, Hyunseok; Hwang, Su Jin; Cho, Sun-Young; Kim, Sang-Beom; Kim, Joonki; Chung, Ki Wha; Choi, Byung-Ok

    2007-06-01

    Charcot-Marie-Tooth disease type 1A (CMT1A) is associated with duplication of chromosome 17p11.2-p12, whereas hereditary neuropathy with liability to pressure palsies (HNPP), which is an autosomal dominant neuropathy showing characteristics of recurrent pressure palsies, is associated with 17p11.2-p12 deletion. An altered gene dosage of PMP22 is believed to the main cause underlying the CMT1A and HNPP phenotypes. Although CMT1A and HNPP are associated with the same locus, there has been no report of these two mutations within a single family. We report a rare family harboring CMT1A duplication and HNPP deletion.

  4. Genome duplication and gene loss affect the evolution of heat shock transcription factor genes in legumes.

    PubMed

    Lin, Yongxiang; Cheng, Ying; Jin, Jing; Jin, Xiaolei; Jiang, Haiyang; Yan, Hanwei; Cheng, Beijiu

    2014-01-01

    Whole-genome duplication events (polyploidy events) and gene loss events have played important roles in the evolution of legumes. Here we show that the vast majority of Hsf gene duplications resulted from whole genome duplication events rather than tandem duplication, and significant differences in gene retention exist between species. By searching for intraspecies gene colinearity (microsynteny) and dating the age distributions of duplicated genes, we found that genome duplications accounted for 42 of 46 Hsf-containing segments in Glycine max, while paired segments were rarely identified in Lotus japonicas, Medicago truncatula and Cajanus cajan. However, by comparing interspecies microsynteny, we determined that the great majority of Hsf-containing segments in Lotus japonicas, Medicago truncatula and Cajanus cajan show extensive conservation with the duplicated regions of Glycine max. These segments formed 17 groups of orthologous segments. These results suggest that these regions shared ancient genome duplication with Hsf genes in Glycine max, but more than half of the copies of these genes were lost. On the other hand, the Glycine max Hsf gene family retained approximately 75% and 84% of duplicated genes produced from the ancient genome duplication and recent Glycine-specific genome duplication, respectively. Continuous purifying selection has played a key role in the maintenance of Hsf genes in Glycine max. Expression analysis of the Hsf genes in Lotus japonicus revealed their putative involvement in multiple tissue-/developmental stages and responses to various abiotic stimuli. This study traces the evolution of Hsf genes in legume species and demonstrates that the rates of gene gain and loss are far from equilibrium in different species.

  5. Genome Duplication and Gene Loss Affect the Evolution of Heat Shock Transcription Factor Genes in Legumes

    PubMed Central

    Jin, Jing; Jin, Xiaolei; Jiang, Haiyang; Yan, Hanwei; Cheng, Beijiu

    2014-01-01

    Whole-genome duplication events (polyploidy events) and gene loss events have played important roles in the evolution of legumes. Here we show that the vast majority of Hsf gene duplications resulted from whole genome duplication events rather than tandem duplication, and significant differences in gene retention exist between species. By searching for intraspecies gene colinearity (microsynteny) and dating the age distributions of duplicated genes, we found that genome duplications accounted for 42 of 46 Hsf-containing segments in Glycine max, while paired segments were rarely identified in Lotus japonicas, Medicago truncatula and Cajanus cajan. However, by comparing interspecies microsynteny, we determined that the great majority of Hsf-containing segments in Lotus japonicas, Medicago truncatula and Cajanus cajan show extensive conservation with the duplicated regions of Glycine max. These segments formed 17 groups of orthologous segments. These results suggest that these regions shared ancient genome duplication with Hsf genes in Glycine max, but more than half of the copies of these genes were lost. On the other hand, the Glycine max Hsf gene family retained approximately 75% and 84% of duplicated genes produced from the ancient genome duplication and recent Glycine-specific genome duplication, respectively. Continuous purifying selection has played a key role in the maintenance of Hsf genes in Glycine max. Expression analysis of the Hsf genes in Lotus japonicus revealed their putative involvement in multiple tissue-/developmental stages and responses to various abiotic stimuli. This study traces the evolution of Hsf genes in legume species and demonstrates that the rates of gene gain and loss are far from equilibrium in different species. PMID:25047803

  6. Whole ARX gene duplication is compatible with normal intellectual development.

    PubMed

    Popovici, Cornel; Busa, Tiffany; Boute, Odile; Thuresson, Ann-Charlotte; Perret, Odile; Sigaudy, Sabine; Södergren, Tommy; Andrieux, Joris; Moncla, Anne; Philip, Nicole

    2014-09-01

    We report here on four males from three families carrying de novo or inherited small Xp22.13 duplications including the ARX gene detected by chromosomal microarray analysis (CMA). Two of these males had normal intelligence. Our report suggests that, unlike other XLMR genes like MECP2 and FMR1, the presence of an extra copy of the ARX gene may not be sufficient to perturb its developmental functions. ARX duplication does not inevitably have detrimental effects on brain development, in contrast with the effects of ARX haploinsufficiency. The abnormal phenotype ascribed to the presence of an extra copy in some male patients may have resulted from the effect of another, not yet identified, chromosomal or molecular anomaly, alone or in association with ARX duplication. © 2014 Wiley Periodicals, Inc.

  7. Inferring angiosperm phylogeny from EST data with widespread gene duplication.

    PubMed

    Sanderson, Michael J; McMahon, Michelle M

    2007-02-08

    Most studies inferring species phylogenies use sequences from single copy genes or sets of orthologs culled from gene families. For taxa such as plants, with very high levels of gene duplication in their nuclear genomes, this has limited the exploitation of nuclear sequences for phylogenetic studies, such as those available in large EST libraries. One rarely used method of inference, gene tree parsimony, can infer species trees from gene families undergoing duplication and loss, but its performance has not been evaluated at a phylogenomic scale for EST data in plants. A gene tree parsimony analysis based on EST data was undertaken for six angiosperm model species and Pinus, an outgroup. Although a large fraction of the tentative consensus sequences obtained from the TIGR database of ESTs was assembled into homologous clusters too small to be phylogenetically informative, some 557 clusters contained promising levels of information. Based on maximum likelihood estimates of the gene trees obtained from these clusters, gene tree parsimony correctly inferred the accepted species tree with strong statistical support. A slight variant of this species tree was obtained when maximum parsimony was used to infer the individual gene trees instead. Despite the complexity of the EST data and the relatively small fraction eventually used in inferring a species tree, the gene tree parsimony method performed well in the face of very high apparent rates of duplication.

  8. Inferring angiosperm phylogeny from EST data with widespread gene duplication

    PubMed Central

    Sanderson, Michael J; McMahon, Michelle M

    2007-01-01

    Background Most studies inferring species phylogenies use sequences from single copy genes or sets of orthologs culled from gene families. For taxa such as plants, with very high levels of gene duplication in their nuclear genomes, this has limited the exploitation of nuclear sequences for phylogenetic studies, such as those available in large EST libraries. One rarely used method of inference, gene tree parsimony, can infer species trees from gene families undergoing duplication and loss, but its performance has not been evaluated at a phylogenomic scale for EST data in plants. Results A gene tree parsimony analysis based on EST data was undertaken for six angiosperm model species and Pinus, an outgroup. Although a large fraction of the tentative consensus sequences obtained from the TIGR database of ESTs was assembled into homologous clusters too small to be phylogenetically informative, some 557 clusters contained promising levels of information. Based on maximum likelihood estimates of the gene trees obtained from these clusters, gene tree parsimony correctly inferred the accepted species tree with strong statistical support. A slight variant of this species tree was obtained when maximum parsimony was used to infer the individual gene trees instead. Conclusion Despite the complexity of the EST data and the relatively small fraction eventually used in inferring a species tree, the gene tree parsimony method performed well in the face of very high apparent rates of duplication. PMID:17288576

  9. Tempo and mode of gene duplication in mammalian ribosomal protein evolution.

    PubMed

    Dharia, Asav P; Obla, Ajay; Gajdosik, Matthew D; Simon, Amanda; Nelson, Craig E

    2014-01-01

    Gene duplication has been widely recognized as a major driver of evolutionary change and organismal complexity through the generation of multi-gene families. Therefore, understanding the forces that govern the evolution of gene families through the retention or loss of duplicated genes is fundamentally important in our efforts to study genome evolution. Previous work from our lab has shown that ribosomal protein (RP) genes constitute one of the largest classes of conserved duplicated genes in mammals. This result was surprising due to the fact that ribosomal protein genes evolve slowly and transcript levels are very tightly regulated. In our present study, we identified and characterized all RP duplicates in eight mammalian genomes in order to investigate the tempo and mode of ribosomal protein family evolution. We show that a sizable number of duplicates are transcriptionally active and are very highly conserved. Furthermore, we conclude that existing gene duplication models do not readily account for the preservation of a very large number of intact retroduplicated ribosomal protein (RT-RP) genes observed in mammalian genomes. We suggest that selection against dominant-negative mutations may underlie the unexpected retention and conservation of duplicated RP genes, and may shape the fate of newly duplicated genes, regardless of duplication mechanism.

  10. Tempo and Mode of Gene Duplication in Mammalian Ribosomal Protein Evolution

    PubMed Central

    Gajdosik, Matthew D.; Simon, Amanda; Nelson, Craig E.

    2014-01-01

    Gene duplication has been widely recognized as a major driver of evolutionary change and organismal complexity through the generation of multi-gene families. Therefore, understanding the forces that govern the evolution of gene families through the retention or loss of duplicated genes is fundamentally important in our efforts to study genome evolution. Previous work from our lab has shown that ribosomal protein (RP) genes constitute one of the largest classes of conserved duplicated genes in mammals. This result was surprising due to the fact that ribosomal protein genes evolve slowly and transcript levels are very tightly regulated. In our present study, we identified and characterized all RP duplicates in eight mammalian genomes in order to investigate the tempo and mode of ribosomal protein family evolution. We show that a sizable number of duplicates are transcriptionally active and are very highly conserved. Furthermore, we conclude that existing gene duplication models do not readily account for the preservation of a very large number of intact retroduplicated ribosomal protein (RT-RP) genes observed in mammalian genomes. We suggest that selection against dominant-negative mutations may underlie the unexpected retention and conservation of duplicated RP genes, and may shape the fate of newly duplicated genes, regardless of duplication mechanism. PMID:25369106

  11. Horizontal Transfer, Not Duplication, Drives the Expansion of Protein Families in Prokaryotes

    PubMed Central

    Treangen, Todd J.; Rocha, Eduardo P. C.

    2011-01-01

    Gene duplication followed by neo- or sub-functionalization deeply impacts the evolution of protein families and is regarded as the main source of adaptive functional novelty in eukaryotes. While there is ample evidence of adaptive gene duplication in prokaryotes, it is not clear whether duplication outweighs the contribution of horizontal gene transfer in the expansion of protein families. We analyzed closely related prokaryote strains or species with small genomes (Helicobacter, Neisseria, Streptococcus, Sulfolobus), average-sized genomes (Bacillus, Enterobacteriaceae), and large genomes (Pseudomonas, Bradyrhizobiaceae) to untangle the effects of duplication and horizontal transfer. After removing the effects of transposable elements and phages, we show that the vast majority of expansions of protein families are due to transfer, even among large genomes. Transferred genes—xenologs—persist longer in prokaryotic lineages possibly due to a higher/longer adaptive role. On the other hand, duplicated genes—paralogs—are expressed more, and, when persistent, they evolve slower. This suggests that gene transfer and gene duplication have very different roles in shaping the evolution of biological systems: transfer allows the acquisition of new functions and duplication leads to higher gene dosage. Accordingly, we show that paralogs share most protein–protein interactions and genetic regulators, whereas xenologs share very few of them. Prokaryotes invented most of life's biochemical diversity. Therefore, the study of the evolution of biology systems should explicitly account for the predominant role of horizontal gene transfer in the diversification of protein families. PMID:21298028

  12. Neuropeptide Y-family peptides and receptors in the elephant shark, Callorhinchus milii confirm gene duplications before the gnathostome radiation.

    PubMed

    Larsson, Tomas A; Tay, Boon-Hui; Sundström, Görel; Fredriksson, Robert; Brenner, Sydney; Larhammar, Dan; Venkatesh, Byrappa

    2009-03-01

    We describe here the repertoire of neuropeptide Y (NPY) peptides and receptors in the elephant shark Callorhinchus milii, belonging to the chondrichthyans that diverged from the rest of the gnathostome (jawed vertebrate) lineage about 450 million years ago and the first chondrichthyan with a genome project. We have identified two peptide genes that are orthologous to NPY and PYY (peptide YY) in other vertebrates, and seven receptor genes orthologous to the Y1, Y2, Y4, Y5, Y6, Y7 and Y8 subtypes found in tetrapods and teleost fishes. The repertoire of peptides and receptors seems to reflect the ancestral configuration in the predecessor of all gnathostomes, whereas other lineages such as mammals and teleosts have lost one or more receptor genes or have acquired 1-2 additional peptide genes. Both the peptides and receptors showed broad and overlapping mRNA expression which may explain why some receptor gene losses could take place in some lineages, but leaves open the question why all the known ancestral receptors have been retained in the elephant shark.

  13. Phylogenetics of lophotrochozoan bHLH genes and the evolution of lineage-specific gene duplicates.

    PubMed

    Bao, Yongbo; Xu, Fei; Shimeld, Sebastian M

    2017-03-11

    The gain and loss of genes encoding transcription factors is of importance to understanding the evolution of gene regulatory complexity. The basic helix-loop-helix (bHLH) genes encode a large superfamily of transcription factors. We systematically classify the bHLH genes from five mollusc, two annelid and one brachiopod genomes, tracing the pattern of bHLH gene evolution across these poorly-studied Phyla. 56 to 88 bHLH genes were identified in each genome, with most identifiable as members of previously described bilaterian families, or of new families we define. Of such families only one, Mesp, appears lost by all these species. Additional duplications have also played a role in the evolution of the bHLH gene repertoire, with many new lophotrochozoan-, mollusc-, bivalve- or gastropod-specific genes defined. Using a combination of transcriptome mining, RT-PCR and in situ hybridization we compared the expression of several of these novel genes in tissues and embryos of the molluscs Crassostrea gigas and Patella vulgata, finding both conserved expression and evidence for neofunctionalisation. We also map the positions of the genes across these genomes, identifying numerous gene linkages. Some reflect recent paralogue divergence by tandem duplication, others are remnants of ancient tandem duplications dating to the lophotrochozoan or bilaterian common ancestors. These data are built into a model of the evolution of bHLH genes in molluscs, showing formidable evolutionary stasis at the family level but considerable within-family diversification by tandem gene duplication.

  14. Evolutionary Analysis of Sequence Divergence and Diversity of Duplicate Genes in Aspergillus fumigatus

    PubMed Central

    Yang, Ence; Hulse, Amanda M.; Cai, James J.

    2012-01-01

    Gene duplication as a major source of novel genetic material plays an important role in evolution. In this study, we focus on duplicate genes in Aspergillus fumigatus, a ubiquitous filamentous fungus causing life-threatening human infections. We characterize the extent and evolutionary patterns of the duplicate genes in the genome of A. fumigatus. Our results show that A. fumigatus contains a large amount of duplicate genes with pronounced sequence divergence between two copies, and approximately 10% of them diverge asymmetrically, i.e. two copies of a duplicate gene pair diverge at significantly different rates. We use a Bayesian approach of the McDonald-Kreitman test to infer distributions of selective coefficients γ(=2Nes) and find that (1) the values of γ for two copies of duplicate genes co-vary positively and (2) the average γ for the two copies differs between genes from different gene families. This analysis highlights the usefulness of combining divergence and diversity data in studying the evolution of duplicate genes. Taken together, our results provide further support and refinement to the theories of gene duplication. Through characterizing the duplicate genes in the genome of A. fumigatus, we establish a computational framework, including parameter settings and methods, for comparative study of genetic redundancy and gene duplication between different fungal species. PMID:23225993

  15. Death receptor 3 (DR3) gene duplication in a chromosome region 1p36.3: gene duplication is more prevalent in rheumatoid arthritis.

    PubMed

    Osawa, K; Takami, N; Shiozawa, K; Hashiramoto, A; Shiozawa, S

    2004-09-01

    The death receptor 3 (DR3) gene is a member of the apoptosis-inducing Fas gene family. In the current study, fluorescence in situ hybridization (FISH) and Fiber-FISH revealed the existence of a second DR3 gene approximately 200 kb upstream of the original DR3 gene. The existence of the duplicated DR3 gene was confirmed by sequencing the corresponding human artificial chromosome clones as well as with quantitative PCR that measured the ratio of the DR3 gene mutation (Rm), intrinsic to rheumatoid arthritis (RA) patients, by simultaneous amplification of the normal and mutated DR3 sequences. The DR3 gene duplication measured by FISH was found to be more frequent in patients with RA as compared to healthy individuals. We therefore surmise that the human DR3 gene can be duplicated and that this gene duplication is more prevalent in patients with RA.

  16. Comparative study of human mitochondrial proteome reveals extensive protein subcellular relocalization after gene duplications

    PubMed Central

    2009-01-01

    Background Gene and genome duplication is the principle creative force in evolution. Recently, protein subcellular relocalization, or neolocalization was proposed as one of the mechanisms responsible for the retention of duplicated genes. This hypothesis received support from the analysis of yeast genomes, but has not been tested thoroughly on animal genomes. In order to evaluate the importance of subcellular relocalizations for retention of duplicated genes in animal genomes, we systematically analyzed nuclear encoded mitochondrial proteins in the human genome by reconstructing phylogenies of mitochondrial multigene families. Results The 456 human mitochondrial proteins selected for this study were clustered into 305 gene families including 92 multigene families. Among the multigene families, 59 (64%) consisted of both mitochondrial and cytosolic (non-mitochondrial) proteins (mt-cy families) while the remaining 33 (36%) were composed of mitochondrial proteins (mt-mt families). Phylogenetic analyses of mt-cy families revealed three different scenarios of their neolocalization following gene duplication: 1) relocalization from mitochondria to cytosol, 2) from cytosol to mitochondria and 3) multiple subcellular relocalizations. The neolocalizations were most commonly enabled by the gain or loss of N-terminal mitochondrial targeting signals. The majority of detected subcellular relocalization events occurred early in animal evolution, preceding the evolution of tetrapods. Mt-mt protein families showed a somewhat different pattern, where gene duplication occurred more evenly in time. However, for both types of protein families, most duplication events appear to roughly coincide with two rounds of genome duplications early in vertebrate evolution. Finally, we evaluated the effects of inaccurate and incomplete annotation of mitochondrial proteins and found that our conclusion of the importance of subcellular relocalization after gene duplication on the genomic scale was

  17. Gene duplication followed by exon structure divergence substitutes for alternative splicing in zebrafish.

    PubMed

    Lambert, Matthew J; Olsen, Kyle G; Cooper, Cynthia D

    2014-08-10

    In this study we report novel findings regarding the evolutionary relationship between gene duplication and alternative splicing, two processes that increase proteomic diversity. By studying teleost fish, we find that gene duplication followed by exon structure divergence between paralogs, but not gene duplication alone, leads to a significant reduction in alternative splicing, as measured by both the proportion of genes that undergo alternative splicing as well as mean number of transcripts per gene. Additionally, we show that this effect is independent of gene family size and gene function. Furthermore, we provide evidence that the reduction in alternative splicing may be due to the partitioning of ancestral splice forms among the duplicate genes - a form of subfunctionalization. Taken together these results indicate that exon structure evolution subsequent to gene duplication may be a common substitute for alternative splicing.

  18. The birth of new genes by RNA- and DNA-mediated duplication during mammalian evolution.

    PubMed

    Jun, Jin; Ryvkin, Paul; Hemphill, Edward; Mandoiu, Ion; Nelson, Craig

    2009-10-01

    Gene duplication has long been recognized as a major force in genome evolution and has recently been recognized as an important source of individual variation. For many years, the origin of functional gene duplicates was assumed to be whole or partial genome duplication events, but recently retrotransposition has also been shown to contribute new functional protein coding genes and siRNA's. In this study, we utilize pseudogenes to recreate more complete gene family histories, and compare the rates of RNA and DNA-mediated duplication and new functional gene formation in five mammalian genomes. We find that RNA-mediated duplication occurs at a much higher and more variable rate than DNA-mediated duplication, and gives rise to many more duplicated sequences over time. We show that, while the chance of RNA-mediated duplicates becoming functional is much lower than that of their DNA-mediated counterparts, the higher rate of retrotransposition leads to nearly equal contributions of new genes by each mechanism. We also find that functional RNA-mediated duplicates are closer to neighboring genes than non-functional RNA-mediated copies, consistent with co-option of regulatory elements at the site of insertion. Overall, new genes derived from DNA and RNA-mediated duplication mechanisms are under similar levels of purifying selective pressure, but have broadly different functions. RNA-mediated duplication gives rise to a diversity of genes but is dominated by the highly expressed genes of RNA metabolic pathways. DNA-mediated duplication can copy regulatory material along with the protein coding region of the gene and often gives rise to classes of genes whose function are dependent on complex regulatory information. This mechanistic difference may in part explain why we find that mammalian protein families tend to evolve by either one mechanism or the other, but rarely by both. Supplementary Material has been provided (see online Supplementary Material at www.liebertonline.com ).

  19. Evolutionary history of the alpha2,8-sialyltransferase (ST8Sia) gene family: tandem duplications in early deuterostomes explain most of the diversity found in the vertebrate ST8Sia genes.

    PubMed

    Harduin-Lepers, Anne; Petit, Daniel; Mollicone, Rosella; Delannoy, Philippe; Petit, Jean-Michel; Oriol, Rafael

    2008-09-23

    The animal sialyltransferases, which catalyze the transfer of sialic acid to the glycan moiety of glycoconjugates, are subdivided into four families: ST3Gal, ST6Gal, ST6GalNAc and ST8Sia, based on acceptor sugar specificity and glycosidic linkage formed. Despite low overall sequence identity between each sialyltransferase family, all sialyltransferases share four conserved peptide motifs (L, S, III and VS) that serve as hallmarks for the identification of the sialyltransferases. Currently, twenty subfamilies have been described in mammals and birds. Examples of the four sialyltransferase families have also been found in invertebrates. Focusing on the ST8Sia family, we investigated the origin of the three groups of alpha2,8-sialyltransferases demonstrated in vertebrates to carry out poly-, oligo- and mono-alpha2,8-sialylation. We identified in the genome of invertebrate deuterostomes, orthologs to the common ancestor for each of the three vertebrate ST8Sia groups and a set of novel genes named ST8Sia EX, not found in vertebrates. All these ST8Sia sequences share a new conserved family-motif, named "C-term" that is involved in protein folding, via an intramolecular disulfide bridge. Interestingly, sequences from Branchiostoma floridae orthologous to the common ancestor of polysialyltransferases possess a polysialyltransferase domain (PSTD) and those orthologous to the common ancestor of oligosialyltransferases possess a new ST8Sia III-specific motif similar to the PSTD. In osteichthyans, we have identified two new subfamilies. In addition, we describe the expression profile of ST8Sia genes in Danio rerio. Polysialylation appeared early in the deuterostome lineage. The recent release of several deuterostome genome databases and paralogons combined with synteny analysis allowed us to obtain insight into events at the gene level that led to the diversification of the ST8Sia genes, with their corresponding enzymatic activities, in both invertebrates and vertebrates. The

  20. Evolutionary history of the alpha2,8-sialyltransferase (ST8Sia) gene family: Tandem duplications in early deuterostomes explain most of the diversity found in the vertebrate ST8Sia genes

    PubMed Central

    2008-01-01

    Background The animal sialyltransferases, which catalyze the transfer of sialic acid to the glycan moiety of glycoconjugates, are subdivided into four families: ST3Gal, ST6Gal, ST6GalNAc and ST8Sia, based on acceptor sugar specificity and glycosidic linkage formed. Despite low overall sequence identity between each sialyltransferase family, all sialyltransferases share four conserved peptide motifs (L, S, III and VS) that serve as hallmarks for the identification of the sialyltransferases. Currently, twenty subfamilies have been described in mammals and birds. Examples of the four sialyltransferase families have also been found in invertebrates. Focusing on the ST8Sia family, we investigated the origin of the three groups of α2,8-sialyltransferases demonstrated in vertebrates to carry out poly-, oligo- and mono-α2,8-sialylation. Results We identified in the genome of invertebrate deuterostomes, orthologs to the common ancestor for each of the three vertebrate ST8Sia groups and a set of novel genes named ST8Sia EX, not found in vertebrates. All these ST8Sia sequences share a new conserved family-motif, named "C-term" that is involved in protein folding, via an intramolecular disulfide bridge. Interestingly, sequences from Branchiostoma floridae orthologous to the common ancestor of polysialyltransferases possess a polysialyltransferase domain (PSTD) and those orthologous to the common ancestor of oligosialyltransferases possess a new ST8Sia III-specific motif similar to the PSTD. In osteichthyans, we have identified two new subfamilies. In addition, we describe the expression profile of ST8Sia genes in Danio rerio. Conclusion Polysialylation appeared early in the deuterostome lineage. The recent release of several deuterostome genome databases and paralogons combined with synteny analysis allowed us to obtain insight into events at the gene level that led to the diversification of the ST8Sia genes, with their corresponding enzymatic activities, in both

  1. Gene duplication within the Green Lineage: the case of TEL genes.

    PubMed

    Charon, Céline; Bruggeman, Quentin; Thareau, Vincent; Henry, Yves

    2012-09-01

    Recent years have witnessed a breathtaking increase in the availability of genome sequence data, providing evidence of the highly duplicate nature of eukaryotic genomes. Plants are exceptional among eukaryotic organisms in that duplicate loci compose a large fraction of their genomes, partly because of the frequent occurrence of polyploidy (or whole-genome duplication) events. Tandem gene duplication and transposition have also contributed to the large number of duplicated genes in plant genomes. Evolutionary analyses allowed the dynamics of duplicate gene evolution to be studied and several models were proposed. It seems that, over time, many duplicated genes were lost and some of those that were retained gained new functions and/or expression patterns (neofunctionalization) or subdivided their functions and/or expression patterns between them (subfunctionalization). Recent studies have provided examples of genes that originated by duplication with successive diversification within plants. In this review, we focused on the TEL (TERMINAL EAR1-like) genes to illustrate such mechanisms. Emerged from the mei2 gene family, these TEL genes are likely to be land plant-specific. Phylogenetic analyses revealed one or two TEL copies per diploid genome. TEL gene degeneration and loss in several Angiosperm species such as in poplar and maize seem to have occurred. In Arabidopsis thaliana, whose genome experienced at least three polyploidy events followed by massive gene loss and genomic reorganization, two TEL genes were retained and two new shorter TEL-like (MCT) genes emerged. Molecular and expression analyses suggest for these genes sub- and neofunctionalization events, but confirmation will come from their functional characterization.

  2. Genome changes after gene duplication: haploidy vs. diploidy.

    PubMed

    Xue, Cheng; Huang, Ren; Maxwell, Taylor J; Fu, Yun-Xin

    2010-09-01

    Since genome size and the number of duplicate genes observed in genomes increase from haploid to diploid organisms, diploidy might provide more evolutionary probabilities through gene duplication. It is still unclear how diploidy promotes genomic evolution in detail. In this study, we explored the evolution of segmental gene duplication in haploid and diploid populations by analytical and simulation approaches. Results show that (1) under the double null recessive (DNR) selective model, given the same recombination rate, the evolutionary trajectories and consequences are very similar between the same-size gene-pool haploid vs. diploid populations; (2) recombination enlarges the probability of preservation of duplicate genes in either haploid or diploid large populations, and haplo-insufficiency reinforces this effect; and (3) the loss of duplicate genes at the ancestor locus is limited under recombination while under complete linkage the loss of duplicate genes is always random at the ancestor and newly duplicated loci. Therefore, we propose a model to explain the advantage of diploidy: diploidy might facilitate the increase of recombination rate, especially under sexual reproduction; more duplicate genes are preserved under more recombination by originalization (by which duplicate genes are preserved intact at a special quasi-mutation-selection balance under the DNR or haplo-insufficient selective model), so genome sizes and the number of duplicate genes in diploid organisms become larger. Additionally, it is suggested that small genomic rearrangements due to the random loss of duplicate genes might be limited under recombination.

  3. Phylogenetic detection of numerous gene duplications shared by animals, fungi and plants

    PubMed Central

    2010-01-01

    Background Gene duplication is considered a major driving force for evolution of genetic novelty, thereby facilitating functional divergence and organismal diversity, including the process of speciation. Animals, fungi and plants are major eukaryotic kingdoms and the divergences between them are some of the most significant evolutionary events. Although gene duplications in each lineage have been studied extensively in various contexts, the extent of gene duplication prior to the split of plants and animals/fungi is not clear. Results Here, we have studied gene duplications in early eukaryotes by phylogenetic relative dating. We have reconstructed gene families (with one or more orthogroups) with members from both animals/fungi and plants by using two different clustering strategies. Extensive phylogenetic analyses of the gene families show that, among nearly 2,600 orthogroups identified, at least 300 of them still retain duplication that occurred before the divergence of the three kingdoms. We further found evidence that such duplications were also detected in some highly divergent protists, suggesting that these duplication events occurred in the ancestors of most major extant eukaryotic groups. Conclusions Our phylogenetic analyses show that numerous gene duplications happened at the early stage of eukaryotic evolution, probably before the separation of known major eukaryotic lineages. We discuss the implication of our results in the contexts of different models of eukaryotic phylogeny. One possible explanation for the large number of gene duplication events is one or more large-scale duplications, possibly whole genome or segmental duplication(s), which provides a genomic basis for the successful radiation of early eukaryotes. PMID:20370904

  4. Molecular characterisation of six patients with prolidase deficiency: identification of the first small duplication in the prolidase gene and of a mutation generating symptomatic and asymptomatic outcomes within the same family

    PubMed Central

    Lupi, A; Rossi, A; Campari, E; Pecora, F; Lund, A M; Elcioglu, N H; Gultepe, M; Rocco, M Di; Cetta, G; Forlino, A

    2006-01-01

    Prolidase deficiency (PD) is a rare autosomal recessive connective tissue disorder caused by mutations in the prolidase gene. The PD patients show a wide range of clinical outcomes characterised mainly by intractable skin ulcers, mental retardation and recurrent respiratory infections. Here we describe five different PEPD mutations in six European patients. We identified two new PEPD mutant alleles: a 13 bp duplication in exon 8, which is the first reported duplication in the prolidase gene and a point mutation resulting in a change in amino acid E412, a highly conserved residue among different species. The E412K substitution is responsible for the first reported phenotypic variability within a family with severe and asymptomatic outcomes. PMID:17142620

  5. PTGBase: an integrated database to study tandem duplicated genes in plants.

    PubMed

    Yu, Jingyin; Ke, Tao; Tehrim, Sadia; Sun, Fengming; Liao, Boshou; Hua, Wei

    2015-01-01

    Tandem duplication is a wide-spread phenomenon in plant genomes and plays significant roles in evolution and adaptation to changing environments. Tandem duplicated genes related to certain functions will lead to the expansion of gene families and bring increase of gene dosage in the form of gene cluster arrays. Many tandem duplication events have been studied in plant genomes; yet, there is a surprising shortage of efforts to systematically present the integration of large amounts of information about publicly deposited tandem duplicated gene data across the plant kingdom. To address this shortcoming, we developed the first plant tandem duplicated genes database, PTGBase. It delivers the most comprehensive resource available to date, spanning 39 plant genomes, including model species and newly sequenced species alike. Across these genomes, 54 130 tandem duplicated gene clusters (129 652 genes) are presented in the database. Each tandem array, as well as its member genes, is characterized in complete detail. Tandem duplicated genes in PTGBase can be explored through browsing or searching by identifiers or keywords of functional annotation and sequence similarity. Users can download tandem duplicated gene arrays easily to any scale, up to the complete annotation data set for an entire plant genome. PTGBase will be updated regularly with newly sequenced plant species as they become available. © The Author(s) 2015. Published by Oxford University Press.

  6. Gene duplication and transfer events in plant mitochondria genome

    SciTech Connect

    Xiong Aisheng Peng Rihe; Zhuang Jing; Gao Feng; Zhu Bo; Fu Xiaoyan; Xue Yong; Jin Xiaofen; Tian Yongsheng; Zhao Wei; Yao Quanhong

    2008-11-07

    Gene or genome duplication events increase the amount of genetic material available to increase the genomic, and thereby phenotypic, complexity of organisms during evolution. Gene duplication and transfer events have been important to molecular evolution in all three domains of life, and may be the first step in the emergence of new gene functions. Gene transfer events have been proposed as another accelerator of evolution. The duplicated gene or genome, mainly nuclear, has been the subject of several recent reviews. In addition to the nuclear genome, organisms have organelle genomes, including mitochondrial genome. In this review, we briefly summarize gene duplication and transfer events in the plant mitochondrial genome.

  7. Gene duplication as a major force in evolution.

    PubMed

    Magadum, Santoshkumar; Banerjee, Urbi; Murugan, Priyadharshini; Gangapur, Doddabhimappa; Ravikesavan, Rajasekar

    2013-04-01

    Gene duplication is an important mechanism for acquiring new genes and creating genetic novelty in organisms. Many new gene functions have evolved through gene duplication and it has contributed tremendously to the evolution of developmental programmes in various organisms. Gene duplication can result from unequal crossing over, retroposition or chromosomal (or genome) duplication. Understanding the mechanisms that generate duplicate gene copies and the subsequent dynamics among gene duplicates is vital because these investigations shed light on localized and genomewide aspects of evolutionary forces shaping intra-specific and inter-specific genome contents, evolutionary relationships, and interactions. Based on whole-genome analysis of Arabidopsis thaliana, there is compelling evidence that angiosperms underwent two whole-genome duplication events early during their evolutionary history. Recent studies have shown that these events were crucial for creation of many important developmental and regulatory genes found in extant angiosperm genomes. Recent studies also provide strong indications that even yeast (Saccharomyces cerevisiae), with its compact genome, is in fact an ancient tetraploid. Gene duplication can provide new genetic material for mutation, drift and selection to act upon, the result of which is specialized or new gene functions. Without gene duplication the plasticity of a genome or species in adapting to changing environments would be severely limited. Whether a duplicate is retained depends upon its function, its mode of duplication, (i.e. whether it was duplicated during a whole-genome duplication event), the species in which it occurs, and its expression rate. The exaptation of preexisting secondary functions is an important feature in gene evolution, just as it is in morphological evolution.

  8. Preservation of duplicate genes by complementary, degenerative mutations.

    PubMed Central

    Force, A; Lynch, M; Pickett, F B; Amores, A; Yan, Y L; Postlethwait, J

    1999-01-01

    The origin of organismal complexity is generally thought to be tightly coupled to the evolution of new gene functions arising subsequent to gene duplication. Under the classical model for the evolution of duplicate genes, one member of the duplicated pair usually degenerates within a few million years by accumulating deleterious mutations, while the other duplicate retains the original function. This model further predicts that on rare occasions, one duplicate may acquire a new adaptive function, resulting in the preservation of both members of the pair, one with the new function and the other retaining the old. However, empirical data suggest that a much greater proportion of gene duplicates is preserved than predicted by the classical model. Here we present a new conceptual framework for understanding the evolution of duplicate genes that may help explain this conundrum. Focusing on the regulatory complexity of eukaryotic genes, we show how complementary degenerative mutations in different regulatory elements of duplicated genes can facilitate the preservation of both duplicates, thereby increasing long-term opportunities for the evolution of new gene functions. The duplication-degeneration-complementation (DDC) model predicts that (1) degenerative mutations in regulatory elements can increase rather than reduce the probability of duplicate gene preservation and (2) the usual mechanism of duplicate gene preservation is the partitioning of ancestral functions rather than the evolution of new functions. We present several examples (including analysis of a new engrailed gene in zebrafish) that appear to be consistent with the DDC model, and we suggest several analytical and experimental approaches for determining whether the complementary loss of gene subfunctions or the acquisition of novel functions are likely to be the primary mechanisms for the preservation of gene duplicates. For a newly duplicated paralog, survival depends on the outcome of the race between

  9. Evolution of vertebrate central nervous system is accompanied by novel expression changes of duplicate genes.

    PubMed

    Chen, Yuan; Ding, Yun; Zhang, Zuming; Wang, Wen; Chen, Jun-Yuan; Ueno, Naoto; Mao, Bingyu

    2011-12-20

    The evolution of the central nervous system (CNS) is one of the most striking changes during the transition from invertebrates to vertebrates. As a major source of genetic novelties, gene duplication might play an important role in the functional innovation of vertebrate CNS. In this study, we focused on a group of CNS-biased genes that duplicated during early vertebrate evolution. We investigated the tempo-spatial expression patterns of 33 duplicate gene families and their orthologs during the embryonic development of the vertebrate Xenopus laevis and the cephalochordate Brachiostoma belcheri. Almost all the identified duplicate genes are differentially expressed in the CNS in Xenopus embryos, and more than 50% and 30% duplicate genes are expressed in the telencephalon and mid-hindbrain boundary, respectively, which are mostly considered as two innovations in the vertebrate CNS. Interestingly, more than 50% of the amphioxus orthologs do not show apparent expression in the CNS in amphioxus embryos as detected by in situ hybridization, indicating that some of the vertebrate CNS-biased duplicate genes might arise from non-CNS genes in invertebrates. Our data accentuate the functional contribution of gene duplication in the CNS evolution of vertebrate and uncover an invertebrate non-CNS history for some vertebrate CNS-biased duplicate genes. Copyright © 2011. Published by Elsevier Ltd.

  10. Recurrent Gene Duplication Diversifies Genome Defense Repertoire in Drosophila.

    PubMed

    Levine, Mia T; Vander Wende, Helen M; Hsieh, Emily; Baker, EmilyClare P; Malik, Harmit S

    2016-07-01

    Transposable elements (TEs) comprise large fractions of many eukaryotic genomes and imperil host genome integrity. The host genome combats these challenges by encoding proteins that silence TE activity. Both the introduction of new TEs via horizontal transfer and TE sequence evolution requires constant innovation of host-encoded TE silencing machinery to keep pace with TEs. One form of host innovation is the adaptation of existing, single-copy host genes. Indeed, host suppressors of TE replication often harbor signatures of positive selection. Such signatures are especially evident in genes encoding the piwi-interacting-RNA pathway of gene silencing, for example, the female germline-restricted TE silencer, HP1D/Rhino Host genomes can also innovate via gene duplication and divergence. However, the importance of gene family expansions, contractions, and gene turnover to host genome defense has been largely unexplored. Here, we functionally characterize Oxpecker, a young, tandem duplicate gene of HP1D/rhino We demonstrate that Oxpecker supports female fertility in Drosophila melanogaster and silences several TE families that are incompletely silenced by HP1D/Rhino in the female germline. We further show that, like Oxpecker, at least ten additional, structurally diverse, HP1D/rhino-derived daughter and "granddaughter" genes emerged during a short 15-million year period of Drosophila evolution. These young paralogs are transcribed primarily in germline tissues, where the genetic conflict between host genomes and TEs plays out. Our findings suggest that gene family expansion is an underappreciated yet potent evolutionary mechanism of genome defense diversification. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  11. Phylogenetics of Lophotrochozoan bHLH Genes and the Evolution of Lineage-Specific Gene Duplicates

    PubMed Central

    Bao, Yongbo

    2017-01-01

    The gain and loss of genes encoding transcription factors is of importance to understanding the evolution of gene regulatory complexity. The basic helix–loop–helix (bHLH) genes encode a large superfamily of transcription factors. We systematically classify the bHLH genes from five mollusc, two annelid and one brachiopod genomes, tracing the pattern of bHLH gene evolution across these poorly studied Phyla. In total, 56–88 bHLH genes were identified in each genome, with most identifiable as members of previously described bilaterian families, or of new families we define. Of such families only one, Mesp, appears lost by all these species. Additional duplications have also played a role in the evolution of the bHLH gene repertoire, with many new lophotrochozoan-, mollusc-, bivalve-, or gastropod-specific genes defined. Using a combination of transcriptome mining, RT-PCR, and in situ hybridization we compared the expression of several of these novel genes in tissues and embryos of the molluscs Crassostrea gigas and Patella vulgata, finding both conserved expression and evidence for neofunctionalization. We also map the positions of the genes across these genomes, identifying numerous gene linkages. Some reflect recent paralog divergence by tandem duplication, others are remnants of ancient tandem duplications dating to the lophotrochozoan or bilaterian common ancestors. These data are built into a model of the evolution of bHLH genes in molluscs, showing formidable evolutionary stasis at the family level but considerable within-family diversification by tandem gene duplication. PMID:28338988

  12. Cdx ParaHox genes acquired distinct developmental roles after gene duplication in vertebrate evolution.

    PubMed

    Marlétaz, Ferdinand; Maeso, Ignacio; Faas, Laura; Isaacs, Harry V; Holland, Peter W H

    2015-08-01

    The functional consequences of whole genome duplications in vertebrate evolution are not fully understood. It remains unclear, for instance, why paralogues were retained in some gene families but extensively lost in others. Cdx homeobox genes encode conserved transcription factors controlling posterior development across diverse bilaterians. These genes are part of the ParaHox gene cluster. Multiple Cdx copies were retained after genome duplication, raising questions about how functional divergence, overlap, and redundancy respectively contributed to their retention and evolutionary fate. We examined the degree of regulatory and functional overlap between the three vertebrate Cdx genes using single and triple morpholino knock-down in Xenopus tropicalis followed by RNA-seq. We found that one paralogue, Cdx4, has a much stronger effect on gene expression than the others, including a strong regulatory effect on FGF and Wnt genes. Functional annotation revealed distinct and overlapping roles and subtly different temporal windows of action for each gene. The data also reveal a colinear-like effect of Cdx genes on Hox genes, with repression of Hox paralogy groups 1 and 2, and activation increasing from Hox group 5 to 11. We also highlight cases in which duplicated genes regulate distinct paralogous targets revealing pathway elaboration after whole genome duplication. Despite shared core pathways, Cdx paralogues have acquired distinct regulatory roles during development. This implies that the degree of functional overlap between paralogues is relatively low and that gene expression pattern alone should be used with caution when investigating the functional evolution of duplicated genes. We therefore suggest that developmental programmes were extensively rewired after whole genome duplication in the early evolution of vertebrates.

  13. Early evolutionary history and genomic features of gene duplicates in the human genome.

    PubMed

    Bu, Lijing; Katju, Vaishali

    2015-08-20

    Human gene duplicates have been the focus of intense research since the development of array-based and targeted next-generation sequencing approaches in the last decade. These studies have primarily concentrated on determining the extant copy-number variation from a population-genomic perspective but lack a robust evolutionary framework to elucidate the early structural and genomic characteristics of gene duplicates at emergence and their subsequent evolution with increasing age. We analyzed 184 gene duplicate pairs comprising small gene families in the draft human genome with 10% or less synonymous sequence divergence. Human gene duplicates primarily originate from DNA-mediated events, taking up genomic residence as intrachromosomal copies in direct or inverse orientation. The distribution of paralogs on autosomes follows random expectations in contrast to their significant enrichment on the sex chromosomes. Furthermore, human gene duplicates exhibit a skewed gradient of distribution along the chromosomal length with significant clustering in pericentromeric regions. Surprisingly, despite the large average length of human genes, the majority of extant duplicates (83%) are complete duplicates, wherein the entire ORF of the ancestral copy was duplicated. The preponderance of complete duplicates is in accord with an extremely large median duplication span of 36 kb, which enhances the probability of capturing ancestral ORFs in their entirety. With increasing evolutionary age, human paralogs exhibit declines in (i) the frequency of intrachromosomal paralogs, and (ii) the proportion of complete duplicates. These changes may reflect lower survival rates of certain classes of duplicates and/or the role of purifying selection. Duplications arising from RNA-mediated events comprise a small fraction (11.4%) of all human paralogs and are more numerous in older evolutionary cohorts of duplicates. The degree of structural resemblance, genomic location and duplication span

  14. Duplicate Gene Divergence by Changes in MicroRNA Binding Sites in Arabidopsis and Brassica

    PubMed Central

    Wang, Sishuo; Adams, Keith L.

    2015-01-01

    Gene duplication provides large numbers of new genes that can lead to the evolution of new functions. Duplicated genes can diverge by changes in sequences, expression patterns, and functions. MicroRNAs play an important role in the regulation of gene expression in many eukaryotes. After duplication, two paralogs may diverge in their microRNA binding sites, which might impact their expression and function. Little is known about conservation and divergence of microRNA binding sites in duplicated genes in plants. We analyzed microRNA binding sites in duplicated genes in Arabidopsis thaliana and Brassica rapa. We found that duplicates are more often targeted by microRNAs than singletons. The vast majority of duplicated genes in A. thaliana with microRNA binding sites show divergence in those sites between paralogs. Analysis of microRNA binding sites in genes derived from the ancient whole-genome triplication in B. rapa also revealed extensive divergence. Paralog pairs with divergent microRNA binding sites show more divergence in expression patterns compared with paralog pairs with the same microRNA binding sites in Arabidopsis. Close to half of the cases of binding site divergence are caused by microRNAs that are specific to the Arabidopsis genus, indicating evolutionarily recent gain of binding sites after target gene duplication. We also show rapid evolution of microRNA binding sites in a jacalin gene family. Our analyses reveal a dynamic process of changes in microRNA binding sites after gene duplication in Arabidopsis and highlight the role of microRNA regulation in the divergence and contrasting evolutionary fates of duplicated genes. PMID:25644246

  15. Duplicate gene divergence by changes in microRNA binding sites in Arabidopsis and Brassica.

    PubMed

    Wang, Sishuo; Adams, Keith L

    2015-02-02

    Gene duplication provides large numbers of new genes that can lead to the evolution of new functions. Duplicated genes can diverge by changes in sequences, expression patterns, and functions. MicroRNAs play an important role in the regulation of gene expression in many eukaryotes. After duplication, two paralogs may diverge in their microRNA binding sites, which might impact their expression and function. Little is known about conservation and divergence of microRNA binding sites in duplicated genes in plants. We analyzed microRNA binding sites in duplicated genes in Arabidopsis thaliana and Brassica rapa. We found that duplicates are more often targeted by microRNAs than singletons. The vast majority of duplicated genes in A. thaliana with microRNA binding sites show divergence in those sites between paralogs. Analysis of microRNA binding sites in genes derived from the ancient whole-genome triplication in B. rapa also revealed extensive divergence. Paralog pairs with divergent microRNA binding sites show more divergence in expression patterns compared with paralog pairs with the same microRNA binding sites in Arabidopsis. Close to half of the cases of binding site divergence are caused by microRNAs that are specific to the Arabidopsis genus, indicating evolutionarily recent gain of binding sites after target gene duplication. We also show rapid evolution of microRNA binding sites in a jacalin gene family. Our analyses reveal a dynamic process of changes in microRNA binding sites after gene duplication in Arabidopsis and highlight the role of microRNA regulation in the divergence and contrasting evolutionary fates of duplicated genes.

  16. Impact of recurrent gene duplication on adaptation of plant genomes

    PubMed Central

    2014-01-01

    Background Recurrent gene duplication and retention played an important role in angiosperm genome evolution. It has been hypothesized that these processes contribute significantly to plant adaptation but so far this hypothesis has not been tested at the genome scale. Results We studied available sequenced angiosperm genomes to assess the frequency of positive selection footprints in lineage specific expanded (LSE) gene families compared to single-copy genes using a dN/dS-based test in a phylogenetic framework. We found 5.38% of alignments in LSE genes with codons under positive selection. In contrast, we found no evidence for codons under positive selection in the single-copy reference set. An analysis at the branch level shows that purifying selection acted more strongly on single-copy genes than on LSE gene clusters. Moreover we detect significantly more branches indicating evolution under positive selection and/or relaxed constraint in LSE genes than in single-copy genes. Conclusions In this – to our knowledge –first genome-scale study we provide strong empirical support for the hypothesis that LSE genes fuel adaptation in angiosperms. Our conservative approach for detecting selection footprints as well as our results can be of interest for further studies on (plant) gene family evolution. PMID:24884640

  17. Effect of Incomplete Lineage Sorting On Tree-Reconciliation-Based Inference of Gene Duplication.

    PubMed

    Zheng, Yu; Zhang, Louxin

    2014-01-01

    In the tree reconciliation approach to infer the duplication history of a gene family, the gene (family) tree is compared to the corresponding species tree. Incomplete lineage sorting (ILS) gives rise to stochastic variation in the topology of a gene tree and hence likely introduces false duplication events when a tree reconciliation method is used. We quantify the effect of ILS on gene duplication inference in a species tree in terms of the expected number of false duplication events inferred from reconciling a random gene tree, which occurs with a probability predicted in coalescent theory, and the species tree. We computationally examine the relationship between the effect of ILS on duplication inference in a species tree and its topological parameters. Our findings suggest that ILS may cause non-negligible bias on duplication inference, particularly on an asymmetric species tree. Hence, when gene duplication is inferred via tree reconciliation or any other approach that takes gene tree topology into account, the ILS-induced bias should be examined cautiously.

  18. Duplication of chicken defensin7 gene generated by gene conversion and homologous recombination.

    PubMed

    Lee, Mi Ok; Bornelöv, Susanne; Andersson, Leif; Lamont, Susan J; Chen, Junfeng; Womack, James E

    2016-11-29

    Defensins constitute an evolutionary conserved family of cationic antimicrobial peptides that play a key role in host innate immune responses to infection. Defensin genes generally reside in complex genomic regions that are prone to structural variation, and defensin genes exhibit extensive copy number variation in humans and in other species. Copy number variation of defensin genes was examined in inbred lines of Leghorn and Fayoumi chickens, and a duplication of defensin7 was discovered in the Fayoumi breed. Analysis of junction sequences confirmed the occurrence of a simple tandem duplication of defensin7 with sequence identity at the junction, suggesting nonallelic homologous recombination between defensin7 and defensin6 The duplication event generated two chimeric promoters that are best explained by gene conversion followed by homologous recombination. Expression of defensin7 was not elevated in animals with two genes despite both genes being transcribed in the tissues examined. Computational prediction of promoter regions revealed the presence of several putative transcription factor binding sites generated by the duplication event. These data provide insight into the evolution and possible function of large gene families and specifically, the defensins.

  19. Duplication of chicken defensin7 gene generated by gene conversion and homologous recombination

    PubMed Central

    Lee, Mi Ok; Bornelöv, Susanne; Andersson, Leif; Lamont, Susan J.; Chen, Junfeng; Womack, James E.

    2016-01-01

    Defensins constitute an evolutionary conserved family of cationic antimicrobial peptides that play a key role in host innate immune responses to infection. Defensin genes generally reside in complex genomic regions that are prone to structural variation, and defensin genes exhibit extensive copy number variation in humans and in other species. Copy number variation of defensin genes was examined in inbred lines of Leghorn and Fayoumi chickens, and a duplication of defensin7 was discovered in the Fayoumi breed. Analysis of junction sequences confirmed the occurrence of a simple tandem duplication of defensin7 with sequence identity at the junction, suggesting nonallelic homologous recombination between defensin7 and defensin6. The duplication event generated two chimeric promoters that are best explained by gene conversion followed by homologous recombination. Expression of defensin7 was not elevated in animals with two genes despite both genes being transcribed in the tissues examined. Computational prediction of promoter regions revealed the presence of several putative transcription factor binding sites generated by the duplication event. These data provide insight into the evolution and possible function of large gene families and specifically, the defensins. PMID:27849592

  20. Matrix Gla protein and osteocalcin: from gene duplication to neofunctionalization.

    PubMed

    Cancela, M Leonor; Laizé, Vincent; Conceição, Natércia

    2014-11-01

    Osteocalcin (OC or bone Gla protein, BGP) and matrix Gla protein (MGP) are two members of the growing family of vitamin K-dependent (VKD) proteins. They were the first VKD proteins found not to be involved in coagulation and synthesized outside the liver. Both proteins were isolated from bone although it is now known that only OC is synthesized by bone cells under normal physiological conditions, but since both proteins can bind calcium and hydroxyapatite, they can also accumulate in bone. Both OC and MGP share similar structural features, both in terms of protein domains and gene organization. OC gene is likely to have appeared from MGP through a tandem gene duplication that occurred concomitantly with the appearance of the bony vertebrates. Despite their relatively close relationship and the fact that both can bind calcium and affect mineralization, their functions are not redundant and they also have other unrelated functions. Interestingly, these two proteins appear to have followed quite different evolutionary strategies in order to acquire novel functionalities, with OC following a gene duplication strategy while MGP variability was obtained mostly by the use of multiple promoters and alternative splicing, leading to proteins with additional functional characteristics and alternative gene regulatory pathways. Copyright © 2014 Elsevier Inc. All rights reserved.

  1. Differential retention and divergent resolution of duplicate genes following whole-genome duplication

    PubMed Central

    McGrath, Casey L.; Gout, Jean-Francois; Johri, Parul; Doak, Thomas G.

    2014-01-01

    The Paramecium aurelia complex is a group of 15 species that share at least three past whole-genome duplications (WGDs). The macronuclear genome sequences of P. biaurelia and P. sexaurelia are presented and compared to the published sequence of P. tetraurelia. Levels of duplicate-gene retention from the recent WGD differ by >10% across species, with P. sexaurelia losing significantly more genes than P. biaurelia or P. tetraurelia. In addition, historically high rates of gene conversion have homogenized WGD paralogs, probably extending the paralogs’ lifetimes. The probability of duplicate retention is positively correlated with GC content and expression level; ribosomal proteins, transcription factors, and intracellular signaling proteins are overrepresented among maintained duplicates. Finally, multiple sources of evidence indicate that P. sexaurelia diverged from the two other lineages immediately following, or perhaps concurrent with, the recent WGD, with approximately half of gene losses between P. tetraurelia and P. sexaurelia representing divergent gene resolutions (i.e., silencing of alternative paralogs), as expected for random duplicate loss between these species. Additionally, though P. biaurelia and P. tetraurelia diverged from each other much later, there are still more than 100 cases of divergent resolution between these two species. Taken together, these results indicate that divergent resolution of duplicate genes between lineages acts to reinforce reproductive isolation between species in the Paramecium aurelia complex. PMID:25085612

  2. Chromosomal Duplication Involving the Forkhead Transcription Factor Gene FOXC1 Causes Iris Hypoplasia and Glaucoma

    PubMed Central

    Lehmann, Ordan J.; Ebenezer, Neil D.; Jordan, Tim; Fox, Margaret; Ocaka, Louise; Payne, Annette; Leroy, Bart P.; Clark, Brian J.; Hitchings, Roger A.; Povey, Sue; Khaw, Peng T.; Bhattacharya, Shomi S.

    2000-01-01

    The forkhead transcription factor gene FOXC1 (formerly FKHL7) is responsible for a number of glaucoma phenotypes in families in which the disease maps to 6p25, although mutations have not been found in all families in which the disease maps to this region. In a large pedigree with iris hypoplasia and glaucoma mapping to 6p25 (peak LOD score 6.20 [recombination fraction 0] at D6S967), no FOXC1 mutations were detected by direct sequencing. However, genotyping with microsatellite repeat markers suggested the presence of a chromosomal duplication that segregated with the disease phenotype. The duplication was confirmed in affected individuals by FISH with markers encompassing FOXC1. These results provide evidence of gene duplication causing developmental disease in humans, with increased gene dosage of either FOXC1 or other, as yet unknown genes within the duplicated segment being the probable mechanism responsible for the phenotype. PMID:11007653

  3. On the retention of gene duplicates prone to dominant deleterious mutations.

    PubMed

    Malaguti, Giulia; Singh, Param Priya; Isambert, Hervé

    2014-05-01

    Recent studies have shown that gene families from different functional categories have been preferentially expanded either by small scale duplication (SSD) or by whole-genome duplication (WGD). In particular, gene families prone to dominant deleterious mutations and implicated in cancers and other genetic diseases in human have been greatly expanded through two rounds of WGD dating back from early vertebrates. Here, we strengthen this intriguing observation, showing that human oncogenes involved in different primary tumors have retained many WGD duplicates compared to other human genes. In order to rationalize this evolutionary outcome, we propose a consistent population genetics model to analyze the retention of SSD and WGD duplicates taking into account their propensity to acquire dominant deleterious mutations. We solve a deterministic haploid model including initial duplicated loci, their retention through sub-functionalization or their neutral loss-of-function or deleterious gain-of-function at one locus. Extensions to diploid genotypes are presented and population size effects are analyzed using stochastic simulations. The only difference between the SSD and WGD scenarios is the initial number of individuals with duplicated loci. While SSD duplicates need to spread through the entire population from a single individual to reach fixation, WGD duplicates are de facto fixed in the small initial post-WGD population arising through the ploidy incompatibility between post-WGD individuals and the rest of the pre-WGD population. WGD duplicates prone to dominant deleterious mutations are then shown to be indirectly selected through purifying selection in post-WGD species, whereas SSD duplicates typically require positive selection. These results highlight the long-term evolution mechanisms behind the surprising accumulation of WGD duplicates prone to dominant deleterious mutations and are shown to be consistent with cancer genome data on the prevalence of human

  4. The probability of duplicate gene preservation by subfunctionalization.

    PubMed Central

    Lynch, M; Force, A

    2000-01-01

    It has often been argued that gene-duplication events are most commonly followed by a mutational event that silences one member of the pair, while on rare occasions both members of the pair are preserved as one acquires a mutation with a beneficial function and the other retains the original function. However, empirical evidence from genome duplication events suggests that gene duplicates are preserved in genomes far more commonly and for periods far in excess of the expectations under this model, and whereas some gene duplicates clearly evolve new functions, there is little evidence that this is the most common mechanism of duplicate-gene preservation. An alternative hypothesis is that gene duplicates are frequently preserved by subfunctionalization, whereby both members of a pair experience degenerative mutations that reduce their joint levels and patterns of activity to that of the single ancestral gene. We consider the ways in which the probability of duplicate-gene preservation by such complementary mutations is modified by aspects of gene structure, degree of linkage, mutation rates and effects, and population size. Even if most mutations cause complete loss-of-subfunction, the probability of duplicate-gene preservation can be appreciable if the long-term effective population size is on the order of 10(5) or smaller, especially if there are more than two independently mutable subfunctions per locus. Even a moderate incidence of partial loss-of-function mutations greatly elevates the probability of preservation. The model proposed herein leads to quantitative predictions that are consistent with observations on the frequency of long-term duplicate gene preservation and with observations that indicate that a common fate of the members of duplicate-gene pairs is the partitioning of tissue-specific patterns of expression of the ancestral gene. PMID:10629003

  5. Nonrandom divergence of gene expression following gene and genome duplications in the flowering plant Arabidopsis thaliana

    PubMed Central

    Casneuf, Tineke; De Bodt, Stefanie; Raes, Jeroen; Maere, Steven; Van de Peer, Yves

    2006-01-01

    Background Genome analyses have revealed that gene duplication in plants is rampant. Furthermore, many of the duplicated genes seem to have been created through ancient genome-wide duplication events. Recently, we have shown that gene loss is strikingly different for large- and small-scale duplication events and highly biased towards the functional class to which a gene belongs. Here, we study the expression divergence of genes that were created during large- and small-scale gene duplication events by means of microarray data and investigate both the influence of the origin (mode of duplication) and the function of the duplicated genes on expression divergence. Results Duplicates that have been created by large-scale duplication events and that can still be found in duplicated segments have expression patterns that are more correlated than those that were created by small-scale duplications or those that no longer lie in duplicated segments. Moreover, the former tend to have highly redundant or overlapping expression patterns and are mostly expressed in the same tissues, while the latter show asymmetric divergence. In addition, a strong bias in divergence of gene expression was observed towards gene function and the biological process genes are involved in. Conclusion By using microarray expression data for Arabidopsis thaliana, we show that the mode of duplication, the function of the genes involved, and the time since duplication play important roles in the divergence of gene expression and, therefore, in the functional divergence of genes after duplication. PMID:16507168

  6. Recurrent tandem gene duplication gave rise to functionally divergent genes in Drosophila.

    PubMed

    Fan, Chuanzhu; Chen, Ying; Long, Manyuan

    2008-07-01

    Tandem gene duplication is one of the major gene duplication mechanisms in eukaryotes, as illustrated by the prevalence of gene family clusters. Tandem duplicated paralogs usually share the same regulatory element, and as a consequence, they are likely to perform similar biological functions. Here, we provide an example of a newly evolved tandem duplicate acquiring novel functions, which were driven by positive selection. CG32708, CG32706, and CG6999 are 3 clustered genes residing in the X chromosome of Drosophila melanogaster. CG6999 and CG32708 have been examined for their molecular population genetic properties (Thornton and Long 2005). We further investigated the evolutionary forces acting on these genes with greater sample sizes and a broader approach that incorporate between-species divergence, using more variety of statistical methods. We explored the possible functional implications by characterizing the tissue-specific and developmental expression patterns of these genes. Sequence comparison of species within D. melanogaster subgroup reveals that this 3-gene cluster was created by 2 rounds of tandem gene duplication in the last 5 Myr. Based on phylogenetic analysis, CG32708 is clearly the parental copy that is shared by all species. CG32706 appears to have originated in the ancestor of Drosophila simulans and D. melanogaster about 5 Mya, and CG6999 is the newest duplicate that is unique to D. melanogaster. All 3 genes have different expression profiles, and CG6999 has in addition acquired a novel transcript. Biased polymorphism frequency spectrum, linkage disequilibrium, nucleotide substitution, and McDonald-Kreitman analyses suggested that the evolution of CG6999 and CG32706 were driven by positive Darwinian selection.

  7. Gene duplication models for directed networks with limits on growth

    NASA Astrophysics Data System (ADS)

    Enemark, Jakob; Sneppen, Kim

    2007-11-01

    Background: Duplication of genes is important for evolution of molecular networks. Many authors have therefore considered gene duplication as a driving force in shaping the topology of molecular networks. In particular it has been noted that growth via duplication would act as an implicit means of preferential attachment, and thereby provide the observed broad degree distributions of molecular networks. Results: We extend current models of gene duplication and rewiring by including directions and the fact that molecular networks are not a result of unidirectional growth. We introduce upstream sites and downstream shapes to quantify potential links during duplication and rewiring. We find that this in itself generates the observed scaling of transcription factors for genome sites in prokaryotes. The dynamical model can generate a scale-free degree distribution, p(k)\\propto 1/k^{\\gamma } , with exponent γ = 1 in the non-growing case, and with γ>1 when the network is growing. Conclusions: We find that duplication of genes followed by substantial recombination of upstream regions could generate features of genetic regulatory networks. Our steady state degree distribution is however too broad to be consistent with data, thereby suggesting that selective pruning acts as a main additional constraint on duplicated genes. Our analysis shows that gene duplication can only be a main cause for the observed broad degree distributions if there are also substantial recombinations between upstream regions of genes.

  8. Dating and functional characterization of duplicated genes in the apple (Malus domestica Borkh.) by analyzing EST data.

    PubMed

    Sanzol, Javier

    2010-05-14

    Gene duplication is central to genome evolution. In plants, genes can be duplicated through small-scale events and large-scale duplications often involving polyploidy. The apple belongs to the subtribe Pyrinae (Rosaceae), a diverse lineage that originated via allopolyploidization. Both small-scale duplications and polyploidy may have been important mechanisms shaping the genome of this species. This study evaluates the gene duplication and polyploidy history of the apple by characterizing duplicated genes in this species using EST data. Overall, 68% of the apple genes were clustered into families with a mean copy-number of 4.6. Analysis of the age distribution of gene duplications supported a continuous mode of small-scale duplications, plus two episodes of large-scale duplicates of vastly different ages. The youngest was consistent with the polyploid origin of the Pyrinae 37-48 MYBP, whereas the older may be related to gamma-triplication; an ancient hexapolyploidization previously characterized in the four sequenced eurosid genomes and basal to the eurosid-asterid divergence. Duplicated genes were studied for functional diversification with an emphasis on young paralogs; those originated during or after the formation of the Pyrinae lineage. Unequal assignment of single-copy genes and gene families to Gene Ontology categories suggested functional bias in the pattern of gene retention of paralogs. Young paralogs related to signal transduction, metabolism, and energy pathways have been preferentially retained. Non-random retention of duplicated genes seems to have mediated the expansion of gene families, some of which may have substantially increased their members after the origin of the Pyrinae. The joint analysis of over-duplicated functional categories and phylogenies, allowed evaluation of the role of both polyploidy and small-scale duplications during this process. Finally, gene expression analysis indicated that 82% of duplicated genes, including 80% of young

  9. Dating and functional characterization of duplicated genes in the apple (Malus domestica Borkh.) by analyzing EST data

    PubMed Central

    2010-01-01

    Background Gene duplication is central to genome evolution. In plants, genes can be duplicated through small-scale events and large-scale duplications often involving polyploidy. The apple belongs to the subtribe Pyrinae (Rosaceae), a diverse lineage that originated via allopolyploidization. Both small-scale duplications and polyploidy may have been important mechanisms shaping the genome of this species. Results This study evaluates the gene duplication and polyploidy history of the apple by characterizing duplicated genes in this species using EST data. Overall, 68% of the apple genes were clustered into families with a mean copy-number of 4.6. Analysis of the age distribution of gene duplications supported a continuous mode of small-scale duplications, plus two episodes of large-scale duplicates of vastly different ages. The youngest was consistent with the polyploid origin of the Pyrinae 37-48 MYBP, whereas the older may be related to γ-triplication; an ancient hexapolyploidization previously characterized in the four sequenced eurosid genomes and basal to the eurosid-asterid divergence. Duplicated genes were studied for functional diversification with an emphasis on young paralogs; those originated during or after the formation of the Pyrinae lineage. Unequal assignment of single-copy genes and gene families to Gene Ontology categories suggested functional bias in the pattern of gene retention of paralogs. Young paralogs related to signal transduction, metabolism, and energy pathways have been preferentially retained. Non-random retention of duplicated genes seems to have mediated the expansion of gene families, some of which may have substantially increased their members after the origin of the Pyrinae. The joint analysis of over-duplicated functional categories and phylogenies, allowed evaluation of the role of both polyploidy and small-scale duplications during this process. Finally, gene expression analysis indicated that 82% of duplicated genes

  10. An Approximation Algorithm for Computing a Parsimonious First Speciation in the Gene Duplication Model

    NASA Astrophysics Data System (ADS)

    Ouangraoua, Aïda; Swenson, Krister M.; Chauve, Cedric

    We consider the following problem: given a forest of gene family trees on a set of genomes, find a first speciation which splits these genomes into two subsets and minimizes the number of gene duplications that happened before this speciation. We call this problem the Minimum Duplication Bipartition Problem. Using a generalization of the Minimum Edge-Cut Problem, known as Submodular Function Minimization, we propose a polynomial time and space 2-approximation algorithm for the Minimum Duplication Bipartition Problem. We illustrate the potential of this algorithm on both synthetic and real data.

  11. Spider Transcriptomes Identify Ancient Large-Scale Gene Duplication Event Potentially Important in Silk Gland Evolution

    PubMed Central

    Clarke, Thomas H.; Garb, Jessica E.; Hayashi, Cheryl Y.; Arensburger, Peter; Ayoub, Nadia A.

    2015-01-01

    The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae). PMID:26058392

  12. Spider Transcriptomes Identify Ancient Large-Scale Gene Duplication Event Potentially Important in Silk Gland Evolution.

    PubMed

    Clarke, Thomas H; Garb, Jessica E; Hayashi, Cheryl Y; Arensburger, Peter; Ayoub, Nadia A

    2015-06-08

    The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae). © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  13. Evolution in action: following function in duplicated floral homeotic genes.

    PubMed

    Causier, Barry; Castillo, Rosa; Zhou, Junli; Ingram, Richard; Xue, Yongbiao; Schwarz-Sommer, Zsuzsanna; Davies, Brendan

    2005-08-23

    Gene duplication plays a fundamental role in evolution by providing the genetic material from which novel functions can arise. Newly duplicated genes can be maintained by subfunctionalization (the duplicated genes perform different aspects of the original gene's function) and/or neofunctionalization (one of the genes acquires a novel function). PLENA in Antirrhinum and AGAMOUS in Arabidopsis are the canonical C-function genes that are essential for the specification of reproductive organs. These functionally equivalent genes encode closely related homeotic MADS-box transcription factors. Using genome synteny, we confirm phylogenetic analyses showing that PLENA and AGAMOUS are nonorthologous genes derived from a duplication in a common ancestor. Their respective orthologs, SHATTERPROOF in Arabidopsis and FARINELLI in Antirrhinum, have undergone independent subfunctionalization via changes in regulation and protein function. Surprisingly, the functional divergence between PLENA and FARINELLI, is morphologically manifest in both transgenic Antirrhinum and Arabidopsis. This provides a clear illustration of a random evolutionary trajectory for gene functions after a duplication event. Different members of a duplicated gene pair have retained the primary homeotic functions in different lineages, illustrating the role of chance in evolution. The differential ability of the Antirrhinum genes to promote male or female development provides a striking example of subfunctionalization at the protein level.

  14. Evolution of pigment synthesis pathways by gene and genome duplication in fish

    PubMed Central

    Braasch, Ingo; Schartl, Manfred; Volff, Jean-Nicolas

    2007-01-01

    Background Coloration and color patterning belong to the most diverse phenotypic traits in animals. Particularly, teleost fishes possess more pigment cell types than any other group of vertebrates. As the result of an ancient fish-specific genome duplication (FSGD), teleost genomes might contain more copies of genes involved in pigment cell development than tetrapods. No systematic genomic inventory allowing to test this hypothesis has been drawn up so far for pigmentation genes in fish, and almost nothing is known about the evolution of these genes in different fish lineages. Results Using a comparative genomic approach including phylogenetic reconstructions and synteny analyses, we have studied two major pigment synthesis pathways in teleost fish, the melanin and the pteridine pathways, with respect to different types of gene duplication. Genes encoding three of the four enzymes involved in the synthesis of melanin from tyrosine have been retained as duplicates after the FSGD. In the pteridine pathway, two cases of duplicated genes originating from the FSGD as well as several lineage-specific gene duplications were observed. In both pathways, genes encoding the rate-limiting enzymes, tyrosinase and GTP-cyclohydrolase I (GchI), have additional paralogs in teleosts compared to tetrapods, which have been generated by different modes of duplication. We have also observed a previously unrecognized diversity of gchI genes in vertebrates. In addition, we have found evidence for divergent resolution of duplicated pigmentation genes, i.e., differential gene loss in divergent teleost lineages, particularly in the tyrosinase gene family. Conclusion Mainly due to the FSGD, teleost fishes apparently have a greater repertoire of pigment synthesis genes than any other vertebrate group. Our results support an important role of the FSGD and other types of duplication in the evolution of pigmentation in fish. PMID:17498288

  15. Whole Genome and Tandem Duplicate Retention Facilitated Glucosinolate Pathway Diversification in the Mustard Family

    PubMed Central

    Hofberger, Johannes A.; Lyons, Eric; Edger, Patrick P.; Chris Pires, J.; Eric Schranz, M.

    2013-01-01

    Plants share a common history of successive whole-genome duplication (WGD) events retaining genomic patterns of duplicate gene copies (ohnologs) organized in conserved syntenic blocks. Duplication was often proposed to affect the origin of novel traits during evolution. However, genetic evidence linking WGD to pathway diversification is scarce. We show that WGD and tandem duplication (TD) accelerated genetic versatility of plant secondary metabolism, exemplified with the glucosinolate (GS) pathway in the mustard family. GS biosynthesis is a well-studied trait, employing at least 52 biosynthetic and regulatory genes in the model plant Arabidopsis. In a phylogenomics approach, we identified 67 GS loci in Aethionema arabicum of the tribe Aethionemae, sister group to all mustard family members. All but one of the Arabidopsis GS gene families evolved orthologs in Aethionema and all but one of the orthologous sequence pairs exhibit synteny. The 45% fraction of duplicates among all protein-coding genes in Arabidopsis was increased to 95% and 97% for Arabidopsis and Aethionema GS pathway inventory, respectively. Compared with the 22% average for all protein-coding genes in Arabidopsis, 52% and 56% of Aethionema and Arabidopsis GS loci align to ohnolog copies dating back to the last common WGD event. Although 15% of all Arabidopsis genes are organized in tandem arrays, 45% and 48% of GS loci in Arabidopsis and Aethionema descend from TD, respectively. We describe a sequential combination of TD and WGD events driving gene family extension, thereby expanding the evolutionary playground for functional diversification and thus potential novelty and success. PMID:24171911

  16. Familial lipoprotein lipase deficiency: a case of compound heterozygosity of a novel duplication (R44Kfs*4) and a common mutation (N291S) in the lipoprotein lipase gene.

    PubMed

    Overgaard, Martin; Brasen, Claus Lohman; Svaneby, Dea; Feddersen, Søren; Nybo, Mads

    2013-07-01

    Familial lipoprotein lipase (LPL) deficiency (FLLD) is a rare autosomal recessive genetic disorder caused by homozygous or compound heterozygous mutations in the LPL gene. FLLD individuals usually express an impaired or non-functional LPL enzyme with low or absent triglyceride (TG) hydrolysis activity causing severe hypertriglyceridaemia. Here we report a case of FLLD in a 29-year-old man, who initially presented with eruptive cutaneous xanthomata, elevated plasma TG concentration but no other co-morbidities. Subsequent genetic testing of the patient revealed compound heterozygosity of a novel duplication (p.R44Kfs*4) leading to a premature stop codon in exon 2 and a known mutation (N291S) in exon 5 of the LPL gene. Further biochemical analysis of the patient's postheparin plasma confirmed a reduction of total lipase activity compared with his heterozygous father carrying the common N291S mutation and to a healthy control. Also the patient showed increased (1.85-fold) activity of hepatic lipase (HL), indicating a functional link between HL and LPL. In summary, we report a case of FLLD caused by compound heterozygosity of a new duplication and a common mutation in the LPL gene, resulting in residual LPL activity. With such mutations, individuals may not receive a diagnosis before classical FLLD symptoms appear later in adulthood. Nevertheless, early diagnosis and lipid-lowering treatment may favour a reduced risk of premature cardiovascular disease or acute pancreatitis in such individuals.

  17. Gene duplication is infrequent in the recent evolutionary history of RNA viruses.

    PubMed

    Simon-Loriere, Etienne; Holmes, Edward C

    2013-06-01

    Gene duplication generates genetic novelty and redundancy and is a major mechanism of evolutionary change in bacteria and eukaryotes. To date, however, gene duplication has been reported only rarely in RNA viruses. Using a conservative BLAST approach we systematically screened for the presence of duplicated (i.e., paralogous) proteins in all RNA viruses for which full genome sequences are publicly available. Strikingly, we found only nine significantly supported cases of gene duplication, two of which are newly described here--in the 25 and 26 kDa proteins of Beet necrotic yellow vein virus (genus Benyvirus) and in the U1 and U2 proteins of Wongabel virus (family Rhabdoviridae). Hence, gene duplication has occurred at a far lower frequency in the recent evolutionary history of RNA viruses than in other organisms. Although the rapidity of RNA virus evolution means that older gene duplication events will be difficult to detect through sequence-based analyses alone, it is likely that specific features of RNA virus biology, and particularly intrinsic constraints on genome size, reduce the likelihood of the fixation and maintenance of duplicated genes.

  18. The probability of preservation of a newly arisen gene duplicate.

    PubMed

    Lynch, M; O'Hely, M; Walsh, B; Force, A

    2001-12-01

    Newly emerging data from genome sequencing projects suggest that gene duplication, often accompanied by genetic map changes, is a common and ongoing feature of all genomes. This raises the possibility that differential expansion/contraction of various genomic sequences may be just as important a mechanism of phenotypic evolution as changes at the nucleotide level. However, the population-genetic mechanisms responsible for the success vs. failure of newly arisen gene duplicates are poorly understood. We examine the influence of various aspects of gene structure, mutation rates, degree of linkage, and population size (N) on the joint fate of a newly arisen duplicate gene and its ancestral locus. Unless there is active selection against duplicate genes, the probability of permanent establishment of such genes is usually no less than 1/(4N) (half of the neutral expectation), and it can be orders of magnitude greater if neofunctionalizing mutations are common. The probability of a map change (reassignment of a key function of an ancestral locus to a new chromosomal location) induced by a newly arisen duplicate is also generally >1/(4N) for unlinked duplicates, suggesting that recurrent gene duplication and alternative silencing may be a common mechanism for generating microchromosomal rearrangements responsible for postreproductive isolating barriers among species. Relative to subfunctionalization, neofunctionalization is expected to become a progressively more important mechanism of duplicate-gene preservation in populations with increasing size. However, even in large populations, the probability of neofunctionalization scales only with the square of the selective advantage. Tight linkage also influences the probability of duplicate-gene preservation, increasing the probability of subfunctionalization but decreasing the probability of neofunctionalization.

  19. The probability of preservation of a newly arisen gene duplicate.

    PubMed Central

    Lynch, M; O'Hely, M; Walsh, B; Force, A

    2001-01-01

    Newly emerging data from genome sequencing projects suggest that gene duplication, often accompanied by genetic map changes, is a common and ongoing feature of all genomes. This raises the possibility that differential expansion/contraction of various genomic sequences may be just as important a mechanism of phenotypic evolution as changes at the nucleotide level. However, the population-genetic mechanisms responsible for the success vs. failure of newly arisen gene duplicates are poorly understood. We examine the influence of various aspects of gene structure, mutation rates, degree of linkage, and population size (N) on the joint fate of a newly arisen duplicate gene and its ancestral locus. Unless there is active selection against duplicate genes, the probability of permanent establishment of such genes is usually no less than 1/(4N) (half of the neutral expectation), and it can be orders of magnitude greater if neofunctionalizing mutations are common. The probability of a map change (reassignment of a key function of an ancestral locus to a new chromosomal location) induced by a newly arisen duplicate is also generally >1/(4N) for unlinked duplicates, suggesting that recurrent gene duplication and alternative silencing may be a common mechanism for generating microchromosomal rearrangements responsible for postreproductive isolating barriers among species. Relative to subfunctionalization, neofunctionalization is expected to become a progressively more important mechanism of duplicate-gene preservation in populations with increasing size. However, even in large populations, the probability of neofunctionalization scales only with the square of the selective advantage. Tight linkage also influences the probability of duplicate-gene preservation, increasing the probability of subfunctionalization but decreasing the probability of neofunctionalization. PMID:11779815

  20. Processes of fungal proteome evolution and gain of function: gene duplication and domain rearrangement

    NASA Astrophysics Data System (ADS)

    Cohen-Gihon, Inbar; Sharan, Roded; Nussinov, Ruth

    2011-06-01

    During evolution, organisms have gained functional complexity mainly by modifying and improving existing functioning systems rather than creating new ones ab initio. Here we explore the interplay between two processes which during evolution have had major roles in the acquisition of new functions: gene duplication and protein domain rearrangements. We consider four possible evolutionary scenarios: gene families that have undergone none of these event types; only gene duplication; only domain rearrangement, or both events. We characterize each of the four evolutionary scenarios by functional attributes. Our analysis of ten fungal genomes indicates that at least for the fungi clade, species significantly appear to gain complexity by gene duplication accompanied by the expansion of existing domain architectures via rearrangements. We show that paralogs gaining new domain architectures via duplication tend to adopt new functions compared to paralogs that preserve their domain architectures. We conclude that evolution of protein families through gene duplication and domain rearrangement is correlated with their functional properties. We suggest that in general, new functions are acquired via the integration of gene duplication and domain rearrangements rather than each process acting independently.

  1. Phylogenomics: Gene Duplication, Unrecognized Paralogy and Outgroup Choice

    PubMed Central

    Roy, Scott William

    2009-01-01

    Comparative genomics has revealed the ubiquity of gene and genome duplication and subsequent gene loss. In the case of gene duplication and subsequent loss, gene trees can differ from species trees, thus frequent gene duplication poses a challenge for reconstruction of species relationships. Here I address the case of multi-gene sets of putative orthologs that include some unrecognized paralogs due to ancestral gene duplication, and ask how outgroups should best be chosen to reduce the degree of non-species tree (NST) signal. Consideration of expected internal branch lengths supports several conclusions: (i) when a single outgroup is used, the degree of NST signal arising from gene duplication is either independent of outgroup choice, or is minimized by use of a maximally closely related post-duplication (MCRPD) outgroup; (ii) when two outgroups are used, NST signal is minimized by using one MCRPD outgroup, while the position of the second outgroup is of lesser importance; and (iii) when two outgroups are used, the ability to detect gene trees that are inconsistent with known aspects of the species tree is maximized by use of one MCRPD, and is either independent of the position of the second outgroup, or is maximized for a more distantly related second outgroup. Overall, these results generalize the utility of closely-related outgroups for phylogenetic analysis. PMID:19234600

  2. Three neuropeptide Y receptor genes in the spiny dogfish, Squalus acanthias, support en bloc duplications in early vertebrate evolution.

    PubMed

    Salaneck, Erik; Ardell, David H; Larson, Earl T; Larhammar, Dan

    2003-08-01

    It has been debated whether the increase in gene number during early vertebrate evolution was due to multiple independent gene duplications or synchronous duplications of many genes. We describe here the cloning of three neuropeptide Y (NPY) receptor genes belonging to the Y1 subfamily in the spiny dogfish, Squalus acanthias, a cartilaginous fish. The three genes are orthologs of the mammalian subtypes Y1, Y4, and Y6, which are located in paralogous gene regions on different chromosomes in mammals. Thus, these genes arose by duplications of a chromosome region before the radiation of gnathostomes (jawed vertebrates). Estimates of duplication times from linearized trees together with evidence from other gene families supports two rounds of chromosome duplications or tetraploidizations early in vertebrate evolution. The anatomical distribution of mRNA was determined by reverse-transcriptase PCR and was found to differ from mammals, suggesting differential functional diversification of the new gene copies during the radiation of the vertebrate classes.

  3. Gene duplication, genome duplication, and the functional diversification of vertebrate globins

    PubMed Central

    Storz, Jay F.; Opazo, Juan C.; Hoffmann, Federico G.

    2015-01-01

    The functional diversification of the vertebrate globin gene superfamily provides an especially vivid illustration of the role of gene duplication and whole-genome duplication in promoting evolutionary innovation. For example, key globin proteins that evolved specialized functions in various aspects of oxidative metabolism and oxygen signaling pathways (hemoglobin [Hb], myoglobin [Mb], and cytoglobin [Cygb]) trace their origins to two whole-genome duplication events in the stem lineage of vertebrates. The retention of the proto-Hb and Mb genes in the ancestor of jawed vertebrates permitted a physiological division of labor between the oxygen-carrier function of Hb and the oxygen-storage function of Mb. In the Hb gene lineage, a subsequent tandem gene duplication gave rise to the proto α- and β-globin genes, which permitted the formation of multimeric Hbs composed of unlike subunits (α2β2). The evolution of this heteromeric quaternary structure was central to the emergence of Hb as a specialized oxygen-transport protein because it provided a mechanism for cooperative oxygen-binding and allosteric regulatory control. Subsequent rounds of duplication and divergence have produced diverse repertoires of α- and β-like globin genes that are ontogenetically regulated such that functionally distinct Hb isoforms are expressed during different stages of prenatal development and postnatal life. In the ancestor of jawless fishes, the proto Mb and Hb genes appear to have been secondarily lost, and the Cygb homolog evolved a specialized respiratory function in blood-oxygen transport. Phylogenetic and comparative genomic analyses of the vertebrate globin gene superfamily have revealed numerous instances in which paralogous globins have convergently evolved similar expression patterns and/or similar functional specializations in different organismal lineages. PMID:22846683

  4. Gene duplication, genome duplication, and the functional diversification of vertebrate globins.

    PubMed

    Storz, Jay F; Opazo, Juan C; Hoffmann, Federico G

    2013-02-01

    The functional diversification of the vertebrate globin gene superfamily provides an especially vivid illustration of the role of gene duplication and whole-genome duplication in promoting evolutionary innovation. For example, key globin proteins that evolved specialized functions in various aspects of oxidative metabolism and oxygen signaling pathways (hemoglobin [Hb], myoglobin [Mb], and cytoglobin [Cygb]) trace their origins to two whole-genome duplication events in the stem lineage of vertebrates. The retention of the proto-Hb and Mb genes in the ancestor of jawed vertebrates permitted a physiological division of labor between the oxygen-carrier function of Hb and the oxygen-storage function of Mb. In the Hb gene lineage, a subsequent tandem gene duplication gave rise to the proto α- and β-globin genes, which permitted the formation of multimeric Hbs composed of unlike subunits (α(2)β(2)). The evolution of this heteromeric quaternary structure was central to the emergence of Hb as a specialized oxygen-transport protein because it provided a mechanism for cooperative oxygen-binding and allosteric regulatory control. Subsequent rounds of duplication and divergence have produced diverse repertoires of α- and β-like globin genes that are ontogenetically regulated such that functionally distinct Hb isoforms are expressed during different stages of prenatal development and postnatal life. In the ancestor of jawless fishes, the proto Mb and Hb genes appear to have been secondarily lost, and the Cygb homolog evolved a specialized respiratory function in blood-oxygen transport. Phylogenetic and comparative genomic analyses of the vertebrate globin gene superfamily have revealed numerous instances in which paralogous globins have convergently evolved similar expression patterns and/or similar functional specializations in different organismal lineages.

  5. Gene duplication and the origins of morphological complexity in pancrustacean eyes, a genomic approach

    PubMed Central

    2010-01-01

    Background Duplication and divergence of genes and genetic networks is hypothesized to be a major driver of the evolution of complexity and novel features. Here, we examine the history of genes and genetic networks in the context of eye evolution by using new approaches to understand patterns of gene duplication during the evolution of metazoan genomes. We hypothesize that 1) genes involved in eye development and phototransduction have duplicated and are retained at higher rates in animal clades that possess more distinct types of optical design; and 2) genes with functional relationships were duplicated and lost together, thereby preserving genetic networks. To test these hypotheses, we examine the rates and patterns of gene duplication and loss evident in 19 metazoan genomes, including that of Daphnia pulex - the first completely sequenced crustacean genome. This is of particular interest because the pancrustaceans (hexapods+crustaceans) have more optical designs than any other major clade of animals, allowing us to test specifically whether the high amount of disparity in pancrustacean eyes is correlated with a higher rate of duplication and retention of vision genes. Results Using protein predictions from 19 metazoan whole-genome projects, we found all members of 23 gene families known to be involved in eye development or phototransduction and deduced their phylogenetic relationships. This allowed us to estimate the number and timing of gene duplication and loss events in these gene families during animal evolution. When comparing duplication/retention rates of these genes, we found that the rate was significantly higher in pancrustaceans than in either vertebrates or non-pancrustacean protostomes. Comparing patterns of co-duplication across Metazoa showed that while these eye-genes co-duplicate at a significantly higher rate than those within a randomly shuffled matrix, many genes with known functional relationships in model organisms did not co-duplicate more

  6. Maintenance and Loss of Duplicated Genes by Dosage Subfunctionalization

    PubMed Central

    Gout, Jean-Francois; Lynch, Michael

    2015-01-01

    Whole-genome duplications (WGDs) have contributed to gene-repertoire enrichment in many eukaryotic lineages. However, most duplicated genes are eventually lost and it is still unclear why some duplicated genes are evolutionary successful whereas others quickly turn to pseudogenes. Here, we show that dosage constraints are major factors opposing post-WGD gene loss in several Paramecium species that share a common ancestral WGD. We propose a model where a majority of WGD-derived duplicates preserve their ancestral function and are retained to produce enough of the proteins performing this same ancestral function. Under this model, the expression level of individual duplicated genes can evolve neutrally as long as they maintain a roughly constant summed expression, and this allows random genetic drift toward uneven contributions of the two copies to total expression. Our analysis suggests that once a high level of imbalance is reached, which can require substantial lengths of time, the copy with the lowest expression level contributes a small enough fraction of the total expression that selection no longer opposes its loss. Extension of our analysis to yeast species sharing a common ancestral WGD yields similar results, suggesting that duplicated-gene retention for dosage constraints followed by divergence in expression level and eventual deterministic gene loss might be a universal feature of post-WGD evolution. PMID:25908670

  7. Differential regulation of the duplicated fabp7, fabp10 and fabp11 genes of zebrafish by peroxisome proliferator activated receptors.

    PubMed

    Laprairie, Robert B; Denovan-Wright, Eileen M; Wright, Jonathan M

    2017-11-01

    In the duplication-degeneration-complementation model, duplicated gene-pairs undergo nonfunctionalization (loss from the genome), subfunctionalization (the functions of the ancestral gene are sub-divided between duplicate genes), or neofunctionalization (one of the duplicate genes acquires a new function). These processes occur by loss or gain of regulatory elements in gene promoters. Fatty acid-binding proteins (Fabp) belong to a multigene family composed of orthologous proteins that are highly conserved in sequence and function, but differ in their gene regulation. We previously reported that the zebrafish fabp1a, fabp1b.1, and fabp1b.2 promoters underwent subfunctionalization of PPAR responsiveness. Here, we describe the regulation at the duplicated zebrafish fabp7a/fabp7b, fabp10a/fabp10b and fabp11a/fabp11b gene promoters. Differential control at the duplicated fabp promoters was assessed by DNA sequence analysis, responsiveness to PPAR-isoform specific agonists and NF-κB p50 antagonists in zebrafish liver and intestine explant tissue, and in HEK293A cells transfected with fabp promoter-reporter constructs. Each zebrafish fabp gene displayed unique transcriptional regulation compared to its paralogous duplicate. This work provides a framework to account for the evolutionary trajectories that led to the high retention (57%) of duplicated fabp genes in the zebrafish genome compared to only ~3% of all duplicated genes in the zebrafish genome. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. A Young Drosophila Duplicate Gene Plays Essential Roles in Spermatogenesis by Regulating Several Y-Linked Male Fertility Genes

    PubMed Central

    Yang, Shuang; Jiang, Yu; Chen, Yuan; Zhao, Ruoping; Zhang, Yue; Zhang, Guojie; Dong, Yang; Yu, Haijing; Zhou, Qi; Wang, Wen

    2010-01-01

    Gene duplication is supposed to be the major source for genetic innovations. However, how a new duplicate gene acquires functions by integrating into a pathway and results in adaptively important phenotypes has remained largely unknown. Here, we investigated the biological roles and the underlying molecular mechanism of the young kep1 gene family in the Drosophila melanogaster species subgroup to understand the origin and evolution of new genes with new functions. Sequence and expression analysis demonstrates that one of the new duplicates, nsr (novel spermatogenesis regulator), exhibits positive selection signals and novel subcellular localization pattern. Targeted mutagenesis and whole-transcriptome sequencing analysis provide evidence that nsr is required for male reproduction associated with sperm individualization, coiling, and structural integrity of the sperm axoneme via regulation of several Y chromosome fertility genes post-transcriptionally. The absence of nsr-like expression pattern and the presence of the corresponding cis-regulatory elements of the parental gene kep1 in the pre-duplication species Drosophila yakuba indicate that kep1 might not be ancestrally required for male functions and that nsr possibly has experienced the neofunctionalization process, facilitated by changes of trans-regulatory repertories. These findings not only present a comprehensive picture about the evolution of a new duplicate gene but also show that recently originated duplicate genes can acquire multiple biological roles and establish novel functional pathways by regulating essential genes. PMID:21203494

  9. Divergence of recently duplicated M{gamma}-type MADS-box genes in Petunia.

    PubMed

    Bemer, Marian; Gordon, Jonathan; Weterings, Koen; Angenent, Gerco C

    2010-02-01

    The MADS-box transcription factor family has expanded considerably in plants via gene and genome duplications and can be subdivided into type I and MIKC-type genes. The two gene classes show a different evolutionary history. Whereas the MIKC-type genes originated during ancient genome duplications, as well as during more recent events, the type I loci appear to experience high turnover with many recent duplications. This different mode of origin also suggests a different fate for the type I duplicates, which are thought to have a higher chance to become silenced or lost from the genome. To get more insight into the evolution of the type I MADS-box genes, we isolated nine type I genes from Petunia, which belong to the Mgamma subclass, and investigated the divergence of their coding and regulatory regions. The isolated genes could be subdivided into two categories: two genes were highly similar to Arabidopsis Mgamma-type genes, whereas the other seven genes showed less similarity to Arabidopsis genes and originated more recently. Two of the recently duplicated genes were found to contain deleterious mutations in their coding regions, and expression analysis revealed that a third paralog was silenced by mutations in its regulatory region. However, in addition to the three genes that were subjected to nonfunctionalization, we also found evidence for neofunctionalization of one of the Petunia Mgamma-type genes. Our study shows a rapid divergence of recently duplicated Mgamma-type MADS-box genes and suggests that redundancy among type I paralogs may be less common than expected.

  10. Comparative Evolution of Duplicated Ddx3 Genes in Teleosts: Insights from Japanese Flounder, Paralichthys olivaceus

    PubMed Central

    Wang, Zhongkai; Liu, Wei; Song, Huayu; Wang, Huizhen; Liu, Jinxiang; Zhao, Haitao; Du, Xinxin; Zhang, Quanqi

    2015-01-01

    Following the two rounds of whole-genome duplication that occurred during deuterostome evolution, a third genome duplication event occurred in the stem lineage of ray-finned fishes. This teleost-specific genome duplication is thought to be responsible for the biological diversification of ray-finned fishes. DEAD-box polypeptide 3 (DDX3) belongs to the DEAD-box RNA helicase family. Although their functions in humans have been well studied, limited information is available regarding their function in teleosts. In this study, two teleost Ddx3 genes were first identified in the transcriptome of Japanese flounder (Paralichthys olivaceus). We confirmed that the two genes originated from teleost-specific genome duplication through synteny and phylogenetic analysis. Additionally, comparative analysis of genome structure, molecular evolution rate, and expression pattern of the two genes in Japanese flounder revealed evidence of subfunctionalization of the duplicated Ddx3 genes in teleosts. Thus, the results of this study reveal novel insights into the evolution of the teleost Ddx3 genes and constitute important groundwork for further research on this gene family. PMID:26109358

  11. High fitness costs and instability of gene duplications reduce rates of evolution of new genes by duplication-divergence mechanisms.

    PubMed

    Adler, Marlen; Anjum, Mehreen; Berg, Otto G; Andersson, Dan I; Sandegren, Linus

    2014-06-01

    An important mechanism for generation of new genes is by duplication-divergence of existing genes. Duplication-divergence includes several different submodels, such as subfunctionalization where after accumulation of neutral mutations the original function is distributed between two partially functional and complementary genes, and neofunctionalization where a new function evolves in one of the duplicated copies while the old function is maintained in another copy. The likelihood of these mechanisms depends on the longevity of the duplicated state, which in turn depends on the fitness cost and genetic stability of the duplications. Here, we determined the fitness cost and stability of defined gene duplications/amplifications on a low copy number plasmid. Our experimental results show that the costs of carrying extra gene copies are substantial and that each additional kilo base pairs of DNA reduces fitness by approximately 0.15%. Furthermore, gene amplifications are highly unstable and rapidly segregate to lower copy numbers in absence of selection. Mathematical modeling shows that the fitness costs and instability strongly reduces the likelihood of both sub- and neofunctionalization, but that these effects can be offset by positive selection for novel beneficial functions.

  12. Modes of Gene Duplication Contribute Differently to Genetic Novelty and Redundancy, but Show Parallels across Divergent Angiosperms

    PubMed Central

    Wang, Yupeng; Wang, Xiyin; Tang, Haibao; Tan, Xu; Ficklin, Stephen P.; Feltus, F. Alex; Paterson, Andrew H.

    2011-01-01

    Background Both single gene and whole genome duplications (WGD) have recurred in angiosperm evolution. However, the evolutionary effects of different modes of gene duplication, especially regarding their contributions to genetic novelty or redundancy, have been inadequately explored. Results In Arabidopsis thaliana and Oryza sativa (rice), species that deeply sample botanical diversity and for which expression data are available from a wide range of tissues and physiological conditions, we have compared expression divergence between genes duplicated by six different mechanisms (WGD, tandem, proximal, DNA based transposed, retrotransposed and dispersed), and between positional orthologs. Both neo-functionalization and genetic redundancy appear to contribute to retention of duplicate genes. Genes resulting from WGD and tandem duplications diverge slowest in both coding sequences and gene expression, and contribute most to genetic redundancy, while other duplication modes contribute more to evolutionary novelty. WGD duplicates may more frequently be retained due to dosage amplification, while inferred transposon mediated gene duplications tend to reduce gene expression levels. The extent of expression divergence between duplicates is discernibly related to duplication modes, different WGD events, amino acid divergence, and putatively neutral divergence (time), but the contribution of each factor is heterogeneous among duplication modes. Gene loss may retard inter-species expression divergence. Members of different gene families may have non-random patterns of origin that are similar in Arabidopsis and rice, suggesting the action of pan-taxon principles of molecular evolution. Conclusion Gene duplication modes differ in contribution to genetic novelty and redundancy, but show some parallels in taxa separated by hundreds of millions of years of evolution. PMID:22164235

  13. Modes of gene duplication contribute differently to genetic novelty and redundancy, but show parallels across divergent angiosperms.

    PubMed

    Wang, Yupeng; Wang, Xiyin; Tang, Haibao; Tan, Xu; Ficklin, Stephen P; Feltus, F Alex; Paterson, Andrew H

    2011-01-01

    Both single gene and whole genome duplications (WGD) have recurred in angiosperm evolution. However, the evolutionary effects of different modes of gene duplication, especially regarding their contributions to genetic novelty or redundancy, have been inadequately explored. In Arabidopsis thaliana and Oryza sativa (rice), species that deeply sample botanical diversity and for which expression data are available from a wide range of tissues and physiological conditions, we have compared expression divergence between genes duplicated by six different mechanisms (WGD, tandem, proximal, DNA based transposed, retrotransposed and dispersed), and between positional orthologs. Both neo-functionalization and genetic redundancy appear to contribute to retention of duplicate genes. Genes resulting from WGD and tandem duplications diverge slowest in both coding sequences and gene expression, and contribute most to genetic redundancy, while other duplication modes contribute more to evolutionary novelty. WGD duplicates may more frequently be retained due to dosage amplification, while inferred transposon mediated gene duplications tend to reduce gene expression levels. The extent of expression divergence between duplicates is discernibly related to duplication modes, different WGD events, amino acid divergence, and putatively neutral divergence (time), but the contribution of each factor is heterogeneous among duplication modes. Gene loss may retard inter-species expression divergence. Members of different gene families may have non-random patterns of origin that are similar in Arabidopsis and rice, suggesting the action of pan-taxon principles of molecular evolution. Gene duplication modes differ in contribution to genetic novelty and redundancy, but show some parallels in taxa separated by hundreds of millions of years of evolution.

  14. Explosive Tandem and Segmental Duplications of Multigenic Families in Eucalyptus grandis

    PubMed Central

    Li, Qiang; Yu, Hong; Cao, Phi Bang; Fawal, Nizar; Mathé, Catherine; Azar, Sahar; Cassan-Wang, Hua; Myburg, Alexander A.; Grima-Pettenati, Jacqueline; Marque, Christiane; Teulières, Chantal; Dunand, Christophe

    2015-01-01

    Plant organisms contain a large number of genes belonging to numerous multigenic families whose evolution size reflects some functional constraints. Sequences from eight multigenic families, involved in biotic and abiotic responses, have been analyzed in Eucalyptus grandis and compared with Arabidopsis thaliana. Two transcription factor families APETALA 2 (AP2)/ethylene responsive factor and GRAS, two auxin transporter families PIN-FORMED and AUX/LAX, two oxidoreductase families (ascorbate peroxidases [APx] and Class III peroxidases [CIII Prx]), and two families of protective molecules late embryogenesis abundant (LEA) and DNAj were annotated in expert and exhaustive manner. Many recent tandem duplications leading to the emergence of species-specific gene clusters and the explosion of the gene numbers have been observed for the AP2, GRAS, LEA, PIN, and CIII Prx in E. grandis, while the APx, the AUX/LAX and DNAj are conserved between species. Although no direct evidence has yet demonstrated the roles of these recent duplicated genes observed in E. grandis, this could indicate their putative implications in the morphological and physiological characteristics of E. grandis, and be the key factor for the survival of this nondormant species. Global analysis of key families would be a good criterion to evaluate the capabilities of some organisms to adapt to environmental variations. PMID:25769696

  15. Dosage, duplication, and diploidization: clarifying the interplay of multiple models for duplicate gene evolution over time.

    PubMed

    Conant, Gavin C; Birchler, James A; Pires, J Chris

    2014-06-01

    Requirements to maintain dosage balance shape many genome-scale patterns in organisms, including the resolution of whole genome duplications (WGD), as well as the varied effects of aneuploidy, segmental duplications, tandem duplications, gene copy number variations (CNV), and epigenetic marks. Like neofunctionalization and subfunctionalization, the impact of absolute and relative dosage varies over time. These variations are of particular importance in understanding the role of dosage in the evolution of polyploid organisms. Numerous investigations have found the consequences of polyploidy remain distinct from small-scale duplications (SSD). This observation is significant as all flowering plants have experienced at least two ancient polyploid events, and many angiosperm lineages have undergone additional rounds of polyploidy. Intriguingly, recent studies indicate a link between how epigenetic marks in recent allopolyploids may induce immediate changes in gene expression and the longer-term patterns of biased fractionation and chromosomal evolution. We argue that dosage effects represent one aspect of an emerging pluralistic framework, a framework that will use biophysics, genomic technologies, and systems-level models of cells to broaden our view of how genomes evolve. Copyright © 2014 Elsevier Ltd. All rights reserved.

  16. Concomitant Duplications of Opioid Peptide and Receptor Genes before the Origin of Jawed Vertebrates

    PubMed Central

    Sundström, Görel; Dreborg, Susanne; Larhammar, Dan

    2010-01-01

    Background The opioid system is involved in reward and pain mechanisms and consists in mammals of four receptors and several peptides. The peptides are derived from four prepropeptide genes, PENK, PDYN, PNOC and POMC, encoding enkephalins, dynorphins, orphanin/nociceptin and beta-endorphin, respectively. Previously we have described how two rounds of genome doubling (2R) before the origin of jawed vertebrates formed the receptor family. Methodology/Principal Findings Opioid peptide gene family members were investigated using a combination of sequence-based phylogeny and chromosomal locations of the peptide genes in various vertebrates. Several adjacent gene families were investigated similarly. The results show that the ancestral peptide gene gave rise to two additional copies in the genome doublings. The fourth member was generated by a local gene duplication, as the genes encoding POMC and PNOC are located on the same chromosome in the chicken genome and all three teleost genomes that we have studied. A translocation has disrupted this synteny in mammals. The PDYN gene seems to have been lost in chicken, but not in zebra finch. Duplicates of some peptide genes have arisen in the teleost fishes. Within the prepropeptide precursors, peptides have been lost or gained in different lineages. Conclusions/Significance The ancestral peptide and receptor genes were located on the same chromosome and were thus duplicated concomitantly. However, subsequently genetic linkage has been lost. In conclusion, the system of opioid peptides and receptors was largely formed by the genome doublings that took place early in vertebrate evolution. PMID:20463905

  17. Assessment and Reconstruction of Novel HSP90 Genes: Duplications, Gains and Losses in Fungal and Animal Lineages

    PubMed Central

    Pantzartzi, Chrysoula N.; Drosopoulou, Elena; Scouras, Zacharias G.

    2013-01-01

    Hsp90s, members of the Heat Shock Protein class, protect the structure and function of proteins and play a significant task in cellular homeostasis and signal transduction. In order to determine the number of hsp90 gene copies and encoded proteins in fungal and animal lineages and through that key duplication events that this family has undergone, we collected and evaluated Hsp90 protein sequences and corresponding Expressed Sequence Tags and analyzed available genomes from various taxa. We provide evidence for duplication events affecting either single species or wider taxonomic groups. With regard to Fungi, duplicated genes have been detected in several lineages. In invertebrates, we demonstrate key duplication events in certain clades of Arthropoda and Mollusca, and a possible gene loss event in a hymenopteran family. Finally, we infer that the duplication event responsible for the two (a and b) isoforms in vertebrates occurred probably shortly after the split of Hyperoartia and Gnathostomata. PMID:24066039

  18. A salmonid EST genomic study: genes, duplications, phylogeny and microarrays

    USDA-ARS?s Scientific Manuscript database

    Background: Salmonids are of interest because of their relatively recent genome duplication, and their extensive use in wild fisheries and aquaculture. A comprehensive gene list and a comparison of genes in some of the different species provide valuable genomic information for one of the most wide...

  19. Molecular evolution accompanying functional divergence of duplicated genes along the plant starch biosynthesis pathway.

    PubMed

    Nougué, Odrade; Corbi, Jonathan; Ball, Steven G; Manicacci, Domenica; Tenaillon, Maud I

    2014-05-15

    Starch is the main source of carbon storage in the Archaeplastida. The starch biosynthesis pathway (sbp) emerged from cytosolic glycogen metabolism shortly after plastid endosymbiosis and was redirected to the plastid stroma during the green lineage divergence. The SBP is a complex network of genes, most of which are members of large multigene families. While some gene duplications occurred in the Archaeplastida ancestor, most were generated during the sbp redirection process, and the remaining few paralogs were generated through compartmentalization or tissue specialization during the evolution of the land plants. In the present study, we tested models of duplicated gene evolution in order to understand the evolutionary forces that have led to the development of SBP in angiosperms. We combined phylogenetic analyses and tests on the rates of evolution along branches emerging from major duplication events in six gene families encoding sbp enzymes. We found evidence of positive selection along branches following cytosolic or plastidial specialization in two starch phosphorylases and identified numerous residues that exhibited changes in volume, polarity or charge. Starch synthases, branching and debranching enzymes functional specializations were also accompanied by accelerated evolution. However, none of the sites targeted by selection corresponded to known functional domains, catalytic or regulatory. Interestingly, among the 13 duplications tested, 7 exhibited evidence of positive selection in both branches emerging from the duplication, 2 in only one branch, and 4 in none of the branches. The majority of duplications were followed by accelerated evolution targeting specific residues along both branches. This pattern was consistent with the optimization of the two sub-functions originally fulfilled by the ancestral gene before duplication. Our results thereby provide strong support to the so-called "Escape from Adaptive Conflict" (EAC) model. Because none of the

  20. On the Complexity of Duplication-Transfer-Loss Reconciliation with Non-Binary Gene Trees.

    PubMed

    Kordi, Misagh; Bansal, Mukul S

    2017-01-01

    Duplication-Transfer-Loss (DTL) reconciliation has emerged as a powerful technique for studying gene family evolution in the presence of horizontal gene transfer. DTL reconciliation takes as input a gene family phylogeny and the corresponding species phylogeny, and reconciles the two by postulating speciation, gene duplication, horizontal gene transfer, and gene loss events. Efficient algorithms exist for finding optimal DTL reconciliations when the gene tree is binary. However, gene trees are frequently non-binary. With such non-binary gene trees, the reconciliation problem seeks to find a binary resolution of the gene tree that minimizes the reconciliation cost. Given the prevalence of non-binary gene trees, many efficient algorithms have been developed for this problem in the context of the simpler Duplication-Loss (DL) reconciliation model. Yet, no efficient algorithms exist for DTL reconciliation with non-binary gene trees and the complexity of the problem remains unknown. In this work, we resolve this open question by showing that the problem is, in fact, NP-hard. Our reduction applies to both the dated and undated formulations of DTL reconciliation. By resolving this long-standing open problem, this work will spur the development of both exact and heuristic algorithms for this important problem.

  1. Evidence of duplicated Hox genes in the most recent common ancestor of extant scorpions.

    PubMed

    Sharma, Prashant P; Santiago, Marc A; González-Santillán, Edmundo; Monod, Lionel; Wheeler, Ward C

    2015-01-01

    Scorpions (order Scorpiones) are unusual among arthropods, both for the extreme heteronomy of their bauplan and for the high gene family turnover exhibited in their genomes. These phenomena appear to be correlated, as two scorpion species have been shown to possess nearly twice the number of Hox genes present in most arthropods. Segmentally offset anterior expression boundaries of a subset of Hox paralogs have been shown to correspond to transitions in segmental identities in the scorpion posterior tagmata, suggesting that posterior heteronomy in scorpions may have been achieved by neofunctionalization of Hox paralogs. However, both the first scorpion genome sequenced and the developmental genetic data are based on exemplars of Buthidae, one of 19 families of scorpions. It is therefore not known whether Hox paralogy is limited to Buthidae or widespread among scorpions. We surveyed 24 high throughput transcriptomes and the single whole genome available for scorpions, in order to test the prediction that Hox gene duplications are common to the order. We used gene tree parsimony to infer whether the paralogy was consistent with a duplication event in the scorpion common ancestor. Here we show that duplicated Hox genes in non-buthid scorpions occur in six of the ten Hox classes. Gene tree topologies and parsimony-based reconciliation of the gene trees are consistent with a duplication event in the most recent common ancestor of scorpions. These results suggest that a Hox paralogy, and by extension the model of posterior patterning established in a buthid, can be extended to non-Buthidae scorpions.

  2. Paralogue Interference Affects the Dynamics after Gene Duplication.

    PubMed

    Kaltenegger, Elisabeth; Ober, Dietrich

    2015-12-01

    Proteins tend to form homomeric complexes of identical subunits, which act as functional units. By definition, the subunits are encoded from a single genetic locus. When such a gene is duplicated, the gene products are suggested initially to cross-interact when coexpressed, thus resulting in the phenomenon of paralogue interference. In this opinion article, we explore how paralogue interference can shape the fate of a duplicated gene. One important outcome is a prolonged time window in which both copies remain under selection increasing the chance to accumulate mutations and to develop new properties. Thereby, paralogue interference can mediate the coevolution of duplicates and here we illustrate the potential of this phenomenon in light of recent new studies.

  3. Gene duplication and divergence of long wavelength-sensitive opsin genes in the guppy, Poecilia reticulata.

    PubMed

    Watson, Corey T; Gray, Suzanne M; Hoffmann, Margarete; Lubieniecki, Krzysztof P; Joy, Jeffrey B; Sandkam, Ben A; Weigel, Detlef; Loew, Ellis; Dreyer, Christine; Davidson, William S; Breden, Felix

    2011-02-01

    Female preference for male orange coloration in the genus Poecilia suggests a role for duplicated long wavelength-sensitive (LWS) opsin genes in facilitating behaviors related to mate choice in these species. Previous work has shown that LWS gene duplication in this genus has resulted in expansion of long wavelength visual capacity as determined by microspectrophotometry (MSP). However, the relationship between LWS genomic repertoires and expression of LWS retinal cone classes within a given species is unclear. Our previous study in the related species, Xiphophorus helleri, was the first characterization of the complete LWS opsin genomic repertoire in conjunction with MSP expression data in the family Poeciliidae, and revealed the presence of four LWS loci and two distinct LWS cone classes. In this study we characterized the genomic organization of LWS opsin genes by BAC clone sequencing, and described the full range of cone cell types in the retina of the colorful Cumaná guppy, Poecilia reticulata. In contrast to X. helleri, MSP data from the Cumaná guppy revealed three LWS cone classes. Comparisons of LWS genomic organization described here for Cumaná to that of X. helleri indicate that gene divergence and not duplication was responsible for the evolution of a novel LWS haplotype in the Cumaná guppy. This lineage-specific divergence is likely responsible for a third additional retinal cone class not present in X. helleri, and may have facilitated the strong sexual selection driven by female preference for orange color patterns associated with the genus Poecilia.

  4. Prevertebrate Local Gene Duplication Facilitated Expansion of the Neuropeptide GPCR Superfamily.

    PubMed

    Yun, Seongsik; Furlong, Michael; Sim, Mikang; Cho, Minah; Park, Sumi; Cho, Eun Bee; Reyes-Alcaraz, Arfaxad; Hwang, Jong-Ik; Kim, Jaebum; Seong, Jae Young

    2015-11-01

    In humans, numerous genes encode neuropeptides that comprise a superfamily of more than 70 genes in approximately 30 families and act mainly through rhodopsin-like G protein-coupled receptors (GPCRs). Two rounds of whole-genome duplication (2R WGD) during early vertebrate evolution greatly contributed to proliferation within gene families; however, the mechanisms underlying the initial emergence and diversification of these gene families before 2R WGD are largely unknown. In this study, we analyzed 25 vertebrate rhodopsin-like neuropeptide GPCR families and their cognate peptides using phylogeny, synteny, and localization of these genes on reconstructed vertebrate ancestral chromosomes (VACs). Based on phylogeny, these GPCR families can be divided into five distinct clades, and members of each clade tend to be located on the same VACs. Similarly, their neuropeptide gene families also tend to reside on distinct VACs. Comparison of these GPCR genes with those of invertebrates including Drosophila melanogaster, Caenorhabditis elegans, Branchiostoma floridae, and Ciona intestinalis indicates that these GPCR families emerged through tandem local duplication during metazoan evolution prior to 2R WGD. Our study describes a presumptive evolutionary mechanism and development pathway of the vertebrate rhodopsin-like GPCR and cognate neuropeptide families from the urbilaterian ancestor to modern vertebrates.

  5. Prevalent role of gene features in determining evolutionary fates of whole-genome duplication duplicated genes in flowering plants.

    PubMed

    Jiang, Wen-kai; Liu, Yun-long; Xia, En-hua; Gao, Li-zhi

    2013-04-01

    The evolution of genes and genomes after polyploidization has been the subject of extensive studies in evolutionary biology and plant sciences. While a significant number of duplicated genes are rapidly removed during a process called fractionation, which operates after the whole-genome duplication (WGD), another considerable number of genes are retained preferentially, leading to the phenomenon of biased gene retention. However, the evolutionary mechanisms underlying gene retention after WGD remain largely unknown. Through genome-wide analyses of sequence and functional data, we comprehensively investigated the relationships between gene features and the retention probability of duplicated genes after WGDs in six plant genomes, Arabidopsis (Arabidopsis thaliana), poplar (Populus trichocarpa), soybean (Glycine max), rice (Oryza sativa), sorghum (Sorghum bicolor), and maize (Zea mays). The results showed that multiple gene features were correlated with the probability of gene retention. Using a logistic regression model based on principal component analysis, we resolved evolutionary rate, structural complexity, and GC3 content as the three major contributors to gene retention. Cluster analysis of these features further classified retained genes into three distinct groups in terms of gene features and evolutionary behaviors. Type I genes are more prone to be selected by dosage balance; type II genes are possibly subject to subfunctionalization; and type III genes may serve as potential targets for neofunctionalization. This study highlights that gene features are able to act jointly as primary forces when determining the retention and evolution of WGD-derived duplicated genes in flowering plants. These findings thus may help to provide a resolution to the debate on different evolutionary models of gene fates after WGDs.

  6. Multiple bursts of pancreatic ribonuclease gene duplication in insect-eating bats.

    PubMed

    Xu, Huihui; Liu, Yang; Meng, Fanxing; He, Beibei; Han, Naijian; Li, Gang; Rossiter, Stephen J; Zhang, Shuyi

    2013-09-10

    Pancreatic ribonuclease gene (RNASE1) was previously shown to have undergone duplication and adaptive evolution related to digestive efficiency in several mammalian groups that have evolved foregut fermentation, including ruminants and some primates. RNASE1 gene duplications thought to be linked to diet have also been recorded in some carnivores. Of all mammals, bats have evolved the most diverse dietary specializations, mainly including frugivory and insectivory. Here we cloned, sequenced and analyzed RNASE1 gene sequences from a range of bat species to determine whether their dietary adaptation is mirrored by molecular adaptation. We found that seven insect-eating members of the families Vespertilionidae and Molossidae possessed two or more duplicates, and we also detected three pseudogenes. Reconstructed RNASE1 gene trees based on both Bayesian and maximum likelihood methods supported independent duplication events in these two families. Selection tests revealed that RNASE1 gene duplicates have undergone episodes of positive selection indicative of functional modification, and lineage-specific tests revealed strong adaptive evolution in the Tadarida β clade. However, unlike the RNASE1 duplicates that function in digestion in some mammals, the bat RNASE1 sequences were found to be characterized by relatively high isoelectric points, a feature previously suggested to promote defense against viruses via the breakdown of double-stranded RNA. Taken together, our findings point to an adaptive diversification of RNASE1 in these two bat families, although we find no clear evidence that this was driven by diet. Future experimental assays are needed to resolve the functions of these enzymes in bats.

  7. {alpha}-Synuclein gene duplication impairs reward learning.

    PubMed

    Kéri, Szabolcs; Moustafa, Ahmed A; Myers, Catherine E; Benedek, György; Gluck, Mark A

    2010-09-07

    alpha-Synuclein (SNCA) plays an important role in the regulation of dopaminergic neurotransmission and neurodegeneration in Parkinson disease. We investigated reward and punishment learning in asymptomatic carriers of a rare SNCA gene duplication who were healthy siblings of patients with Parkinson disease. Results revealed that healthy SNCA duplication carriers displayed impaired reward and intact punishment learning compared with noncarriers. These results demonstrate that a copy number variation of the SNCA gene is associated with selective impairments on reinforcement learning in asymptomatic carriers without the motor symptoms of Parkinson disease.

  8. α-Synuclein gene duplication impairs reward learning

    PubMed Central

    Kéri, Szabolcs; Moustafa, Ahmed A.; Myers, Catherine E.; Benedek, György; Gluck, Mark A.

    2010-01-01

    α-Synuclein (SNCA) plays an important role in the regulation of dopaminergic neurotransmission and neurodegeneration in Parkinson disease. We investigated reward and punishment learning in asymptomatic carriers of a rare SNCA gene duplication who were healthy siblings of patients with Parkinson disease. Results revealed that healthy SNCA duplication carriers displayed impaired reward and intact punishment learning compared with noncarriers. These results demonstrate that a copy number variation of the SNCA gene is associated with selective impairments on reinforcement learning in asymptomatic carriers without the motor symptoms of Parkinson disease. PMID:20733075

  9. The evolution of introns in human duplicated genes.

    PubMed

    Rayko, Edda; Jabbari, Kamel; Bernardi, Giorgio

    2006-01-03

    In previous work [Jabbari, K., Rayko, E., Bernardi, G., 2003. The major shifts of human duplicated genes. Gene 317, 203-208], we investigated the fate of ancient duplicated genes after the compositional transitions that occurred between the genomes of cold- and warm-blooded vertebrates. We found that the majority of duplicated copies were transposed to the "ancestral genome core", the gene-dense genome compartment that underwent a GC enrichment at the compositional transitions. Here, we studied the consequences of the events just outlined on the introns of duplicated genes. We found that, while intron number was highly conserved, total intron size (the sum of intron sizes within any given gene) was smaller in the GC-rich copies compared to the GC-poor copies, especially in dispersed copies (i.e., copies located on different chromosomes or chromosome arms). GC-rich copies also showed higher densities of CpG islands and Alus, whereas GC-poor copies were characterized by higher densities of LINEs. The features of the copies that underwent the compositional transition and became GC-richer are suggestive of, or related to, functional changes.

  10. Alternative Transposition Generates New Chimeric Genes and Segmental Duplications at the Maize p1 Locus

    PubMed Central

    Wang, Dafang; Yu, Chuanhe; Zuo, Tao; Zhang, Jianbo; Weber, David F.; Peterson, Thomas

    2015-01-01

    The maize Ac/Ds transposon family was the first transposable element system identified and characterized by Barbara McClintock. Ac/Ds transposons belong to the hAT family of class II DNA transposons. We and others have shown that Ac/Ds elements can undergo a process of alternative transposition in which the Ac/Ds transposase acts on the termini of two separate, nearby transposons. Because these termini are present in different elements, alternative transposition can generate a variety of genome alterations such as inversions, duplications, deletions, and translocations. Moreover, Ac/Ds elements transpose preferentially into genic regions, suggesting that structural changes arising from alternative transposition may potentially generate chimeric genes at the rearrangement breakpoints. Here we identified and characterized 11 independent cases of gene fusion induced by Ac alternative transposition. In each case, a functional chimeric gene was created by fusion of two linked, paralogous genes; moreover, each event was associated with duplication of the ∼70-kb segment located between the two paralogs. An extant gene in the maize B73 genome that contains an internal duplication apparently generated by an alternative transposition event was also identified. Our study demonstrates that alternative transposition-induced duplications may be a source for spontaneous creation of diverse genome structures and novel genes in maize. PMID:26434719

  11. Effects of Gene Duplication, Positive Selection, and Shifts in Gene Expression on the Evolution of the Venom Gland Transcriptome in Widow Spiders.

    PubMed

    Haney, Robert A; Clarke, Thomas H; Gadgil, Rujuta; Fitzpatrick, Ryan; Hayashi, Cheryl Y; Ayoub, Nadia A; Garb, Jessica E

    2016-01-05

    Gene duplication and positive selection can be important determinants of the evolution of venom, a protein-rich secretion used in prey capture and defense. In a typical model of venom evolution, gene duplicates switch to venom gland expression and change function under the action of positive selection, which together with further duplication produces large gene families encoding diverse toxins. Although these processes have been demonstrated for individual toxin families, high-throughput multitissue sequencing of closely related venomous species can provide insights into evolutionary dynamics at the scale of the entire venom gland transcriptome. By assembling and analyzing multitissue transcriptomes from the Western black widow spider and two closely related species with distinct venom toxicity phenotypes, we do not find that gene duplication and duplicate retention is greater in gene families with venom gland biased expression in comparison with broadly expressed families. Positive selection has acted on some venom toxin families, but does not appear to be in excess for families with venom gland biased expression. Moreover, we find 309 distinct gene families that have single transcripts with venom gland biased expression, suggesting that the switching of genes to venom gland expression in numerous unrelated gene families has been a dominant mode of evolution. We also find ample variation in protein sequences of venom gland-specific transcripts, lineage-specific family sizes, and ortholog expression among species. This variation might contribute to the variable venom toxicity of these species.

  12. Effects of Gene Duplication, Positive Selection, and Shifts in Gene Expression on the Evolution of the Venom Gland Transcriptome in Widow Spiders

    PubMed Central

    Haney, Robert A.; Clarke, Thomas H.; Gadgil, Rujuta; Fitzpatrick, Ryan; Hayashi, Cheryl Y.; Ayoub, Nadia A.; Garb, Jessica E.

    2016-01-01

    Gene duplication and positive selection can be important determinants of the evolution of venom, a protein-rich secretion used in prey capture and defense. In a typical model of venom evolution, gene duplicates switch to venom gland expression and change function under the action of positive selection, which together with further duplication produces large gene families encoding diverse toxins. Although these processes have been demonstrated for individual toxin families, high-throughput multitissue sequencing of closely related venomous species can provide insights into evolutionary dynamics at the scale of the entire venom gland transcriptome. By assembling and analyzing multitissue transcriptomes from the Western black widow spider and two closely related species with distinct venom toxicity phenotypes, we do not find that gene duplication and duplicate retention is greater in gene families with venom gland biased expression in comparison with broadly expressed families. Positive selection has acted on some venom toxin families, but does not appear to be in excess for families with venom gland biased expression. Moreover, we find 309 distinct gene families that have single transcripts with venom gland biased expression, suggesting that the switching of genes to venom gland expression in numerous unrelated gene families has been a dominant mode of evolution. We also find ample variation in protein sequences of venom gland–specific transcripts, lineage-specific family sizes, and ortholog expression among species. This variation might contribute to the variable venom toxicity of these species. PMID:26733576

  13. Compensatory Drift and the Evolutionary Dynamics of Dosage-Sensitive Duplicate Genes.

    PubMed

    Thompson, Ammon; Zakon, Harold H; Kirkpatrick, Mark

    2016-02-01

    Dosage-balance selection preserves functionally redundant duplicates (paralogs) at the optimum for their combined expression. Here we present a model of the dynamics of duplicate genes coevolving under dosage-balance selection. We call this the compensatory drift model. Results show that even when strong dosage-balance selection constrains total expression to the optimum, expression of each duplicate can diverge by drift from its original level. The rate of divergence slows as the strength of stabilizing selection, the size of the mutation effect, and/or the size of the population increases. We show that dosage-balance selection impedes neofunctionalization early after duplication but can later facilitate it. We fit this model to data from sodium channel duplicates in 10 families of teleost fish; these include two convergent lineages of electric fish in which one of the duplicates neofunctionalized. Using the model, we estimated the strength of dosage-balance selection for these genes. The results indicate that functionally redundant paralogs still may undergo radical functional changes after a prolonged period of compensatory drift.

  14. Rapid bursts of androgen-binding protein (Abp) gene duplication occurred independently in diverse mammals

    PubMed Central

    2008-01-01

    Background The draft mouse (Mus musculus) genome sequence revealed an unexpected proliferation of gene duplicates encoding a family of secretoglobin proteins including the androgen-binding protein (ABP) α, β and γ subunits. Further investigation of 14 α-like (Abpa) and 13 β- or γ-like (Abpbg) undisrupted gene sequences revealed a rich diversity of developmental stage-, sex- and tissue-specific expression. Despite these studies, our understanding of the evolution of this gene family remains incomplete. Questions arise from imperfections in the initial mouse genome assembly and a dearth of information about the gene family structure in other rodents and mammals. Results Here, we interrogate the latest 'finished' mouse (Mus musculus) genome sequence assembly to show that the Abp gene repertoire is, in fact, twice as large as reported previously, with 30 Abpa and 34 Abpbg genes and pseudogenes. All of these have arisen since the last common ancestor with rat (Rattus norvegicus). We then demonstrate, by sequencing homologs from species within the Mus genus, that this burst of gene duplication occurred very recently, within the past seven million years. Finally, we survey Abp orthologs in genomes from across the mammalian clade and show that bursts of Abp gene duplications are not specific to the murid rodents; they also occurred recently in the lagomorph (rabbit, Oryctolagus cuniculus) and ruminant (cattle, Bos taurus) lineages, although not in other mammalian taxa. Conclusion We conclude that Abp genes have undergone repeated bursts of gene duplication and adaptive sequence diversification driven by these genes' participation in chemosensation and/or sexual identification. PMID:18269759

  15. PGDD: a database of gene and genome duplication in plants

    PubMed Central

    Lee, Tae-Ho; Tang, Haibao; Wang, Xiyin; Paterson, Andrew H.

    2013-01-01

    Genome duplication (GD) has permanently shaped the architecture and function of many higher eukaryotic genomes. The angiosperms (flowering plants) are outstanding models in which to elucidate consequences of GD for higher eukaryotes, owing to their propensity for chromosomal duplication or even triplication in a few cases. Duplicated genome structures often require both intra- and inter-genome alignments to unravel their evolutionary history, also providing the means to deduce both obvious and otherwise-cryptic orthology, paralogy and other relationships among genes. The burgeoning sets of angiosperm genome sequences provide the foundation for a host of investigations into the functional and evolutionary consequences of gene and GD. To provide genome alignments from a single resource based on uniform standards that have been validated by empirical studies, we built the Plant Genome Duplication Database (PGDD; freely available at http://chibba.agtec.uga.edu/duplication/), a web service providing synteny information in terms of colinearity between chromosomes. At present, PGDD contains data for 26 plants including bryophytes and chlorophyta, as well as angiosperms with draft genome sequences. In addition to the inclusion of new genomes as they become available, we are preparing new functions to enhance PGDD. PMID:23180799

  16. Functional divergence in tandemly duplicated Arabidopsis thaliana trypsin inhibitor genes.

    PubMed Central

    Clauss, M J; Mitchell-Olds, T

    2004-01-01

    In multigene families, variation among loci and alleles can contribute to trait evolution. We explored patterns of functional and genetic variation in six duplicated Arabidopsis thaliana trypsin inhibitor (ATTI) loci. We demonstrate significant variation in constitutive and herbivore-induced transcription among ATTI loci that show, on average, 65% sequence divergence. Significant variation in ATTI expression was also found between two molecularly defined haplotype classes. Population genetic analyses for 17 accessions of A. thaliana showed that six ATTI loci arranged in tandem within 10 kb varied 10-fold in nucleotide diversity, from 0.0009 to 0.0110, and identified a minimum of six recombination events throughout the tandem array. We observed a significant peak in nucleotide and indel polymorphism spanning ATTI loci in the interior of the array, due primarily to divergence between the two haplotype classes. Significant deviation from the neutral equilibrium model for individual genes was interpreted within the context of intergene linkage disequilibrium and correlated patterns of functional differentiation. In contrast to the outcrosser Arabidopsis lyrata for which recombination is observed even within ATTI loci, our data suggest that response to selection was slowed in the inbreeding, annual A. thaliana because of interference among functionally divergent ATTI loci. PMID:15082560

  17. Duplication of the ZIC2 gene is not associated with holoprosencephaly.

    PubMed

    Jobanputra, Vaidehi; Burke, Alanna; Kwame, Anyane-Yeboa; Shanmugham, Anita; Shirazi, Maryam; Brown, Stephen; Warburton, Peter E; Levy, Brynn; Warburton, Dorothy

    2012-01-01

    Cytogenetic testing using genomic microarrays presents a clinical challenge when data regarding the phenotypic consequences of the genomic alteration are not available. We describe a chromosome 13q32.3 duplication discovered by microarray testing in a fetus with a prenatally detected apparently balanced de novo translocation 46,XY,t(2;13)(q37;q32). Microarray analysis on the fetal DNA showed duplications of 384 and 564 kb at the breakpoint regions on chromosomes 2q37.3 and 13q32.3, respectively. There were no disease-associated genes in the duplicated region on chromosome 2q37. The duplicated region on chromosome 13q contains the ZIC2 gene. Haploinsufficiency of ZIC2 is known to cause holoprosencephaly and other brain malformations. Studies in the mouse models have suggested that over expression of ZIC2 may also lead to brain malformations. Fetal MRI of the brain was normal and the family elected to continue the pregnancy. An apparently normal baby was born at term. At 3 months of age a physical exam showed no abnormalities and no developmental delay. This report shows that duplication of ZIC2 is not necessarily associated with brain malformations. We also describe the phenotype from four additional patients with duplications of the region of chromosome 13 containing ZIC2 and three previously described patients with supernumerary marker chromosomes derived from distal chromosome 13. None of the eight patients had holoprosencephaly or brain malformations, indicating that duplication of ZIC2 is not associated with brain anomalies. This information will be useful for counseling in other occurrences of this duplication identified by microarray.

  18. Evaluating and Characterizing Ancient Whole-Genome Duplications in Plants with Gene Count Data.

    PubMed

    Tiley, George P; Ané, Cécile; Burleigh, J Gordon

    2016-04-11

    Whole-genome duplications (WGDs) have helped shape the genomes of land plants, and recent evidence suggests that the genomes of all angiosperms have experienced at least two ancient WGDs. In plants, WGDs often are followed by rapid fractionation, in which many homeologous gene copies are lost. Thus, it can be extremely difficult to identify, let alone characterize, ancient WGDs. In this study, we use a new maximum likelihood estimator to test for evidence of ancient WGDs in land plants and estimate the fraction of new genes copies that are retained following a WGD using gene count data, the number of gene copies in gene families. We identified evidence of many putative ancient WGDs in land plants and found that the genome fractionation rates vary tremendously among ancient WGDs. Analyses of WGDs within Brassicales also indicate that background gene duplication and loss rates vary across land plants, and different gene families have different probabilities of being retained following a WGD. Although our analyses are largely robust to errors in duplication and loss rates and the choice of priors, simulations indicate that this method can have trouble detecting multiple WGDs that occur on the same branch, especially when the gene retention rates for ancient WGDs are very low. They also suggest that we should carefully evaluate evidence for some ancient plant WGD hypotheses.

  19. Evaluating and Characterizing Ancient Whole-Genome Duplications in Plants with Gene Count Data

    PubMed Central

    Tiley, George P.; Ané, Cécile; Burleigh, J. Gordon

    2016-01-01

    Whole-genome duplications (WGDs) have helped shape the genomes of land plants, and recent evidence suggests that the genomes of all angiosperms have experienced at least two ancient WGDs. In plants, WGDs often are followed by rapid fractionation, in which many homeologous gene copies are lost. Thus, it can be extremely difficult to identify, let alone characterize, ancient WGDs. In this study, we use a new maximum likelihood estimator to test for evidence of ancient WGDs in land plants and estimate the fraction of new genes copies that are retained following a WGD using gene count data, the number of gene copies in gene families. We identified evidence of many putative ancient WGDs in land plants and found that the genome fractionation rates vary tremendously among ancient WGDs. Analyses of WGDs within Brassicales also indicate that background gene duplication and loss rates vary across land plants, and different gene families have different probabilities of being retained following a WGD. Although our analyses are largely robust to errors in duplication and loss rates and the choice of priors, simulations indicate that this method can have trouble detecting multiple WGDs that occur on the same branch, especially when the gene retention rates for ancient WGDs are very low. They also suggest that we should carefully evaluate evidence for some ancient plant WGD hypotheses. PMID:26988251

  20. Gene duplication, silencing and expression alteration govern the molecular evolution of PRC2 genes in plants.

    PubMed

    Furihata, Hazuka Y; Suenaga, Kazuya; Kawanabe, Takahiro; Yoshida, Takanori; Kawabe, Akira

    2016-10-13

    PRC2 genes were analyzed for their number of gene duplications, dN/dS ratios and expression patterns among Brassicaceae and Gramineae species. Although both amino acid sequences and copy number of the PRC2 genes were generally well conserved in both Brassicaceae and Gramineae species, we observed that some rapidly evolving genes experienced duplications and expression pattern changes. After multiple duplication events, all but one or two of the duplicated copies tend to be silenced. Silenced copies were reactivated in the endosperm and showed ectopic expression in developing seeds. The results indicated that rapid evolution of some PRC2 genes is initially caused by a relaxation of selective constraint following the gene duplication events. Several loci could become maternally expressed imprinted genes and acquired functional roles in the endosperm.

  1. On the origins of Mendelian disease genes in man: the impact of gene duplication.

    PubMed

    Dickerson, Jonathan E; Robertson, David L

    2012-01-01

    Over 3,000 human diseases are known to be linked to heritable genetic variation, mapping to over 1,700 unique genes. Dating of the evolutionary age of these disease-associated genes has suggested that they have a tendency to be ancient, specifically coming into existence with early metazoa. The approach taken by past studies, however, assumes that the age of a disease is the same as the age of its common ancestor, ignoring the fundamental contribution of duplication events in the evolution of new genes and function. Here, we date both the common ancestor and the duplication history of known human disease-associated genes. We find that the majority of disease genes (80%) are genes that have been duplicated in their evolutionary history. Periods for which there are more disease-associated genes, for example, at the origins of bony vertebrates, are explained by the emergence of more genes at that time, and the majority of these are duplicates inferred to have arisen by whole-genome duplication. These relationships are similar for different disease types and the disease-associated gene's cellular function. This indicates that the emergence of duplication-associated diseases has been ongoing and approximately constant (relative to the retention of duplicate genes) throughout the evolution of life. This continued until approximately 390 Ma from which time relatively fewer novel genes came into existence on the human lineage, let alone disease genes. For single-copy genes associated with disease, we find that the numbers of disease genes decreases with recency. For the majority of duplicates, the disease-associated mutation is associated with just one of the duplicate copies. A universal explanation for heritable disease is, thus, that it is merely a by-product of the evolutionary process; the evolution of new genes (de novo or by duplication) results in the potential for new diseases to emerge.

  2. Insight into transcription factor gene duplication from Caenorhabditis elegans Promoterome-driven expression patterns

    PubMed Central

    Reece-Hoyes, John S; Shingles, Jane; Dupuy, Denis; Grove, Christian A; Walhout, Albertha JM; Vidal, Marc; Hope, Ian A

    2007-01-01

    Background The C. elegans Promoterome is a powerful resource for revealing the regulatory mechanisms by which transcription is controlled pan-genomically. Transcription factors will form the core of any systems biology model of genome control and therefore the promoter activity of Promoterome inserts for C. elegans transcription factor genes was examined, in vivo, with a reporter gene approach. Results Transgenic C. elegans strains were generated for 366 transcription factor promoter/gfp reporter gene fusions. GFP distributions were determined, and then summarized with reference to developmental stage and cell type. Reliability of these data was demonstrated by comparison to previously described gene product distributions. A detailed consideration of the results for one C. elegans transcription factor gene family, the Six family, comprising ceh-32, ceh-33, ceh-34 and unc-39 illustrates the value of these analyses. The high proportion of Promoterome reporter fusions that drove GFP expression, compared to previous studies, led to the hypothesis that transcription factor genes might be involved in local gene duplication events less frequently than other genes. Comparison of transcription factor genes of C. elegans and Caenorhabditis briggsae was therefore carried out and revealed very few examples of functional gene duplication since the divergence of these species for most, but not all, transcription factor gene families. Conclusion Examining reporter expression patterns for hundreds of promoters informs, and thereby improves, interpretation of this data type. Genes encoding transcription factors involved in intrinsic developmental control processes appear acutely sensitive to changes in gene dosage through local gene duplication, on an evolutionary time scale. PMID:17244357

  3. The Phenotypic Plasticity of Duplicated Genes in Saccharomyces cerevisiae and the Origin of Adaptations.

    PubMed

    Mattenberger, Florian; Sabater-Muñoz, Beatriz; Toft, Christina; Fares, Mario A

    2017-01-05

    Gene and genome duplication are the major sources of biological innovations in plants and animals. Functional and transcriptional divergence between the copies after gene duplication has been considered the main driver of innovations . However, here we show that increased phenotypic plasticity after duplication plays a more major role than thought before in the origin of adaptations. We perform an exhaustive analysis of the transcriptional alterations of duplicated genes in the unicellular eukaryote Saccharomyces cerevisiae when challenged with five different environmental stresses. Analysis of the transcriptomes of yeast shows that gene duplication increases the transcriptional response to environmental changes, with duplicated genes exhibiting signatures of adaptive transcriptional patterns in response to stress. The mechanism of duplication matters, with whole-genome duplicates being more transcriptionally altered than small-scale duplicates. The predominant transcriptional pattern follows the classic theory of evolution by gene duplication; with one gene copy remaining unaltered under stress, while its sister copy presents large transcriptional plasticity and a prominent role in adaptation. Moreover, we find additional transcriptional profiles that are suggestive of neo- and subfunctionalization of duplicate gene copies. These patterns are strongly correlated with the functional dependencies and sequence divergence profiles of gene copies. We show that, unlike singletons, duplicates respond more specifically to stress, supporting the role of natural selection in the transcriptional plasticity of duplicates. Our results reveal the underlying transcriptional complexity of duplicated genes and its role in the origin of adaptations.

  4. The Phenotypic Plasticity of Duplicated Genes in Saccharomyces cerevisiae and the Origin of Adaptations

    PubMed Central

    Mattenberger, Florian; Sabater-Muñoz, Beatriz; Toft, Christina; Fares, Mario A.

    2016-01-01

    Gene and genome duplication are the major sources of biological innovations in plants and animals. Functional and transcriptional divergence between the copies after gene duplication has been considered the main driver of innovations . However, here we show that increased phenotypic plasticity after duplication plays a more major role than thought before in the origin of adaptations. We perform an exhaustive analysis of the transcriptional alterations of duplicated genes in the unicellular eukaryote Saccharomyces cerevisiae when challenged with five different environmental stresses. Analysis of the transcriptomes of yeast shows that gene duplication increases the transcriptional response to environmental changes, with duplicated genes exhibiting signatures of adaptive transcriptional patterns in response to stress. The mechanism of duplication matters, with whole-genome duplicates being more transcriptionally altered than small-scale duplicates. The predominant transcriptional pattern follows the classic theory of evolution by gene duplication; with one gene copy remaining unaltered under stress, while its sister copy presents large transcriptional plasticity and a prominent role in adaptation. Moreover, we find additional transcriptional profiles that are suggestive of neo- and subfunctionalization of duplicate gene copies. These patterns are strongly correlated with the functional dependencies and sequence divergence profiles of gene copies. We show that, unlike singletons, duplicates respond more specifically to stress, supporting the role of natural selection in the transcriptional plasticity of duplicates. Our results reveal the underlying transcriptional complexity of duplicated genes and its role in the origin of adaptations. PMID:27799339

  5. Gene duplication, tissue-specific gene expression and sexual conflict in stalk-eyed flies (Diopsidae).

    PubMed

    Baker, Richard H; Narechania, Apurva; Johns, Philip M; Wilkinson, Gerald S

    2012-08-19

    Gene duplication provides an essential source of novel genetic material to facilitate rapid morphological evolution. Traits involved in reproduction and sexual dimorphism represent some of the fastest evolving traits in nature, and gene duplication is intricately involved in the origin and evolution of these traits. Here, we review genomic research on stalk-eyed flies (Diopsidae) that has been used to examine the extent of gene duplication and its role in the genetic architecture of sexual dimorphism. Stalk-eyed flies are remarkable because of the elongation of the head into long stalks, with the eyes and antenna laterally displaced at the ends of these stalks. Many species are strongly sexually dimorphic for eyespan, and these flies have become a model system for studying sexual selection. Using both expressed sequence tag and next-generation sequencing, we have established an extensive database of gene expression in the developing eye-antennal imaginal disc, the adult head and testes. Duplicated genes exhibit narrower expression patterns than non-duplicated genes, and the testes, in particular, provide an abundant source of gene duplication. Within somatic tissue, duplicated genes are more likely to be differentially expressed between the sexes, suggesting gene duplication may provide a mechanism for resolving sexual conflict.

  6. Species-specific duplications of NBS-encoding genes in Chinese chestnut (Castanea mollissima)

    PubMed Central

    Zhong, Yan; Li, Yingjun; Huang, Kaihui; Cheng, Zong-Ming

    2015-01-01

    The disease resistance (R) genes play an important role in protecting plants from infection by diverse pathogens in the environment. The nucleotide-binding site (NBS)-leucine-rich repeat (LRR) class of genes is one of the largest R gene families. Chinese chestnut (Castanea mollissima) is resistant to Chestnut Blight Disease, but relatively little is known about the resistance mechanism. We identified 519 NBS-encoding genes, including 374 NBS-LRR genes and 145 NBS-only genes. The majority of Ka/Ks were less than 1, suggesting the purifying selection operated during the evolutionary history of NBS-encoding genes. A minority (4/34) of Ka/Ks in non-TIR gene families were greater than 1, showing that some genes were under positive selection pressure. Furthermore, Ks peaked at a range of 0.4 to 0.5, indicating that ancient duplications arose during the evolution. The relationship between Ka/Ks and Ks indicated greater selective pressure on the newer and older genes with the critical value of Ks = 0.4–0.5. Notably, species-specific duplications were detected in NBS-encoding genes. In addition, the group of RPW8-NBS-encoding genes clustered together as an independent clade located at a relatively basal position in the phylogenetic tree. Many cis-acting elements related to plant defense responses were detected in promoters of NBS-encoding genes. PMID:26559332

  7. Chimeric Genes in Deletions and Duplications Associated with Intellectual Disability

    PubMed Central

    Monfort, Sandra; Roselló, Mónica; Oltra, Silvestre; Caro-Llopis, Alfonso

    2017-01-01

    We report on three nonrelated patients with intellectual disability and CNVs that give rise to three new chimeric genes. All the genes forming these fusion transcripts may have an important role in central nervous system development and/or in gene expression regulation, and therefore not only their deletion or duplication but also the resulting chimeric gene may contribute to the phenotype of the patients. Deletions and duplications are usually pathogenic when affecting dose-sensitive genes. Alternatively, a chimeric gene may also be pathogenic by different gain-of-function mechanisms that are not restricted to dose-sensitive genes: the emergence of a new polypeptide that combines functional domains from two different genes, the deregulated expression of any coding sequence by the promoter region of a neighboring gene, and/or a putative dominant-negative effect due to the preservation of functional domains of partially truncated proteins. Fusion oncogenes are well known, but in other pathologies, the search for chimeric genes is disregarded. According to our findings, we hypothesize that the frequency of fusion transcripts may be much higher than suspected, and it should be taken into account in the array-CGH analyses of patients with intellectual disability. PMID:28630856

  8. A rare case of plastid protein-coding gene duplication in the chloroplast genome of Euglena archaeoplastidiata (Euglenophyta).

    PubMed

    Bennett, Matthew S; Shiu, Shin-Han; Triemer, Richard E

    2017-03-12

    Gene duplication is an important evolutionary process that allows duplicate functions to diverge, or, in some cases, allows for new functional gains. However, in contrast to the nuclear genome, gene duplications within the chloroplast are extremely rare. Here, we present the chloroplast genome of the photosynthetic protist Euglena archaeoplastidiata. Upon annotation, it was found that the chloroplast genome contained a novel tandem direct duplication that encoded a portion of RuBisCO large subunit (rbcL) followed by a complete copy of ribosomal protein L32 (rpl32), as well as the associated intergenic sequences. Analyses of the duplicated rpl32 were inconclusive regarding selective pressures, although it was found that substitutions in the duplicated region, all non-synonymous, likely had a neutral functional effect. The duplicated region did not exhibit patterns consistent with previously described mechanisms for tandem direct duplications, and demonstrated an unknown mechanism of duplication. In addition, a comparison of this chloroplast genome to other previously characterized chloroplast genomes from the same family revealed characteristics that indicated E. archaeoplastidiata was probably more closely related to taxa in the genera Monomorphina, Cryptoglena, and Euglenaria than it was to other Euglena taxa. Taken together, the chloroplast genome of E. archaeoplastidiata demonstrated multiple characteristics unique to the euglenoid world, and has justified the longstanding curiosity regarding this enigmatic taxon.

  9. Gene duplication and complex circadian clocks in mammals.

    PubMed

    Looby, Paul; Loudon, Andrew S I

    2005-01-01

    The circadian clock arose early in the evolution of life to enable organisms to adapt to the cycle of day and night. Recently, the extent and importance of circadian regulation of behaviour and physiology has come to be more fully realized. Core molecular cogs of circadian oscillators appear to have been largely conserved between such diverse organisms as Drosophila melanogaster and mammals. However, gene duplication events have produced multiple copies of many clock genes in mammals. Recent studies suggest that genome duplication has lead to increased circadian complexity and local tissue regulation. This has important implications for temporal regulation of behaviour via multiple clocks in the central nervous system, and also extends to the local physiology of major body organs and tissues.

  10. A duplication/paracentric inversion associated with familial X-linked deafness (DFN3) suggests the presence of a regulatory element more than 400 kb upstream of the POU3F4 gene.

    PubMed

    de Kok, Y J; Merkx, G F; van der Maarel, S M; Huber, I; Malcolm, S; Ropers, H H; Cremers, F P

    1995-11-01

    X-linked deafness with stapes fixation (DFN3) is caused by mutations in the POU3F4 gene at Xq21.1. By employing pulsed field gel electrophoresis (PFGE) we identified a chromosomal aberration in the DNA of a DFN3 patient who did not show alterations in the open reading frame (ORF) of POU3F4. Southern blot analysis indicated that a DNA segment of 150 kb, located 170 kb proximal to the POU3F4 gene, was duplicated. Fluorescence in situ hybridization (FISH) analysis, PFGE, and detailed Southern analysis revealed that this duplication is part of a more complex rearrangement including a paracentric inversion involving the Xq21.1 region, and presumably the Xq21.3 region. Since at least two DFN3-associated minideletions are situated proximal to the duplicated segment, the inversion most likely disconnects the POU3F4 gene from a regulatory element which is located at a distance of at least 400 kb upstream of the POU3F4 gene.

  11. Recurrent duplications of the annexin A1 gene (ANXA1) in autism spectrum disorders.

    PubMed

    Correia, Catarina T; Conceição, Inês C; Oliveira, Bárbara; Coelho, Joana; Sousa, Inês; Sequeira, Ana F; Almeida, Joana; Café, Cátia; Duque, Frederico; Mouga, Susana; Roberts, Wendy; Gao, Kun; Lowe, Jennifer K; Thiruvahindrapuram, Bhooma; Walker, Susan; Marshall, Christian R; Pinto, Dalila; Nurnberger, John I; Scherer, Stephen W; Geschwind, Daniel H; Oliveira, Guiomar; Vicente, Astrid M

    2014-04-10

    Validating the potential pathogenicity of copy number variants (CNVs) identified in genome-wide studies of autism spectrum disorders (ASD) requires detailed assessment of case/control frequencies, inheritance patterns, clinical correlations, and functional impact. Here, we characterize a small recurrent duplication in the annexin A1 (ANXA1) gene, identified by the Autism Genome Project (AGP) study. From the AGP CNV genomic screen in 2,147 ASD individuals, we selected for characterization an ANXA1 gene duplication that was absent in 4,964 population-based controls. We further screened the duplication in a follow-up sample including 1,496 patients and 410 controls, and evaluated clinical correlations and family segregation. Sequencing of exonic/downstream ANXA1 regions was performed in 490 ASD patients for identification of additional variants. The ANXA1 duplication, overlapping the last four exons and 3'UTR region, had an overall prevalence of 11/3,643 (0.30%) in unrelated ASD patients but was not identified in 5,374 controls. Duplication carriers presented no distinctive clinical phenotype. Family analysis showed neuropsychiatric deficits and ASD traits in multiple relatives carrying the duplication, suggestive of a complex genetic inheritance. Sequencing of exonic regions and the 3'UTR identified 11 novel changes, but no obvious variants with clinical significance. We provide multilevel evidence for a role of ANXA1 in ASD etiology. Given its important role as mediator of glucocorticoid function in a wide variety of brain processes, including neuroprotection, apoptosis, and control of the neuroendocrine system, the results add ANXA1 to the growing list of rare candidate genetic etiological factors for ASD.

  12. Recurrent duplications of the annexin A1 gene (ANXA1) in autism spectrum disorders

    PubMed Central

    2014-01-01

    Background Validating the potential pathogenicity of copy number variants (CNVs) identified in genome-wide studies of autism spectrum disorders (ASD) requires detailed assessment of case/control frequencies, inheritance patterns, clinical correlations, and functional impact. Here, we characterize a small recurrent duplication in the annexin A1 (ANXA1) gene, identified by the Autism Genome Project (AGP) study. Methods From the AGP CNV genomic screen in 2,147 ASD individuals, we selected for characterization an ANXA1 gene duplication that was absent in 4,964 population-based controls. We further screened the duplication in a follow-up sample including 1,496 patients and 410 controls, and evaluated clinical correlations and family segregation. Sequencing of exonic/downstream ANXA1 regions was performed in 490 ASD patients for identification of additional variants. Results The ANXA1 duplication, overlapping the last four exons and 3’UTR region, had an overall prevalence of 11/3,643 (0.30%) in unrelated ASD patients but was not identified in 5,374 controls. Duplication carriers presented no distinctive clinical phenotype. Family analysis showed neuropsychiatric deficits and ASD traits in multiple relatives carrying the duplication, suggestive of a complex genetic inheritance. Sequencing of exonic regions and the 3’UTR identified 11 novel changes, but no obvious variants with clinical significance. Conclusions We provide multilevel evidence for a role of ANXA1 in ASD etiology. Given its important role as mediator of glucocorticoid function in a wide variety of brain processes, including neuroprotection, apoptosis, and control of the neuroendocrine system, the results add ANXA1 to the growing list of rare candidate genetic etiological factors for ASD. PMID:24720851

  13. Divergence in Enzymatic Activities in the Soybean GST Supergene Family Provides New Insight into the Evolutionary Dynamics of Whole-Genome Duplicates

    PubMed Central

    Liu, Hai-Jing; Tang, Zhen-Xin; Han, Xue-Min; Yang, Zhi-Ling; Zhang, Fu-Min; Yang, Hai-Ling; Liu, Yan-Jing; Zeng, Qing-Yin

    2015-01-01

    Whole-genome duplication (WGD), or polyploidy, is a major force in plant genome evolution. A duplicate of all genes is present in the genome immediately following a WGD event. However, the evolutionary mechanisms responsible for the loss of, or retention and subsequent functional divergence of polyploidy-derived duplicates remain largely unknown. In this study we reconstructed the evolutionary history of the glutathione S-transferase (GST) gene family from the soybean genome, and identified 72 GST duplicated gene pairs formed by a recent Glycine-specific WGD event occurring approximately 13 Ma. We found that 72% of duplicated GST gene pairs experienced gene losses or pseudogenization, whereas 28% of GST gene pairs have been retained in the soybean genome. The GST pseudogenes were under relaxed selective constraints, whereas functional GSTs were subject to strong purifying selection. Plant GST genes play important roles in stress tolerance and detoxification metabolism. By examining the gene expression responses to abiotic stresses and enzymatic properties of the ancestral and current proteins, we found that polyploidy-derived GST duplicates show the divergence in enzymatic activities. Through site-directed mutagenesis of ancestral proteins, this study revealed that nonsynonymous substitutions of key amino acid sites play an important role in the divergence of enzymatic functions of polyploidy-derived GST duplicates. These findings provide new insights into the evolutionary and functional dynamics of polyploidy-derived duplicate genes. PMID:26219583

  14. Divergence in Enzymatic Activities in the Soybean GST Supergene Family Provides New Insight into the Evolutionary Dynamics of Whole-Genome Duplicates.

    PubMed

    Liu, Hai-Jing; Tang, Zhen-Xin; Han, Xue-Min; Yang, Zhi-Ling; Zhang, Fu-Min; Yang, Hai-Ling; Liu, Yan-Jing; Zeng, Qing-Yin

    2015-11-01

    Whole-genome duplication (WGD), or polyploidy, is a major force in plant genome evolution. A duplicate of all genes is present in the genome immediately following a WGD event. However, the evolutionary mechanisms responsible for the loss of, or retention and subsequent functional divergence of polyploidy-derived duplicates remain largely unknown. In this study we reconstructed the evolutionary history of the glutathione S-transferase (GST) gene family from the soybean genome, and identified 72 GST duplicated gene pairs formed by a recent Glycine-specific WGD event occurring approximately 13 Ma. We found that 72% of duplicated GST gene pairs experienced gene losses or pseudogenization, whereas 28% of GST gene pairs have been retained in the soybean genome. The GST pseudogenes were under relaxed selective constraints, whereas functional GSTs were subject to strong purifying selection. Plant GST genes play important roles in stress tolerance and detoxification metabolism. By examining the gene expression responses to abiotic stresses and enzymatic properties of the ancestral and current proteins, we found that polyploidy-derived GST duplicates show the divergence in enzymatic activities. Through site-directed mutagenesis of ancestral proteins, this study revealed that nonsynonymous substitutions of key amino acid sites play an important role in the divergence of enzymatic functions of polyploidy-derived GST duplicates. These findings provide new insights into the evolutionary and functional dynamics of polyploidy-derived duplicate genes. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  15. Exact Algorithms for Duplication-Transfer-Loss Reconciliation with Non-Binary Gene Trees.

    PubMed

    Kordi, Misagh; Bansal, Mukul S

    2017-06-01

    Duplication-Transfer-Loss (DTL) reconciliation is a powerful method for studying gene family evolution in the presence of horizontal gene transfer. DTL reconciliation seeks to reconcile gene trees with species trees by postulating speciation, duplication, transfer, and loss events. Efficient algorithms exist for finding optimal DTL reconciliations when the gene tree is binary. In practice, however, gene trees are often non-binary due to uncertainty in the gene tree topologies, and DTL reconciliation with non-binary gene trees is known to be NP-hard. In this paper, we present the first exact algorithms for DTL reconciliation with non-binary gene trees. Specifically, we (i) show that the DTL reconciliation problem for non-binary gene trees is fixed-parameter tractable in the maximum degree of the gene tree, (ii) present an exponential-time, but in-practice efficient, algorithm to track and enumerate all optimal binary resolutions of a non-binary input gene tree, and (iii) apply our algorithms to a large empirical data set of over 4700 gene trees from 100 species to study the impact of gene tree uncertainty on DTL-reconciliation and to demonstrate the applicability and utility of our algorithms. The new techniques and algorithms introduced in this paper will help biologists avoid incorrect evolutionary inferences caused by gene tree uncertainty.

  16. Six Subgroups and Extensive Recent Duplications Characterize the Evolution of the Eukaryotic Tubulin Protein Family

    PubMed Central

    Findeisen, Peggy; Mühlhausen, Stefanie; Dempewolf, Silke; Hertzog, Jonny; Zietlow, Alexander; Carlomagno, Teresa; Kollmar, Martin

    2014-01-01

    Tubulins belong to the most abundant proteins in eukaryotes providing the backbone for many cellular substructures like the mitotic and meiotic spindles, the intracellular cytoskeletal network, and the axonemes of cilia and flagella. Homologs have even been reported for archaea and bacteria. However, a taxonomically broad and whole-genome-based analysis of the tubulin protein family has never been performed, and thus, the number of subfamilies, their taxonomic distribution, and the exact grouping of the supposed archaeal and bacterial homologs are unknown. Here, we present the analysis of 3,524 tubulins from 504 species. The tubulins formed six major subfamilies, α to ζ. Species of all major kingdoms of the eukaryotes encode members of these subfamilies implying that they must have already been present in the last common eukaryotic ancestor. The proposed archaeal homologs grouped together with the bacterial TubZ proteins as sister clade to the FtsZ proteins indicating that tubulins are unique to eukaryotes. Most species contained α- and/or β-tubulin gene duplicates resulting from recent branch- and species-specific duplication events. This shows that tubulins cannot be used for constructing species phylogenies without resolving their ortholog–paralog relationships. The many gene duplicates and also the independent loss of the δ-, ε-, or ζ-tubulins, which have been shown to be part of the triplet microtubules in basal bodies, suggest that tubulins can functionally substitute each other. PMID:25169981

  17. Evidence for the fixation of gene duplications by positive selection in Drosophila

    PubMed Central

    Cardoso-Moreira, Margarida; Arguello, J. Roman; Gottipati, Srikanth; Harshman, L.G.; Grenier, Jennifer K.; Clark, Andrew G.

    2016-01-01

    Gene duplications play a key role in the emergence of novel traits and in adaptation. But despite their centrality to evolutionary processes, it is still largely unknown how new gene duplicates are initially fixed within populations and later maintained in genomes. Long-standing debates on the evolution of gene duplications could be settled by determining the relative importance of genetic drift vs. positive selection in the fixation of new gene duplicates. Using the Drosophila Global Diversity Lines (GDL), we have combined genome-wide SNP polymorphism data with a novel set of copy number variant calls and gene expression profiles to characterize the polymorphic phase of new genes. We found that approximately half of the roughly 500 new complete gene duplications segregating in the GDL lead to significant increases in the expression levels of the duplicated genes and that these duplications are more likely to be found at lower frequencies, suggesting a negative impact on fitness. However, we also found that six of the nine gene duplications that are fixed or close to fixation in at least one of the five populations in our study show signs of being under positive selection, and that these duplications are likely beneficial because of dosage effects, with a possible role for additional mutations in two duplications. Our work suggests that in Drosophila, theoretical models that posit that gene duplications are immediately beneficial and fixed by positive selection are most relevant to explain the long-term evolution of gene duplications in this species. PMID:27197209

  18. Evaluation of whether accelerated protein evolution in chordates has occurred before, after, or simultaneously with gene duplication.

    PubMed

    Johnston, Catrióna R; O'Dushlaine, Colm; Fitzpatrick, David A; Edwards, Richard J; Shields, Denis C

    2007-01-01

    Gene duplication and loss are predicted to be at least of the order of the substitution rate and are key contributors to the development of novel gene function and overall genome evolution. Although it has been established that proteins evolve more rapidly after gene duplication, we were interested in testing to what extent this reflects causation or association. Therefore, we investigated the rate of evolution prior to gene duplication in chordates. Two patterns emerged; firstly, branches, which are both preceded by a duplication and followed by a duplication, display an elevated rate of amino acid replacement. This is reflected in the ratio of nonsynonymous to synonymous substitution (mean nonsynonymous to synonymous nucleotide substitution rate ratio [Ka:Ks]) of 0.44 compared with branches preceded by and followed by a speciation (mean Ka:Ks of 0.23). The observed patterns suggest that there can be simultaneous alteration in the selection pressures on both gene duplication and amino acid replacement, which may be consistent with co-occurring increases in positive selection, or alternatively with concurrent relaxation of purifying selection. The pattern is largely, but perhaps not completely, explained by the existence of certain families that have elevated rates of both gene duplication and amino acid replacement. Secondly, we observed accelerated amino acid replacement prior to duplication (mean Ka:Ks for postspeciation preduplication branches was 0.27). In some cases, this could reflect adaptive changes in protein function precipitating a gene duplication event. In conclusion, the circumstances surrounding the birth of new proteins may frequently involve a simultaneous change in selection pressures on both gene-copy number and amino acid replacement. More precise modeling of the relative importance of preduplication, postduplication, and simultaneous amino acid replacement will require larger and denser genomic data sets from multiple species, allowing

  19. Evidence showing duplication and recombination of cel genes in tandem from hyperthermophilic Thermotoga sp.

    PubMed

    Kim, Min Keun; Kang, Tae Ho; Kim, Jungho; Kim, Hoon; Yun, Han Dae

    2012-12-01

    This study was conducted to assess the gene duplication and diversification of tandem cellulase genes in thermophilic bacteria. The tandem cellulase genes cel5C and cel5D were cloned from Thermotoga maritima MSB8, and a survey of the thermophilic bacterial genome for tandem cel genes from the databases was carried out. A clone having 2.3 kb fragment from T. maritima MSB8 showed cellulase activity, which had two open reading frames in tandem (cel5C and cel5D). The cel5C gene has 954 bp, which encodes a protein of 317 amino acid residues with a signal peptide of 23 amino acids, and the other gene cel5D consisting of 990 bp encoding a protein of 329 amino acid residues. These two proteins have similarity with the enzymes of glycosyl hydrolase family 5. From the enzyme assay, it was observed that Cel5C was extracellular and Cel5D was intracellular cellulase. Phylogenetic and homology matrix analyses of DNA and protein sequences revealed that family 12 cellulase enzymes Cel12A and Cel12B displayed higher homology (>50 %), but Cel5C and Cel5D enzymes belong to family 5 displayed lower homology (<30 %). In addition, repeated and mirror sequences in tandem genes are supposed to show the existence of gene duplication and recombination.

  20. Hox gene duplications correlate with posterior heteronomy in scorpions.

    PubMed

    Sharma, Prashant P; Schwager, Evelyn E; Extavour, Cassandra G; Wheeler, Ward C

    2014-10-07

    The evolutionary success of the largest animal phylum, Arthropoda, has been attributed to tagmatization, the coordinated evolution of adjacent metameres to form morphologically and functionally distinct segmental regions called tagmata. Specification of regional identity is regulated by the Hox genes, of which 10 are inferred to be present in the ancestor of arthropods. With six different posterior segmental identities divided into two tagmata, the bauplan of scorpions is the most heteronomous within Chelicerata. Expression domains of the anterior eight Hox genes are conserved in previously surveyed chelicerates, but it is unknown how Hox genes regionalize the three tagmata of scorpions. Here, we show that the scorpion Centruroides sculpturatus has two paralogues of all Hox genes except Hox3, suggesting cluster and/or whole genome duplication in this arachnid order. Embryonic anterior expression domain boundaries of each of the last four pairs of Hox genes (two paralogues each of Antp, Ubx, abd-A and Abd-B) are unique and distinguish segmental groups, such as pectines, book lungs and the characteristic tail, while maintaining spatial collinearity. These distinct expression domains suggest neofunctionalization of Hox gene paralogues subsequent to duplication. Our data reconcile previous understanding of Hox gene function across arthropods with the extreme heteronomy of scorpions.

  1. Hox gene duplications correlate with posterior heteronomy in scorpions

    PubMed Central

    Sharma, Prashant P.; Schwager, Evelyn E.; Extavour, Cassandra G.; Wheeler, Ward C.

    2014-01-01

    The evolutionary success of the largest animal phylum, Arthropoda, has been attributed to tagmatization, the coordinated evolution of adjacent metameres to form morphologically and functionally distinct segmental regions called tagmata. Specification of regional identity is regulated by the Hox genes, of which 10 are inferred to be present in the ancestor of arthropods. With six different posterior segmental identities divided into two tagmata, the bauplan of scorpions is the most heteronomous within Chelicerata. Expression domains of the anterior eight Hox genes are conserved in previously surveyed chelicerates, but it is unknown how Hox genes regionalize the three tagmata of scorpions. Here, we show that the scorpion Centruroides sculpturatus has two paralogues of all Hox genes except Hox3, suggesting cluster and/or whole genome duplication in this arachnid order. Embryonic anterior expression domain boundaries of each of the last four pairs of Hox genes (two paralogues each of Antp, Ubx, abd-A and Abd-B) are unique and distinguish segmental groups, such as pectines, book lungs and the characteristic tail, while maintaining spatial collinearity. These distinct expression domains suggest neofunctionalization of Hox gene paralogues subsequent to duplication. Our data reconcile previous understanding of Hox gene function across arthropods with the extreme heteronomy of scorpions. PMID:25122224

  2. Origin and evolution of eukaryotic chaperonins: phylogenetic evidence for ancient duplications in CCT genes.

    PubMed

    Archibald, J M; Logsdon, J M; Doolittle, W F

    2000-10-01

    Chaperonins are oligomeric protein-folding complexes which are divided into two distantly related structural classes. Group I chaperonins (called GroEL/cpn60/hsp60) are found in bacteria and eukaryotic organelles, while group II chaperonins are present in archaea and the cytoplasm of eukaryotes (called CCT/TriC). While archaea possess one to three chaperonin subunit-encoding genes, eight distinct CCT gene families (paralogs) have been characterized in eukaryotes. We are interested in determining when during eukaryotic evolution the multiple gene duplications producing the CCT subunits occurred. We describe the sequence and phylogenetic analysis of five CCT genes from TRICHOMONAS: vaginalis and seven from GIARDIA: lamblia, representatives of amitochondriate protist lineages thought to have diverged early from other eukaryotes. Our data show that the gene duplications producing the eight CCT paralogs took place prior to the organismal divergence of TRICHOMONAS: and GIARDIA: from other eukaryotes. Thus, these divergent protists likely possess completely hetero-oligomeric CCT complexes like those in yeast and mammalian cells. No close phylogenetic relationship between the archaeal chaperonins and specific CCT subunits was observed, suggesting that none of the CCT gene duplications predate the divergence of archaea and eukaryotes. The duplications producing the CCTdelta and CCTepsilon subunits, as well as CCTalpha, CCTbeta, and CCTeta, are the most recent in the CCT gene family. Our analyses show significant differences in the rates of evolution of archaeal chaperonins compared with the eukaryotic CCTs, as well as among the different CCT subunits themselves. We discuss these results in light of current views on the origin, evolution, and function of CCT complexes.

  3. The human VK locus. Characterization of a duplicated region encoding 28 different immunoglobulin genes.

    PubMed

    Straubinger, B; Huber, E; Lorenz, W; Osterholzer, E; Pargent, W; Pech, M; Pohlenz, H D; Zimmer, F J; Zachau, H G

    1988-01-05

    Two large regions of the human multigene family coding for the variable parts of the immunoglobulin light chains of the K type (VK) have been characterized on cosmid clones. The two germline regions, called Aa and Ab, span together 250,000 base-pairs and comprise 28 different VK gene segments, nine of which have been sequenced. There is a preponderance of VKII genes but genes belonging to subgroups I and III, and genes that cannot be easily assigned to one of the known subgroups, are interspersed within the VKII gene clusters. A number of pseudogenes have been identified. Within the Aa and Ab regions, all gene segments are organized in the same transcriptional orientation. The regions Aa and Ab, whose restriction maps are highly homologous, were shown not to be allelic structures; they must have arisen by a duplication event. Taken together with previous results, one can conclude that the major part of the VK locus exists in duplicated form. One individual has been found who has only one copy of some of the duplicated regions. By chromosomal walking, the A regions could be linked to the O regions, an analysis of which has been reported. The A regions contribute about one-third of the VK genes so far identified.

  4. New genes from old: asymmetric divergence of gene duplicates and the evolution of development.

    PubMed

    Holland, Peter W H; Marlétaz, Ferdinand; Maeso, Ignacio; Dunwell, Thomas L; Paps, Jordi

    2017-02-05

    Gene duplications and gene losses have been frequent events in the evolution of animal genomes, with the balance between these two dynamic processes contributing to major differences in gene number between species. After gene duplication, it is common for both daughter genes to accumulate sequence change at approximately equal rates. In some cases, however, the accumulation of sequence change is highly uneven with one copy radically diverging from its paralogue. Such 'asymmetric evolution' seems commoner after tandem gene duplication than after whole-genome duplication, and can generate substantially novel genes. We describe examples of asymmetric evolution in duplicated homeobox genes of moths, molluscs and mammals, in each case generating new homeobox genes that were recruited to novel developmental roles. The prevalence of asymmetric divergence of gene duplicates has been underappreciated, in part, because the origin of highly divergent genes can be difficult to resolve using standard phylogenetic methods.This article is part of the themed issue 'Evo-devo in the genomics era, and the origins of morphological diversity'.

  5. A salmonid EST genomic study: genes, duplications, phylogeny and microarrays

    PubMed Central

    Koop, Ben F; von Schalburg, Kristian R; Leong, Jong; Walker, Neil; Lieph, Ryan; Cooper, Glenn A; Robb, Adrienne; Beetz-Sargent, Marianne; Holt, Robert A; Moore, Richard; Brahmbhatt, Sonal; Rosner, Jamie; Rexroad, Caird E; McGowan, Colin R; Davidson, William S

    2008-01-01

    Background Salmonids are of interest because of their relatively recent genome duplication, and their extensive use in wild fisheries and aquaculture. A comprehensive gene list and a comparison of genes in some of the different species provide valuable genomic information for one of the most widely studied groups of fish. Results 298,304 expressed sequence tags (ESTs) from Atlantic salmon (69% of the total), 11,664 chinook, 10,813 sockeye, 10,051 brook trout, 10,975 grayling, 8,630 lake whitefish, and 3,624 northern pike ESTs were obtained in this study and have been deposited into the public databases. Contigs were built and putative full-length Atlantic salmon clones have been identified. A database containing ESTs, assemblies, consensus sequences, open reading frames, gene predictions and putative annotation is available. The overall similarity between Atlantic salmon ESTs and those of rainbow trout, chinook, sockeye, brook trout, grayling, lake whitefish, northern pike and rainbow smelt is 93.4, 94.2, 94.6, 94.4, 92.5, 91.7, 89.6, and 86.2% respectively. An analysis of 78 transcript sets show Salmo as a sister group to Oncorhynchus and Salvelinus within Salmoninae, and Thymallinae as a sister group to Salmoninae and Coregoninae within Salmonidae. Extensive gene duplication is consistent with a genome duplication in the common ancestor of salmonids. Using all of the available EST data, a new expanded salmonid cDNA microarray of 32,000 features was created. Cross-species hybridizations to this cDNA microarray indicate that this resource will be useful for studies of all 68 salmonid species. Conclusion An extensive collection and analysis of salmonid RNA putative transcripts indicate that Pacific salmon, Atlantic salmon and charr are 94–96% similar while the more distant whitefish, grayling, pike and smelt are 93, 92, 89 and 86% similar to salmon. The salmonid transcriptome reveals a complex history of gene duplication that is consistent with an ancestral

  6. The large soybean (Glycine max) WRKY TF family expanded by segmental duplication events and subsequent divergent selection among subgroups

    PubMed Central

    2013-01-01

    Background WRKY genes encode one of the most abundant groups of transcription factors in higher plants, and its members regulate important biological process such as growth, development, and responses to biotic and abiotic stresses. Although the soybean genome sequence has been published, functional studies on soybean genes still lag behind those of other species. Results We identified a total of 133 WRKY members in the soybean genome. According to structural features of their encoded proteins and to the phylogenetic tree, the soybean WRKY family could be classified into three groups (groups I, II, and III). A majority of WRKY genes (76.7%; 102 of 133) were segmentally duplicated and 13.5% (18 of 133) of the genes were tandemly duplicated. This pattern was not apparent in Arabidopsis or rice. The transcriptome atlas revealed notable differential expression in either transcript abundance or in expression patterns under normal growth conditions, which indicated wide functional divergence in this family. Furthermore, some critical amino acids were detected using DIVERGE v2.0 in specific comparisons, suggesting that these sites have contributed to functional divergence among groups or subgroups. In addition, site model and branch-site model analyses of positive Darwinian selection (PDS) showed that different selection regimes could have affected the evolution of these groups. Sites with high probabilities of having been under PDS were found in groups I, II c, II e, and III. Together, these results contribute to a detailed understanding of the molecular evolution of the WRKY gene family in soybean. Conclusions In this work, all the WRKY genes, which were generated mainly through segmental duplication, were identified in the soybean genome. Moreover, differential expression and functional divergence of the duplicated WRKY genes were two major features of this family throughout their evolutionary history. Positive selection analysis revealed that the different groups have

  7. Gene duplication and the evolution of phenotypic diversity in insect societies.

    PubMed

    Chau, Linh M; Goodisman, Michael A D

    2017-09-06

    Gene duplication is an important evolutionary process thought to facilitate the evolution of phenotypic diversity. We investigated if gene duplication was associated with the evolution of phenotypic differences in a highly social insect, the honeybee Apis mellifera. We hypothesized that the genetic redundancy provided by gene duplication could promote the evolution of social and sexual phenotypes associated with advanced societies. We found a positive correlation between sociality and rate of gene duplications across the Apoidea, indicating that gene duplication may be associated with sociality. We also discovered that genes showing biased expression between A. mellifera alternative phenotypes tended to be found more frequently than expected among duplicated genes than singletons. Moreover, duplicated genes had higher levels of caste-, sex-, behavior-, and tissue-biased expression compared to singletons, as expected if gene duplication facilitated phenotypic differentiation. We also found that duplicated genes were maintained in the A. mellifera genome through the processes of conservation, neofunctionalization, and specialization, but not subfunctionalization. Overall, we conclude that gene duplication may have facilitated the evolution of social and sexual phenotypes, as well as tissue differentiation. Thus this study further supports the idea that gene duplication allows species to evolve an increased range of phenotypic diversity. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  8. The structure and early evolution of recently arisen gene duplicates in the Caenorhabditis elegans genome.

    PubMed

    Katju, Vaishali; Lynch, Michael

    2003-12-01

    The significance of gene duplication in provisioning raw materials for the evolution of genomic diversity is widely recognized, but the early evolutionary dynamics of duplicate genes remain obscure. To elucidate the structural characteristics of newly arisen gene duplicates at infancy and their subsequent evolutionary properties, we analyzed gene pairs with < or =10% divergence at synonymous sites within the genome of Caenorhabditis elegans. Structural heterogeneity between duplicate copies is present very early in their evolutionary history and is maintained over longer evolutionary timescales, suggesting that duplications across gene boundaries in conjunction with shuffling events have at least as much potential to contribute to long-term evolution as do fully redundant (complete) duplicates. The median duplication span of 1.4 kb falls short of the average gene length in C. elegans (2.5 kb), suggesting that partial gene duplications are frequent. Most gene duplicates reside close to the parent copy at inception, often as tandem inverted loci, and appear to disperse in the genome as they age, as a result of reduced survivorship of duplicates located in proximity to the ancestral copy. We propose that illegitimate recombination events leading to inverted duplications play a disproportionately large role in gene duplication within this genome in comparison with other mechanisms.

  9. The structure and early evolution of recently arisen gene duplicates in the Caenorhabditis elegans genome.

    PubMed Central

    Katju, Vaishali; Lynch, Michael

    2003-01-01

    The significance of gene duplication in provisioning raw materials for the evolution of genomic diversity is widely recognized, but the early evolutionary dynamics of duplicate genes remain obscure. To elucidate the structural characteristics of newly arisen gene duplicates at infancy and their subsequent evolutionary properties, we analyzed gene pairs with < or =10% divergence at synonymous sites within the genome of Caenorhabditis elegans. Structural heterogeneity between duplicate copies is present very early in their evolutionary history and is maintained over longer evolutionary timescales, suggesting that duplications across gene boundaries in conjunction with shuffling events have at least as much potential to contribute to long-term evolution as do fully redundant (complete) duplicates. The median duplication span of 1.4 kb falls short of the average gene length in C. elegans (2.5 kb), suggesting that partial gene duplications are frequent. Most gene duplicates reside close to the parent copy at inception, often as tandem inverted loci, and appear to disperse in the genome as they age, as a result of reduced survivorship of duplicates located in proximity to the ancestral copy. We propose that illegitimate recombination events leading to inverted duplications play a disproportionately large role in gene duplication within this genome in comparison with other mechanisms. PMID:14704166

  10. Multiple Paleopolyploidizations during the Evolution of the Compositae Reveal Parallel Patterns of Duplicate Gene Retention after Millions of Years

    PubMed Central

    Barker, Michael S.; Kane, Nolan C.; Matvienko, Marta; Kozik, Alexander; Michelmore, Richard W.; Knapp, Steven J.; Rieseberg, Loren H.

    2008-01-01

    Of the approximately 250,000 species of flowering plants, nearly one in ten are members of the Compositae (Asteraceae), a diverse family found in almost every habitat on all continents except Antarctica. With an origin in the mid Eocene, the Compositae is also a relatively young family with remarkable diversifications during the last 40 My. Previous cytologic and systematic investigations suggested that paleopolyploidy may have occurred in at least one Compositae lineage, but a recent analysis of genomic data was equivocal. We tested for evidence of paleopolyploidy in the evolutionary history of the family using recently available expressed sequence tag (EST) data from the Compositae Genome Project. Combined with data available on GenBank, we analyzed nearly 1 million ESTs from 18 species representing seven genera and four tribes. Our analyses revealed at least three ancient whole-genome duplications in the Compositae—a paleopolyploidization shared by all analyzed taxa and placed near the origin of the family just prior to the rapid radiation of its tribes and independent genome duplications near the base of the tribes Mutisieae and Heliantheae. These results are consistent with previous research implicating paleopolyploidy in the evolution and diversification of the Heliantheae. Further, we observed parallel retention of duplicate genes from the basal Compositae genome duplication across all tribes, despite divergence times of 33–38 My among these lineages. This pattern of retention was also repeated for the paleologs from the Heliantheae duplication. Intriguingly, the categories of genes retained in duplicate were substantially different from those in Arabidopsis. In particular, we found that genes annotated to structural components or cellular organization Gene Ontology categories were significantly enriched among paleologs, whereas genes associated with transcription and other regulatory functions were significantly underrepresented. Our results suggest

  11. Multiple paleopolyploidizations during the evolution of the Compositae reveal parallel patterns of duplicate gene retention after millions of years.

    PubMed

    Barker, Michael S; Kane, Nolan C; Matvienko, Marta; Kozik, Alexander; Michelmore, Richard W; Knapp, Steven J; Rieseberg, Loren H

    2008-11-01

    Of the approximately 250,000 species of flowering plants, nearly one in ten are members of the Compositae (Asteraceae), a diverse family found in almost every habitat on all continents except Antarctica. With an origin in the mid Eocene, the Compositae is also a relatively young family with remarkable diversifications during the last 40 My. Previous cytologic and systematic investigations suggested that paleopolyploidy may have occurred in at least one Compositae lineage, but a recent analysis of genomic data was equivocal. We tested for evidence of paleopolyploidy in the evolutionary history of the family using recently available expressed sequence tag (EST) data from the Compositae Genome Project. Combined with data available on GenBank, we analyzed nearly 1 million ESTs from 18 species representing seven genera and four tribes. Our analyses revealed at least three ancient whole-genome duplications in the Compositae-a paleopolyploidization shared by all analyzed taxa and placed near the origin of the family just prior to the rapid radiation of its tribes and independent genome duplications near the base of the tribes Mutisieae and Heliantheae. These results are consistent with previous research implicating paleopolyploidy in the evolution and diversification of the Heliantheae. Further, we observed parallel retention of duplicate genes from the basal Compositae genome duplication across all tribes, despite divergence times of 33-38 My among these lineages. This pattern of retention was also repeated for the paleologs from the Heliantheae duplication. Intriguingly, the categories of genes retained in duplicate were substantially different from those in Arabidopsis. In particular, we found that genes annotated to structural components or cellular organization Gene Ontology categories were significantly enriched among paleologs, whereas genes associated with transcription and other regulatory functions were significantly underrepresented. Our results suggest that

  12. Evolution of tuf genes: ancient duplication, differential loss and gene conversion.

    PubMed

    Lathe, W C; Bork, P

    2001-08-03

    The tuf gene of eubacteria, encoding the EF-tu elongation factor, was duplicated early in the evolution of the taxon. Phylogenetic and genomic location analysis of 20 complete eubacterial genomes suggests that this ancient duplication has been differentially lost and maintained in eubacteria.

  13. Impact of whole-genome and tandem duplications in the expansion and functional diversification of the F-box family in legumes (Fabaceae).

    PubMed

    Bellieny-Rabelo, Daniel; Oliveira, Antônia Elenir Amâncio; Venancio, Thiago Motta

    2013-01-01

    F-box proteins constitute a large gene family that regulates processes from hormone signaling to stress response. F-box proteins are the substrate recognition modules of SCF E3 ubiquitin ligases. Here we report very distinct trends in family size, duplication, synteny and transcription of F-box genes in two nitrogen-fixing legumes, Glycine max (soybean) and Medicago truncatula (alfafa). While the soybean FBX genes emerged mainly through segmental duplications (including whole-genome duplications), M. truncatula genome is dominated by locally-duplicated (tandem) F-box genes. Many of these young FBX genes evolved complex transcriptional patterns, including preferential transcription in different tissues, suggesting that they have probably been recruited to important biochemical pathways (e.g. nodulation and seed development).

  14. Mitochondrial genomes of praying mantises (Dictyoptera, Mantodea): rearrangement, duplication, and reassignment of tRNA genes

    PubMed Central

    Ye, Fei; Lan, Xu-e; Zhu, Wen-bo; You, Ping

    2016-01-01

    Insect mitochondrial genomes (mitogenomes) contain a conserved set of 37 genes for an extensive diversity of lineages. Previously reported dictyopteran mitogenomes share this conserved mitochondrial gene arrangement, although surprisingly little is known about the mitogenome of Mantodea. We sequenced eight mantodean mitogenomes including the first representatives of two families: Hymenopodidae and Liturgusidae. Only two of these genomes retain the typical insect gene arrangement. In three Liturgusidae species, the trnM genes have translocated. Four species of mantis (Creobroter gemmata, Mantis religiosa, Statilia sp., and Theopompa sp.-HN) have multiple identical tandem duplication of trnR, and Statilia sp. additionally includes five extra duplicate trnW. These extra trnR and trnW in Statilia sp. are erratically arranged and form another novel gene order. Interestingly, the extra trnW is converted from trnR by the process of point mutation at anticodon, which is the first case of tRNA reassignment for an insect. Furthermore, no significant differences were observed amongst mantodean mitogenomes with variable copies of tRNA according to comparative analysis of codon usage. Combined with phylogenetic analysis, the characteristics of tRNA only possess limited phylogenetic information in this research. Nevertheless, these features of gene rearrangement, duplication, and reassignment provide valuable information toward understanding mitogenome evolution in insects. PMID:27157299

  15. Gene duplications in prokaryotes can be associated with environmental adaptation

    PubMed Central

    2010-01-01

    Background Gene duplication is a normal evolutionary process. If there is no selective advantage in keeping the duplicated gene, it is usually reduced to a pseudogene and disappears from the genome. However, some paralogs are retained. These gene products are likely to be beneficial to the organism, e.g. in adaptation to new environmental conditions. The aim of our analysis is to investigate the properties of paralog-forming genes in prokaryotes, and to analyse the role of these retained paralogs by relating gene properties to life style of the corresponding prokaryotes. Results Paralogs were identified in a number of prokaryotes, and these paralogs were compared to singletons of persistent orthologs based on functional classification. This showed that the paralogs were associated with for example energy production, cell motility, ion transport, and defence mechanisms. A statistical overrepresentation analysis of gene and protein annotations was based on paralogs of the 200 prokaryotes with the highest fraction of paralog-forming genes. Biclustering of overrepresented gene ontology terms versus species was used to identify clusters of properties associated with clusters of species. The clusters were classified using similarity scores on properties and species to identify interesting clusters, and a subset of clusters were analysed by comparison to literature data. This analysis showed that paralogs often are associated with properties that are important for survival and proliferation of the specific organisms. This includes processes like ion transport, locomotion, chemotaxis and photosynthesis. However, the analysis also showed that the gene ontology terms sometimes were too general, imprecise or even misleading for automatic analysis. Conclusions Properties described by gene ontology terms identified in the overrepresentation analysis are often consistent with individual prokaryote lifestyles and are likely to give a competitive advantage to the organism

  16. [Duplication of DNA--a mechanism for the development of new functionality of genes].

    PubMed

    Maślanka, Roman; Zadrąg-Tęcza, Renata

    2015-01-01

    The amplification of DNA is considered as a mechanism for rapid evolution of organisms. Duplication can be especially advantageous in the case of changing environmental conditions. Whole genome duplication maintains the proper balance between gene expression. This seems to be the main reason why WGD is more favorable than duplication of the fragments of DNA. The polyploidy status disappear as a result of the loss of the majority of duplicated genes. The preservation of duplicated genes is associated with the development of their new functions. Polyploidization is often noted for plants. However due to sequencing technique, the duplications episodes are more frequently reports also for the other systematic taxa, including animals. The occurrence of ancient genome duplication is also considered for yeast Saccharomyces cerevisiae. The existence of two active copies of ribosomal protein genes can be a confirmation of this process. Development of the fermentation process might be one of the probable causes of the yeast genome duplication.

  17. Domain loss facilitates accelerated evolution and neofunctionalization of duplicate snake venom metalloproteinase toxin genes.

    PubMed

    Casewell, Nicholas R; Wagstaff, Simon C; Harrison, Robert A; Renjifo, Camila; Wüster, Wolfgang

    2011-09-01

    Gene duplication is a key mechanism for the adaptive evolution and neofunctionalization of gene families. Large multigene families often exhibit complex evolutionary histories as a result of frequent gene duplication acting in concordance with positive selection pressures. Alterations in the domain structure of genes, causing changes in the molecular scaffold of proteins, can also result in a complex evolutionary history and has been observed in functionally diverse multigene toxin families. Here, we investigate the role alterations in domain structure have on the tempo of evolution and neofunctionalization of multigene families using the snake venom metalloproteinases (SVMPs) as a model system. Our results reveal that the evolutionary history of viperid (Serpentes: Viperidae) SVMPs is repeatedly punctuated by domain loss, with the single loss of the cysteine-rich domain, facilitating the formation of P-II class SVMPs, occurring prior to the convergent loss of the disintegrin domain to form multiple P-I SVMP structures. Notably, the majority of phylogenetic branches where domain loss was inferred to have occurred exhibited highly significant evidence of positive selection in surface-exposed amino acid residues, resulting in the neofunctionalization of P-II and P-I SVMP classes. These results provide a valuable insight into the mechanisms by which complex gene families evolve and detail how the loss of domain structures can catalyze the accelerated evolution of novel gene paralogues. The ensuing generation of differing molecular scaffolds encoded by the same multigene family facilitates gene neofunctionalization while presenting an evolutionary advantage through the retention of multiple genes capable of encoding functionally distinct proteins.

  18. CTDGFinder: A Novel Homology-Based Algorithm for Identifying Closely Spaced Clusters of Tandemly Duplicated Genes.

    PubMed

    Ortiz, Juan F; Rokas, Antonis

    2017-01-01

    Closely spaced clusters of tandemly duplicated genes (CTDGs) contribute to the diversity of many phenotypes, including chemosensation, snake venom, and animal body plans. CTDGs have traditionally been identified subjectively as genomic neighborhoods containing several gene duplicates in close proximity; however, CTDGs are often highly variable with respect to gene number, intergenic distance, and synteny. This lack of formal definition hampers the study of CTDG evolutionary dynamics and the discovery of novel CTDGs in the exponentially growing body of genomic data. To address this gap, we developed a novel homology-based algorithm, CTDGFinder, which formalizes and automates the identification of CTDGs by examining the physical distribution of individual members of families of duplicated genes across chromosomes. Application of CTDGFinder accurately identified CTDGs for many well-known gene clusters (e.g., Hox and beta-globin gene clusters) in the human, mouse and 20 other mammalian genomes. Differences between previously annotated gene clusters and our inferred CTDGs were due to the exclusion of nonhomologs that have historically been considered parts of specific gene clusters, the inclusion or absence of genes between the CTDGs and their corresponding gene clusters, and the splitting of certain gene clusters into distinct CTDGs. Examination of human genes showing tissue-specific enhancement of their expression by CTDGFinder identified members of several well-known gene clusters (e.g., cytochrome P450s and olfactory receptors) and revealed that they were unequally distributed across tissues. By formalizing and automating CTDG identification, CTDGFinder will facilitate understanding of CTDG evolutionary dynamics, their functional implications, and how they are associated with phenotypic diversity. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e

  19. Insertion of the IL1RAPL1 gene into the duplication junction of the dystrophin gene.

    PubMed

    Zhang, Zhujun; Yagi, Mariko; Okizuka, Yo; Awano, Hiroyuki; Takeshima, Yasuhiro; Matsuo, Masafumi

    2009-08-01

    Duplications of one or more exons of the dystrophin gene are the second most common mutation in dystrophinopathies. Even though duplications are suggested to occur with greater complexity than thought earlier, they have been considered an intragenic event. Here, we report the insertion of a part of the IL1RAPL1 (interleukin-1 receptor accessory protein-like 1) gene into the duplication junction site. When the actual exon junction was examined in 15 duplication mutations in the dystrophin gene by analyzing dystrophin mRNA, one patient was found to have an unknown 621 bp insertion at the junction of duplication of exons from 56 to 62. Unexpectedly, the inserted sequence was found completely identical to sequences of exons 3-5 of the IL1RAPL1 gene that is nearly 100 kb distal from the dystrophin gene. Accordingly, the insertion of IL1RAPL1 exons 3-5 between dystrophin exons 62 and 56 was confirmed at the genomic sequence level. One junction between the IL1RAPL1 intron 5 and dystrophin intron 55 was localized within an Alu sequence. These results showed that a fragment of the IL1RAPL1 gene was inserted into the duplication junction of the dystrophin gene in the same direction as the dystrophin gene. This suggests the novel possibility of co-occurrence of complex genomic rearrangements in dystrophinopathy.

  20. A limited role for gene duplications in the evolution of platypus venom.

    PubMed

    Wong, Emily S W; Papenfuss, Anthony T; Whittington, Camilla M; Warren, Wesley C; Belov, Katherine

    2012-01-01

    Gene duplication followed by adaptive selection is believed to be the primary driver of venom evolution. However, to date, no studies have evaluated the importance of gene duplications for venom evolution using a genomic approach. The availability of a sequenced genome and a venom gland transcriptome for the enigmatic platypus provides a unique opportunity to explore the role that gene duplication plays in venom evolution. Here, we identify gene duplication events and correlate them with expressed transcripts in an in-season venom gland. Gene duplicates (1,508) were identified. These duplicated pairs (421), including genes that have undergone multiple rounds of gene duplications, were expressed in the venom gland. The majority of these genes are involved in metabolism and protein synthesis not toxin functions. Twelve secretory genes including serine proteases, metalloproteinases, and protease inhibitors likely to produce symptoms of envenomation such as vasodilation and pain were detected. Only 16 of 107 platypus genes with high similarity to known toxins evolved through gene duplication. Platypus venom C-type natriuretic peptides and nerve growth factor do not possess lineage-specific gene duplicates. Extensive duplications, believed to increase the potency of toxic content and promote toxin diversification, were not found. This is the first study to take a genome-wide approach in order to examine the impact of gene duplication on venom evolution. Our findings support the idea that adaptive selection acts on gene duplicates to drive the independent evolution and functional diversification of similar venom genes in venomous species. However, gene duplications alone do not explain the "venome" of the platypus. Other mechanisms, such as alternative splicing and mutation, may be important in venom innovation.

  1. Gene duplication and speciation in Drosophila: evidence from the Odysseus locus.

    PubMed

    Ting, Chau-Ti; Tsaur, Shun-Chern; Sun, Sha; Browne, William E; Chen, Yung-Chia; Patel, Nipam H; Wu, Chung-I

    2004-08-17

    The importance of gene duplication in evolution has long been recognized. Because duplicated genes are prone to diverge in function, gene duplication could plausibly play a role in species differentiation. However, experimental evidence linking gene duplication with speciation is scarce. Here, we show that a hybrid-male sterility gene, Odysseus (OdsH), arose by gene duplication in the Drosophila genome. OdsH has evolved at a very high rate, whereas its most immediate paralog, unc-4, is nearly identical among species in the Drosophila melanogaster subgroup. The disparity in their sequence evolution is echoed by the divergence in their expression patterns in both soma and reproductive tissues. We suggest that duplicated genes that have yet to evolve a stable function at the time of speciation may be candidates for "speciation genes," which is broadly defined as genes that contribute to differential adaptation between species.

  2. Gene Duplication and Multiplicity of Collagenases in Clostridium histolyticum

    PubMed Central

    Matsushita, Osamu; Jung, Chang-Min; Katayama, Seiichi; Minami, Junzaburo; Takahashi, Yukie; Okabe, Akinobu

    1999-01-01

    Clostridium histolyticum collagenase contains a number of different active components. Previously we have shown that colH encodes a 116-kDa collagenase (ColH) and a 98-kDa gelatinase. We purified a different 116-kDa collagenase (ColG) from the culture supernatant and sequenced its gene (colG). We also identified four other gelatinases (105, 82, 78, and 67 kDa) and determined their N-terminal amino acid sequences, all of which coincided with that of either ColG or ColH. Hybridization experiments showed that each gene is present in a single copy and each gene is transcribed into a single mRNA. These results suggest that all the gelatinases are produced from the respective full-length collagenase by the proteolytic removal of C-terminal fragments. The substrate specificities of the enzymes suggest that colG and colH encode class I and class II enzymes, respectively. Analysis of their DNA locations by pulsed-field gel electrophoresis and nucleotide sequencing of their surrounding regions revealed that the two genes are located in different sites on the chromosome. C. histolyticum colG is more similar to C. perfringens colA than to colH in terms of domain structure. Both colG and colA have a homologous gene, mscL, at their 3′ ends. These results suggest that gene duplication and segment duplication have occurred in an ancestor cell common to C. histolyticum and C. perfringens and that further divergence of the parent gene produced colG and colA. PMID:9922257

  3. Impact of duplicate gene copies on phylogenetic analysis and divergence time estimates in butterflies

    PubMed Central

    Pohl, Nélida; Sison-Mangus, Marilou P; Yee, Emily N; Liswi, Saif W; Briscoe, Adriana D

    2009-01-01

    Background The increase in availability of genomic sequences for a wide range of organisms has revealed gene duplication to be a relatively common event. Encounters with duplicate gene copies have consequently become almost inevitable in the context of collecting gene sequences for inferring species trees. Here we examine the effect of incorporating duplicate gene copies evolving at different rates on tree reconstruction and time estimation of recent and deep divergences in butterflies. Results Sequences from ultraviolet-sensitive (UVRh), blue-sensitive (BRh), and long-wavelength sensitive (LWRh) opsins,EF-1α and COI were obtained from 27 taxa representing the five major butterfly families (5535 bp total). Both BRh and LWRh are present in multiple copies in some butterfly lineages and the different copies evolve at different rates. Regardless of the phylogenetic reconstruction method used, we found that analyses of combined data sets using either slower or faster evolving copies of duplicate genes resulted in a single topology in agreement with our current understanding of butterfly family relationships based on morphology and molecules. Interestingly, individual analyses of BRh and LWRh sequences also recovered these family-level relationships. Two different relaxed clock methods resulted in similar divergence time estimates at the shallower nodes in the tree, regardless of whether faster or slower evolving copies were used, with larger discrepancies observed at deeper nodes in the phylogeny. The time of divergence between the monarch butterfly Danaus plexippus and the queen D. gilippus (15.3–35.6 Mya) was found to be much older than the time of divergence between monarch co-mimic Limenitis archippus and red-spotted purple L. arthemis (4.7–13.6 Mya), and overlapping with the time of divergence of the co-mimetic passionflower butterflies Heliconius erato and H. melpomene (13.5–26.1 Mya). Our family-level results are congruent with recent estimates found in

  4. Impact of duplicate gene copies on phylogenetic analysis and divergence time estimates in butterflies.

    PubMed

    Pohl, Nélida; Sison-Mangus, Marilou P; Yee, Emily N; Liswi, Saif W; Briscoe, Adriana D

    2009-05-13

    The increase in availability of genomic sequences for a wide range of organisms has revealed gene duplication to be a relatively common event. Encounters with duplicate gene copies have consequently become almost inevitable in the context of collecting gene sequences for inferring species trees. Here we examine the effect of incorporating duplicate gene copies evolving at different rates on tree reconstruction and time estimation of recent and deep divergences in butterflies. Sequences from ultraviolet-sensitive (UVRh), blue-sensitive (BRh), and long-wavelength sensitive (LWRh) opsins,EF-1 and COI were obtained from 27 taxa representing the five major butterfly families (5535 bp total). Both BRh and LWRh are present in multiple copies in some butterfly lineages and the different copies evolve at different rates. Regardless of the phylogenetic reconstruction method used, we found that analyses of combined data sets using either slower or faster evolving copies of duplicate genes resulted in a single topology in agreement with our current understanding of butterfly family relationships based on morphology and molecules. Interestingly, individual analyses of BRh and LWRh sequences also recovered these family-level relationships. Two different relaxed clock methods resulted in similar divergence time estimates at the shallower nodes in the tree, regardless of whether faster or slower evolving copies were used, with larger discrepancies observed at deeper nodes in the phylogeny. The time of divergence between the monarch butterfly Danaus plexippus and the queen D. gilippus (15.3-35.6 Mya) was found to be much older than the time of divergence between monarch co-mimic Limenitis archippus and red-spotted purple L. arthemis (4.7-13.6 Mya), and overlapping with the time of divergence of the co-mimetic passionflower butterflies Heliconius erato and H. melpomene (13.5-26.1 Mya). Our family-level results are congruent with recent estimates found in the literature and indicate

  5. Coregulation of tandem duplicate genes slows evolution of subfunctionalization in mammals

    PubMed Central

    Lan, Xun; Pritchard, Jonathan K.

    2016-01-01

    Gene duplication is a fundamental process in genome evolution. However, most young duplicates are degraded by loss-of-function mutations, and the factors that allow some duplicate pairs to survive long-term remain controversial. One class of models to explain duplicate retention invokes sub- or neofunctionalization, whereas others focus on sharing of gene dosage. RNA-sequencing data from 46 human and 26 mouse tissues indicate that subfunctionalization of expression evolves slowly and is rare among duplicates that arose within the placental mammals, possibly because tandem duplicates are coregulated by shared genomic elements. Instead, consistent with the dosage-sharing hypothesis, most young duplicates are down-regulated to match expression levels of single-copy genes. Thus, dosage sharing of expression allows for the initial survival of mammalian duplicates, followed by slower functional adaptation enabling long-term preservation. PMID:27199432

  6. Imbalanced positive selection maintains the functional divergence of duplicated DIHYDROKAEMPFEROL 4-REDUCTASE genes

    PubMed Central

    Huang, Bing-Hong; Chen, Yi-Wen; Huang, Chia-Lung; Gao, Jian; Liao, Pei-Chun

    2016-01-01

    Gene duplication could be beneficial by functional division but might increase the risk of genetic load. The dynamics of duplicated paralogs number could involve recombination, positive selection, and functional divergence. Duplication of DIHYDROFLAVONOL 4-REDUCTASE (DFR) has been reported in several organisms and may have been retained by escape from adaptive conflict (EAC). In this study, we screened the angiosperm DFR gene focusing on a diversified genus Scutellaria to investigate how these duplicated genes are retained. We deduced that gene duplication involved multiple independent events in angiosperms, but the duplication of DFR was before the divergence of Scutellaria. Asymmetric positive selective pressures resulted in different evolutionary rates between the duplicates. Different numbers of regulatory elements, differential codon usages, radical amino acid changes, and differential gene expressions provide evidences of functional divergence between the two DFR duplicates in Scutellaria, implying adaptive subfunctionalization between duplicates. The discovery of pseudogenes accompanying a reduced replacement rate in one DFR paralogous gene suggested possibly leading to “loss of function” due to dosage imbalance after the transient adaptive subfunctionalization in the early stage of duplication. Notwithstanding, episodic gene duplication and functional divergence may be relevant to the diversification of ecological function of DFR gene in Scutellaria. PMID:27966614

  7. Uncovering a gene duplication of the photoreceptive protein, opsin, in scallops (Bivalvia: Pectinidae).

    PubMed

    Serb, Jeanne M; Porath-Krause, Anita J; Pairett, Autum N

    2013-07-01

    Evolutionary biologists have long been interested in how expansions of the photosensory system might contribute to morphological differentiation of animals. Comparative studies in vertebrate and arthropod lineages have provided considerable insight into how the duplication of opsin, the first gene of the phototransduction pathway, have led to functional differentiation and new ecological opportunities; however, this relationship cannot be examined in many invertebrate groups as we have yet to characterize their opsin content. Scallops (Pectinidae) are a promising molluscan model to study the evolution of opsin and its potential role in speciation. Recently, we discovered a second Gq-coupled, or r-, opsin gene expressed in the eyes of two scallop species. To investigate the evolutionary origin of this opsin, we screened 12 bivalve species from 4 families, representing both mobile and sessile species, with and without eyes. Although only one ortholog was recovered from the genome of the eyeless, immobile oyster, we found both genes to have been retained in 3 families comprising the order Pectinoida. Within this clade, non-mobile species of scallops appear to have lost one gene. Phylogeny-based tests of selection indicate different degrees of purifying selection following duplication. These data, in conjunction with highly divergent gene sequences and ortholog-specific retention, suggest functional differences. Our results are congruent with a Gq-opsin gene duplication in an oyster-Pectinoida ancestor, approximately 470 Mya, and suggest the likelihood of retaining both genes is associated with either the presence of eyes and/or degree of mobility. The identification of two highly divergent Gq-opsin genes in scallops is valuable for future functional investigations and provides a foundation for further study of a morphologically and ecologically diverse clade of bivalves that has been understudied with respect to visual ecology and diversification of opsin.

  8. Cloning and analysis of an HMG gene from the lamprey Lampetra fluviatilis: gene duplication in vertebrate evolution.

    PubMed

    Sharman, A C; Hay-Schmidt, A; Holland, P W

    1997-01-03

    Evolution has shaped the organisation of vertebrate genomes, including the human genome. To shed further light on genome history, we have cloned and analysed an HMG gene from lamprey, representing one of the earliest vertebrate lineages. Genes of the HMG1/2 family encode chromosomal proteins that bind DNA in a non-sequence-specific manner, and have been implicated in a variety of cellular processes dependent on chromatin structure. They are characterised by two copies of a conserved motif, the HMG box, followed by an acidic C-terminal region. We report here the cloning of a cDNA clone from the river lamprey Lampetra fluviatilis containing a gene with two HMG boxes and an acidic tail; we designate this gene LfHMG1. Molecular phylogenetic analysis shows that LfHMG1 is descended from a gene ancestral to mammalian HMG1 and HMG2. This implies that there was a duplication event in the HMG1/2 gene family, that occurred after the divergence of the jawed and jawless fishes, 450 million years ago. This conclusion supports and refines the hypothesis that there was a period of extensive gene duplication early in vertebrate evolution. We also show that the HMG1/2 family originated before the protostomes and deuterostomes diverged, over 525 million years ago.

  9. Most parsimonious reconciliation in the presence of gene duplication, loss, and deep coalescence using labeled coalescent trees.

    PubMed

    Wu, Yi-Chieh; Rasmussen, Matthew D; Bansal, Mukul S; Kellis, Manolis

    2014-03-01

    Accurate gene tree-species tree reconciliation is fundamental to inferring the evolutionary history of a gene family. However, although it has long been appreciated that population-related effects such as incomplete lineage sorting (ILS) can dramatically affect the gene tree, many of the most popular reconciliation methods consider discordance only due to gene duplication and loss (and sometimes horizontal gene transfer). Methods that do model ILS are either highly parameterized or consider a restricted set of histories, thus limiting their applicability and accuracy. To address these challenges, we present a novel algorithm DLCpar for inferring a most parsimonious (MP) history of a gene family in the presence of duplications, losses, and ILS. Our algorithm relies on a new reconciliation structure, the labeled coalescent tree (LCT), that simultaneously describes coalescent and duplication-loss history. We show that the LCT representation enables an exhaustive and efficient search over the space of reconciliations, and, for most gene families, the least common ancestor (LCA) mapping is an optimal solution for the species mapping between the gene tree and species tree in an MP LCT. Applying our algorithm to a variety of clades, including flies, fungi, and primates, as well as to simulated phylogenies, we achieve high accuracy, comparable to sophisticated probabilistic reconciliation methods, at reduced run time and with far fewer parameters. These properties enable inferences of the complex evolution of gene families across a broad range of species and large data sets.

  10. Most parsimonious reconciliation in the presence of gene duplication, loss, and deep coalescence using labeled coalescent trees

    PubMed Central

    Wu, Yi-Chieh; Rasmussen, Matthew D.; Bansal, Mukul S.; Kellis, Manolis

    2014-01-01

    Accurate gene tree-species tree reconciliation is fundamental to inferring the evolutionary history of a gene family. However, although it has long been appreciated that population-related effects such as incomplete lineage sorting (ILS) can dramatically affect the gene tree, many of the most popular reconciliation methods consider discordance only due to gene duplication and loss (and sometimes horizontal gene transfer). Methods that do model ILS are either highly parameterized or consider a restricted set of histories, thus limiting their applicability and accuracy. To address these challenges, we present a novel algorithm DLCpar for inferring a most parsimonious (MP) history of a gene family in the presence of duplications, losses, and ILS. Our algorithm relies on a new reconciliation structure, the labeled coalescent tree (LCT), that simultaneously describes coalescent and duplication-loss history. We show that the LCT representation enables an exhaustive and efficient search over the space of reconciliations, and, for most gene families, the least common ancestor (LCA) mapping is an optimal solution for the species mapping between the gene tree and species tree in an MP LCT. Applying our algorithm to a variety of clades, including flies, fungi, and primates, as well as to simulated phylogenies, we achieve high accuracy, comparable to sophisticated probabilistic reconciliation methods, at reduced run time and with far fewer parameters. These properties enable inferences of the complex evolution of gene families across a broad range of species and large data sets. PMID:24310000

  11. Divergence of Gene Body DNA Methylation and Evolution of Plant Duplicate Genes

    PubMed Central

    Wang, Jun; Marowsky, Nicholas C.; Fan, Chuanzhu

    2014-01-01

    It has been shown that gene body DNA methylation is associated with gene expression. However, whether and how deviation of gene body DNA methylation between duplicate genes can influence their divergence remains largely unexplored. Here, we aim to elucidate the potential role of gene body DNA methylation in the fate of duplicate genes. We identified paralogous gene pairs from Arabidopsis and rice (Oryza sativa ssp. japonica) genomes and reprocessed their single-base resolution methylome data. We show that methylation in paralogous genes nonlinearly correlates with several gene properties including exon number/gene length, expression level and mutation rate. Further, we demonstrated that divergence of methylation level and pattern in paralogs indeed positively correlate with their sequence and expression divergences. This result held even after controlling for other confounding factors known to influence the divergence of paralogs. We observed that methylation level divergence might be more relevant to the expression divergence of paralogs than methylation pattern divergence. Finally, we explored the mechanisms that might give rise to the divergence of gene body methylation in paralogs. We found that exonic methylation divergence more closely correlates with expression divergence than intronic methylation divergence. We show that genomic environments (e.g., flanked by transposable elements and repetitive sequences) of paralogs generated by various duplication mechanisms are associated with the methylation divergence of paralogs. Overall, our results suggest that the changes in gene body DNA methylation could provide another avenue for duplicate genes to develop differential expression patterns and undergo different evolutionary fates in plant genomes. PMID:25310342

  12. Gene structure of murine Gna11 and Gna15: tandemly duplicated Gq class G protein alpha subunit genes.

    PubMed

    Davignon, I; Barnard, M; Gavrilova, O; Sweet, K; Wilkie, T M

    1996-02-01

    G protein alpha subunits are encoded by a multigene family of 16 genes that can be grouped into four classes, Gq, Gs, Gi, and G12. The Gq class is composed of four genes in mouse and human, and two of these genes, Gna11 and Gna15, cosegregate on mouse chromosome 10. We have characterized the gene structures of murine Gna11 and Gna15. The two genes are tandemly duplicated in a head-to-tail array. The upstream gene, Gna11, is ubiquitously expressed, whereas expression of the downstream gene, Gna15, is restricted to hematopoietic cells. The coding sequence of each gene is contained within seven exons, and the two genes together span 43 kb, separated by 6 kb of intergenic region. We have found no evidence for alternative splicing within the coding sequence of either gene. Sequence alignments show that the positions of the six intervening sequences are conserved in the two genes, consistent with Gna11 and Gna15 arising by tandem duplication from a common progenitor gene in vertebrates. Phylogenetic trees reveal unequal evolutionary rates among alpha subunits of the Gq class. The rate of change is approximately six fold higher in Gna15 than in Gna11.

  13. Gene structure of murine Gna11 and Gna15: Tandemly duplicated Gq class G protein {alpha} subunit genes

    SciTech Connect

    Davignon, I.; Barnard, M.; Sweet, K.

    1996-02-01

    G protein {alpha} subunits are encoded by a multigene family of 16 genes that can be grouped into four classes, Gq, Gs, Gi, and G12. The Gq class is composed of four genes in mouse and human, and two of these genes, Gna11 and Gna15, cosegregate on mouse chromosome 10. We have characterized the gene structures of murine Gna11 and Gna15. The two genes are tandemly duplicated in a head-to-tail array. The upstream gene, Gna11, is ubiquitously expressed, whereas expression of the downstream gene, Gna15, is restricted to hematopoietic cells. The coding sequence of each gene is contained within seven exons, and the two genes together span 43 kb, separated by 6 kb of intergenic region. We have found no evidence for alternative splicing within the coding sequence of either gene. Sequence alignments show that the positions of the six intervening sequences are conserved in the two genes, consistent with Gna11 and Gna15 arising by tandem duplication from a common progenitor gene in vertebrates. Phylogenetic trees reveal unequal evolutionary rates among {alpha} subunits of the Gq class. The rate of change is approximately six fold higher in Gna15 than in Gna11. 43 refs., 3 figs., 2 tabs.

  14. The HOPA Gene Dodecamer Duplication Is Not a Significant Etiological Factor in Autism.

    ERIC Educational Resources Information Center

    Michaelis, Ron C.; Copeland-Yates, Susan A.; Sossey-Alaoui, Khalid; Skinner, Cindy; Friez, Michael J.; Longshore, John W.; Simensen, Richard J.; Schroer, Richard J.; Stevenson, Roger E.

    2000-01-01

    A study of 202 patients with autism found the incidence of a dodecamer duplication in the HOPA gene was not significantly different between patients and controls. Three female patients inherited the duplication from nonautistic fathers. Also, there was no systematic skewing of X inactivation in female patients with the duplication. (Contains…

  15. Evolution by gene duplication of Medicago truncatula PISTILLATA-like transcription factors.

    PubMed

    Roque, Edelín; Fares, Mario A; Yenush, Lynne; Rochina, Mari Cruz; Wen, Jiangqi; Mysore, Kirankumar S; Gómez-Mena, Concepción; Beltrán, José Pío; Cañas, Luis A

    2016-03-01

    PISTILLATA (PI) is a member of the B-function MADS-box gene family, which controls the identity of both petals and stamens in Arabidopsis thaliana. In Medicago truncatula (Mt), there are two PI-like paralogs, known as MtPI and MtNGL9. These genes differ in their expression patterns, but it is not known whether their functions have also diverged. Describing the evolution of certain duplicated genes, such as transcription factors, remains a challenge owing to the complex expression patterns and functional divergence between the gene copies. Here, we report a number of functional studies, including analyses of gene expression, protein-protein interactions, and reverse genetic approaches designed to demonstrate the respective contributions of each M. truncatula PI-like paralog to the B-function in this species. Also, we have integrated molecular evolution approaches to determine the mode of evolution of Mt PI-like genes after duplication. Our results demonstrate that MtPI functions as a master regulator of B-function in M. truncatula, maintaining the overall ancestral function, while MtNGL9 does not seem to have a role in this regard, suggesting that the pseudogenization could be the functional evolutionary fate for this gene. However, we provide evidence that purifying selection is the primary evolutionary force acting on this paralog, pinpointing the conservation of its biochemical function and, alternatively, the acquisition of a new role for this gene.

  16. Duplication and Diversification of Dipteran Argonaute Genes, and the Evolutionary Divergence of Piwi and Aubergine.

    PubMed

    Lewis, Samuel H; Salmela, Heli; Obbard, Darren J

    2016-02-11

    Genetic studies of Drosophila melanogaster have provided a paradigm for RNA interference (RNAi) in arthropods, in which the microRNA and antiviral pathways are each mediated by a single Argonaute (Ago1 and Ago2) and germline suppression of transposable elements is mediated by a trio of Piwi-subfamily Argonaute proteins (Ago3, Aub, and Piwi). Without a suitable evolutionary context, deviations from this can be interpreted as derived or idiosyncratic. Here we analyze the evolution of Argonaute genes across the genomes and transcriptomes of 86 Dipteran species, showing that variation in copy number can occur rapidly, and that there is constant flux in some RNAi mechanisms. The lability of the RNAi pathways is illustrated by the divergence of Aub and Piwi (182-156 Ma), independent origins of multiple Piwi-family genes in Aedes mosquitoes (less than 25Ma), and the recent duplications of Ago2 and Ago3 in the tsetse fly Glossina morsitans. In each case the tissue specificity of these genes has altered, suggesting functional divergence or innovation, and consistent with the action of dynamic selection pressures across the Argonaute gene family. We find there are large differences in evolutionary rates and gene turnover between pathways, and that paralogs of Ago2, Ago3, and Piwi/Aub show contrasting rates of evolution after duplication. This suggests that Argonautes undergo frequent evolutionary expansions that facilitate functional divergence. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  17. Duplication and Diversification of Dipteran Argonaute Genes, and the Evolutionary Divergence of Piwi and Aubergine

    PubMed Central

    Lewis, Samuel H.; Salmela, Heli; Obbard, Darren J.

    2016-01-01

    Genetic studies of Drosophila melanogaster have provided a paradigm for RNA interference (RNAi) in arthropods, in which the microRNA and antiviral pathways are each mediated by a single Argonaute (Ago1 and Ago2) and germline suppression of transposable elements is mediated by a trio of Piwi-subfamily Argonaute proteins (Ago3, Aub, and Piwi). Without a suitable evolutionary context, deviations from this can be interpreted as derived or idiosyncratic. Here we analyze the evolution of Argonaute genes across the genomes and transcriptomes of 86 Dipteran species, showing that variation in copy number can occur rapidly, and that there is constant flux in some RNAi mechanisms. The lability of the RNAi pathways is illustrated by the divergence of Aub and Piwi (182–156 Ma), independent origins of multiple Piwi-family genes in Aedes mosquitoes (less than 25Ma), and the recent duplications of Ago2 and Ago3 in the tsetse fly Glossina morsitans. In each case the tissue specificity of these genes has altered, suggesting functional divergence or innovation, and consistent with the action of dynamic selection pressures across the Argonaute gene family. We find there are large differences in evolutionary rates and gene turnover between pathways, and that paralogs of Ago2, Ago3, and Piwi/Aub show contrasting rates of evolution after duplication. This suggests that Argonautes undergo frequent evolutionary expansions that facilitate functional divergence. PMID:26868596

  18. Evolution history of duplicated smad3 genes in teleost: insights from Japanese flounder, Paralichthys olivaceus

    PubMed Central

    Du, Xinxin; Liu, Yuezhong; Liu, Jinxiang; Zhang, Quanqi

    2016-01-01

    Following the two rounds of whole-genome duplication (WGD) during deuterosome evolution, a third genome duplication occurred in the ray-fined fish lineage and is considered to be responsible for the teleost-specific lineage diversification and regulation mechanisms. As a receptor-regulated SMAD (R-SMAD), the function of SMAD3 was widely studied in mammals. However, limited information of its role or putative paralogs is available in ray-finned fishes. In this study, two SMAD3 paralogs were first identified in the transcriptome and genome of Japanese flounder (Paralichthys olivaceus). We also explored SMAD3 duplication in other selected species. Following identification, genomic structure, phylogenetic reconstruction, and synteny analyses performed by MrBayes and online bioinformatic tools confirmed that smad3a/3b most likely originated from the teleost-specific WGD. Additionally, selection pressure analysis and expression pattern of the two genes performed by PAML and quantitative real-time PCR (qRT-PCR) revealed evidence of subfunctionalization of the two SMAD3 paralogs in teleost. Our results indicate that two SMAD3 genes originate from teleost-specific WGD, remain transcriptionally active, and may have likely undergone subfunctionalization. This study provides novel insights to the evolution fates of smad3a/3b and draws attentions to future function analysis of SMAD3 gene family. PMID:27703851

  19. Modelling the evolution of multi-gene families.

    PubMed

    Nye, Tom M W

    2009-10-01

    A number of biological processes can lead to genes being copied within the genome of some given species. Duplicate genes of this form are called paralogs and such genes share a high degree sequence similarity as well as often having closely related functions. Some genes have become widely duplicated to form multigene families in which the copies are distributed both within the genomes of individual species and across different species. Statistical modelling of gene duplication and the evolution of multi-gene families currently lags behind well-established models of DNA sequence evolution despite an increasing volume of available data, but the analysis of multi-gene families is important as part of a wider effort to understand evolution at the genomic level. This article reviews existing approaches to modelling multi-gene families and presents various challenges and possibilities for this exciting area of research.

  20. Accelerated evolution after gene duplication: a time-dependent process affecting just one copy.

    PubMed

    Pegueroles, Cinta; Laurie, Steve; Albà, M Mar

    2013-08-01

    Gene duplication is widely regarded as a major mechanism modeling genome evolution and function. However, the mechanisms that drive the evolution of the two, initially redundant, gene copies are still ill defined. Many gene duplicates experience evolutionary rate acceleration, but the relative contribution of positive selection and random drift to the retention and subsequent evolution of gene duplicates, and for how long the molecular clock may be distorted by these processes, remains unclear. Focusing on rodent genes that duplicated before and after the mouse and rat split, we find significantly increased sequence divergence after duplication in only one of the copies, which in nearly all cases corresponds to the novel daughter copy, independent of the mechanism of duplication. We observe that the evolutionary rate of the accelerated copy, measured as the ratio of nonsynonymous to synonymous substitutions, is on average 5-fold higher in the period spanning 4-12 My after the duplication than it was before the duplication. This increase can be explained, at least in part, by the action of positive selection according to the results of the maximum likelihood-based branch-site test. Subsequently, the rate decelerates until purifying selection completely returns to preduplication levels. Reversion to the original rates has already been accomplished 40.5 My after the duplication event, corresponding to a genetic distance of about 0.28 synonymous substitutions per site. Differences in tissue gene expression patterns parallel those of substitution rates, reinforcing the role of neofunctionalization in explaining the evolution of young gene duplicates.

  1. Gene duplication and the evolution of plant MADS-box transcription factors.

    PubMed

    Airoldi, Chiara A; Davies, Brendan

    2012-04-20

    Since the first MADS-box transcription factor genes were implicated in the establishment of floral organ identity in a couple of model plants, the size and scope of this gene family has begun to be appreciated in a much wider range of species. Over the course of millions of years the number of MADS-box genes in plants has increased to the point that the Arabidopsis genome contains more than 100. The understanding gained from studying the evolution, regulation and function of multiple MADS-box genes in an increasing set of species, makes this large plant transcription factor gene family an ideal subject to study the processes that lead to an increase in gene number and the selective birth, death and repurposing of its component members. Here we will use examples taken from the MADS-box gene family to review what is known about the factors that influence the loss and retention of genes duplicated in different ways and examine the varied fates of the retained genes and their associated biological outcomes. Copyright © 2012. Published by Elsevier Ltd.

  2. Correlated duplications and losses in the evolution of palmitoylation writer and eraser families.

    PubMed

    Wittouck, Stijn; van Noort, Vera

    2017-03-20

    Protein post-translational modifications (PTMs) change protein properties. Each PTM type is associated with domain families that apply the modification (writers), remove the modification (erasers) and bind to the modified sites (readers) together called toolkit domains. The evolutionary origin and diversification remains largely understudied, except for tyrosine phosphorylation. Protein palmitoylation entails the addition of a palmitoyl fatty acid to a cysteine residue. This PTM functions as a membrane anchor and is involved in a range of cellular processes. One writer family and two erasers families are known for protein palmitoylation. In this work we unravel the evolutionary history of these writer and eraser families. We constructed a high-quality profile hidden Markov model (HMM) of each family, searched for protein family members in fully sequenced genomes and subsequently constructed phylogenetic distributions of the families. We constructed Maximum Likelihood phylogenetic trees and using gene tree rearrangement and tree reconciliation inferred their evolutionary histories in terms of duplication and loss events. We identified lineages where the families expanded or contracted and found that the evolutionary histories of the families are correlated. The results show that the erasers were invented first, before the origin of the eukaryotes. The writers first arose in the eukaryotic ancestor. The writers and erasers show co-expansions in several eukaryotic ancestral lineages. These expansions often seem to be followed by contractions in some or all of the lineages further in evolution. A general pattern of correlated evolution appears between writer and eraser domains. These co-evolution patterns could be used in new methods for interaction prediction based on phylogenies.

  3. Genome and transcriptome analysis of the Mesoamerican common bean and the role of gene duplications in establishing tissue and temporal specialization of genes.

    PubMed

    Vlasova, Anna; Capella-Gutiérrez, Salvador; Rendón-Anaya, Martha; Hernández-Oñate, Miguel; Minoche, André E; Erb, Ionas; Câmara, Francisco; Prieto-Barja, Pablo; Corvelo, André; Sanseverino, Walter; Westergaard, Gastón; Dohm, Juliane C; Pappas, Georgios J; Saburido-Alvarez, Soledad; Kedra, Darek; Gonzalez, Irene; Cozzuto, Luca; Gómez-Garrido, Jessica; Aguilar-Morón, María A; Andreu, Nuria; Aguilar, O Mario; Garcia-Mas, Jordi; Zehnsdorf, Maik; Vázquez, Martín P; Delgado-Salinas, Alfonso; Delaye, Luis; Lowy, Ernesto; Mentaberry, Alejandro; Vianello-Brondani, Rosana P; García, José Luís; Alioto, Tyler; Sánchez, Federico; Himmelbauer, Heinz; Santalla, Marta; Notredame, Cedric; Gabaldón, Toni; Herrera-Estrella, Alfredo; Guigó, Roderic

    2016-02-25

    Legumes are the third largest family of angiosperms and the second most important crop class. Legume genomes have been shaped by extensive large-scale gene duplications, including an approximately 58 million year old whole genome duplication shared by most crop legumes. We report the genome and the transcription atlas of coding and non-coding genes of a Mesoamerican genotype of common bean (Phaseolus vulgaris L., BAT93). Using a comprehensive phylogenomics analysis, we assessed the past and recent evolution of common bean, and traced the diversification of patterns of gene expression following duplication. We find that successive rounds of gene duplications in legumes have shaped tissue and developmental expression, leading to increased levels of specialization in larger gene families. We also find that many long non-coding RNAs are preferentially expressed in germ-line-related tissues (pods and seeds), suggesting that they play a significant role in fruit development. Our results also suggest that most bean-specific gene family expansions, including resistance gene clusters, predate the split of the Mesoamerican and Andean gene pools. The genome and transcriptome data herein generated for a Mesoamerican genotype represent a counterpart to the genomic resources already available for the Andean gene pool. Altogether, this information will allow the genetic dissection of the characters involved in the domestication and adaptation of the crop, and their further implementation in breeding strategies for this important crop.

  4. Evolution of the ability to modulate host chemokine networks via gene duplication in human cytomegalovirus (HCMV).

    PubMed

    Scarborough, Jessica A; Paul, John R; Spencer, Juliet V

    2017-03-14

    Human cytomegalovirus (HCMV) is a widespread pathogen that is particularly skillful at evading immune detection and defense mechanisms, largely due to extensive co-evolution with its host. One aspect of this co-evolution involves the acquisition of virally encoded G protein-coupled receptors (GPCRs) with homology to the chemokine receptor family. GPCRs are the largest family of cell surface proteins, found in organisms from yeast to humans, and they regulate a variety of cellular processes including development, sensory perception, and immune cell trafficking. The US27 and US28 genes are encoded by human and primate CMVs, but homologs are not found in the genomes of viruses infecting rodents or other species. Phylogenetic analysis was used to investigate the US27 and US28 genes, which are adjacent in the unique short (US) region of the HCMV genome, and their relationship to one another and to human chemokine receptor genes. The results indicate that both US27 and US28 share the same common ancestor with human chemokine receptor CX3CR1, suggesting that a single host gene was captured and a subsequent viral gene duplication event occurred. The US28 gene product (pUS28) has maintained the function of the ancestral gene and has the ability to bind and signal in response to CX3CL1/fractalkine, the natural ligand for CX3CR1. In contrast, pUS27 does not bind to any known chemokine ligand, and the sequence has diverged significantly, highlighted by the fact that pUS27 currently exhibits greater sequence similarity to human CCR1. While the evolutionary advantage of the gene duplication and neofunctionalization event remains unclear, the US27 and US28 genes are highly conserved among different HCMV strains and retained even in laboratory strains that have lost many virulence genes, suggesting that US27 and US28 have each evolved distinct, important functions during virus infection.

  5. Structure and origin of a tandem duplication of a Drosophila metallothionein gene

    SciTech Connect

    Otto, E.; Maroni, G.

    1987-01-01

    A strain of cadmium-resistant Drosophila was isolated that contained a chromosomal duplication of the metallothionein gene, Mtn. This duplication was a direct, tandem repeat of 2.2 kilobases of DNA: 228 bases of 5' flanking DNA, the entire transcription unit, and 1.4 kilobases of 3' flanking DNA. The entire duplication was cloned and DNA sequences of the regions relevant to the duplication process were determined. Comparison of the sequences of the 5' and 3' boundaries revealed no extensive regions of similarity, thus indicating that this duplication was formed by nonhomologous breakage and reunion. Recently, results of similar analyses by other investigators have suggested that this process was involved in the origin of three other eukaryotic duplications. The authors have observed a chi-like sequence near one of the boundaries of each duplication, and therefore suggest that this sequence may be important in generating one of the breaks required for duplication formation.

  6. Did homeobox gene duplications contribute to the Cambrian explosion?

    PubMed

    Holland, Peter W H

    2015-01-01

    The Cambrian explosion describes an apparently rapid increase in the diversity of bilaterian animals around 540-515 million years ago. Bilaterian animals explore the world in three-dimensions deploying forward-facing sense organs, a brain, and an anterior mouth; they possess muscle blocks enabling efficient crawling and burrowing in sediments, and they typically have an efficient 'through-gut' with separate mouth and anus to process bulk food and eject waste, even when burrowing in sediment. A variety of ecological, environmental, genetic, and developmental factors have been proposed as possible triggers and correlates of the Cambrian explosion, and it is likely that a combination of factors were involved. Here, I focus on a set of developmental genetic changes and propose these are part of the mix of permissive factors. I describe how ANTP-class homeobox genes, which encode transcription factors involved in body patterning, increased in number in the bilaterian stem lineage and earlier. These gene duplications generated a large array of ANTP class genes, including three distinct gene clusters called NK, Hox, and ParaHox. Comparative data supports the idea that NK genes were deployed primarily to pattern the bilaterian mesoderm, Hox genes coded position along the central nervous system, and ParaHox genes most likely originally specified the mouth, midgut, and anus of the newly evolved through-gut. It is proposed that diversification of ANTP class genes played a role in the Cambrian explosion by contributing to the patterning systems used to build animal bodies capable of high-energy directed locomotion, including active burrowing.

  7. Are duplicated genes responsible for anthracnose resistance in common bean?

    PubMed Central

    2017-01-01

    The race 65 of Colletotrichum lindemuthianum, etiologic agent of anthracnose in common bean, is distributed worldwide, having great importance in breeding programs for anthracnose resistance. Several resistance alleles have been identified promoting resistance to this race. However, the variability that has been detected within race has made it difficult to obtain cultivars with durable resistance, because cultivars may have different reactions to each strain of race 65. Thus, this work aimed at studying the resistance inheritance of common bean lines to different strains of C. lindemuthianum, race 65. We used six C. lindemuthianum strains previously characterized as belonging to the race 65 through the international set of differential cultivars of anthracnose and nine commercial cultivars, adapted to the Brazilian growing conditions and with potential ability to discriminate the variability within this race. To obtain information on the resistance inheritance related to nine commercial cultivars to six strains of race 65, these cultivars were crossed two by two in all possible combinations, resulting in 36 hybrids. Segregation in the F2 generations revealed that the resistance to each strain is conditioned by two independent genes with the same function, suggesting that they are duplicated genes, where the dominant allele promotes resistance. These results indicate that the specificity between host resistance genes and pathogen avirulence genes is not limited to races, it also occurs within strains of the same race. Further research may be carried out in order to establish if the alleles identified in these cultivars are different from those described in the literature. PMID:28296933

  8. Are duplicated genes responsible for anthracnose resistance in common bean?

    PubMed

    Costa, Larissa Carvalho; Nalin, Rafael Storto; Ramalho, Magno Antonio Patto; de Souza, Elaine Aparecida

    2017-01-01

    The race 65 of Colletotrichum lindemuthianum, etiologic agent of anthracnose in common bean, is distributed worldwide, having great importance in breeding programs for anthracnose resistance. Several resistance alleles have been identified promoting resistance to this race. However, the variability that has been detected within race has made it difficult to obtain cultivars with durable resistance, because cultivars may have different reactions to each strain of race 65. Thus, this work aimed at studying the resistance inheritance of common bean lines to different strains of C. lindemuthianum, race 65. We used six C. lindemuthianum strains previously characterized as belonging to the race 65 through the international set of differential cultivars of anthracnose and nine commercial cultivars, adapted to the Brazilian growing conditions and with potential ability to discriminate the variability within this race. To obtain information on the resistance inheritance related to nine commercial cultivars to six strains of race 65, these cultivars were crossed two by two in all possible combinations, resulting in 36 hybrids. Segregation in the F2 generations revealed that the resistance to each strain is conditioned by two independent genes with the same function, suggesting that they are duplicated genes, where the dominant allele promotes resistance. These results indicate that the specificity between host resistance genes and pathogen avirulence genes is not limited to races, it also occurs within strains of the same race. Further research may be carried out in order to establish if the alleles identified in these cultivars are different from those described in the literature.

  9. The birth of a human-specific neural gene by incomplete duplication and gene fusion.

    PubMed

    Dougherty, Max L; Nuttle, Xander; Penn, Osnat; Nelson, Bradley J; Huddleston, John; Baker, Carl; Harshman, Lana; Duyzend, Michael H; Ventura, Mario; Antonacci, Francesca; Sandstrom, Richard; Dennis, Megan Y; Eichler, Evan E

    2017-03-09

    Gene innovation by duplication is a fundamental evolutionary process but is difficult to study in humans due to the large size, high sequence identity, and mosaic nature of segmental duplication blocks. The human-specific gene hydrocephalus-inducing 2, HYDIN2, was generated by a 364 kbp duplication of 79 internal exons of the large ciliary gene HYDIN from chromosome 16q22.2 to chromosome 1q21.1. Because the HYDIN2 locus lacks the ancestral promoter and seven terminal exons of the progenitor gene, we sought to characterize transcription at this locus by coupling reverse transcription polymerase chain reaction and long-read sequencing. 5' RACE indicates a transcription start site for HYDIN2 outside of the duplication and we observe fusion transcripts spanning both the 5' and 3' breakpoints. We observe extensive splicing diversity leading to the formation of altered open reading frames (ORFs) that appear to be under relaxed selection. We show that HYDIN2 adopted a new promoter that drives an altered pattern of expression, with highest levels in neural tissues. We estimate that the HYDIN duplication occurred ~3.2 million years ago and find that it is nearly fixed (99.9%) for diploid copy number in contemporary humans. Examination of 73 chromosome 1q21 rearrangement patients reveals that HYDIN2 is deleted or duplicated in most cases. Together, these data support a model of rapid gene innovation by fusion of incomplete segmental duplications, altered tissue expression, and potential subfunctionalization or neofunctionalization of HYDIN2 early in the evolution of the Homo lineage.

  10. Evolution of Chaperonin Gene Duplication in Stigonematalean Cyanobacteria (Subsection V)

    PubMed Central

    Weissenbach, Julia; Ilhan, Judith; Hülter, Nils; Stucken, Karina; Dagan, Tal

    2017-01-01

    Chaperonins promote protein folding and are known to play a role in the maintenance of cellular stability under stress conditions. The group I bacterial chaperonin complex comprises GroEL, that forms a barrel-like oligomer, and GroES that forms the lid. In most eubacteria the GroES/GroEL chaperonin is encoded by a single-copy bicistronic operon, whereas in cyanobacteria up to three groES/groEL paralogs have been documented. Here we study the evolution and functional diversification of chaperonin paralogs in the heterocystous, multi-seriate filament forming cyanobacterium Chlorogloeopsis fritschii PCC 6912. The genome of C. fritschii encodes two groES/groEL operons (groESL1, groESL1.2) and a monocistronic groEL gene (groEL2). A phylogenetic reconstruction reveals that the groEL2 duplication is as ancient as cyanobacteria, whereas the groESL1.2 duplication occurred at the ancestor of heterocystous cyanobacteria. A comparison of the groEL paralogs transcription levels under different growth conditions shows that they have adapted distinct transcriptional regulation. Our results reveal that groEL1 and groEL1.2 are upregulated during diazotrophic conditions and the localization of their promoter activity points towards a role in heterocyst differentiation. Furthermore, protein–protein interaction assays suggest that paralogs encoded in the two operons assemble into hybrid complexes. The monocistronic encoded GroEL2 is not forming oligomers nor does it interact with the co-chaperonins. Interaction between GroES1.2 and GroEL1.2 could not be documented, suggesting that the groESL1.2 operon does not encode a functional chaperonin complex. Functional complementation experiments in Escherichia coli show that only GroES1/GroEL1 and GroES1/GroEL1.2 can substitute the native operon. In summary, the evolutionary consequences of chaperonin duplication in cyanobacteria include the retention of groESL1 as a housekeeping gene, subfunctionalization of groESL1.2 and

  11. Evolution of Chaperonin Gene Duplication in Stigonematalean Cyanobacteria (Subsection V).

    PubMed

    Weissenbach, Julia; Ilhan, Judith; Bogumil, David; Hülter, Nils; Stucken, Karina; Dagan, Tal

    2017-01-12

    Chaperonins promote protein folding and are known to play a role in the maintenance of cellular stability under stress conditions. The group I bacterial chaperonin complex comprises GroEL, that forms a barrel-like oligomer, and GroES that forms the lid. In most eubacteria the GroES/GroEL chaperonin is encoded by a single-copy bicistronic operon, whereas in cyanobacteria up to three groES/groEL paralogs have been documented. Here we study the evolution and functional diversification of chaperonin paralogs in the heterocystous, multi-seriate filament forming cyanobacterium Chlorogloeopsis fritschii PCC 6912. The genome of C. fritschii encodes two groES/groEL operons (groESL1, groESL1.2) and a monocistronic groEL gene (groEL2). A phylogenetic reconstruction reveals that the groEL2 duplication is as ancient as cyanobacteria, whereas the groESL1.2 duplication occurred at the ancestor of heterocystous cyanobacteria. A comparison of the groEL paralogs transcription levels under different growth conditions shows that they have adapted distinct transcriptional regulation. Our results reveal that groEL1 and groEL1.2 are upregulated during diazotrophic conditions and the localization of their promoter activity points towards a role in heterocyst differentiation. Furthermore, protein-protein interaction assays suggest that paralogs encoded in the two operons assemble into hybrid complexes. The monocistronic GroEL2 is not forming oligomers nor does it interact with the co-chaperonins. Interaction between GroES1.2 and GroEL1.2 could not be documented, suggesting that the groESL1.2 operon does not encode a functional chaperonin complex. Functional complementation experiments in Escherichia coli show that only GroES1/GroEL1 and GroES1/GroEL1.2 can substitute the native operon. In summary, the evolutionary consequences of chaperonin duplication in cyanobacteria include the retention of groESL1 as a housekeeping gene, subfunctionalization of groESL1.2 and neofunctionalization of the

  12. Gene duplication, population genomics, and species-level differentiation within a tropical mountain shrub.

    PubMed

    Mastretta-Yanes, Alicia; Zamudio, Sergio; Jorgensen, Tove H; Arrigo, Nils; Alvarez, Nadir; Piñero, Daniel; Emerson, Brent C

    2014-09-14

    Gene duplication leads to paralogy, which complicates the de novo assembly of genotyping-by-sequencing (GBS) data. The issue of paralogous genes is exacerbated in plants, because they are particularly prone to gene duplication events. Paralogs are normally filtered from GBS data before undertaking population genomics or phylogenetic analyses. However, gene duplication plays an important role in the functional diversification of genes and it can also lead to the formation of postzygotic barriers. Using populations and closely related species of a tropical mountain shrub, we examine 1) the genomic differentiation produced by putative orthologs, and 2) the distribution of recent gene duplication among lineages and geography. We find high differentiation among populations from isolated mountain peaks and species-level differentiation within what is morphologically described as a single species. The inferred distribution of paralogs among populations is congruent with taxonomy and shows that GBS could be used to examine recent gene duplication as a source of genomic differentiation of nonmodel species.

  13. North Carolina macular dystrophy (MCDR1) caused by a novel tandem duplication of the PRDM13 gene

    PubMed Central

    Sullivan, Lori S.; Wheaton, Dianna K.; Locke, Kirsten G.; Jones, Kaylie D.; Koboldt, Daniel C.; Fulton, Robert S.; Wilson, Richard K.; Blanton, Susan H.; Birch, David G.; Daiger, Stephen P.

    2016-01-01

    Purpose To identify the underlying cause of disease in a large family with North Carolina macular dystrophy (NCMD). Methods A large four-generation family (RFS355) with an autosomal dominant form of NCMD was ascertained. Family members underwent comprehensive visual function evaluations. Blood or saliva from six affected family members and three unaffected spouses was collected and DNA tested for linkage to the MCDR1 locus on chromosome 6q12. Three affected family members and two unaffected spouses underwent whole exome sequencing (WES) and subsequently, custom capture of the linkage region followed by next-generation sequencing (NGS). Standard PCR and dideoxy sequencing were used to further characterize the mutation. Results Of the 12 eyes examined in six affected individuals, all but two had Gass grade 3 macular degeneration features. Large central excavation of the retinal and choroid layers, referred to as a macular caldera, was seen in an age-independent manner in the grade 3 eyes. The calderas are unique to affected individuals with MCDR1. Genome-wide linkage mapping and haplotype analysis of markers from the chromosome 6q region were consistent with linkage to the MCDR1 locus. Whole exome sequencing and custom-capture NGS failed to reveal any rare coding variants segregating with the phenotype. Analysis of the custom-capture NGS sequencing data for copy number variants uncovered a tandem duplication of approximately 60 kb on chromosome 6q. This region contains two genes, CCNC and PRDM13. The duplication creates a partial copy of CCNC and a complete copy of PRDM13. The duplication was found in all affected members of the family and is not present in any unaffected members. The duplication was not seen in 200 ethnically matched normal chromosomes. Conclusions The cause of disease in the original family with MCDR1 and several others has been recently reported to be dysregulation of the PRDM13 gene, caused by either single base substitutions in a DNase 1

  14. North Carolina macular dystrophy (MCDR1) caused by a novel tandem duplication of the PRDM13 gene.

    PubMed

    Bowne, Sara J; Sullivan, Lori S; Wheaton, Dianna K; Locke, Kirsten G; Jones, Kaylie D; Koboldt, Daniel C; Fulton, Robert S; Wilson, Richard K; Blanton, Susan H; Birch, David G; Daiger, Stephen P

    2016-01-01

    To identify the underlying cause of disease in a large family with North Carolina macular dystrophy (NCMD). A large four-generation family (RFS355) with an autosomal dominant form of NCMD was ascertained. Family members underwent comprehensive visual function evaluations. Blood or saliva from six affected family members and three unaffected spouses was collected and DNA tested for linkage to the MCDR1 locus on chromosome 6q12. Three affected family members and two unaffected spouses underwent whole exome sequencing (WES) and subsequently, custom capture of the linkage region followed by next-generation sequencing (NGS). Standard PCR and dideoxy sequencing were used to further characterize the mutation. Of the 12 eyes examined in six affected individuals, all but two had Gass grade 3 macular degeneration features. Large central excavation of the retinal and choroid layers, referred to as a macular caldera, was seen in an age-independent manner in the grade 3 eyes. The calderas are unique to affected individuals with MCDR1. Genome-wide linkage mapping and haplotype analysis of markers from the chromosome 6q region were consistent with linkage to the MCDR1 locus. Whole exome sequencing and custom-capture NGS failed to reveal any rare coding variants segregating with the phenotype. Analysis of the custom-capture NGS sequencing data for copy number variants uncovered a tandem duplication of approximately 60 kb on chromosome 6q. This region contains two genes, CCNC and PRDM13. The duplication creates a partial copy of CCNC and a complete copy of PRDM13. The duplication was found in all affected members of the family and is not present in any unaffected members. The duplication was not seen in 200 ethnically matched normal chromosomes. The cause of disease in the original family with MCDR1 and several others has been recently reported to be dysregulation of the PRDM13 gene, caused by either single base substitutions in a DNase 1 hypersensitive site upstream of the CCNC

  15. Adaptive evolution of genes duplicated from the Drosophila pseudoobscura neo-X chromosome.

    PubMed

    Meisel, Richard P; Hilldorfer, Benedict B; Koch, Jessica L; Lockton, Steven; Schaeffer, Stephen W

    2010-08-01

    Drosophila X chromosomes are disproportionate sources of duplicated genes, and these duplications are usually the result of retrotransposition of X-linked genes to the autosomes. The excess duplication is thought to be driven by natural selection for two reasons: X chromosomes are inactivated during spermatogenesis, and the derived copies of retroposed duplications tend to be testis expressed. Therefore, autosomal derived copies of retroposed genes provide a mechanism for their X-linked paralogs to "escape" X inactivation. Once these duplications have fixed, they may then be selected for male-specific functions. Throughout the evolution of the Drosophila genus, autosomes have fused with X chromosomes along multiple lineages giving rise to neo-X chromosomes. There has also been excess duplication from the two independent neo-X chromosomes that have been examined--one that occurred prior to the common ancestor of the willistoni species group and another that occurred along the lineage leading to Drosophila pseudoobscura. To determine what role natural selection plays in the evolution of genes duplicated from the D. pseudoobscura neo-X chromosome, we analyzed DNA sequence divergence between paralogs, polymorphism within each copy, and the expression profiles of these duplicated genes. We found that the derived copies of all duplicated genes have elevated nonsynonymous polymorphism, suggesting that they are under relaxed selective constraints. The derived copies also tend to have testis- or male-biased expression profiles regardless of their chromosome of origin. Genes duplicated from the neo-X chromosome appear to be under less constraints than those duplicated from other chromosome arms. We also find more evidence for historical adaptive evolution in genes duplicated from the neo-X chromosome, suggesting that they are under a unique selection regime in which elevated nonsynonymous polymorphism provides a large reservoir of functional variants, some of which are fixed

  16. Gene Duplication in Pseudomonas aeruginosa Improves Growth on Adenosine.

    PubMed

    Toussaint, Jean-Paul; Farrell-Sherman, Anna; Feldman, Tamar Perla; Smalley, Nicole E; Schaefer, Amy L; Greenberg, E Peter; Dandekar, Ajai A

    2017-11-01

    The laboratory strain of Pseudomonas aeruginosa, PAO1, activates genes for catabolism of adenosine using quorum sensing (QS). However, this strain is not well-adapted for growth on adenosine, with doubling times greater than 40 h. We previously showed that when PAO1 is grown on adenosine and casein, variants emerge that grow rapidly on adenosine. To understand the mechanism by which this adaptation occurs, we performed whole-genome sequencing of five isolates evolved for rapid growth on adenosine. All five genomes had a gene duplication-amplification (GDA) event covering several genes, including the quorum-regulated nucleoside hydrolase gene, nuh, and PA0148, encoding an adenine deaminase. In addition, two of the growth variants also exhibited a nuh promoter mutation. We recapitulated the rapid growth phenotype with a plasmid containing six genes common to all the GDA events. We also showed that nuh and PA0148, the two genes at either end of the common GDA, were sufficient to confer rapid growth on adenosine. Additionally, we demonstrated that the variant nuh promoter increased basal expression of nuh but maintained its QS regulation. Therefore, GDA in P. aeruginosa confers the ability to grow efficiently on adenosine while maintaining QS regulation of nucleoside catabolism.IMPORTANCEPseudomonas aeruginosa thrives in many habitats and is an opportunistic pathogen of humans. In these diverse environments, P. aeruginosa must adapt to use myriad potential carbon sources. P. aeruginosa PAO1 cannot grow efficiently on nucleosides, including adenosine; however, it can acquire this ability through genetic adaptation. We show that the mechanism of adaptation is by amplification of a specific region of the genome and that the amplification preserves the regulation of the adenosine catabolic pathway by quorum sensing. These results demonstrate an underexplored mechanism of adaptation by P. aeruginosa, with implications for phenotypes such as development of antibiotic

  17. The Evolutionary Relationship between Alternative Splicing and Gene Duplication.

    PubMed

    Iñiguez, Luis P; Hernández, Georgina

    2017-01-01

    The protein diversity that exists today has resulted from various evolutionary processes. It is well known that gene duplication (GD) along with the accumulation of mutations are responsible, among other factors, for an increase in the number of different proteins. The gene structure in eukaryotes requires the removal of non-coding sequences, introns, to produce mature mRNAs. This process, known as cis-splicing, referred to here as splicing, is regulated by several factors which can lead to numerous splicing arrangements, commonly designated as alternative splicing (AS). AS, producing several transcripts isoforms form a single gene, also increases the protein diversity. However, the evolution and manner for increasing protein variation differs between AS and GD. An important question is how are patterns of AS affected after a GD event. Here, we review the current knowledge of AS and GD, focusing on their evolutionary relationship. These two processes are now considered the main contributors to the increasing protein diversity and therefore their relationship is a relevant, yet understudied, area of evolutionary study.

  18. The Evolutionary Relationship between Alternative Splicing and Gene Duplication

    PubMed Central

    Iñiguez, Luis P.; Hernández, Georgina

    2017-01-01

    The protein diversity that exists today has resulted from various evolutionary processes. It is well known that gene duplication (GD) along with the accumulation of mutations are responsible, among other factors, for an increase in the number of different proteins. The gene structure in eukaryotes requires the removal of non-coding sequences, introns, to produce mature mRNAs. This process, known as cis-splicing, referred to here as splicing, is regulated by several factors which can lead to numerous splicing arrangements, commonly designated as alternative splicing (AS). AS, producing several transcripts isoforms form a single gene, also increases the protein diversity. However, the evolution and manner for increasing protein variation differs between AS and GD. An important question is how are patterns of AS affected after a GD event. Here, we review the current knowledge of AS and GD, focusing on their evolutionary relationship. These two processes are now considered the main contributors to the increasing protein diversity and therefore their relationship is a relevant, yet understudied, area of evolutionary study. PMID:28261262

  19. Expression divergence of cellulose synthase (CesA) genes after a recent whole genome duplication event in Populus.

    PubMed

    Takata, Naoki; Taniguchi, Toru

    2015-01-01

    Secondary cell wall-associated CesA genes in Populus have undergone a functional differentiation in expression pattern that may be attributable to evolutionary alteration of regulatory modules. Gene duplication is an important mechanism for functional divergence of genes. Secondary cell wall-associated cellulose synthase genes (CesA4, CesA7 and CesA8) are duplicated in Populus plants due to a recent whole genome duplication event. Here, we demonstrate that duplicate CesA genes show tissue-dependent expression divergence in Populus plants. Real-time PCR analysis of Populus CesA genes suggested that Pt × tCesA8-B was more highly expressed than Pt × tCesA8-A in phloem and secondary xylem tissue of mature stem. Histochemical and histological analyses of transformants expressing a GFP-GUS fusion gene driven by Populus CesA promoters revealed that the duplicate CesA genes showed different expression patterns in phloem fibers, secondary xylem, root cap and leaf trichomes. We predicted putative cis-regulatory motifs that regulate expression of secondary cell wall-associated CesA genes, and identified 19 motifs that are highly conserved in the CesA gene family of eudicotyledonous plants. Furthermore, a transient transactivation assay identified candidate transcription factors that affect levels and patterns of expression of Populus CesA genes. The present study reveals that secondary cell wall-associated CesA genes in Populus have undergone a functional differentiation in expression pattern that may be attributable to evolutionary alteration of regulatory modules.

  20. A structurally novel salt-regulated promoter of duplicated carbonic anhydrase gene 1 from Dunaliella salina.

    PubMed

    Li, Jie; Lu, Yumin; Xue, Lexun; Xie, Hua

    2010-02-01

    It has been demonstrated that the duplicated carbonic anhydrase is induced by salt in the Dunaliella salina (D. salina) and duplicated carbonic anhydrase 1 (DCA1) is a member of carbonic anhydrase family. The purpose of this study was to identify whether both the DCA1 gene and its promoter from D. salina are salt-inducible. In this study, the results of real time RT-PCR showed that the transcripts of DCA1 were induced by gradient concentration of sodium chloride. Subsequently, a structurally novel promoter containing highly repeated GT/AC sequences of the DCA1 gene was isolated, which was able to drive a stable expression of the foreign bar gene in transformed cells of D. salina, and the gradient concentrations of sodium chloride in media paralleled regulations in the levels of both proteins and mRNA of the bar gene driven by the DCA1 promoter. Furthermore, analysis of GUS activities revealed that the salt-inducible expression of the external gus gene was regulated by the promoter fragments containing highly repeated GT sequences, but not by the promoter fragments deleting highly repeated GT sequences. The findings above-mentioned suggest that the highly repeated GT sequence in the DCA1 promoter is involved in the salt-inducible regulation in D. salina and may be a novel salt-inducible element.

  1. Multiple tandem duplication of the phenylalanine ammonia-lyase genes in Cucumis sativus L.

    PubMed

    Shang, Qing-Mao; Li, Liang; Dong, Chun-Juan

    2012-10-01

    Phenylalanine ammonia-lyase (PAL) is the first entry enzyme of the phenylpropanoid pathway, and therefore plays a key role in both plant development and stress defense. In many plants, PAL is encoded by a multi-gene family, and each member is differentially regulated in response to environmental stimuli. In the present study, we report that PAL in cucumber (Cucumis sativus L.) is encoded for by a family of seven genes (designated as CsPAL1-7). All seven CsPALs are arranged in tandem in two duplication blocks, which are located on chromosomes 4 and 6, respectively. The cDNA and protein sequences of the CsPALs share an overall high identity to each other. Homology modeling reveals similarities in their protein structures, besides several slight differences, implying the different activities in conversion of phenylalanine. Phylogenic analysis places CsPAL1-7 in a separate cluster rather than clustering with other plant PALs. Analyses of expression profiles in different cucumber tissues or in response to various stress or plant hormone treatments indicate that CsPAL1-7 play redundant, but divergent roles in cucumber development and stress response. This is consistent with our finding that CsPALs possess overlapping but different cis-elements in their promoter regions. Finally, several duplication events are discussed to explain the evolution of the cucumber PAL genes.

  2. Definition of minimal duplicated region encompassing the XIAP and STAG2 genes in the Xq25 microduplication syndrome.

    PubMed

    Di Benedetto, Daniela; Musumeci, Sebastiano Antonino; Avola, Emanuela; Alberti, Antonino; Buono, Serafino; Scuderi, Carmela; Grillo, Lucia; Galesi, Ornella; Spalletta, Angela; Giudice, Mariangela Lo; Luciano, Daniela; Vinci, Mirella; Bianca, Sebastiano; Romano, Corrado; Fichera, Marco

    2014-08-01

    Typical Xq25 duplications are large and associated with heterogeneous phenotypes. Recently, small duplications involving this genomic region and encompassing the GRIA3 and STAG2 genes have been reported. These Xq25 microduplications are associated with a recognizable syndrome including intellectual disability and distinctive facial appearance. We report on Xq25 microduplications in two unrelated families identified by array comparative genomic hybridization. In both families, the genomic imbalances segregated with the disease in male individuals, while the phenotypes of the heterozygous females appeared to be modulated by their X-inactivation pattern. These rearrangements of about 600 kb involved only three genes: THOC2, XIAP, and STAG2. Further characterization by FISH analyses showed tandem duplication in the Xq25 locus of these genes. These data refine the Xq25 candidate region, identifying a minimal duplicated region of about 270 kb encompassing the XIAP and STAG2 genes. We discuss the function of the genes in the rearrangements and their involvement in the pathogenesis of this disorder. © 2014 Wiley Periodicals, Inc.

  3. Afrobatrachian mitochondrial genomes: genome reorganization, gene rearrangement mechanisms, and evolutionary trends of duplicated and rearranged genes

    PubMed Central

    2013-01-01

    Background Mitochondrial genomic (mitogenomic) reorganizations are rarely found in closely-related animals, yet drastic reorganizations have been found in the Ranoides frogs. The phylogenetic relationships of the three major ranoid taxa (Natatanura, Microhylidae, and Afrobatrachia) have been problematic, and mitogenomic information for afrobatrachians has not been available. Several molecular models for mitochondrial (mt) gene rearrangements have been proposed, but observational evidence has been insufficient to evaluate them. Furthermore, evolutionary trends in rearranged mt genes have not been well understood. To gain molecular and phylogenetic insights into these issues, we analyzed the mt genomes of four afrobatrachian species (Breviceps adspersus, Hemisus marmoratus, Hyperolius marmoratus, and Trichobatrachus robustus) and performed molecular phylogenetic analyses. Furthermore we searched for two evolutionary patterns expected in the rearranged mt genes of ranoids. Results Extensively reorganized mt genomes having many duplicated and rearranged genes were found in three of the four afrobatrachians analyzed. In fact, Breviceps has the largest known mt genome among vertebrates. Although the kinds of duplicated and rearranged genes differed among these species, a remarkable gene rearrangement pattern of non-tandemly copied genes situated within tandemly-copied regions was commonly found. Furthermore, the existence of concerted evolution was observed between non-neighboring copies of triplicated 12S and 16S ribosomal RNA regions. Conclusions Phylogenetic analyses based on mitogenomic data support a close relationship between Afrobatrachia and Microhylidae, with their estimated divergence 100 million years ago consistent with present-day endemism of afrobatrachians on the African continent. The afrobatrachian mt data supported the first tandem and second non-tandem duplication model for mt gene rearrangements and the recombination-based model for concerted

  4. Duplication and Retention Biases of Essential and Non-Essential Genes Revealed by Systematic Knockdown Analyses

    PubMed Central

    Rivers, David; Warnecke, Tobias; Jeffries, Sean J.; Kwon, Taejoon; Rogers, Anthony; Hurst, Laurence D.; Ahringer, Julie

    2013-01-01

    When a duplicate gene has no apparent loss-of-function phenotype, it is commonly considered that the phenotype has been masked as a result of functional redundancy with the remaining paralog. This is supported by indirect evidence showing that multi-copy genes show loss-of-function phenotypes less often than single-copy genes and by direct tests of phenotype masking using select gene sets. Here we take a systematic genome-wide RNA interference approach to assess phenotype masking in paralog pairs in the Caenorhabditis elegans genome. Remarkably, in contrast to expectations, we find that phenotype masking makes only a minor contribution to the low knockdown phenotype rate for duplicate genes. Instead, we find that non-essential genes are highly over-represented among duplicates, leading to a low observed loss-of-function phenotype rate. We further find that duplicate pairs derived from essential and non-essential genes have contrasting evolutionary dynamics: whereas non-essential genes are both more often successfully duplicated (fixed) and lost, essential genes are less often duplicated but upon successful duplication are maintained over longer periods. We expect the fundamental evolutionary duplication dynamics presented here to be broadly applicable. PMID:23675306

  5. Rice pollen hybrid incompatibility caused by reciprocal gene loss of duplicated genes.

    PubMed

    Mizuta, Yoko; Harushima, Yoshiaki; Kurata, Nori

    2010-11-23

    Genetic incompatibility is a barrier contributing to species isolation and is caused by genetic interactions. We made a whole genome survey of two-way interacting loci acting within the gametophyte or zygote using independence tests of marker segregations in an F(2) population from an intersubspecific cross between O. sativa subspecies indica and japonica. We detected only one reproducible interaction, and identified paralogous hybrid incompatibility genes, DOPPELGANGER1 (DPL1) and DOPPELGANGER2 (DPL2), by positional cloning. Independent disruptions of DPL1 and DPL2 occurred in indica and japonica, respectively. DPLs encode highly conserved, plant-specific small proteins (∼10 kDa) and are highly expressed in mature anther. Pollen carrying two defective DPL alleles became nonfunctional and did not germinate, suggesting an essential role for DPLs in pollen germination. Although rice has many duplicated genes resulting from ancient whole genome duplication, the origin of this gene duplication was in recent small-scale gene duplication, occurring after Oryza-Brachypodium differentiation. Comparative analyses suggested the geographic and phylogenetic distribution of these two defective alleles, showing that loss-of-function mutations of DPL1 genes emerged multiple times in indica and its wild ancestor, O. rufipogon, and that the DPL2 gene defect is specific to japonica cultivars.

  6. Two secretory protein genes in Chironomus tentans have arisen by gene duplication and exhibit different developmental expression patterns.

    PubMed

    Galli, J; Wieslander, L

    1993-05-20

    The salivary gland cells in the dipteran Chironomus tentans produce approximately 15 different secretory proteins, with relative molecular masses ranging between 1 x 10(4) and 1 x 10(6). Together these proteins form two types of extra corporal tubes, a larval protective housing and feeding tube or a pupation tube. The developmental change in tube formation is accompanied by a switch in production from one combination of secretory proteins to another. Here we characterize two genes, the sp38-40.A and B genes, which encode secretory proteins with relative molecular masses of 38,000 to 40,000. The two genes are located 346 base-pairs apart in the same orientation and have presumably arisen by gene duplication as the result of an illegitimate recombination event. Both genes contain two regions with cysteine codons, surrounded by regions with short repeats coding for proline and charged amino acid residues. The two genes and alleles of the genes differ in their number of repeats. This structure resembles the structure of the Balbiani ring (BR) genes, which encode the four largest salivary gland secretory proteins. The sp38-40.A and B genes are therefore likely to belong to a BR multigene family containing all or most of the 15 salivary gland secretory protein genes. The expression of the sp38-40.A and B genes are different: the A gene is expressed throughout the larval fourth instar but considerably less in the prepupal stage, while the B gene shows the opposite expression pattern. The developmental regulation of the expression of the two genes has therefore diverged after the gene duplication event.

  7. Analyses of nuclearly encoded mitochondrial genes suggest gene duplication as a mechanism for resolving intralocus sexually antagonistic conflict in Drosophila.

    PubMed

    Gallach, Miguel; Chandrasekaran, Chitra; Betrán, Esther

    2010-01-01

    Gene duplication is probably the most important mechanism for generating new gene functions. However, gene duplication has been overlooked as a potentially effective way to resolve genetic conflicts. Here, we analyze the entire set of Drosophila melanogaster nuclearly encoded mitochondrial duplicate genes and show that both RNA- and DNA-mediated mitochondrial gene duplications exhibit an unexpectedly high rate of relocation (change in location between parental and duplicated gene) as well as an extreme tendency to avoid the X chromosome. These trends are likely related to our observation that relocated genes tend to have testis-specific expression. We also infer that these trends hold across the entire Drosophila genus. Importantly, analyses of gene ontology and functional interaction networks show that there is an overrepresentation of energy production-related functions in these mitochondrial duplicates. We discuss different hypotheses to explain our results and conclude that our findings substantiate the hypothesis that gene duplication for male germline function is likely a mechanism to resolve intralocus sexually antagonistic conflicts that we propose are common in testis. In the case of nuclearly encoded mitochondrial duplicates, our hypothesis is that past sexually antagonistic conflict related to mitochondrial energy function in Drosophila was resolved by gene duplication.

  8. Phylogenetic dating and characterization of gene duplications in vertebrates: the cartilaginous fish reference.

    PubMed

    Robinson-Rechavi, Marc; Boussau, Bastien; Laudet, Vincent

    2004-03-01

    Vertebrates originated in the lower Cambrian. Their diversification and morphological innovations have been attributed to large-scale gene or genome duplications at the origin of the group. These duplications are predicted to have occurred in two rounds, the "2R" hypothesis, or they may have occurred in one genome duplication plus many segmental duplications, although these hypotheses are disputed. Under such models, most genes that are duplicated in all vertebrates should have originated during the same period. Previous work has shown that indeed duplications started after the speciation between vertebrates and the closest invertebrate, amphioxus, but have not set a clear ending. Consideration of chordate phylogeny immediately shows the key position of cartilaginous vertebrates (Chondrichthyes) to answer this question. Did gene duplications occur as frequently during the 45 Myr between the cartilaginous/bony vertebrate split and the fish/tetrapode split as in the previous approximately 100 Myr? Although the time interval is relatively short, it is crucial to understanding the events at the origin of vertebrates. By a systematic appraisal of gene phylogenies, we show that significantly more duplications occurred before than after the cartilaginous/bony vertebrate split. Our results support rounds of gene or genome duplications during a limited period of early vertebrate evolution and allow a better characterization of these events.

  9. CG gene body DNA methylation changes and evolution of duplicated genes in cassava

    PubMed Central

    Wang, Haifeng; Beyene, Getu; Zhai, Jixian; Feng, Suhua; Fahlgren, Noah; Taylor, Nigel J.; Bart, Rebecca; Carrington, James C.; Jacobsen, Steven E.; Ausin, Israel

    2015-01-01

    DNA methylation is important for the regulation of gene expression and the silencing of transposons in plants. Here we present genome-wide methylation patterns at single-base pair resolution for cassava (Manihot esculenta, cultivar TME 7), a crop with a substantial impact in the agriculture of subtropical and tropical regions. On average, DNA methylation levels were higher in all three DNA sequence contexts (CG, CHG, and CHH, where H equals A, T, or C) than those of the most well-studied model plant Arabidopsis thaliana. As in other plants, DNA methylation was found both on transposons and in the transcribed regions (bodies) of many genes. Consistent with these patterns, at least one cassava gene copy of all of the known components of Arabidopsis DNA methylation pathways was identified. Methylation of LTR transposons (GYPSY and COPIA) was found to be unusually high compared with other types of transposons, suggesting that the control of the activity of these two types of transposons may be especially important. Analysis of duplicated gene pairs resulting from whole-genome duplication showed that gene body DNA methylation and gene expression levels have coevolved over short evolutionary time scales, reinforcing the positive relationship between gene body methylation and high levels of gene expression. Duplicated genes with the most divergent gene body methylation and expression patterns were found to have distinct biological functions and may have been under natural or human selection for cassava traits. PMID:26483493

  10. CG gene body DNA methylation changes and evolution of duplicated genes in cassava.

    PubMed

    Wang, Haifeng; Beyene, Getu; Zhai, Jixian; Feng, Suhua; Fahlgren, Noah; Taylor, Nigel J; Bart, Rebecca; Carrington, James C; Jacobsen, Steven E; Ausin, Israel

    2015-11-03

    DNA methylation is important for the regulation of gene expression and the silencing of transposons in plants. Here we present genome-wide methylation patterns at single-base pair resolution for cassava (Manihot esculenta, cultivar TME 7), a crop with a substantial impact in the agriculture of subtropical and tropical regions. On average, DNA methylation levels were higher in all three DNA sequence contexts (CG, CHG, and CHH, where H equals A, T, or C) than those of the most well-studied model plant Arabidopsis thaliana. As in other plants, DNA methylation was found both on transposons and in the transcribed regions (bodies) of many genes. Consistent with these patterns, at least one cassava gene copy of all of the known components of Arabidopsis DNA methylation pathways was identified. Methylation of LTR transposons (GYPSY and COPIA) was found to be unusually high compared with other types of transposons, suggesting that the control of the activity of these two types of transposons may be especially important. Analysis of duplicated gene pairs resulting from whole-genome duplication showed that gene body DNA methylation and gene expression levels have coevolved over short evolutionary time scales, reinforcing the positive relationship between gene body methylation and high levels of gene expression. Duplicated genes with the most divergent gene body methylation and expression patterns were found to have distinct biological functions and may have been under natural or human selection for cassava traits.

  11. Global analysis of human duplicated genes reveals the relative importance of whole-genome duplicates originated in the early vertebrate evolution.

    PubMed

    Acharya, Debarun; Ghosh, Tapash C

    2016-01-22

    Gene duplication is a genetic mutation that creates functionally redundant gene copies that are initially relieved from selective pressures and may adapt themselves to new functions with time. The levels of gene duplication may vary from small-scale duplication (SSD) to whole genome duplication (WGD). Studies with yeast revealed ample differences between these duplicates: Yeast WGD pairs were functionally more similar, less divergent in subcellular localization and contained a lesser proportion of essential genes. In this study, we explored the differences in evolutionary genomic properties of human SSD and WGD genes, with the identifiable human duplicates coming from the two rounds of whole genome duplication occurred early in vertebrate evolution. We observed that these two groups of duplicates were also dissimilar in terms of their evolutionary and genomic properties. But interestingly, this is not like the same observed in yeast. The human WGDs were found to be functionally less similar, diverge more in subcellular level and contain a higher proportion of essential genes than the SSDs, all of which are opposite from yeast. Additionally, we explored that human WGDs were more divergent in their gene expression profile, have higher multifunctionality and are more often associated with disease, and are evolutionarily more conserved than human SSDs. Our study suggests that human WGD duplicates are more divergent and entails the adaptation of WGDs to novel and important functions that consequently lead to their evolutionary conservation in the course of evolution.

  12. High time for a roll call: gene duplication and phylogenetic relationships of TCP-like genes in monocots

    PubMed Central

    Mondragón-Palomino, Mariana; Trontin, Charlotte

    2011-01-01

    Background and Aims The TCP family is an ancient group of plant developmental transcription factors that regulate cell division in vegetative and reproductive structures and are essential in the establishment of flower zygomorphy. In-depth research on eudicot TCPs has documented their evolutionary and developmental role. This has not happened to the same extent in monocots, although zygomorphy has been critical for the diversification of Orchidaceae and Poaceae, the largest families of this group. Investigating the evolution and function of TCP-like genes in a wider group of monocots requires a detailed phylogenetic analysis of all available sequence information and a system that facilitates comparing genetic and functional information. Methods The phylogenetic relationships of TCP-like genes in monocots were investigated by analysing sequences from the genomes of Zea mays, Brachypodium distachyon, Oryza sativa and Sorghum bicolor, as well as EST data from several other monocot species. Key Results All available monocot TCP-like sequences are associated in 20 major groups with an average identity ≥64 % and most correspond to well-supported clades of the phylogeny. Their sequence motifs and relationships of orthology were documented and it was found that 67 % of the TCP-like genes of Sorghum, Oryza, Zea and Brachypodium are in microsyntenic regions. This analysis suggests that two rounds of whole genome duplication drove the expansion of TCP-like genes in these species. Conclusions A system of classification is proposed where putative or recognized monocot TCP-like genes are assigned to a specific clade of PCF-, CIN- or CYC/tb1-like genes. Specific biases in sequence data of this family that must be tackled when studying its molecular evolution and phylogeny are documented. Finally, the significant retention of duplicated TCP genes from Zea mays is considered in the context of balanced gene drive. PMID:21444336

  13. Consensus properties and their large-scale applications for the gene duplication problem.

    PubMed

    Moon, Jucheol; Lin, Harris T; Eulenstein, Oliver

    2016-06-01

    Solving the gene duplication problem is a classical approach for species tree inference from gene trees that are confounded by gene duplications. This problem takes a collection of gene trees and seeks a species tree that implies the minimum number of gene duplications. Wilkinson et al. posed the conjecture that the gene duplication problem satisfies the desirable Pareto property for clusters. That is, for every instance of the problem, all clusters that are commonly present in the input gene trees of this instance, called strict consensus, will also be found in every solution to this instance. We prove that this conjecture does not generally hold. Despite this negative result we show that the gene duplication problem satisfies a weaker version of the Pareto property where the strict consensus is found in at least one solution (rather than all solutions). This weaker property contributes to our design of an efficient scalable algorithm for the gene duplication problem. We demonstrate the performance of our algorithm in analyzing large-scale empirical datasets. Finally, we utilize the algorithm to evaluate the accuracy of standard heuristics for the gene duplication problem using simulated datasets.

  14. Gene duplication and concerted evolution of mitochondrial DNA in crane species.

    PubMed

    Akiyama, Takuya; Nishida, Chizuko; Momose, Kunikazu; Onuma, Manabu; Takami, Kazutoshi; Masuda, Ryuichi

    2017-01-01

    The gene duplication in mitochondrial DNA (mtDNA) has been reported in diverse bird taxa so far. Although many phylogenetic and population genetic analyses of cranes were carried out based on mtDNA diversity, whether mtDNA contains duplicated regions is unknown. To address the presence or absence of gene duplication in cranes and investigate the molecular evolutionary features of crane mtDNA, we analyzed the gene organization and the molecular phylogeny of mtDNA from 13 crane species. We found that the mtDNA in 13 crane species shared a tandem duplicated region, which consists of duplicated sequence sets including cytochrome b (Cytb), NADH6, control region (CR) and three genes of tRNA. The gene order in the duplicated region was identical among all the 13 crane species, and the nucleotide sequences found within each individual showed high similarities. In addition, phylogenetic trees based on homologous sequences of CR and Cytb indicated the possibility of concerted evolution among the duplicated genes. The results suggested that the duplication event occurred in the common ancestor of crane species or some older ancestors. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. Co-expression network analysis of duplicate genes in maize (Zea mays L.) reveals no subgenome bias.

    PubMed

    Li, Lin; Briskine, Roman; Schaefer, Robert; Schnable, Patrick S; Myers, Chad L; Flagel, Lex E; Springer, Nathan M; Muehlbauer, Gary J

    2016-11-04

    Gene duplication is prevalent in many species and can result in coding and regulatory divergence. Gene duplications can be classified as whole genome duplication (WGD), tandem and inserted (non-syntenic). In maize, WGD resulted in the subgenomes maize1 and maize2, of which maize1 is considered the dominant subgenome. However, the landscape of co-expression network divergence of duplicate genes in maize is still largely uncharacterized. To address the consequence of gene duplication on co-expression network divergence, we developed a gene co-expression network from RNA-seq data derived from 64 different tissues/stages of the maize reference inbred-B73. WGD, tandem and inserted gene duplications exhibited distinct regulatory divergence. Inserted duplicate genes were more likely to be singletons in the co-expression networks, while WGD duplicate genes were likely to be co-expressed with other genes. Tandem duplicate genes were enriched in the co-expression pattern where co-expressed genes were nearly identical for the duplicates in the network. Older gene duplications exhibit more extensive co-expression variation than younger duplications. Overall, non-syntenic genes primarily from inserted duplications show more co-expression divergence. Also, such enlarged co-expression divergence is significantly related to duplication age. Moreover, subgenome dominance was not observed in the co-expression networks - maize1 and maize2 exhibit similar levels of intra subgenome correlations. Intriguingly, the level of inter subgenome co-expression was similar to the level of intra subgenome correlations, and genes from specific subgenomes were not likely to be the enriched in co-expression network modules and the hub genes were not predominantly from any specific subgenomes in maize. Our work provides a comprehensive analysis of maize co-expression network divergence for three different types of gene duplications and identifies potential relationships between duplication types

  16. First evidence of a large CHEK2 duplication involved in cancer predisposition in an Italian family with hereditary breast cancer

    PubMed Central

    2014-01-01

    Background CHEK2 is a multi-cancer susceptibility gene whose common germline mutations are known to contribute to the risk of developing breast and prostate cancer. Case presentation Here, we describe an Italian family with a high number of cases of breast cancer and other types of tumour subjected to the MLPA test to verify the presence of BRCA1, BRCA2 and CHEK2 deletions and duplications. We identified a new 23-kb duplication in the CHEK2 gene extending from intron 5 to 13 that was associated with breast cancer in the family. The presence and localisation of the alteration was confirmed by a second analysis by Next-Generation Sequencing. Conclusions This finding suggests that CHEK2 mutations are heterogeneous and that techniques other than sequencing, such as MLPA, are needed to identify CHEK2 mutations. It also indicates that CHEK2 rare variants, such as duplications, can confer a high susceptibility to cancer development and should thus be studied in depth as most of our knowledge of CHEK2 concerns common mutations. PMID:24986639

  17. The glutamine synthetase gene family in Populus

    PubMed Central

    2011-01-01

    Background Glutamine synthetase (GS; EC: 6.3.1.2, L-glutamate: ammonia ligase ADP-forming) is a key enzyme in ammonium assimilation and metabolism of higher plants. The current work was undertaken to develop a more comprehensive understanding of molecular and biochemical features of GS gene family in poplar, and to characterize the developmental regulation of GS expression in various tissues and at various times during the poplar perennial growth. Results The GS gene family consists of 8 different genes exhibiting all structural and regulatory elements consistent with their roles as functional genes. Our results indicate that the family members are organized in 4 groups of duplicated genes, 3 of which code for cytosolic GS isoforms (GS1) and 1 which codes for the choroplastic GS isoform (GS2). Our analysis shows that Populus trichocarpa is the first plant species in which it was observed the complete GS family duplicated. Detailed expression analyses have revealed specific spatial and seasonal patterns of GS expression in poplar. These data provide insights into the metabolic function of GS isoforms in poplar and pave the way for future functional studies. Conclusions Our data suggest that GS duplicates could have been retained in order to increase the amount of enzyme in a particular cell type. This possibility could contribute to the homeostasis of nitrogen metabolism in functions associated to changes in glutamine-derived metabolic products. The presence of duplicated GS genes in poplar could also contribute to diversification of the enzymatic properties for a particular GS isoform through the assembly of GS polypeptides into homo oligomeric and/or hetero oligomeric holoenzymes in specific cell types. PMID:21867507

  18. The evolutionary fate of alternatively spliced homologous exons after gene duplication.

    PubMed

    Abascal, Federico; Tress, Michael L; Valencia, Alfonso

    2015-04-29

    Alternative splicing and gene duplication are the two main processes responsible for expanding protein functional diversity. Although gene duplication can generate new genes and alternative splicing can introduce variation through alternative gene products, the interplay between the two processes is complex and poorly understood. Here, we have carried out a study of the evolution of alternatively spliced exons after gene duplication to better understand the interaction between the two processes. We created a manually curated set of 97 human genes with mutually exclusively spliced homologous exons and analyzed the evolution of these exons across five distantly related vertebrates (lamprey, spotted gar, zebrafish, fugu, and coelacanth). Most of these exons had an ancient origin (more than 400 Ma). We found examples supporting two extreme evolutionary models for the behaviour of homologous axons after gene duplication. We observed 11 events in which gene duplication was accompanied by splice isoform separation, that is, each paralog specifically conserved just one distinct ancestral homologous exon. At other extreme, we identified genes in which the homologous exons were always conserved within paralogs, suggesting that the alternative splicing event cannot easily be separated from the function in these genes. That many homologous exons fall in between these two extremes highlights the diversity of biological systems and suggests that the subtle balance between alternative splicing and gene duplication is adjusted to the specific cellular context of each gene.

  19. Circular DNA Intermediate in the Duplication of Nile Tilapia vasa Genes

    PubMed Central

    Fujimura, Koji; Conte, Matthew A.; Kocher, Thomas D.

    2011-01-01

    vasa is a highly conserved RNA helicase involved in animal germ cell development. Among vertebrate species, it is typically present as a single copy per genome. Here we report the isolation and sequencing of BAC clones for Nile tilapia vasa genes. Contrary to a previous report that Nile tilapia have a single copy of the vasa gene, we find evidence for at least three vasa gene loci. The vasa gene locus was duplicated from the original site and integrated into two distant novel sites. For one of these insertions we find evidence that the duplication was mediated by a circular DNA intermediate. This mechanism of gene duplication may explain the origin of isolated gene duplicates during the evolution of fish genomes. These data provide a foundation for studying the role of multiple vasa genes in the development of tilapia gonads, and will contribute to investigations of the molecular mechanisms of sex determination and evolution in cichlid fishes. PMID:22216289

  20. Gene duplication and divergence affecting drug content in Cannabis sativa.

    PubMed

    Weiblen, George D; Wenger, Jonathan P; Craft, Kathleen J; ElSohly, Mahmoud A; Mehmedic, Zlatko; Treiber, Erin L; Marks, M David

    2015-12-01

    Cannabis sativa is an economically important source of durable fibers, nutritious seeds, and psychoactive drugs but few economic plants are so poorly understood genetically. Marijuana and hemp were crossed to evaluate competing models of cannabinoid inheritance and to explain the predominance of tetrahydrocannabinolic acid (THCA) in marijuana compared with cannabidiolic acid (CBDA) in hemp. Individuals in the resulting F2 population were assessed for differential expression of cannabinoid synthase genes and were used in linkage mapping. Genetic markers associated with divergent cannabinoid phenotypes were identified. Although phenotypic segregation and a major quantitative trait locus (QTL) for the THCA/CBDA ratio were consistent with a simple model of codominant alleles at a single locus, the diversity of THCA and CBDA synthase sequences observed in the mapping population, the position of enzyme coding loci on the map, and patterns of expression suggest multiple linked loci. Phylogenetic analysis further suggests a history of duplication and divergence affecting drug content. Marijuana is distinguished from hemp by a nonfunctional CBDA synthase that appears to have been positively selected to enhance psychoactivity. An unlinked QTL for cannabinoid quantity may also have played a role in the recent escalation of drug potency.

  1. Impact of gene gains, losses and duplication modes on the origin and diversification of vertebrates.

    PubMed

    Cañestro, Cristian; Albalat, Ricard; Irimia, Manuel; Garcia-Fernàndez, Jordi

    2013-02-01

    The study of the evolutionary origin of vertebrates has been linked to the study of genome duplications since Susumo Ohno suggested that the successful diversification of vertebrate innovations was facilitated by two rounds of whole-genome duplication (2R-WGD) in the stem vertebrate. Since then, studies on the functional evolution of many genes duplicated in the vertebrate lineage have provided the grounds to support experimentally this link. This article reviews cases of gene duplications derived either from the 2R-WGD or from local gene duplication events in vertebrates, analyzing their impact on the evolution of developmental innovations. We analyze how gene regulatory networks can be rewired by the activity of transposable elements after genome duplications, discuss how different mechanisms of duplication might affect the fate of duplicated genes, and how the loss of gene duplicates might influence the fate of surviving paralogs. We also discuss the evolutionary relationships between gene duplication and alternative splicing, in particular in the vertebrate lineage. Finally, we discuss the role that the 2R-WGD might have played in the evolution of vertebrate developmental gene networks, paying special attention to those related to vertebrate key features such as neural crest cells, placodes, and the complex tripartite brain. In this context, we argue that current evidences points that the 2R-WGD may not be linked to the origin of vertebrate innovations, but to their subsequent diversification in a broad variety of complex structures and functions that facilitated the successful transition from peaceful filter-feeding non-vertebrate ancestors to voracious vertebrate predators. Copyright © 2013 Elsevier Ltd. All rights reserved.

  2. Evolution of the Vertebrate Resistin Gene Family.

    PubMed

    Hu, Qingda; Tan, Huanran; Irwin, David M

    2015-01-01

    Resistin (encoded by Retn) was previously identified in rodents as a hormone associated with diabetes; however human resistin is instead linked to inflammation. Resistin is a member of a small gene family that includes the resistin-like peptides (encoded by Retnl genes) in mammals. Genomic searches of available genome sequences of diverse vertebrates and phylogenetic analyses were conducted to determine the size and origin of the resistin-like gene family. Genes encoding peptides similar to resistin were found in Mammalia, Sauria, Amphibia, and Actinistia (coelacanth, a lobe-finned fish), but not in Aves or fish from Actinopterygii, Chondrichthyes, or Agnatha. Retnl originated by duplication and transposition from Retn on the early mammalian lineage after divergence of the platypus, but before the placental and marsupial mammal divergence. The resistin-like gene family illustrates an instance where the locus of origin of duplicated genes can be identified, with Retn continuing to reside at this location. Mammalian species typically have a single copy Retn gene, but are much more variable in their numbers of Retnl genes, ranging from 0 to 9. Since Retn is located at the locus of origin, thus likely retained the ancestral expression pattern, largely maintained its copy number, and did not display accelerated evolution, we suggest that it is more likely to have maintained an ancestral function, while Retnl, which transposed to a new location, displays accelerated evolution, and shows greater variability in gene number, including gene loss, likely evolved new, but potentially lineage-specific, functions.

  3. The Roles of Whole-Genome and Small-Scale Duplications in the Functional Specialization of Saccharomyces cerevisiae Genes

    PubMed Central

    Fares, Mario A.; Keane, Orla M.; Toft, Christina; Carretero-Paulet, Lorenzo; Jones, Gary W.

    2013-01-01

    Researchers have long been enthralled with the idea that gene duplication can generate novel functions, crediting this process with great evolutionary importance. Empirical data shows that whole-genome duplications (WGDs) are more likely to be retained than small-scale duplications (SSDs), though their relative contribution to the functional fate of duplicates remains unexplored. Using the map of genetic interactions and the re-sequencing of 27 Saccharomyces cerevisiae genomes evolving for 2,200 generations we show that SSD-duplicates lead to neo-functionalization while WGD-duplicates partition ancestral functions. This conclusion is supported by: (a) SSD-duplicates establish more genetic interactions than singletons and WGD-duplicates; (b) SSD-duplicates copies share more interaction-partners than WGD-duplicates copies; (c) WGD-duplicates interaction partners are more functionally related than SSD-duplicates partners; (d) SSD-duplicates gene copies are more functionally divergent from one another, while keeping more overlapping functions, and diverge in their sub-cellular locations more than WGD-duplicates copies; and (e) SSD-duplicates complement their functions to a greater extent than WGD–duplicates. We propose a novel model that uncovers the complexity of evolution after gene duplication. PMID:23300483

  4. Polymorphism and divergence at three duplicate genes in Brassica nigra.

    PubMed

    Sjödin, Per; Hedman, Harald; Kruskopf Osterberg, Marita; Gustafsson, Susanne; Lagercrantz, Ulf; Lascoux, Martin

    2008-06-01

    The CONSTANS-like gene family has been shown to evolve exceptionally fast in Brassicaceae. In the present study we analyzed sequence polymorphism and divergence of three genes from this family: COL1 (CONSTANS-LIKE 1) and two copies of CO (CONSTANS), COa and COb, in B. nigra. There was a significant fourfold difference in overall nucleotide diversity among the three genes, with BniCOb having twice as much variation as BniCOL1, which in turn was twice as variable as BniCOa. The ratio of nonsynonymous-to-synonymous substitutions (dN/dS) was high for all three genes, confirming previous studies. While we did not detect evidence of selection at BniCOa and BniCOb, there was a significant excess of polymorphic synonymous mutations in a McDonald-Kreitman test comparing COL1 in B. nigra and A. thaliana. This is apparently the result of an increase in selective constraint on COL1 in B. nigra combined with a decrease in A. thaliana. In conclusion, a complex scenario involving both demography and selection seems to have shaped the pattern of polymorphism at the three genes.

  5. The major resistance gene cluster in lettuce is highly duplicated and spans several megabases.

    PubMed

    Meyers, B C; Chin, D B; Shen, K A; Sivaramakrishnan, S; Lavelle, D O; Zhang, Z; Michelmore, R W

    1998-11-01

    At least 10 Dm genes conferring resistance to the oomycete downy mildew fungus Bremia lactucae map to the major resistance cluster in lettuce. We investigated the structure of this cluster in the lettuce cultivar Diana, which contains Dm3. A deletion breakpoint map of the chromosomal region flanking Dm3 was saturated with a variety of molecular markers. Several of these markers are components of a family of resistance gene candidates (RGC2) that encode a nucleotide binding site and a leucine-rich repeat region. These motifs are characteristic of plant disease resistance genes. Bacterial artificial chromosome clones were identified by using duplicated restriction fragment length polymorphism markers from the region, including the nucleotide binding site-encoding region of RGC2. Twenty-two distinct members of the RGC2 family were characterized from the bacterial artificial chromosomes; at least two additional family members exist. The RGC2 family is highly divergent; the nucleotide identity was as low as 53% between the most distantly related copies. These RGC2 genes span at least 3.5 Mb. Eighteen members were mapped on the deletion breakpoint map. A comparison between the phylogenetic and physical relationships of these sequences demonstrated that closely related copies are physically separated from one another and indicated that complex rearrangements have shaped this region. Analysis of low-copy genomic sequences detected no genes, including RGC2, in the Dm3 region, other than sequences related to retrotransposons and transposable elements. The related but divergent family of RGC2 genes may act as a resource for the generation of new resistance phenotypes through infrequent recombination or unequal crossing over.

  6. The major resistance gene cluster in lettuce is highly duplicated and spans several megabases.

    PubMed Central

    Meyers, B C; Chin, D B; Shen, K A; Sivaramakrishnan, S; Lavelle, D O; Zhang, Z; Michelmore, R W

    1998-01-01

    At least 10 Dm genes conferring resistance to the oomycete downy mildew fungus Bremia lactucae map to the major resistance cluster in lettuce. We investigated the structure of this cluster in the lettuce cultivar Diana, which contains Dm3. A deletion breakpoint map of the chromosomal region flanking Dm3 was saturated with a variety of molecular markers. Several of these markers are components of a family of resistance gene candidates (RGC2) that encode a nucleotide binding site and a leucine-rich repeat region. These motifs are characteristic of plant disease resistance genes. Bacterial artificial chromosome clones were identified by using duplicated restriction fragment length polymorphism markers from the region, including the nucleotide binding site-encoding region of RGC2. Twenty-two distinct members of the RGC2 family were characterized from the bacterial artificial chromosomes; at least two additional family members exist. The RGC2 family is highly divergent; the nucleotide identity was as low as 53% between the most distantly related copies. These RGC2 genes span at least 3.5 Mb. Eighteen members were mapped on the deletion breakpoint map. A comparison between the phylogenetic and physical relationships of these sequences demonstrated that closely related copies are physically separated from one another and indicated that complex rearrangements have shaped this region. Analysis of low-copy genomic sequences detected no genes, including RGC2, in the Dm3 region, other than sequences related to retrotransposons and transposable elements. The related but divergent family of RGC2 genes may act as a resource for the generation of new resistance phenotypes through infrequent recombination or unequal crossing over. PMID:9811791

  7. Prevalent Role of Gene Features in Determining Evolutionary Fates of Whole-Genome Duplication Duplicated Genes in Flowering Plants1[W][OA

    PubMed Central

    Jiang, Wen-kai; Liu, Yun-long; Xia, En-hua; Gao, Li-zhi

    2013-01-01

    The evolution of genes and genomes after polyploidization has been the subject of extensive studies in evolutionary biology and plant sciences. While a significant number of duplicated genes are rapidly removed during a process called fractionation, which operates after the whole-genome duplication (WGD), another considerable number of genes are retained preferentially, leading to the phenomenon of biased gene retention. However, the evolutionary mechanisms underlying gene retention after WGD remain largely unknown. Through genome-wide analyses of sequence and functional data, we comprehensively investigated the relationships between gene features and the retention probability of duplicated genes after WGDs in six plant genomes, Arabidopsis (Arabidopsis thaliana), poplar (Populus trichocarpa), soybean (Glycine max), rice (Oryza sativa), sorghum (Sorghum bicolor), and maize (Zea mays). The results showed that multiple gene features were correlated with the probability of gene retention. Using a logistic regression model based on principal component analysis, we resolved evolutionary rate, structural complexity, and GC3 content as the three major contributors to gene retention. Cluster analysis of these features further classified retained genes into three distinct groups in terms of gene features and evolutionary behaviors. Type I genes are more prone to be selected by dosage balance; type II genes are possibly subject to subfunctionalization; and type III genes may serve as potential targets for neofunctionalization. This study highlights that gene features are able to act jointly as primary forces when determining the retention and evolution of WGD-derived duplicated genes in flowering plants. These findings thus may help to provide a resolution to the debate on different evolutionary models of gene fates after WGDs. PMID:23396833

  8. Gene duplication of type-B ARR transcription factors systematically extends transcriptional regulatory structures in Arabidopsis.

    PubMed

    Choi, Seung Hee; Hyeon, Do Young; Lee, Ll Hwan; Park, Su Jin; Han, Seungmin; Lee, In Chul; Hwang, Daehee; Nam, Hong Gil

    2014-11-26

    Many of duplicated genes are enriched in signaling pathways. Recently, gene duplication of kinases has been shown to provide genetic buffering and functional diversification in cellular signaling. Transcription factors (TFs) are also often duplicated. However, how duplication of TFs affects their regulatory structures and functions of target genes has not been explored at the systems level. Here, we examined regulatory and functional roles of duplication of three major ARR TFs (ARR1, 10, and 12) in Arabidopsis cytokinin signaling using wild-type and single, double, and triple deletion mutants of the TFs. Comparative analysis of gene expression profiles obtained from Arabidopsis roots in wild-type and these mutants showed that duplication of ARR TFs systematically extended their transcriptional regulatory structures, leading to enhanced robustness and diversification in functions of target genes, as well as in regulation of cellular networks of target genes. Therefore, our results suggest that duplication of TFs contributes to robustness and diversification in functions of target genes by extending transcriptional regulatory structures.

  9. Gene duplication of type-B ARR transcription factors systematically extends transcriptional regulatory structures in Arabidopsis

    PubMed Central

    Choi, Seung Hee; Hyeon, Do Young; Lee, ll Hwan; Park, Su Jin; Han, Seungmin; Lee, In Chul; Hwang, Daehee; Nam, Hong Gil

    2014-01-01

    Many of duplicated genes are enriched in signaling pathways. Recently, gene duplication of kinases has been shown to provide genetic buffering and functional diversification in cellular signaling. Transcription factors (TFs) are also often duplicated. However, how duplication of TFs affects their regulatory structures and functions of target genes has not been explored at the systems level. Here, we examined regulatory and functional roles of duplication of three major ARR TFs (ARR1, 10, and 12) in Arabidopsis cytokinin signaling using wild-type and single, double, and triple deletion mutants of the TFs. Comparative analysis of gene expression profiles obtained from Arabidopsis roots in wild-type and these mutants showed that duplication of ARR TFs systematically extended their transcriptional regulatory structures, leading to enhanced robustness and diversification in functions of target genes, as well as in regulation of cellular networks of target genes. Therefore, our results suggest that duplication of TFs contributes to robustness and diversification in functions of target genes by extending transcriptional regulatory structures. PMID:25425016

  10. Molecular cytogenetics to characterize mechanisms of gene duplication in pesticide resistance.

    PubMed

    Jugulam, Mithila; Gill, Bikram S

    2017-07-17

    Recent advances in molecular cytogenetics empower construction of physical maps to illustrate the precise position of genetic loci on the chromosomes. Such maps provide visible information about the position of DNA sequences, including the distribution of repetitive sequences on the chromosomes. This is an important step toward unraveling the genetic mechanisms implicated in chromosomal aberrations (e.g., gene duplication). In response to stress, such as pesticide selection, duplicated genes provide an immediate adaptive advantage to organisms that overcome unfavorable conditions. Although the significance of gene duplication as one of the important events driving genetic diversity has been reported, the precise mechanisms of gene duplication that contribute to pesticide resistance, especially to herbicides, are elusive. With particular reference to pesticide resistance, we discuss the prospects of application of molecular cytogenetic tools to uncover mechanism(s) of gene duplication, and illustrate hypothetical models that predict the evolutionary basis of gene duplication. The cytogenetic basis of duplicated genes, their stability, as well as the magnitude of selection pressure, can determine the dynamics of the genetic locus (loci) conferring pesticide resistance not only at the population level, but also at the individual level. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.

  11. Lineage-Specific Expansion of IFIT Gene Family: An Insight into Coevolution with IFN Gene Family

    PubMed Central

    Liu, Ying; Zhang, Yi-Bing; Liu, Ting-Kai; Gui, Jian-Fang

    2013-01-01

    In mammals, IFIT (Interferon [IFN]-induced proteins with Tetratricopeptide Repeat [TPR] motifs) family genes are involved in many cellular and viral processes, which are tightly related to mammalian IFN response. However, little is known about non-mammalian IFIT genes. In the present study, IFIT genes are identified in the genome databases from the jawed vertebrates including the cartilaginous elephant shark but not from non-vertebrates such as lancelet, sea squirt and acorn worm, suggesting that IFIT gene family originates from a vertebrate ancestor about 450 million years ago. IFIT family genes show conserved gene structure and gene arrangements. Phylogenetic analyses reveal that this gene family has expanded through lineage-specific and species-specific gene duplication. Interestingly, IFN gene family seem to share a common ancestor and a similar evolutionary mechanism; the function link of IFIT genes to IFN response is present early since the origin of both gene families, as evidenced by the finding that zebrafish IFIT genes are upregulated by fish IFNs, poly(I:C) and two transcription factors IRF3/IRF7, likely via the IFN-stimulated response elements (ISRE) within the promoters of vertebrate IFIT family genes. These coevolution features creates functional association of both family genes to fulfill a common biological process, which is likely selected by viral infection during evolution of vertebrates. Our results are helpful for understanding of evolution of vertebrate IFN system. PMID:23818968

  12. Exon duplications in the ATP7A gene: Frequency and Transcriptional Behaviour

    PubMed Central

    2011-01-01

    Background Menkes disease (MD) is an X-linked, fatal neurodegenerative disorder of copper metabolism, caused by mutations in the ATP7A gene. Thirty-three Menkes patients in whom no mutation had been detected with standard diagnostic tools were screened for exon duplications in the ATP7A gene. Methods The ATP7A gene was screened for exon duplications using multiplex ligation-dependent probe amplification (MLPA). The expression level of ATP7A was investigated by real-time PCR and detailed analysis of the ATP7A mRNA was performed by RT-PCR followed by sequencing. In order to investigate whether the identified duplicated fragments originated from a single or from two different X-chromosomes, polymorphic markers located in the duplicated fragments were analyzed. Results Partial ATP7A gene duplication was identified in 20 unrelated patients including one patient with Occipital Horn Syndrome (OHS). Duplications in the ATP7A gene are estimated from our material to be the disease causing mutation in 4% of the Menkes disease patients. The duplicated regions consist of between 2 and 15 exons. In at least one of the cases, the duplication was due to an intra-chromosomal event. Characterization of the ATP7A mRNA transcripts in 11 patients revealed that the duplications were organized in tandem, in a head to tail direction. The reading frame was disrupted in all 11 cases. Small amounts of wild-type transcript were found in all patients as a result of exon-skipping events occurring in the duplicated regions. In the OHS patient with a duplication of exon 3 and 4, the duplicated out-of-frame transcript coexists with an almost equally represented wild-type transcript, presumably leading to the milder phenotype. Conclusions In general, patients with duplication of only 2 exons exhibit a milder phenotype as compared to patients with duplication of more than 2 exons. This study provides insight into exon duplications in the ATP7A gene. PMID:22074552

  13. Investigating different duplication pattern of essential genes in mouse and human.

    PubMed

    Acharya, Debarun; Mukherjee, Dola; Podder, Soumita; Ghosh, Tapash C

    2015-01-01

    Gene duplication is one of the major driving forces shaping genome and organism evolution and thought to be itself regulated by some intrinsic properties of the gene. Comparing the essential genes among mouse and human, we observed that the essential genes avoid duplication in mouse while prefer to remain duplicated in humans. In this study, we wanted to explore the reasons behind such differences in gene essentiality by cross-species comparison of human and mouse. Moreover, we examined essential genes that are duplicated in humans are functionally more redundant than that in mouse. The proportion of paralog pseudogenization of essential genes is higher in mouse than that of humans. These duplicates of essential genes are under stringent dosage regulation in human than in mouse. We also observed slower evolutionary rate in the paralogs of human essential genes than the mouse counterpart. Together, these results clearly indicate that human essential genes are retained as duplicates to serve as backed up copies that may shield themselves from harmful mutations.

  14. Extensive Local Gene Duplication and Functional Divergence among Paralogs in Atlantic Salmon

    PubMed Central

    Warren, Ian A.; Ciborowski, Kate L.; Casadei, Elisa; Hazlerigg, David G.; Martin, Sam; Jordan, William C.; Sumner, Seirian

    2014-01-01

    Many organisms can generate alternative phenotypes from the same genome, enabling individuals to exploit diverse and variable environments. A prevailing hypothesis is that such adaptation has been favored by gene duplication events, which generate redundant genomic material that may evolve divergent functions. Vertebrate examples of recent whole-genome duplications are sparse although one example is the salmonids, which have undergone a whole-genome duplication event within the last 100 Myr. The life-cycle of the Atlantic salmon, Salmo salar, depends on the ability to produce alternating phenotypes from the same genome, to facilitate migration and maintain its anadromous life history. Here, we investigate the hypothesis that genome-wide and local gene duplication events have contributed to the salmonid adaptation. We used high-throughput sequencing to characterize the transcriptomes of three key organs involved in regulating migration in S. salar: Brain, pituitary, and olfactory epithelium. We identified over 10,000 undescribed S. salar sequences and designed an analytic workflow to distinguish between paralogs originating from local gene duplication events or from whole-genome duplication events. These data reveal that substantial local gene duplications took place shortly after the whole-genome duplication event. Many of the identified paralog pairs have either diverged in function or become noncoding. Future functional genomics studies will reveal to what extent this rich source of divergence in genetic sequence is likely to have facilitated the evolution of extreme phenotypic plasticity required for an anadromous life-cycle. PMID:24951567

  15. Extensive local gene duplication and functional divergence among paralogs in Atlantic salmon.

    PubMed

    Warren, Ian A; Ciborowski, Kate L; Casadei, Elisa; Hazlerigg, David G; Martin, Sam; Jordan, William C; Sumner, Seirian

    2014-06-19

    Many organisms can generate alternative phenotypes from the same genome, enabling individuals to exploit diverse and variable environments. A prevailing hypothesis is that such adaptation has been favored by gene duplication events, which generate redundant genomic material that may evolve divergent functions. Vertebrate examples of recent whole-genome duplications are sparse although one example is the salmonids, which have undergone a whole-genome duplication event within the last 100 Myr. The life-cycle of the Atlantic salmon, Salmo salar, depends on the ability to produce alternating phenotypes from the same genome, to facilitate migration and maintain its anadromous life history. Here, we investigate the hypothesis that genome-wide and local gene duplication events have contributed to the salmonid adaptation. We used high-throughput sequencing to characterize the transcriptomes of three key organs involved in regulating migration in S. salar: Brain, pituitary, and olfactory epithelium. We identified over 10,000 undescribed S. salar sequences and designed an analytic workflow to distinguish between paralogs originating from local gene duplication events or from whole-genome duplication events. These data reveal that substantial local gene duplications took place shortly after the whole-genome duplication event. Many of the identified paralog pairs have either diverged in function or become noncoding. Future functional genomics studies will reveal to what extent this rich source of divergence in genetic sequence is likely to have facilitated the evolution of extreme phenotypic plasticity required for an anadromous life-cycle.

  16. An ancient genome duplication contributed to the abundance of metabolic genes in the moss Physcomitrella patens

    PubMed Central

    Rensing, Stefan A; Ick, Julia; Fawcett, Jeffrey A; Lang, Daniel; Zimmer, Andreas; Van de Peer, Yves; Reski, Ralf

    2007-01-01

    Background: Analyses of complete genomes and large collections of gene transcripts have shown that most, if not all seed plants have undergone one or more genome duplications in their evolutionary past. Results: In this study, based on a large collection of EST sequences, we provide evidence that the haploid moss Physcomitrella patens is a paleopolyploid as well. Based on the construction of linearized phylogenetic trees we infer the genome duplication to have occurred between 30 and 60 million years ago. Gene Ontology and pathway association of the duplicated genes in P. patens reveal different biases of gene retention compared with seed plants. Conclusion: Metabolic genes seem to have been retained in excess following the genome duplication in P. patens. This might, at least partly, explain the versatility of metabolism, as described for P. patens and other mosses, in comparison to other land plants. PMID:17683536

  17. Gene duplication, exon gain and neofunctionalization of OEP16-related genes in land plants.

    PubMed

    Drea, Sinéad C; Lao, Nga T; Wolfe, Kenneth H; Kavanagh, Tony A

    2006-06-01

    OEP16, a channel protein of the outer membrane of chloroplasts, has been implicated in amino acid transport and in the substrate-dependent import of protochlorophyllide oxidoreductase A. Two major clades of OEP16-related sequences were identified in land plants (OEP16-L and OEP16-S), which arose by a gene duplication event predating the divergence of seed plants and bryophytes. Remarkably, in angiosperms, OEP16-S genes evolved by gaining an additional exon that extends an interhelical loop domain in the pore-forming region of the protein. We analysed the sequence, structure and expression of the corresponding Arabidopsis genes (atOEP16-S and atOEP16-L) and demonstrated that following duplication, both genes diverged in terms of expression patterns and coding sequence. AtOEP16-S, which contains multiple G-box ABA-responsive elements (ABREs) in the promoter region, is regulated by ABI3 and ABI5 and is strongly expressed during the maturation phase in seeds and pollen grains, both desiccation-tolerant tissues. In contrast, atOEP-L, which lacks promoter ABREs, is expressed predominantly in leaves, is induced strongly by low-temperature stress and shows weak induction in response to osmotic stress, salicylic acid and exogenous ABA. Our results indicate that gene duplication, exon gain and regulatory sequence evolution each played a role in the divergence of OEP16 homologues in plants.

  18. Preservation of Gene Duplication Increases the Regulatory Spectrum of Ribosomal Protein Genes and Enhances Growth under Stress.

    PubMed

    Parenteau, Julie; Lavoie, Mathieu; Catala, Mathieu; Malik-Ghulam, Mustafa; Gagnon, Jules; Abou Elela, Sherif

    2015-12-22

    In baker's yeast, the majority of ribosomal protein genes (RPGs) are duplicated, and it was recently proposed that such duplications are preserved via the functional specialization of the duplicated genes. However, the origin and nature of duplicated RPGs' (dRPGs) functional specificity remain unclear. In this study, we show that differences in dRPG functions are generated by variations in the modality of gene expression and, to a lesser extent, by protein sequence. Analysis of the sequence and expression patterns of non-intron-containing RPGs indicates that each dRPG is controlled by specific regulatory sequences modulating its expression levels in response to changing growth conditions. Homogenization of dRPG sequences reduces cell tolerance to growth under stress without changing the number of expressed genes. Together, the data reveal a model where duplicated genes provide a means for modulating the expression of ribosomal proteins in response to stress.

  19. New organelles by gene duplication in a biophysical model of eukaryote endomembrane evolution.

    PubMed

    Ramadas, Rohini; Thattai, Mukund

    2013-06-04

    Extant eukaryotic cells have a dynamic traffic network that consists of diverse membrane-bound organelles exchanging matter via vesicles. This endomembrane system arose and diversified during a period characterized by massive expansions of gene families involved in trafficking after the acquisition of a mitochondrial endosymbiont by a prokaryotic host cell >1.8 billion years ago. Here we investigate the mechanistic link between gene duplication and the emergence of new nonendosymbiotic organelles, using a minimal biophysical model of traffic. Our model incorporates membrane-bound compartments, coat proteins and adaptors that drive vesicles to bud and segregate cargo from source compartments, and SNARE proteins and associated factors that cause vesicles to fuse into specific destination compartments. In simulations, arbitrary numbers of compartments with heterogeneous initial compositions segregate into a few compositionally distinct subsets that we term organelles. The global structure of the traffic system (i.e., the number, composition, and connectivity of organelles) is determined completely by local molecular interactions. On evolutionary timescales, duplication of the budding and fusion machinery followed by loss of cross-interactions leads to the emergence of new organelles, with increased molecular specificity being necessary to maintain larger organellar repertoires. These results clarify potential modes of early eukaryotic evolution as well as more recent eukaryotic diversification.

  20. Capillary electrophoresis for the detection of PMP22 gene duplication: study in Mexican patients.

    PubMed

    Hernández-Zamora, Edgar; de la Luz Arenas-Sordo, María; Maldonado-Rodríguez, Rogelio

    2008-04-01

    Charcot-Marie-Tooth (CMT) disease is the most common inherited disorder of the human peripheral nerve, with an estimated overall prevalence of 17-40/10 000 [1]. The typical phenotype presents peroneal muscular atrophy and pes cavus [2]. CMT is usually divided into two large types, about two-thirds of the patients have CMT type 1 (CMT1), that affects the layer of myelin (demyelination). In type 2 (CMT2) the nerve fibers are affected (axonal). CMT diseases have autosomal dominant, autosomal recessive, and X-linked inheritance [1]. The most frequent subtype is 1A (CMT1A) with autosomal dominant transmission, secondary in most cases to a tandem duplication of a 1.5 Mb DNA fragment on chromosome 17p11.2-p12 [4-7]. In this region, the codification of the peripheral myelin protein 22 (PMP22) takes place. The severity of the disease varies among patients, even within the same family, from almost no symptoms to severe foot-drop and sensory loss. The PMP22 gene has four exons and is regulated by two promoters located toward the extreme 5'. The origin of the duplication that causes the disease is an uneven exchange of the chromatids during the meiosis. This unequal recombination occurs between two regions that limit the PMP22 gene, described as REP places of 24 kb, proximal and distal [3, 4].

  1. New Organelles by Gene Duplication in a Biophysical Model of Eukaryote Endomembrane Evolution

    PubMed Central

    Ramadas, Rohini; Thattai, Mukund

    2013-01-01

    Extant eukaryotic cells have a dynamic traffic network that consists of diverse membrane-bound organelles exchanging matter via vesicles. This endomembrane system arose and diversified during a period characterized by massive expansions of gene families involved in trafficking after the acquisition of a mitochondrial endosymbiont by a prokaryotic host cell >1.8 billion years ago. Here we investigate the mechanistic link between gene duplication and the emergence of new nonendosymbiotic organelles, using a minimal biophysical model of traffic. Our model incorporates membrane-bound compartments, coat proteins and adaptors that drive vesicles to bud and segregate cargo from source compartments, and SNARE proteins and associated factors that cause vesicles to fuse into specific destination compartments. In simulations, arbitrary numbers of compartments with heterogeneous initial compositions segregate into a few compositionally distinct subsets that we term organelles. The global structure of the traffic system (i.e., the number, composition, and connectivity of organelles) is determined completely by local molecular interactions. On evolutionary timescales, duplication of the budding and fusion machinery followed by loss of cross-interactions leads to the emergence of new organelles, with increased molecular specificity being necessary to maintain larger organellar repertoires. These results clarify potential modes of early eukaryotic evolution as well as more recent eukaryotic diversification. PMID:23746528

  2. Gene duplication in the major insecticide target site, Rdl, in Drosophila melanogaster.

    PubMed

    Remnant, Emily J; Good, Robert T; Schmidt, Joshua M; Lumb, Christopher; Robin, Charles; Daborn, Phillip J; Batterham, Philip

    2013-09-03

    The Resistance to Dieldrin gene, Rdl, encodes a GABA-gated chloride channel subunit that is targeted by cyclodiene and phenylpyrazole insecticides. The gene was first characterized in Drosophila melanogaster by genetic mapping of resistance to the cyclodiene dieldrin. The 4,000-fold resistance observed was due to a single amino acid replacement, Ala(301) to Ser. The equivalent change was subsequently identified in Rdl orthologs of a large range of resistant insect species. Here, we report identification of a duplication at the Rdl locus in D. melanogaster. The 113-kb duplication contains one WT copy of Rdl and a second copy with two point mutations: an Ala(301) to Ser resistance mutation and Met(360) to Ile replacement. Individuals with this duplication exhibit intermediate dieldrin resistance compared with single copy Ser(301) homozygotes, reduced temperature sensitivity, and altered RNA editing associated with the resistant allele. Ectopic recombination between Roo transposable elements is involved in generating this genomic rearrangement. The duplication phenotypes were confirmed by construction of a transgenic, artificial duplication integrating the 55.7-kb Rdl locus with a Ser(301) change into an Ala(301) background. Gene duplications can contribute significantly to the evolution of insecticide resistance, most commonly by increasing the amount of gene product produced. Here however, duplication of the Rdl target site creates permanent heterozygosity, providing unique potential for adaptive mutations to accrue in one copy, without abolishing the endogenous role of an essential gene.

  3. Distinct Defects in Spine Formation or Pruning in Two Gene Duplication Mouse Models of Autism.

    PubMed

    Wang, Miao; Li, Huiping; Takumi, Toru; Qiu, Zilong; Xu, Xiu; Yu, Xiang; Bian, Wen-Jie

    2017-04-01

    Autism spectrum disorder (ASD) encompasses a complex set of developmental neurological disorders, characterized by deficits in social communication and excessive repetitive behaviors. In recent years, ASD is increasingly being considered as a disease of the synapse. One main type of genetic aberration leading to ASD is gene duplication, and several mouse models have been generated mimicking these mutations. Here, we studied the effects of MECP2 duplication and human chromosome 15q11-13 duplication on synaptic development and neural circuit wiring in the mouse sensory cortices. We showed that mice carrying MECP2 duplication had specific defects in spine pruning, while the 15q11-13 duplication mouse model had impaired spine formation. Our results demonstrate that spine pathology varies significantly between autism models and that distinct aspects of neural circuit development may be targeted in different ASD mutations. Our results further underscore the importance of gene dosage in normal development and function of the brain.

  4. Duplication and functional diversification of HAP3 genes leading to the origin of the seed-developmental regulatory gene, LEAFY COTYLEDON1 (LEC1), in nonseed plant genomes.

    PubMed

    Xie, Zengyan; Li, Xia; Glover, Beverley J; Bai, Shunong; Rao, Guang-Yuan; Luo, Jingchu; Yang, Ji

    2008-08-01

    The HAP3 gene encodes a subunit of the CCAAT-box-binding factor (CBF), a highly conserved trimeric activator that recognizes and binds the ubiquitous CCAAT promoter element with high affinity. Two types of HAP3 gene have been identified in plant genomes. The LEAFY COTYLEDON1 (LEC1)-type HAP3 genes encode a functionally specialized subunit of CBF, which is expressed specifically in developing seeds. In contrast, most non-LEC1-type HAP3 genes are expressed in various tissues. It has been proposed that the LEC1-type HAP3 genes originated from the duplication and functional divergence of non-LEC1-type HAP3 genes. However, it is not yet known when this duplication event took place or whether the LEC1-type HAP3 genes appeared at the same time as the origin of seed plants. Here we describe a comprehensive comparison of the duplication patterns of HAP3 genes in different plant genomes. We recognize a major expansion of the HAP3 gene family accompanying the origin and early diversification of land plants and postulate that retrotransposition and other mechanisms of gene duplication have been involved in the expansion of the plant HAP3 gene family. We provide evidence that the LEC1-type HAP3 genes originated in nonseed vascular plant genomes and demonstrate that they are inductively expressed under drought stress in nonseed plants. These genes, however, were recruited to a novel regulatory network in the early stages of seed plant evolution and steadily expressed during seed development and maturation.

  5. Interchromosomal segmental duplications explain the unusual structure of PRSS3, the gene for an inhibitor-resistant trypsinogen.

    PubMed

    Rowen, Lee; Williams, Eleanor; Glusman, Gustavo; Linardopoulou, Elena; Friedman, Cynthia; Ahearn, Mary Ellen; Seto, Jason; Boysen, Cecilie; Qin, Shizhen; Wang, Kai; Kaur, Amardeep; Bloom, Scott; Hood, Leroy; Trask, Barbara J

    2005-08-01

    Homo sapiens possess several trypsinogen or trypsinogen-like genes of which three (PRSS1, PRSS2, and PRSS3) produce functional trypsins in the digestive tract. PRSS1 and PRSS2 are located on chromosome 7q35, while PRSS3 is found on chromosome 9p13. Here, we report a variation of the theme of new gene creation by duplication: the PRSS3 gene was formed by segmental duplications originating from chromosomes 7q35 and 11q24. As a result, PRSS3 transcripts display two variants of exon 1. The PRSS3 transcript whose gene organization most resembles PRSS1 and PRSS2 encodes a functional protein originally named mesotrypsinogen. The other variant is a fusion transcript, called trypsinogen IV. We show that the first exon of trypsinogen IV is derived from the noncoding first exon of LOC120224, a chromosome 11 gene. LOC120224 codes for a widely conserved transmembrane protein of unknown function. Comparative analyses suggest that these interchromosomal duplications occurred after the divergence of Old World monkeys and hominids. PRSS3 transcripts consist of a mixed population of mRNAs, some expressed in the pancreas and encoding an apparently functional trypsinogen and others of unknown function expressed in brain and a variety of other tissues. Analysis of the selection pressures acting on the trypsinogen gene family shows that, while the apparently functional genes are under mild to strong purifying selection overall, a few residues appear under positive selection. These residues could be involved in interactions with inhibitors.

  6. Genesis of the vertebrate FoxP subfamily member genes occurred during two ancestral whole genome duplication events.

    PubMed

    Song, Xiaowei; Tang, Yezhong; Wang, Yajun

    2016-08-22

    The vertebrate FoxP subfamily genes play important roles in the construction of essential functional modules involved in physiological and developmental processes. To explore the adaptive evolution of functional modules associated with the FoxP subfamily member genes, it is necessary to study the gene duplication process. We detected four member genes of the FoxP subfamily in sea lampreys (a representative species of jawless vertebrates) through genome screenings and phylogenetic analyses. Reliable paralogons (i.e. paralogous chromosome segments) have rarely been detected in scaffolds of FoxP subfamily member genes in sea lampreys due to the considerable existence of HTH_Tnp_Tc3_2 transposases. However, these transposases did not alter gene numbers of the FoxP subfamily in sea lampreys. The coincidence between the "1-4" gene duplication pattern of FoxP subfamily genes from invertebrates to vertebrates and two rounds of ancestral whole genome duplication (1R- and 2R-WGD) events reveal that the FoxP subfamily of vertebrates was quadruplicated in the 1R- and 2R-WGD events. Furthermore, we deduced that a synchronous gene duplication process occurred for the FoxP subfamily and for three linked gene families/subfamilies (i.e. MIT family, mGluR group III and PLXNA subfamily) in the 1R- and 2R-WGD events using phylogenetic analyses and mirror-dendrogram methods (i.e. algorithms to test protein-protein interactions). Specifically, the ancestor of FoxP1 and FoxP3 and the ancestor of FoxP2 and FoxP4 were generated in 1R-WGD event. In the subsequent 2R-WGD event, these two ancestral genes were changed into FoxP1, FoxP2, FoxP3 and FoxP4. The elucidation of these gene duplication processes shed light on the phylogenetic relationships between functional modules of the FoxP subfamily member genes.

  7. Duplication-dependent CG suppression of the seed storage protein genes of maize.

    PubMed Central

    Lund, Gertrud; Lauria, Massimiliano; Guldberg, Per; Zaina, Silvio

    2003-01-01

    This study investigates the prevalence of CG and CNG suppression in single- vs. multicopy DNA regions of the maize genome. The analysis includes the single- and multicopy seed storage proteins (zeins), the miniature inverted-repeat transposable elements (MITEs), and long terminal repeat (LTR) retrotransposons. Zein genes are clustered on specific chromosomal regions, whereas MITEs and LTRs are dispersed in the genome. The multicopy zein genes are CG suppressed and exhibit large variations in CG suppression. The variation observed correlates with the extent of duplication each zein gene has undergone, indicating that gene duplication results in an increased turnover of cytosine residues. Alignment of individual zein genes confirms this observation and demonstrates that CG depletion results primarily from polarized C:T and G:A transition mutations from a less to a more extensively duplicated gene. In addition, transition mutations occur primarily in a CG or CNG context suggesting that CG suppression may result from deamination of methylated cytosine residues. Duplication-dependent CG depletion is likely to occur at other loci as duplicated MITEs and LTR elements, or elements inserted into duplicated gene regions, also exhibit CG depletion. PMID:14573492

  8. A new duplication in the mitochondrially encoded tRNA proline gene in a patient with dilated cardiomyopathy.

    PubMed

    Cardena, Mari Maki Siria Godoy; Mansur, Alfredo José; Pereira, Alexandre Da Costa; Fridman, Cintia

    2013-02-01

    Mitochondria provide an environment conducive to mutations in DNA molecules (mtDNA). Analyses of mtDNA have shown mutations potentially leading to many cardiovascular traits. Here, we describe a patient with dilated cardiomyopathy and new mtDNA duplication. The patient presented symptoms of heart failure New York Heart Association functional class III and was diagnosed with non-familial dilated cardiomyopathy with important left ventricular systolic dysfunction. Sequencing of mtDNA control region was done, and a 15 bp duplication was observed between nucleotides 16,018 and 16,032. Part of this duplication is localized within the tRNA proline gene (tRNA(Pro)) that has an important role in cell protection against oxidative stress and is considered an important regulatory factor for cellular reactive oxygen species balance. This duplication could alter the stability or secondary structure of tRNA(Pro), affecting mt-protein synthesis. In turn, the presence of duplication in tRNA(Pro) could cause some oxidative stress imbalance and, so, mitochondrial dysfunction could result in the pathogenicity.

  9. Frequent changes in expression profile and accelerated sequence evolution of duplicated imprinted genes in arabidopsis.

    PubMed

    Qiu, Yichun; Liu, Shao-Lun; Adams, Keith L

    2014-07-01

    Eukaryotic genomes have large numbers of duplicated genes that can evolve new functions or expression patterns by changes in coding and regulatory sequences, referred to as neofunctionalization. In flowering plants, some duplicated genes are imprinted in the endosperm, where only one allele is expressed depending on its parental origin. We found that 125 imprinted genes in Arabidopsis arose from gene duplication events during the evolution of the Brassicales. Analyses of 46 gene pairs duplicated by an ancient whole-genome duplication (alpha WGD) indicated that many imprinted genes show an accelerated rate of amino acid changes compared with their paralogs. Analyses of microarray expression data from 63 organ types and developmental stages indicated that many imprinted genes have expression patterns restricted to flowers and/or seeds in contrast to their broadly expressed paralogs. Assays of expression in orthologs from outgroup species revealed that some imprinted genes have acquired an organ-specific expression pattern restricted to flowers and/or seeds. The changes in expression pattern and the accelerated sequence evolution in the imprinted genes suggest that some of them may have undergone neofunctionalization. The imprinted genes MPC, HOMEODOMAIN GLABROUS6 (HDG6), and HDG3 are particularly interesting cases that have different functions from their paralogs. This study indicates that a large number of imprinted genes in Arabidopsis are evolutionarily recent duplicates and that many of them show changes in expression profiles and accelerated sequence evolution. Acquisition of imprinting is a mode of duplicate gene divergence in plants that is more common than previously thought.

  10. Duplicative activation mechanisms of two trypanosome telomeric VSG genes with structurally simple 5' flanks.

    PubMed

    Matthews, K R; Shiels, P G; Graham, S V; Cowan, C; Barry, J D

    1990-12-25

    In the mammalian bloodstream, African trypanosomes express variant surface glycoprotein (VSG) genes from a family of long and complex telomeric expression sites. VSG switching generally occurs by the duplication of different VSG genes into these sites by gene conversion involving a series of 70 base pair (70bp) repeats in the 5' flank. In contrast, when VSG is first synthesised by trypanosomes in the tsetse fly at the metacyclic stage, a separate set of telomeric expression sites is activated. These latter telomeres appear not to act as recipients in gene conversion. We have found that the structure of two such expression sites is simple, with very short 70bp repeat regions and very little other sequence in common with bloodstream expression sites. However, the two telomeres readily act as donors in VSG gene conversion in the bloodstream and we show for one a consistent association of the conversion 5' end point with the short 70bp repeat region. These findings help explain why a very predictable set of VSGs is expressed in the tsetse fly and have implications for VSG gene conversion mechanisms.

  11. beta. amyloid gene duplication in Alzheimer's disease and karyotypically normal Down syndrome

    SciTech Connect

    Delabar, J.; Goldgaber, D.; Lamour, Y.; Nicole, A.; Huret, J.; De Groucy, J.; Brown, P.; Gajdusek, D.C.; Sinet, P.

    1987-03-13

    With the recently cloned complementary DNA probe, lambdaAm4 for the chromosome 21 gene encoding brain amyloid polypeptide (..beta.. amyloid protein) of Alzheimer's disease, leukocyte DNA from three patients with sporadic Alzheimer's disease and two patients with karyotypically normal Down syndrome was found to contain three copies of this bene. Because a small region of chromosome 21 containing the ets-2 gene is duplicated in patients with Alzheimer's disease, as well as in karyotypically normal Down syndrome, duplication of a subsection of the critical segment of chromosome 21 that is duplicated in Down syndrome may be the genetic defect in Alzeimer's disease.

  12. Extensive horizontal gene transfer, duplication, and loss of chlorophyll synthesis genes in the algae

    SciTech Connect

    Hunsperger, Heather M.; Randhawa, Tejinder; Cattolico, Rose Ann

    2015-02-10

    Two non-homologous, isofunctional enzymes catalyze the penultimate step of chlorophyll a synthesis in oxygenic photosynthetic organisms such as cyanobacteria, eukaryotic algae and land plants: the light independent (LIPOR) and light-dependent (POR) protochlorophyllide oxidoreductases. Whereas the distribution of these enzymes in cyanobacteria and land plants is well understood, the presence, loss, duplication, and replacement of these genes have not been surveyed in the polyphyletic and remarkably diverse eukaryotic algal lineages.

  13. Calcium-Activated Potassium (BK) Channels Are Encoded by Duplicate slo1 Genes in Teleost Fishes

    PubMed Central

    Deitcher, David L.; Bass, Andrew H.

    2009-01-01

    Calcium-activated, large conductance potassium (BK) channels in tetrapods are encoded by a single slo1 gene, which undergoes extensive alternative splicing. Alternative splicing generates a high level of functional diversity in BK channels that contributes to the wide range of frequencies electrically tuned by the inner ear hair cells of many tetrapods. To date, the role of BK channels in hearing among teleost fishes has not been investigated at the molecular level, although teleosts account for approximately half of all extant vertebrate species. We identified slo1 genes in teleost and nonteleost fishes using polymerase chain reaction and genetic sequence databases. In contrast to tetrapods, all teleosts examined were found to express duplicate slo1 genes in the central nervous system, whereas nonteleosts that diverged prior to the teleost whole-genome duplication event express a single slo1 gene. Phylogenetic analyses further revealed that whereas other slo1 duplicates were the result of a single duplication event, an independent duplication occurred in a basal teleost (Anguilla rostrata) following the slo1 duplication in teleosts. A third, independent slo1 duplication (autotetraploidization) occurred in salmonids. Comparison of teleost slo1 genomic sequences to their tetrapod orthologue revealed a reduced number of alternative splice sites in both slo1 co-orthologues. For the teleost Porichthys notatus, a focal study species that vocalizes with maximal spectral energy in the range electrically tuned by BK channels in the inner ear, peripheral tissues show the expression of either one (e.g., vocal muscle) or both (e.g., inner ear) slo1 paralogues with important implications for both auditory and vocal physiology. Additional loss of expression of one slo1 paralogue in nonneural tissues in P. notatus suggests that slo1 duplicates were retained via subfunctionalization. Together, the results predict that teleost fish achieve a diversity of BK channel subfunction via

  14. Dynamics of gene duplication in the genomes of chlorophyll d-producing cyanobacteria: implications for the ecological niche.

    PubMed

    Miller, Scott R; Wood, A Michelle; Blankenship, Robert E; Kim, Maria; Ferriera, Steven

    2011-01-01

    Gene duplication may be an important mechanism for the evolution of new functions and for the adaptive modulation of gene expression via dosage effects. Here, we analyzed the fate of gene duplicates for two strains of a novel group of cyanobacteria (genus Acaryochloris) that produces the far-red light absorbing chlorophyll d as its main photosynthetic pigment. The genomes of both strains contain an unusually high number of gene duplicates for bacteria. As has been observed for eukaryotic genomes, we find that the demography of gene duplicates can be well modeled by a birth-death process. Most duplicated Acaryochloris genes are of comparatively recent origin, are strain-specific, and tend to be located on different genetic elements. Analyses of selection on duplicates of different divergence classes suggest that a minority of paralogs exhibit near neutral evolutionary dynamics immediately following duplication but that most duplicate pairs (including those which have been retained for long periods) are under strong purifying selection against amino acid change. The likelihood of duplicate retention varied among gene functional classes, and the pronounced differences between strains in the pool of retained recent duplicates likely reflects differences in the nutrient status and other characteristics of their respective environments. We conclude that most duplicates are quickly purged from Acaryochloris genomes and that those which are retained likely make important contributions to organism ecology by conferring fitness benefits via gene dosage effects. The mechanism of enhanced duplication may involve homologous recombination between genetic elements mediated by paralogous copies of recA.

  15. Restriction and recruitment-gene duplication and the origin and evolution of snake venom toxins.

    PubMed

    Hargreaves, Adam D; Swain, Martin T; Hegarty, Matthew J; Logan, Darren W; Mulley, John F

    2014-08-01

    Snake venom has been hypothesized to have originated and diversified through a process that involves duplication of genes encoding body proteins with subsequent recruitment of the copy to the venom gland, where natural selection acts to develop or increase toxicity. However, gene duplication is known to be a rare event in vertebrate genomes, and the recruitment of duplicated genes to a novel expression domain (neofunctionalization) is an even rarer process that requires the evolution of novel combinations of transcription factor binding sites in upstream regulatory regions. Therefore, although this hypothesis concerning the evolution of snake venom is very unlikely and should be regarded with caution, it is nonetheless often assumed to be established fact, hindering research into the true origins of snake venom toxins. To critically evaluate this hypothesis, we have generated transcriptomic data for body tissues and salivary and venom glands from five species of venomous and nonvenomous reptiles. Our comparative transcriptomic analysis of these data reveals that snake venom does not evolve through the hypothesized process of duplication and recruitment of genes encoding body proteins. Indeed, our results show that many proposed venom toxins are in fact expressed in a wide variety of body tissues, including the salivary gland of nonvenomous reptiles and that these genes have therefore been restricted to the venom gland following duplication, not recruited. Thus, snake venom evolves through the duplication and subfunctionalization of genes encoding existing salivary proteins. These results highlight the danger of the elegant and intuitive "just-so story" in evolutionary biology.

  16. Restriction and Recruitment—Gene Duplication and the Origin and Evolution of Snake Venom Toxins

    PubMed Central

    Hargreaves, Adam D.; Swain, Martin T.; Hegarty, Matthew J.; Logan, Darren W.; Mulley, John F.

    2014-01-01

    Snake venom has been hypothesized to have originated and diversified through a process that involves duplication of genes encoding body proteins with subsequent recruitment of the copy to the venom gland, where natural selection acts to develop or increase toxicity. However, gene duplication is known to be a rare event in vertebrate genomes, and the recruitment of duplicated genes to a novel expression domain (neofunctionalization) is an even rarer process that requires the evolution of novel combinations of transcription factor binding sites in upstream regulatory regions. Therefore, although this hypothesis concerning the evolution of snake venom is very unlikely and should be regarded with caution, it is nonetheless often assumed to be established fact, hindering research into the true origins of snake venom toxins. To critically evaluate this hypothesis, we have generated transcriptomic data for body tissues and salivary and venom glands from five species of venomous and nonvenomous reptiles. Our comparative transcriptomic analysis of these data reveals that snake venom does not evolve through the hypothesized process of duplication and recruitment of genes encoding body proteins. Indeed, our results show that many proposed venom toxins are in fact expressed in a wide variety of body tissues, including the salivary gland of nonvenomous reptiles and that these genes have therefore been restricted to the venom gland following duplication, not recruited. Thus, snake venom evolves through the duplication and subfunctionalization of genes encoding existing salivary proteins. These results highlight the danger of the elegant and intuitive “just-so story” in evolutionary biology. PMID:25079342

  17. Predicting the Stability of Homologous Gene Duplications in a Plant RNA Virus.

    PubMed

    Willemsen, Anouk; Zwart, Mark P; Higueras, Pablo; Sardanyés, Josep; Elena, Santiago F

    2016-10-12

    One of the striking features of many eukaryotes is the apparent amount of redundancy in coding and non-coding elements of their genomes. Despite the possible evolutionary advantages, there are fewer examples of redundant sequences in viral genomes, particularly those with RNA genomes. The factors constraining the maintenance of redundant sequences in present-day RNA virus genomes are not well known. Here, we use Tobacco etch virus, a plant RNA virus, to investigate the stability of genetically redundant sequences by generating viruses with potentially beneficial gene duplications. Subsequently, we tested the viability of these viruses and performed experimental evolution. We found that all gene duplication events resulted in a loss of viability or in a significant reduction in viral fitness. Moreover, upon analyzing the genomes of the evolved viruses, we always observed the deletion of the duplicated gene copy and maintenance of the ancestral copy. Interestingly, there were clear differences in the deletion dynamics of the duplicated gene associated with the passage duration and the size and position of the duplicated copy. Based on the experimental data, we developed a mathematical model to characterize the stability of genetically redundant sequences, and showed that fitness effects are not enough to predict genomic stability. A context-dependent recombination rate is also required, with the context being the duplicated gene and its position. Our results therefore demonstrate experimentally the deleterious nature of gene duplications in RNA viruses. Beside previously described constraints on genome size, we identified additional factors that reduce the likelihood of the maintenance of duplicated genes. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  18. Additional duplicated Hox genes in the earthworm: Perionyx excavatus Hox genes consist of eleven paralog groups.

    PubMed

    Cho, Sung-Jin; Vallès, Yvonne; Kim, Kyong Min; Ji, Seong Chul; Han, Seock Jung; Park, Soon Cheol

    2012-02-10

    Annelida is a lophotrochozoan phylum whose members have a high degree of diversity in body plan morphology, reproductive strategies and ecological niches among others. Of the two traditional classes pertaining to the phylum Annelida (Polychaete and Clitellata), the structure and function of the Hox genes has not been clearly defined within the Oligochaeta class. Using a PCR-based survey, we were able to identify five new Hox genes from the earthworm Perionyx excavatus: a Hox3 gene (Pex-Hox3b), two Dfd genes (Pex-Lox6 and Pex-Lox18), and two posterior genes (Pex-post1 and -post2a). Our result suggests that the eleven earthworm Hox genes contain at least four paralog groups (PG) that have duplicated. We found the clitellates-diagnostic signature residues and annelid signature motif. Also, we show by semi-quantitative RT-PCR that duplicated Hox gene orthologs are differentially expressed in six different anterior-posterior body regions. These results provide essential data for comparative evolution of the Hox cluster within the Annelida.

  19. The enrichment of TATA box and the scarcity of depleted proximal nucleosome in the promoters of duplicated yeast genes.

    PubMed

    Kim, Yuseob; Lee, Jang H; Babbitt, Gregory A

    2010-01-01

    Population genetic theory of gene duplication suggests that the preservation of duplicate copies requires functional divergence upon duplication. Genes that can be readily modified to produce new gene expression patterns may thus be duplicated often. In yeast, genes exhibit dichotomous expression patterns based on their promoter architectures. The expression of genes that contain TATA box or occupied proximal nucleosome (OPN) tends to be variable and respond to external signals. On the other hand, genes without TATA box or with depleted proximal nucleosome (DPN) are expressed constitutively. We find that recent duplicates in the yeast genome are heavily biased to be TATA box containing genes and not to be DPN genes. This suggests that variably expressed genes, due to the functional organization in their promoters, have higher duplicability than constitutively expressed genes.

  20. Inverted duplication of histone genes in chicken and disposition of regulatory sequences.

    PubMed Central

    Wang, S W; Robins, A J; d'Andrea, R; Wells, J R

    1985-01-01

    Sequence analysis of an 8.4 kb fragment containing five chicken histone genes shows that an H4-H2A gene pair is duplicated and inverted around a central H3 gene. A left and right region, each of 2.1 kb are 97% homologous and the boundaries of homology coincide with ten base pair repeats. These boundary regions also contain highly conserved gene promoter elements, suggesting that interaction of transcriptional machinery with histone genes may be connected with recombination in promoter regions, resulting in the inverted duplication structure seen in this cluster. PMID:4000938

  1. Human-specific evolution of novel SRGAP2 genes by incomplete segmental duplication

    PubMed Central

    Dennis, Megan Y.; Nuttle, Xander; Sudmant, Peter H.; Antonacci, Francesca; Graves, Tina A.; Nefedov, Mikhail; Rosenfeld, Jill A.; Sajjadian, Saba; Malig, Maika; Kotkiewicz, Holland; Curry, Cynthia J.; Shafer, Susan; Shaffer, Lisa G.; de Jong, Pieter J.; Wilson, Richard K.; Eichler, Evan E.

    2012-01-01

    SUMMARY Gene duplication is an important source of phenotypic change and adaptive evolution. We use a novel genomic approach to identify highly identical sequence missing from the reference genome, confirming the cortical development gene Slit-Robo Rho GTPase activating protein 2 (SRGAP2) duplicated three times in humans. We show that the promoter and first nine exons of SRGAP2 duplicated from 1q32.1 (SRGAP2A) to 1q21.1 (SRGAP2B) ~3.4 million years ago (mya). Two larger duplications later copied SRGAP2B to chromosome 1p12 (SRGAP2C) and to proximal 1q21.1 (SRGAP2D), ~2.4 and ~1 mya, respectively. Sequence and expression analysis shows SRGAP2C is the most likely duplicate to encode a functional protein and among the most fixed human-specific duplicate genes. Our data suggest a mechanism where incomplete duplication created a novel function —at birth, antagonizing parental SRGAP2 function 2–3 mya a time corresponding to the transition from Australopithecus to Homo and the beginning of neocortex expansion. PMID:22559943

  2. Evolution of Vertebrate Adam Genes; Duplication of Testicular Adams from Ancient Adam9/9-like Loci

    PubMed Central

    Wei, Shuo

    2015-01-01

    Members of the disintegrin metalloproteinase (ADAM) family have important functions in regulating cell-cell and cell-matrix interactions as well as cell signaling. There are two major types of ADAMs: the somatic ADAMs (sADAMs) that have a significant presence in somatic tissues, and the testicular ADAMs (tADAMs) that are expressed predominantly in the testis. Genes encoding tADAMs can be further divided into two groups: group I (intronless) and group II (intron-containing). To date, tAdams have only been reported in placental mammals, and their evolutionary origin and relationship to sAdams remain largely unknown. Using phylogenetic and syntenic tools, we analyzed the Adam genes in various vertebrates ranging from fishes to placental mammals. Our analyses reveal duplication and loss of some sAdams in certain vertebrate species. In particular, there exists an Adam9-like gene in non-mammalian vertebrates but not mammals. We also identified putative group I and group II tAdams in all amniote species that have been examined. These tAdam homologues are more closely related to Adams 9 and 9-like than to other sAdams. In all amniote species examined, group II tAdams lie in close vicinity to Adam9 and hence likely arose from tandem duplication, whereas group I tAdams likely originated through retroposition because of their lack of introns. Clusters of multiple group I tAdams are also common, suggesting tandem duplication after retroposition. Therefore, Adam9/9-like and some of the derived tAdam loci are likely preferred targets for tandem duplication and/or retroposition. Consistent with this hypothesis, we identified a young retroposed gene that duplicated recently from Adam9 in the opossum. As a result of gene duplication, some tAdams were pseudogenized in certain species, whereas others acquired new expression patterns and functions. The rapid duplication of Adam genes has a major contribution to the diversity of ADAMs in various vertebrate species. PMID:26308360

  3. Evolution of Vertebrate Adam Genes; Duplication of Testicular Adams from Ancient Adam9/9-like Loci.

    PubMed

    Bahudhanapati, Harinath; Bhattacharya, Shashwati; Wei, Shuo

    2015-01-01

    Members of the disintegrin metalloproteinase (ADAM) family have important functions in regulating cell-cell and cell-matrix interactions as well as cell signaling. There are two major types of ADAMs: the somatic ADAMs (sADAMs) that have a significant presence in somatic tissues, and the testicular ADAMs (tADAMs) that are expressed predominantly in the testis. Genes encoding tADAMs can be further divided into two groups: group I (intronless) and group II (intron-containing). To date, tAdams have only been reported in placental mammals, and their evolutionary origin and relationship to sAdams remain largely unknown. Using phylogenetic and syntenic tools, we analyzed the Adam genes in various vertebrates ranging from fishes to placental mammals. Our analyses reveal duplication and loss of some sAdams in certain vertebrate species. In particular, there exists an Adam9-like gene in non-mammalian vertebrates but not mammals. We also identified putative group I and group II tAdams in all amniote species that have been examined. These tAdam homologues are more closely related to Adams 9 and 9-like than to other sAdams. In all amniote species examined, group II tAdams lie in close vicinity to Adam9 and hence likely arose from tandem duplication, whereas group I tAdams likely originated through retroposition because of their lack of introns. Clusters of multiple group I tAdams are also common, suggesting tandem duplication after retroposition. Therefore, Adam9/9-like and some of the derived tAdam loci are likely preferred targets for tandem duplication and/or retroposition. Consistent with this hypothesis, we identified a young retroposed gene that duplicated recently from Adam9 in the opossum. As a result of gene duplication, some tAdams were pseudogenized in certain species, whereas others acquired new expression patterns and functions. The rapid duplication of Adam genes has a major contribution to the diversity of ADAMs in various vertebrate species.

  4. Extensive divergence in alternative splicing patterns after gene and genome duplication during the evolutionary history of Arabidopsis.

    PubMed

    Zhang, Peter G; Huang, Suzanne Z; Pin, Anne-Laure; Adams, Keith L

    2010-07-01

    Gene duplication at various scales, from single gene duplication to whole-genome (WG) duplication, has occurred throughout eukaryotic evolution and contributed greatly to the large number of duplicated genes in the genomes of many eukaryotes. Previous studies have shown divergence in expression patterns of many duplicated genes at various evolutionary time scales and cases of gain of a new function or expression pattern by one duplicate or partitioning of functions or expression patterns between duplicates. Alternative splicing (AS) is a fundamental aspect of the expression of many genes that can increase gene product diversity and affect gene regulation. However, the evolution of AS patterns of genes duplicated by polyploidy, as well as in a sizable number of duplicated gene pairs in plants, has not been examined. Here, we have characterized conservation and divergence in AS patterns in genes duplicated by a polyploidy event during the evolutionary history of Arabidopsis thaliana. We used reverse transcription-polymerase chain reaction to assay 104 WG duplicates in six organ types and in plants grown under three abiotic stress treatments to detect organ- and stress-specific patterns of AS. Differences in splicing patterns in one or more organs, or under stress conditions, were found between the genes in a large majority of the duplicated pairs. In a few cases, AS patterns were the same between duplicates only under one or more abiotic stress treatments and not under normal growing conditions or vice versa. We also examined AS in 42 tandem duplicates and we found patterns of AS roughly comparable with the genes duplicated by polyploidy. The alternatively spliced forms in some of the genes created premature stop codons that would result in missing or partial functional domains if the transcripts are translated, which could affect gene function and cause functional divergence between duplicates. Our results indicate that AS patterns have diverged considerably after

  5. Detecting Functional Divergence after Gene Duplication through Evolutionary Changes in Posttranslational Regulatory Sequences

    PubMed Central

    Nguyen Ba, Alex N.; Strome, Bob; Hua, Jun Jie; Desmond, Jonathan; Gagnon-Arsenault, Isabelle; Weiss, Eric L.; Landry, Christian R.; Moses, Alan M.

    2014-01-01

    Gene duplication is an important evolutionary mechanism that can result in functional divergence in paralogs due to neo-functionalization or sub-functionalization. Consistent with functional divergence after gene duplication, recent studies have shown accelerated evolution in retained paralogs. However, little is known in general about the impact of this accelerated evolution on the molecular functions of retained paralogs. For example, do new functions typically involve changes in enzymatic activities, or changes in protein regulation? Here we study the evolution of posttranslational regulation by examining the evolution of important regulatory sequences (short linear motifs) in retained duplicates created by the whole-genome duplication in budding yeast. To do so, we identified short linear motifs whose evolutionary constraint has relaxed after gene duplication with a likelihood-ratio test that can account for heterogeneity in the evolutionary process by using a non-central chi-squared null distribution. We find that short linear motifs are more likely to show changes in evolutionary constraints in retained duplicates compared to single-copy genes. We examine changes in constraints on known regulatory sequences and show that for the Rck1/Rck2, Fkh1/Fkh2, Ace2/Swi5 paralogs, they are associated with previously characterized differences in posttranslational regulation. Finally, we experimentally confirm our prediction that for the Ace2/Swi5 paralogs, Cbk1 regulated localization was lost along the lineage leading to SWI5 after gene duplication. Our analysis suggests that changes in posttranslational regulation mediated by short regulatory motifs systematically contribute to functional divergence after gene duplication. PMID:25474245

  6. Detecting functional divergence after gene duplication through evolutionary changes in posttranslational regulatory sequences.

    PubMed

    Nguyen Ba, Alex N; Strome, Bob; Hua, Jun Jie; Desmond, Jonathan; Gagnon-Arsenault, Isabelle; Weiss, Eric L; Landry, Christian R; Moses, Alan M

    2014-12-01

    Gene duplication is an important evolutionary mechanism that can result in functional divergence in paralogs due to neo-functionalization or sub-functionalization. Consistent with functional divergence after gene duplication, recent studies have shown accelerated evolution in retained paralogs. However, little is known in general about the impact of this accelerated evolution on the molecular functions of retained paralogs. For example, do new functions typically involve changes in enzymatic activities, or changes in protein regulation? Here we study the evolution of posttranslational regulation by examining the evolution of important regulatory sequences (short linear motifs) in retained duplicates created by the whole-genome duplication in budding yeast. To do so, we identified short linear motifs whose evolutionary constraint has relaxed after gene duplication with a likelihood-ratio test that can account for heterogeneity in the evolutionary process by using a non-central chi-squared null distribution. We find that short linear motifs are more likely to show changes in evolutionary constraints in retained duplicates compared to single-copy genes. We examine changes in constraints on known regulatory sequences and show that for the Rck1/Rck2, Fkh1/Fkh2, Ace2/Swi5 paralogs, they are associated with previously characterized differences in posttranslational regulation. Finally, we experimentally confirm our prediction that for the Ace2/Swi5 paralogs, Cbk1 regulated localization was lost along the lineage leading to SWI5 after gene duplication. Our analysis suggests that changes in posttranslational regulation mediated by short regulatory motifs systematically contribute to functional divergence after gene duplication.

  7. Origin, duplication and reshuffling of plasmid genes: Insights from Burkholderia vietnamiensis G4 genome.

    PubMed

    Maida, Isabel; Fondi, Marco; Orlandini, Valerio; Emiliani, Giovanni; Papaleo, Maria Cristiana; Perrin, Elena; Fani, Renato

    2014-01-01

    Using a computational pipeline based on similarity networks reconstruction we analysed the 1133 genes of the Burkholderia vietnamiensis (Bv) G4 five plasmids, showing that gene and operon duplication played an important role in shaping the plasmid architecture. Several single/multiple duplications occurring at intra- and/or interplasmids level involving 253 paralogous genes (stand-alone, clustered or operons) were detected. An extensive gene/operon exchange between plasmids and chromosomes was also disclosed. The larger the plasmid, the higher the number and size of paralogous fragments. Many paralogs encoded mobile genetic elements and duplicated very recently, suggesting that the rearrangement of the Bv plastic genome is ongoing. Concerning the "molecular habitat" and the "taxonomical status" (the Preferential Organismal Sharing) of Bv plasmid genes, most of them have been exchanged with other plasmids of bacteria belonging (or phylogenetically very close) to Burkholderia, suggesting that taxonomical proximity of bacterial strains is a crucial issue in plasmid-mediated gene exchange.

  8. Contribution of nonohnologous duplicated genes to high habitat variability in mammals.

    PubMed

    Tamate, Satoshi C; Kawata, Masakado; Makino, Takashi

    2014-07-01

    The mechanism by which genetic systems affect environmental adaptation is a focus of considerable attention in the fields of ecology, evolution, and conservation. However, the genomic characteristics that constrain adaptive evolution have remained unknown. A recent study showed that the proportion of duplicated genes in whole Drosophila genomes correlated with environmental variability within habitat, but it remains unclear whether the correlation is observed even in vertebrates whose genomes including a large number of duplicated genes generated by whole-genome duplication (WGD). Here, we focus on fully sequenced mammalian genomes that experienced WGD in early vertebrate lineages and show that the proportion of small-scale duplication (SSD) genes in the genome, but not that of WGD genes, is significantly correlated with habitat variability. Moreover, species with low habitat variability have a higher proportion of lost duplicated genes, particularly SSD genes, than those with high habitat variability. These results indicate that species that inhabit variable environments may maintain more SSD genes in their genomes and suggest that SSD genes are important for adapting to novel environments and surviving environmental changes. These insights may be applied to predicting invasive and endangered species. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  9. A novel KCNQ4 mutation and a private IMMP2L-DOCK4 duplication segregating with nonsyndromic hearing loss in a Brazilian family.

    PubMed

    Uehara, Daniela T; Freitas, Érika L; Alves, Leandro U; Mazzeu, Juliana F; Auricchio, Maria Tbm; Tabith, Alfredo; Monteiro, Mário Lr; Rosenberg, Carla; Mingroni-Netto, Regina C

    2015-01-01

    Here we describe a novel missense variant in the KCNQ4 gene and a private duplication at 7q31.1 partially involving two genes (IMMP2L and DOCK4). Both mutations segregated with nonsyndromic hearing loss in a family with three affected individuals. Initially, we identified the duplication in a screening of 132 unrelated cases of hearing loss with a multiplex ligation-dependent probe amplification panel of genes that are candidates to have a role in hearing, including IMMP2L. Mapping of the duplication by array-CGH revealed that the duplication also encompassed the 3'-end of DOCK4. Subsequently, whole-exome sequencing identified the breakpoint of the rearrangement, thereby confirming the existence of a fusion IMMP2L-DOCK4 gene. Transcription products of the fusion gene were identified, indicating that they escaped nonsense-mediated messenger RNA decay. A missense substitution (c.701A>T) in KCNQ4 (a gene at the DFNA2A locus) was also identified by whole-exome sequencing. Because the substitution is predicted to be probably damaging and KCNQ4 has been implicated in hearing loss, this mutation might explain the deafness in the affected individuals, although a hypothetical effect of the product of the fusion gene on hearing cannot be completely ruled out.

  10. A novel KCNQ4 mutation and a private IMMP2L-DOCK4 duplication segregating with nonsyndromic hearing loss in a Brazilian family

    PubMed Central

    Uehara, Daniela T; Freitas, Érika L; Alves, Leandro U; Mazzeu, Juliana F; Auricchio, Maria TBM; Tabith, Alfredo; Monteiro, Mário LR; Rosenberg, Carla; Mingroni-Netto, Regina C

    2015-01-01

    Here we describe a novel missense variant in the KCNQ4 gene and a private duplication at 7q31.1 partially involving two genes (IMMP2L and DOCK4). Both mutations segregated with nonsyndromic hearing loss in a family with three affected individuals. Initially, we identified the duplication in a screening of 132 unrelated cases of hearing loss with a multiplex ligation-dependent probe amplification panel of genes that are candidates to have a role in hearing, including IMMP2L. Mapping of the duplication by array-CGH revealed that the duplication also encompassed the 3′-end of DOCK4. Subsequently, whole-exome sequencing identified the breakpoint of the rearrangement, thereby confirming the existence of a fusion IMMP2L-DOCK4 gene. Transcription products of the fusion gene were identified, indicating that they escaped nonsense-mediated messenger RNA decay. A missense substitution (c.701A>T) in KCNQ4 (a gene at the DFNA2A locus) was also identified by whole-exome sequencing. Because the substitution is predicted to be probably damaging and KCNQ4 has been implicated in hearing loss, this mutation might explain the deafness in the affected individuals, although a hypothetical effect of the product of the fusion gene on hearing cannot be completely ruled out. PMID:27081546

  11. Increased gene dosage plays a predominant role in the initial stages of evolution of duplicate TEM-1 beta lactamase genes.

    PubMed

    Dhar, Riddhiman; Bergmiller, Tobias; Wagner, Andreas

    2014-06-01

    Gene duplication is important in evolution, because it provides new raw material for evolutionary adaptations. Several existing hypotheses about the causes of duplicate retention and diversification differ in their emphasis on gene dosage, subfunctionalization, and neofunctionalization. Little experimental data exist on the relative importance of gene expression changes and changes in coding regions for the evolution of duplicate genes. Furthermore, we do not know how strongly the environment could affect this importance. To address these questions, we performed evolution experiments with the TEM-1 beta lactamase gene in Escherichia coli to study the initial stages of duplicate gene evolution in the laboratory. We mimicked tandem duplication by inserting two copies of the TEM-1 gene on the same plasmid. We then subjected these copies to repeated cycles of mutagenesis and selection in various environments that contained antibiotics in different combinations and concentrations. Our experiments showed that gene dosage is the most important factor in the initial stages of duplicate gene evolution, and overshadows the importance of point mutations in the coding region. © 2014 The Author(s). Evolution © 2014 The Society for the Study of Evolution.

  12. Evaluating dosage compensation as a cause of duplicate gene retention in Paramecium tetraurelia

    PubMed Central

    Hughes, Timothy; Ekman, Diana; Ardawatia, Himanshu; Elofsson, Arne; Liberles, David A

    2007-01-01

    The high retention of duplicate genes in the genome of Paramecium tetraurelia has led to the hypothesis that most of the retained genes have persisted because of constraints due to gene dosage. This and other possible mechanisms are discussed in the light of expectations from population genetics and systems biology. PMID:17521457

  13. Gene-Family Extension Measures and Correlations

    PubMed Central

    Carmi, Gon; Bolshoy, Alexander

    2016-01-01

    The existence of multiple copies of genes is a well-known phenomenon. A gene family is a set of sufficiently similar genes, formed by gene duplication. In earlier works conducted on a limited number of completely sequenced and annotated genomes it was found that size of gene family and size of genome are positively correlated. Additionally, it was found that several atypical microbes deviated from the observed general trend. In this study, we reexamined these associations on a larger dataset consisting of 1484 prokaryotic genomes and using several ranking approaches. We applied ranking methods in such a way that genomes with lower numbers of gene copies would have lower rank. Until now only simple ranking methods were used; we applied the Kemeny optimal aggregation approach as well. Regression and correlation analysis were utilized in order to accurately quantify and characterize the relationships between measures of paralog indices and genome size. In addition, boxplot analysis was employed as a method for outlier detection. We found that, in general, all paralog indexes positively correlate with an increase of genome size. As expected, different groups of atypical prokaryotic genomes were found for different types of paralog quantities. Mycoplasmataceae and Halobacteria appeared to be among the most interesting candidates for further research of evolution through gene duplication. PMID:27527218

  14. Gene-Family Extension Measures and Correlations.

    PubMed

    Carmi, Gon; Bolshoy, Alexander

    2016-08-03

    The existence of multiple copies of genes is a well-known phenomenon. A gene family is a set of sufficiently similar genes, formed by gene duplication. In earlier works conducted on a limited number of completely sequenced and annotated genomes it was found that size of gene family and size of genome are positively correlated. Additionally, it was found that several atypical microbes deviated from the observed general trend. In this study, we reexamined these associations on a larger dataset consisting of 1484 prokaryotic genomes and using several ranking approaches. We applied ranking methods in such a way that genomes with lower numbers of gene copies would have lower rank. Until now only simple ranking methods were used; we applied the Kemeny optimal aggregation approach as well. Regression and correlation analysis were utilized in order to accurately quantify and characterize the relationships between measures of paralog indices and genome size. In addition, boxplot analysis was employed as a method for outlier detection. We found that, in general, all paralog indexes positively correlate with an increase of genome size. As expected, different groups of atypical prokaryotic genomes were found for different types of paralog quantities. Mycoplasmataceae and Halobacteria appeared to be among the most interesting candidates for further research of evolution through gene duplication.

  15. Mosquito vitellogenin genes: Comparative sequence analysis, gene duplication, and the role of rare synonymous codon usage in regulating expression.

    PubMed

    Isoe, Jun; Hagedorn, Henry H

    2007-01-01

    Comparative sequence analysis of mosquito vitellogenin (Vg) genes was carried out to gain a better understanding of their evolution. The genomic clones of vitellogenin genes were isolated and sequenced from all three subfamilies of the family Culicidae including Culicinae (Aedes aegypti, Ochlerotatus atropalpus, Ae. polynesiensis, Ae. albopictus, Ochlerotatus triseriatus and Culex quinquefasciatus), Toxorhynchitinae (Toxorhynchites amboinensis), and Anophelinae (Anopheles albimanus). Genomic clones of vitellogenin genes Vg-B and Vg-C were isolated from Ae. aegypti and sequenced. A comparison of Vg-B and Vg-C, with the previously characterized vitellogenin gene, Vg-A1, suggests that Vg-A1 and Vg-B probably arose by a recent gene duplication, and Vg-C apparently diverged from the two other members of the gene family in an earlier gene duplication event. Two vitellogenin genes orthologous to Vg-C were cloned from a Cx. quinquefasciatus DNA library, one of which is truncated at the N-terminal end. Single vitellogenin genes, orthologous to Vg-C, were cloned from the An. albimanus and Tx. amboinensis libraries. Incomplete sequences orthologous to Vg-B and Vg-C were isolated from the Oc. atropalpus library. Only partial sequences were isolated from Ae. polynesiensis, Ae. albopictus and Oc. triseriatus. Inferred phylogenetic relationships based on analysis of these sequences suggest that Vg-C was the ancestral gene and that a recent gene duplication gave rise to Vg-A1 and Vg-B after the separation of the genus Aedes. The deduced amino acid composition of mosquito vitellogenin proteins exhibits higher tyrosine and phenylalanine composition than other mosquito proteins except for the hexamerin storage proteins. Analysis of vitellogenin coding sequences showed that a majority of amino acid substitutions were due to conserved and moderately conserved changes suggesting that the vitellogenins are under moderately selective constrains to maintain tertiary structure. The

  16. Gene Duplication, Population Genomics, and Species-Level Differentiation within a Tropical Mountain Shrub

    PubMed Central

    Mastretta-Yanes, Alicia; Zamudio, Sergio; Jorgensen, Tove H.; Arrigo, Nils; Alvarez, Nadir; Piñero, Daniel; Emerson, Brent C.

    2014-01-01

    Gene duplication leads to paralogy, which complicates the de novo assembly of genotyping-by-sequencing (GBS) data. The issue of paralogous genes is exacerbated in plants, because they are particularly prone to gene duplication events. Paralogs are normally filtered from GBS data before undertaking population genomics or phylogenetic analyses. However, gene duplication plays an important role in the functional diversification of genes and it can also lead to the formation of postzygotic barriers. Using populations and closely related species of a tropical mountain shrub, we examine 1) the genomic differentiation produced by putative orthologs, and 2) the distribution of recent gene duplication among lineages and geography. We find high differentiation among populations from isolated mountain peaks and species-level differentiation within what is morphologically described as a single species. The inferred distribution of paralogs among populations is congruent with taxonomy and shows that GBS could be used to examine recent gene duplication as a source of genomic differentiation of nonmodel species. PMID:25223767

  17. Evolution of CONSTANS Regulation and Function after Gene Duplication Produced a Photoperiodic Flowering Switch in the Brassicaceae.

    PubMed

    Simon, Samson; Rühl, Mark; de Montaigu, Amaury; Wötzel, Stefan; Coupland, George

    2015-09-01

    Environmental control of flowering allows plant reproduction to occur under optimal conditions and facilitates adaptation to different locations. At high latitude, flowering of many plants is controlled by seasonal changes in day length. The photoperiodic flowering pathway confers this response in the Brassicaceae, which colonized temperate latitudes after divergence from the Cleomaceae, their subtropical sister family. The CONSTANS (CO) transcription factor of Arabidopsis thaliana, a member of the Brassicaceae, is central to the photoperiodic flowering response and shows characteristic patterns of transcription required for day-length sensing. CO is believed to be widely conserved among flowering plants; however, we show that it arose after gene duplication at the root of the Brassicaceae followed by divergence of transcriptional regulation and protein function. CO has two close homologs, CONSTANS-LIKE1 (COL1) and COL2, which are related to CO by tandem duplication and whole-genome duplication, respectively. The single CO homolog present in the Cleomaceae shows transcriptional and functional features similar to those of COL1 and COL2, suggesting that these were ancestral. We detect cis-regulatory and codon changes characteristic of CO and use transgenic assays to demonstrate their significance in the day-length-dependent activation of the CO target gene FLOWERING LOCUS T. Thus, the function of CO as a potent photoperiodic flowering switch evolved in the Brassicaceae after gene duplication. The origin of CO may have contributed to the range expansion of the Brassicaceae and suggests that in other families CO genes involved in photoperiodic flowering arose by convergent evolution.

  18. Evolution dynamics of a model for gene duplication under adaptive conflict

    NASA Astrophysics Data System (ADS)

    Ancliff, Mark; Park, Jeong-Man

    2014-06-01

    We present and solve the dynamics of a model for gene duplication showing escape from adaptive conflict. We use a Crow-Kimura quasispecies model of evolution where the fitness landscape is a function of Hamming distances from two reference sequences, which are assumed to optimize two different gene functions, to describe the dynamics of a mixed population of individuals with single and double copies of a pleiotropic gene. The evolution equations are solved through a spin coherent state path integral, and we find two phases: one is an escape from an adaptive conflict phase, where each copy of a duplicated gene evolves toward subfunctionalization, and the other is a duplication loss of function phase, where one copy maintains its pleiotropic form and the other copy undergoes neutral mutation. The phase is determined by a competition between the fitness benefits of subfunctionalization and the greater mutational load associated with maintaining two gene copies. In the escape phase, we find a dynamics of an initial population of single gene sequences only which escape adaptive conflict through gene duplication and find that there are two time regimes: until a time t* single gene sequences dominate, and after t* double gene sequences outgrow single gene sequences. The time t* is identified as the time necessary for subfunctionalization to evolve and spread throughout the double gene sequences, and we show that there is an optimum mutation rate which minimizes this time scale.

  19. Enhanced fixation and preservation of a newly arisen duplicate gene by masking deleterious loss-of-function mutations.

    PubMed

    Tanaka, Kentaro M; Takahasi, K Ryo; Takano-Shimizu, Toshiyuki

    2009-08-01

    Segmental duplications are enriched within many eukaryote genomes, and their potential consequence is gene duplication. While previous theoretical studies of gene duplication have mainly focused on the gene silencing process after fixation, the process leading to fixation is even more important for segmental duplications, because the majority of duplications would be lost before reaching a significant frequency in a population. Here, by a series of computer simulations, we show that purifying selection against loss-of-function mutations increases the fixation probability of a new duplicate gene, especially when the gene is haplo-insufficient. Theoretically, the probability of simultaneous preservation of both duplicate genes becomes twice the loss-of-function mutation rate (u(c)) when the population size (N), the degree of dominance of mutations (h) and the recombination rate between the duplicate genes (c) are all sufficiently large (Nu(c)>1, h>0.1 and c>u(c)). The preservation probability declines rapidly with h and becomes 0 when h=0 (haplo-sufficiency). We infer that masking deleterious loss-of-function mutations give duplicate genes an immediate selective advantage and, together with effects of increased gene dosage, would predominantly determine the fates of the duplicate genes in the early phase of their evolution.

  20. Characterizing gene family evolution

    PubMed Central

    Liberles, David A.

    2008-01-01

    Gene families are widely used in comparative genomics, molecular evolution, and in systematics. However, they are constructed in different manners, their data analyzed and interpreted differently, with different underlying assumptions, leading to sometimes divergent conclusions. In systematics, concepts like monophyly and the dichotomy between homoplasy and homology have been central to the analysis of phylogenies. We critique the traditional use of such concepts as applied to gene families and give examples of incorrect inferences they may lead to. Operational definitions that have emerged within functional genomics are contrasted with the common formal definitions derived from systematics. Lastly, we question the utility of layers of homology and the meaning of homology at the character state level in the context of sequence evolution. From this, we move forward to present an idealized strategy for characterizing gene family evolution for both systematic and functional purposes, including recent methodological improvements. PMID:19461954

  1. Gene body methylation shows distinct patterns associated with different gene origins and duplication modes and has a heterogeneous relationship with gene expression in Oryza sativa (rice).

    PubMed

    Wang, Yupeng; Wang, Xiyin; Lee, Tae-Ho; Mansoor, Shahid; Paterson, Andrew H

    2013-04-01

    Whole-genome duplication (WGD) has been recurring and single-gene duplication is also widespread in angiosperms. Recent whole-genome DNA methylation maps indicate that gene body methylation (i.e. of coding regions) has a functional role. However, whether gene body methylation is related to gene origins and duplication modes has yet to be reported. In rice (Oryza sativa), we computed a body methylation level (proportion of methylated CpG within coding regions) for each gene in five tissues. Body methylation levels follow a bimodal distribution, but show distinct patterns associated with transposable element-related genes; WGD, tandem, proximal and transposed duplicates; and singleton genes. For pairs of duplicated genes, divergence in body methylation levels increases with physical distance and synonymous (Ks) substitution rates, and WGDs show lower divergence than single-gene duplications of similar Ks levels. Intermediate body methylation tends to be associated with high levels of gene expression, whereas heavy body methylation is associated with lower levels of gene expression. The biological trends revealed here are consistent across five rice tissues, indicating that genes of different origins and duplication modes have distinct body methylation patterns, and body methylation has a heterogeneous relationship with gene expression and may be related to survivorship of duplicated genes.

  2. Independent Origin and Global Distribution of Distinct Plasmodium vivax Duffy Binding Protein Gene Duplications

    PubMed Central

    Hostetler, Jessica B.; Lo, Eugenia; Kanjee, Usheer; Amaratunga, Chanaki; Suon, Seila; Sreng, Sokunthea; Mao, Sivanna; Yewhalaw, Delenasaw; Mascarenhas, Anjali; Kwiatkowski, Dominic P.; Ferreira, Marcelo U.; Rathod, Pradipsinh K.; Yan, Guiyun; Fairhurst, Rick M.; Duraisingh, Manoj T.; Rayner, Julian C.

    2016-01-01

    Background Plasmodium vivax causes the majority of malaria episodes outside Africa, but remains a relatively understudied pathogen. The pathology of P. vivax infection depends critically on the parasite’s ability to recognize and invade human erythrocytes. This invasion process involves an interaction between P. vivax Duffy Binding Protein (PvDBP) in merozoites and the Duffy antigen receptor for chemokines (DARC) on the erythrocyte surface. Whole-genome sequencing of clinical isolates recently established that some P. vivax genomes contain two copies of the PvDBP gene. The frequency of this duplication is particularly high in Madagascar, where there is also evidence for P. vivax infection in DARC-negative individuals. The functional significance and global prevalence of this duplication, and whether there are other copy number variations at the PvDBP locus, is unknown. Methodology/Principal Findings Using whole-genome sequencing and PCR to study the PvDBP locus in P. vivax clinical isolates, we found that PvDBP duplication is widespread in Cambodia. The boundaries of the Cambodian PvDBP duplication differ from those previously identified in Madagascar, meaning that current molecular assays were unable to detect it. The Cambodian PvDBP duplication did not associate with parasite density or DARC genotype, and ranged in prevalence from 20% to 38% over four annual transmission seasons in Cambodia. This duplication was also present in P. vivax isolates from Brazil and Ethiopia, but not India. Conclusions/Significance PvDBP duplications are much more widespread and complex than previously thought, and at least two distinct duplications are circulating globally. The same duplication boundaries were identified in parasites from three continents, and were found at high prevalence in human populations where DARC-negativity is essentially absent. It is therefore unlikely that PvDBP duplication is associated with infection of DARC-negative individuals, but functional tests

  3. Tandem duplication PCR: an ultra-sensitive assay for the detection of internal tandem duplications of the FLT3 gene

    PubMed Central

    Lin, Ming-Tseh; Tseng, Li-Hui; Beierl, Katie; Hsieh, Antony; Thiess, Michele; Chase, Nadine; Stafford, Amanda; Levis, Mark J.; Eshleman, James R.; Gocke, Christopher D.

    2013-01-01

    Internal tandem duplication (ITD) mutations of the FLT3 gene have been associated with a poor prognosis in acute myeloid leukemia (AML). Detection of ITD-positive minor clones at the initial diagnosis and during the minimal residual disease (MRD) stage may be essential. We previously designed a delta-PCR strategy to improve the sensitivity to 0.1% ITD-positive leukemia cells and showed that minor mutants with an allele burden of less than 1% can be clinically significant. In this study, we report on tandem duplication PCR (TD-PCR), a modified inverse PCR assay, and demonstrate a limit of detection of a few molecules of ITD mutants. The TD-PCR was initially designed to confirm ITD mutation of an amplicon which was undetectable by capillary electrophoresis and was incidentally isolated by a molecular fraction collecting tool. Subsequently, TD-PCR detected ITD mutation in 2 of 77 patients previously reported as negative for ITD mutation by a standard PCR assay. TD-PCR can also potentially be applied to monitor MRD with high analytic sensitivity in a portion of ITD-positive AML patients. Further studies using TD-PCR to detect ITD mutants at diagnosis may clarify the clinical significance of those ITD mutants with extremely low allele burden. PMID:23846441

  4. Differential transcriptional modulation of duplicated fatty acid-binding protein genes by dietary fatty acids in zebrafish (Danio rerio): evidence for subfunctionalization or neofunctionalization of duplicated genes.

    PubMed

    Karanth, Santhosh; Lall, Santosh P; Denovan-Wright, Eileen M; Wright, Jonathan M

    2009-09-02

    In the Duplication-Degeneration-Complementation (DDC) model, subfunctionalization and neofunctionalization have been proposed as important processes driving the retention of duplicated genes in the genome. These processes are thought to occur by gain or loss of regulatory elements in the promoters of duplicated genes. We tested the DDC model by determining the transcriptional induction of fatty acid-binding proteins (Fabps) genes by dietary fatty acids (FAs) in zebrafish. We chose zebrafish for this study for two reasons: extensive bioinformatics resources are available for zebrafish at zfin.org and zebrafish contains many duplicated genes owing to a whole genome duplication event that occurred early in the ray-finned fish lineage approximately 230-400 million years ago. Adult zebrafish were fed diets containing either fish oil (12% lipid, rich in highly unsaturated fatty acid), sunflower oil (12% lipid, rich in linoleic acid), linseed oil (12% lipid, rich in linolenic acid), or low fat (4% lipid, low fat diet) for 10 weeks. FA profiles and the steady-state levels of fabp mRNA and heterogeneous nuclear RNA in intestine, liver, muscle and brain of zebrafish were determined. FA profiles assayed by gas chromatography differed in the intestine, brain, muscle and liver depending on diet. The steady-state level of mRNA for three sets of duplicated genes, fabp1a/fabp1b.1/fabp1b.2, fabp7a/fabp7b, and fabp11a/fabp11b, was determined by reverse transcription, quantitative polymerase chain reaction (RT-qPCR). In brain, the steady-state level of fabp7b mRNAs was induced in fish fed the linoleic acid-rich diet; in intestine, the transcript level of fabp1b.1 and fabp7b were elevated in fish fed the linolenic acid-rich diet; in liver, the level of fabp7a mRNAs was elevated in fish fed the low fat diet; and in muscle, the level of fabp7a and fabp11a mRNAs were elevated in fish fed the linolenic acid-rich or the low fat diets. In all cases, induction of the steady-state level of

  5. Two complementary recessive genes in duplicated segments control etiolation in rice.

    PubMed

    Mao, Donghai; Yu, Huihui; Liu, Touming; Yang, Gaiyu; Xing, Yongzhong

    2011-02-01

    The main objective of this study was to identify the genes causing etiolation in a rice mutant, the thylakoids of which were scattered. Three populations were employed to map the genes for etiolation using bulked segregant analysis. Genetic analysis confirmed that etiolation was controlled by two recessive genes, et11 and et12, which were fine mapped to an approximately 147-kb region and an approximately 209-kb region on the short arms of chromosomes 11 and 12, respectively. Both regions were within the duplicated segments on chromosomes 11 and 12. They possessed a highly similar sequence of 38 kb at the locations of a pair of duplicated genes with protein sequences very similar to that of HCF152 in Arabidopsis that are required for the processing of chloroplast RNA. These genes are likely the candidates for et11 and et12. Expression profiling was used to compare the expression patterns of paralogs in the duplicated segments. Expression profiling indicated that the duplicated segments had been undergone concerted evolution, and a large number of the paralogs within the duplicated segments were functionally redundant like et11 and et12.

  6. Mosaic gene conversion after a tandem duplication of mtDNA sequence in Diomedeidae (albatrosses).

    PubMed

    Eda, Masaki; Kuro-o, Masaki; Higuchi, Hiroyoshi; Hasegawa, Hiroshi; Koike, Hiroko

    2010-04-01

    Although the tandem duplication of mitochondrial (mt) sequences, especially those of the control region (CR), has been detected in metazoan species, few studies have focused on the features of the duplicated sequence itself, such as the gene conversion rate, distribution patterns of the variation, and relative rates of evolution between the copies. To investigate the features of duplicated mt sequences, we partially sequenced the mt genome of 16 Phoebastria albatrosses belonging to three species (P. albatrus, P. nigripes, and P. immutabilis). More than 2,300 base pairs of tandemly-duplicated sequence were shared by all three species. The observed gene arrangement was shared in the three Phoebastria albatrosses and suggests that the duplication event occurred in the common ancestor of the three species. Most of the copies in each individual were identical or nearly identical, and were maintained through frequent gene conversions. By contrast, portions of CR domains I and III had different phylogenetic signals, suggesting that gene conversion had not occurred in those sections after the speciation of the three species. Several lines of data, including the heterogeneity of the rate of molecular evolution, nucleotide differences, and putative secondary structures, suggests that the two sequences in CR domain I are maintained through selection; however, additional studies into the mechanisms of gene conversion and mtDNA synthesis are required to confirm this hypothesis.

  7. The pro-opiomelanocortin genes in rainbow trout (Oncorhynchus mykiss): duplications, splice variants, and differential expression.

    PubMed

    Leder, E H; Silverstein, J T

    2006-02-01

    Pro-opiomelanocortin (POMC) is a precursor for several important peptide hormones involved in a variety of functions ranging from stress response to energy homeostasis. In mammals and fish, the POMC-derived peptide alpha-melanocyte stimulating hormone (MSH) is known to be involved in appetite suppression through its interaction with melanocortin-4 receptors. The details of energy homeostasis in fishes are beginning to be elucidated and many of the genes involved in mammalian neuroendocrine signaling pathways are being discovered in fish. In salmonid fishes such as the rainbow trout, genome duplication adds another degree of complexity when trying to compare gene function and homology with other vertebrates. This is true of the POMC gene. Two copies of the POMC gene were previously identified, A and B, presumably resulting from the salmonid duplication. However, while investigating POMC involvement in the feeding response of rainbow trout, a second copy of POMC-A was discovered which is more likely the result of the salmonid duplication and suggests that POMC-B is a duplicate resulting from the earlier teleost duplication prior to tetrapod divergence. The duplicated POMC-A had five deleted amino acids, five inserted amino acids, and 39 amino acid differences from the published POMC-A. In addition to the duplicate POMC-A, a splice variant of the published POMC-A sequence was also identified. Quantitative real-time PCR assays were developed for the different POMC transcripts, and expression was examined in a variety of tissues. Expression of POMC transcripts was highest in the pituitary for all POMC genes, but varied among other tissues for POMC-A1, POMC-A2, POMC-A2s, and POMC-B. POMC-A1 was the only transcript to respond significantly to food deprivation.

  8. The sulfatase gene family.

    PubMed

    Parenti, G; Meroni, G; Ballabio, A

    1997-06-01

    During the past few years, molecular analyses have provided important insights into the biochemistry and genetics of the sulfatase family of enzymes, identifying the molecular bases of inherited diseases caused by sulfatase deficiencies. New members of the sulfatase gene family have been identified in man and other species using a genomic approach. These include the gene encoding arylsulfatase E, which is involved in X-linked recessive chondrodysplasia punctata, a disorder of cartilage and bone development. Another important breakthrough has been the discovery of the biochemical basis of multiple sulfatase deficiency, an autosomal recessive disorder characterized by a severe of all sulfatase activities. These discoveries, together with the resolution of the crystallographic structure of sulfatases, have improved our understanding of the function and evolution of this fascinating family of enzymes.

  9. The mammalian alcohol dehydrogenase genome shows several gene duplications and gene losses resulting in a large set of different enzymes including pseudoenzymes.

    PubMed

    Östberg, Linus J; Persson, Bengt; Höög, Jan-Olov

    2015-06-05

    Mammalian alcohol dehydrogenase (ADH) is a protein family divided into six classes and the number of known family members is increasing rapidly. Several primate genomes are completely analyzed for the ADH region, where higher primates (human and hominoids) have seven genes of classes ADH1-ADH5. Within the group of non-hominoids apes there have been further duplications and species with more than the typical three isozymic forms for ADH1 are present. In contrast there are few completely analyzed ADH genomes in the non-primate group of mammals, where an additional class has been identified, ADH6, that has been lost during the evolution of primates. In this study 85 mammalian genomes with at least one ADH gene have been compiled. In total more than 500 ADH amino acid sequences were analyzed for patterns that distinguish the different classes. For ADH1-ADH4 intensive investigations have been performed both at the functional and at structural levels. However, a corresponding functional protein to the ADH5 gene, which is found in most ADH genomes, has never been detected. The same is true for ADH6, which is only present in non-primates. The entire mammalian ADH family shows a broad spectrum of gene duplications and gene losses where the numbers differ from six genes (most non-primate mammals) up to ten genes (vole). Included in these sets are examples of pseudogenes and pseudoenzymes.

  10. The Use of Duplication-Generating Rearrangements for Studying Heterokaryon Incompatibility Genes in Neurospora

    PubMed Central

    Perkins, David D.

    1975-01-01

    Heterokaryon (vegetative) incompatibility, governing the fusion of somatic hyphal filaments to form stable heterokaryons, is of interest because of its widespread occurrence in fungi and its bearing on cellular recognition. Conventional investigations of the genetic basis of heterokaryon incompatibility in N. crassa are difficult because in commonly used stocks differences are present at several het loci, all with similar incompatibility phenotypes. This difficulty is overcome by using duplications (partial diploids) that are unlikely to contain more than one het locus. A phenotypically expressed incompatibility reaction occurs when unlike het alleles are present within the same somatic nucleus, and this parallels the heterokaryon incompatibility reaction that occurs when unlike alleles in different haploid nuclei are introduced into the same somatic hypha by mycelial fusion.—Nontandem duplications were used to confirm that the incompatibility reactions in heterokaryons and in duplications are alternate expressions of the same genes. This was demonstrated for three loci which had previously been established by conventional heterokaryon tests—het-e, het-c and mt. These were each obtained in duplications as recombinant meiotic segregants from crosses heterozygous for duplication-generating chromosome rearrangements. The particular method of producing the duplications is irrelevant so long as the incompatibility alleles are heterozygous.—The duplication technique has made it possible to determine easily the het-e and het-c genotypes of numerous laboratory and wild strains of unknown constitution. In laboratory strains both loci are represented simply by two alleles. Analysis of het-c is more complicated in some wild strains, where differences have been demonstrated at one or more additional het loci within the duplication used and multiple allelism is also possible.—The results show that the duplication method can be used to identify and map additional

  11. GIPC gene family (Review).

    PubMed

    Katoh, Masaru

    2002-06-01

    GIPC1/GIPC/RGS19IP1, GIPC2, and GIPC3 genes constitute the human GIPC gene family. GIPC1 and GIPC2 show 62.0% total-amino-acid identity. GIPC1 and GIPC3 show 59.9% total-amino-acid identity. GIPC2 and GIPC3 show 55.3% total-amino-acid identity. GIPCs are proteins with central PDZ domain and GIPC homology (GH1 and GH2) domains. PDZ, GH1, and GH2 domains are conserved among human GIPCs, Xenopus GIPC/Kermit, and Drosophila GIPC/ LP09416. Bioinformatics revealed that GIPC genes are linked to prostanoid receptor genes and DNAJB genes in the human genome as follows: GIPC1 gene is linked to prostaglandin E receptor 1 (PTGER1) gene and DNAJB1 gene in human chromosome 19p13.2-p13.1 region; GIPC2 gene to prostaglandin F receptor (PTGFR) gene and DNAJB4 gene in human chromosome 1p31.1-p22.3 region; GIPC3 gene to thromboxane A2 receptor (TBXA2R) gene in human chromosome 19p13.3 region. GIPC1 and GIPC2 mRNAs are expressed together in OKAJIMA, TMK1, MKN45 and KATO-III cells derived from diffuse-type of gastric cancer, and are up-regulated in several cases of primary gastric cancer. PDZ domain of GIPC family proteins interact with Frizzled-3 (FZD3) class of WNT receptor, insulin-like growth factor-I (IGF1) receptor, receptor tyrosine kinase TrkA, TGF-beta type III receptor (TGF-beta RIII), integrin alpha6A subunit, transmembrane glycoprotein 5T4, and RGS19/RGS-GAIP. Because RGS19 is a member of the RGS family that regulate heterotrimeric G-protein signaling, GIPCs might be scaffold proteins linking heterotrimeric G-proteins to seven-transmembrane-type WNT receptor or to receptor tyrosine kinases. Therefore, GIPC1, GIPC2 and GIPC3 might play key roles in carcinogenesis and embryogenesis through modulation of growth factor signaling and cell adhesion.

  12. Heterogeneous expression pattern of tandem duplicated sHsps genes during fruit ripening in two tomato species

    NASA Astrophysics Data System (ADS)

    Arce, DP; Krsticevic, FJ; Ezpeleta, J.; Ponce, SD; Pratta, GR; Tapia, E.

    2016-04-01

    The small heat shock proteins (sHSPs) have been found to play a critical role in physiological stress conditions in protecting proteins from irreversible aggregation. To characterize the gene expression profile of four sHsps with a tandem gene structure arrangement in the domesticated Solanum lycopersicum (Heinz 1706) genome and its wild close relative Solanum pimpinellifolium (LA1589), differential gene expression analysis using RNA-Seq was conducted in three ripening stages in both cultivars fruits. Gene promoter analysis was performed to explain the heterogeneous pattern of gene expression found for these tandem duplicated sHsps. In silico analysis results contribute to refocus wet experiment analysis in tomato sHsp family proteins.

  13. Lineage-Specific Gene Duplication and Loss in Human and Great Ape Evolution

    PubMed Central

    MacLaren, Erik; Marshall, Kriste; Hahn, Gretchen; Meltesen, Lynne; Brenton, Matthew; Hink, Raquel; Burgers, Sonya; Hernandez-Boussard, Tina; Karimpour-Fard, Anis; Glueck, Deborah; McGavran, Loris; Berry, Rebecca

    2004-01-01

    Given that gene duplication is a major driving force of evolutionary change and the key mechanism underlying the emergence of new genes and biological processes, this study sought to use a novel genome-wide approach to identify genes that have undergone lineage-specific duplications or contractions among several hominoid lineages. Interspecies cDNA array-based comparative genomic hybridization was used to individually compare copy number variation for 39,711 cDNAs, representing 29,619 human genes, across five hominoid species, including human. We identified 1,005 genes, either as isolated genes or in clusters positionally biased toward rearrangement-prone genomic regions, that produced relative hybridization signals unique to one or more of the hominoid lineages. Measured as a function of the evolutionary age of each lineage, genes showing copy number expansions were most pronounced in human (134) and include a number of genes thought to be involved in the structure and function of the brain. This work represents, to our knowledge, the first genome-wide gene-based survey of gene duplication across hominoid species. The genes identified here likely represent a significant majority of the major gene copy number changes that have occurred over the past 15 million years of human and great ape evolution and are likely to underlie some of the key phenotypic characteristics that distinguish these species. PMID:15252450

  14. Duplicate gene expression in allopolyploid Gossypium reveals two temporally distinct phases of expression evolution

    PubMed Central

    Flagel, Lex; Udall, Joshua; Nettleton, Dan; Wendel, Jonathan

    2008-01-01

    Background Polyploidy has played a prominent role in shaping the genomic architecture of the angiosperms. Through allopolyploidization, several modern Gossypium (cotton) species contain two divergent, although largely redundant genomes. Owing to this redundancy, these genomes can play host to an array of evolutionary processes that act on duplicate genes. Results We compared homoeolog (genes duplicated by polyploidy) contributions to the transcriptome of a natural allopolyploid and a synthetic interspecific F1 hybrid, both derived from a merger between diploid species from the Gossypium A-genome and D-genome groups. Relative levels of A- and D-genome contributions to the petal transcriptome were determined for 1,383 gene pairs. This comparison permitted partitioning of homoeolog expression biases into those arising from genomic merger and those resulting from polyploidy. Within allopolyploid Gossypium, approximately 24% of the genes with biased (unequal contributions from the two homoeologous copies) expression patterns are inferred to have arisen as a consequence of genomic merger, indicating that a substantial fraction of homoeolog expression biases occur instantaneously with hybridization. The remaining 76% of biased homoeologs reflect long-term evolutionary forces, such as duplicate gene neofunctionalization and subfunctionalization. Finally, we observed a greater number of genes biased toward the paternal D-genome and that expression biases have tended to increases during allopolyploid evolution. Conclusion Our results indicate that allopolyploidization entails significant homoeolog expression modulation, both immediately as a consequence of genomic merger, and secondarily as a result of long-term evolutionary transformations in duplicate gene expression. PMID:18416842

  15. Evolutionary analyses of non-family genes in plants

    SciTech Connect

    Ye, Chuyu; Li, Ting; Yin, Hengfu; Weston, David; Tuskan, Gerald A; Tschaplinski, Timothy J; Yang, Xiaohan

    2013-01-01

    There are a large number of non-family (NF) genes that do not cluster into families with three or more members per genome. While gene families have been extensively studied, a systematic analysis of NF genes has not been reported. We performed comparative studies on NF genes in 14 plant species. Based on the clustering of protein sequences, we identified ~94 000 NF genes across these species that were divided into five evolutionary groups: Viridiplantae wide, angiosperm specific, monocot specific, dicot specific, and those that were species specific. Our analysis revealed that the NF genes resulted largely from less frequent gene duplications and/or a higher rate of gene loss after segmental duplication relative to genes in both lowcopy- number families (LF; 3 10 copies per genome) and high-copy-number families (HF; >10 copies). Furthermore, we identified functions enriched in the NF gene set as compared with the HF genes. We found that NF genes were involved in essential biological processes shared by all plant lineages (e.g. photosynthesis and translation), as well as gene regulation and stress responses associated with phylogenetic diversification. In particular, our analysis of an Arabidopsis protein protein interaction network revealed that hub proteins with the top 10% most connections were over-represented in the NF set relative to the HF set. This research highlights the roles that NF genes may play in evolutionary and functional genomics research.

  16. Evolutionary analyses of non-family genes in plants

    SciTech Connect

    Ye, Chuyu; Li, Ting; Yin, Hengfu; Weston, David; Tuskan, Gerald A; Tschaplinski, Timothy J; Yang, Xiaohan

    2013-03-01

    There are a large number of non-family (NF) genes that do not cluster into families with three or more members per genome. While gene families have been extensively studied, a systematic analysis of NF genes has not been reported. We performed comparative studies on NF genes in 14 plant species. Based on the clustering of protein sequences, we identified ~94,000 NF genes across these species that were divided into five evolutionary groups: Viridiplantae-wide, angiosperm-specific, monocot-specific, dicot-specific, and those that were species-specific. Our analysis revealed that the NF genes resulted largely from less frequent gene duplications and/or a higher rate of gene loss after segmental duplication relative to genes in both low-copy-number families (LF; 3 10 copies per genome) and high-copy-number families (HF; >10 copies). Furthermore, we identified functions enriched in the NF gene set as compared with the HF genes. We found that NF genes were involved in essential biological processes shared by all plant lineages (e.g., photosynthesis and translation), as well as gene regulation and stress responses associated with phylogenetic diversification. In particular, our analysis of an Arabidopsis protein-protein interaction network revealed that hub proteins with the top 10% most connections were over-represented in the NF set relative to the HF set. This research highlights the roles that NF genes may play in evolutionary and functional genomics research.

  17. An Efficient Algorithm for Gene/Species Trees Parsimonious Reconciliation with Losses, Duplications and Transfers

    NASA Astrophysics Data System (ADS)

    Doyon, Jean-Philippe; Scornavacca, Celine; Gorbunov, K. Yu.; Szöllősi, Gergely J.; Ranwez, Vincent; Berry, Vincent

    Tree reconciliation methods aim at estimating the evolutionary events that cause discrepancy between gene trees and species trees. We provide a discrete computational model that considers duplications, transfers and losses of genes. The model yields a fast and exact algorithm to infer time consistent and most parsimonious reconciliations. Then we study the conditions under which parsimony is able to accurately infer such events. Overall, it performs well even under realistic rates, transfers being in general less accurately recovered than duplications. An implementation is freely available at http://www.atgc-montpellier.fr/MPR.

  18. Gene duplication can impart fragility, not robustness, in the yeast protein interaction network.

    PubMed

    Diss, Guillaume; Gagnon-Arsenault, Isabelle; Dion-Coté, Anne-Marie; Vignaud, Hélène; Ascencio, Diana I; Berger, Caroline M; Landry, Christian R

    2017-02-10

    The maintenance of duplicated genes is thought to protect cells from genetic perturbations, but the molecular basis of this robustness is largely unknown. By measuring the interaction of yeast proteins with their partners in wild-type cells and in cells lacking a paralog, we found that 22 out of 56 paralog pairs compensate for the lost interactions. An equivalent number of pairs exhibit the opposite behavior and require each other's presence for maintaining their interactions. These dependent paralogs generally interact physically, regulate each other's abundance, and derive from ancestral self-interacting proteins. This reveals that gene duplication may actually increase mutational fragility instead of robustness in a large number of cases.

  19. A gene duplication affecting expression of the ovine ASIP gene is responsible for white and black sheep

    PubMed Central

    Norris, Belinda J.; Whan, Vicki A.

    2008-01-01

    Agouti signaling protein (ASIP) functions to regulate pigmentation in mice, while its role in many other animals and in humans has not been fully determined. In this study, we identify a 190-kb tandem duplication encompassing the ovine ASIP and AHCY coding regions and the ITCH promoter region as the genetic cause of white coat color of dominant white/tan (AWt) agouti sheep. The duplication 5′ breakpoint is located upstream of the ASIP coding sequence. Ubiquitous expression of a second copy of the ASIP coding sequence regulated by a duplicated copy of the nearby ITCH promoter causes the white sheep phenotype. A single copy ASIP gene with a silenced ASIP promoter occurs in recessive black sheep. In contrast, a single copy functional wild-type (A+) ASIP is responsible for the ancient Barbary sheep coat color phenotype. The gene duplication was facilitated by homologous recombination between two non-LTR SINE sequences flanking the duplicated segment. This is the first sheep trait attributable to gene duplication and shows nonallelic homologous recombination and gene conversion events at the ovine ASIP locus could have an important role in the evolution of sheep pigmentation. PMID:18493018

  20. Opossum carboxylesterases: sequences, phylogeny and evidence for CES gene duplication events predating the marsupial-eutherian common ancestor

    PubMed Central

    2008-01-01

    Background Carboxylesterases (CES) perform diverse metabolic roles in mammalian organisms in the detoxification of a broad range of drugs and xenobiotics and may also serve in specific roles in lipid, cholesterol, pheromone and lung surfactant metabolism. Five CES families have been reported in mammals with human CES1 and CES2 the most extensively studied. Here we describe the genetics, expression and phylogeny of CES isozymes in the opossum and report on the sequences and locations of CES1, CES2 and CES6 'like' genes within two gene clusters on chromosome one. We also discuss the likely sequence of gene duplication events generating multiple CES genes during vertebrate evolution. Results We report a cDNA sequence for an opossum CES and present evidence for CES1 and CES2 like genes expressed in opossum liver and intestine and for distinct gene locations of five opossum CES genes,CES1, CES2.1, CES2.2, CES2.3 and CES6, on chromosome 1. Phylogenetic and sequence alignment studies compared the predicted amino acid sequences for opossum CES with those for human, mouse, chicken, frog, salmon and Drosophila CES gene products. Phylogenetic analyses produced congruent phylogenetic trees depicting a rapid early diversification into at least five distinct CES gene family clusters: CES2, CES1, CES7, CES3, and CES6. Molecular divergence estimates based on a Bayesian relaxed clock approach revealed an origin for the five mammalian CES gene families between 328–378 MYA. Conclusion The deduced amino acid sequence for an opossum cDNA was consistent with its identity as a mammalian CES2 gene product (designated CES2.1). Distinct gene locations for opossum CES1 (1: 446,222,550–446,274,850), three CES2 genes (1: 677,773,395–677,927,030) and a CES6 gene (1: 677,585,520–677,730,419) were observed on chromosome 1. Opossum CES1 and multiple CES2 genes were expressed in liver and intestine. Amino acid sequences for opossum CES1 and three CES2 gene products revealed conserved

  1. Opossum carboxylesterases: sequences, phylogeny and evidence for CES gene duplication events predating the marsupial-eutherian common ancestor.

    PubMed

    Holmes, Roger S; Chan, Jeannie; Cox, Laura A; Murphy, William J; VandeBerg, John L

    2008-02-20

    Carboxylesterases (CES) perform diverse metabolic roles in mammalian organisms in the detoxification of a broad range of drugs and xenobiotics and may also serve in specific roles in lipid, cholesterol, pheromone and lung surfactant metabolism. Five CES families have been reported in mammals with human CES1 and CES2 the most extensively studied. Here we describe the genetics, expression and phylogeny of CES isozymes in the opossum and report on the sequences and locations of CES1, CES2 and CES6 'like' genes within two gene clusters on chromosome one. We also discuss the likely sequence of gene duplication events generating multiple CES genes during vertebrate evolution. We report a cDNA sequence for an opossum CES and present evidence for CES1 and CES2 like genes expressed in opossum liver and intestine and for distinct gene locations of five opossum CES genes,CES1, CES2.1, CES2.2, CES2.3 and CES6, on chromosome 1. Phylogenetic and sequence alignment studies compared the predicted amino acid sequences for opossum CES with those for human, mouse, chicken, frog, salmon and Drosophila CES gene products. Phylogenetic analyses produced congruent phylogenetic trees depicting a rapid early diversification into at least five distinct CES gene family clusters: CES2, CES1, CES7, CES3, and CES6. Molecular divergence estimates based on a Bayesian relaxed clock approach revealed an origin for the five mammalian CES gene families between 328-378 MYA. The deduced amino acid sequence for an opossum cDNA was consistent with its identity as a mammalian CES2 gene product (designated CES2.1). Distinct gene locations for opossum CES1 (1: 446,222,550-446,274,850), three CES2 genes (1: 677,773,395-677,927,030) and a CES6 gene (1: 677,585,520-677,730,419) were observed on chromosome 1. Opossum CES1 and multiple CES2 genes were expressed in liver and intestine. Amino acid sequences for opossum CES1 and three CES2 gene products revealed conserved residues previously reported for human CES1

  2. Pinda: a web service for detection and analysis of intraspecies gene duplication events.

    PubMed

    Kontopoulos, Dimitrios-Georgios; Glykos, Nicholas M

    2013-09-01

    We present Pinda, a Web service for the detection and analysis of possible duplications of a given protein or DNA sequence within a source species. Pinda fully automates the whole gene duplication detection procedure, from performing the initial similarity searches, to generating the multiple sequence alignments and the corresponding phylogenetic trees, to bootstrapping the trees and producing a Z-score-based list of duplication candidates for the input sequence. Pinda has been cross-validated using an extensive set of known and bibliographically characterized duplication events. The service facilitates the automatic and dependable identification of gene duplication events, using some of the most successful bioinformatics software to perform an extensive analysis protocol. Pinda will prove of use for the analysis of newly discovered genes and proteins, thus also assisting the study of recently sequenced genomes. The service's location is http://orion.mbg.duth.gr/Pinda. The source code is freely available via https://github.com/dgkontopoulos/Pinda/. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  3. Functional Characterization of Duplicated SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1-Like Genes in Petunia

    PubMed Central

    Preston, Jill C.; Jorgensen, Stacy A.; Jha, Suryatapa G.

    2014-01-01

    Flowering time is strictly controlled by a combination of internal and external signals that match seed set with favorable environmental conditions. In the model plant species Arabidopsis thaliana (Brassicaceae), many of the genes underlying development and evolution of flowering have been discovered. However, much remains unknown about how conserved the flowering gene networks are in plants with different growth habits, gene duplication histories, and distributions. Here we functionally characterize three homologs of the flowering gene SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1 (SOC1) in the short-lived perennial Petunia hybrida (petunia, Solanaceae). Similar to A. thaliana soc1 mutants, co-silencing of duplicated petunia SOC1-like genes results in late flowering. This phenotype is most severe when all three SOC1-like genes are silenced. Furthermore, expression levels of the SOC1-like genes UNSHAVEN (UNS) and FLORAL BINDING PROTEIN 21 (FBP21), but not FBP28, are positively correlated with developmental age. In contrast to A. thaliana, petunia SOC1-like gene expression did not increase with longer photoperiods, and FBP28 transcripts were actually more abundant under short days. Despite evidence of functional redundancy, differential spatio-temporal expression data suggest that SOC1-like genes might fine-tune petunia flowering in response to photoperiod and developmental stage. This likely resulted from modification of SOC1-like gene regulatory elements following recent duplication, and is a possible mechanism to ensure flowering under both inductive and non-inductive photoperiods. PMID:24787903

  4. Duplication of the bcr and gamma-glutamyl transpeptidase genes.

    PubMed Central

    Heisterkamp, N; Groffen, J

    1988-01-01

    The Philadelphia (Ph') translocation involves rearrangement of the bcr gene located on chromosome 22. Hybridization experiments revealed the presence of multiple bcr gene-related loci within the human genome. Two of these were molecularly cloned and characterized. Both loci contain exons and introns corresponding to the 3' region of the bcr gene. Restriction enzyme and DNA sequence analysis indicate a very high degree of conservation between bcr and the two related genomic sequences. Both bcr-related loci are located on chromosome 22, one centromeric, the other telomeric, of the bcr gene. Within the two bcr related genomic sequences, fragments or the complete coding sequences of an unrelated gene were found to be present. This gene was identified; it encodes gamma-glutamyl transferase, an enzyme involved in the glutathione metabolism. Images PMID:2901712

  5. Inferring the Recent Duplication History of a Gene Cluster

    NASA Astrophysics Data System (ADS)

    Song, Giltae; Zhang, Louxin; Vinař, Tomáš; Miller, Webb

    Much important evolutionary activity occurs in gene clusters, where a copy of a gene may be free to evolve new functions. Computational methods to extract evolutionary information from sequence data for such clusters are currently imperfect, in part because accurate sequence data are often lacking in these genomic regions, making the existing methods difficult to apply. We describe a new method for reconstructing the recent evolutionary history of gene clusters. The method’s performance is evaluated on simulated data and on actual human gene clusters.

  6. Positive selection and ancient duplications in the evolution of class B floral homeotic genes of orchids and grasses.

    PubMed

    Mondragón-Palomino, Mariana; Hiese, Luisa; Härter, Andrea; Koch, Marcus A; Theissen, Günter

    2009-04-21

    Positive selection is recognized as the prevalence of nonsynonymous over synonymous substitutions in a gene. Models of the functional evolution of duplicated genes consider neofunctionalization as key to the retention of paralogues. For instance, duplicate transcription factors are specifically retained in plant and animal genomes and both positive selection and transcriptional divergence appear to have played a role in their diversification. However, the relative impact of these two factors has not been systematically evaluated. Class B MADS-box genes, comprising DEF-like and GLO-like genes, encode developmental transcription factors essential for establishment of perianth and male organ identity in the flowers of angiosperms. Here, we contrast the role of positive selection and the known divergence in expression patterns of genes encoding class B-like MADS-box transcription factors from monocots, with emphasis on the family Orchidaceae and the order Poales. Although in the monocots these two groups are highly diverse and have a strongly canalized floral morphology, there is no information on the role of positive selection in the evolution of their distinctive flower morphologies. Published research shows that in Poales, class B-like genes are expressed in stamens and in lodicules, the perianth organs whose identity might also be specified by class B-like genes, like the identity of the inner tepals of their lily-like relatives. In orchids, however, the number and pattern of expression of class B-like genes have greatly diverged. The DEF-like genes from Orchidaceae form four well-supported, ancient clades of orthologues. In contrast, orchid GLO-like genes form a single clade of ancient orthologues and recent paralogues. DEF-like genes from orchid clade 2 (OMADS3-like genes) are under less stringent purifying selection than the other orchid DEF-like and GLO-like genes. In comparison with orchids, purifying selection was less stringent in DEF-like and GLO-like genes

  7. Positive selection and ancient duplications in the evolution of class B floral homeotic genes of orchids and grasses

    PubMed Central

    Mondragón-Palomino, Mariana; Hiese, Luisa; Härter, Andrea; Koch, Marcus A; Theißen, Günter

    2009-01-01

    Background Positive selection is recognized as the prevalence of nonsynonymous over synonymous substitutions in a gene. Models of the functional evolution of duplicated genes consider neofunctionalization as key to the retention of paralogues. For instance, duplicate transcription factors are specifically retained in plant and animal genomes and both positive selection and transcriptional divergence appear to have played a role in their diversification. However, the relative impact of these two factors has not been systematically evaluated. Class B MADS-box genes, comprising DEF-like and GLO-like genes, encode developmental transcription factors essential for establishment of perianth and male organ identity in the flowers of angiosperms. Here, we contrast the role of positive selection and the known divergence in expression patterns of genes encoding class B-like MADS-box transcription factors from monocots, with emphasis on the family Orchidaceae and the order Poales. Although in the monocots these two groups are highly diverse and have a strongly canalized floral morphology, there is no information on the role of positive selection in the evolution of their distinctive flower morphologies. Published research shows that in Poales, class B-like genes are expressed in stamens and in lodicules, the perianth organs whose identity might also be specified by class B-like genes, like the identity of the inner tepals of their lily-like relatives. In orchids, however, the number and pattern of expression of class B-like genes have greatly diverged. Results The DEF-like genes from Orchidaceae form four well-supported, ancient clades of orthologues. In contrast, orchid GLO-like genes form a single clade of ancient orthologues and recent paralogues. DEF-like genes from orchid clade 2 (OMADS3-like genes) are under less stringent purifying selection than the other orchid DEF-like and GLO-like genes. In comparison with orchids, purifying selection was less stringent in DEF

  8. Genome-wide analysis of homeobox genes from Mesobuthus martensii reveals Hox gene duplication in scorpions.

    PubMed

    Di, Zhiyong; Yu, Yao; Wu, Yingliang; Hao, Pei; He, Yawen; Zhao, Huabin; Li, Yixue; Zhao, Guoping; Li, Xuan; Li, Wenxin; Cao, Zhijian

    2015-06-01

    Homeobox genes belong to a large gene group, which encodes the famous DNA-binding homeodomain that plays a key role in development and cellular differentiation during embryogenesis in animals. Here, one hundred forty-nine homeobox genes were identified from the Asian scorpion, Mesobuthus martensii (Chelicerata: Arachnida: Scorpiones: Buthidae) based on our newly assembled genome sequence with approximately 248 × coverage. The identified homeobox genes were categorized into eight classes including 82 families: 67 ANTP class genes, 33 PRD genes, 11 LIM genes, five POU genes, six SINE genes, 14 TALE genes, five CUT genes, two ZF genes and six unclassified genes. Transcriptome data confirmed that more than half of the genes were expressed in adults. The homeobox gene diversity of the eight classes is similar to the previously analyzed Mandibulata arthropods. Interestingly, it is hypothesized that the scorpion M. martensii may have two Hox clusters. The first complete genome-wide analysis of homeobox genes in Chelicerata not only reveals the repertoire of scorpion, arachnid and chelicerate homeobox genes, but also shows some insights into the evolution of arthropod homeobox genes.

  9. Gene Duplication and Gene Expression Changes Play a Role in the Evolution of Candidate Pollen Feeding Genes in Heliconius Butterflies.

    PubMed

    Smith, Gilbert; Macias-Muñoz, Aide; Briscoe, Adriana D

    2016-09-02

    Heliconius possess a unique ability among butterflies to feed on pollen. Pollen feeding significantly extends their lifespan, and is thought to have been important to the diversification of the genus. We used RNA sequencing to examine feeding-related gene expression in the mouthparts of four species of Heliconius and one nonpollen feeding species, Eueides isabella We hypothesized that genes involved in morphology and protein metabolism might be upregulated in Heliconius because they have longer proboscides than Eueides, and because pollen contains more protein than nectar. Using de novo transcriptome assemblies, we tested these hypotheses by comparing gene expression in mouthparts against antennae and legs. We first looked for genes upregulated in mouthparts across all five species and discovered several hundred genes, many of which had functional annotations involving metabolism of proteins (cocoonase), lipids, and carbohydrates. We then looked specifically within Heliconius where we found eleven common upregulated genes with roles in morphology (CPR cuticle proteins), behavior (takeout-like), and metabolism (luciferase-like). Closer examination of these candidates revealed that cocoonase underwent several duplications along the lineage leading to heliconiine butterflies, including two Heliconius-specific duplications. Luciferase-like genes also underwent duplication within lepidopterans, and upregulation in Heliconius mouthparts. Reverse-transcription PCR confirmed that three cocoonases, a peptidase, and one luciferase-like gene are expressed in the proboscis with little to no expression in labial palps and salivary glands. Our results suggest pollen feeding, like other dietary specializations, was likely facilitated by adaptive expansions of preexisting genes-and that the butterfly proboscis is involved in digestive enzyme production.

  10. Independent and Parallel Evolution of New Genes by Gene Duplication in Two Origins of C4 Photosynthesis Provides New Insight into the Mechanism of Phloem Loading in C4 Species

    PubMed Central

    Emms, David M.; Covshoff, Sarah; Hibberd, Julian M.; Kelly, Steven

    2016-01-01

    C4 photosynthesis is considered one of the most remarkable examples of evolutionary convergence in eukaryotes. However, it is unknown whether the evolution of C4 photosynthesis required the evolution of new genes. Genome-wide gene-tree species-tree reconciliation of seven monocot species that span two origins of C4 photosynthesis revealed that there was significant parallelism in the duplication and retention of genes coincident with the evolution of C4 photosynthesis in these lineages. Specifically, 21 orthologous genes were duplicated and retained independently in parallel at both C4 origins. Analysis of this gene cohort revealed that the set of parallel duplicated and retained genes is enriched for genes that are preferentially expressed in bundle sheath cells, the cell type in which photosynthesis was activated during C4 evolution. Furthermore, functional analysis of the cohort of parallel duplicated genes identified SWEET-13 as a potential key transporter in the evolution of C4 photosynthesis in grasses, and provides new insight into the mechanism of phloem loading in these C4 species. Key words: C4 photosynthesis, gene duplication, gene families, parallel evolution. PMID:27016024

  11. Independent and Parallel Evolution of New Genes by Gene Duplication in Two Origins of C4 Photosynthesis Provides New Insight into the Mechanism of Phloem Loading in C4 Species.

    PubMed

    Emms, David M; Covshoff, Sarah; Hibberd, Julian M; Kelly, Steven

    2016-07-01

    C4 photosynthesis is considered one of the most remarkable examples of evolutionary convergence in eukaryotes. However, it is unknown whether the evolution of C4 photosynthesis required the evolution of new genes. Genome-wide gene-tree species-tree reconciliation of seven monocot species that span two origins of C4 photosynthesis revealed that there was significant parallelism in the duplication and retention of genes coincident with the evolution of C4 photosynthesis in these lineages. Specifically, 21 orthologous genes were duplicated and retained independently in parallel at both C4 origins. Analysis of this gene cohort revealed that the set of parallel duplicated and retained genes is enriched for genes that are preferentially expressed in bundle sheath cells, the cell type in which photosynthesis was activated during C4 evolution. Furthermore, functional analysis of the cohort of parallel duplicated genes identified SWEET-13 as a potential key transporter in the evolution of C4 photosynthesis in grasses, and provides new insight into the mechanism of phloem loading in these C4 species. C4 photosynthesis, gene duplication, gene families, parallel evolution. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  12. Segmental Duplication, Microinversion, and Gene Loss Associated with a Complex Inversion Breakpoint Region in Drosophila

    PubMed Central

    Calvete, Oriol; González, Josefa; Betrán, Esther; Ruiz, Alfredo

    2012-01-01

    Chromosomal inversions are usually portrayed as simple two-breakpoint rearrangements changing gene order but not gene number or structure. However, increasing evidence suggests that inversion breakpoints may often have a complex structure and entail gene duplications with potential functional consequences. Here, we used a combination of different techniques to investigate the breakpoint structure and the functional consequences of a complex rearrangement fixed in Drosophila buzzatii and comprising two tandemly arranged inversions sharing the middle breakpoint: 2m and 2n. By comparing the sequence in the breakpoint regions between D. buzzatii (inverted chromosome) and D. mojavensis (noninverted chromosome), we corroborate the breakpoint reuse at the molecular level and infer that inversion 2m was associated with a duplication of a ∼13 kb segment and likely generated by staggered breaks plus repair by nonhomologous end joining. The duplicated segment contained the gene CG4673, involved in nuclear transport, and its two nested genes CG5071 and CG5079. Interestingly, we found that other than the inversion and the associated duplication, both breakpoints suffered additional rearrangements, that is, the proximal breakpoint experienced a microinversion event associated at both ends with a 121-bp long duplication that contains a promoter. As a consequence of all these different rearrangements, CG5079 has been lost from the genome, CG5071 is now a single copy nonnested gene, and CG4673 has a transcript ∼9 kb shorter and seems to have acquired a more complex gene regulation. Our results illustrate the complex effects of chromosomal rearrangements and highlight the need of complementing genomic approaches with detailed sequence-level and functional analyses of breakpoint regions if we are to fully understand genome structure, function, and evolutionary dynamics. PMID:22328714

  13. Genomic characterization of ribitol teichoic acid synthesis in Staphylococcus aureus: genes, genomic organization and gene duplication.

    PubMed

    Qian, Ziliang; Yin, Yanbin; Zhang, Yong; Lu, Lingyi; Li, Yixue; Jiang, Ying

    2006-04-05

    Staphylococcus aureus or MRSA (Methicillin Resistant S. aureus), is an acquired pathogen and the primary cause of nosocomial infections worldwide. In S. aureus, teichoic acid is an essential component of the cell wall, and its biosynthesis is not yet well characterized. Studies in Bacillus subtilis have discovered two different pathways of teichoic acid biosynthesis, in two strains W23 and 168 respectively, namely teichoic acid ribitol (tar) and teichoic acid glycerol (tag). The genes involved in these two pathways are also characterized, tarA, tarB, tarD, tarI, tarJ, tarK, tarL for the tar pathway, and tagA, tagB, tagD, tagE, tagF for the tag pathway. With the genome sequences of several MRSA strains: Mu50, MW2, N315, MRSA252, COL as well as methicillin susceptible strain MSSA476 available, a comparative genomic analysis was performed to characterize teichoic acid biosynthesis in these S. aureus strains. We identified all S. aureus tar and tag gene orthologs in the selected S. aureus strains which would contribute to teichoic acids sythesis. Based on our identification of genes orthologous to tarI, tarJ, tarL, which are specific to tar pathway in B. subtilis W23, we also concluded that tar is the major teichoic acid biogenesis pathway in S. aureus. Further analyses indicated that the S. aureus tar genes, different from the divergon organization in B. subtilis, are organized into several clusters in cis. Most interesting, compared with genes in B. subtilis tar pathway, the S. aureus tar specific genes (tarI,J,L) are duplicated in all six S. aureus genomes. In the S. aureus strains we analyzed, tar (teichoic acid ribitol) is the main teichoic acid biogenesis pathway. The tar genes are organized into several genomic groups in cis and the genes specific to tar (relative to tag): tarI, tarJ, tarL are duplicated. The genomic organization of the S. aureus tar pathway suggests their regulations are different when compared to B. subtilis tar or tag pathway, which are

  14. Alcohol Dehydrogenase in the Diploid Plant STEPHANOMERIA EXIGUA (Compositae): Gene Duplication, Mode of Inheritance and Linkage

    PubMed Central

    Roose, M. L.; Gottlieb, L. D.

    1980-01-01

    Study of the biochemical genetics of alcohol dehydrogenase (ADH) in the annual plant Stephanomeria exigua (Compositae) revealed that the isozymes are specified by a small family of tightly linked structural genes. One set of ADH isozymes (ADH-1) was induced in roots by flooding, and was also expressed in thickened unflooded tap roots, stems, ovaries and seeds. As in other plants, the enzymes are dimeric and form homo- and heterodimers. An electrophoretic survey of ADH-1 phenotypes in two natural populations revealed seven different ADH-1 homodimers in various phenotypes having one to eight enzyme bands. Genetic analysis of segregations from crosses involving 59 plants showed that the ADH-1 isozymes are inherited as a single Mendelian unit, Adh1. Adh1 is polymorphic for forms that specify one, two, or three different ADH-1 subunits (which combine to form homo- and heterodimers), and are expressed co-dominantly in all genotypic combinations. Staining intensity of enzymes extracted from various homozygous and heterozygous plants indicated that the different subunit types specified by Adh1 are produced in approximately equal amounts. These observations suggest that Adh1 is a compound locus consisting of one to several tightly linked (0 recombinants among 579 testcross progeny), coordinately expressed structural genes. The genes in the two triplications also occur in various duplicate complexes and thus could have originated via unequal crossing over. The ADH-2 isozyme found in pollen and seeds is apparently specified by a different gene, Adh2. Adh1 and Adh2 are tightly linked (0 recombinants among 81 testcross progeny). PMID:17249032

  15. Becker Muscular Dystrophy (BMD) caused by duplication of exons 3-6 of the dystrophin gene presenting as dilated cardiomyopathy

    SciTech Connect

    Tsai, A.C.; Allingham-Hawkins, D.J.; Becker, L.

    1994-09-01

    X-linked dilated cardiomyopathy (XLCM) is a progressive myocardial disease presenting with congestive heart failure in teenage males without clinical signs of skeletal myopathy. Tight linkage of XLCM to the DMD locus has been demonstrated; it has been suggested that, at least in some families, XLCM is a {open_quotes}dystrophinopathy.{close_quotes} We report a 14-year-old boy who presented with acute heart failure due to dilated cardiomyopathy. He had no history of muscle weakness, but physical examination revealed pseudohypertrophy of the calf muscles. He subsequently received a heart transplantation. Family history was negative. Serum CK level at the time of diagnosis was 10,416. Myocardial biopsy showed no evidence of carditis. Dystrophin staining of cardiac and skeletal muscle with anti-sera to COOH and NH{sub 2}termini showed a patchy distribution of positivity suggestive of Becker muscular dystrophy. Analysis of 18 of the 79 dystrophin exons detected a duplication that included exons 3-6. The proband`s mother has an elevated serum CK and was confirmed to be a carrier of the same duplication. A mutation in the muscle promotor region of the dystrophin gene has been implicated in the etiology of SLCM. However, Towbin et al. (1991) argued that other 5{prime} mutations in the dystrophin gene could cause selective cardiomyopathy. The findings in our patient support the latter hypothesis. This suggests that there are multiple regions in the dystrophin gene which, when disrupted, can cause isolated dilated cardiomyopathy.

  16. Familial 4.3 Mb duplication of 21q22 sheds new light on the Down syndrome critical region

    PubMed Central

    Ronan, Anne; Fagan, Kerry; Christie, Louise; Conroy, Jeffrey; Nowak, Norma J; Turner, Gillian

    2007-01-01

    A 4.3 Mb duplication of chromosome 21 bands q22.13–q22.2 was diagnosed by interphase fluorescent in‐situ hybridisation (FISH) in a 31‐week gestational age baby with cystic hygroma and hydrops; the duplication was later found in the mother and in her 8‐year‐old daughter by the same method and confirmed by array comparative genomic hybridisation (aCGH). All had the facial gestalt of Down syndrome (DS). This is the smallest accurately defined duplication of chromosome 21 reported with a DS phenotype. The duplication encompasses the gene DYRK1 but not DSCR1 or DSCAM, all of which have previously been implicated in the causation of DS. Previous karyotype analysis and telomere screening of the mother, and karyotype analysis and metaphase FISH of a chorionic villus sample, had all failed to reveal the duplication. The findings in this family add to the identification and delineation of a “critical region” for the DS phenotype on chromosome 21. Cryptic chromosomal abnormalities can be missed on a routine karyotype for investigation of abnormal prenatal ultrasound findings, lending support to the use of aCGH analysis in this setting. PMID:17237124

  17. The evolution of the SEPALLATA subfamily of MADS-box genes: a preangiosperm origin with multiple duplications throughout angiosperm history.

    PubMed

    Zahn, Laura M; Kong, Hongzhi; Leebens-Mack, James H; Kim, Sangtae; Soltis, Pamela S; Landherr, Lena L; Soltis, Douglas E; Depamphilis, Claude W; Ma, Hong

    2005-04-01

    Members of the SEPALLATA (SEP) MADS-box subfamily are required for specifying the "floral state" by contributing to floral organ and meristem identity. SEP genes have not been detected in gymnosperms and seem to have originated since the lineage leading to extant angiosperms diverged from extant gymnosperms. Therefore, both functional and evolutionary studies suggest that SEP genes may have been critical for the origin of the flower. To gain insights into the evolution of SEP genes, we isolated nine genes from plants that occupy phylogenetically important positions. Phylogenetic analyses of SEP sequences show that several gene duplications occurred during the evolution of this subfamily, providing potential opportunities for functional divergence. The first duplication occurred prior to the origin of the extant angiosperms, resulting in the AGL2/3/4 and AGL9 clades. Subsequent duplications occurred within these clades in the eudicots and monocots. The timing of the first SEP duplication approximately coincides with duplications in the DEFICIENS/GLOBOSA and AGAMOUS MADS-box subfamilies, which may have resulted from either a proposed genome-wide duplication in the ancestor of extant angiosperms or multiple independent duplication events. Regardless of the mechanism of gene duplication, these pairs of duplicate transcription factors provided new possibilities of genetic interactions that may have been important in the origin of the flower.

  18. Gene Duplication and Gene Expression Changes Play a Role in the Evolution of Candidate Pollen Feeding Genes in Heliconius Butterflies

    PubMed Central

    Smith, Gilbert; Macias-Muñoz, Aide; Briscoe, Adriana D.

    2016-01-01

    Heliconius possess a unique ability among butterflies to feed on pollen. Pollen feeding significantly extends their lifespan, and is thought to have been important to the diversification of the genus. We used RNA sequencing to examine feeding-related gene expression in the mouthparts of four species of Heliconius and one nonpollen feeding species, Eueides isabella. We hypothesized that genes involved in morphology and protein metabolism might be upregulated in Heliconius because they have longer proboscides than Eueides, and because pollen contains more protein than nectar. Using de novo transcriptome assemblies, we tested these hypotheses by comparing gene expression in mouthparts against antennae and legs. We first looked for genes upregulated in mouthparts across all five species and discovered several hundred genes, many of which had functional annotations involving metabolism of proteins (cocoonase), lipids, and carbohydrates. We then looked specifically within Heliconius where we found eleven common upregulated genes with roles in morphology (CPR cuticle proteins), behavior (takeout-like), and metabolism (luciferase-like). Closer examination of these candidates revealed that cocoonase underwent several duplications along the lineage leading to heliconiine butterflies, including two Heliconius-specific duplications. Luciferase-like genes also underwent duplication within lepidopterans, and upregulation in Heliconius mouthparts. Reverse-transcription PCR confirmed that three cocoonases, a peptidase, and one luciferase-like gene are expressed in the proboscis with little to no expression in labial palps and salivary glands. Our results suggest pollen feeding, like other dietary specializations, was likely facilitated by adaptive expansions of preexisting genes—and that the butterfly proboscis is involved in digestive enzyme production. PMID:27553646

  19. Preferential duplication of intermodular hub genes: an evolutionary signature in eukaryotes genome networks.

    PubMed

    Ferreira, Ricardo M; Rybarczyk-Filho, José Luiz; Dalmolin, Rodrigo J S; Castro, Mauro A A; Moreira, José C F; Brunnet, Leonardo G; de Almeida, Rita M C

    2013-01-01

    Whole genome protein-protein association networks are not random and their topological properties stem from genome evolution mechanisms. In fact, more connected, but less clustered proteins are related to genes that, in general, present more paralogs as compared to other genes, indicating frequent previous gene duplication episodes. On the other hand, genes related to conserved biological functions present few or no paralogs and yield proteins that are highly connected and clustered. These general network characteristics must have an evolutionary explanation. Considering data from STRING database, we present here experimental evidence that, more than not being scale free, protein degree distributions of organisms present an increased probability for high degree nodes. Furthermore, based on this experimental evidence, we propose a simulation model for genome evolution, where genes in a network are either acquired de novo using a preferential attachment rule, or duplicated with a probability that linearly grows with gene degree and decreases with its clustering coefficient. For the first time a model yields results that simultaneously describe different topological distributions. Also, this model correctly predicts that, to produce protein-protein association networks with number of links and number of nodes in the observed range for Eukaryotes, it is necessary 90% of gene duplication and 10% of de novo gene acquisition. This scenario implies a universal mechanism for genome evolution.

  20. Orsomucoid: A new variant and additional duplicated ORM1 gene in Qatari population

    SciTech Connect

    Sebetan, I.M.; Alali, K.A.; Alzaman, A.

    1994-09-01

    A new genetically determined ORM2 variant and additional duplicated ORM1 gene were observed in Qatari population using isoelectric focusing in ultra thin layer polyacrylamide gels. The studied population samples indicate occurence of six ORM1 alleles and three ORM2 ones. A simple reliable method for separation of orsomucoid variations with comparison of different reported methods will be presented.

  1. A possible role of DNA methylation in functional divergence of a fast evolving duplicate gene encoding odorant binding protein 11 in the honeybee.

    PubMed

    Kucharski, R; Maleszka, J; Maleszka, R

    2016-06-29

    Although gene duplication is seen as the main path to evolution of new functions, molecular mechanisms by which selection favours the gain versus loss of newly duplicated genes and minimizes the fixation of pseudo-genes are not well understood. Here, we investigate in detail a duplicate honeybee gene obp11 belonging to a fast evolving insect gene family encoding odorant binding proteins (OBPs). We report that obp11 is expressed only in female bees in rare antennal sensilla basiconica in contrast to its tandem partner obp10 that is expressed in the brain in both females and males (drones). Unlike all other obp genes in the honeybee, obp11 is methylated suggesting that functional diversification of obp11 and obp10 may have been driven by an epigenetic mechanism. We also show that increased methylation in drones near one donor splice site that correlates with higher abundance of a transcript variant encoding a truncated OBP11 protein is one way of controlling its contrasting expression. Our data suggest that like in mammals and plants, DNA methylation in insects may contribute to functional diversification of proteins produced from duplicated genes, in particular to their subfunctionalization by generating complementary patterns of expression.

  2. Evolution by selection, recombination, and gene duplication in MHC class I genes of two Rhacophoridae species

    PubMed Central

    2013-01-01

    Background Comparison of major histocompatibility complex (MHC) genes across vertebrate species can reveal molecular mechanisms underlying the evolution of adaptive immunity-related proteins. As the first terrestrial tetrapods, amphibians deserve special attention because of their exposure to probably increased spectrum of microorganisms compared with ancestral aquatic fishes. Knowledge regarding the evolutionary patterns and mechanisms associated with amphibian MHC genes remains limited. The goal of the present study was to isolate MHC class I genes from two Rhacophoridae species (Rhacophorus omeimontis and Polypedates megacephalus) and examine their evolution. Results We identified 27 MHC class I alleles spanning the region from exon 2 to 4 in 38 tree frogs. The available evidence suggests that these 27 sequences all belong to classical MHC class I (MHC Ia) genes. Although several anuran species only display one MHC class Ia locus, at least two or three loci were observed in P. megacephalus and R. omeimontis, indicating that the number of MHC class Ia loci varies among anuran species. Recombination events, which mainly involve the entire exons, played an important role in shaping the genetic diversity of the 27 MHC class Ia alleles. In addition, signals of positive selection were found in Rhacophoridae MHC class Ia genes. Amino acid sites strongly suggested by program to be under positive selection basically accorded with the putative antigen binding sites deduced from crystal structure of human HLA. Phylogenetic relationships among MHC class I alleles revealed the presence of trans-species polymorphisms. Conclusions In the two Rhacophoridae species (1) there are two or three MHC class Ia loci; (2) recombination mainly occurs between the entire exons of MHC class Ia genes; (3) balancing selection, gene duplication and recombination all contribute to the diversity of MHC class Ia genes. These findings broaden our knowledge on the evolution of amphibian MHC systems

  3. Multispecies Analysis of Expression Pattern Diversification in the Recently Expanded Insect Ly6 Gene Family

    PubMed Central

    Tanaka, Kohtaro; Hazbun, Alexis; Hijazi, Assia; Vreede, Barbara; Sucena, Élio

    2015-01-01

    Gene families often consist of members with diverse expression domains reflecting their functions in a wide variety of tissues. However, how the expression of individual members, and thus their tissue-specific functions, diversified during the course of gene family expansion is not well understood. In this study, we approached this question through the analysis of the duplication history and transcriptional evolution of a rapidly expanding subfamily of insect Ly6 genes. We analyzed different insect genomes and identified seven Ly6 genes that have originated from a single ancestor through sequential duplication within the higher Diptera. We then determined how the original embryonic expression pattern of the founding gene diversified by characterizing its tissue-specific expression in the beetle Tribolium castaneum, the butterfly Bicyclus anynana, and the mosquito Anopheles stephensi and those of its duplicates in three higher dipteran species, representing various stages of the duplication history (Megaselia abdita, Ceratitis capitata, and Drosophila melanogaster). Our results revealed that frequent neofunctionalization episodes contributed to the increased expression breadth of this subfamily and that these events occurred after duplication and speciation events at comparable frequencies. In addition, at each duplication node, we consistently found asymmetric expression divergence. One paralog inherited most of the tissue-specificities of the founder gene, whereas the other paralog evolved drastically reduced expression domains. Our approach attests to the power of combining a well-established duplication history with a comprehensive coverage of representative species in acquiring unequivocal information about the dynamics of gene expression evolution in gene families. PMID:25743545

  4. Pericentromeric effects shape the patterns of divergence, retention, and expression of duplicated genes in the Paleopolyploid Soybean

    USDA-ARS?s Scientific Manuscript database

    Sequence divergence and fractionation of duplicated genes following whole genome duplication (WGD) are important processes in the course of polyploid genome evolution. However, the evolutionary forces that govern the divergence and retention of WGD-derived genes are poorly understood. In this study,...

  5. New Genes Originated via Multiple Recombinational Pathways in the β-Globin Gene Family of Rodents

    PubMed Central

    Hoffmann, Federico G.; Opazo, Juan C.; Storz, Jay F.

    2008-01-01

    Species differences in the size or membership composition of multigene families can be attributed to lineage-specific additions of new genes via duplication, losses of genes via deletion or inactivation, and the creation of chimeric genes via domain shuffling or gene fusion. In principle, it should be possible to infer the recombinational pathways responsible for each of these different types of genomic change by conducting detailed comparative analyses of genomic sequence data. Here, we report an attempt to unravel the complex evolutionary history of the β-globin gene family in a taxonomically diverse set of rodent species. The main objectives were: 1) to characterize the genomic structure of the β-globin gene cluster of rodents; 2) to assign orthologous and paralogous relationships among duplicate copies of β-like globin genes; and 3) to infer the specific recombinational pathways responsible for gene duplications, gene deletions, and the creation of chimeric fusion genes. Results of our comparative genomic analyses revealed that variation in gene family size among rodent species is mainly attributable to the differential gain and loss of later expressed β-globin genes via unequal crossing-over. However, two distinct recombinational mechanisms were implicated in the creation of chimeric fusion genes. In muroid rodents, a chimeric γ/ε fusion gene was created by unequal crossing-over between the embryonic ε- and γ-globin genes. Interestingly, this γ/ε fusion gene was generated in the same fashion as the “anti-Lepore” 5′-δ-(β/δ)-β-3′ duplication mutant in humans (the reciprocal exchange product of the pathological hemoglobin Lepore deletion mutant). By contrast, in the house mouse, Mus musculus, a chimeric β/δ fusion pseudogene was created by a β-globin → δ-globin gene conversion event. Although the γ/ε and β/δ fusion genes share a similar chimeric gene structure, they originated via completely different recombinational pathways. PMID

  6. Ancestral gene duplication enabled the evolution of multifunctional cellulases in stick insects (Phasmatodea).

    PubMed

    Shelomi, Matan; Heckel, David G; Pauchet, Yannick

    2016-04-01

    The Phasmatodea (stick insects) have multiple, endogenous, highly expressed copies of glycoside hydrolase family 9 (GH9) genes. The purpose for retaining so many was unknown. We cloned and expressed the enzymes in transfected insect cell lines, and tested the individual proteins against different plant cell wall component poly- and oligosaccharides. Nearly all isolated enzymes were active against carboxymethylcellulose, however most could also degrade glucomannan, and some also either xylan or xyloglucan. The latter two enzyme groups were each monophyletic, suggesting the evolution of these novel substrate specificities in an early ancestor of the order. Such enzymes are highly unusual for Metazoa, for which no xyloglucanases had been reported. Phasmatodea gut extracts could degrade multiple plant cell wall components fully into sugar monomers, suggesting that enzymatic breakdown of plant cell walls by the entire Phasmatodea digestome may contribute to the Phasmatodea nutritional budget. The duplication and neofunctionalization of GH9s in the ancestral Phasmatodea may have enabled them to specialize as folivores and diverge from their omnivorous ancestors. The structural changes enabling these unprecedented activities in the cellulases require further study. Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. Effects of whole genome duplication on cell size and gene expression in mouse embryonic stem cells

    PubMed Central

    IMAI, Hiroyuki; FUJII, Wataru; KUSAKABE, Ken Takeshi; KISO, Yasuo; KANO, Kiyoshi

    2016-01-01

    Alterations in ploidy tend to influence cell physiology, which in the long-term, contribute to species adaptation and evolution. Polyploid cells are observed under physiological conditions in the nerve and liver tissues, and in tumorigenic processes. Although tetraploid cells have been studied in mammalian cells, the basic characteristics and alterations caused by whole genome duplication are still poorly understood. The purpose of this study was to acquire basic knowledge about the effect of whole genome duplication on the cell cycle, cell size, and gene expression. Using flow cytometry, we demonstrate that cell cycle subpopulations in mouse tetraploid embryonic stem cells (TESCs) were similar to those in embryonic stem cells (ESCs). We performed smear preparations and flow cytometric analysis to identify cell size alterations. These indicated that the relative cell volume of TESCs was approximately 2.2–2.5 fold that of ESCs. We also investigated the effect of whole genome duplication on the expression of housekeeping and pluripotency marker genes using quantitative real-time PCR with external RNA. We found that the target transcripts were 2.2 times more abundant in TESCs than those in ESCs. This indicated that gene expression and cell volume increased in parallel. Our findings suggest the existence of a homeostatic mechanism controlling the cytoplasmic transcript levels in accordance with genome volume changes caused by whole genome duplication. PMID:27569766

  8. Ancient gene duplication provided a key molecular step for anaerobic growth of Baker's yeast.

    PubMed

    Hayashi, Masaya; Schilke, Brenda; Marszalek, Jaroslaw; Williams, Barry; Craig, Elizabeth A

    2011-07-01

    Mitochondria are essential organelles required for a number of key cellular processes. As most mitochondrial proteins are nuclear encoded, their efficient translocation into the organelle is critical. Transport of proteins across the inner membrane is driven by a multicomponent, matrix-localized "import motor," which is based on the activity of the molecular chaperone Hsp70 and a J-protein cochaperone. In Saccharomyces cerevisiae, two paralogous J-proteins, Pam18 and Mdj2, can form the import motor. Both contain transmembrane and matrix domains, with Pam18 having an additional intermembrane space (IMS) domain. Evolutionary analyses revealed that the origin of the IMS domain of S. cerevisiae Pam18 coincides with a gene duplication event that generated the PAM18/MDJ2 gene pair. The duplication event and origin of the Pam18 IMS domain occurred at the relatively ancient divergence of the fungal subphylum Saccharomycotina. The timing of the duplication event also corresponds with a number of additional functional changes related to mitochondrial function and respiration. Physiological and genetic studies revealed that the IMS domain of Pam18 is required for efficient growth under anaerobic conditions, even though it is dispensable when oxygen i