Chapman, Brad A; Bowers, John E; Feltus, Frank A; Paterson, Andrew H
2006-02-21
Genome duplication followed by massive gene loss has permanently shaped the genomes of many higher eukaryotes, particularly angiosperms. It has long been believed that a primary advantage of genome duplication is the opportunity for the evolution of genes with new functions by modification of duplicated genes. If so, then patterns of genetic diversity among strains within taxa might reveal footprints of selection that are consistent with this advantage. Contrary to classical predictions that duplicated genes may be relatively free to acquire unique functionality, we find among both Arabidopsis ecotypes and Oryza subspecies that SNPs encode less radical amino acid changes in genes for which there exists a duplicated copy at a "paleologous" locus than in "singleton" genes. Preferential retention of duplicated genes encoding long complex proteins and their unexpectedly slow divergence (perhaps because of homogenization) suggest that a primary advantage of retaining duplicated paleologs may be the buffering of crucial functions. Functional buffering and functional divergence may represent extremes in the spectrum of duplicated gene fates. Functional buffering may be especially important during "genomic turmoil" immediately after genome duplication but continues to act approximately 60 million years later, and its gradual deterioration may contribute cyclicality to genome duplication in some lineages.
Chapman, Brad A.; Bowers, John E.; Feltus, Frank A.; Paterson, Andrew H.
2006-01-01
Genome duplication followed by massive gene loss has permanently shaped the genomes of many higher eukaryotes, particularly angiosperms. It has long been believed that a primary advantage of genome duplication is the opportunity for the evolution of genes with new functions by modification of duplicated genes. If so, then patterns of genetic diversity among strains within taxa might reveal footprints of selection that are consistent with this advantage. Contrary to classical predictions that duplicated genes may be relatively free to acquire unique functionality, we find among both Arabidopsis ecotypes and Oryza subspecies that SNPs encode less radical amino acid changes in genes for which there exists a duplicated copy at a “paleologous” locus than in “singleton” genes. Preferential retention of duplicated genes encoding long complex proteins and their unexpectedly slow divergence (perhaps because of homogenization) suggest that a primary advantage of retaining duplicated paleologs may be the buffering of crucial functions. Functional buffering and functional divergence may represent extremes in the spectrum of duplicated gene fates. Functional buffering may be especially important during “genomic turmoil” immediately after genome duplication but continues to act ≈60 million years later, and its gradual deterioration may contribute cyclicality to genome duplication in some lineages. PMID:16467140
Evolution of tuf genes: ancient duplication, differential loss and gene conversion.
Lathe, W C; Bork, P
2001-08-03
The tuf gene of eubacteria, encoding the EF-tu elongation factor, was duplicated early in the evolution of the taxon. Phylogenetic and genomic location analysis of 20 complete eubacterial genomes suggests that this ancient duplication has been differentially lost and maintained in eubacteria.
van der Ley, P
1988-11-01
Gonococci express a family of related outer membrane proteins designated protein II (P.II). These surface proteins are subject to both phase variation and antigenic variation. The P.II gene repertoire of Neisseria gonorrhoeae strain JS3 was found to consist of at least ten genes, eight of which were cloned. Sequence analysis and DNA hybridization studies revealed that one particular P.II-encoding sequence is present in three distinct, but almost identical, copies in the JS3 genome. These genes encode the P.II protein that was previously identified as P.IIc. Comparison of their sequences shows that the multiple copies of this P.IIc-encoding gene might have been generated by both gene conversion and gene duplication.
Guselnikov, S.V.; Grayfer, L.; De Jesús Andino, F.; Rogozin, I.B.; Robert, J.; Taranin, A.V.
2015-01-01
The ITAM-bearing transmembrane signaling subunits (TSS) are indispensable components of activating leukocyte receptor complexes. The TSS-encoding genes map to paralogous chromosomal regions, which are thought to arise from ancient genome tetraploidization(s). To assess a possible role of tetraploidization in the TSS evolution, we studied TSS and other functionally linked genes in the amphibian species Xenopus laevis whose genome was duplicated about 40 MYR ago. We found that X. laevis has retained a duplicated set of sixteen TSS genes, all except one being transcribed. Furthermore, duplicated TCRα loci and genes encoding TSS-coupling protein kinases have also been retained. No clear evidence for functional divergence of the TSS paralogs was obtained from gene expression and sequence analyses. We suggest that the main factor of maintenance of duplicated TSS genes in X. laevis was a protein dosage effect and that this effect might have facilitated the TSS set expansion in early vertebrates. PMID:26170006
Restriction and Recruitment—Gene Duplication and the Origin and Evolution of Snake Venom Toxins
Hargreaves, Adam D.; Swain, Martin T.; Hegarty, Matthew J.; Logan, Darren W.; Mulley, John F.
2014-01-01
Snake venom has been hypothesized to have originated and diversified through a process that involves duplication of genes encoding body proteins with subsequent recruitment of the copy to the venom gland, where natural selection acts to develop or increase toxicity. However, gene duplication is known to be a rare event in vertebrate genomes, and the recruitment of duplicated genes to a novel expression domain (neofunctionalization) is an even rarer process that requires the evolution of novel combinations of transcription factor binding sites in upstream regulatory regions. Therefore, although this hypothesis concerning the evolution of snake venom is very unlikely and should be regarded with caution, it is nonetheless often assumed to be established fact, hindering research into the true origins of snake venom toxins. To critically evaluate this hypothesis, we have generated transcriptomic data for body tissues and salivary and venom glands from five species of venomous and nonvenomous reptiles. Our comparative transcriptomic analysis of these data reveals that snake venom does not evolve through the hypothesized process of duplication and recruitment of genes encoding body proteins. Indeed, our results show that many proposed venom toxins are in fact expressed in a wide variety of body tissues, including the salivary gland of nonvenomous reptiles and that these genes have therefore been restricted to the venom gland following duplication, not recruited. Thus, snake venom evolves through the duplication and subfunctionalization of genes encoding existing salivary proteins. These results highlight the danger of the elegant and intuitive “just-so story” in evolutionary biology. PMID:25079342
Yu, Jingyin; Tehrim, Sadia; Zhang, Fengqi; Tong, Chaobo; Huang, Junyan; Cheng, Xiaohui; Dong, Caihua; Zhou, Yanqiu; Qin, Rui; Hua, Wei; Liu, Shengyi
2014-01-03
Plant disease resistance (R) genes with the nucleotide binding site (NBS) play an important role in offering resistance to pathogens. The availability of complete genome sequences of Brassica oleracea and Brassica rapa provides an important opportunity for researchers to identify and characterize NBS-encoding R genes in Brassica species and to compare with analogues in Arabidopsis thaliana based on a comparative genomics approach. However, little is known about the evolutionary fate of NBS-encoding genes in the Brassica lineage after split from A. thaliana. Here we present genome-wide analysis of NBS-encoding genes in B. oleracea, B. rapa and A. thaliana. Through the employment of HMM search and manual curation, we identified 157, 206 and 167 NBS-encoding genes in B. oleracea, B. rapa and A. thaliana genomes, respectively. Phylogenetic analysis among 3 species classified NBS-encoding genes into 6 subgroups. Tandem duplication and whole genome triplication (WGT) analyses revealed that after WGT of the Brassica ancestor, NBS-encoding homologous gene pairs on triplicated regions in Brassica ancestor were deleted or lost quickly, but NBS-encoding genes in Brassica species experienced species-specific gene amplification by tandem duplication after divergence of B. rapa and B. oleracea. Expression profiling of NBS-encoding orthologous gene pairs indicated the differential expression pattern of retained orthologous gene copies in B. oleracea and B. rapa. Furthermore, evolutionary analysis of CNL type NBS-encoding orthologous gene pairs among 3 species suggested that orthologous genes in B. rapa species have undergone stronger negative selection than those in B .oleracea species. But for TNL type, there are no significant differences in the orthologous gene pairs between the two species. This study is first identification and characterization of NBS-encoding genes in B. rapa and B. oleracea based on whole genome sequences. Through tandem duplication and whole genome triplication analysis in B. oleracea, B. rapa and A. thaliana genomes, our study provides insight into the evolutionary history of NBS-encoding genes after divergence of A. thaliana and the Brassica lineage. These results together with expression pattern analysis of NBS-encoding orthologous genes provide useful resource for functional characterization of these genes and genetic improvement of relevant crops.
Shen, Danyu; Liu, Tingli; Ye, Wenwu; Liu, Li; Liu, Peihan; Wu, Yuren; Wang, Yuanchao; Dou, Daolong
2013-01-01
Phytophthora and other oomycetes secrete a large number of putative host cytoplasmic effectors with conserved FLAK motifs following signal peptides, termed crinkling and necrosis inducing proteins (CRN), or Crinkler. Here, we first investigated the evolutionary patterns and mechanisms of CRN effectors in Phytophthora sojae and compared them to two other Phytophthora species. The genes encoding CRN effectors could be divided into 45 orthologous gene groups (OGG), and most OGGs unequally distributed in the three species, in which each underwent large number of gene gains or losses, indicating that the CRN genes expanded after species evolution in Phytophthora and evolved through pathoadaptation. The 134 expanded genes in P. sojae encoded family proteins including 82 functional genes and expressed at higher levels while the other 68 genes encoding orphan proteins were less expressed and contained 50 pseudogenes. Furthermore, we demonstrated that most expanded genes underwent gene duplication or/and fragment recombination. Three different mechanisms that drove gene duplication or recombination were identified. Finally, the expanded CRN effectors exhibited varying pathogenic functions, including induction of programmed cell death (PCD) and suppression of PCD through PAMP-triggered immunity or/and effector-triggered immunity. Overall, these results suggest that gene duplication and fragment recombination may be two mechanisms that drive the expansion and neofunctionalization of the CRN family in P. sojae, which aids in understanding the roles of CRN effectors within each oomycete pathogen.
Gacesa, Ranko; Chung, Ray; Dunn, Simon R; Weston, Andrew J; Jaimes-Becerra, Adrian; Marques, Antonio C; Morandini, André C; Hranueli, Daslav; Starcevic, Antonio; Ward, Malcolm; Long, Paul F
2015-10-13
Gene duplication followed by adaptive selection is a well-accepted process leading to toxin diversification in venoms. However, emergent genomic, transcriptomic and proteomic evidence now challenges this role to be at best equivocal to other processess . Cnidaria are arguably the most ancient phylum of the extant metazoa that are venomous and such provide a definitive ancestral anchor to examine the evolution of this trait. Here we compare predicted toxins from the translated genome of the coral Acropora digitifera to putative toxins revealed by proteomic analysis of soluble proteins discharged from nematocysts, to determine the extent to which gene duplications contribute to venom innovation in this reef-building coral species. A new bioinformatics tool called HHCompare was developed to detect potential gene duplications in the genomic data, which is made freely available ( https://github.com/rgacesa/HHCompare ). A total of 55 potential toxin encoding genes could be predicted from the A. digitifera genome, of which 36 (65 %) had likely arisen by gene duplication as evinced using the HHCompare tool and verified using two standard phylogeny methods. Surprisingly, only 22 % (12/55) of the potential toxin repertoire could be detected following rigorous proteomic analysis, for which only half (6/12) of the toxin proteome could be accounted for as peptides encoded by the gene duplicates. Biological activities of these toxins are dominatedby putative phospholipases and toxic peptidases. Gene expansions in A. digitifera venom are the most extensive yet described in any venomous animal, and gene duplication plays a significant role leading to toxin diversification in this coral species. Since such low numbers of toxins were detected in the proteome, it is unlikely that the venom is evolving rapidly by prey-driven positive natural selection. Rather we contend that the venom has a defensive role deterring predation or harm from interspecific competition and overgrowth by fouling organisms. Factors influencing translation of toxin encoding genes perhaps warrants more profound experimental consideration.
Kaltenegger, Elisabeth; Eich, Eckart; Ober, Dietrich
2013-01-01
Homospermidine synthase (HSS), the first pathway-specific enzyme of pyrrolizidine alkaloid biosynthesis, is known to have its origin in the duplication of a gene encoding deoxyhypusine synthase. To study the processes that followed this gene duplication event and gave rise to HSS, we identified sequences encoding HSS and deoxyhypusine synthase from various species of the Convolvulaceae. We show that HSS evolved only once in this lineage. This duplication event was followed by several losses of a functional gene copy attributable to gene loss or pseudogenization. Statistical analyses of sequence data suggest that, in those lineages in which the gene copy was successfully recruited as HSS, the gene duplication event was followed by phases of various selection pressures, including purifying selection, relaxed functional constraints, and possibly positive Darwinian selection. Site-specific mutagenesis experiments have confirmed that the substitution of sites predicted to be under positive Darwinian selection is sufficient to convert a deoxyhypusine synthase into a HSS. In addition, analyses of transcript levels have shown that HSS and deoxyhypusine synthase have also diverged with respect to their regulation. The impact of protein–protein interaction on the evolution of HSS is discussed with respect to current models of enzyme evolution. PMID:23572540
Wei, Hengling; Li, Wei; Sun, Xiwei; Zhu, Shuijin; Zhu, Jun
2013-01-01
Plant disease resistance genes are a key component of defending plants from a range of pathogens. The majority of these resistance genes belong to the super-family that harbors a Nucleotide-binding site (NBS). A number of studies have focused on NBS-encoding genes in disease resistant breeding programs for diverse plants. However, little information has been reported with an emphasis on systematic analysis and comparison of NBS-encoding genes in cotton. To fill this gap of knowledge, in this study, we identified and investigated the NBS-encoding resistance genes in cotton using the whole genome sequence information of Gossypium raimondii. Totally, 355 NBS-encoding resistance genes were identified. Analyses of the conserved motifs and structural diversity showed that the most two distinct features for these genes are the high proportion of non-regular NBS genes and the high diversity of N-termini domains. Analyses of the physical locations and duplications of NBS-encoding genes showed that gene duplication of disease resistance genes could play an important role in cotton by leading to an increase in the functional diversity of the cotton NBS-encoding genes. Analyses of phylogenetic comparisons indicated that, in cotton, the NBS-encoding genes with TIR domain not only have their own evolution pattern different from those of genes without TIR domain, but also have their own species-specific pattern that differs from those of TIR genes in other plants. Analyses of the correlation between disease resistance QTL and NBS-encoding resistance genes showed that there could be more than half of the disease resistance QTL associated to the NBS-encoding genes in cotton, which agrees with previous studies establishing that more than half of plant resistance genes are NBS-encoding genes. PMID:23936305
NASA Astrophysics Data System (ADS)
Yue, Jia-Xing; Holland, Nicholas D.; Holland, Linda Z.; Deheyn, Dimitri D.
2016-06-01
Green Fluorescent Protein (GFP) was originally found in cnidarians, and later in copepods and cephalochordates (amphioxus) (Branchiostoma spp). Here, we looked for GFP-encoding genes in Asymmetron, an early-diverged cephalochordate lineage, and found two such genes closely related to some of the Branchiostoma GFPs. Dim fluorescence was found throughout the body in adults of Asymmetron lucayanum, and, as in Branchiostoma floridae, was especially intense in the ripe ovaries. Spectra of the fluorescence were similar between Asymmetron and Branchiostoma. Lineage-specific expansion of GFP-encoding genes in the genus Branchiostoma was observed, largely driven by tandem duplications. Despite such expansion, purifying selection has strongly shaped the evolution of GFP-encoding genes in cephalochordates, with apparent relaxation for highly duplicated clades. All cephalochordate GFP-encoding genes are quite different from those of copepods and cnidarians. Thus, the ancestral cephalochordates probably had GFP, but since GFP appears to be lacking in more early-diverged deuterostomes (echinoderms, hemichordates), it is uncertain whether the ancestral cephalochordates (i.e. the common ancestor of Asymmetron and Branchiostoma) acquired GFP by horizontal gene transfer (HGT) from copepods or cnidarians or inherited it from the common ancestor of copepods and deuterostomes, i.e. the ancestral bilaterians.
Phylogenetics of Lophotrochozoan bHLH Genes and the Evolution of Lineage-Specific Gene Duplicates.
Bao, Yongbo; Xu, Fei; Shimeld, Sebastian M
2017-04-01
The gain and loss of genes encoding transcription factors is of importance to understanding the evolution of gene regulatory complexity. The basic helix-loop-helix (bHLH) genes encode a large superfamily of transcription factors. We systematically classify the bHLH genes from five mollusc, two annelid and one brachiopod genomes, tracing the pattern of bHLH gene evolution across these poorly studied Phyla. In total, 56-88 bHLH genes were identified in each genome, with most identifiable as members of previously described bilaterian families, or of new families we define. Of such families only one, Mesp, appears lost by all these species. Additional duplications have also played a role in the evolution of the bHLH gene repertoire, with many new lophotrochozoan-, mollusc-, bivalve-, or gastropod-specific genes defined. Using a combination of transcriptome mining, RT-PCR, and in situ hybridization we compared the expression of several of these novel genes in tissues and embryos of the molluscs Crassostrea gigas and Patella vulgata, finding both conserved expression and evidence for neofunctionalization. We also map the positions of the genes across these genomes, identifying numerous gene linkages. Some reflect recent paralog divergence by tandem duplication, others are remnants of ancient tandem duplications dating to the lophotrochozoan or bilaterian common ancestors. These data are built into a model of the evolution of bHLH genes in molluscs, showing formidable evolutionary stasis at the family level but considerable within-family diversification by tandem gene duplication. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Seabra, Ana R; Vieira, Cristina P; Cullimore, Julie V; Carvalho, Helena G
2010-08-19
Nitrogen is a crucial nutrient that is both essential and rate limiting for plant growth and seed production. Glutamine synthetase (GS), occupies a central position in nitrogen assimilation and recycling, justifying the extensive number of studies that have been dedicated to this enzyme from several plant sources. All plants species studied to date have been reported as containing a single, nuclear gene encoding a plastid located GS isoenzyme per haploid genome. This study reports the existence of a second nuclear gene encoding a plastid located GS in Medicago truncatula. This study characterizes a new, second gene encoding a plastid located glutamine synthetase (GS2) in M. truncatula. The gene encodes a functional GS isoenzyme with unique kinetic properties, which is exclusively expressed in developing seeds. Based on molecular data and the assumption of a molecular clock, it is estimated that the gene arose from a duplication event that occurred about 10 My ago, after legume speciation and that duplicated sequences are also present in closely related species of the Vicioide subclade. Expression analysis by RT-PCR and western blot indicate that the gene is exclusively expressed in developing seeds and its expression is related to seed filling, suggesting a specific function of the enzyme associated to legume seed metabolism. Interestingly, the gene was found to be subjected to alternative splicing over the first intron, leading to the formation of two transcripts with similar open reading frames but varying 5' UTR lengths, due to retention of the first intron. To our knowledge, this is the first report of alternative splicing on a plant GS gene. This study shows that Medicago truncatula contains an additional GS gene encoding a plastid located isoenzyme, which is functional and exclusively expressed during seed development. Legumes produce protein-rich seeds requiring high amounts of nitrogen, we postulate that this gene duplication represents a functional innovation of plastid located GS related to storage protein accumulation exclusive to legume seed metabolism.
Complexity of Gene Expression Evolution after Duplication: Protein Dosage Rebalancing
Rogozin, Igor B.
2014-01-01
Ongoing debates about functional importance of gene duplications have been recently intensified by a heated discussion of the “ortholog conjecture” (OC). Under the OC, which is central to functional annotation of genomes, orthologous genes are functionally more similar than paralogous genes at the same level of sequence divergence. However, a recent study challenged the OC by reporting a greater functional similarity, in terms of gene ontology (GO) annotations and expression profiles, among within-species paralogs compared to orthologs. These findings were taken to indicate that functional similarity of homologous genes is primarily determined by the cellular context of the genes, rather than evolutionary history. Subsequent studies suggested that the OC appears to be generally valid when applied to mammalian evolution but the complete picture of evolution of gene expression also has to incorporate lineage-specific aspects of paralogy. The observed complexity of gene expression evolution after duplication can be explained through selection for gene dosage effect combined with the duplication-degeneration-complementation model. This paper discusses expression divergence of recent duplications occurring before functional divergence of proteins encoded by duplicate genes. PMID:25197576
Calcium-activated potassium (BK) channels are encoded by duplicate slo1 genes in teleost fishes.
Rohmann, Kevin N; Deitcher, David L; Bass, Andrew H
2009-07-01
Calcium-activated, large conductance potassium (BK) channels in tetrapods are encoded by a single slo1 gene, which undergoes extensive alternative splicing. Alternative splicing generates a high level of functional diversity in BK channels that contributes to the wide range of frequencies electrically tuned by the inner ear hair cells of many tetrapods. To date, the role of BK channels in hearing among teleost fishes has not been investigated at the molecular level, although teleosts account for approximately half of all extant vertebrate species. We identified slo1 genes in teleost and nonteleost fishes using polymerase chain reaction and genetic sequence databases. In contrast to tetrapods, all teleosts examined were found to express duplicate slo1 genes in the central nervous system, whereas nonteleosts that diverged prior to the teleost whole-genome duplication event express a single slo1 gene. Phylogenetic analyses further revealed that whereas other slo1 duplicates were the result of a single duplication event, an independent duplication occurred in a basal teleost (Anguilla rostrata) following the slo1 duplication in teleosts. A third, independent slo1 duplication (autotetraploidization) occurred in salmonids. Comparison of teleost slo1 genomic sequences to their tetrapod orthologue revealed a reduced number of alternative splice sites in both slo1 co-orthologues. For the teleost Porichthys notatus, a focal study species that vocalizes with maximal spectral energy in the range electrically tuned by BK channels in the inner ear, peripheral tissues show the expression of either one (e.g., vocal muscle) or both (e.g., inner ear) slo1 paralogues with important implications for both auditory and vocal physiology. Additional loss of expression of one slo1 paralogue in nonneural tissues in P. notatus suggests that slo1 duplicates were retained via subfunctionalization. Together, the results predict that teleost fish achieve a diversity of BK channel subfunction via gene duplication, rather than increased alternative splicing as witnessed for the tetrapod and invertebrate orthologue.
Calcium-Activated Potassium (BK) Channels Are Encoded by Duplicate slo1 Genes in Teleost Fishes
Deitcher, David L.; Bass, Andrew H.
2009-01-01
Calcium-activated, large conductance potassium (BK) channels in tetrapods are encoded by a single slo1 gene, which undergoes extensive alternative splicing. Alternative splicing generates a high level of functional diversity in BK channels that contributes to the wide range of frequencies electrically tuned by the inner ear hair cells of many tetrapods. To date, the role of BK channels in hearing among teleost fishes has not been investigated at the molecular level, although teleosts account for approximately half of all extant vertebrate species. We identified slo1 genes in teleost and nonteleost fishes using polymerase chain reaction and genetic sequence databases. In contrast to tetrapods, all teleosts examined were found to express duplicate slo1 genes in the central nervous system, whereas nonteleosts that diverged prior to the teleost whole-genome duplication event express a single slo1 gene. Phylogenetic analyses further revealed that whereas other slo1 duplicates were the result of a single duplication event, an independent duplication occurred in a basal teleost (Anguilla rostrata) following the slo1 duplication in teleosts. A third, independent slo1 duplication (autotetraploidization) occurred in salmonids. Comparison of teleost slo1 genomic sequences to their tetrapod orthologue revealed a reduced number of alternative splice sites in both slo1 co-orthologues. For the teleost Porichthys notatus, a focal study species that vocalizes with maximal spectral energy in the range electrically tuned by BK channels in the inner ear, peripheral tissues show the expression of either one (e.g., vocal muscle) or both (e.g., inner ear) slo1 paralogues with important implications for both auditory and vocal physiology. Additional loss of expression of one slo1 paralogue in nonneural tissues in P. notatus suggests that slo1 duplicates were retained via subfunctionalization. Together, the results predict that teleost fish achieve a diversity of BK channel subfunction via gene duplication, rather than increased alternative splicing as witnessed for the tetrapod and invertebrate orthologue. PMID:19321796
Kang, Sung-Hwan; Atallah, Osama O; Sun, Yong-Duo; Folimonova, Svetlana Y
2018-01-15
Viruses from the family Closteroviridae show an example of intra-genome duplications of more than one gene. In addition to the hallmark coat protein gene duplication, several members possess a tandem duplication of papain-like leader proteases. In this study, we demonstrate that domains encoding the L1 and L2 proteases in the Citrus tristeza virus genome underwent a significant functional divergence at the RNA and protein levels. We show that the L1 protease is crucial for viral accumulation and establishment of initial infection, whereas its coding region is vital for virus transport. On the other hand, the second protease is indispensable for virus infection of its natural citrus host, suggesting that L2 has evolved an important adaptive function that mediates virus interaction with the woody host. Copyright © 2017 Elsevier Inc. All rights reserved.
Bulky Trichomonad Genomes: Encoding a Swiss Army Knife.
Barratt, Joel; Gough, Rory; Stark, Damien; Ellis, John
2016-10-01
The trichomonads are a remarkably successful lineage of ancient, predominantly parasitic protozoa. Recent molecular analyses have revealed extensive duplication of certain genetic loci in trichomonads. Consequently, their genomes are exceptionally large compared to other parasitic protozoa. Retention of these large gene expansions across different trichomonad families raises the question: do these duplications afford an advantage? Many duplicated genes are linked to the parasitic lifestyle and some are regulated differently to their paralogues, suggesting they have acquired new functions. It is proposed that these large genomes encode a Swiss army knife of sorts, packed with a multitude of tools for use in many different circumstances. This may have bestowed trichomonads with the extraordinary versatility that has undoubtedly contributed to their success. Copyright © 2016 Elsevier Ltd. All rights reserved.
Vatansever, Recep; Koc, Ibrahim; Ozyigit, Ibrahim Ilker; Sen, Ugur; Uras, Mehmet Emin; Anjum, Naser A; Pereira, Eduarda; Filiz, Ertugrul
2016-12-01
Solanum tuberosum genome analysis revealed 12 StSULTR genes encoding 18 transcripts. Among genes annotated at group level ( StSULTR I-IV), group III members formed the largest SULTRs-cluster and were potentially involved in biotic/abiotic stress responses via various regulatory factors, and stress and signaling proteins. Employing bioinformatics tools, this study performed genome-wide identification and expression analysis of SULTR (StSULTR) genes in potato (Solanum tuberosum L.). Very strict homology search and subsequent domain verification with Hidden Markov Model revealed 12 StSULTR genes encoding 18 transcripts. StSULTR genes were mapped on seven S. tuberosum chromosomes. Annotation of StSULTR genes was also done as StSULTR I-IV at group level based mainly on the phylogenetic distribution with Arabidopsis SULTRs. Several tandem and segmental duplications were identified between StSULTR genes. Among these duplications, Ka/Ks ratios indicated neutral nature of mutations that might not be causing any selection. Two segmental and one-tandem duplications were calculated to occur around 147.69, 180.80 and 191.00 million years ago (MYA), approximately corresponding to the time of monocot/dicot divergence. Two other segmental duplications were found to occur around 61.23 and 67.83 MYA, which is very close to the origination of monocotyledons. Most cis-regulatory elements in StSULTRs were found associated with major hormones (such as abscisic acid and methyl jasmonate), and defense and stress responsiveness. The cis-element distribution in duplicated gene pairs indicated the contribution of duplication events in conferring the neofunctionalization/s in StSULTR genes. Notably, RNAseq data analyses unveiled expression profiles of StSULTR genes under different stress conditions. In particular, expression profiles of StSULTR III members suggested their involvement in plant stress responses. Additionally, gene co-expression networks of these group members included various regulatory factors, stress and signaling proteins, and housekeeping and some other proteins with unknown functions.
Gene Duplication and Evolutionary Innovations in Hemoglobin-Oxygen Transport
2016-01-01
During vertebrate evolution, duplicated hemoglobin (Hb) genes diverged with respect to functional properties as well as the developmental timing of expression. For example, the subfamilies of genes that encode the different subunit chains of Hb are ontogenetically regulated such that functionally distinct Hb isoforms are expressed during different developmental stages. In some vertebrate taxa, functional differentiation between co-expressed Hb isoforms may also contribute to physiologically important divisions of labor. PMID:27053736
Pantzartzi, Chrysoula N.; Drosopoulou, Elena; Scouras, Zacharias G.
2013-01-01
Hsp90s, members of the Heat Shock Protein class, protect the structure and function of proteins and play a significant task in cellular homeostasis and signal transduction. In order to determine the number of hsp90 gene copies and encoded proteins in fungal and animal lineages and through that key duplication events that this family has undergone, we collected and evaluated Hsp90 protein sequences and corresponding Expressed Sequence Tags and analyzed available genomes from various taxa. We provide evidence for duplication events affecting either single species or wider taxonomic groups. With regard to Fungi, duplicated genes have been detected in several lineages. In invertebrates, we demonstrate key duplication events in certain clades of Arthropoda and Mollusca, and a possible gene loss event in a hymenopteran family. Finally, we infer that the duplication event responsible for the two (a and b) isoforms in vertebrates occurred probably shortly after the split of Hyperoartia and Gnathostomata. PMID:24066039
Drift diffusion model of reward and punishment learning in rare alpha-synuclein gene carriers.
Moustafa, Ahmed A; Kéri, Szabolcs; Polner, Bertalan; White, Corey
To understand the cognitive effects of alpha-synuclein polymorphism, we employed a drift diffusion model (DDM) to analyze reward- and punishment-guided probabilistic learning task data of participants with the rare alpha-synuclein gene duplication and age- and education-matched controls. Overall, the DDM analysis showed that, relative to controls, asymptomatic alpha-synuclein gene duplication carriers had significantly increased learning from negative feedback, while they tended to show impaired learning from positive feedback. No significant differences were found in response caution, response bias, or motor/encoding time. We here discuss the implications of these computational findings to the understanding of the neural mechanism of alpha-synuclein gene duplication.
Duplicated growth hormone genes in a passerine bird, the jungle crow (Corvus macrorhynchos).
Arai, Natsumi; Iigo, Masayuki
2010-07-02
Molecular cloning, molecular phylogeny, gene structure and expression analyses of growth hormone (GH) were performed in a passerine bird, the jungle crow (Corvus macrorhynchos). Unexpectedly, duplicated GH cDNA and genes were identified and designated as GH1A and GH1B. In silico analyses identified the zebra finch orthologs. Both GH genes encode 217 amino acid residues and consist of five exons and four introns, spanning 5.2 kbp in GH1A and 4.2 kbp in GH1B. Predicted GH proteins of the jungle crow and zebra finch contain four conserved cysteine residues, suggesting duplicated GH genes are functional. Molecular phylogenetic analysis revealed that duplication of GH genes occur after divergence of the passerine lineage from the other avian orders as has been suggested from partial genomic DNA sequences of passerine GH genes. RT-PCR analyses confirmed expression of GH1A and GH1B in the pituitary gland. In addition, GH1A gene is expressed in all the tissues examined. However, expression of GH1B is confined to several brain areas and blood cells. These results indicate that the regulatory mechanisms of duplicated GH genes are different and that duplicated GH genes exert both endocrine and autocrine/paracrine functions. Copyright 2010 Elsevier Inc. All rights reserved.
Uchiumi, Fumiaki; Watanabe, Takeshi; Tanuma, Sei-ichi
2010-05-15
DNA helicases are important in the regulation of DNA transaction and thereby various cellular functions. In this study, we developed a cost-effective multiple DNA transfection assay with DEAE-dextran reagent and analyzed the promoter activities of the human DNA helicases. The 5'-flanking regions of the human DNA helicase-encoding genes were isolated and subcloned into luciferase (Luc) expression plasmids. They were coated onto 96-well plate and used for co-transfection with a renilla-Luc expression vector into various cells, and dual-Luc assays were performed. The profiles of promoter activities were dependent on cell lines used. Among these human DNA helicase genes, XPB, RecQL5, and RTEL promoters were activated during TPA-induced HL-60 cell differentiation. Interestingly, duplicated ets (GGAA) elements are commonly located around the transcription start sites of these genes. The duplicated GGAA motifs are also found in the promoters of DNA replication/repair synthesis factor genes including PARG, ATR, TERC, and Rb1. Mutation analyses suggested that the duplicated GGAA-motifs are necessary for the basal promoter activity in various cells and some of them positively respond to TPA in HL-60 cells. TPA-induced response of 44-bp in the RTEL promoter was attenuated by co-transfection of the PU.1 expression vector. These findings suggest that the duplicated ets motifs regulate DNA-repair associated gene expressions during macrophage-like differentiation of HL-60 cells. Copyright 2010 Elsevier Inc. All rights reserved.
Lyons, Jonathan J; Yu, Xiaomin; Hughes, Jason D; Le, Quang T; Jamil, Ali; Bai, Yun; Ho, Nancy; Zhao, Ming; Liu, Yihui; O'Connell, Michael P; Trivedi, Neil N; Nelson, Celeste; DiMaggio, Thomas; Jones, Nina; Matthews, Helen; Lewis, Katie L; Oler, Andrew J; Carlson, Ryan J; Arkwright, Peter D; Hong, Celine; Agama, Sherene; Wilson, Todd M; Tucker, Sofie; Zhang, Yu; McElwee, Joshua J; Pao, Maryland; Glover, Sarah C; Rothenberg, Marc E; Hohman, Robert J; Stone, Kelly D; Caughey, George H; Heller, Theo; Metcalfe, Dean D; Biesecker, Leslie G; Schwartz, Lawrence B; Milner, Joshua D
2016-12-01
Elevated basal serum tryptase levels are present in 4-6% of the general population, but the cause and relevance of such increases are unknown. Previously, we described subjects with dominantly inherited elevated basal serum tryptase levels associated with multisystem complaints including cutaneous flushing and pruritus, dysautonomia, functional gastrointestinal symptoms, chronic pain, and connective tissue abnormalities, including joint hypermobility. Here we report the identification of germline duplications and triplications in the TPSAB1 gene encoding α-tryptase that segregate with inherited increases in basal serum tryptase levels in 35 families presenting with associated multisystem complaints. Individuals harboring alleles encoding three copies of α-tryptase had higher basal serum levels of tryptase and were more symptomatic than those with alleles encoding two copies, suggesting a gene-dose effect. Further, we found in two additional cohorts (172 individuals) that elevated basal serum tryptase levels were exclusively associated with duplication of α-tryptase-encoding sequence in TPSAB1, and affected individuals reported symptom complexes seen in our initial familial cohort. Thus, our findings link duplications in TPSAB1 with irritable bowel syndrome, cutaneous complaints, connective tissue abnormalities, and dysautonomia.
Genome-wide identification and evolution of the PIN-FORMED (PIN) gene family in Glycine max.
Liu, Yuan; Wei, Haichao
2017-07-01
Soybean (Glycine max) is one of the most important crop plants. Wild and cultivated soybean varieties have significant differences worth further investigation, such as plant morphology, seed size, and seed coat development; these characters may be related to auxin biology. The PIN gene family encodes essential transport proteins in cell-to-cell auxin transport, but little research on soybean PIN genes (GmPIN genes) has been done, especially with respect to the evolution and differences between wild and cultivated soybean. In this study, we retrieved 23 GmPIN genes from the latest updated G. max genome database; six GmPIN protein sequences were changed compared with the previous database. Based on the Plant Genome Duplication Database, 18 GmPIN genes have been involved in segment duplication. Three pairs of GmPIN genes arose after the second soybean genome duplication, and six occurred after the first genome duplication. The duplicated GmPIN genes retained similar expression patterns. All the duplicated GmPIN genes experienced purifying selection (K a /K s < 1) to prevent accumulation of non-synonymous mutations and thus remained more similar. In addition, we also focused on the artificial selection of the soybean PIN genes. Five artificially selected GmPIN genes were identified by comparing the genome sequence of 17 wild and 14 cultivated soybean varieties. Our research provides useful and comprehensive basic information for understanding GmPIN genes.
Schwarte, Sandra; Tiedemann, Ralph
2011-06-01
Rubisco (ribulose-1,5-bisphosphate carboxylase/oxygenase; EC 4.1.1.39), the most abundant protein in nature, catalyzes the assimilation of CO(2) (worldwide about 10(11) t each year) by carboxylation of ribulose-1,5-bisphosphate. It is a hexadecamer consisting of eight large and eight small subunits. Although the Rubisco large subunit (rbcL) is encoded by a single gene on the multicopy chloroplast genome, the Rubisco small subunits (rbcS) are encoded by a family of nuclear genes. In Arabidopsis thaliana, the rbcS gene family comprises four members, that is, rbcS-1a, rbcS-1b, rbcS-2b, and rbcS-3b. We sequenced all Rubisco genes in 26 worldwide distributed A. thaliana accessions. In three of these accessions, we detected a gene duplication/loss event, where rbcS-1b was lost and substituted by a duplicate of rbcS-2b (called rbcS-2b*). By screening 74 additional accessions using a specific polymerase chain reaction assay, we detected five additional accessions with this duplication/loss event. In summary, we found the gene duplication/loss in 8 of 100 A. thaliana accessions, namely, Bch, Bu, Bur, Cvi, Fei, Lm, Sha, and Sorbo. We sequenced an about 1-kb promoter region for all Rubisco genes as well. This analysis revealed that the gene duplication/loss event was associated with promoter alterations (two insertions of 450 and 850 bp, one deletion of 730 bp) in rbcS-2b and a promoter deletion (2.3 kb) in rbcS-2b* in all eight affected accessions. The substitution of rbcS-1b by a duplicate of rbcS-2b (i.e., rbcS-2b*) might be caused by gene conversion. All four Rubisco genes evolve under purifying selection, as expected for central genes of the highly conserved photosystem of green plants. We inferred a single positive selected site, a tyrosine to aspartic acid substitution at position 72 in rbcS-1b. Exactly the same substitution compromises carboxylase activity in the cyanobacterium Anacystis nidulans. In A. thaliana, this substitution is associated with an inferred recombination. Functional implications of the substitution remain to be evaluated.
Mitochondrial Genomes of Kinorhyncha: trnM Duplication and New Gene Orders within Animals.
Popova, Olga V; Mikhailov, Kirill V; Nikitin, Mikhail A; Logacheva, Maria D; Penin, Aleksey A; Muntyan, Maria S; Kedrova, Olga S; Petrov, Nikolai B; Panchin, Yuri V; Aleoshin, Vladimir V
2016-01-01
Many features of mitochondrial genomes of animals, such as patterns of gene arrangement, nucleotide content and substitution rate variation are extensively used in evolutionary and phylogenetic studies. Nearly 6,000 mitochondrial genomes of animals have already been sequenced, covering the majority of animal phyla. One of the groups that escaped mitogenome sequencing is phylum Kinorhyncha-an isolated taxon of microscopic worm-like ecdysozoans. The kinorhynchs are thought to be one of the early-branching lineages of Ecdysozoa, and their mitochondrial genomes may be important for resolving evolutionary relations between major animal taxa. Here we present the results of sequencing and analysis of mitochondrial genomes from two members of Kinorhyncha, Echinoderes svetlanae (Cyclorhagida) and Pycnophyes kielensis (Allomalorhagida). Their mitochondrial genomes are circular molecules approximately 15 Kbp in size. The kinorhynch mitochondrial gene sequences are highly divergent, which precludes accurate phylogenetic inference. The mitogenomes of both species encode a typical metazoan complement of 37 genes, which are all positioned on the major strand, but the gene order is distinct and unique among Ecdysozoa or animals as a whole. We predict four types of start codons for protein-coding genes in E. svetlanae and five in P. kielensis with a consensus DTD in single letter code. The mitochondrial genomes of E. svetlanae and P. kielensis encode duplicated methionine tRNA genes that display compensatory nucleotide substitutions. Two distant species of Kinorhyncha demonstrate similar patterns of gene arrangements in their mitogenomes. Both genomes have duplicated methionine tRNA genes; the duplication predates the divergence of two species. The kinorhynchs share a few features pertaining to gene order that align them with Priapulida. Gene order analysis reveals that gene arrangement specific of Priapulida may be ancestral for Scalidophora, Ecdysozoa, and even Protostomia.
Mitochondrial Genomes of Kinorhyncha: trnM Duplication and New Gene Orders within Animals
Popova, Olga V.; Mikhailov, Kirill V.; Nikitin, Mikhail A.; Logacheva, Maria D.; Penin, Aleksey A.; Muntyan, Maria S.; Kedrova, Olga S.; Petrov, Nikolai B.; Panchin, Yuri V.
2016-01-01
Many features of mitochondrial genomes of animals, such as patterns of gene arrangement, nucleotide content and substitution rate variation are extensively used in evolutionary and phylogenetic studies. Nearly 6,000 mitochondrial genomes of animals have already been sequenced, covering the majority of animal phyla. One of the groups that escaped mitogenome sequencing is phylum Kinorhyncha—an isolated taxon of microscopic worm-like ecdysozoans. The kinorhynchs are thought to be one of the early-branching lineages of Ecdysozoa, and their mitochondrial genomes may be important for resolving evolutionary relations between major animal taxa. Here we present the results of sequencing and analysis of mitochondrial genomes from two members of Kinorhyncha, Echinoderes svetlanae (Cyclorhagida) and Pycnophyes kielensis (Allomalorhagida). Their mitochondrial genomes are circular molecules approximately 15 Kbp in size. The kinorhynch mitochondrial gene sequences are highly divergent, which precludes accurate phylogenetic inference. The mitogenomes of both species encode a typical metazoan complement of 37 genes, which are all positioned on the major strand, but the gene order is distinct and unique among Ecdysozoa or animals as a whole. We predict four types of start codons for protein-coding genes in E. svetlanae and five in P. kielensis with a consensus DTD in single letter code. The mitochondrial genomes of E. svetlanae and P. kielensis encode duplicated methionine tRNA genes that display compensatory nucleotide substitutions. Two distant species of Kinorhyncha demonstrate similar patterns of gene arrangements in their mitogenomes. Both genomes have duplicated methionine tRNA genes; the duplication predates the divergence of two species. The kinorhynchs share a few features pertaining to gene order that align them with Priapulida. Gene order analysis reveals that gene arrangement specific of Priapulida may be ancestral for Scalidophora, Ecdysozoa, and even Protostomia. PMID:27755612
Kito, Keiji; Ito, Haruka; Nohara, Takehiro; Ohnishi, Mihoko; Ishibashi, Yuko; Takeda, Daisuke
2016-01-01
Omics analysis is a versatile approach for understanding the conservation and diversity of molecular systems across multiple taxa. In this study, we compared the proteome expression profiles of four yeast species (Saccharomyces cerevisiae, Saccharomyces mikatae, Kluyveromyces waltii, and Kluyveromyces lactis) grown on glucose- or glycerol-containing media. Conserved expression changes across all species were observed only for a small proportion of all proteins differentially expressed between the two growth conditions. Two Kluyveromyces species, both of which exhibited a high growth rate on glycerol, a nonfermentative carbon source, showed distinct species-specific expression profiles. In K. waltii grown on glycerol, proteins involved in the glyoxylate cycle and gluconeogenesis were expressed in high abundance. In K. lactis grown on glycerol, the expression of glycolytic and ethanol metabolic enzymes was unexpectedly low, whereas proteins involved in cytoplasmic translation, including ribosomal proteins and elongation factors, were highly expressed. These marked differences in the types of predominantly expressed proteins suggest that K. lactis optimizes the balance of proteome resource allocation between metabolism and protein synthesis giving priority to cellular growth. In S. cerevisiae, about 450 duplicate gene pairs were retained after whole-genome duplication. Intriguingly, we found that in the case of duplicates with conserved sequences, the total abundance of proteins encoded by a duplicate pair in S. cerevisiae was similar to that of protein encoded by nonduplicated ortholog in Kluyveromyces yeast. Given the frequency of haploinsufficiency, this observation suggests that conserved duplicate genes, even though minor cases of retained duplicates, do not exhibit a dosage effect in yeast, except for ribosomal proteins. Thus, comparative proteomic analyses across multiple species may reveal not only species-specific characteristics of metabolic processes under nonoptimal culture conditions but also provide valuable insights into intriguing biological principles, including the balance of proteome resource allocation and the role of gene duplication in evolutionary history. PMID:26560065
Kito, Keiji; Ito, Haruka; Nohara, Takehiro; Ohnishi, Mihoko; Ishibashi, Yuko; Takeda, Daisuke
2016-01-01
Omics analysis is a versatile approach for understanding the conservation and diversity of molecular systems across multiple taxa. In this study, we compared the proteome expression profiles of four yeast species (Saccharomyces cerevisiae, Saccharomyces mikatae, Kluyveromyces waltii, and Kluyveromyces lactis) grown on glucose- or glycerol-containing media. Conserved expression changes across all species were observed only for a small proportion of all proteins differentially expressed between the two growth conditions. Two Kluyveromyces species, both of which exhibited a high growth rate on glycerol, a nonfermentative carbon source, showed distinct species-specific expression profiles. In K. waltii grown on glycerol, proteins involved in the glyoxylate cycle and gluconeogenesis were expressed in high abundance. In K. lactis grown on glycerol, the expression of glycolytic and ethanol metabolic enzymes was unexpectedly low, whereas proteins involved in cytoplasmic translation, including ribosomal proteins and elongation factors, were highly expressed. These marked differences in the types of predominantly expressed proteins suggest that K. lactis optimizes the balance of proteome resource allocation between metabolism and protein synthesis giving priority to cellular growth. In S. cerevisiae, about 450 duplicate gene pairs were retained after whole-genome duplication. Intriguingly, we found that in the case of duplicates with conserved sequences, the total abundance of proteins encoded by a duplicate pair in S. cerevisiae was similar to that of protein encoded by nonduplicated ortholog in Kluyveromyces yeast. Given the frequency of haploinsufficiency, this observation suggests that conserved duplicate genes, even though minor cases of retained duplicates, do not exhibit a dosage effect in yeast, except for ribosomal proteins. Thus, comparative proteomic analyses across multiple species may reveal not only species-specific characteristics of metabolic processes under nonoptimal culture conditions but also provide valuable insights into intriguing biological principles, including the balance of proteome resource allocation and the role of gene duplication in evolutionary history. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Origin and functional diversification of an amphibian defense peptide arsenal.
Roelants, Kim; Fry, Bryan G; Ye, Lumeng; Stijlemans, Benoit; Brys, Lea; Kok, Philippe; Clynen, Elke; Schoofs, Liliane; Cornelis, Pierre; Bossuyt, Franky
2013-01-01
The skin secretion of many amphibians contains an arsenal of bioactive molecules, including hormone-like peptides (HLPs) acting as defense toxins against predators, and antimicrobial peptides (AMPs) providing protection against infectious microorganisms. Several amphibian taxa seem to have independently acquired the genes to produce skin-secreted peptide arsenals, but it remains unknown how these originated from a non-defensive ancestral gene and evolved diverse defense functions against predators and pathogens. We conducted transcriptome, genome, peptidome and phylogenetic analyses to chart the full gene repertoire underlying the defense peptide arsenal of the frog Silurana tropicalis and reconstruct its evolutionary history. Our study uncovers a cluster of 13 transcriptionally active genes, together encoding up to 19 peptides, including diverse HLP homologues and AMPs. This gene cluster arose from a duplicated gastrointestinal hormone gene that attained a HLP-like defense function after major remodeling of its promoter region. Instead, new defense functions, including antimicrobial activity, arose by mutation of the precursor proteins, resulting in the proteolytic processing of secondary peptides alongside the original ones. Although gene duplication did not trigger functional innovation, it may have subsequently facilitated the convergent loss of the original function in multiple gene lineages (subfunctionalization), completing their transformation from HLP gene to AMP gene. The processing of multiple peptides from a single precursor entails a mechanism through which peptide-encoding genes may establish new functions without the need for gene duplication to avoid adaptive conflicts with older ones.
Origin and Functional Diversification of an Amphibian Defense Peptide Arsenal
Roelants, Kim; Fry, Bryan G.; Ye, Lumeng; Stijlemans, Benoit; Brys, Lea; Kok, Philippe; Clynen, Elke; Schoofs, Liliane; Cornelis, Pierre; Bossuyt, Franky
2013-01-01
The skin secretion of many amphibians contains an arsenal of bioactive molecules, including hormone-like peptides (HLPs) acting as defense toxins against predators, and antimicrobial peptides (AMPs) providing protection against infectious microorganisms. Several amphibian taxa seem to have independently acquired the genes to produce skin-secreted peptide arsenals, but it remains unknown how these originated from a non-defensive ancestral gene and evolved diverse defense functions against predators and pathogens. We conducted transcriptome, genome, peptidome and phylogenetic analyses to chart the full gene repertoire underlying the defense peptide arsenal of the frog Silurana tropicalis and reconstruct its evolutionary history. Our study uncovers a cluster of 13 transcriptionally active genes, together encoding up to 19 peptides, including diverse HLP homologues and AMPs. This gene cluster arose from a duplicated gastrointestinal hormone gene that attained a HLP-like defense function after major remodeling of its promoter region. Instead, new defense functions, including antimicrobial activity, arose by mutation of the precursor proteins, resulting in the proteolytic processing of secondary peptides alongside the original ones. Although gene duplication did not trigger functional innovation, it may have subsequently facilitated the convergent loss of the original function in multiple gene lineages (subfunctionalization), completing their transformation from HLP gene to AMP gene. The processing of multiple peptides from a single precursor entails a mechanism through which peptide-encoding genes may establish new functions without the need for gene duplication to avoid adaptive conflicts with older ones. PMID:23935531
Marlétaz, Ferdinand; Maeso, Ignacio; Faas, Laura; Isaacs, Harry V; Holland, Peter W H
2015-08-01
The functional consequences of whole genome duplications in vertebrate evolution are not fully understood. It remains unclear, for instance, why paralogues were retained in some gene families but extensively lost in others. Cdx homeobox genes encode conserved transcription factors controlling posterior development across diverse bilaterians. These genes are part of the ParaHox gene cluster. Multiple Cdx copies were retained after genome duplication, raising questions about how functional divergence, overlap, and redundancy respectively contributed to their retention and evolutionary fate. We examined the degree of regulatory and functional overlap between the three vertebrate Cdx genes using single and triple morpholino knock-down in Xenopus tropicalis followed by RNA-seq. We found that one paralogue, Cdx4, has a much stronger effect on gene expression than the others, including a strong regulatory effect on FGF and Wnt genes. Functional annotation revealed distinct and overlapping roles and subtly different temporal windows of action for each gene. The data also reveal a colinear-like effect of Cdx genes on Hox genes, with repression of Hox paralogy groups 1 and 2, and activation increasing from Hox group 5 to 11. We also highlight cases in which duplicated genes regulate distinct paralogous targets revealing pathway elaboration after whole genome duplication. Despite shared core pathways, Cdx paralogues have acquired distinct regulatory roles during development. This implies that the degree of functional overlap between paralogues is relatively low and that gene expression pattern alone should be used with caution when investigating the functional evolution of duplicated genes. We therefore suggest that developmental programmes were extensively rewired after whole genome duplication in the early evolution of vertebrates.
Gene duplication in the major insecticide target site, Rdl, in Drosophila melanogaster
Remnant, Emily J.; Good, Robert T.; Schmidt, Joshua M.; Lumb, Christopher; Robin, Charles; Daborn, Phillip J.; Batterham, Philip
2013-01-01
The Resistance to Dieldrin gene, Rdl, encodes a GABA-gated chloride channel subunit that is targeted by cyclodiene and phenylpyrazole insecticides. The gene was first characterized in Drosophila melanogaster by genetic mapping of resistance to the cyclodiene dieldrin. The 4,000-fold resistance observed was due to a single amino acid replacement, Ala301 to Ser. The equivalent change was subsequently identified in Rdl orthologs of a large range of resistant insect species. Here, we report identification of a duplication at the Rdl locus in D. melanogaster. The 113-kb duplication contains one WT copy of Rdl and a second copy with two point mutations: an Ala301 to Ser resistance mutation and Met360 to Ile replacement. Individuals with this duplication exhibit intermediate dieldrin resistance compared with single copy Ser301 homozygotes, reduced temperature sensitivity, and altered RNA editing associated with the resistant allele. Ectopic recombination between Roo transposable elements is involved in generating this genomic rearrangement. The duplication phenotypes were confirmed by construction of a transgenic, artificial duplication integrating the 55.7-kb Rdl locus with a Ser301 change into an Ala301 background. Gene duplications can contribute significantly to the evolution of insecticide resistance, most commonly by increasing the amount of gene product produced. Here however, duplication of the Rdl target site creates permanent heterozygosity, providing unique potential for adaptive mutations to accrue in one copy, without abolishing the endogenous role of an essential gene. PMID:23959864
Yin, Guangjun; Xu, Hongliang; Xiao, Shuyang; Qin, Yajuan; Li, Yaxuan; Yan, Yueming; Hu, Yingkao
2013-10-03
WRKY genes encode one of the most abundant groups of transcription factors in higher plants, and its members regulate important biological process such as growth, development, and responses to biotic and abiotic stresses. Although the soybean genome sequence has been published, functional studies on soybean genes still lag behind those of other species. We identified a total of 133 WRKY members in the soybean genome. According to structural features of their encoded proteins and to the phylogenetic tree, the soybean WRKY family could be classified into three groups (groups I, II, and III). A majority of WRKY genes (76.7%; 102 of 133) were segmentally duplicated and 13.5% (18 of 133) of the genes were tandemly duplicated. This pattern was not apparent in Arabidopsis or rice. The transcriptome atlas revealed notable differential expression in either transcript abundance or in expression patterns under normal growth conditions, which indicated wide functional divergence in this family. Furthermore, some critical amino acids were detected using DIVERGE v2.0 in specific comparisons, suggesting that these sites have contributed to functional divergence among groups or subgroups. In addition, site model and branch-site model analyses of positive Darwinian selection (PDS) showed that different selection regimes could have affected the evolution of these groups. Sites with high probabilities of having been under PDS were found in groups I, II c, II e, and III. Together, these results contribute to a detailed understanding of the molecular evolution of the WRKY gene family in soybean. In this work, all the WRKY genes, which were generated mainly through segmental duplication, were identified in the soybean genome. Moreover, differential expression and functional divergence of the duplicated WRKY genes were two major features of this family throughout their evolutionary history. Positive selection analysis revealed that the different groups have different evolutionary rates. Together, these results contribute to a detailed understanding of the molecular evolution of the WRKY gene family in soybean.
Haney, Robert A.; Clarke, Thomas H.; Gadgil, Rujuta; Fitzpatrick, Ryan; Hayashi, Cheryl Y.; Ayoub, Nadia A.; Garb, Jessica E.
2016-01-01
Gene duplication and positive selection can be important determinants of the evolution of venom, a protein-rich secretion used in prey capture and defense. In a typical model of venom evolution, gene duplicates switch to venom gland expression and change function under the action of positive selection, which together with further duplication produces large gene families encoding diverse toxins. Although these processes have been demonstrated for individual toxin families, high-throughput multitissue sequencing of closely related venomous species can provide insights into evolutionary dynamics at the scale of the entire venom gland transcriptome. By assembling and analyzing multitissue transcriptomes from the Western black widow spider and two closely related species with distinct venom toxicity phenotypes, we do not find that gene duplication and duplicate retention is greater in gene families with venom gland biased expression in comparison with broadly expressed families. Positive selection has acted on some venom toxin families, but does not appear to be in excess for families with venom gland biased expression. Moreover, we find 309 distinct gene families that have single transcripts with venom gland biased expression, suggesting that the switching of genes to venom gland expression in numerous unrelated gene families has been a dominant mode of evolution. We also find ample variation in protein sequences of venom gland–specific transcripts, lineage-specific family sizes, and ortholog expression among species. This variation might contribute to the variable venom toxicity of these species. PMID:26733576
Le, Thong Minh; Le, Quy Van Chanh; Truong, Dung Minh; Lee, Hye-Jeong; Choi, Min-Kyeung; Cho, Hyesun; Chung, Hak-Jae; Kim, Jin-Hoi; Do, Jeong-Tae; Song, Hyuk; Park, Chankyu
2017-01-01
Several β2-microglobulin (B2M) -bound protein complexes undertake key roles in various immune system pathways, including the neonatal Fc receptor (FcRn), cluster of differentiation 1 (CD1) protein, non-classical major histocompatibility complex (MHC), and well-known MHC class I molecules. Therefore, the duplication of B2M may lead to an increase in the biological competence of organisms to the environment. Based on the pig genome assembly SSC10.2, a segmental duplication of ~45.5 kb, encoding the entire B2M protein, was identified in pig chromosome 1. Through experimental validation, we confirmed the functional duplication of the B2M gene with a completely identical coding sequence between two copies in pigs. Considering the importance of B2M in the immune system, we performed the phylogenetic analysis of B2M duplication in ten mammalian species, confirming the presence of B2M duplication in cetartioldactyls, like cattle, sheep, goats, pigs and whales, but non-cetartiodactyl species, like mice, cats, dogs, horses, and humans. The density of long interspersed nuclear element (LINE) at the edges of duplicated blocks (39 to 66%) was found to be 2 to 3-fold higher than the average (20.12%) of the pig genome, suggesting its role in the duplication event. The B2M mRNA expression level in pigs was 12.71 and 7.57 times (2-ΔΔCt values) higher than humans and mice, respectively. However, we were unable to experimentally demonstrate the difference in the level of B2M protein because species specific anti-B2M antibodies are not available. We reported, for the first time, the functional duplication of the B2M gene in animals. The identification of partially remaining duplicated B2M sequences in the genomes of only cetartiodactyls indicates that the event was lineage specific. B2M duplication could be beneficial to the immune system of pigs by increasing the availability of MHC class I light chain protein, B2M, to complex with the proteins encoded by the relatively large number of MHC class I heavy chain genes in pigs. Further studies are necessary to address the biological meaning of increased expression of B2M.
Extensive concerted evolution of rice paralogs and the road to regaining independence.
Wang, Xiyin; Tang, Haibao; Bowers, John E; Feltus, Frank A; Paterson, Andrew H
2007-11-01
Many genes duplicated by whole-genome duplications (WGDs) are more similar to one another than expected. We investigated whether concerted evolution through conversion and crossing over, well-known to affect tandem gene clusters, also affects dispersed paralogs. Genome sequences for two Oryza subspecies reveal appreciable gene conversion in the approximately 0.4 MY since their divergence, with a gradual progression toward independent evolution of older paralogs. Since divergence from subspecies indica, approximately 8% of japonica paralogs produced 5-7 MYA on chromosomes 11 and 12 have been affected by gene conversion and several reciprocal exchanges of chromosomal segments, while approximately 70-MY-old "paleologs" resulting from a genome duplication (GD) show much less conversion. Sequence similarity analysis in proximal gene clusters also suggests more conversion between younger paralogs. About 8% of paleologs may have been converted since rice-sorghum divergence approximately 41 MYA. Domain-encoding sequences are more frequently converted than nondomain sequences, suggesting a sort of circularity--that sequences conserved by selection may be further conserved by relatively frequent conversion. The higher level of concerted evolution in the 5-7 MY-old segmental duplication may reflect the behavior of many genomes within the first few million years after duplication or polyploidization.
Djogbénou, Luc S.; Berthomieu, Arnaud; Makoundou, Patrick; Baba-Moussa, Lamine S.; Fiston-Lavier, Anna-Sophie; Belkhir, Khalid; Labbé, Pierrick; Weill, Mylène
2016-01-01
Gene copy-number variations are widespread in natural populations, but investigating their phenotypic consequences requires contemporary duplications under selection. Such duplications have been found at the ace-1 locus (encoding the organophosphate and carbamate insecticides’ target) in the mosquito Anopheles gambiae (the major malaria vector); recent studies have revealed their intriguing complexity, consistent with the involvement of various numbers and types (susceptible or resistant to insecticide) of copies. We used an integrative approach, from genome to phenotype level, to investigate the influence of duplication architecture and gene-dosage on mosquito fitness. We found that both heterogeneous (i.e., one susceptible and one resistant ace-1 copy) and homogeneous (i.e., identical resistant copies) duplications segregated in field populations. The number of copies in homogeneous duplications was variable and positively correlated with acetylcholinesterase activity and resistance level. Determining the genomic structure of the duplicated region revealed that, in both types of duplication, ace-1 and 11 other genes formed tandem 203kb amplicons. We developed a diagnostic test for duplications, which showed that ace-1 was amplified in all 173 resistant mosquitoes analyzed (field-collected in several African countries), in heterogeneous or homogeneous duplications. Each type was associated with different fitness trade-offs: heterogeneous duplications conferred an intermediate phenotype (lower resistance and fitness costs), whereas homogeneous duplications tended to increase both resistance and fitness cost, in a complex manner. The type of duplication selected seemed thus to depend on the intensity and distribution of selection pressures. This versatility of trade-offs available through gene duplication highlights the importance of large mutation events in adaptation to environmental variation. This impressive adaptability could have a major impact on vector control in Africa. PMID:27918584
Gautier, Philippe; Loosli, Felix; Tay, Boon-Hui; Tay, Alice; Murdoch, Emma; Coutinho, Pedro; van Heyningen, Veronica; Brenner, Sydney; Venkatesh, Byrappa; Kleinjan, Dirk A.
2013-01-01
Pax6 is a developmental control gene essential for eye development throughout the animal kingdom. In addition, Pax6 plays key roles in other parts of the CNS, olfactory system, and pancreas. In mammals a single Pax6 gene encoding multiple isoforms delivers these pleiotropic functions. Here we provide evidence that the genomes of many other vertebrate species contain multiple Pax6 loci. We sequenced Pax6-containing BACs from the cartilaginous elephant shark (Callorhinchus milii) and found two distinct Pax6 loci. Pax6.1 is highly similar to mammalian Pax6, while Pax6.2 encodes a paired-less Pax6. Using synteny relationships, we identify homologs of this novel paired-less Pax6.2 gene in lizard and in frog, as well as in zebrafish and in other teleosts. In zebrafish two full-length Pax6 duplicates were known previously, originating from the fish-specific genome duplication (FSGD) and expressed in divergent patterns due to paralog-specific loss of cis-elements. We show that teleosts other than zebrafish also maintain duplicate full-length Pax6 loci, but differences in gene and regulatory domain structure suggest that these Pax6 paralogs originate from a more ancient duplication event and are hence renamed as Pax6.3. Sequence comparisons between mammalian and elephant shark Pax6.1 loci highlight the presence of short- and long-range conserved noncoding elements (CNEs). Functional analysis demonstrates the ancient role of long-range enhancers for Pax6 transcription. We show that the paired-less Pax6.2 ortholog in zebrafish is expressed specifically in the developing retina. Transgenic analysis of elephant shark and zebrafish Pax6.2 CNEs with homology to the mouse NRE/Pα internal promoter revealed highly specific retinal expression. Finally, morpholino depletion of zebrafish Pax6.2 resulted in a “small eye” phenotype, supporting a role in retinal development. In summary, our study reveals that the pleiotropic functions of Pax6 in vertebrates are served by a divergent family of Pax6 genes, forged by ancient duplication events and by independent, lineage-specific gene losses. PMID:23359656
Evolution and functional divergence of NLRP genes in mammalian reproductive systems
2009-01-01
Background NLRPs (Nucleotide-binding oligomerization domain, Leucine rich Repeat and Pyrin domain containing Proteins) are members of NLR (Nod-like receptors) protein family. Recent researches have shown that NLRP genes play important roles in both mammalian innate immune system and reproductive system. Several of NLRP genes were shown to be specifically expressed in the oocyte in mammals. The aim of the present work was to study how these genes evolved and diverged after their duplication, as well as whether natural selection played a role during their evolution. Results By using in silico methods, we have evaluated the evolution and functional divergence of NLRP genes, in particular of mouse reproduction-related Nlrp genes. We found that (1) major NLRP genes have been duplicated before the divergence of mammals, with certain lineage-specific duplications in primates (NLRP7 and 11) and in rodents (Nlrp1, 4 and 9 duplicates); (2) tandem duplication events gave rise to a mammalian reproduction-related NLRP cluster including NLRP2, 4, 5, 7, 8, 9, 11, 13 and 14 genes; (3) the function of mammalian oocyte-specific NLRP genes (NLRP4, 5, 9 and 14) might have diverged during gene evolution; (4) recent segmental duplications concerning Nlrp4 copies and vomeronasal 1 receptor encoding genes (V1r) have been undertaken in the mouse; and (5) duplicates of Nlrp4 and 9 in the mouse might have been subjected to adaptive evolution. Conclusion In conclusion, this study brings us novel information on the evolution of mammalian reproduction-related NLRPs. On the one hand, NLRP genes duplicated and functionally diversified in mammalian reproductive systems (such as NLRP4, 5, 9 and 14). On the other hand, during evolution, different lineages adapted to develop their own NLRP genes, particularly in reproductive function (such as the specific expansion of Nlrp4 and Nlrp9 in the mouse). PMID:19682372
Marandel, Lucie; Panserat, Stéphane; Plagnes-Juan, Elisabeth; Arbenoits, Eva; Soengas, José Luis; Bobe, Julien
2017-05-02
Glucose-6-phosphate (G6pc) is a key enzyme involved in the regulation of the glucose homeostasis. The present study aims at revisiting and clarifying the evolutionary history of g6pc genes in vertebrates. g6pc duplications happened by successive rounds of whole genome duplication that occurred during vertebrate evolution. g6pc duplicated before or around Osteichthyes/Chondrichthyes radiation, giving rise to g6pca and g6pcb as a consequence of the second vertebrate whole genome duplication. g6pca was lost after this duplication in Sarcopterygii whereas both g6pca and g6pcb then duplicated as a consequence of the teleost-specific whole genome duplication. One g6pca duplicate was lost after this duplication in teleosts. Similarly one g6pcb2 duplicate was lost at least in the ancestor of percomorpha. The analysis of the evolution of spatial expression patterns of g6pc genes in vertebrates showed that all g6pc were mainly expressed in intestine and liver whereas teleost-specific g6pcb2 genes were mainly and surprisingly expressed in brain and heart. g6pcb2b, one gene previously hypothesised to be involved in the glucose intolerant phenotype in trout, was unexpectedly up-regulated (as it was in liver) by carbohydrates in trout telencephalon without showing significant changes in other brain regions. This up-regulation is in striking contrast with expected glucosensing mechanisms suggesting that its positive response to glucose relates to specific unknown processes in this brain area. Our results suggested that the fixation and the divergence of g6pc duplicated genes during vertebrates' evolution may lead to adaptive novelty and probably to the emergence of novel phenotypes related to glucose homeostasis.
Tornow, J; Santangelo, G M
1994-06-01
A duplicate copy of the RPL37A gene (encoding ribosomal protein L37) was cloned and sequenced. The coding region of RPL37B is very similar to that of RPL37A, with only one conservative amino-acid difference. However, the intron and flanking sequences of the two genes are extremely dissimilar. Disruption experiments indicate that the two loci are not functionally equivalent: disruption of RPL37B was insignificant, but disruption of RPL37A severely impaired the growth rate of the cell. When both RPL37 loci are disrupted, the cell is unable to grow at all, indicating that rpL37 is an essential protein. The functional disparity between the two RPL37 loci could be explained by differential gene expression. The results of two experiments support this idea: gene fusion of RPL37A to a reporter gene resulted in six-fold higher mRNA levels than was generated by the same reporter gene fused to RPL37B, and a modest increase in gene dosage of RPL37B overcame the lack of a functional RPL37A gene.
Peñalosa-Ruiz, Georgina; Aranda, Cristina; Ongay-Larios, Laura; Colon, Maritrini; Quezada, Hector; Gonzalez, Alicia
2012-01-01
Background Gene duplication and the subsequent divergence of paralogous pairs play a central role in the evolution of novel gene functions. S. cerevisiae possesses two paralogous genes (ALT1/ALT2) which presumably encode alanine aminotransferases. It has been previously shown that Alt1 encodes an alanine aminotransferase, involved in alanine metabolism; however the physiological role of Alt2 is not known. Here we investigate whether ALT2 encodes an active alanine aminotransferase. Principal Findings Our results show that although ALT1 and ALT2 encode 65% identical proteins, only Alt1 displays alanine aminotransferase activity; in contrast ALT2 encodes a catalytically inert protein. ALT1 and ALT2 expression is modulated by Nrg1 and by the intracellular alanine pool. ALT1 is alanine-induced showing a regulatory profile of a gene encoding an enzyme involved in amino acid catabolism, in agreement with the fact that Alt1 is the sole pathway for alanine catabolism present in S. cerevisiae. Conversely, ALT2 expression is alanine-repressed, indicating a role in alanine biosynthesis, although the encoded-protein has no alanine aminotransferase enzymatic activity. In the ancestral-like yeast L. kluyveri, the alanine aminotransferase activity was higher in the presence of alanine than in the presence of ammonium, suggesting that as for ALT1, LkALT1 expression could be alanine-induced. ALT2 retention poses the questions of whether the encoded protein plays a particular function, and if this function was present in the ancestral gene. It could be hypotesized that ALT2 diverged after duplication, through neo-functionalization or that ALT2 function was present in the ancestral gene, with a yet undiscovered function. Conclusions ALT1 and ALT2 divergence has resulted in delegation of alanine aminotransferase activity to Alt1. These genes display opposed regulatory profiles: ALT1 is alanine-induced, while ALT2 is alanine repressed. Both genes are negatively regulated by the Nrg1 repressor. Presented results indicate that alanine could act as ALT2 Nrg1-co-repressor. PMID:23049841
Arya, Preeti; Kumar, Gulshan; Acharya, Vishal; Singh, Anil K.
2014-01-01
Nucleotide binding site leucine-rich repeats (NBS-LRR) disease resistance proteins play an important role in plant defense against pathogen attack. A number of recent studies have been carried out to identify and characterize NBS-LRR gene families in many important plant species. In this study, we identified NBS-LRR gene family comprising of 1015 NBS-LRRs using highly stringent computational methods. These NBS-LRRs were characterized on the basis of conserved protein motifs, gene duplication events, chromosomal locations, phylogenetic relationships and digital gene expression analysis. Surprisingly, equal distribution of Toll/interleukin-1 receptor (TIR) and coiled coil (CC) (1∶1) was detected in apple while the unequal distribution was reported in majority of all other known plant genome studies. Prediction of gene duplication events intriguingly revealed that not only tandem duplication but also segmental duplication may equally be responsible for the expansion of the apple NBS-LRR gene family. Gene expression profiling using expressed sequence tags database of apple and quantitative real-time PCR (qRT-PCR) revealed the expression of these genes in wide range of tissues and disease conditions, respectively. Taken together, this study will provide a blueprint for future efforts towards improvement of disease resistance in apple. PMID:25232838
Assessing duplication and loss of APETALA1/FRUITFULL homologs in Ranunculales
Pabón-Mora, Natalia; Hidalgo, Oriane; Gleissberg, Stefan; Litt, Amy
2013-01-01
Gene duplication and loss provide raw material for evolutionary change within organismal lineages as functional diversification of gene copies provide a mechanism for phenotypic variation. Here we focus on the APETALA1/FRUITFULL MADS-box gene lineage evolution. AP1/FUL genes are angiosperm-specific and have undergone several duplications. By far the most significant one is the core-eudicot duplication resulting in the euAP1 and euFUL clades. Functional characterization of several euAP1 and euFUL genes has shown that both function in proper floral meristem identity, and axillary meristem repression. Independently, euAP1 genes function in floral meristem and sepal identity, whereas euFUL genes control phase transition, cauline leaf growth, compound leaf morphogenesis and fruit development. Significant functional variation has been detected in the function of pre-duplication basal-eudicot FUL-like genes, but the underlying mechanisms for change have not been identified. FUL-like genes in the Papaveraceae encode all functions reported for euAP1 and euFUL genes, whereas FUL-like genes in Aquilegia (Ranunculaceae) function in inflorescence development and leaf complexity, but not in flower or fruit development. Here we isolated FUL-like genes across the Ranunculales and used phylogenetic approaches to analyze their evolutionary history. We identified an early duplication resulting in the RanFL1 and RanFL2 clades. RanFL1 genes were present in all the families sampled and are mostly under strong negative selection in the MADS, I and K domains. RanFL2 genes were only identified from Eupteleaceae, Papaveraceae s.l., Menispermaceae and Ranunculaceae and show relaxed purifying selection at the I and K domains. We discuss how asymmetric sequence diversification, new motifs, differences in codon substitutions and likely protein-protein interactions resulting from this Ranunculiid-specific duplication can help explain the functional differences among basal-eudicot FUL-like genes. PMID:24062757
Sagara, N; Kirikoshi, H; Terasaki, H; Yasuhiko, Y; Toda, G; Shiokawa, K; Katoh, M
2001-04-06
Frizzled-1 (FZD1)-FZD10 are seven-transmembrane-type WNT receptors, and SFRP1-SFRP5 are soluble-type WNT antagonists. These molecules are encoded by mutually distinct genes. We have previously isolated and characterized the 7.7-kb FZD4 mRNA, encoding a seven-transmembrane receptor with the extracellular cysteine-rich domain (CRD). Here, we have cloned and characterized FZD4S, a splicing variant of the FZD4 gene. FZD4S, corresponding to the 10.0-kb FZD4 mRNA, consisted of exon 1, intron 1, and exon 2 of the FZD4 gene. FZD4S encoded a soluble-type polypeptide with the N-terminal part of CRD, and was expressed in human fetal kidney. Injection of synthetic FZD4S mRNA into the ventral marginal zone of Xenopus embryos at the 4-cell stage did not induce axis duplication by itself, but augmented the axis duplication potential of coinjected Xwnt-8 mRNA. These results indicate that the FZD4 gene gives rise to soluble-type FZD4S as well as seven-transmembrane-type FZD4 due to alternative splicing, and strongly suggest that FZD4S plays a role as a positive regulator of the WNT signaling pathway. Copyright 2001 Academic Press.
Jonas, V; Lin, C R; Kawashima, E; Semon, D; Swanson, L W; Mermod, J J; Evans, R M; Rosenfeld, M G
1985-01-01
Two mRNAs generated as a consequence of alternative RNA processing events in expression of the human calcitonin gene encode the protein precursors of either calcitonin or calcitonin gene-related peptide (CGRP). Both calcitonin and CGRP RNAs and their encoded peptide products are expressed in the human pituitary and in medullary thyroid tumors. On the basis of sequence comparison, it is suggested that both the calcitonin and CGRP exons arose from a common primordial sequence, suggesting that duplication and rearrangement events are responsible for the generation of this complex transcription unit. Images PMID:3872459
Multiple conversion between the genes encoding bacterial class-I release factors
Ishikawa, Sohta A.; Kamikawa, Ryoma; Inagaki, Yuji
2015-01-01
Bacteria require two class-I release factors, RF1 and RF2, that recognize stop codons and promote peptide release from the ribosome. RF1 and RF2 were most likely established through gene duplication followed by altering their stop codon specificities in the common ancestor of extant bacteria. This scenario expects that the two RF gene families have taken independent evolutionary trajectories after the ancestral gene duplication event. However, we here report two independent cases of conversion between RF1 and RF2 genes (RF1-RF2 gene conversion), which were severely examined by procedures incorporating the maximum-likelihood phylogenetic method. In both cases, RF1-RF2 gene conversion was predicted to occur in the region encoding nearly entire domain 3, of which functions are common between RF paralogues. Nevertheless, the ‘direction’ of gene conversion appeared to be opposite from one another—from RF2 gene to RF1 gene in one case, while from RF1 gene to RF2 gene in the other. The two cases of RF1-RF2 gene conversion prompt us to propose two novel aspects in the evolution of bacterial class-I release factors: (i) domain 3 is interchangeable between RF paralogues, and (ii) RF1-RF2 gene conversion have occurred frequently in bacterial genome evolution. PMID:26257102
Adomako-Ankomah, Yaw; English, Elizabeth D.; Danielson, Jeffrey J.; Pernas, Lena F.; Parker, Michelle L.; Boulanger, Martin J.; Dubey, Jitender P.; Boyle, Jon P.
2016-01-01
In Toxoplasma gondii, an intracellular parasite of humans and other animals, host mitochondrial association (HMA) is driven by a gene family that encodes multiple mitochondrial association factor 1 (MAF1) proteins. However, the importance of MAF1 gene duplication in the evolution of HMA is not understood, nor is the impact of HMA on parasite biology. Here we used within- and between-species comparative analysis to determine that the MAF1 locus is duplicated in T. gondii and its nearest extant relative Hammondia hammondi, but not another close relative, Neospora caninum. Using cross-species complementation, we determined that the MAF1 locus harbors multiple distinct paralogs that differ in their ability to mediate HMA, and that only T. gondii and H. hammondi harbor HMA+ paralogs. Additionally, we found that exogenous expression of an HMA+ paralog in T. gondii strains that do not normally exhibit HMA provides a competitive advantage over their wild-type counterparts during a mouse infection. These data indicate that HMA likely evolved by neofunctionalization of a duplicate MAF1 copy in the common ancestor of T. gondii and H. hammondi, and that the neofunctionalized gene duplicate is selectively advantageous. PMID:26920761
Stankiewicz, Paweł; Kulkarni, Shashikant; Dharmadhikari, Avinash V.; Sampath, Srirangan; Bhatt, Samarth S.; Shaikh, Tamim H.; Xia, Zhilian; Pursley, Amber N.; Cooper, M. Lance; Shinawi, Marwan; Paciorkowski, Alex R.; Grange, Dorothy K.; Noetzel, Michael J.; Saunders, Scott; Simons, Paul; Summar, Marshall; Lee, Brendan; Scaglia, Fernando; Fellmann, Florence; Martinet, Danielle; Beckmann, Jacques S.; Asamoah, Alexander; Platky, Kathryn; Sparks, Susan; Martin, Ann S.; Madan-Khetarpal, Suneeta; Hoover, Jacqueline; Medne, Livija; Bonnemann, Carsten G.; Moeschler, John B.; Vallee, Stephanie E.; Parikh, Sumit; Irwin, Polly; Dalzell, Victoria P.; Smith, Wendy E.; Banks, Valerie C.; Flannery, David B.; Lovell, Carolyn M.; Bellus, Gary A.; Golden-Grant, Kathryn; Gorski, Jerome L.; Kussmann, Jennifer L.; McGregor, Tracy L.; Hamid, Rizwan; Pfotenhauer, Jean; Ballif, Blake C.; Shaw, Chad A.; Kang, Sung-Hae L.; Bacino, Carlos A.; Patel, Ankita; Rosenfeld, Jill A.; Cheung, Sau Wai; Shaffer, Lisa G.
2013-01-01
We report 24 unrelated individuals with deletions and 17 additional cases with duplications at 10q11.21q21.1 identified by chromosomal microarray analysis. The rearrangements range in size from 0.3 to 12 Mb. Nineteen of the deletions and eight duplications are flanked by large, directly oriented segmental duplications of >98% sequence identity, suggesting that nonallelic homologous recombination (NAHR) caused these genomic rearrangements. Nine individuals with deletions and five with duplications have additional copy number changes. Detailed clinical evaluation of 20 patients with deletions revealed variable clinical features, with developmental delay (DD) and/or intellectual disability (ID) as the only features common to a majority of individuals. We suggest that some of the other features present in more than one patient with deletion, including hypotonia, sleep apnea, chronic constipation, gastroesophageal and vesicoureteral refluxes, epilepsy, ataxia, dysphagia, nystagmus, and ptosis may result from deletion of the CHAT gene, encoding choline acetyltransferase, and the SLC18A3 gene, mapping in the first intron of CHAT and encoding vesicular acetylcholine transporter. The phenotypic diversity and presence of the deletion in apparently normal carrier parents suggest that subjects carrying 10q11.21q11.23 deletions may exhibit variable phenotypic expressivity and incomplete penetrance influenced by additional genetic and nongenetic modifiers. PMID:21948486
Philip Stewart; Daniel Cullen
1999-06-01
The lignin peroxidases of Phanerochaete chrysosporium are encoded by a minimum of 10 closely related genes. Physical and genetic mapping of a cluster of eight lip genes revealed six genes occurring in pairs and transcriptionally convergent, suggesting that portions of the lip family arose by gene duplication events. The completed sequence of 1ipG and lipJ, together...
Extensive Concerted Evolution of Rice Paralogs and the Road to Regaining Independence
Wang, Xiyin; Tang, Haibao; Bowers, John E.; Feltus, Frank A.; Paterson, Andrew H.
2007-01-01
Many genes duplicated by whole-genome duplications (WGDs) are more similar to one another than expected. We investigated whether concerted evolution through conversion and crossing over, well-known to affect tandem gene clusters, also affects dispersed paralogs. Genome sequences for two Oryza subspecies reveal appreciable gene conversion in the ∼0.4 MY since their divergence, with a gradual progression toward independent evolution of older paralogs. Since divergence from subspecies indica, ∼8% of japonica paralogs produced 5–7 MYA on chromosomes 11 and 12 have been affected by gene conversion and several reciprocal exchanges of chromosomal segments, while ∼70-MY-old “paleologs” resulting from a genome duplication (GD) show much less conversion. Sequence similarity analysis in proximal gene clusters also suggests more conversion between younger paralogs. About 8% of paleologs may have been converted since rice–sorghum divergence ∼41 MYA. Domain-encoding sequences are more frequently converted than nondomain sequences, suggesting a sort of circularity—that sequences conserved by selection may be further conserved by relatively frequent conversion. The higher level of concerted evolution in the 5–7 MY-old segmental duplication may reflect the behavior of many genomes within the first few million years after duplication or polyploidization. PMID:18039882
Linder, P; Dölz, R; Mossé, M O; Lazowska, J; Slonimski, P P
1993-01-01
The amount of nucleotide sequence data is increasing exponentially. We therefore made an effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. Each sequence has been attributed a single genetic name and in the case of allelic duplicated sequences, synonyms are given, if necessary. For the nomenclature we have introduced a standard principle for naming gene sequences based on priority rules. We have also applied a simple method to distinguish duplicated sequences of one and the same gene from non-allelic sequences of duplicated genes. By using these principles we have sorted out a lot of confusion in the literature and databanks. Along with the genetic name, the mnemonic from the EMBL databank, the codon bias, reference of the publication of the sequence and the EMBL accession numbers are included in each entry. PMID:8332521
2013-01-01
Background WRKY genes encode one of the most abundant groups of transcription factors in higher plants, and its members regulate important biological process such as growth, development, and responses to biotic and abiotic stresses. Although the soybean genome sequence has been published, functional studies on soybean genes still lag behind those of other species. Results We identified a total of 133 WRKY members in the soybean genome. According to structural features of their encoded proteins and to the phylogenetic tree, the soybean WRKY family could be classified into three groups (groups I, II, and III). A majority of WRKY genes (76.7%; 102 of 133) were segmentally duplicated and 13.5% (18 of 133) of the genes were tandemly duplicated. This pattern was not apparent in Arabidopsis or rice. The transcriptome atlas revealed notable differential expression in either transcript abundance or in expression patterns under normal growth conditions, which indicated wide functional divergence in this family. Furthermore, some critical amino acids were detected using DIVERGE v2.0 in specific comparisons, suggesting that these sites have contributed to functional divergence among groups or subgroups. In addition, site model and branch-site model analyses of positive Darwinian selection (PDS) showed that different selection regimes could have affected the evolution of these groups. Sites with high probabilities of having been under PDS were found in groups I, II c, II e, and III. Together, these results contribute to a detailed understanding of the molecular evolution of the WRKY gene family in soybean. Conclusions In this work, all the WRKY genes, which were generated mainly through segmental duplication, were identified in the soybean genome. Moreover, differential expression and functional divergence of the duplicated WRKY genes were two major features of this family throughout their evolutionary history. Positive selection analysis revealed that the different groups have different evolutionary rates. Together, these results contribute to a detailed understanding of the molecular evolution of the WRKY gene family in soybean. PMID:24088323
Qiu, Wen-Ming; Li, Jing; Zhou, Hui; Zhang, Qiong; Guo, Wenwu; Zhu, Tingting; Peng, Junhua; Sun, Fengjie; Li, Shaohua; Korban, Schuyler S.; Han, Yuepeng
2012-01-01
Starch is one of the major components of cereals, tubers, and fruits. Genes encoding granule-bound starch synthase (GBSS), which is responsible for amylose synthesis, have been extensively studied in cereals but little is known about them in fruits. Due to their low copy gene number, GBSS genes have been used to study plant phylogenetic and evolutionary relationships. In this study, GBSS genes have been isolated and characterized in three fruit trees, including apple, peach, and orange. Moreover, a comprehensive evolutionary study of GBSS genes has also been conducted between both monocots and eudicots. Results have revealed that genomic structures of GBSS genes in plants are conserved, suggesting they all have evolved from a common ancestor. In addition, the GBSS gene in an ancestral angiosperm must have undergone genome duplication ∼251 million years ago (MYA) to generate two families, GBSSI and GBSSII. Both GBSSI and GBSSII are found in monocots; however, GBSSI is absent in eudicots. The ancestral GBSSII must have undergone further divergence when monocots and eudicots split ∼165 MYA. This is consistent with expression profiles of GBSS genes, wherein these profiles are more similar to those of GBSSII in eudicots than to those of GBSSI genes in monocots. In dicots, GBSSII must have undergone further divergence when rosids and asterids split from each other ∼126 MYA. Taken together, these findings suggest that it is GBSSII rather than GBSSI of monocots that have orthologous relationships with GBSS genes of eudicots. Moreover, diversification of GBSS genes is mainly associated with genome-wide duplication events throughout the evolutionary course of history of monocots and eudicots. PMID:22291904
Stø, Ida M.; Orr, Russell J. S.; Fooyontphanich, Kim; Jin, Xu; Knutsen, Jonfinn M. B.; Fischer, Urs; Tranbarger, Timothy J.; Nordal, Inger; Aalen, Reidunn B.
2015-01-01
The peptide INFLORESCENCE DEFICIENT IN ABSCISSION (IDA), which signals through the leucine-rich repeat receptor-like kinases HAESA (HAE) and HAESA-LIKE2 (HSL2), controls different cell separation events in Arabidopsis thaliana. We hypothesize the involvement of this signaling module in abscission processes in other plant species even though they may shed other organs than A. thaliana. As the first step toward testing this hypothesis from an evolutionarily perspective we have identified genes encoding putative orthologs of IDA and its receptors by BLAST searches of publically available protein, nucleotide and genome databases for angiosperms. Genes encoding IDA or IDA-LIKE (IDL) peptides and HSL proteins were found in all investigated species, which were selected as to represent each angiosperm order with available genomic sequences. The 12 amino acids representing the bioactive peptide in A. thaliana have virtually been unchanged throughout the evolution of the angiosperms; however, the number of IDL and HSL genes varies between different orders and species. The phylogenetic analyses suggest that IDA, HSL2, and the related HSL1 gene, were present in the species that gave rise to the angiosperms. HAE has arisen from HSL1 after a genome duplication that took place after the monocot—eudicots split. HSL1 has also independently been duplicated in the monocots, while HSL2 has been lost in gingers (Zingiberales) and grasses (Poales). IDA has been duplicated in eudicots to give rise to functionally divergent IDL peptides. We postulate that the high number of IDL homologs present in the core eudicots is a result of multiple whole genome duplications (WGD). We substantiate the involvement of IDA and HAE/HSL2 homologs in abscission by providing gene expression data of different organ separation events from various species. PMID:26579174
Zimmer, Christoph T; Garrood, William T; Singh, Kumar Saurabh; Randall, Emma; Lueke, Bettina; Gutbrod, Oliver; Matthiesen, Svend; Kohler, Maxie; Nauen, Ralf; Davies, T G Emyr; Bass, Chris
2018-01-22
Gene duplication is a major source of genetic variation that has been shown to underpin the evolution of a wide range of adaptive traits [1, 2]. For example, duplication or amplification of genes encoding detoxification enzymes has been shown to play an important role in the evolution of insecticide resistance [3-5]. In this context, gene duplication performs an adaptive function as a result of its effects on gene dosage and not as a source of functional novelty [3, 6-8]. Here, we show that duplication and neofunctionalization of a cytochrome P450, CYP6ER1, led to the evolution of insecticide resistance in the brown planthopper. Considerable genetic variation was observed in the coding sequence of CYP6ER1 in populations of brown planthopper collected from across Asia, but just two sequence variants are highly overexpressed in resistant strains and metabolize imidacloprid. Both variants are characterized by profound amino-acid alterations in substrate recognition sites, and the introduction of these mutations into a susceptible P450 sequence is sufficient to confer resistance. CYP6ER1 is duplicated in resistant strains with individuals carrying paralogs with and without the gain-of-function mutations. Despite numerical parity in the genome, the susceptible and mutant copies exhibit marked asymmetry in their expression with the resistant paralogs overexpressed. In the primary resistance-conferring CYP6ER1 variant, this results from an extended region of novel sequence upstream of the gene that provides enhanced expression. Our findings illustrate the versatility of gene duplication in providing opportunities for functional and regulatory innovation during the evolution of an adaptive trait. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
2008-01-01
Background The draft mouse (Mus musculus) genome sequence revealed an unexpected proliferation of gene duplicates encoding a family of secretoglobin proteins including the androgen-binding protein (ABP) α, β and γ subunits. Further investigation of 14 α-like (Abpa) and 13 β- or γ-like (Abpbg) undisrupted gene sequences revealed a rich diversity of developmental stage-, sex- and tissue-specific expression. Despite these studies, our understanding of the evolution of this gene family remains incomplete. Questions arise from imperfections in the initial mouse genome assembly and a dearth of information about the gene family structure in other rodents and mammals. Results Here, we interrogate the latest 'finished' mouse (Mus musculus) genome sequence assembly to show that the Abp gene repertoire is, in fact, twice as large as reported previously, with 30 Abpa and 34 Abpbg genes and pseudogenes. All of these have arisen since the last common ancestor with rat (Rattus norvegicus). We then demonstrate, by sequencing homologs from species within the Mus genus, that this burst of gene duplication occurred very recently, within the past seven million years. Finally, we survey Abp orthologs in genomes from across the mammalian clade and show that bursts of Abp gene duplications are not specific to the murid rodents; they also occurred recently in the lagomorph (rabbit, Oryctolagus cuniculus) and ruminant (cattle, Bos taurus) lineages, although not in other mammalian taxa. Conclusion We conclude that Abp genes have undergone repeated bursts of gene duplication and adaptive sequence diversification driven by these genes' participation in chemosensation and/or sexual identification. PMID:18269759
Laukaitis, Christina M; Heger, Andreas; Blakley, Tyler D; Munclinger, Pavel; Ponting, Chris P; Karn, Robert C
2008-02-12
The draft mouse (Mus musculus) genome sequence revealed an unexpected proliferation of gene duplicates encoding a family of secretoglobin proteins including the androgen-binding protein (ABP) alpha, beta and gamma subunits. Further investigation of 14 alpha-like (Abpa) and 13 beta- or gamma-like (Abpbg) undisrupted gene sequences revealed a rich diversity of developmental stage-, sex- and tissue-specific expression. Despite these studies, our understanding of the evolution of this gene family remains incomplete. Questions arise from imperfections in the initial mouse genome assembly and a dearth of information about the gene family structure in other rodents and mammals. Here, we interrogate the latest 'finished' mouse (Mus musculus) genome sequence assembly to show that the Abp gene repertoire is, in fact, twice as large as reported previously, with 30 Abpa and 34 Abpbg genes and pseudogenes. All of these have arisen since the last common ancestor with rat (Rattus norvegicus). We then demonstrate, by sequencing homologs from species within the Mus genus, that this burst of gene duplication occurred very recently, within the past seven million years. Finally, we survey Abp orthologs in genomes from across the mammalian clade and show that bursts of Abp gene duplications are not specific to the murid rodents; they also occurred recently in the lagomorph (rabbit, Oryctolagus cuniculus) and ruminant (cattle, Bos taurus) lineages, although not in other mammalian taxa. We conclude that Abp genes have undergone repeated bursts of gene duplication and adaptive sequence diversification driven by these genes' participation in chemosensation and/or sexual identification.
Zmienko, Agnieszka; Samelak-Czajka, Anna; Kozlowski, Piotr; Szymanska, Maja; Figlerowicz, Marek
2016-11-08
Intraspecies copy number variations (CNVs), defined as unbalanced structural variations of specific genomic loci, ≥1 kb in size, are present in the genomes of animals and plants. A growing number of examples indicate that CNVs may have functional significance and contribute to phenotypic diversity. In the model plant Arabidopsis thaliana at least several hundred protein-coding genes might display CNV; however, locus-specific genotyping studies in this plant have not been conducted. We analyzed the natural CNVs in the region overlapping MSH2 gene that encodes the DNA mismatch repair protein, and AT3G18530 and AT3G18535 genes that encode poorly characterized proteins. By applying multiplex ligation-dependent probe amplification and droplet digital PCR we genotyped those genes in 189 A. thaliana accessions. We found that AT3G18530 and AT3G18535 were duplicated (2-14 times) in 20 and deleted in 101 accessions. MSH2 was duplicated in 12 accessions (up to 12-14 copies) but never deleted. In all but one case, the MSH2 duplications were associated with those of AT3G18530 and AT3G18535. Considering the structure of the CNVs, we distinguished 5 genotypes for this region, determined their frequency and geographical distribution. We defined the CNV breakpoints in 35 accessions with AT3G18530 and AT3G18535 deletions and tandem duplications and showed that they were reciprocal events, resulting from non-allelic homologous recombination between 99 %-identical sequences flanking these genes. The widespread geographical distribution of the deletions supported by the SNP and linkage disequilibrium analyses of the genomic sequence confirmed the recurrent nature of this CNV. We characterized in detail for the first time the complex multiallelic CNV in Arabidopsis genome. The region encoding MSH2, AT3G18530 and AT3G18535 genes shows enormous variation of copy numbers among natural ecotypes, being a remarkable example of high Arabidopsis genome plasticity. We provided the molecular insight into the mechanism underlying the recurrent nature of AT3G18530-AT3G18535 duplications/deletions. We also performed the first direct comparison of the two leading experimental methods, suitable for assessing the DNA copy number status. Our comprehensive case study provides foundation information for further analyses of CNV evolution in Arabidopsis and other plants, and their possible use in plant breeding.
High level of microsynteny and purifying selection affect the evolution of WRKY family in Gramineae.
Jin, Jing; Kong, Jingjing; Qiu, Jianle; Zhu, Huasheng; Peng, Yuancheng; Jiang, Haiyang
2016-01-01
The WRKY gene family, which encodes proteins in the regulation processes of diverse developmental stages, is one of the largest families of transcription factors in higher plants. In this study, by searching for interspecies gene colinearity (microsynteny) and dating the age distributions of duplicated genes, we found 35 chromosomal segments of subgroup I genes of WRKY family (WRKY I) in four Gramineae species (Brachypodium, rice, sorghum, and maize) formed eight orthologous groups. After a stepwise gene-by-gene reciprocal comparison of all the protein sequences in the WRKY I gene flanking areas, highly conserved regions of microsynteny were found in the four Gramineae species. Most gene pairs showed conserved orientation within syntenic genome regions. Furthermore, tandem duplication events played the leading role in gene expansion. Eventually, environmental selection pressure analysis indicated strong purifying selection for the WRKY I genes in Gramineae, which may have been followed by gene loss and rearrangement. The results presented in this study provide basic information of Gramineae WRKY I genes and form the foundation for future functional studies of these genes. High level of microsynteny in the four grass species provides further evidence that a large-scale genome duplication event predated speciation.
Ma, Zhaowu; Zhou, Yang; Abbood, Nibras Najm; Liu, Jianfeng; Su, Li; Jia, Haibo; Guo, An-Yuan
2012-01-01
Background HES/HEY genes encode a family of basic helix-loop-helix (bHLH) transcription factors with both bHLH and Orange domain. HES/HEY proteins are direct targets of the Notch signaling pathway and play an essential role in developmental decisions, such as the developments of nervous system, somitogenesis, blood vessel and heart. Despite their important functions, the origin and evolution of this HES/HEY gene family has yet to be elucidated. Methods and Findings In this study, we identified genes of the HES/HEY family in representative species and performed evolutionary analysis to elucidate their origin and evolutionary process. Our results showed that the HES/HEY genes only existed in metazoans and may originate from the common ancestor of metazoans. We identified HES/HEY genes in more than 10 species representing the main lineages. Combining the bHLH and Orange domain sequences, we constructed the phylogenetic trees by different methods (Bayesian, ML, NJ and ME) and classified the HES/HEY gene family into four groups. Our results indicated that this gene family had undergone three expansions, which were along with the origins of Eumetazoa, vertebrate, and teleost. Gene structure analysis revealed that the HES/HEY genes were involved in exon and/or intron loss in different species lineages. Genes of this family were duplicated in bony fishes and doubled than other vertebrates. Furthermore, we studied the teleost-specific duplications in zebrafish and investigated the expression pattern of duplicated genes in different tissues by RT-PCR. Finally, we proposed a model to show the evolution of this gene family with processes of expansion, exon/intron loss, and motif loss. Conclusions Our study revealed the evolution of HES/HEY gene family, the expression and function divergence of duplicated genes, which also provide clues for the research of Notch function in development. This study shows a model of gene family analysis with gene structure evolution and duplication. PMID:22808219
Zhou, Mi; Yan, Jun; Ma, Zhaowu; Zhou, Yang; Abbood, Nibras Najm; Liu, Jianfeng; Su, Li; Jia, Haibo; Guo, An-Yuan
2012-01-01
HES/HEY genes encode a family of basic helix-loop-helix (bHLH) transcription factors with both bHLH and Orange domain. HES/HEY proteins are direct targets of the Notch signaling pathway and play an essential role in developmental decisions, such as the developments of nervous system, somitogenesis, blood vessel and heart. Despite their important functions, the origin and evolution of this HES/HEY gene family has yet to be elucidated. In this study, we identified genes of the HES/HEY family in representative species and performed evolutionary analysis to elucidate their origin and evolutionary process. Our results showed that the HES/HEY genes only existed in metazoans and may originate from the common ancestor of metazoans. We identified HES/HEY genes in more than 10 species representing the main lineages. Combining the bHLH and Orange domain sequences, we constructed the phylogenetic trees by different methods (Bayesian, ML, NJ and ME) and classified the HES/HEY gene family into four groups. Our results indicated that this gene family had undergone three expansions, which were along with the origins of Eumetazoa, vertebrate, and teleost. Gene structure analysis revealed that the HES/HEY genes were involved in exon and/or intron loss in different species lineages. Genes of this family were duplicated in bony fishes and doubled than other vertebrates. Furthermore, we studied the teleost-specific duplications in zebrafish and investigated the expression pattern of duplicated genes in different tissues by RT-PCR. Finally, we proposed a model to show the evolution of this gene family with processes of expansion, exon/intron loss, and motif loss. Our study revealed the evolution of HES/HEY gene family, the expression and function divergence of duplicated genes, which also provide clues for the research of Notch function in development. This study shows a model of gene family analysis with gene structure evolution and duplication.
Evolutionary history of the enolase gene family.
Tracy, M R; Hedges, S B
2000-12-23
The enzyme enolase [EC 4.2.1.11] is found in all organisms, with vertebrates exhibiting tissue-specific isozymes encoded by three genes: alpha (alpha), beta (beta), and gamma (gamma) enolase. Limited taxonomic sampling of enolase has obscured the timing of gene duplication events. To help clarify the evolutionary history of the gene family, cDNAs were sequenced from six taxa representing major lineages of vertebrates: Chiloscyllium punctatum (shark), Amia calva (bowfin), Salmo trutta (trout), Latimeria chalumnae (coelacanth), Lepidosiren paradoxa (South American lungfish), and Neoceratodus forsteri (Australian lungfish). Phylogenetic analysis of all enolase and related gene sequences revealed an early gene duplication event prior to the last common ancestor of living organisms. Several distantly related archaebacterial sequences were designated as 'enolase-2', whereas all other enolase sequences were designated 'enolase-1'. Two of the three isozymes of enolase-1, alpha- and beta-enolase, were discovered in actinopterygian, sarcopterygian, and chondrichthian fishes. Phylogenetic analysis of vertebrate enolases revealed that the two gene duplications leading to the three isozymes of enolase-1 occurred subsequent to the divergence of living agnathans, near the Proterozoic/Phanerozoic boundary (approximately 550Mya). Two copies of enolase, designated alpha(1) and alpha(2), were found in the trout and are presumed to be the result of a genome duplication event.
Alternative splicing and the evolution of phenotypic novelty.
Bush, Stephen J; Chen, Lu; Tovar-Corona, Jaime M; Urrutia, Araxi O
2017-02-05
Alternative splicing, a mechanism of post-transcriptional RNA processing whereby a single gene can encode multiple distinct transcripts, has been proposed to underlie morphological innovations in multicellular organisms. Genes with developmental functions are enriched for alternative splicing events, suggestive of a contribution of alternative splicing to developmental programmes. The role of alternative splicing as a source of transcript diversification has previously been compared to that of gene duplication, with the relationship between the two extensively explored. Alternative splicing is reduced following gene duplication with the retention of duplicate copies higher for genes which were alternatively spliced prior to duplication. Furthermore, and unlike the case for overall gene number, the proportion of alternatively spliced genes has also increased in line with the evolutionary diversification of cell types, suggesting alternative splicing may contribute to the complexity of developmental programmes. Together these observations suggest a prominent role for alternative splicing as a source of functional innovation. However, it is unknown whether the proliferation of alternative splicing events indeed reflects a functional expansion of the transcriptome or instead results from weaker selection acting on larger species, which tend to have a higher number of cell types and lower population sizes.This article is part of the themed issue 'Evo-devo in the genomics era, and the origins of morphological diversity'. © 2016 The Author(s).
Alternative splicing and the evolution of phenotypic novelty
Bush, Stephen J.; Chen, Lu; Tovar-Corona, Jaime M.
2017-01-01
Alternative splicing, a mechanism of post-transcriptional RNA processing whereby a single gene can encode multiple distinct transcripts, has been proposed to underlie morphological innovations in multicellular organisms. Genes with developmental functions are enriched for alternative splicing events, suggestive of a contribution of alternative splicing to developmental programmes. The role of alternative splicing as a source of transcript diversification has previously been compared to that of gene duplication, with the relationship between the two extensively explored. Alternative splicing is reduced following gene duplication with the retention of duplicate copies higher for genes which were alternatively spliced prior to duplication. Furthermore, and unlike the case for overall gene number, the proportion of alternatively spliced genes has also increased in line with the evolutionary diversification of cell types, suggesting alternative splicing may contribute to the complexity of developmental programmes. Together these observations suggest a prominent role for alternative splicing as a source of functional innovation. However, it is unknown whether the proliferation of alternative splicing events indeed reflects a functional expansion of the transcriptome or instead results from weaker selection acting on larger species, which tend to have a higher number of cell types and lower population sizes. This article is part of the themed issue ‘Evo-devo in the genomics era, and the origins of morphological diversity’. PMID:27994117
Chai, Wenbo; Si, Weina; Ji, Wei; Qin, Qianqian; Zhao, Manli; Jiang, Haiyang
2018-01-01
HD-Zip proteins represent the major transcription factors in higher plants, playing essential roles in plant development and stress responses. Foxtail millet is a crop to investigate the systems biology of millet and biofuel grasses and the HD-Zip gene family has not been studied in foxtail millet. For further investigation of the expression profile of the HD-Zip gene family in foxtail millet, a comprehensive genome-wide expression analysis was conducted in this study. We found 47 protein-encoding genes in foxtail millet using BLAST search tools; the putative proteins were classified into four subfamilies, namely, subfamilies I, II, III, and IV. Gene structure and motif analysis indicate that the genes in one subfamily were conserved. Promotor analysis showed that HD-Zip gene was involved in abiotic stress. Duplication analysis revealed that 8 (~17%) hdz genes were tandemly duplicated and 28 (58%) were segmentally duplicated; purifying duplication plays important roles in gene expansion. Microsynteny analysis revealed the maximum relationship in foxtail millet-sorghum and foxtail millet-rice. Expression profiling upon the abiotic stresses of drought and high salinity and the biotic stress of ABA revealed that some genes regulated responses to drought and salinity stresses via an ABA-dependent process, especially sihdz29 and sihdz45. Our study provides new insight into evolutionary and functional analyses of HD-Zip genes involved in environmental stress responses in foxtail millet.
Isolation and characterization of the pea cytochrome c oxidase Vb gene.
Kubo, Nakao; Arimura, Shin-Ichi; Tsutsumi, Nobuhiro; Kadowaki, Koh-Ichi; Hirai, Masashi
2006-11-01
Three copies of the gene that encodes cytochrome c oxidase subunit Vb were isolated from the pea (PscoxVb-1, PscoxVb-2, and PscoxVb-3). Northern Blot and reverse transcriptase-PCR analyses suggest that all 3 genes are transcribed in the pea. Each pea coxVb gene has an N-terminal extended sequence that can encode a mitochondrial targeting signal, called a presequence. The localization of green fluorescent proteins fused with the presequence strongly suggests the targeting of pea COXVb proteins to mitochondria. Each pea coxVb gene has 5 intron sites within the coding region. These are similar to Arabidopsis and rice, although the intron lengths vary greatly. A phylogenetic analysis of coxVb suggests the occurrence of gene duplication events during angiosperm evolution. In particular, 2 duplication events might have occurred in legumes, grasses, and Solanaceae. A comparison of amino acid sequences in COXVb or its counterpart shows the conservation of several amino acids within a zinc finger motif. Interestingly, a homology search analysis showed that bacterial protein COG4391 and a mitochondrial complex I 13 kDa subunit also have similar amino acid compositions around this motif. Such similarity might reflect evolutionary relationships among the 3 proteins.
2011-01-01
Background Missense mutations in three different genes encoding amyloid-β precursor protein, presenilin 1 and presenilin 2 are recognized to cause familial early-onset Alzheimer disease. Also duplications of the amyloid precursor protein gene have been shown to cause the disease. At the Dept. of Geriatric Medicine, Karolinska University Hospital, Sweden, patients are referred for mutation screening for the identification of nucleotide variations and for determining copy-number of the APP locus. Methods We combined the method of microsatellite marker genotyping with a quantitative real-time PCR analysis to detect duplications in patients with Alzheimer disease. Results In 22 DNA samples from individuals diagnosed with clinical Alzheimer disease, we identified one patient carrying a duplication on chromosome 21 which included the APP locus. Further mapping of the chromosomal region by array-comparative genome hybridization showed that the duplication spanned a maximal region of 1.09 Mb. Conclusions This is the first report of an APP duplication in a Swedish Alzheimer patient and describes the use of quantitative real-time PCR as a tool for determining copy-number of the APP locus. PMID:22044463
Thonberg, Håkan; Fallström, Marie; Björkström, Jenny; Schoumans, Jacqueline; Nennesmo, Inger; Graff, Caroline
2011-11-01
Missense mutations in three different genes encoding amyloid-β precursor protein, presenilin 1 and presenilin 2 are recognized to cause familial early-onset Alzheimer disease. Also duplications of the amyloid precursor protein gene have been shown to cause the disease. At the Dept. of Geriatric Medicine, Karolinska University Hospital, Sweden, patients are referred for mutation screening for the identification of nucleotide variations and for determining copy-number of the APP locus. We combined the method of microsatellite marker genotyping with a quantitative real-time PCR analysis to detect duplications in patients with Alzheimer disease. In 22 DNA samples from individuals diagnosed with clinical Alzheimer disease, we identified one patient carrying a duplication on chromosome 21 which included the APP locus. Further mapping of the chromosomal region by array-comparative genome hybridization showed that the duplication spanned a maximal region of 1.09 Mb. This is the first report of an APP duplication in a Swedish Alzheimer patient and describes the use of quantitative real-time PCR as a tool for determining copy-number of the APP locus.
USDA-ARS?s Scientific Manuscript database
In Toxoplasma gondii, an intracellular parasite of humans and other warm-blooded animals, the ability to associate with host mitochondria (HMA) is driven by a locally expanded gene family that encodes multiple mitochondrial association factor 1 (MAF1) proteins. The importance of copy number in the e...
Adomako-Ankomah, Yaw; English, Elizabeth D; Danielson, Jeffrey J; Pernas, Lena F; Parker, Michelle L; Boulanger, Martin J; Dubey, Jitender P; Boyle, Jon P
2016-05-01
In Toxoplasma gondii, an intracellular parasite of humans and other animals, host mitochondrial association (HMA) is driven by a gene family that encodes multiple mitochondrial association factor 1 (MAF1) proteins. However, the importance of MAF1 gene duplication in the evolution of HMA is not understood, nor is the impact of HMA on parasite biology. Here we used within- and between-species comparative analysis to determine that the MAF1 locus is duplicated in T. gondii and its nearest extant relative Hammondia hammondi, but not another close relative, Neospora caninum Using cross-species complementation, we determined that the MAF1 locus harbors multiple distinct paralogs that differ in their ability to mediate HMA, and that only T. gondii and H. hammondi harbor HMA(+) paralogs. Additionally, we found that exogenous expression of an HMA(+) paralog in T. gondii strains that do not normally exhibit HMA provides a competitive advantage over their wild-type counterparts during a mouse infection. These data indicate that HMA likely evolved by neofunctionalization of a duplicate MAF1 copy in the common ancestor of T. gondii and H. hammondi, and that the neofunctionalized gene duplicate is selectively advantageous. Copyright © 2016 by the Genetics Society of America.
Conserved Non-Coding Sequences are Associated with Rates of mRNA Decay in Arabidopsis.
Spangler, Jacob B; Feltus, Frank Alex
2013-01-01
Steady-state mRNA levels are tightly regulated through a combination of transcriptional and post-transcriptional control mechanisms. The discovery of cis-acting DNA elements that encode these control mechanisms is of high importance. We have investigated the influence of conserved non-coding sequences (CNSs), DNA patterns retained after an ancient whole genome duplication event, on the breadth of gene expression and the rates of mRNA decay in Arabidopsis thaliana. The absence of CNSs near α duplicate genes was associated with a decrease in breadth of gene expression and slower mRNA decay rates while the presence CNSs near α duplicates was associated with an increase in breadth of gene expression and faster mRNA decay rates. The observed difference in mRNA decay rate was fastest in genes with CNSs in both non-transcribed and transcribed regions, albeit through an unknown mechanism. This study supports the notion that some Arabidopsis CNSs regulate the steady-state mRNA levels through post-transcriptional control mechanisms and that CNSs also play a role in controlling the breadth of gene expression.
Conserved Non-Coding Sequences are Associated with Rates of mRNA Decay in Arabidopsis
Spangler, Jacob B.; Feltus, Frank Alex
2013-01-01
Steady-state mRNA levels are tightly regulated through a combination of transcriptional and post-transcriptional control mechanisms. The discovery of cis-acting DNA elements that encode these control mechanisms is of high importance. We have investigated the influence of conserved non-coding sequences (CNSs), DNA patterns retained after an ancient whole genome duplication event, on the breadth of gene expression and the rates of mRNA decay in Arabidopsis thaliana. The absence of CNSs near α duplicate genes was associated with a decrease in breadth of gene expression and slower mRNA decay rates while the presence CNSs near α duplicates was associated with an increase in breadth of gene expression and faster mRNA decay rates. The observed difference in mRNA decay rate was fastest in genes with CNSs in both non-transcribed and transcribed regions, albeit through an unknown mechanism. This study supports the notion that some Arabidopsis CNSs regulate the steady-state mRNA levels through post-transcriptional control mechanisms and that CNSs also play a role in controlling the breadth of gene expression. PMID:23675377
Indrasumunar, Arief; Wilde, Julia; Hayashi, Satomi; Li, Dongxue; Gresshoff, Peter M
2015-03-15
Association between legumes and rhizobia results in the formation of root nodules, where symbiotic nitrogen fixation occurs. The early stages of this association involve a complex of signalling events between the host and microsymbiont. Several genes dealing with early signal transduction have been cloned, and one of them encodes the leucine-rich repeat (LRR) receptor kinase (SymRK; also termed NORK). The Symbiosis Receptor Kinase gene is required by legumes to establish a root endosymbiosis with Rhizobium bacteria as well as mycorrhizal fungi. Using degenerate primer and BAC sequencing, we cloned duplicated SymRK homeologues in soybean called GmSymRKα and GmSymRKβ. These duplicated genes have high similarity of nucleotide (96%) and amino acid sequence (95%). Sequence analysis predicted a malectin-like domain within the extracellular domain of both genes. Several putative cis-acting elements were found in promoter regions of GmSymRKα and GmSymRKβ, suggesting a participation in lateral root development, cell division and peribacteroid membrane formation. The mutant of SymRK genes is not available in soybean; therefore, to know the functions of these genes, RNA interference (RNAi) of these duplicated genes was performed. For this purpose, RNAi construct of each gene was generated and introduced into the soybean genome by Agrobacterium rhizogenes-mediated hairy root transformation. RNAi of GmSymRKβ gene resulted in an increased reduction of nodulation and mycorrhizal infection than RNAi of GmSymRKα, suggesting it has the major activity of the duplicated gene pair. The results from the important crop legume soybean confirm the joint phenotypic action of GmSymRK genes in both mycorrhizal and rhizobial infection seen in model legumes. Copyright © 2015 Elsevier GmbH. All rights reserved.
Hou, Zhaoqi; Jia, Bing; Li, Fei; Liu, Pu; Liu, Li; Ye, Zhenfeng; Zhu, Liwu; Wang, Qi; Heng, Wei
2018-01-01
The plant genes encoding ABCGs that have been identified to date play a role in suberin formation in response to abiotic and biotic stress. In the present study, 80 ABCG genes were identified in 'Dangshansuli' Chinese white pear and designated as PbABCGs. Based on the structural characteristics and phylogenetic analysis, the PbABCG family genes could be classified into seven main groups: classes A-G. Segmental and dispersed duplications were the primary forces underlying the PbABCG gene family expansion in 'Dangshansuli' pear. Most of the PbABCG duplicated gene pairs date to the recent whole-genome duplication that occurred 30~45 million years ago. Purifying selection has also played a critical role in the evolution of the ABCG genes. Ten PbABCG genes screened in the transcriptome of 'Dangshansuli' pear and its russet mutant 'Xiusu' were validated, and the expression levels of the PbABCG genes exhibited significant differences at different stages. The results presented here will undoubtedly be useful for better understanding of the complexity of the PbABCG gene family and will facilitate the functional characterization of suberin formation in the russet mutant.
Expansion by whole genome duplication and evolution of the sox gene family in teleost fish
Naville, Magali; Volff, Jean-Nicolas
2017-01-01
It is now recognized that several rounds of whole genome duplication (WGD) have occurred during the evolution of vertebrates, but the link between WGDs and phenotypic diversification remains unsolved. We have investigated in this study the impact of the teleost-specific WGD on the evolution of the sox gene family in teleostean fishes. The sox gene family, which encodes for transcription factors, has essential role in morphology, physiology and behavior of vertebrates and teleosts, the current largest group of vertebrates. We have first redrawn the evolution of all sox genes identified in eleven teleost genomes using a comparative genomic approach including phylogenetic and synteny analyses. We noticed, compared to tetrapods, an important expansion of the sox family: 58% (11/19) of sox genes are duplicated in teleost genomes. Furthermore, all duplicated sox genes, except sox17 paralogs, are derived from the teleost-specific WGD. Then, focusing on five sox genes, analyzing the evolution of coding and non-coding sequences, as well as the expression patterns in fish embryos and adult tissues, we demonstrated that these paralogs followed lineage-specific evolutionary trajectories in teleost genomes. This work, based on whole genome data from multiple teleostean species, supports the contribution of WGDs to the expansion of gene families, as well as to the emergence of genomic differences between lineages that might promote genetic and phenotypic diversity in teleosts. PMID:28738066
Yoneyama, Keisuke; Akashi, Tomoyoshi; Aoki, Toshio
2016-01-01
Soybean (Glycine max) accumulates several prenylated isoflavonoid phytoalexins, collectively referred to as glyceollins. Glyceollins (I, II, III, IV and V) possess modified pterocarpan skeletons with C5 moieties from dimethylallyl diphosphate, and they are commonly produced from (6aS, 11aS)-3,9,6a-trihydroxypterocarpan [(−)-glycinol]. The metabolic fate of (−)-glycinol is determined by the enzymatic introduction of a dimethylallyl group into C-4 or C-2, which is reportedly catalyzed by regiospecific prenyltransferases (PTs). 4-Dimethylallyl (−)-glycinol and 2-dimethylallyl (−)-glycinol are precursors of glyceollin I and other glyceollins, respectively. Although multiple genes encoding (−)-glycinol biosynthetic enzymes have been identified, those involved in the later steps of glyceollin formation mostly remain unidentified, except for (−)-glycinol 4-dimethylallyltransferase (G4DT), which is involved in glyceollin I biosynthesis. In this study, we identified four genes that encode isoflavonoid PTs, including (−)-glycinol 2-dimethylallyltransferase (G2DT), using homology-based in silico screening and biochemical characterization in yeast expression systems. Transcript analyses illustrated that changes in G2DT gene expression were correlated with the induction of glyceollins II, III, IV and V in elicitor-treated soybean cells and leaves, suggesting its involvement in glyceollin biosynthesis. Moreover, the genomic signatures of these PT genes revealed that G4DT and G2DT are paralogs derived from whole-genome duplications of the soybean genome, whereas other PT genes [isoflavone dimethylallyltransferase 1 (IDT1) and IDT2] were derived via local gene duplication on soybean chromosome 11. PMID:27986914
Chilian, B; Abdollahpour, H; Bierhals, T; Haltrich, I; Fekete, G; Nagel, I; Rosenberger, G; Kutsche, K
2013-12-01
Synaptopathies constitute a group of neurological diseases including autism spectrum disorders (ASD) and intellectual disability (ID). They have been associated with mutations in genes encoding proteins important for the formation and stabilization of synapses, such as SHANK1-3. Loss-of-function mutations in the SHANK genes have been identified in individuals with ASD and ID suggesting that other factors modify the neurological phenotype. We report a boy with severe ID, behavioral anomalies, and language impairment who carries a balanced de novo triple translocation 46,XY,t(11;17;19)(q13.3;q25.1;q13.42). The 11q13.3 breakpoint was found to disrupt the SHANK2 gene. The patient also carries copy number variations at 15q13.3 and 10q22.11 encompassing ARHGAP11B and two synaptic genes. The CHRNA7 gene encoding α7-nicotinic acetylcholine receptor subunit and the GPRIN2 gene encoding G-protein-regulated inducer of neurite growth 2 were duplicated. Co-occurrence of a de novo SHANK2 mutation and a CHRNA7 duplication in two reported patients with ASD and ID as well as in the patient with t(11;17;19), severe ID and behavior problems suggests convergence of these genes on a common synaptic pathway. Our results strengthen the oligogenic inheritance model and highlight the presence of a large effect mutation and modifier genes collectively determining phenotypic expression of the synaptopathy. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Biedrzycka, Aleksandra; O'Connor, Emily; Sebastian, Alvaro; Migalska, Magdalena; Radwan, Jacek; Zając, Tadeusz; Bielański, Wojciech; Solarz, Wojciech; Ćmiel, Adam; Westerdahl, Helena
2017-07-05
Recent work suggests that gene duplications may play an important role in the evolution of immunity genes. Passerine birds, and in particular Sylvioidea warblers, have highly duplicated major histocompatibility complex (MHC) genes, which are key in immunity, compared to other vertebrates. However, reasons for this high MHC gene copy number are yet unclear. High-throughput sequencing (HTS) allows MHC genotyping even in individuals with extremely duplicated genes. This HTS data can reveal evidence of selection, which may help to unravel the putative functions of different gene copies, i.e. neofunctionalization. We performed exhaustive genotyping of MHC class I in a Sylvioidea warbler, the sedge warbler, Acrocephalus schoenobaenus, using the Illumina MiSeq technique on individuals from a wild study population. The MHC diversity in 863 genotyped individuals by far exceeds that of any other bird species described to date. A single individual could carry up to 65 different alleles, a large proportion of which are expressed (transcribed). The MHC alleles were of three different lengths differing in evidence of selection, diversity and divergence within our study population. Alleles without any deletions and alleles containing a 6 bp deletion showed characteristics of classical MHC genes, with evidence of multiple sites subject to positive selection and high sequence divergence. In contrast, alleles containing a 3 bp deletion had no sites subject to positive selection and had low divergence. Our results suggest that sedge warbler MHC alleles that either have no deletion, or contain a 6 bp deletion, encode classical antigen presenting MHC molecules. In contrast, MHC alleles containing a 3 bp deletion may encode molecules with a different function. This study demonstrates that highly duplicated MHC genes can be characterised with HTS and that selection patterns can be useful for revealing neofunctionalization. Importantly, our results highlight the need to consider the putative function of different MHC genes in future studies of MHC in relation to disease resistance and fitness.
Citerne, Hélène L.; Le Guilloux, Martine; Sannier, Julie; Nadot, Sophie; Damerval, Catherine
2013-01-01
TCP ECE genes encode transcription factors which have received much attention for their repeated recruitment in the control of floral symmetry in core eudicots, and more recently in monocots. Major duplications of TCP ECE genes have been described in core eudicots, but the evolutionary history of this gene family is unknown in basal eudicots. Reconstructing the phylogeny of ECE genes in basal eudicots will help set a framework for understanding the functional evolution of these genes. TCP ECE genes were sequenced in all major lineages of basal eudicots and Gunnera which belongs to the sister clade to all other core eudicots. We show that in these lineages they have a complex evolutionary history with repeated duplications. We estimate the timing of the two major duplications already identified in the core eudicots within a timeframe before the divergence of Gunnera and after the divergence of Proteales. We also use a synteny-based approach to examine the extent to which the expansion of TCP ECE genes in diverse eudicot lineages may be due to genome-wide duplications. The three major core-eudicot specific clades share a number of collinear genes, and their common evolutionary history may have originated at the γ event. Genomic comparisons in Arabidopsis thaliana and Solanum lycopersicum highlight their separate polyploid origin, with syntenic fragments with and without TCP ECE genes showing differential gene loss and genomic rearrangements. Comparison between recently available genomes from two basal eudicots Aquilegia coerulea and Nelumbo nucifera suggests that the two TCP ECE paralogs in these species are also derived from large-scale duplications. TCP ECE loci from basal eudicots share many features with the three main core eudicot loci, and allow us to infer the makeup of the ancestral eudicot locus. PMID:24019982
Ohtani, Haruka; Morimoto, Takuya; Beppu, Kenji; Kataoka, Ikuo
2018-01-01
Dioecy, the presence of male and female flowers on distinct individuals, has evolved independently in multiple plant lineages, and the genes involved in this differential development are just starting to be uncovered in a few species. Here, we used genomic approaches to investigate this pathway in kiwifruits (genus Actinidia). Genome-wide cataloging of male-specific subsequences, combined with transcriptome analysis, led to the identification of a type-C cytokinin response regulator as a potential sex determinant gene in this genus. Functional transgenic analyses in two model systems, Arabidopsis thaliana and Nicotiana tabacum, indicated that this gene acts as a dominant suppressor of carpel development, prompting us to name it Shy Girl (SyGI). Evolutionary analyses in a panel of Actinidia species revealed that SyGI is located in the Y-specific region of the genome and probably arose from a lineage-specific gene duplication. Comparisons with the duplicated autosomal counterpart, and with orthologs from other angiosperms, suggest that the SyGI-specific duplication and subsequent evolution of cis-elements may have played a key role in the acquisition of separate sexes in this species. PMID:29626069
USDA-ARS?s Scientific Manuscript database
Catalase/peroxidases (KatGs) are a superfamily of reactive oxygen species (ROS)-degrading enzymes believed to be horizontally acquired by ancient Ascomycota from bacteria. Subsequent gene duplication resulted in two KatG paralogs in ascomycetes: the widely distributed intracellular KatG1 group, and ...
Kongchum, Pawapol; Hallerman, Eric M; Hulata, Gideon; David, Lior; Palti, Yniv
2011-01-01
Induction of innate immune pathways is critical for early host defense, but there is limited understanding of how teleost fishes recognize pathogen molecules and activate these pathways. In mammals, cells of the innate immune system detect pathogenic molecular structures using pattern recognition receptors (PRRs). TLR9 functions as a PRR that recognizes CpG motifs in bacterial and viral DNA and requires adaptor molecules MyD88 and TRAF6 for signal transduction. Here we report full-length cDNA isolation, structural characterization and tissue mRNA expression analysis of the common carp (cc) TLR9, MyD88 and TRAF6 gene orthologs. The ccTLR9 open-reading frame (ORF) is predicted to encode a 1064-amino acid (aa) protein. We found that MyD88 and TRAF6 genes are duplicated in common carp. This is the first report of TRAF6 duplication in a vertebrate genome and stronger evidence in support of MyD88 duplication is provided. The ccMyD88a and b ORFs are predicted to encode 288-aa and 284-aa peptides, respectively. They share 91% aa sequence identity between paralogs. The ccTRAF6a and b ORFs are both predicted to encode 543-aa peptides sharing 95% aa sequence identity between paralogs. The ccTLR9 gene is contained in a single large exon. The ccMyD88a and ccMyD88b coding sequences span five exons. The TRAF6b gene spans six exons. PCR amplification to obtain the entire coding sequence of ccTRAF6a gene was not successful. The 2104-bp fragment amplified covers the 3' end of the gene and it contains a partial sequence of one exon and three complete exons. The predicated protein domains of the ccTLR9, ccMyD88 and ccTRAF6 are conserved and resemble orthologs from other vertebrates. Real-time quantitative PCR assays of the ccTLR9, MyD88a and b, and TRAF6a and b gene transcripts in healthy common carp indicated that mRNA expression varied between tissues. Differential expression of duplicate copies were found for ccMyD88 and ccTRAF6 in white and red muscle tissues, suggesting that paralogs may have evolved and attained a new function. The genomic information we describe in this paper provides evidence of sequence and structural conservation of immune response genes in common carp. Published by Elsevier Ltd.
Meyer, Thomas; Pankuweit, Sabine; Richter, Anette; Maisch, Bernhard; Ruppert, Volker
2013-09-15
Hypertrophic cardiomyopathy (HCM) is a cardiovascular disease with autosomal dominant inheritance caused by mutations in genes coding for sarcomeric and/or regulatory proteins expressed in cardiomyocytes. In a small cohort of HCM patients (n=8), we searched for mutations in the two most common genes responsible for HCM and found four missense mutations in the MYH7 gene encoding cardiac β-myosin heavy chain (R204H, M493V, R719W, and R870H) and three mutations in the myosin-binding protein C3 gene (MYBPC3) including one missense (A848V) and two frameshift mutations (c.3713delTG and c.702ins26bp). The c.702ins26bp insertion resulted from the duplication of a 26-bp fragment in a 54-year-old female HCM patient presenting with clinical signs of heart failure due to diastolic dysfunction. Although such large duplications (>10 bp) in the MYBPC3 gene are very rare and have been identified only in 4 families reported so far, the identical duplication mutation was found earlier in a Dutch patient, demonstrating that it may constitute a hitherto unknown founder mutation in central European populations. This observation underscores the significance of insertions into the coding sequence of the MYBPC3 gene for the development and pathogenesis of HCM. © 2013 Elsevier B.V. All rights reserved.
Axelsen, Jacob Bock; Yan, Koon-Kiu; Maslov, Sergei
2007-01-01
Background The evolution of the full repertoire of proteins encoded in a given genome is mostly driven by gene duplications, deletions, and sequence modifications of existing proteins. Indirect information about relative rates and other intrinsic parameters of these three basic processes is contained in the proteome-wide distribution of sequence identities of pairs of paralogous proteins. Results We introduce a simple mathematical framework based on a stochastic birth-and-death model that allows one to extract some of this information and apply it to the set of all pairs of paralogous proteins in H. pylori, E. coli, S. cerevisiae, C. elegans, D. melanogaster, and H. sapiens. It was found that the histogram of sequence identities p generated by an all-to-all alignment of all protein sequences encoded in a genome is well fitted with a power-law form ~ p-γ with the value of the exponent γ around 4 for the majority of organisms used in this study. This implies that the intra-protein variability of substitution rates is best described by the Gamma-distribution with the exponent α ≈ 0.33. Different features of the shape of such histograms allow us to quantify the ratio between the genome-wide average deletion/duplication rates and the amino-acid substitution rate. Conclusion We separately measure the short-term ("raw") duplication and deletion rates rdup∗, rdel∗ which include gene copies that will be removed soon after the duplication event and their dramatically reduced long-term counterparts rdup, rdel. High deletion rate among recently duplicated proteins is consistent with a scenario in which they didn't have enough time to significantly change their functional roles and thus are to a large degree disposable. Systematic trends of each of the four duplication/deletion rates with the total number of genes in the genome were analyzed. All but the deletion rate of recent duplicates rdel∗ were shown to systematically increase with Ngenes. Abnormally flat shapes of sequence identity histograms observed for yeast and human are consistent with lineages leading to these organisms undergoing one or more whole-genome duplications. This interpretation is corroborated by our analysis of the genome of Paramecium tetraurelia where the p-4 profile of the histogram is gradually restored by the successive removal of paralogs generated in its four known whole-genome duplication events. PMID:18039386
Edelmann, Lisa; Stankiewicz, Pavel; Spiteri, Elizabeth; Pandita, Raj K.; Shaffer, Lisa; Lupski, James; Morrow, Bernice E.
2001-01-01
The DGCR6 (DiGeorge critical region) gene encodes a putative protein with sequence similarity to gonadal (gdl), a Drosophila melanogaster gene of unknown function. We mapped the DGCR6 gene to chromosome 22q11 within a low copy repeat, termed sc11.1a, and identified a second copy of the gene, DGCR6L, within the duplicate locus, termed sc11.1b. Both sc11.1 repeats are deleted in most persons with velo-cardio-facial syndrome/DiGeorge syndrome (VCFS/DGS), and they map immediately adjacent and internal to the low copy repeats, termed LCR22, that mediate the deletions associated with VCFS/DGS. We sequenced genomic clones from both loci and determined that the putative initiator methionine is located further upstream than originally described, but in a position similar to the mouse and chicken orthologs. DGCR6L encodes a highly homologous, functional copy of DGCR6, with some base changes rendering amino acid differences. Expression studies of the two genes indicate that both genes are widely expressed in fetal and adult tissues. Evolutionary studies using FISH mapping in several different species of ape combined with sequence analysis of DGCR6 in a number of different primate species indicate that the duplication is at least 12 million years old and may date back to before the divergence of Catarrhines from Platyrrhines, 35 mya. These data suggest that there has been selective evolutionary pressure toward the functional maintenance of both paralogs. Interestingly, a full-length HERV-K provirus integrated into the sc11.1a locus after the divergence of chimpanzees and humans. PMID:11157784
Gene Duplication and the Evolution of Hemoglobin Isoform Differentiation in Birds*
Grispo, Michael T.; Natarajan, Chandrasekhar; Projecto-Garcia, Joana; Moriyama, Hideaki; Weber, Roy E.; Storz, Jay F.
2012-01-01
The majority of bird species co-express two functionally distinct hemoglobin (Hb) isoforms in definitive erythrocytes as follows: HbA (the major adult Hb isoform, with α-chain subunits encoded by the αA-globin gene) and HbD (the minor adult Hb isoform, with α-chain subunits encoded by the αD-globin gene). The αD-globin gene originated via tandem duplication of an embryonic α-like globin gene in the stem lineage of tetrapod vertebrates, which suggests the possibility that functional differentiation between the HbA and HbD isoforms may be attributable to a retained ancestral character state in HbD that harkens back to a primordial, embryonic function. To investigate this possibility, we conducted a combined analysis of protein biochemistry and sequence evolution to characterize the structural and functional basis of Hb isoform differentiation in birds. Functional experiments involving purified HbA and HbD isoforms from 11 different bird species revealed that HbD is characterized by a consistently higher O2 affinity in the presence of allosteric effectors such as organic phosphates and Cl− ions. In the case of both HbA and HbD, analyses of oxygenation properties under the two-state Monod-Wyman-Changeux allosteric model revealed that the pH dependence of Hb-O2 affinity stems primarily from changes in the O2 association constant of deoxy (T-state)-Hb. Ancestral sequence reconstructions revealed that the amino acid substitutions that distinguish the adult-expressed Hb isoforms are not attributable to the retention of an ancestral (pre-duplication) character state in the αD-globin gene that is shared with the embryonic α-like globin gene. PMID:22962007
Gene duplication and the evolution of hemoglobin isoform differentiation in birds.
Grispo, Michael T; Natarajan, Chandrasekhar; Projecto-Garcia, Joana; Moriyama, Hideaki; Weber, Roy E; Storz, Jay F
2012-11-02
The majority of bird species co-express two functionally distinct hemoglobin (Hb) isoforms in definitive erythrocytes as follows: HbA (the major adult Hb isoform, with α-chain subunits encoded by the α(A)-globin gene) and HbD (the minor adult Hb isoform, with α-chain subunits encoded by the α(D)-globin gene). The α(D)-globin gene originated via tandem duplication of an embryonic α-like globin gene in the stem lineage of tetrapod vertebrates, which suggests the possibility that functional differentiation between the HbA and HbD isoforms may be attributable to a retained ancestral character state in HbD that harkens back to a primordial, embryonic function. To investigate this possibility, we conducted a combined analysis of protein biochemistry and sequence evolution to characterize the structural and functional basis of Hb isoform differentiation in birds. Functional experiments involving purified HbA and HbD isoforms from 11 different bird species revealed that HbD is characterized by a consistently higher O(2) affinity in the presence of allosteric effectors such as organic phosphates and Cl(-) ions. In the case of both HbA and HbD, analyses of oxygenation properties under the two-state Monod-Wyman-Changeux allosteric model revealed that the pH dependence of Hb-O(2) affinity stems primarily from changes in the O(2) association constant of deoxy (T-state)-Hb. Ancestral sequence reconstructions revealed that the amino acid substitutions that distinguish the adult-expressed Hb isoforms are not attributable to the retention of an ancestral (pre-duplication) character state in the α(D)-globin gene that is shared with the embryonic α-like globin gene.
Landsverk, Megan L.; Ruzzo, Elizabeth K.; Mefford, Heather C.; Buysse, Karen; Buchan, Jillian G.; Eichler, Evan E.; Petty, Elizabeth M.; Peterson, Esther A.; Knutzen, Dana M.; Barnett, Karen; Farlow, Martin R.; Caress, Judy; Parry, Gareth J.; Quan, Dianna; Gardner, Kathy L.; Hong, Ming; Simmons, Zachary; Bird, Thomas D.; Chance, Phillip F.; Hannibal, Mark C.
2009-01-01
Hereditary neuralgic amyotrophy (HNA) is an autosomal dominant disorder associated with recurrent episodes of focal neuropathy primarily affecting the brachial plexus. Point mutations in the SEPT9 gene have been previously identified as the molecular basis of HNA in some pedigrees. However in many families, including those from North America demonstrating a genetic founder haplotype, no sequence mutations have been detected. We report an intragenic 38 Kb SEPT9 duplication that is linked to HNA in 12 North American families that share the common founder haplotype. Analysis of the breakpoints showed that the duplication is identical in all pedigrees, and molecular analysis revealed that the duplication includes the 645 bp exon in which previous HNA mutations were found. The SEPT9 transcript variants that span this duplication contain two in-frame repeats of this exon, and immunoblotting demonstrates larger molecular weight SEPT9 protein isoforms. This exon also encodes for a majority of the SEPT9 N-terminal proline rich region suggesting that this region plays a role in the pathogenesis of HNA. PMID:19139049
Landsverk, Megan L; Ruzzo, Elizabeth K; Mefford, Heather C; Buysse, Karen; Buchan, Jillian G; Eichler, Evan E; Petty, Elizabeth M; Peterson, Esther A; Knutzen, Dana M; Barnett, Karen; Farlow, Martin R; Caress, Judy; Parry, Gareth J; Quan, Dianna; Gardner, Kathy L; Hong, Ming; Simmons, Zachary; Bird, Thomas D; Chance, Phillip F; Hannibal, Mark C
2009-04-01
Hereditary neuralgic amyotrophy (HNA) is an autosomal dominant disorder associated with recurrent episodes of focal neuropathy primarily affecting the brachial plexus. Point mutations in the SEPT9 gene have been previously identified as the molecular basis of HNA in some pedigrees. However in many families, including those from North America demonstrating a genetic founder haplotype, no sequence mutations have been detected. We report an intragenic 38 Kb SEPT9 duplication that is linked to HNA in 12 North American families that share the common founder haplotype. Analysis of the breakpoints showed that the duplication is identical in all pedigrees, and molecular analysis revealed that the duplication includes the 645 bp exon in which previous HNA mutations were found. The SEPT9 transcript variants that span this duplication contain two in-frame repeats of this exon, and immunoblotting demonstrates larger molecular weight SEPT9 protein isoforms. This exon also encodes for a majority of the SEPT9 N-terminal proline rich region suggesting that this region plays a role in the pathogenesis of HNA.
Holland, Peter W H
2013-01-01
Many homeobox genes encode transcription factors with regulatory roles in animal and plant development. Homeobox genes are found in almost all eukaryotes, and have diversified into 11 gene classes and over 100 gene families in animal evolution, and 10 to 14 gene classes in plants. The largest group in animals is the ANTP class which includes the well-known Hox genes, plus other genes implicated in development including ParaHox (Cdx, Xlox, Gsx), Evx, Dlx, En, NK4, NK3, Msx, and Nanog. Genomic data suggest that the ANTP class diversified by extensive tandem duplication to generate a large array of genes, including an NK gene cluster and a hypothetical ProtoHox gene cluster that duplicated to generate Hox and ParaHox genes. Expression and functional data suggest that NK, Hox, and ParaHox gene clusters acquired distinct roles in patterning the mesoderm, nervous system, and gut. The PRD class is also diverse and includes Pax2/5/8, Pax3/7, Pax4/6, Gsc, Hesx, Otx, Otp, and Pitx genes. PRD genes are not generally arranged in ancient genomic clusters, although the Dux, Obox, and Rhox gene clusters arose in mammalian evolution as did several non-clustered PRD genes. Tandem duplication and genome duplication expanded the number of homeobox genes, possibly contributing to the evolution of developmental complexity, but homeobox gene loss must not be ignored. Evolutionary changes to homeobox gene expression have also been documented, including Hox gene expression patterns shifting in concert with segmental diversification in vertebrates and crustaceans, and deletion of a Pitx1 gene enhancer in pelvic-reduced sticklebacks. WIREs Dev Biol 2013, 2:31-45. doi: 10.1002/wdev.78 For further resources related to this article, please visit the WIREs website. The author declares that he has no conflicts of interest. Copyright © 2012 Wiley Periodicals, Inc.
Ma, Jiale; Pan, Zihao; Huang, Jinhu; Sun, Min; Lu, Chengping; Yao, Huochun
2017-01-01
ABSTRACT The type VI secretion system (T6SS) is a widespread molecular weapon deployed by many bacterial species to target eukaryotic host cells or rival bacteria. Using a dynamic injection mechanism, diverse effectors can be delivered by T6SS directly into recipient cells. Here, we report a new family of T6SS effectors encoded by extended Hcps carrying diverse toxin domains. Bioinformatic analyses revealed that these Hcps with C-terminal extension toxins, designated as Hcp-ET, exist widely in the Enterobacteriaceae. To verify our findings, Hcp-ET1 was tested for its antibacterial effect, and showed effective inhibition of target cell growth via the predicted HNH-DNase activity by T6SS-dependent delivery. Further studies showed that Hcp-ET2 mediated interbacterial antagonism via a Tle1 phospholipase (encoded by DUF2235 domain) activity. Notably, comprehensive analyses of protein homology and genomic neighborhoods revealed that Hcp-ET3–4 is fused with 2 toxin domains (Pyocin S3 and Colicin-DNase) C-terminally, and its encoding gene is followed 3 duplications of the cognate immunity genes. However, some bacteria encode a separated hcp-et3 and an orphan et4 (et4O1) genes caused by a termination-codon mutation in the fusion region between Pyocin S3 and Colicin-DNase encoding fragments. Our results demonstrated that both of these toxins had antibacterial effects. Further, all duplications of the cognate immunity protein contributed to neutralize the DNase toxicity of Pyocin S3 and Colicin, which has not been reported previously. In conclusion, we propose that Hcp-ET proteins are polymorphic T6SS effectors, and thus present a novel encoding pattern of T6SS effectors. PMID:28060574
Molecular evolution of psbA gene in ferns: unraveling selective pressure and co-evolutionary pattern
2012-01-01
Background The photosynthetic oxygen-evolving photo system II (PS II) produces almost the entire oxygen in the atmosphere. This unique biochemical system comprises a functional core complex that is encoded by psbA and other genes. Unraveling the evolutionary dynamics of this gene is of particular interest owing to its direct role in oxygen production. psbA underwent gene duplication in leptosporangiates, in which both copies have been preserved since. Because gene duplication is often followed by the non-fictionalization of one of the copies and its subsequent erosion, preservation of both psbA copies pinpoint functional or regulatory specialization events. The aim of this study was to investigate the molecular evolution of psbA among fern lineages. Results We sequenced psbA , which encodes D1 protein in the core complex of PSII, in 20 species representing 8 orders of extant ferns; then we searched for selection and convolution signatures in psbA across the 11 fern orders. Collectively, our results indicate that: (1) selective constraints among D1 protein relaxed after the duplication in 4 leptosporangiate orders; (2) a handful positively selected codons were detected within species of single copy psbA, but none in duplicated ones; (3) a few sites among D1 protein were involved in co-evolution process which may intimate significant functional/structural communications between them. Conclusions The strong competition between ferns and angiosperms for light may have been the main cause for a continuous fixation of adaptive amino acid changes in psbA , in particular after its duplication. Alternatively, a single psbA copy may have undergone bursts of adaptive changes at the molecular level to overcome angiosperms competition. The strong signature of positive Darwinian selection in a major part of D1 protein is testament to this. At the same time, species own two psbA copies hardly have positive selection signals among the D1 protein coding sequences. In this study, eleven co-evolving sites have been detected via different molecules, which may be more important than others. PMID:22899792
Xu, Aishi; Li, Guang; Yang, Dong; Wu, Songfeng; Ouyang, Hongsheng; Xu, Ping; He, Fuchu
2015-12-04
Although the "missing protein" is a temporary concept in C-HPP, the biological information for their "missing" could be an important clue in evolutionary studies. Here we classified missing-protein-encoding genes into two groups, the genes encoding PE2 proteins (with transcript evidence) and the genes encoding PE3/4 proteins (with no transcript evidence). These missing-protein-encoding genes distribute unevenly among different chromosomes, chromosomal regions, or gene clusters. In the view of evolutionary features, PE3/4 genes tend to be young, spreading at the nonhomology chromosomal regions and evolving at higher rates. Interestingly, there is a higher proportion of singletons in PE3/4 genes than the proportion of singletons in all genes (background) and OTCSGs (organ, tissue, cell type-specific genes). More importantly, most of the paralogous PE3/4 genes belong to the newly duplicated members of the paralogous gene groups, which mainly contribute to special biological functions, such as "smell perception". These functions are heavily restricted into specific type of cells, tissues, or specific developmental stages, acting as the new functional requirements that facilitated the emergence of the missing-protein-encoding genes during evolution. In addition, the criteria for the extremely special physical-chemical proteins were first set up based on the properties of PE2 proteins, and the evolutionary characteristics of those proteins were explored. Overall, the evolutionary analyses of missing-protein-encoding genes are expected to be highly instructive for proteomics and functional studies in the future.
Berke, Lidija; Snel, Berend
2014-01-01
The histone modification H3K27me3 is involved in repression of transcription and plays a crucial role in developmental transitions in both animals and plants. It is deposited by PRC2 (Polycomb repressive complex 2), a conserved protein complex. In Arabidopsis thaliana, H3K27me3 is found at 15% of all genes. These tend to encode transcription factors and other regulators important for development. However, it is not known how PRC2 is recruited to target loci nor how this set of target genes arose during Arabidopsis evolution. To resolve the latter, we integrated A. thaliana gene families with five independent genome-wide H3K27me3 data sets. Gene families were either significantly enriched or depleted of H3K27me3, showing a strong impact of shared ancestry to H3K27me3 distribution. To quantify this, we performed ancestral state reconstruction of H3K27me3 on phylogenetic trees of gene families. The set of H3K27me3-marked genes changed less than expected by chance, suggesting that H3K27me3 was retained after gene duplication. This retention suggests that the PRC2-recruiting signal could be encoded in the DNA and also conserved among certain duplicated genes. Indeed, H3K27me3-marked genes were overrepresented among paralogs sharing conserved noncoding sequences (CNSs) that are enriched with transcription factor binding sites. The association of upstream CNSs with H3K27me3-marked genes represents the first genome-wide connection between H3K27me3 and potential regulatory elements in plants. Thus, we propose that CNSs likely function as part of the PRC2 recruitment in plants. PMID:24567304
Salojärvi, Jarkko; Smolander, Olli-Pekka; Nieminen, Kaisa; Rajaraman, Sitaram; Safronov, Omid; Safdari, Pezhman; Lamminmäki, Airi; Immanen, Juha; Lan, Tianying; Tanskanen, Jaakko; Rastas, Pasi; Amiryousefi, Ali; Jayaprakash, Balamuralikrishna; Kammonen, Juhana I; Hagqvist, Risto; Eswaran, Gugan; Ahonen, Viivi Helena; Serra, Juan Alonso; Asiegbu, Fred O; de Dios Barajas-Lopez, Juan; Blande, Daniel; Blokhina, Olga; Blomster, Tiina; Broholm, Suvi; Brosché, Mikael; Cui, Fuqiang; Dardick, Chris; Ehonen, Sanna E; Elomaa, Paula; Escamez, Sacha; Fagerstedt, Kurt V; Fujii, Hiroaki; Gauthier, Adrien; Gollan, Peter J; Halimaa, Pauliina; Heino, Pekka I; Himanen, Kristiina; Hollender, Courtney; Kangasjärvi, Saijaliisa; Kauppinen, Leila; Kelleher, Colin T; Kontunen-Soppela, Sari; Koskinen, J Patrik; Kovalchuk, Andriy; Kärenlampi, Sirpa O; Kärkönen, Anna K; Lim, Kean-Jin; Leppälä, Johanna; Macpherson, Lee; Mikola, Juha; Mouhu, Katriina; Mähönen, Ari Pekka; Niinemets, Ülo; Oksanen, Elina; Overmyer, Kirk; Palva, E Tapio; Pazouki, Leila; Pennanen, Ville; Puhakainen, Tuula; Poczai, Péter; Possen, Boy J H M; Punkkinen, Matleena; Rahikainen, Moona M; Rousi, Matti; Ruonala, Raili; van der Schoot, Christiaan; Shapiguzov, Alexey; Sierla, Maija; Sipilä, Timo P; Sutela, Suvi; Teeri, Teemu H; Tervahauta, Arja I; Vaattovaara, Aleksia; Vahala, Jorma; Vetchinnikova, Lidia; Welling, Annikki; Wrzaczek, Michael; Xu, Enjun; Paulin, Lars G; Schulman, Alan H; Lascoux, Martin; Albert, Victor A; Auvinen, Petri; Helariutta, Ykä; Kangasjärvi, Jaakko
2017-06-01
Silver birch (Betula pendula) is a pioneer boreal tree that can be induced to flower within 1 year. Its rapid life cycle, small (440-Mb) genome, and advanced germplasm resources make birch an attractive model for forest biotechnology. We assembled and chromosomally anchored the nuclear genome of an inbred B. pendula individual. Gene duplicates from the paleohexaploid event were enriched for transcriptional regulation, whereas tandem duplicates were overrepresented by environmental responses. Population resequencing of 80 individuals showed effective population size crashes at major points of climatic upheaval. Selective sweeps were enriched among polyploid duplicates encoding key developmental and physiological triggering functions, suggesting that local adaptation has tuned the timing of and cross-talk between fundamental plant processes. Variation around the tightly-linked light response genes PHYC and FRS10 correlated with latitude and longitude and temperature, and with precipitation for PHYC. Similar associations characterized the growth-promoting cytokinin response regulator ARR1, and the wood development genes KAK and MED5A.
Evolution of developmental regulation in the vertebrate FgfD subfamily.
Jovelin, Richard; Yan, Yi-Lin; He, Xinjun; Catchen, Julian; Amores, Angel; Canestro, Cristian; Yokoi, Hayato; Postlethwait, John H
2010-01-15
Fibroblast growth factors (Fgfs) encode small signaling proteins that help regulate embryo patterning. Fgfs fall into seven families, including FgfD. Nonvertebrate chordates have a single FgfD gene; mammals have three (Fgf8, Fgf17, and Fgf18); and teleosts have six (fgf8a, fgf8b, fgf17, fgf18a, fgf18b, and fgf24). What are the evolutionary processes that led to the structural duplication and functional diversification of FgfD genes during vertebrate phylogeny? To study this question, we investigated conserved syntenies, patterns of gene expression, and the distribution of conserved noncoding elements (CNEs) in FgfD genes of stickleback and zebrafish, and compared them with data from cephalochordates, urochordates, and mammals. Genomic analysis suggests that Fgf8, Fgf17, Fgf18, and Fgf24 arose in two rounds of whole genome duplication at the base of the vertebrate radiation; that fgf8 and fgf18 duplications occurred at the base of the teleost radiation; and that Fgf24 is an ohnolog that was lost in the mammalian lineage. Expression analysis suggests that ancestral subfunctions partitioned between gene duplicates and points to the evolution of novel expression domains. Analysis of CNEs, at least some of which are candidate regulatory elements, suggests that ancestral CNEs partitioned between gene duplicates. These results help explain the evolutionary pathways by which the developmentally important family of FgfD molecules arose and the deduced principles that guided FgfD evolution are likely applicable to the evolution of developmental regulation in many vertebrate multigene families. (c) 2009 Wiley-Liss, Inc.
NASA Astrophysics Data System (ADS)
Tian, Z. H.; Jiao, C. Z.
2017-07-01
RIG-I like receptors (RLRs) play key roles in sensing non-self nucleic acids in cytoplasm and trigger antiviral innate immune response in vertebrates and human body. Here we carried out in silico analysis to identify and investigate the putative RLRs encoded in the genome of marine mollusk, Crassostrea gigas (cgRLRs), an invertebrate species. We found the unusual duplication and varieties on domain architecture of putative cgRLRs encoded in the genome of C. gigas. Three putative cgRLRs (accessions numbers are EKC24603, EKC31344.1 and EKC38304.1 on GenBank), have the similar domain architecture with that of human RIG-I or MDA5, and one protein (EKC34573.1) with that of human LGP2; The fifth putative cgRLRs (EKC38303.1) is somewhat similar with human RIG-I/MDA5 except that it has only one caspase activation and recruitment domain (CARD) in its N-terminal. Other nine proteins were identified to be partialy similar with RLRs while with the incomplete sequences, which maybe reflect the events of partial duplication of cgRLRs genes occurred in the oyster genome.
Yabe, Taijiro; Ge, Xiaoyan; Pelegri, Francisco
2007-12-01
A female-sterile zebrafish maternal-effect mutation in cellular atoll (cea) results in defects in the initiation of cell division starting at the second cell division cycle. This phenomenon is caused by defects in centrosome duplication, which in turn affect the formation of a bipolar spindle. We show that cea encodes the centriolar coiled-coil protein Sas-6, and that zebrafish Cea/Sas-6 protein localizes to centrosomes. cea also has a genetic paternal contribution, which when mutated results in an arrested first cell division followed by normal cleavage. Our data supports the idea that, in zebrafish, paternally inherited centrosomes are required for the first cell division while maternally derived factors are required for centrosomal duplication and cell divisions in subsequent cell cycles. DNA synthesis ensues in the absence of centrosome duplication, and the one-cycle delay in the first cell division caused by cea mutant sperm leads to whole genome duplication. We discuss the potential implications of these findings with regards to the origin of polyploidization in animal species. In addition, the uncoupling of developmental time and cell division count caused by the cea mutation suggests the presence of a time window, normally corresponding to the first two cell cycles, which is permissive for germ plasm recruitment.
Evolutionary and Expression Analyses of the Apple Basic Leucine Zipper Transcription Factor Family
Zhao, Jiao; Guo, Rongrong; Guo, Chunlei; Hou, Hongmin; Wang, Xiping; Gao, Hua
2016-01-01
Transcription factors (TFs) play essential roles in the regulatory networks controlling many developmental processes in plants. Members of the basic leucine (Leu) zipper (bZIP) TF family, which is unique to eukaryotes, are involved in regulating diverse processes, including flower and vascular development, seed maturation, stress signaling, and defense responses to pathogens. The bZIP proteins have a characteristic bZIP domain composed of a DNA-binding basic region and a Leu zipper dimerization region. In this study, we identified 112 apple (Malus domestica Borkh) bZIP TF-encoding genes, termed MdbZIP genes. Synteny analysis indicated that segmental and tandem duplication events, as well as whole genome duplication, have contributed to the expansion of the apple bZIP family. The family could be divided into 11 groups based on structural features of the encoded proteins, as well as on the phylogenetic relationship of the apple bZIP proteins to those of the model plant Arabidopsis thaliana (AtbZIP genes). Synteny analysis revealed that several paired MdbZIP genes and AtbZIP gene homologs were located in syntenic genomic regions. Furthermore, expression analyses of group A MdbZIP genes showed distinct expression levels in 10 different organs. Moreover, changes in these expression profiles in response to abiotic stress conditions and various hormone treatments identified MdbZIP genes that were responsive to high salinity and drought, as well as to different phytohormones. PMID:27066030
Evolutionary and Expression Analyses of the Apple Basic Leucine Zipper Transcription Factor Family.
Zhao, Jiao; Guo, Rongrong; Guo, Chunlei; Hou, Hongmin; Wang, Xiping; Gao, Hua
2016-01-01
Transcription factors (TFs) play essential roles in the regulatory networks controlling many developmental processes in plants. Members of the basic leucine (Leu) zipper (bZIP) TF family, which is unique to eukaryotes, are involved in regulating diverse processes, including flower and vascular development, seed maturation, stress signaling, and defense responses to pathogens. The bZIP proteins have a characteristic bZIP domain composed of a DNA-binding basic region and a Leu zipper dimerization region. In this study, we identified 112 apple (Malus domestica Borkh) bZIP TF-encoding genes, termed MdbZIP genes. Synteny analysis indicated that segmental and tandem duplication events, as well as whole genome duplication, have contributed to the expansion of the apple bZIP family. The family could be divided into 11 groups based on structural features of the encoded proteins, as well as on the phylogenetic relationship of the apple bZIP proteins to those of the model plant Arabidopsis thaliana (AtbZIP genes). Synteny analysis revealed that several paired MdbZIP genes and AtbZIP gene homologs were located in syntenic genomic regions. Furthermore, expression analyses of group A MdbZIP genes showed distinct expression levels in 10 different organs. Moreover, changes in these expression profiles in response to abiotic stress conditions and various hormone treatments identified MdbZIP genes that were responsive to high salinity and drought, as well as to different phytohormones.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dyer, K.D.; Handen, J.S.; Rosenberg, H.F.
The Charcot-Leyden crystal (CLC) protein, or eosinophil lysophospholipase, is a characteristic protein of human eosinophils and basophils; recent work has demonstrated that the CLC protein is both structurally and functionally related to the galectin family of {beta}-galactoside binding proteins. The galectins as a group share a number of features in common, including a linear ligand binding site encoded on a single exon. In this work, we demonstrate that the intron-exon structure of the gene encoding CLC is analogous to those encoding the galectins. The coding sequence of the CLC gene is divided into four exons, with the entire {beta}-galactoside bindingmore » site encoded by exon III. We have isolated CLC {beta}-galactoside binding sites from both orangutan (Pongo pygmaeus) and murine (Mus musculus) genomic DNAs, both encoded on single exons, and noted conservation of the amino acids shown to interact directly with the {beta}-galactoside ligand. The most likely interpretation of these results suggests the occurrence of one or more exon duplication and insertion events, resulting in the distribution of this lectin domain to CLC as well as to the multiple galectin genes. 35 refs., 3 figs.« less
A highly divergent gene cluster in honey bees encodes a novel silk family.
Sutherland, Tara D; Campbell, Peter M; Weisman, Sarah; Trueman, Holly E; Sriskantha, Alagacone; Wanjura, Wolfgang J; Haritos, Victoria S
2006-11-01
The pupal cocoon of the domesticated silk moth Bombyx mori is the best known and most extensively studied insect silk. It is not widely known that Apis mellifera larvae also produce silk. We have used a combination of genomic and proteomic techniques to identify four honey bee fiber genes (AmelFibroin1-4) and two silk-associated genes (AmelSA1 and 2). The four fiber genes are small, comprise a single exon each, and are clustered on a short genomic region where the open reading frames are GC-rich amid low GC intergenic regions. The genes encode similar proteins that are highly helical and predicted to form unusually tight coiled coils. Despite the similarity in size, structure, and composition of the encoded proteins, the genes have low primary sequence identity. We propose that the four fiber genes have arisen from gene duplication events but have subsequently diverged significantly. The silk-associated genes encode proteins likely to act as a glue (AmelSA1) and involved in silk processing (AmelSA2). Although the silks of honey bees and silkmoths both originate in larval labial glands, the silk proteins are completely different in their primary, secondary, and tertiary structures as well as the genomic arrangement of the genes encoding them. This implies independent evolutionary origins for these functionally related proteins.
Sembongi, Hiroshi; Di Re, Miriam; Bokori-Brown, Monika; Holt, Ian J
2007-10-01
Rearrangements of mitochondrial DNA (mtDNA) are a well-recognized cause of human disease; deletions are more frequent, but duplications are more readily transmitted to offspring. In theory, partial duplications of mtDNA can be resolved to partially deleted and wild-type (WT) molecules, via homologous recombination. Therefore, the yeast CCE1 gene, encoding a Holliday junction resolvase, was introduced into cells carrying partially duplicated or partially triplicated mtDNA. Some cell lines carrying the CCE1 gene had substantial amounts of WT mtDNA suggesting that the enzyme can mediate intramolecular recombination in human mitochondria. However, high levels of expression of CCE1 frequently led to mtDNA loss, and so it is necessary to strictly regulate the expression of CCE1 in human cells to ensure the selection and maintenance of WT mtDNA.
Taylor, William R.; Gibbs, Melanie; Breuker, Casper J.; Holland, Peter W. H.
2014-01-01
Gene duplications within the conserved Hox cluster are rare in animal evolution, but in Lepidoptera an array of divergent Hox-related genes (Shx genes) has been reported between pb and zen. Here, we use genome sequencing of five lepidopteran species (Polygonia c-album, Pararge aegeria, Callimorpha dominula, Cameraria ohridella, Hepialus sylvina) plus a caddisfly outgroup (Glyphotaelius pellucidus) to trace the evolution of the lepidopteran Shx genes. We demonstrate that Shx genes originated by tandem duplication of zen early in the evolution of large clade Ditrysia; Shx are not found in a caddisfly and a member of the basally diverging Hepialidae (swift moths). Four distinct Shx genes were generated early in ditrysian evolution, and were stably retained in all descendent Lepidoptera except the silkmoth which has additional duplications. Despite extensive sequence divergence, molecular modelling indicates that all four Shx genes have the potential to encode stable homeodomains. The four Shx genes have distinct spatiotemporal expression patterns in early development of the Speckled Wood butterfly (Pararge aegeria), with ShxC demarcating the future sites of extraembryonic tissue formation via strikingly localised maternal RNA in the oocyte. All four genes are also expressed in presumptive serosal cells, prior to the onset of zen expression. Lepidopteran Shx genes represent an unusual example of Hox cluster expansion and integration of novel genes into ancient developmental regulatory networks. PMID:25340822
POM-ZP3, a bipartite transcript derived from human ZP3 and a POM121 homologue
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kipersztok, S.; Osawa, G.A.; Liang, L.F.
1995-01-20
Human POM-ZP3 is a novel bipartite RNA transcript that is derived from a gene homologous to rat POM121 (a nuclear pore membrane protein) and ZP3 (a sperm receptor ligand in the zona pellucida). The 5{prime} region is 77% identical to the 5{prime} end of the coding region of rat POM121 and appears to represent a partial duplication of a gene encoding a human homologue of this rodent gene. The 3{prime} end of the POM-ZP3 transcript is 99% identical to ZP3 and appears to have arisen from a duplication of the last four exons (exons 5-8) of ZP3. Using Northern blotsmore » and RT-PCR, POM-ZP3 transcripts were detected in human ovaries, testes, spleen, thymus, lymphocytes, prostate, and intestines. The longest open reading frame encodes a conceptual protein of 210 amino acids, the first 76 of which are 83% identical to residues 241-315 of rat POM121. The next 125 amino acids are 98% identical to residues 239-363 of the 424-amino-acid human ZP3 protein. By fluorescence in situ hybridization, genomic fragments of ZP3 and a human homologue of POM121 were localized to chromosome 7q11.23. Taken together, these data suggest that partial duplications of human ZP3 and a POM121-like gene have resulted in a fusion transcript, POM-ZP3, that is expressed in multiple human tissues. 24 refs., 5 figs.« less
Mondragón-Palomino, Mariana; Hiese, Luisa; Härter, Andrea; Koch, Marcus A; Theißen, Günter
2009-01-01
Background Positive selection is recognized as the prevalence of nonsynonymous over synonymous substitutions in a gene. Models of the functional evolution of duplicated genes consider neofunctionalization as key to the retention of paralogues. For instance, duplicate transcription factors are specifically retained in plant and animal genomes and both positive selection and transcriptional divergence appear to have played a role in their diversification. However, the relative impact of these two factors has not been systematically evaluated. Class B MADS-box genes, comprising DEF-like and GLO-like genes, encode developmental transcription factors essential for establishment of perianth and male organ identity in the flowers of angiosperms. Here, we contrast the role of positive selection and the known divergence in expression patterns of genes encoding class B-like MADS-box transcription factors from monocots, with emphasis on the family Orchidaceae and the order Poales. Although in the monocots these two groups are highly diverse and have a strongly canalized floral morphology, there is no information on the role of positive selection in the evolution of their distinctive flower morphologies. Published research shows that in Poales, class B-like genes are expressed in stamens and in lodicules, the perianth organs whose identity might also be specified by class B-like genes, like the identity of the inner tepals of their lily-like relatives. In orchids, however, the number and pattern of expression of class B-like genes have greatly diverged. Results The DEF-like genes from Orchidaceae form four well-supported, ancient clades of orthologues. In contrast, orchid GLO-like genes form a single clade of ancient orthologues and recent paralogues. DEF-like genes from orchid clade 2 (OMADS3-like genes) are under less stringent purifying selection than the other orchid DEF-like and GLO-like genes. In comparison with orchids, purifying selection was less stringent in DEF-like and GLO-like genes from Poales. Most importantly, positive selection took place before the major organ reduction and losses in the floral axis that eventually yielded the zygomorphic grass floret. Conclusion In DEF-like genes of Poales, positive selection on the region mediating interactions with other proteins or DNA could have triggered the evolution of the regulatory mechanisms behind the development of grass-specific reproductive structures. Orchidaceae show a different trend, where gene duplication and transcriptional divergence appear to have played a major role in the canalization and modularization of perianth development. PMID:19383167
A Synergism between Adaptive Effects and Evolvability Drives Whole Genome Duplication to Fixation
Cuypers, Thomas D.; Hogeweg, Paulien
2014-01-01
Whole genome duplication has shaped eukaryotic evolutionary history and has been associated with drastic environmental change and species radiation. While the most common fate of WGD duplicates is a return to single copy, retained duplicates have been found enriched for highly interacting genes. This pattern has been explained by a neutral process of subfunctionalization and more recently, dosage balance selection. However, much about the relationship between environmental change, WGD and adaptation remains unknown. Here, we study the duplicate retention pattern postWGD, by letting virtual cells adapt to environmental changes. The virtual cells have structured genomes that encode a regulatory network and simple metabolism. Populations are under selection for homeostasis and evolve by point mutations, small indels and WGD. After populations had initially adapted fully to fluctuating resource conditions re-adaptation to a broad range of novel environments was studied by tracking mutations in the line of descent. WGD was established in a minority (≈30%) of lineages, yet, these were significantly more successful at re-adaptation. Unexpectedly, WGD lineages conserved more seemingly redundant genes, yet had higher per gene mutation rates. While WGD duplicates of all functional classes were significantly over-retained compared to a model of neutral losses, duplicate retention was clearly biased towards highly connected TFs. Importantly, no subfunctionalization occurred in conserved pairs, strongly suggesting that dosage balance shaped retention. Meanwhile, singles diverged significantly. WGD, therefore, is a powerful mechanism to cope with environmental change, allowing conservation of a core machinery, while adapting the peripheral network to accommodate change. PMID:24743268
A synergism between adaptive effects and evolvability drives whole genome duplication to fixation.
Cuypers, Thomas D; Hogeweg, Paulien
2014-04-01
Whole genome duplication has shaped eukaryotic evolutionary history and has been associated with drastic environmental change and species radiation. While the most common fate of WGD duplicates is a return to single copy, retained duplicates have been found enriched for highly interacting genes. This pattern has been explained by a neutral process of subfunctionalization and more recently, dosage balance selection. However, much about the relationship between environmental change, WGD and adaptation remains unknown. Here, we study the duplicate retention pattern postWGD, by letting virtual cells adapt to environmental changes. The virtual cells have structured genomes that encode a regulatory network and simple metabolism. Populations are under selection for homeostasis and evolve by point mutations, small indels and WGD. After populations had initially adapted fully to fluctuating resource conditions re-adaptation to a broad range of novel environments was studied by tracking mutations in the line of descent. WGD was established in a minority (≈30%) of lineages, yet, these were significantly more successful at re-adaptation. Unexpectedly, WGD lineages conserved more seemingly redundant genes, yet had higher per gene mutation rates. While WGD duplicates of all functional classes were significantly over-retained compared to a model of neutral losses, duplicate retention was clearly biased towards highly connected TFs. Importantly, no subfunctionalization occurred in conserved pairs, strongly suggesting that dosage balance shaped retention. Meanwhile, singles diverged significantly. WGD, therefore, is a powerful mechanism to cope with environmental change, allowing conservation of a core machinery, while adapting the peripheral network to accommodate change.
2014-01-01
Background Starch is the main source of carbon storage in the Archaeplastida. The starch biosynthesis pathway (sbp) emerged from cytosolic glycogen metabolism shortly after plastid endosymbiosis and was redirected to the plastid stroma during the green lineage divergence. The SBP is a complex network of genes, most of which are members of large multigene families. While some gene duplications occurred in the Archaeplastida ancestor, most were generated during the sbp redirection process, and the remaining few paralogs were generated through compartmentalization or tissue specialization during the evolution of the land plants. In the present study, we tested models of duplicated gene evolution in order to understand the evolutionary forces that have led to the development of SBP in angiosperms. We combined phylogenetic analyses and tests on the rates of evolution along branches emerging from major duplication events in six gene families encoding sbp enzymes. Results We found evidence of positive selection along branches following cytosolic or plastidial specialization in two starch phosphorylases and identified numerous residues that exhibited changes in volume, polarity or charge. Starch synthases, branching and debranching enzymes functional specializations were also accompanied by accelerated evolution. However, none of the sites targeted by selection corresponded to known functional domains, catalytic or regulatory. Interestingly, among the 13 duplications tested, 7 exhibited evidence of positive selection in both branches emerging from the duplication, 2 in only one branch, and 4 in none of the branches. Conclusions The majority of duplications were followed by accelerated evolution targeting specific residues along both branches. This pattern was consistent with the optimization of the two sub-functions originally fulfilled by the ancestral gene before duplication. Our results thereby provide strong support to the so-called “Escape from Adaptive Conflict” (EAC) model. Because none of the residues targeted by selection occurred in characterized functional domains, we propose that enzyme specialization has occurred through subtle changes in affinity, activity or interaction with other enzymes in complex formation, while the basic function defined by the catalytic domain has been maintained. PMID:24884572
Dolferus, R.; Osterman, J. C.; Peacock, W. J.; Dennis, E. S.
1997-01-01
This article reports the cloning of the genes encoding the Arabidopsis and rice class III ADH enzymes, members of the alcohol dehydrogenase or medium chain reductase/dehydrogenase superfamily of proteins with glutathione-dependent formaldehyde dehydrogenase activity (GSH-FDH). Both genes contain eight introns in exactly the same positions, and these positions are conserved in plant ethanol-active Adh genes (class P). These data provide further evidence that plant class P genes have evolved from class III genes by gene duplication and acquisition of new substrate specificities. The position of introns and similarities in the nucleic acid and amino acid sequences of the different classes of ADH enzymes in plants and humans suggest that plant and animal class III enzymes diverged before they duplicated to give rise to plant and animal ethanol-active ADH enzymes. Plant class P ADH enzymes have gained substrate specificities and evolved promoters with different expression properties, in keeping with their metabolic function as part of the alcohol fermentation pathway. PMID:9215914
Gak, Eugene; Tyurin, Michael; Kiriukhin, Michael
2014-05-01
The cell energy fraction that powered maintenance and expression of genes encoding pro-phage elements, pta-ack cluster, early sporulation, sugar ABC transporter periplasmic proteins, 6-phosphofructokinase, pyruvate kinase, and fructose-1,6-disphosphatase in acetogen Clostridium sp. MT871 was re-directed to power synthetic operon encoding isobutanol biosynthesis at the expense of these genes achieved via their elimination. Genome tailoring decreased cell duplication time by 7.0 ± 0.1 min (p < 0.05) compared to the parental strain, with intact genome and cell duplication time of 68 ± 1 min (p < 0.05). Clostridium sp. MT871 with tailored genome was UVC-mutated to withstand 6.1 % isobutanol in fermentation broth to prevent product inhibition in an engineered commercial biocatalyst producing 5 % (674.5 mM) isobutanol during two-step continuous fermentation of CO2/H2 gas blend. Biocatalyst Clostridium sp. MT871RG- 11IBR6 was engineered to express six copies of synthetic operon comprising optimized synthetic format dehydrogenase, pyruvate formate lyase, acetolactate synthase, acetohydroxyacid reductoisomerase, 2,3-dihydroxy-isovalerate dehydratase, branched-chain alpha-ketoacid decarboxylase gene, aldehyde dehydrogenase, and alcohol dehydrogenase, regaining cell duplication time of 68 ± 1 min (p < 0.05) for the parental strain. This is the first report on isobutanol production by an engineered acetogen biocatalyst suitable for commercial manufacturing of this chemical/fuel using continuous fermentation of CO2/H2 blend thus contributing to the reversal of global warming.
Evolution of the nuclear receptor gene superfamily.
Laudet, V; Hänni, C; Coll, J; Catzeflis, F; Stéhelin, D
1992-01-01
Nuclear receptor genes represent a large family of genes encoding receptors for various hydrophobic ligands such as steroids, vitamin D, retinoic acid and thyroid hormones. This family also contains genes encoding putative receptors for unknown ligands. Nuclear receptor gene products are composed of several domains important for transcriptional activation, DNA binding (C domain), hormone binding and dimerization (E domain). It is not known whether these genes have evolved through gene duplication from a common ancestor or if their different domains came from different independent sources. To test these possibilities we have constructed and compared the phylogenetic trees derived from two different domains of 30 nuclear receptor genes. The tree built from the DNA binding C domain clearly shows a common progeny of all nuclear receptors, which can be grouped into three subfamilies: (i) thyroid hormone and retinoic acid receptors, (ii) orphan receptors and (iii) steroid hormone receptors. The tree constructed from the central part of the E domain which is implicated in transcriptional regulation and dimerization shows the same distribution in three subfamilies but two groups of receptors are in a different position from that in the C domain tree: (i) the Drosophila knirps family genes have acquired very different E domains during evolution, and (ii) the vitamin D and ecdysone receptors, as well as the FTZ-F1 and the NGF1B genes, seem to have DNA binding and hormone binding domains belonging to different classes. These data suggest a complex evolutionary history for nuclear receptor genes in which gene duplication events and swapping between domains of different origins took place. PMID:1312460
Xie, Jian-Bo; Du, Zhenglin; Bai, Lanqing; Tian, Changfu; Zhang, Yunzhi; Xie, Jiu-Yan; Wang, Tianshu; Liu, Xiaomeng; Chen, Xi; Cheng, Qi; Chen, Sanfeng; Li, Jilun
2014-01-01
We provide here a comparative genome analysis of 31 strains within the genus Paenibacillus including 11 new genomic sequences of N2-fixing strains. The heterogeneity of the 31 genomes (15 N2-fixing and 16 non-N2-fixing Paenibacillus strains) was reflected in the large size of the shell genome, which makes up approximately 65.2% of the genes in pan genome. Large numbers of transposable elements might be related to the heterogeneity. We discovered that a minimal and compact nif cluster comprising nine genes nifB, nifH, nifD, nifK, nifE, nifN, nifX, hesA and nifV encoding Mo-nitrogenase is conserved in the 15 N2-fixing strains. The nif cluster is under control of a σ70-depedent promoter and possesses a GlnR/TnrA-binding site in the promoter. Suf system encoding [Fe–S] cluster is highly conserved in N2-fixing and non-N2-fixing strains. Furthermore, we demonstrate that the nif cluster enabled Escherichia coli JM109 to fix nitrogen. Phylogeny of the concatenated NifHDK sequences indicates that Paenibacillus and Frankia are sister groups. Phylogeny of the concatenated 275 single-copy core genes suggests that the ancestral Paenibacillus did not fix nitrogen. The N2-fixing Paenibacillus strains were generated by acquiring the nif cluster via horizontal gene transfer (HGT) from a source related to Frankia. During the history of evolution, the nif cluster was lost, producing some non-N2-fixing strains, and vnf encoding V-nitrogenase or anf encoding Fe-nitrogenase was acquired, causing further diversification of some strains. In addition, some N2-fixing strains have additional nif and nif-like genes which may result from gene duplications. The evolution of nitrogen fixation in Paenibacillus involves a mix of gain, loss, HGT and duplication of nif/anf/vnf genes. This study not only reveals the organization and distribution of nitrogen fixation genes in Paenibacillus, but also provides insight into the complex evolutionary history of nitrogen fixation. PMID:24651173
Ma, Jun; Wang, Qinglian; Sun, Runrun; Xie, Fuliang; Jones, Don C; Zhang, Baohong
2014-10-16
Plant-specific TEOSINTE-BRANCHED1/CYCLOIDEA/PCF (TCP) transcription factors play versatile functions in multiple aspects of plant growth and development. However, no systematical study has been performed in cotton. In this study, we performed for the first time the genome-wide identification and expression analysis of the TCP transcription factor family in Gossypium raimondii. A total of 38 non-redundant cotton TCP encoding genes were identified. The TCP transcription factors were divided into eleven subgroups based on phylogenetic analysis. Most TCP genes within the same subfamily demonstrated similar exon and intron organization and the motif structures were highly conserved among the subfamilies. Additionally, the chromosomal distribution pattern revealed that TCP genes were unevenly distributed across 11 out of the 13 chromosomes; segmental duplication is a predominant duplication event for TCP genes and the major contributor to the expansion of TCP gene family in G. raimondii. Moreover, the expression profiles of TCP genes shed light on their functional divergence.
Ma, Jun; Wang, Qinglian; Sun, Runrun; Xie, Fuliang; Jones, Don C.; Zhang, Baohong
2014-01-01
Plant-specific TEOSINTE-BRANCHED1/CYCLOIDEA/PCF (TCP) transcription factors play versatile functions in multiple aspects of plant growth and development. However, no systematical study has been performed in cotton. In this study, we performed for the first time the genome-wide identification and expression analysis of the TCP transcription factor family in Gossypium raimondii. A total of 38 non-redundant cotton TCP encoding genes were identified. The TCP transcription factors were divided into eleven subgroups based on phylogenetic analysis. Most TCP genes within the same subfamily demonstrated similar exon and intron organization and the motif structures were highly conserved among the subfamilies. Additionally, the chromosomal distribution pattern revealed that TCP genes were unevenly distributed across 11 out of the 13 chromosomes; segmental duplication is a predominant duplication event for TCP genes and the major contributor to the expansion of TCP gene family in G. raimondii. Moreover, the expression profiles of TCP genes shed light on their functional divergence. PMID:25322260
Zhou, Qingxiang; Zhang, Tianyi; Xu, Weihua; Yu, Linlin; Yi, Yongzhu; Zhang, Zhifang
2008-01-01
Background achaete-scute complexe (AS-C) has been widely studied at genetic, developmental and evolutional levels. Genes of this family encode proteins containing a highly conserved bHLH domain, which take part in the regulation of the development of central nervous system and peripheral nervous system. Many AS-C homologs have been isolated from various vertebrates and invertebrates. Also, AS-C genes are duplicated during the evolution of Diptera. Functions besides neural development controlling have also been found in Drosophila AS-C genes. Results We cloned four achaete-scute homologs (ASH) from the lepidopteran model organism Bombyx mori, including three proneural genes and one neural precursor gene. Proteins encoded by them contained the characteristic bHLH domain and the three proneural ones were also found to have the C-terminal conserved motif. These genes regulated promoter activity through the Class A E-boxes in vitro. Though both Bm-ASH and Drosophila AS-C have four members, they are not in one by one corresponding relationships. Results of RT-PCR and real-time PCR showed that Bm-ASH genes were expressed in different larval tissues, and had well-regulated expressional profiles during the development of embryo and wing/wing disc. Conclusion There are four achaete-scute homologs in Bombyx mori, the second insect having four AS-C genes so far, and these genes have multiple functions in silkworm life cycle. AS-C gene duplication in insects occurs after or parallel to, but not before the taxonomic order formation during evolution. PMID:18321391
Zhou, Qingxiang; Zhang, Tianyi; Xu, Weihua; Yu, Linlin; Yi, Yongzhu; Zhang, Zhifang
2008-03-06
achaete-scute complexe (AS-C) has been widely studied at genetic, developmental and evolutional levels. Genes of this family encode proteins containing a highly conserved bHLH domain, which take part in the regulation of the development of central nervous system and peripheral nervous system. Many AS-C homologs have been isolated from various vertebrates and invertebrates. Also, AS-C genes are duplicated during the evolution of Diptera. Functions besides neural development controlling have also been found in Drosophila AS-C genes. We cloned four achaete-scute homologs (ASH) from the lepidopteran model organism Bombyx mori, including three proneural genes and one neural precursor gene. Proteins encoded by them contained the characteristic bHLH domain and the three proneural ones were also found to have the C-terminal conserved motif. These genes regulated promoter activity through the Class A E-boxes in vitro. Though both Bm-ASH and Drosophila AS-C have four members, they are not in one by one corresponding relationships. Results of RT-PCR and real-time PCR showed that Bm-ASH genes were expressed in different larval tissues, and had well-regulated expressional profiles during the development of embryo and wing/wing disc. There are four achaete-scute homologs in Bombyx mori, the second insect having four AS-C genes so far, and these genes have multiple functions in silkworm life cycle. AS-C gene duplication in insects occurs after or parallel to, but not before the taxonomic order formation during evolution.
Bioinformatics Analysis of NBS-LRR Encoding Resistance Genes in Setaria italica.
Zhao, Yan; Weng, Qiaoyun; Song, Jinhui; Ma, Hailian; Yuan, Jincheng; Dong, Zhiping; Liu, Yinghui
2016-06-01
In plants, resistance (R) genes are involved in pathogen recognition and subsequent activation of innate immune responses. The nucleotide-binding site-leucine-rich repeat (NBS-LRR) genes family forms the largest R-gene family among plant genomes and play an important role in plant disease resistance. In this paper, comprehensive analysis of NBS-encoding genes is performed in the whole Setaria italica genome. A total of 96 NBS-LRR genes are identified, and comprehensive overview of the NBS-LRR genes is undertaken, including phylogenetic analysis, chromosome locations, conserved motifs of proteins, and gene expression. Based on the domain, these genes are divided into two groups and distributed in all Setaria italica chromosomes. Most NBS-LRR genes are located at the distal tip of the long arms of the chromosomes. Setaria italica NBS-LRR proteins share at least one nucleotide-biding domain and one leucine-rich repeat domain. Our results also show the duplication of NBS-LRR genes in Setaria italica is related to their gene structure.
Evolution and Expression of Tissue Globins in Ray-Finned Fishes.
Gallagher, Michael D; Macqueen, Daniel J
2017-01-01
The globin gene family encodes oxygen-binding hemeproteins conserved across the major branches of multicellular life. The origins and evolutionary histories of complete globin repertoires have been established for many vertebrates, but there remain major knowledge gaps for ray-finned fish. Therefore, we used phylogenetic, comparative genomic and gene expression analyses to discover and characterize canonical “non-blood” globin family members (i.e., myoglobin, cytoglobin, neuroglobin, globin-X, and globin-Y) across multiple ray-finned fish lineages, revealing novel gene duplicates (paralogs) conserved from whole genome duplication (WGD) and small-scale duplication events. Our key findings were that: (1) globin-X paralogs in teleosts have been retained from the teleost-specific WGD, (2) functional paralogs of cytoglobin, neuroglobin, and globin-X, but not myoglobin, have been conserved from the salmonid-specific WGD, (3) triplicate lineage-specific myoglobin paralogs are conserved in arowanas (Osteoglossiformes), which arose by tandem duplication and diverged under positive selection, (4) globin-Y is retained in multiple early branching fish lineages that diverged before teleosts, and (5) marked variation in tissue-specific expression of globin gene repertoires exists across ray-finned fish evolution, including several previously uncharacterized sites of expression. In this respect, our data provide an interesting link between myoglobin expression and the evolution of air breathing in teleosts. Together, our findings demonstrate great-unrecognized diversity in the repertoire and expression of nonblood globins that has arisen during ray-finned fish evolution.
Origins of neurogenesis, a cnidarian view.
Galliot, Brigitte; Quiquand, Manon; Ghila, Luiza; de Rosa, Renaud; Miljkovic-Licina, Marijana; Chera, Simona
2009-08-01
New perspectives on the origin of neurogenesis emerged with the identification of genes encoding post-synaptic proteins as well as many "neurogenic" regulators as the NK, Six, Pax, bHLH proteins in the Demosponge genome, a species that might differentiate sensory cells but no neurons. However, poriferans seem to miss some key regulators of the neurogenic circuitry as the Hox/paraHox and Otx-like gene families. Moreover as a general feature, many gene families encoding evolutionarily-conserved signaling proteins and transcription factors were submitted to a wave of gene duplication in the last common eumetazoan ancestor, after Porifera divergence. In contrast gene duplications in the last common bilaterian ancestor, Urbilateria, are limited, except for the bHLH Atonal-class. Hence Cnidaria share with Bilateria a large number of genetic tools. The expression and functional analyses currently available suggest a neurogenic function for numerous orthologs in developing or adult cnidarians where neurogenesis takes place continuously. As an example, in the Hydra polyp, the Clytia medusa and the Acropora coral, the Gsx/cnox2/Anthox-2 ParaHox gene likely supports neurogenesis. Also neurons and nematocytes (mechanosensory cells) share in hydrozoans a common stem cell and several regulatory genes indicating that they can be considered as sister cells. Performed in anthozoan and medusozoan species, these studies should tell us more about the way(s) evolution hazards achieved the transition from epithelial to neuronal cell fate, and about the robustness of the genetic circuitry that allowed neuromuscular transmission to arise and be maintained across evolution.
Sharma, V K; Bayles, D O; Alt, D P; Looft, T; Brunelle, B W; Stasko, J A
2017-03-08
Escherichia coli O157:H7 (O157) strain 86-24, linked to a 1986 disease outbreak, displays curli- and biofilm-negative phenotypes that are correlated with the lack of Congo red (CR) binding and formation of white colonies (CR - ) on a CR-containing medium. However, on a CR medium this strain produces red isolates (CR + ) capable of producing curli fimbriae and biofilms. To identify genes controlling differential expression of curli fimbriae and biofilm formation, the RNA-Seq profile of a CR + isolate was compared to the CR - parental isolate. Of the 242 genes expressed differentially in the CR + isolate, 201 genes encoded proteins of known functions while the remaining 41 encoded hypothetical proteins. Among the genes with known functions, 149 were down- and 52 were up-regulated. Some of the upregulated genes were linked to biofilm formation through biosynthesis of curli fimbriae and flagella. The genes encoding transcriptional regulators, such as CsgD, QseB, YkgK, YdeH, Bdm, CspD, BssR and FlhDC, which modulate biofilm formation, were significantly altered in their expression. Several genes of the envelope stress (cpxP), heat shock (rpoH, htpX, degP), oxidative stress (ahpC, katE), nutrient limitation stress (phoB-phoR and pst) response pathways, and amino acid metabolism were downregulated in the CR + isolate. Many genes mediating acid resistance and colanic acid biosynthesis, which influence biofilm formation directly or indirectly, were also down-regulated. Comparative genomics of CR + and CR - isolates revealed the presence of a short duplicated sequence in the rcsB gene of the CR + isolate. The alignment of the amino acid sequences of RcsB of the two isolates showed truncation of RcsB in the CR + isolate at the insertion site of the duplicated sequence. Complementation of CR + isolate with rcsB of the CR - parent restored parental phenotypes to the CR + isolate. The results of this study indicate that RcsB is a global regulator affecting bacterial survival in growth-restrictive environments through upregulation of genes promoting biofilm formation while downregulating certain metabolic functions. Understanding whether rcsB inactivation enhances persistence and survival of O157 in carrier animals and the environment would be important in developing strategies for controlling this bacterial pathogen in these niches.
Yang, Ya; Moore, Michael J.; Brockington, Samuel F.; Soltis, Douglas E.; Wong, Gane Ka-Shu; Carpenter, Eric J.; Zhang, Yong; Chen, Li; Yan, Zhixiang; Xie, Yinlong; Sage, Rowan F.; Covshoff, Sarah; Hibberd, Julian M.; Nelson, Matthew N.; Smith, Stephen A.
2015-01-01
Many phylogenomic studies based on transcriptomes have been limited to “single-copy” genes due to methodological challenges in homology and orthology inferences. Only a relatively small number of studies have explored analyses beyond reconstructing species relationships. We sampled 69 transcriptomes in the hyperdiverse plant clade Caryophyllales and 27 outgroups from annotated genomes across eudicots. Using a combined similarity- and phylogenetic tree-based approach, we recovered 10,960 homolog groups, where each was represented by at least eight ingroup taxa. By decomposing these homolog trees, and taking gene duplications into account, we obtained 17,273 ortholog groups, where each was represented by at least ten ingroup taxa. We reconstructed the species phylogeny using a 1,122-gene data set with a gene occupancy of 92.1%. From the homolog trees, we found that both synonymous and nonsynonymous substitution rates in herbaceous lineages are up to three times as fast as in their woody relatives. This is the first time such a pattern has been shown across thousands of nuclear genes with dense taxon sampling. We also pinpointed regions of the Caryophyllales tree that were characterized by relatively high frequencies of gene duplication, including three previously unrecognized whole-genome duplications. By further combining information from homolog tree topology and synonymous distance between paralog pairs, phylogenetic locations for 13 putative genome duplication events were identified. Genes that experienced the greatest gene family expansion were concentrated among those involved in signal transduction and oxidoreduction, including a cytochrome P450 gene that encodes a key enzyme in the betalain synthesis pathway. Our approach demonstrates a new approach for functional phylogenomic analysis in nonmodel species that is based on homolog groups in addition to inferred ortholog groups. PMID:25837578
Heterogeneous conservation of Dlx paralog co-expression in jawed vertebrates.
Debiais-Thibaud, Mélanie; Metcalfe, Cushla J; Pollack, Jacob; Germon, Isabelle; Ekker, Marc; Depew, Michael; Laurenti, Patrick; Borday-Birraux, Véronique; Casane, Didier
2013-01-01
The Dlx gene family encodes transcription factors involved in the development of a wide variety of morphological innovations that first evolved at the origins of vertebrates or of the jawed vertebrates. This gene family expanded with the two rounds of genome duplications that occurred before jawed vertebrates diversified. It includes at least three bigene pairs sharing conserved regulatory sequences in tetrapods and teleost fish, but has been only partially characterized in chondrichthyans, the third major group of jawed vertebrates. Here we take advantage of developmental and molecular tools applied to the shark Scyliorhinus canicula to fill in the gap and provide an overview of the evolution of the Dlx family in the jawed vertebrates. These results are analyzed in the theoretical framework of the DDC (Duplication-Degeneration-Complementation) model. The genomic organisation of the catshark Dlx genes is similar to that previously described for tetrapods. Conserved non-coding elements identified in bony fish were also identified in catshark Dlx clusters and showed regulatory activity in transgenic zebrafish. Gene expression patterns in the catshark showed that there are some expression sites with high conservation of the expressed paralog(s) and other expression sites with events of paralog sub-functionalization during jawed vertebrate diversification, resulting in a wide variety of evolutionary scenarios within this gene family. Dlx gene expression patterns in the catshark show that there has been little neo-functionalization in Dlx genes over gnathostome evolution. In most cases, one tandem duplication and two rounds of vertebrate genome duplication have led to at least six Dlx coding sequences with redundant expression patterns followed by some instances of paralog sub-functionalization. Regulatory constraints such as shared enhancers, and functional constraints including gene pleiotropy, may have contributed to the evolutionary inertia leading to high redundancy between gene expression patterns.
Genome-wide analysis of WRKY gene family in Cucumis sativus
2011-01-01
Background WRKY proteins are a large family of transcriptional regulators in higher plant. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. Prior to the present study, only one full-length cucumber WRKY protein had been reported. The recent publication of the draft genome sequence of cucumber allowed us to conduct a genome-wide search for cucumber WRKY proteins, and to compare these positively identified proteins with their homologs in model plants, such as Arabidopsis. Results We identified a total of 55 WRKY genes in the cucumber genome. According to structural features of their encoded proteins, the cucumber WRKY (CsWRKY) genes were classified into three groups (group 1-3). Analysis of expression profiles of CsWRKY genes indicated that 48 WRKY genes display differential expression either in their transcript abundance or in their expression patterns under normal growth conditions, and 23 WRKY genes were differentially expressed in response to at least one abiotic stresses (cold, drought or salinity). The expression profile of stress-inducible CsWRKY genes were correlated with those of their putative Arabidopsis WRKY (AtWRKY) orthologs, except for the group 3 WRKY genes. Interestingly, duplicated group 3 AtWRKY genes appear to have been under positive selection pressure during evolution. In contrast, there was no evidence of recent gene duplication or positive selection pressure among CsWRKY group 3 genes, which may have led to the expressional divergence of group 3 orthologs. Conclusions Fifty-five WRKY genes were identified in cucumber and the structure of their encoded proteins, their expression, and their evolution were examined. Considering that there has been extensive expansion of group 3 WRKY genes in angiosperms, the occurrence of different evolutionary events could explain the functional divergence of these genes. PMID:21955985
Genome-wide analysis of WRKY gene family in Cucumis sativus.
Ling, Jian; Jiang, Weijie; Zhang, Ying; Yu, Hongjun; Mao, Zhenchuan; Gu, Xingfang; Huang, Sanwen; Xie, Bingyan
2011-09-28
WRKY proteins are a large family of transcriptional regulators in higher plant. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. Prior to the present study, only one full-length cucumber WRKY protein had been reported. The recent publication of the draft genome sequence of cucumber allowed us to conduct a genome-wide search for cucumber WRKY proteins, and to compare these positively identified proteins with their homologs in model plants, such as Arabidopsis. We identified a total of 55 WRKY genes in the cucumber genome. According to structural features of their encoded proteins, the cucumber WRKY (CsWRKY) genes were classified into three groups (group 1-3). Analysis of expression profiles of CsWRKY genes indicated that 48 WRKY genes display differential expression either in their transcript abundance or in their expression patterns under normal growth conditions, and 23 WRKY genes were differentially expressed in response to at least one abiotic stresses (cold, drought or salinity). The expression profile of stress-inducible CsWRKY genes were correlated with those of their putative Arabidopsis WRKY (AtWRKY) orthologs, except for the group 3 WRKY genes. Interestingly, duplicated group 3 AtWRKY genes appear to have been under positive selection pressure during evolution. In contrast, there was no evidence of recent gene duplication or positive selection pressure among CsWRKY group 3 genes, which may have led to the expressional divergence of group 3 orthologs. Fifty-five WRKY genes were identified in cucumber and the structure of their encoded proteins, their expression, and their evolution were examined. Considering that there has been extensive expansion of group 3 WRKY genes in angiosperms, the occurrence of different evolutionary events could explain the functional divergence of these genes.
Transcriptional analysis of the R locus: Progress report, September 1986 through October 1987
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wessler, S.R.
1987-11-01
The R locus controls where, when and how much anthocyanins are expressed in at least 11 different tissues of the corn plant and seed. Enormous natural variation has been seen when the phenotypes of different R alleles are compared in a common genetic background. Some alleles have been shown to have a compound structure resulting from gene duplication and divergence. In these complex alleles, each member of the duplication (called R genic elements) has a unique pattern of expression. The function of the R locus is not known; genetic and biochemical analyses suggest that it may encode a protein thatmore » regulates other genes in the anthocyanin pathway. Over the past year we have determined that the genic elements (P), (S), and (Lc) all encode a very rare 2.8 kb transcript that is present in tissue displaying anthocyanin pigmentation. cDNA libraries have been constructed using mRNA isolated from tissues shown by Northern blots to be enriched for the R transcript. Full-length cDNAs will be sequenced and compared to each other.« less
The ribosomal protein genes and Minute loci of Drosophila melanogaster
Marygold, Steven J; Roote, John; Reuter, Gunter; Lambertsson, Andrew; Ashburner, Michael; Millburn, Gillian H; Harrison, Paul M; Yu, Zhan; Kenmochi, Naoya; Kaufman, Thomas C; Leevers, Sally J; Cook, Kevin R
2007-01-01
Background Mutations in genes encoding ribosomal proteins (RPs) have been shown to cause an array of cellular and developmental defects in a variety of organisms. In Drosophila melanogaster, disruption of RP genes can result in the 'Minute' syndrome of dominant, haploinsufficient phenotypes, which include prolonged development, short and thin bristles, and poor fertility and viability. While more than 50 Minute loci have been defined genetically, only 15 have so far been characterized molecularly and shown to correspond to RP genes. Results We combined bioinformatic and genetic approaches to conduct a systematic analysis of the relationship between RP genes and Minute loci. First, we identified 88 genes encoding 79 different cytoplasmic RPs (CRPs) and 75 genes encoding distinct mitochondrial RPs (MRPs). Interestingly, nine CRP genes are present as duplicates and, while all appear to be functional, one member of each gene pair has relatively limited expression. Next, we defined 65 discrete Minute loci by genetic criteria. Of these, 64 correspond to, or very likely correspond to, CRP genes; the single non-CRP-encoding Minute gene encodes a translation initiation factor subunit. Significantly, MRP genes and more than 20 CRP genes do not correspond to Minute loci. Conclusion This work answers a longstanding question about the molecular nature of Minute loci and suggests that Minute phenotypes arise from suboptimal protein synthesis resulting from reduced levels of cytoribosomes. Furthermore, by identifying the majority of haplolethal and haplosterile loci at the molecular level, our data will directly benefit efforts to attain complete deletion coverage of the D. melanogaster genome. PMID:17927810
JGI Plant Genomics Gene Annotation Pipeline
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shu, Shengqiang; Rokhsar, Dan; Goodstein, David
2014-07-14
Plant genomes vary in size and are highly complex with a high amount of repeats, genome duplication and tandem duplication. Gene encodes a wealth of information useful in studying organism and it is critical to have high quality and stable gene annotation. Thanks to advancement of sequencing technology, many plant species genomes have been sequenced and transcriptomes are also sequenced. To use these vastly large amounts of sequence data to make gene annotation or re-annotation in a timely fashion, an automatic pipeline is needed. JGI plant genomics gene annotation pipeline, called integrated gene call (IGC), is our effort toward thismore » aim with aid of a RNA-seq transcriptome assembly pipeline. It utilizes several gene predictors based on homolog peptides and transcript ORFs. See Methods for detail. Here we present genome annotation of JGI flagship green plants produced by this pipeline plus Arabidopsis and rice except for chlamy which is done by a third party. The genome annotations of these species and others are used in our gene family build pipeline and accessible via JGI Phytozome portal whose URL and front page snapshot are shown below.« less
'Laminopathies': A wide spectrum of human diseases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Worman, Howard J.; Bonne, Gisele; Universite Pierre et Marie Curie-Paris 6, Faculte de medecine, Paris F-75013
2007-06-10
Mutations in genes encoding the intermediate filament nuclear lamins and associated proteins cause a wide spectrum of diseases sometimes called 'laminopathies.' Diseases caused by mutations in LMNA encoding A-type lamins include autosomal dominant Emery-Dreifuss muscular dystrophy and related myopathies, Dunnigan-type familial partial lipodystrophy, Charcot-Marie-Tooth disease type 2B1 and developmental and accelerated aging disorders. Duplication in LMNB1 encoding lamin B1 causes autosomal dominant leukodystrophy and mutations in LMNB2 encoding lamin B2 are associated with acquired partial lipodystrophy. Disorders caused by mutations in genes encoding lamin-associated integral inner nuclear membrane proteins include X-linked Emery-Dreifuss muscular dystrophy, sclerosing bone dysplasias, HEM/Greenberg skeletal dysplasiamore » and Pelger-Huet anomaly. While mutations and clinical phenotypes of 'laminopathies' have been carefully described, data explaining pathogenic mechanisms are only emerging. Future investigations will likely identify new 'laminopathies' and a combination of basic and clinical research will lead to a better understanding of pathophysiology and the development of therapies.« less
Wood, Gwendolyn E.; Haydock, Andrew K.; Leigh, John A.
2003-01-01
Methanococcus maripaludis is a mesophilic species of Archaea capable of producing methane from two substrates: hydrogen plus carbon dioxide and formate. To study the latter, we identified the formate dehydrogenase genes of M. maripaludis and found that the genome contains two gene clusters important for formate utilization. Phylogenetic analysis suggested that the two formate dehydrogenase gene sets arose from duplication events within the methanococcal lineage. The first gene cluster encodes homologs of formate dehydrogenase α (FdhA) and β (FdhB) subunits and a putative formate transporter (FdhC) as well as a carbonic anhydrase analog. The second gene cluster encodes only FdhA and FdhB homologs. Mutants lacking either fdhA gene exhibited a partial growth defect on formate, whereas a double mutant was completely unable to grow on formate as a sole methanogenic substrate. Investigation of fdh gene expression revealed that transcription of both gene clusters is controlled by the presence of H2 and not by the presence of formate. PMID:12670979
The Goddard and Saturn Genes Are Essential for Drosophila Male Fertility and May Have Arisen De Novo
Gubala, Anna M.; Schmitz, Jonathan F.; Kearns, Michael J.; Vinh, Tery T.; Bornberg-Bauer, Erich; Wolfner, Mariana F.
2017-01-01
New genes arise through a variety of mechanisms, including the duplication of existing genes and the de novo birth of genes from noncoding DNA sequences. While there are numerous examples of duplicated genes with important functional roles, the functions of de novo genes remain largely unexplored. Many newly evolved genes are expressed in the male reproductive tract, suggesting that these evolutionary innovations may provide advantages to males experiencing sexual selection. Using testis-specific RNA interference, we screened 11 putative de novo genes in Drosophila melanogaster for effects on male fertility and identified two, goddard and saturn, that are essential for spermatogenesis and sperm function. Goddard knockdown (KD) males fail to produce mature sperm, while saturn KD males produce few sperm, and these function inefficiently once transferred to females. Consistent with a de novo origin, both genes are identifiable only in Drosophila and are predicted to encode proteins with no sequence similarity to any annotated protein. However, since high levels of divergence prevented the unambiguous identification of the noncoding sequences from which each gene arose, we consider goddard and saturn to be putative de novo genes. Within Drosophila, both genes have been lost in certain lineages, but show conserved, male-specific patterns of expression in the species in which they are found. Goddard is consistently found in single-copy and evolves under purifying selection. In contrast, saturn has diversified through gene duplication and positive selection. These data suggest that de novo genes can acquire essential roles in male reproduction. PMID:28104747
Puranik, Swati; Sahu, Pranav Pankaj; Mandal, Sambhu Nath; B., Venkata Suresh; Parida, Swarup Kumar; Prasad, Manoj
2013-01-01
The NAC proteins represent a major plant-specific transcription factor family that has established enormously diverse roles in various plant processes. Aided by the availability of complete genomes, several members of this family have been identified in Arabidopsis, rice, soybean and poplar. However, no comprehensive investigation has been presented for the recently sequenced, naturally stress tolerant crop, Setaria italica (foxtail millet) that is famed as a model crop for bioenergy research. In this study, we identified 147 putative NAC domain-encoding genes from foxtail millet by systematic sequence analysis and physically mapped them onto nine chromosomes. Genomic organization suggested that inter-chromosomal duplications may have been responsible for expansion of this gene family in foxtail millet. Phylogenetically, they were arranged into 11 distinct sub-families (I-XI), with duplicated genes fitting into one cluster and possessing conserved motif compositions. Comparative mapping with other grass species revealed some orthologous relationships and chromosomal rearrangements including duplication, inversion and deletion of genes. The evolutionary significance as duplication and divergence of NAC genes based on their amino acid substitution rates was understood. Expression profiling against various stresses and phytohormones provides novel insights into specific and/or overlapping expression patterns of SiNAC genes, which may be responsible for functional divergence among individual members in this crop. Further, we performed structure modeling and molecular simulation of a stress-responsive protein, SiNAC128, proffering an initial framework for understanding its molecular function. Taken together, this genome-wide identification and expression profiling unlocks new avenues for systematic functional analysis of novel NAC gene family candidates which may be applied for improvising stress adaption in plants. PMID:23691254
Puranik, Swati; Sahu, Pranav Pankaj; Mandal, Sambhu Nath; B, Venkata Suresh; Parida, Swarup Kumar; Prasad, Manoj
2013-01-01
The NAC proteins represent a major plant-specific transcription factor family that has established enormously diverse roles in various plant processes. Aided by the availability of complete genomes, several members of this family have been identified in Arabidopsis, rice, soybean and poplar. However, no comprehensive investigation has been presented for the recently sequenced, naturally stress tolerant crop, Setaria italica (foxtail millet) that is famed as a model crop for bioenergy research. In this study, we identified 147 putative NAC domain-encoding genes from foxtail millet by systematic sequence analysis and physically mapped them onto nine chromosomes. Genomic organization suggested that inter-chromosomal duplications may have been responsible for expansion of this gene family in foxtail millet. Phylogenetically, they were arranged into 11 distinct sub-families (I-XI), with duplicated genes fitting into one cluster and possessing conserved motif compositions. Comparative mapping with other grass species revealed some orthologous relationships and chromosomal rearrangements including duplication, inversion and deletion of genes. The evolutionary significance as duplication and divergence of NAC genes based on their amino acid substitution rates was understood. Expression profiling against various stresses and phytohormones provides novel insights into specific and/or overlapping expression patterns of SiNAC genes, which may be responsible for functional divergence among individual members in this crop. Further, we performed structure modeling and molecular simulation of a stress-responsive protein, SiNAC128, proffering an initial framework for understanding its molecular function. Taken together, this genome-wide identification and expression profiling unlocks new avenues for systematic functional analysis of novel NAC gene family candidates which may be applied for improvising stress adaption in plants.
Lu, Hsiao-ling; Tanguy, Sylvie; Rispe, Claude; Gauthier, Jean-Pierre; Walsh, Tom; Gordon, Karl; Edwards, Owain; Tagu, Denis; Chang, Chun-che; Jaubert-Possamai, Stéphanie
2011-01-01
Piwi-interacting RNAs (piRNAs) are known to regulate transposon activity in germ cells of several animal models that propagate sexually. However, the role of piRNAs during asexual reproduction remains almost unknown. Aphids that can alternate sexual and asexual reproduction cycles in response to seasonal changes of photoperiod provide a unique opportunity to study piRNAs and the piRNA pathway in both reproductive modes. Taking advantage of the recently sequenced genome of the pea aphid Acyrthosiphon pisum, we found an unusually large lineage-specific expansion of genes encoding the Piwi sub-clade of Argonaute proteins. In situ hybridisation showed differential expressions between the duplicated piwi copies: while Api-piwi2 and Api-piwi6 are “specialised” in germ cells their most closely related copy, respectively Api-piwi5 and Api-piwi3, are expressed in the somatic cells. The differential expression was also identified in duplicated ago3: Api-ago3a in germ cells and Api-ago3b in somatic cells. Moreover, analyses of expression profiles of the expanded piwi and ago3 genes by semi-quantitative RT-PCR showed that expressions varied according to the reproductive types. These specific expression patterns suggest that expanded aphid piwi and ago3 genes have distinct roles in asexual and sexual reproduction. PMID:22162754
Evolution and Expression of Tissue Globins in Ray-Finned Fishes
Gallagher, Michael D.
2017-01-01
The globin gene family encodes oxygen-binding hemeproteins conserved across the major branches of multicellular life. The origins and evolutionary histories of complete globin repertoires have been established for many vertebrates, but there remain major knowledge gaps for ray-finned fish. Therefore, we used phylogenetic, comparative genomic and gene expression analyses to discover and characterize canonical “non-blood” globin family members (i.e., myoglobin, cytoglobin, neuroglobin, globin-X, and globin-Y) across multiple ray-finned fish lineages, revealing novel gene duplicates (paralogs) conserved from whole genome duplication (WGD) and small-scale duplication events. Our key findings were that: (1) globin-X paralogs in teleosts have been retained from the teleost-specific WGD, (2) functional paralogs of cytoglobin, neuroglobin, and globin-X, but not myoglobin, have been conserved from the salmonid-specific WGD, (3) triplicate lineage-specific myoglobin paralogs are conserved in arowanas (Osteoglossiformes), which arose by tandem duplication and diverged under positive selection, (4) globin-Y is retained in multiple early branching fish lineages that diverged before teleosts, and (5) marked variation in tissue-specific expression of globin gene repertoires exists across ray-finned fish evolution, including several previously uncharacterized sites of expression. In this respect, our data provide an interesting link between myoglobin expression and the evolution of air breathing in teleosts. Together, our findings demonstrate great-unrecognized diversity in the repertoire and expression of nonblood globins that has arisen during ray-finned fish evolution. PMID:28173090
Genome-Wide Analyses of the Soybean F-Box Gene Family in Response to Salt Stress
Jia, Qi; Xiao, Zhi-Xia; Wong, Fuk-Ling; Sun, Song; Liang, Kang-Jing; Lam, Hon-Ming
2017-01-01
The F-box family is one of the largest gene families in plants that regulate diverse life processes, including salt responses. However, the knowledge of the soybean F-box genes and their roles in salt tolerance remains limited. Here, we conducted a genome-wide survey of the soybean F-box family, and their expression analysis in response to salinity via in silico analysis of online RNA-sequencing (RNA-seq) data and quantitative reverse-transcription polymerase chain reaction (qRT-PCR) to predict their potential functions. A total of 725 potential F-box proteins encoded by 509 genes were identified and classified into 9 subfamilies. The gene structures, conserved domains and chromosomal distributions were characterized. There are 76 pairs of duplicate genes identified, including genome-wide segmental and tandem duplication events, which lead to the expansion of the number of F-box genes. The in silico expression analysis showed that these genes would be involved in diverse developmental functions and play an important role in salt response. Our qRT-PCR analysis confirmed 12 salt-responding F-box genes. Overall, our results provide useful information on soybean F-box genes, especially their potential roles in salt tolerance. PMID:28417911
Genome-Wide Analyses of the Soybean F-Box Gene Family in Response to Salt Stress.
Jia, Qi; Xiao, Zhi-Xia; Wong, Fuk-Ling; Sun, Song; Liang, Kang-Jing; Lam, Hon-Ming
2017-04-12
The F-box family is one of the largest gene families in plants that regulate diverse life processes, including salt responses. However, the knowledge of the soybean F-box genes and their roles in salt tolerance remains limited. Here, we conducted a genome-wide survey of the soybean F-box family, and their expression analysis in response to salinity via in silico analysis of online RNA-sequencing (RNA-seq) data and quantitative reverse-transcription polymerase chain reaction (qRT-PCR) to predict their potential functions. A total of 725 potential F-box proteins encoded by 509 genes were identified and classified into 9 subfamilies. The gene structures, conserved domains and chromosomal distributions were characterized. There are 76 pairs of duplicate genes identified, including genome-wide segmental and tandem duplication events, which lead to the expansion of the number of F-box genes. The in silico expression analysis showed that these genes would be involved in diverse developmental functions and play an important role in salt response. Our qRT-PCR analysis confirmed 12 salt-responding F-box genes. Overall, our results provide useful information on soybean F-box genes, especially their potential roles in salt tolerance.
Solis-Escalante, Daniel; Kuijpers, Niels G. A.; Barrajon-Simancas, Nuria; van den Broek, Marcel; Pronk, Jack T.
2015-01-01
As a result of ancestral whole-genome and small-scale duplication events, the genomes of Saccharomyces cerevisiae and many eukaryotes still contain a substantial fraction of duplicated genes. In all investigated organisms, metabolic pathways, and more particularly glycolysis, are specifically enriched for functionally redundant paralogs. In ancestors of the Saccharomyces lineage, the duplication of glycolytic genes is purported to have played an important role leading to S. cerevisiae's current lifestyle favoring fermentative metabolism even in the presence of oxygen and characterized by a high glycolytic capacity. In modern S. cerevisiae strains, the 12 glycolytic reactions leading to the biochemical conversion from glucose to ethanol are encoded by 27 paralogs. In order to experimentally explore the physiological role of this genetic redundancy, a yeast strain with a minimal set of 14 paralogs was constructed (the “minimal glycolysis” [MG] strain). Remarkably, a combination of a quantitative systems approach and semiquantitative analysis in a wide array of growth environments revealed the absence of a phenotypic response to the cumulative deletion of 13 glycolytic paralogs. This observation indicates that duplication of glycolytic genes is not a prerequisite for achieving the high glycolytic fluxes and fermentative capacities that are characteristic of S. cerevisiae and essential for many of its industrial applications and argues against gene dosage effects as a means of fixing minor glycolytic paralogs in the yeast genome. The MG strain was carefully designed and constructed to provide a robust prototrophic platform for quantitative studies and has been made available to the scientific community. PMID:26071034
Makeyev, Aleksandr V.; Erdenechimeg, Lkhamsuren; Mungunsukh, Ognoon; Roth, Jutta J.; Enkhmandakh, Badam; Ruddle, Frank H.; Bayarsaihan, Dashzeveg
2004-01-01
Williams–Beuren syndrome (also known as Williams syndrome) is caused by a deletion of a 1.55- to 1.84-megabase region from chromosome band 7q11.23. GTF2IRD1 and GTF2I, located within this critical region, encode proteins of the TFII-I family with multiple helix–loop–helix domains known as I repeats. In the present work, we characterize a third member, GTF2IRD2, which has sequence and structural similarity to the GTF2I and GTF2IRD1 paralogs. The ORF encodes a protein with several features characteristic of regulatory factors, including two I repeats, two leucine zippers, and a single Cys-2/His-2 zinc finger. The genomic organization of human, baboon, rat, and mouse genes is well conserved. Our exon-by-exon comparison has revealed that GTF2IRD2 is more closely related to GTF2I than to GTF2IRD1 and apparently is derived from the GTF2I sequence. The comparison of GTF2I and GTF2IRD2 genes revealed two distinct regions of homology, indicating that the helix–loop–helix domain structure of the GTF2IRD2 gene has been generated by two independent genomic duplications. We speculate that GTF2I is derived from GTF2IRD1 as a result of local duplication and the further evolution of its structure was associated with its functional specialization. Comparison of genomic sequences surrounding GTF2IRD2 genes in mice and humans allows refinement of the centromeric breakpoint position of the primate-specific inversion within the Williams–Beuren syndrome critical region. PMID:15243160
Makeyev, Aleksandr V; Erdenechimeg, Lkhamsuren; Mungunsukh, Ognoon; Roth, Jutta J; Enkhmandakh, Badam; Ruddle, Frank H; Bayarsaihan, Dashzeveg
2004-07-27
Williams-Beuren syndrome (also known as Williams syndrome) is caused by a deletion of a 1.55- to 1.84-megabase region from chromosome band 7q11.23. GTF2IRD1 and GTF2I, located within this critical region, encode proteins of the TFII-I family with multiple helix-loop-helix domains known as I repeats. In the present work, we characterize a third member, GTF2IRD2, which has sequence and structural similarity to the GTF2I and GTF2IRD1 paralogs. The ORF encodes a protein with several features characteristic of regulatory factors, including two I repeats, two leucine zippers, and a single Cys-2/His-2 zinc finger. The genomic organization of human, baboon, rat, and mouse genes is well conserved. Our exon-by-exon comparison has revealed that GTF2IRD2 is more closely related to GTF2I than to GTF2IRD1 and apparently is derived from the GTF2I sequence. The comparison of GTF2I and GTF2IRD2 genes revealed two distinct regions of homology, indicating that the helix-loop-helix domain structure of the GTF2IRD2 gene has been generated by two independent genomic duplications. We speculate that GTF2I is derived from GTF2IRD1 as a result of local duplication and the further evolution of its structure was associated with its functional specialization. Comparison of genomic sequences surrounding GTF2IRD2 genes in mice and humans allows refinement of the centromeric breakpoint position of the primate-specific inversion within the Williams-Beuren syndrome critical region.
Matoso, Eunice; Melo, Joana B; Ferreira, Susana I; Jardim, Ana; Castelo, Teresa M; Weise, Anja; Carreira, Isabel M
2013-08-01
An insertional translocation (IT) can result in pure segmental aneusomy for the inserted genomic segment allowing to define a more accurate clinical phenotype. Here, we report on two siblings sharing an unbalanced IT inherited from the mother with a history of learning difficulty. An 8-year-old girl with developmental delay, speech disability, and attention-deficit hyperactivity disorder (ADHD), showed by GTG banding analysis a subtle interstitial alteration in 21q21. Oligonucleotide array comparative genomic hybridization (array-CGH) analysis showed a 4q13.1-q13.3 duplication spanning 8.6 Mb. Fluorescence in situ hybridization (FISH) with bacterial artificial chromosome (BAC) clones confirmed the rearrangement, a der(21)ins(21;4)(q21;q13.1q13.3). The duplication described involves 50 RefSeq genes including the EPHA5 gene that encodes for the EphA5 receptor involved in embryonic development of the brain and also in synaptic remodeling and plasticity thought to underlie learning and memory. The same rearrangement was observed in a younger brother with behavioral problems and also exhibiting ADHD. ADHD is among the most heritable of neuropsychiatric disorders. There are few reports of patients with duplications involving the proximal region of 4q and a mild phenotype. To the best of our knowledge this is the first report of a duplication restricted to band 4q13. This abnormality could be easily missed in children who have nonspecific cognitive impairment. The presence of this behavioral disorder in the two siblings reinforces the hypothesis that the region involved could include genes involved in ADHD. Copyright © 2013 Wiley Periodicals, Inc.
Johansson, Tomas; Nyman, Per Olof; Cullen, Daniel
2002-04-01
A peroxidase-encoding gene, mnp2, and its corresponding cDNA were characterized from the white-rot basidiomycete Trametes versicolor PRL 572. We used quantitative reverse transcriptase-mediated PCR to identify mnp2 transcripts in nutrient-limited stationary cultures. Although mnp2 lacks upstream metal response elements (MREs), addition of MnSO(4) to cultures increased mnp2 transcript levels 250-fold. In contrast, transcript levels of an MRE-containing gene of T. versicolor, mnp1, increased only eightfold under the same conditions. Thus, the manganese peroxidase genes in T. versicolor are differentially regulated, and upstream MREs are not necessarily involved. Our results support the hypothesis that fungal and plant peroxidases arose through an ancient duplication and folding of two structural domains, since we found the mnp1 and mnp2 polypeptides to have internal homology.
Johansson, Tomas; Nyman, Per Olof; Cullen, Daniel
2002-01-01
A peroxidase-encoding gene, mnp2, and its corresponding cDNA were characterized from the white-rot basidiomycete Trametes versicolor PRL 572. We used quantitative reverse transcriptase-mediated PCR to identify mnp2 transcripts in nutrient-limited stationary cultures. Although mnp2 lacks upstream metal response elements (MREs), addition of MnSO4 to cultures increased mnp2 transcript levels 250-fold. In contrast, transcript levels of an MRE-containing gene of T. versicolor, mnp1, increased only eightfold under the same conditions. Thus, the manganese peroxidase genes in T. versicolor are differentially regulated, and upstream MREs are not necessarily involved. Our results support the hypothesis that fungal and plant peroxidases arose through an ancient duplication and folding of two structural domains, since we found the mnp1 and mnp2 polypeptides to have internal homology. PMID:11916737
Doddapaneni, Harshavardhan; Subramanian, Venkataramanan; Fu, Bolei; Cullen, Dan
2013-06-01
The oxidative enzymatic machinery for degradation of organic substrates in Agaricus bisporus (Ab) is at the core of the carbon recycling mechanisms in this fungus. To date, 156 genes have been tentatively identified as part of this oxidative enzymatic machinery, which includes 26 peroxidase encoding genes, nine copper radical oxidase [including three putative glyoxal oxidase-encoding genes (GLXs)], 12 laccases sensu stricto and 109 cytochrome P450 monooxygenases. Comparative analyses of these enzymes in Ab with those of the white-rot fungus, Phanerochaete chrysosporium, the brown-rot fungus, Postia placenta, the coprophilic litter fungus, Coprinopsis cinerea and the ectomychorizal fungus, Laccaria bicolor, revealed enzyme diversity consistent with adaptation to substrates rich in humic substances and partially degraded plant material. For instance, relative to wood decay fungi, Ab cytochrome P450 genes were less numerous (109 gene models), distributed among distinctive families, and lacked extensive duplication and clustering. Viewed together with P450 transcript accumulation patterns in three tested growth conditions, these observations were consistent with the unique Ab lifestyle. Based on tandem gene arrangements, a certain degree of gene duplication seems to have occurred in this fungus in the copper radical oxidase (CRO) and the laccase gene families. In Ab, high transcript levels and regulation of the heme-thiolate peroxidases, two manganese peroxidases and the three GLX-like genes are likely in response to complex natural substrates, including lignocellulose and its derivatives, thereby suggesting an important role in lignin degradation. On the other hand, the expression patterns of the related CROs suggest a developmental role in this fungus. Based on these observations, a brief comparative genomic overview of the Ab oxidative enzyme machinery is presented. Copyright © 2013 Elsevier Inc. All rights reserved.
[Divergence of paralogous growth-hormone-encoding genes and their promoters in Salmonidae].
Kamenskaya, D N; Pankova, M V; Atopkin, D M; Brykov, V A
2017-01-01
In many fish species, including salmonids, the growth-hormone is encoded by two duplicated paralogous genes, gh1 and gh2. Both genes were already in place at the time of divergence of species in this group. A comparison of the entire sequence of these genes of salmonids has shown that their conserved regions are associated with exons, while their most variable regions correspond to introns. Introns C and D include putative regulatory elements (sites Pit-1, CRE, and ERE), that are also conserved. In chars, the degree of polymorphism of gh2 gene is 2-3 times as large as that in gh1 gene. However, a comparison across all Salmonidae species would not extent this observation to other species. In both these chars' genes, the promoters are conserved mainly because they correspond to putative regulatory sequences (TATA box, binding sites for the pituitary transcription factor Pit-1 (F1-F4), CRE, GRE and RAR/RXR elements). The promoter of gh2 gene has a greater degree of polymorphism compared with gh1 gene promoter in all investigated species of salmonids. The observed differences in the rates of accumulation of changes in growth hormone encoding paralogs could be explained by differences in the intensity of selection.
Lagman, David; Ocampo Daza, Daniel; Widmark, Jenny; Abalo, Xesús M; Sundström, Görel; Larhammar, Dan
2013-11-02
Vertebrate color vision is dependent on four major color opsin subtypes: RH2 (green opsin), SWS1 (ultraviolet opsin), SWS2 (blue opsin), and LWS (red opsin). Together with the dim-light receptor rhodopsin (RH1), these form the family of vertebrate visual opsins. Vertebrate genomes contain many multi-membered gene families that can largely be explained by the two rounds of whole genome duplication (WGD) in the vertebrate ancestor (2R) followed by a third round in the teleost ancestor (3R). Related chromosome regions resulting from WGD or block duplications are said to form a paralogon. We describe here a paralogon containing the genes for visual opsins, the G-protein alpha subunit families for transducin (GNAT) and adenylyl cyclase inhibition (GNAI), the oxytocin and vasopressin receptors (OT/VP-R), and the L-type voltage-gated calcium channels (CACNA1-L). Sequence-based phylogenies and analyses of conserved synteny show that the above-mentioned gene families, and many neighboring gene families, expanded in the early vertebrate WGDs. This allows us to deduce the following evolutionary scenario: The vertebrate ancestor had a chromosome containing the genes for two visual opsins, one GNAT, one GNAI, two OT/VP-Rs and one CACNA1-L gene. This chromosome was quadrupled in 2R. Subsequent gene losses resulted in a set of five visual opsin genes, three GNAT and GNAI genes, six OT/VP-R genes and four CACNA1-L genes. These regions were duplicated again in 3R resulting in additional teleost genes for some of the families. Major chromosomal rearrangements have taken place in the teleost genomes. By comparison with the corresponding chromosomal regions in the spotted gar, which diverged prior to 3R, we could time these rearrangements to post-3R. We present an extensive analysis of the paralogon housing the visual opsin, GNAT and GNAI, OT/VP-R, and CACNA1-L gene families. The combined data imply that the early vertebrate WGD events contributed to the evolution of vision and the other neuronal and neuroendocrine functions exerted by the proteins encoded by these gene families. In pouched lamprey all five visual opsin genes have previously been identified, suggesting that lampreys diverged from the jawed vertebrates after 2R.
Takeuchi, Takeshi; Koyanagi, Ryo; Gyoja, Fuki; Kanda, Miyuki; Hisata, Kanako; Fujie, Manabu; Goto, Hiroki; Yamasaki, Shinichi; Nagai, Kiyohito; Morino, Yoshiaki; Miyamoto, Hiroshi; Endo, Kazuyoshi; Endo, Hirotoshi; Nagasawa, Hiromichi; Kinoshita, Shigeharu; Asakawa, Shuichi; Watabe, Shugo; Satoh, Noriyuki; Kawashima, Takeshi
2016-01-01
Bivalve molluscs have flourished in marine environments, and many species constitute important aquatic resources. Recently, whole genome sequences from two bivalves, the pearl oyster, Pinctada fucata, and the Pacific oyster, Crassostrea gigas, have been decoded, making it possible to compare genomic sequences among molluscs, and to explore general and lineage-specific genetic features and trends in bivalves. In order to improve the quality of sequence data for these purposes, we have updated the entire P. fucata genome assembly. We present a new genome assembly of the pearl oyster, Pinctada fucata (version 2.0). To update the assembly, we conducted additional sequencing, obtaining accumulated sequence data amounting to 193× the P. fucata genome. Sequence redundancy in contigs that was caused by heterozygosity was removed in silico, which significantly improved subsequent scaffolding. Gene model version 2.0 was generated with the aid of manual gene annotations supplied by the P. fucata research community. Comparison of mollusc and other bilaterian genomes shows that gene arrangements of Hox, ParaHox, and Wnt clusters in the P. fucata genome are similar to those of other molluscs. Like the Pacific oyster, P. fucata possesses many genes involved in environmental responses and in immune defense. Phylogenetic analyses of heat shock protein70 and C1q domain-containing protein families indicate that extensive expansion of genes occurred independently in each lineage. Several gene duplication events prior to the split between the pearl oyster and the Pacific oyster are also evident. In addition, a number of tandem duplications of genes that encode shell matrix proteins are also well characterized in the P. fucata genome. Both the Pinctada and Crassostrea lineages have expanded specific gene families in a lineage-specific manner. Frequent duplication of genes responsible for shell formation in the P. fucata genome explains the diversity of mollusc shell structures. These duplications reveal dynamic genome evolution to forge the complex physiology that enables bivalves to employ a sessile lifestyle in the intertidal zone.
Duplications and losses in gene families of rust pathogens highlight putative effectors.
Pendleton, Amanda L; Smith, Katherine E; Feau, Nicolas; Martin, Francis M; Grigoriev, Igor V; Hamelin, Richard; Nelson, C Dana; Burleigh, J Gordon; Davis, John M
2014-01-01
Rust fungi are a group of fungal pathogens that cause some of the world's most destructive diseases of trees and crops. A shared characteristic among rust fungi is obligate biotrophy, the inability to complete a lifecycle without a host. This dependence on a host species likely affects patterns of gene expansion, contraction, and innovation within rust pathogen genomes. The establishment of disease by biotrophic pathogens is reliant upon effector proteins that are encoded in the fungal genome and secreted from the pathogen into the host's cell apoplast or within the cells. This study uses a comparative genomic approach to elucidate putative effectors and determine their evolutionary histories. We used OrthoMCL to identify nearly 20,000 gene families in proteomes of 16 diverse fungal species, which include 15 basidiomycetes and one ascomycete. We inferred patterns of duplication and loss for each gene family and identified families with distinctive patterns of expansion/contraction associated with the evolution of rust fungal genomes. To recognize potential contributors for the unique features of rust pathogens, we identified families harboring secreted proteins that: (i) arose or expanded in rust pathogens relative to other fungi, or (ii) contracted or were lost in rust fungal genomes. While the origin of rust fungi appears to be associated with considerable gene loss, there are many gene duplications associated with each sampled rust fungal genome. We also highlight two putative effector gene families that have expanded in Cqf that we hypothesize have roles in pathogenicity.
Virts, Elizabeth L; Jankowska, Anna; Mackay, Craig; Glaas, Marcel F; Wiek, Constanze; Kelich, Stephanie L; Lottmann, Nadine; Kennedy, Felicia M; Marchal, Christophe; Lehnert, Erik; Scharf, Rüdiger E; Dufour, Carlo; Lanciotti, Marina; Farruggia, Piero; Santoro, Alessandra; Savasan, Süreyya; Scheckenbach, Kathrin; Schipper, Jörg; Wagenmann, Martin; Lewis, Todd; Leffak, Michael; Farlow, Janice L; Foroud, Tatiana M; Honisch, Ellen; Niederacher, Dieter; Chakraborty, Sujata C; Vance, Gail H; Pruss, Dmitry; Timms, Kirsten M; Lanchbury, Jerry S; Alpi, Arno F; Hanenberg, Helmut
2015-09-15
Fanconi anemia (FA) is a rare inherited disorder clinically characterized by congenital malformations, progressive bone marrow failure and cancer susceptibility. At the cellular level, FA is associated with hypersensitivity to DNA-crosslinking genotoxins. Eight of 17 known FA genes assemble the FA E3 ligase complex, which catalyzes monoubiquitination of FANCD2 and is essential for replicative DNA crosslink repair. Here, we identify the first FA patient with biallelic germline mutations in the ubiquitin E2 conjugase UBE2T. Both mutations were aluY-mediated: a paternal deletion and maternal duplication of exons 2-6. These loss-of-function mutations in UBE2T induced a cellular phenotype similar to biallelic defects in early FA genes with the absence of FANCD2 monoubiquitination. The maternal duplication produced a mutant mRNA that could encode a functional protein but was degraded by nonsense-mediated mRNA decay. In the patient's hematopoietic stem cells, the maternal allele with the duplication of exons 2-6 spontaneously reverted to a wild-type allele by monoallelic recombination at the duplicated aluY repeat, thereby preventing bone marrow failure. Analysis of germline DNA of 814 normal individuals and 850 breast cancer patients for deletion or duplication of UBE2T exons 2-6 identified the deletion in only two controls, suggesting aluY-mediated recombinations within the UBE2T locus are rare and not associated with an increased breast cancer risk. Finally, a loss-of-function germline mutation in UBE2T was detected in a high-risk breast cancer patient with wild-type BRCA1/2. Cumulatively, we identified UBE2T as a bona fide FA gene (FANCT) that also may be a rare cancer susceptibility gene. © The Author 2015. Published by Oxford University Press.
Duplication and selection in the evolution of primate β-defensin genes
Semple, Colin AM; Rolfe, Mark; Dorin, Julia R
2003-01-01
Background Innate immunity is the first line of defense against microorganisms in vertebrates and acts by providing an initial barrier to microorganisms and triggering adaptive immune responses. Peptides such as β-defensins are an important component of this defense, providing a broad spectrum of antimicrobial activity against bacteria, fungi, mycobacteria and several enveloped viruses. β-defensins are small cationic peptides that vary in their expression patterns and spectrum of pathogen specificity. Disruptions in β-defensin function have been implicated in human diseases, including cystic fibrosis, and a fuller understanding of the variety, function and evolution of human β-defensins might form the basis for novel therapies. Here we use a combination of laboratory and computational techniques to characterize the main human β-defensin locus on chromosome 8p22-p23. Results In addition to known genes in the region we report the genomic structures and expression patterns of four novel human β-defensin genes and a related pseudogene. These genes show an unusual pattern of evolution, with rapid divergence between second exon sequences that encode the mature β-defensin peptides matched by relative stasis in first exons that encode signal peptides. Conclusions We conclude that the 8p22-p23 locus has evolved by successive rounds of duplication followed by substantial divergence involving positive selection, to produce a diverse cluster of paralogous genes established before the human-baboon divergence more than 23 million years ago. Positive selection, disproportionately favoring alterations in the charge of amino-acid residues, is implicated as driving second exon divergence in these genes. PMID:12734011
Gubala, Anna M; Schmitz, Jonathan F; Kearns, Michael J; Vinh, Tery T; Bornberg-Bauer, Erich; Wolfner, Mariana F; Findlay, Geoffrey D
2017-05-01
New genes arise through a variety of mechanisms, including the duplication of existing genes and the de novo birth of genes from noncoding DNA sequences. While there are numerous examples of duplicated genes with important functional roles, the functions of de novo genes remain largely unexplored. Many newly evolved genes are expressed in the male reproductive tract, suggesting that these evolutionary innovations may provide advantages to males experiencing sexual selection. Using testis-specific RNA interference, we screened 11 putative de novo genes in Drosophila melanogaster for effects on male fertility and identified two, goddard and saturn, that are essential for spermatogenesis and sperm function. Goddard knockdown (KD) males fail to produce mature sperm, while saturn KD males produce few sperm, and these function inefficiently once transferred to females. Consistent with a de novo origin, both genes are identifiable only in Drosophila and are predicted to encode proteins with no sequence similarity to any annotated protein. However, since high levels of divergence prevented the unambiguous identification of the noncoding sequences from which each gene arose, we consider goddard and saturn to be putative de novo genes. Within Drosophila, both genes have been lost in certain lineages, but show conserved, male-specific patterns of expression in the species in which they are found. Goddard is consistently found in single-copy and evolves under purifying selection. In contrast, saturn has diversified through gene duplication and positive selection. These data suggest that de novo genes can acquire essential roles in male reproduction. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Zhao, Yang; Zhou, Yuqiong; Jiang, Haiyang; Li, Xiaoyu; Gan, Defang; Peng, Xiaojian; Zhu, Suwen; Cheng, Beijiu
2011-01-01
Background Members of the homeodomain-leucine zipper (HD-Zip) gene family encode transcription factors that are unique to plants and have diverse functions in plant growth and development such as various stress responses, organ formation and vascular development. Although systematic characterization of this family has been carried out in Arabidopsis and rice, little is known about HD-Zip genes in maize (Zea mays L.). Methods and Findings In this study, we described the identification and structural characterization of HD-Zip genes in the maize genome. A complete set of 55 HD-Zip genes (Zmhdz1-55) were identified in the maize genome using Blast search tools and categorized into four classes (HD-Zip I-IV) based on phylogeny. Chromosomal location of these genes revealed that they are distributed unevenly across all 10 chromosomes. Segmental duplication contributed largely to the expansion of the maize HD-ZIP gene family, while tandem duplication was only responsible for the amplification of the HD-Zip II genes. Furthermore, most of the maize HD-Zip I genes were found to contain an overabundance of stress-related cis-elements in their promoter sequences. The expression levels of the 17 HD-Zip I genes under drought stress were also investigated by quantitative real-time PCR (qRT-PCR). All of the 17 maize HD-ZIP I genes were found to be regulated by drought stress, and the duplicated genes within a sister pair exhibited the similar expression patterns, suggesting their conserved functions during the process of evolution. Conclusions Our results reveal a comprehensive overview of the maize HD-Zip gene family and provide the first step towards the selection of Zmhdz genes for cloning and functional research to uncover their roles in maize growth and development. PMID:22164299
Zhao, Yang; Zhou, Yuqiong; Jiang, Haiyang; Li, Xiaoyu; Gan, Defang; Peng, Xiaojian; Zhu, Suwen; Cheng, Beijiu
2011-01-01
Members of the homeodomain-leucine zipper (HD-Zip) gene family encode transcription factors that are unique to plants and have diverse functions in plant growth and development such as various stress responses, organ formation and vascular development. Although systematic characterization of this family has been carried out in Arabidopsis and rice, little is known about HD-Zip genes in maize (Zea mays L.). In this study, we described the identification and structural characterization of HD-Zip genes in the maize genome. A complete set of 55 HD-Zip genes (Zmhdz1-55) were identified in the maize genome using Blast search tools and categorized into four classes (HD-Zip I-IV) based on phylogeny. Chromosomal location of these genes revealed that they are distributed unevenly across all 10 chromosomes. Segmental duplication contributed largely to the expansion of the maize HD-ZIP gene family, while tandem duplication was only responsible for the amplification of the HD-Zip II genes. Furthermore, most of the maize HD-Zip I genes were found to contain an overabundance of stress-related cis-elements in their promoter sequences. The expression levels of the 17 HD-Zip I genes under drought stress were also investigated by quantitative real-time PCR (qRT-PCR). All of the 17 maize HD-ZIP I genes were found to be regulated by drought stress, and the duplicated genes within a sister pair exhibited the similar expression patterns, suggesting their conserved functions during the process of evolution. Our results reveal a comprehensive overview of the maize HD-Zip gene family and provide the first step towards the selection of Zmhdz genes for cloning and functional research to uncover their roles in maize growth and development.
Structural Divergence in Vertebrate Phylogeny of a Duplicated Prototype Galectin
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bhat, R.; Chakraborty, M.; Mian, I. S.
Prototype galectins, endogenously expressed animal lectins with a single carbohydrate recognition domain, are well-known regulators of tissue properties such as growth and adhesion. The earliest discovered and best studied of the prototype galectins is Galectin-1 (Gal-1). In the Gallus gallus (chicken) genome, Gal-1 is represented by two homologs: Gal-1A and Gal-1B, with distinct biochemical properties, tissue expression, and developmental functions. We investigated the origin of the Gal-1A/Gal-1B divergence to gain insight into when their developmental functions originated and how they could have contributed to vertebrate phenotypic evolution. Sequence alignment and phylogenetic tree construction showed that the Gal-1A/Gal-1B divergence can bemore » traced back to the origin of the sauropsid lineage (consisting of extinct and extant reptiles and birds) although lineage-specific duplications also occurred in the amphibian and actinopterygian genomes. Gene synteny analysis showed that sauropsid gal-1b (the gene for Gal-1B) and its frog and actinopterygian gal-1 homologs share a similar chromosomal location, whereas sauropsid gal-1a has translocated to a new position. Surprisingly, we found that chicken Gal-1A, encoded by the translocated gal-1a, was more similar in its tertiary folding pattern than Gal-1B, encoded by the untranslocated gal-1b, to experimentally determined and predicted folds of nonsauropsid Gal-1s. This inference is consistent with our finding of a lower proportion of conserved residues in sauropsid Gal-1Bs, and evidence for positive selection of sauropsid gal-1b, but not gal-1a genes. We propose that the duplication and structural divergence of Gal-1B away from Gal-1A led to specialization in both expression and function in the sauropsid lineage.« less
Structural Divergence in Vertebrate Phylogeny of a Duplicated Prototype Galectin
Bhat, R.; Chakraborty, M.; Mian, I. S.; ...
2014-09-25
Prototype galectins, endogenously expressed animal lectins with a single carbohydrate recognition domain, are well-known regulators of tissue properties such as growth and adhesion. The earliest discovered and best studied of the prototype galectins is Galectin-1 (Gal-1). In the Gallus gallus (chicken) genome, Gal-1 is represented by two homologs: Gal-1A and Gal-1B, with distinct biochemical properties, tissue expression, and developmental functions. We investigated the origin of the Gal-1A/Gal-1B divergence to gain insight into when their developmental functions originated and how they could have contributed to vertebrate phenotypic evolution. Sequence alignment and phylogenetic tree construction showed that the Gal-1A/Gal-1B divergence can bemore » traced back to the origin of the sauropsid lineage (consisting of extinct and extant reptiles and birds) although lineage-specific duplications also occurred in the amphibian and actinopterygian genomes. Gene synteny analysis showed that sauropsid gal-1b (the gene for Gal-1B) and its frog and actinopterygian gal-1 homologs share a similar chromosomal location, whereas sauropsid gal-1a has translocated to a new position. Surprisingly, we found that chicken Gal-1A, encoded by the translocated gal-1a, was more similar in its tertiary folding pattern than Gal-1B, encoded by the untranslocated gal-1b, to experimentally determined and predicted folds of nonsauropsid Gal-1s. This inference is consistent with our finding of a lower proportion of conserved residues in sauropsid Gal-1Bs, and evidence for positive selection of sauropsid gal-1b, but not gal-1a genes. We propose that the duplication and structural divergence of Gal-1B away from Gal-1A led to specialization in both expression and function in the sauropsid lineage.« less
Protein synthesis in sperm: dialog between mitochondria and cytoplasm.
Gur, Yael; Breitbart, Haim
2008-01-30
Ejaculated sperm are capable of using mRNAs transcripts for protein translation during the final maturation steps before fertilization. In a capacitation-dependent process, nuclear-encoded mRNAs are translated by mitochondrial-type ribosomes while the cytoplasmic translation machinery is not involved. Our findings suggest that new proteins are synthesized to replace degraded proteins while swimming and waiting in the female reproductive tract before fertilization, or produced due to the specific needs of the capacitating spermatozoa. In addition, a growing number of articles have reported evidence for the correlation of nuclear-encoded mRNA and protein synthesis in somatic mitochondria. It is known that all of the proteins necessary for the replication, transcription and translation of the genes encoded in mtDNA are now encoded in the nuclear genome. This genetic investment is far out of proportion to the number of proteins involved, as there have been multiple movements and duplications of genes. However, the evolutionary retention (or secondary uptake) of the mitochondrial machinery for translation of nuclear-encoded mRNAs may shed light on this paradox.
Goettel, Wolfgang; Ramirez, Martha; Upchurch, Robert G; An, Yong-Qiang Charles
2016-08-01
Identification and characterization of a 254-kb genomic deletion on a duplicated chromosome segment that resulted in a low level of palmitic acid in soybean seeds using transcriptome sequencing. A large number of soybean genotypes varying in seed oil composition and content have been identified. Understanding the molecular mechanisms underlying these variations is important for breeders to effectively utilize them as a genetic resource. Through design and application of a bioinformatics approach, we identified nine co-regulated gene clusters by comparing seed transcriptomes of nine soybean genotypes varying in oil composition and content. We demonstrated that four gene clusters in the genotypes M23, Jack and N0304-303-3 coincided with large-scale genome rearrangements. The co-regulated gene clusters in M23 and Jack mapped to a previously described 164-kb deletion and a copy number amplification of the Rhg1 locus, respectively. The coordinately down-regulated gene clusters in N0304-303-3 were caused by a 254-kb deletion containing 19 genes including a fatty acyl-ACP thioesterase B gene (FATB1a). This deletion was associated with reduced palmitic acid content in seeds and was the molecular cause of a previously reported nonfunctional FATB1a allele, fap nc . The M23 and N0304-304-3 deletions were located in duplicated genome segments retained from the Glycine-specific whole genome duplication that occurred 13 million years ago. The homoeologous genes in these duplicated regions shared a strong similarity in both their encoded protein sequences and transcript accumulation levels, suggesting that they may have conserved and important functions in seeds. The functional conservation of homoeologous genes may result in genetic redundancy and gene dosage effects for their associated seed traits, explaining why the large deletion did not cause lethal effects or completely eliminate palmitic acid in N0304-303-3.
Dölz, R; Mossé, M O; Slonimski, P P; Bairoch, A; Linder, P
1994-01-01
We continued our effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. In this database each sequence has been attributed a single genetic name. In the case of duplicated sequences a simple method has been applied to distinguish between sequences of one and the same gene from non-allelic sequences of duplicated genes. If necessary, synonyms are given in the case of allelic duplicated sequences. Thus sequences can be found either by the name or by synonyms given in LISTA. Each entry contains the genetic name, the mnemonic from the EMBL data bank, the codon bias, reference of the publication of the sequence, Chromosomal location as far as known, Swissprot and EMBL accession numbers. To obtain more information on the included sequences, each entry has been screened against non-redundant nucleotide and protein data bank collections resulting in LISTA-HON and LISTA-HOP. The LISTA data base can be linked to the associated data sets or to nucleotide and protein banks by the Sequence Retrieval System (SRS). PMID:7937046
Korablev, Alexei N; Serova, Irina A; Serov, Oleg L
2017-12-28
Copy Number Variation (CNV) of the human CNTN6 gene (encoding the contactin-6 protein), caused by deletions or duplications, is responsible for severe neurodevelopmental impairments, often in combination with facial dysmorphias. Conversely, deleterious point mutations of this gene do not show any clinical phenotypes. The aim of this study is to generate mice carrying large deletions, duplications and inversions involving the Cntn6 gene as a new experimental model to study CNV of the human CNTN6 locus. To generate large chromosomal rearrangements on mouse chromosome 6, we applied CRISPR/Cas9 technology in zygotes. Two guide RNAs (gRNAs) (flanking a DNA fragment of 1137 Mb) together with Cas9 mRNA and single-stranded DNA oligonucleotides (ssODN) were microinjected into the cytoplasm of 599 zygotes of F1 (C57BL x CBA) mice, and 256 of them were transplanted into oviducts of CD-1 females. As a result, we observed the birth of 41 viable F0 offspring. Genotyping of these mice was performed by PCR analysis and sequencing of PCR products. Among the 41 F0 offspring, we identified seven mice with deletions, two animals carrying duplications of the gene and four carrying inversions. Interestingly, two F0 offspring had both deletions and duplications. It is important to note that while three of seven deletion carriers showed expected sequences at the new joint sites, in another three, we identified an absence of 1-10 nucleotides at the CRISPR/Cas9 cut sites, and in one animal, 103 bp were missing, presumably due to error-prone non-homologous end joining. In addition, we detected the absence of 5 and 13 nucleotides at these sites in two F0 duplication carriers. Similar sequence changes at CRISPR/Cas9 cut sites were observed at the right and left boundaries of inversions. Thus, megabase-scale deletions, duplications and inversions were identified in 11 F0 offspring among 41 analyzed, i.e., approximately 25% efficiency. All genetically modified F0 offspring were viable and able to transmit these large chromosomal rearrangements to the next generation. Using CRISPR/Cas9 technology, we created mice carrying megabase-scale deletions, duplications, and inversions involving the full-sized Cntn6 gene. These mice became founders of new mouse lines, which may be more appropriate experimental models of CNV in the human 3p26.3 region than Сntn6 knockout mice.
2010-01-01
Background Clock family genes encode transcription factors that regulate clock-controlled genes and thus regulate many physiological mechanisms/processes in a circadian fashion. Clock1 duplicates and copies of Clock3 and NPAS2-like genes were partially characterized (genomic sequencing) and mapped using family-based indels/SNPs in rainbow trout (RT)(Oncorhynchus mykiss), Arctic charr (AC)(Salvelinus alpinus), and Atlantic salmon (AS)(Salmo salar) mapping panels. Results Clock1 duplicates mapped to linkage groups RT-8/-24, AC-16/-13 and AS-2/-18. Clock3/NPAS2-like genes mapped to RT-9/-20, AC-20/-43, and AS-5. Most of these linkage group regions containing the Clock gene duplicates were derived from the most recent 4R whole genome duplication event specific to the salmonids. These linkage groups contain quantitative trait loci (QTL) for life history and growth traits (i.e., reproduction and cell cycling). Comparative synteny analyses with other model teleost species reveal a high degree of conservation for genes in these chromosomal regions suggesting that functionally related or co-regulated genes are clustered in syntenic blocks. For example, anti-müllerian hormone (amh), regulating sexual maturation, and ornithine decarboxylase antizymes (oaz1 and oaz2), regulating cell cycling, are contained within these syntenic blocks. Conclusions Synteny analyses indicate that regions homologous to major life-history QTL regions in salmonids contain many candidate genes that are likely to influence reproduction and cell cycling. The order of these genes is highly conserved across the vertebrate species examined, and as such, these genes may make up a functional cluster of genes that are likely co-regulated. CLOCK, as a transcription factor, is found within this block and therefore has the potential to cis-regulate the processes influenced by these genes. Additionally, clock-controlled genes (CCGs) are located in other life-history QTL regions within salmonids suggesting that at least in part, trans-regulation of these QTL regions may also occur via Clock expression. PMID:20670436
Pandey, Ashutosh; Misra, Prashant; Alok, Anshu; Kaur, Navneet; Sharma, Shivani; Lakhwani, Deepika; Asif, Mehar H.; Tiwari, Siddharth; Trivedi, Prabodh K.
2016-01-01
The homeodomain zipper family (HD-ZIP) of transcription factors is present only in plants and plays important role in the regulation of plant-specific processes. The subfamily IV of HDZ transcription factors (HD-ZIP IV) has primarily been implicated in the regulation of epidermal structure development. Though this gene family is present in all lineages of land plants, members of this gene family have not been identified in banana, which is one of the major staple fruit crops. In the present work, we identified 21 HDZIV encoding genes in banana by the computational analysis of banana genome resource. Our analysis suggested that these genes putatively encode proteins having all the characteristic domains of HDZIV transcription factors. The phylogenetic analysis of the banana HDZIV family genes further confirmed that after separation from a common ancestor, the banana, and poales lineages might have followed distinct evolutionary paths. Further, we conclude that segmental duplication played a major role in the evolution of banana HDZIV encoding genes. All the identified banana HDZIV genes expresses in different banana tissue, however at varying levels. The transcript levels of some of the banana HDZIV genes were also detected in banana fruit pulp, suggesting their putative role in fruit attributes. A large number of genes of this family showed modulated expression under drought and salinity stress. Taken together, the present work lays a foundation for elucidation of functional aspects of the banana HDZIV encoding genes and for their possible use in the banana improvement programs. PMID:26870050
Evolution of Prdm Genes in Animals: Insights from Comparative Genomics
Vervoort, Michel; Meulemeester, David; Béhague, Julien; Kerner, Pierre
2016-01-01
Prdm genes encode transcription factors with a subtype of SET domain known as the PRDF1-RIZ (PR) homology domain and a variable number of zinc finger motifs. These genes are involved in a wide variety of functions during animal development. As most Prdm genes have been studied in vertebrates, especially in mice, little is known about the evolution of this gene family. We searched for Prdm genes in the fully sequenced genomes of 93 different species representative of all the main metazoan lineages. A total of 976 Prdm genes were identified in these species. The number of Prdm genes per species ranges from 2 to 19. To better understand how the Prdm gene family has evolved in metazoans, we performed phylogenetic analyses using this large set of identified Prdm genes. These analyses allowed us to define 14 different subfamilies of Prdm genes and to establish, through ancestral state reconstruction, that 11 of them are ancestral to bilaterian animals. Three additional subfamilies were acquired during early vertebrate evolution (Prdm5, Prdm11, and Prdm17). Several gene duplication and gene loss events were identified and mapped onto the metazoan phylogenetic tree. By studying a large number of nonmetazoan genomes, we confirmed that Prdm genes likely constitute a metazoan-specific gene family. Our data also suggest that Prdm genes originated before the diversification of animals through the association of a single ancestral SET domain encoding gene with one or several zinc finger encoding genes. PMID:26560352
Akhter, Yusuf; Ehebauer, Matthias T; Mukhopadhyay, Sangita; Hasnain, Seyed E
2012-01-01
The PE/PPE multigene family codes for approximately 10% of the Mycobacterium tuberculosis proteome and is encoded by 176 open reading frames. These proteins possess, and have been named after, the conserved proline-glutamate (PE) or proline-proline-glutamate (PPE) motifs at their N-terminus. Their genes have a conserved structure and repeat motifs that could be a potential source of antigenic variation in M. tuberculosis. PE/PPE genes are scattered throughout the genome and PE/PPE pairs are usually encoded in bicistronic operons although this is not universally so. This gene family has evolved by specific gene duplication events. PE/PPE proteins are either secreted or localized to the cell surface. Several are thought to be virulence factors, which participate in evasion of the host immune response. This review summarizes the current knowledge about the gene family in order to better understand its biological function. Copyright © 2011 Elsevier Masson SAS. All rights reserved.
Developmentally distinct MYB genes encode functionally equivalent proteins in Arabidopsis.
Lee, M M; Schiefelbein, J
2001-05-01
The duplication and divergence of developmental control genes is thought to have driven morphological diversification during the evolution of multicellular organisms. To examine the molecular basis of this process, we analyzed the functional relationship between two paralogous MYB transcription factor genes, WEREWOLF (WER) and GLABROUS1 (GL1), in Arabidopsis. The WER and GL1 genes specify distinct cell types and exhibit non-overlapping expression patterns during Arabidopsis development. Nevertheless, reciprocal complementation experiments with a series of gene fusions showed that WER and GL1 encode functionally equivalent proteins, and their unique roles in plant development are entirely due to differences in their cis-regulatory sequences. Similar experiments with a distantly related MYB gene (MYB2) showed that its product cannot functionally substitute for WER or GL1. Furthermore, an analysis of the WER and GL1 proteins shows that conserved sequences correspond to specific functional domains. These results provide new insights into the evolution of the MYB gene family in Arabidopsis, and, more generally, they demonstrate that novel developmental gene function may arise solely by the modification of cis-regulatory sequences.
Evolution and Variation of Renin Genes in Mice
Dickinson, Douglas P.; Gross, Kenneth W.; Piccini, Nina; Wilson, Carol M.
1984-01-01
Inbred strains of mice carry Ren-1, a gene encoding the thermostable Renin-1 isozyme. Ren-1 is expressed at relatively low levels in mouse submandibular gland and kidney. Some strains also carry Ren-2, a gene encoding the thermolabile Renin-2 isozyme. Ren-2 is expressed at high levels in the mouse submandibular gland and at very low levels, if at all, in the kidney. Ren-1 and Ren-2 are closely linked on mouse chromosome 1, show extensive homology in coding and noncoding regions and provide a model for studying the regulation of gene expression. An investigation of renin genes and enzymatic activity in wild-derived mice identified several restriction site polymorphisms as well as putative variants in renin gene expression and protein structure. The number of renin genes carried by different subpopulations of wild-derived mice is consistent with the occurrence of a gene duplication event prior to the divergence of M. spretus (2.75–5.5 million yr ago). This conclusion is in agreement with a prior estimate based upon comparative sequence analysis of Ren-1 and Ren-2 from inbred laboratory mice. PMID:6389258
Jiang, Shu-Ye; Ramachandran, Srinivasan
2016-01-01
DNA glycosylases catalyze the release of methylated bases. They play vital roles in the base excision repair pathway and might also function in DNA demethylation. At least three families of DNA glycosylases have been identified, which included 3′-methyladenine DNA glycosylase (MDG) I, MDG II, and HhH-GPD (Helix–hairpin–Helix and Glycine/Proline/aspartate (D)). However, little is known on their genome-wide identification, expansion, and evolutionary history as well as their expression profiling and biological functions. In this study, we have genome-widely identified and evolutionarily characterized these family members. Generally, a genome encodes only one MDG II gene in most of organisms. No MDG I or MDG II gene was detected in green algae. However, HhH-GPD genes were detectable in all available organisms. The ancestor species contain small size of MDG I and HhH-GPD families. These two families were mainly expanded through the whole-genome duplication and segmental duplication. They were evolutionarily conserved and were generally under purifying selection. However, we have detected recent positive selection among the Oryza genus, which might play roles in species divergence. Further investigation showed that expression divergence played important roles in gene survival after expansion. All of these family genes were expressed in most of developmental stages and tissues in rice plants. High ratios of family genes were downregulated by drought and fungus pathogen as well as abscisic acid (ABA) and jasmonic acid (JA) treatments, suggesting a negative regulation in response to drought stress and pathogen infection through ABA- and/or JA-dependent hormone signaling pathway. PMID:27026054
Jiang, Shu-Ye; Sevugan, Mayalagu; Ramachandran, Srinivasan
2018-05-09
Valine-glutamine (VQ) motif containing proteins play important roles in abiotic and biotic stress responses in plants. However, little is known about the origin and evolution as well as comprehensive expression regulation of the VQ gene family. In this study, we systematically surveyed this gene family in 50 plant genomes from algae, moss, gymnosperm and angiosperm and explored their presence in other species from animals, bacteria, fungi and viruses. No VQs were detected in all tested algae genomes and all genomes from moss, gymnosperm and angiosperm encode varying numbers of VQs. Interestingly, some of fungi, lower animals and bacteria also encode single to a few VQs. Thus, they are not plant-specific and should be regarded as an ancient family. Their family expansion was mainly due to segmental duplication followed by tandem duplication and mobile elements. Limited contribution of gene conversion was detected to the family evolution. Generally, VQs were very much conserved in their motif coding region and were under purifying selection. However, positive selection was also observed during species divergence. Many VQs were up- or down-regulated by various abiotic / biotic stresses and phytohormones in rice and Arabidopsis. They were also co-expressed with some of other stress-related genes. All of the expression data suggest a comprehensive expression regulation of the VQ gene family. We provide new insights into gene expansion, divergence, evolution and their expression regulation of this VQ family. VQs were detectable not only in plants but also in some of fungi, lower animals and bacteria, suggesting the evolutionary conservation and the ancient origin. Overall, VQs are non-plant-specific and play roles in abiotic / biotic responses or other biological processes through comprehensive expression regulation.
The major resistance gene cluster in lettuce is highly duplicated and spans several megabases.
Meyers, B C; Chin, D B; Shen, K A; Sivaramakrishnan, S; Lavelle, D O; Zhang, Z; Michelmore, R W
1998-01-01
At least 10 Dm genes conferring resistance to the oomycete downy mildew fungus Bremia lactucae map to the major resistance cluster in lettuce. We investigated the structure of this cluster in the lettuce cultivar Diana, which contains Dm3. A deletion breakpoint map of the chromosomal region flanking Dm3 was saturated with a variety of molecular markers. Several of these markers are components of a family of resistance gene candidates (RGC2) that encode a nucleotide binding site and a leucine-rich repeat region. These motifs are characteristic of plant disease resistance genes. Bacterial artificial chromosome clones were identified by using duplicated restriction fragment length polymorphism markers from the region, including the nucleotide binding site-encoding region of RGC2. Twenty-two distinct members of the RGC2 family were characterized from the bacterial artificial chromosomes; at least two additional family members exist. The RGC2 family is highly divergent; the nucleotide identity was as low as 53% between the most distantly related copies. These RGC2 genes span at least 3.5 Mb. Eighteen members were mapped on the deletion breakpoint map. A comparison between the phylogenetic and physical relationships of these sequences demonstrated that closely related copies are physically separated from one another and indicated that complex rearrangements have shaped this region. Analysis of low-copy genomic sequences detected no genes, including RGC2, in the Dm3 region, other than sequences related to retrotransposons and transposable elements. The related but divergent family of RGC2 genes may act as a resource for the generation of new resistance phenotypes through infrequent recombination or unequal crossing over. PMID:9811791
The early stages of duplicate gene evolution
Moore, Richard C.; Purugganan, Michael D.
2003-01-01
Gene duplications are one of the primary driving forces in the evolution of genomes and genetic systems. Gene duplicates account for 8–20% of the genes in eukaryotic genomes, and the rates of gene duplication are estimated at between 0.2% and 2% per gene per million years. Duplicate genes are believed to be a major mechanism for the establishment of new gene functions and the generation of evolutionary novelty, yet very little is known about the early stages of the evolution of duplicated gene pairs. It is unclear, for example, to what extent selection, rather than neutral genetic drift, drives the fixation and early evolution of duplicate loci. Analysis of recently duplicated genes in the Arabidopsis thaliana genome reveals significantly reduced species-wide levels of nucleotide polymorphisms in the progenitor and/or duplicate gene copies, suggesting that selective sweeps accompany the initial stages of the evolution of these duplicated gene pairs. Our results support recent theoretical work that indicates that fates of duplicate gene pairs may be determined in the initial phases of duplicate gene evolution and that positive selection plays a prominent role in the evolutionary dynamics of the very early histories of duplicate nuclear genes. PMID:14671323
Evolution of Gene Duplication in Plants1[OPEN
2016-01-01
Ancient duplication events and a high rate of retention of extant pairs of duplicate genes have contributed to an abundance of duplicate genes in plant genomes. These duplicates have contributed to the evolution of novel functions, such as the production of floral structures, induction of disease resistance, and adaptation to stress. Additionally, recent whole-genome duplications that have occurred in the lineages of several domesticated crop species, including wheat (Triticum aestivum), cotton (Gossypium hirsutum), and soybean (Glycine max), have contributed to important agronomic traits, such as grain quality, fruit shape, and flowering time. Therefore, understanding the mechanisms and impacts of gene duplication will be important to future studies of plants in general and of agronomically important crops in particular. In this review, we survey the current knowledge about gene duplication, including gene duplication mechanisms, the potential fates of duplicate genes, models explaining duplicate gene retention, the properties that distinguish duplicate from singleton genes, and the evolutionary impact of gene duplication. PMID:27288366
Evolution of Gene Duplication in Plants.
Panchy, Nicholas; Lehti-Shiu, Melissa; Shiu, Shin-Han
2016-08-01
Ancient duplication events and a high rate of retention of extant pairs of duplicate genes have contributed to an abundance of duplicate genes in plant genomes. These duplicates have contributed to the evolution of novel functions, such as the production of floral structures, induction of disease resistance, and adaptation to stress. Additionally, recent whole-genome duplications that have occurred in the lineages of several domesticated crop species, including wheat (Triticum aestivum), cotton (Gossypium hirsutum), and soybean (Glycine max), have contributed to important agronomic traits, such as grain quality, fruit shape, and flowering time. Therefore, understanding the mechanisms and impacts of gene duplication will be important to future studies of plants in general and of agronomically important crops in particular. In this review, we survey the current knowledge about gene duplication, including gene duplication mechanisms, the potential fates of duplicate genes, models explaining duplicate gene retention, the properties that distinguish duplicate from singleton genes, and the evolutionary impact of gene duplication. © 2016 American Society of Plant Biologists. All Rights Reserved.
Schwartze, Volker U; Winter, Sascha; Shelest, Ekaterina; Marcet-Houben, Marina; Horn, Fabian; Wehner, Stefanie; Linde, Jörg; Valiante, Vito; Sammeth, Michael; Riege, Konstantin; Nowrousian, Minou; Kaerger, Kerstin; Jacobsen, Ilse D; Marz, Manja; Brakhage, Axel A; Gabaldón, Toni; Böcker, Sebastian; Voigt, Kerstin
2014-08-01
Lichtheimia species are the second most important cause of mucormycosis in Europe. To provide broader insights into the molecular basis of the pathogenicity-associated traits of the basal Mucorales, we report the full genome sequence of L. corymbifera and compared it to the genome of Rhizopus oryzae, the most common cause of mucormycosis worldwide. The genome assembly encompasses 33.6 MB and 12,379 protein-coding genes. This study reveals four major differences of the L. corymbifera genome to R. oryzae: (i) the presence of an highly elevated number of gene duplications which are unlike R. oryzae not due to whole genome duplication (WGD), (ii) despite the relatively high incidence of introns, alternative splicing (AS) is not frequently observed for the generation of paralogs and in response to stress, (iii) the content of repetitive elements is strikingly low (<5%), (iv) L. corymbifera is typically haploid. Novel virulence factors were identified which may be involved in the regulation of the adaptation to iron-limitation, e.g. LCor01340.1 encoding a putative siderophore transporter and LCor00410.1 involved in the siderophore metabolism. Genes encoding the transcription factors LCor08192.1 and LCor01236.1, which are similar to GATA type regulators and to calcineurin regulated CRZ1, respectively, indicating an involvement of the calcineurin pathway in the adaption to iron limitation. Genes encoding MADS-box transcription factors are elevated up to 11 copies compared to the 1-4 copies usually found in other fungi. More findings are: (i) lower content of tRNAs, but unique codons in L. corymbifera, (ii) Over 25% of the proteins are apparently specific for L. corymbifera. (iii) L. corymbifera contains only 2/3 of the proteases (known to be essential virulence factors) in comparison to R. oryzae. On the other hand, the number of secreted proteases, however, is roughly twice as high as in R. oryzae.
Wehner, Stefanie; Linde, Jörg; Valiante, Vito; Sammeth, Michael; Riege, Konstantin; Nowrousian, Minou; Kaerger, Kerstin; Jacobsen, Ilse D.; Marz, Manja; Brakhage, Axel A.; Gabaldón, Toni; Böcker, Sebastian; Voigt, Kerstin
2014-01-01
Lichtheimia species are the second most important cause of mucormycosis in Europe. To provide broader insights into the molecular basis of the pathogenicity-associated traits of the basal Mucorales, we report the full genome sequence of L. corymbifera and compared it to the genome of Rhizopus oryzae, the most common cause of mucormycosis worldwide. The genome assembly encompasses 33.6 MB and 12,379 protein-coding genes. This study reveals four major differences of the L. corymbifera genome to R. oryzae: (i) the presence of an highly elevated number of gene duplications which are unlike R. oryzae not due to whole genome duplication (WGD), (ii) despite the relatively high incidence of introns, alternative splicing (AS) is not frequently observed for the generation of paralogs and in response to stress, (iii) the content of repetitive elements is strikingly low (<5%), (iv) L. corymbifera is typically haploid. Novel virulence factors were identified which may be involved in the regulation of the adaptation to iron-limitation, e.g. LCor01340.1 encoding a putative siderophore transporter and LCor00410.1 involved in the siderophore metabolism. Genes encoding the transcription factors LCor08192.1 and LCor01236.1, which are similar to GATA type regulators and to calcineurin regulated CRZ1, respectively, indicating an involvement of the calcineurin pathway in the adaption to iron limitation. Genes encoding MADS-box transcription factors are elevated up to 11 copies compared to the 1–4 copies usually found in other fungi. More findings are: (i) lower content of tRNAs, but unique codons in L. corymbifera, (ii) Over 25% of the proteins are apparently specific for L. corymbifera. (iii) L. corymbifera contains only 2/3 of the proteases (known to be essential virulence factors) in comparision to R. oryzae. On the other hand, the number of secreted proteases, however, is roughly twice as high as in R. oryzae. PMID:25121733
Sehrish, Tina; Symonds, V. Vaughan; Soltis, Douglas E.; Soltis, Pamela S.; Tate, Jennifer A.
2015-01-01
Allopolyploids, formed by hybridization and chromosome doubling, face the immediate challenge of having duplicated nuclear genomes that interact with the haploid and maternally inherited cytoplasmic (plastid and mitochondrial) genomes. Most of our knowledge of the genomic consequences of allopolyploidy has focused on the fate of the duplicated nuclear genes without regard to their potential interactions with cytoplasmic genomes. As a step toward understanding the fates of nuclear-encoded subunits that are plastid-targeted, here we examine the retention and expression of the gene encoding the small subunit of Ribulose-1, 5-bisphosphate carboxylase/oxygenase (Rubisco; rbcS) in multiple populations of allotetraploid Tragopogon miscellus (Asteraceae). These polyploids formed recently (~80 years ago) and repeatedly from T. dubius and T. pratensis in the northwestern United States. Examination of 79 T. miscellus individuals from 10 natural populations, as well as 25 synthetic allotetraploids, including reciprocally formed plants, revealed a low percentage of naturally occurring individuals that show a bias in either gene (homeolog) loss (12%) or expression (16%), usually toward maintaining the maternal nuclear copy of rbcS. For individuals showing loss, seven retained the maternally derived rbcS homeolog only, while three had the paternally derived copy. All of the synthetic polyploid individuals examined (S0 and S1 generations) retained and expressed both parental homeologs. These results demonstrate that cytonuclear coordination does not happen immediately upon polyploid formation in Tragopogon miscellus. PMID:26646761
Campbell, Elsie L; Hagen, Kari D; Chen, Rui; Risser, Douglas D; Ferreira, Daniela P; Meeks, John C
2015-02-15
In cyanobacterial Nostoc species, substratum-dependent gliding motility is confined to specialized nongrowing filaments called hormogonia, which differentiate from vegetative filaments as part of a conditional life cycle and function as dispersal units. Here we confirm that Nostoc punctiforme hormogonia are positively phototactic to white light over a wide range of intensities. N. punctiforme contains two gene clusters (clusters 2 and 2i), each of which encodes modular cyanobacteriochrome-methyl-accepting chemotaxis proteins (MCPs) and other proteins that putatively constitute a basic chemotaxis-like signal transduction complex. Transcriptional analysis established that all genes in clusters 2 and 2i, plus two additional clusters (clusters 1 and 3) with genes encoding MCPs lacking cyanobacteriochrome sensory domains, are upregulated during the differentiation of hormogonia. Mutational analysis determined that only genes in cluster 2i are essential for positive phototaxis in N. punctiforme hormogonia; here these genes are designated ptx (for phototaxis) genes. The cluster is unusual in containing complete or partial duplicates of genes encoding proteins homologous to the well-described chemotaxis elements CheY, CheW, MCP, and CheA. The cyanobacteriochrome-MCP gene (ptxD) lacks transmembrane domains and has 7 potential binding sites for bilins. The transcriptional start site of the ptx genes does not resemble a sigma 70 consensus recognition sequence; moreover, it is upstream of two genes encoding gas vesicle proteins (gvpA and gvpC), which also are expressed only in the hormogonium filaments of N. punctiforme. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Allison, Andrew B; Mead, Daniel G; Palacios, Gustavo F; Tesh, Robert B; Holmes, Edward C
2014-01-05
Flanders virus (FLAV) and Hart Park virus (HPV) are rhabdoviruses that circulate in mosquito-bird cycles in the eastern and western United States, respectively, and constitute the only two North American representatives of the Hart Park serogroup. Previously, it was suggested that FLAV is unique among the rhabdoviruses in that it contains two pseudogenes located between the P and M genes, while the cognate sequence for HPV has been lacking. Herein, we demonstrate that FLAV and HPV do not contain pseudogenes in this region, but encode three small functional proteins designated as U1-U3 that apparently arose by gene duplication. To further investigate the U1-U3 region, we conducted the first large-scale evolutionary analysis of a member of the Hart Park serogroup by analyzing over 100 spatially and temporally distinct FLAV isolates. Our phylogeographic analysis demonstrates that although FLAV appears to be slowly evolving, phylogenetically divergent lineages co-circulate sympatrically. © 2013 Published by Elsevier Inc.
Evolutionary history of the ABCB2 genomic region in teleosts
Palti, Y.; Rodriguez, M.F.; Gahr, S.A.; Hansen, J.D.
2007-01-01
Gene duplication, silencing and translocation have all been implicated in shaping the unique genomic architecture of the teleost MH regions. Previously, we demonstrated that trout possess five unlinked regions encoding MH genes. One of these regions harbors ABCB2 which in all other vertebrate classes is found in the MHC class II region. In this study, we sequenced a BAC contig for the trout ABCB2 region. Analysis of this region revealed the presence of genes homologous to those located in the human class II (ABCB2, BRD2, ??DAA), extended class II (RGL2, PHF1, SYGP1) and class III (PBX2, Notch-L) regions. The organization and syntenic relationships of this region were then compared to similar regions in humans, Tetraodon and zebrafish to learn more about the evolutionary history of this region. Our analysis indicates that this region was generated during the teleost-specific duplication event while also providing insight about potential MH paralogous regions in teleosts. ?? 2006 Elsevier Ltd. All rights reserved.
Lu, Jianguo; Peatman, Eric; Tang, Haibao; Lewis, Joshua; Liu, Zhanjiang
2012-06-15
Gene duplication has had a major impact on genome evolution. Localized (or tandem) duplication resulting from unequal crossing over and whole genome duplication are believed to be the two dominant mechanisms contributing to vertebrate genome evolution. While much scrutiny has been directed toward discerning patterns indicative of whole-genome duplication events in teleost species, less attention has been paid to the continuous nature of gene duplications and their impact on the size, gene content, functional diversity, and overall architecture of teleost genomes. Here, using a Markov clustering algorithm directed approach we catalogue and analyze patterns of gene duplication in the four model teleost species with chromosomal coordinates: zebrafish, medaka, stickleback, and Tetraodon. Our analyses based on set size, duplication type, synonymous substitution rate (Ks), and gene ontology emphasize shared and lineage-specific patterns of genome evolution via gene duplication. Most strikingly, our analyses highlight the extraordinary duplication and retention rate of recent duplicates in zebrafish and their likely role in the structural and functional expansion of the zebrafish genome. We find that the zebrafish genome is remarkable in its large number of duplicated genes, small duplicate set size, biased Ks distribution toward minimal mutational divergence, and proportion of tandem and intra-chromosomal duplicates when compared with the other teleost model genomes. The observed gene duplication patterns have played significant roles in shaping the architecture of teleost genomes and appear to have contributed to the recent functional diversification and divergence of important physiological processes in zebrafish. We have analyzed gene duplication patterns and duplication types among the available teleost genomes and found that a large number of genes were tandemly and intrachromosomally duplicated, suggesting their origin of independent and continuous duplication. This is particularly true for the zebrafish genome. Further analysis of the duplicated gene sets indicated that a significant portion of duplicated genes in the zebrafish genome were of recent, lineage-specific duplication events. Most strikingly, a subset of duplicated genes is enriched among the recently duplicated genes involved in immune or sensory response pathways. Such findings demonstrated the significance of continuous gene duplication as well as that of whole genome duplication in the course of genome evolution.
Structural and transcriptional characterization of a novel member of the soybean urease gene family.
Wiebke-Strohm, Beatriz; Ligabue-Braun, Rodrigo; Rechenmacher, Ciliana; De Oliveira-Busatto, Luisa Abruzzi; Carlini, Célia Regina; Bodanese-Zanettini, Maria Helena
2016-04-01
In plants, ureases have been related to urea degradation, to defense against pathogenic fungi and phytophagous insects, and to the soybean-Bradyrhizobium japonicum symbiosis. Two urease isoforms have been described for soybean: the embryo-specific, encoded by Eu1 gene, and the ubiquitous urease, encoded by Eu4. A third urease-encoding locus exists in the completed soybean genome. The gene was designated Eu5 and the putative product of its ORF as SBU-III. Phylogenetic analysis shows that 41 plant, moss and algal ureases have diverged from a common ancestor protein, but ureases from monocots, eudicots and ancient species have evolved independently. Genomes of ancient organisms present a single urease-encoding gene and urease-encoding gene duplication has occurred independently along the evolution of some eudicot species. SBU-III has a shorter amino acid sequence, since many gaps are found when compared to other sequences. A mutation in a highly conserved amino acid residue suggests absence of ureolytic activity, but the overall protein architecture remains very similar to the other ureases. The expression profile of urease-encoding genes in different organs and developmental stages was determined by RT-qPCR. Eu5 transcripts were detected in seeds one day after dormancy break, roots of young plants and embryos of developing seeds. Eu1 and Eu4 transcripts were found in all analyzed organs, but Eu4 expression was more prominent in seeds one day after dormancy break whereas Eu1 predominated in developing seeds. The evidence suggests that SBU-III may not be involved in nitrogen availability to plants, but it could be involved in other biological role(s). Copyright © 2016 Elsevier Masson SAS. All rights reserved.
2014-01-01
Background The Maternally expressed gene (Meg) family is a locally-duplicated gene family of maize which encodes cysteine-rich proteins (CRPs). The founding member of the family, Meg1, is required for normal development of the basal endosperm transfer cell layer (BETL) and is involved in the allocation of maternal nutrients to growing seeds. Despite the important roles of Meg1 in maize seed development, the evolutionary history of the Meg cluster and the activities of the duplicate genes are not understood. Results In maize, the Meg gene cluster resides in a 2.3 Mb-long genomic region that exhibits many features of non-centromeric heterochromatin. Using phylogenetic reconstruction and syntenic alignments, we identified the pedigree of the Meg family, in which 11 of its 13 members arose in maize after allotetraploidization ~4.8 mya. Phylogenetic and population-genetic analyses identified possible signatures suggesting recent positive selection in Meg homologs. Structural analyses of the Meg proteins indicated potentially adaptive changes in secondary structure from α-helix to β-strand during the expansion. Transcriptomic analysis of the maize endosperm indicated that 6 Meg genes are selectively activated in the BETL, and younger Meg genes are more active than older ones. In endosperms from B73 by Mo17 reciprocal crosses, most Meg genes did not display parent-specific expression patterns. Conclusions Recently-duplicated Meg genes have different protein secondary structures, and their expressions in the BETL dominate over those of older members. Together with the signs of positive selections in the young Meg genes, these results suggest that the expansion of the Meg family involves potentially adaptive transitions in which new members with novel functions prevailed over older members. PMID:25084677
Unique Temporal Expression of Triplicated Long-Wavelength Opsins in Developing Butterfly Eyes
Arikawa, Kentaro; Iwanaga, Tomoyuki; Wakakuwa, Motohiro; Kinoshita, Michiyo
2017-01-01
Following gene duplication events, the expression patterns of the resulting gene copies can often diverge both spatially and temporally. Here we report on gene duplicates that are expressed in distinct but overlapping patterns, and which exhibit temporally divergent expression. Butterflies have sophisticated color vision and spectrally complex eyes, typically with three types of heterogeneous ommatidia. The eyes of the butterfly Papilio xuthus express two green- and one red-absorbing visual pigment, which came about via gene duplication events, in addition to one ultraviolet (UV)- and one blue-absorbing visual pigment. We localized mRNAs encoding opsins of these visual pigments in developing eye disks throughout the pupal stage. The mRNAs of the UV and blue opsin are expressed early in pupal development (pd), specifying the type of the ommatidium in which they appear. Red sensitive photoreceptors first express a green opsin mRNA, which is replaced later by the red opsin mRNA. Broadband photoreceptors (that coexpress the green and red opsins) first express the green opsin mRNA, later change to red opsin mRNA and finally re-express the green opsin mRNA in addition to the red mRNA. Such a unique temporal and spatial expression pattern of opsin mRNAs may reflect the evolution of visual pigments and provide clues toward understanding how the spectrally complex eyes of butterflies evolved. PMID:29238294
Structure and vascular tissue expression of duplicated TERMINAL EAR1-like paralogues in poplar.
Charon, Céline; Vivancos, Julien; Mazubert, Christelle; Paquet, Nicolas; Pilate, Gilles; Dron, Michel
2010-02-01
TERMINAL EAR1-like (TEL) genes encode putative RNA-binding proteins only found in land plants. Previous studies suggested that they may regulate tissue and organ initiation in Poaceae. Two TEL genes were identified in both Populus trichocarpa and the hybrid aspen Populus tremula x P. alba, named, respectively, PoptrTEL1-2 and PtaTEL1-2. The analysis of the organisation around the PoptrTEL genes in the P. trichocarpa genome and the estimation of the synonymous substitution rate for PtaTEL1-2 genes indicate that the paralogous link between these two Populus TEL genes probably results from the Salicoid large-scale gene-duplication event. Phylogenetic analyses confirmed their orthology link with the other TEL genes. The expression pattern of both PtaTEL genes appeared to be restricted to the mother cells of the plant body: leaf founder cells, leaf primordia, axillary buds and root differentiating tissues, as well as to mother cells of vascular tissues. Most interestingly, PtaTEL1-2 transcripts were found in differentiating cells of secondary xylem and phloem, but probably not in the cambium itself. Taken together, these results indicate specific expression of the TEL genes in differentiating cells controlling tissue and organ development in Populus (and other Angiosperm species).
Expression analysis of genes encoding double B-box zinc finger proteins in maize.
Li, Wenlan; Wang, Jingchao; Sun, Qi; Li, Wencai; Yu, Yanli; Zhao, Meng; Meng, Zhaodong
2017-11-01
The B-box proteins play key roles in plant development. The double B-box (DBB) family is one of the subfamily of the B-box family, with two B-box domains and without a CCT domain. In this study, 12 maize double B-box genes (ZmDBBs) were identified through a genome-wide survey. Phylogenetic analysis of DBB proteins from maize, rice, Sorghum bicolor, Arabidopsis, and poplar classified them into five major clades. Gene duplication analysis indicated that segmental duplications made a large contribution to the expansion of ZmDBBs. Furthermore, a large number of cis-acting regulatory elements related to plant development, response to light and phytohormone were identified in the promoter regions of the ZmDBB genes. The expression patterns of the ZmDBB genes in various tissues and different developmental stages demonstrated that ZmDBBs might play essential roles in plant development, and some ZmDBB genes might have unique function in specific developmental stages. In addition, several ZmDBB genes showed diurnal expression pattern. The expression levels of some ZmDBB genes changed significantly under light/dark treatment conditions and phytohormone treatments, implying that they might participate in light signaling pathway and hormone signaling. Our results will provide new information to better understand the complexity of the DBB gene family in maize.
Książkiewicz, Michał; Rychel, Sandra; Nelson, Matthew N; Wyrwa, Katarzyna; Naganowska, Barbara; Wolko, Bogdan
2016-10-21
The Arabidopsis FLOWERING LOCUS T (FT) gene, a member of the phosphatidylethanolamine binding protein (PEBP) family, is a major controller of flowering in response to photoperiod, vernalization and light quality. In legumes, FT evolved into three, functionally diversified clades, FTa, FTb and FTc. A milestone achievement in narrow-leafed lupin (Lupinus angustifolius L.) domestication was the loss of vernalization responsiveness at the Ku locus. Recently, one of two existing L. angustifolius homologs of FTc, LanFTc1, was revealed to be the gene underlying Ku. It is the first recorded involvement of an FTc homologue in vernalization. The evolutionary basis of this phenomenon in lupin has not yet been deciphered. Bacterial artificial chromosome (BAC) clones carrying LanFTc1 and LanFTc2 genes were localized in different mitotic chromosomes and constituted sequence-specific landmarks for linkage groups NLL-10 and NLL-17. BAC-derived superscaffolds containing LanFTc genes revealed clear microsyntenic patterns to genome sequences of nine legume species. Superscaffold-1 carrying LanFTc1 aligned to regions encoding one or more FT-like genes whereas superscaffold-2 mapped to a region lacking such a homolog. Comparative mapping of the L. angustifolius genome assembly anchored to linkage map localized superscaffold-1 in the middle of a 15 cM conserved, collinear region. In contrast, superscaffold-2 was found at the edge of a 20 cM syntenic block containing highly disrupted collinearity at the LanFTc2 locus. 118 PEBP-family full-length homologs were identified in 10 legume genomes. Bayesian phylogenetic inference provided novel evidence supporting the hypothesis that whole-genome and tandem duplications contributed to expansion of PEBP-family genes in legumes. Duplicated genes were subjected to strong purifying selection. Promoter analysis of FT genes revealed no statistically significant sequence similarity between duplicated copies; only RE-alpha and CCAAT-box motifs were found at conserved positions and orientations. Numerous lineage-specific duplications occurred during the evolution of legume PEBP-family genes. Whole-genome duplications resulted in the origin of subclades FTa, FTb and FTc and in the multiplication of FTa and FTb copy number. LanFTc1 is located in the region conserved among all main lineages of Papilionoideae. LanFTc1 is a direct descendant of ancestral FTc, whereas LanFTc2 appeared by subsequent duplication.
Le, Dung Tien; Nguyen, Kim-Lien; Chu, Ha Duc; Vu, Nam Tuan; Pham, Thu Thi Ly; Tran, Lam-Son Phan
2018-05-28
In plants, two types of methionine sulfoxide reductase (MSR) exist, namely methionine-S-sulfoxide reductase (MSRA) and methionine-R-sulfoxide reductase (MSRB). These enzymes catalyze the reduction of methionine sulfoxides (MetO) back to methionine (Met) by a catalytic cysteine (Cys) and one or two resolving Cys residues. Interestingly, a group of MSRA encoded by plant genomes does not have a catalytic residue. We asked that if this group of MSRA did not have any function (as fitness), why it was not lost during the evolutionary process. To challenge this question, we analyzed the gene family encoding MSRA in soybean (GmMSRAs). We found seven genes encoding GmMSRAs, which included three segmental duplicated pairs. Among them, a pair of duplicated genes, namely GmMSRA1 and GmMSRA6, was without a catalytic Cys residue. Pseudogenes were ruled out as their transcripts were detected in various tissues and their Ka/Ks ratio indicated a negative selection pressure. In vivo analysis in Δ3MSR yeast strain indicated that the GmMSRA6 did not have activity toward MetO, contrasting to GmMSRA3 which had catalytic Cys and had activity. When exposed to H 2 O 2 -induced oxidative stress, GmMSRA6 did not confer any protection to the Δ3MSR yeast strain. Overexpression of GmMSRA6 in Arabidopsis thaliana did not alter the plant's phenotype under physiological conditions. However, the transgenic plants exhibited slightly higher sensitivity toward salinity-induced stress. Taken together, this data suggested that the plant MSRAs without the catalytic Cys are not enzymatically active and their existence may be explained by a role in regulating plant MSR activity via dominant-negative substrate competition mechanism.
Ngcungcu, Thandiswa; Oti, Martin; Sitek, Jan C; Haukanes, Bjørn I; Linghu, Bolan; Bruccoleri, Robert; Stokowy, Tomasz; Oakeley, Edward J; Yang, Fan; Zhu, Jiang; Sultan, Marc; Schalkwijk, Joost; van Vlijmen-Willems, Ivonne M J J; von der Lippe, Charlotte; Brunner, Han G; Ersland, Kari M; Grayson, Wayne; Buechmann-Moller, Stine; Sundnes, Olav; Nirmala, Nanguneri; Morgan, Thomas M; van Bokhoven, Hans; Steen, Vidar M; Hull, Peter R; Szustakowski, Joseph; Staedtler, Frank; Zhou, Huiqing; Fiskerstrand, Torunn; Ramsay, Michele
2017-05-04
Keratolytic winter erythema (KWE) is a rare autosomal-dominant skin disorder characterized by recurrent episodes of palmoplantar erythema and epidermal peeling. KWE was previously mapped to 8p23.1-p22 (KWE critical region) in South African families. Using targeted resequencing of the KWE critical region in five South African families and SNP array and whole-genome sequencing in two Norwegian families, we identified two overlapping tandem duplications of 7.67 kb (South Africans) and 15.93 kb (Norwegians). The duplications segregated with the disease and were located upstream of CTSB, a gene encoding cathepsin B, a cysteine protease involved in keratinocyte homeostasis. Included in the 2.62 kb overlapping region of these duplications is an enhancer element that is active in epidermal keratinocytes. The activity of this enhancer correlated with CTSB expression in normal differentiating keratinocytes and other cell lines, but not with FDFT1 or NEIL2 expression. Gene expression (qPCR) analysis and immunohistochemistry of the palmar epidermis demonstrated significantly increased expression of CTSB, as well as stronger staining of cathepsin B in the stratum granulosum of affected individuals than in that of control individuals. Analysis of higher-order chromatin structure data and RNA polymerase II ChIA-PET data from MCF-7 cells did not suggest remote effects of the enhancer. In conclusion, KWE in South African and Norwegian families is caused by tandem duplications in a non-coding genomic region containing an active enhancer element for CTSB, resulting in upregulation of this gene in affected individuals. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Co-expression network analysis of duplicate genes in maize (Zea mays L.) reveals no subgenome bias.
Li, Lin; Briskine, Roman; Schaefer, Robert; Schnable, Patrick S; Myers, Chad L; Flagel, Lex E; Springer, Nathan M; Muehlbauer, Gary J
2016-11-04
Gene duplication is prevalent in many species and can result in coding and regulatory divergence. Gene duplications can be classified as whole genome duplication (WGD), tandem and inserted (non-syntenic). In maize, WGD resulted in the subgenomes maize1 and maize2, of which maize1 is considered the dominant subgenome. However, the landscape of co-expression network divergence of duplicate genes in maize is still largely uncharacterized. To address the consequence of gene duplication on co-expression network divergence, we developed a gene co-expression network from RNA-seq data derived from 64 different tissues/stages of the maize reference inbred-B73. WGD, tandem and inserted gene duplications exhibited distinct regulatory divergence. Inserted duplicate genes were more likely to be singletons in the co-expression networks, while WGD duplicate genes were likely to be co-expressed with other genes. Tandem duplicate genes were enriched in the co-expression pattern where co-expressed genes were nearly identical for the duplicates in the network. Older gene duplications exhibit more extensive co-expression variation than younger duplications. Overall, non-syntenic genes primarily from inserted duplications show more co-expression divergence. Also, such enlarged co-expression divergence is significantly related to duplication age. Moreover, subgenome dominance was not observed in the co-expression networks - maize1 and maize2 exhibit similar levels of intra subgenome correlations. Intriguingly, the level of inter subgenome co-expression was similar to the level of intra subgenome correlations, and genes from specific subgenomes were not likely to be the enriched in co-expression network modules and the hub genes were not predominantly from any specific subgenomes in maize. Our work provides a comprehensive analysis of maize co-expression network divergence for three different types of gene duplications and identifies potential relationships between duplication types, duplication ages and co-expression consequences.
Gene Duplicability of Core Genes Is Highly Consistent across All Angiosperms.
Li, Zhen; Defoort, Jonas; Tasdighian, Setareh; Maere, Steven; Van de Peer, Yves; De Smet, Riet
2016-02-01
Gene duplication is an important mechanism for adding to genomic novelty. Hence, which genes undergo duplication and are preserved following duplication is an important question. It has been observed that gene duplicability, or the ability of genes to be retained following duplication, is a nonrandom process, with certain genes being more amenable to survive duplication events than others. Primarily, gene essentiality and the type of duplication (small-scale versus large-scale) have been shown in different species to influence the (long-term) survival of novel genes. However, an overarching view of "gene duplicability" is lacking, mainly due to the fact that previous studies usually focused on individual species and did not account for the influence of genomic context and the time of duplication. Here, we present a large-scale study in which we investigated duplicate retention for 9178 gene families shared between 37 flowering plant species, referred to as angiosperm core gene families. For most gene families, we observe a strikingly consistent pattern of gene duplicability across species, with gene families being either primarily single-copy or multicopy in all species. An intermediate class contains gene families that are often retained in duplicate for periods extending to tens of millions of years after whole-genome duplication, but ultimately appear to be largely restored to singleton status, suggesting that these genes may be dosage balance sensitive. The distinction between single-copy and multicopy gene families is reflected in their functional annotation, with single-copy genes being mainly involved in the maintenance of genome stability and organelle function and multicopy genes in signaling, transport, and metabolism. The intermediate class was overrepresented in regulatory genes, further suggesting that these represent putative dosage-balance-sensitive genes. © 2016 American Society of Plant Biologists. All rights reserved.
Genome-Wide Analysis of bZIP-Encoding Genes in Maize
Wei, Kaifa; Chen, Juan; Wang, Yanmei; Chen, Yanhui; Chen, Shaoxiang; Lin, Yina; Pan, Si; Zhong, Xiaojun; Xie, Daoxin
2012-01-01
In plants, basic leucine zipper (bZIP) proteins regulate numerous biological processes such as seed maturation, flower and vascular development, stress signalling and pathogen defence. We have carried out a genome-wide identification and analysis of 125 bZIP genes that exist in the maize genome, encoding 170 distinct bZIP proteins. This family can be divided into 11 groups according to the phylogenetic relationship among the maize bZIP proteins and those in Arabidopsis and rice. Six kinds of intron patterns (a–f) within the basic and hinge regions are defined. The additional conserved motifs have been identified and present the group specificity. Detailed three-dimensional structure analysis has been done to display the sequence conservation and potential distribution of the bZIP domain. Further, we predict the DNA-binding pattern and the dimerization property on the basis of the characteristic features in the basic and hinge regions and the leucine zipper, respectively, which supports our classification greatly and helps to classify 26 distinct subfamilies. The chromosome distribution and the genetic analysis reveal that 58 ZmbZIP genes are located in the segmental duplicate regions in the maize genome, suggesting that the segment chromosomal duplications contribute greatly to the expansion of the maize bZIP family. Across the 60 different developmental stages of 11 organs, three apparent clusters formed represent three kinds of different expression patterns among the ZmbZIP gene family in maize development. A similar but slightly different expression pattern of bZIPs in two inbred lines displays that 22 detected ZmbZIP genes might be involved in drought stress. Thirteen pairs and 143 pairs of ZmbZIP genes show strongly negative and positive correlations in the four distinct fungal infections, respectively, based on the expression profile and Pearson's correlation coefficient analysis. PMID:23103471
Evolutionary Insights into Taste Perception of the Invasive Pest Drosophila suzukii.
Crava, Cristina M; Ramasamy, Sukanya; Ometto, Lino; Anfora, Gianfranco; Rota-Stabelli, Omar
2016-12-07
Chemosensory perception allows insects to interact with the environment by perceiving odorant or tastant molecules; genes encoding chemoreceptors are the molecular interface between the environment and the insect, and play a central role in mediating its chemosensory behavior. Here, we explore how the evolution of these genes in the emerging pest Drosophila suzukii correlates with the peculiar ecology of this species. We annotated approximately 130 genes coding for gustatory receptors (GRs) and divergent ionotropic receptors (dIRs) in D. suzukii and in its close relative D. biarmipes We then analyzed the evolution, in terms of size, of each gene family as well of the molecular evolution of the genes in a 14 Drosophila species phylogenetic framework. We show that the overall evolution of GRs parallels that of dIRs not only in D. suzukii, but also in all other analyzed Drosophila Our results reveal an unprecedented burst of gene family size in the lineage leading to the suzukii subgroup, as well as genomic changes that characterize D. suzukii, particularly duplications and strong signs of positive selection in the putative bitter-taste receptor GR59d. Expression studies of duplicate genes in D. suzukii support a spatio-temporal subfunctionalization of the duplicate isoforms. Our results suggest that D. suzukii is not characterized by gene loss, as observed in other specialist Drosophila species, but rather by a dramatic acceleration of gene gains, compatible with a highly generalist feeding behavior. Overall, our analyses provide candidate taste receptors specific for D. suzukii that may correlate with its specific behavior, and which may be tested in functional studies to ultimately enhance its control in the field. Copyright © 2016 Crava et al.
Tester, David J.; Benton, Amber J.; Train, Laura; Deal, Barbara; Baudhuin, Linnea M.; Ackerman, Michael J.
2010-01-01
Long QT Syndrome (LQTS) is a cardiac channelopathy associated with syncope, seizures, and sudden death. Approximately 75% of LQTS is due to mutations in genes encoding for three cardiac ion channel alpha-subunits (LQT1-3). However, traditional mutational analyses have limited detection capabilities for atypical mutations such as large gene rearrangements. Here, we set out to determine the prevalence and spectrum of large deletions/duplications in the major LQTS-susceptibility genes among unrelated patients who were mutation-negative following point mutation analysis of LQT1-12-susceptibility genes. Forty-two unrelated clinically strong LQTS patients were analyzed using multiplex ligation-dependent probe amplification (MLPA), a quantitative fluorescent technique for detecting multiple exon deletions and duplications. The SALSA-MLPA LQTS Kit from MRC-Holland was used to analyze the three major LQTS-associated genes: KCNQ1, KCNH2, and SCN5A and the two minor genes: KCNE1 and KCNE2. Overall, 2 gene rearrangements were found in 2/42 (4.8%, CI, 1.7–11%) unrelated patients. A deletion of KCNQ1 exon 3 was identified in a 10 year-old Caucasian boy with a QTc of 660 milliseconds (ms), a personal history of exercise-induced syncope, and a family history of syncope. A deletion of KCNQ1 exon 7 was identified in a 17 year-old Caucasian girl with a QTc of 480 ms, a personal history of exercise-induced syncope, and a family history of sudden cardiac death. In conclusion, since nearly 5% of patients with genetically elusive LQTS had large genomic rearrangements involving the canonical LQTS-susceptibility genes, reflex genetic testing to investigate genomic rearrangements may be of clinical value. PMID:20920651
2013-01-01
Background Hydrophobins are small secreted cysteine-rich proteins that play diverse roles during different phases of fungal life cycle. In basidiomycetes, hydrophobin-encoding genes often form large multigene families with up to 40 members. The evolutionary forces driving hydrophobin gene expansion and diversification in basidiomycetes are poorly understood. The functional roles of individual genes within such gene families also remain unclear. The relationship between the hydrophobin gene number, the genome size and the lifestyle of respective fungal species has not yet been thoroughly investigated. Here, we present results of our survey of hydrophobin gene families in two species of wood-degrading basidiomycetes, Phlebia brevispora and Heterobasidion annosum s.l. We have also investigated the regulatory pattern of hydrophobin-encoding genes from H. annosum s.s. during saprotrophic growth on pine wood as well as on culture filtrate from Phlebiopsis gigantea using micro-arrays. These data are supplemented by results of the protein structure modeling for a representative set of hydrophobins. Results We have identified hydrophobin genes from the genomes of two wood-degrading species of basidiomycetes, Heterobasidion irregulare, representing one of the microspecies within the aggregate H. annosum s.l., and Phlebia brevispora. Although a high number of hydrophobin-encoding genes were observed in H. irregulare (16 copies), a remarkable expansion of these genes was recorded in P. brevispora (26 copies). A significant expansion of hydrophobin-encoding genes in other analyzed basidiomycetes was also documented (1–40 copies), whereas contraction through gene loss was observed among the analyzed ascomycetes (1–11 copies). Our phylogenetic analysis confirmed the important role of gene duplication events in the evolution of hydrophobins in basidiomycetes. Increased number of hydrophobin-encoding genes appears to have been linked to the species’ ecological strategy, with the non-pathogenic fungi having increased numbers of hydrophobins compared with their pathogenic counterparts. However, there was no significant relationship between the number of hydrophobin-encoding genes and genome size. Furthermore, our results revealed significant differences in the expression levels of the 16 H. annosum s.s. hydrophobin-encoding genes which suggest possible differences in their regulatory patterns. Conclusions A considerable expansion of the hydrophobin-encoding genes in basidiomycetes has been observed. The distribution and number of hydrophobin-encoding genes in the analyzed species may be connected to their ecological preferences. Results of our analysis also have shown that H. annosum s.l. hydrophobin-encoding genes may be under positive selection. Our gene expression analysis revealed differential expression of H. annosum s.s. hydrophobin genes under different growth conditions, indicating their possible functional diversification. PMID:24188142
Wang, Yupeng; Ficklin, Stephen P; Wang, Xiyin; Feltus, F Alex; Paterson, Andrew H
2016-01-01
Different modes of gene duplication including whole-genome duplication (WGD), and tandem, proximal and dispersed duplications are widespread in angiosperm genomes. Small-scale, stochastic gene relocations and transposed gene duplications are widely accepted to be the primary mechanisms for the creation of dispersed duplicates. However, here we show that most surviving ancient dispersed duplicates in core eudicots originated from large-scale gene relocations within a narrow window of time following a genome triplication (γ) event that occurred in the stem lineage of core eudicots. We name these surviving ancient dispersed duplicates as relocated γ duplicates. In Arabidopsis thaliana, relocated γ, WGD and single-gene duplicates have distinct features with regard to gene functions, essentiality, and protein interactions. Relative to γ duplicates, relocated γ duplicates have higher non-synonymous substitution rates, but comparable levels of expression and regulation divergence. Thus, relocated γ duplicates should be distinguished from WGD and single-gene duplicates for evolutionary investigations. Our results suggest large-scale gene relocations following the γ event were associated with the diversification of core eudicots.
Wang, Yupeng; Ficklin, Stephen P.; Wang, Xiyin; Feltus, F. Alex; Paterson, Andrew H.
2016-01-01
Different modes of gene duplication including whole-genome duplication (WGD), and tandem, proximal and dispersed duplications are widespread in angiosperm genomes. Small-scale, stochastic gene relocations and transposed gene duplications are widely accepted to be the primary mechanisms for the creation of dispersed duplicates. However, here we show that most surviving ancient dispersed duplicates in core eudicots originated from large-scale gene relocations within a narrow window of time following a genome triplication (γ) event that occurred in the stem lineage of core eudicots. We name these surviving ancient dispersed duplicates as relocated γ duplicates. In Arabidopsis thaliana, relocated γ, WGD and single-gene duplicates have distinct features with regard to gene functions, essentiality, and protein interactions. Relative to γ duplicates, relocated γ duplicates have higher non-synonymous substitution rates, but comparable levels of expression and regulation divergence. Thus, relocated γ duplicates should be distinguished from WGD and single-gene duplicates for evolutionary investigations. Our results suggest large-scale gene relocations following the γ event were associated with the diversification of core eudicots. PMID:27195960
DOE Office of Scientific and Technical Information (OSTI.GOV)
Onda, M.; Kudo, S.; Fukuda, M.
Human glycophorin A, B, and E (GPA, GPB, and GPE) genes belong to a gene family located at the long arm of chromosome 4. These three genes are homologous from the 5'-flanking sequence to the Alu sequence, which is 1 kb downstream from the exon encoding the transmembrane domain. Analysis of the Alu sequence and flanking direct repeat sequences suggested that the GPA gene most closely resembles the ancestral gene, whereas the GPB and GPE gene arose by homologous recombination within the Alu sequence, acquiring 3' sequences from an unrelated precursor genomic segment. Here the authors describe the identification ofmore » this putative precursor genomic segment. A human genomic library was screened by using the sequence of the 3' region of the GPB gene as a probe. The genomic clones isolated were found to contain an Alu sequence that appeared to be involved in the recombination. Downstream from the Alu sequence, the nucleotide sequence of the precursor genomic segment is almost identical to that of the GPB or GPE gene. In contrast, the upstream sequence of the genomic segment differs entirely from that of the GPA, GPB, and GPE genes. Conservation of the direct repeats flanking the Alu sequence of the genomic segment strongly suggests that the sequence of this genomic segment has been maintained during evolution. This identified genomic segment was found to reside downstream from the GPA gene by both gene mapping and in situ chromosomal localization. The precursor genomic segment was also identified in the orangutan genome, which is known to lack GPB and GPE genes. These results indicate that one of the duplicated ancestral glycophorin genes acquired a unique 3' sequence by unequal crossing-over through its Alu sequence and the further downstream Alu sequence present in the duplicated gene. Further duplication and divergence of this gene yielded the GPB and GPE genes. 37 refs., 5 figs.« less
Gene Duplicability of Core Genes Is Highly Consistent across All Angiosperms[OPEN
Li, Zhen; Van de Peer, Yves; De Smet, Riet
2016-01-01
Gene duplication is an important mechanism for adding to genomic novelty. Hence, which genes undergo duplication and are preserved following duplication is an important question. It has been observed that gene duplicability, or the ability of genes to be retained following duplication, is a nonrandom process, with certain genes being more amenable to survive duplication events than others. Primarily, gene essentiality and the type of duplication (small-scale versus large-scale) have been shown in different species to influence the (long-term) survival of novel genes. However, an overarching view of “gene duplicability” is lacking, mainly due to the fact that previous studies usually focused on individual species and did not account for the influence of genomic context and the time of duplication. Here, we present a large-scale study in which we investigated duplicate retention for 9178 gene families shared between 37 flowering plant species, referred to as angiosperm core gene families. For most gene families, we observe a strikingly consistent pattern of gene duplicability across species, with gene families being either primarily single-copy or multicopy in all species. An intermediate class contains gene families that are often retained in duplicate for periods extending to tens of millions of years after whole-genome duplication, but ultimately appear to be largely restored to singleton status, suggesting that these genes may be dosage balance sensitive. The distinction between single-copy and multicopy gene families is reflected in their functional annotation, with single-copy genes being mainly involved in the maintenance of genome stability and organelle function and multicopy genes in signaling, transport, and metabolism. The intermediate class was overrepresented in regulatory genes, further suggesting that these represent putative dosage-balance-sensitive genes. PMID:26744215
Gene duplication and the evolution of phenotypic diversity in insect societies.
Chau, Linh M; Goodisman, Michael A D
2017-12-01
Gene duplication is an important evolutionary process thought to facilitate the evolution of phenotypic diversity. We investigated if gene duplication was associated with the evolution of phenotypic differences in a highly social insect, the honeybee Apis mellifera. We hypothesized that the genetic redundancy provided by gene duplication could promote the evolution of social and sexual phenotypes associated with advanced societies. We found a positive correlation between sociality and rate of gene duplications across the Apoidea, indicating that gene duplication may be associated with sociality. We also discovered that genes showing biased expression between A. mellifera alternative phenotypes tended to be found more frequently than expected among duplicated genes than singletons. Moreover, duplicated genes had higher levels of caste-, sex-, behavior-, and tissue-biased expression compared to singletons, as expected if gene duplication facilitated phenotypic differentiation. We also found that duplicated genes were maintained in the A. mellifera genome through the processes of conservation, neofunctionalization, and specialization, but not subfunctionalization. Overall, we conclude that gene duplication may have facilitated the evolution of social and sexual phenotypes, as well as tissue differentiation. Thus this study further supports the idea that gene duplication allows species to evolve an increased range of phenotypic diversity. © 2017 The Author(s). Evolution © 2017 The Society for the Study of Evolution.
USDA-ARS?s Scientific Manuscript database
The Q gene encodes an AP2-like transcription factor that played an important role in domestication of polyploid wheat. The chromosome 5A Q alleles (5AQ and 5Aq) have been well studied, but much less is known about the q alleles on wheat homoeologous chromosomes 5B (5Bq) and 5D (5Dq). We investigated...
Molecular evolution of multiple arylalkylamine N-acetyltransferase (AANAT) in fish.
Zilberman-Peled, Bina; Bransburg-Zabary, Sharron; Klein, David C; Gothilf, Yoav
2011-01-01
Arylalkylamine N-acetyltransferase (AANAT) catalyzes the transfer of an acetyl group from acetyl coenzyme A (AcCoA) to arylalkylamines, including indolethylamines and phenylethylamines. Multiple aanats are present in teleost fish as a result of whole genome and gene duplications. Fish aanat1a and aanat2 paralogs display different patterns of tissue expression and encode proteins with different substrate preference: AANAT1a is expressed in the retina, and acetylates both indolethylamines and phenylethylamines; while AANAT2 is expressed in the pineal gland, and preferentially acetylates indolethylamines. The two enzymes are therefore thought to serve different roles. Here, the molecular changes that led to their specialization were studied by investigating the structure-function relationships of AANATs in the gilthead seabream (sb, Sperus aurata). Acetylation activity of reciprocal mutated enzymes pointed to specific residues that contribute to substrate specificity of the enzymes. Inhibition tests followed by complementary analyses of the predicted three-dimensional models of the enzymes, suggested that both phenylethylamines and indolethylamines bind to the catalytic pocket of both enzymes. These results suggest that substrate selectivity of AANAT1a and AANAT2 is determined by the positioning of the substrate within the catalytic pocket, and its accessibility to catalysis. This illustrates the evolutionary process by which enzymes encoded by duplicated genes acquire different activities and play different biological roles.
Wang, Guifeng; Zhong, Mingyu; Wang, Gang; Song, Rentao
2014-01-01
The actin-based myosin system is essential for the organization and dynamics of the endomembrane system and transport network in plant cells. Plants harbour two unique myosin groups, class VIII and class XI, and the latter is structurally and functionally analogous to the animal and fungal class V myosin. Little is known about myosins in grass, even though grass includes several agronomically important cereal crops. Here, we identified 14 myosin genes from the genome of maize (Zea mays). The relatively larger sizes of maize myosin genes are due to their much longer introns, which are abundant in transposable elements. Phylogenetic analysis indicated that maize myosin genes could be classified into class VIII and class XI, with three and 11 members, respectively. Apart from subgroup XI-F, the remaining subgroups were duplicated at least in one analysed lineage, and the duplication events occurred more extensively in Arabidopsis than in maize. Only two pairs of maize myosins were generated from segmental duplication. Expression analysis revealed that most maize myosin genes were expressed universally, whereas a few members (XI-1, -6, and -11) showed an anther-specific pattern, and many underwent extensive alternative splicing. We also found a short transcript at the O1 locus, which conceptually encoded a headless myosin that most likely functions at the transcriptional level rather than via a dominant-negative mechanism at the translational level. Together, these data provide significant insights into the evolutionary and functional characterization of maize myosin genes that could transfer to the identification and application of homologous myosins of other grasses. PMID:24363426
A limited role for gene duplications in the evolution of platypus venom.
Wong, Emily S W; Papenfuss, Anthony T; Whittington, Camilla M; Warren, Wesley C; Belov, Katherine
2012-01-01
Gene duplication followed by adaptive selection is believed to be the primary driver of venom evolution. However, to date, no studies have evaluated the importance of gene duplications for venom evolution using a genomic approach. The availability of a sequenced genome and a venom gland transcriptome for the enigmatic platypus provides a unique opportunity to explore the role that gene duplication plays in venom evolution. Here, we identify gene duplication events and correlate them with expressed transcripts in an in-season venom gland. Gene duplicates (1,508) were identified. These duplicated pairs (421), including genes that have undergone multiple rounds of gene duplications, were expressed in the venom gland. The majority of these genes are involved in metabolism and protein synthesis not toxin functions. Twelve secretory genes including serine proteases, metalloproteinases, and protease inhibitors likely to produce symptoms of envenomation such as vasodilation and pain were detected. Only 16 of 107 platypus genes with high similarity to known toxins evolved through gene duplication. Platypus venom C-type natriuretic peptides and nerve growth factor do not possess lineage-specific gene duplicates. Extensive duplications, believed to increase the potency of toxic content and promote toxin diversification, were not found. This is the first study to take a genome-wide approach in order to examine the impact of gene duplication on venom evolution. Our findings support the idea that adaptive selection acts on gene duplicates to drive the independent evolution and functional diversification of similar venom genes in venomous species. However, gene duplications alone do not explain the "venome" of the platypus. Other mechanisms, such as alternative splicing and mutation, may be important in venom innovation.
A Limited Role for Gene Duplications in the Evolution of Platypus Venom
Wong, Emily S. W.; Papenfuss, Anthony T.; Whittington, Camilla M.; Warren, Wesley C.; Belov, Katherine
2012-01-01
Gene duplication followed by adaptive selection is believed to be the primary driver of venom evolution. However, to date, no studies have evaluated the importance of gene duplications for venom evolution using a genomic approach. The availability of a sequenced genome and a venom gland transcriptome for the enigmatic platypus provides a unique opportunity to explore the role that gene duplication plays in venom evolution. Here, we identify gene duplication events and correlate them with expressed transcripts in an in-season venom gland. Gene duplicates (1,508) were identified. These duplicated pairs (421), including genes that have undergone multiple rounds of gene duplications, were expressed in the venom gland. The majority of these genes are involved in metabolism and protein synthesis not toxin functions. Twelve secretory genes including serine proteases, metalloproteinases, and protease inhibitors likely to produce symptoms of envenomation such as vasodilation and pain were detected. Only 16 of 107 platypus genes with high similarity to known toxins evolved through gene duplication. Platypus venom C-type natriuretic peptides and nerve growth factor do not possess lineage-specific gene duplicates. Extensive duplications, believed to increase the potency of toxic content and promote toxin diversification, were not found. This is the first study to take a genome-wide approach in order to examine the impact of gene duplication on venom evolution. Our findings support the idea that adaptive selection acts on gene duplicates to drive the independent evolution and functional diversification of similar venom genes in venomous species. However, gene duplications alone do not explain the “venome” of the platypus. Other mechanisms, such as alternative splicing and mutation, may be important in venom innovation. PMID:21816864
Genetic structure of the mating-type locus of Chlamydomonas reinhardtii.
Ferris, Patrick J; Armbrust, E Virginia; Goodenough, Ursula W
2002-01-01
Portions of the cloned mating-type (MT) loci (mt(+) and mt(-)) of Chlamydomonas reinhardtii, defined as the approximately 1-Mb domains of linkage group VI that are under recombinational suppression, were subjected to Northern analysis to elucidate their coding capacity. The four central rearranged segments of the loci were found to contain both housekeeping genes (expressed during several life-cycle stages) and mating-related genes, while the sequences unique to mt(+) or mt(-) carried genes expressed only in the gametic or zygotic phases of the life cycle. One of these genes, Mtd1, is a candidate participant in gametic cell fusion; two others, Mta1 and Ezy2, are candidate participants in the uniparental inheritance of chloroplast DNA. The identified housekeeping genes include Pdk, encoding pyruvate dehydrogenase kinase, and GdcH, encoding glycine decarboxylase complex subunit H. Unusual genetic configurations include three genes whose sequences overlap, one gene that has inserted into the coding region of another, several genes that have been inactivated by rearrangements in the region, and genes that have undergone tandem duplication. This report extends our original conclusion that the MT locus has incurred high levels of mutational change. PMID:11805055
Coady, A.M.; Murray, A.L.; Elliott, D.G.; Rhodes, L.D.
2006-01-01
Renibacterium salmoninarum, a gram-positive diplococcobacillus that causes bacterial kidney disease among salmon and trout, has two chromosomal loci encoding the major soluble antigen (msa) gene. Because the MSA protein is widely suspected to be an important virulence factor, we used insertion-duplication mutagenesis to generate disruptions of either the msa1 or msa2 gene. Surprisingly, expression of MSA protein in broth cultures appeared unaffected. However, the virulence of either mutant in juvenile Chinook salmon (Oncorhynchus tshawytscha) by intraperitoneal challenge was severely attenuated, suggesting that disruption of the msa1 or msa2 gene affected in vivo expression. Copyright ?? 2006, American Society for Microbiology. All Rights Reserved.
Both mechanism and age of duplications contribute to biased gene retention patterns in plants.
Rody, Hugo V S; Baute, Gregory J; Rieseberg, Loren H; Oliveira, Luiz O
2017-01-06
All extant seed plants are successful paleopolyploids, whose genomes carry duplicate genes that have survived repeated episodes of diploidization. However, the survival of gene duplicates is biased with respect to gene function and mechanism of duplication. Transcription factors, in particular, are reported to be preferentially retained following whole-genome duplications (WGDs), but disproportionately lost when duplicated by tandem events. An explanation for this pattern is provided by the Gene Balance Hypothesis (GBH), which posits that duplicates of highly connected genes are retained following WGDs to maintain optimal stoichiometry among gene products; but such connected gene duplicates are disfavored following tandem duplications. We used genomic data from 25 taxonomically diverse plant species to investigate the roles of duplication mechanism, gene function, and age of duplication in the retention of duplicate genes. Enrichment analyses were conducted to identify Gene Ontology (GO) functional categories that were overrepresented in either WGD or tandem duplications, or across ranges of divergence times. Tandem paralogs were much younger, on average, than WGD paralogs and the most frequently overrepresented GO categories were not shared between tandem and WGD paralogs. Transcription factors were overrepresented among ancient paralogs regardless of mechanism of origin or presence of a WGD. Also, in many cases, there was no bias toward transcription factor retention following recent WGDs. Both the fixation and the retention of duplicated genes in plant genomes are context-dependent events. The strong bias toward ancient transcription factor duplicates can be reconciled with the GBH if selection for optimal stoichiometry among gene products is strongest following the earliest polyploidization events and becomes increasingly relaxed as gene families expand.
Zou, Zhi; Yang, Lifu; Wang, Danhua; Huang, Qixing; Mo, Yeyong; Xie, Guishui
2016-01-01
WRKY proteins comprise one of the largest transcription factor families in plants and form key regulators of many plant processes. This study presents the characterization of 58 WRKY genes from the castor bean (Ricinus communis L., Euphorbiaceae) genome. Compared with the automatic genome annotation, one more WRKY-encoding locus was identified and 20 out of the 57 predicted gene models were manually corrected. All RcWRKY genes were shown to contain at least one intron in their coding sequences. According to the structural features of the present WRKY domains, the identified RcWRKY genes were assigned to three previously defined groups (I-III). Although castor bean underwent no recent whole-genome duplication event like physic nut (Jatropha curcas L., Euphorbiaceae), comparative genomics analysis indicated that one gene loss, one intron loss and one recent proximal duplication occurred in the RcWRKY gene family. The expression of all 58 RcWRKY genes was supported by ESTs and/or RNA sequencing reads derived from roots, leaves, flowers, seeds and endosperms. Further global expression profiles with RNA sequencing data revealed diverse expression patterns among various tissues. Results obtained from this study not only provide valuable information for future functional analysis and utilization of the castor bean WRKY genes, but also provide a useful reference to investigate the gene family expansion and evolution in Euphorbiaceus plants.
Hardigan, Michael A.; Crisovan, Emily; Hamilton, John P.; Laimbeer, Parker; Leisner, Courtney P.; Manrique-Carpintero, Norma C.; Newton, Linsey; Pham, Gina M.; Vaillancourt, Brieanne; Zeng, Zixian; Jiang, Jiming
2016-01-01
Clonally reproducing plants have the potential to bear a significantly greater mutational load than sexually reproducing species. To investigate this possibility, we examined the breadth of genome-wide structural variation in a panel of monoploid/doubled monoploid clones generated from native populations of diploid potato (Solanum tuberosum), a highly heterozygous asexually propagated plant. As rare instances of purely homozygous clones, they provided an ideal set for determining the degree of structural variation tolerated by this species and deriving its minimal gene complement. Extensive copy number variation (CNV) was uncovered, impacting 219.8 Mb (30.2%) of the potato genome with nearly 30% of genes subject to at least partial duplication or deletion, revealing the highly heterogeneous nature of the potato genome. Dispensable genes (>7000) were associated with limited transcription and/or a recent evolutionary history, with lower deletion frequency observed in genes conserved across angiosperms. Association of CNV with plant adaptation was highlighted by enrichment in gene clusters encoding functions for environmental stress response, with gene duplication playing a part in species-specific expansions of stress-related gene families. This study revealed unique impacts of CNV in a species with asexual reproductive habits and how CNV may drive adaption through evolution of key stress pathways. PMID:26772996
Evolution of the duplicated intracellular lipid-binding protein genes of teleost fishes.
Venkatachalam, Ananda B; Parmar, Manoj B; Wright, Jonathan M
2017-08-01
Increasing organismal complexity during the evolution of life has been attributed to the duplication of genes and entire genomes. More recently, theoretical models have been proposed that postulate the fate of duplicated genes, among them the duplication-degeneration-complementation (DDC) model. In the DDC model, the common fate of a duplicated gene is lost from the genome owing to nonfunctionalization. Duplicated genes are retained in the genome either by subfunctionalization, where the functions of the ancestral gene are sub-divided between the sister duplicate genes, or by neofunctionalization, where one of the duplicate genes acquires a new function. Both processes occur either by loss or gain of regulatory elements in the promoters of duplicated genes. Here, we review the genomic organization, evolution, and transcriptional regulation of the multigene family of intracellular lipid-binding protein (iLBP) genes from teleost fishes. Teleost fishes possess many copies of iLBP genes owing to a whole genome duplication (WGD) early in the teleost fish radiation. Moreover, the retention of duplicated iLBP genes is substantially higher than the retention of all other genes duplicated in the teleost genome. The fatty acid-binding protein genes, a subfamily of the iLBP multigene family in zebrafish, are differentially regulated by peroxisome proliferator-activated receptor (PPAR) isoforms, which may account for the retention of iLBP genes in the zebrafish genome by the process of subfunctionalization of cis-acting regulatory elements in iLBP gene promoters.
Evolution of the Kdo2-lipid A Biosynthesis in Bacteria
DOE Office of Scientific and Technical Information (OSTI.GOV)
S Opiyo; R Pardy; H Moriyama
BACKGROUND: Lipid A is the highly immunoreactive endotoxic center of lipopolysaccharide (LPS). It anchors the LPS into the outer membrane of most Gram-negative bacteria. Lipid A can be recognized by animal cells, triggers defense-related responses, and causes Gram-negative sepsis. The biosynthesis of Kdo2-lipid A, the LPS substructure, involves with nine enzymatic steps. RESULTS: In order to elucidate the evolutionary pathway of Kdo2-lipid A biosynthesis, we examined the distribution of genes encoding the nine enzymes across bacteria. We found that not all Gram-negative bacteria have all nine enzymes. Some Gram-negative bacteria have no genes encoding these enzymes and others have genesmore » only for the first four enzymes (LpxA, LpxC, LpxD, and LpxB). Among the nine enzymes, five appeared to have arisen from three independent gene duplication events. Two of such events happened within the Proteobacteria lineage, followed by functional specialization of the duplicated genes and pathway optimization in these bacteria. CONCLUSIONS: The nine-enzyme pathway, which was established based on the studies mainly in Escherichia coli K12, appears to be the most derived and optimized form. It is found only in E. coli and related Proteobacteria. Simpler and probably less efficient pathways are found in other bacterial groups, with Kdo2-lipid A variants as the likely end products. The Kdo2-lipid A biosynthetic pathway exemplifies extremely plastic evolution of bacterial genomes, especially those of Proteobacteria, and how these mainly pathogenic bacteria have adapted to their environment.« less
Wei, Chaoling; Yang, Hua; Wang, Songbo; Zhao, Jian; Liu, Chun; Gao, Liping; Xia, Enhua; Lu, Ying; Tai, Yuling; She, Guangbiao; Sun, Jun; Cao, Haisheng; Tong, Wei; Gao, Qiang; Li, Yeyun; Deng, Weiwei; Jiang, Xiaolan; Wang, Wenzhao; Chen, Qi; Zhang, Shihua; Li, Haijing; Wu, Junlan; Wang, Ping; Li, Penghui; Shi, Chengying; Zheng, Fengya; Jian, Jianbo; Huang, Bei; Shan, Dai; Shi, Mingming; Fang, Congbing; Yue, Yi; Li, Fangdong; Li, Daxiang; Wei, Shu; Han, Bin; Jiang, Changjun; Yin, Ye; Xia, Tao; Zhang, Zhengzhu; Bennetzen, Jeffrey L; Zhao, Shancen; Wan, Xiaochun
2018-05-01
Tea, one of the world's most important beverage crops, provides numerous secondary metabolites that account for its rich taste and health benefits. Here we present a high-quality sequence of the genome of tea, Camellia sinensis var. sinensis (CSS), using both Illumina and PacBio sequencing technologies. At least 64% of the 3.1-Gb genome assembly consists of repetitive sequences, and the rest yields 33,932 high-confidence predictions of encoded proteins. Divergence between two major lineages, CSS and Camellia sinensis var. assamica (CSA), is calculated to ∼0.38 to 1.54 million years ago (Mya). Analysis of genic collinearity reveals that the tea genome is the product of two rounds of whole-genome duplications (WGDs) that occurred ∼30 to 40 and ∼90 to 100 Mya. We provide evidence that these WGD events, and subsequent paralogous duplications, had major impacts on the copy numbers of secondary metabolite genes, particularly genes critical to producing three key quality compounds: catechins, theanine, and caffeine. Analyses of transcriptome and phytochemistry data show that amplification and transcriptional divergence of genes encoding a large acyltransferase family and leucoanthocyanidin reductases are associated with the characteristic young leaf accumulation of monomeric galloylated catechins in tea, while functional divergence of a single member of the glutamine synthetase gene family yielded theanine synthetase. This genome sequence will facilitate understanding of tea genome evolution and tea metabolite pathways, and will promote germplasm utilization for breeding improved tea varieties. Copyright © 2018 the Author(s). Published by PNAS.
Hamilton, P T; Reeve, J N
1985-01-01
DNA fragments cloned from the methanogenic archaebacterium Methanobrevibacter smithii which complement mutations in the purE and proC genes of E. coli have been sequenced. Sequence analyses, transposon mutagenesis and expression in E. coli minicells indicate that purE and proC complementations result from the synthesis of M. smithii polypeptides with molecular weights of 36,697 and 27,836 respectively. The encoding genes appear to be located in operons. The M. smithii genome contains 69% A/T basepairs (bp) which is reflected in unusual codon usages and intergenic regions containing approximately 85% A/T bp. An insertion element, designated ISM1, was found within the cloned M. smithii DNA located adjacent to the proC complementing region. ISM1 is 1381 bp in length, has 29 bp terminal inverted repeat sequences and contains one major ORF encoded in 87% of the ISM1 sequence. ISM1 is mobile, present in approximately 10 copies per genome and integration duplicates 8 bp at the site of insertion. The duplicated sequences show homology with sequences within the 29 bp terminal repeat sequence of ISM1. Comparison of our data with sequences from halophilic archaebacteria suggests that 5'GAANTTTCA and 5'TTTTAATATAAA may be consensus promoter sequences for archaebacteria. These sequences closely resemble the consensus sequences which precede Drosophila heat-shock genes (Pelham 1982; Davidson et al. 1983). Methanogens appear to employ the eubacterial system of mRNA: 16SrRNA hybridization to ensure initiation of translation; the consensus ribosome binding sequence is 5'AGGTGA.
Wei, Chaoling; Yang, Hua; Wang, Songbo; Zhao, Jian; Liu, Chun; Gao, Liping; Xia, Enhua; Lu, Ying; Tai, Yuling; She, Guangbiao; Sun, Jun; Cao, Haisheng; Tong, Wei; Gao, Qiang; Li, Yeyun; Deng, Weiwei; Jiang, Xiaolan; Wang, Wenzhao; Chen, Qi; Zhang, Shihua; Li, Haijing; Wu, Junlan; Wang, Ping; Li, Penghui; Shi, Chengying; Zheng, Fengya; Jian, Jianbo; Huang, Bei; Shan, Dai; Shi, Mingming; Fang, Congbing; Yue, Yi; Li, Fangdong; Li, Daxiang; Wei, Shu; Han, Bin; Jiang, Changjun; Yin, Ye; Xia, Tao; Zhang, Zhengzhu; Bennetzen, Jeffrey L.; Zhao, Shancen; Wan, Xiaochun
2018-01-01
Tea, one of the world’s most important beverage crops, provides numerous secondary metabolites that account for its rich taste and health benefits. Here we present a high-quality sequence of the genome of tea, Camellia sinensis var. sinensis (CSS), using both Illumina and PacBio sequencing technologies. At least 64% of the 3.1-Gb genome assembly consists of repetitive sequences, and the rest yields 33,932 high-confidence predictions of encoded proteins. Divergence between two major lineages, CSS and Camellia sinensis var. assamica (CSA), is calculated to ∼0.38 to 1.54 million years ago (Mya). Analysis of genic collinearity reveals that the tea genome is the product of two rounds of whole-genome duplications (WGDs) that occurred ∼30 to 40 and ∼90 to 100 Mya. We provide evidence that these WGD events, and subsequent paralogous duplications, had major impacts on the copy numbers of secondary metabolite genes, particularly genes critical to producing three key quality compounds: catechins, theanine, and caffeine. Analyses of transcriptome and phytochemistry data show that amplification and transcriptional divergence of genes encoding a large acyltransferase family and leucoanthocyanidin reductases are associated with the characteristic young leaf accumulation of monomeric galloylated catechins in tea, while functional divergence of a single member of the glutamine synthetase gene family yielded theanine synthetase. This genome sequence will facilitate understanding of tea genome evolution and tea metabolite pathways, and will promote germplasm utilization for breeding improved tea varieties. PMID:29678829
Paterson, Andrew H; Chapman, Brad A; Kissinger, Jessica C; Bowers, John E; Feltus, Frank A; Estill, James C
2006-11-01
Genome duplication is potentially a good source of new genes, but such genes take time to evolve. We have found a group of "duplication-resistant" genes, which have undergone convergent restoration to singleton status following several independent genome duplications. Restoration of duplication-resistant genes to singleton status could be important to long-term survival of a polyploid lineage. Angiosperms show more frequent polyploidization and a higher degree of duplicate gene preservation than other paleopolyploids, making them well-suited to further study of duplication-resistant genes.
Sanzol, Javier
2010-05-14
Gene duplication is central to genome evolution. In plants, genes can be duplicated through small-scale events and large-scale duplications often involving polyploidy. The apple belongs to the subtribe Pyrinae (Rosaceae), a diverse lineage that originated via allopolyploidization. Both small-scale duplications and polyploidy may have been important mechanisms shaping the genome of this species. This study evaluates the gene duplication and polyploidy history of the apple by characterizing duplicated genes in this species using EST data. Overall, 68% of the apple genes were clustered into families with a mean copy-number of 4.6. Analysis of the age distribution of gene duplications supported a continuous mode of small-scale duplications, plus two episodes of large-scale duplicates of vastly different ages. The youngest was consistent with the polyploid origin of the Pyrinae 37-48 MYBP, whereas the older may be related to gamma-triplication; an ancient hexapolyploidization previously characterized in the four sequenced eurosid genomes and basal to the eurosid-asterid divergence. Duplicated genes were studied for functional diversification with an emphasis on young paralogs; those originated during or after the formation of the Pyrinae lineage. Unequal assignment of single-copy genes and gene families to Gene Ontology categories suggested functional bias in the pattern of gene retention of paralogs. Young paralogs related to signal transduction, metabolism, and energy pathways have been preferentially retained. Non-random retention of duplicated genes seems to have mediated the expansion of gene families, some of which may have substantially increased their members after the origin of the Pyrinae. The joint analysis of over-duplicated functional categories and phylogenies, allowed evaluation of the role of both polyploidy and small-scale duplications during this process. Finally, gene expression analysis indicated that 82% of duplicated genes, including 80% of young paralogs, showed uncorrelated expression profiles, suggesting extensive subfunctionalization and a role of gene duplication in the acquisition of novel patterns of gene expression. This study reports a genome-wide analysis of the mode of gene duplication in the apple, and provides evidence for its role in genome functional diversification by characterising three major processes: selective retention of paralogs, amplification of gene families, and changes in gene expression.
The evolution of an osmotically inducible dps in the genus Streptomyces.
Facey, Paul D; Hitchings, Matthew D; Williams, Jason S; Skibinski, David O F; Dyson, Paul J; Del Sol, Ricardo
2013-01-01
Dps proteins are found almost ubiquitously in bacterial genomes and there is now an appreciation of their multifaceted roles in various stress responses. Previous studies have shown that this family of proteins assemble into dodecamers and their quaternary structure is entirely critical to their function. Moreover, the numbers of dps genes per bacterial genome is variable; even amongst closely related species - however, for many genera this enigma is yet to be satisfactorily explained. We reconstruct the most probable evolutionary history of Dps in Streptomyces genomes. Typically, these bacteria encode for more than one Dps protein. We offer the explanation that variation in the number of dps per genome among closely related Streptomyces can be explained by gene duplication or lateral acquisition, and the former preceded a subsequent shift in expression patterns for one of the resultant paralogs. We show that the genome of S. coelicolor encodes for three Dps proteins including a tailless Dps. Our in vivo observations show that the tailless protein, unlike the other two Dps in S. coelicolor, does not readily oligomerise. Phylogenetic and bioinformatic analyses combined with expression studies indicate that in several Streptomyces species at least one Dps is significantly over-expressed during osmotic shock, but the identity of the ortholog varies. In silico analysis of dps promoter regions coupled with gene expression studies of duplicated dps genes shows that paralogous gene pairs are expressed differentially and this correlates with the presence of a sigB promoter. Lastly, we identify a rare novel clade of Dps and show that a representative of these proteins in S. coelicolor possesses a dodecameric quaternary structure of high stability.
Expression of Duplicate msa Genes in the Salmonid Pathogen Renibacterium salmoninarum
Rhodes, Linda D.; Coady, Alison M.; Strom, Mark S.
2002-01-01
Renibacterium salmoninarum is a gram-positive bacterium responsible for bacterial kidney disease of salmon and trout. R. salmoninarum has two identical copies of the gene encoding major soluble antigen (MSA), an immunodominant, extracellular protein. To determine whether one or both copies of msa are expressed, reporter plasmids encoding a fusion of MSA and green fluorescent protein controlled by 0.6 kb of promoter region from msa1 or msa2 were constructed and introduced into R. salmoninarum. Single copies of the reporter plasmids integrated into the chromosome by homologous recombination. Expression of mRNA and protein from the integrated plasmids was detected, and transformed cells were fluorescent, demonstrating that both msa1 and msa2 are expressed under in vitro conditions. This is the first report of successful transformation and homologous recombination in R. salmoninarum. PMID:12406741
Mathews, Kristina Wehr; Cavegn, Margrith; Zwicky, Monica
2017-03-01
Drosophila females are larger than males. In this article, we describe how X -chromosome dosage drives sexual dimorphism of body size through two means: first, through unbalanced expression of a key X -linked growth-regulating gene, and second, through female-specific activation of the sex-determination pathway. X -chromosome dosage determines phenotypic sex by regulating the genes of the sex-determining pathway. In the presence of two sets of X -chromosome signal elements (XSEs), Sex-lethal ( Sxl ) is activated in female ( XX ) but not male ( XY ) animals. Sxl activates transformer ( tra ), a gene that encodes a splicing factor essential for female-specific development. It has previously been shown that null mutations in the tra gene result in only a partial reduction of body size of XX animals, which shows that other factors must contribute to size determination. We tested whether X dosage directly affects animal size by analyzing males with duplications of X -chromosomal segments. Upon tiling across the X chromosome, we found four duplications that increase male size by >9%. Within these, we identified several genes that promote growth as a result of duplication. Only one of these, Myc , was found not to be dosage compensated. Together, our results indicate that both Myc dosage and tra expression play crucial roles in determining sex-specific size in Drosophila larvae and adult tissue. Since Myc also acts as an XSE that contributes to tra activation in early development, a double dose of Myc in females serves at least twice in development to promote sexual size dimorphism. Copyright © 2017 by the Genetics Society of America.
Coppin, Evelyne; Silar, Philippe
2007-08-01
In the filamentous fungus Podospora anserina, many pigmentation mutations map to the median region of the complex locus '14', called segment '29'. The data presented in this paper show that segment 29 corresponds to a gene encoding a polyketide synthase, designated PaPKS1, and identifies two mutations that completely or partially abolish the activity of the PaPKS1 polypeptide. We present evidence that the P. anserina green pigment is a (DHN)-melanin. Using the powerful genetic system of PaPKS1 cloning, we demonstrate that in P. anserina trans-duplicated sequences are subject to the RIP process as previously demonstrated for the cis-duplicated regions.
Zhang, Zhengrong; Yuan, Li; Liu, Xin; Chen, Xuesen; Wang, Xiaoyun
2018-01-10
As a family of transcription factors, DNA binding with one figure (Dof) proteins play important roles in various biological processes in plants. Here, a total of 60 putative apple (Malus domestica) Dof genes (MdDof) were identified and mapped to different chromosomes. Chromosomal distribution and synteny analysis indicated that the expansion of the MdDof genes came primarily from segmental and duplication events, and from whole genome duplication, which lead to more Dof members in apples than in other plants. All 60 MdDof genes were classified into thirteen groups, according to multiple sequence alignment and the phylogenetic tree constructed of Dof genes from apple, peach (Prunus persica), Arabidopsis and rice. Within each group, the members shared a similar exon/intron and motif compositions, although the sizes of the MdDof genes and encoding proteins were quite different. Several Dof genes from the apple and peach were identified to be homologues based on their close synteny relationship, which suggested that these genes bear similar functions. Half of the MdDof genes were randomly selected to determine their responses to different stresses. The majority of MdDof genes were quite sensitive to PEG, NaCl, cold and exogenous ABA treatment. Our results suggested that MdDof family members may play important roles in plant tolerance to abiotic stress. Copyright © 2017 Elsevier B.V. All rights reserved.
Pan, Deng; Zhang, Liqing
2007-01-01
Background The rate of gene duplication is an important parameter in the study of evolution, but the influence of gene conversion and technical problems have confounded previous attempts to provide a satisfying estimate. We propose a new strategy to estimate the rate that involves separate quantification of the rates of two different mechanisms of gene duplication and subsequent combination of the two rates, based on their respective contributions to the overall gene duplication rate. Results Previous estimates of gene duplication rates are based on small gene families. Therefore, to assess the applicability of this to families of all sizes, we looked at both two-copy gene families and the entire genome. We studied unequal crossover and retrotransposition, and found that these mechanisms of gene duplication are largely independent and account for a substantial amount of duplicated genes. Unequal crossover contributed more to duplications in the entire genome than retrotransposition did, but this contribution was significantly less in two-copy gene families, and duplicated genes arising from this mechanism are more likely to be retained. Combining rates of duplication using the two mechanisms, we estimated the overall rates to be from approximately 0.515 to 1.49 × 10-3 per gene per million years in human, and from approximately 1.23 to 4.23 × 10-3 in mouse. The rates estimated from two-copy gene families are always lower than those from the entire genome, and so it is not appropriate to use small families to estimate the rate for the entire genome. Conclusion We present a novel strategy for estimating gene duplication rates. Our results show that different mechanisms contribute differently to the evolution of small and large gene families. PMID:17683522
Homez, a homeobox leucine zipper gene specific to the vertebrate lineage.
Bayarsaihan, Dashzeveg; Enkhmandakh, Badam; Makeyev, Aleksandr; Greally, John M; Leckman, James F; Ruddle, Frank H
2003-09-02
This work describes a vertebrate homeobox gene, designated Homez (homeodomain leucine zipper-encoding gene), that encodes a protein with an unusual structural organization. There are several regions within Homez, including three atypical homeodomains, two leucine zipper-like motifs, and an acidic domain. The gene is ubiquitously expressed in human and murine tissues, although the expression pattern is more restricted during mouse development. Genomic analysis revealed that human and mouse genes are located at 14q11.2 and 14C, respectively, and are composed of two exons. The zebrafish and pufferfish homologs share high similarity to mammalian sequences, particularly within the homeodomain sequences. Based on homology of homeodomains and on the similarity in overall protein structure, we delineate Homez and members of ZHX family of zinc finger homeodomain factors as a subset within the superfamily of homeobox-containing proteins. The type and composition of homeodomains in the Homez subfamily are vertebrate-specific. Phylogenetic analysis indicates that Homez lineage was separated from related genes >400 million years ago before separation of ray- and lobe-finned fishes. We apply a duplication-degeneration-complementation model to explain how this family of genes has evolved.
Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana.
Lin, X; Kaul, S; Rounsley, S; Shea, T P; Benito, M I; Town, C D; Fujii, C Y; Mason, T; Bowman, C L; Barnstead, M; Feldblyum, T V; Buell, C R; Ketchum, K A; Lee, J; Ronning, C M; Koo, H L; Moffat, K S; Cronin, L A; Shen, M; Pai, G; Van Aken, S; Umayam, L; Tallon, L J; Gill, J E; Adams, M D; Carrera, A J; Creasy, T H; Goodman, H M; Somerville, C R; Copenhaver, G P; Preuss, D; Nierman, W C; White, O; Eisen, J A; Salzberg, S L; Fraser, C M; Venter, J C
1999-12-16
Arabidopsis thaliana (Arabidopsis) is unique among plant model organisms in having a small genome (130-140 Mb), excellent physical and genetic maps, and little repetitive DNA. Here we report the sequence of chromosome 2 from the Columbia ecotype in two gap-free assemblies (contigs) of 3.6 and 16 megabases (Mb). The latter represents the longest published stretch of uninterrupted DNA sequence assembled from any organism to date. Chromosome 2 represents 15% of the genome and encodes 4,037 genes, 49% of which have no predicted function. Roughly 250 tandem gene duplications were found in addition to large-scale duplications of about 0.5 and 4.5 Mb between chromosomes 2 and 1 and between chromosomes 2 and 4, respectively. Sequencing of nearly 2 Mb within the genetically defined centromere revealed a low density of recognizable genes, and a high density and diverse range of vestigial and presumably inactive mobile elements. More unexpected is what appears to be a recent insertion of a continuous stretch of 75% of the mitochondrial genome into chromosome 2.
Wang, Yupeng; Wang, Xiyin; Tang, Haibao; Tan, Xu; Ficklin, Stephen P; Feltus, F Alex; Paterson, Andrew H
2011-01-01
Both single gene and whole genome duplications (WGD) have recurred in angiosperm evolution. However, the evolutionary effects of different modes of gene duplication, especially regarding their contributions to genetic novelty or redundancy, have been inadequately explored. In Arabidopsis thaliana and Oryza sativa (rice), species that deeply sample botanical diversity and for which expression data are available from a wide range of tissues and physiological conditions, we have compared expression divergence between genes duplicated by six different mechanisms (WGD, tandem, proximal, DNA based transposed, retrotransposed and dispersed), and between positional orthologs. Both neo-functionalization and genetic redundancy appear to contribute to retention of duplicate genes. Genes resulting from WGD and tandem duplications diverge slowest in both coding sequences and gene expression, and contribute most to genetic redundancy, while other duplication modes contribute more to evolutionary novelty. WGD duplicates may more frequently be retained due to dosage amplification, while inferred transposon mediated gene duplications tend to reduce gene expression levels. The extent of expression divergence between duplicates is discernibly related to duplication modes, different WGD events, amino acid divergence, and putatively neutral divergence (time), but the contribution of each factor is heterogeneous among duplication modes. Gene loss may retard inter-species expression divergence. Members of different gene families may have non-random patterns of origin that are similar in Arabidopsis and rice, suggesting the action of pan-taxon principles of molecular evolution. Gene duplication modes differ in contribution to genetic novelty and redundancy, but show some parallels in taxa separated by hundreds of millions of years of evolution.
Wang, Yupeng; Wang, Xiyin; Tang, Haibao; Tan, Xu; Ficklin, Stephen P.; Feltus, F. Alex; Paterson, Andrew H.
2011-01-01
Background Both single gene and whole genome duplications (WGD) have recurred in angiosperm evolution. However, the evolutionary effects of different modes of gene duplication, especially regarding their contributions to genetic novelty or redundancy, have been inadequately explored. Results In Arabidopsis thaliana and Oryza sativa (rice), species that deeply sample botanical diversity and for which expression data are available from a wide range of tissues and physiological conditions, we have compared expression divergence between genes duplicated by six different mechanisms (WGD, tandem, proximal, DNA based transposed, retrotransposed and dispersed), and between positional orthologs. Both neo-functionalization and genetic redundancy appear to contribute to retention of duplicate genes. Genes resulting from WGD and tandem duplications diverge slowest in both coding sequences and gene expression, and contribute most to genetic redundancy, while other duplication modes contribute more to evolutionary novelty. WGD duplicates may more frequently be retained due to dosage amplification, while inferred transposon mediated gene duplications tend to reduce gene expression levels. The extent of expression divergence between duplicates is discernibly related to duplication modes, different WGD events, amino acid divergence, and putatively neutral divergence (time), but the contribution of each factor is heterogeneous among duplication modes. Gene loss may retard inter-species expression divergence. Members of different gene families may have non-random patterns of origin that are similar in Arabidopsis and rice, suggesting the action of pan-taxon principles of molecular evolution. Conclusion Gene duplication modes differ in contribution to genetic novelty and redundancy, but show some parallels in taxa separated by hundreds of millions of years of evolution. PMID:22164235
Tempo and Mode of Gene Duplication in Mammalian Ribosomal Protein Evolution
Gajdosik, Matthew D.; Simon, Amanda; Nelson, Craig E.
2014-01-01
Gene duplication has been widely recognized as a major driver of evolutionary change and organismal complexity through the generation of multi-gene families. Therefore, understanding the forces that govern the evolution of gene families through the retention or loss of duplicated genes is fundamentally important in our efforts to study genome evolution. Previous work from our lab has shown that ribosomal protein (RP) genes constitute one of the largest classes of conserved duplicated genes in mammals. This result was surprising due to the fact that ribosomal protein genes evolve slowly and transcript levels are very tightly regulated. In our present study, we identified and characterized all RP duplicates in eight mammalian genomes in order to investigate the tempo and mode of ribosomal protein family evolution. We show that a sizable number of duplicates are transcriptionally active and are very highly conserved. Furthermore, we conclude that existing gene duplication models do not readily account for the preservation of a very large number of intact retroduplicated ribosomal protein (RT-RP) genes observed in mammalian genomes. We suggest that selection against dominant-negative mutations may underlie the unexpected retention and conservation of duplicated RP genes, and may shape the fate of newly duplicated genes, regardless of duplication mechanism. PMID:25369106
The Early ANTP Gene Repertoire: Insights from the Placozoan Genome
Schierwater, Bernd; Kamm, Kai; Srivastava, Mansi; Rokhsar, Daniel; Rosengarten, Rafael D.; Dellaporta, Stephen L.
2008-01-01
The evolution of ANTP genes in the Metazoa has been the subject of conflicting hypotheses derived from full or partial gene sequences and genomic organization in higher animals. Whole genome sequences have recently filled in some crucial gaps for the basal metazoan phyla Cnidaria and Porifera. Here we analyze the complete genome of Trichoplax adhaerens, representing the basal metazoan phylum Placozoa, for its set of ANTP class genes. The Trichoplax genome encodes representatives of Hox/ParaHox-like, NKL, and extended Hox genes. This repertoire possibly mirrors the condition of a hypothetical cnidarian-bilaterian ancestor. The evolution of the cnidarian and bilaterian ANTP gene repertoires can be deduced by a limited number of cis-duplications of NKL and “extended Hox” genes and the presence of a single ancestral “ProtoHox” gene. PMID:18716659
Garcia, Nelson; Messing, Joachim
2017-01-01
The TEL2, TTI1, and TTI2 proteins are co-chaperones for heat shock protein 90 (HSP90) to regulate the protein folding and maturation of phosphatidylinositol 3-kinase-related kinases (PIKKs). Referred to as the TTT complex, the genes that encode them are highly conserved from man to maize. TTT complex and PIKK genes exist mostly as single copy genes in organisms where they have been characterized. Members of this interacting protein network in maize were identified and synteny analyses were performed to study their evolution. Similar to other species, there is only one copy of each of these genes in maize which was due to a loss of the duplicated copy created by ancient allotetraploidy. Moreover, the retained copies of the TTT complex and the PIKK genes tolerated extensive retrotransposon insertion in their introns that resulted in increased gene lengths and gene body methylation, without apparent effect in normal gene expression and function. The results raise an interesting question on whether the reversion to single copy was due to selection against deleterious unbalanced gene duplications between members of the complex as predicted by the gene balance hypothesis, or due to neutral loss of extra copies. Uneven alteration of dosage either by adding extra copies or modulating gene expression of complex members is being proposed as a means to investigate whether the data supports the gene balance hypothesis or not.
Li, Hongxia; Yu, Juhua; Li, Jianlin; Tang, Yongkai; Yu, Fan; Zhou, Jie; Yu, Wenjuan
2016-04-01
Interleukin-17 (IL-17) plays an important role in inflammation and host defense in mammals. In this study, we identified two duplicated IL-17A/F2 genes in the common carp (Cyprinus carpio) (ccIL-17A/F2a and ccIL-17A/F2b), putative encoded proteins contain 140 amino acids (aa) with conserved IL-17 family motifs. Expression analysis revealed high constitutive expression of ccIL-17A/F2s in mucosal tissues, including gill, skin and intestine, their expression could be induced by Aeromonas hydrophila, suggesting a potential role in mucosal immunity. Recombinant ccIL-17A/F2a protein (rccIL-17A/F2a) produced in Escherichia coli could induce the expression of proinflammatory cytokines (IL-1β) and the antimicrobial peptides S100A1, S100A10a and S100A10b in the primary kidney in a dose- and time-dependent manner. Above findings suggest that ccIL-17A/F2 plays an important role in both proinflammatory and innate immunity. Two duplicated ccIL-17A/F2s showed different expression level with ccIL-17A/F2a higher than b, comparison of two 5' regulatory regions indicated the length from anticipated promoter to transcriptional start site (TSS) and putative transcription factor binding site (TFBS) were different. Promoter activity of ccIL-17A/F2a was 2.5 times of ccIL-17A/F2b which consistent with expression results of two genes. These suggest mutations in 5'regulatory region contributed to the differentiation of duplicated genes. To our knowledge, this is the first report to analyze 5'regulatory region of piscine IL-17 family genes. Copyright © 2016 Elsevier Ltd. All rights reserved.
Jin, Jing; Jin, Xiaolei; Jiang, Haiyang; Yan, Hanwei; Cheng, Beijiu
2014-01-01
Whole-genome duplication events (polyploidy events) and gene loss events have played important roles in the evolution of legumes. Here we show that the vast majority of Hsf gene duplications resulted from whole genome duplication events rather than tandem duplication, and significant differences in gene retention exist between species. By searching for intraspecies gene colinearity (microsynteny) and dating the age distributions of duplicated genes, we found that genome duplications accounted for 42 of 46 Hsf-containing segments in Glycine max, while paired segments were rarely identified in Lotus japonicas, Medicago truncatula and Cajanus cajan. However, by comparing interspecies microsynteny, we determined that the great majority of Hsf-containing segments in Lotus japonicas, Medicago truncatula and Cajanus cajan show extensive conservation with the duplicated regions of Glycine max. These segments formed 17 groups of orthologous segments. These results suggest that these regions shared ancient genome duplication with Hsf genes in Glycine max, but more than half of the copies of these genes were lost. On the other hand, the Glycine max Hsf gene family retained approximately 75% and 84% of duplicated genes produced from the ancient genome duplication and recent Glycine-specific genome duplication, respectively. Continuous purifying selection has played a key role in the maintenance of Hsf genes in Glycine max. Expression analysis of the Hsf genes in Lotus japonicus revealed their putative involvement in multiple tissue-/developmental stages and responses to various abiotic stimuli. This study traces the evolution of Hsf genes in legume species and demonstrates that the rates of gene gain and loss are far from equilibrium in different species. PMID:25047803
Fares, Mario A; Sabater-Muñoz, Beatriz; Toft, Christina
2017-05-01
Gene duplication generates new genetic material, which has been shown to lead to major innovations in unicellular and multicellular organisms. A whole-genome duplication occurred in the ancestor of Saccharomyces yeast species but 92% of duplicates returned to single-copy genes shortly after duplication. The persisting duplicated genes in Saccharomyces led to the origin of major metabolic innovations, which have been the source of the unique biotechnological capabilities in the Baker's yeast Saccharomyces cerevisiae. What factors have determined the fate of duplicated genes remains unknown. Here, we report the first demonstration that the local genome mutation and transcription rates determine the fate of duplicates. We show, for the first time, a preferential location of duplicated genes in the mutational and transcriptional hotspots of S. cerevisiae genome. The mechanism of duplication matters, with whole-genome duplicates exhibiting different preservation trends compared to small-scale duplicates. Genome mutational and transcriptional hotspots are rich in duplicates with large repetitive promoter elements. Saccharomyces cerevisiae shows more tolerance to deleterious mutations in duplicates with repetitive promoter elements, which in turn exhibit higher transcriptional plasticity against environmental perturbations. Our data demonstrate that the genome traps duplicates through the accelerated regulatory and functional divergence of their gene copies providing a source of novel adaptations in yeast. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Conserved structure and expression of hsp70 paralogs in teleost fishes.
Metzger, David C H; Hemmer-Hansen, Jakob; Schulte, Patricia M
2016-06-01
The cytosolic 70KDa heat shock proteins (Hsp70s) are widely used as biomarkers of environmental stress in ecological and toxicological studies in fish. Here we analyze teleost genome sequences to show that two genes encoding inducible hsp70s (hsp70-1 and hsp70-2) are likely present in all teleost fish. Phylogenetic and synteny analyses indicate that hsp70-1 and hsp70-2 are distinct paralogs that originated prior to the diversification of the teleosts. The promoters of both genes contain a TATA box and conserved heat shock elements (HSEs), but unlike mammalian HSP70s, both genes contain an intron in the 5' UTR. The hsp70-2 gene has undergone tandem duplication in several species. In addition, many other teleost genome assemblies have multiple copies of hsp70-2 present on separate, small, genomic scaffolds. To verify that these represent poorly assembled tandem duplicates, we cloned the genomic region surrounding hsp70-2 in Fundulus heteroclitus and showed that the hsp70-2 gene copies that are on separate scaffolds in the genome assembly are arranged as tandem duplicates. Real-time quantitative PCR of F. heteroclitus genomic DNA indicates that four copies of the hsp70-2 gene are likely present in the F. heteroclitus genome. Comparison of expression patterns in F. heteroclitus and Gasterosteus aculeatus demonstrates that hsp70-2 has a higher fold increase than hsp70-1 following heat shock in gill but not in muscle tissue, revealing a conserved difference in expression patterns between isoforms and tissues. These data indicate that ecological and toxicological studies using hsp70 as a biomarker in teleosts should take this complexity into account. Copyright © 2016 Elsevier Inc. All rights reserved.
Roux, Julien; Liu, Jialin; Robinson-Rechavi, Marc
2017-01-01
Abstract The evolutionary history of vertebrates is marked by three ancient whole-genome duplications: two successive rounds in the ancestor of vertebrates, and a third one specific to teleost fishes. Biased loss of most duplicates enriched the genome for specific genes, such as slow evolving genes, but this selective retention process is not well understood. To understand what drives the long-term preservation of duplicate genes, we characterized duplicated genes in terms of their expression patterns. We used a new method of expression enrichment analysis, TopAnat, applied to in situ hybridization data from thousands of genes from zebrafish and mouse. We showed that the presence of expression in the nervous system is a good predictor of a higher rate of retention of duplicate genes after whole-genome duplication. Further analyses suggest that purifying selection against the toxic effects of misfolded or misinteracting proteins, which is particularly strong in nonrenewing neural tissues, likely constrains the evolution of coding sequences of nervous system genes, leading indirectly to the preservation of duplicate genes after whole-genome duplication. Whole-genome duplications thus greatly contributed to the expansion of the toolkit of genes available for the evolution of profound novelties of the nervous system at the base of the vertebrate radiation. PMID:28981708
Functional requirements driving the gene duplication in 12 Drosophila species.
Zhong, Yan; Jia, Yanxiao; Gao, Yang; Tian, Dacheng; Yang, Sihai; Zhang, Xiaohui
2013-08-15
Gene duplication supplies the raw materials for novel gene functions and many gene families arisen from duplication experience adaptive evolution. Most studies of young duplicates have focused on mammals, especially humans, whereas reports describing their genome-wide evolutionary patterns across the closely related Drosophila species are rare. The sequenced 12 Drosophila genomes provide the opportunity to address this issue. In our study, 3,647 young duplicate gene families were identified across the 12 Drosophila species and three types of expansions, species-specific, lineage-specific and complex expansions, were detected in these gene families. Our data showed that the species-specific young duplicate genes predominated (86.6%) over the other two types. Interestingly, many independent species-specific expansions in the same gene family have been observed in many species, even including 11 or 12 Drosophila species. Our data also showed that the functional bias observed in these young duplicate genes was mainly related to responses to environmental stimuli and biotic stresses. This study reveals the evolutionary patterns of young duplicates across 12 Drosophila species on a genomic scale. Our results suggest that convergent evolution acts on young duplicate genes after the species differentiation and adaptive evolution may play an important role in duplicate genes for adaption to ecological factors and environmental changes in Drosophila.
Elphick, Maurice R; Rowe, Matthew L
2009-04-01
The myoactive neuropeptide NGIWYamide was originally isolated from the holothurian (sea cucumber) Apostichopus japonicus but there is evidence that NGIWYamide-like peptides also occur in other echinoderms. Here we report the discovery of a gene in the sea urchin Strongylocentrotus purpuratus that encodes two copies of an NGIWYamide-like peptide: Asn-Gly-Phe-Phe-Phe-(NH(2)) or NGFFFamide. Interestingly, the C-terminal region of the NGFFFamide precursor shares sequence similarity with neurophysins, carrier proteins hitherto uniquely associated with precursors of vasopressin/oxytocin-like neuropeptides. Thus, the NGFFFamide precursor is the first neurophysin-containing neuropeptide precursor to be discovered that does not contain a vasopressin/oxytocin-like peptide. However, it remains to be determined whether neurophysin acts as a carrier protein for NGFFFamide. The S. purpuratus genome also contains a gene encoding a precursor comprising a neurophysin polypeptide and 'echinotocin' (CFISNCPKGamide) - the first vasopressin/oxytocin-like peptide to be identified in an echinoderm. Therefore, in S. purpuratus there are two genes encoding precursors that have a neurophysin domain but which encode neuropeptides that are structurally unrelated. Furthermore, both NGFFFamide and echinotocin cause contraction of tube foot and oesophagus preparations from the sea urchin Echinus esculentus, consistent with the myoactivity of NGIWYamide in sea cucumbers and the myoactivity of vasopressin/oxytocin-like peptides in other animal phyla. Presumably the NGFFFamide precursor acquired its neurophysin domain following partial or complete duplication of a gene encoding a vasopressin/oxytocin-like peptide, but it remains to be determined when in evolutionary history this occurred.
Soybean kinome: functional classification and gene expression patterns
Liu, Jinyi; Chen, Nana; Grant, Joshua N.; Cheng, Zong-Ming (Max); Stewart, C. Neal; Hewezi, Tarek
2015-01-01
The protein kinase (PK) gene family is one of the largest and most highly conserved gene families in plants and plays a role in nearly all biological functions. While a large number of genes have been predicted to encode PKs in soybean, a comprehensive functional classification and global analysis of expression patterns of this large gene family is lacking. In this study, we identified the entire soybean PK repertoire or kinome, which comprised 2166 putative PK genes, representing 4.67% of all soybean protein-coding genes. The soybean kinome was classified into 19 groups, 81 families, and 122 subfamilies. The receptor-like kinase (RLK) group was remarkably large, containing 1418 genes. Collinearity analysis indicated that whole-genome segmental duplication events may have played a key role in the expansion of the soybean kinome, whereas tandem duplications might have contributed to the expansion of specific subfamilies. Gene structure, subcellular localization prediction, and gene expression patterns indicated extensive functional divergence of PK subfamilies. Global gene expression analysis of soybean PK subfamilies revealed tissue- and stress-specific expression patterns, implying regulatory functions over a wide range of developmental and physiological processes. In addition, tissue and stress co-expression network analysis uncovered specific subfamilies with narrow or wide interconnected relationships, indicative of their association with particular or broad signalling pathways, respectively. Taken together, our analyses provide a foundation for further functional studies to reveal the biological and molecular functions of PKs in soybean. PMID:25614662
Characterisation of single domain ATP-binding cassette protien homologues of Theileria parva.
Kibe, M K; Macklin, M; Gobright, E; Bishop, R; Urakawa, T; ole-MoiYoi, O K
2001-09-01
Two distinct genes encoding single domain, ATP-binding cassette transport protein homologues of Theileria parva were cloned and sequenced. Neither of the genes is tandemly duplicated. One gene, TpABC1, encodes a predicted protein of 593 amino acids with an N-terminal hydrophobic domain containing six potential membrane-spanning segments. A single discontinuous ATP-binding element was located in the C-terminal region of TpABC1. The second gene, TpABC2, also contains a single C-terminal ATP-binding motif. Copies of TpABC2 were present at four loci in the T. parva genome on three different chromosomes. TpABC1 exhibited allelic polymorphism between stocks of the parasite. Comparison of cDNA and genomic sequences revealed that TpABC1 contained seven short introns, between 29 and 84 bp in length. The full-length TpABC1 protein was expressed in insect cells using the baculovirus system. Application of antibodies raised against the recombinant antigen to western blots of T. parva piroplasm lysates detected an 85 kDa protein in this life-cycle stage.
Pineda, Sandy S; Sollod, Brianna L; Wilson, David; Darling, Aaron; Sunagar, Kartik; Undheim, Eivind A B; Kely, Laurence; Antunes, Agostinho; Fry, Bryan G; King, Glenn F
2014-03-05
Spiders have evolved pharmacologically complex venoms that serve to rapidly subdue prey and deter predators. The major toxic factors in most spider venoms are small, disulfide-rich peptides. While there is abundant evidence that snake venoms evolved by recruitment of genes encoding normal body proteins followed by extensive gene duplication accompanied by explosive structural and functional diversification, the evolutionary trajectory of spider-venom peptides is less clear. Here we present evidence of a spider-toxin superfamily encoding a high degree of sequence and functional diversity that has evolved via accelerated duplication and diversification of a single ancestral gene. The peptides within this toxin superfamily are translated as prepropeptides that are posttranslationally processed to yield the mature toxin. The N-terminal signal sequence, as well as the protease recognition site at the junction of the propeptide and mature toxin are conserved, whereas the remainder of the propeptide and mature toxin sequences are variable. All toxin transcripts within this superfamily exhibit a striking cysteine codon bias. We show that different pharmacological classes of toxins within this peptide superfamily evolved under different evolutionary selection pressures. Overall, this study reinforces the hypothesis that spiders use a combinatorial peptide library strategy to evolve a complex cocktail of peptide toxins that target neuronal receptors and ion channels in prey and predators. We show that the ω-hexatoxins that target insect voltage-gated calcium channels evolved under the influence of positive Darwinian selection in an episodic fashion, whereas the κ-hexatoxins that target insect calcium-activated potassium channels appear to be under negative selection. A majority of the diversifying sites in the ω-hexatoxins are concentrated on the molecular surface of the toxins, thereby facilitating neofunctionalisation leading to new toxin pharmacology.
A local duplication of the Melanocortin receptor 1 locus in Astyanax
Gross, Joshua B.; Weagley, James; Stahl, Bethany A.; Ma, Li; Espinasa, Luis; McGaugh, Suzanne E.
2017-01-01
In this study, we report evidence of a novel duplication of Melanocortin receptor 1 (Mc1r) in the cavefish genome. This locus was discovered following the observation of excessive allelic diversity in a ~820 bp fragment of Mc1r amplified via degenerate PCR from a natural population of Astyanax aeneus fish from Guerrero, Mexico. The cavefish genome reveals the presence of two closely related Mc1r open reading frames separated by a 1.46 kb intergenic region. One open reading frame corresponds to the previously reported Mc1r receptor, and the other open reading frame (duplicate copy) is 975 bp in length, encoding a receptor of 325 amino acids. Sequence similarity analyses position both copies in the syntenic region of the single Mc1r locus in 16 representative craniate genomes spanning bony fish (including Astyanax) to mammals, suggesting we discovered tandem duplicates of this important gene. The two Mc1r copies share ~89% sequence similarity, and, within Astyanax, are more similar to one another compared to other melanocortin family members. Future studies will inform the precise functional significance of the duplicated Mc1r locus, and if this novel copy number variant may have adaptive significance for the Astyanax lineage. PMID:28738163
Shang, Shuai; Zhong, Huaming; Wu, Xiaoyang; Wei, Qinguo; Zhang, Huanxin; Chen, Jun; Chen, Yao; Tang, Xuexi; Zhang, Honghai
2018-04-01
Toll-like receptors (TLRs) encoded by the TLR multigene family play an important role in initial pathogen recognition in vertebrates. Among the TLRs, TLR2 and TLR4 may be of particular importance to reptiles. In order to study the evolutionary patterns and structural characteristics of TLRs, we explored the available genomes of several representative members of reptiles. 25 TLR2 genes and 19 TLR4 genes from reptiles were obtained in this study. Phylogenetic results showed that the TLR2 gene duplication occurred in several species. Evolutionary analysis by at least two methods identified 30 and 13 common positively selected codons in TLR2 and TLR4, respectively. Most positively selected sites of TLR2 and TLR4 were located in the Leucine-rich repeat (LRRs). Branch model analysis showed that TLR2 genes were under different evolutionary forces in reptiles, while the TLR4 genes showed no significant selection pressure. The different evolutionary adaptation of TLR2 and TLR4 among the reptiles might be due to their different function in recognizing bacteria. Overall, we explored the structure and evolution of TLR2 and TLR4 genes in reptiles for the first time. Our study revealed valuable information regarding TLR2 and TLR4 in reptiles, and provided novel insights into the conservation concern of natural populations. Copyright © 2017 Elsevier B.V. All rights reserved.
Gu, Xun; Wang, Yufeng; Gu, Jianying
2002-06-01
The classical (two-round) hypothesis of vertebrate genome duplication proposes two successive whole-genome duplication(s) (polyploidizations) predating the origin of fishes, a view now being seriously challenged. As the debate largely concerns the relative merits of the 'big-bang mode' theory (large-scale duplication) and the 'continuous mode' theory (constant creation by small-scale duplications), we tested whether a significant proportion of paralogous genes in the contemporary human genome was indeed generated in the early stage of vertebrate evolution. After an extensive search of major databases, we dated 1,739 gene duplication events from the phylogenetic analysis of 749 vertebrate gene families. We found a pattern characterized by two waves (I, II) and an ancient component. Wave I represents a recent gene family expansion by tandem or segmental duplications, whereas wave II, a rapid paralogous gene increase in the early stage of vertebrate evolution, supports the idea of genome duplication(s) (the big-bang mode). Further analysis indicated that large- and small-scale gene duplications both make a significant contribution during the early stage of vertebrate evolution to build the current hierarchy of the human proteome.
Roux, Julien; Liu, Jialin; Robinson-Rechavi, Marc
2017-11-01
The evolutionary history of vertebrates is marked by three ancient whole-genome duplications: two successive rounds in the ancestor of vertebrates, and a third one specific to teleost fishes. Biased loss of most duplicates enriched the genome for specific genes, such as slow evolving genes, but this selective retention process is not well understood. To understand what drives the long-term preservation of duplicate genes, we characterized duplicated genes in terms of their expression patterns. We used a new method of expression enrichment analysis, TopAnat, applied to in situ hybridization data from thousands of genes from zebrafish and mouse. We showed that the presence of expression in the nervous system is a good predictor of a higher rate of retention of duplicate genes after whole-genome duplication. Further analyses suggest that purifying selection against the toxic effects of misfolded or misinteracting proteins, which is particularly strong in nonrenewing neural tissues, likely constrains the evolution of coding sequences of nervous system genes, leading indirectly to the preservation of duplicate genes after whole-genome duplication. Whole-genome duplications thus greatly contributed to the expansion of the toolkit of genes available for the evolution of profound novelties of the nervous system at the base of the vertebrate radiation. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Detecting long tandem duplications in genomic sequences.
Audemard, Eric; Schiex, Thomas; Faraut, Thomas
2012-05-08
Detecting duplication segments within completely sequenced genomes provides valuable information to address genome evolution and in particular the important question of the emergence of novel functions. The usual approach to gene duplication detection, based on all-pairs protein gene comparisons, provides only a restricted view of duplication. In this paper, we introduce ReD Tandem, a software using a flow based chaining algorithm targeted at detecting tandem duplication arrays of moderate to longer length regions, with possibly locally weak similarities, directly at the DNA level. On the A. thaliana genome, using a reference set of tandem duplicated genes built using TAIR,(a) we show that ReD Tandem is able to predict a large fraction of recently duplicated genes (dS < 1) and that it is also able to predict tandem duplications involving non coding elements such as pseudo-genes or RNA genes. ReD Tandem allows to identify large tandem duplications without any annotation, leading to agnostic identification of tandem duplications. This approach nicely complements the usual protein gene based which ignores duplications involving non coding regions. It is however inherently restricted to relatively recent duplications. By recovering otherwise ignored events, ReD Tandem gives a more comprehensive view of existing evolutionary processes and may also allow to improve existing annotations.
Sullivan, James A.; Gray, John C.
2000-01-01
The pea lip1 (light-independent photomorphogenesis1) mutant shows many of the characteristics of light-grown development when grown in continuous darkness. To investigate the identity of LIP1, cDNAs encoding the pea homolog of COP1, a repressor of photomorphogenesis identified in Arabidopsis, were isolated from wild-type and lip1 pea seedlings. lip1 seedlings contained a wild-type COP1 transcript as well as a larger COP1′ transcript that contained an internal in-frame duplication of 894 bp. The COP1′ transcript segregated with the lip1 phenotype in F2 seedlings and could be translated in vitro to produce a protein of ∼100 kD. The COP1 gene in lip1 peas contained a 7.5-kb duplication, consisting of exons 1 to 7 of the wild-type sequence, located 2.5 kb upstream of a region of genomic DNA identical to the wild-type COP1 DNA sequence. Transcription and splicing of the mutant COP1 gene was predicted to produce the COP1′ transcript, whereas transcription from an internal promoter in the 2.5-kb region of DNA located between the duplicated regions of COP1 would produce the wild-type COP1 transcript. The presence of small quantities of wild-type COP1 transcripts may reduce the severity of the phenotype produced by the mutated COP1′ protein. The genomic DNA sequences of the COP1 gene from wild-type and lip1 peas and the cDNA sequences of COP1 and COP1′ transcripts have been submitted to the EMBL database under the EMBL accession numbers AJ276591, AJ276592, AJ289773, and AJ289774, respectively. PMID:11041887
Mutational analysis of the major soybean UreF paralogue involved in urease activation.
Polacco, Joe C; Hyten, David L; Medeiros-Silva, Mônica; Sleper, David A; Bilyeu, Kristin D
2011-06-01
The soybean genome duplicated ∼14 and 45 million years ago and has many paralogous genes, including those in urease activation (emplacement of Ni and CO(2) in the active site). Activation requires the UreD and UreF proteins, each encoded by two paralogues. UreG, a third essential activation protein, is encoded by the single-copy Eu3, and eu3 mutants lack activity of both urease isozymes. eu2 has the same urease-negative phenotype, consistent with Eu2 being a single-copy gene, possibly encoding a Ni carrier. Unexpectedly, two eu2 alleles co-segregated with missense mutations in the chromosome 2 UreF paralogue (Ch02UreF), suggesting lack of expression/function of Ch14UreF. However, Ch02UreF and Ch14UreF transcripts accumulate at the same level. Further, it had been shown that expression of the Ch14UreF ORF complemented a fungal ureF mutant. A third, nonsense (Q2*) allelic mutant, eu2-c, exhibited 5- to 10-fold more residual urease activity than missense eu2-a or eu2-b, though eu2-c should lack all Ch02UreF protein. It is hypothesized that low-level activation by Ch14UreF is 'spoiled' by the altered missense Ch02UreF proteins ('epistatic dominant-negative'). In agreement with active 'spoiling' by eu2-b-encoded Ch02UreF (G31D), eu2-b/eu2-c heterozygotes had less than half the urease activity of eu2-c/eu2-c siblings. Ch02UreF (G31D) could spoil activation by Chr14UreF because of higher affinity for the activation complex, or because Ch02UreF (G31D) is more abundant than Ch14UreF. Here, the latter is favoured, consistent with a reported in-frame AUG in the 5' leader of Chr14UreF transcript. Translational inhibition could represent a form of 'functional divergence' of duplicated genes.
Mutational analysis of the major soybean UreF paralogue involved in urease activation
Polacco, Joe C.; Hyten, David L.; Medeiros-Silva, Mônica; Sleper, David A.; Bilyeu, Kristin D.
2011-01-01
The soybean genome duplicated ∼14 and 45 million years ago and has many paralogous genes, including those in urease activation (emplacement of Ni and CO2 in the active site). Activation requires the UreD and UreF proteins, each encoded by two paralogues. UreG, a third essential activation protein, is encoded by the single-copy Eu3, and eu3 mutants lack activity of both urease isozymes. eu2 has the same urease-negative phenotype, consistent with Eu2 being a single-copy gene, possibly encoding a Ni carrier. Unexpectedly, two eu2 alleles co-segregated with missense mutations in the chromosome 2 UreF paralogue (Ch02UreF), suggesting lack of expression/function of Ch14UreF. However, Ch02UreF and Ch14UreF transcripts accumulate at the same level. Further, it had been shown that expression of the Ch14UreF ORF complemented a fungal ureF mutant. A third, nonsense (Q2*) allelic mutant, eu2-c, exhibited 5- to 10-fold more residual urease activity than missense eu2-a or eu2-b, though eu2-c should lack all Ch02UreF protein. It is hypothesized that low-level activation by Ch14UreF is ‘spoiled’ by the altered missense Ch02UreF proteins (‘epistatic dominant-negative’). In agreement with active ‘spoiling’ by eu2-b-encoded Ch02UreF (G31D), eu2-b/eu2-c heterozygotes had less than half the urease activity of eu2-c/eu2-c siblings. Ch02UreF (G31D) could spoil activation by Chr14UreF because of higher affinity for the activation complex, or because Ch02UreF (G31D) is more abundant than Ch14UreF. Here, the latter is favoured, consistent with a reported in-frame AUG in the 5' leader of Chr14UreF transcript. Translational inhibition could represent a form of ‘functional divergence’ of duplicated genes. PMID:21430294
Christensen, Kris A; Davidson, William S
2017-01-01
Salmonids (e.g. Atlantic salmon, Pacific salmon, and trouts) have a long legacy of genome duplication. In addition to three ancient genome duplications that all teleosts are thought to share, salmonids have had one additional genome duplication. We explored a methodology for untangling these duplications from each other to better understand them in Atlantic salmon. In this methodology, homeologous regions (paralogous/duplicated genomic regions originating from a whole genome duplication) from the most recent genome duplication were assumed to have duplicated genes at greater density and have greater sequence similarity. This assumption was used to differentiate duplicated gene pairs in Atlantic salmon that are either from the most recent genome duplication or from earlier duplications. From a comparison with multiple vertebrate species, it is clear that Atlantic salmon have retained more duplicated genes from ancient genome duplications than other vertebrates--often at higher density in the genome and containing fewer synonymous mutations. It may be that polysomic inheritance is the mechanism responsible for maintaining ancient gene duplicates in salmonids. Polysomic inheritance (when multiple chromosomes pair during meiosis) is thought to be relatively common in salmonids compared to other vertebrate species. These findings illuminate how genome duplications may not only increase the number of duplicated genes, but may also be involved in the maintenance of them from previous genome duplications as well.
Draft Genome of the Scarab Beetle Oryctes borbonicus on La Réunion Island
Meyer, Jan M.; Markov, Gabriel V.; Baskaran, Praveen; Herrmann, Matthias; Sommer, Ralf J.; Rödelsperger, Christian
2016-01-01
Beetles represent the largest insect order and they display extreme morphological, ecological and behavioral diversity, which makes them ideal models for evolutionary studies. Here, we present the draft genome of the scarab beetle Oryctes borbonicus, which has a more basal phylogenetic position than the two previously sequenced pest species Tribolium castaneum and Dendroctonus ponderosae providing the potential for sequence polarization. Oryctes borbonicus is endemic to La Réunion, an island located in the Indian Ocean, and is the host of the nematode Pristionchus pacificus, a well-established model organism for integrative evolutionary biology. At 518 Mb, the O. borbonicus genome is substantially larger and encodes more genes than T. castaneum and D. ponderosae. We found that only 25% of the predicted genes of O. borbonicus are conserved as single copy genes across the nine investigated insect genomes, suggesting substantial gene turnover within insects. Even within beetles, up to 21% of genes are restricted to only one species, whereas most other genes have undergone lineage-specific duplications and losses. We illustrate lineage-specific duplications using detailed phylogenetic analysis of two gene families. This study serves as a reference point for insect/coleopteran genomics, although its original motivation was to find evidence for potential horizontal gene transfer (HGT) between O. borbonicus and P. pacificus. The latter was previously shown to be the recipient of multiple horizontally transferred genes including some genes from insect donors. However, our study failed to provide any clear evidence for additional HGTs between the two species. PMID:27289092
Loreni, F; Ruberti, I; Bozzoni, I; Pierandrei-Amaldi, P; Amaldi, F
1985-01-01
Ribosomal protein L1 is encoded by two genes in Xenopus laevis. The comparison of two cDNA sequences shows that the two L1 gene copies (L1a and L1b) have diverged in many silent sites and very few substitution sites; moreover a small duplication occurred at the very end of the coding region of the L1b gene which thus codes for a product five amino acids longer than that coded by L1a. Quantitatively the divergence between the two L1 genes confirms that a whole genome duplication took place in Xenopus laevis approximately 30 million years ago. A genomic fragment containing one of the two L1 gene copies (L1a), with its nine introns and flanking regions, has been completely sequenced. The 5' end of this gene has been mapped within a 20-pyridimine stretch as already found for other vertebrate ribosomal protein genes. Four of the nine introns have a 60-nucleotide sequence with 80% homology; within this region some boxes, one of which is 16 nucleotides long, are 100% homologous among the four introns. This feature of L1a gene introns is interesting since we have previously shown that the activity of this gene is regulated at a post-transcriptional level and it involves the block of the normal splicing of some intron sequences. Images Fig. 3. Fig. 5. PMID:3841512
Tester, David J; Benton, Amber J; Train, Laura; Deal, Barbara; Baudhuin, Linnea M; Ackerman, Michael J
2010-10-15
Long QT syndrome (LQTS) is a cardiac channelopathy associated with syncope, seizures, and sudden death. Approximately 75% of LQTS is due to mutations in genes encoding for 3 cardiac ion channel α-subunits (LQT1 to LQT3). However, traditional mutational analyses have limited detection capabilities for atypical mutations such as large gene rearrangements. We set out to determine the prevalence and spectrum of large deletions/duplications in the major LQTS-susceptibility genes in unrelated patients who were mutation negative after point mutation analysis of LQT1- to LQT12-susceptibility genes. Forty-two unrelated, clinically strong LQTS patients were analyzed using multiplex ligation-dependent probe amplification, a quantitative fluorescent technique for detecting multiple exon deletions and duplications. The SALSA multiplex ligation-dependent probe amplification LQTS kit from MRC-Holland was used to analyze the 3 major LQTS-associated genes, KCNQ1, KCNH2, and SCN5A, and the 2 minor genes, KCNE1 and KCNE2. Overall, 2 gene rearrangements were found in 2 of 42 unrelated patients (4.8%, confidence interval 1.7 to 11). A deletion of KCNQ1 exon 3 was identified in a 10-year-old Caucasian boy with a corrected QT duration of 660 ms, a personal history of exercise-induced syncope, and a family history of syncope. A deletion of KCNQ1 exon 7 was identified in a 17-year-old Caucasian girl with a corrected QT duration of 480 ms, a personal history of exercise-induced syncope, and a family history of sudden cardiac death. In conclusion, because nearly 5% of patients with genetically elusive LQTS had large genomic rearrangements involving the canonical LQTS-susceptibility genes, reflex genetic testing to investigate genomic rearrangements may be of clinical value. Copyright © 2010 Elsevier Inc. All rights reserved.
Xu, Zongda; Zhang, Qixiang; Sun, Lidan; Du, Dongliang; Cheng, Tangren; Pan, Huitang; Yang, Weiru; Wang, Jia
2014-10-01
MADS-box genes encode transcription factors that play crucial roles in plant development, especially in flower and fruit development. To gain insight into this gene family in Prunus mume, an important ornamental and fruit plant in East Asia, and to elucidate their roles in flower organ determination and fruit development, we performed a genome-wide identification, characterisation and expression analysis of MADS-box genes in this Rosaceae tree. In this study, 80 MADS-box genes were identified in P. mume and categorised into MIKC, Mα, Mβ, Mγ and Mδ groups based on gene structures and phylogenetic relationships. The MIKC group could be further classified into 12 subfamilies. The FLC subfamily was absent in P. mume and the six tandemly arranged DAM genes might experience a species-specific evolution process in P. mume. The MADS-box gene family might experience an evolution process from MIKC genes to Mδ genes to Mα, Mβ and Mγ genes. The expression analysis suggests that P. mume MADS-box genes have diverse functions in P. mume development and the functions of duplicated genes diverged after the duplication events. In addition to its involvement in the development of female gametophytes, type I genes also play roles in male gametophytes development. In conclusion, this study adds to our understanding of the roles that the MADS-box genes played in flower and fruit development and lays a foundation for selecting candidate genes for functional studies in P. mume and other species. Furthermore, this study also provides a basis to study the evolution of the MADS-box family.
Findeisen, Peggy; Mühlhausen, Stefanie; Dempewolf, Silke; Hertzog, Jonny; Zietlow, Alexander; Carlomagno, Teresa; Kollmar, Martin
2014-01-01
Tubulins belong to the most abundant proteins in eukaryotes providing the backbone for many cellular substructures like the mitotic and meiotic spindles, the intracellular cytoskeletal network, and the axonemes of cilia and flagella. Homologs have even been reported for archaea and bacteria. However, a taxonomically broad and whole-genome-based analysis of the tubulin protein family has never been performed, and thus, the number of subfamilies, their taxonomic distribution, and the exact grouping of the supposed archaeal and bacterial homologs are unknown. Here, we present the analysis of 3,524 tubulins from 504 species. The tubulins formed six major subfamilies, α to ζ. Species of all major kingdoms of the eukaryotes encode members of these subfamilies implying that they must have already been present in the last common eukaryotic ancestor. The proposed archaeal homologs grouped together with the bacterial TubZ proteins as sister clade to the FtsZ proteins indicating that tubulins are unique to eukaryotes. Most species contained α- and/or β-tubulin gene duplicates resulting from recent branch- and species-specific duplication events. This shows that tubulins cannot be used for constructing species phylogenies without resolving their ortholog–paralog relationships. The many gene duplicates and also the independent loss of the δ-, ε-, or ζ-tubulins, which have been shown to be part of the triplet microtubules in basal bodies, suggest that tubulins can functionally substitute each other. PMID:25169981
Positive Selection in Rapidly Evolving Plastid–Nuclear Enzyme Complexes
Rockenbach, Kate; Havird, Justin C.; Monroe, J. Grey; Triant, Deborah A.; Taylor, Douglas R.; Sloan, Daniel B.
2016-01-01
Rates of sequence evolution in plastid genomes are generally low, but numerous angiosperm lineages exhibit accelerated evolutionary rates in similar subsets of plastid genes. These genes include clpP1 and accD, which encode components of the caseinolytic protease (CLP) and acetyl-coA carboxylase (ACCase) complexes, respectively. Whether these extreme and repeated accelerations in rates of plastid genome evolution result from adaptive change in proteins (i.e., positive selection) or simply a loss of functional constraint (i.e., relaxed purifying selection) is a source of ongoing controversy. To address this, we have taken advantage of the multiple independent accelerations that have occurred within the genus Silene (Caryophyllaceae) by examining phylogenetic and population genetic variation in the nuclear genes that encode subunits of the CLP and ACCase complexes. We found that, in species with accelerated plastid genome evolution, the nuclear-encoded subunits in the CLP and ACCase complexes are also evolving rapidly, especially those involved in direct physical interactions with plastid-encoded proteins. A massive excess of nonsynonymous substitutions between species relative to levels of intraspecific polymorphism indicated a history of strong positive selection (particularly in CLP genes). Interestingly, however, some species are likely undergoing loss of the native (heteromeric) plastid ACCase and putative functional replacement by a duplicated cytosolic (homomeric) ACCase. Overall, the patterns of molecular evolution in these plastid–nuclear complexes are unusual for anciently conserved enzymes. They instead resemble cases of antagonistic coevolution between pathogens and host immune genes. We discuss a possible role of plastid–nuclear conflict as a novel cause of accelerated evolution. PMID:27707788
Baker, Richard H; Narechania, Apurva; Johns, Philip M; Wilkinson, Gerald S
2012-08-19
Gene duplication provides an essential source of novel genetic material to facilitate rapid morphological evolution. Traits involved in reproduction and sexual dimorphism represent some of the fastest evolving traits in nature, and gene duplication is intricately involved in the origin and evolution of these traits. Here, we review genomic research on stalk-eyed flies (Diopsidae) that has been used to examine the extent of gene duplication and its role in the genetic architecture of sexual dimorphism. Stalk-eyed flies are remarkable because of the elongation of the head into long stalks, with the eyes and antenna laterally displaced at the ends of these stalks. Many species are strongly sexually dimorphic for eyespan, and these flies have become a model system for studying sexual selection. Using both expressed sequence tag and next-generation sequencing, we have established an extensive database of gene expression in the developing eye-antennal imaginal disc, the adult head and testes. Duplicated genes exhibit narrower expression patterns than non-duplicated genes, and the testes, in particular, provide an abundant source of gene duplication. Within somatic tissue, duplicated genes are more likely to be differentially expressed between the sexes, suggesting gene duplication may provide a mechanism for resolving sexual conflict.
Baker, Richard H.; Narechania, Apurva; Johns, Philip M.; Wilkinson, Gerald S.
2012-01-01
Gene duplication provides an essential source of novel genetic material to facilitate rapid morphological evolution. Traits involved in reproduction and sexual dimorphism represent some of the fastest evolving traits in nature, and gene duplication is intricately involved in the origin and evolution of these traits. Here, we review genomic research on stalk-eyed flies (Diopsidae) that has been used to examine the extent of gene duplication and its role in the genetic architecture of sexual dimorphism. Stalk-eyed flies are remarkable because of the elongation of the head into long stalks, with the eyes and antenna laterally displaced at the ends of these stalks. Many species are strongly sexually dimorphic for eyespan, and these flies have become a model system for studying sexual selection. Using both expressed sequence tag and next-generation sequencing, we have established an extensive database of gene expression in the developing eye-antennal imaginal disc, the adult head and testes. Duplicated genes exhibit narrower expression patterns than non-duplicated genes, and the testes, in particular, provide an abundant source of gene duplication. Within somatic tissue, duplicated genes are more likely to be differentially expressed between the sexes, suggesting gene duplication may provide a mechanism for resolving sexual conflict. PMID:22777023
Identification of three duplicated Spin genes in medaka (Oryzias latipes).
Wang, Xiao-Lei; Mei, Jie; Sun, Min; Hong, Yun-Han; Gui, Jian-Fang
2005-05-09
Gene and genomic duplications are very important and frequent events in fish evolution, and the divergence of duplicated genes in sequences and functions is a focus of research on gene evolution. Here, we report the identification and characterization of three duplicated Spindlin (Spin) genes from medaka (Oryzias latipes): OlSpinA, OlSpinB, and OlSpinC. Molecular cloning, genomic DNA Blast analysis and phylogenetic relationship analysis demonstrated that the three duplicated OlSpin genes should belong to gene duplication. Furthermore, Western blot analysis revealed significant expression differences of the three OlSpins among different tissues and during embryogenesis in medaka, and suggested that sequence and functional divergence might have occurred in evolution among them.
New genes from old: asymmetric divergence of gene duplicates and the evolution of development.
Holland, Peter W H; Marlétaz, Ferdinand; Maeso, Ignacio; Dunwell, Thomas L; Paps, Jordi
2017-02-05
Gene duplications and gene losses have been frequent events in the evolution of animal genomes, with the balance between these two dynamic processes contributing to major differences in gene number between species. After gene duplication, it is common for both daughter genes to accumulate sequence change at approximately equal rates. In some cases, however, the accumulation of sequence change is highly uneven with one copy radically diverging from its paralogue. Such 'asymmetric evolution' seems commoner after tandem gene duplication than after whole-genome duplication, and can generate substantially novel genes. We describe examples of asymmetric evolution in duplicated homeobox genes of moths, molluscs and mammals, in each case generating new homeobox genes that were recruited to novel developmental roles. The prevalence of asymmetric divergence of gene duplicates has been underappreciated, in part, because the origin of highly divergent genes can be difficult to resolve using standard phylogenetic methods.This article is part of the themed issue 'Evo-devo in the genomics era, and the origins of morphological diversity'. © 2016 The Author(s).
The Caenorhabditis chemoreceptor gene families.
Thomas, James H; Robertson, Hugh M
2008-10-06
Chemoreceptor proteins mediate the first step in the transduction of environmental chemical stimuli, defining the breadth of detection and conferring stimulus specificity. Animal genomes contain families of genes encoding chemoreceptors that mediate taste, olfaction, and pheromone responses. The size and diversity of these families reflect the biology of chemoperception in specific species. Based on manual curation and sequence comparisons among putative G-protein-coupled chemoreceptor genes in the nematode Caenorhabditis elegans, we identified approximately 1300 genes and 400 pseudogenes in the 19 largest gene families, most of which fall into larger superfamilies. In the related species C. briggsae and C. remanei, we identified most or all genes in each of the 19 families. For most families, C. elegans has the largest number of genes and C. briggsae the smallest number, suggesting changes in the importance of chemoperception among the species. Protein trees reveal family-specific and species-specific patterns of gene duplication and gene loss. The frequency of strict orthologs varies among the families, from just over 50% in two families to less than 5% in three families. Several families include large species-specific expansions, mostly in C. elegans and C. remanei. Chemoreceptor gene families in Caenorhabditis species are large and evolutionarily dynamic as a result of gene duplication and gene loss. These dynamics shape the chemoreceptor gene complements in Caenorhabditis species and define the receptor space available for chemosensory responses. To explain these patterns, we propose the gray pawn hypothesis: individual genes are of little significance, but the aggregate of a large number of diverse genes is required to cover a large phenotype space.
The Caenorhabditis chemoreceptor gene families
Thomas, James H; Robertson, Hugh M
2008-01-01
Background Chemoreceptor proteins mediate the first step in the transduction of environmental chemical stimuli, defining the breadth of detection and conferring stimulus specificity. Animal genomes contain families of genes encoding chemoreceptors that mediate taste, olfaction, and pheromone responses. The size and diversity of these families reflect the biology of chemoperception in specific species. Results Based on manual curation and sequence comparisons among putative G-protein-coupled chemoreceptor genes in the nematode Caenorhabditis elegans, we identified approximately 1300 genes and 400 pseudogenes in the 19 largest gene families, most of which fall into larger superfamilies. In the related species C. briggsae and C. remanei, we identified most or all genes in each of the 19 families. For most families, C. elegans has the largest number of genes and C. briggsae the smallest number, suggesting changes in the importance of chemoperception among the species. Protein trees reveal family-specific and species-specific patterns of gene duplication and gene loss. The frequency of strict orthologs varies among the families, from just over 50% in two families to less than 5% in three families. Several families include large species-specific expansions, mostly in C. elegans and C. remanei. Conclusion Chemoreceptor gene families in Caenorhabditis species are large and evolutionarily dynamic as a result of gene duplication and gene loss. These dynamics shape the chemoreceptor gene complements in Caenorhabditis species and define the receptor space available for chemosensory responses. To explain these patterns, we propose the gray pawn hypothesis: individual genes are of little significance, but the aggregate of a large number of diverse genes is required to cover a large phenotype space. PMID:18837995
Innate Immune Complexity in the Purple Sea Urchin: Diversity of the Sp185/333 System
Smith, L. Courtney
2012-01-01
The California purple sea urchin, Strongylocentrotus purpuratus, is a long-lived echinoderm with a complex and sophisticated innate immune system. There are several large gene families that function in immunity in this species including the Sp185/333 gene family that has ∼50 (±10) members. The family shows intriguing sequence diversity and encodes a broad array of diverse yet similar proteins. The genes have two exons of which the second encodes the mature protein and has repeats and blocks of sequence called elements. Mosaics of element patterns plus single nucleotide polymorphisms-based variants of the elements result in significant sequence diversity among the genes yet maintains similar structure among the members of the family. Sequence of a bacterial artificial chromosome insert shows a cluster of six, tightly linked Sp185/333 genes that are flanked by GA microsatellites. The sequences between the GA microsatellites in which the Sp185/333 genes and flanking regions are located, are much more similar to each other than are the sequences outside the microsatellites suggesting processes such as gene conversion, recombination, or duplication. However, close linkage does not correspond with greater sequence similarity compared to randomly cloned and sequenced genes that are unlikely to be linked. There are three segmental duplications that are bounded by GAT microsatellites and include three almost identical genes plus flanking regions. RNA editing is detectible throughout the mRNAs based on comparisons to the genes, which, in combination with putative post-translational modifications to the proteins, results in broad arrays of Sp185/333 proteins that differ among individuals. The mature proteins have an N-terminal glycine-rich region, a central RGD motif, and a C-terminal histidine-rich region. The Sp185/333 proteins are localized to the cell surface and are found within vesicles in subsets of polygonal and small phagocytes. The coelomocyte proteome shows full-length and truncated proteins, including some with missense sequence. Current results suggest that both native Sp185/333 proteins and a recombinant protein bind bacteria and are likely important in sea urchin innate immunity. PMID:22566951
Liu, Gangbiao; Zou, Yangyun; Cheng, Qiqun; Zeng, Yanwu; Gu, Xun; Su, Zhixi
2014-04-01
The age distribution of gene duplication events within the human genome exhibits two waves of duplications along with an ancient component. However, because of functional constraint differences, genes in different functional categories might show dissimilar retention patterns after duplication. It is known that genes in some functional categories are highly duplicated in the early stage of vertebrate evolution. However, the correlations of the age distribution pattern of gene duplication between the different functional categories are still unknown. To investigate this issue, we developed a robust pipeline to date the gene duplication events in the human genome. We successfully estimated about three-quarters of the duplication events within the human genome, along with the age distribution pattern in each Gene Ontology (GO) slim category. We found that some GO slim categories show different distribution patterns when compared to the whole genome. Further hierarchical clustering of the GO slim functional categories enabled grouping into two main clusters. We found that human genes located in the duplicated copy number variant regions, whose duplicate genes have not been fixed in the human population, were mainly enriched in the groups with a high proportion of recently duplicated genes. Moreover, we used a phylogenetic tree-based method to date the age of duplications in three signaling-related gene superfamilies: transcription factors, protein kinases and G-protein coupled receptors. These superfamilies were expressed in different subcellular localizations. They showed a similar age distribution as the signaling-related GO slim categories. We also compared the differences between the age distributions of gene duplications in multiple subcellular localizations. We found that the distribution patterns of the major subcellular localizations were similar to that of the whole genome. This study revealed the whole picture of the evolution patterns of gene functional categories in the human genome.
Consensus properties and their large-scale applications for the gene duplication problem.
Moon, Jucheol; Lin, Harris T; Eulenstein, Oliver
2016-06-01
Solving the gene duplication problem is a classical approach for species tree inference from gene trees that are confounded by gene duplications. This problem takes a collection of gene trees and seeks a species tree that implies the minimum number of gene duplications. Wilkinson et al. posed the conjecture that the gene duplication problem satisfies the desirable Pareto property for clusters. That is, for every instance of the problem, all clusters that are commonly present in the input gene trees of this instance, called strict consensus, will also be found in every solution to this instance. We prove that this conjecture does not generally hold. Despite this negative result we show that the gene duplication problem satisfies a weaker version of the Pareto property where the strict consensus is found in at least one solution (rather than all solutions). This weaker property contributes to our design of an efficient scalable algorithm for the gene duplication problem. We demonstrate the performance of our algorithm in analyzing large-scale empirical datasets. Finally, we utilize the algorithm to evaluate the accuracy of standard heuristics for the gene duplication problem using simulated datasets.
PTGBase: an integrated database to study tandem duplicated genes in plants.
Yu, Jingyin; Ke, Tao; Tehrim, Sadia; Sun, Fengming; Liao, Boshou; Hua, Wei
2015-01-01
Tandem duplication is a wide-spread phenomenon in plant genomes and plays significant roles in evolution and adaptation to changing environments. Tandem duplicated genes related to certain functions will lead to the expansion of gene families and bring increase of gene dosage in the form of gene cluster arrays. Many tandem duplication events have been studied in plant genomes; yet, there is a surprising shortage of efforts to systematically present the integration of large amounts of information about publicly deposited tandem duplicated gene data across the plant kingdom. To address this shortcoming, we developed the first plant tandem duplicated genes database, PTGBase. It delivers the most comprehensive resource available to date, spanning 39 plant genomes, including model species and newly sequenced species alike. Across these genomes, 54 130 tandem duplicated gene clusters (129 652 genes) are presented in the database. Each tandem array, as well as its member genes, is characterized in complete detail. Tandem duplicated genes in PTGBase can be explored through browsing or searching by identifiers or keywords of functional annotation and sequence similarity. Users can download tandem duplicated gene arrays easily to any scale, up to the complete annotation data set for an entire plant genome. PTGBase will be updated regularly with newly sequenced plant species as they become available. © The Author(s) 2015. Published by Oxford University Press.
Chen, Yuan; Ding, Yun; Zhang, Zuming; Wang, Wen; Chen, Jun-Yuan; Ueno, Naoto; Mao, Bingyu
2011-12-20
The evolution of the central nervous system (CNS) is one of the most striking changes during the transition from invertebrates to vertebrates. As a major source of genetic novelties, gene duplication might play an important role in the functional innovation of vertebrate CNS. In this study, we focused on a group of CNS-biased genes that duplicated during early vertebrate evolution. We investigated the tempo-spatial expression patterns of 33 duplicate gene families and their orthologs during the embryonic development of the vertebrate Xenopus laevis and the cephalochordate Brachiostoma belcheri. Almost all the identified duplicate genes are differentially expressed in the CNS in Xenopus embryos, and more than 50% and 30% duplicate genes are expressed in the telencephalon and mid-hindbrain boundary, respectively, which are mostly considered as two innovations in the vertebrate CNS. Interestingly, more than 50% of the amphioxus orthologs do not show apparent expression in the CNS in amphioxus embryos as detected by in situ hybridization, indicating that some of the vertebrate CNS-biased duplicate genes might arise from non-CNS genes in invertebrates. Our data accentuate the functional contribution of gene duplication in the CNS evolution of vertebrate and uncover an invertebrate non-CNS history for some vertebrate CNS-biased duplicate genes. Copyright © 2011. Published by Elsevier Ltd.
Shi, LiLi; Li, Bin; Zhou, Ting Ting; Wang, Wei; Chan, Siuming F
2018-01-01
The recent use of RNA-Seq to study the transcriptomes of different species has helped identify a large number of new genes from different non-model organisms. In this study, five distinctive transcripts encoding for neuropeptide members of the CHH/MIH/GIH family have been identified from the spermatophore transcriptome of the shrimp Fenneropenaeus merguiensis. The size of these transcripts ranged from 531 bp to 1771 bp. Four transcripts encoded different CHH-family subtype I members, and one transcript encoded a subtype II member. RT-PCR and RACE approaches have confirmed the expression of these genes in males. The low degree of amino acid sequence identity among these neuropeptides suggests that they may have different specific function(s). Results from a phylogenetic tree analysis indicated that these neuropeptides were likely derived from a common ancestor gene resulting from mutation and gene duplication. These CHH-family members could be grouped into distinct clusters, indicating a strong structural/functional relationship among these neuropeptides. Eyestalk removal caused a significant increase in the expression of transcript 32710 but decreases in expression for transcript 28020. These findings suggest the possible regulation of these genes by eyestalk factor(s). In summary, the results of this study would justify a re-evaluation of the more generalized and pleiotropic functions of these neuropeptides. This study also represents the first report on the cloning/identification of five CHH family neuropeptides in a non-neuronal tissue from a single crustacean species.
Li, Jun; Hou, Hongmin; Li, Xiaoqin; Xiang, Jiang; Yin, Xiangjing; Gao, Hua; Zheng, Yi; Bassett, Carole L; Wang, Xiping
2013-09-01
SQUAMOSA promoter binding protein (SBP)-box genes encode a family of plant-specific transcription factors and play many crucial roles in plant development. In this study, 27 SBP-box gene family members were identified in the apple (Malus × domestica Borkh.) genome, 15 of which were suggested to be putative targets of MdmiR156. Plant SBPs were classified into eight groups according to the phylogenetic analysis of SBP-domain proteins. Gene structure, gene chromosomal location and synteny analyses of MdSBP genes within the apple genome demonstrated that tandem and segmental duplications, as well as whole genome duplications, have likely contributed to the expansion and evolution of the SBP-box gene family in apple. Additionally, synteny analysis between apple and Arabidopsis indicated that several paired homologs of MdSBP and AtSPL genes were located in syntenic genomic regions. Tissue-specific expression analysis of MdSBP genes in apple demonstrated their diversified spatiotemporal expression patterns. Most MdmiR156-targeted MdSBP genes, which had relatively high transcript levels in stems, leaves, apical buds and some floral organs, exhibited a more differential expression pattern than most MdmiR156-nontargeted MdSBP genes. Finally, expression analysis of MdSBP genes in leaves upon various plant hormone treatments showed that many MdSBP genes were responsive to different plant hormones, indicating that MdSBP genes may be involved in responses to hormone signaling during stress or in apple development. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
Fan, Sheng; Zhang, Dong; Xing, Libo; Qi, Siyan; Du, Lisha; Wu, Haiqin; Shao, Hongxia; Li, Youmei; Ma, Juanjuan; Han, Mingyu
2017-08-01
Although INDETERMINATE DOMAIN (IDD) genes encoding specific plant transcription factors have important roles in plant growth and development, little is known about apple IDD (MdIDD) genes and their potential functions in the flower induction. In this study, we identified 20 putative IDD genes in apple and named them according to their chromosomal locations. All identified MdIDD genes shared a conserved IDD domain. A phylogenetic analysis separated MdIDDs and other plant IDD genes into four groups. Bioinformatic analysis of chemical characteristics, gene structure, and prediction of protein-protein interactions demonstrated the functional and structural diversity of MdIDD genes. To further uncover their potential functions, we performed analysis of tandem, synteny, and gene duplications, which indicated several paired homologs of IDD genes between apple and Arabidopsis. Additionally, genome duplications also promoted the expansion and evolution of the MdIDD genes. Quantitative real-time PCR revealed that all the MdIDD genes showed distinct expression levels in five different tissues (stems, leaves, buds, flowers, and fruits). Furthermore, the expression levels of candidate MdIDD genes were also investigated in response to various circumstances, including GA treatment (decreased the flowering rate), sugar treatment (increased the flowering rate), alternate-bearing conditions, and two varieties with different-flowering intensities. Parts of them were affected by exogenous treatments and showed different expression patterns. Additionally, changes in response to alternate-bearing and different-flowering varieties of apple trees indicated that they were also responsive to flower induction. Taken together, our comprehensive analysis provided valuable information for further analysis of IDD genes aiming at flower induction.
Benito-Sanz, Sara; Belinchon-Martínez, Alberta; Aza-Carmona, Miriam; de la Torre, Carolina; Huber, Celine; González-Casado, Isabel; Ross, Judith L; Thomas, N Simon; Zinn, Andrew R; Cormier-Daire, Valerie; Heath, Karen E
2017-02-01
Short stature homeobox gene (SHOX) is located in the pseudoautosomal region 1 of the sex chromosomes. It encodes a transcription factor implicated in the skeletal growth. Point mutations, deletions or duplications of SHOX or its transcriptional regulatory elements are associated with two skeletal dysplasias, Léri-Weill dyschondrosteosis (LWD) and Langer mesomelic dysplasia (LMD), as well as in a small proportion of idiopathic short stature (ISS) individuals. We have identified a total of 15 partial SHOX deletions and 13 partial SHOX duplications in LWD, LMD and ISS patients referred for routine SHOX diagnostics during a 10 year period (2004-2014). Subsequently, we characterized these alterations using MLPA (multiplex ligation-dependent probe amplification assay), fine-tiling array CGH (comparative genomic hybridation) and breakpoint PCR. Nearly half of the alterations have a distal or proximal breakpoint in intron 3. Evaluation of our data and that in the literature reveals that although partial deletions and duplications only account for a small fraction of SHOX alterations, intron 3 appears to be a breakpoint hotspot, with alterations arising by non-allelic homologous recombination, non-homologous end joining or other complex mechanisms.
Ma, Xiaodong; Ma, Jianchao; Fan, Di; Li, Chaofeng; Jiang, Yuanzhong; Luo, Keming
2016-01-01
Higher plants have been shown to experience a juvenile vegetative phase, an adult vegetative phase, and a reproductive phase during its postembryonic development and distinct lateral organ morphologies have been observed at the different development stages. Populus euphratica, commonly known as a desert poplar, has developed heteromorphic leaves during its development. The TCP family genes encode a group of plant-specific transcription factors involved in several aspects of plant development. In particular, TCPs have been shown to influence leaf size and shape in many herbaceous plants. However, whether these functions are conserved in woody plants remains unknown. In the present study, we carried out genome-wide identification of TCP genes in P. euphratica and P. trichocarpa, and 33 and 36 genes encoding putative TCP proteins were found, respectively. Phylogenetic analysis of the poplar TCPs together with Arabidopsis TCPs indicated a biased expansion of the TCP gene family via segmental duplications. In addition, our results have also shown a correlation between different expression patterns of several P. euphratica TCP genes and leaf shape variations, indicating their involvement in the regulation of leaf shape development. PMID:27605130
RNase 1 genes from the Family Sciuridae define a novel rodent ribonuclease cluster
Siegel, Steven J.; Percopo, Caroline M.; Dyer, Kimberly D.; Zhao, Wei; Roth, V. Louise; Mercer, John M.; Rosenberg, Helene F.
2009-01-01
The RNase A ribonucleases are complex group of functionally diverse secretory proteins with conserved enzymatic activity. We have identified novel RNase 1 genes from four species of squirrel (order Rodentia, family Sciuridae). Squirrel RNase 1 genes encode typical RNase A ribonucleases, each with eight cysteines, a conserved CKXXNTF signature motif, and a canonical His12-Lys41-His119 catalytic triad. Two alleles encode Callosciurus prevostii RNase 1, which include a Ser18↔Pro, analogous to the sequence polymorphisms found among the RNase 1 duplications in the genome of Rattus exulans. Interestingly, although the squirrel RNase 1 genes are closely related to one another (77 to 95% amino acid sequence identity), the cluster as a whole is distinct and divergent from the clusters including RNase 1 genes from other rodent species. We examined the specific sites at which Sciuridae RNase 1s diverge from Muridae / Cricetidae RNase 1s, and determined that the divergent sites are located on the external surface, with complete sparing of the catalytic crevice. The full significance of these findings awaits a more complete understanding of biological role of mammalian RNase 1s. PMID:19771477
Furihata, Hazuka Y; Suenaga, Kazuya; Kawanabe, Takahiro; Yoshida, Takanori; Kawabe, Akira
2016-10-13
PRC2 genes were analyzed for their number of gene duplications, d N /d S ratios and expression patterns among Brassicaceae and Gramineae species. Although both amino acid sequences and copy number of the PRC2 genes were generally well conserved in both Brassicaceae and Gramineae species, we observed that some rapidly evolving genes experienced duplications and expression pattern changes. After multiple duplication events, all but one or two of the duplicated copies tend to be silenced. Silenced copies were reactivated in the endosperm and showed ectopic expression in developing seeds. The results indicated that rapid evolution of some PRC2 genes is initially caused by a relaxation of selective constraint following the gene duplication events. Several loci could become maternally expressed imprinted genes and acquired functional roles in the endosperm.
Kumar, Kamal; Srivastava, Vikas; Purayannur, Savithri; Kaladhar, V Chandra; Cheruvu, Purnima Jaiswal; Verma, Praveen Kumar
2016-06-01
The WRKY genes have been identified as important transcriptional modulators predominantly during the environmental stresses, but they also play critical role at various stages of plant life cycle. We report the identification of WRKY domain (WD)-encoding genes from galegoid clade legumes chickpea (Cicer arietinum L.) and barrel medic (Medicago truncatula). In total, 78 and 98 WD-encoding genes were found in chickpea and barrel medic, respectively. Comparative analysis suggests the presence of both conserved and unique WRKYs, and expansion of WRKY family in M. truncatula primarily by tandem duplication. Exclusively found in galegoid legumes, CaWRKY16 and its orthologues encode for a novel protein having a transmembrane and partial Exo70 domains flanking a group-III WD. Genomic region of galegoids, having CaWRKY16, is more dynamic when compared with millettioids. In onion cells, fused CaWRKY16-EYFP showed punctate fluorescent signals in cytoplasm. The chickpea WRKY group-III genes were further characterized for their transcript level modulation during pathogenic stress and treatments of abscisic acid, jasmonic acid, and salicylic acid (SA) by real-time PCR. Differential regulation of genes was observed during Ascochyta rabiei infection and SA treatment. Characterization of A. rabiei and SA inducible gene CaWRKY50 showed that it localizes to plant nucleus, binds to W-box, and have a C-terminal transactivation domain. Overexpression of CaWRKY50 in tobacco plants resulted in early flowering and senescence. The in-depth comparative account presented here for two legume WRKY genes will be of great utility in hastening functional characterization of crop legume WRKYs and will also help in characterization of Exo70Js. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Design of the Detector II: A CMOS Gate Array for the Study of Concurrent Error Detection Techniques.
1987-07-01
detection schemes and temporary failures. The circuit consists- or of six different adders with concurrent error detection schemes . The error detection... schemes are - simple duplication, duplication with functional dual implementation, duplication with different &I [] .6implementations, two-rail encoding...THE SYSTEM. .. .... ...... ...... ...... 5 7. DESIGN OF CED SCHEMES .. ... ...... ...... ........ 7 7.1 Simple Duplication
Schnable, James C; Pedersen, Brent S; Subramaniam, Sabarinath; Freeling, Michael
2011-01-01
Whole genome duplications, or tetraploidies, are an important source of increased gene content. Following whole genome duplication, duplicate copies of many genes are lost from the genome. This loss of genes is biased both in the classes of genes deleted and the subgenome from which they are lost. Many or all classes are genes preferentially retained as duplicate copies are engaged in dose sensitive protein-protein interactions, such that deletion of any one duplicate upsets the status quo of subunit concentrations, and presumably lowers fitness as a result. Transcription factors are also preferentially retained following every whole genome duplications studied. This has been explained as a consequence of protein-protein interactions, just as for other highly retained classes of genes. We show that the quantity of conserved noncoding sequences (CNSs) associated with genes predicts the likelihood of their retention as duplicate pairs following whole genome duplication. As many CNSs likely represent binding sites for transcriptional regulators, we propose that the likelihood of gene retention following tetraploidy may also be influenced by dose-sensitive protein-DNA interactions between the regulatory regions of CNS-rich genes - nicknamed bigfoot genes - and the proteins that bind to them. Using grass genomes, we show that differential loss of CNSs from one member of a pair following the pre-grass tetraploidy reduces its chance of retention in the subsequent maize lineage tetraploidy.
Schnable, James C.; Pedersen, Brent S.; Subramaniam, Sabarinath; Freeling, Michael
2011-01-01
Whole genome duplications, or tetraploidies, are an important source of increased gene content. Following whole genome duplication, duplicate copies of many genes are lost from the genome. This loss of genes is biased both in the classes of genes deleted and the subgenome from which they are lost. Many or all classes are genes preferentially retained as duplicate copies are engaged in dose sensitive protein–protein interactions, such that deletion of any one duplicate upsets the status quo of subunit concentrations, and presumably lowers fitness as a result. Transcription factors are also preferentially retained following every whole genome duplications studied. This has been explained as a consequence of protein–protein interactions, just as for other highly retained classes of genes. We show that the quantity of conserved noncoding sequences (CNSs) associated with genes predicts the likelihood of their retention as duplicate pairs following whole genome duplication. As many CNSs likely represent binding sites for transcriptional regulators, we propose that the likelihood of gene retention following tetraploidy may also be influenced by dose–sensitive protein–DNA interactions between the regulatory regions of CNS-rich genes – nicknamed bigfoot genes – and the proteins that bind to them. Using grass genomes, we show that differential loss of CNSs from one member of a pair following the pre-grass tetraploidy reduces its chance of retention in the subsequent maize lineage tetraploidy. PMID:22645525
Chang, Dan; Duda, Thomas F
2014-06-05
Predatory marine gastropods of the genus Conus exhibit substantial variation in venom composition both within and among species. Apart from mechanisms associated with extensive turnover of gene families and rapid evolution of genes that encode venom components ('conotoxins'), the evolution of distinct conotoxin expression patterns is an additional source of variation that may drive interspecific differences in the utilization of species' 'venom gene space'. To determine the evolution of expression patterns of venom genes of Conus species, we evaluated the expression of A-superfamily conotoxin genes of a set of closely related Conus species by comparing recovered transcripts of A-superfamily genes that were previously identified from the genomes of these species. We modified community phylogenetics approaches to incorporate phylogenetic history and disparity of genes and their expression profiles to determine patterns of venom gene space utilization. Less than half of the A-superfamily gene repertoire of these species is expressed, and only a few orthologous genes are coexpressed among species. Species exhibit substantially distinct expression strategies, with some expressing sets of closely related loci ('under-dispersed' expression of available genes) while others express sets of more disparate genes ('over-dispersed' expression). In addition, expressed genes show higher dN/dS values than either unexpressed or ancestral genes; this implies that expression exposes genes to selection and facilitates rapid evolution of these genes. Few recent lineage-specific gene duplicates are expressed simultaneously, suggesting that expression divergence among redundant gene copies may be established shortly after gene duplication. Our study demonstrates that venom gene space is explored differentially by Conus species, a process that effectively permits the independent and rapid evolution of venoms in these species.
Sukalo, Maja; Schäflein, Eva; Schanze, Ina; Everman, David B; Rezaei, Nima; Argente, Jesús; Lorda-Sanchez, Isabel; Deshpande, Charu; Takahashi, Tsutomu; Kleger, Alexander; Zenker, Martin
2017-11-01
Johanson-Blizzard syndrome (JBS, MIM #243800) is a very rare autosomal recessive disorder characterized by exocrine pancreatic insufficiency, nasal wing hypoplasia, hypodontia, and other abnormalities. JBS is caused by mutations of the UBR1 gene (MIM *605981), encoding a ubiquitin ligase of the N-end rule pathway. Molecular findings in a total of 65 unrelated patients with a clinical diagnosis of JBS who were previously screened for UBR1 mutations by Sanger sequencing were reviewed and cases lacking a disease-causing UBR1 mutation on either one or both alleles were included in this study. In order to discover mutations that are not detectable by Sanger sequencing, we designed a probe set for multiplex ligation-dependent probe amplification (MLPA) analysis of the UBR1 gene and analyzed the copy number status of all 47 UBR1 exons. Our previous studies using Sanger sequencing could detect mutations in 93.1% of 130 disease-associated UBR1 alleles. Six patients with a highly suggestive clinical diagnosis of JBS and unsolved genotype were included in this study. MLPA analysis detected six alleles harboring exon deletions/duplications, thereby raising the mutation detection rate in the entire cohort to 97.7% (127/130 alleles). We conclude that single or multi-exon deletions or duplications account for a substantial proportion of JBS-associated UBR1 mutations. © 2017 The Authors. Molecular Genetics & Genomic Medicine published by Wiley Periodicals, Inc.
Wang, Shan-Ning; Peng, Yong; Lu, Zi-Yun; Dhiloo, Khalid Hussain; Zheng, Yao; Shan, Shuang; Li, Rui-Jun; Zhang, Yong-Jun; Guo, Yu-Yuan
2016-07-01
Ionotropic receptors (IRs) mainly detect the acids and amines having great importance in many insect species, representing an ancient olfactory receptor family in insects. In the present work, we performed RNAseq of Microplitis mediator antennae and identified seventeen IRs. Full-length MmedIRs were cloned and sequenced. Phylogenetic analysis of the Hymenoptera IRs revealed that ten MmedIR genes encoded "antennal IRs" and seven encoded "divergent IRs". Among the IR25a orthologous groups, two genes, MmedIR25a.1 and MmedIR25a.2, were found in M. mediator. Gene structure analysis of MmedIR25a revealed a tandem duplication of IR25a in M. mediator. The tissue distribution and development specific expression of the MmedIR genes suggested that these genes showed a broad expression profile. Quantitative gene expression analysis showed that most of the genes are highly enriched in adult antennae, indicating the candidate chemosensory function of this family in parasitic wasps. Using immunocytochemistry, we confirmed that one co-receptor, MmedIR8a, was expressed in the olfactory sensory neurons. Our data will supply fundamental information for functional analysis of the IRs in parasitoid wasp chemoreception. Copyright © 2016 Elsevier Ltd. All rights reserved.
Zhu, Kaikai; Wang, Xiaolong; Liu, Jinyi; Tang, Jun; Cheng, Qunkang; Chen, Jin-Gui; Cheng, Zong-Ming Max
2018-01-01
Protein kinases (PKs) have evolved as the largest family of molecular switches that regulate protein activities associated with almost all essential cellular functions. Only a fraction of plant PKs, however, have been functionally characterized even in model plant species. In the present study, the entire grapevine kinome was identified and annotated using the most recent version of the grapevine genome. A total of 1168 PK-encoding genes were identified and classified into 20 groups and 121 families, with the RLK-Pelle group being the largest, with 872 members. The 1168 kinase genes were unevenly distributed over all 19 chromosomes, and both tandem and segmental duplications contributed to the expansion of the grapevine kinome, especially of the RLK-Pelle group. Ka/Ks values indicated that most of the tandem and segmental duplication events were under purifying selection. The grapevine kinome families exhibited different expression patterns during plant development and in response to various stress treatments, with many being coexpressed. The comprehensive annotation of grapevine kinase genes, their patterns of expression and coexpression, and the related information facilitate a more complete understanding of the roles of various grapevine kinases in growth and development, responses to abiotic stress, and evolutionary history.
Crnovčić, Ivana; Rückert, Christian; Semsary, Siamak; Lang, Manuel; Kalinowski, Jörn; Keller, Ullrich
2017-01-01
Sequencing the actinomycin (acm) biosynthetic gene cluster of Streptomyces antibioticus IMRU 3720, which produces actinomycin X (Acm X), revealed 20 genes organized into a highly similar framework as in the bi-armed acm C biosynthetic gene cluster of Streptomyces chrysomallus but without an attached additional extra arm of orthologues as in the latter. Curiously, the extra arm of the S. chrysomallus gene cluster turned out to perfectly match the single arm of the S. antibioticus gene cluster in the same order of orthologues including the the presence of two pseudogenes, scacmM and scacmN, encoding a cytochrome P450 and its ferredoxin, respectively. Orthologues of the latter genes were both missing in the principal arm of the S. chrysomallus acm C gene cluster. All orthologues of the extra arm showed a G +C-contents different from that of their counterparts in the principal arm. Moreover, the similarities of translation products from the extra arm were all higher to the corresponding translation products of orthologue genes from the S. antibioticus acm X gene cluster than to those encoded by the principal arm of their own gene cluster. This suggests that the duplicated structure of the S. chrysomallus acm C biosynthetic gene cluster evolved from previous fusion between two one-armed acm gene clusters each from a different genetic background. However, while scacmM and scacmN in the extra arm of the S. chrysomallus acm C gene cluster are mutated and therefore are non-functional, their orthologues saacmM and saacmN in the S. antibioticus acm C gene cluster show no defects seemingly encoding active enzymes with functions specific for Acm X biosynthesis. Both acm biosynthetic gene clusters lack a kynurenine-3-monooxygenase gene necessary for biosynthesis of 3-hydroxy-4-methylanthranilic acid, the building block of the Acm chromophore, which suggests participation of a genome-encoded relevant monooxygenase during Acm biosynthesis in both S. chrysomallus and S. antibioticus. PMID:28435299
Crnovčić, Ivana; Rückert, Christian; Semsary, Siamak; Lang, Manuel; Kalinowski, Jörn; Keller, Ullrich
2017-01-01
Sequencing the actinomycin ( acm ) biosynthetic gene cluster of Streptomyces antibioticus IMRU 3720, which produces actinomycin X (Acm X), revealed 20 genes organized into a highly similar framework as in the bi-armed acm C biosynthetic gene cluster of Streptomyces chrysomallus but without an attached additional extra arm of orthologues as in the latter. Curiously, the extra arm of the S. chrysomallus gene cluster turned out to perfectly match the single arm of the S. antibioticus gene cluster in the same order of orthologues including the the presence of two pseudogenes, scacmM and scacmN , encoding a cytochrome P450 and its ferredoxin, respectively. Orthologues of the latter genes were both missing in the principal arm of the S. chrysomallus acm C gene cluster. All orthologues of the extra arm showed a G +C-contents different from that of their counterparts in the principal arm. Moreover, the similarities of translation products from the extra arm were all higher to the corresponding translation products of orthologue genes from the S. antibioticus acm X gene cluster than to those encoded by the principal arm of their own gene cluster. This suggests that the duplicated structure of the S. chrysomallus acm C biosynthetic gene cluster evolved from previous fusion between two one-armed acm gene clusters each from a different genetic background. However, while scacmM and scacmN in the extra arm of the S. chrysomallus acm C gene cluster are mutated and therefore are non-functional, their orthologues saacmM and saacmN in the S. antibioticus acm C gene cluster show no defects seemingly encoding active enzymes with functions specific for Acm X biosynthesis. Both acm biosynthetic gene clusters lack a kynurenine-3-monooxygenase gene necessary for biosynthesis of 3-hydroxy-4-methylanthranilic acid, the building block of the Acm chromophore, which suggests participation of a genome-encoded relevant monooxygenase during Acm biosynthesis in both S. chrysomallus and S. antibioticus .
Acharya, Debarun; Ghosh, Tapash C
2016-01-22
Gene duplication is a genetic mutation that creates functionally redundant gene copies that are initially relieved from selective pressures and may adapt themselves to new functions with time. The levels of gene duplication may vary from small-scale duplication (SSD) to whole genome duplication (WGD). Studies with yeast revealed ample differences between these duplicates: Yeast WGD pairs were functionally more similar, less divergent in subcellular localization and contained a lesser proportion of essential genes. In this study, we explored the differences in evolutionary genomic properties of human SSD and WGD genes, with the identifiable human duplicates coming from the two rounds of whole genome duplication occurred early in vertebrate evolution. We observed that these two groups of duplicates were also dissimilar in terms of their evolutionary and genomic properties. But interestingly, this is not like the same observed in yeast. The human WGDs were found to be functionally less similar, diverge more in subcellular level and contain a higher proportion of essential genes than the SSDs, all of which are opposite from yeast. Additionally, we explored that human WGDs were more divergent in their gene expression profile, have higher multifunctionality and are more often associated with disease, and are evolutionarily more conserved than human SSDs. Our study suggests that human WGD duplicates are more divergent and entails the adaptation of WGDs to novel and important functions that consequently lead to their evolutionary conservation in the course of evolution.
Comparative inference of duplicated genes produced by polyploidization in soybean genome.
Yang, Yanmei; Wang, Jinpeng; Di, Jianyong
2013-01-01
Soybean (Glycine max) is one of the most important crop plants for providing protein and oil. It is important to investigate soybean genome for its economic and scientific value. Polyploidy is a widespread and recursive phenomenon during plant evolution, and it could generate massive duplicated genes which is an important resource for genetic innovation. Improved sequence alignment criteria and statistical analysis are used to identify and characterize duplicated genes produced by polyploidization in soybean. Based on the collinearity method, duplicated genes by whole genome duplication account for 70.3% in soybean. From the statistical analysis of the molecular distances between duplicated genes, our study indicates that the whole genome duplication event occurred more than once in the genome evolution of soybean, which is often distributed near the ends of chromosomes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Proia, R.L.
1988-03-01
Lysosomal {beta}-hexosaminidase is composed of two structurally similar chains, {alpha} and {beta}, that are the products of different genes. Mutations in either gene causing {beta}-hexosaminidase deficiency result in the lysosomal storage disease GM2-gangliosidosis. To enable the investigation of the molecular lesions in this disorder and to study the evolutionary relationship between the {alpha} and {beta} chains, the {beta}-chain gene was isolated, and its organization was characterized. The {beta}-chain coding region is divided into 14 exons distributed over {approx}40 kilobases of DNA. Comparison with the {alpha}-chain gene revealed that 12 of the 13 introns interrupt the coding regions at homologous positions.more » This extensive sharing of intron placement demonstrates that the {alpha} and {beta} chains evolved by way of the duplication of a common ancestor.« less
Gaines, William A.; Marcotte, William R.
2010-01-01
Spider dragline silk is primarily composed of proteins called major ampullate spidroins (MaSp) that consist of a large repeat array flanked by non-repetitive N- and C-terminal domains. Until recently, there has been little evidence for more than one gene encoding each of the two major spidroin silk proteins, MaSp1 and MaSp2. Here, we report the deduced N-terminal domain sequences for two distinct MaSp1 genes from Nephila clavipes (MaSp1A and MaSp1B) and for MaSp2. All three MaSp genes are co-expressed in the major ampullate gland. A search of the GenBank database also revealed two distinct MaSp1 C-terminal domain sequences. Sequencing confirmed that both MaSp1 genes are present in all seven Nephila clavipes spiders examined. The presence of nucleotide polymorphisms in these genes confirmed that MaSp1A and MaSp1B are distinct genetic loci and not merely alleles of the same gene. We have experimentally determined the transcription start sites for all three MaSp genes and established preliminary pairing between the two MaSp1 N- and C-terminal domains. Phylogenetic analysis of these new sequences and other published MaSp N- and C-terminal domain sequences illustrated that duplications of MaSp genes may be widespread among spider species. PMID:18828837
Salaneck, Erik; Ardell, David H; Larson, Earl T; Larhammar, Dan
2003-08-01
It has been debated whether the increase in gene number during early vertebrate evolution was due to multiple independent gene duplications or synchronous duplications of many genes. We describe here the cloning of three neuropeptide Y (NPY) receptor genes belonging to the Y1 subfamily in the spiny dogfish, Squalus acanthias, a cartilaginous fish. The three genes are orthologs of the mammalian subtypes Y1, Y4, and Y6, which are located in paralogous gene regions on different chromosomes in mammals. Thus, these genes arose by duplications of a chromosome region before the radiation of gnathostomes (jawed vertebrates). Estimates of duplication times from linearized trees together with evidence from other gene families supports two rounds of chromosome duplications or tetraploidizations early in vertebrate evolution. The anatomical distribution of mRNA was determined by reverse-transcriptase PCR and was found to differ from mammals, suggesting differential functional diversification of the new gene copies during the radiation of the vertebrate classes.
The gene space in wheat: the complete γ-gliadin gene family from the wheat cultivar Chinese Spring.
Anderson, Olin D; Huo, Naxin; Gu, Yong Q
2013-06-01
The complete set of unique γ-gliadin genes is described for the wheat cultivar Chinese Spring using a combination of expressed sequence tag (EST) and Roche 454 DNA sequences. Assemblies of Chinese Spring ESTs yielded 11 different γ-gliadin gene sequences. Two of the sequences encode identical polypeptides and are assumed to be the result of a recent gene duplication. One gene has a 3' coding mutation that changes the reading frame in the final eight codons. A second assembly of Chinese Spring γ-gliadin sequences was generated using Roche 454 total genomic DNA sequences. The 454 assembly confirmed the same 11 active genes as the EST assembly plus two pseudogenes not represented by ESTs. These 13 γ-gliadin sequences represent the complete unique set of γ-gliadin genes for cv Chinese Spring, although not ruled out are additional genes that are exact duplications of these 13 genes. A comparison with the ESTs of two other hexaploid cultivars (Butte 86 and Recital) finds that the most active genes are present in all three cultivars, with exceptions likely due to too few ESTs for detection in Butte 86 and Recital. A comparison of the numbers of ESTs per gene indicates differential levels of expression within the γ-gliadin gene family. Genome assignments were made for 6 of the 13 Chinese Spring γ-gliadin genes, i.e., one assignment from a match to two γ-gliadin genes found within a tetraploid wheat A genome BAC and four genes that match four distinct γ-gliadin sequences assembled from Roche 454 sequences from Aegilops tauschii, the hexaploid wheat D-genome ancestor.
Schilf, Paul; Peter, Annette; Hurek, Thomas; Stick, Reimer
2014-07-01
Lamin proteins are found in all metazoans. Most non-vertebrate genomes including those of the closest relatives of vertebrates, the cephalochordates and tunicates, encode only a single lamin. In teleosts and tetrapods the number of lamin genes has quadrupled. They can be divided into four sub-types, lmnb1, lmnb2, LIII, and lmna, each characterized by particular features and functional differentiations. Little is known when during vertebrate evolution these features have emerged. Lampreys belong to the Agnatha, the sister group of the Gnathostomata. They split off first within the vertebrate lineage. Analysis of the sea lamprey (Petromyzon marinus) lamin complement presented here, identified three functional lamin genes, one encoding a lamin LIII, indicating that the characteristic gene structure of this subtype had been established prior to the agnathan/gnathostome split. Two other genes encode lamins for which orthology to gnathostome lamins cannot be designated. Search for lamin gene sequences in all vertebrate taxa for which sufficient sequence data are available reveals the evolutionary time frame in which specific features of the vertebrate lamins were established. Structural features characteristic for A-type lamins are not found in the lamprey genome. In contrast, lmna genes are present in all gnathostome lineages suggesting that this gene evolved with the emergence of the gnathostomes. The analysis of lamin gene neighborhoods reveals noticeable similarities between the different vertebrate lamin genes supporting the hypothesis that they emerged due to two rounds of whole genome duplication and makes clear that an orthologous relationship between a particular vertebrate paralog and lamins outside the vertebrate lineage cannot be established. Copyright © 2014 Elsevier GmbH. All rights reserved.
Shaheen, Ranad; Al Tala, Saeed; Almoisheer, Agaadir; Alkuraya, Fowzan S
2014-12-01
Primordial dwarfism (PD) is a heterogeneous clinical entity characterised by severe prenatal and postnatal growth deficiency. Despite the recent wave of disease gene discovery, the causal mutations in many PD patients remain unknown. To describe a PD family that maps to a novel locus. Clinical, imaging and laboratory phenotyping of a new family with PD followed by autozygosity mapping, linkage analysis and candidate gene sequencing. We describe a multiplex consanguineous Saudi family in which two full siblings and one half-sibling presented with classical features of Seckel syndrome in addition to optic nerve hypoplasia. We were able to map the phenotype to a single novel locus on 4q25-q28.2, in which we identified a five base-pair deletion in PLK4, which encodes a master regulator of centriole duplication. Our discovery further confirms the role of genes involved in centriole biology in the pathogenesis of PD. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Lengyel, Peter
2014-01-01
My Ph.D. thesis in the laboratory of Severo Ochoa at New York University School of Medicine in 1962 included the determination of the nucleotide compositions of codons specifying amino acids. The experiments were based on the use of random copolyribonucleotides (synthesized by polynucleotide phosphorylase) as messenger RNA in a cell-free protein-synthesizing system. At Yale University, where I joined the faculty, my co-workers and I first studied the mechanisms of protein synthesis. Thereafter, we explored the interferons (IFNs), which were discovered as antiviral defense agents but were revealed to be components of a highly complex multifunctional system. We isolated pure IFNs and characterized IFN-activated genes, the proteins they encode, and their functions. We concentrated on a cluster of IFN-activated genes, the p200 cluster, which arose by repeated gene duplications and which encodes a large family of highly multifunctional proteins. For example, the murine protein p204 can be activated in numerous tissues by distinct transcription factors. It modulates cell proliferation and the differentiation of a variety of tissues by binding to many proteins. p204 also inhibits the activities of wild-type Ras proteins and Ras oncoproteins. PMID:24867946
Gene family size conservation is a good indicator of evolutionary rates.
Chen, Feng-Chi; Chen, Chiuan-Jung; Li, Wen-Hsiung; Chuang, Trees-Juen
2010-08-01
The evolution of duplicate genes has been a topic of broad interest. Here, we propose that the conservation of gene family size is a good indicator of the rate of sequence evolution and some other biological properties. By comparing the human-chimpanzee-macaque orthologous gene families with and without family size conservation, we demonstrate that genes with family size conservation evolve more slowly than those without family size conservation. Our results further demonstrate that both family expansion and contraction events may accelerate gene evolution, resulting in elevated evolutionary rates in the genes without family size conservation. In addition, we show that the duplicate genes with family size conservation evolve significantly more slowly than those without family size conservation. Interestingly, the median evolutionary rate of singletons falls in between those of the above two types of duplicate gene families. Our results thus suggest that the controversy on whether duplicate genes evolve more slowly than singletons can be resolved when family size conservation is taken into consideration. Furthermore, we also observe that duplicate genes with family size conservation have the highest level of gene expression/expression breadth, the highest proportion of essential genes, and the lowest gene compactness, followed by singletons and then by duplicate genes without family size conservation. Such a trend accords well with our observations of evolutionary rates. Our results thus point to the importance of family size conservation in the evolution of duplicate genes.
Impact of gene gains, losses and duplication modes on the origin and diversification of vertebrates.
Cañestro, Cristian; Albalat, Ricard; Irimia, Manuel; Garcia-Fernàndez, Jordi
2013-02-01
The study of the evolutionary origin of vertebrates has been linked to the study of genome duplications since Susumo Ohno suggested that the successful diversification of vertebrate innovations was facilitated by two rounds of whole-genome duplication (2R-WGD) in the stem vertebrate. Since then, studies on the functional evolution of many genes duplicated in the vertebrate lineage have provided the grounds to support experimentally this link. This article reviews cases of gene duplications derived either from the 2R-WGD or from local gene duplication events in vertebrates, analyzing their impact on the evolution of developmental innovations. We analyze how gene regulatory networks can be rewired by the activity of transposable elements after genome duplications, discuss how different mechanisms of duplication might affect the fate of duplicated genes, and how the loss of gene duplicates might influence the fate of surviving paralogs. We also discuss the evolutionary relationships between gene duplication and alternative splicing, in particular in the vertebrate lineage. Finally, we discuss the role that the 2R-WGD might have played in the evolution of vertebrate developmental gene networks, paying special attention to those related to vertebrate key features such as neural crest cells, placodes, and the complex tripartite brain. In this context, we argue that current evidences points that the 2R-WGD may not be linked to the origin of vertebrate innovations, but to their subsequent diversification in a broad variety of complex structures and functions that facilitated the successful transition from peaceful filter-feeding non-vertebrate ancestors to voracious vertebrate predators. Copyright © 2013 Elsevier Ltd. All rights reserved.
Detection of two distinct forms of apoC-I in great apes.
Puppione, Donald L; Ryan, Christopher M; Bassilian, Sara; Souda, Puneet; Xiao, Xinshu; Ryder, Oliver A; Whitelegge, Julian P
2010-03-01
ApoC-I, the smallest of the soluble apolipoproteins, associates with both TG-rich lipoproteins and HDL. Mass spectral analyses of human apoC-I previously had demonstrated that in the circulation there are two forms, either a 57 amino acid protein or a 55 amino acid protein, due to the loss of two amino acids from the N-terminus. In our analyses of the apolipoproteins of the other great apes by mass spectrometry, four forms of apoC-I were detected. Two of these showed a high degree of identity to the mature and truncated forms of human apoC-I. The other two were homologous to the virtual protein and its truncated form that are encoded by a human pseudogene. In humans, the genes for apoC-I and its pseudogene are located on chromosome 19, the pseudogene being 2.5 kb downstream from the apoC-I gene. Based on the similarity between the apoC-I gene and the pseudogene, it has been concluded that the latter arose from the former as a result of gene duplication approximately 35 million years ago. Interestingly, the virtual protein encoded by the pseudogene is acidic, not basic like apoC-I. In the chimpanzee, there also are two genes for apoC-I, the one upstream encodes a basic protein and the downstream gene, rather than being a pseudogene, encodes an acidic protein (P86336). In addition to reporting on the molecular masses of great ape apoC-I, we were able to clearly demonstrate by "Top-down" sequencing that the acidic form arose from a separate gene. In our analyses, we have measured the molecular masses of apoC-I associated with the HDL of the following great apes: bonobo (Pan paniscus), chimpanzee (Pan troglodytes), and the Sumatran orangutan (Pongo abelii). Genomic variations in chromosome 19 among great apes, baboons and macaques as they relate to both genes for apoC-I and the pseudogene are compared and discussed.
Findeisen, Peggy; Mühlhausen, Stefanie; Dempewolf, Silke; Hertzog, Jonny; Zietlow, Alexander; Carlomagno, Teresa; Kollmar, Martin
2014-08-27
Tubulins belong to the most abundant proteins in eukaryotes providing the backbone for many cellular substructures like the mitotic and meiotic spindles, the intracellular cytoskeletal network, and the axonemes of cilia and flagella. Homologs have even been reported for archaea and bacteria. However, a taxonomically broad and whole-genome-based analysis of the tubulin protein family has never been performed, and thus, the number of subfamilies, their taxonomic distribution, and the exact grouping of the supposed archaeal and bacterial homologs are unknown. Here, we present the analysis of 3,524 tubulins from 504 species. The tubulins formed six major subfamilies, α to ζ. Species of all major kingdoms of the eukaryotes encode members of these subfamilies implying that they must have already been present in the last common eukaryotic ancestor. The proposed archaeal homologs grouped together with the bacterial TubZ proteins as sister clade to the FtsZ proteins indicating that tubulins are unique to eukaryotes. Most species contained α- and/or β-tubulin gene duplicates resulting from recent branch- and species-specific duplication events. This shows that tubulins cannot be used for constructing species phylogenies without resolving their ortholog-paralog relationships. The many gene duplicates and also the independent loss of the δ-, ε-, or ζ-tubulins, which have been shown to be part of the triplet microtubules in basal bodies, suggest that tubulins can functionally substitute each other. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Zeng, Jia; Hannenhalli, Sridhar
2013-01-01
Gene duplication, followed by functional evolution of duplicate genes, is a primary engine of evolutionary innovation. In turn, gene expression evolution is a critical component of overall functional evolution of paralogs. Inferring evolutionary history of gene expression among paralogs is therefore a problem of considerable interest. It also represents significant challenges. The standard approaches of evolutionary reconstruction assume that at an internal node of the duplication tree, the two duplicates evolve independently. However, because of various selection pressures functional evolution of the two paralogs may be coupled. The coupling of paralog evolution corresponds to three major fates of gene duplicates: subfunctionalization (SF), conserved function (CF) or neofunctionalization (NF). Quantitative analysis of these fates is of great interest and clearly influences evolutionary inference of expression. These two interrelated problems of inferring gene expression and evolutionary fates of gene duplicates have not been studied together previously and motivate the present study. Here we propose a novel probabilistic framework and algorithm to simultaneously infer (i) ancestral gene expression and (ii) the likely fate (SF, NF, CF) at each duplication event during the evolution of gene family. Using tissue-specific gene expression data, we develop a nonparametric belief propagation (NBP) algorithm to predict the ancestral expression level as a proxy for function, and describe a novel probabilistic model that relates the predicted and known expression levels to the possible evolutionary fates. We validate our model using simulation and then apply it to a genome-wide set of gene duplicates in human. Our results suggest that SF tends to be more frequent at the earlier stage of gene family expansion, while NF occurs more frequently later on.
Whole-genome sequencing identifies genetic alterations in pediatric low-grade gliomas.
Zhang, Jinghui; Wu, Gang; Miller, Claudia P; Tatevossian, Ruth G; Dalton, James D; Tang, Bo; Orisme, Wilda; Punchihewa, Chandanamali; Parker, Matthew; Qaddoumi, Ibrahim; Boop, Fredrick A; Lu, Charles; Kandoth, Cyriac; Ding, Li; Lee, Ryan; Huether, Robert; Chen, Xiang; Hedlund, Erin; Nagahawatte, Panduka; Rusch, Michael; Boggs, Kristy; Cheng, Jinjun; Becksfort, Jared; Ma, Jing; Song, Guangchun; Li, Yongjin; Wei, Lei; Wang, Jianmin; Shurtleff, Sheila; Easton, John; Zhao, David; Fulton, Robert S; Fulton, Lucinda L; Dooling, David J; Vadodaria, Bhavin; Mulder, Heather L; Tang, Chunlao; Ochoa, Kerri; Mullighan, Charles G; Gajjar, Amar; Kriwacki, Richard; Sheer, Denise; Gilbertson, Richard J; Mardis, Elaine R; Wilson, Richard K; Downing, James R; Baker, Suzanne J; Ellison, David W
2013-06-01
The most common pediatric brain tumors are low-grade gliomas (LGGs). We used whole-genome sequencing to identify multiple new genetic alterations involving BRAF, RAF1, FGFR1, MYB, MYBL1 and genes with histone-related functions, including H3F3A and ATRX, in 39 LGGs and low-grade glioneuronal tumors (LGGNTs). Only a single non-silent somatic alteration was detected in 24 of 39 (62%) tumors. Intragenic duplications of the portion of FGFR1 encoding the tyrosine kinase domain (TKD) and rearrangements of MYB were recurrent and mutually exclusive in 53% of grade II diffuse LGGs. Transplantation of Trp53-null neonatal astrocytes expressing FGFR1 with the duplication involving the TKD into the brains of nude mice generated high-grade astrocytomas with short latency and 100% penetrance. FGFR1 with the duplication induced FGFR1 autophosphorylation and upregulation of the MAPK/ERK and PI3K pathways, which could be blocked by specific inhibitors. Focusing on the therapeutically challenging diffuse LGGs, our study of 151 tumors has discovered genetic alterations and potential therapeutic targets across the entire range of pediatric LGGs and LGGNTs.
Kennerknecht, Nicole; Sahm, Hermann; Yen, Ming-Ren; Pátek, Miroslav; Saier, Jr., Milton H.; Eggeling, Lothar
2002-01-01
Bacteria possess amino acid export systems, and Corynebacterium glutamicum excretes l-isoleucine in a process dependent on the proton motive force. In order to identify the system responsible for l-isoleucine export, we have used transposon mutagenesis to isolate mutants of C. glutamicum sensitive to the peptide isoleucyl-isoleucine. In one such mutant, strong peptide sensitivity resulted from insertion into a gene designated brnF encoding a hydrophobic protein predicted to possess seven transmembrane spanning helices. brnE is located downstream of brnF and encodes a second hydrophobic protein with four putative membrane-spanning helices. A mutant deleted of both genes no longer exports l-isoleucine, whereas an overexpressing strain exports this amino acid at an increased rate. BrnF and BrnE together are also required for the export of l-leucine and l-valine. BrnFE is thus a two-component export permease specific for aliphatic hydrophobic amino acids. Upstream of brnFE and transcribed divergently is an Lrp-like regulatory gene required for active export. Searches for homologues of BrnFE show that this type of exporter is widespread in prokaryotes but lacking in eukaryotes and that both gene products which together comprise the members of a novel family, the LIV-E family, generally map together within a single operon. Comparisons of the BrnF and BrnE phylogenetic trees show that gene duplication events in the early bacterial lineage gave rise to multiple paralogues that have been retained in α-proteobacteria but not in other prokaryotes analyzed. PMID:12081967
Dong, Chun-Juan; Shang, Qing-Mao
2013-07-01
Phenylalanine ammonia-lyase (PAL), the first enzyme in the phenylpropanoid pathway, plays a critical role in plant growth, development, and adaptation. PAL enzymes are encoded by a gene family in plants. Here, we report a genome-wide search for PAL genes in watermelon. A total of 12 PAL genes, designated ClPAL1-12, are identified . Nine are arranged in tandem in two duplication blocks located on chromosomes 4 and 7, and the other three ClPAL genes are distributed as single copies on chromosomes 2, 3, and 8. Both the cDNA and protein sequences of ClPALs share an overall high identity with each other. A phylogenetic analysis places 11 of the ClPALs into a separate cucurbit subclade, whereas ClPAL2, which belongs to neither monocots nor dicots, may serve as an ancestral PAL in plants. In the cucurbit subclade, seven ClPALs form homologous pairs with their counterparts from cucumber. Expression profiling reveals that 11 of the ClPAL genes are expressed and show preferential expression in the stems and male and female flowers. Six of the 12 ClPALs are moderately or strongly expressed in the fruits, particularly in the pulp, suggesting the potential roles of PAL in the development of fruit color and flavor. A promoter motif analysis of the ClPAL genes implies redundant but distinctive cis-regulatory structures for stress responsiveness. Finally, duplication events during the evolution and expansion of the ClPAL gene family are discussed, and the relationships between the ClPAL genes and their cucumber orthologs are estimated.
Identification and characterization of a second CD4-like gene in teleost fish.
Dijkstra, Johannes Martinus; Somamoto, Tomonori; Moore, Lindsey; Hordvik, Ivar; Ototake, Mitsuru; Fischer, Uwe
2006-02-01
In fish, T cell subdivision is not well studied, although CD8 and CD4 homologues have been reported. This study describes a second teleost CD4-like gene, CD4-like 2 (CD4L-2). Two rainbow trout copies of this gene were found, -2a and -2b, encoding molecules sharing 81% aa identity. The 2a/2b duplication may be related to tetraploid ancestry of salmonid fishes. In the Fugu genome CD4L-2 lies head to tail with an earlier reported, very different CD4-like gene [Suetake, H., Araki, K., Suzuki, Y., 2004. Cloning, expression, and characterization of fugu CD4, the first ectothermic animal CD4. Immunogenetics 56, 368-374], which was designated CD4L-1 in the present article. The flanking genes of the Fugu CD4L-1 and CD4L-2 are reminiscent of the genes surrounding CD4 and LAG-3 in mammals. However, neither synteny nor phylogenetic analysis could decide between CD4 and LAG-3 identity for the fish CD4L genes. CD4L-1 and CD4L-2 share a tyrosine protein kinase p56(lck) binding motif in the cytoplasmic tail with CD4 but not with LAG-3. Trout CD4L-2 expression is highest in the thymus, similar to mammalian and chicken CD4, whereas Fugu CD4L-1 expression was highest in the spleen. However, CD4L-2 encodes only two IG-like domains, whereas CD4L-1, CD4 and LAG-3 encode four. The CD4-like genes 1 and 2 in fish apparently went through an evolution different from that of LAG-3 and CD4 in higher vertebrates.
The complete chloroplast genome sequence of Chikusichloa aquatica (Poaceae: Oryzeae).
Zhang, Jie; Zhang, Dan; Shi, Chao; Gao, Ju; Gao, Li-Zhi
2016-07-01
The complete chloroplast sequence of the Chikusichloa aquatica was determined in this study. The genome consists of 136 563 bp containing a pair of inverted repeats (IRs) of 20 837 bp, which was separated by a large single-copy region and a small single-copy region of 82 315 bp and 33 411 bp, respectively. The C. aquatica cp genome encodes 111 functional genes (71 protein-coding genes, four rRNA genes, and 36 tRNA genes): 92 are unique, while 19 are duplicated in the IR regions. The genic regions account for 58.9% of whole cp genome, and the GC content of the plastome is 39.0%. A phylogenomic analysis showed that C. aquatica is closely related to Rhynchoryza subulata that belongs to the tribe Oryzeae.
Corradi, Nicolas; Sanders, Ian R
2006-03-10
The P-type II ATPase gene family encodes proteins with an important role in adaptation of the cell to variation in external K+, Ca2+ and Na2+ concentrations. The presence of P-type II gene subfamilies that are specific for certain kingdoms has been reported but was sometimes contradicted by discovery of previously unknown homologous sequences in newly sequenced genomes. Members of this gene family have been sampled in all of the fungal phyla except the arbuscular mycorrhizal fungi (AMF; phylum Glomeromycota), which are known to play a key-role in terrestrial ecosystems and to be genetically highly variable within populations. Here we used highly degenerate primers on AMF genomic DNA to increase the sampling of fungal P-Type II ATPases and to test previous predictions about their evolution. In parallel, homologous sequences of the P-type II ATPases have been used to determine the nature and amount of polymorphism that is present at these loci among isolates of Glomus intraradices harvested from the same field. In this study, four P-type II ATPase sub-families have been isolated from three AMF species. We show that, contrary to previous predictions, P-type IIC ATPases are present in all basal fungal taxa. Additionally, P-Type IIE ATPases should no longer be considered as exclusive to the Ascomycota and the Basidiomycota, since we also demonstrate their presence in the Zygomycota. Finally, a comparison of homologous sequences encoding P-type IID ATPases showed unexpectedly that indel mutations among coding regions, as well as specific gene duplications occur among AMF individuals within the same field. On the basis of these results we suggest that the diversification of P-Type IIC and E ATPases followed the diversification of the extant fungal phyla with independent events of gene gains and losses. Consistent with recent findings on the human genome, but at a much smaller geographic scale, we provided evidence that structural genomic changes, such as exonic indel mutations and gene duplications are less rare than previously thought and that these also occur within fungal populations.
Deeg, Christoph M; Chow, Cheryl-Emiliane T
2018-01-01
Giant viruses are ecologically important players in aquatic ecosystems that have challenged concepts of what constitutes a virus. Herein, we present the giant Bodo saltans virus (BsV), the first characterized representative of the most abundant group of giant viruses in ocean metagenomes, and the first isolate of a klosneuvirus, a subgroup of the Mimiviridae proposed from metagenomic data. BsV infects an ecologically important microzooplankton, the kinetoplastid Bodo saltans. Its 1.39 Mb genome encodes 1227 predicted ORFs, including a complex replication machinery. Yet, much of its translational apparatus has been lost, including all tRNAs. Essential genes are invaded by homing endonuclease-encoding self-splicing introns that may defend against competing viruses. Putative anti-host factors show extensive gene duplication via a genomic accordion indicating an ongoing evolutionary arms race and highlighting the rapid evolution and genomic plasticity that has led to genome gigantism and the enigma that is giant viruses. PMID:29582753
Mastretta-Yanes, Alicia; Zamudio, Sergio; Jorgensen, Tove H.; Arrigo, Nils; Alvarez, Nadir; Piñero, Daniel; Emerson, Brent C.
2014-01-01
Gene duplication leads to paralogy, which complicates the de novo assembly of genotyping-by-sequencing (GBS) data. The issue of paralogous genes is exacerbated in plants, because they are particularly prone to gene duplication events. Paralogs are normally filtered from GBS data before undertaking population genomics or phylogenetic analyses. However, gene duplication plays an important role in the functional diversification of genes and it can also lead to the formation of postzygotic barriers. Using populations and closely related species of a tropical mountain shrub, we examine 1) the genomic differentiation produced by putative orthologs, and 2) the distribution of recent gene duplication among lineages and geography. We find high differentiation among populations from isolated mountain peaks and species-level differentiation within what is morphologically described as a single species. The inferred distribution of paralogs among populations is congruent with taxonomy and shows that GBS could be used to examine recent gene duplication as a source of genomic differentiation of nonmodel species. PMID:25223767
Suvorov, Anton; Jensen, Nicholas O; Sharkey, Camilla R; Fujimoto, M Stanley; Bodily, Paul; Wightman, Haley M Cahill; Ogden, T Heath; Clement, Mark J; Bybee, Seth M
2017-03-01
Gene duplication plays a central role in adaptation to novel environments by providing new genetic material for functional divergence and evolution of biological complexity. Several evolutionary models have been proposed for gene duplication to explain how new gene copies are preserved by natural selection, but these models have rarely been tested using empirical data. Opsin proteins, when combined with a chromophore, form a photopigment that is responsible for the absorption of light, the first step in the phototransduction cascade. Adaptive gene duplications have occurred many times within the animal opsins' gene family, leading to novel wavelength sensitivities. Consequently, opsins are an attractive choice for the study of gene duplication evolutionary models. Odonata (dragonflies and damselflies) have the largest opsin repertoire of any insect currently known. Additionally, there is tremendous variation in opsin copy number between species, particularly in the long-wavelength-sensitive (LWS) class. Using comprehensive phylotranscriptomic and statistical approaches, we tested various evolutionary models of gene duplication. Our results suggest that both the blue-sensitive (BS) and LWS opsin classes were subjected to strong positive selection that greatly weakens after multiple duplication events, a pattern that is consistent with the permanent heterozygote model. Due to the immense interspecific variation and duplicability potential of opsin genes among odonates, they represent a unique model system to test hypotheses regarding opsin gene duplication and diversification at the molecular level. © 2016 John Wiley & Sons Ltd.
Muthamilarasan, Mehanathan; Khandelwal, Rohit; Yadav, Chandra Bhan; Bonthala, Venkata Suresh; Khan, Yusuf; Prasad, Manoj
2014-01-01
MYB proteins represent one of the largest transcription factor families in plants, playing important roles in diverse developmental and stress-responsive processes. Considering its significance, several genome-wide analyses have been conducted in almost all land plants except foxtail millet. Foxtail millet (Setaria italica L.) is a model crop for investigating systems biology of millets and bioenergy grasses. Further, the crop is also known for its potential abiotic stress-tolerance. In this context, a comprehensive genome-wide survey was conducted and 209 MYB protein-encoding genes were identified in foxtail millet. All 209 S. italica MYB (SiMYB) genes were physically mapped onto nine chromosomes of foxtail millet. Gene duplication study showed that segmental- and tandem-duplication have occurred in genome resulting in expansion of this gene family. The protein domain investigation classified SiMYB proteins into three classes according to number of MYB repeats present. The phylogenetic analysis categorized SiMYBs into ten groups (I-X). SiMYB-based comparative mapping revealed a maximum orthology between foxtail millet and sorghum, followed by maize, rice and Brachypodium. Heat map analysis showed tissue-specific expression pattern of predominant SiMYB genes. Expression profiling of candidate MYB genes against abiotic stresses and hormone treatments using qRT-PCR revealed specific and/or overlapping expression patterns of SiMYBs. Taken together, the present study provides a foundation for evolutionary and functional characterization of MYB TFs in foxtail millet to dissect their functions in response to environmental stimuli.
Pleiotropy, redundancy and the evolution of flowers.
Albert, Victor A; Oppenheimer, David G; Lindqvist, Charlotte
2002-07-01
Most angiosperm flowers are tightly integrated, functionally bisexual shoots that have carpels with enclosed ovules. Flowering plants evolved from within the gymnosperms, which lack this combination of innovations. Paradoxically, phylogenetic reconstructions suggest that the flowering plant lineage substantially pre-dates the evolution of flowers themselves. We provide a model based on known gene regulatory networks whereby positive selection on a single, partially redundant gene duplicate 'trapped' the ancestors of flower-bearing plants into the condensed, bisexual state approximately 130 million years ago. The LEAFY (LFY) gene of Arabidopsis encodes a master regulator that functions as the main conduit of environmental signals to the reproductive developmental program. We directly link the elimination of one LFY paralog, pleiotropically maintained in gymnosperms, to the sudden appearance of flowers in the fossil record.
Nahas, John V; Iosue, Christine L; Shaik, Noor F; Selhorst, Kathleen; He, Bin Z; Wykoff, Dennis D
2018-05-10
Convergent evolution is often due to selective pressures generating a similar phenotype. We observe relatively recent duplications in a spectrum of Saccharomycetaceae yeast species resulting in multiple phosphatases that are regulated by different nutrient conditions - thiamine and phosphate starvation. This specialization is both transcriptional and at the level of phosphatase substrate specificity. In Candida glabrata , loss of the ancestral phosphatase family was compensated by the co-option of a different histidine phosphatase family with three paralogs. Using RNA-seq and functional assays, we identify one of these paralogs, CgPMU3 , as a thiamine phosphatase. We further determine that the 81% identical paralog CgPMU2 does not encode thiamine phosphatase activity; however, both are capable of cleaving the phosphatase substrate, 1-napthyl-phosphate. We functionally demonstrate that members of this family evolved novel enzymatic functions for phosphate and thiamine starvation, and are regulated transcriptionally by either nutrient condition, and observe similar trends in other yeast species. This independent, parallel evolution involving two different families of histidine phosphatases suggests that there were likely similar selective pressures on multiple yeast species to recycle thiamine and phosphate. In this work, we focused on duplication and specialization, but there is also repeated loss of phosphatases, indicating that the expansion and contraction of the phosphatase family is dynamic in many Ascomycetes. The dynamic evolution of the phosphatase gene families is perhaps just one example of how gene duplication, co-option, and transcriptional and functional specialization together allow species to adapt to their environment with existing genetic resources. Copyright © 2018, G3: Genes, Genomes, Genetics.
Fiebig, Michael; Kelly, Steven; Gluenz, Eva
2015-01-01
Leishmania spp. are protozoan parasites that have two principal life cycle stages: the motile promastigote forms that live in the alimentary tract of the sandfly and the amastigote forms, which are adapted to survive and replicate in the harsh conditions of the phagolysosome of mammalian macrophages. Here, we used Illumina sequencing of poly-A selected RNA to characterise and compare the transcriptomes of L. mexicana promastigotes, axenic amastigotes and intracellular amastigotes. These data allowed the production of the first transcriptome evidence-based annotation of gene models for this species, including genome-wide mapping of trans-splice sites and poly-A addition sites. The revised genome annotation encompassed 9,169 protein-coding genes including 936 novel genes as well as modifications to previously existing gene models. Comparative analysis of gene expression across promastigote and amastigote forms revealed that 3,832 genes are differentially expressed between promastigotes and intracellular amastigotes. A large proportion of genes that were downregulated during differentiation to amastigotes were associated with the function of the motile flagellum. In contrast, those genes that were upregulated included cell surface proteins, transporters, peptidases and many uncharacterized genes, including 293 of the 936 novel genes. Genome-wide distribution analysis of the differentially expressed genes revealed that the tetraploid chromosome 30 is highly enriched for genes that were upregulated in amastigotes, providing the first evidence of a link between this whole chromosome duplication event and adaptation to the vertebrate host in this group. Peptide evidence for 42 proteins encoded by novel transcripts supports the idea of an as yet uncharacterised set of small proteins in Leishmania spp. with possible implications for host-pathogen interactions. PMID:26452044
Shah, Firoz; Nicolás, César; Bentzer, Johan; Ellström, Magnus; Smits, Mark; Rineau, Francois; Canbäck, Björn; Floudas, Dimitrios; Carleer, Robert; Lackner, Gerald; Braesel, Jana; Hoffmeister, Dirk; Henrissat, Bernard; Ahrén, Dag; Johansson, Tomas; Hibbett, David S; Martin, Francis; Persson, Per; Tunlid, Anders
2016-03-01
Ectomycorrhizal fungi are thought to have a key role in mobilizing organic nitrogen that is trapped in soil organic matter (SOM). However, the extent to which ectomycorrhizal fungi decompose SOM and the mechanism by which they do so remain unclear, considering that they have lost many genes encoding lignocellulose-degrading enzymes that are present in their saprotrophic ancestors. Spectroscopic analyses and transcriptome profiling were used to examine the mechanisms by which five species of ectomycorrhizal fungi, representing at least four origins of symbiosis, decompose SOM extracted from forest soils. In the presence of glucose and when acquiring nitrogen, all species converted the organic matter in the SOM extract using oxidative mechanisms. The transcriptome expressed during oxidative decomposition has diverged over evolutionary time. Each species expressed a different set of transcripts encoding proteins associated with oxidation of lignocellulose by saprotrophic fungi. The decomposition 'toolbox' has diverged through differences in the regulation of orthologous genes, the formation of new genes by gene duplications, and the recruitment of genes from diverse but functionally similar enzyme families. The capacity to oxidize SOM appears to be common among ectomycorrhizal fungi. We propose that the ancestral decay mechanisms used primarily to obtain carbon have been adapted in symbiosis to scavenge nutrients instead. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.
Yao, Xiaohong; Tang, Ping; Li, Zuozhou; Li, Dawei; Liu, Yifei; Huang, Hongwen
2015-01-01
Actinidia chinensis is an important economic plant belonging to the basal lineage of the asterids. Availability of a complete Actinidia chloroplast genome sequence is crucial to understanding phylogenetic relationships among major lineages of angiosperms and facilitates kiwifruit genetic improvement. We report here the complete nucleotide sequences of the chloroplast genomes for Actinidia chinensis and A. chinensis var deliciosa obtained through de novo assembly of Illumina paired-end reads produced by total DNA sequencing. The total genome size ranges from 155,446 to 157,557 bp, with an inverted repeat (IR) of 24,013 to 24,391 bp, a large single copy region (LSC) of 87,984 to 88,337 bp and a small single copy region (SSC) of 20,332 to 20,336 bp. The genome encodes 113 different genes, including 79 unique protein-coding genes, 30 tRNA genes and 4 ribosomal RNA genes, with 16 duplicated in the inverted repeats, and a tRNA gene (trnfM-CAU) duplicated once in the LSC region. Comparisons of IR boundaries among four asterid species showed that IR/LSC borders were extended into the 5' portion of the psbA gene and IR contraction occurred in Actinidia. The clap gene has been lost from the chloroplast genome in Actinidia, and may have been transferred to the nucleus during chloroplast evolution. Twenty-seven polymorphic simple sequence repeat (SSR) loci were identified in the Actinidia chloroplast genome. Maximum parsimony analyses of a 72-gene, 16 taxa angiosperm dataset strongly support the placement of Actinidiaceae in Ericales within the basal asterids.
Bassham, Susan; Cañestro, Cristian; Postlethwait, John H
2008-08-22
Gene duplication provides opportunities for lineage diversification and evolution of developmental novelties. Duplicated genes generally either disappear by accumulation of mutations (nonfunctionalization), or are preserved either by the origin of positively selected functions in one or both duplicates (neofunctionalization), or by the partitioning of original gene subfunctions between the duplicates (subfunctionalization). The Pax2/5/8 family of important developmental regulators has undergone parallel expansion among chordate groups. After the divergence of urochordate and vertebrate lineages, two rounds of independent gene duplications resulted in the Pax2, Pax5, and Pax8 genes of most vertebrates (the sister group of the urochordates), and an additional duplication provided the pax2a and pax2b duplicates in teleost fish. Separate from the vertebrate genome expansions, a duplication also created two Pax2/5/8 genes in the common ancestor of ascidian and larvacean urochordates. To better understand mechanisms underlying the evolution of duplicated genes, we investigated, in the larvacean urochordate Oikopleura dioica, the embryonic gene expression patterns of Pax2/5/8 paralogs. We compared the larvacean and ascidian expression patterns to infer modular subfunctions present in the single pre-duplication Pax2/5/8 gene of stem urochordates, and we compared vertebrate and urochordate expression to infer the suite of Pax2/5/8 gene subfunctions in the common ancestor of olfactores (vertebrates + urochordates). Expression pattern differences of larvacean and ascidian Pax2/5/8 orthologs in the endostyle, pharynx and hindgut suggest that some ancestral gene functions have been partitioned differently to the duplicates in the two urochordate lineages. Novel expression in the larvacean heart may have resulted from the neofunctionalization of a Pax2/5/8 gene in the urochordates. Expression of larvacean Pax2/5/8 in the endostyle, in sites of epithelial remodeling, and in sensory tissues evokes like functions of Pax2, Pax5 and Pax8 in vertebrate embryos, and may indicate ancient origins for these functions in the chordate common ancestor. Comparative analysis of expression patterns of chordate Pax2/5/8 duplicates, rooted on the single-copy Pax2/5/8 gene of amphioxus, whose lineage diverged basally among chordates, provides new insights into the evolution and development of the heart, thyroid, pharynx, stomodeum and placodes in chordates; supports the controversial conclusion that the atrial siphon of ascidians and the otic placode in vertebrates are homologous; and backs the notion that Pax2/5/8 functioned in ancestral chordates to engineer epithelial fusions and perforations, including gill slit openings.
Wang, Zhihui; Cheng, Ke; Wan, Liyun; Yan, Liying; Jiang, Huifang; Liu, Shengyi; Lei, Yong; Liao, Boshou
2015-12-10
Plant bZIP proteins characteristically harbor a highly conserved bZIP domain with two structural features: a DNA-binding basic region and a leucine (Leu) zipper dimerization region. They have been shown to be diverse transcriptional regulators, playing crucial roles in plant development, physiological processes, and biotic/abiotic stress responses. Despite the availability of six completely sequenced legume genomes, a comprehensive investigation of bZIP family members in legumes has yet to be presented. In this study, we identified 428 bZIP genes encoding 585 distinct proteins in six legumes, Glycine max, Medicago truncatula, Phaseolus vulgaris, Cicer arietinum, Cajanus cajan, and Lotus japonicus. The legume bZIP genes were categorized into 11 groups according to their phylogenetic relationships with genes from Arabidopsis. Four kinds of intron patterns (a-d) within the basic and hinge regions were defined and additional conserved motifs were identified, both presenting high group specificity and supporting the group classification. We predicted the DNA-binding patterns and the dimerization properties, based on the characteristic features in the basic and hinge regions and the Leu zipper, respectively, which indicated that some highly conserved amino acid residues existed across each major group. The chromosome distribution and analysis for WGD-derived duplicated blocks revealed that the legume bZIP genes have expanded mainly by segmental duplication rather than tandem duplication. Expression data further revealed that the legume bZIP genes were expressed constitutively or in an organ-specific, development-dependent manner playing roles in multiple seed developmental stages and tissues. We also detected several key legume bZIP genes involved in drought- and salt-responses by comparing fold changes of expression values in drought-stressed or salt-stressed roots and leaves. In summary, this genome-wide identification, characterization and expression analysis of legume bZIP genes provides valuable information for understanding the molecular functions and evolution of the legume bZIP transcription factor family, and highlights potential legume bZIP genes involved in regulating tissue development and abiotic stress responses.
Garzón-Ospina, Diego; Forero-Rodríguez, Johanna; Patarroyo, Manuel A
2014-12-13
The msp-7 gene has become differentially expanded in the Plasmodium genus; Plasmodium vivax has the highest copy number of this gene, several of which encode antigenic proteins in merozoites. DNA sequences from thirty-six Colombian clinical isolates from P. vivax (pv) msp-7E, -7F and -7L genes were analysed for characterizing and studying the genetic diversity of these pvmsp-7 members which are expressed during the intra-erythrocyte stage; natural selection signals producing the variation pattern so observed were evaluated. The pvmsp-7E gene was highly polymorphic compared to pvmsp-7F and pvmsp-7L which were seen to have limited genetic diversity; pvmsp-7E polymorphism was seen to have been maintained by different types of positive selection. Even though these copies seemed to be species-specific duplications, a search in the Plasmodium cynomolgi genome (P. vivax sister taxon) showed that both species shared the whole msp-7 repertoire. This led to exploring the long-term effect of natural selection by comparing the orthologous sequences which led to finding signatures for lineage-specific positive selection. The results confirmed that the P. vivax msp-7 family has a heterogeneous genetic diversity pattern; some members are highly conserved whilst others are highly diverse. The results suggested that the 3'-end of these genes encode MSP-7 proteins' functional region whilst the central region of pvmsp-7E has evolved rapidly. The lineage-specific positive selection signals found suggested that mutations occurring in msp-7s genes during host switch may have succeeded in adapting the ancestral P. vivax parasite population to humans.
Two Rounds of Whole Genome Duplication in the Ancestral Vertebrate
Dehal, Paramvir; Boore, Jeffrey L
2005-01-01
The hypothesis that the relatively large and complex vertebrate genome was created by two ancient, whole genome duplications has been hotly debated, but remains unresolved. We reconstructed the evolutionary relationships of all gene families from the complete gene sets of a tunicate, fish, mouse, and human, and then determined when each gene duplicated relative to the evolutionary tree of the organisms. We confirmed the results of earlier studies that there remains little signal of these events in numbers of duplicated genes, gene tree topology, or the number of genes per multigene family. However, when we plotted the genomic map positions of only the subset of paralogous genes that were duplicated prior to the fish–tetrapod split, their global physical organization provides unmistakable evidence of two distinct genome duplication events early in vertebrate evolution indicated by clear patterns of four-way paralogous regions covering a large part of the human genome. Our results highlight the potential for these large-scale genomic events to have driven the evolutionary success of the vertebrate lineage. PMID:16128622
Core histone genes of Giardia intestinalis: genomic organization, promoter structure, and expression
Yee, Janet; Tang, Anita; Lau, Wei-Ling; Ritter, Heather; Delport, Dewald; Page, Melissa; Adam, Rodney D; Müller, Miklós; Wu, Gang
2007-01-01
Background Giardia intestinalis is a protist found in freshwaters worldwide, and is the most common cause of parasitic diarrhea in humans. The phylogenetic position of this parasite is still much debated. Histones are small, highly conserved proteins that associate tightly with DNA to form chromatin within the nucleus. There are two classes of core histone genes in higher eukaryotes: DNA replication-independent histones and DNA replication-dependent ones. Results We identified two copies each of the core histone H2a, H2b and H3 genes, and three copies of the H4 gene, at separate locations on chromosomes 3, 4 and 5 within the genome of Giardia intestinalis, but no gene encoding a H1 linker histone could be recognized. The copies of each gene share extensive DNA sequence identities throughout their coding and 5' noncoding regions, which suggests these copies have arisen from relatively recent gene duplications or gene conversions. The transcription start sites are at triplet A sequences 1–27 nucleotides upstream of the translation start codon for each gene. We determined that a 50 bp region upstream from the start of the histone H4 coding region is the minimal promoter, and a highly conserved 15 bp sequence called the histone motif (him) is essential for its activity. The Giardia core histone genes are constitutively expressed at approximately equivalent levels and their mRNAs are polyadenylated. Competition gel-shift experiments suggest that a factor within the protein complex that binds him may also be a part of the protein complexes that bind other promoter elements described previously in Giardia. Conclusion In contrast to other eukaryotes, the Giardia genome has only a single class of core histone genes that encode replication-independent histones. Our inability to locate a gene encoding the linker histone H1 leads us to speculate that the H1 protein may not be required for the compaction of Giardia's small and gene-rich genome. PMID:17425802
Kaiser, Ann-Sophie; Maas, Bianca; Wolff, Anna; Sutter, Christian; Janssen, Johannes W G; Hinderhofer, Katrin; Moog, Ute
2015-05-01
SATB2, a gene encoding a highly conserved DNA-binding protein, is known to have an important role in craniofacial and neuronal development. Only a few patients with SATB2 variants have been described so far. Recently, Döcker et al provided a summary of these patients and delineated the SAS (SATB2-associated syndrome). We here report on a girl with intellectual disability, nearly absent speech and suspected hypodontia who was shown to carry an intragenic SATB2 tandem duplication hypothesized to lead to haploinsufficiency of SATB2. Preliminary information on this patient had already been included in the article by Döcker et al. We want to give a detailed description of the patient's phenotype and genotype, providing further insight into the spectrum of the molecular mechanisms leading to SAS.
Models for loosely linked gene duplicates suggest lengthy persistence of both copies.
O'Hely, Martin; Wockner, Leesa
2007-06-21
Consider the appearance of a duplicate copy of a gene at a locus linked loosely, if at all, to the locus at which the gene is usually found. If all copies of the gene are subject to non-functionalizing mutations, then two fates are possible: loss of functional copies at the duplicate locus (loss of duplicate expression), or loss of functional copies at the original locus (map change). This paper proposes a simple model to address the probability of map change, the time taken for a map change and/or loss of duplicate expression, and considers where in the spectrum between loss of duplicate expression and map change such a duplicate complex is likely to be found. The findings are: the probability of map change is always half the reciprocal of the population size N, the time for a map change to occur is order NlogN generations, and that there is a marked tendency for duplicates to remain near equi-frequency with the gene at the original locus for a large portion of that time. This is in excellent agreement with simulations.
ISC, a Novel Group of Bacterial and Archaeal DNA Transposons That Encode Cas9 Homologs
Kapitonov, Vladimir V.; Makarova, Kira S.
2015-01-01
ABSTRACT Bacterial genomes encode numerous homologs of Cas9, the effector protein of the type II CRISPR-Cas systems. The homology region includes the arginine-rich helix and the HNH nuclease domain that is inserted into the RuvC-like nuclease domain. These genes, however, are not linked to cas genes or CRISPR. Here, we show that Cas9 homologs represent a distinct group of nonautonomous transposons, which we denote ISC (insertion sequences Cas9-like). We identify many diverse families of full-length ISC transposons and demonstrate that their terminal sequences (particularly 3′ termini) are similar to those of IS605 superfamily transposons that are mobilized by the Y1 tyrosine transposase encoded by the TnpA gene and often also encode the TnpB protein containing the RuvC-like endonuclease domain. The terminal regions of the ISC and IS605 transposons contain palindromic structures that are likely recognized by the Y1 transposase. The transposons from these two groups are inserted either exactly in the middle or upstream of specific 4-bp target sites, without target site duplication. We also identify autonomous ISC transposons that encode TnpA-like Y1 transposases. Thus, the nonautonomous ISC transposons could be mobilized in trans either by Y1 transposases of other, autonomous ISC transposons or by Y1 transposases of the more abundant IS605 transposons. These findings imply an evolutionary scenario in which the ISC transposons evolved from IS605 family transposons, possibly via insertion of a mobile group II intron encoding the HNH domain, and Cas9 subsequently evolved via immobilization of an ISC transposon. IMPORTANCE Cas9 endonucleases, the effectors of type II CRISPR-Cas systems, represent the new generation of genome-engineering tools. Here, we describe in detail a novel family of transposable elements that encode the likely ancestors of Cas9 and outline the evolutionary scenario connecting different varieties of these transposons and Cas9. PMID:26712934
Mastretta-Yanes, Alicia; Zamudio, Sergio; Jorgensen, Tove H; Arrigo, Nils; Alvarez, Nadir; Piñero, Daniel; Emerson, Brent C
2014-09-14
Gene duplication leads to paralogy, which complicates the de novo assembly of genotyping-by-sequencing (GBS) data. The issue of paralogous genes is exacerbated in plants, because they are particularly prone to gene duplication events. Paralogs are normally filtered from GBS data before undertaking population genomics or phylogenetic analyses. However, gene duplication plays an important role in the functional diversification of genes and it can also lead to the formation of postzygotic barriers. Using populations and closely related species of a tropical mountain shrub, we examine 1) the genomic differentiation produced by putative orthologs, and 2) the distribution of recent gene duplication among lineages and geography. We find high differentiation among populations from isolated mountain peaks and species-level differentiation within what is morphologically described as a single species. The inferred distribution of paralogs among populations is congruent with taxonomy and shows that GBS could be used to examine recent gene duplication as a source of genomic differentiation of nonmodel species. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Li, Xiaoqin; Guo, Rongrong; Li, Jun; Singer, Stacy D; Zhang, Yucheng; Yin, Xiangjing; Zheng, Yi; Fan, Chonghui; Wang, Xiping
2013-10-01
Aldehyde dehydrogenases (ALDHs) represent a protein superfamily encoding NAD(P)(+)-dependent enzymes that oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes. In plants, they are involved in many biological processes and play a role in the response to environmental stress. In this study, a total of 39 ALDH genes from ten families were identified in the apple (Malus × domestica Borkh.) genome. Synteny analysis of the apple ALDH (MdALDH) genes indicated that segmental and tandem duplications, as well as whole genome duplications, have likely contributed to the expansion and evolution of these gene families in apple. Moreover, synteny analysis between apple and Arabidopsis demonstrated that several MdALDH genes were found in the corresponding syntenic blocks of Arabidopsis, suggesting that these genes appeared before the divergence of lineages that led to apple and Arabidopsis. In addition, phylogenetic analysis, as well as comparisons of exon-intron and protein structures, provided further insight into both their evolutionary relationships and their putative functions. Tissue-specific expression analysis of the MdALDH genes demonstrated diverse spatiotemporal expression patterns, while their expression profiles under abiotic stress and various hormone treatments indicated that many MdALDH genes were responsive to high salinity and drought, as well as different plant hormones. This genome-wide identification, as well as characterization of evolutionary relationships and expression profiles, of the apple MdALDH genes will not only be useful for the further analysis of ALDH genes and their roles in stress response, but may also aid in the future improvement of apple stress tolerance. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
Dreyer, Hermann; Steiner, Gerhard
2006-01-01
Background Mitochondrial (mt) gene arrangement is highly variable among molluscs and especially among bivalves. Of the 30 complete molluscan mt-genomes published to date, only one is of a heterodont bivalve, although this is the most diverse taxon in terms of species numbers. We determined the complete sequence of the mitochondrial genomes of Acanthocardia tuberculata and Hiatella arctica, (Mollusca, Bivalvia, Heterodonta) and describe their gene contents and genome organisations to assess the variability of these features among the Bivalvia and their value for phylogenetic inference. Results The size of the mt-genome in Acanthocardia tuberculata is 16.104 basepairs (bp), and in Hiatella arctica 18.244 bp. The Acanthocardia mt-genome contains 12 of the typical protein coding genes, lacking the Atpase subunit 8 (atp8) gene, as all published marine bivalves. In contrast, a complete atp8 gene is present in Hiatella arctica. In addition, we found a putative truncated atp8 gene when re-annotating the mt-genome of Venerupis philippinarum. Both mt-genomes reported here encode all genes on the same strand and have an additional trnM. In Acanthocardia several large non-coding regions are present. One of these contains 3.5 nearly identical copies of a 167 bp motive. In Hiatella, the 3' end of the NADH dehydrogenase subunit (nad)6 gene is duplicated together with the adjacent non-coding region. The gene arrangement of Hiatella is markedly different from all other known molluscan mt-genomes, that of Acanthocardia shows few identities with the Venerupis philippinarum. Phylogenetic analyses on amino acid and nucleotide levels robustly support the Heterodonta and the sister group relationship of Acanthocardia and Venerupis. Monophyletic Bivalvia are resolved only by a Bayesian inference of the nucleotide data set. In all other analyses the two unionid species, being to only ones with genes located on both strands, do not group with the remaining bivalves. Conclusion The two mt-genomes reported here add to and underline the high variability of gene order and presence of duplications in bivalve and molluscan taxa. Some genomic traits like the loss of the atp8 gene or the encoding of all genes on the same strand are homoplastic among the Bivalvia. These characters, gene order, and the nucleotide sequence data show considerable potential of resolving phylogenetic patterns at lower taxonomic levels. PMID:16948842
The European Eel NCCβ Gene Encodes a Thiazide-resistant Na-Cl Cotransporter*
Moreno, Erika; Plata, Consuelo; Rodríguez-Gama, Alejandro; Argaiz, Eduardo R.; Vázquez, Norma; Leyva-Ríos, Karla; Islas, León; Cutler, Christopher; Pacheco-Alvarez, Diana; Mercado, Adriana; Cariño-Cortés, Raquel; Castañeda-Bueno, María; Gamba, Gerardo
2016-01-01
The thiazide-sensitive Na-Cl cotransporter (NCC) is the major pathway for salt reabsorption in the mammalian distal convoluted tubule. NCC plays a key role in the regulation of blood pressure. Its inhibition with thiazides constitutes the primary baseline therapy for arterial hypertension. However, the thiazide-binding site in NCC is unknown. Mammals have only one gene encoding for NCC. The eel, however, contains a duplicate gene. NCCα is an ortholog of mammalian NCC and is expressed in the kidney. NCCβ is present in the apical membrane of the rectum. Here we cloned and functionally characterized NCCβ from the European eel. The cRNA encodes a 1043-amino acid membrane protein that, when expressed in Xenopus oocytes, functions as an Na-Cl cotransporter with two major characteristics, making it different from other known NCCs. First, eel NCCβ is resistant to thiazides. Single-point mutagenesis supports that the absence of thiazide inhibition is, at least in part, due to the substitution of a conserved serine for a cysteine at position 379. Second, NCCβ is not activated by low-chloride hypotonic stress, although the unique Ste20-related proline alanine-rich kinase (SPAK) binding site in the amino-terminal domain is conserved. Thus, NCCβ exhibits significant functional differences from NCCs that could be helpful in defining several aspects of the structure-function relationship of this important cotransporter. PMID:27587391
Emms, David M; Covshoff, Sarah; Hibberd, Julian M; Kelly, Steven
2016-07-01
C4 photosynthesis is considered one of the most remarkable examples of evolutionary convergence in eukaryotes. However, it is unknown whether the evolution of C4 photosynthesis required the evolution of new genes. Genome-wide gene-tree species-tree reconciliation of seven monocot species that span two origins of C4 photosynthesis revealed that there was significant parallelism in the duplication and retention of genes coincident with the evolution of C4 photosynthesis in these lineages. Specifically, 21 orthologous genes were duplicated and retained independently in parallel at both C4 origins. Analysis of this gene cohort revealed that the set of parallel duplicated and retained genes is enriched for genes that are preferentially expressed in bundle sheath cells, the cell type in which photosynthesis was activated during C4 evolution. Furthermore, functional analysis of the cohort of parallel duplicated genes identified SWEET-13 as a potential key transporter in the evolution of C4 photosynthesis in grasses, and provides new insight into the mechanism of phloem loading in these C4 species. C4 photosynthesis, gene duplication, gene families, parallel evolution. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Emms, David M.; Covshoff, Sarah; Hibberd, Julian M.; Kelly, Steven
2016-01-01
C4 photosynthesis is considered one of the most remarkable examples of evolutionary convergence in eukaryotes. However, it is unknown whether the evolution of C4 photosynthesis required the evolution of new genes. Genome-wide gene-tree species-tree reconciliation of seven monocot species that span two origins of C4 photosynthesis revealed that there was significant parallelism in the duplication and retention of genes coincident with the evolution of C4 photosynthesis in these lineages. Specifically, 21 orthologous genes were duplicated and retained independently in parallel at both C4 origins. Analysis of this gene cohort revealed that the set of parallel duplicated and retained genes is enriched for genes that are preferentially expressed in bundle sheath cells, the cell type in which photosynthesis was activated during C4 evolution. Furthermore, functional analysis of the cohort of parallel duplicated genes identified SWEET-13 as a potential key transporter in the evolution of C4 photosynthesis in grasses, and provides new insight into the mechanism of phloem loading in these C4 species. Key words: C4 photosynthesis, gene duplication, gene families, parallel evolution. PMID:27016024
Lateral Gene Transfer in a Heavy Metal-Contaminated-Groundwater Microbial Community
Hemme, Christopher L.; Green, Stefan J.; Rishishwar, Lavanya; Prakash, Om; Pettenato, Angelica; Chakraborty, Romy; Deutschbauer, Adam M.; Van Nostrand, Joy D.; Wu, Liyou; He, Zhili; Jordan, I. King; Arkin, Adam P.; Kostka, Joel E.
2016-01-01
ABSTRACT Unraveling the drivers controlling the response and adaptation of biological communities to environmental change, especially anthropogenic activities, is a central but poorly understood issue in ecology and evolution. Comparative genomics studies suggest that lateral gene transfer (LGT) is a major force driving microbial genome evolution, but its role in the evolution of microbial communities remains elusive. To delineate the importance of LGT in mediating the response of a groundwater microbial community to heavy metal contamination, representative Rhodanobacter reference genomes were sequenced and compared to shotgun metagenome sequences. 16S rRNA gene-based amplicon sequence analysis indicated that Rhodanobacter populations were highly abundant in contaminated wells with low pHs and high levels of nitrate and heavy metals but remained rare in the uncontaminated wells. Sequence comparisons revealed that multiple geochemically important genes, including genes encoding Fe2+/Pb2+ permeases, most denitrification enzymes, and cytochrome c553, were native to Rhodanobacter and not subjected to LGT. In contrast, the Rhodanobacter pangenome contained a recombinational hot spot in which numerous metal resistance genes were subjected to LGT and/or duplication. In particular, Co2+/Zn2+/Cd2+ efflux and mercuric resistance operon genes appeared to be highly mobile within Rhodanobacter populations. Evidence of multiple duplications of a mercuric resistance operon common to most Rhodanobacter strains was also observed. Collectively, our analyses indicated the importance of LGT during the evolution of groundwater microbial communities in response to heavy metal contamination, and a conceptual model was developed to display such adaptive evolutionary processes for explaining the extreme dominance of Rhodanobacter populations in the contaminated groundwater microbiome. PMID:27048805
Selection Shapes Transcriptional Logic and Regulatory Specialization in Genetic Networks.
Fogelmark, Karl; Peterson, Carsten; Troein, Carl
2016-01-01
Living organisms need to regulate their gene expression in response to environmental signals and internal cues. This is a computational task where genes act as logic gates that connect to form transcriptional networks, which are shaped at all scales by evolution. Large-scale mutations such as gene duplications and deletions add and remove network components, whereas smaller mutations alter the connections between them. Selection determines what mutations are accepted, but its importance for shaping the resulting networks has been debated. To investigate the effects of selection in the shaping of transcriptional networks, we derive transcriptional logic from a combinatorially powerful yet tractable model of the binding between DNA and transcription factors. By evolving the resulting networks based on their ability to function as either a simple decision system or a circadian clock, we obtain information on the regulation and logic rules encoded in functional transcriptional networks. Comparisons are made between networks evolved for different functions, as well as with structurally equivalent but non-functional (neutrally evolved) networks, and predictions are validated against the transcriptional network of E. coli. We find that the logic rules governing gene expression depend on the function performed by the network. Unlike the decision systems, the circadian clocks show strong cooperative binding and negative regulation, which achieves tight temporal control of gene expression. Furthermore, we find that transcription factors act preferentially as either activators or repressors, both when binding multiple sites for a single target gene and globally in the transcriptional networks. This separation into positive and negative regulators requires gene duplications, which highlights the interplay between mutation and selection in shaping the transcriptional networks.
Martínez-Castilla, León Patricio; Alvarez-Buylla, Elena R.
2003-01-01
Gene duplication is a substrate of evolution. However, the relative importance of positive selection versus relaxation of constraints in the functional divergence of gene copies is still under debate. Plant MADS-box genes encode transcriptional regulators key in various aspects of development and have undergone extensive duplications to form a large family. We recovered 104 MADS sequences from the Arabidopsis genome. Bayesian phylogenetic trees recover type II lineage as a monophyletic group and resolve a branching sequence of monophyletic groups within this lineage. The type I lineage is comprised of several divergent groups. However, contrasting gene structure and patterns of chromosomal distribution between type I and II sequences suggest that they had different evolutionary histories and support the placement of the root of the gene family between these two groups. Site-specific and site-branch analyses of positive Darwinian selection (PDS) suggest that different selection regimes could have affected the evolution of these lineages. We found evidence for PDS along the branch leading to flowering time genes that have a direct impact on plant fitness. Sites with high probabilities of having been under PDS were found in the MADS and K domains, suggesting that these played important roles in the acquisition of novel functions during MADS-box diversification. Detected sites are targets for further experimental analyses. We argue that adaptive changes in MADS-domain protein sequences have been important for their functional divergence, suggesting that changes within coding regions of transcriptional regulators have influenced phenotypic evolution of plants. PMID:14597714
Sorting by Cuts, Joins, and Whole Chromosome Duplications.
Zeira, Ron; Shamir, Ron
2017-02-01
Genome rearrangement problems have been extensively studied due to their importance in biology. Most studied models assumed a single copy per gene. However, in reality, duplicated genes are common, most notably in cancer. In this study, we make a step toward handling duplicated genes by considering a model that allows the atomic operations of cut, join, and whole chromosome duplication. Given two linear genomes, [Formula: see text] with one copy per gene and [Formula: see text] with two copies per gene, we give a linear time algorithm for computing a shortest sequence of operations transforming [Formula: see text] into [Formula: see text] such that all intermediate genomes are linear. We also show that computing an optimal sequence with fewest duplications is NP-hard.
Neutral and Non-Neutral Evolution of Duplicated Genes with Gene Conversion
Fawcett, Jeffrey A.; Innan, Hideki
2011-01-01
Gene conversion is one of the major mutational mechanisms involved in the DNA sequence evolution of duplicated genes. It contributes to create unique patters of DNA polymorphism within species and divergence between species. A typical pattern is so-called concerted evolution, in which the divergence between duplicates is maintained low for a long time because of frequent exchanges of DNA fragments. In addition, gene conversion affects the DNA evolution of duplicates in various ways especially when selection operates. Here, we review theoretical models to understand the evolution of duplicates in both neutral and non-neutral cases. We also explain how these theories contribute to interpreting real polymorphism and divergence data by using some intriguing examples. PMID:24710144
5p13 microduplication syndrome: a new case and better clinical definition of the syndrome.
Novara, Francesca; Alfei, Enrico; D'Arrigo, Stefano; Pantaleoni, Chiara; Beri, Silvana; Achille, Valentina; Sciacca, Francesca L; Giorda, Roberto; Zuffardi, Orsetta; Ciccone, Roberto
2013-01-01
Chromosome 5p13 duplication syndrome (OMIM #613174), a contiguous gene syndrome involving duplication of several genes on chromosome 5p13 including NIPBL (OMIM 608667), has been described in rare patients with developmental delay and learning disability, behavioral problems and peculiar facial dysmorphisms. 5p13 duplications described so far present with variable sizes, from 0.25 to 13.6 Mb, and contain a variable number of genes. Here we report another patient with 5p13 duplication syndrome including NIPBL gene only. Proband's phenotype overlapped that reported in patients with 5p13 microduplication syndrome and especially that of subjects with smaller duplications. Moreover, we better define genotype-phenotype relationship associated with this duplication and confirmed that NIPBL was likely the major dosage sensitive gene for the 5p13 microduplication phenotype. Copyright © 2012 Elsevier Masson SAS. All rights reserved.
Tang, Fang; Yang, Shengming; Liu, Jinge
2016-01-01
Rj4 is a dominant gene in soybeans (Glycine max) that restricts nodulation by many strains of Bradyrhizobium elkanii. The soybean-B. elkanii symbiosis has a low nitrogen-fixation efficiency, but B. elkanii strains are highly competitive for nodulation; thus, cultivars harboring an Rj4 allele are considered favorable. Cloning the Rj4 gene is the first step in understanding the molecular basis of Rj4-mediated nodulation restriction and facilitates the development of molecular tools for genetic improvement of nitrogen fixation in soybeans. We finely mapped the Rj4 locus within a small genomic region on soybean chromosome 1, and validated one of the candidate genes as Rj4 using both complementation tests and CRISPR/Cas9-based gene knockout experiments. We demonstrated that Rj4 encodes a thaumatin-like protein, for which a corresponding allele is not present in the surveyed rj4 genotypes, including the reference genome Williams 82. Our conclusion disagrees with the previous report that Rj4 is the Glyma.01G165800 gene (previously annotated as Glyma01g37060). Instead, we provide convincing evidence that Rj4 is Glyma.01g165800-D, a duplicated and unique version of Glyma.01g165800, that has evolved the ability to control symbiotic specificity. PMID:26582727
Pöggeler, Stefanie
2011-04-01
Multicopper oxidases (MCO) catalyze the biological oxidation of various aromatic substrates and have been identified in plants, insects, bacteria, and wood rotting fungi. In nature, they are involved in biodegradation of biopolymers such as lignin and humic compounds, but have also been tested for various industrial applications. In fungi, MCOs have been shown to play important roles during their life cycles, such as in fruiting body formation, pigment formation and pathogenicity. Coprophilous fungi, which grow on the dung of herbivores, appear to encode an unexpectedly high number of enzymes capable of at least partly degrading lignin. This study compared the MCO-coding capacity of the coprophilous filamentous ascomycetes Podospora anserina and Sordaria macrospora with closely related non-coprophilous members of the order Sordariales. An increase of MCO genes in coprophilic members of the Sordariales most probably occurred by gene duplication and horizontal gene transfer events.
The proteolipid protein gene: Double, double, . . . and trouble
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hodes, M.E.; Dlouhy, S.R.
1996-07-01
That more of a good thing may be too much has been apparent at least since the discovery that Down syndrome is caused by three copies of chromosome 21 instead of the normal two. Duplications of myelin genes also lead to trouble. An extra dose of PMP22, the gene for a protein of peripheral nervous system myelin, causes Charcot-Marie Tooth type 1A disease (CMT1A). Increased dosage of the proteolipid protein gene, PLP, which encodes the chief protein of CNS myelin, can cause Pelizaeus-Merzbacher disease (PMD). The work of Inoue et al. is of particular importance because they found the duplicationmore » in four of five families with {open_quotes}classical{close_quotes} PMD, whereas other changes in PLP, such as missense mutations, are found in no more than one in four or five patients with the disease. 27 refs.« less
Starrett, James; Hedin, Marshal; Ayoub, Nadia; Hayashi, Cheryl Y
2013-07-25
Hemocyanins are multimeric copper-containing hemolymph proteins involved in oxygen binding and transport in all major arthropod lineages. Most arachnids have seven primary subunits (encoded by paralogous genes a-g), which combine to form a 24-mer (4×6) quaternary structure. Within some spider lineages, however, hemocyanin evolution has been a dynamic process with extensive paralog duplication and loss. We have obtained hemocyanin gene sequences from numerous representatives of the spider infraorders Mygalomorphae and Araneomorphae in order to infer the evolution of the hemocyanin gene family and estimate spider relationships using these conserved loci. Our hemocyanin gene tree is largely consistent with the previous hypotheses of paralog relationships based on immunological studies, but reveals some discrepancies in which paralog types have been lost or duplicated in specific spider lineages. Analyses of concatenated hemocyanin sequences resolved deep nodes in the spider phylogeny and recovered a number of clades that are supported by other molecular studies, particularly for mygalomorph taxa. The concatenated data set is also used to estimate dates of higher-level spider divergences and suggests that the diversification of extant mygalomorphs preceded that of extant araneomorphs. Spiders are diverse in behavior and respiratory morphology, and our results are beneficial for comparative analyses of spider respiration. Lastly, the conserved hemocyanin sequences allow for the inference of spider relationships and ancient divergence dates. Copyright © 2013 Elsevier B.V. All rights reserved.
Bassham, Susan; Cañestro, Cristian; Postlethwait, John H
2008-01-01
Background Gene duplication provides opportunities for lineage diversification and evolution of developmental novelties. Duplicated genes generally either disappear by accumulation of mutations (nonfunctionalization), or are preserved either by the origin of positively selected functions in one or both duplicates (neofunctionalization), or by the partitioning of original gene subfunctions between the duplicates (subfunctionalization). The Pax2/5/8 family of important developmental regulators has undergone parallel expansion among chordate groups. After the divergence of urochordate and vertebrate lineages, two rounds of independent gene duplications resulted in the Pax2, Pax5, and Pax8 genes of most vertebrates (the sister group of the urochordates), and an additional duplication provided the pax2a and pax2b duplicates in teleost fish. Separate from the vertebrate genome expansions, a duplication also created two Pax2/5/8 genes in the common ancestor of ascidian and larvacean urochordates. Results To better understand mechanisms underlying the evolution of duplicated genes, we investigated, in the larvacean urochordate Oikopleura dioica, the embryonic gene expression patterns of Pax2/5/8 paralogs. We compared the larvacean and ascidian expression patterns to infer modular subfunctions present in the single pre-duplication Pax2/5/8 gene of stem urochordates, and we compared vertebrate and urochordate expression to infer the suite of Pax2/5/8 gene subfunctions in the common ancestor of olfactores (vertebrates + urochordates). Expression pattern differences of larvacean and ascidian Pax2/5/8 orthologs in the endostyle, pharynx and hindgut suggest that some ancestral gene functions have been partitioned differently to the duplicates in the two urochordate lineages. Novel expression in the larvacean heart may have resulted from the neofunctionalization of a Pax2/5/8 gene in the urochordates. Expression of larvacean Pax2/5/8 in the endostyle, in sites of epithelial remodeling, and in sensory tissues evokes like functions of Pax2, Pax5 and Pax8 in vertebrate embryos, and may indicate ancient origins for these functions in the chordate common ancestor. Conclusion Comparative analysis of expression patterns of chordate Pax2/5/8 duplicates, rooted on the single-copy Pax2/5/8 gene of amphioxus, whose lineage diverged basally among chordates, provides new insights into the evolution and development of the heart, thyroid, pharynx, stomodeum and placodes in chordates; supports the controversial conclusion that the atrial siphon of ascidians and the otic placode in vertebrates are homologous; and backs the notion that Pax2/5/8 functioned in ancestral chordates to engineer epithelial fusions and perforations, including gill slit openings. PMID:18721460
Toloza-Villalobos, Jessica; Arroyo, José Ignacio; Opazo, Juan C
2015-01-01
The circadian clock is a central oscillator that coordinates endogenous rhythms. Members of six gene families underlie the metabolic machinery of this system. Although this machinery appears to correspond to a highly conserved genetic system in metazoans, it has been recognized that vertebrates possess a more diverse gene inventory than that of non-vertebrates. This difference could have originated in the two successive rounds of whole-genome duplications that took place in the common ancestor of the group. Teleost fish underwent an extra event of whole-genome duplication, which is thought to have provided an abundance of raw genetic material for the biological innovations that facilitated the radiation of the group. In this study, we assessed the relative contributions of whole-genome duplication and small-scale gene duplication to generate the repertoire of genes associated with the circadian clock of teleost fish. To achieve this goal, we annotated genes from six gene families associated with the circadian clock in eight teleost fish species, and we reconstructed their evolutionary history by inferring phylogenetic relationships. Our comparative analysis indicated that teleost species possess a variable repertoire of genes related to the circadian clock gene families and that the actual diversity of these genes has been shaped by a variety of phenomena, such as the complete deletion of ohnologs, the differential retention of genes, and lineage-specific gene duplications. From a functional perspective, the subfunctionalization of two ohnolog genes (PER1a and PER1b) in zebrafish highlights the power of whole-genome duplications to generate biological diversity.
Evolution and Distribution of Teleost myomiRNAs: Functionally Diversified myomiRs in Teleosts.
Siddique, Bhuiyan Sharmin; Kinoshita, Shigeharu; Wongkarangkana, Chaninya; Asakawa, Shuichi; Watabe, Shugo
2016-06-01
Myosin heavy chain (MYH) genes belong to a multigene family, and the regulated expression of each member determines the physiological and contractile muscle properties. Among these, MYH6, MYH7, and MYH14 occupy unique positions in the mammalian MYH gene family because of their specific expression in slow/cardiac muscles and the existence of intronic micro(mi) RNAs. MYH6, MYH7, and MYH14 encode miR-208a, miR-208b, and miR-499, respectively. These MYH encoded miRNAs are designated as myomiRs because of their muscle-specific expression and functions. In mammals, myomiRs and host MYHs form a transcription network involved in muscle fiber-type specification; thus, genomic positions and expression patterns of them are well conserved. However, our previous studies revealed divergent distribution and expression of MYH14/miR-499 among teleosts, suggesting the unique evolution of myomiRs and host MYHs in teleosts. Here, we examined distribution and expression of myomiRs and host MYHs in various teleost species. The major cardiac MYH isoforms in teleosts are an intronless gene, atrial myosin heavy chain (amhc), and ventricular myosin heavy chain (vmhc) gene that encodes an intronic miRNA, miR-736. Phylogenetic analysis revealed that vmhc/miR-736 is a teleost-specific myomiR that differed from tetrapoda MYH6/MYH7/miR-208s. Teleost genomes also contain species-specific orthologs in addition to vmhc and amhc, indicating complex gene duplication and gene loss events during teleost evolution. In medaka and torafugu, miR-499 was highly expressed in slow/cardiac muscles whereas the expression of miR-736 was quite low and not muscle specific. These results suggest functional diversification of myomiRs in teleost with the diversification of host MYHs.
Gambetta, Gregory A; Matthews, Mark A; Syvanen, Michael
2018-05-04
Xylella fastidiosa (Xf) is a gram negative bacterium inhabiting the plant vascular system. In most species this bacterium lives as a benign symbiote, but in several agriculturally important plants (e.g. coffee, citrus, grapevine) Xf is pathogenic. Xf has four loci encoding homologues to hemolysin RTX proteins, virulence factors involved in a wide range of plant pathogen interactions. We show that all four genes are expressed during pathogenesis in grapevine. The sequences from these four genes have a complex repetitive structure. At the C-termini, sequence diversity between strains is what would be expected from orthologous genes. However, within strains there is no N-terminal homology, indicating these loci encode RTXs of different functions and/or specificities. More striking is that many of the orthologous loci between strains share this extreme variation at the N-termini. Thus these RTX orthologues are most easily visualized as fusions between the orthologous C-termini and different N-termini. Further, the four genes are found in operons having a peculiar structure with an extensively duplicated module encoding a small protein with homology to the N-terminal region of the full length RTX. Surprisingly, some of these small peptides are most similar not to their corresponding full length RTX, but to the N-termini of RTXs from other Xf strains, and even other remotely related species. These results demonstrate that these genes are expressed in planta during pathogenesis. Their structure suggests extensive evolutionary restructuring through horizontal gene transfers and heterologous recombination mechanisms. The sum of the evidence suggests these repetitive modules are a novel kind of mobile genetic element.
Lengyel, Peter
2014-07-11
My Ph.D. thesis in the laboratory of Severo Ochoa at New York University School of Medicine in 1962 included the determination of the nucleotide compositions of codons specifying amino acids. The experiments were based on the use of random copolyribonucleotides (synthesized by polynucleotide phosphorylase) as messenger RNA in a cell-free protein-synthesizing system. At Yale University, where I joined the faculty, my co-workers and I first studied the mechanisms of protein synthesis. Thereafter, we explored the interferons (IFNs), which were discovered as antiviral defense agents but were revealed to be components of a highly complex multifunctional system. We isolated pure IFNs and characterized IFN-activated genes, the proteins they encode, and their functions. We concentrated on a cluster of IFN-activated genes, the p200 cluster, which arose by repeated gene duplications and which encodes a large family of highly multifunctional proteins. For example, the murine protein p204 can be activated in numerous tissues by distinct transcription factors. It modulates cell proliferation and the differentiation of a variety of tissues by binding to many proteins. p204 also inhibits the activities of wild-type Ras proteins and Ras oncoproteins. © 2014 by The American Society for Biochemistry and Molecular Biology, Inc.
The complete chloroplast genome of North American ginseng, Panax quinquefolius.
Han, Zeng-Jie; Li, Wei; Liu, Yuan; Gao, Li-Zhi
2016-09-01
We report complete nucleotide sequence of the Panax quinquefolius chloroplast genome using next-generation sequencing technology. The genome size is 156 359 bp, including two inverted repeats (IRs) of 52 153 bp, separated by the large single-copy (LSC 86 184 bp) and small single-copy (SSC 18 081 bp) regions. This cp genome encodes 114 unigenes (80 protein-coding genes, four rRNA genes, and 30 tRNA genes), in which 18 are duplicated in the IR regions. Overall GC content of the genome is 38.08%. A phylogenomic analysis of the 10 complete chloroplast genomes from Araliaceae using Daucus carota from Apiaceae as outgroup showed that P. quinquefolius is closely related to the other two members of the genus Panax, P. ginseng and P. notoginseng.
Zhong, Jinshun; Kellogg, Elizabeth A
2015-01-01
Duplication, retention, and expression of CYCLOIDEA2 (CYC2)-like genes are thought to affect evolution of corolla symmetry. However, exactly what and how changes in CYC2-like genes correlate with the origin of corolla zygomorphy are poorly understood. We inferred and calibrated a densely sampled phylogeny of CYC2-like genes across the Lamiales and examined their expression in early diverging (EDL) and higher core clades (HCL). CYC2-like genes duplicated extensively in Lamiales, at least six times in core Lamiales (CL) around the Cretaceous-Paleogene (K-Pg) boundary, and seven more in EDL relatively more recently. Nested duplications and losses of CYC2-like paralogs are pervasive but may not correlate with transitions in corolla symmetry. We found evidence for dN/dS (ω) variation following gene duplications. CYC2-like paralogs in HCL show differential expression with higher expression in adaxial petals. Asymmetric expression but not recurrent duplication of CYC2-like genes correlates with the origin of corolla zygomorphy. Changes in both cis-regulatory and coding domains of CYC2-like genes are probably crucial for the evolution of corolla zygomorphy. Multiple selection regimes appear likely to play important roles in gene retention. The parallel duplications of CYC2-like genes are after the initial diversification of bumble bees and Euglossine bees. © 2014 The Authors. New Phytologist © 2014 New Phytologist Trust.
Parmar, Manoj B; Wright, Jonathan M
2013-11-01
A whole-genome duplication (WGD) early in the teleost fish lineage makes fish ideal organisms to study the fate of duplicated genes and underlying evolutionary trajectories that have led to the retention of ohnologous gene duplicates in fish genomes. Here, we compare the genomic organization and tissue-specific transcription of the ohnologous fabp7 and fabp10 genes in medaka, three-spined stickleback, and spotted green pufferfish to the well-studied duplicated fabp7 and fabp10 genes of zebrafish. Teleost fabp7 and fabp10 genes contain four exons interrupted by three introns. Polypeptide sequences of Fabp7 and Fabp10 show the highest sequence identity and similarity with their orthologs from vertebrates. Orthology was evident as the ohnologous Fabp7 and Fabp10 polypeptides of teleost fishes each formed distinct clades and clustered together with their orthologs from other vertebrates in a phylogenetic tree. Furthermore, ohnologous teleost fabp7 and fabp10 genes exhibit conserved gene synteny with human FABP7 and chicken FABP10, respectively, which provides compelling evidence that the duplicated fabp7 and fabp10 genes of teleost fishes most likely arose from the well-documented WGD. The tissue-specific distribution of fabp7a, fabp7b, fabp10a, and fabp10b transcripts provides evidence of diverged spatial transcriptional regulation between ohnologous gene duplicates of fabp7 and fabp10 in teleost fishes.
Structure of allelic variants of subtype 5 of histone H1 in pea Pisum sativum L.
Bogdanova, V S; Lester, D R; Berdnikov, V A; Andersson, I
2005-06-01
The pea genome contains seven histone H1 genes encoding different subtypes. Previously, the DNA sequence of only one gene, His1, coding for the subtype H1-1, had been identified. We isolated a histone H1 allele from a pea genomic DNA library. Data from the electrophoretic mobility of the pea H1 subtypes and their N-bromosuccinimide cleavage products indicated that the newly isolated gene corresponded to the H1-5 subtype encoded by His5. We confirmed this result by sequencing the gene from three pea lines with H1-5 allelic variants of altered electrophoretic mobility. The allele of the slow H1-5 variant differed from the standard allele by a nucleotide substitution that caused the replacement of the positively charged lysine with asparagine in the DNA-interacting domain of the histone molecule. A temperature-related occurrence had previously been demonstrated for this H1-5 variant in a study on a worldwide collection of pea germplasm. The variant tended to occur at higher frequencies in geographic regions with a cold climate. The fast allelic variant of H1-5 displayed a deletion resulting in the loss of a duplicated pentapeptide in the C-terminal domain.
Biased exonization of transposed elements in duplicated genes: A lesson from the TIF-IA gene.
Amit, Maayan; Sela, Noa; Keren, Hadas; Melamed, Ze'ev; Muler, Inna; Shomron, Noam; Izraeli, Shai; Ast, Gil
2007-11-29
Gene duplication and exonization of intronic transposed elements are two mechanisms that enhance genomic diversity. We examined whether there is less selection against exonization of transposed elements in duplicated genes than in single-copy genes. Genome-wide analysis of exonization of transposed elements revealed a higher rate of exonization within duplicated genes relative to single-copy genes. The gene for TIF-IA, an RNA polymerase I transcription initiation factor, underwent a humanoid-specific triplication, all three copies of the gene are active transcriptionally, although only one copy retains the ability to generate the TIF-IA protein. Prior to TIF-IA triplication, an Alu element was inserted into the first intron. In one of the non-protein coding copies, this Alu is exonized. We identified a single point mutation leading to exonization in one of the gene duplicates. When this mutation was introduced into the TIF-IA coding copy, exonization was activated and the level of the protein-coding mRNA was reduced substantially. A very low level of exonization was detected in normal human cells. However, this exonization was abundant in most leukemia cell lines evaluated, although the genomic sequence is unchanged in these cancerous cells compared to normal cells. The definition of the Alu element within the TIF-IA gene as an exon is restricted to certain types of cancers; the element is not exonized in normal human cells. These results further our understanding of the delicate interplay between gene duplication and alternative splicing and of the molecular evolutionary mechanisms leading to genetic innovations. This implies the existence of purifying selection against exonization in single copy genes, with duplicate genes free from such constrains.
Biased exonization of transposed elements in duplicated genes: A lesson from the TIF-IA gene
Amit, Maayan; Sela, Noa; Keren, Hadas; Melamed, Ze'ev; Muler, Inna; Shomron, Noam; Izraeli, Shai; Ast, Gil
2007-01-01
Background Gene duplication and exonization of intronic transposed elements are two mechanisms that enhance genomic diversity. We examined whether there is less selection against exonization of transposed elements in duplicated genes than in single-copy genes. Results Genome-wide analysis of exonization of transposed elements revealed a higher rate of exonization within duplicated genes relative to single-copy genes. The gene for TIF-IA, an RNA polymerase I transcription initiation factor, underwent a humanoid-specific triplication, all three copies of the gene are active transcriptionally, although only one copy retains the ability to generate the TIF-IA protein. Prior to TIF-IA triplication, an Alu element was inserted into the first intron. In one of the non-protein coding copies, this Alu is exonized. We identified a single point mutation leading to exonization in one of the gene duplicates. When this mutation was introduced into the TIF-IA coding copy, exonization was activated and the level of the protein-coding mRNA was reduced substantially. A very low level of exonization was detected in normal human cells. However, this exonization was abundant in most leukemia cell lines evaluated, although the genomic sequence is unchanged in these cancerous cells compared to normal cells. Conclusion The definition of the Alu element within the TIF-IA gene as an exon is restricted to certain types of cancers; the element is not exonized in normal human cells. These results further our understanding of the delicate interplay between gene duplication and alternative splicing and of the molecular evolutionary mechanisms leading to genetic innovations. This implies the existence of purifying selection against exonization in single copy genes, with duplicate genes free from such constrains. PMID:18047649
Circular DNA Intermediate in the Duplication of Nile Tilapia vasa Genes
Fujimura, Koji; Conte, Matthew A.; Kocher, Thomas D.
2011-01-01
vasa is a highly conserved RNA helicase involved in animal germ cell development. Among vertebrate species, it is typically present as a single copy per genome. Here we report the isolation and sequencing of BAC clones for Nile tilapia vasa genes. Contrary to a previous report that Nile tilapia have a single copy of the vasa gene, we find evidence for at least three vasa gene loci. The vasa gene locus was duplicated from the original site and integrated into two distant novel sites. For one of these insertions we find evidence that the duplication was mediated by a circular DNA intermediate. This mechanism of gene duplication may explain the origin of isolated gene duplicates during the evolution of fish genomes. These data provide a foundation for studying the role of multiple vasa genes in the development of tilapia gonads, and will contribute to investigations of the molecular mechanisms of sex determination and evolution in cichlid fishes. PMID:22216289
Muthamilarasan, Mehanathan; Khandelwal, Rohit; Yadav, Chandra Bhan; Bonthala, Venkata Suresh; Khan, Yusuf; Prasad, Manoj
2014-01-01
MYB proteins represent one of the largest transcription factor families in plants, playing important roles in diverse developmental and stress-responsive processes. Considering its significance, several genome-wide analyses have been conducted in almost all land plants except foxtail millet. Foxtail millet (Setaria italica L.) is a model crop for investigating systems biology of millets and bioenergy grasses. Further, the crop is also known for its potential abiotic stress-tolerance. In this context, a comprehensive genome-wide survey was conducted and 209 MYB protein-encoding genes were identified in foxtail millet. All 209 S. italica MYB (SiMYB) genes were physically mapped onto nine chromosomes of foxtail millet. Gene duplication study showed that segmental- and tandem-duplication have occurred in genome resulting in expansion of this gene family. The protein domain investigation classified SiMYB proteins into three classes according to number of MYB repeats present. The phylogenetic analysis categorized SiMYBs into ten groups (I - X). SiMYB-based comparative mapping revealed a maximum orthology between foxtail millet and sorghum, followed by maize, rice and Brachypodium. Heat map analysis showed tissue-specific expression pattern of predominant SiMYB genes. Expression profiling of candidate MYB genes against abiotic stresses and hormone treatments using qRT-PCR revealed specific and/or overlapping expression patterns of SiMYBs. Taken together, the present study provides a foundation for evolutionary and functional characterization of MYB TFs in foxtail millet to dissect their functions in response to environmental stimuli. PMID:25279462
Pan, Xue; Siloto, Rodrigo M P; Wickramarathna, Aruna D; Mietkiewska, Elzbieta; Weselake, Randall J
2013-08-16
The oil from flax (Linum usitatissimum L.) has high amounts of α-linolenic acid (ALA; 18:3(cis)(Δ9,12,15)) and is one of the richest sources of omega-3 polyunsaturated fatty acids (ω-3-PUFAs). To produce ∼57% ALA in triacylglycerol (TAG), it is likely that flax contains enzymes that can efficiently transfer ALA to TAG. To test this hypothesis, we conducted a systematic characterization of TAG-synthesizing enzymes from flax. We identified several genes encoding acyl-CoA:diacylglycerol acyltransferases (DGATs) and phospholipid:diacylglycerol acyltransferases (PDATs) from the flax genome database. Due to recent genome duplication, duplicated gene pairs have been identified for all genes except DGAT2-2. Analysis of gene expression indicated that two DGAT1, two DGAT2, and four PDAT genes were preferentially expressed in flax embryos. Yeast functional analysis showed that DGAT1, DGAT2, and two PDAT enzymes restored TAG synthesis when produced recombinantly in yeast H1246 strain. The activity of particular PDAT enzymes (LuPDAT1 and LuPDAT2) was stimulated by the presence of ALA. Further seed-specific expression of flax genes in Arabidopsis thaliana indicated that DGAT1, PDAT1, and PDAT2 had significant effects on seed oil phenotype. Overall, this study indicated the existence of unique PDAT enzymes from flax that are able to preferentially catalyze the synthesis of TAG containing ALA acyl moieties. The identified LuPDATs may have practical applications for increasing the accumulation of ALA and other polyunsaturated fatty acids in oilseeds for food and industrial applications.
Jiang, Minghui; Ash, Ryan T.; Baker, Steven A.; Suter, Bernhard; Ferguson, Andrew; Park, Jiyoung; Rudy, Jessica; Torsky, Sergey P.; Chao, Hsiao-Tuan; Zoghbi, Huda Y.
2013-01-01
MECP2 duplication syndrome is a childhood neurological disorder characterized by intellectual disability, autism, motor abnormalities, and epilepsy. The disorder is caused by duplications spanning the gene encoding methyl-CpG-binding protein-2 (MeCP2), a protein involved in the modulation of chromatin and gene expression. MeCP2 is thought to play a role in maintaining the structural integrity of neuronal circuits. Loss of MeCP2 function causes Rett syndrome and results in abnormal dendritic spine morphology and decreased pyramidal dendritic arbor complexity and spine density. The consequences of MeCP2 overexpression on dendritic pathophysiology remain unclear. We used in vivo two-photon microscopy to characterize layer 5 pyramidal neuron spine turnover and dendritic arborization as a function of age in transgenic mice expressing the human MECP2 gene at twice the normal levels of MeCP2 (Tg1; Collins et al., 2004). We found that spine density in terminal dendritic branches is initially higher in young Tg1 mice but falls below control levels after postnatal week 12, approximately correlating with the onset of behavioral symptoms. Spontaneous spine turnover rates remain high in older Tg1 animals compared with controls, reflecting the persistence of an immature state. Both spine gain and loss rates are higher, with a net bias in favor of spine elimination. Apical dendritic arbors in both simple- and complex-tufted layer 5 Tg1 pyramidal neurons have more branches of higher order, indicating that MeCP2 overexpression induces dendritic overgrowth. P70S6K was hyperphosphorylated in Tg1 somatosensory cortex, suggesting that elevated mTOR signaling may underlie the observed increase in spine turnover and dendritic growth. PMID:24336718
Tharia, Hazel A; Shrive, Annette K; Mills, John D; Arme, Chris; Williams, Gwyn T; Greenhough, Trevor J
2002-02-22
The serum amyloid P component (SAP)-like pentraxin Limulus polyphemus SAP is a recently discovered, distinct pentraxin species, of known structure, which does not bind phosphocholine and whose N-terminal sequence has been shown to differ markedly from the highly conserved N terminus of all other known horseshoe crab pentraxins. The complete cDNA sequence of Limulus SAP, and the derived amino acid sequence, the first invertebrate SAP-like pentraxin sequence, have been determined. Two sequences were identified that differed only in the length of the 3' untranslated region. Limulus SAP is synthesised as a precursor protein of 234 amino acid residues, the first 17 residues encoding a signal peptide that is absent from the mature protein. Phylogenetic analysis clusters Limulus SAP pentraxin with the horseshoe crab C-reactive proteins (CRPs) rather than the mammalian SAPs, which are clustered with mammalian CRPs. The deduced amino acid sequence shares 22% identity with both human SAP and CRP, which are 51% identical, and 31-35% with horseshoe crab CRPs. These analyses indicate that gene duplication of CRP (or SAP), followed by sequence divergence and the evolution of CRP and/or SAP function, occurred independently along the chordate and arthropod evolutionary lines rather than in a common ancestor. They further indicate that the CRP/SAP gene duplication event in Limulus occurred before both the emergence of the Limulus CRP variants and the mammalian CRP/SAP gene duplication. Limulus SAP, which does not exhibit the CRP characteristic of calcium-dependent binding to phosphocholine, is established as a pentraxin species distinct from all other known horseshoe crab pentraxins that exist in many variant forms sharing a high level of sequence homology. Copyright 2002 Elsevier Science Ltd.
Rensing, Stefan A; Ick, Julia; Fawcett, Jeffrey A; Lang, Daniel; Zimmer, Andreas; Van de Peer, Yves; Reski, Ralf
2007-01-01
Background: Analyses of complete genomes and large collections of gene transcripts have shown that most, if not all seed plants have undergone one or more genome duplications in their evolutionary past. Results: In this study, based on a large collection of EST sequences, we provide evidence that the haploid moss Physcomitrella patens is a paleopolyploid as well. Based on the construction of linearized phylogenetic trees we infer the genome duplication to have occurred between 30 and 60 million years ago. Gene Ontology and pathway association of the duplicated genes in P. patens reveal different biases of gene retention compared with seed plants. Conclusion: Metabolic genes seem to have been retained in excess following the genome duplication in P. patens. This might, at least partly, explain the versatility of metabolism, as described for P. patens and other mosses, in comparison to other land plants. PMID:17683536
Borges, Sofia; Cravo, Pedro; Creasey, Alison; Fawcett, Richard; Modrzynska, Katarzyna; Rodrigues, Louise; Martinelli, Axel; Hunt, Paul
2011-01-01
Multidrug-resistant Plasmodium falciparum malaria parasites pose a threat to effective drug control, even to artemisinin-based combination therapies (ACTs). Here we used linkage group selection and Solexa whole-genome resequencing to investigate the genetic basis of resistance to component drugs of ACTs. Using the rodent malaria parasite P. chabaudi, we analyzed the uncloned progeny of a genetic backcross between the mefloquine-, lumefantrine-, and artemisinin-resistant mutant AS-15MF and a genetically distinct sensitive clone, AJ, following drug treatment. Genomewide scans of selection showed that parasites surviving each drug treatment bore a duplication of a segment of chromosome 12 (translocated to chromosome 04) present in AS-15MF. Whole-genome resequencing identified the size of the duplicated segment and its position on chromosome 4. The duplicated fragment extends for ∼393 kbp and contains over 100 genes, including mdr1, encoding the multidrug resistance P-glycoprotein homologue 1. We therefore show that resistance to chemically distinct components of ACTs is mediated by the same genetic mutation, highlighting a possible limitation of these therapies. PMID:21709099
Vandelle, Elodie; Vannozzi, Alessandro; Wong, Darren; Danzi, Davide; Digby, Anne-Marie; Dal Santo, Silvia; Astegno, Alessandra
2018-06-04
Calcium (Ca 2+ ) is an ubiquitous key second messenger in plants, where it modulates many developmental and adaptive processes in response to various stimuli. Several proteins containing Ca 2+ binding domain have been identified in plants, including calmodulin (CaM) and calmodulin-like (CML) proteins, which play critical roles in translating Ca 2+ signals into proper cellular responses. In this work, a genome-wide analysis conducted in Vitis vinifera identified three CaM- and 62 CML-encoding genes. We assigned gene family nomenclature, analyzed gene structure, chromosomal location and gene duplication, as well as protein motif organization. The phylogenetic clustering revealed a total of eight subgroups, including one unique clade of VviCaMs distinct from VviCMLs. VviCaMs were found to contain four EF-hand motifs whereas VviCML proteins have one to five. Most of grapevine CML genes were intronless, while VviCaMs were intron rich. All the genes were well spread among the 19 grapevine chromosomes and displayed a high level of duplication. The expression profiling of VviCaM/VviCML genes revealed a broad expression pattern across all grape organs and tissues at various developmental stages, and a significant modulation in biotic stress-related responses. Our results highlight the complexity of CaM/CML protein family also in grapevine, supporting the versatile role of its different members in modulating cellular responses to various stimuli, in particular to biotic stresses. This work lays the foundation for further functional and structural studies on specific grapevine CaMs/CMLs in order to better understand the role of Ca 2+ -binding proteins in grapevine and to explore their potential for further biotechnological applications. Copyright © 2018 Elsevier Masson SAS. All rights reserved.
Screening of duplicated loci reveals hidden divergence patterns in a complex salmonid genome
Limborg, Morten T.; Larson, Wesley; Seeb, Lisa W.; Seeb, James E.
2017-01-01
A whole-genome duplication (WGD) doubles the entire genomic content of a species and is thought to have catalysed adaptive radiation in some polyploid-origin lineages. However, little is known about general consequences of a WGD because gene duplicates (i.e., paralogs) are commonly filtered in genomic studies; such filtering may remove substantial portions of the genome in data sets from polyploid-origin species. We demonstrate a new method that enables genome-wide scans for signatures of selection at both nonduplicated and duplicated loci by taking locus-specific copy number into account. We apply this method to RAD sequence data from different ecotypes of a polyploid-origin salmonid (Oncorhynchus nerka) and reveal signatures of divergent selection that would have been missed if duplicated loci were filtered. We also find conserved signatures of elevated divergence at pairs of homeologous chromosomes with residual tetrasomic inheritance, suggesting that joint evolution of some nondiverged gene duplicates may affect the adaptive potential of these genes. These findings illustrate that including duplicated loci in genomic analyses enables novel insights into the evolutionary consequences of WGDs and local segmental gene duplications.
2010-01-01
Background Salmonids are one of the most intensely studied fish, in part due to their economic and environmental importance, and in part due to a recent whole genome duplication in the common ancestor of salmonids. This duplication greatly impacts species diversification, functional specialization, and adaptation. Extensive new genomic resources have recently become available for Atlantic salmon (Salmo salar), but documentation of allelic versus duplicate reference genes remains a major uncertainty in the complete characterization of its genome and its evolution. Results From existing expressed sequence tag (EST) resources and three new full-length cDNA libraries, 9,057 reference quality full-length gene insert clones were identified for Atlantic salmon. A further 1,365 reference full-length clones were annotated from 29,221 northern pike (Esox lucius) ESTs. Pairwise dN/dS comparisons within each of 408 sets of duplicated salmon genes using northern pike as a diploid out-group show asymmetric relaxation of selection on salmon duplicates. Conclusions 9,057 full-length reference genes were characterized in S. salar and can be used to identify alleles and gene family members. Comparisons of duplicated genes show that while purifying selection is the predominant force acting on both duplicates, consistent with retention of functionality in both copies, some relaxation of pressure on gene duplicates can be identified. In addition, there is evidence that evolution has acted asymmetrically on paralogs, allowing one of the pair to diverge at a faster rate. PMID:20433749
Hartmann, Fanny E; Rodríguez de la Vega, Ricardo C; Brandenburg, Jean-Tristan; Carpentier, Fantin; Giraud, Tatiana
2018-04-01
Gene presence-absence polymorphisms segregating within species are a significant source of genetic variation but have been little investigated to date in natural populations. In plant pathogens, the gain or loss of genes encoding proteins interacting directly with the host, such as secreted proteins, probably plays an important role in coevolution and local adaptation. We investigated gene presence-absence polymorphism in populations of two closely related species of castrating anther-smut fungi, Microbotryum lychnidis-dioicae (MvSl) and M. silenes-dioicae (MvSd), from across Europe, on the basis of Illumina genome sequencing data and high-quality genome references. We observed presence-absence polymorphism for 186 autosomal genes (2% of all genes) in MvSl, and only 51 autosomal genes in MvSd. Distinct genes displayed presence-absence polymorphism in the two species. Genes displaying presence-absence polymorphism were frequently located in subtelomeric and centromeric regions and close to repetitive elements, and comparison with outgroups indicated that most were present in a single species, being recently acquired through duplications in multiple-gene families. Gene presence-absence polymorphism in MvSl showed a phylogeographic structure corresponding to clusters detected based on SNPs. In addition, gene absence alleles were rare within species and skewed toward low-frequency variants. These findings are consistent with a deleterious or neutral effect for most gene presence-absence polymorphism. Some of the observed gene loss and gain events may however be adaptive, as suggested by the putative functions of the corresponding encoded proteins (e.g., secreted proteins) or their localization within previously identified selective sweeps. The adaptive roles in plant and anther-smut fungi interactions of candidate genes however need to be experimentally tested in future studies.
Rodríguez de la Vega, Ricardo C; Brandenburg, Jean-Tristan; Carpentier, Fantin; Giraud, Tatiana
2018-01-01
Abstract Gene presence–absence polymorphisms segregating within species are a significant source of genetic variation but have been little investigated to date in natural populations. In plant pathogens, the gain or loss of genes encoding proteins interacting directly with the host, such as secreted proteins, probably plays an important role in coevolution and local adaptation. We investigated gene presence–absence polymorphism in populations of two closely related species of castrating anther-smut fungi, Microbotryum lychnidis-dioicae (MvSl) and M. silenes-dioicae (MvSd), from across Europe, on the basis of Illumina genome sequencing data and high-quality genome references. We observed presence–absence polymorphism for 186 autosomal genes (2% of all genes) in MvSl, and only 51 autosomal genes in MvSd. Distinct genes displayed presence–absence polymorphism in the two species. Genes displaying presence–absence polymorphism were frequently located in subtelomeric and centromeric regions and close to repetitive elements, and comparison with outgroups indicated that most were present in a single species, being recently acquired through duplications in multiple-gene families. Gene presence–absence polymorphism in MvSl showed a phylogeographic structure corresponding to clusters detected based on SNPs. In addition, gene absence alleles were rare within species and skewed toward low-frequency variants. These findings are consistent with a deleterious or neutral effect for most gene presence–absence polymorphism. Some of the observed gene loss and gain events may however be adaptive, as suggested by the putative functions of the corresponding encoded proteins (e.g., secreted proteins) or their localization within previously identified selective sweeps. The adaptive roles in plant and anther-smut fungi interactions of candidate genes however need to be experimentally tested in future studies. PMID:29722826
Workshop on Self-Determination in Developing and Evolving Systems
1994-02-18
processes of duplication (e.g. gene duplication, cell duplication, structural enlargement), responses to selfish DNA (e.g. suppression of outlaw...direct their development, then the genes would need some form of environmental feedback. Are there any plausible mechanisms for such feedback? 3. What is...evolutionary innovation, what is the contribution of random mutations, directed mutation, gene conversion, symbiogenesis, fusion, jumping genes or other
Clarke, Thomas H.; Garb, Jessica E.; Hayashi, Cheryl Y.; Arensburger, Peter; Ayoub, Nadia A.
2015-01-01
The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae). PMID:26058392
Kaur, Kiranpreet; Bakke, Marit Jørgensen; Nilsen, Frank; Horsberg, Tor Einar
2015-01-01
Acetylcholinesterase (AChE) is an important enzyme in cholinergic synapses. Most arthropods have two genes (ace1 and ace2), but only one encodes the predominant synaptic AChE, the main target for organophosphates. Resistance towards organophosphates is widespread in the marine arthropod Lepeophtheirus salmonis. To understand this trait, it is essential to characterize the gene(s) coding for AChE(s). The full length cDNA sequences encoding two AChEs in L. salmonis were molecularly characterized in this study. The two ace genes were highly similar (83.5% similarity at protein level). Alignment to the L. salmonis genome revealed that both genes were located close to each other (separated by just 26.4 kbp on the L. salmonis genome), resulting from a recent gene duplication. Both proteins had all the typical features of functional AChE and clustered together with AChE-type 1 proteins in other species, an observation that has not been described in other arthropods. We therefore concluded the presence of two versions of ace1 gene in L. salmonis, named ace1a and ace1b. Ace1a was predominantly expressed in different developmental stages compared to ace1b and was possibly active in the cephalothorax, indicating that ace1a is more likely to play the major role in cholinergic synaptic transmission. The study is essential to understand the role of AChEs in resistance against organophosphates in L. salmonis. PMID:25938836
Makeyev, A V; Chkheidze, A N; Liebhaber, S A
1999-08-27
Gene families normally expand by segmental genomic duplication and subsequent sequence divergence. Although copies of partially or fully processed mRNA transcripts are occasionally retrotransposed into the genome, they are usually nonfunctional ("processed pseudogenes"). The two major cytoplasmic poly(C)-binding proteins in mammalian cells, alphaCP-1 and alphaCP-2, are implicated in a spectrum of post-transcriptional controls. These proteins are highly similar in structure and are encoded by closely related mRNAs. Based on this close relationship, we were surprised to find that one of these proteins, alphaCP-2, was encoded by a multiexon gene, whereas the second gene, alphaCP-1, was identical to and colinear with its mRNA. The alphaCP-1 and alphaCP-2 genes were shown to be single copy and were mapped to separate chromosomes. The linkage groups encompassing each of the two loci were concordant between mice and humans. These data suggested that the alphaCP-1 gene was generated by retrotransposition of a fully processed alphaCP-2 mRNA and that this event occurred well before the mammalian radiation. The stringent structural conservation of alphaCP-1 and its ubiquitous tissue distribution suggested that the retrotransposed alphaCP-1 gene was rapidly recruited to a function critical to the cell and distinct from that of its alphaCP-2 progenitor.
Kauzlaric, Annamaria; Ecco, Gabriela; Cassano, Marco; Duc, Julien; Imbeault, Michael; Trono, Didier
2017-01-01
KRAB-containing poly-zinc finger proteins (KZFPs) constitute the largest family of transcription factors encoded by mammalian genomes, and growing evidence indicates that they fulfill functions critical to both embryonic development and maintenance of adult homeostasis. KZFP genes underwent broad and independent waves of expansion in many higher vertebrates lineages, yet comprehensive studies of members harbored by a given species are scarce. Here we present a thorough analysis of KZFP genes and related units in the murine genome. We first identified about twice as many elements than previously annotated as either KZFP genes or pseudogenes, notably by assigning to this family an entity formerly considered as a large group of Satellite repeats. We then could delineate an organization in clusters distributed throughout the genome, with signs of recombination, translocation, duplication and seeding of new sites by retrotransposition of KZFP genes and related genetic units (KZFP/rGUs). Moreover, we harvested evidence indicating that closely related paralogs had evolved through both drifting and shifting of sequences encoding for zinc finger arrays. Finally, we could demonstrate that the KAP1-SETDB1 repressor complex tames the expression of KZFP/rGUs within clusters, yet that the primary targets of this regulation are not the KZFP/rGUs themselves but enhancers contained in neighboring endogenous retroelements and that, underneath, KZFPs conserve highly individualized patterns of expression. PMID:28334004
Kauzlaric, Annamaria; Ecco, Gabriela; Cassano, Marco; Duc, Julien; Imbeault, Michael; Trono, Didier
2017-01-01
KRAB-containing poly-zinc finger proteins (KZFPs) constitute the largest family of transcription factors encoded by mammalian genomes, and growing evidence indicates that they fulfill functions critical to both embryonic development and maintenance of adult homeostasis. KZFP genes underwent broad and independent waves of expansion in many higher vertebrates lineages, yet comprehensive studies of members harbored by a given species are scarce. Here we present a thorough analysis of KZFP genes and related units in the murine genome. We first identified about twice as many elements than previously annotated as either KZFP genes or pseudogenes, notably by assigning to this family an entity formerly considered as a large group of Satellite repeats. We then could delineate an organization in clusters distributed throughout the genome, with signs of recombination, translocation, duplication and seeding of new sites by retrotransposition of KZFP genes and related genetic units (KZFP/rGUs). Moreover, we harvested evidence indicating that closely related paralogs had evolved through both drifting and shifting of sequences encoding for zinc finger arrays. Finally, we could demonstrate that the KAP1-SETDB1 repressor complex tames the expression of KZFP/rGUs within clusters, yet that the primary targets of this regulation are not the KZFP/rGUs themselves but enhancers contained in neighboring endogenous retroelements and that, underneath, KZFPs conserve highly individualized patterns of expression.
Maciejowski, John; Ahn, James Hyungsoo; Cipriani, Patricia Giselle; Killian, Darrell J.; Chaudhary, Aisha L.; Lee, Ji Inn; Voutev, Roumen; Johnsen, Robert C.; Baillie, David L.; Gunsalus, Kristin C.; Fitch, David H. A.; Hubbard, E. Jane Albert
2005-01-01
We report molecular genetic studies of three genes involved in early germ-line proliferation in Caenorhabditis elegans that lend unexpected insight into a germ-line/soma functional separation of autosomal/X-linked duplicated gene pairs. In a genetic screen for germ-line proliferation-defective mutants, we identified mutations in rpl-11.1 (L11 protein of the large ribosomal subunit), pab-1 [a poly(A)-binding protein], and glp-3/eft-3 (an elongation factor 1-α homolog). All three are members of autosome/X gene pairs. Consistent with a germ-line-restricted function of rpl-11.1 and pab-1, mutations in these genes extend life span and cause gigantism. We further examined the RNAi phenotypes of the three sets of rpl genes (rpl-11, rpl-24, and rpl-25) and found that for the two rpl genes with autosomal/X-linked pairs (rpl-11 and rpl-25), zygotic germ-line function is carried by the autosomal copy. Available RNAi results for highly conserved autosomal/X-linked gene pairs suggest that other duplicated genes may follow a similar trend. The three rpl and the pab-1/2 duplications predate the divergence between C. elegans and C. briggsae, while the eft-3/4 duplication appears to have occurred in the lineage to C. elegans after it diverged from C. briggsae. The duplicated C. briggsae orthologs of the three C. elegans autosomal/X-linked gene pairs also display functional differences between paralogs. We present hypotheses for evolutionary mechanisms that may underlie germ-line/soma subfunctionalization of duplicated genes, taking into account the role of X chromosome silencing in the germ line and analogous mammalian phenomena. PMID:15687263
Fukushige, Tetsunari; Goszczynski, Barbara; Tian, Helen; McGhee, James D
2003-10-01
We describe the elt-4 gene from the nematode Caenorhabditis elegans. elt-4 is predicted to encode a very small (72 residues, 8.1 kD) GATA-type zinc finger transcription factor. The elt-4 gene is located approximately 5 kb upstream of the C. elegans elt-2 gene, which also encodes a GATA-type transcription factor; the zinc finger DNA-binding domains are highly conserved (24/25 residues) between the two proteins. The elt-2 gene is expressed only in the intestine and is essential for normal intestinal development. This article explores whether elt-4 also has a role in intestinal development. Reporter fusions to the elt-4 promoter or reporter insertions into the elt-4 coding regions show that elt-4 is indeed expressed in the intestine, beginning at the 1.5-fold stage of embryogenesis and continuing into adulthood. elt-4 reporter fusions are also expressed in nine cells of the posterior pharynx. Ectopic expression of elt-4 cDNA within the embryo does not cause detectable ectopic expression of biochemical markers of gut differentiation; furthermore, ectopic elt-4 expression neither inhibits nor enhances the ectopic marker expression caused by ectopic elt-2 expression. A deletion allele of elt-4 was isolated but no obvious phenotype could be detected, either in the gut or elsewhere; brood sizes, hatching efficiencies, and growth rates were indistinguishable from wild type. We found no evidence that elt-4 provided backup functions for elt-2. We used microarray analysis to search for genes that might be differentially expressed between L1 larvae of the elt-4 deletion strain and wild-type worms. Paired hybridizations were repeated seven times, allowing us to conclude, with some confidence, that no candidate target transcript could be identified as significantly up- or downregulated by loss of elt-4 function. In vitro binding experiments could not detect specific binding of ELT-4 protein to candidate binding sites (double-stranded oligonucleotides containing single or multiple WGATAR sequences); ELT-4 protein neither enhanced nor inhibited the strong sequence-specific binding of the ELT-2 protein. Whereas ELT-2 protein is a strong transcriptional activator in yeast, ELT-4 protein has no such activity under similar conditions, nor does it influence the transcriptional activity of coexpressed ELT-2 protein. Although an elt-2 homolog was easily identified in the genomic sequence of the related nematode C. briggsae, no elt-4 homolog could be identified. Analysis of the changes in silent third codon positions within the DNA-binding domains indicates that elt-4 arose as a duplication of elt-2, some 25-55 MYA. Thus, elt-4 has survived far longer than the average duplicated gene in C. elegans, even though no obvious biological function could be detected. elt-4 provides an interesting example of a tandemly duplicated gene that may originally have been the same size as elt-2 but has gradually been whittled down to its present size of little more than a zinc finger. Although elt-4 must confer (or must have conferred) some selective advantage to C. elegans, we suggest that its ultimate evolutionary fate will be disappearance from the C. elegans genome.
Callebaut, Isabelle; Laurin, Michel; Pascal, Géraldine; Poupon, Anne; Goudet, Ghylène; Monget, Philippe
2012-01-01
Genes encoding proteins involved in sperm-egg interaction and fertilization exhibit a particularly fast evolution and may participate in prezygotic species isolation [1], [2]. Some of them (ZP3, ADAM1, ADAM2, ACR and CD9) have individually been shown to evolve under positive selection [3], [4], suggesting a role of positive Darwinian selection on sperm-egg interaction. However, the genes involved in this biological function have not been systematically and exhaustively studied with an evolutionary perspective, in particular across vertebrates with internal and external fertilization. Here we show that 33 genes among the 69 that have been experimentally shown to be involved in fertilization in at least one taxon in vertebrates are under positive selection. Moreover, we identified 17 pseudogenes and 39 genes that have at least one duplicate in one species. For 15 genes, we found neither positive selection, nor gene copies or pseudogenes. Genes of teleosts, especially genes involved in sperm-oolemma fusion, appear to be more frequently under positive selection than genes of birds and eutherians. In contrast, pseudogenization, gene loss and gene gain are more frequent in eutherians. Thus, each of the 19 studied vertebrate species exhibits a unique signature characterized by gene gain and loss, as well as position of amino acids under positive selection. Reflecting these clade-specific signatures, teleosts and eutherian mammals are recovered as clades in a parsimony analysis. Interestingly the same analysis places Xenopus apart from teleosts, with which it shares the primitive external fertilization, and locates it along with amniotes (which share internal fertilization), suggesting that external or internal environmental conditions of germ cell interaction may not be the unique factors that drive the evolution of fertilization genes. Our work should improve our understanding of the fertilization process and on the establishment of reproductive barriers, for example by offering new leads for experiments on genes identified as positively selected. PMID:22957080
Gene and domain duplication in the chordate Otx gene family: insights from amphioxus Otx.
Williams, N A; Holland, P W
1998-05-01
We report the genomic organization and deduced protein sequence of a cephalochordate member of the Otx homeobox gene family (AmphiOtx) and show its probable single-copy state in the genome. We also present molecular phylogenetic analysis indicating that there was single ancestral Otx gene in the first chordates which was duplicated in the vertebrate lineage after it had split from the lineage leading to the cephalochordates. Duplication of a C-terminal protein domain has occurred specifically in the vertebrate lineage, strengthening the case for a single Otx gene in an ancestral chordate whose gene structure has been retained in an extant cephalochordate. Comparative analysis of protein sequences and published gene expression patterns suggest that the ancestral chordate Otx gene had roles in patterning the anterior mesendoderm and central nervous system. These roles were elaborated following Otx gene duplication in vertebrates, accompanied by regulatory and structural divergence, particularly of Otx1 descendant genes.
Maroni, G.; Wise, J.; Young, J. E.; Otto, E.
1987-01-01
A search for duplications of the Drosophila melanogaster metallothionein gene (Mtn) yielded numerous examples of this type of chromosomal rearrangement. These duplications are distributed widely—we found them in samples from four continents, and they are functional—larvae carrying Mtn duplications produce more Mtn RNA and tolerate increased cadmium and copper concentrations. Six different duplication types were characterized by restriction-enzyme analyses using probes from the Mtn region. The restriction maps show that in four cases the sequences, ranging in size between 2.2 and 6.0 kb, are arranged as direct, tandem repeats; in two other cases, this basic pattern is modified by the insertion of a putative transposable element into one of the repeated units. Duplications of the D. melanogaster metallothionein gene such as those that we found in natural populations may represent early stages in the evolution of a gene family. PMID:2828157
Jiang, Wen-kai; Liu, Yun-long; Xia, En-hua; Gao, Li-zhi
2013-01-01
The evolution of genes and genomes after polyploidization has been the subject of extensive studies in evolutionary biology and plant sciences. While a significant number of duplicated genes are rapidly removed during a process called fractionation, which operates after the whole-genome duplication (WGD), another considerable number of genes are retained preferentially, leading to the phenomenon of biased gene retention. However, the evolutionary mechanisms underlying gene retention after WGD remain largely unknown. Through genome-wide analyses of sequence and functional data, we comprehensively investigated the relationships between gene features and the retention probability of duplicated genes after WGDs in six plant genomes, Arabidopsis (Arabidopsis thaliana), poplar (Populus trichocarpa), soybean (Glycine max), rice (Oryza sativa), sorghum (Sorghum bicolor), and maize (Zea mays). The results showed that multiple gene features were correlated with the probability of gene retention. Using a logistic regression model based on principal component analysis, we resolved evolutionary rate, structural complexity, and GC3 content as the three major contributors to gene retention. Cluster analysis of these features further classified retained genes into three distinct groups in terms of gene features and evolutionary behaviors. Type I genes are more prone to be selected by dosage balance; type II genes are possibly subject to subfunctionalization; and type III genes may serve as potential targets for neofunctionalization. This study highlights that gene features are able to act jointly as primary forces when determining the retention and evolution of WGD-derived duplicated genes in flowering plants. These findings thus may help to provide a resolution to the debate on different evolutionary models of gene fates after WGDs. PMID:23396833
The HOPA Gene Dodecamer Duplication Is Not a Significant Etiological Factor in Autism.
ERIC Educational Resources Information Center
Michaelis, Ron C.; Copeland-Yates, Susan A.; Sossey-Alaoui, Khalid; Skinner, Cindy; Friez, Michael J.; Longshore, John W.; Simensen, Richard J.; Schroer, Richard J.; Stevenson, Roger E.
2000-01-01
A study of 202 patients with autism found the incidence of a dodecamer duplication in the HOPA gene was not significantly different between patients and controls. Three female patients inherited the duplication from nonautistic fathers. Also, there was no systematic skewing of X inactivation in female patients with the duplication. (Contains…
Pod Corn Is Caused by Rearrangement at the Tunicate1 Locus[W][OA
Han, Jong-Jin; Jackson, David; Martienssen, Robert
2012-01-01
Pod corn (Zea mays var tunicata) was once regarded as ancestral to cultivated maize, and was prized by pre-Columbian cultures for its magical properties. Tunicate1 (Tu1) is a dominant pod corn mutation in which kernels are completely enclosed in leaflike glumes. Here we show that Tu1 encodes a MADS box transcription factor expressed in leaves whose 5′ regulatory region is fused by a 1.8-Mb chromosomal inversion to the 3′ region of a gene expressed in the inflorescence. Both genes are further duplicated, accounting for classical derivative alleles isolated by recombination, and Tu1 transgenes interact with these derivative alleles in a dose-dependent manner. In young ear primordia, TU1 proteins are nuclearly localized in specific cells at the base of spikelet pair meristems. Tu1 branch determination defects resemble those in ramosa mutants, which encode regulatory proteins expressed in these same cells, accounting for synergism in double mutants discovered almost 100 years ago. The Tu1 rearrangement is not found in ancestral teosinte and arose after domestication of maize. PMID:22829149
2013-01-01
Background Microsporidian Nosema bombycis has received much attention because the pébrine disease of domesticated silkworms results in great economic losses in the silkworm industry. So far, no effective treatment could be found for pébrine. Compared to other known Nosema parasites, N. bombycis can unusually parasitize a broad range of hosts. To gain some insights into the underlying genetic mechanism of pathological ability and host range expansion in this parasite, a comparative genomic approach is conducted. The genome of two Nosema parasites, N. bombycis and N. antheraeae (an obligatory parasite to undomesticated silkworms Antheraea pernyi), were sequenced and compared with their distantly related species, N. ceranae (an obligatory parasite to honey bees). Results Our comparative genomics analysis show that the N. bombycis genome has greatly expanded due to the following three molecular mechanisms: 1) the proliferation of host-derived transposable elements, 2) the acquisition of many horizontally transferred genes from bacteria, and 3) the production of abundnant gene duplications. To our knowledge, duplicated genes derived not only from small-scale events (e.g., tandem duplications) but also from large-scale events (e.g., segmental duplications) have never been seen so abundant in any reported microsporidia genomes. Our relative dating analysis further indicated that these duplication events have arisen recently over very short evolutionary time. Furthermore, several duplicated genes involving in the cytotoxic metabolic pathway were found to undergo positive selection, suggestive of the role of duplicated genes on the adaptive evolution of pathogenic ability. Conclusions Genome expansion is rarely considered as the evolutionary outcome acting on those highly reduced and compact parasitic microsporidian genomes. This study, for the first time, demonstrates that the parasitic genomes can expand, instead of shrink, through several common molecular mechanisms such as gene duplication, horizontal gene transfer, and transposable element expansion. We also showed that the duplicated genes can serve as raw materials for evolutionary innovations possibly contributing to the increase of pathologenic ability. Based on our research, we propose that duplicated genes of N. bombycis should be treated as primary targets for treatment designs against pébrine. PMID:23496955
Maintenance and Loss of Duplicated Genes by Dosage Subfunctionalization.
Gout, Jean-Francois; Lynch, Michael
2015-08-01
Whole-genome duplications (WGDs) have contributed to gene-repertoire enrichment in many eukaryotic lineages. However, most duplicated genes are eventually lost and it is still unclear why some duplicated genes are evolutionary successful whereas others quickly turn to pseudogenes. Here, we show that dosage constraints are major factors opposing post-WGD gene loss in several Paramecium species that share a common ancestral WGD. We propose a model where a majority of WGD-derived duplicates preserve their ancestral function and are retained to produce enough of the proteins performing this same ancestral function. Under this model, the expression level of individual duplicated genes can evolve neutrally as long as they maintain a roughly constant summed expression, and this allows random genetic drift toward uneven contributions of the two copies to total expression. Our analysis suggests that once a high level of imbalance is reached, which can require substantial lengths of time, the copy with the lowest expression level contributes a small enough fraction of the total expression that selection no longer opposes its loss. Extension of our analysis to yeast species sharing a common ancestral WGD yields similar results, suggesting that duplicated-gene retention for dosage constraints followed by divergence in expression level and eventual deterministic gene loss might be a universal feature of post-WGD evolution. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Divergent evolution of multiple virus-resistance genes from a progenitor in Capsicum spp.
Kim, Saet-Byul; Kang, Won-Hee; Huy, Hoang Ngoc; Yeom, Seon-In; An, Jeong-Tak; Kim, Seungill; Kang, Min-Young; Kim, Hyun Jung; Jo, Yeong Deuk; Ha, Yeaseong; Choi, Doil; Kang, Byoung-Cheorl
2017-01-01
Plants have evolved hundreds of nucleotide-binding and leucine-rich domain proteins (NLRs) as potential intracellular immune receptors, but the evolutionary mechanism leading to the ability to recognize specific pathogen effectors is elusive. Here, we cloned Pvr4 (a Potyvirus resistance gene in Capsicum annuum) and Tsw (a Tomato spotted wilt virus resistance gene in Capsicum chinense) via a genome-based approach using independent segregating populations. The genes both encode typical NLRs and are located at the same locus on pepper chromosome 10. Despite the fact that these two genes recognize completely different viral effectors, the genomic structures and coding sequences of the two genes are strikingly similar. Phylogenetic studies revealed that these two immune receptors diverged from a progenitor gene of a common ancestor. Our results suggest that sequence variations caused by gene duplication and neofunctionalization may underlie the evolution of the ability to specifically recognize different effectors. These findings thereby provide insight into the divergent evolution of plant immune receptors. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Selection Shapes Transcriptional Logic and Regulatory Specialization in Genetic Networks
Fogelmark, Karl; Peterson, Carsten; Troein, Carl
2016-01-01
Background Living organisms need to regulate their gene expression in response to environmental signals and internal cues. This is a computational task where genes act as logic gates that connect to form transcriptional networks, which are shaped at all scales by evolution. Large-scale mutations such as gene duplications and deletions add and remove network components, whereas smaller mutations alter the connections between them. Selection determines what mutations are accepted, but its importance for shaping the resulting networks has been debated. Methodology To investigate the effects of selection in the shaping of transcriptional networks, we derive transcriptional logic from a combinatorially powerful yet tractable model of the binding between DNA and transcription factors. By evolving the resulting networks based on their ability to function as either a simple decision system or a circadian clock, we obtain information on the regulation and logic rules encoded in functional transcriptional networks. Comparisons are made between networks evolved for different functions, as well as with structurally equivalent but non-functional (neutrally evolved) networks, and predictions are validated against the transcriptional network of E. coli. Principal Findings We find that the logic rules governing gene expression depend on the function performed by the network. Unlike the decision systems, the circadian clocks show strong cooperative binding and negative regulation, which achieves tight temporal control of gene expression. Furthermore, we find that transcription factors act preferentially as either activators or repressors, both when binding multiple sites for a single target gene and globally in the transcriptional networks. This separation into positive and negative regulators requires gene duplications, which highlights the interplay between mutation and selection in shaping the transcriptional networks. PMID:26927540
Innes, Roger W; Ameline-Torregrosa, Carine; Ashfield, Tom; Cannon, Ethalinda; Cannon, Steven B; Chacko, Ben; Chen, Nicolas W G; Couloux, Arnaud; Dalwani, Anita; Denny, Roxanne; Deshpande, Shweta; Egan, Ashley N; Glover, Natasha; Hans, Christian S; Howell, Stacy; Ilut, Dan; Jackson, Scott; Lai, Hongshing; Mammadov, Jafar; Del Campo, Sara Martin; Metcalf, Michelle; Nguyen, Ashley; O'Bleness, Majesta; Pfeil, Bernard E; Podicheti, Ram; Ratnaparkhe, Milind B; Samain, Sylvie; Sanders, Iryna; Ségurens, Béatrice; Sévignac, Mireille; Sherman-Broyles, Sue; Thareau, Vincent; Tucker, Dominic M; Walling, Jason; Wawrzynski, Adam; Yi, Jing; Doyle, Jeff J; Geffroy, Valérie; Roe, Bruce A; Maroof, M A Saghai; Young, Nevin D
2008-12-01
The genomes of most, if not all, flowering plants have undergone whole genome duplication events during their evolution. The impact of such polyploidy events is poorly understood, as is the fate of most duplicated genes. We sequenced an approximately 1 million-bp region in soybean (Glycine max) centered on the Rpg1-b disease resistance gene and compared this region with a region duplicated 10 to 14 million years ago. These two regions were also compared with homologous regions in several related legume species (a second soybean genotype, Glycine tomentella, Phaseolus vulgaris, and Medicago truncatula), which enabled us to determine how each of the duplicated regions (homoeologues) in soybean has changed following polyploidy. The biggest change was in retroelement content, with homoeologue 2 having expanded to 3-fold the size of homoeologue 1. Despite this accumulation of retroelements, over 77% of the duplicated low-copy genes have been retained in the same order and appear to be functional. This finding contrasts with recent analyses of the maize (Zea mays) genome, in which only about one-third of duplicated genes appear to have been retained over a similar time period. Fluorescent in situ hybridization revealed that the homoeologue 2 region is located very near a centromere. Thus, pericentromeric localization, per se, does not result in a high rate of gene inactivation, despite greatly accelerated retrotransposon accumulation. In contrast to low-copy genes, nucleotide-binding-leucine-rich repeat disease resistance gene clusters have undergone dramatic species/homoeologue-specific duplications and losses, with some evidence for partitioning of subfamilies between homoeologues.
Emms, David M.; Covshoff, Sarah; Hibberd, Julian M.; ...
2016-03-24
C4 photosynthesis is considered one of the most remarkable examples of evolutionary convergence in eukaryotes. However, it is unknown whether the evolution of C4 photosynthesis required the evolution of new genes. Genome-wide gene-tree species-tree reconciliation of seven monocot species that span two origins of C4 photosynthesis revealed that there was significant parallelism in the duplication and retention of genes coincident with the evolution of C4 photosynthesis in these lineages. Specifically, 21 orthologous genes were duplicated and retained independently in parallel at both C4 origins. Analysis of this gene cohort revealed that the set of parallel duplicated and retained genes ismore » enriched for genes that are preferentially expressed in bundle sheath cells, the cell type in which photosynthesis was activated during C4 evolution. Moreover, functional analysis of the cohort of parallel duplicated genes identified SWEET-13 as a potential key transporter in the evolution of C4 photosynthesis in grasses, and provides new insight into the mechanism of phloem loading in these C4 species.« less
Xue, Yufei; Chen, Baojun; Win, Aung Naing; Fu, Chun; Lian, Jianping; Liu, Xue; Wang, Rui; Zhang, Xingcui
2018-01-01
Omega-3 fatty acid desaturase (ω-3 FAD, D15D) is a key enzyme for α-linolenic acid (ALA) biosynthesis. Both chia (Salvia hispanica) and perilla (Perilla frutescens) contain high levels of ALA in seeds. In this study, the ω-3 FAD gene family was systematically and comparatively cloned from chia and perilla. Perilla FAD3, FAD7, FAD8 and chia FAD7 are encoded by single-copy (but heterozygous) genes, while chia FAD3 is encoded by 2 distinct genes. Only 1 chia FAD8 sequence was isolated. In these genes, there are 1 to 6 transcription start sites, 1 to 8 poly(A) tailing sites, and 7 introns. The 5’UTRs of PfFAD8a/b contain 1 to 2 purine-stretches and 2 pyrimidine-stretches. An alternative splice variant of ShFAD7a/b comprises a 5’UTR intron. Their encoded proteins harbor an FA_desaturase conserved domain together with 4 trans-membrane helices and 3 histidine boxes. Phylogenetic analysis validated their identity of dicot microsomal or plastidial ω-3 FAD proteins, and revealed some important evolutionary features of plant ω-3 FAD genes such as convergent evolution across different phylums, single-copy status in algae, and duplication events in certain taxa. The qRT-PCR assay showed that the ω-3 FAD genes of two species were expressed at different levels in various organs, and they also responded to multiple stress treatments. The functionality of the ShFAD3 and PfFAD3 enzymes was confirmed by yeast expression. The systemic molecular and functional features of the ω-3 FAD gene family from chia and perilla revealed in this study will facilitate their use in future studies on genetic improvement of ALA traits in oilseed crops. PMID:29351555
Clarke, Thomas H; Garb, Jessica E; Hayashi, Cheryl Y; Arensburger, Peter; Ayoub, Nadia A
2015-06-08
The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae). © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Extensive Local Gene Duplication and Functional Divergence among Paralogs in Atlantic Salmon
Warren, Ian A.; Ciborowski, Kate L.; Casadei, Elisa; Hazlerigg, David G.; Martin, Sam; Jordan, William C.; Sumner, Seirian
2014-01-01
Many organisms can generate alternative phenotypes from the same genome, enabling individuals to exploit diverse and variable environments. A prevailing hypothesis is that such adaptation has been favored by gene duplication events, which generate redundant genomic material that may evolve divergent functions. Vertebrate examples of recent whole-genome duplications are sparse although one example is the salmonids, which have undergone a whole-genome duplication event within the last 100 Myr. The life-cycle of the Atlantic salmon, Salmo salar, depends on the ability to produce alternating phenotypes from the same genome, to facilitate migration and maintain its anadromous life history. Here, we investigate the hypothesis that genome-wide and local gene duplication events have contributed to the salmonid adaptation. We used high-throughput sequencing to characterize the transcriptomes of three key organs involved in regulating migration in S. salar: Brain, pituitary, and olfactory epithelium. We identified over 10,000 undescribed S. salar sequences and designed an analytic workflow to distinguish between paralogs originating from local gene duplication events or from whole-genome duplication events. These data reveal that substantial local gene duplications took place shortly after the whole-genome duplication event. Many of the identified paralog pairs have either diverged in function or become noncoding. Future functional genomics studies will reveal to what extent this rich source of divergence in genetic sequence is likely to have facilitated the evolution of extreme phenotypic plasticity required for an anadromous life-cycle. PMID:24951567
Prevention of data duplication for high throughput sequencing repositories
Gabdank, Idan; Chan, Esther T; Davidson, Jean M; Hilton, Jason A; Davis, Carrie A; Baymuradov, Ulugbek K; Narayanan, Aditi; Onate, Kathrina C; Graham, Keenan; Miyasato, Stuart R; Dreszer, Timothy R; Strattan, J Seth; Jolanki, Otto; Tanaka, Forrest Y; Hitz, Benjamin C
2018-01-01
Abstract Prevention of unintended duplication is one of the ongoing challenges many databases have to address. Working with high-throughput sequencing data, the complexity of that challenge increases with the complexity of the definition of a duplicate. In a computational data model, a data object represents a real entity like a reagent or a biosample. This representation is similar to how a card represents a book in a paper library catalog. Duplicated data objects not only waste storage, they can mislead users into assuming the model represents more than the single entity. Even if it is clear that two objects represent a single entity, data duplication opens the door to potential inconsistencies between the objects since the content of the duplicated objects can be updated independently, allowing divergence of the metadata associated with the objects. Analogously to a situation in which a catalog in a paper library would contain by mistake two cards for a single copy of a book. If these cards are listing simultaneously two different individuals as current book borrowers, it would be difficult to determine which borrower (out of the two listed) actually has the book. Unfortunately, in a large database with multiple submitters, unintended duplication is to be expected. In this article, we present three principal guidelines the Encyclopedia of DNA Elements (ENCODE) Portal follows in order to prevent unintended duplication of both actual files and data objects: definition of identifiable data objects (I), object uniqueness validation (II) and de-duplication mechanism (III). In addition to explaining our modus operandi, we elaborate on the methods used for identification of sequencing data files. Comparison of the approach taken by the ENCODE Portal vs other widely used biological data repositories is provided. Database URL: https://www.encodeproject.org/ PMID:29688363
Adaptive evolution and functional innovation of Populus-specific recently evolved microRNAs.
Xie, Jianbo; Yang, Xiaohui; Song, Yuepeng; Du, Qingzhang; Li, Ying; Chen, Jinhui; Zhang, Deqiang
2017-01-01
Lineage-specific microRNAs (miRNAs) undergo rapid turnover during evolution; however, their origin and functional importance have remained controversial. Here, we examine the origin, evolution, and potential roles in local adaptation of Populus-specific miRNAs, which originated after the recent salicoid-specific, whole-genome duplication. RNA sequencing was used to generate extensive, comparable miRNA and gene expression data for six tissues. A natural population of Populus trichocarpa and closely related species were used to study the divergence rates, evolution, and adaptive variation of miRNAs. MiRNAs that originated in 5' untranslated regions had higher expression levels and their expression showed high correlation with their host genes. Compared with conserved miRNAs, a significantly higher proportion of Populus-specific miRNAs appear to target genes that were duplicated in salicoids. Examination of single nucleotide polymorphisms in Populus-specific miRNA precursors showed high amounts of population differentiation. We also characterized the newly emerged MIR6445 family, which could trigger the production of phased small interfering RNAs from NAC mRNAs, which encode a transcription factor with primary roles in a variety of plant developmental processes. Together, these observations provide evolutionary insights into the birth and potential roles of Populus-specific miRNAs in genome maintenance, local adaptation, and functional innovation. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Intragenomic spread of plastid-targeting presequences in the coccolithophore Emiliania huxleyi.
Burki, Fabien; Hirakawa, Yoshihisa; Keeling, Patrick J
2012-09-01
Nucleus-encoded plastid-targeted proteins of photosynthetic organisms are generally equipped with an N-terminal presequence required for crossing the plastid membranes. The acquisition of these presequences played a fundamental role in the establishment of plastids. Here, we report a unique case of two non-homologous proteins possessing completely identical presequences consisting of a bipartite plastid-targeting signal in the coccolithophore Emiliania huxleyi. We further show that this presequence is highly conserved in five additional proteins that did not originally function in plastids, representing de novo plastid acquisitions. These are among the most recent cases of presequence spreading from gene to gene and shed light on important evolutionary processes that have been usually erased by the ancient history of plastid evolution. We propose a mechanism of acquisition involving genomic duplications and gene replacement through non-homologous recombination that may have played a more general role for equipping proteins with targeting information.
Dong, Ying; Matigian, Nick; Harvey, Tracey J; Samaratunga, Hemamali; Hooper, John D; Clements, Judith A
2008-02-01
Abstract Tissue kallikrein (kallikrein 1) was first identified in pancreas and is the namesake of the kallikrein-related peptidase (KLK) family. KLK1 and the other 14 members of the human KLK family are encoded by 15 serine protease genes clustered at chromosome 19q13.4. Our Northern blot analysis of 19 normal human tissues for expression of KLK4 to KLK15 identified pancreas as a common expression site for the gene cluster spanning KLK5 to KLK13, as well as for KLK15 which is located adjacent to KLK1. Consistent with previous reports detailing the ability of KLK genes to generate organ- and disease-specific transcripts, detailed molecular and in silico analyses indicated that KLK5 and KLK7 generate transcripts in pancreas variant from those in skin or ovary. Consistently, we identified in the promoters of these KLK genes motifs which conform with consensus binding sites for transcription factors conferring pancreatic expression. In addition, immunohistochemical analysis revealed predominant localisation of KLK5 and KLK7 in acinar cells of the exocrine pancreas, suggesting roles for these enzymes in digestion. Our data also support expression patterns derived from gene duplication events in the human KLK cluster. These findings suggest that, in addition to KLK1, other related KLK enzymes will function in the exocrine pancreas.
The Sequence and Analysis of Duplication Rich Human Chromosome 16
DOE R&D Accomplishments Database
Martin, Joel; Han, Cliff; Gordon, Laurie A.; Terry, Astrid; Prabhakar, Shyam; She, Xinwei; Xie, Gary; Hellsten, Uffe; Man Chan, Yee; Altherr, Michael; Couronne, Olivier; Aerts, Andrea; Bajorek, Eva; Black, Stacey; Blumer, Heather; Branscomb, Elbert; Brown, Nancy C.; Bruno, William J.; Buckingham, Judith M.; Callen, David F.; Campbell, Connie S.; Campbell, Mary L.; Campbell, Evelyn W.; Caoile, Chenier; Challacombe, Jean F.; Chasteen, Leslie A.; Chertkov, Olga; Chi, Han C.; Christensen, Mari; Clark, Lynn M.; Cohn, Judith D.; Denys, Mirian; Detter, John C.; Dickson, Mark; Dimitrijevic-Bussod, Mira; Escobar, Julio; Fawcett, Joseph J.; Flowers, Dave; Fotopulos, Dea; Glavina, Tijana; Gomez, Maria; Gonzales, Eidelyn; Goodstein, David; Goodwin, Lynne A.; Grady, Deborah L.; Grigoriev, Igor; Groza, Matthew; Hammon, Nancy; Hawkins, Trevor; Haydu, Lauren; Hildebrand, Carl E.; Huang, Wayne; Israni, Sanjay; Jett, Jamie; Jewett, Phillip E.; Kadner, Kristen; Kimball, Heather; Kobayashi, Arthur; Krawczyk, Marie-Claude; Leyba, Tina; Longmire, Jonathan L.; Lopez, Frederick; Lou, Yunian; Lowry, Steve; Ludeman, Thom; Mark, Graham A.; Mcmurray, Kimberly L.; Meincke, Linda J.; Morgan, Jenna; Moyzis, Robert K.; Mundt, Mark O.; Munk, A. Christine; Nandkeshwar, Richard D.; Pitluck, Sam; Pollard, Martin; Predki, Paul; Parson-Quintana, Beverly; Ramirez, Lucia; Rash, Sam; Retterer, James; Ricke, Darryl O.; Robinson, Donna L.; Rodriguez, Alex; Salamov, Asaf; Saunders, Elizabeth H.; Scott, Duncan; Shough, Timothy; Stallings, Raymond L.; Stalvey, Malinda; Sutherland, Robert D.; Tapia, Roxanne; Tesmer, Judith G.; Thayer, Nina; Thompson, Linda S.; Tice, Hope; Torney, David C.; Tran-Gyamfi, Mary; Tsai, Ming; Ulanovsky, Levy E.; Ustaszewska, Anna; Vo, Nu; White, P. Scott; Williams, Albert L.; Wills, Patricia L.; Wu, Jung-Rung; Wu, Kevin; Yang, Joan; DeJong, Pieter; Bruce, David; Doggett, Norman; Deaven, Larry; Schmutz, Jeremy; Grimwood, Jane; Richardson, Paul; et al.
2004-01-01
We report here the 78,884,754 base pairs of finished human chromosome 16 sequence, representing over 99.9 percent of its euchromatin. Manual annotation revealed 880 protein coding genes confirmed by 1,637 aligned transcripts, 19 tRNA genes, 341 pseudogenes and 3 RNA pseudogenes. These genes include metallothionein, cadherin and iroquois gene families, as well as the disease genes for polycystic kidney disease and acute myelomonocytic leukemia. Several large-scale structural polymorphisms spanning hundreds of kilobasepairs were identified and result in gene content differences across humans. One of the unique features of chromosome 16 is its high level of segmental duplication, ranked among the highest of the human autosomes. While the segmental duplications are enriched in the relatively gene poor pericentromere of the p-arm, some are involved in recent gene duplication and conversion events which are likely to have had an impact on the evolution of primates and human disease susceptibility.
Evolution of the eukaryotic dynactin complex, the activator of cytoplasmic dynein
2012-01-01
Background Dynactin is a large multisubunit protein complex that enhances the processivity of cytoplasmic dynein and acts as an adapter between dynein and the cargo. It is composed of eleven different polypeptides of which eight are unique to this complex, namely dynactin1 (p150Glued), dynactin2 (p50 or dynamitin), dynactin3 (p24), dynactin4 (p62), dynactin5 (p25), dynactin6 (p27), and the actin-related proteins Arp1 and Arp10 (Arp11). Results To reveal the evolution of dynactin across the eukaryotic tree the presence or absence of all dynactin subunits was determined in most of the available eukaryotic genome assemblies. Altogether, 3061 dynactin sequences from 478 organisms have been annotated. Phylogenetic trees of the various subunit sequences were used to reveal sub-family relationships and to reconstruct gene duplication events. Especially in the metazoan lineage, several of the dynactin subunits were duplicated independently in different branches. The largest subunit repertoire is found in vertebrates. Dynactin diversity in vertebrates is further increased by alternative splicing of several subunits. The most prominent example is the dynactin1 gene, which may code for up to 36 different isoforms due to three different transcription start sites and four exons that are spliced as differentially included exons. Conclusions The dynactin complex is a very ancient complex that most likely included all subunits in the last common ancestor of extant eukaryotes. The absence of dynactin in certain species coincides with that of the cytoplasmic dynein heavy chain: Organisms that do not encode cytoplasmic dynein like plants and diplomonads also do not encode the unique dynactin subunits. The conserved core of dynactin consists of dynactin1, dynactin2, dynactin4, dynactin5, Arp1, and the heterodimeric actin capping protein. The evolution of the remaining subunits dynactin3, dynactin6, and Arp10 is characterized by many branch- and species-specific gene loss events. PMID:22726940
Synthetic and Evolutionary Construction of a Chlorate-Reducing Shewanella oneidensis MR-1.
Clark, Iain C; Melnyk, Ryan A; Youngblut, Matthew D; Carlson, Hans K; Iavarone, Anthony T; Coates, John D
2015-05-19
Despite evidence for the prevalence of horizontal gene transfer of respiratory genes, little is known about how pathways functionally integrate within new hosts. One example of a mobile respiratory metabolism is bacterial chlorate reduction, which is frequently encoded on composite transposons. This implies that the essential components of the metabolism are encoded on these mobile elements. To test this, we heterologously expressed genes for chlorate reduction from Shewanella algae ACDC in the non-chlorate-reducing Shewanella oneidensis MR-1. The construct that ultimately endowed robust growth on chlorate included cld, a cytochrome c gene, clrABDC, and two genes of unknown function. Although strain MR-1 was unable to grow on chlorate after initial insertion of these genes into the chromosome, 11 derived strains capable of chlorate respiration were obtained through adaptive evolution. Genome resequencing indicated that all of the evolved chlorate-reducing strains replicated a large genomic region containing chlorate reduction genes. Contraction in copy number and loss of the ability to reduce chlorate were also observed, indicating that this phenomenon was extremely dynamic. Although most strains contained more than six copies of the replicated region, a single strain with less duplication also grew rapidly. This strain contained three additional mutations that we hypothesized compensated for the low copy number. We remade the mutations combinatorially in the unevolved strain and determined that a single nucleotide polymorphism (SNP) upstream of cld enabled growth on chlorate and was epistatic to a second base pair change in the NarP binding sequence between narQP and nrfA that enhanced growth. The ability of chlorate reduction composite transposons to form functional metabolisms after transfer to a new host is an important part of their propagation. To study this phenomenon, we engineered Shewanella oneidensis MR-1 into a chlorate reducer. We defined a set of genes sufficient to endow growth on chlorate from a plasmid, but found that chromosomal insertion of these genes was nonfunctional. Evolution of this inoperative strain into a chlorate reducer showed that tandem duplication was a dominant mechanism of activation. While copy number changes are a relatively rapid way of increasing gene dosage, replicating almost 1 megabase of extra DNA is costly. Mutations that alleviate the need for high copy number are expected to arise and eventually predominate, and we identified a single nucleotide polymorphism (SNP) that relieved the copy number requirement. This study uses both rational and evolutionary approaches to gain insight into the evolution of a fascinating respiratory metabolism. Copyright © 2015 Clark et al.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hamilton, A T; Huntley, S; Tran-Gyamfi, M
Although most genes are conserved as one-to-one orthologs in different mammalian orders, certain gene families have evolved to comprise different numbers and types of protein-coding genes through independent series of gene duplications, divergence and gene loss in each evolutionary lineage. One such family encodes KRAB-zinc finger (KRAB-ZNF) genes, which are likely to function as transcriptional repressors. One KRAB-ZNF subfamily, the ZNF91 clade, has expanded specifically in primates to comprise more than 110 loci in the human genome, yielding large gene clusters in human chromosomes 19 and 7 and smaller clusters or isolated copies at other chromosomal locations. Although phylogenetic analysismore » indicates that many of these genes arose before the split between old world monkeys and new world monkeys, the ZNF91 subfamily has continued to expand and diversify throughout the evolution of apes and humans. The paralogous loci are distinguished by sequence divergence within their zinc finger arrays indicating a selection for proteins with different DNA binding specificities. RT-PCR and in situ hybridization data show that some of these ZNF genes can have tissue-specific expression patterns, however many KRAB-ZNFs that are near-ubiquitous could also be playing very specific roles in halting target pathways in all tissues except for a few, where the target is released by the absence of its repressor. The number of variant KRAB-ZNF proteins is increased not only because of the large number of loci, but also because many loci can produce multiple splice variants, which because of the modular structure of these genes may have separate and perhaps even conflicting regulatory roles. The lineage-specific duplication and rapid divergence of this family of transcription factor genes suggests a role in determining species-specific biological differences and the evolution of novel primate traits.« less
Defense Against Cannibalism: The SdpI Family of Bacterial Immunity/Signal Transduction Proteins
Povolotsky, Tatyana Leonidovna; Orlova, Ekaterina; Tamang, Dorjee G.
2010-01-01
The SdpI family consists of putative bacterial toxin immunity and signal transduction proteins. One member of the family in Bacillus subtilis, SdpI, provides immunity to cells from cannibalism in times of nutrient limitation. SdpI family members are transmembrane proteins with 3, 4, 5, 6, 7, 8, or 12 putative transmembrane α-helical segments (TMSs). These varied topologies appear to be genuine rather than artifacts due to sequencing or annotation errors. The basic and most frequently occurring element of the SdpI family has 6 TMSs. Homologues of all topological types were aligned to determine the homologous TMSs and loop regions, and the positive-inside rule was used to determine sidedness. The two most conserved motifs were identified between TMSs 1 and 2 and TMSs 4 and 5 of the 6 TMS proteins. These showed significant sequence similarity, leading us to suggest that the primordial precursor of these proteins was a 3 TMS–encoding genetic element that underwent intragenic duplication. Various deletional and fusional events, as well as intragenic duplications and inversions, may have yielded SdpI homologues with topologies of varying numbers and positions of TMSs. We propose a specific evolutionary pathway that could have given rise to these distantly related bacterial immunity proteins. We further show that genes encoding SdpI homologues often appear in operons with genes for homologues of SdpR, SdpI’s autorepressor. Our analyses allow us to propose structure–function relationships that may be applicable to most family members. Electronic supplementary material The online version of this article (doi:10.1007/s00232-010-9260-7) contains supplementary material, which is available to authorized users. PMID:20563570
Han, Li; Szabó, Piroska E.; Mann, Jeffrey R.
2010-01-01
The misexpressed imprinted genes causing developmental failure of mouse parthenogenones are poorly defined. To obtain further insight, we investigated misexpressions that could cause the pronounced growth deficiency and death of fetuses with maternal duplication of distal chromosome (Chr) 7 (MatDup.dist7). Their small size could involve inactivity of Igf2, encoding a growth factor, with some contribution by over-expression of Cdkn1c, encoding a negative growth regulator. Mice lacking Igf2 expression are usually viable, and MatDup.dist7 death has been attributed to the misexpression of Cdkn1c or other imprinted genes. To examine the role of misexpressions determined by two maternal copies of the Igf2/H19 imprinting control region (ICR)—a chromatin insulator, we introduced a mutant ICR (ICRΔ) into MatDup.dist7 fetuses. This activated Igf2, with correction of H19 expression and other imprinted transcripts expected. Substantial growth enhancement and full postnatal viability was obtained, demonstrating that the aberrant MatDup.dist7 phenotype is highly dependent on the presence of two unmethylated maternal Igf2/H19 ICRs. Activation of Igf2 is likely the predominant correction that rescued growth and viability. Further experiments involved the introduction of a null allele of Cdkn1c to alleviate its over-expression. Results were not consistent with the possibility that this misexpression alone, or in combination with Igf2 inactivity, mediates MatDup.dist7 death. Rather, a network of misexpressions derived from dist7 is probably involved. Our results are consistent with the idea that reduced expression of IGF2 plays a role in the aetiology of the human imprinting-related growth-deficit disorder, Silver-Russell syndrome. PMID:20062522
Two-component signal transduction systems of Xanthomonas spp.: a lesson from genomics.
Qian, Wei; Han, Zhong-Ji; He, Chaozu
2008-02-01
The two-component signal transduction systems (TCSTSs), consisting of a histidine kinase sensor (HK) and a response regulator (RR), are the dominant molecular mechanisms by which prokaryotes sense and respond to environmental stimuli. Genomes of Xanthomonas generally contain a large repertoire of TCSTS genes (approximately 92 to 121 for each genome), which encode diverse structural groups of HKs and RRs. Among them, although a core set of 70 TCSTS genes (about two-thirds in total) which accumulates point mutations with a slow rate are shared by these genomes, the other genes, especially hybrid HKs, experienced extensive genetic recombination, including genomic rearrangement, gene duplication, addition or deletion, and fusion or fission. The recombinations potentially promote the efficiency and complexity of TCSTSs in regulating gene expression. In addition, our analysis suggests that a co-evolutionary model, rather than a selfish operon model, is the major mechanism for the maintenance and microevolution of TCSTS genes in the genomes of Xanthomonas. Genomic annotation, secondary protein structure prediction, and comparative genomic analyses of TCSTS genes reviewed here provide insights into our understanding of signal networks in these important phytopathogenic bacteria.
The evolution of duplicate gene expression in mammalian organs
Guschanski, Katerina; Warnefors, Maria; Kaessmann, Henrik
2017-01-01
Gene duplications generate genomic raw material that allows the emergence of novel functions, likely facilitating adaptive evolutionary innovations. However, global assessments of the functional and evolutionary relevance of duplicate genes in mammals were until recently limited by the lack of appropriate comparative data. Here, we report a large-scale study of the expression evolution of DNA-based functional gene duplicates in three major mammalian lineages (placental mammals, marsupials, egg-laying monotremes) and birds, on the basis of RNA sequencing (RNA-seq) data from nine species and eight organs. We observe dynamic changes in tissue expression preference of paralogs with different duplication ages, suggesting differential contribution of paralogs to specific organ functions during vertebrate evolution. Specifically, we show that paralogs that emerged in the common ancestor of bony vertebrates are enriched for genes with brain-specific expression and provide evidence for differential forces underlying the preferential emergence of young testis- and liver-specific expressed genes. Further analyses uncovered that the overall spatial expression profiles of gene families tend to be conserved, with several exceptions of pronounced tissue specificity shifts among lineage-specific gene family expansions. Finally, we trace new lineage-specific genes that may have contributed to the specific biology of mammalian organs, including the little-studied placenta. Overall, our study provides novel and taxonomically broad evidence for the differential contribution of duplicate genes to tissue-specific transcriptomes and for their importance for the phenotypic evolution of vertebrates. PMID:28743766
Xp22.33p22.12 Duplication in a Patient with Intellectual Disability and Dysmorphic Facial Features
Lintas, Carla; Picinelli, Chiara; Piras, Ignazio S.; Sacco, Roberto; Gabriele, Stefano; Verdecchia, Magda; Persico, Antonio M.
2016-01-01
A novel 19.98-Mb duplication in chromosome Xp22.33p22.12 was detected by array CGH in a 30-year-old man affected by intellectual disability, congenital hypotonia and dysmorphic features. The duplication encompasses more than 100 known genes. Many of these genes (such as neuroligin 4, cyclin-dependent kinase like 5, and others) have already correlated with X-linked intellectual disability and/or neurodevelopmental disorders. Due to the high number of potentially pathogenic genes involved in the reported duplication, we cannot correlate the clinical phenotype to a single gene. Indeed, we suggest that the resulting clinical phenotype may have arisen from the overexpression and consequent perturbation of fine gene dosage. PMID:26997944
Xp22.33p22.12 Duplication in a Patient with Intellectual Disability and Dysmorphic Facial Features.
Lintas, Carla; Picinelli, Chiara; Piras, Ignazio S; Sacco, Roberto; Gabriele, Stefano; Verdecchia, Magda; Persico, Antonio M
2016-02-01
A novel 19.98-Mb duplication in chromosome Xp22.33p22.12 was detected by array CGH in a 30-year-old man affected by intellectual disability, congenital hypotonia and dysmorphic features. The duplication encompasses more than 100 known genes. Many of these genes (such as neuroligin 4, cyclin-dependent kinase like 5, and others) have already correlated with X-linked intellectual disability and/or neurodevelopmental disorders. Due to the high number of potentially pathogenic genes involved in the reported duplication, we cannot correlate the clinical phenotype to a single gene. Indeed, we suggest that the resulting clinical phenotype may have arisen from the overexpression and consequent perturbation of fine gene dosage.
Evolution, functions, and mysteries of plant ARGONAUTE proteins.
Zhang, Han; Xia, Rui; Meyers, Blake C; Walbot, Virginia
2015-10-01
ARGONAUTE (AGO) proteins bind small RNAs (sRNAs) to form RNA-induced silencing complexes for transcriptional and post-transcriptional gene silencing. Genomes of primitive plants encode only a few AGO proteins. The Arabidopsis thaliana genome encodes ten AGO proteins, designated AGO1 to AGO10. Most early studies focused on these ten proteins and their interacting sRNAs. AGOs in other flowering plant species have duplicated and diverged from this set, presumably corresponding to new, diverged or specific functions. Among these, the grass-specific AGO18 family has been discovered and implicated as playing important roles during plant reproduction and viral defense. This review covers our current knowledge about functions and features of AGO proteins in both eudicots and monocots and compares their similarities and differences. On the basis of these features, we propose a new nomenclature for some plant AGOs. Copyright © 2015 Elsevier Ltd. All rights reserved.
From the ultrasonic to the infrared: molecular evolution and the sensory biology of bats
Jones, Gareth; Teeling, Emma C.; Rossiter, Stephen J.
2013-01-01
Great advances have been made recently in understanding the genetic basis of the sensory biology of bats. Research has focused on the molecular evolution of candidate sensory genes, genes with known functions [e.g., olfactory receptor (OR) genes] and genes identified from mutations associated with sensory deficits (e.g., blindness and deafness). For example, the FoxP2 gene, underpinning vocal behavior and sensorimotor coordination, has undergone diversification in bats, while several genes associated with audition show parallel amino acid substitutions in unrelated lineages of echolocating bats and, in some cases, in echolocating dolphins, representing a classic case of convergent molecular evolution. Vision genes encoding the photopigments rhodopsin and the long-wave sensitive opsin are functional in bats, while that encoding the short-wave sensitive opsin has lost functionality in rhinolophoid bats using high-duty cycle laryngeal echolocation, suggesting a sensory trade-off between investment in vision and echolocation. In terms of olfaction, bats appear to have a distinctive OR repertoire compared with other mammals, and a gene involved in signal transduction in the vomeronasal system has become non-functional in most bat species. Bitter taste receptors appear to have undergone a “birth-and death” evolution involving extensive gene duplication and loss, unlike genes coding for sweet and umami tastes that show conservation across most lineages but loss in vampire bats. Common vampire bats have also undergone adaptations for thermoperception, via alternative splicing resulting in the evolution of a novel heat-sensitive channel. The future for understanding the molecular basis of sensory biology is promising, with great potential for comparative genomic analyses, studies on gene regulation and expression, exploration of the role of alternative splicing in the generation of proteomic diversity, and linking genetic mechanisms to behavioral consequences. PMID:23755015
Arashida, Ryo; Kakizawa, Shigeyuki; Hoshi, Ayaka; Ishii, Yoshiko; Jung, Hee-Young; Kagiwada, Satoshi; Yamaji, Yasuyuki; Oshima, Kenro; Namba, Shigetou
2008-04-01
Phytoplasmas are phloem-limited plant pathogens that are transmitted by insect vectors and are associated with diseases in hundreds of plant species. Despite their small sizes, phytoplasma genomes have repeat-rich sequences, which are due to several genes that are encoded as multiple copies. These multiple genes exist in a gene cluster, the potential mobile unit (PMU). PMUs are present at several distinct regions in the phytoplasma genome. The multicopy genes encoded by PMUs (herein named mobile unit genes [MUGs]) and similar genes elsewhere in the genome (herein named fundamental genes [FUGs]) are likely to have the same function based on their annotations. In this manuscript we show evidence that MUGs and FUGs do not cluster together within the same clade. Each MUG is in a cluster with a short branch length, suggesting that MUGs are recently diverged paralogs, whereas the origin of FUGs is different from that of MUGs. We also compared the genome structures around the lplA gene in two derivative lines of the 'Candidatus Phytoplasma asteris' OY strain, the severe-symptom line W (OY-W) and the mild-symptom line M (OY-M). The gene organizations of the nucleotide sequences upstream of the lplA genes of OY-W and OY-M were dramatically different. The tra5 insertion sequence, an element of PMUs, was found only in this region in OY-W. These results suggest that transposition of entire PMUs and PMU sections has occurred frequently in the OY phytoplasma genome. The difference in the pathogenicities of OY-W and OY-M might be caused by the duplication and transposition of PMUs, followed by genome rearrangement.
Toxin-antitoxin systems and regulatory mechanisms in Mycobacterium tuberculosis.
Slayden, Richard A; Dawson, Clinton C; Cummings, Jason E
2018-06-01
There has been a significant reduction in annual tuberculosis incidence since the World Health Organization declared tuberculosis a global health threat. However, treatment of M. tuberculosis infections requires lengthy multidrug therapeutic regimens to achieve a durable cure. The development of new drugs that are active against resistant strains and phenotypically diverse organisms continues to present the greatest challenge in the future. Numerous phylogenomic analyses have revealed that the Mtb genome encodes a significantly expanded repertoire of toxin-antitoxin (TA) loci that makes up the Mtb TA system. A TA loci is a two-gene operon encoding a 'toxin' protein that inhibits bacterial growth and an interacting 'antitoxin' partner that neutralizes the inhibitory activity of the toxin. The presence of multiple chromosomally encoded TA loci in Mtb raises important questions in regard to expansion, regulation and function. Thus, the functional roles of TA loci in Mtb pathogenesis have received considerable attention over the last decade. The cumulative results indicate that they are involved in regulating adaptive responses to stresses associated with the host environment and drug treatment. Here we review the TA families encoded in Mtb, discuss the duplication of TA loci in Mtb, regulatory mechanism of TA loci, and phenotypic heterogeneity and pathogenesis.
Barbaro, Michela; Oscarson, Mikael; Schoumans, Jacqueline; Staaf, Johan; Ivarsson, Sten A; Wedell, Anna
2007-08-01
Testis development is a tightly regulated process that requires an efficient and coordinated spatiotemporal action of many factors, and it has been shown that several genes involved in gonadal development exert a dosage effect. Chromosomal imbalances have been reported in several patients presenting with gonadal dysgenesis as part of severe dysmorphic phenotypes. We screened for submicroscopic DNA copy number variations in two sisters with an apparent normal 46,XY karyotype and female external genitalia due to gonadal dysgenesis, and in which mutations in known candidate genes had been excluded. By high-resolution tiling bacterial artificial chromosome array comparative genome hybridization, a submicroscopic duplication at Xp21.2 containing DAX1 (NR0B1) was identified. Using fluorescence in situ hybridization, multiple ligation probe amplification, and PCR, the rearrangement was further characterized. This revealed a 637-kb tandem duplication that in addition to DAX1 includes the four MAGEB genes, the hypothetical gene CXorf21, GK, and part of the MAP3K7IP3 gene. Sequencing and analysis of the breakpoint boundaries and duplication junction suggest that the duplication originated through a coupled homologous and nonhomologous recombination process. This represents the first duplication on Xp21.2 identified in patients with isolated gonadal dysgenesis because all previously described XY subjects with Xp21 duplications presented with gonadal dysgenesis as part of a more complex phenotype, including mental retardation and/or malformations. Thus, our data support DAX1 as a dosage sensitive gene responsible for gonadal dysgenesis and highlight the importance of considering DAX1 locus duplications in the evaluation of all cases of 46,XY gonadal dysgenesis.
Tetreau, Guillaume; Dittmer, Neal T; Cao, Xiaolong; Agrawal, Sinu; Chen, Yun-Ru; Muthukrishnan, Subbaratnam; Haobo, Jiang; Blissard, Gary W; Kanost, Michael R; Wang, Ping
2015-07-01
In insects, chitin is a major structural component of the cuticle and the peritrophic membrane (PM). In nature, chitin is always associated with proteins among which chitin-binding proteins (CBPs) are the most important for forming, maintaining and regulating the functions of these extracellular structures. In this study, a genome-wide search for genes encoding proteins with ChtBD2-type (peritrophin A-type) chitin-binding domains (CBDs) was conducted. A total of 53 genes encoding 56 CBPs were identified, including 15 CPAP1s (cuticular proteins analogous to peritrophins with 1 CBD), 11 CPAP3s (CPAPs with 3 CBDs) and 17 PMPs (PM proteins) with a variable number of CBDs, which are structural components of cuticle or of the PM. CBDs were also identified in enzymes of chitin metabolism including 6 chitinases and 7 chitin deacetylases encoded by 6 and 5 genes, respectively. RNA-seq analysis confirmed that PMP and CPAP genes have differential spatial expression patterns. The expression of PMP genes is midgut-specific, while CPAP genes are widely expressed in different cuticle forming tissues. Phylogenetic analysis of CBDs of proteins in insects belonging to different orders revealed that CPAP1s from different species constitute a separate family with 16 different groups, including 6 new groups identified in this study. The CPAP3s are clustered into a separate family of 7 groups present in all insect orders. Altogether, they reveal that duplication events of CBDs in CPAP1s and CPAP3s occurred prior to the evolutionary radiation of insect species. In contrast to the CPAPs, all CBDs from individual PMPs are generally clustered and distinct from other PMPs in the same species in phylogenetic analyses, indicating that the duplication of CBDs in each of these PMPs occurred after divergence of insect species. Phylogenetic analysis of these three CBP families showed that the CBDs in CPAP1s form a clearly separate family, while those found in PMPs and CPAP3s were clustered together in the phylogenetic tree. For chitinases and chitin deacetylases, most of phylogenetic analysis performed with the CBD sequences resulted in similar clustering to the one obtained by using catalytic domain sequences alone, suggesting that CBDs were incorporated into these enzymes and evolved in tandem with the catalytic domains before the diversification of different insect orders. Based on these results, the evolution of CBDs in insect CBPs is discussed to provide a new insight into the CBD sequence structure and diversity, and their evolution and expression in insects. Copyright © 2014 Elsevier Ltd. All rights reserved.
Mielczarek, M; Frąszczak, M; Giannico, R; Minozzi, G; Williams, John L; Wojdak-Maksymiec, K; Szyda, J
2017-07-01
Thirty-two whole genome DNA sequences of cows were analyzed to evaluate inter-individual variability in the distribution and length of copy number variations (CNV) and to functionally annotate CNV breakpoints. The total number of deletions per individual varied between 9,731 and 15,051, whereas the number of duplications was between 1,694 and 5,187. Most of the deletions (81%) and duplications (86%) were unique to a single cow. No relation between the pattern of variant sharing and a family relationship or disease status was found. The animal-averaged length of deletions was from 5,234 to 9,145 bp and the average length of duplications was between 7,254 and 8,843 bp. Highly significant inter-individual variation in length and number of CNV was detected for both deletions and duplications. The majority of deletion and duplication breakpoints were located in intergenic regions and introns, whereas fewer were identified in noncoding transcripts and splice regions. Only 1.35 and 0.79% of the deletion and duplication breakpoints were observed within coding regions. A gene with the highest number of deletion breakpoints codes for protein kinase cGMP-dependent type I, whereas the T-cell receptor α constant gene had the most duplication breakpoints. The functional annotation of genes with the largest incidence of deletion/duplication breakpoints identified 87/112 Kyoto Encyclopedia of Genes and Genomes pathways, but none of the pathways were significantly enriched or depleted with breakpoints. The analysis of Gene Ontology (GO) terms revealed that a cluster with the highest enrichment score among genes with many deletion breakpoints was represented by GO terms related to ion transport, whereas the GO term cluster mostly enriched among the genes with many duplication breakpoints was related to binding of macromolecules. Furthermore, when considering the number of deletion breakpoints per gene functional category, no significant differences were observed between the "housekeeping" and "strong selection" categories, but genes representing the "low selection pressure" group showed a significantly higher number of breakpoints. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Molecular evolution of the crustacean hyperglycemic hormone family in ecdysozoans
2010-01-01
Background Crustacean Hyperglycemic Hormone (CHH) family peptides are neurohormones known to regulate several important functions in decapod crustaceans such as ionic and energetic metabolism, molting and reproduction. The structural conservation of these peptides, together with the variety of functions they display, led us to investigate their evolutionary history. CHH family peptides exist in insects (Ion Transport Peptides) and may be present in all ecdysozoans as well. In order to extend the evolutionary study to the entire family, CHH family peptides were thus searched in taxa outside decapods, where they have been, to date, poorly investigated. Results CHH family peptides were characterized by molecular cloning in a branchiopod crustacean, Daphnia magna, and in a collembolan, Folsomia candida. Genes encoding such peptides were also rebuilt in silico from genomic sequences of another branchiopod, a chelicerate and two nematodes. These sequences were included in updated datasets to build phylogenies of the CHH family in pancrustaceans. These phylogenies suggest that peptides found in Branchiopoda and Collembola are more closely related to insect ITPs than to crustacean CHHs. Datasets were also used to support a phylogenetic hypothesis about pancrustacean relationships, which, in addition to gene structures, allowed us to propose two evolutionary scenarios of this multigenic family in ecdysozoans. Conclusions Evolutionary scenarios suggest that CHH family genes of ecdysozoans originate from an ancestral two-exon gene, and genes of arthropods from a three-exon one. In malacostracans, the evolution of the CHH family has involved several duplication, insertion or deletion events, leading to neuropeptides with a wide variety of functions, as observed in decapods. This family could thus constitute a promising model to investigate the links between gene duplications and functional divergence. PMID:20184761
[Hyperuricemia and gene mutations: a case report].
Tattoli, Fabio; Falconi, Daniela; De Prisco, Ornella; Maurizio, Gherzi; Marazzi, Federico; Marengo, Marita; Serra, Ilaria; Tamagnone, Michela; Cordero di Montezemolo, Luca; Pasini, Barbara; Formica, Marco
2017-06-01
Hyperuricemia is frequently found in nephrology. The case presented may be useful to clarify some pathogenetic aspects. It is a patient of 18 years, hyperuricaemic. Non-consanguineous parents, hyperuricemia in the paternal line, not neuropsychiatric disorders in the family. Delay in neuromotor acquisitions, average intellectual disabilities, anxiety disorder, obsessive-compulsive personality traits. Normal renal function and renal ultrasound. Evidence of hyperuricemia in 2015. Never gouty episodes and / or lithiasis, initiated allopurinol 100 mg on alternate days, with no side effects, urea in the control range, slightly below normal uricuria. Given the complex clinical, he carried out a genetic analysis of array-CGH. He showed a deletion on the short arm of chromosome 3 (3p12.3) and a duplication of the long arm of chromosome 1 (19q13-42). The deletion 3p12.3 (paternal inheritance), involves the ROBO2 gene. Duplication 19q13.42, (maternal inheritance), includes NLRP12, DPRX, ZNF331 genes. The ROBO2 gene with its mutation, is associated with vesicoureteral reflux. The NLRP12 gene encodes proteins called "Nalps", forming a subfamily of proteins "CATERPILLAR". Many "Nalps" as well as the "Nalps 12" have an N-terminal domain (DYP) with a purin. Since uric acid is a byproduct of purine metabolism, considered the familiarity, we believe that we can hypothesize that the mutations found. In particular those concerning the NLRP-12 gene, may have a role in the presence of hyperuricemia. We believe that in patients with hyperuricemia, associated with a particular impairment of neurological picture, it is likely that there is a subtended common genetic deficiency. Copyright by Società Italiana di Nefrologia SIN, Rome, Italy.
Pan, Xue; Siloto, Rodrigo M. P.; Wickramarathna, Aruna D.; Mietkiewska, Elzbieta; Weselake, Randall J.
2013-01-01
The oil from flax (Linum usitatissimum L.) has high amounts of α-linolenic acid (ALA; 18:3cisΔ9,12,15) and is one of the richest sources of omega-3 polyunsaturated fatty acids (ω-3-PUFAs). To produce ∼57% ALA in triacylglycerol (TAG), it is likely that flax contains enzymes that can efficiently transfer ALA to TAG. To test this hypothesis, we conducted a systematic characterization of TAG-synthesizing enzymes from flax. We identified several genes encoding acyl-CoA:diacylglycerol acyltransferases (DGATs) and phospholipid:diacylglycerol acyltransferases (PDATs) from the flax genome database. Due to recent genome duplication, duplicated gene pairs have been identified for all genes except DGAT2-2. Analysis of gene expression indicated that two DGAT1, two DGAT2, and four PDAT genes were preferentially expressed in flax embryos. Yeast functional analysis showed that DGAT1, DGAT2, and two PDAT enzymes restored TAG synthesis when produced recombinantly in yeast H1246 strain. The activity of particular PDAT enzymes (LuPDAT1 and LuPDAT2) was stimulated by the presence of ALA. Further seed-specific expression of flax genes in Arabidopsis thaliana indicated that DGAT1, PDAT1, and PDAT2 had significant effects on seed oil phenotype. Overall, this study indicated the existence of unique PDAT enzymes from flax that are able to preferentially catalyze the synthesis of TAG containing ALA acyl moieties. The identified LuPDATs may have practical applications for increasing the accumulation of ALA and other polyunsaturated fatty acids in oilseeds for food and industrial applications. PMID:23824186
GENE-dosage effects on fitness in recent adaptive duplications: ace-1 in the mosquito Culex pipiens.
Labbé, Pierrick; Milesi, Pascal; Yébakima, André; Pasteur, Nicole; Weill, Mylène; Lenormand, Thomas
2014-07-01
Gene duplications have long been advocated to contribute to the evolution of new functions. The role of selection in their early spread is more controversial. Unless duplications are favored for a direct benefit of increased expression, they are likely detrimental. In this article, we investigated the case of duplications favored because they combine already functionally divergent alleles. Their gene-dosage/fitness relations are poorly known because selection may operate on both overall expression and duplicates relative dosage. Using the well-documented case of Culex pipiens resistance to insecticides, we compared strains with various ace-1 allele combinations, including two duplicated alleles carrying both susceptible and resistant copies. The overall protein activity was nearly additive, but, surprisingly, fitness correlated better with the relative proportion of susceptible and resistant copies rather than any absolute measure of activity. Gene dosage is thus crucial, duplications stabilizing a "heterozygote" phenotype. It corroborates the view that these were favored because they fix a permanent heterosis, thereby solving the irreducible trade-off between resistance and synaptic transmission. Moreover, we showed that the contrasted successes of the two duplicated alleles in natural populations depend on genetic changes unrelated to ace-1, confirming the probable implication of recessive sublethal mutations linked to structural rearrangements in some duplications. © 2014 The Author(s). Evolution © 2014 The Society for the Study of Evolution.
Cocquempot, Olivier; Brault, Véronique; Babinet, Charles; Herault, Yann
2009-09-01
Polyalanine expansion diseases are proposed to result from unequal crossover of sister chromatids that increases the number of repeats. In this report we suggest an alternative mechanism we put forward while we investigated a new spontaneous mutant that we named "Dyc" for "Digit in Y and Carpe" phenotype. Phenotypic analysis revealed an abnormal limb patterning similar to that of the human inherited congenital disease synpolydactyly (SPD) and to the mouse mutant model Spdh. Both human SPD and mouse Spdh mutations affect the Hoxd13 gene within a 15-residue polyalanine-encoding repeat in the first exon of the gene, leading to a dominant negative HOXD13. Genetic analysis of the Dyc mutant revealed a trinucleotide expansion in the polyalanine-encoding region of the Hoxd13 gene resulting in a 7-alanine expansion. However, unlike the Spdh mutation, this expansion cannot result from a simple duplication of a short segment. Instead, we propose the fork stalling and template switching (FosTeS) described for generation of nonrecurrent genomic rearrangements as a possible mechanism for the Dyc polyalanine extension, as well as for other polyalanine expansions described in the literature and that could not be explained by unequal crossing over.
Cocquempot, Olivier; Brault, Véronique; Babinet, Charles; Herault, Yann
2009-01-01
Polyalanine expansion diseases are proposed to result from unequal crossover of sister chromatids that increases the number of repeats. In this report we suggest an alternative mechanism we put forward while we investigated a new spontaneous mutant that we named “Dyc” for “Digit in Y and Carpe” phenotype. Phenotypic analysis revealed an abnormal limb patterning similar to that of the human inherited congenital disease synpolydactyly (SPD) and to the mouse mutant model Spdh. Both human SPD and mouse Spdh mutations affect the Hoxd13 gene within a 15-residue polyalanine-encoding repeat in the first exon of the gene, leading to a dominant negative HOXD13. Genetic analysis of the Dyc mutant revealed a trinucleotide expansion in the polyalanine-encoding region of the Hoxd13 gene resulting in a 7-alanine expansion. However, unlike the Spdh mutation, this expansion cannot result from a simple duplication of a short segment. Instead, we propose the fork stalling and template switching (FosTeS) described for generation of nonrecurrent genomic rearrangements as a possible mechanism for the Dyc polyalanine extension, as well as for other polyalanine expansions described in the literature and that could not be explained by unequal crossing over. PMID:19546318
Collart, F R; Osipiuk, J; Trent, J; Olsen, G J; Huberman, E
1996-10-03
We have cloned and characterized the gene encoding inosine monophosphate dehydrogenase (IMPDH) from Pyrococcus furiosus (Pf), a hyperthermophillic archeon. Sequence analysis of the Pf gene indicated an open reading frame specifying a protein of 485 amino acids (aa) with a calculated M(r) of 52900. Canonical Archaea promoter elements, Box A and Box B, are located -49 and -17 nucleotides (nt), respectively, upstream of the putative start codon. The sequence of the putative active-site region conforms to the IMPDH signature motif and contains a putative active-site cysteine. Phylogenetic relationships derived by using all available IMPDH sequences are consistent with trees developed for other molecules; they do not precisely resolve the history of Pf IMPDH but indicate a close similarity to bacterial IMPDH proteins. The phylogenetic analysis indicates that a gene duplication occurred prior to the division between rodents and humans, accounting for the Type I and II isoforms identified in mice and humans.
Genome Structure of the Legume, Lotus japonicus
Sato, Shusei; Nakamura, Yasukazu; Kaneko, Takakazu; Asamizu, Erika; Kato, Tomohiko; Nakao, Mitsuteru; Sasamoto, Shigemi; Watanabe, Akiko; Ono, Akiko; Kawashima, Kumiko; Fujishiro, Tsunakazu; Katoh, Midori; Kohara, Mitsuyo; Kishida, Yoshie; Minami, Chiharu; Nakayama, Shinobu; Nakazaki, Naomi; Shimizu, Yoshimi; Shinpo, Sayaka; Takahashi, Chika; Wada, Tsuyuko; Yamada, Manabu; Ohmido, Nobuko; Hayashi, Makoto; Fukui, Kiichi; Baba, Tomoya; Nakamichi, Tomoko; Mori, Hirotada; Tabata, Satoshi
2008-01-01
The legume Lotus japonicus has been widely used as a model system to investigate the genetic background of legume-specific phenomena such as symbiotic nitrogen fixation. Here, we report structural features of the L. japonicus genome. The 315.1-Mb sequences determined in this and previous studies correspond to 67% of the genome (472 Mb), and are likely to cover 91.3% of the gene space. Linkage mapping anchored 130-Mb sequences onto the six linkage groups. A total of 10 951 complete and 19 848 partial structures of protein-encoding genes were assigned to the genome. Comparative analysis of these genes revealed the expansion of several functional domains and gene families that are characteristic of L. japonicus. Synteny analysis detected traces of whole-genome duplication and the presence of synteny blocks with other plant genomes to various degrees. This study provides the first opportunity to look into the complex and unique genetic system of legumes. PMID:18511435
An Exact Algorithm to Compute the Double-Cut-and-Join Distance for Genomes with Duplicate Genes.
Shao, Mingfu; Lin, Yu; Moret, Bernard M E
2015-05-01
Computing the edit distance between two genomes is a basic problem in the study of genome evolution. The double-cut-and-join (DCJ) model has formed the basis for most algorithmic research on rearrangements over the last few years. The edit distance under the DCJ model can be computed in linear time for genomes without duplicate genes, while the problem becomes NP-hard in the presence of duplicate genes. In this article, we propose an integer linear programming (ILP) formulation to compute the DCJ distance between two genomes with duplicate genes. We also provide an efficient preprocessing approach to simplify the ILP formulation while preserving optimality. Comparison on simulated genomes demonstrates that our method outperforms MSOAR in computing the edit distance, especially when the genomes contain long duplicated segments. We also apply our method to assign orthologous gene pairs among human, mouse, and rat genomes, where once again our method outperforms MSOAR.
2009-01-01
Background Penicillium chrysogenum converts isopenicillin N (IPN) into hydrophobic penicillins by means of the peroxisomal IPN acyltransferase (IAT), which is encoded by the penDE gene. In silico analysis of the P. chrysogenum genome revealed the presence of a gene, Pc13g09140, initially described as paralogue of the IAT-encoding penDE gene. We have termed this gene ial because it encodes a protein with high similarity to IAT (IAL for IAT-Like). We have conducted an investigation to characterize the ial gene and to determine the role of the IAL protein in the penicillin biosynthetic pathway. Results The IAL contains motifs characteristic of the IAT such as the processing site, but lacks the peroxisomal targeting sequence ARL. Null ial mutants and overexpressing strains indicated that IAL lacks acyltransferase (penicillin biosynthetic) and amidohydrolase (6-APA forming) activities in vivo. When the canonical ARL motif (leading to peroxisomal targeting) was added to the C-terminus of the IAL protein (IALARL) by site-directed mutagenesis, no penicillin biosynthetic activity was detected. Since the IAT is only active after an accurate self-processing of the preprotein into α and β subunits, self-processing of the IAL was tested in Escherichia coli. Overexpression experiments and SDS-PAGE analysis revealed that IAL is also self-processed in two subunits, but despite the correct processing, the enzyme remained inactive in vitro. Conclusion No activity related to the penicillin biosynthesis was detected for the IAL. Sequence comparison among the P. chrysogenum IAL, the A. nidulans IAL homologue and the IAT, revealed that the lack of enzyme activity seems to be due to an alteration of the essential Ser309 in the thioesterase active site. Homologues of the ial gene have been found in many other ascomycetes, including non-penicillin producers. Our data suggest that like in A. nidulans, the ial and penDE genes might have been formed from a single ancestral gene that became duplicated during evolution, although a separate evolutive origin for the ial and penDE genes, is also discussed. PMID:19470155
Synthetic and Evolutionary Construction of a Chlorate-Reducing Shewanella oneidensis MR-1
Clark, Iain C.; Melnyk, Ryan A.; Youngblut, Matthew D.; Carlson, Hans K.; Iavarone, Anthony T.
2015-01-01
ABSTRACT Despite evidence for the prevalence of horizontal gene transfer of respiratory genes, little is known about how pathways functionally integrate within new hosts. One example of a mobile respiratory metabolism is bacterial chlorate reduction, which is frequently encoded on composite transposons. This implies that the essential components of the metabolism are encoded on these mobile elements. To test this, we heterologously expressed genes for chlorate reduction from Shewanella algae ACDC in the non-chlorate-reducing Shewanella oneidensis MR-1. The construct that ultimately endowed robust growth on chlorate included cld, a cytochrome c gene, clrABDC, and two genes of unknown function. Although strain MR-1 was unable to grow on chlorate after initial insertion of these genes into the chromosome, 11 derived strains capable of chlorate respiration were obtained through adaptive evolution. Genome resequencing indicated that all of the evolved chlorate-reducing strains replicated a large genomic region containing chlorate reduction genes. Contraction in copy number and loss of the ability to reduce chlorate were also observed, indicating that this phenomenon was extremely dynamic. Although most strains contained more than six copies of the replicated region, a single strain with less duplication also grew rapidly. This strain contained three additional mutations that we hypothesized compensated for the low copy number. We remade the mutations combinatorially in the unevolved strain and determined that a single nucleotide polymorphism (SNP) upstream of cld enabled growth on chlorate and was epistatic to a second base pair change in the NarP binding sequence between narQP and nrfA that enhanced growth. PMID:25991681
Napolitano, Mauro; Rubio, Miguel Ángel; Santamaría-Gómez, Javier; Olmedo-Verd, Elvira; Robinson, Nigel J; Luque, Ignacio
2012-05-01
Zur regulators control zinc homeostasis by repressing target genes under zinc-sufficient conditions in a wide variety of bacteria. This paper describes how part of a survey of duplicated genes led to the identification of the open reading frame all2473 as the gene encoding the Zur regulator of the cyanobacterium Anabaena sp. strain PCC 7120. All2473 binds to DNA in a zinc-dependent manner, and its DNA-binding sequence was characterized, which allowed us to determine the relative contribution of particular nucleotides to Zur binding. A zur mutant was found to be impaired in the regulation of zinc homeostasis, showing sensitivity to elevated concentrations of zinc but not other metals. In an effort to characterize the Zur regulon in Anabaena, 23 genes containing upstream putative Zur-binding sequences were identified and found to be regulated by Zur. These genes are organized in six single transcriptional units and six operons, some of them containing multiple Zur-regulated promoters. The identities of genes of the Zur regulon indicate that Anabaena adapts to conditions of zinc deficiency by replacing zinc metalloproteins with paralogues that fulfill the same function but presumably with a lower zinc demand, and with inducing putative metallochaperones and membrane transport systems likely being involved in the scavenging of extracellular zinc, including plasma membrane ABC transport systems and outer membrane TonB-dependent receptors. Among the Zur-regulated genes, the ones showing the highest induction level encode proteins of the outer membrane, suggesting a primary role for components of this cell compartment in the capture of zinc cations from the extracellular medium.
A novel sodium bicarbonate cotransporter-like gene in an ancient duplicated region: SLC4A9 at 5q31
Lipovich, Leonard; Lynch, Eric D; Lee, Ming K; King, Mary-Claire
2001-01-01
Background: Sodium bicarbonate cotransporter (NBC) genes encode proteins that execute coupled Na+ and HCO3- transport across epithelial cell membranes. We report the discovery, characterization, and genomic context of a novel human NBC-like gene, SLC4A9, on chromosome 5q31. Results: SLC4A9 was initially discovered by genomic sequence annotation and further characterized by sequencing of long-insert cDNA library clones. The predicted protein of 990 amino acids has 12 transmembrane domains and high sequence similarity to other NBCs. The 23-exon gene has 14 known mRNA isoforms. In three regions, mRNA sequence variation is generated by the inclusion or exclusion of portions of an exon. Noncoding SLC4A9 cDNAs were recovered multiple times from different libraries. The 3' untranslated region is fragmented into six alternatively spliced exons and contains expressed Alu, LINE and MER repeats. SLC4A9 has two alternative stop codons and six polyadenylation sites. Its expression is largely restricted to the kidney. In silico approaches were used to characterize two additional novel SLC4A genes and to place SLC4A9 within the context of multiple paralogous gene clusters containing members of the epidermal growth factor (EGF), ankyrin (ANK) and fibroblast growth factor (FGF) families. Seven human EGF-SLC4A-ANK-FGF clusters were found. Conclusion: The novel sodium bicarbonate cotransporter-like gene SLC4A9 demonstrates abundant alternative mRNA processing. It belongs to a growing class of functionally diverse genes characterized by inefficient highly variable splicing. The evolutionary history of the EGF-SLC4A-ANK-FGF gene clusters involves multiple rounds of duplication, apparently followed by large insertions and deletions at paralogous loci and genome-wide gene shuffling. PMID:11305939
Unraveling flp-11/flp-32 dichotomy in nematodes.
Atkinson, Louise E; Miskelly, Iain R; Moffett, Christy L; McCoy, Ciaran J; Maule, Aaron G; Marks, Nikki J; Mousley, Angela
2016-10-01
FMRFamide-like peptide (FLP) signalling systems are core to nematode neuromuscular function. Novel drug discovery efforts associated with nematode FLP/FLP receptor biology are advanced through the accumulation of basic biological data that can reveal subtle complexities within the neuropeptidergic system. This study reports the characterisation of FMRFamide-like peptide encoding gene-11 (flp-11) and FMRFamide-like peptide encoding gene-32 (flp-32), two distinct flp genes which encode the analogous peptide, AMRN(A/S)LVRFamide, in multiple nematode species - the only known example of this phenomenon within the FLPergic system of nematodes. Using bioinformatics, in situ hybridisation, immunocytochemistry and behavioural assays we show that: (i) flp-11 and -32 are distinct flp genes expressed individually or in tandem across multiple nematode species, where they encode a highly similar peptide; (ii) flp-11 does not appear to be the most widely expressed flp in Caenorhabditis elegans; (iii) in species expressing both flp-11 and flp-32, flp-11 displays a conserved, restricted expression pattern across nematode clades and lifestyles; (iv) in species expressing both flp-11 and flp-32, flp-32 expression is more widespread and less conserved than flp-11; (v) in species expressing only flp-11, the flp-11 expression profile is more similar to the flp-32 profile observed in species expressing both; and (vi) FLP-11 peptides inhibit motor function in multiple nematode species. The biological significance and evolutionary origin of flp-11 and -32 peptide duplication remains unclear despite attempts to identify a common ancestor; this may become clearer as the availability of genomic data improves. This work provides insight into the complexity of the neuropeptidergic system in nematodes, and begins to examine how nematodes may compensate for structural neuronal simplicity. From a parasite control standpoint, this work underscores the importance of basic biological data, and has wider implications for the utility of C. elegans as a model for parasite neurobiology. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Wentz, Elisabet; Vujic, Mihailo; Kärrstedt, Ewa-Lotta; Erlandsson, Anna; Gillberg, Christopher
2014-05-01
Autism spectrum disorder, severe behaviour problems and duplication of the Xq12 to Xq13 region have recently been described in three male relatives. To describe the psychiatric comorbidity and dysmorphic features, including craniosynostosis, of two male siblings with autism and duplication of the Xq13 to Xq21 region, and attempt to narrow down the number of duplicated genes proposed to be leading to global developmental delay and autism. We performed DNA sequencing of certain exons of the TWIST1 gene, the FGFR2 gene and the FGFR3 gene. We also performed microarray analysis of the DNA. In addition to autism, the two male siblings exhibited severe learning disability, self-injurious behaviour, temper tantrums and hyperactivity, and had no communicative language. Chromosomal analyses were normal. Neither of the two siblings showed mutations of the sequenced exons known to produce craniosynostosis. The microarray analysis detected an extra copy of a region on the long arm of chromosome X, chromosome band Xq13.1-q21.1. Comparison of our two cases with previously described patients allowed us to identify three genes predisposing for autism in the duplicated chromosomal region. Sagittal craniosynostosis is also a new finding linked to the duplication.
Abi Rached, L; McDermott, M F; Pontarotti, P
1999-02-01
The human Major Histocompatibility Complex (MHC) shares similarities with three other chromosome regions in human. This could be the vestige of ancestral large scale duplications. We discuss here the possibility i) that these duplications occurred during two rounds of tetraploidization supposed to have taken place during chordate evolution before the jawed vertebrate radiation, and ii) that one of the quadruplicate regions, relaxed of functional constraints, gave rise to the vertebrate MHC by a quick round of gene cis-duplication and cis-exon shuffling. These different rounds of cis-duplications and exon shufflings allowed the emergence of new genes participating in novel biological functions i.e. adaptive immune responses. Cis-duplications and cis-exon shufflings are ongoing processes in the evolution of some of these genes in this region as they have occurred and were fixed at different times and in different lineages during vertebrate evolution. In contrast, other genes within the MHC have remained stable since the emergence of jawed vertebrates.
Li, Qi; Zhang, Ning; Zhang, Liangsheng; Ma, Hong
2015-04-01
Rhomboid proteins are intramembrane serine proteases that are involved in a plethora of biological functions, but the evolutionary history of the rhomboid gene family is not clear. We performed a comprehensive molecular evolutionary analysis of the rhomboid gene family and also investigated the organization and sequence features of plant rhomboids in different subfamilies. Our results showed that eukaryotic rhomboids could be divided into five subfamilies (RhoA-RhoD and PARL). Most orthology groups appeared to be conserved only as single or low-copy genes in all lineages in RhoB-RhoD and PARL, whereas RhoA genes underwent several duplication events, resulting in multiple gene copies. These duplication events were due to whole genome duplications in plants and animals and the duplicates might have experienced functional divergence. We also identified a novel group of plant rhomboid (RhoB1) that might have lost their enzymatic activity; their existence suggests that they might have evolved new mechanisms. Plant and animal rhomboids have similar evolutionary patterns. In addition, there are mutations affecting key active sites in RBL8, RBL9 and one of the Brassicaceae PARL duplicates. This study delineates a possible evolutionary scheme for intramembrane proteins and illustrates distinct fates and a mechanism of evolution of gene duplicates. © 2014 The Authors. New Phytologist © 2014 New Phytologist Trust.
A diffusion model for the fate of tandem gene duplicates in diploids.
O'Hely, Martin
2007-06-01
Suppose one chromosome in one member of a population somehow acquires a duplicate copy of the gene, fully linked to the original gene's locus. Preservation is the event that eventually every chromosome in the population is a descendant of the one which initially carried the duplicate. For a haploid population in which the absence of all copies of the gene is lethal, the probability of preservation has recently been estimated via a diffusion approximation. That approximation is shown to carry over to the case of diploids and arbitrary strong selection against the absence of the gene. The techniques used lead to some new results. In the large population limit, it is shown that the relative probability that descendants of a small number of individuals carrying multiple copies of the gene fix in the population is proportional to the number of copies carried. The probability of preservation is approximated when chromosomes carrying two copies of the gene are subject to additional, fully non-functionalizing mutations, thereby modelling either an additional cost of replicating a longer genome, or a partial duplication of the gene. In the latter case the preservation probability depends only on the mutation rate to null for the duplicated portion of the gene.
The Temporal Regulation of S Phase Proteins During G1
Grant, Gavin D.; Cook, Jeanette G.
2018-01-01
Successful DNA replication requires intimate coordination with cell cycle progression. Prior to DNA replication initiation in S phase, a series of essential preparatory events in G1 phase ensures timely, complete, and precise genome duplication. Among the essential molecular processes are regulated transcriptional upregulation of genes that encode replication proteins, appropriate post-transcriptional control of replication factor abundance and activity, and the assembly of DNA-loaded protein complexes to license replication origins. In this chapter we describe these critical G1 events necessary for DNA replication and their regulation in the context of both cell cycle entry and cell cycle progression. PMID:29357066
Cardoso, João C R; Félix, Rute C; Trindade, Marlene; Power, Deborah M
2014-12-01
The secretin receptor (SCTR) is a member of Class 2 subfamily B1 GPCRs and part of the PAC1/VPAC receptor subfamily. This receptor has long been known in mammals but has only recently been identified in other vertebrates including teleosts, from which it was previously considered to be absent. The ligand for SCTR in mammals is secretin (SCT), an important gastrointestinal peptide, which in teleosts has not yet been isolated, or the gene identified. This study revises the evolutionary model previously proposed for the secretin-GPCRs in metazoan by analysing in detail the fishes, the most successful of the extant vertebrates. All the Actinopterygii genomes analysed and the Chondrichthyes and Sarcopterygii fish possess a SCTR gene that shares conserved sequence, structure and synteny with the tetrapod homologue. Phylogenetic clustering and gene environment comparisons revealed that fish and tetrapod SCTR shared a common origin and diverged early from the PAC1/VPAC subfamily group. In teleosts SCTR duplicated as a result of the fish specific whole genome duplication but in all the teleost genomes analysed, with the exception of tilapia (Oreochromis niloticus), one of the duplicates was lost. The function of SCTR in teleosts is unknown but quantitative PCR revealed that in both sea bass (Dicentrarchus labrax) and tilapia (Oreochromis mossambicus) transcript abundance is high in the gastrointestinal tract suggesting it may intervene in similar processes to those in mammals. In contrast, no gene encoding the ligand SCT was identified in the ray-finned fishes (Actinopterygii) although it was present in the coelacanth (lobe finned fish, Sarcopterygii) and in the elephant shark (holocephalian). The genes in linkage with SCT in tetrapods and coelacanth were also identified in ray-finned fishes supporting the idea that it was lost from their genome. At present SCTR remains an orphan receptor in ray-finned fishes and it will be of interest in the future to establish why SCT was lost and which ligand substitutes for it so that full characterization of the receptor can occur. Copyright © 2014 Elsevier Inc. All rights reserved.
Divergence and evolution of cotton bHLH proteins from diploid to allotetraploid.
Liu, Bingliang; Guan, Xueying; Liang, Wenhua; Chen, Jiedan; Fang, Lei; Hu, Yan; Guo, Wangzhen; Rong, Junkang; Xu, Guohua; Zhang, Tianzhen
2018-02-23
Polyploidy is considered a major driving force in genome expansion, yielding duplicated genes whose expression may be conserved or divergence as a consequence of polyploidization. We compared the genome sequences of tetraploid cotton (Gossypium hirsutum) and its two diploid progenitors, G. arboreum and G. raimondii, and found that the bHLH genes were conserved over the polyploidization. Oppositely, the expression of the homeolgous gene pairs was diversified. The biased homeologous proportion for bHLH family is significantly higher (64.6%) than the genome wide homeologous expression bias (40%). Compared with cacao (T. cacao), orthologous genes only accounted for a small proportion (41.7%) of whole cotton bHLHs family. The further Ks analysis indicated that bHLH genes underwent at least two distinct episodes of whole genome duplication: a recent duplication (1.0-60.0 million years ago, MYA, 0.005 < Ks < 0.312) and an old duplication (> 60.0 MYA, 0.312 < Ks < 3.0). The old duplication event might have played a key role in the expansion of the bHLH family. Both recent and old duplicated pairs (68.8%) showed a divergent expression profile, indicating specialized functions. The expression diversification of the duplicated genes suggested it might be a universal feature of the long-term evolution of cotton. Overview of cotton bHLH proteins indicated a conserved and divergent evolution from diploids to allotetraploid. Our results provided an excellent example for studying the long-term evolution of polyploidy.
Manno, N; Sherratt, S; Boaretto, F; Coico, F Mejìa; Camus, C Espinoza; Campos, C Jara; Musumeci, S; Battisti, A; Quinnell, R J; León, J Mostacero; Vazza, G; Mostacciuolo, M L; Paoletti, M G; Falcone, F H
2014-11-26
The human genome encodes a gene for an enzymatically active chitinase (CHIT1) located in a single copy on Chromosome 1, which is highly expressed by activated macrophages and in other cells of the innate immune response. Several dysfunctional mutations are known in CHIT1, including a 24-bp duplication in Exon 10 causing catalytic deficiency. This duplication is a common variant conserved in many human populations, except in West and South Africans. Thus it has been proposed that human migration out of Africa and the consequent reduction of exposure to chitin from environmental factors may have enabled the conservation of dysfunctional mutations in human chitinases. Our data obtained from 85 indigenous Amerindians from Peru, representative of populations characterized by high prevalence of chitin-bearing enteroparasites and intense entomophagy, reveal a very high frequency of the 24-bp duplication (47.06%), and of other single nucleotide polymorphisms which are known to partially affect enzymatic activity (G102S: 42.7% and A442G/V: 25.5%). Our finding is in line with a founder effect, but appears to confute our previous hypothesis of a protective role against parasite infection and sustains the discussion on the redundancy of chitinolytic function. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.
Ragupathy, Raja; Naeem, Hamid A; Reimer, Elsa; Lukow, Odean M; Sapirstein, Harry D; Cloutier, Sylvie
2008-01-01
Sequencing of a BAC clone encompassing the Glu-B1 locus in Glenlea, revealed a 10.3 Kb segmental duplication including the Bx7 gene and flanking an LTR retroelement. To better understand the evolution of this locus, two collections of wheat were surveyed. The first consisted of 96 diploid and tetraploid species accessions while the second consisted of 316 Triticum aestivum cultivars and landraces from 41 countries. The genotypes were first characterized by SDS-PAGE and a total of 40 of the 316 T. aestivum accessions were found to display the overexpressed Bx7 phenotype (Bx7OE). Three lines from the 96 diploid/tetraploid collection also displayed the stronger intensity staining characteristic of the Bx7(OE) subunit. The relative amounts of the Bx7 subunit to total HMW-GS were quantified by RP-HPLC for all Bx7OE accessions and a number of checks. The entire collection was assessed for the presence of four DNA markers namely an 18 bp indel of the coding region of Bx7 variant alleles, a 43 bp indel of the 5'-region and the left and right junctions of the LTR retrotransposon borders and the duplicated segment. All 43 accessions found to have the Bx7OE subunit by SDS-PAGE and RP-HPLC produced the four diagnostic PCR amplicons. None of the lines without the Bx7OE had the LTR retroelement/duplication genomic structure. However, the 18 and 43 bp indel were found in accessions other than Bx7OE. These results indicate that the overexpression of the Bx7 HMW-GS is likely the result of a single event, i.e., a gene duplication at the Glu-B1 locus mediated by the insertion of a retroelement. Also, the 18 and 43 bp indels pre-date the duplication event. Allelic variants Bx7*, Bx7 with and without 43 bp insert and Bx7OE were found in both tetraploid and hexaploid collections and shared the same genomic organization. Though the possibility of introgression from T. aestivum to T. turgidum cannot be ruled out, the three structural genomic changes of the B-genome taken together support the hypothesis of multiple polyploidization events involving different tetraploid progenitors.
Hayashida, Kyoko; Hara, Yuichiro; Abe, Takashi; Yamasaki, Chisato; Toyoda, Atsushi; Kosuge, Takehide; Suzuki, Yutaka; Sato, Yoshiharu; Kawashima, Shuichi; Katayama, Toshiaki; Wakaguri, Hiroyuki; Inoue, Noboru; Homma, Keiichi; Tada-Umezaki, Masahito; Yagi, Yukio; Fujii, Yasuyuki; Habara, Takuya; Kanehisa, Minoru; Watanabe, Hidemi; Ito, Kimihito; Gojobori, Takashi; Sugawara, Hideaki; Imanishi, Tadashi; Weir, William; Gardner, Malcolm; Pain, Arnab; Shiels, Brian; Hattori, Masahira; Nene, Vishvanath; Sugimoto, Chihiro
2012-01-01
ABSTRACT We sequenced the genome of Theileria orientalis, a tick-borne apicomplexan protozoan parasite of cattle. The focus of this study was a comparative genome analysis of T. orientalis relative to other highly pathogenic Theileria species, T. parva and T. annulata. T. parva and T. annulata induce transformation of infected cells of lymphocyte or macrophage/monocyte lineages; in contrast, T. orientalis does not induce uncontrolled proliferation of infected leukocytes and multiplies predominantly within infected erythrocytes. While synteny across homologous chromosomes of the three Theileria species was found to be well conserved overall, subtelomeric structures were found to differ substantially, as T. orientalis lacks the large tandemly arrayed subtelomere-encoded variable secreted protein-encoding gene family. Moreover, expansion of particular gene families by gene duplication was found in the genomes of the two transforming Theileria species, most notably, the TashAT/TpHN and Tar/Tpr gene families. Gene families that are present only in T. parva and T. annulata and not in T. orientalis, Babesia bovis, or Plasmodium were also identified. Identification of differences between the genome sequences of Theileria species with different abilities to transform and immortalize bovine leukocytes will provide insight into proteins and mechanisms that have evolved to induce and regulate this process. The T. orientalis genome database is available at http://totdb.czc.hokudai.ac.jp/. PMID:22951932
DOE Office of Scientific and Technical Information (OSTI.GOV)
Emms, David M.; Covshoff, Sarah; Hibberd, Julian M.
C4 photosynthesis is considered one of the most remarkable examples of evolutionary convergence in eukaryotes. However, it is unknown whether the evolution of C4 photosynthesis required the evolution of new genes. Genome-wide gene-tree species-tree reconciliation of seven monocot species that span two origins of C4 photosynthesis revealed that there was significant parallelism in the duplication and retention of genes coincident with the evolution of C4 photosynthesis in these lineages. Specifically, 21 orthologous genes were duplicated and retained independently in parallel at both C4 origins. Analysis of this gene cohort revealed that the set of parallel duplicated and retained genes ismore » enriched for genes that are preferentially expressed in bundle sheath cells, the cell type in which photosynthesis was activated during C4 evolution. Moreover, functional analysis of the cohort of parallel duplicated genes identified SWEET-13 as a potential key transporter in the evolution of C4 photosynthesis in grasses, and provides new insight into the mechanism of phloem loading in these C4 species.« less
A duplicated PLP gene causing Pelizaeus-Merzbacher disease detected by comparative multiplex PCR
DOE Office of Scientific and Technical Information (OSTI.GOV)
Inoue, K.; Sugiyama, N.; Kawanishi, C.
1996-07-01
Pelizaeus-Merzbacher disease (PMD) is an X-linked dysmyelinating disorder caused by abnormalities in the proteolipid protein (PLP) gene, which is essential for oligodendrocyte differentiation and CNS myelin formation. Although linkage analysis has shown the homogeneity at the PLP locus in patients with PMD, exonic mutations in the PLP gene have been identified in only 10% - 25% of all cases, which suggests the presence of other genetic aberrations, including gene duplication. In this study, we examined five families with PMD not carrying exonic mutations in PLP gene, using comparative multiplex PCR (CM-PCR) as a semiquantitative assay of gene dosage. PLP genemore » duplications were identified in four families by CM-PCR and confirmed in three families by densitometric RFLP analysis. Because a homologous myelin protein gene, PMP22, is duplicated in the majority of patients with Charcot-Marie-Tooth 1A, PLP gene overdosage may be an important genetic abnormality in PMD and affect myelin formation. 38 ref., 5 figs., 2 tabs.« less
The Genomic Basis of Evolutionary Innovation in Pseudomonas aeruginosa
Wagner, Andreas; MacLean, R. Craig
2016-01-01
Novel traits play a key role in evolution, but their origins remain poorly understood. Here we address this problem by using experimental evolution to study bacterial innovation in real time. We allowed 380 populations of Pseudomonas aeruginosa to adapt to 95 different carbon sources that challenged bacteria with either evolving novel metabolic traits or optimizing existing traits. Whole genome sequencing of more than 80 clones revealed profound differences in the genetic basis of innovation and optimization. Innovation was associated with the rapid acquisition of mutations in genes involved in transcription and metabolism. Mutations in pre-existing duplicate genes in the P. aeruginosa genome were common during innovation, but not optimization. These duplicate genes may have been acquired by P. aeruginosa due to either spontaneous gene amplification or horizontal gene transfer. High throughput phenotype assays revealed that novelty was associated with increased pleiotropic costs that are likely to constrain innovation. However, mutations in duplicate genes with close homologs in the P. aeruginosa genome were associated with low pleiotropic costs compared to mutations in duplicate genes with distant homologs in the P. aeruginosa genome, suggesting that functional redundancy between duplicates facilitates innovation by buffering pleiotropic costs. PMID:27149698
Ancestral whole-genome duplication in the marine chelicerate horseshoe crabs
Kenny, N J; Chan, K W; Nong, W; Qu, Z; Maeso, I; Yip, H Y; Chan, T F; Kwan, H S; Holland, P W H; Chu, K H; Hui, J H L
2016-01-01
Whole-genome duplication (WGD) results in new genomic resources that can be exploited by evolution for rewiring genetic regulatory networks in organisms. In metazoans, WGD occurred before the last common ancestor of vertebrates, and has been postulated as a major evolutionary force that contributed to their speciation and diversification of morphological structures. Here, we have sequenced genomes from three of the four extant species of horseshoe crabs—Carcinoscorpius rotundicauda, Limulus polyphemus and Tachypleus tridentatus. Phylogenetic and sequence analyses of their Hox and other homeobox genes, which encode crucial transcription factors and have been used as indicators of WGD in animals, strongly suggests that WGD happened before the last common ancestor of these marine chelicerates >135 million years ago. Signatures of subfunctionalisation of paralogues of Hox genes are revealed in the appendages of two species of horseshoe crabs. Further, residual homeobox pseudogenes are observed in the three lineages. The existence of WGD in the horseshoe crabs, noted for relative morphological stasis over geological time, suggests that genomic diversity need not always be reflected phenotypically, in contrast to the suggested situation in vertebrates. This study provides evidence of ancient WGD in the ecdysozoan lineage, and reveals new opportunities for studying genomic and regulatory evolution after WGD in the Metazoa. PMID:26419336
Lappin, Fiona M; Shaw, Rebecca L; Macqueen, Daniel J
2016-12-01
High-throughput sequencing has revolutionised comparative and evolutionary genome biology. It has now become relatively commonplace to generate multiple genomes and/or transcriptomes to characterize the evolution of large taxonomic groups of interest. Nevertheless, such efforts may be unsuited to some research questions or remain beyond the scope of some research groups. Here we show that targeted high-throughput sequencing offers a viable alternative to study genome evolution across a vertebrate family of great scientific interest. Specifically, we exploited sequence capture and Illumina sequencing to characterize the evolution of key components from the insulin-like growth (IGF) signalling axis of salmonid fish at unprecedented phylogenetic resolution. The IGF axis represents a central governor of vertebrate growth and its core components were expanded by whole genome duplication in the salmonid ancestor ~95Ma. Using RNA baits synthesised to genes encoding the complete family of IGF binding proteins (IGFBP) and an IGF hormone (IGF2), we captured, sequenced and assembled orthologous and paralogous exons from species representing all ten salmonid genera. This approach generated 299 novel sequences, most as complete or near-complete protein-coding sequences. Phylogenetic analyses confirmed congruent evolutionary histories for all nineteen recognized salmonid IGFBP family members and identified novel salmonid-specific IGF2 paralogues. Moreover, we reconstructed the evolution of duplicated IGF axis paralogues across a replete salmonid phylogeny, revealing complex historic selection regimes - both ancestral to salmonids and lineage-restricted - that frequently involved asymmetric paralogue divergence under positive and/or relaxed purifying selection. Our findings add to an emerging literature highlighting diverse applications for targeted sequencing in comparative-evolutionary genomics. We also set out a viable approach to obtain large sets of nuclear genes for any member of the salmonid family, which should enable insights into the evolutionary role of whole genome duplication before additional nuclear genome sequences become available. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Katoh, M; Kirikoshi, H; Terasaki, H; Shiokawa, K
2001-12-21
Genetic alterations of WNT signaling molecules lead to carcinogenesis through activation of the beta-catenin-TCF signaling pathway. We have previously cloned and characterized WNT2B/WNT13 gene on human chromosome 1p13, which is homologous to proto-oncogene WNT2 on human chromosome 7q31. WNT2B1 and WNT2B2 mRNAs, generated from the WNT2B gene due to alternative splicing of the alternative promoter type, encode almost identical polypeptides with divergence in the N-terminal region. WNT2B2 mRNA rather than WNT2B1 mRNA is preferentially expressed in NT2 cells with the potential of neuronal differentiation. Here, we describe our investigations of expression of WNT2B mRNAs in various types of human primary cancer. Matched tumor/normal expression array analysis revealed that WNT2B mRNAs were significantly up-regulated in 2 of 8 cases of primary gastric cancer. WNT2B2 mRNA rather than WNT2B1 mRNA was found to be preferentially up-regulated in a case of primary gastric cancer (signet ring cell carcinoma). Function of WNT2B1 mRNA and that of WNT2B2 mRNA were investigated by using Xenopus axis duplication assay. Injection of synthetic WNT2B1 mRNA into the ventral marginal zone of fertilized Xenopus eggs at the 4-cell stage did not induce axis duplication. In contrast, ventral injection of synthetic WNT2B2 mRNA induced axis duplication in 90% of embryos (complete axis duplication, 24%). These results strongly suggest that WNT2B2 up-regulation in some cases of gastric cancer might lead to carcinogenesis through activation of the beta-catenin-TCF signaling pathway.
Chen, Nian; Lai, Xiao-Ping
2010-07-01
We obtained the complete mitochondrial genome of King Cobra(GenBank accession number: EU_921899) by Ex Taq-PCR, TA-cloning and primer-walking methods. This genome is very similar to other vertebrate, which is 17 267 bp in length and encodes 38 genes (including 13 protein-coding, 2 ribosomal RNA and 23 transfer RNA genes) and two long non-coding regions. The duplication of tRNA-Ile gene forms a new mitochondrial gene rearrangement model. Eight tRNA genes and one protein genes were transcribed from L strand, and the other genes were transcribed genes from H strand. Genes on the H strand show a fairly similar content of Adenosine and Thymine respectively, whereas those on the L strand have higher proportion of A than T. Combined rDNA sequence data (12S+16S rRNA) were used to reconstruct the phylogeny of 21 snake species for which complete mitochondrial genome sequences were available in the public databases. This large data set and an appropriate range of outgroup taxa demonstrated that Elapidae is more closely related to colubridae than viperidae, which supports the traditional viewpoints.
Polan, Michelle B; Pastore, Matthew T; Steingass, Katherine; Hashimoto, Sayaka; Thrush, Devon L; Pyatt, Robert; Reshmi, Shalini; Gastier-Foster, Julie M; Astbury, Caroline; McBride, Kim L
2014-01-01
Recent studies have shown that certain copy number variations (CNV) are associated with a wide range of neurodevelopmental disorders, including autism spectrum disorders (ASD), bipolar disorder and intellectual disabilities. Implicated regions and genes have comprised a variety of post synaptic complex proteins and neurotransmitter receptors, including gamma-amino butyric acid A (GABAA). Clusters of GABAA receptor subunit genes are found on chromosomes 4p12, 5q34, 6q15 and 15q11-13. Maternally inherited 15q11-13 duplications among individuals with neurodevelopmental disorders are well described, but few case reports exist for the other regions. We describe a family with a 2.42 Mb duplication at chromosome 4p13 to 4p12, identified in the index case and other family members by oligonucleotide array comparative genomic hybridization, that contains 13 genes including a cluster of four GABAA receptor subunit genes. Fluorescent in-situ hybridization was used to confirm the duplication. The duplication segregates with a variety of neurodevelopmental disorders in this family, including ASD (index case), developmental delay, dyspraxia and ADHD (brother), global developmental delays (brother), learning disabilities (mother) and bipolar disorder (maternal grandmother). In addition, we identified and describe another individual unrelated to this family, with a similar duplication, who was diagnosed with ASD, ADHD and borderline intellectual disability. The 4p13 to 4p12 duplication appears to confer a susceptibility to a variety of neurodevelopmental disorders in these two families. We hypothesize that the duplication acts through a dosage effect of GABAA receptor subunit genes, adding evidence for alterations in the GABAergic system in the etiology of neurodevelopmental disorders. PMID:23695283
Convergent evolution of gene networks by single-gene duplications in higher eukaryotes.
Amoutzias, Gregory D; Robertson, David L; Oliver, Stephen G; Bornberg-Bauer, Erich
2004-03-01
By combining phylogenetic, proteomic and structural information, we have elucidated the evolutionary driving forces for the gene-regulatory interaction networks of basic helix-loop-helix transcription factors. We infer that recurrent events of single-gene duplication and domain rearrangement repeatedly gave rise to distinct networks with almost identical hub-based topologies, and multiple activators and repressors. We thus provide the first empirical evidence for scale-free protein networks emerging through single-gene duplications, the dominant importance of molecular modularity in the bottom-up construction of complex biological entities, and the convergent evolution of networks.
Duplicated genes evolve independently in allopolyploid cotton.
Richard C. Cronn; Randall L. Small; Jonathan F. Wendel
1999-01-01
Of the many processes that generate gene duplications, polyploidy is unique in that entire genomes are duplicated. This process has been important in the evolution of many eukaryotic groups, and it occurs with high frequency in plants. Recent evidence suggests that polyploidization may be accompanied by rapid genomic changes, but the evolutionary fate of discrete loci...
Genomic mechanisms accounting for the adaptation to parasitism in nematode-trapping fungi.
Meerupati, Tejashwari; Andersson, Karl-Magnus; Friman, Eva; Kumar, Dharmendra; Tunlid, Anders; Ahrén, Dag
2013-11-01
Orbiliomycetes is one of the earliest diverging branches of the filamentous ascomycetes. The class contains nematode-trapping fungi that form unique infection structures, called traps, to capture and kill free-living nematodes. The traps have evolved differently along several lineages and include adhesive traps (knobs, nets or branches) and constricting rings. We show, by genome sequencing of the knob-forming species Monacrosporium haptotylum and comparison with the net-forming species Arthrobotrys oligospora, that two genomic mechanisms are likely to have been important for the adaptation to parasitism in these fungi. Firstly, the expansion of protein domain families and the large number of species-specific genes indicated that gene duplication followed by functional diversification had a major role in the evolution of the nematode-trapping fungi. Gene expression indicated that many of these genes are important for pathogenicity. Secondly, gene expression of orthologs between the two fungi during infection indicated that differential regulation was an important mechanism for the evolution of parasitism in nematode-trapping fungi. Many of the highly expressed and highly upregulated M. haptotylum transcripts during the early stages of nematode infection were species-specific and encoded small secreted proteins (SSPs) that were affected by repeat-induced point mutations (RIP). An active RIP mechanism was revealed by lack of repeats, dinucleotide bias in repeats and genes, low proportion of recent gene duplicates, and reduction of recent gene family expansions. The high expression and rapid divergence of SSPs indicate a striking similarity in the infection mechanisms of nematode-trapping fungi and plant and insect pathogens from the crown groups of the filamentous ascomycetes (Pezizomycotina). The patterns of gene family expansions in the nematode-trapping fungi were more similar to plant pathogens than to insect and animal pathogens. The observation of RIP activity in the Orbiliomycetes suggested that this mechanism was present early in the evolution of the filamentous ascomycetes.
Genomic organization of plant aminopropyl transferases.
Rodríguez-Kessler, Margarita; Delgado-Sánchez, Pablo; Rodríguez-Kessler, Gabriela Theresia; Moriguchi, Takaya; Jiménez-Bremont, Juan Francisco
2010-07-01
Aminopropyl transferases like spermidine synthase (SPDS; EC 2.5.1.16), spermine synthase and thermospermine synthase (SPMS, tSPMS; EC 2.5.1.22) belong to a class of widely distributed enzymes that use decarboxylated S-adenosylmethionine as an aminopropyl donor and putrescine or spermidine as an amino acceptor to form in that order spermidine, spermine or thermospermine. We describe the analysis of plant genomic sequences encoding SPDS, SPMS, tSPMS and PMT (putrescine N-methyltransferase; EC 2.1.1.53). Genome organization (including exon size, gain and loss, as well as intron number, size, loss, retention, placement and phase, and the presence of transposons) of plant aminopropyl transferase genes were compared between the genomic sequences of SPDS, SPMS and tSPMS from Zea mays, Oryza sativa, Malus x domestica, Populus trichocarpa, Arabidopsis thaliana and Physcomitrella patens. In addition, the genomic organization of plant PMT genes, proposed to be derived from SPDS during the evolution of alkaloid metabolism, is illustrated. Herein, a particular conservation and arrangement of exon and intron sequences between plant SPDS, SPMS and PMT genes that clearly differs with that of ACL5 genes, is shown. The possible acquisition of the plant SPMS exon II and, in particular exon XI in the monocot SPMS genes, is a remarkable feature that allows their differentiation from SPDS genes. In accordance with our in silico analysis, functional complementation experiments of the maize ZmSPMS1 enzyme (previously considered to be SPDS) in yeast demonstrated its spermine synthase activity. Another significant aspect is the conservation of intron sequences among SPDS and PMT paralogs. In addition the existence of microsynteny among some SPDS paralogs, especially in P. trichocarpa and A. thaliana, supports duplication events of plant SPDS genes. Based in our analysis, we hypothesize that SPMS genes appeared with the divergence of vascular plants by a processes of gene duplication and the acquisition of unique exons of as-yet unknown origin. 2010 Elsevier Masson SAS. All rights reserved.
Genomic Mechanisms Accounting for the Adaptation to Parasitism in Nematode-Trapping Fungi
Meerupati, Tejashwari; Andersson, Karl-Magnus; Friman, Eva; Kumar, Dharmendra; Tunlid, Anders; Ahrén, Dag
2013-01-01
Orbiliomycetes is one of the earliest diverging branches of the filamentous ascomycetes. The class contains nematode-trapping fungi that form unique infection structures, called traps, to capture and kill free-living nematodes. The traps have evolved differently along several lineages and include adhesive traps (knobs, nets or branches) and constricting rings. We show, by genome sequencing of the knob-forming species Monacrosporium haptotylum and comparison with the net-forming species Arthrobotrys oligospora, that two genomic mechanisms are likely to have been important for the adaptation to parasitism in these fungi. Firstly, the expansion of protein domain families and the large number of species-specific genes indicated that gene duplication followed by functional diversification had a major role in the evolution of the nematode-trapping fungi. Gene expression indicated that many of these genes are important for pathogenicity. Secondly, gene expression of orthologs between the two fungi during infection indicated that differential regulation was an important mechanism for the evolution of parasitism in nematode-trapping fungi. Many of the highly expressed and highly upregulated M. haptotylum transcripts during the early stages of nematode infection were species-specific and encoded small secreted proteins (SSPs) that were affected by repeat-induced point mutations (RIP). An active RIP mechanism was revealed by lack of repeats, dinucleotide bias in repeats and genes, low proportion of recent gene duplicates, and reduction of recent gene family expansions. The high expression and rapid divergence of SSPs indicate a striking similarity in the infection mechanisms of nematode-trapping fungi and plant and insect pathogens from the crown groups of the filamentous ascomycetes (Pezizomycotina). The patterns of gene family expansions in the nematode-trapping fungi were more similar to plant pathogens than to insect and animal pathogens. The observation of RIP activity in the Orbiliomycetes suggested that this mechanism was present early in the evolution of the filamentous ascomycetes. PMID:24244185
On the Complexity of Duplication-Transfer-Loss Reconciliation with Non-Binary Gene Trees.
Kordi, Misagh; Bansal, Mukul S
2017-01-01
Duplication-Transfer-Loss (DTL) reconciliation has emerged as a powerful technique for studying gene family evolution in the presence of horizontal gene transfer. DTL reconciliation takes as input a gene family phylogeny and the corresponding species phylogeny, and reconciles the two by postulating speciation, gene duplication, horizontal gene transfer, and gene loss events. Efficient algorithms exist for finding optimal DTL reconciliations when the gene tree is binary. However, gene trees are frequently non-binary. With such non-binary gene trees, the reconciliation problem seeks to find a binary resolution of the gene tree that minimizes the reconciliation cost. Given the prevalence of non-binary gene trees, many efficient algorithms have been developed for this problem in the context of the simpler Duplication-Loss (DL) reconciliation model. Yet, no efficient algorithms exist for DTL reconciliation with non-binary gene trees and the complexity of the problem remains unknown. In this work, we resolve this open question by showing that the problem is, in fact, NP-hard. Our reduction applies to both the dated and undated formulations of DTL reconciliation. By resolving this long-standing open problem, this work will spur the development of both exact and heuristic algorithms for this important problem.
Major COL4A5 gene rearrangements in patients with juvenile type Alport syndrome
DOE Office of Scientific and Technical Information (OSTI.GOV)
Renieri, A.; Galli, L.; Bruttini, M.
1995-11-20
Mutations in the COL4A5 gene, which encodes the {alpha}5 chain of type IV collagen, are found in a large fraction of patients with X-linked Alport syndrome. The recently discovered COL4A6, tightly linked and highly homologous to COL4A5, represents a second candidate gene for Alport syndrome. We analyzed 177 Italian Alport syndrome families by Southern blotting using cDNA probes from both COL4A5 and COL4A6. Nine unrelated families, accounting for 5% of the cases, were found to have a rearrangement in COL4A5. No rearrangements were found in COL4A6, with the exception of a deletion encompassing the 5{prime} ends of both COL4A5 andmore » COL4A6 genes in a patient with Alport syndrome and leiomyomatosis. COL4A5 rearrangements were all intragenic and included 1 duplication and 7 deletions. Polymerase chain reaction (PCR) analysis was carried out to characterize deletion and duplication boundaries and to predict the resulting protein abnormality. The two smallest deletions involved a single exon (exons 17 and 40, respectively), while the largest ones spanned exons 1 to 36. The clinical phenotype of patients in whom a rearrangement in COL4A5 was detected was severe, with progression to end-stage renal failure in juvenile age and hypoacusis occurring in most cases. These data have some important implications in the diagnosis of patients with Alport syndrome. 34 refs., 3 figs., 1 tab.« less
Evolution dynamics of a model for gene duplication under adaptive conflict
NASA Astrophysics Data System (ADS)
Ancliff, Mark; Park, Jeong-Man
2014-06-01
We present and solve the dynamics of a model for gene duplication showing escape from adaptive conflict. We use a Crow-Kimura quasispecies model of evolution where the fitness landscape is a function of Hamming distances from two reference sequences, which are assumed to optimize two different gene functions, to describe the dynamics of a mixed population of individuals with single and double copies of a pleiotropic gene. The evolution equations are solved through a spin coherent state path integral, and we find two phases: one is an escape from an adaptive conflict phase, where each copy of a duplicated gene evolves toward subfunctionalization, and the other is a duplication loss of function phase, where one copy maintains its pleiotropic form and the other copy undergoes neutral mutation. The phase is determined by a competition between the fitness benefits of subfunctionalization and the greater mutational load associated with maintaining two gene copies. In the escape phase, we find a dynamics of an initial population of single gene sequences only which escape adaptive conflict through gene duplication and find that there are two time regimes: until a time t* single gene sequences dominate, and after t* double gene sequences outgrow single gene sequences. The time t* is identified as the time necessary for subfunctionalization to evolve and spread throughout the double gene sequences, and we show that there is an optimum mutation rate which minimizes this time scale.
Jourda, Cyril; Cardi, Céline; Gibert, Olivier; Giraldo Toro, Andrès; Ricci, Julien; Mbéguié-A-Mbéguié, Didier; Yahiaoui, Nabila
2016-01-01
Starch is the most widespread and abundant storage carbohydrate in plants. It is also a major feature of cultivated bananas as it accumulates to large amounts during banana fruit development before almost complete conversion to soluble sugars during ripening. Little is known about the structure of major gene families involved in banana starch metabolism and their evolution compared to other species. To identify genes involved in banana starch metabolism and investigate their evolutionary history, we analyzed six gene families playing a crucial role in plant starch biosynthesis and degradation: the ADP-glucose pyrophosphorylases (AGPases), starch synthases (SS), starch branching enzymes (SBE), debranching enzymes (DBE), α-amylases (AMY) and β-amylases (BAM). Using comparative genomics and phylogenetic approaches, these genes were classified into families and sub-families and orthology relationships with functional genes in Eudicots and in grasses were identified. In addition to known ancestral duplications shaping starch metabolism gene families, independent evolution in banana and grasses also occurred through lineage-specific whole genome duplications for specific sub-families of AGPase, SS, SBE, and BAM genes; and through gene-scale duplications for AMY genes. In particular, banana lineage duplications yielded a set of AGPase, SBE and BAM genes that were highly or specifically expressed in banana fruits. Gene expression analysis highlighted a complex transcriptional reprogramming of starch metabolism genes during ripening of banana fruits. A differential regulation of expression between banana gene duplicates was identified for SBE and BAM genes, suggesting that part of starch metabolism regulation in the fruit evolved in the banana lineage. PMID:27994606
Divergence of Gene Body DNA Methylation and Evolution of Plant Duplicate Genes
Wang, Jun; Marowsky, Nicholas C.; Fan, Chuanzhu
2014-01-01
It has been shown that gene body DNA methylation is associated with gene expression. However, whether and how deviation of gene body DNA methylation between duplicate genes can influence their divergence remains largely unexplored. Here, we aim to elucidate the potential role of gene body DNA methylation in the fate of duplicate genes. We identified paralogous gene pairs from Arabidopsis and rice (Oryza sativa ssp. japonica) genomes and reprocessed their single-base resolution methylome data. We show that methylation in paralogous genes nonlinearly correlates with several gene properties including exon number/gene length, expression level and mutation rate. Further, we demonstrated that divergence of methylation level and pattern in paralogs indeed positively correlate with their sequence and expression divergences. This result held even after controlling for other confounding factors known to influence the divergence of paralogs. We observed that methylation level divergence might be more relevant to the expression divergence of paralogs than methylation pattern divergence. Finally, we explored the mechanisms that might give rise to the divergence of gene body methylation in paralogs. We found that exonic methylation divergence more closely correlates with expression divergence than intronic methylation divergence. We show that genomic environments (e.g., flanked by transposable elements and repetitive sequences) of paralogs generated by various duplication mechanisms are associated with the methylation divergence of paralogs. Overall, our results suggest that the changes in gene body DNA methylation could provide another avenue for duplicate genes to develop differential expression patterns and undergo different evolutionary fates in plant genomes. PMID:25310342
Carrigan, Matthew A.; Uryasev, Oleg; Davis, Ross P.; Zhai, LanMin; Hurley, Thomas D.; Benner, Steven A.
2012-01-01
Background Gene duplication is a source of molecular innovation throughout evolution. However, even with massive amounts of genome sequence data, correlating gene duplication with speciation and other events in natural history can be difficult. This is especially true in its most interesting cases, where rapid and multiple duplications are likely to reflect adaptation to rapidly changing environments and life styles. This may be so for Class I of alcohol dehydrogenases (ADH1s), where multiple duplications occurred in primate lineages in Old and New World monkeys (OWMs and NWMs) and hominoids. Methodology/Principal Findings To build a preferred model for the natural history of ADH1s, we determined the sequences of nine new ADH1 genes, finding for the first time multiple paralogs in various prosimians (lemurs, strepsirhines). Database mining then identified novel ADH1 paralogs in both macaque (an OWM) and marmoset (a NWM). These were used with the previously identified human paralogs to resolve controversies relating to dates of duplication and gene conversion in the ADH1 family. Central to these controversies are differences in the topologies of trees generated from exonic (coding) sequences and intronic sequences. Conclusions/Significance We provide evidence that gene conversions are the primary source of difference, using molecular clock dating of duplications and analyses of microinsertions and deletions (micro-indels). The tree topology inferred from intron sequences appear to more correctly represent the natural history of ADH1s, with the ADH1 paralogs in platyrrhines (NWMs) and catarrhines (OWMs and hominoids) having arisen by duplications shortly predating the divergence of OWMs and NWMs. We also conclude that paralogs in lemurs arose independently. Finally, we identify errors in database interpretation as the source of controversies concerning gene conversion. These analyses provide a model for the natural history of ADH1s that posits four ADH1 paralogs in the ancestor of Catarrhine and Platyrrhine primates, followed by the loss of an ADH1 paralog in the human lineage. PMID:22859968
Myelin protein zero gene sequencing diagnoses Charcot-Marie-Tooth Type 1B disease
DOE Office of Scientific and Technical Information (OSTI.GOV)
Su, Y.; Zhang, H.; Madrid, R.
1994-09-01
Charcot-Marie-Tooth disease (CMT), the most common genetic neuropathy, affects about 1 in 2600 people in Norway and is found worldwide. CMT Type 1 (CMT1) has slow nerve conduction with demyelinated Schwann cells. Autosomal dominant CMT Type 1B (CMT1B) results from mutations in the myelin protein zero gene which directs the synthesis of more than half of all Schwann cell protein. This gene was mapped to the chromosome 1q22-1q23.1 borderline by fluorescence in situ hybridization. The first 7 of 7 reported CMT1B mutations are unique. Thus the most effective means to identify CMT1B mutations in at-risk family members and fetuses ismore » to sequence the entire coding sequence in dominant or sporadic CMT patients without the CMT1A duplication. Of the 19 primers used in 16 pars to uniquely amplify the entire MPZ coding sequence, 6 primer pairs were used to amplify and sequence the 6 exons. The DyeDeoxy Terminator cycle sequencing method used with four different color fluorescent lables was superior to manual sequencing because it sequences more bases unambiguously from extracted genomic DNA samples within 24 hours. This protocol was used to test 28 CMT and Dejerine-Sottas patients without CMT1A gene duplication. Sequencing MPZ gene-specific amplified fragments identified 9 polymorphic sites within the 6 exons that encode the 248 amino acid MPZ protein. The large number of major CMT1B mutations identified by single strand sequencing are being verified by reverse strand sequencing and when possible, by restriction enzyme analysis. This protocol can be used to distringuish CMT1B patients from othre CMT phenotypes and to determine the CMT1B status of relatives both presymptomatically and prenatally.« less
Hall, Jennifer R; Clow, Kathy A; Rise, Matthew L; Driedzic, William R
2015-09-01
Aquaglyceroporins (GLPs) are integral membrane proteins that facilitate passive movement of water, glycerol and urea across cellular membranes. In this study, GLP-encoding genes were characterized in rainbow smelt (Osmerus mordax mordax), an anadromous teleost that accumulates high glycerol and modest urea levels in plasma and tissues as an adaptive cryoprotectant mechanism in sub-zero temperatures. We report the gene and promoter sequences for two aqp10b paralogs (aqp10ba, aqp10bb) that are 82% identical at the predicted amino acid level, and aqp9b. Aqp10bb and aqp9b have the 6 exon structure common to vertebrate GLPs. Aqp10ba has 8 exons; there are two additional exons at the 5' end, and the promoter sequence is different from aqp10bb. Molecular phylogenetic analysis suggests that the aqp10b paralogs arose from a gene duplication event specific to the smelt lineage. Smelt GLP transcripts are ubiquitously expressed; however, aqp10ba transcripts were highest in kidney, aqp10bb transcripts were highest in kidney, intestine, pyloric caeca and brain, and aqp9b transcripts were highest in spleen, liver, red blood cells and kidney. In cold-temperature challenge experiments, plasma glycerol and urea levels were significantly higher in cold- compared to warm-acclimated smelt; however, GLP transcript levels were generally either significantly lower or remained constant. The exception was significantly higher aqp10ba transcript levels in kidney. High aqp10ba transcripts in smelt kidney that increase significantly in response to cold temperature in congruence with plasma urea suggest that this gene duplicate may have evolved to allow the re-absorption of urea to concomitantly conserve nitrogen and prevent freezing. Copyright © 2015 Elsevier Inc. All rights reserved.
Lenz, Tobias L; Eizaguirre, Christophe; Becker, Sven; Reusch, Thorsten BH
2009-01-01
Background In all jawed vertebrates, highly polymorphic genes of the major histocompatibility complex (MHC) encode antigen presenting molecules that play a key role in the adaptive immune response. Their polymorphism is composed of multiple copies of recently duplicated genes, each possessing many alleles within populations, as well as high nucleotide divergence between alleles of the same species. Experimental evidence is accumulating that MHC polymorphism is a result of balancing selection by parasites and pathogens. In order to describe MHC diversity and analyse the underlying mechanisms that maintain it, a reliable genotyping technique is required that is suitable for such highly variable genes. Results We present a genotyping protocol that uses Reference Strand-mediated Conformation Analysis (RSCA), optimised for recently duplicated MHC class IIB genes that are typical for many fish and bird species, including the three-spined stickleback, Gasterosteus aculeatus. In addition we use a comprehensive plasmid library of MHC class IIB alleles to determine the nucleotide sequence of alleles represented by RSCA allele peaks. Verification of the RSCA typing by cloning and sequencing demonstrates high congruency between both methods and provides new insight into the polymorphism of classical stickleback MHC genes. Analysis of the plasmid library additionally reveals the high resolution and reproducibility of the RSCA technique. Conclusion This new RSCA genotyping protocol offers a fast, but sensitive and reliable way to determine the MHC allele repertoire of three-spined sticklebacks. It therefore provides a valuable tool to employ this highly polymorphic and adaptive marker in future high-throughput studies of host-parasite co-evolution and ecological speciation in this emerging model organism. PMID:19291291
Conserved Non-Coding Regulatory Signatures in Arabidopsis Co-Expressed Gene Modules
Spangler, Jacob B.; Ficklin, Stephen P.; Luo, Feng; Freeling, Michael; Feltus, F. Alex
2012-01-01
Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome. PMID:23024789
Conserved non-coding regulatory signatures in Arabidopsis co-expressed gene modules.
Spangler, Jacob B; Ficklin, Stephen P; Luo, Feng; Freeling, Michael; Feltus, F Alex
2012-01-01
Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome.
Iskandar, Christelle F; Cailliez-Grimal, Catherine; Rahman, Abdur; Rondags, Emmanuel; Remenant, Benoît; Zagorec, Monique; Leisner, Jorgen J; Borges, Frédéric; Revol-Junelles, Anne-Marie
2016-09-01
The dairy population of Carnobacterium maltaromaticum is characterized by a high diversity suggesting a high diversity of the genetic traits linked to the dairy process. As lactose is the main carbon source in milk, the genetics of lactose metabolism was investigated in this LAB. Comparative genomic analysis revealed that the species C. maltaromaticum exhibits genes related to the Leloir and the tagatose-6-phosphate (Tagatose-6P) pathways. More precisely, strains can bear genes related to one or both pathways and several strains apparently do not contain homologs related to these pathways. Analysis at the population scale revealed that the Tagatose-6P and the Leloir encoding genes are disseminated in multiple phylogenetic lineages of C. maltaromaticum: genes of the Tagatose-6P pathway are present in the lineages I, II and III, and genes of the Leloir pathway are present in the lineages I, III and IV. These data suggest that these genes evolved thanks to horizontal transfer, genetic duplication and translocation. We hypothesize that the lac and gal genes evolved in C. maltaromaticum according to a complex scenario that mirrors the high population diversity. Copyright © 2016 Elsevier Ltd. All rights reserved.
Evolutionary Analysis of MIKCc-Type MADS-Box Genes in Gymnosperms and Angiosperms
Chen, Fei; Zhang, Xingtan; Liu, Xing; Zhang, Liangsheng
2017-01-01
MIKCc-type MADS-box genes encode transcription factors that control floral organ morphogenesis and flowering time in flowering plants. Here, in order to determine when the subfamilies of MIKCc originated and their early evolutionary trajectory, we sampled and analyzed the genomes and large-scale transcriptomes representing all the orders of gymnosperms and basal angiosperms. Through phylogenetic inference, the MIKCc-type MADS-box genes were subdivided into 14 monophyletic clades. Among them, the gymnosperm orthologs of AGL6, SEP, AP1, GMADS, SOC1, AGL32, AP3/PI, SVP, AGL15, ANR1, and AG were identified. We identified and characterized the origin of a novel subfamily GMADS within gymnosperms but lost orthologs in monocots and Brassicaceae. ABCE model prototype genes were relatively conserved in terms of gene number in gymnosperms, but expanded in angiosperms, whereas SVP, SOC1, and GMADS had dramatic expansions in gymnosperms but conserved in angiosperms. Our results provided the most detailed evolutionary history of all MIKCc gene clades in gymnosperms and angiosperms. We proposed that although the near complete set of MIKCc genes had evolved in gymnosperms, the duplication and expressional transition of ABCE model MIKCc genes in the ancestor of angiosperms triggered the first flower. PMID:28611810
Unraveling the Mechanism Underlying the Glycosylation and Methylation of Anthocyanins in Peach1[C][W
Cheng, Jun; Wei, Guochao; Zhou, Hui; Gu, Chao; Vimolmangkang, Sornkanok; Liao, Liao; Han, Yuepeng
2014-01-01
Modification of anthocyanin plays an important role in increasing its stability in plants. Here, six anthocyanins were identified in peach (Prunus persica), and their structural diversity is attributed to glycosylation and methylation. Interestingly, peach is quite similar to the wild species Prunus ferganensis but differs from both Prunus davidiana and Prunus kansueasis in terms of anthocyanin composition in flowers. This indicates that peach is probably domesticated from P. ferganensis. Subsequently, genes responsible for both methylation and glycosylation of anthocyanins were identified, and their spatiotemporal expression results in different patterns of anthocyanin accumulation in flowers, leaves, and fruits. Two tandem-duplicated genes encoding flavonoid 3-O-glycosyltransferase (F3GT) in peach, PpUGT78A1 and PpUGT78A2, showed different activity toward anthocyanin, providing an example of divergent evolution of F3GT genes in plants. Two genes encoding anthocyanin O-methyltransferase (AOMT), PpAOMT1 and PpAOMT2, are expressed in leaves and flowers, but only PpAOMT2 is responsible for the O-methylation of anthocyanins at the 3′ position in peach. In addition, our study reveals a novel branch of UGT78 genes in plants that lack the highly conserved intron 2 of the UGT gene family, with a great variation of the amino acid residue at position 22 of the plant secondary product glycosyltransferase box. Our results not only provide insights into the mechanisms underlying anthocyanin glycosylation and methylation in peach but will also aid in future attempts to manipulate flavonoid biosynthesis in peach as well as in other plants. PMID:25106821
CNL Disease Resistance Genes in Soybean and Their Evolutionary Divergence
Nepal, Madhav P; Benson, Benjamin V
2015-01-01
Disease resistance genes (R-genes) encode proteins involved in detecting pathogen attack and activating downstream defense molecules. Recent availability of soybean genome sequences makes it possible to examine the diversity of gene families including disease-resistant genes. The objectives of this study were to identify coiled-coil NBS-LRR (= CNL) R-genes in soybean, infer their evolutionary relationships, and assess structural as well as functional divergence of the R-genes. Profile hidden Markov models were used for sequence identification and model-based maximum likelihood was used for phylogenetic analysis, and variation in chromosomal positioning, gene clustering, and functional divergence were assessed. We identified 188 soybean CNL genes nested into four clades consistent to their orthologs in Arabidopsis. Gene clustering analysis revealed the presence of 41 gene clusters located on 13 different chromosomes. Analyses of the Ks-values and chromosomal positioning suggest duplication events occurring at varying timescales, and an extrapericentromeric positioning may have facilitated their rapid evolution. Each of the four CNL clades exhibited distinct patterns of gene expression. Phylogenetic analysis further supported the extrapericentromeric positioning effect on the divergence and retention of the CNL genes. The results are important for understanding the diversity and divergence of CNL genes in soybean, which would have implication in soybean crop improvement in future. PMID:25922568
CNL Disease Resistance Genes in Soybean and Their Evolutionary Divergence.
Nepal, Madhav P; Benson, Benjamin V
2015-01-01
Disease resistance genes (R-genes) encode proteins involved in detecting pathogen attack and activating downstream defense molecules. Recent availability of soybean genome sequences makes it possible to examine the diversity of gene families including disease-resistant genes. The objectives of this study were to identify coiled-coil NBS-LRR (= CNL) R-genes in soybean, infer their evolutionary relationships, and assess structural as well as functional divergence of the R-genes. Profile hidden Markov models were used for sequence identification and model-based maximum likelihood was used for phylogenetic analysis, and variation in chromosomal positioning, gene clustering, and functional divergence were assessed. We identified 188 soybean CNL genes nested into four clades consistent to their orthologs in Arabidopsis. Gene clustering analysis revealed the presence of 41 gene clusters located on 13 different chromosomes. Analyses of the K s-values and chromosomal positioning suggest duplication events occurring at varying timescales, and an extrapericentromeric positioning may have facilitated their rapid evolution. Each of the four CNL clades exhibited distinct patterns of gene expression. Phylogenetic analysis further supported the extrapericentromeric positioning effect on the divergence and retention of the CNL genes. The results are important for understanding the diversity and divergence of CNL genes in soybean, which would have implication in soybean crop improvement in future.
Santos-Garcia, Diego; Rollat-Farnier, Pierre-Antoine; Beitia, Francisco; Zchori-Fein, Einat; Vavre, Fabrice; Mouton, Laurence; Moya, Andrés; Latorre, Amparo; Silva, Francisco J.
2014-01-01
Many insects harbor inherited bacterial endosymbionts. Although some of them are not strictly essential and are considered facultative, they can be a key to host survival under specific environmental conditions, such as parasitoid attacks, climate changes, or insecticide pressures. The whitefly Bemisia tabaci is at the top of the list of organisms inflicting agricultural damage and outbreaks, and changes in its distribution may be associated to global warming. In this work, we have sequenced and analyzed the genome of Cardinium cBtQ1, a facultative bacterial endosymbiont of B. tabaci and propose that it belongs to a new taxonomic family, which also includes Candidatus Amoebophilus asiaticus and Cardinium cEper1, endosymbionts of amoeba and wasps, respectively. Reconstruction of their last common ancestors’ gene contents revealed an initial massive gene loss from the free-living ancestor. This was followed in Cardinium by smaller losses, associated with settlement in arthropods. Some of these losses, affecting cofactor and amino acid biosynthetic encoding genes, took place in Cardinium cBtQ1 after its divergence from the Cardinium cEper1 lineage and were related to its settlement in the whitefly and its endosymbionts. Furthermore, the Cardinium cBtQ1 genome displays a large proportion of transposable elements, which have recently inactivated genes and produced chromosomal rearrangements. The genome also contains a chromosomal duplication and a multicopy plasmid, which harbors several genes putatively associated with gliding motility, as well as two other genes encoding proteins with potential insecticidal activity. As gene amplification is very rare in endosymbionts, an important function of these genes cannot be ruled out. PMID:24723729
Valoti, Elisabetta; Alberti, Marta; Tortajada, Agustin; Garcia-Fernandez, Jesus; Gastoldi, Sara; Besso, Luca; Bresin, Elena; Remuzzi, Giuseppe; Rodriguez de Cordoba, Santiago; Noris, Marina
2015-01-01
Genomic aberrations affecting the genes encoding factor H (FH) and the five FH-related proteins (FHRs) have been described in patients with atypical hemolytic uremic syndrome (aHUS), a rare condition characterized by microangiopathic hemolytic anemia, thrombocytopenia, and ARF. These genomic rearrangements occur through nonallelic homologous recombinations caused by the presence of repeated homologous sequences in CFH and CFHR1-R5 genes. In this study, we found heterozygous genomic rearrangements among CFH and CFHR genes in 4.5% of patients with aHUS. CFH/CFHR rearrangements were associated with poor clinical prognosis and high risk of post-transplant recurrence. Five patients carried known CFH/CFHR1 genes, but we found a duplication leading to a novel CFHR1/CFH hybrid gene in a family with two affected subjects. The resulting fusion protein contains the first four short consensus repeats of FHR1 and the terminal short consensus repeat 20 of FH. In an FH-dependent hemolysis assay, we showed that the hybrid protein causes sheep erythrocyte lysis. Functional analysis of the FHR1 fraction purified from serum of heterozygous carriers of the CFHR1/CFH hybrid gene indicated that the FHR1/FH hybrid protein acts as a competitive antagonist of FH. Furthermore, sera from carriers of the hybrid CFHR1/CFH gene induced more C5b-9 deposition on endothelial cells than control serum. These results suggest that this novel genomic hybrid mediates disease pathogenesis through dysregulation of complement at the endothelial cell surface. We recommend that genetic screening of aHUS includes analysis of CFH and CFHR rearrangements, particularly before a kidney transplant. Copyright © 2015 by the American Society of Nephrology.
Chromosome-encoded narrow-spectrum Ambler class A beta-lactamase GIL-1 from Citrobacter gillenii.
Naas, Thierry; Aubert, Daniel; Ozcan, Ayla; Nordmann, Patrice
2007-04-01
A novel beta-lactamase gene was cloned from the whole-cell DNA of an enterobacterial Citrobacter gillenii reference strain that displayed a weak narrow-spectrum beta-lactam-resistant phenotype and was expressed in Escherichia coli. It encoded a clavulanic acid-inhibited Ambler class A beta-lactamase, GIL-1, with a pI value of 7.5 and a molecular mass of ca. 29 kDa. GIL-1 had the highest percent amino acid sequence identity with TEM-1 and SHV-1, 77%, and 67%, respectively, and only 46%, 31%, and 32% amino acid sequence identity with CKO-1 (C. koseri), CdiA1 (C. diversus), and SED-1 (C. sedlaki), respectively. The substrate profile of the purified GIL-1 was similar to that of beta-lactamases TEM-1 and SHV-1. The blaGIL-1 gene was chromosomally located, as revealed by I-CeuI experiments, and was constitutively expressed at a low level in C. gillenii. No gene homologous to the regulatory ampR genes of chromosomal class C beta-lactamases was found upstream of the blaGIL-1 gene, which fits the noninducibility of beta-lactamase expression in C. gillenii. Rapid amplification of DNA 5' ends analysis of the promoter region revealed putative promoter sequences that diverge from what has been identified as the consensus sequence in E. coli. The blaGIL-1 gene was part of a 5.5-kb DNA fragment bracketed by a 9-bp duplication and inserted between the d-lactate dehydrogenase gene and the ydbH genes; this DNA fragment was absent in other Citrobacter species. This work further illustrates the heterogeneity of beta-lactamases in Citrobacter spp., which may indicate that the variability of Citrobacter species is greater than expected.
He, Yajun; Mao, Shaoshuai; Gao, Yulong; Zhu, Liying; Wu, Daoming; Cui, Yixin; Li, Jiana; Qian, Wei
2016-01-01
WRKY transcription factors play important roles in responses to environmental stress stimuli. Using a genome-wide domain analysis, we identified 287 WRKY genes with 343 WRKY domains in the sequenced genome of Brassica napus, 139 in the A sub-genome and 148 in the C sub-genome. These genes were classified into eight groups based on phylogenetic analysis. In the 343 WRKY domains, a total of 26 members showed divergence in the WRKY domain, and 21 belonged to group I. This finding suggested that WRKY genes in group I are more active and variable compared with genes in other groups. Using genome-wide identification and analysis of the WRKY gene family in Brassica napus, we observed genome duplication, chromosomal/segmental duplications and tandem duplication. All of these duplications contributed to the expansion of the WRKY gene family. The duplicate segments that were detected indicated that genome duplication events occurred in the two diploid progenitors B. rapa and B. olearecea before they combined to form B. napus. Analysis of the public microarray database and EST database for B. napus indicated that 74 WRKY genes were induced or preferentially expressed under stress conditions. According to the public QTL data, we identified 77 WRKY genes in 31 QTL regions related to various stress tolerance. We further evaluated the expression of 26 BnaWRKY genes under multiple stresses by qRT-PCR. Most of the genes were induced by low temperature, salinity and drought stress, indicating that the WRKYs play important roles in B. napus stress responses. Further, three BnaWRKY genes were strongly responsive to the three multiple stresses simultaneously, which suggests that these 3 WRKY may have multi-functional roles in stress tolerance and can potentially be used in breeding new rapeseed cultivars. We also found six tandem repeat pairs exhibiting similar expression profiles under the various stress conditions, and three pairs were mapped in the stress related QTL regions, indicating tandem duplicate WRKYs in the adaptive responses to environmental stimuli during the evolution process. Our results provide a framework for future studies regarding the function of WRKY genes in response to stress in B. napus. PMID:27322342
He, Yajun; Mao, Shaoshuai; Gao, Yulong; Zhu, Liying; Wu, Daoming; Cui, Yixin; Li, Jiana; Qian, Wei
2016-01-01
WRKY transcription factors play important roles in responses to environmental stress stimuli. Using a genome-wide domain analysis, we identified 287 WRKY genes with 343 WRKY domains in the sequenced genome of Brassica napus, 139 in the A sub-genome and 148 in the C sub-genome. These genes were classified into eight groups based on phylogenetic analysis. In the 343 WRKY domains, a total of 26 members showed divergence in the WRKY domain, and 21 belonged to group I. This finding suggested that WRKY genes in group I are more active and variable compared with genes in other groups. Using genome-wide identification and analysis of the WRKY gene family in Brassica napus, we observed genome duplication, chromosomal/segmental duplications and tandem duplication. All of these duplications contributed to the expansion of the WRKY gene family. The duplicate segments that were detected indicated that genome duplication events occurred in the two diploid progenitors B. rapa and B. olearecea before they combined to form B. napus. Analysis of the public microarray database and EST database for B. napus indicated that 74 WRKY genes were induced or preferentially expressed under stress conditions. According to the public QTL data, we identified 77 WRKY genes in 31 QTL regions related to various stress tolerance. We further evaluated the expression of 26 BnaWRKY genes under multiple stresses by qRT-PCR. Most of the genes were induced by low temperature, salinity and drought stress, indicating that the WRKYs play important roles in B. napus stress responses. Further, three BnaWRKY genes were strongly responsive to the three multiple stresses simultaneously, which suggests that these 3 WRKY may have multi-functional roles in stress tolerance and can potentially be used in breeding new rapeseed cultivars. We also found six tandem repeat pairs exhibiting similar expression profiles under the various stress conditions, and three pairs were mapped in the stress related QTL regions, indicating tandem duplicate WRKYs in the adaptive responses to environmental stimuli during the evolution process. Our results provide a framework for future studies regarding the function of WRKY genes in response to stress in B. napus.
The role of retrotransposons in gene family expansions: insights from the mouse Abp gene family.
Janoušek, Václav; Karn, Robert C; Laukaitis, Christina M
2013-05-29
Retrotransposons have been suggested to provide a substrate for non-allelic homologous recombination (NAHR) and thereby promote gene family expansion. Their precise role, however, is controversial. Here we ask whether retrotransposons contributed to the recent expansions of the Androgen-binding protein (Abp) gene families that occurred independently in the mouse and rat genomes. Using dot plot analysis, we found that the most recent duplication in the Abp region of the mouse genome is flanked by L1Md_T elements. Analysis of the sequence of these elements revealed breakpoints that are the relicts of the recombination that caused the duplication, confirming that the duplication arose as a result of NAHR using L1 elements as substrates. L1 and ERVII retrotransposons are considerably denser in the Abp regions than in one Mb flanking regions, while other repeat types are depleted in the Abp regions compared to flanking regions. L1 retrotransposons preferentially accumulated in the Abp gene regions after lineage separation and roughly followed the pattern of Abp gene expansion. By contrast, the proportion of shared vs. lineage-specific ERVII repeats in the Abp region resembles the rest of the genome. We confirmed the role of L1 repeats in Abp gene duplication with the identification of recombinant L1Md_T elements at the edges of the most recent mouse Abp gene duplication. High densities of L1 and ERVII repeats were found in the Abp gene region with abrupt transitions at the region boundaries, suggesting that their higher densities are tightly associated with Abp gene duplication. We observed that the major accumulation of L1 elements occurred after the split of the mouse and rat lineages and that there is a striking overlap between the timing of L1 accumulation and expansion of the Abp gene family in the mouse genome. Establishing a link between the accumulation of L1 elements and the expansion of the Abp gene family and identification of an NAHR-related breakpoint in the most recent duplication are the main contributions of our study.
The role of retrotransposons in gene family expansions: insights from the mouse Abp gene family
2013-01-01
Background Retrotransposons have been suggested to provide a substrate for non-allelic homologous recombination (NAHR) and thereby promote gene family expansion. Their precise role, however, is controversial. Here we ask whether retrotransposons contributed to the recent expansions of the Androgen-binding protein (Abp) gene families that occurred independently in the mouse and rat genomes. Results Using dot plot analysis, we found that the most recent duplication in the Abp region of the mouse genome is flanked by L1Md_T elements. Analysis of the sequence of these elements revealed breakpoints that are the relicts of the recombination that caused the duplication, confirming that the duplication arose as a result of NAHR using L1 elements as substrates. L1 and ERVII retrotransposons are considerably denser in the Abp regions than in one Mb flanking regions, while other repeat types are depleted in the Abp regions compared to flanking regions. L1 retrotransposons preferentially accumulated in the Abp gene regions after lineage separation and roughly followed the pattern of Abp gene expansion. By contrast, the proportion of shared vs. lineage-specific ERVII repeats in the Abp region resembles the rest of the genome. Conclusions We confirmed the role of L1 repeats in Abp gene duplication with the identification of recombinant L1Md_T elements at the edges of the most recent mouse Abp gene duplication. High densities of L1 and ERVII repeats were found in the Abp gene region with abrupt transitions at the region boundaries, suggesting that their higher densities are tightly associated with Abp gene duplication. We observed that the major accumulation of L1 elements occurred after the split of the mouse and rat lineages and that there is a striking overlap between the timing of L1 accumulation and expansion of the Abp gene family in the mouse genome. Establishing a link between the accumulation of L1 elements and the expansion of the Abp gene family and identification of an NAHR-related breakpoint in the most recent duplication are the main contributions of our study. PMID:23718880
Macronuclear Genome Sequence of the Ciliate Tetrahymena thermophila, a Model Eukaryote
Eisen, Jonathan A; Coyne, Robert S; Wu, Martin; Wu, Dongying; Thiagarajan, Mathangi; Wortman, Jennifer R; Badger, Jonathan H; Ren, Qinghu; Amedeo, Paolo; Jones, Kristie M; Tallon, Luke J; Delcher, Arthur L; Salzberg, Steven L; Silva, Joana C; Haas, Brian J; Majoros, William H; Farzad, Maryam; Carlton, Jane M; Smith, Roger K; Garg, Jyoti; Pearlman, Ronald E; Karrer, Kathleen M; Sun, Lei; Manning, Gerard; Elde, Nels C; Turkewitz, Aaron P; Asai, David J; Wilkes, David E; Wang, Yufeng; Cai, Hong; Collins, Kathleen; Stewart, B. Andrew; Lee, Suzanne R; Wilamowska, Katarzyna; Weinberg, Zasha; Ruzzo, Walter L; Wloga, Dorota; Gaertig, Jacek; Frankel, Joseph; Tsao, Che-Chia; Gorovsky, Martin A; Keeling, Patrick J; Waller, Ross F; Patron, Nicola J; Cherry, J. Michael; Stover, Nicholas A; Krieger, Cynthia J; del Toro, Christina; Ryder, Hilary F; Williamson, Sondra C; Barbeau, Rebecca A; Hamilton, Eileen P; Orias, Eduardo
2006-01-01
The ciliate Tetrahymena thermophila is a model organism for molecular and cellular biology. Like other ciliates, this species has separate germline and soma functions that are embodied by distinct nuclei within a single cell. The germline-like micronucleus (MIC) has its genome held in reserve for sexual reproduction. The soma-like macronucleus (MAC), which possesses a genome processed from that of the MIC, is the center of gene expression and does not directly contribute DNA to sexual progeny. We report here the shotgun sequencing, assembly, and analysis of the MAC genome of T. thermophila, which is approximately 104 Mb in length and composed of approximately 225 chromosomes. Overall, the gene set is robust, with more than 27,000 predicted protein-coding genes, 15,000 of which have strong matches to genes in other organisms. The functional diversity encoded by these genes is substantial and reflects the complexity of processes required for a free-living, predatory, single-celled organism. This is highlighted by the abundance of lineage-specific duplications of genes with predicted roles in sensing and responding to environmental conditions (e.g., kinases), using diverse resources (e.g., proteases and transporters), and generating structural complexity (e.g., kinesins and dyneins). In contrast to the other lineages of alveolates (apicomplexans and dinoflagellates), no compelling evidence could be found for plastid-derived genes in the genome. UGA, the only T. thermophila stop codon, is used in some genes to encode selenocysteine, thus making this organism the first known with the potential to translate all 64 codons in nuclear genes into amino acids. We present genomic evidence supporting the hypothesis that the excision of DNA from the MIC to generate the MAC specifically targets foreign DNA as a form of genome self-defense. The combination of the genome sequence, the functional diversity encoded therein, and the presence of some pathways missing from other model organisms makes T. thermophila an ideal model for functional genomic studies to address biological, biomedical, and biotechnological questions of fundamental importance. PMID:16933976
Lovejoy, David A; Pavlović, Téa
2015-11-01
In humans, the teneurin gene family consists of four highly conserved paralogous genes that are the result of early vertebrate gene duplications arising from a gene introduced into multicellular organisms from a bacterial ancestor. In vertebrates and humans, the teneurins have become integrated into a number of critical physiological systems including several aspects of reproductive physiology. Structurally complex, these genes possess a sequence in their terminal exon that encodes for a bioactive peptide sequence termed the 'teneurin C-terminal associated peptide' (TCAP). The teneurin/TCAP protein forms an intercellular adhesive unit with its receptor, latrophilin, an Adhesion family G-protein coupled receptor. It is present in numerous cell types and has been implicated in gamete migration and gonadal morphology. Moreover, TCAP is highly effective at reducing the corticotropin-releasing factor (CRF) stress response. As a result, TCAP may also play a role in regulating the stress-associated inhibition of reproduction. In addition, the teneurins and TCAP have been implicated in tumorigenesis associated with reproductive tissues. Therefore, the teneurin/TCAP system may offer clinicians a novel biomarker system upon which to diagnose some reproductive pathologies.
Singh, Amarjeet; Baranwal, Vinay; Shankar, Alka; Kanwar, Poonam; Ranjan, Rajeev; Yadav, Sandeep; Pandey, Amita; Kapoor, Sanjay; Pandey, Girdhar K.
2012-01-01
Background Phospholipase A (PLA) is an important group of enzymes responsible for phospholipid hydrolysis in lipid signaling. PLAs have been implicated in abiotic stress signaling and developmental events in various plants species. Genome-wide analysis of PLA superfamily has been carried out in dicot plant Arabidopsis. A comprehensive genome-wide analysis of PLAs has not been presented yet in crop plant rice. Methodology/Principal Findings A comprehensive bioinformatics analysis identified a total of 31 PLA encoding genes in the rice genome, which are divided into three classes; phospholipase A1 (PLA1), patatin like phospholipases (pPLA) and low molecular weight secretory phospholipase A2 (sPLA2) based on their sequences and phylogeny. A subset of 10 rice PLAs exhibited chromosomal duplication, emphasizing the role of duplication in the expansion of this gene family in rice. Microarray expression profiling revealed a number of PLA members expressing differentially and significantly under abiotic stresses and reproductive development. Comparative expression analysis with Arabidopsis PLAs revealed a high degree of functional conservation between the orthologs in two plant species, which also indicated the vital role of PLAs in stress signaling and plant development across different plant species. Moreover, sub-cellular localization of a few candidates suggests their differential localization and functional role in the lipid signaling. Conclusion/Significance The comprehensive analysis and expression profiling would provide a critical platform for the functional characterization of the candidate PLA genes in crop plants. PMID:22363522
Evolutionary origins of a novel host plant detoxification gene in butterflies.
Fischer, Hanna M; Wheat, Christopher W; Heckel, David G; Vogel, Heiko
2008-05-01
Chemical interactions between plants and their insect herbivores provide an excellent opportunity to study the evolution of species interactions on a molecular level. Here, we investigate the molecular evolutionary events that gave rise to a novel detoxifying enzyme (nitrile-specifier protein [NSP]) in the butterfly family Pieridae, previously identified as a coevolutionary key innovation. By generating and sequencing expressed sequence tags, genomic libraries, and screening databases we found NSP to be a member of an insect-specific gene family, which we characterized and named the NSP-like gene family. Members consist of variable tandem repeats, are gut expressed, and are found across Insecta evolving in a dynamic, ongoing birth-death process. In the Lepidoptera, multiple copies of single-domain major allergen genes are present and originate via tandem duplications. Multiple domain genes are found solely within the brassicaceous-feeding Pieridae butterflies, one of them being NSP and another called major allergen (MA). Analyses suggest that NSP and its paralog MA have a unique single-domain evolutionary origin, being formed by intragenic domain duplication followed by tandem whole-gene duplication. Duplicates subsequently experienced a period of relaxed constraint followed by an increase in constraint, perhaps after neofunctionalization. NSP and its ortholog MA are still experiencing high rates of change, reflecting a dynamic evolution consistent with the known role of NSP in plant-insect interactions. Our results provide direct evidence to the hypothesis that gene duplication is one of the driving forces for speciation and adaptation, showing that both within- and whole-gene tandem duplications are a powerful force underlying evolutionary adaptation.
Levels of duplicate gene expression in armoured catfishes.
Dunham, R A; Philipp, D P; Whitt, G S
1980-01-01
Species of armoured catfishes differ significantly in their cellular DNA content and chromosome number. Starch gel electrophoresis of isozymes was used to determine whether each of 16 enzyme loci was expressed in a single or duplicate state. The percent of enzyme loci exhibiting duplicate locus expression in Corydoras aeneus, Corydoras julii, Corydoras melanistius, and Corydoras myersi was 37.5 percent, 18.75 percent, 12.5 percent, and 6.25 percent, respectively. The percentage of loci expressed in duplicate is higher in the species with higher haploid DNA contents, which are 4.4 pg, 3.0 pg, and 2.3 pg, respectively. These differences in DNA contents are also associated with differences in chromosome number. These data are consistent with the hypothesis that increases in DNA contents and enzyme loci occur both by tetraploidization and by regional gene duplication and that these increases are then followed by a partial loss of DNA and a reduction in the number of the duplicate isozyme loci expressed. Such analyses provide insight into the mechanisms of genome amplification and reduction as well as insights into the fats of duplicate genes.
Friedberg, Felix
2009-05-01
In this paper we examine (restricted to homo sapiens) the products resulting from gene duplication and the subsequent alternative splicing for the members of a multidomain group of proteins which possess the evolutionary conserved calponin homology CH domain, i.e. an "actin binding domain", as a singlet and which, in addition, contain the conserved cysteine rich double Zn finger possessing Lim domain, also as a singlet. Seven genes, resulting from gene duplications, were identified that code for seven group members for which pre-mRNAs appear to have undergone multiple alternative splicing: Mical 1, 2 and 3 are located on chromosomes 6q21, 11p15 and 22q11, respectively. The LMO7 gene is present on chromosome 13q22 and the LIMCH1 gene on chromosome 4p13. Micall1 is mapped to chromosome 22q13 and Micall2 to chromosome 7p22. Translated Gen/Bank ESTs suggest the existence of multiple products alternatively spliced from the pre-mRNAs encoded by these genes. Characteristic indicators of such splicing among the proteins derived from one gene must include containment of some common extensive 100% identical regions. In some instances only one exon might be partly or completely eliminated. Sometimes alternative splicing is also associated with an increased frequency of creation of an exon or part of an exon from an intron. Not only coding regions for the body of the protein but also for its N- or -C ends could be affected by the splicing. If created forms are merely beginning at different starting points but remain identical in sequence thereafter, their existence as products of alternate splicing must be questioned. In the splicings, described in this paper, multiple isoforms rather than a single isoform appear as products during the gene expression.
Lacerra, Giuseppina; Fiorito, Mirella; Musollino, Gennaro; Di Noce, Francesca; Esposito, Maria; Nigro, Vincenzo; Gaudiano, Carlo; Carestia, Clementina
2004-10-01
The alpha-globin chains are encoded by two duplicated genes (HBA2 and HBA1, 5'-3') showing overall sequence homology >96% and average CG content >60%. alpha-Thalassemia, the most prevalent worldwide autosomal recessive disorder, is a hereditary anemia caused by sequence variations of these genes in about 25% of carriers. We evaluated the overall sensitivity and suitability of DHPLC and DG-DGGE in scanning both the alpha-globin genes by carrying out a retrospective analysis of 19 variant alleles in 29 genotypes. The HBA2 alleles c.1A>G, c.79G>A, and c.281T>G, and the HBA1 allele c.475C>A were new. Three pathogenic sequence variations were associated in cis with nonpathogenic variations in all families studied; they were the HBA2 variation c.2T>C associated with c.-24C>G, and the HBA2 variations c.391G>C and c.427T>C, both associated with c.565G>A. We set up original experimental conditions for DHPLC and DG-DGGE and analyzed 10 normal subjects, 46 heterozygotes, seven homozygotes, seven compound heterozygotes, and six compound heterozygotes for a hybrid gene. Both the methodologies gave reproducible results and no false-positive was detected. DHPLC showed 100% sensitivity and DG-DGGE nearly 90%. About 100% of the sequence from the cap site to the polyA addition site could be scanned by DHPLC, about 87% by DG-DGGE. It is noteworthy that the three most common pathogenic sequence variations (HBA2 alleles c.2T>C, c.95+2_95+6del, and c.523A>G) were unambiguously detected by both the methodologies. Genotype diagnosis must be confirmed with PCR sequencing of single amplicons or with an allele-specific method. This study can be helpful for scanning genes with high CG content and offers a model suitable for duplicated genes with high homology. Copyright 2004 Wiley-Liss, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Roa, B.B.; Warner, L.E.; Lupski, J.R.
1994-09-01
The MPZ gene that maps to chromosome 1q22q23 encodes myelin protein zero, which is the most abundant peripheral nerve myelin protein that functions as a homophilic adhesion molecule in myelin compaction. Association of the MPZ gene with the dysmyelinating peripheral neuropathies Charcot-Marie-Tooth disease type 1B (CMT1B) and the more severe Dejerine-Sottas syndrome (DSS) was previously demonstrated by MPZ mutations identified in CMT1B and in rare DSS patients. In this study, the coding region of the MPZ gene was screened for mutations in a cohort of 74 unrelated patients with either CMT type 1 or DSS who do not carry themore » most common CMT1-associated molecular lesion of a 1.5 Mb DNA duplication on 17p11.2-p12. Heteroduplex analysis detected base mismatches in ten patients that were distributed over three exons of MPZ. Direct sequencing of PCR-amplified genomic DNA identified a de novo MPZ mutation associated with CMT1B that predicts an Ile(135)Thr substitution. This finding further confirms the role of MPZ in the CMT1B disease process. In addition, two polymorphisms were identified within the Gly(200) and Ser(228) codons that do not alter the respective amino acid residues. A fourth base mismatch in MPZ exon 3 detected by heteroduplex analysis is currently being characterized by direct sequence determination. Previously, four unrelated patients in this same cohort were found to have unique point mutations in the coding region of the PMP22 gene. The collective findings on CMT1 point mutations could suggest that regulatory region mutations, and possibly mutations in CMT gene(s) apart from the MPZ, PMP22 and Cx32 genes identified thus far, may prove to be significant for a number of CMT1 cases that do not involve DNA duplication.« less
Bright, Lydia J.; Gout, Jean-Francois; Lynch, Michael
2017-01-01
New gene functions arise within existing gene families as a result of gene duplication and subsequent diversification. To gain insight into the steps that led to the functional diversification of paralogues, we tracked duplicate retention patterns, expression-level divergence, and subcellular markers of functional diversification in the Rab GTPase gene family in three Paramecium aurelia species. After whole-genome duplication, Rab GTPase duplicates are more highly retained than other genes in the genome but appear to be diverging more rapidly in expression levels, consistent with early steps in functional diversification. However, by localizing specific Rab proteins in Paramecium cells, we found that paralogues from the two most recent whole-genome duplications had virtually identical localization patterns, and that less closely related paralogues showed evidence of both conservation and diversification. The functionally conserved paralogues appear to target to compartments associated with both endocytic and phagocytic recycling functions, confirming evolutionary and functional links between the two pathways in a divergent eukaryotic lineage. Because the functionally diversifying paralogues are still closely related to and derived from a clade of functionally conserved Rab11 genes, we were able to pinpoint three specific amino acid residues that may be driving the change in the localization and thus the function in these proteins. PMID:28251922
Law, Sheran Hiu Wan; Redelings, Benjamin David; Kullman, Seth William
2012-01-15
The availability of multiple teleost (bony fish) genomes is providing unprecedented opportunities to understand the diversity and function of gene duplication events using comparative genomics. Here we examine multiple paralogous genes of γ-glutamyl transferase (GGT) in several distantly related teleost species including medaka, stickleback, green spotted pufferfish, fugu, and zebrafish. Through mining genome databases, we have identified multiple GGT orthologs. Duplicate (paralogous) GGT sequences for GGT1 (GGT1 a and b), GGTL1 (GGTL1 a and b), and GGTL3 (GGTL3 a and b) were identified for each species. Phylogenetic analysis suggests that GGTs are ancient proteins conserved across most metazoan phyla and those paralogous GGTs in teleosts likely arose from the serial 3R genome duplication events. A third GGTL1 gene (GGTL1c) was found in green spotted pufferfish; however, this gene is not present in medaka, stickleback, or fugu. Similarly, one or both paralogs of GGTL3 appear to have been lost in green spotted pufferfish, fugu, and zebrafish. Syntenic relationships were highly maintained between duplicated teleost chromosomes, among teleosts and across ray-finned (Actinopterygii) and lobe-finned (Sarcopterygii) species. To assess subfunction partitioning, six medaka GGT genes were cloned and assessed for developmental and tissue-specific expression. On the basis of these data, we propose a modification of the "duplication-degeneration-complementation" model of subfunction partitioning where quantitative differences rather than absolute differences in gene expression are observed between gene paralogs. Our results demonstrate that multiple GGT genes have been retained within teleost genomes. Questions remain, however, regarding the functional roles of multiple GGTs in these species. Copyright © 2011 Wiley Periodicals, Inc., A Wiley Company.
Zheng, Deyou
2008-01-01
Background Sequencing and annotation of several mammalian genomes have revealed that segmental duplications are a common architectural feature of primate genomes; in fact, about 5% of the human genome is composed of large blocks of interspersed segmental duplications. These segmental duplications have been implicated in genomic copy-number variation, gene novelty, and various genomic disorders. However, the molecular processes involved in the evolution and regulation of duplicated sequences remain largely unexplored. Results In this study, the profile of about 20 histone modifications within human segmental duplications was characterized using high-resolution, genome-wide data derived from a ChIP-Seq study. The analysis demonstrates that derivative loci of segmental duplications often differ significantly from the original with respect to many histone methylations. Further investigation showed that genes are present three times more frequently in the original than in the derivative, whereas pseudogenes exhibit the opposite trend. These asymmetries tend to increase with the age of segmental duplications. The uneven distribution of genes and pseudogenes does not, however, fully account for the asymmetry in the profile of histone modifications. Conclusion The first systematic analysis of histone modifications between segmental duplications demonstrates that two seemingly 'identical' genomic copies are distinct in their epigenomic properties. Results here suggest that local chromatin environments may be implicated in the discrimination of derived copies of segmental duplications from their originals, leading to a biased pseudogenization of the new duplicates. The data also indicate that further exploration of the interactions between histone modification and sequence degeneration is necessary in order to understand the divergence of duplicated sequences. PMID:18598352
The evolution of Dscam genes across the arthropods.
Armitage, Sophie A O; Freiburg, Rebecca Y; Kurtz, Joachim; Bravo, Ignacio G
2012-04-13
One way of creating phenotypic diversity is through alternative splicing of precursor mRNAs. A gene that has evolved a hypervariable form is Down syndrome cell adhesion molecule (Dscam-hv), which in Drosophila melanogaster can produce thousands of isoforms via mutually exclusive alternative splicing. The extracellular region of this protein is encoded by three variable exon clusters, each containing multiple exon variants. The protein is vital for neuronal wiring where the extreme variability at the somatic level is required for axonal guidance, and it plays a role in immunity where the variability has been hypothesised to relate to recognition of different antigens. Dscam-hv has been found across the Pancrustacea. Additionally, three paralogous non-hypervariable Dscam-like genes have also been described for D. melanogaster. Here we took a bioinformatics approach, building profile Hidden Markov Models to search across species for putative orthologs to the Dscam genes and for hypervariable alternatively spliced exons, and inferring the phylogenetic relationships among them. Our aims were to examine whether Dscam orthologs exist outside the Bilateria, whether the origin of Dscam-hv could lie outside the Pancrustacea, when the Dscam-like orthologs arose, how many alternatively spliced exons of each exon cluster were present in the most common recent ancestor, and how these clusters evolved. Our results suggest that the origin of Dscam genes may lie after the split between the Cnidaria and the Bilateria and supports the hypothesis that Dscam-hv originated in the common ancestor of the Pancrustacea. Our phylogeny of Dscam gene family members shows six well-supported clades: five containing Dscam-like genes and one containing all the Dscam-hv genes, a seventh clade contains arachnid putative Dscam genes. Furthermore, the exon clusters appear to have experienced different evolutionary histories. Dscam genes have undergone independent duplication events in the insects and in an arachnid genome, which adds to the more well-known tandem duplications that have taken place within Dscam-hv genes. Therefore, two forms of gene expansion seem to be active within this gene family. The evolutionary history of this dynamic gene family will be further unfolded as genomes of species from more disparate groups become available.
Xu, Jia Meng; Fan, Wei; Jin, Jian Feng; Lou, He Qiang; Chen, Wei Wei; Yang, Jian Li; Zheng, Shao Jian
2017-01-01
Relying on Al-activated root oxalate secretion, and internal detoxification and accumulation of Al, buckwheat is highly Al resistant. However, the molecular mechanisms responsible for these processes are still poorly understood. It is well-known that root apex is the critical region of Al toxicity that rapidly impairs a series of events, thus, resulting in inhibition of root elongation. Here, we carried out transcriptome analysis of the buckwheat root apex (0–1 cm) with regards to early response (first 6 h) to Al stress (20 μM), which is crucial for identification of both genes and processes involved in Al toxicity and tolerance mechanisms. We obtained 34,469 unigenes with 26,664 unigenes annotated in the NCBI database, and identified 589 up-regulated and 255 down-regulated differentially expressed genes (DEGs) under Al stress. Functional category analysis revealed that biological processes differ between up- and down-regulated genes, although ‘metabolic processes’ were the most affected category in both up- and down-regulated DEGs. Based on the data, it is proposed that Al stress affects a variety of biological processes that collectively contributes to the inhibition of root elongation. We identified 30 transporter genes and 27 transcription factor (TF) genes induced by Al. Gene homology analysis highlighted candidate genes encoding transporters associated with Al uptake, transport, detoxification, and accumulation. We also found that TFs play critical role in transcriptional regulation of Al resistance genes in buckwheat. In addition, gene duplication events are very common in the buckwheat genome, suggesting a possible role for gene duplication in the species’ high Al resistance. Taken together, the transcriptomic analysis of buckwheat root apex shed light on the processes that contribute to the inhibition of root elongation. Furthermore, the comprehensive analysis of both transporter genes and TF genes not only deep our understanding on the responses of buckwheat roots to Al toxicity but provide a good start for functional characterization of genes critical for Al tolerance. PMID:28702047
The evolution of Dscam genes across the arthropods
2012-01-01
Background One way of creating phenotypic diversity is through alternative splicing of precursor mRNAs. A gene that has evolved a hypervariable form is Down syndrome cell adhesion molecule (Dscam-hv), which in Drosophila melanogaster can produce thousands of isoforms via mutually exclusive alternative splicing. The extracellular region of this protein is encoded by three variable exon clusters, each containing multiple exon variants. The protein is vital for neuronal wiring where the extreme variability at the somatic level is required for axonal guidance, and it plays a role in immunity where the variability has been hypothesised to relate to recognition of different antigens. Dscam-hv has been found across the Pancrustacea. Additionally, three paralogous non-hypervariable Dscam-like genes have also been described for D. melanogaster. Here we took a bioinformatics approach, building profile Hidden Markov Models to search across species for putative orthologs to the Dscam genes and for hypervariable alternatively spliced exons, and inferring the phylogenetic relationships among them. Our aims were to examine whether Dscam orthologs exist outside the Bilateria, whether the origin of Dscam-hv could lie outside the Pancrustacea, when the Dscam-like orthologs arose, how many alternatively spliced exons of each exon cluster were present in the most common recent ancestor, and how these clusters evolved. Results Our results suggest that the origin of Dscam genes may lie after the split between the Cnidaria and the Bilateria and supports the hypothesis that Dscam-hv originated in the common ancestor of the Pancrustacea. Our phylogeny of Dscam gene family members shows six well-supported clades: five containing Dscam-like genes and one containing all the Dscam-hv genes, a seventh clade contains arachnid putative Dscam genes. Furthermore, the exon clusters appear to have experienced different evolutionary histories. Conclusions Dscam genes have undergone independent duplication events in the insects and in an arachnid genome, which adds to the more well-known tandem duplications that have taken place within Dscam-hv genes. Therefore, two forms of gene expansion seem to be active within this gene family. The evolutionary history of this dynamic gene family will be further unfolded as genomes of species from more disparate groups become available. PMID:22500922
Xu, Jia Meng; Fan, Wei; Jin, Jian Feng; Lou, He Qiang; Chen, Wei Wei; Yang, Jian Li; Zheng, Shao Jian
2017-01-01
Relying on Al-activated root oxalate secretion, and internal detoxification and accumulation of Al, buckwheat is highly Al resistant. However, the molecular mechanisms responsible for these processes are still poorly understood. It is well-known that root apex is the critical region of Al toxicity that rapidly impairs a series of events, thus, resulting in inhibition of root elongation. Here, we carried out transcriptome analysis of the buckwheat root apex (0-1 cm) with regards to early response (first 6 h) to Al stress (20 μM), which is crucial for identification of both genes and processes involved in Al toxicity and tolerance mechanisms. We obtained 34,469 unigenes with 26,664 unigenes annotated in the NCBI database, and identified 589 up-regulated and 255 down-regulated differentially expressed genes (DEGs) under Al stress. Functional category analysis revealed that biological processes differ between up- and down-regulated genes, although 'metabolic processes' were the most affected category in both up- and down-regulated DEGs. Based on the data, it is proposed that Al stress affects a variety of biological processes that collectively contributes to the inhibition of root elongation. We identified 30 transporter genes and 27 transcription factor (TF) genes induced by Al. Gene homology analysis highlighted candidate genes encoding transporters associated with Al uptake, transport, detoxification, and accumulation. We also found that TFs play critical role in transcriptional regulation of Al resistance genes in buckwheat. In addition, gene duplication events are very common in the buckwheat genome, suggesting a possible role for gene duplication in the species' high Al resistance. Taken together, the transcriptomic analysis of buckwheat root apex shed light on the processes that contribute to the inhibition of root elongation. Furthermore, the comprehensive analysis of both transporter genes and TF genes not only deep our understanding on the responses of buckwheat roots to Al toxicity but provide a good start for functional characterization of genes critical for Al tolerance.
Clayton-Smith, Jill; Walters, Sarah; Hobson, Emma; Burkitt-Wright, Emma; Smith, Rupert; Toutain, Annick; Amiel, Jeanne; Lyonnet, Stanislas; Mansour, Sahar; Fitzpatrick, David; Ciccone, Roberto; Ricca, Ivana; Zuffardi, Orsetta; Donnai, Dian
2009-01-01
Xq28 duplications encompassing MECP2 have been described in male patients with a severe neurodevelopmental disorder associated with hypotonia and spasticity, severe learning disability and recurrent pneumonia. We identified an Xq28 duplication in three families where several male patients had presented with intestinal pseudo-obstruction or bladder distension. The affected boys had similar dysmorphic facial appearances. Subsequently, we ascertained seven further families where the proband presented with similar features. We demonstrated duplications of the Xq28 region in five of these additional families. In addition to MECP2, these duplications encompassed several other genes already known to be associated with diseases including SLC6A8, L1CAM and Filamin A (FLNA). The two remaining families were shown to have intragenic duplications of FLNA only. We discuss which elements of the Xq28 duplication phenotype may be associated with the various genes in the duplication. We propose that duplication of FLNA may contribute to the bowel and bladder phenotype seen in these seven families. PMID:18854860
Human-Specific Duplication and Mosaic Transcripts: The Recent Paralogous Structure of Chromosome 22
Bailey, Jeffrey A. ; Yavor, Amy M. ; Viggiano, Luigi ; Misceo, Doriana ; Horvath, Juliann E. ; Archidiacono, Nicoletta ; Schwartz, Stuart ; Rocchi, Mariano ; Eichler, Evan E.
2002-01-01
In recent decades, comparative chromosomal banding, chromosome painting, and gene-order studies have shown strong conservation of gross chromosome structure and gene order in mammals. However, findings from the human genome sequence suggest an unprecedented degree of recent (<35 million years ago) segmental duplication. This dynamism of segmental duplications has important implications in disease and evolution. Here we present a chromosome-wide view of the structure and evolution of the most highly homologous duplications (⩾1 kb and ⩾90%) on chromosome 22. Overall, 10.8% (3.7/33.8 Mb) of chromosome 22 is duplicated, with an average sequence identity of 95.4%. To organize the duplications into tractable units, intron-exon structure and well-defined duplication boundaries were used to define 78 duplicated modules (minimally shared evolutionary segments) with 157 copies on chromosome 22. Analysis of these modules provides evidence for the creation or modification of 11 novel transcripts. Comparative FISH analyses of human, chimpanzee, gorilla, orangutan, and macaque reveal qualitative and quantitative differences in the distribution of these duplications—consistent with their recent origin. Several duplications appear to be human specific, including a ∼400-kb duplication (99.4%–99.8% sequence identity) that transposed from chromosome 14 to the most proximal pericentromeric region of chromosome 22. Experimental and in silico data further support a pericentromeric gradient of duplications where the most recent duplications transpose adjacent to the centromere. Taken together, these data suggest that segmental duplications have been an ongoing process of primate genome evolution, contributing to recent gene innovation and the dynamic transformation of genome architecture within and among closely related species. PMID:11731936
2013-01-01
Background Mitochondrial genomic (mitogenomic) reorganizations are rarely found in closely-related animals, yet drastic reorganizations have been found in the Ranoides frogs. The phylogenetic relationships of the three major ranoid taxa (Natatanura, Microhylidae, and Afrobatrachia) have been problematic, and mitogenomic information for afrobatrachians has not been available. Several molecular models for mitochondrial (mt) gene rearrangements have been proposed, but observational evidence has been insufficient to evaluate them. Furthermore, evolutionary trends in rearranged mt genes have not been well understood. To gain molecular and phylogenetic insights into these issues, we analyzed the mt genomes of four afrobatrachian species (Breviceps adspersus, Hemisus marmoratus, Hyperolius marmoratus, and Trichobatrachus robustus) and performed molecular phylogenetic analyses. Furthermore we searched for two evolutionary patterns expected in the rearranged mt genes of ranoids. Results Extensively reorganized mt genomes having many duplicated and rearranged genes were found in three of the four afrobatrachians analyzed. In fact, Breviceps has the largest known mt genome among vertebrates. Although the kinds of duplicated and rearranged genes differed among these species, a remarkable gene rearrangement pattern of non-tandemly copied genes situated within tandemly-copied regions was commonly found. Furthermore, the existence of concerted evolution was observed between non-neighboring copies of triplicated 12S and 16S ribosomal RNA regions. Conclusions Phylogenetic analyses based on mitogenomic data support a close relationship between Afrobatrachia and Microhylidae, with their estimated divergence 100 million years ago consistent with present-day endemism of afrobatrachians on the African continent. The afrobatrachian mt data supported the first tandem and second non-tandem duplication model for mt gene rearrangements and the recombination-based model for concerted evolution of duplicated mt regions. We also showed that specific nucleotide substitution and compositional patterns expected in duplicated and rearranged mt genes did not occur, suggesting no disadvantage in employing these genes for phylogenetic inference. PMID:24053406
Evolutionary Diversification of Insect Innexins
Hughes, Austin L.
2014-01-01
Abstract Phylogenetic analysis of insect innexins supported the hypothesis that six major clades of insect innexins arose by gene duplication prior to the origin of the endopterygote insects. Within one of the six clades (the Zpg Clade), two independent gene duplication events were inferred to have occurred in the lineage of Drosophila , after the most recent common ancestor of the dipteran families Culicidae and Drosophilidae. The relationships among this clades were poorly resolved, except for a sister relationship between ShakB and Ogre. Gene expression data from FlyAtlas supported the hypothesis that the latter gene duplication events gave rise to functional differentiation, with Zpg showing a high level of expression in ovary, and Inx5 and Inx6 showing a high level of expression in testis. Because unduplicated members of this clade in Bombyx mori and Anopheles gambiae showed high levels of expression in both ovary and tests, the expression patterns of the Drosophila members of this clade provide evidence of subdivision of an ancestral gene function after gene duplication. PMID:25502029
Whole-Genome Duplication and the Functional Diversification of Teleost Fish Hemoglobins
Opazo, Juan C.; Butts, G. Tyler; Nery, Mariana F.; Storz, Jay F.; Hoffmann, Federico G.
2013-01-01
Subsequent to the two rounds of whole-genome duplication that occurred in the common ancestor of vertebrates, a third genome duplication occurred in the stem lineage of teleost fishes. This teleost-specific genome duplication (TGD) is thought to have provided genetic raw materials for the physiological, morphological, and behavioral diversification of this highly speciose group. The extreme physiological versatility of teleost fish is manifest in their diversity of blood–gas transport traits, which reflects the myriad solutions that have evolved to maintain tissue O2 delivery in the face of changing metabolic demands and environmental O2 availability during different ontogenetic stages. During the course of development, regulatory changes in blood–O2 transport are mediated by the expression of multiple, functionally distinct hemoglobin (Hb) isoforms that meet the particular O2-transport challenges encountered by the developing embryo or fetus (in viviparous or oviparous species) and in free-swimming larvae and adults. The main objective of the present study was to assess the relative contributions of whole-genome duplication, large-scale segmental duplication, and small-scale gene duplication in producing the extraordinary functional diversity of teleost Hbs. To accomplish this, we integrated phylogenetic reconstructions with analyses of conserved synteny to characterize the genomic organization and evolutionary history of the globin gene clusters of teleosts. These results were then integrated with available experimental data on functional properties and developmental patterns of stage-specific gene expression. Our results indicate that multiple α- and β-globin genes were present in the common ancestor of gars (order Lepisoteiformes) and teleosts. The comparative genomic analysis revealed that teleosts possess a dual set of TGD-derived globin gene clusters, each of which has undergone lineage-specific changes in gene content via repeated duplication and deletion events. Phylogenetic reconstructions revealed that paralogous genes convergently evolved similar functional properties in different teleost lineages. Consistent with other recent studies of globin gene family evolution in vertebrates, our results revealed evidence for repeated evolutionary transitions in the developmental regulation of Hb synthesis. PMID:22949522
Chromosome I duplications in Caenorhabditis elegans
DOE Office of Scientific and Technical Information (OSTI.GOV)
McKim, K.S.; Rose, A.M.
1990-01-01
We have isolated and characterized 76 duplications of chromosome I in the genome of Caenorhabditis elegans. The region studied is the 20 map unit left half of the chromosome. Sixty-two duplications were induced with gamma radiation and 14 arose spontaneously. The latter class was apparently the result of spontaneous breaks within the parental duplication. The majority of duplications behave as if they are free. Three duplications are attached to identifiable sequences from other chromosomes. The duplication breakpoints have been mapped by complementation analysis relative to genes on chromosome I. Nineteen duplication breakpoints and seven deficiency breakpoints divide the left halfmore » of the chromosome into 24 regions. We have studied the relationship between duplication size and segregational stability. While size is an important determinant of mitotic stability, it is not the only one. We observed clear exceptions to a size-stability correlation. In addition to size, duplication stability may be influenced by specific sequences or chromosome structure. The majority of the duplications were stable enough to be powerful tools for gene mapping. Therefore the duplications described here will be useful in the genetic characterization of chromosome I and the techniques we have developed can be adapted to other regions of the genome.« less
Guo, Yong; Qiu, Li-Juan
2013-01-01
The Dof domain protein family is a classic plant-specific zinc-finger transcription factor family involved in a variety of biological processes. There is great diversity in the number of Dof genes in different plants. However, there are only very limited reports on the characterization of Dof transcription factors in soybean (Glycine max). In the present study, 78 putative Dof genes were identified from the whole-genome sequence of soybean. The predicted GmDof genes were non-randomly distributed within and across 19 out of 20 chromosomes and 97.4% (38 pairs) were preferentially retained duplicate paralogous genes located in duplicated regions of the genome. Soybean-specific segmental duplications contributed significantly to the expansion of the soybean Dof gene family. These Dof proteins were phylogenetically clustered into nine distinct subgroups among which the gene structure and motif compositions were considerably conserved. Comparative phylogenetic analysis of these Dof proteins revealed four major groups, similar to those reported for Arabidopsis and rice. Most of the GmDofs showed specific expression patterns based on RNA-seq data analyses. The expression patterns of some duplicate genes were partially redundant while others showed functional diversity, suggesting the occurrence of sub-functionalization during subsequent evolution. Comprehensive expression profile analysis also provided insights into the soybean-specific functional divergence among members of the Dof gene family. Cis-regulatory element analysis of these GmDof genes suggested diverse functions associated with different processes. Taken together, our results provide useful information for the functional characterization of soybean Dof genes by combining phylogenetic analysis with global gene-expression profiling.
Genomic analysis reveals extensive gene duplication within the bovine TRB locus
Connelley, Timothy; Aerts, Jan; Law, Andy; Morrison, W Ivan
2009-01-01
Background Diverse TR and IG repertoires are generated by V(D)J somatic recombination. Genomic studies have been pivotal in cataloguing the V, D, J and C genes present in the various TR/IG loci and describing how duplication events have expanded the number of these genes. Such studies have also provided insights into the evolution of these loci and the complex mechanisms that regulate TR/IG expression. In this study we analyze the sequence of the third bovine genome assembly to characterize the germline repertoire of bovine TRB genes and compare the organization, evolution and regulatory structure of the bovine TRB locus with that of humans and mice. Results The TRB locus in the third bovine genome assembly is distributed over 5 scaffolds, extending to ~730 Kb. The available sequence contains 134 TRBV genes, assigned to 24 subgroups, and 3 clusters of DJC genes, each comprising a single TRBD gene, 5–7 TRBJ genes and a single TRBC gene. Seventy-nine of the TRBV genes are predicted to be functional. Comparison with the human and murine TRB loci shows that the gene order, as well as the sequences of non-coding elements that regulate TRB expression, are highly conserved in the bovine. Dot-plot analyses demonstrate that expansion of the genomic TRBV repertoire has occurred via a complex and extensive series of duplications, predominantly involving DNA blocks containing multiple genes. These duplication events have resulted in massive expansion of several TRBV subgroups, most notably TRBV6, 9 and 21 which contain 40, 35 and 16 members respectively. Similarly, duplication has lead to the generation of a third DJC cluster. Analyses of cDNA data confirms the diversity of the TRBV genes and, in addition, identifies a substantial number of TRBV genes, predominantly from the larger subgroups, which are still absent from the genome assembly. The observed gene duplication within the bovine TRB locus has created a repertoire of phylogenetically diverse functional TRBV genes, which is substantially larger than that described for humans and mice. Conclusion The analyses completed in this study reveal that, although the gene content and organization of the bovine TRB locus are broadly similar to that of humans and mice, multiple duplication events have led to a marked expansion in the number of TRB genes. Similar expansions in other ruminant TR loci suggest strong evolutionary pressures in this lineage have selected for the development of enlarged sets of TR genes that can contribute to diverse TR repertoires. PMID:19393068
Yerramsetty, Pradeep; Stata, Matt; Siford, Rebecca; Sage, Tammy L; Sage, Rowan F; Wong, Gane Ka-Shu; Albert, Victor A; Berry, James O
2016-06-29
RLSB, an S-1 domain RNA binding protein of Arabidopsis, selectively binds rbcL mRNA and co-localizes with Ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) within chloroplasts of C3 and C4 plants. Previous studies using both Arabidopsis (C3) and maize (C4) suggest RLSB homologs are post-transcriptional regulators of plastid-encoded rbcL mRNA. While RLSB accumulates in all Arabidopsis leaf chlorenchyma cells, in C4 leaves RLSB-like proteins accumulate only within Rubisco-containing bundle sheath chloroplasts of Kranz-type species, and only within central compartment chloroplasts in the single cell C4 plant Bienertia. Our recent evidence implicates this mRNA binding protein as a primary determinant of rbcL expression, cellular localization/compartmentalization, and photosynthetic function in all multicellular green plants. This study addresses the hypothesis that RLSB is a highly conserved Rubisco regulatory factor that occurs in the chloroplasts all higher plants. Phylogenetic analysis has identified RLSB orthologs and paralogs in all major plant groups, from ancient liverworts to recent angiosperms. RLSB homologs were also identified in algae of the division Charophyta, a lineage closely related to land plants. RLSB-like sequences were not identified in any other algae, suggesting that it may be specific to the evolutionary line leading to land plants. The RLSB family occurs in single copy across most angiosperms, although a few species with two copies were identified, seemingly randomly distributed throughout the various taxa, although perhaps correlating in some cases with known ancient whole genome duplications. Monocots of the order Poales (Poaceae and Cyperaceae) were found to contain two copies, designated here as RLSB-a and RLSB-b, with only RLSB-a implicated in the regulation of rbcL across the maize developmental gradient. Analysis of microsynteny in angiosperms revealed high levels of conservation across eudicot species and for both paralogs in grasses, highlighting the possible importance of maintaining this gene and its surrounding genomic regions. Findings presented here indicate that the RLSB family originated as a unique gene in land plant evolution, perhaps in the common ancestor of charophytes and higher plants. Purifying selection has maintained this as a highly conserved single- or two-copy gene across most extant species, with several conserved gene duplications. Together with previous findings, this study suggests that RLSB has been sustained as an important regulatory protein throughout the course of land plant evolution. While only RLSB-a has been directly implicated in rbcL regulation in maize, RLSB-b could have an overlapping function in the co-regulation of rbcL, or may have diverged as a regulator of one or more other plastid-encoded mRNAs. This analysis confirms that RLSB is an important and unique photosynthetic regulatory protein that has been continuously expressed in land plants as they emerged and diversified from their ancient common ancestor.
Wang, Peipei; Li, Jing; Gao, Xiaoyang; Zhang, Di; Li, Anlin; Liu, Changning
2018-05-29
Physic nut ( Jatropha curcas L.) is a species of flowering plant with great potential for biofuel production and as an emerging model organism for functional genomic analysis, particularly in the Euphorbiaceae family. DNA binding with one finger (Dof) transcription factors play critical roles in numerous biological processes in plants. Nevertheless, the knowledge about members, and the evolutionary and functional characteristics of the Dof gene family in physic nut is insufficient. Therefore, we performed a genome-wide screening and characterization of the Dof gene family within the physic nut draft genome. In total, 24 JcDof genes (encoding 33 JcDof proteins) were identified. All the JcDof genes were divided into three major groups based on phylogenetic inference, which was further validated by the subsequent gene structure and motif analysis. Genome comparison revealed that segmental duplication may have played crucial roles in the expansion of the JcDof gene family, and gene expansion was mainly subjected to positive selection. The expression profile demonstrated the broad involvement of JcDof genes in response to various abiotic stresses, hormonal treatments and functional divergence. This study provides valuable information for better understanding the evolution of JcDof genes, and lays a foundation for future functional exploration of JcDof genes.
Dynamic evolution of the GnRH receptor gene family in vertebrates.
Williams, Barry L; Akazome, Yasuhisa; Oka, Yoshitaka; Eisthen, Heather L
2014-10-25
Elucidating the mechanisms underlying coevolution of ligands and receptors is an important challenge in molecular evolutionary biology. Peptide hormones and their receptors are excellent models for such efforts, given the relative ease of examining evolutionary changes in genes encoding for both molecules. Most vertebrates possess multiple genes for both the decapeptide gonadotropin releasing hormone (GnRH) and for the GnRH receptor. The evolutionary history of the receptor family, including ancestral copy number and timing of duplications and deletions, has been the subject of controversy. We report here for the first time sequences of three distinct GnRH receptor genes in salamanders (axolotls, Ambystoma mexicanum), which are orthologous to three GnRH receptors from ranid frogs. To understand the origin of these genes within the larger evolutionary context of the gene family, we performed phylogenetic analyses and probabilistic protein homology searches of GnRH receptor genes in vertebrates and their near relatives. Our analyses revealed four points that alter previous views about the evolution of the GnRH receptor gene family. First, the "mammalian" pituitary type GnRH receptor, which is the sole GnRH receptor in humans and previously presumed to be highly derived because it lacks the cytoplasmic C-terminal domain typical of most G-protein coupled receptors, is actually an ancient gene that originated in the common ancestor of jawed vertebrates (Gnathostomata). Second, unlike previous studies, we classify vertebrate GnRH receptors into five subfamilies. Third, the order of subfamily origins is the inverse of previous proposed models. Fourth, the number of GnRH receptor genes has been dynamic in vertebrates and their ancestors, with multiple duplications and losses. Our results provide a novel evolutionary framework for generating hypotheses concerning the functional importance of structural characteristics of vertebrate GnRH receptors. We show that five subfamilies of vertebrate GnRH receptors evolved early in the vertebrate phylogeny, followed by several independent instances of gene loss. Chief among cases of gene loss are humans, best described as degenerate with respect to GnRH receptors because we retain only a single, ancient gene.
Felip, Alicia; Zanuy, Silvia; Pineda, Rafael; Pinilla, Leonor; Carrillo, Manuel; Tena-Sempere, Manuel; Gómez, Ana
2009-11-27
Kisspeptins, the products of KiSS-1 gene, have recently emerged as fundamental regulators of reproductive function in different mammalian and, presumably, non-mammalian species. To date, a single form of KiSS-1 has been described in mammals, and recently, in several fish species and Xenopus. We report herein the cloning and characterization of two distinct KiSS-like genes, namely, KiSS-1 and KiSS-2, in the teleost sea bass. While KiSS-1 encodes a peptide identical to rodent kisspeptin-10, the predicted KiSS-2 decapeptide diverges at 4 amino acids (FNFNPFGLRF). Genome database searches showed that both genes are present in non-placental vertebrate genomes. Indeed, phylogenetic and genome mapping analyses suggest that KiSS-1 and KiSS-2 are paralogous genes that originated by duplication of an ancestral gene, although KiSS-2 is lost in placental mammals. KiSS-1 and KiSS-2 mRNAs are present in brain and gonads of sea bass, medaka and zebrafish. Comparative functional studies demonstrated that KiSS-2 decapeptide was significantly more potent than KiSS-1 peptide in inducing LH and FSH secretion in sea bass. In contrast, KiSS-2 decapeptide only weakly elicited LH secretion in rats, whereas KiSS-1 peptide was maximally effective. Our data are the first to provide conclusive evidence for the existence of a second KiSS gene, KiSS-2, in non-placental vertebrates, whose product is likely to play a dominant stimulatory role in the regulation of the gonadotropic axis at least in teleosts.
Li, Lingyun; Li, Qingbo; Rohlin, Lars; Kim, UnMi; Salmon, Kirsty; Rejtar, Tomas; Gunsalus, Robert P.; Karger, Barry L.; Ferry, James G.
2008-01-01
Summary Methanosarcina acetivorans strain C2A is an acetate- and methanol-utilizing methane-producing organism for which the genome, the largest yet sequenced among the Archaea, reveals extensive physiological diversity. LC linear ion trap-FTICR mass spectrometry was employed to analyze acetate- vs. methanol-grown cells metabolically labeled with 14N vs. 15N, respectively, to obtain quantitative protein abundance ratios. DNA microarray analyses of acetate- vs. methanol-grown cells was also performed to determine gene expression ratios. The combined approaches were highly complementary, extending the physiological understanding of growth and methanogenesis. Of the 1081 proteins detected, 255 were ≥ 3-fold differentially abundant. DNA microarray analysis revealed 410 genes that were ≥ 2.5-fold differentially expressed of 1972 genes with detected expression. The ratios of differentially abundant proteins were in good agreement with expression ratios of the encoding genes. Taken together, the results suggest several novel roles for electron transport components specific to acetate-grown cells, including two flavodoxins each specific for growth on acetate or methanol. Protein abundance ratios indicated that duplicate CO dehydrogenase/acetyl-CoA complexes function in the conversion of acetate to methane. Surprisingly, the protein abundance and gene expression ratios indicated a general stress response in acetate- vs. methanol-grown cells that included enzymes specific for polyphosphate accumulation and oxidative stress. The microarray analysis identified transcripts of several genes encoding regulatory proteins with identity to the PhoU, MarR, GlnK, and TetR families commonly found in the Bacteria domain. An analysis of neighboring genes suggested roles in controlling phosphate metabolism (PhoU), ammonia assimilation (GlnK), and molybdopterin cofactor biosynthesis (TetR). Finally, the proteomic and microarray results suggested roles for two-component regulatory systems specific for each growth substrate. PMID:17269732
Origin and Evolution of the Sodium -Pumping NADH: Ubiquinone Oxidoreductase
Reyes-Prieto, Adrian; Barquera, Blanca; Juárez, Oscar
2014-01-01
The sodium -pumping NADH: ubiquinone oxidoreductase (Na+-NQR) is the main ion pump and the primary entry site for electrons into the respiratory chain of many different types of pathogenic bacteria. This enzymatic complex creates a transmembrane gradient of sodium that is used by the cell to sustain ionic homeostasis, nutrient transport, ATP synthesis, flagellum rotation and other essential processes. Comparative genomics data demonstrate that the nqr operon, which encodes all Na+-NQR subunits, is found in a large variety of bacterial lineages with different habitats and metabolic strategies. Here we studied the distribution, origin and evolution of this enzymatic complex. The molecular phylogenetic analyses and the organizations of the nqr operon indicate that Na+-NQR evolved within the Chlorobi/Bacteroidetes group, after the duplication and subsequent neofunctionalization of the operon that encodes the homolog RNF complex. Subsequently, the nqr operon dispersed through multiple horizontal transfer events to other bacterial lineages such as Chlamydiae, Planctomyces and α, β, γ and δ -proteobacteria. Considering the biochemical properties of the Na+-NQR complex and its physiological role in different bacteria, we propose a detailed scenario to explain the molecular mechanisms that gave rise to its novel redox- dependent sodium -pumping activity. Our model postulates that the evolution of the Na+-NQR complex involved a functional divergence from its RNF homolog, following the duplication of the rnf operon, the loss of the rnfB gene and the recruitment of the reductase subunit of an aromatic monooxygenase. PMID:24809444
Elahi, Elahe; Shafaghati, Yousef; Asadi, Sareh; Absalan, Farnaz; Goodarzi, Hani; Gharaii, Nava; Karimi-Nejad, Mohammad Hassan; Shahram, Farhad; Hughes, Anne E
2007-01-01
Familial expansile osteolysis (FEO) is a rare disorder causing bone dysplasia. The clinical features of FEO include early-onset hearing loss, tooth destruction, and progressive lytic expansion within limb bones causing pain, fracture, and deformity. An 18-bp duplication in the first exon of the TNFRSF11A gene encoding RANK has been previously identified in four FEO pedigrees. Despite having the identical mutation, phenotypic variations among affected individuals of the same and different pedigrees were noted. Another 18-bp duplication, one base proximal to the duplication previously reported, was subsequently found in two unrelated FEO patients. Finally, mutations overlapping with the mutations found in the FEO pedigrees have been found in ESH and early-onset PDB pedigrees. An Iranian FEO pedigree that contains six affected individuals dispersed in three generations has previously been introduced; here, the clinical features of the proband are reported in greater detail, and the genetic defect of the pedigree is presented. Direct sequencing of the entire coding region and upstream and downstream noncoding regions of TNFRSF11A in her DNA revealed the same 18-bp duplication mutation as previously found in the four FEO pedigrees. Additionally, eight sequence variations as compared to the TNFRSF11A reference sequence were identified, and a haplotype linked to the mutation based on these variations was defined. Although the mutation in the Iranian and four of the previously described FEO pedigrees was the same, haplotypes based on the intragenic SNPs suggest that the mutations do not share a common descent.
Towers, Rebecca J.; Fagan, Peter K.; Talay, Susanne R.; Currie, Bart J.; Sriprakash, Kadaba S.; Walker, Mark J.; Chhatwal, Gursharan S.
2003-01-01
Streptococcal fibronectin-binding protein is an important virulence factor involved in colonization and invasion of epithelial cells and tissues by Streptococcus pyogenes. In order to investigate the mechanisms involved in the evolution of sfbI, the sfbI genes from 54 strains were sequenced. Thirty-four distinct alleles were identified. Three principal mechanisms appear to have been involved in the evolution of sfbI. The amino-terminal aromatic amino acid-rich domain is the most variable region and is apparently generated by intergenic recombination of horizontally acquired DNA cassettes, resulting in a genetic mosaic in this region. Two distinct and divergent sequence types that shared only 61 to 70% identity were identified in the central proline-rich region, while variation at the 3′ end of the gene is due to deletion or duplication of defined repeat units. Potential antigenic and functional variabilities in SfbI imply significant selective pressure in vivo with direct implications for the microbial pathogenesis of S. pyogenes. PMID:14662917
Plastid genome sequence of an ornamental and editable fruit tree of Rosaceae, Prunus mume.
Wang, Shuo; Gao, Cheng-Wen; Gao, Li-Zhi
2016-11-01
Here we assembled and analyzed the complete chloroplast genome of Prunus mume, a popular ornamental and editable fruit tree of Rosaceae. The cp genome exhibited a circular DNA molecule of 157 712 bp with a typical quadripartite structure consisted of two inverted repeat regions (IRa and IRb) of 26 394 bp separated by large (LSC) and small (SSC) single-copy regions of 85 861 and 19 063 bp, respectively. It encoded 112 unique genes, 19 of which were duplicated in the IR regions, giving a total of 131 genes. Eighteen of these genes harbored one or two introns. GC content was 38.9%, and coding regions accounted for 51.3% of the genome. Phylogenetic analysis showed that P. mume clustered with P. persica and P. kansuensis in the genus Punus. This newly determined chloroplast genome will enhance modern breeding programs for the purpose of genetic improvement of this valuable plant.
Hofberger, Johannes A.; Lyons, Eric; Edger, Patrick P.; Chris Pires, J.; Eric Schranz, M.
2013-01-01
Plants share a common history of successive whole-genome duplication (WGD) events retaining genomic patterns of duplicate gene copies (ohnologs) organized in conserved syntenic blocks. Duplication was often proposed to affect the origin of novel traits during evolution. However, genetic evidence linking WGD to pathway diversification is scarce. We show that WGD and tandem duplication (TD) accelerated genetic versatility of plant secondary metabolism, exemplified with the glucosinolate (GS) pathway in the mustard family. GS biosynthesis is a well-studied trait, employing at least 52 biosynthetic and regulatory genes in the model plant Arabidopsis. In a phylogenomics approach, we identified 67 GS loci in Aethionema arabicum of the tribe Aethionemae, sister group to all mustard family members. All but one of the Arabidopsis GS gene families evolved orthologs in Aethionema and all but one of the orthologous sequence pairs exhibit synteny. The 45% fraction of duplicates among all protein-coding genes in Arabidopsis was increased to 95% and 97% for Arabidopsis and Aethionema GS pathway inventory, respectively. Compared with the 22% average for all protein-coding genes in Arabidopsis, 52% and 56% of Aethionema and Arabidopsis GS loci align to ohnolog copies dating back to the last common WGD event. Although 15% of all Arabidopsis genes are organized in tandem arrays, 45% and 48% of GS loci in Arabidopsis and Aethionema descend from TD, respectively. We describe a sequential combination of TD and WGD events driving gene family extension, thereby expanding the evolutionary playground for functional diversification and thus potential novelty and success. PMID:24171911
Jiang, Hua; Liu, Sha; Zhang, Yong-Ling; Wan, Jun-Hui; Li, Ru; Li, Dong-Zhi
2015-01-01
We describe a new case of a β-thalassemia (β-thal) heterozygote with the mutation IVS-II-654 (C>T) presenting with a transfusion-dependent phenotype. Multiplex ligation-dependent probe amplification (MLPA) and array comparative genomic hybridization (CGH) analyses of the α-globin gene cluster revealed a full duplication of the α-globin genes including the upstream regulatory element. The duplicated allele and the normal allele in trans resulted in a total of six active α-globin genes. The severe clinical phenotype seemed to be related to the considerable excess of the α- and β-globin deficit caused by the presence of the β-thal. α-Globin cluster duplication should be considered in patients heterozygous for β-thal who show a more severe phenotype than β-thal trait.
Pervasive positive selection on duplicated and nonduplicated vertebrate protein coding genes.
Studer, Romain A; Penel, Simon; Duret, Laurent; Robinson-Rechavi, Marc
2008-09-01
A stringent branch-site codon model was used to detect positive selection in vertebrate evolution. We show that the test is robust to the large evolutionary distances involved. Positive selection was detected in 77% of 884 genes studied. Most positive selection concerns a few sites on a single branch of the phylogenetic tree: Between 0.9% and 4.7% of sites are affected by positive selection depending on the branches. No functional category was overrepresented among genes under positive selection. Surprisingly, whole genome duplication had no effect on the prevalence of positive selection, whether the fish-specific genome duplication or the two rounds at the origin of vertebrates. Thus positive selection has not been limited to a few gene classes, or to specific evolutionary events such as duplication, but has been pervasive during vertebrate evolution.