Studying the genetic basis of speciation in high gene flow marine invertebrates
2016-01-01
A growing number of genes responsible for reproductive incompatibilities between species (barrier loci) exhibit the signals of positive selection. However, the possibility that genes experiencing positive selection diverge early in speciation and commonly cause reproductive incompatibilities has not been systematically investigated on a genome-wide scale. Here, I outline a research program for studying the genetic basis of speciation in broadcast spawning marine invertebrates that uses a priori genome-wide information on a large, unbiased sample of genes tested for positive selection. A targeted sequence capture approach is proposed that scores single-nucleotide polymorphisms (SNPs) in widely separated species populations at an early stage of allopatric divergence. The targeted capture of both coding and non-coding sequences enables SNPs to be characterized at known locations across the genome and at genes with known selective or neutral histories. The neutral coding and non-coding SNPs provide robust background distributions for identifying FST-outliers within genes that can, in principle, identify specific mutations experiencing diversifying selection. If natural hybridization occurs between species, the neutral coding and non-coding SNPs can provide a neutral admixture model for genomic clines analyses aimed at finding genes exhibiting strong blocks to introgression. Strongylocentrotid sea urchins are used as a model system to outline the approach but it can be used for any group that has a complete reference genome available. PMID:29491951
Sexual selection drives evolution and rapid turnover of male gene expression.
Harrison, Peter W; Wright, Alison E; Zimmer, Fabian; Dean, Rebecca; Montgomery, Stephen H; Pointer, Marie A; Mank, Judith E
2015-04-07
The profound and pervasive differences in gene expression observed between males and females, and the unique evolutionary properties of these genes in many species, have led to the widespread assumption that they are the product of sexual selection and sexual conflict. However, we still lack a clear understanding of the connection between sexual selection and transcriptional dimorphism, often termed sex-biased gene expression. Moreover, the relative contribution of sexual selection vs. drift in shaping broad patterns of expression, divergence, and polymorphism remains unknown. To assess the role of sexual selection in shaping these patterns, we assembled transcriptomes from an avian clade representing the full range of sexual dimorphism and sexual selection. We use these species to test the links between sexual selection and sex-biased gene expression evolution in a comparative framework. Through ancestral reconstruction of sex bias, we demonstrate a rapid turnover of sex bias across this clade driven by sexual selection and show it to be primarily the result of expression changes in males. We use phylogenetically controlled comparative methods to demonstrate that phenotypic measures of sexual selection predict the proportion of male-biased but not female-biased gene expression. Although male-biased genes show elevated rates of coding sequence evolution, consistent with previous reports in a range of taxa, there is no association between sexual selection and rates of coding sequence evolution, suggesting that expression changes may be more important than coding sequence in sexual selection. Taken together, our results highlight the power of sexual selection to act on gene expression differences and shape genome evolution.
Liu, Yangyang; Han, Xiao; Yuan, Junting; Geng, Tuoyu; Chen, Shihao; Hu, Xuming; Cui, Isabelle H; Cui, Hengmi
2017-04-07
The type II bacterial CRISPR/Cas9 system is a simple, convenient, and powerful tool for targeted gene editing. Here, we describe a CRISPR/Cas9-based approach for inserting a poly(A) transcriptional terminator into both alleles of a targeted gene to silence protein-coding and non-protein-coding genes, which often play key roles in gene regulation but are difficult to silence via insertion or deletion of short DNA fragments. The integration of 225 bp of bovine growth hormone poly(A) signals into either the first intron or the first exon or behind the promoter of target genes caused efficient termination of expression of PPP1R12C , NSUN2 (protein-coding genes), and MALAT1 (non-protein-coding gene). Both NeoR and PuroR were used as markers in the selection of clonal cell lines with biallelic integration of a poly(A) signal. Genotyping analysis indicated that the cell lines displayed the desired biallelic silencing after a brief selection period. These combined results indicate that this CRISPR/Cas9-based approach offers an easy, convenient, and efficient novel technique for gene silencing in cell lines, especially for those in which gene integration is difficult because of a low efficiency of homology-directed repair. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Positive Selection Underlies Faster-Z Evolution of Gene Expression in Birds
Dean, Rebecca; Harrison, Peter W.; Wright, Alison E.; Zimmer, Fabian; Mank, Judith E.
2015-01-01
The elevated rate of evolution for genes on sex chromosomes compared with autosomes (Fast-X or Fast-Z evolution) can result either from positive selection in the heterogametic sex or from nonadaptive consequences of reduced relative effective population size. Recent work in birds suggests that Fast-Z of coding sequence is primarily due to relaxed purifying selection resulting from reduced relative effective population size. However, gene sequence and gene expression are often subject to distinct evolutionary pressures; therefore, we tested for Fast-Z in gene expression using next-generation RNA-sequencing data from multiple avian species. Similar to studies of Fast-Z in coding sequence, we recover clear signatures of Fast-Z in gene expression; however, in contrast to coding sequence, our data indicate that Fast-Z in expression is due to positive selection acting primarily in females. In the soma, where gene expression is highly correlated between the sexes, we detected Fast-Z in both sexes, although at a higher rate in females, suggesting that many positively selected expression changes in females are also expressed in males. In the gonad, where intersexual correlations in expression are much lower, we detected Fast-Z for female gene expression, but crucially, not males. This suggests that a large amount of expression variation is sex-specific in its effects within the gonad. Taken together, our results indicate that Fast-Z evolution of gene expression is the product of positive selection acting on recessive beneficial alleles in the heterogametic sex. More broadly, our analysis suggests that the adaptive potential of Z chromosome gene expression may be much greater than that of gene sequence, results which have important implications for the role of sex chromosomes in speciation and sexual selection. PMID:26067773
Genes uniquely expressed in human growth plate chondrocytes uncover a distinct regulatory network.
Li, Bing; Balasubramanian, Karthika; Krakow, Deborah; Cohn, Daniel H
2017-12-20
Chondrogenesis is the earliest stage of skeletal development and is a highly dynamic process, integrating the activities and functions of transcription factors, cell signaling molecules and extracellular matrix proteins. The molecular mechanisms underlying chondrogenesis have been extensively studied and multiple key regulators of this process have been identified. However, a genome-wide overview of the gene regulatory network in chondrogenesis has not been achieved. In this study, employing RNA sequencing, we identified 332 protein coding genes and 34 long non-coding RNA (lncRNA) genes that are highly selectively expressed in human fetal growth plate chondrocytes. Among the protein coding genes, 32 genes were associated with 62 distinct human skeletal disorders and 153 genes were associated with skeletal defects in knockout mice, confirming their essential roles in skeletal formation. These gene products formed a comprehensive physical interaction network and participated in multiple cellular processes regulating skeletal development. The data also revealed 34 transcription factors and 11,334 distal enhancers that were uniquely active in chondrocytes, functioning as transcriptional regulators for the cartilage-selective genes. Our findings revealed a complex gene regulatory network controlling skeletal development whereby transcription factors, enhancers and lncRNAs participate in chondrogenesis by transcriptional regulation of key genes. Additionally, the cartilage-selective genes represent candidate genes for unsolved human skeletal disorders.
Deschamps, Matthieu; Laval, Guillaume; Fagny, Maud; Itan, Yuval; Abel, Laurent; Casanova, Jean-Laurent; Patin, Etienne; Quintana-Murci, Lluis
2016-01-01
Human genes governing innate immunity provide a valuable tool for the study of the selective pressure imposed by microorganisms on host genomes. A comprehensive, genome-wide study of how selective constraints and adaptations have driven the evolution of innate immunity genes is missing. Using full-genome sequence variation from the 1000 Genomes Project, we first show that innate immunity genes have globally evolved under stronger purifying selection than the remainder of protein-coding genes. We identify a gene set under the strongest selective constraints, mutations in which are likely to predispose individuals to life-threatening disease, as illustrated by STAT1 and TRAF3. We then evaluate the occurrence of local adaptation and detect 57 high-scoring signals of positive selection at innate immunity genes, variation in which has been associated with susceptibility to common infectious or autoimmune diseases. Furthermore, we show that most adaptations targeting coding variation have occurred in the last 6,000–13,000 years, the period at which populations shifted from hunting and gathering to farming. Finally, we show that innate immunity genes present higher Neandertal introgression than the remainder of the coding genome. Notably, among the genes presenting the highest Neandertal ancestry, we find the TLR6-TLR1-TLR10 cluster, which also contains functional adaptive variation in Europeans. This study identifies highly constrained genes that fulfill essential, non-redundant functions in host survival and reveals others that are more permissive to change—containing variation acquired from archaic hominins or adaptive variants in specific populations—improving our understanding of the relative biological importance of innate immunity pathways in natural conditions. PMID:26748513
Evidence of translation efficiency adaptation of the coding regions of the bacteriophage lambda.
Goz, Eli; Mioduser, Oriah; Diament, Alon; Tuller, Tamir
2017-08-01
Deciphering the way gene expression regulatory aspects are encoded in viral genomes is a challenging mission with ramifications related to all biomedical disciplines. Here, we aimed to understand how the evolution shapes the bacteriophage lambda genes by performing a high resolution analysis of ribosomal profiling data and gene expression related synonymous/silent information encoded in bacteriophage coding regions.We demonstrated evidence of selection for distinct compositions of synonymous codons in early and late viral genes related to the adaptation of translation efficiency to different bacteriophage developmental stages. Specifically, we showed that evolution of viral coding regions is driven, among others, by selection for codons with higher decoding rates; during the initial/progressive stages of infection the decoding rates in early/late genes were found to be superior to those in late/early genes, respectively. Moreover, we argued that selection for translation efficiency could be partially explained by adaptation to Escherichia coli tRNA pool and the fact that it can change during the bacteriophage life cycle.An analysis of additional aspects related to the expression of viral genes, such as mRNA folding and more complex/longer regulatory signals in the coding regions, is also reported. The reported conclusions are likely to be relevant also to additional viruses. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Positive Selection Underlies Faster-Z Evolution of Gene Expression in Birds.
Dean, Rebecca; Harrison, Peter W; Wright, Alison E; Zimmer, Fabian; Mank, Judith E
2015-10-01
The elevated rate of evolution for genes on sex chromosomes compared with autosomes (Fast-X or Fast-Z evolution) can result either from positive selection in the heterogametic sex or from nonadaptive consequences of reduced relative effective population size. Recent work in birds suggests that Fast-Z of coding sequence is primarily due to relaxed purifying selection resulting from reduced relative effective population size. However, gene sequence and gene expression are often subject to distinct evolutionary pressures; therefore, we tested for Fast-Z in gene expression using next-generation RNA-sequencing data from multiple avian species. Similar to studies of Fast-Z in coding sequence, we recover clear signatures of Fast-Z in gene expression; however, in contrast to coding sequence, our data indicate that Fast-Z in expression is due to positive selection acting primarily in females. In the soma, where gene expression is highly correlated between the sexes, we detected Fast-Z in both sexes, although at a higher rate in females, suggesting that many positively selected expression changes in females are also expressed in males. In the gonad, where intersexual correlations in expression are much lower, we detected Fast-Z for female gene expression, but crucially, not males. This suggests that a large amount of expression variation is sex-specific in its effects within the gonad. Taken together, our results indicate that Fast-Z evolution of gene expression is the product of positive selection acting on recessive beneficial alleles in the heterogametic sex. More broadly, our analysis suggests that the adaptive potential of Z chromosome gene expression may be much greater than that of gene sequence, results which have important implications for the role of sex chromosomes in speciation and sexual selection. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Development of genetically engineered bacteria for production of selected aromatic compounds
Ward, Thomas E.; Watkins, Carolyn S.; Bulmer, Deborah K.; Johnson, Bruce F.; Amaratunga, Mohan
2001-01-01
The cloning and expression of genes in the common aromatic pathway of E. coli are described. A compound for which chorismate, the final product of the common aromatic pathway, is an anabolic intermediate can be produced by cloning and expressing selected genes of the common aromatic pathway and the genes coding for enzymes necessary to convert chorismate to the selected compound. Plasmids carrying selected genes of the common aromatic pathway are also described.
Dos Reis, Mario
2015-04-01
First principles of population genetics are used to obtain formulae relating the non-synonymous to synonymous substitution rate ratio to the selection coefficients acting at codon sites in protein-coding genes. Two theoretical cases are discussed and two examples from real data (a chloroplast gene and a virus polymerase) are given. The formulae give much insight into the dynamics of non-synonymous substitutions and may inform the development of methods to detect adaptive evolution. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
Next generation sequencing and analysis of a conserved transcriptome of New Zealand's kiwi.
Subramanian, Sankar; Huynen, Leon; Millar, Craig D; Lambert, David M
2010-12-15
Kiwi is a highly distinctive, flightless and endangered ratite bird endemic to New Zealand. To understand the patterns of molecular evolution of the nuclear protein-coding genes in brown kiwi (Apteryx australis mantelli) and to determine the timescale of avian history we sequenced a transcriptome obtained from a kiwi embryo using next generation sequencing methods. We then assembled the conserved protein-coding regions using the chicken proteome as a scaffold. Using 1,543 conserved protein coding genes we estimated the neutral evolutionary divergence between the kiwi and chicken to be ~45%, which is approximately equal to the divergence computed for the human-mouse pair using the same set of genes. A large fraction of genes was found to be under high selective constraint, as most of the expressed genes appeared to be involved in developmental gene regulation. Our study suggests a significant relationship between gene expression levels and protein evolution. Using sequences from over 700 nuclear genes we estimated the divergence between the two basal avian groups, Palaeognathae and Neognathae to be 132 million years, which is consistent with previous studies using mitochondrial genes. The results of this investigation revealed patterns of mutation and purifying selection in conserved protein coding regions in birds. Furthermore this study suggests a relatively cost-effective way of obtaining a glimpse into the fundamental molecular evolutionary attributes of a genome, particularly when no closely related genomic sequence is available.
Dimitrieva, Slavica; Anisimova, Maria
2014-01-01
In protein-coding genes, synonymous mutations are often thought not to affect fitness and therefore are not subject to natural selection. Yet increasingly, cases of non-neutral evolution at certain synonymous sites were reported over the last decade. To evaluate the extent and the nature of site-specific selection on synonymous codons, we computed the site-to-site synonymous rate variation (SRV) and identified gene properties that make SRV more likely in a large database of protein-coding gene families and protein domains. To our knowledge, this is the first study that explores the determinants and patterns of the SRV in real data. We show that the SRV is widespread in the evolution of protein-coding sequences, putting in doubt the validity of the synonymous rate as a standard neutral proxy. While protein domains rarely undergo adaptive evolution, the SRV appears to play important role in optimizing the domain function at the level of DNA. In contrast, protein families are more likely to evolve by positive selection, but are less likely to exhibit SRV. Stronger SRV was detected in genes with stronger codon bias and tRNA reusage, those coding for proteins with larger number of interactions or forming larger number of structures, located in intracellular components and those involved in typically conserved complex processes and functions. Genes with extreme SRV show higher expression levels in nearly all tissues. This indicates that codon bias in a gene, which often correlates with gene expression, may often be a site-specific phenomenon regulating the speed of translation along the sequence, consistent with the co-translational folding hypothesis. Strikingly, genes with SRV were strongly overrepresented for metabolic pathways and those associated with several genetic diseases, particularly cancers and diabetes.
SNP discovery in candidate adaptive genes using exon capture in a free-ranging alpine ungulate
Gretchen H. Roffler; Stephen J. Amish; Seth Smith; Ted Cosart; Marty Kardos; Michael K. Schwartz; Gordon Luikart
2016-01-01
Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein-coding...
A large-scale study of the random variability of a coding sequence: a study on the CFTR gene.
Modiano, Guido; Bombieri, Cristina; Ciminelli, Bianca Maria; Belpinati, Francesca; Giorgi, Silvia; Georges, Marie des; Scotet, Virginie; Pompei, Fiorenza; Ciccacci, Cinzia; Guittard, Caroline; Audrézet, Marie Pierre; Begnini, Angela; Toepfer, Michael; Macek, Milan; Ferec, Claude; Claustres, Mireille; Pignatti, Pier Franco
2005-02-01
Coding single nucleotide substitutions (cSNSs) have been studied on hundreds of genes using small samples (n(g) approximately 100-150 genes). In the present investigation, a large random European population sample (average n(g) approximately 1500) was studied for a single gene, the CFTR (Cystic Fibrosis Transmembrane conductance Regulator). The nonsynonymous (NS) substitutions exhibited, in accordance with previous reports, a mean probability of being polymorphic (q > 0.005), much lower than that of the synonymous (S) substitutions, but they showed a similar rate of subpolymorphic (q < 0.005) variability. This indicates that, in autosomal genes that may have harmful recessive alleles (nonduplicated genes with important functions), genetic drift overwhelms selection in the subpolymorphic range of variability, making disadvantageous alleles behave as neutral. These results imply that the majority of the subpolymorphic nonsynonymous alleles of these genes are selectively negative or even pathogenic.
Gene encoding plant asparagine synthetase
Coruzzi, Gloria M.; Tsai, Fong-Ying
1993-10-26
The identification and cloning of the gene(s) for plant asparagine synthetase (AS), an important enzyme involved in the formation of asparagine, a major nitrogen transport compound of higher plants is described. Expression vectors constructed with the AS coding sequence may be utilized to produce plant AS; to engineer herbicide resistant plants, salt/drought tolerant plants or pathogen resistant plants; as a dominant selectable marker; or to select for novel herbicides or compounds useful as agents that synchronize plant cells in culture. The promoter for plant AS, which directs high levels of gene expression and is induced in an organ specific manner and by darkness, is also described. The AS promoter may be used to direct the expression of heterologous coding sequences in appropriate hosts.
Nishida, I; Sugiura, M; Enju, A; Nakamura, M
2000-12-01
A new isogene for acyl-(acyl-carrier-protein):glycerol-3-phosphate acyltransferase (GPAT; EC 2.3.1.15) in squash has been cloned and the gene product was identified as oleate-selective GPAT. Using PCR primers that could hybridise with exons for a previously cloned squash GPAT, we obtained two PCR products of different size: one coded for a previously cloned squash GPAT corresponding to non-selective isoforms AT2 and AT3, and the other for a new isozyme, probably the oleate-selective isoform AT1. Full-length amino acid sequences of respective isozymes were deduced from the nucleotide sequences of genomic genes and cDNAs, which were cloned by a series of PCR-based methods. Thus, we designated the new gene CmATS1;1 and the other one CmATS1;2. Genome blot analysis revealed that the squash genome contained the two isogenes at non-allelic loci. AT1-active fractions were partially purified, and three polypeptide bands were identified as being AT1 polypeptides, which exhibited relative molecular masses of 39.5-40.5 kDa, pI values of 6.75-7.15, and oleate selectivity over palmitate. Partial amino-terminal sequences obtained from two of these bands verified that the new isogene codes for AT1 polypeptides.
A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes.
Hezroni, Hadas; Ben-Tov Perry, Rotem; Meir, Zohar; Housman, Gali; Lubelsky, Yoav; Ulitsky, Igor
2017-08-30
Only a small portion of human long non-coding RNAs (lncRNAs) appear to be conserved outside of mammals, but the events underlying the birth of new lncRNAs in mammals remain largely unknown. One potential source is remnants of protein-coding genes that transitioned into lncRNAs. We systematically compare lncRNA and protein-coding loci across vertebrates, and estimate that up to 5% of conserved mammalian lncRNAs are derived from lost protein-coding genes. These lncRNAs have specific characteristics, such as broader expression domains, that set them apart from other lncRNAs. Fourteen lncRNAs have sequence similarity with the loci of the contemporary homologs of the lost protein-coding genes. We propose that selection acting on enhancer sequences is mostly responsible for retention of these regions. As an example of an RNA element from a protein-coding ancestor that was retained in the lncRNA, we describe in detail a short translated ORF in the JPX lncRNA that was derived from an upstream ORF in a protein-coding gene and retains some of its functionality. We estimate that ~ 55 annotated conserved human lncRNAs are derived from parts of ancestral protein-coding genes, and loss of coding potential is thus a non-negligible source of new lncRNAs. Some lncRNAs inherited regulatory elements influencing transcription and translation from their protein-coding ancestors and those elements can influence the expression breadth and functionality of these lncRNAs.
A Granular Self-Organizing Map for Clustering and Gene Selection in Microarray Data.
Ray, Shubhra Sankar; Ganivada, Avatharam; Pal, Sankar K
2016-09-01
A new granular self-organizing map (GSOM) is developed by integrating the concept of a fuzzy rough set with the SOM. While training the GSOM, the weights of a winning neuron and the neighborhood neurons are updated through a modified learning procedure. The neighborhood is newly defined using the fuzzy rough sets. The clusters (granules) evolved by the GSOM are presented to a decision table as its decision classes. Based on the decision table, a method of gene selection is developed. The effectiveness of the GSOM is shown in both clustering samples and developing an unsupervised fuzzy rough feature selection (UFRFS) method for gene selection in microarray data. While the superior results of the GSOM, as compared with the related clustering methods, are provided in terms of β -index, DB-index, Dunn-index, and fuzzy rough entropy, the genes selected by the UFRFS are not only better in terms of classification accuracy and a feature evaluation index, but also statistically more significant than the related unsupervised methods. The C-codes of the GSOM and UFRFS are available online at http://avatharamg.webs.com/software-code.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Shiyu; Kaeppler, Shawn M.; Vogel, Kenneth P.
Switchgrass is undergoing development as a dedicated cellulosic bioenergy crop. Fermentation of lignocellulosic biomass to ethanol in a bioenergy system or to volatile fatty acids in a livestock production system is strongly and negatively influenced by lignification of cell walls. This study detects specific loci that exhibit selection signatures across switchgrass breeding populations that differ in in vitro dry matter digestibility (IVDMD), ethanol yield, and lignin concentration. Allele frequency changes in candidate genes were used to detect loci under selection. Out of the 183 polymorphisms identified in the four candidate genes, twenty-five loci in the intron regions and four locimore » in coding regions were found to display a selection signature. All loci in the coding regions are synonymous substitutions. Selection in both directions were observed on polymorphisms that appeared to be under selection. Genetic diversity and linkage disequilibrium within the candidate genes were low. The recurrent divergent selection caused excessive moderate allele frequencies in the cycle 3 reduced lignin population as compared to the base population. As a result, this study provides valuable insight on genetic changes occurring in short-term selection in the polyploid populations, and discovered potential markers for breeding switchgrass with improved biomass quality.« less
Chen, Shiyu; Kaeppler, Shawn M.; Vogel, Kenneth P.; ...
2016-11-28
Switchgrass is undergoing development as a dedicated cellulosic bioenergy crop. Fermentation of lignocellulosic biomass to ethanol in a bioenergy system or to volatile fatty acids in a livestock production system is strongly and negatively influenced by lignification of cell walls. This study detects specific loci that exhibit selection signatures across switchgrass breeding populations that differ in in vitro dry matter digestibility (IVDMD), ethanol yield, and lignin concentration. Allele frequency changes in candidate genes were used to detect loci under selection. Out of the 183 polymorphisms identified in the four candidate genes, twenty-five loci in the intron regions and four locimore » in coding regions were found to display a selection signature. All loci in the coding regions are synonymous substitutions. Selection in both directions were observed on polymorphisms that appeared to be under selection. Genetic diversity and linkage disequilibrium within the candidate genes were low. The recurrent divergent selection caused excessive moderate allele frequencies in the cycle 3 reduced lignin population as compared to the base population. As a result, this study provides valuable insight on genetic changes occurring in short-term selection in the polyploid populations, and discovered potential markers for breeding switchgrass with improved biomass quality.« less
Gutiérrez, Verónica; Rego, Natalia; Naya, Hugo; García, Graciela
2015-10-28
Among teleosts, the South American genus Austrolebias (Cyprinodontiformes: Rivulidae) includes 42 taxa of annual fishes divided into five different species groups. It is a monophyletic genus, but morphological and molecular data do not resolve the relationship among intrageneric clades and high rates of substitution have been previously described in some mitochondrial genes. In this work, the complete mitogenome of a species of the genus was determined for the first time. We determined its structure, gene order and evolutionary peculiar features, which will allow us to evaluate the performance of mitochondrial genes in the phylogenetic resolution at different taxonomic levels. Regarding gene content and order, the circular mitogenome of A. charrua (17,271 pb) presents the typical pattern of vertebrate mitogenomes. It contains the full complement of 13 proteins-coding genes, 22 tRNA, 2 rRNA and one non-coding control region. Notably, the tRNA-Cys was only 57 bp in length and lacks the D-loop arm. In three full sibling individuals, heteroplasmatic condition was detected due to a total of 12 variable sites in seven protein-coding genes. Among cyprinodontiforms, the mitogenome of A. charrua exhibits the lowest G+C content (37 %) and GCskew, as well as the highest strand asymmetry with a net difference of T over A at 1st and 3rd codon positions. Considering the 12 coding-genes of the H strand, correspondence analyses of nucleotide composition and codon usage show that A and T at 1st and 3rd codon positions have the highest weight in the first axis, and segregate annual species from the other cyprinodontiforms analyzed. Given the annual life-style, their mitogenomes could be under different selective pressures. All 13 protein-coding genes are under strong purifying selection and we did not find any significant evidence of nucleotide sites showing episodic selection (dN >dS) at annual lineages. When fast evolving third codon positions were removed from alignments, the "supergene" tree recovers our reference species phylogeny as well as the Cytb, ND4L and ND6 genes. Therefore, third codon positions seem to be saturated in the aforementioned coding regions at intergeneric Cyprinodontiformes comparisons. The complete mitogenome obtained in present work, offers relevant data for further comparative studies on molecular phylogeny and systematics of this taxonomic controversial endemic genus of annual fishes.
Roux, Julien; Liu, Jialin; Robinson-Rechavi, Marc
2017-01-01
Abstract The evolutionary history of vertebrates is marked by three ancient whole-genome duplications: two successive rounds in the ancestor of vertebrates, and a third one specific to teleost fishes. Biased loss of most duplicates enriched the genome for specific genes, such as slow evolving genes, but this selective retention process is not well understood. To understand what drives the long-term preservation of duplicate genes, we characterized duplicated genes in terms of their expression patterns. We used a new method of expression enrichment analysis, TopAnat, applied to in situ hybridization data from thousands of genes from zebrafish and mouse. We showed that the presence of expression in the nervous system is a good predictor of a higher rate of retention of duplicate genes after whole-genome duplication. Further analyses suggest that purifying selection against the toxic effects of misfolded or misinteracting proteins, which is particularly strong in nonrenewing neural tissues, likely constrains the evolution of coding sequences of nervous system genes, leading indirectly to the preservation of duplicate genes after whole-genome duplication. Whole-genome duplications thus greatly contributed to the expansion of the toolkit of genes available for the evolution of profound novelties of the nervous system at the base of the vertebrate radiation. PMID:28981708
Intact coding region of the serotonin transporter gene in obsessive-compulsive disorder
DOE Office of Scientific and Technical Information (OSTI.GOV)
Altemus, M.; Murphy, D.L.; Greenberg, B.
1996-07-26
Epidemiologic studies indicate that obsessive-compulsive disorder is genetically transmitted in some families, although no genetic abnormalities have been identified in individuals with this disorder. The selective response of obsessive-compulsive disorder to treatment with agents which block serotonin reuptake suggests the gene coding for the serotonin transporter as a candidate gene. The primary structure of the serotonin-transporter coding region was sequenced in 22 patients with obsessive-compulsive disorder, using direct PCR sequencing of cDNA synthesized from platelet serotonin-transporter mRNA. No variations in amino acid sequence were found among the obsessive-compulsive disorder patients or healthy controls. These results do not support a rolemore » for alteration in the primary structure of the coding region of the serotonin-transporter gene in the pathogenesis of obsessive-compulsive disorder. 27 refs.« less
Reranking candidate gene models with cross-species comparison for improved gene prediction
Liu, Qian; Crammer, Koby; Pereira, Fernando CN; Roos, David S
2008-01-01
Background Most gene finders score candidate gene models with state-based methods, typically HMMs, by combining local properties (coding potential, splice donor and acceptor patterns, etc). Competing models with similar state-based scores may be distinguishable with additional information. In particular, functional and comparative genomics datasets may help to select among competing models of comparable probability by exploiting features likely to be associated with the correct gene models, such as conserved exon/intron structure or protein sequence features. Results We have investigated the utility of a simple post-processing step for selecting among a set of alternative gene models, using global scoring rules to rerank competing models for more accurate prediction. For each gene locus, we first generate the K best candidate gene models using the gene finder Evigan, and then rerank these models using comparisons with putative orthologous genes from closely-related species. Candidate gene models with lower scores in the original gene finder may be selected if they exhibit strong similarity to probable orthologs in coding sequence, splice site location, or signal peptide occurrence. Experiments on Drosophila melanogaster demonstrate that reranking based on cross-species comparison outperforms the best gene models identified by Evigan alone, and also outperforms the comparative gene finders GeneWise and Augustus+. Conclusion Reranking gene models with cross-species comparison improves gene prediction accuracy. This straightforward method can be readily adapted to incorporate additional lines of evidence, as it requires only a ranked source of candidate gene models. PMID:18854050
Dubey, Bhawna; Meganathan, P R; Haque, Ikramul
2012-07-01
This paper reports the complete mitochondrial genome sequence of an endangered Indian snake, Python molurus molurus (Indian Rock Python). A typical snake mitochondrial (mt) genome of 17258 bp length comprising of 37 genes including the 13 protein coding genes, 22 tRNA genes, and 2 ribosomal RNA genes along with duplicate control regions is described herein. The P. molurus molurus mt. genome is relatively similar to other snake mt. genomes with respect to gene arrangement, composition, tRNA structures and skews of AT/GC bases. The nucleotide composition of the genome shows that there are more A-C % than T-G% on the positive strand as revealed by positive AT and CG skews. Comparison of individual protein coding genes, with other snake genomes suggests that ATP8 and NADH3 genes have high divergence rates. Codon usage analysis reveals a preference of NNC codons over NNG codons in the mt. genome of P. molurus. Also, the synonymous and non-synonymous substitution rates (ka/ks) suggest that most of the protein coding genes are under purifying selection pressure. The phylogenetic analyses involving the concatenated 13 protein coding genes of P. molurus molurus conformed to the previously established snake phylogeny.
Convergent evolution of marine mammals is associated with distinct substitutions in common genes
Zhou, Xuming; Seim, Inge; Gladyshev, Vadim N.
2015-01-01
Phenotypic convergence is thought to be driven by parallel substitutions coupled with natural selection at the sequence level. Multiple independent evolutionary transitions of mammals to an aquatic environment offer an opportunity to test this thesis. Here, whole genome alignment of coding sequences identified widespread parallel amino acid substitutions in marine mammals; however, the majority of these changes were not unique to these animals. Conversely, we report that candidate aquatic adaptation genes, identified by signatures of likelihood convergence and/or elevated ratio of nonsynonymous to synonymous nucleotide substitution rate, are characterized by very few parallel substitutions and exhibit distinct sequence changes in each group. Moreover, no significant positive correlation was found between likelihood convergence and positive selection in all three marine lineages. These results suggest that convergence in protein coding genes associated with aquatic lifestyle is mainly characterized by independent substitutions and relaxed negative selection. PMID:26549748
Biomimetic Artificial Epigenetic Code for Targeted Acetylation of Histones.
Taniguchi, Junichi; Feng, Yihong; Pandian, Ganesh N; Hashiya, Fumitaka; Hidaka, Takuya; Hashiya, Kaori; Park, Soyoung; Bando, Toshikazu; Ito, Shinji; Sugiyama, Hiroshi
2018-06-13
While the central role of locus-specific acetylation of histone proteins in eukaryotic gene expression is well established, the availability of designer tools to regulate acetylation at particular nucleosome sites remains limited. Here, we develop a unique strategy to introduce acetylation by constructing a bifunctional molecule designated Bi-PIP. Bi-PIP has a P300/CBP-selective bromodomain inhibitor (Bi) as a P300/CBP recruiter and a pyrrole-imidazole polyamide (PIP) as a sequence-selective DNA binder. Biochemical assays verified that Bi-PIPs recruit P300 to the nucleosomes having their target DNA sequences and extensively accelerate acetylation. Bi-PIPs also activated transcription of genes that have corresponding cognate DNA sequences inside living cells. Our results demonstrate that Bi-PIPs could act as a synthetic programmable histone code of acetylation, which emulates the bromodomain-mediated natural propagation system of histone acetylation to activate gene expression in a sequence-selective manner.
Roux, Julien; Liu, Jialin; Robinson-Rechavi, Marc
2017-11-01
The evolutionary history of vertebrates is marked by three ancient whole-genome duplications: two successive rounds in the ancestor of vertebrates, and a third one specific to teleost fishes. Biased loss of most duplicates enriched the genome for specific genes, such as slow evolving genes, but this selective retention process is not well understood. To understand what drives the long-term preservation of duplicate genes, we characterized duplicated genes in terms of their expression patterns. We used a new method of expression enrichment analysis, TopAnat, applied to in situ hybridization data from thousands of genes from zebrafish and mouse. We showed that the presence of expression in the nervous system is a good predictor of a higher rate of retention of duplicate genes after whole-genome duplication. Further analyses suggest that purifying selection against the toxic effects of misfolded or misinteracting proteins, which is particularly strong in nonrenewing neural tissues, likely constrains the evolution of coding sequences of nervous system genes, leading indirectly to the preservation of duplicate genes after whole-genome duplication. Whole-genome duplications thus greatly contributed to the expansion of the toolkit of genes available for the evolution of profound novelties of the nervous system at the base of the vertebrate radiation. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Bidard, Frédérique; Imbeaud, Sandrine; Reymond, Nancie; Lespinet, Olivier; Silar, Philippe; Clavé, Corinne; Delacroix, Hervé; Berteaux-Lecellier, Véronique; Debuchy, Robert
2010-06-18
The development of new microarray technologies makes custom long oligonucleotide arrays affordable for many experimental applications, notably gene expression analyses. Reliable results depend on probe design quality and selection. Probe design strategy should cope with the limited accuracy of de novo gene prediction programs, and annotation up-dating. We present a novel in silico procedure which addresses these issues and includes experimental screening, as an empirical approach is the best strategy to identify optimal probes in the in silico outcome. We used four criteria for in silico probe selection: cross-hybridization, hairpin stability, probe location relative to coding sequence end and intron position. This latter criterion is critical when exon-intron gene structure predictions for intron-rich genes are inaccurate. For each coding sequence (CDS), we selected a sub-set of four probes. These probes were included in a test microarray, which was used to evaluate the hybridization behavior of each probe. The best probe for each CDS was selected according to three experimental criteria: signal-to-noise ratio, signal reproducibility, and representative signal intensities. This procedure was applied for the development of a gene expression Agilent platform for the filamentous fungus Podospora anserina and the selection of a single 60-mer probe for each of the 10,556 P. anserina CDS. A reliable gene expression microarray version based on the Agilent 44K platform was developed with four spot replicates of each probe to increase statistical significance of analysis.
Chakraborty, Supriyo; Uddin, Arif; Mazumder, Tarikul Huda; Choudhury, Monisha Nath; Malakar, Arup Kumar; Paul, Prosenjit; Halder, Binata; Deka, Himangshu; Mazumder, Gulshana Akthar; Barbhuiya, Riazul Ahmed; Barbhuiya, Masuk Ahmed; Devi, Warepam Jesmi
2017-12-02
The study of codon usage coupled with phylogenetic analysis is an important tool to understand the genetic and evolutionary relationship of a gene. The 13 protein coding genes of human mitochondria are involved in electron transport chain for the generation of energy currency (ATP). However, no work has yet been reported on the codon usage of the mitochondrial protein coding genes across six continents. To understand the patterns of codon usage in mitochondrial genes across six different continents, we used bioinformatic analyses to analyze the protein coding genes. The codon usage bias was low as revealed from high ENC value. Correlation between codon usage and GC3 suggested that all the codons ending with G/C were positively correlated with GC3 but vice versa for A/T ending codons with the exception of ND4L and ND5 genes. Neutrality plot revealed that for the genes ATP6, COI, COIII, CYB, ND4 and ND4L, natural selection might have played a major role while mutation pressure might have played a dominant role in the codon usage bias of ATP8, COII, ND1, ND2, ND3, ND5 and ND6 genes. Phylogenetic analysis indicated that evolutionary relationships in each of 13 protein coding genes of human mitochondria were different across six continents and further suggested that geographical distance was an important factor for the origin and evolution of 13 protein coding genes of human mitochondria. Copyright © 2017 Elsevier B.V. and Mitochondria Research Society. All rights reserved.
Janova, Eva; Matiasovic, Jan; Vahala, Jiri; Vodicka, Roman; Van Dyk, Enette; Horin, Petr
2009-07-01
The major histocompatibility complex genes coding for antigen binding and presenting molecules are the most polymorphic genes in the vertebrate genome. We studied the DRA and DQA gene polymorphism of the family Equidae. In addition to 11 previously reported DRA and 24 DQA alleles, six new DRA sequences and 13 new DQA alleles were identified in the genus Equus. Phylogenetic analysis of both DRA and DQA sequences provided evidence for trans-species polymorphism in the family Equidae. The phylogenetic trees differed from species relationships defined by standard taxonomy of Equidae and from trees based on mitochondrial or neutral gene sequence data. Analysis of selection showed differences between the less variable DRA and more variable DQA genes. DRA alleles were more often shared by more species. The DQA sequences analysed showed strong amongst-species positive selection; the selected amino acid positions mostly corresponded to selected positions in rodent and human DQA genes.
JCoDA: a tool for detecting evolutionary selection.
Steinway, Steven N; Dannenfelser, Ruth; Laucius, Christopher D; Hayes, James E; Nayak, Sudhir
2010-05-27
The incorporation of annotated sequence information from multiple related species in commonly used databases (Ensembl, Flybase, Saccharomyces Genome Database, Wormbase, etc.) has increased dramatically over the last few years. This influx of information has provided a considerable amount of raw material for evaluation of evolutionary relationships. To aid in the process, we have developed JCoDA (Java Codon Delimited Alignment) as a simple-to-use visualization tool for the detection of site specific and regional positive/negative evolutionary selection amongst homologous coding sequences. JCoDA accepts user-inputted unaligned or pre-aligned coding sequences, performs a codon-delimited alignment using ClustalW, and determines the dN/dS calculations using PAML (Phylogenetic Analysis Using Maximum Likelihood, yn00 and codeml) in order to identify regions and sites under evolutionary selection. The JCoDA package includes a graphical interface for Phylip (Phylogeny Inference Package) to generate phylogenetic trees, manages formatting of all required file types, and streamlines passage of information between underlying programs. The raw data are output to user configurable graphs with sliding window options for straightforward visualization of pairwise or gene family comparisons. Additionally, codon-delimited alignments are output in a variety of common formats and all dN/dS calculations can be output in comma-separated value (CSV) format for downstream analysis. To illustrate the types of analyses that are facilitated by JCoDA, we have taken advantage of the well studied sex determination pathway in nematodes as well as the extensive sequence information available to identify genes under positive selection, examples of regional positive selection, and differences in selection based on the role of genes in the sex determination pathway. JCoDA is a configurable, open source, user-friendly visualization tool for performing evolutionary analysis on homologous coding sequences. JCoDA can be used to rapidly screen for genes and regions of genes under selection using PAML. It can be freely downloaded at http://www.tcnj.edu/~nayaklab/jcoda.
JCoDA: a tool for detecting evolutionary selection
2010-01-01
Background The incorporation of annotated sequence information from multiple related species in commonly used databases (Ensembl, Flybase, Saccharomyces Genome Database, Wormbase, etc.) has increased dramatically over the last few years. This influx of information has provided a considerable amount of raw material for evaluation of evolutionary relationships. To aid in the process, we have developed JCoDA (Java Codon Delimited Alignment) as a simple-to-use visualization tool for the detection of site specific and regional positive/negative evolutionary selection amongst homologous coding sequences. Results JCoDA accepts user-inputted unaligned or pre-aligned coding sequences, performs a codon-delimited alignment using ClustalW, and determines the dN/dS calculations using PAML (Phylogenetic Analysis Using Maximum Likelihood, yn00 and codeml) in order to identify regions and sites under evolutionary selection. The JCoDA package includes a graphical interface for Phylip (Phylogeny Inference Package) to generate phylogenetic trees, manages formatting of all required file types, and streamlines passage of information between underlying programs. The raw data are output to user configurable graphs with sliding window options for straightforward visualization of pairwise or gene family comparisons. Additionally, codon-delimited alignments are output in a variety of common formats and all dN/dS calculations can be output in comma-separated value (CSV) format for downstream analysis. To illustrate the types of analyses that are facilitated by JCoDA, we have taken advantage of the well studied sex determination pathway in nematodes as well as the extensive sequence information available to identify genes under positive selection, examples of regional positive selection, and differences in selection based on the role of genes in the sex determination pathway. Conclusions JCoDA is a configurable, open source, user-friendly visualization tool for performing evolutionary analysis on homologous coding sequences. JCoDA can be used to rapidly screen for genes and regions of genes under selection using PAML. It can be freely downloaded at http://www.tcnj.edu/~nayaklab/jcoda. PMID:20507581
Gritz, L; Davies, J
1983-11-01
The plasmid-borne gene hph coding for hygromycin B phosphotransferase (HPH) in Escherichia coli has been identified and its nucleotide sequence determined. The hph gene is 1026 nucleotides long, coding for a protein with a predicted Mr of 39 000. The hph gene was placed in a shuttle plasmid vector, downstream from the promoter region of the cyc 1 gene of Saccharomyces cerevisiae, and an hph construction containing a single AUG in the 5' noncoding region allowed direct selection following transformation in yeast and in E. coli. Thus the hph gene can be used in cloning vectors for both pro- and eukaryotes.
Two Perspectives on the Origin of the Standard Genetic Code
NASA Astrophysics Data System (ADS)
Sengupta, Supratim; Aggarwal, Neha; Bandhu, Ashutosh Vishwa
2014-12-01
The origin of a genetic code made it possible to create ordered sequences of amino acids. In this article we provide two perspectives on code origin by carrying out simulations of code-sequence coevolution in finite populations with the aim of examining how the standard genetic code may have evolved from more primitive code(s) encoding a small number of amino acids. We determine the efficacy of the physico-chemical hypothesis of code origin in the absence and presence of horizontal gene transfer (HGT) by allowing a diverse collection of code-sequence sets to compete with each other. We find that in the absence of horizontal gene transfer, natural selection between competing codes distinguished by differences in the degree of physico-chemical optimization is unable to explain the structure of the standard genetic code. However, for certain probabilities of the horizontal transfer events, a universal code emerges having a structure that is consistent with the standard genetic code.
Su, Fei; Ou, Hong-Yu; Tao, Fei; Tang, Hongzhi; Xu, Ping
2013-12-27
With genomic sequences of many closely related bacterial strains made available by deep sequencing, it is now possible to investigate trends in prokaryotic microevolution. Positive selection is a sub-process of microevolution, in which a particular mutation is favored, causing the allele frequency to continuously shift in one direction. Wide scanning of prokaryotic genomes has shown that positive selection at the molecular level is much more frequent than expected. Genes with significant positive selection may play key roles in bacterial adaption to different environmental pressures. However, selection pressure analyses are computationally intensive and awkward to configure. Here we describe an open access web server, which is designated as PSP (Positive Selection analysis for Prokaryotic genomes) for performing evolutionary analysis on orthologous coding genes, specially designed for rapid comparison of dozens of closely related prokaryotic genomes. Remarkably, PSP facilitates functional exploration at the multiple levels by assignments and enrichments of KO, GO or COG terms. To illustrate this user-friendly tool, we analyzed Escherichia coli and Bacillus cereus genomes and found that several genes, which play key roles in human infection and antibiotic resistance, show significant evidence of positive selection. PSP is freely available to all users without any login requirement at: http://db-mml.sjtu.edu.cn/PSP/. PSP ultimately allows researchers to do genome-scale analysis for evolutionary selection across multiple prokaryotic genomes rapidly and easily, and identify the genes undergoing positive selection, which may play key roles in the interactions of host-pathogen and/or environmental adaptation.
Pervasive positive selection on duplicated and nonduplicated vertebrate protein coding genes.
Studer, Romain A; Penel, Simon; Duret, Laurent; Robinson-Rechavi, Marc
2008-09-01
A stringent branch-site codon model was used to detect positive selection in vertebrate evolution. We show that the test is robust to the large evolutionary distances involved. Positive selection was detected in 77% of 884 genes studied. Most positive selection concerns a few sites on a single branch of the phylogenetic tree: Between 0.9% and 4.7% of sites are affected by positive selection depending on the branches. No functional category was overrepresented among genes under positive selection. Surprisingly, whole genome duplication had no effect on the prevalence of positive selection, whether the fish-specific genome duplication or the two rounds at the origin of vertebrates. Thus positive selection has not been limited to a few gene classes, or to specific evolutionary events such as duplication, but has been pervasive during vertebrate evolution.
GeneBuilder: interactive in silico prediction of gene structure.
Milanesi, L; D'Angelo, D; Rogozin, I B
1999-01-01
Prediction of gene structure in newly sequenced DNA becomes very important in large genome sequencing projects. This problem is complicated due to the exon-intron structure of eukaryotic genes and because gene expression is regulated by many different short nucleotide domains. In order to be able to analyse the full gene structure in different organisms, it is necessary to combine information about potential functional signals (promoter region, splice sites, start and stop codons, 3' untranslated region) together with the statistical properties of coding sequences (coding potential), information about homologous proteins, ESTs and repeated elements. We have developed the GeneBuilder system which is based on prediction of functional signals and coding regions by different approaches in combination with similarity searches in proteins and EST databases. The potential gene structure models are obtained by using a dynamic programming method. The program permits the use of several parameters for gene structure prediction and refinement. During gene model construction, selecting different exon homology levels with a protein sequence selected from a list of homologous proteins can improve the accuracy of the gene structure prediction. In the case of low homology, GeneBuilder is still able to predict the gene structure. The GeneBuilder system has been tested by using the standard set (Burset and Guigo, Genomics, 34, 353-367, 1996) and the performances are: 0.89 sensitivity and 0.91 specificity at the nucleotide level. The total correlation coefficient is 0.88. The GeneBuilder system is implemented as a part of the WebGene a the URL: http://www.itba.mi. cnr.it/webgene and TRADAT (TRAncription Database and Analysis Tools) launcher URL: http://www.itba.mi.cnr.it/tradat.
Zayed, Amro; Whitfield, Charles W.
2008-01-01
Apis mellifera originated in Africa and extended its range into Eurasia in two or more ancient expansions. In 1956, honey bees of African origin were introduced into South America, their descendents admixing with previously introduced European bees, giving rise to the highly invasive and economically devastating “Africanized” honey bee. Here we ask whether the honey bee's out-of-Africa expansions, both ancient and recent (invasive), were associated with a genome-wide signature of positive selection, detected by contrasting genetic differentiation estimates (FST) between coding and noncoding SNPs. In native populations, SNPs in protein-coding regions had significantly higher FST estimates than those in noncoding regions, indicating adaptive evolution in the genome driven by positive selection. This signal of selection was associated with the expansion of honey bees from Africa into Western and Northern Europe, perhaps reflecting adaptation to temperate environments. We estimate that positive selection acted on a minimum of 852–1,371 genes or ≈10% of the bee's coding genome. We also detected positive selection associated with the invasion of African-derived honey bees in the New World. We found that introgression of European-derived alleles into Africanized bees was significantly greater for coding than noncoding regions. Our findings demonstrate that Africanized bees exploited the genetic diversity present from preexisting introductions in an adaptive way. Finally, we found a significant negative correlation between FST estimates and the local GC content surrounding coding SNPs, suggesting that AT-rich genes play an important role in adaptive evolution in the honey bee. PMID:18299560
Zayed, Amro; Whitfield, Charles W
2008-03-04
Apis mellifera originated in Africa and extended its range into Eurasia in two or more ancient expansions. In 1956, honey bees of African origin were introduced into South America, their descendents admixing with previously introduced European bees, giving rise to the highly invasive and economically devastating "Africanized" honey bee. Here we ask whether the honey bee's out-of-Africa expansions, both ancient and recent (invasive), were associated with a genome-wide signature of positive selection, detected by contrasting genetic differentiation estimates (F(ST)) between coding and noncoding SNPs. In native populations, SNPs in protein-coding regions had significantly higher F(ST) estimates than those in noncoding regions, indicating adaptive evolution in the genome driven by positive selection. This signal of selection was associated with the expansion of honey bees from Africa into Western and Northern Europe, perhaps reflecting adaptation to temperate environments. We estimate that positive selection acted on a minimum of 852-1,371 genes or approximately 10% of the bee's coding genome. We also detected positive selection associated with the invasion of African-derived honey bees in the New World. We found that introgression of European-derived alleles into Africanized bees was significantly greater for coding than noncoding regions. Our findings demonstrate that Africanized bees exploited the genetic diversity present from preexisting introductions in an adaptive way. Finally, we found a significant negative correlation between F(ST) estimates and the local GC content surrounding coding SNPs, suggesting that AT-rich genes play an important role in adaptive evolution in the honey bee.
Regions of extreme synonymous codon selection in mammalian genes
Schattner, Peter; Diekhans, Mark
2006-01-01
Recently there has been increasing evidence that purifying selection occurs among synonymous codons in mammalian genes. This selection appears to be a consequence of either cis-regulatory motifs, such as exonic splicing enhancers (ESEs), or mRNA secondary structures, being superimposed on the coding sequence of the gene. We have developed a program to identify regions likely to be enriched for such motifs by searching for extended regions of extreme codon conservation between homologous genes of related species. Here we present the results of applying this approach to five mammalian species (human, chimpanzee, mouse, rat and dog). Even with very conservative selection criteria, we find over 200 regions of extreme codon conservation, ranging in length from 60 to 178 codons. The regions are often found within genes involved in DNA-binding, RNA-binding or zinc-ion-binding. They are highly depleted for synonymous single nucleotide polymorphisms (SNPs) but not for non-synonymous SNPs, further indicating that the observed codon conservation is being driven by negative selection. Forty-three percent of the regions overlap conserved alternative transcript isoforms and are enriched for known ESEs. Other regions are enriched for TpA dinucleotides and may contain conserved motifs/structures relating to mRNA stability and/or degradation. We anticipate that this tool will be useful for detecting regions enriched in other classes of coding-sequence motifs and structures as well. PMID:16556911
Theory of prokaryotic genome evolution.
Sela, Itamar; Wolf, Yuri I; Koonin, Eugene V
2016-10-11
Bacteria and archaea typically possess small genomes that are tightly packed with protein-coding genes. The compactness of prokaryotic genomes is commonly perceived as evidence of adaptive genome streamlining caused by strong purifying selection in large microbial populations. In such populations, even the small cost incurred by nonfunctional DNA because of extra energy and time expenditure is thought to be sufficient for this extra genetic material to be eliminated by selection. However, contrary to the predictions of this model, there exists a consistent, positive correlation between the strength of selection at the protein sequence level, measured as the ratio of nonsynonymous to synonymous substitution rates, and microbial genome size. Here, by fitting the genome size distributions in multiple groups of prokaryotes to predictions of mathematical models of population evolution, we show that only models in which acquisition of additional genes is, on average, slightly beneficial yield a good fit to genomic data. These results suggest that the number of genes in prokaryotic genomes reflects the equilibrium between the benefit of additional genes that diminishes as the genome grows and deletion bias (i.e., the rate of deletion of genetic material being slightly greater than the rate of acquisition). Thus, new genes acquired by microbial genomes, on average, appear to be adaptive. The tight spacing of protein-coding genes likely results from a combination of the deletion bias and purifying selection that efficiently eliminates nonfunctional, noncoding sequences.
Cloning and molecular evolution of the aldehyde dehydrogenase 2 gene (Aldh2) in bats (Chiroptera).
Chen, Yao; Shen, Bin; Zhang, Junpeng; Jones, Gareth; He, Guimei
2013-02-01
Old World fruit bats (Pteropodidae) and New World fruit bats (Phyllostomidae) ingest significant quantities of ethanol while foraging. Mitochondrial aldehyde dehydrogenase (ALDH2, encoded by the Aldh2 gene) plays an important role in ethanol metabolism. To test whether the Aldh2 gene has undergone adaptive evolution in frugivorous and nectarivorous bats in relation to ethanol elimination, we sequenced part of the coding region of the gene (1,143 bp, ~73 % coverage) in 14 bat species, including three Old World fruit bats and two New World fruit bats. Our results showed that the Aldh2 coding sequences are highly conserved across all bat species we examined, and no evidence of positive selection was detected in the ancestral branches leading to Old World fruit bats and New World fruit bats. Further research is needed to determine whether other genes involved in ethanol metabolism have been the targets of positive selection in frugivorous and nectarivorous bats.
Opioid system genes in alcoholism: a case-control study in Croatian population.
Cupic, B; Stefulj, J; Zapletal, E; Matosic, A; Bordukalo-Niksic, T; Cicin-Sain, L; Gabrilovac, J
2013-10-01
Due to their involvement in dependence pathways, opioid system genes represent strong candidates for association studies investigating alcoholism. In this study, single nucleotide polymorphisms within the genes for mu (OPRM1) and kappa (OPRK1) opioid receptors and precursors of their ligands - proopiomelanocortin (POMC), coding for beta-endorphin and prodynorphin (PDYN) coding for dynorphins, were analyzed in a case-control study that included 354 male alcohol-dependent and 357 male control subjects from Croatian population. Analysis of allele and genotype frequencies of the selected polymorphisms of the genes OPRM1/POMC and OPRK1/PDYN revealed no differences between the tested groups. The same was true when alcohol-dependent persons were subdivided according to the Cloninger's criteria into type-1 and type-2 groups, known to differ in the extent of genetic control. Thus, the data obtained suggest no association of the selected polymorphisms of the genes OPRM1/POMC and OPRK1/PDYN with alcoholism in Croatian population. Copyright © 2013 Elsevier Ltd. All rights reserved.
Biased exonization of transposed elements in duplicated genes: A lesson from the TIF-IA gene.
Amit, Maayan; Sela, Noa; Keren, Hadas; Melamed, Ze'ev; Muler, Inna; Shomron, Noam; Izraeli, Shai; Ast, Gil
2007-11-29
Gene duplication and exonization of intronic transposed elements are two mechanisms that enhance genomic diversity. We examined whether there is less selection against exonization of transposed elements in duplicated genes than in single-copy genes. Genome-wide analysis of exonization of transposed elements revealed a higher rate of exonization within duplicated genes relative to single-copy genes. The gene for TIF-IA, an RNA polymerase I transcription initiation factor, underwent a humanoid-specific triplication, all three copies of the gene are active transcriptionally, although only one copy retains the ability to generate the TIF-IA protein. Prior to TIF-IA triplication, an Alu element was inserted into the first intron. In one of the non-protein coding copies, this Alu is exonized. We identified a single point mutation leading to exonization in one of the gene duplicates. When this mutation was introduced into the TIF-IA coding copy, exonization was activated and the level of the protein-coding mRNA was reduced substantially. A very low level of exonization was detected in normal human cells. However, this exonization was abundant in most leukemia cell lines evaluated, although the genomic sequence is unchanged in these cancerous cells compared to normal cells. The definition of the Alu element within the TIF-IA gene as an exon is restricted to certain types of cancers; the element is not exonized in normal human cells. These results further our understanding of the delicate interplay between gene duplication and alternative splicing and of the molecular evolutionary mechanisms leading to genetic innovations. This implies the existence of purifying selection against exonization in single copy genes, with duplicate genes free from such constrains.
Biased exonization of transposed elements in duplicated genes: A lesson from the TIF-IA gene
Amit, Maayan; Sela, Noa; Keren, Hadas; Melamed, Ze'ev; Muler, Inna; Shomron, Noam; Izraeli, Shai; Ast, Gil
2007-01-01
Background Gene duplication and exonization of intronic transposed elements are two mechanisms that enhance genomic diversity. We examined whether there is less selection against exonization of transposed elements in duplicated genes than in single-copy genes. Results Genome-wide analysis of exonization of transposed elements revealed a higher rate of exonization within duplicated genes relative to single-copy genes. The gene for TIF-IA, an RNA polymerase I transcription initiation factor, underwent a humanoid-specific triplication, all three copies of the gene are active transcriptionally, although only one copy retains the ability to generate the TIF-IA protein. Prior to TIF-IA triplication, an Alu element was inserted into the first intron. In one of the non-protein coding copies, this Alu is exonized. We identified a single point mutation leading to exonization in one of the gene duplicates. When this mutation was introduced into the TIF-IA coding copy, exonization was activated and the level of the protein-coding mRNA was reduced substantially. A very low level of exonization was detected in normal human cells. However, this exonization was abundant in most leukemia cell lines evaluated, although the genomic sequence is unchanged in these cancerous cells compared to normal cells. Conclusion The definition of the Alu element within the TIF-IA gene as an exon is restricted to certain types of cancers; the element is not exonized in normal human cells. These results further our understanding of the delicate interplay between gene duplication and alternative splicing and of the molecular evolutionary mechanisms leading to genetic innovations. This implies the existence of purifying selection against exonization in single copy genes, with duplicate genes free from such constrains. PMID:18047649
Simard, Frédéric; Licht, Monica; Besansky, Nora J.; Lehmann, Tovi
2007-01-01
Genetic variation in defensin, a gene encoding a major effector molecule of insects immune response was analyzed within and between populations of three members of the Anopheles gambiae complex. The species selected included the two anthropophilic species, An. gambiae and An. arabiensis and the most zoophilic species of the complex, An. quadriannulatus. The first species was represented by four populations spanning its extreme genetic and geographical ranges, whereas each of the other two species was represented by a single population. We found (i) reduced overall polymorphism in the mature peptide region and in the total coding region, together with specific reductions in rare and moderately frequent mutations (sites) in the coding region compared with non coding regions, (ii) markedly reduced rate of nonsynonymous diversity compared with synonymous variation in the mature peptide and virtually identical mature peptide across the three species, and (iii) increased divergence between species in the mature peptide together with reduced differentiation between populations of An. gambiae in the same DNA region. These patterns suggest a strong purifying selection on the mature peptide and probably the whole coding region. Because An. quadriannulatus is not exposed to human pathogens, identical mature peptide and similar pattern of polymorphism across species implies that human pathogens played no role as selective agents on this peptide. PMID:17161659
2010-01-01
Background The development of new microarray technologies makes custom long oligonucleotide arrays affordable for many experimental applications, notably gene expression analyses. Reliable results depend on probe design quality and selection. Probe design strategy should cope with the limited accuracy of de novo gene prediction programs, and annotation up-dating. We present a novel in silico procedure which addresses these issues and includes experimental screening, as an empirical approach is the best strategy to identify optimal probes in the in silico outcome. Findings We used four criteria for in silico probe selection: cross-hybridization, hairpin stability, probe location relative to coding sequence end and intron position. This latter criterion is critical when exon-intron gene structure predictions for intron-rich genes are inaccurate. For each coding sequence (CDS), we selected a sub-set of four probes. These probes were included in a test microarray, which was used to evaluate the hybridization behavior of each probe. The best probe for each CDS was selected according to three experimental criteria: signal-to-noise ratio, signal reproducibility, and representative signal intensities. This procedure was applied for the development of a gene expression Agilent platform for the filamentous fungus Podospora anserina and the selection of a single 60-mer probe for each of the 10,556 P. anserina CDS. Conclusions A reliable gene expression microarray version based on the Agilent 44K platform was developed with four spot replicates of each probe to increase statistical significance of analysis. PMID:20565839
Characteristics and significance of intergenic polyadenylated RNA transcription in Arabidopsis.
Moghe, Gaurav D; Lehti-Shiu, Melissa D; Seddon, Alex E; Yin, Shan; Chen, Yani; Juntawong, Piyada; Brandizzi, Federica; Bailey-Serres, Julia; Shiu, Shin-Han
2013-01-01
The Arabidopsis (Arabidopsis thaliana) genome is the most well-annotated plant genome. However, transcriptome sequencing in Arabidopsis continues to suggest the presence of polyadenylated (polyA) transcripts originating from presumed intergenic regions. It is not clear whether these transcripts represent novel noncoding or protein-coding genes. To understand the nature of intergenic polyA transcription, we first assessed its abundance using multiple messenger RNA sequencing data sets. We found 6,545 intergenic transcribed fragments (ITFs) occupying 3.6% of Arabidopsis intergenic space. In contrast to transcribed fragments that map to protein-coding and RNA genes, most ITFs are significantly shorter, are expressed at significantly lower levels, and tend to be more data set specific. A surprisingly large number of ITFs (32.1%) may be protein coding based on evidence of translation. However, our results indicate that these "translated" ITFs tend to be close to and are likely associated with known genes. To investigate if ITFs are under selection and are functional, we assessed ITF conservation through cross-species as well as within-species comparisons. Our analysis reveals that 237 ITFs, including 49 with translation evidence, are under strong selective constraint and relatively distant from annotated features. These ITFs are likely parts of novel genes. However, the selective pressure imposed on most ITFs is similar to that of randomly selected, untranscribed intergenic sequences. Our findings indicate that despite the prevalence of ITFs, apart from the possibility of genomic contamination, many may be background or noisy transcripts derived from "junk" DNA, whose production may be inherent to the process of transcription and which, on rare occasions, may act as catalysts for the creation of novel genes.
Jeukens, Julie; Bernatchez, Louis
2012-01-01
While gene expression divergence is known to be involved in adaptive phenotypic divergence and speciation, the relative importance of regulatory and structural evolution of genes is poorly understood. A recent next-generation sequencing experiment allowed identifying candidate genes potentially involved in the ongoing speciation of sympatric dwarf and normal lake whitefish (Coregonus clupeaformis), such as cytosolic malate dehydrogenase (MDH1), which showed both significant expression and sequence divergence. The main goal of this study was to investigate into more details the signatures of natural selection in the regulatory and coding sequences of MDH1 in lake whitefish and test for parallelism of these signatures with other coregonine species. Sequencing of the two regions in 118 fish from four sympatric pairs of whitefish and two cisco species revealed a total of 35 single nucleotide polymorphisms (SNPs), with more genetic diversity in European compared to North American coregonine species. While the coding region was found to be under purifying selection, an SNP in the proximal promoter exhibited significant allele frequency divergence in a parallel manner among independent sympatric pairs of North American lake whitefish and European whitefish (C. lavaretus). According to transcription factor binding simulation for 22 regulatory haplotypes of MDH1, putative binding profiles were fairly conserved among species, except for the region around this SNP. Moreover, we found evidence for the role of this SNP in the regulation of MDH1 expression level. Overall, these results provide further evidence for the role of natural selection in gene regulation evolution among whitefish species pairs and suggest its possible link with patterns of phenotypic diversity observed in coregonine species. PMID:22408741
Jeukens, Julie; Bernatchez, Louis
2012-01-01
While gene expression divergence is known to be involved in adaptive phenotypic divergence and speciation, the relative importance of regulatory and structural evolution of genes is poorly understood. A recent next-generation sequencing experiment allowed identifying candidate genes potentially involved in the ongoing speciation of sympatric dwarf and normal lake whitefish (Coregonus clupeaformis), such as cytosolic malate dehydrogenase (MDH1), which showed both significant expression and sequence divergence. The main goal of this study was to investigate into more details the signatures of natural selection in the regulatory and coding sequences of MDH1 in lake whitefish and test for parallelism of these signatures with other coregonine species. Sequencing of the two regions in 118 fish from four sympatric pairs of whitefish and two cisco species revealed a total of 35 single nucleotide polymorphisms (SNPs), with more genetic diversity in European compared to North American coregonine species. While the coding region was found to be under purifying selection, an SNP in the proximal promoter exhibited significant allele frequency divergence in a parallel manner among independent sympatric pairs of North American lake whitefish and European whitefish (C. lavaretus). According to transcription factor binding simulation for 22 regulatory haplotypes of MDH1, putative binding profiles were fairly conserved among species, except for the region around this SNP. Moreover, we found evidence for the role of this SNP in the regulation of MDH1 expression level. Overall, these results provide further evidence for the role of natural selection in gene regulation evolution among whitefish species pairs and suggest its possible link with patterns of phenotypic diversity observed in coregonine species.
MHC class I-associated peptides derive from selective regions of the human genome.
Pearson, Hillary; Daouda, Tariq; Granados, Diana Paola; Durette, Chantal; Bonneil, Eric; Courcelles, Mathieu; Rodenbrock, Anja; Laverdure, Jean-Philippe; Côté, Caroline; Mader, Sylvie; Lemieux, Sébastien; Thibault, Pierre; Perreault, Claude
2016-12-01
MHC class I-associated peptides (MAPs) define the immune self for CD8+ T lymphocytes and are key targets of cancer immunosurveillance. Here, the goals of our work were to determine whether the entire set of protein-coding genes could generate MAPs and whether specific features influence the ability of discrete genes to generate MAPs. Using proteogenomics, we have identified 25,270 MAPs isolated from the B lymphocytes of 18 individuals who collectively expressed 27 high-frequency HLA-A,B allotypes. The entire MAP repertoire presented by these 27 allotypes covered only 10% of the exomic sequences expressed in B lymphocytes. Indeed, 41% of expressed protein-coding genes generated no MAPs, while 59% of genes generated up to 64 MAPs, often derived from adjacent regions and presented by different allotypes. We next identified several features of transcripts and proteins associated with efficient MAP production. From these data, we built a logistic regression model that predicts with good accuracy whether a gene generates MAPs. Our results show preferential selection of MAPs from a limited repertoire of proteins with distinctive features. The notion that the MHC class I immunopeptidome presents only a small fraction of the protein-coding genome for monitoring by the immune system has profound implications in autoimmunity and cancer immunology.
MHC class I–associated peptides derive from selective regions of the human genome
Pearson, Hillary; Granados, Diana Paola; Durette, Chantal; Bonneil, Eric; Courcelles, Mathieu; Rodenbrock, Anja; Laverdure, Jean-Philippe; Côté, Caroline; Thibault, Pierre
2016-01-01
MHC class I–associated peptides (MAPs) define the immune self for CD8+ T lymphocytes and are key targets of cancer immunosurveillance. Here, the goals of our work were to determine whether the entire set of protein-coding genes could generate MAPs and whether specific features influence the ability of discrete genes to generate MAPs. Using proteogenomics, we have identified 25,270 MAPs isolated from the B lymphocytes of 18 individuals who collectively expressed 27 high-frequency HLA-A,B allotypes. The entire MAP repertoire presented by these 27 allotypes covered only 10% of the exomic sequences expressed in B lymphocytes. Indeed, 41% of expressed protein-coding genes generated no MAPs, while 59% of genes generated up to 64 MAPs, often derived from adjacent regions and presented by different allotypes. We next identified several features of transcripts and proteins associated with efficient MAP production. From these data, we built a logistic regression model that predicts with good accuracy whether a gene generates MAPs. Our results show preferential selection of MAPs from a limited repertoire of proteins with distinctive features. The notion that the MHC class I immunopeptidome presents only a small fraction of the protein-coding genome for monitoring by the immune system has profound implications in autoimmunity and cancer immunology. PMID:27841757
The importance of immune gene variability (MHC) in evolutionary ecology and conservation
Sommer, Simone
2005-01-01
Genetic studies have typically inferred the effects of human impact by documenting patterns of genetic differentiation and levels of genetic diversity among potentially isolated populations using selective neutral markers such as mitochondrial control region sequences, microsatellites or single nucleotide polymorphism (SNPs). However, evolutionary relevant and adaptive processes within and between populations can only be reflected by coding genes. In vertebrates, growing evidence suggests that genetic diversity is particularly important at the level of the major histocompatibility complex (MHC). MHC variants influence many important biological traits, including immune recognition, susceptibility to infectious and autoimmune diseases, individual odours, mating preferences, kin recognition, cooperation and pregnancy outcome. These diverse functions and characteristics place genes of the MHC among the best candidates for studies of mechanisms and significance of molecular adaptation in vertebrates. MHC variability is believed to be maintained by pathogen-driven selection, mediated either through heterozygote advantage or frequency-dependent selection. Up to now, most of our knowledge has derived from studies in humans or from model organisms under experimental, laboratory conditions. Empirical support for selective mechanisms in free-ranging animal populations in their natural environment is rare. In this review, I first introduce general information about the structure and function of MHC genes, as well as current hypotheses and concepts concerning the role of selection in the maintenance of MHC polymorphism. The evolutionary forces acting on the genetic diversity in coding and non-coding markers are compared. Then, I summarise empirical support for the functional importance of MHC variability in parasite resistance with emphasis on the evidence derived from free-ranging animal populations investigated in their natural habitat. Finally, I discuss the importance of adaptive genetic variability with respect to human impact and conservation, and implications for future studies. PMID:16242022
Whittle, Carrie A.; Extavour, Cassandra G.
2016-01-01
Abstract Spiders belong to the Chelicerata, the most basally branching arthropod subphylum. The common house spider, Parasteatoda tepidariorum, is an emerging model and provides a valuable system to address key questions in molecular evolution in an arthropod system that is distinct from traditionally studied insects. Here, we provide evidence suggesting that codon usage, amino acid frequency, and protein lengths are each influenced by expression-mediated selection in P. tepidariorum. First, highly expressed genes exhibited preferential usage of T3 codons in this spider, suggestive of selection. Second, genes with elevated transcription favored amino acids with low or intermediate size/complexity (S/C) scores (glycine and alanine) and disfavored those with large S/C scores (such as cysteine), consistent with the minimization of biosynthesis costs of abundant proteins. Third, we observed a negative correlation between expression level and coding sequence length. Together, we conclude that protein-coding genes exhibit signals of expression-related selection in this emerging, noninsect, arthropod model. PMID:27017527
Origins of De Novo Genes in Human and Chimpanzee.
Ruiz-Orera, Jorge; Hernandez-Rodriguez, Jessica; Chiva, Cristina; Sabidó, Eduard; Kondova, Ivanela; Bontrop, Ronald; Marqués-Bonet, Tomàs; Albà, M Mar
2015-12-01
The birth of new genes is an important motor of evolutionary innovation. Whereas many new genes arise by gene duplication, others originate at genomic regions that did not contain any genes or gene copies. Some of these newly expressed genes may acquire coding or non-coding functions and be preserved by natural selection. However, it is yet unclear which is the prevalence and underlying mechanisms of de novo gene emergence. In order to obtain a comprehensive view of this process, we have performed in-depth sequencing of the transcriptomes of four mammalian species--human, chimpanzee, macaque, and mouse--and subsequently compared the assembled transcripts and the corresponding syntenic genomic regions. This has resulted in the identification of over five thousand new multiexonic transcriptional events in human and/or chimpanzee that are not observed in the rest of species. Using comparative genomics, we show that the expression of these transcripts is associated with the gain of regulatory motifs upstream of the transcription start site (TSS) and of U1 snRNP sites downstream of the TSS. In general, these transcripts show little evidence of purifying selection, suggesting that many of them are not functional. However, we find signatures of selection in a subset of de novo genes which have evidence of protein translation. Taken together, the data support a model in which frequently-occurring new transcriptional events in the genome provide the raw material for the evolution of new proteins.
Origins of De Novo Genes in Human and Chimpanzee
Ruiz-Orera, Jorge; Hernandez-Rodriguez, Jessica; Chiva, Cristina; Sabidó, Eduard; Kondova, Ivanela; Bontrop, Ronald; Marqués-Bonet, Tomàs; Albà, M.Mar
2015-01-01
The birth of new genes is an important motor of evolutionary innovation. Whereas many new genes arise by gene duplication, others originate at genomic regions that did not contain any genes or gene copies. Some of these newly expressed genes may acquire coding or non-coding functions and be preserved by natural selection. However, it is yet unclear which is the prevalence and underlying mechanisms of de novo gene emergence. In order to obtain a comprehensive view of this process, we have performed in-depth sequencing of the transcriptomes of four mammalian species—human, chimpanzee, macaque, and mouse—and subsequently compared the assembled transcripts and the corresponding syntenic genomic regions. This has resulted in the identification of over five thousand new multiexonic transcriptional events in human and/or chimpanzee that are not observed in the rest of species. Using comparative genomics, we show that the expression of these transcripts is associated with the gain of regulatory motifs upstream of the transcription start site (TSS) and of U1 snRNP sites downstream of the TSS. In general, these transcripts show little evidence of purifying selection, suggesting that many of them are not functional. However, we find signatures of selection in a subset of de novo genes which have evidence of protein translation. Taken together, the data support a model in which frequently-occurring new transcriptional events in the genome provide the raw material for the evolution of new proteins. PMID:26720152
Yang, Hai-Ling; Liu, Yan-Jing; Wang, Cai-Ling; Zeng, Qing-Yin
2012-01-01
Trehalose-6-phosphate synthase (TPS) plays important roles in trehalose metabolism and signaling. Plant TPS proteins contain both a TPS and a trehalose-6-phosphate phosphatase (TPP) domain, which are coded by a multi-gene family. The plant TPS gene family has been divided into class I and class II. A previous study showed that the Populus, Arabidopsis, and rice genomes have seven class I and 27 class II TPS genes. In this study, we found that all class I TPS genes had 16 introns within the protein-coding region, whereas class II TPS genes had two introns. A significant sequence difference between the two classes of TPS proteins was observed by pairwise sequence comparisons of the 34 TPS proteins. A phylogenetic analysis revealed that at least seven TPS genes were present in the monocot–dicot common ancestor. Segmental duplications contributed significantly to the expansion of this gene family. At least five and three TPS genes were created by segmental duplication events in the Populus and rice genomes, respectively. Both the TPS and TPP domains of 34 TPS genes have evolved under purifying selection, but the selective constraint on the TPP domain was more relaxed than that on the TPS domain. Among 34 TPS genes from Populus, Arabidopsis, and rice, four class I TPS genes (AtTPS1, OsTPS1, PtTPS1, and PtTPS2) were under stronger purifying selection, whereas three Arabidopsis class I TPS genes (AtTPS2, 3, and 4) apparently evolved under relaxed selective constraint. Additionally, a reverse transcription polymerase chain reaction analysis showed the expression divergence of the TPS gene family in Populus, Arabidopsis, and rice under normal growth conditions and in response to stressors. Our findings provide new insights into the mechanisms of gene family expansion and functional evolution. PMID:22905132
Yang, Hai-Ling; Liu, Yan-Jing; Wang, Cai-Ling; Zeng, Qing-Yin
2012-01-01
Trehalose-6-phosphate synthase (TPS) plays important roles in trehalose metabolism and signaling. Plant TPS proteins contain both a TPS and a trehalose-6-phosphate phosphatase (TPP) domain, which are coded by a multi-gene family. The plant TPS gene family has been divided into class I and class II. A previous study showed that the Populus, Arabidopsis, and rice genomes have seven class I and 27 class II TPS genes. In this study, we found that all class I TPS genes had 16 introns within the protein-coding region, whereas class II TPS genes had two introns. A significant sequence difference between the two classes of TPS proteins was observed by pairwise sequence comparisons of the 34 TPS proteins. A phylogenetic analysis revealed that at least seven TPS genes were present in the monocot-dicot common ancestor. Segmental duplications contributed significantly to the expansion of this gene family. At least five and three TPS genes were created by segmental duplication events in the Populus and rice genomes, respectively. Both the TPS and TPP domains of 34 TPS genes have evolved under purifying selection, but the selective constraint on the TPP domain was more relaxed than that on the TPS domain. Among 34 TPS genes from Populus, Arabidopsis, and rice, four class I TPS genes (AtTPS1, OsTPS1, PtTPS1, and PtTPS2) were under stronger purifying selection, whereas three Arabidopsis class I TPS genes (AtTPS2, 3, and 4) apparently evolved under relaxed selective constraint. Additionally, a reverse transcription polymerase chain reaction analysis showed the expression divergence of the TPS gene family in Populus, Arabidopsis, and rice under normal growth conditions and in response to stressors. Our findings provide new insights into the mechanisms of gene family expansion and functional evolution.
Budisan, Liviuta; Gulei, Diana; Zanoaga, Oana Mihaela; Irimie, Alexandra Iulia; Chira, Sergiu; Braicu, Cornelia; Gherman, Claudia Diana; Berindan-Neagoe, Ioana
2017-01-01
Phytochemicals are natural compounds synthesized as secondary metabolites in plants, representing an important source of molecules with a wide range of therapeutic applications. These natural agents are important regulators of key pathological processes/conditions, including cancer, as they are able to modulate the expression of coding and non-coding transcripts with an oncogenic or tumour suppressor role. These natural agents are currently exploited for the development of therapeutic strategies alone or in tandem with conventional treatments for cancer. The aim of this paper is to review the recent studies regarding the role of these natural phytochemicals in different processes related to cancer inhibition, including apoptosis activation, angiogenesis and metastasis suppression. From the large palette of phytochemicals we selected epigallocatechin gallate (EGCG), caffeic acid phenethyl ester (CAPE), genistein, morin and kaempferol, due to their increased activity in modulating multiple coding and non-coding genes, targeting the main hallmarks of cancer. PMID:28587155
Budisan, Liviuta; Gulei, Diana; Zanoaga, Oana Mihaela; Irimie, Alexandra Iulia; Sergiu, Chira; Braicu, Cornelia; Gherman, Claudia Diana; Berindan-Neagoe, Ioana
2017-06-01
Phytochemicals are natural compounds synthesized as secondary metabolites in plants, representing an important source of molecules with a wide range of therapeutic applications. These natural agents are important regulators of key pathological processes/conditions, including cancer, as they are able to modulate the expression of coding and non-coding transcripts with an oncogenic or tumour suppressor role. These natural agents are currently exploited for the development of therapeutic strategies alone or in tandem with conventional treatments for cancer. The aim of this paper is to review the recent studies regarding the role of these natural phytochemicals in different processes related to cancer inhibition, including apoptosis activation, angiogenesis and metastasis suppression. From the large palette of phytochemicals we selected epigallocatechin gallate (EGCG), caffeic acid phenethyl ester (CAPE), genistein, morin and kaempferol, due to their increased activity in modulating multiple coding and non-coding genes, targeting the main hallmarks of cancer.
A genome-wide survey of maternal and embryonic transcripts during Xenopus tropicalis development.
Paranjpe, Sarita S; Jacobi, Ulrike G; van Heeringen, Simon J; Veenstra, Gert Jan C
2013-11-06
Dynamics of polyadenylation vs. deadenylation determine the fate of several developmentally regulated genes. Decay of a subset of maternal mRNAs and new transcription define the maternal-to-zygotic transition, but the full complement of polyadenylated and deadenylated coding and non-coding transcripts has not yet been assessed in Xenopus embryos. To analyze the dynamics and diversity of coding and non-coding transcripts during development, both polyadenylated mRNA and ribosomal RNA-depleted total RNA were harvested across six developmental stages and subjected to high throughput sequencing. The maternally loaded transcriptome is highly diverse and consists of both polyadenylated and deadenylated transcripts. Many maternal genes show peak expression in the oocyte and include genes which are known to be the key regulators of events like oocyte maturation and fertilization. Of all the transcripts that increase in abundance between early blastula and larval stages, about 30% of the embryonic genes are induced by fourfold or more by the late blastula stage and another 35% by late gastrulation. Using a gene model validation and discovery pipeline, we identified novel transcripts and putative long non-coding RNAs (lncRNA). These lncRNA transcripts were stringently selected as spliced transcripts generated from independent promoters, with limited coding potential and a codon bias characteristic of noncoding sequences. Many lncRNAs are conserved and expressed in a developmental stage-specific fashion. These data reveal dynamics of transcriptome polyadenylation and abundance and provides a high-confidence catalogue of novel and long non-coding RNAs.
2011-01-01
Background A gene's position in regulatory, protein interaction or metabolic networks can be predictive of the strength of purifying selection acting on it, but these relationships are neither universal nor invariably strong. Following work in bacteria, fungi and invertebrate animals, we explore the relationship between selective constraint and metabolic function in mammals. Results We measure the association between selective constraint, estimated by the ratio of nonsynonymous (Ka) to synonymous (Ks) substitutions, and several, primarily metabolic, measures of gene function. We find significant differences between the selective constraints acting on enzyme-coding genes from different cellular compartments, with the nucleus showing higher constraint than genes from either the cytoplasm or the mitochondria. Among metabolic genes, the centrality of an enzyme in the metabolic network is significantly correlated with Ka/Ks. In contrast to yeasts, gene expression magnitude does not appear to be the primary predictor of selective constraint in these organisms. Conclusions Our results imply that the relationship between selective constraint and enzyme centrality is complex: the strength of selective constraint acting on mammalian genes is quite variable and does not appear to exclusively follow patterns seen in other organisms. PMID:21470417
Identification of positive selection in disease response genes within members of the Poaceae.
Rech, Gabriel E; Vargas, Walter A; Sukno, Serenella A; Thon, Michael R
2012-12-01
Millions of years of coevolution between plants and pathogens can leave footprints on their genomes and genes involved on this interaction are expected to show patterns of positive selection in which novel, beneficial alleles are rapidly fixed within the population. Using information about upregulated genes in maize during Colletotrichum graminicola infection and resources available in the Phytozome database, we looked for evidence of positive selection in the Poaceae lineage, acting on protein coding sequences related with plant defense. We found six genes with evidence of positive selection and another eight with sites showing episodic selection. Some of them have already been described as evolving under positive selection, but others are reported here for the first time including genes encoding isocitrate lyase, dehydrogenases, a multidrug transporter, a protein containing a putative leucine-rich repeat and other proteins with unknown functions. Mapping positively selected residues onto the predicted 3-D structure of proteins showed that most of them are located on the surface, where proteins are in contact with other molecules. We present here a set of Poaceae genes that are likely to be involved in plant defense mechanisms and have evidence of positive selection. These genes are excellent candidates for future functional validation.
McLysaght, Aoife; Guerzoni, Daniele
2015-09-26
The origin of novel protein-coding genes de novo was once considered so improbable as to be impossible. In less than a decade, and especially in the last five years, this view has been overturned by extensive evidence from diverse eukaryotic lineages. There is now evidence that this mechanism has contributed a significant number of genes to genomes of organisms as diverse as Saccharomyces, Drosophila, Plasmodium, Arabidopisis and human. From simple beginnings, these genes have in some instances acquired complex structure, regulated expression and important functional roles. New genes are often thought of as dispensable late additions; however, some recent de novo genes in human can play a role in disease. Rather than an extremely rare occurrence, it is now evident that there is a relatively constant trickle of proto-genes released into the testing ground of natural selection. It is currently unknown whether de novo genes arise primarily through an 'RNA-first' or 'ORF-first' pathway. Either way, evolutionary tinkering with this pool of genetic potential may have been a significant player in the origins of lineage-specific traits and adaptations. © 2015 The Authors.
Campos, José Luis; Charlesworth, Brian
2017-01-01
We used whole-genome resequencing data from a population of Drosophila melanogaster to investigate the causes of the negative correlation between the within-population synonymous nucleotide site diversity (πS) of a gene and its degree of divergence from related species at nonsynonymous nucleotide sites (KA). By using the estimated distributions of mutational effects on fitness at nonsynonymous and UTR sites, we predicted the effects of background selection at sites within a gene on πS and found that these could account for only part of the observed correlation between πS and KA. We developed a model of the effects of selective sweeps that included gene conversion as well as crossing over. We used this model to estimate the average strength of selection on positively selected mutations in coding sequences and in UTRs, as well as the proportions of new mutations that are selectively advantageous. Genes with high levels of selective constraint on nonsynonymous sites were found to have lower strengths of positive selection and lower proportions of advantageous mutations than genes with low levels of constraint. Overall, background selection and selective sweeps within a typical gene reduce its synonymous diversity to ∼75% of its value in the absence of selection, with larger reductions for genes with high KA. Gene conversion has a major effect on the estimates of the parameters of positive selection, such that the estimated strength of selection on favorable mutations is greatly reduced if it is ignored. PMID:28559322
A Case-by-Case Evolutionary Analysis of Four Imprinted Retrogenes
McCole, Ruth B; Loughran, Noeleen B; Chahal, Mandeep; Fernandes, Luis P; Roberts, Roland G; Fraternali, Franca; O'Connell, Mary J; Oakey, Rebecca J
2011-01-01
Retroposition is a widespread phenomenon resulting in the generation of new genes that are initially related to a parent gene via very high coding sequence similarity. We examine the evolutionary fate of four retrogenes generated by such an event; mouse Inpp5f_v2, Mcts2, Nap1l5, and U2af1-rs1. These genes are all subject to the epigenetic phenomenon of parental imprinting. We first provide new data on the age of these retrogene insertions. Using codon-based models of sequence evolution, we show these retrogenes have diverse evolutionary trajectories, including divergence from the parent coding sequence under positive selection pressure, purifying selection pressure maintaining parent-retrogene similarity, and neutral evolution. Examination of the expression pattern of retrogenes shows an atypical, broad pattern across multiple tissues. Protein 3D structure modeling reveals that a positively selected residue in U2af1-rs1, not shared by its parent, may influence protein conformation. Our case-by-case analysis of the evolution of four imprinted retrogenes reveals that this interesting class of imprinted genes, while similar in regulation and sequence characteristics, follow very varied evolutionary paths. PMID:21166792
Moreira, Viviane S; Soares, Virgínia L F; Silva, Raner J S; Sousa, Aurizangela O; Otoni, Wagner C; Costa, Marcio G C
2018-05-01
Bixa orellana L., popularly known as annatto, produces several secondary metabolites of pharmaceutical and industrial interest, including bixin, whose molecular basis of biosynthesis remain to be determined. Gene expression analysis by quantitative real-time PCR (qPCR) is an important tool to advance such knowledge. However, correct interpretation of qPCR data requires the use of suitable reference genes in order to reduce experimental variations. In the present study, we have selected four different candidates for reference genes in B. orellana , coding for 40S ribosomal protein S9 (RPS9), histone H4 (H4), 60S ribosomal protein L38 (RPL38) and 18S ribosomal RNA (18SrRNA). Their expression stabilities in different tissues (e.g. flower buds, flowers, leaves and seeds at different developmental stages) were analyzed using five statistical tools (NormFinder, geNorm, BestKeeper, ΔCt method and RefFinder). The results indicated that RPL38 is the most stable gene in different tissues and stages of seed development and 18SrRNA is the most unstable among the analyzed genes. In order to validate the candidate reference genes, we have analyzed the relative expression of a target gene coding for carotenoid cleavage dioxygenase 1 (CCD1) using the stable RPL38 and the least stable gene, 18SrRNA , for normalization of the qPCR data. The results demonstrated significant differences in the interpretation of the CCD1 gene expression data, depending on the reference gene used, reinforcing the importance of the correct selection of reference genes for normalization.
Adaptive evolution of the matrix extracellular phosphoglycoprotein in mammals
2011-01-01
Background Matrix extracellular phosphoglycoprotein (MEPE) belongs to a family of small integrin-binding ligand N-linked glycoproteins (SIBLINGs) that play a key role in skeleton development, particularly in mineralization, phosphate regulation and osteogenesis. MEPE associated disorders cause various physiological effects, such as loss of bone mass, tumors and disruption of renal function (hypophosphatemia). The study of this developmental gene from an evolutionary perspective could provide valuable insights on the adaptive diversification of morphological phenotypes in vertebrates. Results Here we studied the adaptive evolution of the MEPE gene in 26 Eutherian mammals and three birds. The comparative genomic analyses revealed a high degree of evolutionary conservation of some coding and non-coding regions of the MEPE gene across mammals indicating a possible regulatory or functional role likely related with mineralization and/or phosphate regulation. However, the majority of the coding region had a fast evolutionary rate, particularly within the largest exon (1467 bp). Rodentia and Scandentia had distinct substitution rates with an increased accumulation of both synonymous and non-synonymous mutations compared with other mammalian lineages. Characteristics of the gene (e.g. biochemical, evolutionary rate, and intronic conservation) differed greatly among lineages of the eight mammalian orders. We identified 20 sites with significant positive selection signatures (codon and protein level) outside the main regulatory motifs (dentonin and ASARM) suggestive of an adaptive role. Conversely, we find three sites under selection in the signal peptide and one in the ASARM motif that were supported by at least one selection model. The MEPE protein tends to accumulate amino acids promoting disorder and potential phosphorylation targets. Conclusion MEPE shows a high number of selection signatures, revealing the crucial role of positive selection in the evolution of this SIBLING member. The selection signatures were found mainly outside the functional motifs, reinforcing the idea that other regions outside the dentonin and the ASARM might be crucial for the function of the protein and future studies should be undertaken to understand its importance. PMID:22103247
Trotta, Edoardo
2016-05-17
The three stop codons UAA, UAG, and UGA signal the termination of mRNA translation. As a result of a mechanism that is not adequately understood, they are normally used with unequal frequencies. In this work, we showed that selective forces and mutational biases drive stop codon usage in the human genome. We found that, in respect to sense codons, stop codon usage was affected by stronger selective forces but was less influenced by neutral mutational biases. UGA is the most frequent termination codon in human genome. However, UAA was the preferred stop codon in genes with high breadth of expression, high level of expression, AT-rich coding sequences, housekeeping functions, and in gene ontology categories with the largest deviation from expected stop codon usage. Selective forces associated with the breadth and the level of expression favoured AT-rich sequences in the mRNA region including the stop site and its proximal 3'-UTR, but acted with scarce effects on sense codons, generating two regions, upstream and downstream of the stop codon, with strongly different base composition. By favouring low levels of GC-content, selection promoted labile local secondary structures at the stop site and its proximal 3'-UTR. The compositional and structural context favoured by selection was surprisingly emphasized in the class of ribosomal proteins and was consistent with sequence elements that increase the efficiency of translational termination. Stop codons were also heterogeneously distributed among chromosomes by a mechanism that was strongly correlated with the GC-content of coding sequences. In human genome, the nucleotide composition and the thermodynamic stability of stop codon site and its proximal 3'-UTR are correlated with the GC-content of coding sequences and with the breadth and the level of gene expression. In highly expressed genes stop codon usage is compositionally and structurally consistent with highly efficient translation termination signals.
USDA-ARS?s Scientific Manuscript database
Retrograde signalling is a selective process defined by cues generated in chloroplast/mitochondria which traverse membranes and end up regulating nuclear gene expression and protein synthesis. The coding and encoding of organellar message(s) that alter nuclear gene expression and/or cellular metabo...
López-Wilchis, Ricardo; Del Río-Portilla, Miguel Ángel; Guevara-Chumacero, Luis Manuel
2017-02-01
We described the complete mitochondrial genome (mitogenome) of the Wagner's mustached bat, Pteronotus personatus, a species belonging to the family Mormoopidae, and compared it with other published mitogenomes of bats (Chiroptera). The mitogenome of P. personatus was 16,570 bp long and contained a typically conserved structure including 13 protein-coding genes, 22 transfer RNA genes, two ribosomal RNA genes, and one control region (D-loop). Most of the genes were encoded on the H-strand, except for eight tRNA and the ND6 genes. The order of protein-coding and rRNA genes was highly conserved in all mitogenomes. All protein-coding genes started with an ATG codon, except for ND2, ND3, and ND5, which initiated with ATA, and terminated with the typical stop codon TAA/TAG or the codon AGA. Phylogenetic trees constructed using Maximum Parsimony, Maximum Likelihood, and Bayesian inference methods showed an identical topology and indicated the monophyly of different families of bats (Mormoopidae, Phyllostomidae, Vespertilionidae, Rhinolophidae, and Pteropopidae) and the existence of two major clades corresponding to the suborders Yangochiroptera and Yinpterochiroptera. The mitogenome sequence provided here will be useful for further phylogenetic analyses and population genetic studies in mormoopid bats.
Hansen, Karina K; Hauser, Frank; Williamson, Michael; Weber, Stine B; Grimmelikhuijzen, Cornelis J P
2011-01-07
Recently, a novel neuropeptide, CCHamide, was discovered in the silkworm Bombyx mori (L. Roller et al., Insect Biochem. Mol. Biol. 38 (2008) 1147-1157). We have now found that all insects with a sequenced genome have two genes, each coding for a different CCHamide, CCHamide-1 and -2. We have also cloned and deorphanized two Drosophila G-protein-coupled receptors (GPCRs) coded for by genes CG14593 and CG30106 that are selectively activated by Drosophila CCH-amide-1 (EC(50), 2×10(-9) M) and CCH-amide-2 (EC(50), 5×10(-9) M), respectively. Gene CG30106 (symbol synonym CG14484) has in a previous publication (E.C. Johnson et al., J. Biol. Chem. 278 (2003) 52172-52178) been wrongly assigned to code for an allatostatin-B receptor. This conclusion is based on our findings that the allatostatins-B do not activate the CG30106 receptor and on the recent findings from other research groups that the allatostatins-B activate an unrelated GPCR coded for by gene CG16752. Comparative genomics suggests that a duplication of the CCHamide neuropeptide signalling system occurred after the split of crustaceans and insects, about 410 million years ago, because only one CCHamide neuropeptide gene is found in the water flea Daphnia pulex (Crustacea) and the tick Ixodes scapularis (Chelicerata). Copyright © 2010 Elsevier Inc. All rights reserved.
Stotz, Henrik U; Harvey, Pascoe J; Haddadi, Parham; Mashanova, Alla; Kukol, Andreas; Larkan, Nicholas J; Borhan, M Hossein; Fitt, Bruce D L
2018-01-01
Genes coding for nucleotide-binding leucine-rich repeat (LRR) receptors (NLRs) control resistance against intracellular (cell-penetrating) pathogens. However, evidence for a role of genes coding for proteins with LRR domains in resistance against extracellular (apoplastic) fungal pathogens is limited. Here, the distribution of genes coding for proteins with eLRR domains but lacking kinase domains was determined for the Brassica napus genome. Predictions of signal peptide and transmembrane regions divided these genes into 184 coding for receptor-like proteins (RLPs) and 121 coding for secreted proteins (SPs). Together with previously annotated NLRs, a total of 720 LRR genes were found. Leptosphaeria maculans-induced expression during a compatible interaction with cultivar Topas differed between RLP, SP and NLR gene families; NLR genes were induced relatively late, during the necrotrophic phase of pathogen colonization. Seven RLP, one SP and two NLR genes were found in Rlm1 and Rlm3/Rlm4/Rlm7/Rlm9 loci for resistance against L. maculans on chromosome A07 of B. napus. One NLR gene at the Rlm9 locus was positively selected, as was the RLP gene on chromosome A10 with LepR3 and Rlm2 alleles conferring resistance against L. maculans races with corresponding effectors AvrLm1 and AvrLm2, respectively. Known loci for resistance against L. maculans (extracellular hemi-biotrophic fungus), Sclerotinia sclerotiorum (necrotrophic fungus) and Plasmodiophora brassicae (intracellular, obligate biotrophic protist) were examined for presence of RLPs, SPs and NLRs in these regions. Whereas loci for resistance against P. brassicae were enriched for NLRs, no such signature was observed for the other pathogens. These findings demonstrate involvement of (i) NLR genes in resistance against the intracellular pathogen P. brassicae and a putative NLR gene in Rlm9-mediated resistance against the extracellular pathogen L. maculans.
Genetics of Cerebellar and Neocortical Expansion in Anthropoid Primates: A Comparative Approach
Harrison, Peter W.; Montgomery, Stephen H.
2017-01-01
What adaptive changes in brain structure and function underpin the evolution of increased cognitive performance in humans and our close relatives? Identifying the genetic basis of brain evolution has become a major tool in answering this question. Numerous cases of positive selection, altered gene expression or gene duplication have been identified that may contribute to the evolution of the neocortex, which is widely assumed to play a predominant role in cognitive evolution. However, the components of the neocortex co-evolve with other functionally interdependent regions of the brain, most notably in the cerebellum. The cerebellum is linked to a range of cognitive tasks and expanded rapidly during hominoid evolution. Here we present data that suggest that, across anthropoid primates, protein-coding genes with known roles in cerebellum development were just as likely to be targeted by selection as genes linked to cortical development. Indeed, based on currently available gene ontology data, protein-coding genes with known roles in cerebellum development are more likely to have evolved adaptively during hominoid evolution. This is consistent with phenotypic data suggesting an accelerated rate of cerebellar expansion in apes that is beyond that predicted from scaling with the neocortex in other primates. Finally, we present evidence that the strength of selection on specific genes is associated with variation in the volume of either the neocortex or the cerebellum, but not both. This result provides preliminary evidence that co-variation between these brain components during anthropoid evolution may be at least partly regulated by selection on independent loci, a conclusion that is consistent with recent intraspecific genetic analyses and a mosaic model of brain evolution that predicts adaptive evolution of brain structure. PMID:28683440
Zheng, Yang; Cai, Jing; Li, JianWen; Li, Bo; Lin, Runmao; Tian, Feng; Wang, XiaoLing; Wang, Jun
2010-01-01
A 10-fold BAC library for giant panda was constructed and nine BACs were selected to generate finish sequences. These BACs could be used as a validation resource for the de novo assembly accuracy of the whole genome shotgun sequencing reads of giant panda newly generated by the Illumina GA sequencing technology. Complete sanger sequencing, assembly, annotation and comparative analysis were carried out on the selected BACs of a joint length 878 kb. Homologue search and de novo prediction methods were used to annotate genes and repeats. Twelve protein coding genes were predicted, seven of which could be functionally annotated. The seven genes have an average gene size of about 41 kb, an average coding size of about 1.2 kb and an average exon number of 6 per gene. Besides, seven tRNA genes were found. About 27 percent of the BAC sequence is composed of repeats. A phylogenetic tree was constructed using neighbor-join algorithm across five species, including giant panda, human, dog, cat and mouse, which reconfirms dog as the most related species to giant panda. Our results provide detailed sequence and structure information for new genes and repeats of giant panda, which will be helpful for further studies on the giant panda.
Voloch, Carolina M; Capellão, Renata T; Mello, Beatriz; Schrago, Carlos G
2014-11-19
Lyssavirus is a diverse genus of viruses that infect a variety of mammalian hosts, typically causing encephalitis. The evolution of this lineage, particularly the rabies virus, has been a focus of research because of the extensive occurrence of cross-species transmission, and the distinctive geographical patterns present throughout the diversification of these viruses. Although numerous studies have examined pattern-related questions concerning Lyssavirus evolution, analyses of the evolutionary processes acting on Lyssavirus diversification are scarce. To clarify the relevance of positive natural selection in Lyssavirus diversification, we conducted a comprehensive scan for episodic diversifying selection across all lineages and codon sites of the five coding regions in lyssavirus genomes. Although the genomes of these viruses are generally conserved, the glycoprotein (G), RNA-dependent RNA polymerase (L) and polymerase (P) genes were frequently targets of adaptive evolution during the diversification of the genus. Adaptive evolution is particularly manifest in the glycoprotein gene, which was inferred to have experienced the highest density of positively selected codon sites along branches. Substitutions in the L gene were found to be associated with the early diversification of phylogroups. A comparison between the number of positively selected sites inferred along the branches of RABV population branches and Lyssavirus intespecies branches suggested that the occurrence of positive selection was similar on the five coding regions of the genome in both groups.
Voloch, Carolina M.; Capellão, Renata T.; Mello, Beatriz; Schrago, Carlos G.
2014-01-01
Lyssavirus is a diverse genus of viruses that infect a variety of mammalian hosts, typically causing encephalitis. The evolution of this lineage, particularly the rabies virus, has been a focus of research because of the extensive occurrence of cross-species transmission, and the distinctive geographical patterns present throughout the diversification of these viruses. Although numerous studies have examined pattern-related questions concerning Lyssavirus evolution, analyses of the evolutionary processes acting on Lyssavirus diversification are scarce. To clarify the relevance of positive natural selection in Lyssavirus diversification, we conducted a comprehensive scan for episodic diversifying selection across all lineages and codon sites of the five coding regions in lyssavirus genomes. Although the genomes of these viruses are generally conserved, the glycoprotein (G), RNA-dependent RNA polymerase (L) and polymerase (P) genes were frequently targets of adaptive evolution during the diversification of the genus. Adaptive evolution is particularly manifest in the glycoprotein gene, which was inferred to have experienced the highest density of positively selected codon sites along branches. Substitutions in the L gene were found to be associated with the early diversification of phylogroups. A comparison between the number of positively selected sites inferred along the branches of RABV population branches and Lyssavirus intespecies branches suggested that the occurrence of positive selection was similar on the five coding regions of the genome in both groups. PMID:25415197
Analysis of protein-coding genetic variation in 60,706 humans.
Lek, Monkol; Karczewski, Konrad J; Minikel, Eric V; Samocha, Kaitlin E; Banks, Eric; Fennell, Timothy; O'Donnell-Luria, Anne H; Ware, James S; Hill, Andrew J; Cummings, Beryl B; Tukiainen, Taru; Birnbaum, Daniel P; Kosmicki, Jack A; Duncan, Laramie E; Estrada, Karol; Zhao, Fengmei; Zou, James; Pierce-Hoffman, Emma; Berghout, Joanne; Cooper, David N; Deflaux, Nicole; DePristo, Mark; Do, Ron; Flannick, Jason; Fromer, Menachem; Gauthier, Laura; Goldstein, Jackie; Gupta, Namrata; Howrigan, Daniel; Kiezun, Adam; Kurki, Mitja I; Moonshine, Ami Levy; Natarajan, Pradeep; Orozco, Lorena; Peloso, Gina M; Poplin, Ryan; Rivas, Manuel A; Ruano-Rubio, Valentin; Rose, Samuel A; Ruderfer, Douglas M; Shakir, Khalid; Stenson, Peter D; Stevens, Christine; Thomas, Brett P; Tiao, Grace; Tusie-Luna, Maria T; Weisburd, Ben; Won, Hong-Hee; Yu, Dongmei; Altshuler, David M; Ardissino, Diego; Boehnke, Michael; Danesh, John; Donnelly, Stacey; Elosua, Roberto; Florez, Jose C; Gabriel, Stacey B; Getz, Gad; Glatt, Stephen J; Hultman, Christina M; Kathiresan, Sekar; Laakso, Markku; McCarroll, Steven; McCarthy, Mark I; McGovern, Dermot; McPherson, Ruth; Neale, Benjamin M; Palotie, Aarno; Purcell, Shaun M; Saleheen, Danish; Scharf, Jeremiah M; Sklar, Pamela; Sullivan, Patrick F; Tuomilehto, Jaakko; Tsuang, Ming T; Watkins, Hugh C; Wilson, James G; Daly, Mark J; MacArthur, Daniel G
2016-08-18
Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.
Jin, Qijiang; Hu, Xin; Li, Xin; Wang, Bei; Wang, Yanjie; Jiang, Hongwei; Mattson, Neil; Xu, Yingchun
2016-01-01
Trehalose-6-phosphate synthase (TPS) plays a key role in plant carbohydrate metabolism and the perception of carbohydrate availability. In the present work, the publicly available Nelumbo nucifera (lotus) genome sequence database was analyzed which led to identification of nine lotus TPS genes (NnTPS). It was found that at least two introns are included in the coding sequences of NnTPS genes. When the motif compositions were analyzed we found that NnTPS generally shared the similar motifs, implying that they have similar functions. The dN/dS ratios were always less than 1 for different domains and regions outside domains, suggesting purifying selection on the lotus TPS gene family. The regions outside TPS domain evolved relatively faster than NnTPS domains. A phylogenetic tree was constructed using all predicted coding sequences of lotus TPS genes, together with those from Arabidopsis, poplar, soybean, and rice. The result indicated that those TPS genes could be clearly divided into two main subfamilies (I-II), where each subfamily could be further divided into 2 (I) and 5 (II) subgroups. Analyses of divergence and adaptive evolution show that purifying selection may have been the main force driving evolution of plant TPS genes. Some of the critical sites that contributed to divergence may have been under positive selection. Transcriptome data analysis revealed that most NnTPS genes were predominantly expressed in sink tissues. Expression pattern of NnTPS genes under copper and submergence stress indicated that NNU_014679 and NNU_022788 might play important roles in lotus energy metabolism and participate in stress response. Our results can facilitate further functional studies of TPS genes in lotus. PMID:27746792
Sun, Miao-Miao; Han, Liang; Zhang, Fu-Kai; Zhou, Dong-Hui; Wang, Shu-Qing; Ma, Jun; Zhu, Xing-Quan; Liu, Guo-Hua
2018-01-01
Marshallagia marshalli (Nematoda: Trichostrongylidae) infection can lead to serious parasitic gastroenteritis in sheep, goat, and wild ruminant, causing significant socioeconomic losses worldwide. Up to now, the study concerning the molecular biology of M. marshalli is limited. Herein, we sequenced the complete mitochondrial (mt) genome of M. marshalli and examined its phylogenetic relationship with selected members of the superfamily Trichostrongyloidea using Bayesian inference (BI) based on concatenated mt amino acid sequence datasets. The complete mt genome sequence of M. marshalli is 13,891 bp, including 12 protein-coding genes, 22 transfer RNA genes, and 2 ribosomal RNA genes. All protein-coding genes are transcribed in the same direction. Phylogenetic analyses based on concatenated amino acid sequences of the 12 protein-coding genes supported the monophylies of the families Haemonchidae, Molineidae, and Dictyocaulidae with strong statistical support, but rejected the monophyly of the family Trichostrongylidae. The determination of the complete mt genome sequence of M. marshalli provides novel genetic markers for studying the systematics, population genetics, and molecular epidemiology of M. marshalli and its congeners.
Detection of Pathways Affected by Positive Selection in Primate Lineages Ancestral to Humans
Moretti, S.; Davydov, I.I.; Excoffier, L.
2017-01-01
Abstract Gene set enrichment approaches have been increasingly successful in finding signals of recent polygenic selection in the human genome. In this study, we aim at detecting biological pathways affected by positive selection in more ancient human evolutionary history. Focusing on four branches of the primate tree that lead to modern humans, we tested all available protein coding gene trees of the Primates clade for signals of adaptation in these branches, using the likelihood-based branch site test of positive selection. The results of these locus-specific tests were then used as input for a gene set enrichment test, where whole pathways are globally scored for a signal of positive selection, instead of focusing only on outlier “significant” genes. We identified signals of positive selection in several pathways that are mainly involved in immune response, sensory perception, metabolism, and energy production. These pathway-level results are highly significant, even though there is no functional enrichment when only focusing on top scoring genes. Interestingly, several gene sets are found significant at multiple levels in the phylogeny, but different genes are responsible for the selection signal in the different branches. This suggests that the same function has been optimized in different ways at different times in primate evolution. PMID:28333345
Campos, José Luis; Johnston, Keira; Charlesworth, Brian
2017-12-08
A faster rate of adaptive evolution of X-linked genes compared with autosomal genes (the faster-X effect) can be caused by the fixation of recessive or partially recessive advantageous mutations. This effect should be largest for advantageous mutations that affect only male fitness, and least for mutations that affect only female fitness. We tested these predictions in Drosophila melanogaster by using coding and functionally significant non-coding sequences of genes with different levels of sex-biased expression. Consistent with theory, nonsynonymous substitutions in most male-biased and unbiased genes show faster adaptive evolution on the X. However, genes with very low recombination rates do not show such an effect, possibly as a consequence of Hill-Robertson interference. Contrary to expectation, there was a substantial faster-X effect for female-biased genes. After correcting for recombination rate differences, however, female-biased genes did not show a faster X-effect. Similar analyses of non-coding UTRs and long introns showed a faster-X effect for all groups of genes, other than introns of female-biased genes. Given the strong evidence that deleterious mutations are mostly recessive or partially recessive, we would expect a slower rate of evolution of X-linked genes for slightly deleterious mutations that become fixed by genetic drift. Surprisingly, we found little evidence for this after correcting for recombination rate, implying that weakly deleterious mutations are mostly close to being semidominant. This is consistent with evidence from polymorphism data, which we use to test how models of selection that assume semidominance with no sex-specific fitness effects may bias estimates of purifying selection. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Positive and relaxed selection associated with flight evolution and loss in insect transcriptomes
Mitterboeck, T. Fatima; Liu, Shanlin; Adamowicz, Sarah J.; Fu, Jinzhong; Zhang, Rui; Song, Wenhui; Meusemann, Karen
2017-01-01
Abstract The evolution of powered flight is a major innovation that has facilitated the success of insects. Previously, studies of birds, bats, and insects have detected molecular signatures of differing selection regimes in energy-related genes associated with flight evolution and/or loss. Here, using DNA sequences from more than 1000 nuclear and mitochondrial protein-coding genes obtained from insect transcriptomes, we conduct a broader exploration of which gene categories display positive and relaxed selection at the origin of flight as well as with multiple independent losses of flight. We detected a number of categories of nuclear genes more often under positive selection in the lineage leading to the winged insects (Pterygota), related to catabolic processes such as proteases, as well as splicing-related genes. Flight loss was associated with relaxed selection signatures in splicing genes, mirroring the results for flight evolution. Similar to previous studies of flight loss in various animal taxa, we observed consistently higher nonsynonymous-to-synonymous substitution ratios in mitochondrial genes of flightless lineages, indicative of relaxed selection in energy-related genes. While oxidative phosphorylation genes were not detected as being under selection with the origin of flight specifically, they were most often detected as being under positive selection in holometabolous (complete metamorphosis) insects as compared with other insect lineages. This study supports some convergence in gene-specific selection pressures associated with flight ability, and the exploratory analysis provided some new insights into gene categories potentially associated with the gain and loss of flight in insects. PMID:29020740
Positive and relaxed selection associated with flight evolution and loss in insect transcriptomes.
Mitterboeck, T Fatima; Liu, Shanlin; Adamowicz, Sarah J; Fu, Jinzhong; Zhang, Rui; Song, Wenhui; Meusemann, Karen; Zhou, Xin
2017-10-01
The evolution of powered flight is a major innovation that has facilitated the success of insects. Previously, studies of birds, bats, and insects have detected molecular signatures of differing selection regimes in energy-related genes associated with flight evolution and/or loss. Here, using DNA sequences from more than 1000 nuclear and mitochondrial protein-coding genes obtained from insect transcriptomes, we conduct a broader exploration of which gene categories display positive and relaxed selection at the origin of flight as well as with multiple independent losses of flight. We detected a number of categories of nuclear genes more often under positive selection in the lineage leading to the winged insects (Pterygota), related to catabolic processes such as proteases, as well as splicing-related genes. Flight loss was associated with relaxed selection signatures in splicing genes, mirroring the results for flight evolution. Similar to previous studies of flight loss in various animal taxa, we observed consistently higher nonsynonymous-to-synonymous substitution ratios in mitochondrial genes of flightless lineages, indicative of relaxed selection in energy-related genes. While oxidative phosphorylation genes were not detected as being under selection with the origin of flight specifically, they were most often detected as being under positive selection in holometabolous (complete metamorphosis) insects as compared with other insect lineages. This study supports some convergence in gene-specific selection pressures associated with flight ability, and the exploratory analysis provided some new insights into gene categories potentially associated with the gain and loss of flight in insects. © The Authors 2017. Published by Oxford University Press.
Shabalina, Svetlana A.; Ogurtsov, Aleksey Y.; Spiridonov, Nikolay A.; Koonin, Eugene V.
2014-01-01
Alternative splicing (AS), alternative transcription initiation (ATI) and alternative transcription termination (ATT) create the extraordinary complexity of transcriptomes and make key contributions to the structural and functional diversity of mammalian proteomes. Analysis of mammalian genomic and transcriptomic data shows that contrary to the traditional view, the joint contribution of ATI and ATT to the transcriptome and proteome diversity is quantitatively greater than the contribution of AS. Although the mean numbers of protein-coding constitutive and alternative nucleotides in gene loci are nearly identical, their distribution along the transcripts is highly non-uniform. On average, coding exons in the variable 5′ and 3′ transcript ends that are created by ATI and ATT contain approximately four times more alternative nucleotides than core protein-coding regions that diversify exclusively via AS. Short upstream exons that encompass alternative 5′-untranslated regions and N-termini of proteins evolve under strong nucleotide-level selection whereas in 3′-terminal exons that encode protein C-termini, protein-level selection is significantly stronger. The groups of genes that are subject to ATI and ATT show major differences in biological roles, expression and selection patterns. PMID:24792168
Prediction of plant lncRNA by ensemble machine learning classifiers.
Simopoulos, Caitlin M A; Weretilnyk, Elizabeth A; Golding, G Brian
2018-05-02
In plants, long non-protein coding RNAs are believed to have essential roles in development and stress responses. However, relative to advances on discerning biological roles for long non-protein coding RNAs in animal systems, this RNA class in plants is largely understudied. With comparatively few validated plant long non-coding RNAs, research on this potentially critical class of RNA is hindered by a lack of appropriate prediction tools and databases. Supervised learning models trained on data sets of mostly non-validated, non-coding transcripts have been previously used to identify this enigmatic RNA class with applications largely focused on animal systems. Our approach uses a training set comprised only of empirically validated long non-protein coding RNAs from plant, animal, and viral sources to predict and rank candidate long non-protein coding gene products for future functional validation. Individual stochastic gradient boosting and random forest classifiers trained on only empirically validated long non-protein coding RNAs were constructed. In order to use the strengths of multiple classifiers, we combined multiple models into a single stacking meta-learner. This ensemble approach benefits from the diversity of several learners to effectively identify putative plant long non-coding RNAs from transcript sequence features. When the predicted genes identified by the ensemble classifier were compared to those listed in GreeNC, an established plant long non-coding RNA database, overlap for predicted genes from Arabidopsis thaliana, Oryza sativa and Eutrema salsugineum ranged from 51 to 83% with the highest agreement in Eutrema salsugineum. Most of the highest ranking predictions from Arabidopsis thaliana were annotated as potential natural antisense genes, pseudogenes, transposable elements, or simply computationally predicted hypothetical protein. Due to the nature of this tool, the model can be updated as new long non-protein coding transcripts are identified and functionally verified. This ensemble classifier is an accurate tool that can be used to rank long non-protein coding RNA predictions for use in conjunction with gene expression studies. Selection of plant transcripts with a high potential for regulatory roles as long non-protein coding RNAs will advance research in the elucidation of long non-protein coding RNA function.
Molecular cloning of chitinase 33 (chit33) gene from Trichoderma atroviride
Matroudi, S.; Zamani, M.R.; Motallebi, M.
2008-01-01
In this study Trichoderma atroviride was selected as over producer of chitinase enzyme among 30 different isolates of Trichoderma sp. on the basis of chitinase specific activity. From this isolate the genomic and cDNA clones encoding chit33 have been isolated and sequenced. Comparison of genomic and cDNA sequences for defining gene structure indicates that this gene contains three short introns and also an open reading frame coding for a protein of 321 amino acids. The deduced amino acid sequence includes a 19 aa putative signal peptide. Homology between this sequence and other reported Trichoderma Chit33 proteins are discussed. The coding sequence of chit33 gene was cloned in pEt26b(+) expression vector and expressed in E. coli. PMID:24031242
Wildman, Derek E.; Uddin, Monica; Liu, Guozhen; Grossman, Lawrence I.; Goodman, Morris
2003-01-01
What do functionally important DNA sites, those scrutinized and shaped by natural selection, tell us about the place of humans in evolution? Here we compare ≈90 kb of coding DNA nucleotide sequence from 97 human genes to their sequenced chimpanzee counterparts and to available sequenced gorilla, orangutan, and Old World monkey counterparts, and, on a more limited basis, to mouse. The nonsynonymous changes (functionally important), like synonymous changes (functionally much less important), show chimpanzees and humans to be most closely related, sharing 99.4% identity at nonsynonymous sites and 98.4% at synonymous sites. On a time scale, the coding DNA divergencies separate the human–chimpanzee clade from the gorilla clade at between 6 and 7 million years ago and place the most recent common ancestor of humans and chimpanzees at between 5 and 6 million years ago. The evolutionary rate of coding DNA in the catarrhine clade (Old World monkey and ape, including human) is much slower than in the lineage to mouse. Among the genes examined, 30 show evidence of positive selection during descent of catarrhines. Nonsynonymous substitutions by themselves, in this subset of positively selected genes, group humans and chimpanzees closest to each other and have chimpanzees diverge about as much from the common human–chimpanzee ancestor as humans do. This functional DNA evidence supports two previously offered taxonomic proposals: family Hominidae should include all extant apes; and genus Homo should include three extant species and two subgenera, Homo (Homo) sapiens (humankind), Homo (Pan) troglodytes (common chimpanzee), and Homo (Pan) paniscus (bonobo chimpanzee). PMID:12766228
Wildman, Derek E; Uddin, Monica; Liu, Guozhen; Grossman, Lawrence I; Goodman, Morris
2003-06-10
What do functionally important DNA sites, those scrutinized and shaped by natural selection, tell us about the place of humans in evolution? Here we compare approximately 90 kb of coding DNA nucleotide sequence from 97 human genes to their sequenced chimpanzee counterparts and to available sequenced gorilla, orangutan, and Old World monkey counterparts, and, on a more limited basis, to mouse. The nonsynonymous changes (functionally important), like synonymous changes (functionally much less important), show chimpanzees and humans to be most closely related, sharing 99.4% identity at nonsynonymous sites and 98.4% at synonymous sites. On a time scale, the coding DNA divergencies separate the human-chimpanzee clade from the gorilla clade at between 6 and 7 million years ago and place the most recent common ancestor of humans and chimpanzees at between 5 and 6 million years ago. The evolutionary rate of coding DNA in the catarrhine clade (Old World monkey and ape, including human) is much slower than in the lineage to mouse. Among the genes examined, 30 show evidence of positive selection during descent of catarrhines. Nonsynonymous substitutions by themselves, in this subset of positively selected genes, group humans and chimpanzees closest to each other and have chimpanzees diverge about as much from the common human-chimpanzee ancestor as humans do. This functional DNA evidence supports two previously offered taxonomic proposals: family Hominidae should include all extant apes; and genus Homo should include three extant species and two subgenera, Homo (Homo) sapiens (humankind), Homo (Pan) troglodytes (common chimpanzee), and Homo (Pan) paniscus (bonobo chimpanzee).
[Variation of CAG repeats in coding region of ATXN2 gene in different ethnic groups].
Chen, Xiao-Chen; Sun, Hao; Mi, Dong-Qing; Huang, Xiao-Qin; Lin, Ke-Qin; Yi, Wen; Yu, Liang; Shi, Lei; Shi, Li; Yang, Zhao-Qing; Chu, Jia-You
2011-04-01
Toinvestigate CAG repeats variation of ATXN2 gene coding region in six ethnic groups that live in comparatively different environments, to evaluate whether these variations are under positive selection, and to find factors driving selection effects, 291 unrelated healthy individuals were collected from six ethnic groups and their STR geneotyping was performed. The frequencies of alleles and genotypes were counted and thereby Slatkin's linearized Fst values were calculated. The UPGMA tree against this gene was constructed. The MDS analysis among these groups was carried out as well. The results from the linearized Fst values indicated that there were significant evolutionary differences of the STR in ATXN2 gene between Hui and Yi groups, but not among the other 4 groups. Further analysis was performed by combining our data with published data obtained from other groups. These results indicated that there were significant differences between Japanese and other groups including Hui, Hani, Yunnan Mongolian, and Inner Mongolian. Both Hui and Mongolian from Inner Mongolia were significantly different from Han. In conclusion, the six ethnic groups had their own distribution characterizations of allelic frequencies of ATXN2 STR, and the potential cause of frequency changes in rare alleles could be the consequence of positive selection.
Sengupta, Subhadipa; Chakraborti, Dipankar; Mondal, Hossain A; Das, Sampa
2010-03-01
Rice, the major food crop of world is severely affected by homopteran sucking pests. We introduced coding sequence of Allium sativum leaf agglutinin, ASAL, in rice cultivar IR64 to develop sustainable resistance against sap-sucking planthoppers as well as eliminated the selectable antibiotic-resistant marker gene hygromycin phosphotransferase (hpt) exploiting cre/lox site-specific recombination system. An expression vector was constructed containing the coding sequence of ASAL, a potent controlling agent against green leafhoppers (GLH, Nephotettix virescens) and brown planthopper (BPH, Nilaparvata lugens). The selectable marker (hpt) gene cassette was cloned within two lox sites of the same vector. Alongside, another vector was developed with chimeric cre recombinase gene cassette. Reciprocal crosses were performed between three single-copy T(0) plants with ASAL- lox-hpt-lox T-DNA and three single-copy T(0) plants with cre-bar T-DNA. Marker gene excisions were detected in T(1) hybrids through hygromycin sensitivity assay. Molecular analysis of T(1) plants exhibited 27.4% recombination efficiency. T(2) progenies of L03C04(1) hybrid parent showed 25% cre negative ASAL-expressing plants. Northern blot, western blot and ELISA showed significant level of ASAL expression in five marker-free T(2) progeny plants. In planta bioassay of GLH and BPH performed on these T(2) progenies exhibited radical reduction in survivability and fecundity compared with the untransformed control plants.
42 CFR 73.3 - HHS select agents and toxins.
Code of Federal Regulations, 2013 CFR
2013-10-01
... replication competent forms of the 1918 pandemic influenza virus containing any portion of the coding regions of all eight gene segments (Reconstructed 1918 Influenza virus) Ricin Rickettsia prowazekii SARS...
42 CFR 73.3 - HHS select agents and toxins.
Code of Federal Regulations, 2012 CFR
2012-10-01
... virus Monkeypox virus Reconstructed replication competent forms of the 1918 pandemic influenza virus containing any portion of the coding regions of all eight gene segments (Reconstructed 1918 Influenza virus...
42 CFR 73.3 - HHS select agents and toxins.
Code of Federal Regulations, 2014 CFR
2014-10-01
... replication competent forms of the 1918 pandemic influenza virus containing any portion of the coding regions of all eight gene segments (Reconstructed 1918 Influenza virus) Ricin Rickettsia prowazekii SARS...
Third International Meeting on Esterases Reacting with Organophosphorus Compounds
1998-01-01
cassette for negative selection, 884 bp of ACHE including exon 1, 1.6 kb of a Neor gene cassette for positive selection, 5.2 kb of the ACHE Bam HI...fragment including exon 6, and 3 kb of Bluescript. Deletion of exons 2-5 removed 80% of the ACHE coding sequence. The gene targeting vector was...expression due to environmental influences on CYP3A4 and the presence or absence of CYP3A5 which may be under genetic control in man. Plasma
Natural selection in avian protein-coding genes expressed in brain.
Axelsson, Erik; Hultin-Rosenberg, Lina; Brandström, Mikael; Zwahlén, Martin; Clayton, David F; Ellegren, Hans
2008-06-01
The evolution of birds from theropod dinosaurs took place approximately 150 million years ago, and was associated with a number of specific adaptations that are still evident among extant birds, including feathers, song and extravagant secondary sexual characteristics. Knowledge about the molecular evolutionary background to such adaptations is lacking. Here, we analyse the evolution of > 5000 protein-coding gene sequences expressed in zebra finch brain by comparison to orthologous sequences in chicken. Mean d(N)/d(S) is 0.085 and genes with their maximal expression in the eye and central nervous system have the lowest mean d(N)/d(S) value, while those expressed in digestive and reproductive tissues exhibit the highest. We find that fast-evolving genes (those which have higher than expected rate of nonsynonymous substitution, indicative of adaptive evolution) are enriched for biological functions such as fertilization, muscle contraction, defence response, response to stress, wounding and endogenous stimulus, and cell death. After alignment to mammalian orthologues, we identify a catalogue of 228 genes that show a significantly higher rate of protein evolution in the two bird lineages than in mammals. These accelerated bird genes, representing candidates for avian-specific adaptations, include genes implicated in vocal learning and other cognitive processes. Moreover, colouration genes evolve faster in birds than in mammals, which may have been driven by sexual selection for extravagant plumage characteristics.
Fang, Lu; Shen, Bin; Irwin, David M; Zhang, Shuyi
2014-10-01
Glycogen synthase, which catalyzes the synthesis of glycogen, is especially important for Old World (Pteropodidae) and New World (Phyllostomidae) fruit bats that ingest high-carbohydrate diets. Glycogen synthase 1, encoded by the Gys1 gene, is the glycogen synthase isozyme that functions in muscles. To determine whether Gys1 has undergone adaptive evolution in bats with carbohydrate-rich diets, in comparison to insect-eating sister bat taxa, we sequenced the coding region of the Gys1 gene from 10 species of bats, including two Old World fruit bats (Pteropodidae) and a New World fruit bat (Phyllostomidae). Our results show no evidence for positive selection in the Gys1 coding sequence on the ancestral Old World and the New World Artibeus lituratus branches. Tests for convergent evolution indicated convergence of the sequences and one parallel amino acid substitution (T395A) was detected on these branches, which was likely driven by natural selection.
2010-01-01
Background Ghrelin, an endogenous ligand for the growth hormone secretagogue receptor (GHSR), has two major functions: the stimulation of the growth hormone production and the stimulation of food intake. Accumulating evidence also indicates a role of ghrelin in cancer development. Methods We conducted a case-control study to examine the association of common genetic variants in the genes coding for ghrelin (GHRL) and its receptor (GHSR) with colorectal cancer risk. Pairwise tagging was used to select the 11 polymorphisms included in the study. The selected polymorphisms were genotyped in 680 cases and 593 controls from the Czech Republic. Results We found two SNPs associated with lower risk of colorectal cancer, namely SNPs rs27647 and rs35683. We replicated the two hits, in additional 569 cases and 726 controls from Germany. Conclusion A joint analysis of the two populations indicated that the T allele of rs27647 SNP exerted a protective borderline effect (Ptrend = 0.004). PMID:20920174
Campa, Daniele; Pardini, Barbara; Naccarati, Alessio; Vodickova, Ludmila; Novotny, Jan; Steinke, Verena; Rahner, Nils; Holinski-Feder, Elke; Morak, Monika; Schackert, Hans K; Görgens, Heike; Kötting, Judith; Betz, Beate; Kloor, Matthias; Engel, Christoph; Büttner, Reinhard; Propping, Peter; Försti, Asta; Hemminki, Kari; Barale, Roberto; Vodicka, Pavel; Canzian, Federico
2010-09-28
Ghrelin, an endogenous ligand for the growth hormone secretagogue receptor (GHSR), has two major functions: the stimulation of the growth hormone production and the stimulation of food intake. Accumulating evidence also indicates a role of ghrelin in cancer development. We conducted a case-control study to examine the association of common genetic variants in the genes coding for ghrelin (GHRL) and its receptor (GHSR) with colorectal cancer risk. Pairwise tagging was used to select the 11 polymorphisms included in the study. The selected polymorphisms were genotyped in 680 cases and 593 controls from the Czech Republic. We found two SNPs associated with lower risk of colorectal cancer, namely SNPs rs27647 and rs35683. We replicated the two hits, in additional 569 cases and 726 controls from Germany. A joint analysis of the two populations indicated that the T allele of rs27647 SNP exerted a protective borderline effect (Ptrend = 0.004).
2010-01-01
Background Natural accessions of Arabidopsis thaliana are characterized by a high level of phenotypic variation that can be used to investigate the extent and mode of selection on the primary metabolic traits. A collection of 54 A. thaliana natural accession-derived lines were subjected to deep genotyping through Single Feature Polymorphism (SFP) detection via genomic DNA hybridization to Arabidopsis Tiling 1.0 Arrays for the detection of selective sweeps, and identification of associations between sweep regions and growth-related metabolic traits. Results A total of 1,072,557 high-quality SFPs were detected and indications for 3,943 deletions and 1,007 duplications were obtained. A significantly lower than expected SFP frequency was observed in protein-, rRNA-, and tRNA-coding regions and in non-repetitive intergenic regions, while pseudogenes, transposons, and non-coding RNA genes are enriched with SFPs. Gene families involved in plant defence or in signalling were identified as highly polymorphic, while several other families including transcription factors are depleted of SFPs. 198 significant associations between metabolic genes and 9 metabolic and growth-related phenotypic traits were detected with annotation hinting at the nature of the relationship. Five significant selective sweep regions were also detected of which one associated significantly with a metabolic trait. Conclusions We generated a high density polymorphism map for 54 A. thaliana accessions that highlights the variability of resistance genes across geographic ranges and used it to identify selective sweeps and associations between metabolic genes and metabolic phenotypes. Several associations show a clear biological relationship, while many remain requiring further investigation. PMID:20302660
Gene disruption in Trichoderma atroviride via Agrobacterium-mediated transformation.
Zeilinger, Susanne
2004-02-01
A modified Agrobacterium-mediated transformation method for the efficient disruption of two genes encoding signaling compounds of the mycoparasite Trichoderma atroviride is described, using the hph gene of Escherichia coli as selection marker. The transformation vectors contained about 1 kb of 5' and 3' non-coding regions from the tmk1 (encoding a MAP kinase) or tga3 (encoding an alpha-subunit of a heterotrimeric G protein) target loci flanking a selection marker. Transformation of fungal conidia and selection on hygromycin-containing media applying an overlay-based procedure, which overcomes the lack of formation of distinct single colonies by the fungus, led to stable clones for both disruption constructs. Southern and PCR analyses proved gene disruption by single-copy homologous integration with a frequency of approximately 60% for both genes; and the loss of tmk1 and tga3 transcript formation in the disruptants was demonstrated by RT-PCR.
Mohandesan, Elmira; Fitak, Robert R; Corander, Jukka; Yadamsuren, Adiya; Chuluunbat, Battsetseg; Abdelhadi, Omer; Raziq, Abdul; Nagy, Peter; Stalder, Gabrielle; Walzer, Chris; Faye, Bernard; Burger, Pamela A
2017-08-30
The genus Camelus is an interesting model to study adaptive evolution in the mitochondrial genome, as the three extant Old World camel species inhabit hot and low-altitude as well as cold and high-altitude deserts. We sequenced 24 camel mitogenomes and combined them with three previously published sequences to study the role of natural selection under different environmental pressure, and to advance our understanding of the evolutionary history of the genus Camelus. We confirmed the heterogeneity of divergence across different components of the electron transport system. Lineage-specific analysis of mitochondrial protein evolution revealed a significant effect of purifying selection in the concatenated protein-coding genes in domestic Bactrian camels. The estimated dN/dS < 1 in the concatenated protein-coding genes suggested purifying selection as driving force for shaping mitogenome diversity in camels. Additional analyses of the functional divergence in amino acid changes between species-specific lineages indicated fixed substitutions in various genes, with radical effects on the physicochemical properties of the protein products. The evolutionary time estimates revealed a divergence between domestic and wild Bactrian camels around 1.1 [0.58-1.8] million years ago (mya). This has major implications for the conservation and management of the critically endangered wild species, Camelus ferus.
Origin of sphinx, a young chimeric RNA gene in Drosophila melanogaster
Wang, Wen; Brunet, Frédéric G.; Nevo, Eviatar; Long, Manyuan
2002-01-01
Non-protein-coding RNA genes play an important role in various biological processes. How new RNA genes originated and whether this process is controlled by similar evolutionary mechanisms for the origin of protein-coding genes remains unclear. A young chimeric RNA gene that we term sphinx (spx) provides the first insight into the early stage of evolution of RNA genes. spx originated as an insertion of a retroposed sequence of the ATP synthase chain F gene at the cytological region 60DB since the divergence of Drosophila melanogaster from its sibling species 2–3 million years ago. This retrosequence, which is located at 102F on the fourth chromosome, recruited a nearby exon and intron, thereby evolving a chimeric gene structure. This molecular process suggests that the mechanism of exon shuffling, which can generate protein-coding genes, also plays a role in the origin of RNA genes. The subsequent evolutionary process of spx has been associated with a high nucleotide substitution rate, possibly driven by a continuous positive Darwinian selection for a novel function, as is shown in its sex- and development-specific alternative splicing. To test whether spx has adapted to different environments, we investigated its population genetic structure in the unique “Evolution Canyon” in Israel, revealing a similar haplotype structure in spx, and thus similar evolutionary forces operating on spx between environments. PMID:11904380
A Molecular Portrait of De Novo Genes in Yeasts.
Vakirlis, Nikolaos; Hebert, Alex S; Opulente, Dana A; Achaz, Guillaume; Hittinger, Chris Todd; Fischer, Gilles; Coon, Joshua J; Lafontaine, Ingrid
2018-03-01
New genes, with novel protein functions, can evolve "from scratch" out of intergenic sequences. These de novo genes can integrate the cell's genetic network and drive important phenotypic innovations. Therefore, identifying de novo genes and understanding how the transition from noncoding to coding occurs are key problems in evolutionary biology. However, identifying de novo genes is a difficult task, hampered by the presence of remote homologs, fast evolving sequences and erroneously annotated protein coding genes. To overcome these limitations, we developed a procedure that handles the usual pitfalls in de novo gene identification and predicted the emergence of 703 de novo gene candidates in 15 yeast species from 2 genera whose phylogeny spans at least 100 million years of evolution. We validated 85 candidates by proteomic data, providing new translation evidence for 25 of them through mass spectrometry experiments. We also unambiguously identified the mutations that enabled the transition from noncoding to coding for 30 Saccharomyces de novo genes. We established that de novo gene origination is a widespread phenomenon in yeasts, only a few being ultimately maintained by selection. We also found that de novo genes preferentially emerge next to divergent promoters in GC-rich intergenic regions where the probability of finding a fortuitous and transcribed ORF is the highest. Finally, we found a more than 3-fold enrichment of de novo genes at recombination hot spots, which are GC-rich and nucleosome-free regions, suggesting that meiotic recombination contributes to de novo gene emergence in yeasts.
Ziegler, Andreas; Dohr, Gotrfried; Uchanska-Ziegler, Barbara
2002-07-01
Polymorphic genes of the human major histocompatibility complex [MHC; human leukocyte antigen (HLA)] are probably important in determining resistance to parasites and avoidance of inbreeding. We investigated whether HLA-associated sexual selection could also involve HLA-linked olfactory receptor (OR) genes, which might not only participate in olfaction-guided mate choice, but also in selection processes within the testis. The testicular expression status of HLA class I molecules (by immunohistology) and HLA-linked OR genes (by transcriptional analysis) was determined. Various HLA class I heavy chains, but not beta2-microglobulin (beta2m), were expressed, mainly at the spermatocyte I stage. Of 17 HLA-linked OR genes analyzed, eight were found to be transcribed in the testis. They exhibited varying numbers of 5'- or 3'-non-coding exons as well as differential splicing. We suggest that testis-expressed polymorphic HLA and OR proteins are functionally connected and serve the selection of spermatozoa, enabling them to distinguish 'self from 'non-self [the sperm-receptor-selection (SRS) hypothesis].
Association of Amine-Receptor DNA Sequence Variants with Associative Learning in the Honeybee.
Lagisz, Malgorzata; Mercer, Alison R; de Mouzon, Charlotte; Santos, Luana L S; Nakagawa, Shinichi
2016-03-01
Octopamine- and dopamine-based neuromodulatory systems play a critical role in learning and learning-related behaviour in insects. To further our understanding of these systems and resulting phenotypes, we quantified DNA sequence variations at six loci coding octopamine-and dopamine-receptors and their association with aversive and appetitive learning traits in a population of honeybees. We identified 79 polymorphic sequence markers (mostly SNPs and a few insertions/deletions) located within or close to six candidate genes. Intriguingly, we found that levels of sequence variation in the protein-coding regions studied were low, indicating that sequence variation in the coding regions of receptor genes critical to learning and memory is strongly selected against. Non-coding and upstream regions of the same genes, however, were less conserved and sequence variations in these regions were weakly associated with between-individual differences in learning-related traits. While these associations do not directly imply a specific molecular mechanism, they suggest that the cross-talk between dopamine and octopamine signalling pathways may influence olfactory learning and memory in the honeybee.
McClelland, Shawn; Brennan, Gary P; Dubé, Celine; Rajpara, Seeta; Iyer, Shruti; Richichi, Cristina; Bernard, Christophe; Baram, Tallie Z
2014-01-01
The mechanisms generating epileptic neuronal networks following insults such as severe seizures are unknown. We have previously shown that interfering with the function of the neuron-restrictive silencer factor (NRSF/REST), an important transcription factor that influences neuronal phenotype, attenuated development of this disorder. In this study, we found that epilepsy-provoking seizures increased the low NRSF levels in mature hippocampus several fold yet surprisingly, provoked repression of only a subset (∼10%) of potential NRSF target genes. Accordingly, the repressed gene-set was rescued when NRSF binding to chromatin was blocked. Unexpectedly, genes selectively repressed by NRSF had mid-range binding frequencies to the repressor, a property that rendered them sensitive to moderate fluctuations of NRSF levels. Genes selectively regulated by NRSF during epileptogenesis coded for ion channels, receptors, and other crucial contributors to neuronal function. Thus, dynamic, selective regulation of NRSF target genes may play a role in influencing neuronal properties in pathological and physiological contexts. DOI: http://dx.doi.org/10.7554/eLife.01267.001 PMID:25117540
Possible Diversifying Selection in the Imprinted Gene, MEDEA, in Arabidopsis
Miyake, Takashi; Takebayashi, Naoki
2009-01-01
Coevolutionary conflict among imprinted genes that influence traits such as offspring growth may arise when maternal and paternal genomes have different evolutionary optima. This conflict is expected in outcrossing taxa with multiple paternity, but not self-fertilizing taxa. MEDEA (MEA) is an imprinted plant gene that influences seed growth. Disagreement exists regarding the type of selection acting on this gene. We present new data and analyses of sequence diversity of MEA in self-fertilizing and outcrossing Arabidopsis and its relatives, to help clarify the form of selection acting on this gene. Codon-based branch analysis among taxa (PAML) suggests that selection on the coding region is changing over time, and nonsynonymous substitution is elevated in at least one outcrossing branch. Codon-based analysis of diversity within outcrossing Arabidopsis lyrata ssp. petraea (OmegaMap) suggests that diversifying selection is acting on a portion of the gene, to cause elevated nonsynonymous polymorphism. Providing further support for balancing selection in A. lyrata, Hudson, Kreitman and Aguadé analysis indicates that diversity/divergence at silent sites in the MEA promoter and genic region is elevated relative to reference genes, and there are deviations from the neutral frequency spectrum. This combination of positive selection as well as balancing and diversifying selection in outcrossing lineages is consistent with other genes influence by evolutionary conflict, such as disease resistance genes. Consistent with predictions that conflict would be eliminated in self-fertilizing taxa, we found no evidence of positive, balancing, or diversifying selection in A. thaliana promoter or genic region. PMID:19126870
Selective modes determine evolutionary rates, gene compactness and expression patterns in Brassica.
Guo, Yue; Liu, Jing; Zhang, Jiefu; Liu, Shengyi; Du, Jianchang
2017-07-01
It has been well documented that most nuclear protein-coding genes in organisms can be classified into two categories: positively selected genes (PSGs) and negatively selected genes (NSGs). The characteristics and evolutionary fates of different types of genes, however, have been poorly understood. In this study, the rates of nonsynonymous substitution (K a ) and the rates of synonymous substitution (K s ) were investigated by comparing the orthologs between the two sequenced Brassica species, Brassica rapa and Brassica oleracea, and the evolutionary rates, gene structures, expression patterns, and codon bias were compared between PSGs and NSGs. The resulting data show that PSGs have higher protein evolutionary rates, lower synonymous substitution rates, shorter gene length, fewer exons, higher functional specificity, lower expression level, higher tissue-specific expression and stronger codon bias than NSGs. Although the quantities and values are different, the relative features of PSGs and NSGs have been largely verified in the model species Arabidopsis. These data suggest that PSGs and NSGs differ not only under selective pressure (K a /K s ), but also in their evolutionary, structural and functional properties, indicating that selective modes may serve as a determinant factor for measuring evolutionary rates, gene compactness and expression patterns in Brassica. © 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.
Biewer, M; Lechner, S; Hasselmann, M
2016-01-01
Studying the fate of duplicated genes provides informative insight into the evolutionary plasticity of biological pathways to which they belong. In the paralogous sex-determining genes complementary sex determiner (csd) and feminizer (fem) of honey bee species (genus Apis), only heterozygous csd initiates female development. Here, the full-length coding sequences of the genes csd and fem of the phylogenetically basal dwarf honey bee Apis florea are characterized. Compared with other Apis species, remarkable evolutionary changes in the formation and localization of a protein-interacting (coiled-coil) motif and in the amino acids coding for the csd characteristic hypervariable region (HVR) are observed. Furthermore, functionally different csd alleles were isolated as genomic fragments from a random population sample. In the predicted potential specifying domain (PSD), a high ratio of πN/πS=1.6 indicated positive selection, whereas signs of balancing selection, commonly found in other Apis species, are missing. Low nucleotide diversity on synonymous and genome-wide, non-coding sites as well as site frequency analyses indicated a strong impact of genetic drift in A. florea, likely linked to its biology. Along the evolutionary trajectory of ~30 million years of csd evolution, episodic diversifying selection seems to have acted differently among distinct Apis branches. Consistently low amino-acid differences within the PSD among pairs of functional heterozygous csd alleles indicate that the HVR is the most important region for determining allele specificity. We propose that in the early history of the lineage-specific fem duplication giving rise to csd in Apis, A. florea csd stands as a remarkable example for the plasticity of initial sex-determining signals.
Transformation and inheritance of a hygromycin phosphotransferase gene in maize plants.
Walters, D A; Vetsch, C S; Potts, D E; Lundquist, R C
1992-01-01
Embryogenic maize (Zea mays L.) callus cultures were transformed by microprojectile bombardment with a chimeric hygromycin phosphotransferase (HPT) gene and three transformed lines were obtained by selecting for hygromycin resistance. All lines contained one or a few copies of the intact HPT coding sequence. Fertile, transgenic plants were regenerated and the transmission of the chimeric gene was demonstrated through two complete generations. One line inherited the gene in the manner expected for a single, dominant locus, whereas two did not.
Positive selection in the SLC11A1 gene in the family Equidae.
Bayerova, Zuzana; Janova, Eva; Matiasovic, Jan; Orlando, Ludovic; Horin, Petr
2016-05-01
Immunity-related genes are a suitable model for studying effects of selection at the genomic level. Some of them are highly conserved due to functional constraints and purifying selection, while others are variable and change quickly to cope with the variation of pathogens. The SLC11A1 gene encodes a transporter protein mediating antimicrobial activity of macrophages. Little is known about the patterns of selection shaping this gene during evolution. Although it is a typical evolutionarily conserved gene, functionally important polymorphisms associated with various diseases were identified in humans and other species. We analyzed the genomic organization, genetic variation, and evolution of the SLC11A1 gene in the family Equidae to identify patterns of selection within this important gene. Nucleotide SLC11A1 sequences were shown to be highly conserved in ten equid species, with more than 97 % sequence identity across the family. Single nucleotide polymorphisms (SNPs) were found in the coding and noncoding regions of the gene. Seven codon sites were identified to be under strong purifying selection. Codons located in three regions, including the glycosylated extracellular loop, were shown to be under diversifying selection. A 3-bp indel resulting in a deletion of the amino acid 321 in the predicted protein was observed in all horses, while it has been maintained in all other equid species. This codon comprised in an N-glycosylation site was found to be under positive selection. Interspecific variation in the presence of predicted N-glycosylation sites was observed.
Singh, Kh Dhanachandra; Karthikeyan, Muthusamy
2014-12-01
The renin-angiotensin-aldosterone system (RAAS) plays a key role in the regulation of blood pressure (BP). Mutations on the genes that encode components of the RAAS have played a significant role in genetic susceptibility to hypertension and have been intensively scrutinized. The identification of such probably causal mutations not only provides insight into the RAAS but may also serve as antihypertensive therapeutic targets and diagnostic markers. The methods for analyzing the SNPs from the huge dataset of SNPs, containing both functional and neutral SNPs is challenging by the experimental approach on every SNPs to determine their biological significance. To explore the functional significance of genetic mutation (SNPs), we adopted combined sequence and sequence-structure-based SNP analysis algorithm. Out of 3864 SNPs reported in dbSNP, we found 108 missense SNPs in the coding region and remaining in the non-coding region. In this study, we are reporting only those SNPs in coding region to be deleterious when three or more tools are predicted to be deleterious and which have high RMSD from the native structure. Based on these analyses, we have identified two SNPs of REN gene, eight SNPs of AGT gene, three SNPs of ACE gene, two SNPs of AT1R gene, three SNPs of CYP11B2 gene and three SNPs of CMA1 gene in the coding region were found to be deleterious. Further this type of study will be helpful in reducing the cost and time for identification of potential SNP and also helpful in selecting potential SNP for experimental study out of SNP pool.
The complete mitochondrial genome of the ice pigeon (Columba livia breed ice).
Zhang, Rui-Hua; He, Wen-Xiao
2015-02-01
The ice pigeon is a breed of fancy pigeon developed over many years of selective breeding. In the present work, we report the complete mitochondrial genome sequence of ice pigeon for the first time. The total length of the mitogenome was 17,236 bp with the base composition of 30.2% for A, 24.0% for T, 31.9% for C, and 13.9% for G and an A-T (54.2 %)-rich feature was detected. It harbored 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and 1 non-coding control region (D-loop region). The arrangement of all genes was identical to the typical mitochondrial genomes of pigeon. The complete mitochondrial genome sequence of ice pigeon would serve as an important data set of the germplasm resources for further study.
The complete mitochondrial genome of the Jacobin pigeon (Columba livia breed Jacobin).
He, Wen-Xiao; Jia, Jin-Feng
2015-06-01
The Jacobin is a breed of fancy pigeon developed over many years of selective breeding that originated in Asia. In the present work, we report the complete mitochondrial genome sequence of Jacobin pigeon for the first time. The total length of the mitogenome was 17,245 bp with the base composition of 30.18% for A, 23.98% for T, 31.88% for C, and 13.96% for G and an A-T (54.17 %)-rich feature was detected. It harbored 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and 1 non-coding control region. The arrangement of all genes was identical to the typical mitochondrial genomes of pigeon. The complete mitochondrial genome sequence of Jacobin pigeon would serve as an important data set of the germplasm resources for further study.
The complete mitochondrial genome of the Fancy Pigeon, Columba livia (Columbiformes: Columbidae).
Zhang, Rui-Hua; Xu, Ming-Ju; Wang, Cun-Lian; Xu, Tong; Wei, Dong; Liu, Bao-Jian; Wang, Guo-Hua
2015-02-01
The fancy pigeons are domesticated varieties of the rock pigeon developed over many years of selective breeding. In the present work, we report the complete mitochondrial genome sequence of fancy pigeon for the first time. The total length of the mitogenome was 17,233 bp with the base composition of 30.1% for A, 24.0% for T, 31.9% for C, and 14.0% for G and an A-T (54.2 %)-rich feature was detected. It harbored 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and 1 non-coding control region (D-loop region). The arrangement of all genes was identical to the typical mitochondrial genomes of pigeon. The complete mitochondrial genome sequence of fancy pigeon would serve as an important data set of the germplasm resources for further study.
Genomic Sequence around Butterfly Wing Development Genes: Annotation and Comparative Analysis
Conceição, Inês C.; Long, Anthony D.; Gruber, Jonathan D.; Beldade, Patrícia
2011-01-01
Background Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. Methodology/Principal Findings We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations) and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes). Conclusions The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1) the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2) the high conservation of non-coding sequence around the genes wingless and Ecdysone receptor, both involved in multiple developmental processes including wing pattern formation. PMID:21909358
Landscape genomics: natural selection drives the evolution of mitogenome in penguins.
Ramos, Barbara; González-Acuña, Daniel; Loyola, David E; Johnson, Warren E; Parker, Patricia G; Massaro, Melanie; Dantas, Gisele P M; Miranda, Marcelo D; Vianna, Juliana A
2018-01-16
Mitochondria play a key role in the balance of energy and heat production, and therefore the mitochondrial genome is under natural selection by environmental temperature and food availability, since starvation can generate more efficient coupling of energy production. However, selection over mitochondrial DNA (mtDNA) genes has usually been evaluated at the population level. We sequenced by NGS 12 mitogenomes and with four published genomes, assessed genetic variation in ten penguin species distributed from the equator to Antarctica. Signatures of selection of 13 mitochondrial protein-coding genes were evaluated by comparing among species within and among genera (Spheniscus, Pygoscelis, Eudyptula, Eudyptes and Aptenodytes). The genetic data were correlated with environmental data obtained through remote sensing (sea surface temperature [SST], chlorophyll levels [Chl] and a combination of SST and Chl [COM]) through the distribution of these species. We identified the complete mtDNA genomes of several penguin species, including ND6 and 8 tRNAs on the light strand and 12 protein coding genes, 14 tRNAs and two rRNAs positioned on the heavy strand. The highest diversity was found in NADH dehydrogenase genes and the lowest in COX genes. The lowest evolutionary divergence among species was between Humboldt (Spheniscus humboldti) and Galapagos (S. mendiculus) penguins (0.004), while the highest was observed between little penguin (Eudyptula minor) and Adélie penguin (Pygoscelis adeliae) (0.097). We identified a signature of purifying selection (Ka/Ks < 1) across the mitochondrial genome, which is consistent with the hypothesis that purifying selection is constraining mitogenome evolution to maintain Oxidative phosphorylation (OXPHOS) proteins and functionality. Pairwise species maximum-likelihood analyses of selection at codon sites suggest positive selection has occurred on ATP8 (Fixed-Effects Likelihood, FEL) and ND4 (Single Likelihood Ancestral Counting, SLAC) in all penguins. In contrast, COX1 had a signature of strong negative selection. ND4 Ka/Ks ratios were highly correlated with SST (Mantel, p-value: 0.0001; GLM, p-value: 0.00001) and thus may be related to climate adaptation throughout penguin speciation. These results identify mtDNA candidate genes under selection which could be involved in broad-scale adaptations of penguins to their environment. Such knowledge may be particularly useful for developing predictive models of how these species may respond to severe climatic changes in the future.
D'Onofrio, Giuseppe; Ghosh, Tapash Chandra
2005-01-17
Fluctuations and increments of both C(3) and G(3) levels along the human coding sequences were investigated comparing two sets of Xenopus/human orthologous genes. The first set of genes shows minor differences of the GC(3) levels, the second shows considerable increments of the GC(3) levels in the human genes. In both data sets, the fluctuations of C(3) and G(3) levels along the coding sequences correlated with the secondary structures of the encoded proteins. The human genes that underwent the compositional transition showed a different increment of the C(3) and G(3) levels within and among the structural units of the proteins. The relative synonymous codon usage (RSCU) of several amino acids were also affected during the compositional transition, showing that there exists a correlation between RSCU and protein secondary structures in human genes. The importance of natural selection for the formation of isochore organization of the human genome has been discussed on the basis of these results.
Zhao, Zheng; Bai, Jing; Wu, Aiwei; Wang, Yuan; Zhang, Jinwen; Wang, Zishan; Li, Yongsheng; Xu, Juan; Li, Xia
2015-01-01
Long non-coding RNAs (lncRNAs) are emerging as key regulators of diverse biological processes and diseases. However, the combinatorial effects of these molecules in a specific biological function are poorly understood. Identifying co-expressed protein-coding genes of lncRNAs would provide ample insight into lncRNA functions. To facilitate such an effort, we have developed Co-LncRNA, which is a web-based computational tool that allows users to identify GO annotations and KEGG pathways that may be affected by co-expressed protein-coding genes of a single or multiple lncRNAs. LncRNA co-expressed protein-coding genes were first identified in publicly available human RNA-Seq datasets, including 241 datasets across 6560 total individuals representing 28 tissue types/cell lines. Then, the lncRNA combinatorial effects in a given GO annotations or KEGG pathways are taken into account by the simultaneous analysis of multiple lncRNAs in user-selected individual or multiple datasets, which is realized by enrichment analysis. In addition, this software provides a graphical overview of pathways that are modulated by lncRNAs, as well as a specific tool to display the relevant networks between lncRNAs and their co-expressed protein-coding genes. Co-LncRNA also supports users in uploading their own lncRNA and protein-coding gene expression profiles to investigate the lncRNA combinatorial effects. It will be continuously updated with more human RNA-Seq datasets on an annual basis. Taken together, Co-LncRNA provides a web-based application for investigating lncRNA combinatorial effects, which could shed light on their biological roles and could be a valuable resource for this community. Database URL: http://www.bio-bigdata.com/Co-LncRNA/ PMID:26363020
2014-01-01
Background Locating the protein-coding genes in novel genomes is essential to understanding and exploiting the genomic information but it is still difficult to accurately predict all the genes. The recent availability of detailed information about transcript structure from high-throughput sequencing of messenger RNA (RNA-Seq) delineates many expressed genes and promises increased accuracy in gene prediction. Computational gene predictors have been intensively developed for and tested in well-studied animal genomes. Hundreds of fungal genomes are now or will soon be sequenced. The differences of fungal genomes from animal genomes and the phylogenetic sparsity of well-studied fungi call for gene-prediction tools tailored to them. Results SnowyOwl is a new gene prediction pipeline that uses RNA-Seq data to train and provide hints for the generation of Hidden Markov Model (HMM)-based gene predictions and to evaluate the resulting models. The pipeline has been developed and streamlined by comparing its predictions to manually curated gene models in three fungal genomes and validated against the high-quality gene annotation of Neurospora crassa; SnowyOwl predicted N. crassa genes with 83% sensitivity and 65% specificity. SnowyOwl gains sensitivity by repeatedly running the HMM gene predictor Augustus with varied input parameters and selectivity by choosing the models with best homology to known proteins and best agreement with the RNA-Seq data. Conclusions SnowyOwl efficiently uses RNA-Seq data to produce accurate gene models in both well-studied and novel fungal genomes. The source code for the SnowyOwl pipeline (in Python) and a web interface (in PHP) is freely available from http://sourceforge.net/projects/snowyowl/. PMID:24980894
Modulation of Gene Expression in Contextual Fear Conditioning in the Rat
Macchi, Monica; Ciampini, Cristina; Bernardi, Rodolfo; Baldi, Elisabetta; Bucherelli, Corrado; Brunelli, Marcello; Scuri, Rossana
2013-01-01
In contextual fear conditioning (CFC) a single training leads to long-term memory of context-aversive electrical foot-shocks association. Mid-temporal regions of the brain of trained and naive rats were obtained 2 days after conditioning and screened by two-directional suppression subtractive hybridization. A pool of differentially expressed genes was identified and some of them were randomly selected and confirmed with qRT-PCR assay. These transcripts showed high homology for rat gene sequences coding for proteins involved in different cellular processes. The expression of the selected transcripts was also tested in rats which had freely explored the experimental apparatus (exploration) and in rats to which the same number of aversive shocks had been administered in the same apparatus, but temporally compressed so as to make the association between painful stimuli and the apparatus difficult (shock-only). Some genes resulted differentially expressed only in the rats subjected to CFC, others only in exploration or shock-only rats, whereas the gene coding for translocase of outer mitochondrial membrane 20 protein and nardilysin were differentially expressed in both CFC and exploration rats. For example, the expression of stathmin 1 whose transcripts resulted up regulated was also tested to evaluate the transduction and protein localization after conditioning. PMID:24278235
A new mutation identified in SPATA16 in two globozoospermic patients.
ElInati, Elias; Fossard, Camille; Okutman, Ozlem; Ghédir, Houda; Ibala-Romdhane, Samira; Ray, Pierre F; Saad, Ali; Hennebicq, Sylvianne; Viville, Stéphane
2016-06-01
The aim of this study is to identify potential genes involved in human globozoopsermia. Nineteen globozoospermic patients (previously screened for DPY19L2 mutations with no causative mutation) were recruited in this study and screened for mutations in genes implicated in human globozoospermia SPATA16 and PICK1. Using the candidate gene approach and the determination of Spata16 partners by Glutathione S-transferase (GST) pull-down four genes were also selected and screened for mutations. We identified a novel mutation of SPATA16: deletion of 22.6 Kb encompassing the first coding exon in two unrelated Tunisian patients who presented the same deletion breakpoints. The two patients shared the same haplotype, suggesting a possible ancestral founder effect for this new deletion. Four genes were selected using the candidate gene approach and the GST pull-down (GOPC, PICK1, AGFG1 and IRGC) and were screened for mutation, but no variation was identified. The present study confirms the pathogenicity of the SPATA16 mutations. The fact that no variation was detected in the coding sequence of AFGF1, GOPC, PICK1 and IRGC does not mean that they are not involved in human globozoospermia. A larger globozoospermic cohort must be studied in order to accelerate the process of identifying new genes involved in such phenotypes. Until sufficient numbers of patients have been screened, AFGF1, GOPC, PICK1 and IRGC should still be considered as candidate genes.
Bayesian variable selection for post-analytic interrogation of susceptibility loci.
Chen, Siying; Nunez, Sara; Reilly, Muredach P; Foulkes, Andrea S
2017-06-01
Understanding the complex interplay among protein coding genes and regulatory elements requires rigorous interrogation with analytic tools designed for discerning the relative contributions of overlapping genomic regions. To this aim, we offer a novel application of Bayesian variable selection (BVS) for classifying genomic class level associations using existing large meta-analysis summary level resources. This approach is applied using the expectation maximization variable selection (EMVS) algorithm to typed and imputed SNPs across 502 protein coding genes (PCGs) and 220 long intergenic non-coding RNAs (lncRNAs) that overlap 45 known loci for coronary artery disease (CAD) using publicly available Global Lipids Gentics Consortium (GLGC) (Teslovich et al., 2010; Willer et al., 2013) meta-analysis summary statistics for low-density lipoprotein cholesterol (LDL-C). The analysis reveals 33 PCGs and three lncRNAs across 11 loci with >50% posterior probabilities for inclusion in an additive model of association. The findings are consistent with previous reports, while providing some new insight into the architecture of LDL-cholesterol to be investigated further. As genomic taxonomies continue to evolve, additional classes such as enhancer elements and splicing regions, can easily be layered into the proposed analysis framework. Moreover, application of this approach to alternative publicly available meta-analysis resources, or more generally as a post-analytic strategy to further interrogate regions that are identified through single point analysis, is straightforward. All coding examples are implemented in R version 3.2.1 and provided as supplemental material. © 2016, The International Biometric Society.
Goz, Eli; Zafrir, Zohar; Tuller, Tamir
2018-04-30
Understanding how viruses co-evolve with their hosts and adapt various genomic level strategies in order to ensure their fitness may have essential implications in unveiling the secrets of viral evolution, and in developing new vaccines and therapeutic approaches. Here, based on a novel genomic analysis of 2,625 different viruses and 439 corresponding host organisms, we provide evidence of universal evolutionary selection for high dimensional 'silent' patterns of information hidden in the redundancy of viral genetic code. Our model suggests that long substrings of nucleotides in the coding regions of viruses from all classes, often also repeat in the corresponding viral hosts from all domains of life. Selection for these substrings cannot be explained only by such phenomena as codon usage bias, horizontal gene transfer, and the encoded proteins. Genes encoding structural proteins responsible for building the core of the viral particles were found to include more host-repeating substrings, and these substrings tend to appear in the middle parts of the viral coding regions. In addition, in human viruses these substrings tend to be enriched with motives related to transcription factors and RNA binding proteins. The host-repeating substrings are possibly related to the evolutionary pressure on the viruses to effectively interact with host's intracellular factors and to efficiently escape from the host's immune system. tamirtul@post.tau.ac.il (TT). Supplementary data are available at Bioinformatics online.
Raju, Hemalatha B.; Tsinoremas, Nicholas F.; Capobianco, Enrico
2016-01-01
Regeneration of injured nerves is likely occurring in the peripheral nervous system, but not in the central nervous system. Although protein-coding gene expression has been assessed during nerve regeneration, little is currently known about the role of non-coding RNAs (ncRNAs). This leaves open questions about the potential effects of ncRNAs at transcriptome level. Due to the limited availability of human neuropathic pain (NP) data, we have identified the most comprehensive time-course gene expression profile referred to sciatic nerve (SN) injury and studied in a rat model using two neuronal tissues, namely dorsal root ganglion (DRG) and SN. We have developed a methodology to identify differentially expressed bioentities starting from microarray probes and repurposing them to annotate ncRNAs, while analyzing the expression profiles of protein-coding genes. The approach is designed to reuse microarray data and perform first profiling and then meta-analysis through three main steps. First, we used contextual analysis to identify what we considered putative or potential protein-coding targets for selected ncRNAs. Relevance was therefore assigned to differential expression of neighbor protein-coding genes, with neighborhood defined by a fixed genomic distance from long or antisense ncRNA loci, and of parental genes associated with pseudogenes. Second, connectivity among putative targets was used to build networks, in turn useful to conduct inference at interactomic scale. Last, network paths were annotated to assess relevance to NP. We found significant differential expression in long-intergenic ncRNAs (32 lincRNAs in SN and 8 in DRG), antisense RNA (31 asRNA in SN and 12 in DRG), and pseudogenes (456 in SN and 56 in DRG). In particular, contextual analysis centered on pseudogenes revealed some targets with known association to neurodegeneration and/or neurogenesis processes. While modules of the olfactory receptors were clearly identified in protein–protein interaction networks, other connectivity paths were identified between proteins already investigated in studies on disorders, such as Parkinson, Down syndrome, Huntington disease, and Alzheimer. Our findings suggest the importance of reusing gene expression data by meta-analysis approaches. PMID:27803687
Raju, Hemalatha B; Tsinoremas, Nicholas F; Capobianco, Enrico
2016-01-01
Regeneration of injured nerves is likely occurring in the peripheral nervous system, but not in the central nervous system. Although protein-coding gene expression has been assessed during nerve regeneration, little is currently known about the role of non-coding RNAs (ncRNAs). This leaves open questions about the potential effects of ncRNAs at transcriptome level. Due to the limited availability of human neuropathic pain (NP) data, we have identified the most comprehensive time-course gene expression profile referred to sciatic nerve (SN) injury and studied in a rat model using two neuronal tissues, namely dorsal root ganglion (DRG) and SN. We have developed a methodology to identify differentially expressed bioentities starting from microarray probes and repurposing them to annotate ncRNAs, while analyzing the expression profiles of protein-coding genes. The approach is designed to reuse microarray data and perform first profiling and then meta-analysis through three main steps. First, we used contextual analysis to identify what we considered putative or potential protein-coding targets for selected ncRNAs. Relevance was therefore assigned to differential expression of neighbor protein-coding genes, with neighborhood defined by a fixed genomic distance from long or antisense ncRNA loci, and of parental genes associated with pseudogenes. Second, connectivity among putative targets was used to build networks, in turn useful to conduct inference at interactomic scale. Last, network paths were annotated to assess relevance to NP. We found significant differential expression in long-intergenic ncRNAs (32 lincRNAs in SN and 8 in DRG), antisense RNA (31 asRNA in SN and 12 in DRG), and pseudogenes (456 in SN and 56 in DRG). In particular, contextual analysis centered on pseudogenes revealed some targets with known association to neurodegeneration and/or neurogenesis processes. While modules of the olfactory receptors were clearly identified in protein-protein interaction networks, other connectivity paths were identified between proteins already investigated in studies on disorders, such as Parkinson, Down syndrome, Huntington disease, and Alzheimer. Our findings suggest the importance of reusing gene expression data by meta-analysis approaches.
Identification of Conflicting Selective Effects on Highly Expressed Genes
Higgs, Paul G.; Hao, Weilong; Golding, G. Brian
2007-01-01
Many different selective effects on DNA and proteins influence the frequency of codons and amino acids in coding sequences. Selection is often stronger on highly expressed genes. Hence, by comparing high- and low-expression genes it is possible to distinguish the factors that are selected by evolution. It has been proposed that highly expressed genes should (i) preferentially use codons matching abundant tRNAs (translational efficiency), (ii) preferentially use amino acids with low cost of synthesis, (iii) be under stronger selection to maintain the required amino acid content, and (iv) be selected for translational robustness. These effects act simultaneously and can be contradictory. We develop a model that combines these factors, and use Akaike’s Information Criterion for model selection. We consider pairs of paralogues that arose by whole-genome duplication in Saccharmyces cerevisiae. A codon-based model is used that includes asymmetric effects due to selection on highly expressed genes. The largest effect is translational efficiency, which is found to strongly influence synonymous, but not non-synonymous rates. Minimization of the cost of amino acid synthesis is implicated. However, when a more general measure of selection for amino acid usage is used, the cost minimization effect becomes redundant. Small effects that we attribute to selection for translational robustness can be identified as an improvement in the model fit on top of the effects of translational efficiency and amino acid usage. PMID:19430600
Niu, Ao-lei; Wang, Yin-qiu; Zhang, Hui; Liao, Cheng-hong; Wang, Jin-kai; Zhang, Rui; Che, Jun; Su, Bing
2011-10-12
Homeobox genes are the key regulators during development, and they are in general highly conserved with only a few reported cases of rapid evolution. RHOXF2 is an X-linked homeobox gene in primates. It is highly expressed in the testicle and may play an important role in spermatogenesis. As male reproductive system is often the target of natural and/or sexual selection during evolution, in this study, we aim to dissect the pattern of molecular evolution of RHOXF2 in primates and its potential functional consequence. We studied sequences and copy number variation of RHOXF2 in humans and 16 nonhuman primate species as well as the expression patterns in human, chimpanzee, white-browed gibbon and rhesus macaque. The gene copy number analysis showed that there had been parallel gene duplications/losses in multiple primate lineages. Our evidence suggests that 11 nonhuman primate species have one RHOXF2 copy, and two copies are present in humans and four Old World monkey species, and at least 6 copies in chimpanzees. Further analysis indicated that the gene duplications in primates had likely been mediated by endogenous retrovirus (ERV) sequences flanking the gene regions. In striking contrast to non-human primates, humans appear to have homogenized their two RHOXF2 copies by the ERV-mediated non-allelic recombination mechanism. Coding sequence and phylogenetic analysis suggested multi-lineage strong positive selection on RHOXF2 during primate evolution, especially during the origins of humans and chimpanzees. All the 8 coding region polymorphic sites in human populations are non-synonymous, implying on-going selection. Gene expression analysis demonstrated that besides the preferential expression in the reproductive system, RHOXF2 is also expressed in the brain. The quantitative data suggests expression pattern divergence among primate species. RHOXF2 is a fast-evolving homeobox gene in primates. The rapid evolution and copy number changes of RHOXF2 had been driven by Darwinian positive selection acting on the male reproductive system and possibly also on the central nervous system, which sheds light on understanding the role of homeobox genes in adaptive evolution.
LS Bound based gene selection for DNA microarray data.
Zhou, Xin; Mao, K Z
2005-04-15
One problem with discriminant analysis of DNA microarray data is that each sample is represented by quite a large number of genes, and many of them are irrelevant, insignificant or redundant to the discriminant problem at hand. Methods for selecting important genes are, therefore, of much significance in microarray data analysis. In the present study, a new criterion, called LS Bound measure, is proposed to address the gene selection problem. The LS Bound measure is derived from leave-one-out procedure of LS-SVMs (least squares support vector machines), and as the upper bound for leave-one-out classification results it reflects to some extent the generalization performance of gene subsets. We applied this LS Bound measure for gene selection on two benchmark microarray datasets: colon cancer and leukemia. We also compared the LS Bound measure with other evaluation criteria, including the well-known Fisher's ratio and Mahalanobis class separability measure, and other published gene selection algorithms, including Weighting factor and SVM Recursive Feature Elimination. The strength of the LS Bound measure is that it provides gene subsets leading to more accurate classification results than the filter method while its computational complexity is at the level of the filter method. A companion website can be accessed at http://www.ntu.edu.sg/home5/pg02776030/lsbound/. The website contains: (1) the source code of the gene selection algorithm; (2) the complete set of tables and figures regarding the experimental study; (3) proof of the inequality (9). ekzmao@ntu.edu.sg.
The Landscape of A-to-I RNA Editome Is Shaped by Both Positive and Purifying Selection
Kong, Yimeng; Pan, Bohu; Chen, Longxian; Wang, Hongbing; Hao, Pei; Li, Xuan
2016-01-01
The hydrolytic deamination of adenosine to inosine (A-to-I editing) in precursor mRNA induces variable gene products at the post-transcription level. How and to what extent A-to-I RNA editing diversifies transcriptome is not fully characterized in the evolution, and very little is known about the selective constraints that drive the evolution of RNA editing events. Here we present a study on A-to-I RNA editing, by generating a global profile of A-to-I editing for a phylogeny of seven Drosophila species, a model system spanning an evolutionary timeframe of approximately 45 million years. Of totally 9281 editing events identified, 5150 (55.5%) are located in the coding sequences (CDS) of 2734 genes. Phylogenetic analysis places these genes into 1,526 homologous families, about 5% of total gene families in the fly lineages. Based on conservation of the editing sites, the editing events in CDS are categorized into three distinct types, representing events on singleton genes (type I), and events not conserved (type II) or conserved (type III) within multi-gene families. While both type I and II events are subject to purifying selection, notably type III events are positively selected, and highly enriched in the components and functions of the nervous system. The tissue profiles are documented for three editing types, and their critical roles are further implicated by their shifting patterns during holometabolous development and in post-mating response. In conclusion, three A-to-I RNA editing types are found to have distinct evolutionary dynamics. It appears that nervous system functions are mainly tested to determine if an A-to-I editing is beneficial for an organism. The coding plasticity enabled by A-to-I editing creates a new class of binary variations, which is a superior alternative to maintain heterozygosity of expressed genes in a diploid mating system. PMID:27467689
Identification of coding and non-coding mutational hotspots in cancer genomes.
Piraino, Scott W; Furney, Simon J
2017-01-05
The identification of mutations that play a causal role in tumour development, so called "driver" mutations, is of critical importance for understanding how cancers form and how they might be treated. Several large cancer sequencing projects have identified genes that are recurrently mutated in cancer patients, suggesting a role in tumourigenesis. While the landscape of coding drivers has been extensively studied and many of the most prominent driver genes are well characterised, comparatively less is known about the role of mutations in the non-coding regions of the genome in cancer development. The continuing fall in genome sequencing costs has resulted in a concomitant increase in the number of cancer whole genome sequences being produced, facilitating systematic interrogation of both the coding and non-coding regions of cancer genomes. To examine the mutational landscapes of tumour genomes we have developed a novel method to identify mutational hotspots in tumour genomes using both mutational data and information on evolutionary conservation. We have applied our methodology to over 1300 whole cancer genomes and show that it identifies prominent coding and non-coding regions that are known or highly suspected to play a role in cancer. Importantly, we applied our method to the entire genome, rather than relying on predefined annotations (e.g. promoter regions) and we highlight recurrently mutated regions that may have resulted from increased exposure to mutational processes rather than selection, some of which have been identified previously as targets of selection. Finally, we implicate several pan-cancer and cancer-specific candidate non-coding regions, which could be involved in tumourigenesis. We have developed a framework to identify mutational hotspots in cancer genomes, which is applicable to the entire genome. This framework identifies known and novel coding and non-coding mutional hotspots and can be used to differentiate candidate driver regions from likely passenger regions susceptible to somatic mutation.
Miller-Delaney, Suzanne F.C.; Bryan, Kenneth; Das, Sudipto; McKiernan, Ross C.; Bray, Isabella M.; Reynolds, James P.; Gwinn, Ryder; Stallings, Raymond L.
2015-01-01
Temporal lobe epilepsy is associated with large-scale, wide-ranging changes in gene expression in the hippocampus. Epigenetic changes to DNA are attractive mechanisms to explain the sustained hyperexcitability of chronic epilepsy. Here, through methylation analysis of all annotated C-phosphate-G islands and promoter regions in the human genome, we report a pilot study of the methylation profiles of temporal lobe epilepsy with or without hippocampal sclerosis. Furthermore, by comparative analysis of expression and promoter methylation, we identify methylation sensitive non-coding RNA in human temporal lobe epilepsy. A total of 146 protein-coding genes exhibited altered DNA methylation in temporal lobe epilepsy hippocampus (n = 9) when compared to control (n = 5), with 81.5% of the promoters of these genes displaying hypermethylation. Unique methylation profiles were evident in temporal lobe epilepsy with or without hippocampal sclerosis, in addition to a common methylation profile regardless of pathology grade. Gene ontology terms associated with development, neuron remodelling and neuron maturation were over-represented in the methylation profile of Watson Grade 1 samples (mild hippocampal sclerosis). In addition to genes associated with neuronal, neurotransmitter/synaptic transmission and cell death functions, differential hypermethylation of genes associated with transcriptional regulation was evident in temporal lobe epilepsy, but overall few genes previously associated with epilepsy were among the differentially methylated. Finally, a panel of 13, methylation-sensitive microRNA were identified in temporal lobe epilepsy including MIR27A, miR-193a-5p (MIR193A) and miR-876-3p (MIR876), and the differential methylation of long non-coding RNA documented for the first time. The present study therefore reports select, genome-wide DNA methylation changes in human temporal lobe epilepsy that may contribute to the molecular architecture of the epileptic brain. PMID:25552301
Characterization of the complete mitochondrial genome of the king pigeon (Columba livia breed king).
Zhang, Rui-Hua; He, Wen-Xiao; Xu, Tong
2015-06-01
The king pigeon is a breed of pigeon developed over many years of selective breeding primarily as a utility breed. In the present work, we report the complete mitochondrial genome sequence of king pigeon for the first time. The total length of the mitogenome was 17,221 bp with the base composition of 30.14% for A, 24.05% for T, 31.82% for C, and 13.99% for G and an A-T (54.22 %)-rich feature was detected. It harbored 13 protein-coding genes, two ribosomal RNA genes, 22 transfer RNA genes, and one non-coding control region (D-loop region). The arrangement of all genes was identical to the typical mitochondrial genomes of pigeon. The complete mitochondrial genome sequence of king pigeon would serve as an important data set of the germplasm resources for further study.
Bohlin, Jon; Eldholm, Vegard; Pettersson, John H O; Brynildsrud, Ola; Snipen, Lars
2017-02-10
The core genome consists of genes shared by the vast majority of a species and is therefore assumed to have been subjected to substantially stronger purifying selection than the more mobile elements of the genome, also known as the accessory genome. Here we examine intragenic base composition differences in core genomes and corresponding accessory genomes in 36 species, represented by the genomes of 731 bacterial strains, to assess the impact of selective forces on base composition in microbes. We also explore, in turn, how these results compare with findings for whole genome intragenic regions. We found that GC content in coding regions is significantly higher in core genomes than accessory genomes and whole genomes. Likewise, GC content variation within coding regions was significantly lower in core genomes than in accessory genomes and whole genomes. Relative entropy in coding regions, measured as the difference between observed and expected trinucleotide frequencies estimated from mononucleotide frequencies, was significantly higher in the core genomes than in accessory and whole genomes. Relative entropy was positively associated with coding region GC content within the accessory genomes, but not within the corresponding coding regions of core or whole genomes. The higher intragenic GC content and relative entropy, as well as the lower GC content variation, observed in the core genomes is most likely associated with selective constraints. It is unclear whether the positive association between GC content and relative entropy in the more mobile accessory genomes constitutes signatures of selection or selective neutral processes.
Adaptive response due to changes in gene regulation: a study with Drosophila.
McDonald, J F; Chambers, G K; David, J; Ayala, F J
1977-01-01
In spite of the critical role of the process of adaptation in evolution, there are few detailed studies of the genotypic and molecular basis of the process. Drosophila melanogaster flies selected for increased tolerance to ethanol exhibited higher levels of alcohol dehydrogenase (alcohol:NAD+ oxidoreductase; EC 1.1.1.1) activity than unselected controls. A series of tests (electrophoresis, product inhibition, temperature stability, pH optima, substrate specificity, and Michaelis constants) gave no evidence of structural differences in the enzyme of the selected and the control flies. However, quantitative immunological assays showed that the selected flies contained significantly higher amounts of alcohol dehydrogenase. Adaptation of the selected flies to higher alcohol tolerance has most likely taken place by changes not in the structural gene locus coding for the enzyme, but by regulatory changes affecting the amount of gene product. Images PMID:412190
Bråte, Jon; Adamski, Marcin; Neumann, Ralf S; Shalchian-Tabrizi, Kamran; Adamska, Maja
2015-12-22
Long non-coding RNAs (lncRNAs) play important regulatory roles during animal development, and it has been hypothesized that an RNA-based gene regulation was important for the evolution of developmental complexity in animals. However, most studies of lncRNA gene regulation have been performed using model animal species, and very little is known about this type of gene regulation in non-bilaterians. We have therefore analysed RNA-Seq data derived from a comprehensive set of embryogenesis stages in the calcareous sponge Sycon ciliatum and identified hundreds of developmentally expressed intergenic lncRNAs (lincRNAs) in this species. In situ hybridization of selected lincRNAs revealed dynamic spatial and temporal expression during embryonic development. More than 600 lincRNAs constitute integral parts of differentially expressed gene modules, which also contain known developmental regulatory genes, e.g. transcription factors and signalling molecules. This study provides insights into the non-coding gene repertoire of one of the earliest evolved animal lineages, and suggests that RNA-based gene regulation was probably present in the last common ancestor of animals. © 2015 The Authors.
Yu, Hua; Jiao, Bingke; Lu, Lu; Wang, Pengfei; Chen, Shuangcheng; Liang, Chengzhi; Liu, Wei
2018-01-01
Accurately reconstructing gene co-expression network is of great importance for uncovering the genetic architecture underlying complex and various phenotypes. The recent availability of high-throughput RNA-seq sequencing has made genome-wide detecting and quantifying of the novel, rare and low-abundance transcripts practical. However, its potential merits in reconstructing gene co-expression network have still not been well explored. Using massive-scale RNA-seq samples, we have designed an ensemble pipeline, called NetMiner, for building genome-scale and high-quality Gene Co-expression Network (GCN) by integrating three frequently used inference algorithms. We constructed a RNA-seq-based GCN in one species of monocot rice. The quality of network obtained by our method was verified and evaluated by the curated gene functional association data sets, which obviously outperformed each single method. In addition, the powerful capability of network for associating genes with functions and agronomic traits was shown by enrichment analysis and case studies. In particular, we demonstrated the potential value of our proposed method to predict the biological roles of unknown protein-coding genes, long non-coding RNA (lncRNA) genes and circular RNA (circRNA) genes. Our results provided a valuable and highly reliable data source to select key candidate genes for subsequent experimental validation. To facilitate identification of novel genes regulating important biological processes and phenotypes in other plants or animals, we have published the source code of NetMiner, making it freely available at https://github.com/czllab/NetMiner.
2012-01-01
Background The feline genome is valuable to the veterinary and model organism genomics communities because the cat is an obligate carnivore and a model for endangered felids. The initial public release of the Felis catus genome assembly provided a framework for investigating the genomic basis of feline biology. However, the entire set of protein coding genes has not been elucidated. Results We identified and characterized 1227 protein coding feline sequences, of which 913 map to public sequences and 314 are novel. These sequences have been deposited into NCBI's genbank database and complement public genomic resources by providing additional protein coding sequences that fill in some of the gaps in the feline genome assembly. Through functional and comparative genomic analyses, we gained an understanding of the role of these sequences in feline development, nutrition and health. Specifically, we identified 104 orthologs of human genes associated with Mendelian disorders. We detected negative selection within sequences with gene ontology annotations associated with intracellular trafficking, cytoskeleton and muscle functions. We detected relatively less negative selection on protein sequences encoding extracellular networks, apoptotic pathways and mitochondrial gene ontology annotations. Additionally, we characterized feline cDNA sequences that have mouse orthologs associated with clinical, nutritional and developmental phenotypes. Together, this analysis provides an overview of the value of our cDNA sequences and enhances our understanding of how the feline genome is similar to, and different from other mammalian genomes. Conclusions The cDNA sequences reported here expand existing feline genomic resources by providing high-quality sequences annotated with comparative genomic information providing functional, clinical, nutritional and orthologous gene information. PMID:22257742
Bacteriophage 5' untranslated regions for control of plastid transgene expression.
Yang, Huijun; Gray, Benjamin N; Ahner, Beth A; Hanson, Maureen R
2013-02-01
Expression of foreign proteins from transgenes incorporated into plastid genomes requires regulatory sequences that can be recognized by the plastid transcription and translation machinery. Translation signals harbored by the 5' untranslated region (UTR) of plastid transcripts can profoundly affect the level of accumulation of proteins expressed from chimeric transgenes. Both endogenous 5' UTRs and the bacteriophage T7 gene 10 (T7g10) 5' UTR have been found to be effective in combination with particular coding regions to mediate high-level expression of foreign proteins. We investigated whether two other bacteriophage 5' UTRs could be utilized in plastid transgenes by fusing them to the aadA (aminoglycoside-3'-adenyltransferase) coding region that is commonly used as a selectable marker in plastid transformation. Transplastomic plants containing either the T7g1.3 or T4g23 5' UTRs fused to Myc-epitope-tagged aadA were successfully obtained, demonstrating the ability of these 5' UTRs to regulate gene expression in plastids. Placing the Thermobifida fusca cel6A gene under the control of the T7g1.3 or T4g23 5' UTRs, along with a tetC downstream box, resulted in poor expression of the cellulase in contrast with high-level accumulation while using the T7g10 5' UTR. However, transplastomic plants with the bacteriophage 5' UTRs controlling the aadA coding region exhibited fewer undesired recombinant species than plants containing the same marker gene regulated by the Nicotiana tabacum psbA 5' UTR. Furthermore, expression of the T7g1.3 and T4g23 5' UTR::aadA fusions downstream of the cel6A gene provided sufficient spectinomycin resistance to allow selection of homoplasmic transgenic plants and had no effect on Cel6A accumulation.
Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures
Stark, Alexander; Lin, Michael F.; Kheradpour, Pouya; Pedersen, Jakob S.; Parts, Leopold; Carlson, Joseph W.; Crosby, Madeline A.; Rasmussen, Matthew D.; Roy, Sushmita; Deoras, Ameya N.; Ruby, J. Graham; Brennecke, Julius; Hodges, Emily; Hinrichs, Angie S.; Caspi, Anat; Paten, Benedict; Park, Seung-Won; Han, Mira V.; Maeder, Morgan L.; Polansky, Benjamin J.; Robson, Bryanne E.; Aerts, Stein; van Helden, Jacques; Hassan, Bassem; Gilbert, Donald G.; Eastman, Deborah A.; Rice, Michael; Weir, Michael; Hahn, Matthew W.; Park, Yongkyu; Dewey, Colin N.; Pachter, Lior; Kent, W. James; Haussler, David; Lai, Eric C.; Bartel, David P.; Hannon, Gregory J.; Kaufman, Thomas C.; Eisen, Michael B.; Clark, Andrew G.; Smith, Douglas; Celniker, Susan E.; Gelbart, William M.; Kellis, Manolis
2008-01-01
Sequencing of multiple related species followed by comparative genomics analysis constitutes a powerful approach for the systematic understanding of any genome. Here, we use the genomes of 12 Drosophila species for the de novo discovery of functional elements in the fly. Each type of functional element shows characteristic patterns of change, or ‘evolutionary signatures’, dictated by its precise selective constraints. Such signatures enable recognition of new protein-coding genes and exons, spurious and incorrect gene annotations, and numerous unusual gene structures, including abundant stop-codon readthrough. Similarly, we predict non-protein-coding RNA genes and structures, and new microRNA (miRNA) genes. We provide evidence of miRNA processing and functionality from both hairpin arms and both DNA strands. We identify several classes of pre- and post-transcriptional regulatory motifs, and predict individual motif instances with high confidence. We also study how discovery power scales with the divergence and number of species compared, and we provide general guidelines for comparative studies. PMID:17994088
Li, Ming-Rui; Shi, Feng-Xue; Li, Ya-Ling; Jiang, Peng; Jiao, Lili
2017-01-01
Abstract Chinese ginseng (Panax ginseng Meyer) is a medicinally important herb and plays crucial roles in traditional Chinese medicine. Pharmacological analyses identified diverse bioactive components from Chinese ginseng. However, basic biological attributes including domestication and selection of the ginseng plant remain under-investigated. Here, we presented a genome-wide view of the domestication and selection of cultivated ginseng based on the whole genome data. A total of 8,660 protein-coding genes were selected for genome-wide scanning of the 30 wild and cultivated ginseng accessions. In complement, the 45s rDNA, chloroplast and mitochondrial genomes were included to perform phylogenetic and population genetic analyses. The observed spatial genetic structure between northern cultivated ginseng (NCG) and southern cultivated ginseng (SCG) accessions suggested multiple independent origins of cultivated ginseng. Genome-wide scanning further demonstrated that NCG and SCG have undergone distinct selection pressures during the domestication process, with more genes identified in the NCG (97 genes) than in the SCG group (5 genes). Functional analyses revealed that these genes are involved in diverse pathways, including DNA methylation, lignin biosynthesis, and cell differentiation. These findings suggested that the SCG and NCG groups have distinct demographic histories. Candidate genes identified are useful for future molecular breeding of cultivated ginseng. PMID:28922794
García-Alonso, L; Romani, S; Jiménez, F
2000-12-01
Cell adhesion molecules (CAMs) implement the process of axon guidance by promoting specific selection and attachment to substrates. We show that, in Drosophila, loss-of-function conditions of either the Neuroglian CAM, the FGF receptor coded by the gene heartless, or the EGF receptor coded by DER display a similar phenotype of abnormal substrate selection and axon guidance by peripheral sensory neurons. Moreover, neuroglian loss-of-function phenotype can be suppressed by the expression of gain-of-function conditions of heartless or DER. The results are consistent with a scenario where the activity of these receptor tyrosine kinases is controlled by Neuroglian at choice points where sensory axons select between alternative substrates for extension.
The neutral emergence of error minimized genetic codes superior to the standard genetic code.
Massey, Steven E
2016-11-07
The standard genetic code (SGC) assigns amino acids to codons in such a way that the impact of point mutations is reduced, this is termed 'error minimization' (EM). The occurrence of EM has been attributed to the direct action of selection, however it is difficult to explain how the searching of alternative codes for an error minimized code can occur via codon reassignments, given that these are likely to be disruptive to the proteome. An alternative scenario is that EM has arisen via the process of genetic code expansion, facilitated by the duplication of genes encoding charging enzymes and adaptor molecules. This is likely to have led to similar amino acids being assigned to similar codons. Strikingly, we show that if during code expansion the most similar amino acid to the parent amino acid, out of the set of unassigned amino acids, is assigned to codons related to those of the parent amino acid, then genetic codes with EM superior to the SGC easily arise. This scheme mimics code expansion via the gene duplication of charging enzymes and adaptors. The result is obtained for a variety of different schemes of genetic code expansion and provides a mechanistically realistic manner in which EM has arisen in the SGC. These observations might be taken as evidence for self-organization in the earliest stages of life. Copyright © 2016 Elsevier Ltd. All rights reserved.
Sperm Bindin Divergence under Sexual Selection and Concerted Evolution in Sea Stars.
Patiño, Susana; Keever, Carson C; Sunday, Jennifer M; Popovic, Iva; Byrne, Maria; Hart, Michael W
2016-08-01
Selection associated with competition among males or sexual conflict between mates can create positive selection for high rates of molecular evolution of gamete recognition genes and lead to reproductive isolation between species. We analyzed coding sequence and repetitive domain variation in the gene encoding the sperm acrosomal protein bindin in 13 diverse sea star species. We found that bindin has a conserved coding sequence domain structure in all 13 species, with several repeated motifs in a large central region that is similar among all sea stars in organization but highly divergent among genera in nucleotide and predicted amino acid sequence. More bindin codons and lineages showed positive selection for high relative rates of amino acid substitution in genera with gonochoric outcrossing adults (and greater expected strength of sexual selection) than in selfing hermaphrodites. That difference is consistent with the expectation that selfing (a highly derived mating system) may moderate the strength of sexual selection and limit the accumulation of bindin amino acid differences. The results implicate both positive selection on single codons and concerted evolution within the repetitive region in bindin divergence, and suggest that both single amino acid differences and repeat differences may affect sperm-egg binding and reproductive compatibility. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Su, Huei-Jiun; Hu, Jer-Ming
2012-01-01
Background and Aims The holoparasitic flowering plant Balanophora displays extreme floral reduction and was previously found to have enormous rate acceleration in the nuclear 18S rDNA region. So far, it remains unclear whether non-ribosomal, protein-coding genes of Balanophora also evolve in an accelerated fashion and whether the genes with high substitution rates retain their functionality. To tackle these issues, six different genes were sequenced from two Balanophora species and their rate variation and expression patterns were examined. Methods Sequences including nuclear PI, euAP3, TM6, LFY and RPB2 and mitochondrial matR were determined from two Balanophora spp. and compared with selected hemiparasitic species of Santalales and autotrophic core eudicots. Gene expression was detected for the six protein-coding genes and the expression patterns of the three B-class genes (PI, AP3 and TM6) were further examined across different organs of B. laxiflora using RT-PCR. Key Results Balanophora mitochondrial matR is highly accelerated in both nonsynonymous (dN) and synonymous (dS) substitution rates, whereas the rate variation of nuclear genes LFY, PI, euAP3, TM6 and RPB2 are less dramatic. Significant dS increases were detected in Balanophora PI, TM6, RPB2 and dN accelerations in euAP3. All of the protein-coding genes are expressed in inflorescences, indicative of their functionality. PI is restrictively expressed in tepals, synandria and floral bracts, whereas AP3 and TM6 are widely expressed in both male and female inflorescences. Conclusions Despite the observation that rates of sequence evolution are generally higher in Balanophora than in hemiparasitic species of Santalales and autotrophic core eudicots, the five nuclear protein-coding genes are functional and are evolving at a much slower rate than 18S rDNA. The mechanism or mechanisms responsible for rapid sequence evolution and concomitant rate acceleration for 18S rDNA and matR are currently not well understood and require further study in Balanophora and other holoparasites. PMID:23041381
He, Peng; Huang, Sheng; Xiao, Guanghui; Zhang, Yuzhou; Yu, Jianing
2016-12-01
RNA editing is a posttranscriptional modification process that alters the RNA sequence so that it deviates from the genomic DNA sequence. RNA editing mainly occurs in chloroplasts and mitochondrial genomes, and the number of editing sites varies in terrestrial plants. Why and how RNA editing systems evolved remains a mystery. Ginkgo biloba is one of the oldest seed plants and has an important evolutionary position. Determining the patterns and distribution of RNA editing in the ancient plant provides insights into the evolutionary trend of RNA editing, and helping us to further understand their biological significance. In this paper, we investigated 82 protein-coding genes in the chloroplast genome of G. biloba and identified 255 editing sites, which is the highest number of RNA editing events reported in a gymnosperm. All of the editing sites were C-to-U conversions, which mainly occurred in the second codon position, biased towards to the U_A context, and caused an increase in hydrophobic amino acids. RNA editing could change the secondary structures of 82 proteins, and create or eliminate a transmembrane region in five proteins as determined in silico. Finally, the evolutionary tendencies of RNA editing in different gene groups were estimated using the nonsynonymous-synonymous substitution rate selection mode. The G. biloba chloroplast genome possesses the highest number of RNA editing events reported so far in a seed plant. Most of the RNA editing sites can restore amino acid conservation, increase hydrophobicity, and even influence protein structures. Similar purifying selections constitute the dominant evolutionary force at the editing sites of essential genes, such as the psa, some psb and pet groups, and a positive selection occurred in the editing sites of nonessential genes, such as most ndh and a few psb genes.
Evolution of the viral hemorrhagic septicemia virus: divergence, selection and origin.
He, Mei; Yan, Xue-Chun; Liang, Yang; Sun, Xiao-Wen; Teng, Chun-Bo
2014-08-01
Viral hemorrhagic septicemia virus (VHSV) is an economically significant rhabdovirus that affects an increasing number of freshwater and marine fish species. Extensive studies have been conducted on the molecular epizootiology, genetic diversity, and phylogeny of VHSV. However, there are discrepancies between the reported estimates of the nucleotide substitution rate for the G gene and the divergence times for the genotypes. Herein, Bayesian coalescent analyses were conducted to the time-stamped entire coding sequences of the six VHSV genes. Rate estimates based on the G gene indicated that the marine genotypes/subtypes might not all evolve slower than their major European freshwater counterpart. Age calculations on the six genes revealed that the first bifurcation event of the analyzed isolates might have taken place within the last 300 years, which was much younger than previously thought. Selection analyses suggested that two codons of the G gene might be positively selected. Surveys of codon usage bias showed that the P, M and NV genes exhibited genotype-specific variations. Furthermore, we proposed that VHSV originated from the Pacific Northwest of North America. Copyright © 2014 Elsevier Inc. All rights reserved.
Rapid Detection of Positive Selection in Genes and Genomes Through Variation Clusters
Wagner, Andreas
2007-01-01
Positive selection in genes and genomes can point to the evolutionary basis for differences among species and among races within a species. The detection of positive selection can also help identify functionally important protein regions and thus guide protein engineering. Many existing tests for positive selection are excessively conservative, vulnerable to artifacts caused by demographic population history, or computationally very intensive. I here propose a simple and rapid test that is complementary to existing tests and that overcomes some of these problems. It relies on the null hypothesis that neutrally evolving DNA regions should show a Poisson distribution of nucleotide substitutions. The test detects significant deviations from this expectation in the form of variation clusters, highly localized groups of amino acid changes in a coding region. In applying this test to several thousand human–chimpanzee gene orthologs, I show that such variation clusters are not generally caused by relaxed selection. They occur in well-defined domains of a protein's tertiary structure and show a large excess of amino acid replacement over silent substitutions. I also identify multiple new human–chimpanzee orthologs subject to positive selection, among them genes that are involved in reproductive functions, immune defense, and the nervous system. PMID:17603100
Antimicrobial peptide evolution in the Asiatic honey bee Apis cerana.
Xu, Peng; Shi, Min; Chen, Xue-Xin
2009-01-01
The Asiatic honeybee, Apis cerana Fabricius, is an important honeybee species in Asian countries. It is still found in the wild, but is also one of the few bee species that can be domesticated. It has acquired some genetic advantages and significantly different biological characteristics compared with other Apis species. However, it has been less studied, and over the past two decades, has become a threatened species in China. We designed primers for the sequences of the four antimicrobial peptide cDNA gene families (abaecin, defensin, apidaecin, and hymenoptaecin) of the Western honeybee, Apis mellifera L. and identified all the antimicrobial peptide cDNA genes in the Asiatic honeybee for the first time. All the sequences were amplified by reverse transcriptase-polymerase chain reaction (RT-PCR). In all, 29 different defensin cDNA genes coding 7 different defensin peptides, 11 different abaecin cDNA genes coding 2 different abaecin peptides, 13 different apidaecin cDNA genes coding 4 apidaecin peptides and 34 different hymenoptaecin cDNA genes coding 13 different hymenoptaecin peptides were cloned and identified from the Asiatic honeybee adult workers. Detailed comparison of these four antimicrobial peptide gene families with those of the Western honeybee revealed that there are many similarities in the quantity and amino acid components of peptides in the abaecin, defensin and apidaecin families, while many more hymenoptaecin peptides are found in the Asiatic honeybee than those in the Western honeybee (13 versus 1). The results indicated that the Asiatic honeybee adult generated more variable antimicrobial peptides, especially hymenoptaecin peptides than the Western honeybee when stimulated by pathogens or injury. This suggests that, compared to the Western honeybee that has a longer history of domestication, selection on the Asiatic honeybee has favored the generation of more variable antimicrobial peptides as protection against pathogens.
Adaptive Evolution Is Substantially Impeded by Hill–Robertson Interference in Drosophila
Castellano, David; Coronado-Zamora, Marta; Campos, Jose L.; Barbadilla, Antonio; Eyre-Walker, Adam
2016-01-01
Hill–Robertson interference (HRi) is expected to reduce the efficiency of natural selection when two or more linked selected sites do not segregate freely, but no attempt has been done so far to quantify the overall impact of HRi on the rate of adaptive evolution for any given genome. In this work, we estimate how much HRi impedes the rate of adaptive evolution in the coding genome of Drosophila melanogaster. We compiled a data set of 6,141 autosomal protein-coding genes from Drosophila, from which polymorphism levels in D. melanogaster and divergence out to D. yakuba were estimated. The rate of adaptive evolution was calculated using a derivative of the McDonald–Kreitman test that controls for slightly deleterious mutations. We find that the rate of adaptive amino acid substitution at a given position of the genome is positively correlated to both the rate of recombination and the mutation rate, and negatively correlated to the gene density of the region. These correlations are robust to controlling for each other, for synonymous codon bias and for gene functions related to immune response and testes. We show that HRi diminishes the rate of adaptive evolution by approximately 27%. Interestingly, genes with low mutation rates embedded in gene poor regions lose approximately 17% of their adaptive substitutions whereas genes with high mutation rates embedded in gene rich regions lose approximately 60%. We conclude that HRi hampers the rate of adaptive evolution in Drosophila and that the variation in recombination, mutation, and gene density along the genome affects the HRi effect. PMID:26494843
Herbert, Kristina M.; Nag, Anita
2016-01-01
Viral infection initiates an array of changes in host gene expression. Many viruses dampen host protein expression and attempt to evade the host anti-viral defense machinery. Host gene expression is suppressed at several stages of host messenger RNA (mRNA) formation including selective degradation of translationally competent messenger RNAs. Besides mRNAs, host cells also express a variety of noncoding RNAs, including small RNAs, that may also be subject to inhibition upon viral infection. In this review we focused on different ways viruses antagonize coding and noncoding RNAs in the host cell to its advantage. PMID:27271653
Crosley, E J; Elliot, M G; Christians, J K; Crespi, B J
2013-02-01
Recent evidence from chimpanzees and gorillas has raised doubts that preeclampsia is a uniquely human disease. The deep extravillous trophoblast (EVT) invasion and spiral artery remodeling that characterizes our placenta (and is abnormal in preeclampsia) is shared within great apes, setting Homininae apart from Hylobatidae and Old World Monkeys, which show much shallower trophoblast invasion and limited spiral artery remodeling. We hypothesize that the evolution of a more invasive placenta in the lineage ancestral to the great apes involved positive selection on genes crucial to EVT invasion and spiral artery remodeling. Furthermore, identification of placentally-expressed genes under selection in this lineage may identify novel genes involved in placental development. We tested for positive selection in approximately 18,000 genes using the ratio of non-synonymous to synonymous amino acid substitution for protein-coding DNA. DAVID Bioinformatics Resources identified biological processes enriched in positively selected genes, including processes related to EVT invasion and spiral artery remodeling. Analyses revealed 295 and 264 genes under significant positive selection on the branches ancestral to Hominidae (Human, Chimp, Gorilla, Orangutan) and Homininae (Human, Chimp, Gorilla), respectively. Gene ontology analysis of these gene sets demonstrated significant enrichments for several functional gene clusters relevant to preeclampsia risk, and sets of placentally-expressed genes that have been linked with preeclampsia and/or trophoblast invasion in other studies. Our study represents a novel approach to the identification of candidate genes and amino acid residues involved in placental pathologies by implicating them in the evolution of highly-invasive placenta. Copyright © 2012 Elsevier Ltd. All rights reserved.
Mitochondrial genome sequence of Egyptian swift Rock Pigeon (Columba livia breed Egyptian swift).
Li, Chun-Hong; Shi, Wei; Shi, Wan-Yu
2015-06-01
The Egyptian swift Rock Pigeon is a breed of fancy pigeon developed over many years of selective breeding. In this work, we report the complete mitochondrial genome sequence of Egyptian swift Rock Pigeon. The total length of the mitogenome was 17,239 bp and its overall base composition was estimated to be 30.2% for A, 24.0% for T, 31.9% for C and 13.9% for G, indicating an A-T (54.2%)-rich feature in the mitogenome. It contained the typical structure of 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and a non-coding control region (D-loop region). The complete mitochondrial genome sequence of Egyptian swift Rock Pigeon would serve as an important data set of the germplasm resources for further study.
Itoh, Takeshi; Tanaka, Tsuyoshi; Barrero, Roberto A.; Yamasaki, Chisato; Fujii, Yasuyuki; Hilton, Phillip B.; Antonio, Baltazar A.; Aono, Hideo; Apweiler, Rolf; Bruskiewich, Richard; Bureau, Thomas; Burr, Frances; Costa de Oliveira, Antonio; Fuks, Galina; Habara, Takuya; Haberer, Georg; Han, Bin; Harada, Erimi; Hiraki, Aiko T.; Hirochika, Hirohiko; Hoen, Douglas; Hokari, Hiroki; Hosokawa, Satomi; Hsing, Yue; Ikawa, Hiroshi; Ikeo, Kazuho; Imanishi, Tadashi; Ito, Yukiyo; Jaiswal, Pankaj; Kanno, Masako; Kawahara, Yoshihiro; Kawamura, Toshiyuki; Kawashima, Hiroaki; Khurana, Jitendra P.; Kikuchi, Shoshi; Komatsu, Setsuko; Koyanagi, Kanako O.; Kubooka, Hiromi; Lieberherr, Damien; Lin, Yao-Cheng; Lonsdale, David; Matsumoto, Takashi; Matsuya, Akihiro; McCombie, W. Richard; Messing, Joachim; Miyao, Akio; Mulder, Nicola; Nagamura, Yoshiaki; Nam, Jongmin; Namiki, Nobukazu; Numa, Hisataka; Nurimoto, Shin; O’Donovan, Claire; Ohyanagi, Hajime; Okido, Toshihisa; OOta, Satoshi; Osato, Naoki; Palmer, Lance E.; Quetier, Francis; Raghuvanshi, Saurabh; Saichi, Naomi; Sakai, Hiroaki; Sakai, Yasumichi; Sakata, Katsumi; Sakurai, Tetsuya; Sato, Fumihiko; Sato, Yoshiharu; Schoof, Heiko; Seki, Motoaki; Shibata, Michie; Shimizu, Yuji; Shinozaki, Kazuo; Shinso, Yuji; Singh, Nagendra K.; Smith-White, Brian; Takeda, Jun-ichi; Tanino, Motohiko; Tatusova, Tatiana; Thongjuea, Supat; Todokoro, Fusano; Tsugane, Mika; Tyagi, Akhilesh K.; Vanavichit, Apichart; Wang, Aihui; Wing, Rod A.; Yamaguchi, Kaori; Yamamoto, Mayu; Yamamoto, Naoyuki; Yu, Yeisoo; Zhang, Hao; Zhao, Qiang; Higo, Kenichi; Burr, Benjamin; Gojobori, Takashi; Sasaki, Takuji
2007-01-01
We present here the annotation of the complete genome of rice Oryza sativa L. ssp. japonica cultivar Nipponbare. All functional annotations for proteins and non-protein-coding RNA (npRNA) candidates were manually curated. Functions were identified or inferred in 19,969 (70%) of the proteins, and 131 possible npRNAs (including 58 antisense transcripts) were found. Almost 5000 annotated protein-coding genes were found to be disrupted in insertional mutant lines, which will accelerate future experimental validation of the annotations. The rice loci were determined by using cDNA sequences obtained from rice and other representative cereals. Our conservative estimate based on these loci and an extrapolation suggested that the gene number of rice is ∼32,000, which is smaller than previous estimates. We conducted comparative analyses between rice and Arabidopsis thaliana and found that both genomes possessed several lineage-specific genes, which might account for the observed differences between these species, while they had similar sets of predicted functional domains among the protein sequences. A system to control translational efficiency seems to be conserved across large evolutionary distances. Moreover, the evolutionary process of protein-coding genes was examined. Our results suggest that natural selection may have played a role for duplicated genes in both species, so that duplication was suppressed or favored in a manner that depended on the function of a gene. PMID:17210932
Liu, Feiling; Guo, Dianhao; Yuan, Zhuting; Chen, Chen; Xiao, Huamei
2017-11-20
Long non-coding RNA (lncRNA) is a class of noncoding RNA >200 bp in length that has essential roles in regulating a variety of biological processes. Here, we constructed a computational pipeline to identify lncRNA genes in the diamondback moth (Plutella xylostella), a major insect pest of cruciferous vegetables. In total, 3,324 lncRNAs corresponding to 2,475 loci were identified from 13 RNA-Seq datasets, including samples from parasitized, insecticide-resistant strains and different developmental stages. The identified P. xylostella lncRNAs had shorter transcripts and fewer exons than protein-coding genes. Seven out of nine randomly selected lncRNAs were validated by strand-specific RT-PCR. In total, 54-172 lncRNAs were specifically expressed in the insecticide resistant strains, among which one lncRNA was located adjacent to the sodium channel gene. In addition, 63-135 lncRNAs were specifically expressed in different developmental stages, among which three lncRNAs overlapped or were located adjacent to the metamorphosis-associated genes. These lncRNAs were either strongly or weakly co-expressed with their overlapping or neighboring mRNA genes. In summary, we identified thousands of lncRNAs and presented evidence that lncRNAs might have key roles in conferring insecticide resistance and regulating the metamorphosis development in P. xylostella.
Baxter, Laura L; Hsu, Benjamin J; Umayam, Lowell; Wolfsberg, Tyra G; Larson, Denise M; Frith, Martin C; Kawai, Jun; Hayashizaki, Yoshihide; Carninci, Piero; Pavan, William J
2007-06-01
As part of the RIKEN mouse encyclopedia project, two cDNA libraries were prepared from melanocyte-derived cell lines, using techniques of full-length clone selection and subtraction/normalization to enrich for rare transcripts. End sequencing showed that these libraries display over 83% complete coding sequence at the 5' end and 96-97% complete coding sequence at the 3' end. Evaluation of the libraries, derived from B16F10Y tumor cells and melan-c cells, revealed that they contain clones for a majority of the genes previously demonstrated to function in melanocyte biology. Analysis of genomic locations for transcripts revealed that the distribution of melanocyte genes is non-random throughout the genome. Three genomic regions identified that showed significant clustering of melanocyte-expressed genes contain one or more genes previously shown to regulate melanocyte development or function. A catalog of genes expressed in these libraries is presented, providing a valuable resource of cDNA clones and sequence information that can be used for identification of new genes important for melanocyte development, function, and disease.
APPRIS 2017: principal isoforms for multiple gene sets
Rodriguez-Rivas, Juan; Di Domenico, Tomás; Vázquez, Jesús; Valencia, Alfonso
2018-01-01
Abstract The APPRIS database (http://appris-tools.org) uses protein structural and functional features and information from cross-species conservation to annotate splice isoforms in protein-coding genes. APPRIS selects a single protein isoform, the ‘principal’ isoform, as the reference for each gene based on these annotations. A single main splice isoform reflects the biological reality for most protein coding genes and APPRIS principal isoforms are the best predictors of these main proteins isoforms. Here, we present the updates to the database, new developments that include the addition of three new species (chimpanzee, Drosophila melangaster and Caenorhabditis elegans), the expansion of APPRIS to cover the RefSeq gene set and the UniProtKB proteome for six species and refinements in the core methods that make up the annotation pipeline. In addition APPRIS now provides a measure of reliability for individual principal isoforms and updates with each release of the GENCODE/Ensembl and RefSeq reference sets. The individual GENCODE/Ensembl, RefSeq and UniProtKB reference gene sets for six organisms have been merged to produce common sets of splice variants. PMID:29069475
The Center for Regenerative Biology and Medicine at Mount Desert Island Biological Laboratory
2013-06-01
system through in vivo disruption of gene function. 15. SUBJECT TERMS limb regeneration Positional Memory Code Axolotl ...another selection factor to identify those genes that are similarly controlled in both Polypterus and axolotl samples. These comparisons revealed a...sequence IDs among Axolotl and Polypterus contigs that were up-regulated and down regulated greater than 2-fold between 0 and 7 dpa. (Left) The
Altered transcription of inflammation-related genes in dental pulp of coeliac children.
Bossù, Maurizio; Montuori, Monica; Casani, Daniela; Di Giorgio, Gianni; Pacifici, Andrea; Ladniak, Barbara; Polimeni, Antonella
2016-09-01
Coeliac disease is a chronic small intestinal immune-mediated enteropathy precipitated by exposure to dietary gluten, and possible relationships between coeliac disease and dental pathogenic conditions during childhood have been poorly investigated. The dental pulp plays a pivotal role in the immune defence against possible entry of pathogens from teeth, and the aim of this work was to investigate quantitative transcription levels of selected genes (IL-9, IL-11, IL-15, IL-18, IL-21, IL-27, MICA, IFN-γ) coding for pro-inflammatory immune innate activities in the pulp of primary teeth from healthy children and children with coeliac disease. The pulp from primary teeth of 10 healthy children and 10 children with coeliac disease was used to extract RNA and prepare cDNA for quantitative PCR transcription analysis employing commercial nucleotide probes for selected genes. In children with coeliac disease, the genes coding for pro-inflammatory cytokines IFN-γ, IL-11, IL-18, and IL-21 were significantly overexpressed, suggesting the possible importance of these cytokines in the relationships between coeliac disease and dental disorders. For the first time, we reported in dental pulp of children possible relationships between coeliac disease and modulation in transcription of cytokine-dependent inflammatory activities. © 2015 BSPD, IAPD and John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Simulation of gene evolution under directional mutational pressure
NASA Astrophysics Data System (ADS)
Dudkiewicz, Małgorzata; Mackiewicz, Paweł; Kowalczuk, Maria; Mackiewicz, Dorota; Nowicka, Aleksandra; Polak, Natalia; Smolarczyk, Kamila; Banaszak, Joanna; R. Dudek, Mirosław; Cebrat, Stanisław
2004-05-01
The two main mechanisms generating the genetic diversity, mutation and recombination, have random character but they are biased which has an effect on the generation of asymmetry in the bacterial chromosome structure and in the protein coding sequences. Thus, like in a case of two chiral molecules-the two possible orientations of a gene in relation to the topology of a chromosome are not equivalent. Assuming that the sequence of a gene may oscillate only between certain limits of its structural composition means that the gene could be forced out of these limits by the directional mutation pressure, in the course of evolution. The probability of the event depends on the time the gene stays under the same mutation pressure. Inversion of the gene changes the directional mutational pressure to the reciprocal one and hence it changes the distance of the gene to its lower and upper bound of the structural tolerance. Using Monte Carlo methods we were able to simulate the evolution of genes under experimentally found mutational pressure, assuming simple mechanisms of selection. We found that the mutation and recombination should work in accordance to lower their negative effects on the function of the products of coding sequences.
Liu, Rui; Jin, Long; Long, Keren; Chai, Jie; Ma, Jideng; Tang, Qianzi; Tian, Shilin; Hu, Yaodong; Lin, Ling; Wang, Xun; Jiang, Anan; Li, Xuewei; Li, Mingzhou
2016-01-10
Domestication and subsequent selective pressures have produced a large variety of pig coat colors in different regions and breeds. The melanocortin 1 receptor (MC1R) gene plays a crucial role in determining coat color of mammals. Here, we investigated genetic diversity and selection at the coding region of the porcine melanocortin receptor 1 (MC1R) in Tibetan pigs and Landrace pigs. By contrast, genetic variability was much lower in Landrace pigs than in Tibetan pigs. Meanwhile, haplotype analysis showed that Tibetan pigs possessed shared haplotypes, suggesting a possibility of recent introgression event by way of crossbreeding with neighboring domestic pigs or shared ancestral polymorphism. Additionally, we detected positive selection at the MC1R in both Tibetan pigs and Landrace pigs through the dN/dS analysis. These findings suggested that novel phenotypic change (dark coat color) caused by novel mutations may help Tibetan pigs against intensive solar ultraviolet (UV) radiation and camouflage in wild environment, whereas white coat color in Landrace were intentionally selected by human after domestication. Furthermore, both the phylogenetic analysis and the network analysis provided clues that MC1R in Asian and European wild boars may have initially experienced different selective pressures, and MC1R alleles diversified in modern domesticated pigs. Copyright © 2015 Elsevier B.V. All rights reserved.
Leung, Wilson; Shaffer, Christopher D; Reed, Laura K; Smith, Sheryl T; Barshop, William; Dirkes, William; Dothager, Matthew; Lee, Paul; Wong, Jeannette; Xiong, David; Yuan, Han; Bedard, James E J; Machone, Joshua F; Patterson, Seantay D; Price, Amber L; Turner, Bryce A; Robic, Srebrenka; Luippold, Erin K; McCartha, Shannon R; Walji, Tezin A; Walker, Chelsea A; Saville, Kenneth; Abrams, Marita K; Armstrong, Andrew R; Armstrong, William; Bailey, Robert J; Barberi, Chelsea R; Beck, Lauren R; Blaker, Amanda L; Blunden, Christopher E; Brand, Jordan P; Brock, Ethan J; Brooks, Dana W; Brown, Marie; Butzler, Sarah C; Clark, Eric M; Clark, Nicole B; Collins, Ashley A; Cotteleer, Rebecca J; Cullimore, Peterson R; Dawson, Seth G; Docking, Carter T; Dorsett, Sasha L; Dougherty, Grace A; Downey, Kaitlyn A; Drake, Andrew P; Earl, Erica K; Floyd, Trevor G; Forsyth, Joshua D; Foust, Jonathan D; Franchi, Spencer L; Geary, James F; Hanson, Cynthia K; Harding, Taylor S; Harris, Cameron B; Heckman, Jonathan M; Holderness, Heather L; Howey, Nicole A; Jacobs, Dontae A; Jewell, Elizabeth S; Kaisler, Maria; Karaska, Elizabeth A; Kehoe, James L; Koaches, Hannah C; Koehler, Jessica; Koenig, Dana; Kujawski, Alexander J; Kus, Jordan E; Lammers, Jennifer A; Leads, Rachel R; Leatherman, Emily C; Lippert, Rachel N; Messenger, Gregory S; Morrow, Adam T; Newcomb, Victoria; Plasman, Haley J; Potocny, Stephanie J; Powers, Michelle K; Reem, Rachel M; Rennhack, Jonathan P; Reynolds, Katherine R; Reynolds, Lyndsey A; Rhee, Dong K; Rivard, Allyson B; Ronk, Adam J; Rooney, Meghan B; Rubin, Lainey S; Salbert, Luke R; Saluja, Rasleen K; Schauder, Taylor; Schneiter, Allison R; Schulz, Robert W; Smith, Karl E; Spencer, Sarah; Swanson, Bryant R; Tache, Melissa A; Tewilliager, Ashley A; Tilot, Amanda K; VanEck, Eve; Villerot, Matthew M; Vylonis, Megan B; Watson, David T; Wurzler, Juliana A; Wysocki, Lauren M; Yalamanchili, Monica; Zaborowicz, Matthew A; Emerson, Julia A; Ortiz, Carlos; Deuschle, Frederic J; DiLorenzo, Lauren A; Goeller, Katie L; Macchi, Christopher R; Muller, Sarah E; Pasierb, Brittany D; Sable, Joseph E; Tucci, Jessica M; Tynon, Marykathryn; Dunbar, David A; Beken, Levent H; Conturso, Alaina C; Danner, Benjamin L; DeMichele, Gabriella A; Gonzales, Justin A; Hammond, Maureen S; Kelley, Colleen V; Kelly, Elisabeth A; Kulich, Danielle; Mageeney, Catherine M; McCabe, Nikie L; Newman, Alyssa M; Spaeder, Lindsay A; Tumminello, Richard A; Revie, Dennis; Benson, Jonathon M; Cristostomo, Michael C; DaSilva, Paolo A; Harker, Katherine S; Jarrell, Jenifer N; Jimenez, Luis A; Katz, Brandon M; Kennedy, William R; Kolibas, Kimberly S; LeBlanc, Mark T; Nguyen, Trung T; Nicolas, Daniel S; Patao, Melissa D; Patao, Shane M; Rupley, Bryan J; Sessions, Bridget J; Weaver, Jennifer A; Goodman, Anya L; Alvendia, Erica L; Baldassari, Shana M; Brown, Ashley S; Chase, Ian O; Chen, Maida; Chiang, Scott; Cromwell, Avery B; Custer, Ashley F; DiTommaso, Tia M; El-Adaimi, Jad; Goscinski, Nora C; Grove, Ryan A; Gutierrez, Nestor; Harnoto, Raechel S; Hedeen, Heather; Hong, Emily L; Hopkins, Barbara L; Huerta, Vilma F; Khoshabian, Colin; LaForge, Kristin M; Lee, Cassidy T; Lewis, Benjamin M; Lydon, Anniken M; Maniaci, Brian J; Mitchell, Ryan D; Morlock, Elaine V; Morris, William M; Naik, Priyanka; Olson, Nicole C; Osterloh, Jeannette M; Perez, Marcos A; Presley, Jonathan D; Randazzo, Matt J; Regan, Melanie K; Rossi, Franca G; Smith, Melanie A; Soliterman, Eugenia A; Sparks, Ciani J; Tran, Danny L; Wan, Tiffany; Welker, Anne A; Wong, Jeremy N; Sreenivasan, Aparna; Youngblom, Jim; Adams, Andrew; Alldredge, Justin; Bryant, Ashley; Carranza, David; Cifelli, Alyssa; Coulson, Kevin; Debow, Calise; Delacruz, Noelle; Emerson, Charlene; Farrar, Cassandra; Foret, Don; Garibay, Edgar; Gooch, John; Heslop, Michelle; Kaur, Sukhjit; Khan, Ambreen; Kim, Van; Lamb, Travis; Lindbeck, Peter; Lucas, Gabi; Macias, Elizabeth; Martiniuc, Daniela; Mayorga, Lissett; Medina, Joseph; Membreno, Nelson; Messiah, Shady; Neufeld, Lacey; Nguyen, San Francisco; Nichols, Zachary; Odisho, George; Peterson, Daymon; Rodela, Laura; Rodriguez, Priscilla; Rodriguez, Vanessa; Ruiz, Jorge; Sherrill, Will; Silva, Valeria; Sparks, Jeri; Statton, Geeta; Townsend, Ashley; Valdez, Isabel; Waters, Mary; Westphal, Kyle; Winkler, Stacey; Zumkehr, Joannee; DeJong, Randall J; Hoogewerf, Arlene J; Ackerman, Cheri M; Armistead, Isaac O; Baatenburg, Lara; Borr, Matthew J; Brouwer, Lindsay K; Burkhart, Brandon J; Bushhouse, Kelsey T; Cesko, Lejla; Choi, Tiffany Y Y; Cohen, Heather; Damsteegt, Amanda M; Darusz, Jess M; Dauphin, Cory M; Davis, Yelena P; Diekema, Emily J; Drewry, Melissa; Eisen, Michelle E M; Faber, Hayley M; Faber, Katherine J; Feenstra, Elizabeth; Felzer-Kim, Isabella T; Hammond, Brandy L; Hendriksma, Jesse; Herrold, Milton R; Hilbrands, Julia A; Howell, Emily J; Jelgerhuis, Sarah A; Jelsema, Timothy R; Johnson, Benjamin K; Jones, Kelly K; Kim, Anna; Kooienga, Ross D; Menyes, Erika E; Nollet, Eric A; Plescher, Brittany E; Rios, Lindsay; Rose, Jenny L; Schepers, Allison J; Scott, Geoff; Smith, Joshua R; Sterling, Allison M; Tenney, Jenna C; Uitvlugt, Chris; VanDyken, Rachel E; VanderVennen, Marielle; Vue, Samantha; Kokan, Nighat P; Agbley, Kwabea; Boham, Sampson K; Broomfield, Daniel; Chapman, Kayla; Dobbe, Ali; Dobbe, Ian; Harrington, William; Ibrahem, Marwan; Kennedy, Andre; Koplinsky, Chad A; Kubricky, Cassandra; Ladzekpo, Danielle; Pattison, Claire; Ramirez, Roman E; Wande, Lucia; Woehlke, Sarah; Wawersik, Matthew; Kiernan, Elizabeth; Thompson, Jeffrey S; Banker, Roxanne; Bartling, Justina R; Bhatiya, Chinmoy I; Boudoures, Anna L; Christiansen, Lena; Fosselman, Daniel S; French, Kristin M; Gill, Ishwar S; Havill, Jessen T; Johnson, Jaelyn L; Keny, Lauren J; Kerber, John M; Klett, Bethany M; Kufel, Christina N; May, Francis J; Mecoli, Jonathan P; Merry, Callie R; Meyer, Lauren R; Miller, Emily G; Mullen, Gregory J; Palozola, Katherine C; Pfeil, Jacob J; Thomas, Jessica G; Verbofsky, Evan M; Spana, Eric P; Agarwalla, Anant; Chapman, Julia; Chlebina, Ben; Chong, Insun; Falk, I N; Fitzgibbons, John D; Friedman, Harrison; Ighile, Osagie; Kim, Andrew J; Knouse, Kristin A; Kung, Faith; Mammo, Danny; Ng, Chun Leung; Nikam, Vinayak S; Norton, Diana; Pham, Philip; Polk, Jessica W; Prasad, Shreya; Rankin, Helen; Ratliff, Camille D; Scala, Victoria; Schwartz, Nicholas U; Shuen, Jessica A; Xu, Amy; Xu, Thomas Q; Zhang, Yi; Rosenwald, Anne G; Burg, Martin G; Adams, Stephanie J; Baker, Morgan; Botsford, Bobbi; Brinkley, Briana; Brown, Carter; Emiah, Shadie; Enoch, Erica; Gier, Chad; Greenwell, Alyson; Hoogenboom, Lindsay; Matthews, Jordan E; McDonald, Mitchell; Mercer, Amanda; Monsma, Nicholaus; Ostby, Kristine; Ramic, Alen; Shallman, Devon; Simon, Matthew; Spencer, Eric; Tomkins, Trisha; Wendland, Pete; Wylie, Anna; Wolyniak, Michael J; Robertson, Gregory M; Smith, Samuel I; DiAngelo, Justin R; Sassu, Eric D; Bhalla, Satish C; Sharif, Karim A; Choeying, Tenzin; Macias, Jason S; Sanusi, Fareed; Torchon, Karvyn; Bednarski, April E; Alvarez, Consuelo J; Davis, Kristen C; Dunham, Carrie A; Grantham, Alaina J; Hare, Amber N; Schottler, Jennifer; Scott, Zackary W; Kuleck, Gary A; Yu, Nicole S; Kaehler, Marian M; Jipp, Jacob; Overvoorde, Paul J; Shoop, Elizabeth; Cyrankowski, Olivia; Hoover, Betsy; Kusner, Matt; Lin, Devry; Martinov, Tijana; Misch, Jonathan; Salzman, Garrett; Schiedermayer, Holly; Snavely, Michael; Zarrasola, Stephanie; Parrish, Susan; Baker, Atlee; Beckett, Alissa; Belella, Carissa; Bryant, Julie; Conrad, Turner; Fearnow, Adam; Gomez, Carolina; Herbstsomer, Robert A; Hirsch, Sarah; Johnson, Christen; Jones, Melissa; Kabaso, Rita; Lemmon, Eric; Vieira, Carolina Marques Dos Santos; McFarland, Darryl; McLaughlin, Christopher; Morgan, Abbie; Musokotwane, Sepo; Neutzling, William; Nietmann, Jana; Paluskievicz, Christina; Penn, Jessica; Peoples, Emily; Pozmanter, Caitlin; Reed, Emily; Rigby, Nichole; Schmidt, Lasse; Shelton, Micah; Shuford, Rebecca; Tirasawasdichai, Tiara; Undem, Blair; Urick, Damian; Vondy, Kayla; Yarrington, Bryan; Eckdahl, Todd T; Poet, Jeffrey L; Allen, Alica B; Anderson, John E; Barnett, Jason M; Baumgardner, Jordan S; Brown, Adam D; Carney, Jordan E; Chavez, Ramiro A; Christgen, Shelbi L; Christie, Jordan S; Clary, Andrea N; Conn, Michel A; Cooper, Kristen M; Crowley, Matt J; Crowley, Samuel T; Doty, Jennifer S; Dow, Brian A; Edwards, Curtis R; Elder, Darcie D; Fanning, John P; Janssen, Bridget M; Lambright, Anthony K; Lane, Curtiss E; Limle, Austin B; Mazur, Tammy; McCracken, Marly R; McDonough, Alexa M; Melton, Amy D; Minnick, Phillip J; Musick, Adam E; Newhart, William H; Noynaert, Joseph W; Ogden, Bradley J; Sandusky, Michael W; Schmuecker, Samantha M; Shipman, Anna L; Smith, Anna L; Thomsen, Kristen M; Unzicker, Matthew R; Vernon, William B; Winn, Wesley W; Woyski, Dustin S; Zhu, Xiao; Du, Chunguang; Ament, Caitlin; Aso, Soham; Bisogno, Laura Simone; Caronna, Jason; Fefelova, Nadezhda; Lopez, Lenin; Malkowitz, Lorraine; Marra, Jonathan; Menillo, Daniella; Obiorah, Ifeanyi; Onsarigo, Eric Nyabeta; Primus, Shekerah; Soos, Mahdi; Tare, Archana; Zidan, Ameer; Jones, Christopher J; Aronhalt, Todd; Bellush, James M; Burke, Christa; DeFazio, Steve; Does, Benjamin R; Johnson, Todd D; Keysock, Nicholas; Knudsen, Nelson H; Messler, James; Myirski, Kevin; Rekai, Jade Lea; Rempe, Ryan Michael; Salgado, Michael S; Stagaard, Erica; Starcher, Justin R; Waggoner, Andrew W; Yemelyanova, Anastasia K; Hark, Amy T; Bertolet, Anne; Kuschner, Cyrus E; Parry, Kesley; Quach, Michael; Shantzer, Lindsey; Shaw, Mary E; Smith, Mary A; Glenn, Omolara; Mason, Portia; Williams, Charlotte; Key, S Catherine Silver; Henry, Tyneshia C P; Johnson, Ashlee G; White, Jackie X; Haberman, Adam; Asinof, Sam; Drumm, Kelly; Freeburg, Trip; Safa, Nadia; Schultz, Darrin; Shevin, Yakov; Svoronos, Petros; Vuong, Tam; Wellinghoff, Jules; Hoopes, Laura L M; Chau, Kim M; Ward, Alyssa; Regisford, E Gloria C; Augustine, LaJerald; Davis-Reyes, Brionna; Echendu, Vivienne; Hales, Jasmine; Ibarra, Sharon; Johnson, Lauriaun; Ovu, Steven; Braverman, John M; Bahr, Thomas J; Caesar, Nicole M; Campana, Christopher; Cassidy, Daniel W; Cognetti, Peter A; English, Johnathan D; Fadus, Matthew C; Fick, Cameron N; Freda, Philip J; Hennessy, Bryan M; Hockenberger, Kelsey; Jones, Jennifer K; King, Jessica E; Knob, Christopher R; Kraftmann, Karen J; Li, Linghui; Lupey, Lena N; Minniti, Carl J; Minton, Thomas F; Moran, Joseph V; Mudumbi, Krishna; Nordman, Elizabeth C; Puetz, William J; Robinson, Lauren M; Rose, Thomas J; Sweeney, Edward P; Timko, Ashley S; Paetkau, Don W; Eisler, Heather L; Aldrup, Megan E; Bodenberg, Jessica M; Cole, Mara G; Deranek, Kelly M; DeShetler, Megan; Dowd, Rose M; Eckardt, Alexandra K; Ehret, Sharon C; Fese, Jessica; Garrett, Amanda D; Kammrath, Anna; Kappes, Michelle L; Light, Morgan R; Meier, Anne C; O'Rouke, Allison; Perella, Mallory; Ramsey, Kimberley; Ramthun, Jennifer R; Reilly, Mary T; Robinett, Deirdre; Rossi, Nadine L; Schueler, Mary Grace; Shoemaker, Emma; Starkey, Kristin M; Vetor, Ashley; Vrable, Abby; Chandrasekaran, Vidya; Beck, Christopher; Hatfield, Kristen R; Herrick, Douglas A; Khoury, Christopher B; Lea, Charlotte; Louie, Christopher A; Lowell, Shannon M; Reynolds, Thomas J; Schibler, Jeanine; Scoma, Alexandra H; Smith-Gee, Maxwell T; Tuberty, Sarah; Smith, Christopher D; Lopilato, Jane E; Hauke, Jeanette; Roecklein-Canfield, Jennifer A; Corrielus, Maureen; Gilman, Hannah; Intriago, Stephanie; Maffa, Amanda; Rauf, Sabya A; Thistle, Katrina; Trieu, Melissa; Winters, Jenifer; Yang, Bib; Hauser, Charles R; Abusheikh, Tariq; Ashrawi, Yara; Benitez, Pedro; Boudreaux, Lauren R; Bourland, Megan; Chavez, Miranda; Cruz, Samantha; Elliott, GiNell; Farek, Jesse R; Flohr, Sarah; Flores, Amanda H; Friedrichs, Chelsey; Fusco, Zach; Goodwin, Zane; Helmreich, Eric; Kiley, John; Knepper, John Mark; Langner, Christine; Martinez, Megan; Mendoza, Carlos; Naik, Monal; Ochoa, Andrea; Ragland, Nicolas; Raimey, England; Rathore, Sunil; Reza, Evangelina; Sadovsky, Griffin; Seydoux, Marie-Isabelle B; Smith, Jonathan E; Unruh, Anna K; Velasquez, Vicente; Wolski, Matthew W; Gosser, Yuying; Govind, Shubha; Clarke-Medley, Nicole; Guadron, Leslie; Lau, Dawn; Lu, Alvin; Mazzeo, Cheryl; Meghdari, Mariam; Ng, Simon; Pamnani, Brad; Plante, Olivia; Shum, Yuki Kwan Wa; Song, Roy; Johnson, Diana E; Abdelnabi, Mai; Archambault, Alexi; Chamma, Norma; Gaur, Shailly; Hammett, Deborah; Kandahari, Adrese; Khayrullina, Guzal; Kumar, Sonali; Lawrence, Samantha; Madden, Nigel; Mandelbaum, Max; Milnthorp, Heather; Mohini, Shiv; Patel, Roshni; Peacock, Sarah J; Perling, Emily; Quintana, Amber; Rahimi, Michael; Ramirez, Kristen; Singhal, Rishi; Weeks, Corinne; Wong, Tiffany; Gillis, Aubree T; Moore, Zachary D; Savell, Christopher D; Watson, Reece; Mel, Stephanie F; Anilkumar, Arjun A; Bilinski, Paul; Castillo, Rostislav; Closser, Michael; Cruz, Nathalia M; Dai, Tiffany; Garbagnati, Giancarlo F; Horton, Lanor S; Kim, Dongyeon; Lau, Joyce H; Liu, James Z; Mach, Sandy D; Phan, Thu A; Ren, Yi; Stapleton, Kenneth E; Strelitz, Jean M; Sunjed, Ray; Stamm, Joyce; Anderson, Morgan C; Bonifield, Bethany Grace; Coomes, Daniel; Dillman, Adam; Durchholz, Elaine J; Fafara-Thompson, Antoinette E; Gross, Meleah J; Gygi, Amber M; Jackson, Lesley E; Johnson, Amy; Kocsisova, Zuzana; Manghelli, Joshua L; McNeil, Kylie; Murillo, Michael; Naylor, Kierstin L; Neely, Jessica; Ogawa, Emmy E; Rich, Ashley; Rogers, Anna; Spencer, J Devin; Stemler, Kristina M; Throm, Allison A; Van Camp, Matt; Weihbrecht, Katie; Wiles, T Aaron; Williams, Mallory A; Williams, Matthew; Zoll, Kyle; Bailey, Cheryl; Zhou, Leming; Balthaser, Darla M; Bashiri, Azita; Bower, Mindy E; Florian, Kayla A; Ghavam, Nazanin; Greiner-Sosanko, Elizabeth S; Karim, Helmet; Mullen, Victor W; Pelchen, Carly E; Yenerall, Paul M; Zhang, Jiayu; Rubin, Michael R; Arias-Mejias, Suzette M; Bermudez-Capo, Armando G; Bernal-Vega, Gabriela V; Colon-Vazquez, Mariela; Flores-Vazquez, Arelys; Gines-Rosario, Mariela; Llavona-Cartagena, Ivan G; Martinez-Rodriguez, Javier O; Ortiz-Fuentes, Lionel; Perez-Colomba, Eliezer O; Perez-Otero, Joseph; Rivera, Elisandra; Rodriguez-Giron, Luke J; Santiago-Sanabria, Arnaldo J; Senquiz-Gonzalez, Andrea M; delValle, Frank R Soto; Vargas-Franco, Dorianmarie; Velázquez-Soto, Karla I; Zambrana-Burgos, Joan D; Martinez-Cruzado, Juan Carlos; Asencio-Zayas, Lillyann; Babilonia-Figueroa, Kevin; Beauchamp-Pérez, Francis D; Belén-Rodríguez, Juliana; Bracero-Quiñones, Luciann; Burgos-Bula, Andrea P; Collado-Méndez, Xavier A; Colón-Cruz, Luis R; Correa-Muller, Ana I; Crooke-Rosado, Jonathan L; Cruz-García, José M; Defendini-Ávila, Marianna; Delgado-Peraza, Francheska M; Feliciano-Cancela, Alex J; Gónzalez-Pérez, Valerie M; Guiblet, Wilfried; Heredia-Negrón, Aldo; Hernández-Muñiz, Jennifer; Irizarry-González, Lourdes N; Laboy-Corales, Ángel L; Llaurador-Caraballo, Gabriela A; Marín-Maldonado, Frances; Marrero-Llerena, Ulises; Martell-Martínez, Héctor A; Martínez-Traverso, Idaliz M; Medina-Ortega, Kiara N; Méndez-Castellanos, Sonya G; Menéndez-Serrano, Krizia C; Morales-Caraballo, Carol I; Ortiz-DeChoudens, Saryleine; Ortiz-Ortiz, Patricia; Pagán-Torres, Hendrick; Pérez-Afanador, Diana; Quintana-Torres, Enid M; Ramírez-Aponte, Edwin G; Riascos-Cuero, Carolina; Rivera-Llovet, Michelle S; Rivera-Pagán, Ingrid T; Rivera-Vicéns, Ramón E; Robles-Juarbe, Fabiola; Rodríguez-Bonilla, Lorraine; Rodríguez-Echevarría, Brian O; Rodríguez-García, Priscila M; Rodríguez-Laboy, Abneris E; Rodríguez-Santiago, Susana; Rojas-Vargas, Michael L; Rubio-Marrero, Eva N; Santiago-Colón, Albeliz; Santiago-Ortiz, Jorge L; Santos-Ramos, Carlos E; Serrano-González, Joseline; Tamayo-Figueroa, Alina M; Tascón-Peñaranda, Edna P; Torres-Castillo, José L; Valentín-Feliciano, Nelson A; Valentín-Feliciano, Yashira M; Vargas-Barreto, Nadyan M; Vélez-Vázquez, Miguel; Vilanova-Vélez, Luis R; Zambrana-Echevarría, Cristina; MacKinnon, Christy; Chung, Hui-Min; Kay, Chris; Pinto, Anthony; Kopp, Olga R; Burkhardt, Joshua; Harward, Chris; Allen, Robert; Bhat, Pavan; Chang, Jimmy Hsiang-Chun; Chen, York; Chesley, Christopher; Cohn, Dara; DuPuis, David; Fasano, Michael; Fazzio, Nicholas; Gavinski, Katherine; Gebreyesus, Heran; Giarla, Thomas; Gostelow, Marcus; Greenstein, Rachel; Gunasinghe, Hashini; Hanson, Casey; Hay, Amanda; He, Tao Jian; Homa, Katie; Howe, Ruth; Howenstein, Jeff; Huang, Henry; Khatri, Aaditya; Kim, Young Lu; Knowles, Olivia; Kong, Sarah; Krock, Rebecca; Kroll, Matt; Kuhn, Julia; Kwong, Matthew; Lee, Brandon; Lee, Ryan; Levine, Kevin; Li, Yedda; Liu, Bo; Liu, Lucy; Liu, Max; Lousararian, Adam; Ma, Jimmy; Mallya, Allyson; Manchee, Charlie; Marcus, Joseph; McDaniel, Stephen; Miller, Michelle L; Molleston, Jerome M; Diez, Cristina Montero; Ng, Patrick; Ngai, Natalie; Nguyen, Hien; Nylander, Andrew; Pollack, Jason; Rastogi, Suchita; Reddy, Himabindu; Regenold, Nathaniel; Sarezky, Jon; Schultz, Michael; Shim, Jien; Skorupa, Tara; Smith, Kenneth; Spencer, Sarah J; Srikanth, Priya; Stancu, Gabriel; Stein, Andrew P; Strother, Marshall; Sudmeier, Lisa; Sun, Mengyang; Sundaram, Varun; Tazudeen, Noor; Tseng, Alan; Tzeng, Albert; Venkat, Rohit; Venkataram, Sandeep; Waldman, Leah; Wang, Tracy; Yang, Hao; Yu, Jack Y; Zheng, Yin; Preuss, Mary L; Garcia, Angelica; Juergens, Matt; Morris, Robert W; Nagengast, Alexis A; Azarewicz, Julie; Carr, Thomas J; Chichearo, Nicole; Colgan, Mike; Donegan, Megan; Gardner, Bob; Kolba, Nik; Krumm, Janice L; Lytle, Stacey; MacMillian, Laurell; Miller, Mary; Montgomery, Andrew; Moretti, Alysha; Offenbacker, Brittney; Polen, Mike; Toth, John; Woytanowski, John; Kadlec, Lisa; Crawford, Justin; Spratt, Mary L; Adams, Ashley L; Barnard, Brianna K; Cheramie, Martin N; Eime, Anne M; Golden, Kathryn L; Hawkins, Allyson P; Hill, Jessica E; Kampmeier, Jessica A; Kern, Cody D; Magnuson, Emily E; Miller, Ashley R; Morrow, Cody M; Peairs, Julia C; Pickett, Gentry L; Popelka, Sarah A; Scott, Alexis J; Teepe, Emily J; TerMeer, Katie A; Watchinski, Carmen A; Watson, Lucas A; Weber, Rachel E; Woodard, Kate A; Barnard, Daron C; Appiah, Isaac; Giddens, Michelle M; McNeil, Gerard P; Adebayo, Adeola; Bagaeva, Kate; Chinwong, Justina; Dol, Chrystel; George, Eunice; Haltaufderhyde, Kirk; Haye, Joanna; Kaur, Manpreet; Semon, Max; Serjanov, Dmitri; Toorie, Anika; Wilson, Christopher; Riddle, Nicole C; Buhler, Jeremy; Mardis, Elaine R; Elgin, Sarah C R
2015-03-04
The Muller F element (4.2 Mb, ~80 protein-coding genes) is an unusual autosome of Drosophila melanogaster; it is mostly heterochromatic with a low recombination rate. To investigate how these properties impact the evolution of repeats and genes, we manually improved the sequence and annotated the genes on the D. erecta, D. mojavensis, and D. grimshawi F elements and euchromatic domains from the Muller D element. We find that F elements have greater transposon density (25-50%) than euchromatic reference regions (3-11%). Among the F elements, D. grimshawi has the lowest transposon density (particularly DINE-1: 2% vs. 11-27%). F element genes have larger coding spans, more coding exons, larger introns, and lower codon bias. Comparison of the Effective Number of Codons with the Codon Adaptation Index shows that, in contrast to the other species, codon bias in D. grimshawi F element genes can be attributed primarily to selection instead of mutational biases, suggesting that density and types of transposons affect the degree of local heterochromatin formation. F element genes have lower estimated DNA melting temperatures than D element genes, potentially facilitating transcription through heterochromatin. Most F element genes (~90%) have remained on that element, but the F element has smaller syntenic blocks than genome averages (3.4-3.6 vs. 8.4-8.8 genes per block), indicating greater rates of inversion despite lower rates of recombination. Overall, the F element has maintained characteristics that are distinct from other autosomes in the Drosophila lineage, illuminating the constraints imposed by a heterochromatic milieu. Copyright © 2015 Leung et al.
Tarazona-Santos, Eduardo; Fabbri, Cristina; Yeager, Meredith; Magalhaes, Wagner C; Burdett, Laurie; Crenshaw, Andrew; Pettener, Davide; Chanock, Stephen J
2010-03-23
Glucose is an important source of energy for living organisms. In vertebrates it is ingested with the diet and transported into the cells by conserved mechanisms and molecules, such as the trans-membrane Glucose Transporters (GLUTs). Members of this family have tissue specific expression, biochemical properties and physiologic functions that together regulate glucose levels and distribution. GLUT4 -coded by SLC2A4 (17p13) is an insulin-sensitive transporter with a critical role in glucose homeostasis and diabetes pathogenesis, preferentially expressed in the adipose tissue, heart muscle and skeletal muscle. We tested the hypothesis that natural selection acted on SLC2A4. We re-sequenced SLC2A4 and genotyped 104 SNPs along a approximately 1 Mb region flanking this gene in 102 ethnically diverse individuals. Across the studied populations (African, European, Asian and Latin-American), all the eight common SNPs are concentrated in the N-terminal region upstream of exon 7 ( approximately 3700 bp), while the C-terminal region downstream of intron 6 ( approximately 2600 bp) harbors only 6 singletons, a pattern that is not compatible with neutrality for this part of the gene. Tests of neutrality based on comparative genomics suggest that: (1) episodes of natural selection (likely a selective sweep) predating the coalescent of human lineages, within the last 25 million years, account for the observed reduced diversity downstream of intron 6 and, (2) the target of natural selection may not be in the SLC2A4 coding sequence. We propose that the contrast in the pattern of genetic variation between the N-terminal and C-terminal regions are signatures of the action of natural selection and thus follow-up studies should investigate the functional importance of different regions of the SLC2A4 gene.
Kaltner, H; Gabius, H-J
2012-04-01
Lectin histochemistry has revealed cell-type-selective glycosylation. It is under dynamic and spatially controlled regulation. Since their chemical properties allow carbohydrates to reach unsurpassed structural diversity in oligomers, they are ideal for high density information coding. Consequently, the concept of the sugar code assigns a functional dimension to the glycans of cellular glycoconjugates. Indeed, multifarious cell processes depend on specific recognition of glycans by their receptors (lectins), which translate the sugar-encoded information into effects. Duplication of ancestral genes and the following divergence of sequences account for the evolutionary dynamics in lectin families. Differences in gene number can even appear among closely related species. The adhesion/growth-regulatory galectins are selected as an instructive example to trace the phylogenetic diversification in several animals, most of them popular models in developmental and tumor biology. Chicken galectins are identified as a low-level-complexity set, thus singled out for further detailed analysis. The various operative means for establishing protein diversity among the chicken galectins are delineated, and individual characteristics in expression profiles discerned. To apply this galectin-fingerprinting approach in histopathology has potential for refining differential diagnosis and for obtaining prognostic assessments. On the grounds of in vitro work with tumor cells a strategically orchestrated co-regulation of galectin expression with presentation of cognate glycans is detected. This coordination epitomizes the far-reaching physiological significance of sugar coding.
The Evolution and Expression Pattern of Human Overlapping lncRNA and Protein-coding Gene Pairs.
Ning, Qianqian; Li, Yixue; Wang, Zhen; Zhou, Songwen; Sun, Hong; Yu, Guangjun
2017-03-27
Long non-coding RNA overlapping with protein-coding gene (lncRNA-coding pair) is a special type of overlapping genes. Protein-coding overlapping genes have been well studied and increasing attention has been paid to lncRNAs. By studying lncRNA-coding pairs in human genome, we showed that lncRNA-coding pairs were more likely to be generated by overprinting and retaining genes in lncRNA-coding pairs were given higher priority than non-overlapping genes. Besides, the preference of overlapping configurations preserved during evolution was based on the origin of lncRNA-coding pairs. Further investigations showed that lncRNAs promoting the splicing of their embedded protein-coding partners was a unilateral interaction, but the existence of overlapping partners improving the gene expression was bidirectional and the effect was decreased with the increased evolutionary age of genes. Additionally, the expression of lncRNA-coding pairs showed an overall positive correlation and the expression correlation was associated with their overlapping configurations, local genomic environment and evolutionary age of genes. Comparison of the expression correlation of lncRNA-coding pairs between normal and cancer samples found that the lineage-specific pairs including old protein-coding genes may play an important role in tumorigenesis. This work presents a systematically comprehensive understanding of the evolution and the expression pattern of human lncRNA-coding pairs.
de Santana Lopes, Amanda; Gomes Pacheco, Túlio; Nimz, Tabea; do Nascimento Vieira, Leila; Guerra, Miguel P; Nodari, Rubens O; de Souza, Emanuel Maltempi; de Oliveira Pedrosa, Fábio; Rogalski, Marcelo
2018-04-01
The plastome of macaw palm was sequenced allowing analyses of evolution and molecular markers. Additionally, we demonstrated that more than half of plastid protein-coding genes in Arecaceae underwent positive selection. Macaw palm is a native species from tropical and subtropical Americas. It shows high production of oil per hectare reaching up to 70% of oil content in fruits and an interesting plasticity to grow in different ecosystems. Its domestication and breeding are still in the beginning, which makes the development of molecular markers essential to assess natural populations and germplasm collections. Therefore, we sequenced and characterized in detail the plastome of macaw palm. A total of 221 SSR loci were identified in the plastome of macaw palm. Additionally, eight polymorphism hotspots were characterized at level of subfamily and tribe. Moreover, several events of gain and loss of RNA editing sites were found within the subfamily Arecoideae. Aiming to uncover evolutionary events in Arecaceae, we also analyzed extensively the evolution of plastid genes. The analyses show that highly divergent genes seem to evolve in a species-specific manner, suggesting that gene degeneration events may be occurring within Arecaceae at the level of genus or species. Unexpectedly, we found that more than half of plastid protein-coding genes are under positive selection, including genes for photosynthesis, gene expression machinery and other essential plastid functions. Furthermore, we performed a phylogenomic analysis using whole plastomes of 40 taxa, representing all subfamilies of Arecaceae, which placed the macaw palm within the tribe Cocoseae. Finally, the data showed here are important for genetic studies in macaw palm and provide new insights into the evolution of plastid genes and environmental adaptation in Arecaceae.
Mäkinen, H S; Cano, J M; Merilä, J
2008-08-01
Natural selection is expected to leave an imprint on the neutral polymorphisms at the adjacent genomic regions of a selected gene. While directional selection tends to reduce within-population genetic diversity and increase among-population differentiation, the reverse is expected under balancing selection. To identify targets of natural selection in the three-spined stickleback (Gasterosteus aculeatus) genome, 103 microsatellite and two indel markers including expressed sequence tags (EST) and quantitative trait loci (QTL)-associated loci, were genotyped in four freshwater and three marine populations. The results indicated that a high proportion of loci (14.7%) might be affected by balancing selection and a lower proportion (2.8%) by directional selection. The strongest signatures of directional selection were detected in a microsatellite locus and two indel markers located in the intronic regions of the Eda-gene coding for the number of lateral plates. Yet, other microsatellite loci previously found to be informative in QTL-mapping studies revealed no signatures of selection. Two novel microsatellite loci (Stn12 and Stn90) located in chromosomes I and VIII, respectively, showed signals of directional selection and might be linked to genomic regions containing gene(s) important for adaptive divergence. Although the coverage of the total genomic content was relatively low, the predominance of balancing selection signals is in agreement with the contention that balancing, rather than directional selection is the predominant mode of selection in the wild.
Prescott, Natalie J.; Lehne, Benjamin; Stone, Kristina; Lee, James C.; Taylor, Kirstin; Knight, Jo; Papouli, Efterpi; Mirza, Muddassar M.; Simpson, Michael A.; Spain, Sarah L.; Lu, Grace; Fraternali, Franca; Bumpstead, Suzannah J.; Gray, Emma; Amar, Ariella; Bye, Hannah; Green, Peter; Chung-Faye, Guy; Hayee, Bu’Hussain; Pollok, Richard; Satsangi, Jack; Parkes, Miles; Barrett, Jeffrey C.; Mansfield, John C.; Sanderson, Jeremy; Lewis, Cathryn M.; Weale, Michael E.; Schlitt, Thomas; Mathew, Christopher G.
2015-01-01
The contribution of rare coding sequence variants to genetic susceptibility in complex disorders is an important but unresolved question. Most studies thus far have investigated a limited number of genes from regions which contain common disease associated variants. Here we investigate this in inflammatory bowel disease by sequencing the exons and proximal promoters of 531 genes selected from both genome-wide association studies and pathway analysis in pooled DNA panels from 474 cases of Crohn’s disease and 480 controls. 80 variants with evidence of association in the sequencing experiment or with potential functional significance were selected for follow up genotyping in 6,507 IBD cases and 3,064 population controls. The top 5 disease associated variants were genotyped in an extension panel of 3,662 IBD cases and 3,639 controls, and tested for association in a combined analysis of 10,147 IBD cases and 7,008 controls. A rare coding variant p.G454C in the BTNL2 gene within the major histocompatibility complex was significantly associated with increased risk for IBD (p = 9.65x10−10, OR = 2.3[95% CI = 1.75–3.04]), but was independent of the known common associated CD and UC variants at this locus. Rare (<1%) and low frequency (1–5%) variants in 3 additional genes showed suggestive association (p<0.005) with either an increased risk (ARIH2 c.338-6C>T) or decreased risk (IL12B p.V298F, and NICN p.H191R) of IBD. These results provide additional insights into the involvement of the inhibition of T cell activation in the development of both sub-phenotypes of inflammatory bowel disease. We suggest that although rare coding variants may make a modest overall contribution to complex disease susceptibility, they can inform our understanding of the molecular pathways that contribute to pathogenesis. PMID:25671699
Das, Pranab J; McCarthy, Fiona; Vishnoi, Monika; Paria, Nandina; Gresham, Cathy; Li, Gang; Kachroo, Priyanka; Sudderth, A Kendrick; Teague, Sheila; Love, Charles C; Varner, Dickson D; Chowdhary, Bhanu P; Raudsepp, Terje
2013-01-01
Mature mammalian sperm contain a complex population of RNAs some of which might regulate spermatogenesis while others probably play a role in fertilization and early development. Due to this limited knowledge, the biological functions of sperm RNAs remain enigmatic. Here we report the first characterization of the global transcriptome of the sperm of fertile stallions. The findings improved understanding of the biological significance of sperm RNAs which in turn will allow the discovery of sperm-based biomarkers for stallion fertility. The stallion sperm transcriptome was interrogated by analyzing sperm and testes RNA on a 21,000-element equine whole-genome oligoarray and by RNA-seq. Microarray analysis revealed 6,761 transcripts in the sperm, of which 165 were sperm-enriched, and 155 were differentially expressed between the sperm and testes. Next, 70 million raw reads were generated by RNA-seq of which 50% could be aligned with the horse reference genome. A total of 19,257 sequence tags were mapped to all horse chromosomes and the mitochondrial genome. The highest density of mapped transcripts was in gene-rich ECA11, 12 and 13, and the lowest in gene-poor ECA9 and X; 7 gene transcripts originated from ECAY. Structural annotation aligned sperm transcripts with 4,504 known horse and/or human genes, rRNAs and 82 miRNAs, whereas 13,354 sequence tags remained anonymous. The data were aligned with selected equine gene models to identify additional exons and splice variants. Gene Ontology annotations showed that sperm transcripts were associated with molecular processes (chemoattractant-activated signal transduction, ion transport) and cellular components (membranes and vesicles) related to known sperm functions at fertilization, while some messenger and micro RNAs might be critical for early development. The findings suggest that the rich repertoire of coding and non-coding RNAs in stallion sperm is not a random remnant from spermatogenesis in testes but a selectively retained and functionally coherent collection of RNAs.
Das, Pranab J.; McCarthy, Fiona; Vishnoi, Monika; Paria, Nandina; Gresham, Cathy; Li, Gang; Kachroo, Priyanka; Sudderth, A. Kendrick; Teague, Sheila; Love, Charles C.; Varner, Dickson D.; Chowdhary, Bhanu P.; Raudsepp, Terje
2013-01-01
Mature mammalian sperm contain a complex population of RNAs some of which might regulate spermatogenesis while others probably play a role in fertilization and early development. Due to this limited knowledge, the biological functions of sperm RNAs remain enigmatic. Here we report the first characterization of the global transcriptome of the sperm of fertile stallions. The findings improved understanding of the biological significance of sperm RNAs which in turn will allow the discovery of sperm-based biomarkers for stallion fertility. The stallion sperm transcriptome was interrogated by analyzing sperm and testes RNA on a 21,000-element equine whole-genome oligoarray and by RNA-seq. Microarray analysis revealed 6,761 transcripts in the sperm, of which 165 were sperm-enriched, and 155 were differentially expressed between the sperm and testes. Next, 70 million raw reads were generated by RNA-seq of which 50% could be aligned with the horse reference genome. A total of 19,257 sequence tags were mapped to all horse chromosomes and the mitochondrial genome. The highest density of mapped transcripts was in gene-rich ECA11, 12 and 13, and the lowest in gene-poor ECA9 and X; 7 gene transcripts originated from ECAY. Structural annotation aligned sperm transcripts with 4,504 known horse and/or human genes, rRNAs and 82 miRNAs, whereas 13,354 sequence tags remained anonymous. The data were aligned with selected equine gene models to identify additional exons and splice variants. Gene Ontology annotations showed that sperm transcripts were associated with molecular processes (chemoattractant-activated signal transduction, ion transport) and cellular components (membranes and vesicles) related to known sperm functions at fertilization, while some messenger and micro RNAs might be critical for early development. The findings suggest that the rich repertoire of coding and non-coding RNAs in stallion sperm is not a random remnant from spermatogenesis in testes but a selectively retained and functionally coherent collection of RNAs. PMID:23409192
Li, Shicheng; Sun, Xiao; Miao, Shuncheng; Liu, Jia; Jiao, Wenjie
2017-11-01
Cigarette smoking is one of the greatest preventable risk factors for developing cancer, and most cases of lung squamous cell carcinoma (lung SCC) are associated with smoking. The pathogenesis mechanism of tumor progress is unclear. This study aimed to identify biomarkers in smoking-related lung cancer, including protein-coding gene, long noncoding RNA, and transcription factors. We selected and obtained messenger RNA microarray datasets and clinical data from the Gene Expression Omnibus database to identify gene expression altered by cigarette smoking. Integrated bioinformatic analysis was used to clarify biological functions of the identified genes, including Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, the construction of a protein-protein interaction network, transcription factor, and statistical analyses. Subsequent quantitative real-time PCR was utilized to verify these bioinformatic analyses. Five hundred and ninety-eight differentially expressed genes and 21 long noncoding RNA were identified in smoking-related lung SCC. GO and KEGG pathway analysis showed that identified genes were enriched in the cancer-related functions and pathways. The protein-protein interaction network revealed seven hub genes identified in lung SCC. Several transcription factors and their binding sites were predicted. The results of real-time quantitative PCR revealed that AURKA and BIRC5 were significantly upregulated and LINC00094 was downregulated in the tumor tissues of smoking patients. Further statistical analysis indicated that dysregulation of AURKA, BIRC5, and LINC00094 indicated poor prognosis in lung SCC. Protein-coding genes AURKA, BIRC5, and LINC00094 could be biomarkers or therapeutic targets for smoking-related lung SCC. © 2017 The Authors. Thoracic Cancer published by China Lung Oncology Group and John Wiley & Sons Australia, Ltd.
Zhao, Zheng; Bai, Jing; Wu, Aiwei; Wang, Yuan; Zhang, Jinwen; Wang, Zishan; Li, Yongsheng; Xu, Juan; Li, Xia
2015-01-01
Long non-coding RNAs (lncRNAs) are emerging as key regulators of diverse biological processes and diseases. However, the combinatorial effects of these molecules in a specific biological function are poorly understood. Identifying co-expressed protein-coding genes of lncRNAs would provide ample insight into lncRNA functions. To facilitate such an effort, we have developed Co-LncRNA, which is a web-based computational tool that allows users to identify GO annotations and KEGG pathways that may be affected by co-expressed protein-coding genes of a single or multiple lncRNAs. LncRNA co-expressed protein-coding genes were first identified in publicly available human RNA-Seq datasets, including 241 datasets across 6560 total individuals representing 28 tissue types/cell lines. Then, the lncRNA combinatorial effects in a given GO annotations or KEGG pathways are taken into account by the simultaneous analysis of multiple lncRNAs in user-selected individual or multiple datasets, which is realized by enrichment analysis. In addition, this software provides a graphical overview of pathways that are modulated by lncRNAs, as well as a specific tool to display the relevant networks between lncRNAs and their co-expressed protein-coding genes. Co-LncRNA also supports users in uploading their own lncRNA and protein-coding gene expression profiles to investigate the lncRNA combinatorial effects. It will be continuously updated with more human RNA-Seq datasets on an annual basis. Taken together, Co-LncRNA provides a web-based application for investigating lncRNA combinatorial effects, which could shed light on their biological roles and could be a valuable resource for this community. Database URL: http://www.bio-bigdata.com/Co-LncRNA/. © The Author(s) 2015. Published by Oxford University Press.
Engsontia, Patamarerk; Sangket, Unitsa; Chotigeat, Wilaiwan; Satasook, Chutamas
2014-08-01
Lepidoptera (comprised of butterflies and moths) is one of the largest groups of insects, including more than 160,000 described species. Chemoreception plays important roles in the adaptation of these species to a wide range of niches, e.g., plant hosts, egg-laying sites, and mates. This study investigated the molecular evolution of the lepidopteran odorant (Or) and gustatory receptor (Gr) genes using recently identified genes from Bombyx mori, Danaus plexippus, Heliconius melpomene, Plutella xylostella, Heliothis virescens, Manduca sexta, Cydia pomonella, and Spodoptera littoralis. A limited number of cases of large lineage-specific gene expansion are observed (except in the P. xylostella lineage), possibly due to selection against tandem gene duplication. There has been strong purifying selection during the evolution of both lepidopteran odorant and gustatory genes, as shown by the low ω values estimated through CodeML analysis, ranging from 0.0093 to 0.3926. However, purifying selection has been relaxed on some amino acid sites in these receptors, leading to sequence divergence, which is a precursor of positive selection on these sequences. Signatures of positive selection were detected only in a few loci from the lineage-specific analysis. Estimation of gene gains and losses suggests that the common ancestor of the Lepidoptera had fewer Or genes compared to extant species and an even more reduced number of Gr genes, particularly within the bitter receptor clade. Multiple gene gains and a few gene losses occurred during the evolution of Lepidoptera. Gene family expansion may be associated with the adaptation of lepidopteran species to plant hosts, especially after angiosperm radiation. Phylogenetic analysis of the moth sex pheromone receptor genes suggested that chromosomal translocations have occurred several times. New sex pheromone receptors have arisen through tandem gene duplication. Positive selection was detected at some amino acid sites predicted to be in the extracellular and transmembrane regions of the newly duplicated genes, which might be associated with the evolution of the new pheromone receptors.
Cobbin, Joanna C. A.; Ong, Chi; Verity, Erin; Gilbertson, Brad P.; Rockman, Steven P.
2014-01-01
ABSTRACT Egg-grown influenza vaccine yields are maximized by infection with a seed virus produced by “classical reassortment” of a seasonal isolate with a highly egg-adapted strain. Seed viruses are selected based on a high-growth phenotype and the presence of the seasonal hemagglutinin (HA) and neuraminidase (NA) surface antigens. Retrospective analysis of H3N2 vaccine seed viruses indicated that, unlike other internal proteins that were predominantly derived from the high-growth parent A/Puerto Rico/8/34 (PR8), the polymerase subunit PB1 could be derived from either parent depending on the seasonal strain. We have recently shown that A/Udorn/307/72 (Udorn) models a seasonal isolate that yields reassortants bearing the seasonal PB1 gene. This is despite the fact that the reverse genetics-derived virus that includes Udorn PB1 with Udorn HA and NA on a PR8 background has inferior growth compared to the corresponding virus with PR8 PB1. Here we use competitive plasmid transfections to investigate the mechanisms driving selection of a less fit virus and show that the Udorn PB1 gene segment cosegregates with the Udorn NA gene segment. Analysis of chimeric PB1 genes revealed that the coselection of NA and PB1 segments was not directed through the previously identified packaging sequences but through interactions involving the internal coding region of the PB1 gene. This study identifies associations between viral genes that can direct selection in classical reassortment for vaccine production and which may also be of relevance to the gene constellations observed in past antigenic shift events where creation of a pandemic virus has involved reassortment. IMPORTANCE Influenza vaccine must be produced and administered in a timely manner in order to provide protection during the winter season, and poor-growing vaccine seed viruses can compromise this process. To maximize vaccine yields, manufacturers create hybrid influenza viruses with gene segments encoding the surface antigens from a seasonal virus isolate, important for immunity, and others from a virus with high growth properties. This involves coinfection of cells with both parent viruses and selection of dominant progeny bearing the seasonal antigens. We show that this method of creating hybrid viruses does not necessarily select for the best yielding virus because preferential pairing of gene segments when progeny viruses are produced determines the genetic makeup of the hybrids. This not only has implications for how hybrid viruses are selected for vaccine production but also sheds light on what drives and limits hybrid gene combinations that arise in nature, leading to pandemics. PMID:24872588
Cobbin, Joanna C A; Ong, Chi; Verity, Erin; Gilbertson, Brad P; Rockman, Steven P; Brown, Lorena E
2014-08-01
Egg-grown influenza vaccine yields are maximized by infection with a seed virus produced by "classical reassortment" of a seasonal isolate with a highly egg-adapted strain. Seed viruses are selected based on a high-growth phenotype and the presence of the seasonal hemagglutinin (HA) and neuraminidase (NA) surface antigens. Retrospective analysis of H3N2 vaccine seed viruses indicated that, unlike other internal proteins that were predominantly derived from the high-growth parent A/Puerto Rico/8/34 (PR8), the polymerase subunit PB1 could be derived from either parent depending on the seasonal strain. We have recently shown that A/Udorn/307/72 (Udorn) models a seasonal isolate that yields reassortants bearing the seasonal PB1 gene. This is despite the fact that the reverse genetics-derived virus that includes Udorn PB1 with Udorn HA and NA on a PR8 background has inferior growth compared to the corresponding virus with PR8 PB1. Here we use competitive plasmid transfections to investigate the mechanisms driving selection of a less fit virus and show that the Udorn PB1 gene segment cosegregates with the Udorn NA gene segment. Analysis of chimeric PB1 genes revealed that the coselection of NA and PB1 segments was not directed through the previously identified packaging sequences but through interactions involving the internal coding region of the PB1 gene. This study identifies associations between viral genes that can direct selection in classical reassortment for vaccine production and which may also be of relevance to the gene constellations observed in past antigenic shift events where creation of a pandemic virus has involved reassortment. Influenza vaccine must be produced and administered in a timely manner in order to provide protection during the winter season, and poor-growing vaccine seed viruses can compromise this process. To maximize vaccine yields, manufacturers create hybrid influenza viruses with gene segments encoding the surface antigens from a seasonal virus isolate, important for immunity, and others from a virus with high growth properties. This involves coinfection of cells with both parent viruses and selection of dominant progeny bearing the seasonal antigens. We show that this method of creating hybrid viruses does not necessarily select for the best yielding virus because preferential pairing of gene segments when progeny viruses are produced determines the genetic makeup of the hybrids. This not only has implications for how hybrid viruses are selected for vaccine production but also sheds light on what drives and limits hybrid gene combinations that arise in nature, leading to pandemics. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
Pereira, Joana; Johnson, Warren E.; O’Brien, Stephen J.; Jarvis, Erich D.; Zhang, Guojie; Gilbert, M. Thomas P.; Vasconcelos, Vitor; Antunes, Agostinho
2014-01-01
The Hedgehog (Hh) gene family codes for a class of secreted proteins composed of two active domains that act as signalling molecules during embryo development, namely for the development of the nervous and skeletal systems and the formation of the testis cord. While only one Hh gene is found typically in invertebrate genomes, most vertebrates species have three (Sonic hedgehog – Shh; Indian hedgehog – Ihh; and Desert hedgehog – Dhh), each with different expression patterns and functions, which likely helped promote the increasing complexity of vertebrates and their successful diversification. In this study, we used comparative genomic and adaptive evolutionary analyses to characterize the evolution of the Hh genes in vertebrates following the two major whole genome duplication (WGD) events. To overcome the lack of Hh-coding sequences on avian publicly available databases, we used an extensive dataset of 45 avian and three non-avian reptilian genomes to show that birds have all three Hh paralogs. We find suggestions that following the WGD events, vertebrate Hh paralogous genes evolved independently within similar linkage groups and under different evolutionary rates, especially within the catalytic domain. The structural regions around the ion-binding site were identified to be under positive selection in the signaling domain. These findings contrast with those observed in invertebrates, where different lineages that experienced gene duplication retained similar selective constraints in the Hh orthologs. Our results provide new insights on the evolutionary history of the Hh gene family, the functional roles of these paralogs in vertebrate species, and on the location of mutational hotspots. PMID:25549322
The evolution of small insertions and deletions in the coding genes of Drosophila melanogaster.
Chong, Zechen; Zhai, Weiwei; Li, Chunyan; Gao, Min; Gong, Qiang; Ruan, Jue; Li, Juan; Jiang, Lan; Lv, Xuemei; Hungate, Eric; Wu, Chung-I
2013-12-01
Studies of protein evolution have focused on amino acid substitutions with much less systematic analysis on insertion and deletions (indels) in protein coding genes. We hence surveyed 7,500 genes between Drosophila melanogaster and D. simulans, using D. yakuba as an outgroup for this purpose. The evolutionary rate of coding indels is indeed low, at only 3% of that of nonsynonymous substitutions. As coding indels follow a geometric distribution in size and tend to fall in low-complexity regions of proteins, it is unclear whether selection or mutation underlies this low rate. To resolve the issue, we collected genomic sequences from an isogenic African line of D. melanogaster (ZS30) at a high coverage of 70× and analyzed indel polymorphism between ZS30 and the reference genome. In comparing polymorphism and divergence, we found that the divergence to polymorphism ratio (i.e., fixation index) for smaller indels (size ≤ 10 bp) is very similar to that for synonymous changes, suggesting that most of the within-species polymorphism and between-species divergence for indels are selectively neutral. Interestingly, deletions of larger sizes (size ≥ 11 bp and ≤ 30 bp) have a much higher fixation index than synonymous mutations and 44.4% of fixed middle-sized deletions are estimated to be adaptive. To our surprise, this pattern is not found for insertions. Protein indel evolution appear to be in a dynamic flux of neutrally driven expansion (insertions) together with adaptive-driven contraction (deletions), and these observations provide important insights for understanding the fitness of new mutations as well as the evolutionary driving forces for genomic evolution in Drosophila species.
Naval-Sanchez, Marina; Nguyen, Quan; McWilliam, Sean; Porto-Neto, Laercio R; Tellam, Ross; Vuocolo, Tony; Reverter, Antonio; Perez-Enciso, Miguel; Brauning, Rudiger; Clarke, Shannon; McCulloch, Alan; Zamani, Wahid; Naderi, Saeid; Rezaei, Hamid Reza; Pompanon, Francois; Taberlet, Pierre; Worley, Kim C; Gibbs, Richard A; Muzny, Donna M; Jhangiani, Shalini N; Cockett, Noelle; Daetwyler, Hans; Kijas, James
2018-02-28
Domestication fundamentally reshaped animal morphology, physiology and behaviour, offering the opportunity to investigate the molecular processes driving evolutionary change. Here we assess sheep domestication and artificial selection by comparing genome sequence from 43 modern breeds (Ovis aries) and their Asian mouflon ancestor (O. orientalis) to identify selection sweeps. Next, we provide a comparative functional annotation of the sheep genome, validated using experimental ChIP-Seq of sheep tissue. Using these annotations, we evaluate the impact of selection and domestication on regulatory sequences and find that sweeps are significantly enriched for protein coding genes, proximal regulatory elements of genes and genome features associated with active transcription. Finally, we find individual sites displaying strong allele frequency divergence are enriched for the same regulatory features. Our data demonstrate that remodelling of gene expression is likely to have been one of the evolutionary forces that drove phenotypic diversification of this common livestock species.
Li, Ming-Rui; Shi, Feng-Xue; Li, Ya-Ling; Jiang, Peng; Jiao, Lili; Liu, Bao; Li, Lin-Feng
2017-09-01
Chinese ginseng (Panax ginseng Meyer) is a medicinally important herb and plays crucial roles in traditional Chinese medicine. Pharmacological analyses identified diverse bioactive components from Chinese ginseng. However, basic biological attributes including domestication and selection of the ginseng plant remain under-investigated. Here, we presented a genome-wide view of the domestication and selection of cultivated ginseng based on the whole genome data. A total of 8,660 protein-coding genes were selected for genome-wide scanning of the 30 wild and cultivated ginseng accessions. In complement, the 45s rDNA, chloroplast and mitochondrial genomes were included to perform phylogenetic and population genetic analyses. The observed spatial genetic structure between northern cultivated ginseng (NCG) and southern cultivated ginseng (SCG) accessions suggested multiple independent origins of cultivated ginseng. Genome-wide scanning further demonstrated that NCG and SCG have undergone distinct selection pressures during the domestication process, with more genes identified in the NCG (97 genes) than in the SCG group (5 genes). Functional analyses revealed that these genes are involved in diverse pathways, including DNA methylation, lignin biosynthesis, and cell differentiation. These findings suggested that the SCG and NCG groups have distinct demographic histories. Candidate genes identified are useful for future molecular breeding of cultivated ginseng. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Ferrás, Cristina; Oude Vrielink, Joachim AF; Verspuy, Johan WA; te Riele, Hein; Tsaalbi-Shtylik, Anastasia; de Wind, Niels
2009-01-01
A substantial fraction of sporadic and inherited colorectal and endometrial cancers in humans is deficient in DNA mismatch repair (MMR). These cancers are characterized by length alterations in ubiquitous simple sequence repeats, a phenotype called microsatellite instability. Here we have exploited this phenotype by developing a novel approach for the highly selective gene therapy of MMR-deficient tumors. To achieve this selectivity, we mutated the VP22FCU1 suicide gene by inserting an out-of-frame microsatellite within its coding region. We show that in a significant fraction of microsatellite-instable (MSI) cells carrying the mutated suicide gene, full-length protein becomes expressed within a few cell doublings, presumably resulting from a reverting frameshift within the inserted microsatellite. Treatment of these cells with the innocuous prodrug 5-fluorocytosine (5-FC) induces strong cytotoxicity and we demonstrate that this owes to multiple bystander effects conferred by the suicide gene/prodrug combination. In a mouse model, MMR-deficient tumors that contained the out-of-frame VP22FCU1 gene displayed strong remission after treatment with 5-FC, without any obvious adverse systemic effects to the mouse. By virtue of its high selectivity and potency, this conditional enzyme/prodrug combination may hold promise for the treatment or prevention of MMR-deficient cancer in humans. PMID:19471249
Changes in gene expression associated with reproductive maturation in wild female baboons.
Babbitt, Courtney C; Tung, Jenny; Wray, Gregory A; Alberts, Susan C
2012-01-01
Changes in gene expression during development play an important role in shaping morphological and behavioral differences, including between humans and nonhuman primates. Although many of the most striking developmental changes occur during early development, reproductive maturation represents another critical window in primate life history. However, this process is difficult to study at the molecular level in natural primate populations. Here, we took advantage of ovarian samples made available through an unusual episode of human-wildlife conflict to identify genes that are important in this process. Specifically, we used RNA sequencing (RNA-Seq) to compare genome-wide gene expression patterns in the ovarian tissue of juvenile and adult female baboons from Amboseli National Park, Kenya. We combined this information with prior evidence of selection occurring on two primate lineages (human and chimpanzee). We found that in cases in which genes were both differentially expressed over the course of ovarian maturation and also linked to lineage-specific selection this selective signature was much more likely to occur in regulatory regions than in coding regions. These results suggest that adaptive change in the development of the primate ovary may be largely driven at the mechanistic level by selection on gene regulation, potentially in relationship to the physiology or timing of female reproductive maturation.
Changes in Gene Expression Associated with Reproductive Maturation in Wild Female Baboons
Babbitt, Courtney C.; Tung, Jenny; Wray, Gregory A.; Alberts, Susan C.
2012-01-01
Changes in gene expression during development play an important role in shaping morphological and behavioral differences, including between humans and nonhuman primates. Although many of the most striking developmental changes occur during early development, reproductive maturation represents another critical window in primate life history. However, this process is difficult to study at the molecular level in natural primate populations. Here, we took advantage of ovarian samples made available through an unusual episode of human–wildlife conflict to identify genes that are important in this process. Specifically, we used RNA sequencing (RNA-Seq) to compare genome-wide gene expression patterns in the ovarian tissue of juvenile and adult female baboons from Amboseli National Park, Kenya. We combined this information with prior evidence of selection occurring on two primate lineages (human and chimpanzee). We found that in cases in which genes were both differentially expressed over the course of ovarian maturation and also linked to lineage-specific selection this selective signature was much more likely to occur in regulatory regions than in coding regions. These results suggest that adaptive change in the development of the primate ovary may be largely driven at the mechanistic level by selection on gene regulation, potentially in relationship to the physiology or timing of female reproductive maturation. PMID:22155733
New PAH gene promoter KLF1 and 3'-region C/EBPalpha motifs influence transcription in vitro.
Klaassen, Kristel; Stankovic, Biljana; Kotur, Nikola; Djordjevic, Maja; Zukic, Branka; Nikcevic, Gordana; Ugrin, Milena; Spasovski, Vesna; Srzentic, Sanja; Pavlovic, Sonja; Stojiljkovic, Maja
2017-02-01
Phenylketonuria (PKU) is a metabolic disease caused by mutations in the phenylalanine hydroxylase (PAH) gene. Although the PAH genotype remains the main determinant of PKU phenotype severity, genotype-phenotype inconsistencies have been reported. In this study, we focused on unanalysed sequences in non-coding PAH gene regions to assess their possible influence on the PKU phenotype. We transiently transfected HepG2 cells with various chloramphenicol acetyl transferase (CAT) reporter constructs which included PAH gene non-coding regions. Selected non-coding regions were indicated by in silico prediction to contain transcription factor binding sites. Furthermore, electrophoretic mobility shift assay (EMSA) and supershift assays were performed to identify which transcriptional factors were engaged in the interaction. We found novel KLF1 motif in the PAH promoter, which decreases CAT activity by 50 % in comparison to basal transcription in vitro. The cytosine at the c.-170 promoter position creates an additional binding site for the protein complex involving KLF1 transcription factor. Moreover, we assessed for the first time the role of a multivariant variable number tandem repeat (VNTR) region located in the 3'-region of the PAH gene. We found that the VNTR3, VNTR7 and VNTR8 constructs had approximately 60 % of CAT activity. The regulation is mediated by the C/EBPalpha transcription factor, present in protein complex binding to VNTR3. Our study highlighted two novel promoter KLF1 and 3'-region C/EBPalpha motifs in the PAH gene which decrease transcription in vitro and, thus, could be considered as PAH expression modifiers. New transcription motifs in non-coding regions will contribute to better understanding of the PKU phenotype complexity and may become important for the optimisation of PKU treatment.
Hirose, K; Kawasaki, Y; Kotani, K; Abiko, K; Sato, H
2004-05-01
Quinolone-resistant (QR) mutants of Mycoplasma bovirhinis strain PG43 (type strain) were generated by stepwise selection in increasing concentrations of enrofloxacin (ENR). An alteration was found in the quinolone resistance-determining region (QRDR) of the parC gene coding for the ParC subunit of topoisomerase IV from these mutants, but not in the gyrA, gyrB, and parE gene coding for the GyrA and GyrB subunits of DNA gyrase and the ParE subunit of topoisomerase IV. Similarly, such an alteration in QRDR of parC was found in the field isolates of M. bovirhinis, which possessed various levels of QR. The substitution of leucine (Leu) by serine (Ser) at position 80 of QRDR of ParC was observed in both QR-mutants and QR-isolates. This is the first report of QR based on a point mutation of the parC gene in M. bovirhinis.
Patterns of microsatellite evolution inferred from the Helianthus annuus (Asteraceae) transcriptome.
Pramod, Sreepriya; Perkins, Andy D; Welch, Mark E
2014-08-01
The distribution of microsatellites in exons, and their association with gene ontology (GO) terms is explored to elucidate patterns of microsatellite evolution in the common sunflower, Helianthus annuus. The relative position, motif, size and level of impurity were estimated for each microsatellite in the unigene database available from the Compositae Genome Project (CGP), and statistical analyses were performed to determine if differences in microsatellite distributions and enrichment within certain GO terms were significant. There are more translated than untranslated microsatellites, implying that many bring about structural changes in proteins. However, the greatest density is observed within the UTRs, particularly 5'UTRs. Further, UTR microsatellites are purer and longer than coding region microsatellites. This suggests that UTR microsatellites are either younger and under more relaxed constraints, or that purifying selection limits impurities, and directional selection favours their expansion. GOs associated with response to various environmental stimuli including water deprivation and salt stress were significantly enriched with microsatellites. This may suggest that these GOs are more labile in plant genomes, or that selection has favoured the maintenance of microsatellites in these genes over others. This study shows that the distribution of transcribed microsatellites in H. annuus is nonrandom, the coding region microsatellites are under greater constraint compared to the UTR microsatellites, and that these sequences are enriched within genes that regulate plant responses to environmental stress and stimuli.
Diehl, William E.; Johnson, Welkin E.; Hunter, Eric
2013-01-01
All genes in the TRIM6/TRIM34/TRIM5/TRIM22 locus are type I interferon inducible, with TRIM5 and TRIM22 possessing antiviral properties. Evolutionary studies involving the TRIM6/34/5/22 locus have predominantly focused on the coding sequence of the genes, finding that TRIM5 and TRIM22 have undergone high rates of both non-synonymous nucleotide replacements and in-frame insertions and deletions. We sought to understand if divergent evolutionary pressures on TRIM6/34/5/22 coding regions have selected for modifications in the non-coding regions of these genes and explore whether such non-coding changes may influence the biological function of these genes. The transcribed genomic regions, including the introns, of TRIM6, TRIM34, TRIM5, and TRIM22 from ten Haplorhini primates and one prosimian species were analyzed for transposable element content. In Haplorhini species, TRIM5 displayed an exaggerated interspecies variability, predominantly resulting from changes in the composition of transposable elements in the large first and fourth introns. Multiple lineage-specific endogenous retroviral long terminal repeats (LTRs) were identified in the first intron of TRIM5 and TRIM22. In the prosimian genome, we identified a duplication of TRIM5 with a concomitant loss of TRIM22. The transposable element content of the prosimian TRIM5 genes appears to largely represent the shared Haplorhini/prosimian ancestral state for this gene. Furthermore, we demonstrated that one such differentially fixed LTR provides for species-specific transcriptional regulation of TRIM22 in response to p53 activation. Our results identify a previously unrecognized source of species-specific variation in the antiviral TRIM genes, which can lead to alterations in their transcriptional regulation. These observations suggest that there has existed long-term pressure for exaptation of retroviral LTRs in the non-coding regions of these genes. This likely resulted from serial viral challenges and provided a mechanism for rapid alteration of transcriptional regulation. To our knowledge, this represents the first report of persistent evolutionary pressure for the capture of retroviral LTR insertions. PMID:23516500
Kou, Shu-Jun; Wu, Xiao-Meng; Liu, Zheng; Liu, Yuan-Long; Xu, Qiang; Guo, Wen-Wu
2012-12-01
miRNAs have recently been reported to modulate somatic embryogenesis (SE), a key pathway of plant regeneration in vitro. For expression level detection and subsequent function dissection of miRNAs in certain biological processes, qRT-PCR is one of the most effective and sensitive techniques, for which suitable reference gene selection is a prerequisite. In this study, three miRNAs and eight non-coding RNAs (ncRNA) were selected as reference candidates, and their expression stability was inspected in developing citrus SE tissues cultured at 20, 25, and 30 °C. Stability of the eight non-miRNA ncRNAs was further validated in five adult tissues without temperature treatment. The best single reference gene for SE tissues was snoR14 or snoRD25, while for the adult tissues the best one was U4; although they were not as stable as the optimal multiple references snoR14 + U6 for SE tissues and snoR14 + U5 for adult tissues. For expression normalization of less abundant miRNAs in SE tissues, miR3954 was assessed as a viable reference. Single reference gene snoR14 outperformed multiple references for the overall SE and adult tissues. As one of the pioneer systematic studies on reference gene identification for plant miRNA normalization, this study benefits future exploration on miRNA function in citrus and provides valuable information for similar studies in other higher plants. Three miRNAs and eight non-coding RNAs were tested as reference candidates on developing citrus SE tissues. Best single references snoR14 or snoRD25 and optimal multiple references snoR14 + U6, snoR14 + U5 were identified.
Chen, Shanyuan; Gomes, Rui; Costa, Vânia; Santos, Pedro; Charneca, Rui; Zhang, Ya-ping; Liu, Xue-hong; Wang, Shao-qing; Bento, Pedro; Nunes, Jose-Luis; Buzgó, József; Varga, Gyula; Anton, István; Zsolnai, Attila; Beja-Pereira, Albano
2013-10-01
The coexistence of wild boars and domestic pigs across Eurasia makes it feasible to conduct comparative genetic or genomic analyses for addressing how genetically different a domestic species is from its wild ancestor. To test whether there are differences in patterns of genetic variability between wild and domestic pigs at immunity-related genes and to detect outlier loci putatively under selection that may underlie differences in immune responses, here we analyzed 54 single-nucleotide polymorphisms (SNPs) of 19 immunity-related candidate genes on 11 autosomes in three pairs of wild boar and domestic pig populations from China, Iberian Peninsula, and Hungary. Our results showed no statistically significant differences in allele frequency and heterozygosity across SNPs between three pairs of wild and domestic populations. This observation was more likely due to the widespread and long-lasting gene flow between wild boars and domestic pigs across Eurasia. In addition, we detected eight coding SNPs from six genes as outliers being under selection consistently by three outlier tests (BayeScan2.1, FDIST2, and Arlequin3.5). Among four non-synonymous outlier SNPs, one from TLR4 gene was identified as being subject to positive (diversifying) selection and three each from CD36, IFNW1, and IL1B genes were suggested as under balancing selection. All of these four non-synonymous variants were predicted as being benign by PolyPhen-2. Our results were supported by other independent lines of evidence for positive selection or balancing selection acting on these four immune genes (CD36, IFNW1, IL1B, and TLR4). Our study showed an example applying a candidate gene approach to identify functionally important mutations (i.e., outlier loci) in wild and domestic pigs for subsequent functional experiments.
Cheng, Hui; Li, Jinfeng; Zhang, Hong; Cai, Binhua; Gao, Zhihong
2017-01-01
Compared with other members of the family Rosaceae, the chloroplast genomes of Fragaria species exhibit low variation, and this situation has limited phylogenetic analyses; thus, complete chloroplast genome sequencing of Fragaria species is needed. In this study, we sequenced the complete chloroplast genome of F. × ananassa ‘Benihoppe’ using the Illumina HiSeq 2500-PE150 platform and then performed a combination of de novo assembly and reference-guided mapping of contigs to generate complete chloroplast genome sequences. The chloroplast genome exhibits a typical quadripartite structure with a pair of inverted repeats (IRs, 25,936 bp) separated by large (LSC, 85,531 bp) and small (SSC, 18,146 bp) single-copy (SC) regions. The length of the F. × ananassa ‘Benihoppe’ chloroplast genome is 155,549 bp, representing the smallest Fragaria chloroplast genome observed to date. The genome encodes 112 unique genes, comprising 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Comparative analysis of the overall nucleotide sequence identity among ten complete chloroplast genomes confirmed that for both coding and non-coding regions in Rosaceae, SC regions exhibit higher sequence variation than IRs. The Ka/Ks ratio of most genes was less than 1, suggesting that most genes are under purifying selection. Moreover, the mVISTA results also showed a high degree of conservation in genome structure, gene order and gene content in Fragaria, particularly among three octoploid strawberries which were F. × ananassa ‘Benihoppe’, F. chiloensis (GP33) and F. virginiana (O477). However, when the sequences of the coding and non-coding regions of F. × ananassa ‘Benihoppe’ were compared in detail with those of F. chiloensis (GP33) and F. virginiana (O477), a number of SNPs and InDels were revealed by MEGA 7. Six non-coding regions (trnK-matK, trnS-trnG, atpF-atpH, trnC-petN, trnT-psbD and trnP-psaJ) with a percentage of variable sites greater than 1% and no less than five parsimony-informative sites were identified and may be useful for phylogenetic analysis of the genus Fragaria. PMID:29038765
A model of directional selection applied to the evolution of drug resistance in HIV-1.
Seoighe, Cathal; Ketwaroo, Farahnaz; Pillay, Visva; Scheffler, Konrad; Wood, Natasha; Duffet, Rodger; Zvelebil, Marketa; Martinson, Neil; McIntyre, James; Morris, Lynn; Hide, Winston
2007-04-01
Understanding how pathogens acquire resistance to drugs is important for the design of treatment strategies, particularly for rapidly evolving viruses such as HIV-1. Drug treatment can exert strong selective pressures and sites within targeted genes that confer resistance frequently evolve far more rapidly than the neutral rate. Rapid evolution at sites that confer resistance to drugs can be used to help elucidate the mechanisms of evolution of drug resistance and to discover or corroborate novel resistance mutations. We have implemented standard maximum likelihood methods that are used to detect diversifying selection and adapted them for use with serially sampled reverse transcriptase (RT) coding sequences isolated from a group of 300 HIV-1 subtype C-infected women before and after single-dose nevirapine (sdNVP) to prevent mother-to-child transmission. We have also extended the standard models of codon evolution for application to the detection of directional selection. Through simulation, we show that the directional selection model can provide a substantial improvement in sensitivity over models of diversifying selection. Five of the sites within the RT gene that are known to harbor mutations that confer resistance to nevirapine (NVP) strongly supported the directional selection model. There was no evidence that other mutations that are known to confer NVP resistance were selected in this cohort. The directional selection model, applied to serially sampled sequences, also had more power than the diversifying selection model to detect selection resulting from factors other than drug resistance. Because inference of selection from serial samples is unlikely to be adversely affected by recombination, the methods we describe may have general applicability to the analysis of positive selection affecting recombining coding sequences when serially sampled data are available.
HERC1 polymorphisms: population-specific variations in haplotype composition.
Yuasa, Isao; Umetsu, Kazuo; Nishimukai, Hiroaki; Fukumori, Yasuo; Harihara, Shinji; Saitou, Naruya; Jin, Feng; Chattopadhyay, Prasanta K; Henke, Lotte; Henke, Jürgen
2009-08-01
Human HERC1 is one of six HERC proteins and may play an important role in intracellular membrane trafficking. The human HERC1 gene is suggested to have been affected by local positive selection. To assess the global frequency distributions of coding and non-coding single nucleotide polymorphisms (SNPs) in the HERC1 gene, we developed a new simultaneous genotyping method for four SNPs, and applied this method to investigate 1213 individuals from 12 global populations. The results confirmed remarked differences in the allele and haplotype frequencies between East Asian and non-East Asian populations. One of the three common haplotypes observed was found to be characteristic of East Asians, who showed a relatively uniform distribution of haplotypes. Information on haplotypes would be useful for testing the function of polymorphisms in the HERC1 gene. This is the first study to investigate the distribution of HERC1 polymorphisms in various populations. (c) 2009 John Wiley & Sons, Ltd.
Inverting the parameters of an earthquake-ruptured fault with a genetic algorithm
NASA Astrophysics Data System (ADS)
Yu, Ting-To; Fernàndez, Josè; Rundle, John B.
1998-03-01
Natural selection is the spirit of the genetic algorithm (GA): by keeping the good genes in the current generation, thereby producing better offspring during evolution. The crossover function ensures the heritage of good genes from parent to offspring. Meanwhile, the process of mutation creates a special gene, the character of which does not exist in the parent generation. A program based on genetic algorithms using C language is constructed to invert the parameters of an earthquake-ruptured fault. The verification and application of this code is shown to demonstrate its capabilities. It is determined that this code is able to find the global extreme and can be used to solve more practical problems with constraints gathered from other sources. It is shown that GA is superior to other inverting schema in many aspects. This easy handling and yet powerful algorithm should have many suitable applications in the field of geosciences.
Sex-biased transcriptome divergence along a latitudinal gradient.
Allen, Scott L; Bonduriansky, Russell; Sgro, Carla M; Chenoweth, Stephen F
2017-03-01
Sex-dependent gene expression is likely an important genomic mechanism that allows sex-specific adaptation to environmental changes. Among Drosophila species, sex-biased genes display remarkably consistent evolutionary patterns; male-biased genes evolve faster than unbiased genes in both coding sequence and expression level, suggesting sex differences in selection through time. However, comparatively little is known of the evolutionary process shaping sex-biased expression within species. Latitudinal clines offer an opportunity to examine how changes in key ecological parameters also influence sex-specific selection and the evolution of sex-biased gene expression. We assayed male and female gene expression in Drosophila serrata along a latitudinal gradient in eastern Australia spanning most of its endemic distribution. Analysis of 11 631 genes across eight populations revealed strong sex differences in the frequency, mode and strength of divergence. Divergence was far stronger in males than females and while latitudinal clines were evident in both sexes, male divergence was often population specific, suggesting responses to localized selection pressures that do not covary predictably with latitude. While divergence was enriched for male-biased genes, there was no overrepresentation of X-linked genes in males. By contrast, X-linked divergence was elevated in females, especially for female-biased genes. Many genes that diverged in D. serrata have homologs also showing latitudinal divergence in Drosophila simulans and Drosophila melanogaster on other continents, likely indicating parallel adaptation in these distantly related species. Our results suggest that sex differences in selection play an important role in shaping the evolution of gene expression over macro- and micro-ecological spatial scales. © 2017 John Wiley & Sons Ltd.
Subramanian, Sankar; Lingala, Syamala Gowri; Swaminathan, Siva; Huynen, Leon; Lambert, David
2014-08-01
The complete mitochondrial genome of the Chinstrap penguin (Pygoscelis antarcticus) was sequenced and compared with other penguin mitogenomes. The genome is 15,972 bp in length with the number and order of protein coding genes and RNAs being very similar to that of other known penguin mitogenomes. Comparative nucleotide analysis showed the Chinstrap mitogenome shares 94% homology with the mitogenome of its sister species, Pygoscelis adelie (Adélie penguin). Divergence at nonsynonymous nucleotide positions was found to be up to 23 times less than that observed in synonymous positions of protein coding genes, suggesting high selection constraints. The complete mitogenome data will be useful for genetic and evolutionary studies of penguins.
Adaptive Evolution Is Substantially Impeded by Hill-Robertson Interference in Drosophila.
Castellano, David; Coronado-Zamora, Marta; Campos, Jose L; Barbadilla, Antonio; Eyre-Walker, Adam
2016-02-01
Hill-Robertson interference (HRi) is expected to reduce the efficiency of natural selection when two or more linked selected sites do not segregate freely, but no attempt has been done so far to quantify the overall impact of HRi on the rate of adaptive evolution for any given genome. In this work, we estimate how much HRi impedes the rate of adaptive evolution in the coding genome of Drosophila melanogaster. We compiled a data set of 6,141 autosomal protein-coding genes from Drosophila, from which polymorphism levels in D. melanogaster and divergence out to D. yakuba were estimated. The rate of adaptive evolution was calculated using a derivative of the McDonald-Kreitman test that controls for slightly deleterious mutations. We find that the rate of adaptive amino acid substitution at a given position of the genome is positively correlated to both the rate of recombination and the mutation rate, and negatively correlated to the gene density of the region. These correlations are robust to controlling for each other, for synonymous codon bias and for gene functions related to immune response and testes. We show that HRi diminishes the rate of adaptive evolution by approximately 27%. Interestingly, genes with low mutation rates embedded in gene poor regions lose approximately 17% of their adaptive substitutions whereas genes with high mutation rates embedded in gene rich regions lose approximately 60%. We conclude that HRi hampers the rate of adaptive evolution in Drosophila and that the variation in recombination, mutation, and gene density along the genome affects the HRi effect. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Blochlinger, K; Diggelmann, H
1984-12-01
The DNA coding sequence for the hygromycin B phosphotransferase gene was placed under the control of the regulatory sequences of a cloned long terminal repeat of Moloney sarcoma virus. This construction allowed direct selection for hygromycin B resistance after transfection of eucaryotic cell lines not naturally resistant to this antibiotic, thus providing another dominant marker for DNA transfer in eucaryotic cells.
Blochlinger, K; Diggelmann, H
1984-01-01
The DNA coding sequence for the hygromycin B phosphotransferase gene was placed under the control of the regulatory sequences of a cloned long terminal repeat of Moloney sarcoma virus. This construction allowed direct selection for hygromycin B resistance after transfection of eucaryotic cell lines not naturally resistant to this antibiotic, thus providing another dominant marker for DNA transfer in eucaryotic cells. Images PMID:6098829
Genes and Junk in Plant Mitochondria—Repair Mechanisms and Selection
Christensen, Alan C.
2014-01-01
Plant mitochondrial genomes have very low mutation rates. In contrast, they also rearrange and expand frequently. This is easily understood if DNA repair in genes is accomplished by accurate mechanisms, whereas less accurate mechanisms including nonhomologous end joining or break-induced replication are used in nongenes. An important question is how different mechanisms of repair predominate in coding and noncoding DNA, although one possible mechanism is transcription-coupled repair (TCR). This work tests the predictions of TCR and finds no support for it. Examination of the mutation spectra and rates in genes and junk reveals what DNA repair mechanisms are available to plant mitochondria, and what selective forces act on the repair products. A model is proposed that mismatches and other DNA damages are repaired by converting them into double-strand breaks (DSBs). These can then be repaired by any of the DSB repair mechanisms, both accurate and inaccurate. Natural selection will eliminate coding regions repaired by inaccurate mechanisms, accounting for the low mutation rates in genes, whereas mutations, rearrangements, and expansions generated by inaccurate repair in noncoding regions will persist. Support for this model includes the structure of the mitochondrial mutS homolog in plants, which is fused to a double-strand endonuclease. The model proposes that plant mitochondria do not distinguish a damaged or mismatched DNA strand from the undamaged strand, they simply cut both strands and perform homology-based DSB repair. This plant-specific strategy for protecting future generations from mitochondrial DNA damage has the side effect of genome expansions and rearrangements. PMID:24904012
Improved Sparse Multi-Class SVM and Its Application for Gene Selection in Cancer Classification
Huang, Lingkang; Zhang, Hao Helen; Zeng, Zhao-Bang; Bushel, Pierre R.
2013-01-01
Background Microarray techniques provide promising tools for cancer diagnosis using gene expression profiles. However, molecular diagnosis based on high-throughput platforms presents great challenges due to the overwhelming number of variables versus the small sample size and the complex nature of multi-type tumors. Support vector machines (SVMs) have shown superior performance in cancer classification due to their ability to handle high dimensional low sample size data. The multi-class SVM algorithm of Crammer and Singer provides a natural framework for multi-class learning. Despite its effective performance, the procedure utilizes all variables without selection. In this paper, we propose to improve the procedure by imposing shrinkage penalties in learning to enforce solution sparsity. Results The original multi-class SVM of Crammer and Singer is effective for multi-class classification but does not conduct variable selection. We improved the method by introducing soft-thresholding type penalties to incorporate variable selection into multi-class classification for high dimensional data. The new methods were applied to simulated data and two cancer gene expression data sets. The results demonstrate that the new methods can select a small number of genes for building accurate multi-class classification rules. Furthermore, the important genes selected by the methods overlap significantly, suggesting general agreement among different variable selection schemes. Conclusions High accuracy and sparsity make the new methods attractive for cancer diagnostics with gene expression data and defining targets of therapeutic intervention. Availability: The source MATLAB code are available from http://math.arizona.edu/~hzhang/software.html. PMID:23966761
Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants.
Fu, Wenqing; O'Connor, Timothy D; Jun, Goo; Kang, Hyun Min; Abecasis, Goncalo; Leal, Suzanne M; Gabriel, Stacey; Rieder, Mark J; Altshuler, David; Shendure, Jay; Nickerson, Deborah A; Bamshad, Michael J; Akey, Joshua M
2013-01-10
Establishing the age of each mutation segregating in contemporary human populations is important to fully understand our evolutionary history and will help to facilitate the development of new approaches for disease-gene discovery. Large-scale surveys of human genetic variation have reported signatures of recent explosive population growth, notable for an excess of rare genetic variants, suggesting that many mutations arose recently. To more quantitatively assess the distribution of mutation ages, we resequenced 15,336 genes in 6,515 individuals of European American and African American ancestry and inferred the age of 1,146,401 autosomal single nucleotide variants (SNVs). We estimate that approximately 73% of all protein-coding SNVs and approximately 86% of SNVs predicted to be deleterious arose in the past 5,000-10,000 years. The average age of deleterious SNVs varied significantly across molecular pathways, and disease genes contained a significantly higher proportion of recently arisen deleterious SNVs than other genes. Furthermore, European Americans had an excess of deleterious variants in essential and Mendelian disease genes compared to African Americans, consistent with weaker purifying selection due to the Out-of-Africa dispersal. Our results better delimit the historical details of human protein-coding variation, show the profound effect of recent human history on the burden of deleterious SNVs segregating in contemporary populations, and provide important practical information that can be used to prioritize variants in disease-gene discovery.
The primary structure of the Saccharomyces cerevisiae gene for 3-phosphoglycerate kinase.
Hitzeman, R A; Hagie, F E; Hayflick, J S; Chen, C Y; Seeburg, P H; Derynck, R
1982-01-01
The DNA sequence of the gene for the yeast glycolytic enzyme, 3-phosphoglycerate kinase (PGK), has been obtained by sequencing part of a 3.1 kbp HindIII fragment obtained from the yeast genome. The structural gene sequence corresponds to a reading frame of 1251 bp coding for 416 amino acids with no intervening DNA sequences. The amino acid sequence is approximately 65 percent homologous with human and horse PGK protein sequences and is in general agreement with the published protein sequence for yeast PGK. As for other highly expressed structural genes in yeast, the coding sequence is highly codon biased with 95 percent of the amino acids coded for by a select 25 codons (out of 61 possible). Besides structural DNA sequence, 291 bp of 5'-flanking sequence and 286 bp of 3'-flanking sequence were determined. Transcription starts 36 nucleotides upstream from the translational start and stops 86-93 nucleotides downstream from the translational stop. These results suggest a non-polyadenylated mRNA length of 1373 to 1380 nucleotides, which is consistent with the observed length of 1500 nucleotides for polyadenylated PGK mRNA. A sequence TATATATAAA is found at 145 nucleotides upstream from the translational start. This sequence resembles the TATAAA box that is possibly associated with RNA polymerase II binding. Images PMID:6296791
del Val, Coral; Rivas, Elena; Torres-Quesada, Omar; Toro, Nicolás; Jiménez-Zurdo, José I
2007-01-01
Bacterial small non-coding RNAs (sRNAs) are being recognized as novel widespread regulators of gene expression in response to environmental signals. Here, we present the first search for sRNA-encoding genes in the nitrogen-fixing endosymbiont Sinorhizobium meliloti, performed by a genome-wide computational analysis of its intergenic regions. Comparative sequence data from eight related α-proteobacteria were obtained, and the interspecies pairwise alignments were scored with the programs eQRNA and RNAz as complementary predictive tools to identify conserved and stable secondary structures corresponding to putative non-coding RNAs. Northern experiments confirmed that eight of the predicted loci, selected among the original 32 candidates as most probable sRNA genes, expressed small transcripts. This result supports the combined use of eQRNA and RNAz as a robust strategy to identify novel sRNAs in bacteria. Furthermore, seven of the transcripts accumulated differentially in free-living and symbiotic conditions. Experimental mapping of the 5′-ends of the detected transcripts revealed that their encoding genes are organized in autonomous transcription units with recognizable promoter and, in most cases, termination signatures. These findings suggest novel regulatory functions for sRNAs related to the interactions of α-proteobacteria with their eukaryotic hosts. PMID:17971083
Lassa-Vesicular Stomatitis Chimeric Virus Safely Destroys Brain Tumors
Wollmann, Guido; Drokhlyansky, Eugene; Davis, John N.; Cepko, Connie
2015-01-01
ABSTRACT High-grade tumors in the brain are among the deadliest of cancers. Here, we took a promising oncolytic virus, vesicular stomatitis virus (VSV), and tested the hypothesis that the neurotoxicity associated with the virus could be eliminated without blocking its oncolytic potential in the brain by replacing the neurotropic VSV glycoprotein with the glycoprotein from one of five different viruses, including Ebola virus, Marburg virus, lymphocytic choriomeningitis virus (LCMV), rabies virus, and Lassa virus. Based on in vitro infections of normal and tumor cells, we selected two viruses to test in vivo. Wild-type VSV was lethal when injected directly into the brain. In contrast, a novel chimeric virus (VSV-LASV-GPC) containing genes from both the Lassa virus glycoprotein precursor (GPC) and VSV showed no adverse actions within or outside the brain and targeted and completely destroyed brain cancer, including high-grade glioblastoma and melanoma, even in metastatic cancer models. When mice had two brain tumors, intratumoral VSV-LASV-GPC injection in one tumor (glioma or melanoma) led to complete tumor destruction; importantly, the virus moved contralaterally within the brain to selectively infect the second noninjected tumor. A chimeric virus combining VSV genes with the gene coding for the Ebola virus glycoprotein was safe in the brain and also selectively targeted brain tumors but was substantially less effective in destroying brain tumors and prolonging survival of tumor-bearing mice. A tropism for multiple cancer types combined with an exquisite tumor specificity opens a new door to widespread application of VSV-LASV-GPC as a safe and efficacious oncolytic chimeric virus within the brain. IMPORTANCE Many viruses have been tested for their ability to target and kill cancer cells. Vesicular stomatitis virus (VSV) has shown substantial promise, but a key problem is that if it enters the brain, it can generate adverse neurologic consequences, including death. We tested a series of chimeric viruses containing genes coding for VSV, together with a gene coding for the glycoprotein from other viruses, including Ebola virus, Lassa virus, LCMV, rabies virus, and Marburg virus, which was substituted for the VSV glycoprotein gene. Ebola and Lassa chimeric viruses were safe in the brain and targeted brain tumors. Lassa-VSV was particularly effective, showed no adverse side effects even when injected directly into the brain, and targeted and destroyed two different types of deadly brain cancer, including glioblastoma and melanoma. PMID:25878115
Xu, Fuyi; Hu, Shixian; Chao, Tianzhu; Wang, Maochun; Li, Kai; Zhou, Yuxun; Xu, Hongyan; Xiao, Junhua
2017-10-01
Both natural and artificial selection play a critical role in animals' adaptation to the environment. Detection of the signature of selection in genomic regions can provide insights for understanding the function of specific phenotypes. It is generally assumed that laboratory mice may experience intense artificial selection while wild mice more natural selection. However, the differences of selection signature in the mouse genome and underlying genes between wild and laboratory mice remain unclear. In this study, we used two mouse populations: chromosome 1 (Chr 1) substitution lines (C1SLs) derived from Chinese wild mice and mouse genome project (MGP) sequenced inbred strains and two selection detection statistics: Fst and Tajima's D to identify the signature of selection footprint on Chr 1. For the differentiation between the C1SLs and MGP, 110 candidate selection regions containing 47 protein coding genes were detected. A total of 149 selection regions which encompass 7.215 Mb were identified in the C1SLs by Tajima's D approach. While for the MGP, we identified nearly twice selection regions (243) compared with the C1SLs which accounted for 13.27 Mb Chr 1 sequence. Through functional annotation, we identified several biological processes with significant enrichment including seven genes in the olfactory transduction pathway. In addition, we searched the phenotypes associated with the 47 candidate selection genes identified by Fst. These genes were involved in behavior, growth or body weight, mortality or aging, and immune systems which align well with the phenotypic differences between wild and laboratory mice. Therefore, the findings would be helpful for our understanding of the phenotypic differences between wild and laboratory mice and applications for using this new mouse resource (C1SLs) for further genetics studies.
Genome-Wide Analyses Reveal Genes Subject to Positive Selection in Pasteurella multocida
Cao, Peili; Guo, Dongchun; Liu, Jiasen; Jiang, Qian; Xu, Zhuofei; Qu, Liandong
2017-01-01
Pasteurella multocida, a Gram-negative opportunistic pathogen, has led to a broad range of diseases in mammals and birds, including fowl cholera in poultry, pneumonia and atrophic rhinitis in swine and rabbit, hemorrhagic septicemia in cattle, and bite infections in humans. In order to better interpret the genetic diversity and adaptation evolution of this pathogen, seven genomes of P. multocida strains isolated from fowls, rabbit and pigs were determined by using high-throughput sequencing approach. Together with publicly available P. multocida genomes, evolutionary features were systematically analyzed in this study. Clustering of 70,565 protein-coding genes showed that the pangenome of 33 P. multocida strains was composed of 1,602 core genes, 1,364 dispensable genes, and 1,070 strain-specific genes. Of these, we identified a full spectrum of genes related to virulence factors and revealed genetic diversity of these potential virulence markers across P. multocida strains, e.g., bcbAB, fcbC, lipA, bexDCA, ctrCD, lgtA, lgtC, lic2A involved in biogenesis of surface polysaccharides, hsf encoding autotransporter adhesin, and fhaB encoding filamentous haemagglutinin. Furthermore, based on genome-wide positive selection scanning, a total of 35 genes were subject to strong selection pressure. Extensive analyses of protein subcellular location indicated that membrane-associated genes were highly abundant among all positively selected genes. The detected amino acid sites undergoing adaptive selection were preferably located in extracellular space, perhaps associated with bacterial evasion of host immune responses. Our findings shed more light on conservation and distribution of virulence-associated genes across P. multocida strains. Meanwhile, this study provides a genetic context for future researches on the mechanism of adaptive evolution in P. multocida. PMID:28611758
Zhang, Bo; Zhang, Yan-Hong; Wang, Xin; Zhang, Hui-Xian; Lin, Qiang
2017-07-01
The deep sea is one of the most extensive ecosystems on earth. Organisms living there survive in an extremely harsh environment, and their mitochondrial energy metabolism might be a result of evolution. As one of the most important organelles, mitochondria generate energy through energy metabolism and play an important role in almost all biological activities. In this study, the mitogenome of a deep-sea sea anemone ( Bolocera sp.) was sequenced and characterized. Like other metazoans, it contained 13 energy pathway protein-coding genes and two ribosomal RNAs. However, it also exhibited some unique features: just two transfer RNA genes, two group I introns, two transposon-like noncanonical open reading frames (ORFs), and a control region-like (CR-like) element. All of the mitochondrial genes were coded by the same strand (the H-strand). The genetic order and orientation were identical to those of most sequenced actiniarians. Phylogenetic analyses showed that this species was closely related to Bolocera tuediae . Positive selection analysis showed that three residues (31 L and 42 N in ATP6 , 570 S in ND5 ) of Bolocera sp. were positively selected sites. By comparing these features with those of shallow sea anemone species, we deduced that these novel gene features may influence the activity of mitochondrial genes. This study may provide some clues regarding the adaptation of Bolocera sp. to the deep-sea environment.
Qian, Yamin; Fang, Tao; Shen, Bin; Zhang, Shuyi
2014-01-01
Frugivorous and nectarivorous bats rely largely on hepatic glycogenesis and glycogenolysis for postprandial blood glucose disposal and maintenance of glucose homeostasis during short time starvation, respectively. The glycogen synthase 2 encoded by the Gys2 gene plays a critical role in liver glycogen synthesis. To test whether the Gys2 gene has undergone adaptive evolution in bats with carbohydrate-rich diets in relation to their insect-eating sister taxa, we sequenced the coding region of the Gys2 gene in a number of bat species, including three Old World fruit bats (OWFBs) (Pteropodidae) and two New World fruit bats (NWFBs) (Phyllostomidae). Our results showed that the Gys2 coding sequences are highly conserved across all bat species we examined, and no evidence of positive selection was detected in the ancestral branches leading to OWFBs and NWFBs. Our explicit convergence test showed that posterior probabilities of convergence between several branches of OWFBs, and the NWFBs were markedly higher than that of divergence. Three parallel amino acid substitutions (Q72H, K371Q, and E666D) were detected among branches of OWFBs and NWFBs. Tests for parallel evolution showed that two parallel substitutions (Q72H and E666D) were driven by natural selection, while the K371Q was more likely to be fixed randomly. Thus, our results suggested that the Gys2 gene has undergone parallel evolution on amino acid level between OWFBs and NWFBs in relation to their carbohydrate metabolism.
Leung, Wilson; Shaffer, Christopher D.; Reed, Laura K.; Smith, Sheryl T.; Barshop, William; Dirkes, William; Dothager, Matthew; Lee, Paul; Wong, Jeannette; Xiong, David; Yuan, Han; Bedard, James E. J.; Machone, Joshua F.; Patterson, Seantay D.; Price, Amber L.; Turner, Bryce A.; Robic, Srebrenka; Luippold, Erin K.; McCartha, Shannon R.; Walji, Tezin A.; Walker, Chelsea A.; Saville, Kenneth; Abrams, Marita K.; Armstrong, Andrew R.; Armstrong, William; Bailey, Robert J.; Barberi, Chelsea R.; Beck, Lauren R.; Blaker, Amanda L.; Blunden, Christopher E.; Brand, Jordan P.; Brock, Ethan J.; Brooks, Dana W.; Brown, Marie; Butzler, Sarah C.; Clark, Eric M.; Clark, Nicole B.; Collins, Ashley A.; Cotteleer, Rebecca J.; Cullimore, Peterson R.; Dawson, Seth G.; Docking, Carter T.; Dorsett, Sasha L.; Dougherty, Grace A.; Downey, Kaitlyn A.; Drake, Andrew P.; Earl, Erica K.; Floyd, Trevor G.; Forsyth, Joshua D.; Foust, Jonathan D.; Franchi, Spencer L.; Geary, James F.; Hanson, Cynthia K.; Harding, Taylor S.; Harris, Cameron B.; Heckman, Jonathan M.; Holderness, Heather L.; Howey, Nicole A.; Jacobs, Dontae A.; Jewell, Elizabeth S.; Kaisler, Maria; Karaska, Elizabeth A.; Kehoe, James L.; Koaches, Hannah C.; Koehler, Jessica; Koenig, Dana; Kujawski, Alexander J.; Kus, Jordan E.; Lammers, Jennifer A.; Leads, Rachel R.; Leatherman, Emily C.; Lippert, Rachel N.; Messenger, Gregory S.; Morrow, Adam T.; Newcomb, Victoria; Plasman, Haley J.; Potocny, Stephanie J.; Powers, Michelle K.; Reem, Rachel M.; Rennhack, Jonathan P.; Reynolds, Katherine R.; Reynolds, Lyndsey A.; Rhee, Dong K.; Rivard, Allyson B.; Ronk, Adam J.; Rooney, Meghan B.; Rubin, Lainey S.; Salbert, Luke R.; Saluja, Rasleen K.; Schauder, Taylor; Schneiter, Allison R.; Schulz, Robert W.; Smith, Karl E.; Spencer, Sarah; Swanson, Bryant R.; Tache, Melissa A.; Tewilliager, Ashley A.; Tilot, Amanda K.; VanEck, Eve; Villerot, Matthew M.; Vylonis, Megan B.; Watson, David T.; Wurzler, Juliana A.; Wysocki, Lauren M.; Yalamanchili, Monica; Zaborowicz, Matthew A.; Emerson, Julia A.; Ortiz, Carlos; Deuschle, Frederic J.; DiLorenzo, Lauren A.; Goeller, Katie L.; Macchi, Christopher R.; Muller, Sarah E.; Pasierb, Brittany D.; Sable, Joseph E.; Tucci, Jessica M.; Tynon, Marykathryn; Dunbar, David A.; Beken, Levent H.; Conturso, Alaina C.; Danner, Benjamin L.; DeMichele, Gabriella A.; Gonzales, Justin A.; Hammond, Maureen S.; Kelley, Colleen V.; Kelly, Elisabeth A.; Kulich, Danielle; Mageeney, Catherine M.; McCabe, Nikie L.; Newman, Alyssa M.; Spaeder, Lindsay A.; Tumminello, Richard A.; Revie, Dennis; Benson, Jonathon M.; Cristostomo, Michael C.; DaSilva, Paolo A.; Harker, Katherine S.; Jarrell, Jenifer N.; Jimenez, Luis A.; Katz, Brandon M.; Kennedy, William R.; Kolibas, Kimberly S.; LeBlanc, Mark T.; Nguyen, Trung T.; Nicolas, Daniel S.; Patao, Melissa D.; Patao, Shane M.; Rupley, Bryan J.; Sessions, Bridget J.; Weaver, Jennifer A.; Goodman, Anya L.; Alvendia, Erica L.; Baldassari, Shana M.; Brown, Ashley S.; Chase, Ian O.; Chen, Maida; Chiang, Scott; Cromwell, Avery B.; Custer, Ashley F.; DiTommaso, Tia M.; El-Adaimi, Jad; Goscinski, Nora C.; Grove, Ryan A.; Gutierrez, Nestor; Harnoto, Raechel S.; Hedeen, Heather; Hong, Emily L.; Hopkins, Barbara L.; Huerta, Vilma F.; Khoshabian, Colin; LaForge, Kristin M.; Lee, Cassidy T.; Lewis, Benjamin M.; Lydon, Anniken M.; Maniaci, Brian J.; Mitchell, Ryan D.; Morlock, Elaine V.; Morris, William M.; Naik, Priyanka; Olson, Nicole C.; Osterloh, Jeannette M.; Perez, Marcos A.; Presley, Jonathan D.; Randazzo, Matt J.; Regan, Melanie K.; Rossi, Franca G.; Smith, Melanie A.; Soliterman, Eugenia A.; Sparks, Ciani J.; Tran, Danny L.; Wan, Tiffany; Welker, Anne A.; Wong, Jeremy N.; Sreenivasan, Aparna; Youngblom, Jim; Adams, Andrew; Alldredge, Justin; Bryant, Ashley; Carranza, David; Cifelli, Alyssa; Coulson, Kevin; Debow, Calise; Delacruz, Noelle; Emerson, Charlene; Farrar, Cassandra; Foret, Don; Garibay, Edgar; Gooch, John; Heslop, Michelle; Kaur, Sukhjit; Khan, Ambreen; Kim, Van; Lamb, Travis; Lindbeck, Peter; Lucas, Gabi; Macias, Elizabeth; Martiniuc, Daniela; Mayorga, Lissett; Medina, Joseph; Membreno, Nelson; Messiah, Shady; Neufeld, Lacey; Nguyen, San Francisco; Nichols, Zachary; Odisho, George; Peterson, Daymon; Rodela, Laura; Rodriguez, Priscilla; Rodriguez, Vanessa; Ruiz, Jorge; Sherrill, Will; Silva, Valeria; Sparks, Jeri; Statton, Geeta; Townsend, Ashley; Valdez, Isabel; Waters, Mary; Westphal, Kyle; Winkler, Stacey; Zumkehr, Joannee; DeJong, Randall J.; Hoogewerf, Arlene J.; Ackerman, Cheri M.; Armistead, Isaac O.; Baatenburg, Lara; Borr, Matthew J.; Brouwer, Lindsay K.; Burkhart, Brandon J.; Bushhouse, Kelsey T.; Cesko, Lejla; Choi, Tiffany Y. Y.; Cohen, Heather; Damsteegt, Amanda M.; Darusz, Jess M.; Dauphin, Cory M.; Davis, Yelena P.; Diekema, Emily J.; Drewry, Melissa; Eisen, Michelle E. M.; Faber, Hayley M.; Faber, Katherine J.; Feenstra, Elizabeth; Felzer-Kim, Isabella T.; Hammond, Brandy L.; Hendriksma, Jesse; Herrold, Milton R.; Hilbrands, Julia A.; Howell, Emily J.; Jelgerhuis, Sarah A.; Jelsema, Timothy R.; Johnson, Benjamin K.; Jones, Kelly K.; Kim, Anna; Kooienga, Ross D.; Menyes, Erika E.; Nollet, Eric A.; Plescher, Brittany E.; Rios, Lindsay; Rose, Jenny L.; Schepers, Allison J.; Scott, Geoff; Smith, Joshua R.; Sterling, Allison M.; Tenney, Jenna C.; Uitvlugt, Chris; VanDyken, Rachel E.; VanderVennen, Marielle; Vue, Samantha; Kokan, Nighat P.; Agbley, Kwabea; Boham, Sampson K.; Broomfield, Daniel; Chapman, Kayla; Dobbe, Ali; Dobbe, Ian; Harrington, William; Ibrahem, Marwan; Kennedy, Andre; Koplinsky, Chad A.; Kubricky, Cassandra; Ladzekpo, Danielle; Pattison, Claire; Ramirez, Roman E.; Wande, Lucia; Woehlke, Sarah; Wawersik, Matthew; Kiernan, Elizabeth; Thompson, Jeffrey S.; Banker, Roxanne; Bartling, Justina R.; Bhatiya, Chinmoy I.; Boudoures, Anna L.; Christiansen, Lena; Fosselman, Daniel S.; French, Kristin M.; Gill, Ishwar S.; Havill, Jessen T.; Johnson, Jaelyn L.; Keny, Lauren J.; Kerber, John M.; Klett, Bethany M.; Kufel, Christina N.; May, Francis J.; Mecoli, Jonathan P.; Merry, Callie R.; Meyer, Lauren R.; Miller, Emily G.; Mullen, Gregory J.; Palozola, Katherine C.; Pfeil, Jacob J.; Thomas, Jessica G.; Verbofsky, Evan M.; Spana, Eric P.; Agarwalla, Anant; Chapman, Julia; Chlebina, Ben; Chong, Insun; Falk, I.N.; Fitzgibbons, John D.; Friedman, Harrison; Ighile, Osagie; Kim, Andrew J.; Knouse, Kristin A.; Kung, Faith; Mammo, Danny; Ng, Chun Leung; Nikam, Vinayak S.; Norton, Diana; Pham, Philip; Polk, Jessica W.; Prasad, Shreya; Rankin, Helen; Ratliff, Camille D.; Scala, Victoria; Schwartz, Nicholas U.; Shuen, Jessica A.; Xu, Amy; Xu, Thomas Q.; Zhang, Yi; Rosenwald, Anne G.; Burg, Martin G.; Adams, Stephanie J.; Baker, Morgan; Botsford, Bobbi; Brinkley, Briana; Brown, Carter; Emiah, Shadie; Enoch, Erica; Gier, Chad; Greenwell, Alyson; Hoogenboom, Lindsay; Matthews, Jordan E.; McDonald, Mitchell; Mercer, Amanda; Monsma, Nicholaus; Ostby, Kristine; Ramic, Alen; Shallman, Devon; Simon, Matthew; Spencer, Eric; Tomkins, Trisha; Wendland, Pete; Wylie, Anna; Wolyniak, Michael J.; Robertson, Gregory M.; Smith, Samuel I.; DiAngelo, Justin R.; Sassu, Eric D.; Bhalla, Satish C.; Sharif, Karim A.; Choeying, Tenzin; Macias, Jason S.; Sanusi, Fareed; Torchon, Karvyn; Bednarski, April E.; Alvarez, Consuelo J.; Davis, Kristen C.; Dunham, Carrie A.; Grantham, Alaina J.; Hare, Amber N.; Schottler, Jennifer; Scott, Zackary W.; Kuleck, Gary A.; Yu, Nicole S.; Kaehler, Marian M.; Jipp, Jacob; Overvoorde, Paul J.; Shoop, Elizabeth; Cyrankowski, Olivia; Hoover, Betsy; Kusner, Matt; Lin, Devry; Martinov, Tijana; Misch, Jonathan; Salzman, Garrett; Schiedermayer, Holly; Snavely, Michael; Zarrasola, Stephanie; Parrish, Susan; Baker, Atlee; Beckett, Alissa; Belella, Carissa; Bryant, Julie; Conrad, Turner; Fearnow, Adam; Gomez, Carolina; Herbstsomer, Robert A.; Hirsch, Sarah; Johnson, Christen; Jones, Melissa; Kabaso, Rita; Lemmon, Eric; Vieira, Carolina Marques dos Santos; McFarland, Darryl; McLaughlin, Christopher; Morgan, Abbie; Musokotwane, Sepo; Neutzling, William; Nietmann, Jana; Paluskievicz, Christina; Penn, Jessica; Peoples, Emily; Pozmanter, Caitlin; Reed, Emily; Rigby, Nichole; Schmidt, Lasse; Shelton, Micah; Shuford, Rebecca; Tirasawasdichai, Tiara; Undem, Blair; Urick, Damian; Vondy, Kayla; Yarrington, Bryan; Eckdahl, Todd T.; Poet, Jeffrey L.; Allen, Alica B.; Anderson, John E.; Barnett, Jason M.; Baumgardner, Jordan S.; Brown, Adam D.; Carney, Jordan E.; Chavez, Ramiro A.; Christgen, Shelbi L.; Christie, Jordan S.; Clary, Andrea N.; Conn, Michel A.; Cooper, Kristen M.; Crowley, Matt J.; Crowley, Samuel T.; Doty, Jennifer S.; Dow, Brian A.; Edwards, Curtis R.; Elder, Darcie D.; Fanning, John P.; Janssen, Bridget M.; Lambright, Anthony K.; Lane, Curtiss E.; Limle, Austin B.; Mazur, Tammy; McCracken, Marly R.; McDonough, Alexa M.; Melton, Amy D.; Minnick, Phillip J.; Musick, Adam E.; Newhart, William H.; Noynaert, Joseph W.; Ogden, Bradley J.; Sandusky, Michael W.; Schmuecker, Samantha M.; Shipman, Anna L.; Smith, Anna L.; Thomsen, Kristen M.; Unzicker, Matthew R.; Vernon, William B.; Winn, Wesley W.; Woyski, Dustin S.; Zhu, Xiao; Du, Chunguang; Ament, Caitlin; Aso, Soham; Bisogno, Laura Simone; Caronna, Jason; Fefelova, Nadezhda; Lopez, Lenin; Malkowitz, Lorraine; Marra, Jonathan; Menillo, Daniella; Obiorah, Ifeanyi; Onsarigo, Eric Nyabeta; Primus, Shekerah; Soos, Mahdi; Tare, Archana; Zidan, Ameer; Jones, Christopher J.; Aronhalt, Todd; Bellush, James M.; Burke, Christa; DeFazio, Steve; Does, Benjamin R.; Johnson, Todd D.; Keysock, Nicholas; Knudsen, Nelson H.; Messler, James; Myirski, Kevin; Rekai, Jade Lea; Rempe, Ryan Michael; Salgado, Michael S.; Stagaard, Erica; Starcher, Justin R.; Waggoner, Andrew W.; Yemelyanova, Anastasia K.; Hark, Amy T.; Bertolet, Anne; Kuschner, Cyrus E.; Parry, Kesley; Quach, Michael; Shantzer, Lindsey; Shaw, Mary E.; Smith, Mary A.; Glenn, Omolara; Mason, Portia; Williams, Charlotte; Key, S. Catherine Silver; Henry, Tyneshia C. P.; Johnson, Ashlee G.; White, Jackie X.; Haberman, Adam; Asinof, Sam; Drumm, Kelly; Freeburg, Trip; Safa, Nadia; Schultz, Darrin; Shevin, Yakov; Svoronos, Petros; Vuong, Tam; Wellinghoff, Jules; Hoopes, Laura L. M.; Chau, Kim M.; Ward, Alyssa; Regisford, E. Gloria C.; Augustine, LaJerald; Davis-Reyes, Brionna; Echendu, Vivienne; Hales, Jasmine; Ibarra, Sharon; Johnson, Lauriaun; Ovu, Steven; Braverman, John M.; Bahr, Thomas J.; Caesar, Nicole M.; Campana, Christopher; Cassidy, Daniel W.; Cognetti, Peter A.; English, Johnathan D.; Fadus, Matthew C.; Fick, Cameron N.; Freda, Philip J.; Hennessy, Bryan M.; Hockenberger, Kelsey; Jones, Jennifer K.; King, Jessica E.; Knob, Christopher R.; Kraftmann, Karen J.; Li, Linghui; Lupey, Lena N.; Minniti, Carl J.; Minton, Thomas F.; Moran, Joseph V.; Mudumbi, Krishna; Nordman, Elizabeth C.; Puetz, William J.; Robinson, Lauren M.; Rose, Thomas J.; Sweeney, Edward P.; Timko, Ashley S.; Paetkau, Don W.; Eisler, Heather L.; Aldrup, Megan E.; Bodenberg, Jessica M.; Cole, Mara G.; Deranek, Kelly M.; DeShetler, Megan; Dowd, Rose M.; Eckardt, Alexandra K.; Ehret, Sharon C.; Fese, Jessica; Garrett, Amanda D.; Kammrath, Anna; Kappes, Michelle L.; Light, Morgan R.; Meier, Anne C.; O’Rouke, Allison; Perella, Mallory; Ramsey, Kimberley; Ramthun, Jennifer R.; Reilly, Mary T.; Robinett, Deirdre; Rossi, Nadine L.; Schueler, Mary Grace; Shoemaker, Emma; Starkey, Kristin M.; Vetor, Ashley; Vrable, Abby; Chandrasekaran, Vidya; Beck, Christopher; Hatfield, Kristen R.; Herrick, Douglas A.; Khoury, Christopher B.; Lea, Charlotte; Louie, Christopher A.; Lowell, Shannon M.; Reynolds, Thomas J.; Schibler, Jeanine; Scoma, Alexandra H.; Smith-Gee, Maxwell T.; Tuberty, Sarah; Smith, Christopher D.; Lopilato, Jane E.; Hauke, Jeanette; Roecklein-Canfield, Jennifer A.; Corrielus, Maureen; Gilman, Hannah; Intriago, Stephanie; Maffa, Amanda; Rauf, Sabya A.; Thistle, Katrina; Trieu, Melissa; Winters, Jenifer; Yang, Bib; Hauser, Charles R.; Abusheikh, Tariq; Ashrawi, Yara; Benitez, Pedro; Boudreaux, Lauren R.; Bourland, Megan; Chavez, Miranda; Cruz, Samantha; Elliott, GiNell; Farek, Jesse R.; Flohr, Sarah; Flores, Amanda H.; Friedrichs, Chelsey; Fusco, Zach; Goodwin, Zane; Helmreich, Eric; Kiley, John; Knepper, John Mark; Langner, Christine; Martinez, Megan; Mendoza, Carlos; Naik, Monal; Ochoa, Andrea; Ragland, Nicolas; Raimey, England; Rathore, Sunil; Reza, Evangelina; Sadovsky, Griffin; Seydoux, Marie-Isabelle B.; Smith, Jonathan E.; Unruh, Anna K.; Velasquez, Vicente; Wolski, Matthew W.; Gosser, Yuying; Govind, Shubha; Clarke-Medley, Nicole; Guadron, Leslie; Lau, Dawn; Lu, Alvin; Mazzeo, Cheryl; Meghdari, Mariam; Ng, Simon; Pamnani, Brad; Plante, Olivia; Shum, Yuki Kwan Wa; Song, Roy; Johnson, Diana E.; Abdelnabi, Mai; Archambault, Alexi; Chamma, Norma; Gaur, Shailly; Hammett, Deborah; Kandahari, Adrese; Khayrullina, Guzal; Kumar, Sonali; Lawrence, Samantha; Madden, Nigel; Mandelbaum, Max; Milnthorp, Heather; Mohini, Shiv; Patel, Roshni; Peacock, Sarah J.; Perling, Emily; Quintana, Amber; Rahimi, Michael; Ramirez, Kristen; Singhal, Rishi; Weeks, Corinne; Wong, Tiffany; Gillis, Aubree T.; Moore, Zachary D.; Savell, Christopher D.; Watson, Reece; Mel, Stephanie F.; Anilkumar, Arjun A.; Bilinski, Paul; Castillo, Rostislav; Closser, Michael; Cruz, Nathalia M.; Dai, Tiffany; Garbagnati, Giancarlo F.; Horton, Lanor S.; Kim, Dongyeon; Lau, Joyce H.; Liu, James Z.; Mach, Sandy D.; Phan, Thu A.; Ren, Yi; Stapleton, Kenneth E.; Strelitz, Jean M.; Sunjed, Ray; Stamm, Joyce; Anderson, Morgan C.; Bonifield, Bethany Grace; Coomes, Daniel; Dillman, Adam; Durchholz, Elaine J.; Fafara-Thompson, Antoinette E.; Gross, Meleah J.; Gygi, Amber M.; Jackson, Lesley E.; Johnson, Amy; Kocsisova, Zuzana; Manghelli, Joshua L.; McNeil, Kylie; Murillo, Michael; Naylor, Kierstin L.; Neely, Jessica; Ogawa, Emmy E.; Rich, Ashley; Rogers, Anna; Spencer, J. Devin; Stemler, Kristina M.; Throm, Allison A.; Van Camp, Matt; Weihbrecht, Katie; Wiles, T. Aaron; Williams, Mallory A.; Williams, Matthew; Zoll, Kyle; Bailey, Cheryl; Zhou, Leming; Balthaser, Darla M.; Bashiri, Azita; Bower, Mindy E.; Florian, Kayla A.; Ghavam, Nazanin; Greiner-Sosanko, Elizabeth S.; Karim, Helmet; Mullen, Victor W.; Pelchen, Carly E.; Yenerall, Paul M.; Zhang, Jiayu; Rubin, Michael R.; Arias-Mejias, Suzette M.; Bermudez-Capo, Armando G.; Bernal-Vega, Gabriela V.; Colon-Vazquez, Mariela; Flores-Vazquez, Arelys; Gines-Rosario, Mariela; Llavona-Cartagena, Ivan G.; Martinez-Rodriguez, Javier O.; Ortiz-Fuentes, Lionel; Perez-Colomba, Eliezer O.; Perez-Otero, Joseph; Rivera, Elisandra; Rodriguez-Giron, Luke J.; Santiago-Sanabria, Arnaldo J.; Senquiz-Gonzalez, Andrea M.; delValle, Frank R. Soto; Vargas-Franco, Dorianmarie; Velázquez-Soto, Karla I.; Zambrana-Burgos, Joan D.; Martinez-Cruzado, Juan Carlos; Asencio-Zayas, Lillyann; Babilonia-Figueroa, Kevin; Beauchamp-Pérez, Francis D.; Belén-Rodríguez, Juliana; Bracero-Quiñones, Luciann; Burgos-Bula, Andrea P.; Collado-Méndez, Xavier A.; Colón-Cruz, Luis R.; Correa-Muller, Ana I.; Crooke-Rosado, Jonathan L.; Cruz-García, José M.; Defendini-Ávila, Marianna; Delgado-Peraza, Francheska M.; Feliciano-Cancela, Alex J.; Gónzalez-Pérez, Valerie M.; Guiblet, Wilfried; Heredia-Negrón, Aldo; Hernández-Muñiz, Jennifer; Irizarry-González, Lourdes N.; Laboy-Corales, Ángel L.; Llaurador-Caraballo, Gabriela A.; Marín-Maldonado, Frances; Marrero-Llerena, Ulises; Martell-Martínez, Héctor A.; Martínez-Traverso, Idaliz M.; Medina-Ortega, Kiara N.; Méndez-Castellanos, Sonya G.; Menéndez-Serrano, Krizia C.; Morales-Caraballo, Carol I.; Ortiz-DeChoudens, Saryleine; Ortiz-Ortiz, Patricia; Pagán-Torres, Hendrick; Pérez-Afanador, Diana; Quintana-Torres, Enid M.; Ramírez-Aponte, Edwin G.; Riascos-Cuero, Carolina; Rivera-Llovet, Michelle S.; Rivera-Pagán, Ingrid T.; Rivera-Vicéns, Ramón E.; Robles-Juarbe, Fabiola; Rodríguez-Bonilla, Lorraine; Rodríguez-Echevarría, Brian O.; Rodríguez-García, Priscila M.; Rodríguez-Laboy, Abneris E.; Rodríguez-Santiago, Susana; Rojas-Vargas, Michael L.; Rubio-Marrero, Eva N.; Santiago-Colón, Albeliz; Santiago-Ortiz, Jorge L.; Santos-Ramos, Carlos E.; Serrano-González, Joseline; Tamayo-Figueroa, Alina M.; Tascón-Peñaranda, Edna P.; Torres-Castillo, José L.; Valentín-Feliciano, Nelson A.; Valentín-Feliciano, Yashira M.; Vargas-Barreto, Nadyan M.; Vélez-Vázquez, Miguel; Vilanova-Vélez, Luis R.; Zambrana-Echevarría, Cristina; MacKinnon, Christy; Chung, Hui-Min; Kay, Chris; Pinto, Anthony; Kopp, Olga R.; Burkhardt, Joshua; Harward, Chris; Allen, Robert; Bhat, Pavan; Chang, Jimmy Hsiang-Chun; Chen, York; Chesley, Christopher; Cohn, Dara; DuPuis, David; Fasano, Michael; Fazzio, Nicholas; Gavinski, Katherine; Gebreyesus, Heran; Giarla, Thomas; Gostelow, Marcus; Greenstein, Rachel; Gunasinghe, Hashini; Hanson, Casey; Hay, Amanda; He, Tao Jian; Homa, Katie; Howe, Ruth; Howenstein, Jeff; Huang, Henry; Khatri, Aaditya; Kim, Young Lu; Knowles, Olivia; Kong, Sarah; Krock, Rebecca; Kroll, Matt; Kuhn, Julia; Kwong, Matthew; Lee, Brandon; Lee, Ryan; Levine, Kevin; Li, Yedda; Liu, Bo; Liu, Lucy; Liu, Max; Lousararian, Adam; Ma, Jimmy; Mallya, Allyson; Manchee, Charlie; Marcus, Joseph; McDaniel, Stephen; Miller, Michelle L.; Molleston, Jerome M.; Diez, Cristina Montero; Ng, Patrick; Ngai, Natalie; Nguyen, Hien; Nylander, Andrew; Pollack, Jason; Rastogi, Suchita; Reddy, Himabindu; Regenold, Nathaniel; Sarezky, Jon; Schultz, Michael; Shim, Jien; Skorupa, Tara; Smith, Kenneth; Spencer, Sarah J.; Srikanth, Priya; Stancu, Gabriel; Stein, Andrew P.; Strother, Marshall; Sudmeier, Lisa; Sun, Mengyang; Sundaram, Varun; Tazudeen, Noor; Tseng, Alan; Tzeng, Albert; Venkat, Rohit; Venkataram, Sandeep; Waldman, Leah; Wang, Tracy; Yang, Hao; Yu, Jack Y.; Zheng, Yin; Preuss, Mary L.; Garcia, Angelica; Juergens, Matt; Morris, Robert W.; Nagengast, Alexis A.; Azarewicz, Julie; Carr, Thomas J.; Chichearo, Nicole; Colgan, Mike; Donegan, Megan; Gardner, Bob; Kolba, Nik; Krumm, Janice L.; Lytle, Stacey; MacMillian, Laurell; Miller, Mary; Montgomery, Andrew; Moretti, Alysha; Offenbacker, Brittney; Polen, Mike; Toth, John; Woytanowski, John; Kadlec, Lisa; Crawford, Justin; Spratt, Mary L.; Adams, Ashley L.; Barnard, Brianna K.; Cheramie, Martin N.; Eime, Anne M.; Golden, Kathryn L.; Hawkins, Allyson P.; Hill, Jessica E.; Kampmeier, Jessica A.; Kern, Cody D.; Magnuson, Emily E.; Miller, Ashley R.; Morrow, Cody M.; Peairs, Julia C.; Pickett, Gentry L.; Popelka, Sarah A.; Scott, Alexis J.; Teepe, Emily J.; TerMeer, Katie A.; Watchinski, Carmen A.; Watson, Lucas A.; Weber, Rachel E.; Woodard, Kate A.; Barnard, Daron C.; Appiah, Isaac; Giddens, Michelle M.; McNeil, Gerard P.; Adebayo, Adeola; Bagaeva, Kate; Chinwong, Justina; Dol, Chrystel; George, Eunice; Haltaufderhyde, Kirk; Haye, Joanna; Kaur, Manpreet; Semon, Max; Serjanov, Dmitri; Toorie, Anika; Wilson, Christopher; Riddle, Nicole C.; Buhler, Jeremy; Mardis, Elaine R.
2015-01-01
The Muller F element (4.2 Mb, ~80 protein-coding genes) is an unusual autosome of Drosophila melanogaster; it is mostly heterochromatic with a low recombination rate. To investigate how these properties impact the evolution of repeats and genes, we manually improved the sequence and annotated the genes on the D. erecta, D. mojavensis, and D. grimshawi F elements and euchromatic domains from the Muller D element. We find that F elements have greater transposon density (25–50%) than euchromatic reference regions (3–11%). Among the F elements, D. grimshawi has the lowest transposon density (particularly DINE-1: 2% vs. 11–27%). F element genes have larger coding spans, more coding exons, larger introns, and lower codon bias. Comparison of the Effective Number of Codons with the Codon Adaptation Index shows that, in contrast to the other species, codon bias in D. grimshawi F element genes can be attributed primarily to selection instead of mutational biases, suggesting that density and types of transposons affect the degree of local heterochromatin formation. F element genes have lower estimated DNA melting temperatures than D element genes, potentially facilitating transcription through heterochromatin. Most F element genes (~90%) have remained on that element, but the F element has smaller syntenic blocks than genome averages (3.4–3.6 vs. 8.4–8.8 genes per block), indicating greater rates of inversion despite lower rates of recombination. Overall, the F element has maintained characteristics that are distinct from other autosomes in the Drosophila lineage, illuminating the constraints imposed by a heterochromatic milieu. PMID:25740935
Shaffer, Christopher D.; Chen, Elizabeth J.; Quisenberry, Thomas J.; Ko, Kevin; Braverman, John M.; Giarla, Thomas C.; Mortimer, Nathan T.; Reed, Laura K.; Smith, Sheryl T.; Robic, Srebrenka; McCartha, Shannon R.; Perry, Danielle R.; Prescod, Lindsay M.; Sheppard, Zenyth A.; Saville, Ken J.; McClish, Allison; Morlock, Emily A.; Sochor, Victoria R.; Stanton, Brittney; Veysey-White, Isaac C.; Revie, Dennis; Jimenez, Luis A.; Palomino, Jennifer J.; Patao, Melissa D.; Patao, Shane M.; Himelblau, Edward T.; Campbell, Jaclyn D.; Hertz, Alexandra L.; McEvilly, Maddison F.; Wagner, Allison R.; Youngblom, James; Bedi, Baljit; Bettincourt, Jeffery; Duso, Erin; Her, Maiye; Hilton, William; House, Samantha; Karimi, Masud; Kumimoto, Kevin; Lee, Rebekah; Lopez, Darryl; Odisho, George; Prasad, Ricky; Robbins, Holly Lyn; Sandhu, Tanveer; Selfridge, Tracy; Tsukashima, Kara; Yosif, Hani; Kokan, Nighat P.; Britt, Latia; Zoellner, Alycia; Spana, Eric P.; Chlebina, Ben T.; Chong, Insun; Friedman, Harrison; Mammo, Danny A.; Ng, Chun L.; Nikam, Vinayak S.; Schwartz, Nicholas U.; Xu, Thomas Q.; Burg, Martin G.; Batten, Spencer M.; Corbeill, Lindsay M.; Enoch, Erica; Ensign, Jesse J.; Franks, Mary E.; Haiker, Breanna; Ingles, Judith A.; Kirkland, Lyndsay D.; Lorenz-Guertin, Joshua M.; Matthews, Jordan; Mittig, Cody M.; Monsma, Nicholaus; Olson, Katherine J.; Perez-Aragon, Guillermo; Ramic, Alen; Ramirez, Jordan R.; Scheiber, Christopher; Schneider, Patrick A.; Schultz, Devon E.; Simon, Matthew; Spencer, Eric; Wernette, Adam C.; Wykle, Maxine E.; Zavala-Arellano, Elizabeth; McDonald, Mitchell J.; Ostby, Kristine; Wendland, Peter; DiAngelo, Justin R.; Ceasrine, Alexis M.; Cox, Amanda H.; Docherty, James E.B.; Gingras, Robert M.; Grieb, Stephanie M.; Pavia, Michael J.; Personius, Casey L.; Polak, Grzegorz L.; Beach, Dale L.; Cerritos, Heaven L.; Horansky, Edward A.; Sharif, Karim A.; Moran, Ryan; Parrish, Susan; Bickford, Kirsten; Bland, Jennifer; Broussard, Juliana; Campbell, Kerry; Deibel, Katelynn E.; Forka, Richard; Lemke, Monika C.; Nelson, Marlee B.; O'Keeffe, Catherine; Ramey, S. Mariel; Schmidt, Luke; Villegas, Paola; Jones, Christopher J.; Christ, Stephanie L.; Mamari, Sami; Rinaldi, Adam S.; Stity, Ghazal; Hark, Amy T.; Scheuerman, Mark; Silver Key, S. Catherine; McRae, Briana D.; Haberman, Adam S.; Asinof, Sam; Carrington, Harriette; Drumm, Kelly; Embry, Terrance; McGuire, Richard; Miller-Foreman, Drew; Rosen, Stella; Safa, Nadia; Schultz, Darrin; Segal, Matt; Shevin, Yakov; Svoronos, Petros; Vuong, Tam; Skuse, Gary; Paetkau, Don W.; Bridgman, Rachael K.; Brown, Charlotte M.; Carroll, Alicia R.; Gifford, Francesca M.; Gillespie, Julie Beth; Herman, Susan E.; Holtcamp, Krystal L.; Host, Misha A.; Hussey, Gabrielle; Kramer, Danielle M.; Lawrence, Joan Q.; Martin, Madeline M.; Niemiec, Ellen N.; O'Reilly, Ashleigh P.; Pahl, Olivia A.; Quintana, Guadalupe; Rettie, Elizabeth A.S.; Richardson, Torie L.; Rodriguez, Arianne E.; Rodriguez, Mona O.; Schiraldi, Laura; Smith, Joanna J.; Sugrue, Kelsey F.; Suriano, Lindsey J.; Takach, Kaitlyn E.; Vasquez, Arielle M.; Velez, Ximena; Villafuerte, Elizabeth J.; Vives, Laura T.; Zellmer, Victoria R.; Hauke, Jeanette; Hauser, Charles R.; Barker, Karolyn; Cannon, Laurie; Parsamian, Perouza; Parsons, Samantha; Wichman, Zachariah; Bazinet, Christopher W.; Johnson, Diana E.; Bangura, Abubakarr; Black, Jordan A.; Chevee, Victoria; Einsteen, Sarah A.; Hilton, Sarah K.; Kollmer, Max; Nadendla, Rahul; Stamm, Joyce; Fafara-Thompson, Antoinette E.; Gygi, Amber M.; Ogawa, Emmy E.; Van Camp, Matt; Kocsisova, Zuzana; Leatherman, Judith L.; Modahl, Cassie M.; Rubin, Michael R.; Apiz-Saab, Susana S.; Arias-Mejias, Suzette M.; Carrion-Ortiz, Carlos F.; Claudio-Vazquez, Patricia N.; Espada-Green, Debbie M.; Feliciano-Camacho, Marium; Gonzalez-Bonilla, Karina M.; Taboas-Arroyo, Mariela; Vargas-Franco, Dorianmarie; Montañez-Gonzalez, Raquel; Perez-Otero, Joseph; Rivera-Burgos, Myrielis; Rivera-Rosario, Francisco J.; Eisler, Heather L.; Alexander, Jackie; Begley, Samatha K.; Gabbard, Deana; Allen, Robert J.; Aung, Wint Yan; Barshop, William D.; Boozalis, Amanda; Chu, Vanessa P.; Davis, Jeremy S.; Duggal, Ryan N.; Franklin, Robert; Gavinski, Katherine; Gebreyesus, Heran; Gong, Henry Z.; Greenstein, Rachel A.; Guo, Averill D.; Hanson, Casey; Homa, Kaitlin E.; Hsu, Simon C.; Huang, Yi; Huo, Lucy; Jacobs, Sarah; Jia, Sasha; Jung, Kyle L.; Wai-Chee Kong, Sarah; Kroll, Matthew R.; Lee, Brandon M.; Lee, Paul F.; Levine, Kevin M.; Li, Amy S.; Liu, Chengyu; Liu, Max Mian; Lousararian, Adam P.; Lowery, Peter B.; Mallya, Allyson P.; Marcus, Joseph E.; Ng, Patrick C.; Nguyen, Hien P.; Patel, Ruchik; Precht, Hashini; Rastogi, Suchita; Sarezky, Jonathan M.; Schefkind, Adam; Schultz, Michael B.; Shen, Delia; Skorupa, Tara; Spies, Nicholas C.; Stancu, Gabriel; Vivian Tsang, Hiu Man; Turski, Alice L.; Venkat, Rohit; Waldman, Leah E.; Wang, Kaidi; Wang, Tracy; Wei, Jeffrey W.; Wu, Dennis Y.; Xiong, David D.; Yu, Jack; Zhou, Karen; McNeil, Gerard P.; Fernandez, Robert W.; Menzies, Patrick Gomez; Gu, Tingting; Buhler, Jeremy; Mardis, Elaine R.; Elgin, Sarah C.R.
2017-01-01
The discordance between genome size and the complexity of eukaryotes can partly be attributed to differences in repeat density. The Muller F element (∼5.2 Mb) is the smallest chromosome in Drosophila melanogaster, but it is substantially larger (>18.7 Mb) in D. ananassae. To identify the major contributors to the expansion of the F element and to assess their impact, we improved the genome sequence and annotated the genes in a 1.4-Mb region of the D. ananassae F element, and a 1.7-Mb region from the D element for comparison. We find that transposons (particularly LTR and LINE retrotransposons) are major contributors to this expansion (78.6%), while Wolbachia sequences integrated into the D. ananassae genome are minor contributors (0.02%). Both D. melanogaster and D. ananassae F-element genes exhibit distinct characteristics compared to D-element genes (e.g., larger coding spans, larger introns, more coding exons, and lower codon bias), but these differences are exaggerated in D. ananassae. Compared to D. melanogaster, the codon bias observed in D. ananassae F-element genes can primarily be attributed to mutational biases instead of selection. The 5′ ends of F-element genes in both species are enriched in dimethylation of lysine 4 on histone 3 (H3K4me2), while the coding spans are enriched in H3K9me2. Despite differences in repeat density and gene characteristics, D. ananassae F-element genes show a similar range of expression levels compared to genes in euchromatic domains. This study improves our understanding of how transposons can affect genome size and how genes can function within highly repetitive domains. PMID:28667019
TargetCompare: A web interface to compare simultaneous miRNAs targets
Moreira, Fabiano Cordeiro; Dustan, Bruno; Hamoy, Igor G; Ribeiro-dos-Santos, André M; dos Santos, Ândrea Ribeiro
2014-01-01
MicroRNAs (miRNAs) are small non-coding nucleotide sequences between 17 and 25 nucleotides in length that primarily function in the regulation of gene expression. A since miRNA has thousand of predict targets in a complex, regulatory cell signaling network. Therefore, it is of interest to study multiple target genes simultaneously. Hence, we describe a web tool (developed using Java programming language and MySQL database server) to analyse multiple targets of pre-selected miRNAs. We cross validated the tool in eight most highly expressed miRNAs in the antrum region of stomach. This helped to identify 43 potential genes that are target of at least six of the referred miRNAs. The developed tool aims to reduce the randomness and increase the chance of selecting strong candidate target genes and miRNAs responsible for playing important roles in the studied tissue. Availability http://lghm.ufpa.br/targetcompare PMID:25352731
TargetCompare: A web interface to compare simultaneous miRNAs targets.
Moreira, Fabiano Cordeiro; Dustan, Bruno; Hamoy, Igor G; Ribeiro-Dos-Santos, André M; Dos Santos, Andrea Ribeiro
2014-01-01
MicroRNAs (miRNAs) are small non-coding nucleotide sequences between 17 and 25 nucleotides in length that primarily function in the regulation of gene expression. A since miRNA has thousand of predict targets in a complex, regulatory cell signaling network. Therefore, it is of interest to study multiple target genes simultaneously. Hence, we describe a web tool (developed using Java programming language and MySQL database server) to analyse multiple targets of pre-selected miRNAs. We cross validated the tool in eight most highly expressed miRNAs in the antrum region of stomach. This helped to identify 43 potential genes that are target of at least six of the referred miRNAs. The developed tool aims to reduce the randomness and increase the chance of selecting strong candidate target genes and miRNAs responsible for playing important roles in the studied tissue. http://lghm.ufpa.br/targetcompare.
Proudhon, D; Wei, J; Briat, J; Theil, E C
1996-03-01
Ferritin, a protein widespread in nature, concentrates iron approximately 10(11)-10(12)-fold above the solubility within a spherical shell of 24 subunits; it derives in plants and animals from a common ancestor (based on sequence) but displays a cytoplasmic location in animals compared to the plastid in contemporary plants. Ferritin gene regulation in plants and animals is altered by development, hormones, and excess iron; iron signals target DNA in plants but mRNA in animals. Evolution has thus conserved the two end points of ferritin gene expression, the physiological signals and the protein structure, while allowing some divergence of the genetic mechanisms. Comparison of ferritin gene organization in plants and animals, made possible by the cloning of a dicot (soybean) ferritin gene presented here and the recent cloning of two monocot (maize) ferritin genes, shows evolutionary divergence in ferritin gene organization between plants and animals but conservation among plants or among animals; divergence in the genetic mechanism for iron regulation is reflected by the absence in all three plant genes of the IRE, a highly conserved, noncoding sequence in vertebrate animal ferritin mRNA. In plant ferritin genes, the number of introns (n = 7) is higher than in animals (n = 3). Second, no intron positions are conserved when ferritin genes of plants and animals are compared, although all ferritin gene introns are in the coding region; within kingdoms, the intron positions in ferritin genes are conserved. Finally, secondary protein structure has no apparent relationship to intron/exon boundaries in plant ferritin genes, whereas in animal ferritin genes the correspondence is high. The structural differences in introns/exons among phylogenetically related ferritin coding sequences and the high conservation of the gene structure within plant or animal kingdoms of the gene structure within plant or animal kingdoms suggest that kingdom-specific functional constraints may exist to maintain a particular intron/exon pattern within ferritin genes. In the case of plants, where ferritin gene intron placement is unrelated to triplet codons or protein structure, and where ferritin is targeted to the plastid, the selection pressure on gene organization may relate to RNA function and plastid/nuclear signaling.
Goto, Hiroki; Peng, Lei; Makova, Kateryna D
2009-02-01
Compared with the X chromosome, the mammalian Y chromosome is considerably diminished in size and has lost most of its ancestral genes during evolution. Interestingly, for the X-degenerate region on the Y chromosome, human has retained all 16 genes, while chimpanzee has lost 4 of the 16 genes since the divergence of the two species. To uncover the evolutionary forces governing ape Y chromosome degeneration, we determined the complete sequences of the coding exons and splice sites for 16 gorilla Y chromosome genes of the X-degenerate region. We discovered that all studied reading frames and splice sites were intact, and thus, this genomic region experienced no gene loss in the gorilla lineage. Higher nucleotide divergence was observed in the chimpanzee than the human lineage, particularly for genes with disruptive mutations, suggesting a lack of functional constraints for these genes in chimpanzee. Surprisingly, our results indicate that the human and gorilla orthologues of the genes disrupted in chimpanzee evolve under relaxed functional constraints and might not be essential. Taking mating patterns and effective population sizes of ape species into account, we conclude that genetic hitchhiking associated with positive selection due to sperm competition might explain the rapid decline in the Y chromosome gene number in chimpanzee. As we found no evidence of positive selection acting on the X-degenerate genes, such selection likely targets other genes on the chimpanzee Y chromosome.
Bowen, J K; Templeton, M D; Sharrock, K R; Crowhurst, R N; Rikkerink, E H
1995-01-20
The feasibility of performing routine transformation-mediated mutagenesis in Glomerella cingulata was analysed by adopting three one-step gene disruption strategies targeted at the pectin lyase gene pnlA. The efficiencies of disruption following transformation with gene replacement- or gene truncation-disruption vectors were compared. To effect replacement-disruption, G. cingulata was transformed with a vector carrying DNA from the pnlA locus in which the majority of the coding sequence had been replaced by the gene for hygromycin B resistance. Two of the five transformants investigated contained an inactivated pnlA gene (pnlA-); both also contained ectopically integrated vector sequences. The efficacy of gene disruption by transformation with two gene truncation-disruption vectors was also assessed. Both vectors carried at 5' and 3' truncated copy of the pnlA coding sequence, adjacent to the gene for hygromycin B resistance. The promoter sequences controlling the selectable marker differed in the two vectors. In one vector the homologous G. cingulata gpdA promoter controlled hygromycin B phosphotransferase expression (homologous truncation vector), whereas in the second vector promoter elements were from the Aspergillus nidulans gpdA gene (heterologous truncation vector). Following transformation with the homologous truncation vector, nine transformants were analysed by Southern hybridisation; no transformants contained a disrupted pnlA gene. Of nineteen heterologous truncation vector transformants, three contained a disrupted pnlA gene; Southern analysis revealed single integrations of vector sequence at pnlA in two of these transformants. pnlA mRNA was not detected by Northern hybridisation in pnlA- transformants. pnlA- transformants failed to produce a PNLA protein with a pI identical to one normally detected in wild-type isolates by silver and activity staining of isoelectric focussing gels. Pathogenesis on Capsicum and apple was unaffected by disruption of the pnlA gene, indicating that the corresponding gene product, PNLA, is not essential for pathogenicity. Gene disruption is a feasible method for selectively mutating defined loci in G. cingulata for functional analysis of the corresponding gene products.
Dumas, Laura; Dickens, C Michael; Anderson, Nathan; Davis, Jonathan; Bennett, Beth; Radcliffe, Richard A; Sikela, James M
2014-06-01
It has been well documented that genetic factors can influence predisposition to develop alcoholism. While the underlying genomic changes may be of several types, two of the most common and disease associated are copy number variations (CNVs) and sequence alterations of protein coding regions. The goal of this study was to identify CNVs and single-nucleotide polymorphisms that occur in gene coding regions that may play a role in influencing the risk of an individual developing alcoholism. Toward this end, two mouse strains were used that have been selectively bred based on their differential sensitivity to alcohol: the Inbred long sleep (ILS) and Inbred short sleep (ISS) mouse strains. Differences in initial response to alcohol have been linked to risk for alcoholism, and the ILS/ISS strains are used to investigate the genetics of initial sensitivity to alcohol. Array comparative genomic hybridization (arrayCGH) and exome sequencing were conducted to identify CNVs and gene coding sequence differences, respectively, between ILS and ISS mice. Mouse arrayCGH was performed using catalog Agilent 1 × 244 k mouse arrays. Subsequently, exome sequencing was carried out using an Illumina HiSeq 2000 instrument. ArrayCGH detected 74 CNVs that were strain-specific (38 ILS/36 ISS), including several ISS-specific deletions that contained genes implicated in brain function and neurotransmitter release. Among several interesting coding variations detected by exome sequencing was the gain of a premature stop codon in the alpha-amylase 2B (AMY2B) gene specifically in the ILS strain. In total, exome sequencing detected 2,597 and 1,768 strain-specific exonic gene variants in the ILS and ISS mice, respectively. This study represents the most comprehensive and detailed genomic comparison of ILS and ISS mouse strains to date. The two complementary genome-wide approaches identified strain-specific CNVs and gene coding sequence variations that should provide strong candidates to contribute to the alcohol-related phenotypic differences associated with these strains.
Colville, A. M.; Iancu, O. D.; Oberbeck, D. L.; Darakjian, P.; Zheng, C. L.; Walter, N. A. R.; Harrington, C. A.; Searles, R. P.; McWeeney, S.; Hitzemann, R. J.
2017-01-01
Previous studies on changes in murine brain gene expression associated with the selection for ethanol preference have used F2 intercross or heterogeneous stock (HS) founders, derived from standard laboratory strains. However, these populations represent only a small proportion of the genetic variance available in Mus musculus. To investigate a wider range of genetic diversity, we selected mice for ethanol preference using an HS derived from the eight strains of the collaborative cross. These HS mice were selectively bred (four generations) for high and low ethanol preference. The nucleus accumbens shell of naive S4 mice was interrogated using RNA sequencing (RNA-Seq). Gene networks were constructed using the weighted gene coexpression network analysis assessing both coexpression and cosplicing. Selection targeted one of the network coexpression modules (greenyellow) that was significantly enriched in genes associated with receptor signaling activity including Chrna7, Grin2a, Htr2a and Oprd1. Connectivity in the module as measured by changes in the hub nodes was significantly reduced in the low preference line. Of particular interest was the observation that selection had marked effects on a large number of cell adhesion molecules, including cadherins and protocadherins. In addition, the coexpression data showed that selection had marked effects on long non-coding RNA hub nodes. Analysis of the cosplicing network data showed a significant effect of selection on a large cluster of Ras GTPase-binding genes including Cdkl5, Cyfip1, Ndrg1, Sod1 and Stxbp5. These data in part support the earlier observation that preference is linked to Ras/Mapk pathways. PMID:28058793
Substitution rate and natural selection in parvovirus B19
Stamenković, Gorana G.; Ćirković, Valentina S.; Šiljić, Marina M.; Blagojević, Jelena V.; Knežević, Aleksandra M.; Joksić, Ivana D.; Stanojević, Maja P.
2016-01-01
The aim of this study was to estimate substitution rate and imprints of natural selection on parvovirus B19 genotype 1. Studied datasets included 137 near complete coding B19 genomes (positions 665 to 4851) for phylogenetic and substitution rate analysis and 146 and 214 partial genomes for selection analyses in open reading frames ORF1 and ORF2, respectively, collected 1973–2012 and including 9 newly sequenced isolates from Serbia. Phylogenetic clustering assigned majority of studied isolates to G1A. Nucleotide substitution rate for total coding DNA was 1.03 (0.6–1.27) x 10−4 substitutions/site/year, with higher values for analyzed genome partitions. In spite of the highest evolutionary rate, VP2 codons were found to be under purifying selection with rare episodic positive selection, whereas codons under diversifying selection were found in the unique part of VP1, known to contain B19 immune epitopes important in persistent infection. Analyses of overlapping gene regions identified nucleotide positions under opposite selective pressure in different ORFs, suggesting complex evolutionary mechanisms of nucleotide changes in B19 viral genomes. PMID:27775080
Jiang, Shu-Ye; Sevugan, Mayalagu; Ramachandran, Srinivasan
2018-05-09
Valine-glutamine (VQ) motif containing proteins play important roles in abiotic and biotic stress responses in plants. However, little is known about the origin and evolution as well as comprehensive expression regulation of the VQ gene family. In this study, we systematically surveyed this gene family in 50 plant genomes from algae, moss, gymnosperm and angiosperm and explored their presence in other species from animals, bacteria, fungi and viruses. No VQs were detected in all tested algae genomes and all genomes from moss, gymnosperm and angiosperm encode varying numbers of VQs. Interestingly, some of fungi, lower animals and bacteria also encode single to a few VQs. Thus, they are not plant-specific and should be regarded as an ancient family. Their family expansion was mainly due to segmental duplication followed by tandem duplication and mobile elements. Limited contribution of gene conversion was detected to the family evolution. Generally, VQs were very much conserved in their motif coding region and were under purifying selection. However, positive selection was also observed during species divergence. Many VQs were up- or down-regulated by various abiotic / biotic stresses and phytohormones in rice and Arabidopsis. They were also co-expressed with some of other stress-related genes. All of the expression data suggest a comprehensive expression regulation of the VQ gene family. We provide new insights into gene expansion, divergence, evolution and their expression regulation of this VQ family. VQs were detectable not only in plants but also in some of fungi, lower animals and bacteria, suggesting the evolutionary conservation and the ancient origin. Overall, VQs are non-plant-specific and play roles in abiotic / biotic responses or other biological processes through comprehensive expression regulation.
de Freitas, Michele C R; Resende, Juliana A; Ferreira-Machado, Alessandra B; Saji, Guadalupe D R Q; de Vasconcelos, Ana T R; da Silva, Vânia L; Nicolás, Marisa F; Diniz, Cláudio G
2016-01-01
Bacteroides fragilis , member from commensal gut microbiota, is an important pathogen associated to endogenous infections and metronidazole remains a valuable antibiotic for the treatment of these infections, although bacterial resistance is widely reported. Considering the need of a better understanding on the global mechanisms by which B. fragilis survive upon metronidazole exposure, we performed a RNA-seq transcriptomic approach with validation of gene expression results by qPCR. Bacteria strains were selected after in vitro subcultures with subinhibitory concentration (SIC) of the drug. From a wild type B. fragilis ATCC 43859 four derivative strains were selected: first and fourth subcultures under metronidazole exposure and first and fourth subcultures after drug removal. According to global gene expression analysis, 2,146 protein coding genes were identified, of which a total of 1,618 (77%) were assigned to a Gene Ontology term (GO), indicating that most known cellular functions were taken. Among these 2,146 protein coding genes, 377 were shared among all strains, suggesting that they are critical for B. fragilis survival. In order to identify distinct expression patterns, we also performed a K-means clustering analysis set to 15 groups. This analysis allowed us to detect the major activated or repressed genes encoding for enzymes which act in several metabolic pathways involved in metronidazole response such as drug activation, defense mechanisms against superoxide ions, high expression level of multidrug efflux pumps, and DNA repair. The strains collected after metronidazole removal were functionally more similar to those cultured under drug pressure, reinforcing that drug-exposure lead to drastic persistent changes in the B. fragilis gene expression patterns. These results may help to elucidate B. fragilis response during metronidazole exposure, mainly at SIC, contributing with information about bacterial survival strategies under stress conditions in their environment.
Nonneutral GC3 and retroelement codon mimicry in Phytophthora.
Jiang, Rays H Y; Govers, Francine
2006-10-01
Phytophthora is a genus entirely comprised of destructive plant pathogens. It belongs to the Stramenopila, a unique branch of eukaryotes, phylogenetically distinct from plants, animals, or fungi. Phytophthora genes show a strong preference for usage of codons ending with G or C (high GC3). The presence of high GC3 in genes can be utilized to differentiate coding regions from noncoding regions in the genome. We found that both selective pressure and mutation bias drive codon bias in Phytophthora. Indicative for selection pressure is the higher GC3 value of highly expressed genes in different Phytophthora species. Lineage specific GC increase of noncoding regions is reminiscent of whole-genome mutation bias, whereas the elevated Phytophthora GC3 is primarily a result of translation efficiency-driven selection. Heterogeneous retrotransposons exist in Phytophthora genomes and many of them vary in their GC content. Interestingly, the most widespread groups of retroelements in Phytophthora show high GC3 and a codon bias that is similar to host genes. Apparently, selection pressure has been exerted on the retroelement's codon usage, and such mimicry of host codon bias might be beneficial for the propagation of retrotransposons.
Genetic information transfer promotes cooperation in bacteria
Dimitriu, Tatiana; Lotton, Chantal; Bénard-Capelle, Julien; Misevic, Dusan; Brown, Sam P.; Lindner, Ariel B.; Taddei, François
2014-01-01
Many bacterial species are social, producing costly secreted “public good” molecules that enhance the growth of neighboring cells. The genes coding for these cooperative traits are often propagated via mobile genetic elements and can be virulence factors from a biomedical perspective. Here, we present an experimental framework that links genetic information exchange and the selection of cooperative traits. Using simulations and experiments based on a synthetic bacterial system to control public good secretion and plasmid conjugation, we demonstrate that horizontal gene transfer can favor cooperation. In a well-mixed environment, horizontal transfer brings a direct infectious advantage to any gene, regardless of its cooperation properties. However, in a structured population transfer selects specifically for cooperation by increasing the assortment among cooperative alleles. Conjugation allows cooperative alleles to overcome rarity thresholds and invade bacterial populations structured purely by stochastic dilution effects. Our results provide an explanation for the prevalence of cooperative genes on mobile elements, and suggest a previously unidentified benefit of horizontal gene transfer for bacteria. PMID:25024219
Sorimachi, Kenji; Okayasu, Teiji
2015-01-01
The complete vertebrate mitochondrial genome consists of 13 coding genes. We used this genome to investigate the existence of natural selection in vertebrate evolution. From the complete mitochondrial genomes, we predicted nucleotide contents and then separated these values into coding and non-coding regions. When nucleotide contents of a coding or non-coding region were plotted against the nucleotide content of the complete mitochondrial genomes, we obtained linear regression lines only between homonucleotides and their analogs. On every plot using G or A content purine, G content in aquatic vertebrates was higher than that in terrestrial vertebrates, while A content in aquatic vertebrates was lower than that in terrestrial vertebrates. Based on these relationships, vertebrates were separated into two groups, terrestrial and aquatic. However, using C or T content pyrimidine, clear separation between these two groups was not obtained. The hagfish (Eptatretus burgeri) was further separated from both terrestrial and aquatic vertebrates. Based on these results, nucleotide content relationships predicted from the complete vertebrate mitochondrial genomes reveal the existence of natural selection based on evolutionary separation between terrestrial and aquatic vertebrate groups. In addition, we propose that separation of the two groups might be linked to ammonia detoxification based on high G and low A contents, which encode Glu rich and Lys poor proteins.
Biederman, Michelle K; Nelson, Megan M; Asalone, Kathryn C; Pedersen, Alyssa L; Saldanha, Colin J; Bracht, John R
2018-05-21
Developmentally programmed genome rearrangements are rare in vertebrates, but have been reported in scattered lineages including the bandicoot, hagfish, lamprey, and zebra finch (Taeniopygia guttata) [1]. In the finch, a well-studied animal model for neuroendocrinology and vocal learning [2], one such programmed genome rearrangement involves a germline-restricted chromosome, or GRC, which is found in germlines of both sexes but eliminated from mature sperm [3, 4]. Transmitted only through the oocyte, it displays uniparental female-driven inheritance, and early in embryonic development is apparently eliminated from all somatic tissue in both sexes [3, 4]. The GRC comprises the longest finch chromosome at over 120 million base pairs [3], and previously the only known GRC-derived sequence was repetitive and non-coding [5]. Because the zebra finch genome project was sourced from male muscle (somatic) tissue [6], the remaining genomic sequence and protein-coding content of the GRC remain unknown. Here we report the first protein-coding gene from the GRC: a member of the α-soluble N-ethylmaleimide sensitive fusion protein (NSF) attachment protein (α-SNAP) family hitherto missing from zebra finch gene annotations. In addition to the GRC-encoded α-SNAP, we find an additional paralogous α-SNAP residing in the somatic genome (a somatolog)-making the zebra finch the first example in which α-SNAP is not a single-copy gene. We show divergent, sex-biased expression for the paralogs and also that positive selection is detectable across the bird α-SNAP lineage, including the GRC-encoded α-SNAP. This study presents the identification and evolutionary characterization of the first protein-coding GRC gene in any organism. Copyright © 2018 Elsevier Ltd. All rights reserved.
Ikonomidis, Alexandros; Grapsa, Anastasia; Pavlioglou, Charikleia; Demiri, Antonia; Batarli, Alexandra; Panopoulou, Maria
2016-12-01
Fifty-six Staphylococcus epidermidis clinical isolates, showing high-level linezolid resistance and causing bacteremia in critically ill patients, were studied. All isolates belonged to ST22 clone and carried the T2504A and C2534T mutations in gene coding for 23SrRNA as well as the C189A, G208A, C209T and G384C missense mutations in L3 protein which resulted in Asp159Tyr, Gly152Asp and Leu94Val substitutions. Other silent mutations were also detected in genes coding for ribosomal proteins L3 and L22. In silico analysis of missense mutations showed that although L3 protein retained the sequence of secondary motifs, the tertiary structure was influenced. The observed alteration in L3 protein folding provides an indication on the putative role of L3-coding gene mutations in high-level linezolid resistance. Furthermore, linezolid pressure in health care settings where linezolid consumption is of high rates might lead to the selection of resistant mutants possessing L3 mutations that might confer high-level linezolid resistance.
Delaye, Luis; Ruiz-Ruiz, Susana; Calderon, Enrique; Tarazona, Sonia; Conesa, Ana; Moya, Andrés
2018-06-01
Pneumocystis species are ascomycete fungi adapted to live inside the lungs of mammals. These ascomycetes show extensive stenoxenism, meaning that each species of Pneumocystis infects a single species of host. Here, we study the effect exerted by natural selection on gene evolution in the genomes of three Pneumocystis species. We show that genes involved in host interaction evolve under positive selection. In the first place, we found strong evidence of episodic diversifying selection in Major surface glycoproteins (Msg). These proteins are located on the surface of Pneumocystis and are used for host attachment and probably for immune system evasion. Consistent with their function as antigens, most sites under diversifying selection in Msg code for residues with large relative surface accessibility areas. We also found evidence of positive selection in part of the cell machinery used to export Msg to the cell surface. Specifically, we found that genes participating in glycosylphosphatidylinositol (GPI) biosynthesis show an increased rate of nonsynonymous substitutions (dN) versus synonymous substitutions (dS). GPI is a molecule synthesized in the endoplasmic reticulum that is used to anchor proteins to membranes. We interpret the aforementioned findings as evidence of selective pressure exerted by the host immune system on Pneumocystis species, shaping the evolution of Msg and several proteins involved in GPI biosynthesis. We suggest that genome evolution in Pneumocystis is well described by the Red-Queen hypothesis whereby genes relevant for biotic interactions show accelerated rates of evolution.
Michel, Christian J
2017-04-18
In 1996, a set X of 20 trinucleotides was identified in genes of both prokaryotes and eukaryotes which has on average the highest occurrence in reading frame compared to its two shifted frames. Furthermore, this set X has an interesting mathematical property as X is a maximal C 3 self-complementary trinucleotide circular code. In 2015, by quantifying the inspection approach used in 1996, the circular code X was confirmed in the genes of bacteria and eukaryotes and was also identified in the genes of plasmids and viruses. The method was based on the preferential occurrence of trinucleotides among the three frames at the gene population level. We extend here this definition at the gene level. This new statistical approach considers all the genes, i.e., of large and small lengths, with the same weight for searching the circular code X . As a consequence, the concept of circular code, in particular the reading frame retrieval, is directly associated to each gene. At the gene level, the circular code X is strengthened in the genes of bacteria, eukaryotes, plasmids, and viruses, and is now also identified in the genes of archaea. The genes of mitochondria and chloroplasts contain a subset of the circular code X . Finally, by studying viral genes, the circular code X was found in DNA genomes, RNA genomes, double-stranded genomes, and single-stranded genomes.
Comparative architecture of silks, fibrous proteins and their encoding genes in insects and spiders.
Craig, Catherine L; Riekel, Christian
2002-12-01
The known silk fibroins and fibrous glues are thought to be encoded by members of the same gene family. All silk fibroins sequenced to date contain regions of long-range order (crystalline regions) and/or short-range order (non-crystalline regions). All of the sequenced fibroin silks (Flag or silk from flagelliform gland in spiders; Fhc or heavy chain fibroin silks produced by Lepidoptera larvae) are made up of hierarchically organized, repetitive arrays of amino acids. Fhc fibroin genes are characterized by a similar molecular genetic architecture of two exons and one intron, but the organization and size of these units differs. The Flag, Ser (sericin gene) and BR (Balbiani ring genes; both fibrous proteins) genes are made up of multiple exons and introns. Sequences coding for crystalline and non-crystalline protein domains are integrated in the repetitive regions of Fhc and MA exons, but not in the protein glues Ser1 and BR-1. Genetic 'hot-spots' promote recombination errors in Fhc, MA, and Flag. Codon bias, structural constraint, point mutations, and shortened coding arrays may be alternative means of stabilizing precursor mRNA transcripts. Differential regulation of gene expression and selective splicing of the mRNA transcript may allow rapid adaptation of silk functional properties to different physical environments.
Rabara, Roel C; Tripathi, Prateek; Lin, Jun; Rushton, Paul J
2013-02-15
Drought is one of the important environmental factors affecting crop production worldwide and therefore understanding the molecular response of plant to stress is an important step in crop improvement. WRKY transcription factors are one of the 10 largest transcription factor families across the green lineage. In this study, highly upregulated dehydration-induced WRKY and enzyme-coding genes from tobacco and soybean were selected from microarray data for promoter analyses. Putative stress-related cis-regulatory elements such as TGACG motif, ABRE-like elements; W and G-like sequences were identified by an in silico analyses of promoter region of the selected genes. GFP quantification of transgenic BY-2 cell culture showed these promoters direct higher expression in-response to 100 μM JA treatment compared to 100 μM ABA, 10% PEG and 85 mM NaCl treatments. Thus promoter activity upon JA treatment and enrichment of MeJA-responsive elements in the promoter of the selected genes provides insights for these genes to be jasmonic acid responsive with potential of mediating cross-talk during dehydration responses. Copyright © 2013 Elsevier Inc. All rights reserved.
Evolutionary genomics of animal personality.
van Oers, Kees; Mueller, Jakob C
2010-12-27
Research on animal personality can be approached from both a phenotypic and a genetic perspective. While using a phenotypic approach one can measure present selection on personality traits and their combinations. However, this approach cannot reconstruct the historical trajectory that was taken by evolution. Therefore, it is essential for our understanding of the causes and consequences of personality diversity to link phenotypic variation in personality traits with polymorphisms in genomic regions that code for this trait variation. Identifying genes or genome regions that underlie personality traits will open exciting possibilities to study natural selection at the molecular level, gene-gene and gene-environment interactions, pleiotropic effects and how gene expression shapes personality phenotypes. In this paper, we will discuss how genome information revealed by already established approaches and some more recent techniques such as high-throughput sequencing of genomic regions in a large number of individuals can be used to infer micro-evolutionary processes, historical selection and finally the maintenance of personality trait variation. We will do this by reviewing recent advances in molecular genetics of animal personality, but will also use advanced human personality studies as case studies of how molecular information may be used in animal personality research in the near future.
Redwan, R M; Saidin, A; Kumar, S V
2015-08-12
Pineapple (Ananas comosus var. comosus) is known as the king of fruits for its crown and is the third most important tropical fruit after banana and citrus. The plant, which is indigenous to South America, is the most important species in the Bromeliaceae family and is largely traded for fresh fruit consumption. Here, we report the complete chloroplast sequence of the MD-2 pineapple that was sequenced using the PacBio sequencing technology. In this study, the high error rate of PacBio long sequence reads of A. comosus's total genomic DNA were improved by leveraging on the high accuracy but short Illumina reads for error-correction via the latest error correction module from Novocraft. Error corrected long PacBio reads were assembled by using a single tool to produce a contig representing the pineapple chloroplast genome. The genome of 159,636 bp in length is featured with the conserved quadripartite structure of chloroplast containing a large single copy region (LSC) with a size of 87,482 bp, a small single copy region (SSC) with a size of 18,622 bp and two inverted repeat regions (IRA and IRB) each with the size of 26,766 bp. Overall, the genome contained 117 unique coding regions and 30 were repeated in the IR region with its genes contents, structure and arrangement similar to its sister taxon, Typha latifolia. A total of 35 repeats structure were detected in both the coding and non-coding regions with a majority being tandem repeats. In addition, 205 SSRs were detected in the genome with six protein-coding genes contained more than two SSRs. Comparative chloroplast genomes from the subclass Commelinidae revealed a conservative protein coding gene albeit located in a highly divergence region. Analysis of selection pressure on protein-coding genes using Ka/Ks ratio showed significant positive selection exerted on the rps7 gene of the pineapple chloroplast with P less than 0.05. Phylogenetic analysis confirmed the recent taxonomical relation among the member of commelinids which support the monophyly relationship between Arecales and Dasypogonaceae and between Zingiberales to the Poales, which includes the A. comosus. The complete sequence of the chloroplast of pineapple provides insights to the divergence of genic chloroplast sequences from the members of the subclass Commelinidae. The complete pineapple chloroplast will serve as a reference for in-depth taxonomical studies in the Bromeliaceae family when more species under the family are sequenced in the future. The genetic sequence information will also make feasible other molecular applications of the pineapple chloroplast for plant genetic improvement.
Xie, Weilong; Perry, Gregory; Martin, C Joe; Shim, Youn-Seb; Navabi, Alireza; Pauls, K Peter
2017-07-01
Common beans (Phaseolus vulgaris) are excellent sources of dietary folates, but different varieties contain different amounts of these compounds. Genes coding for dihydroneopterin aldolase (DHNA) and aminodeoxychorismate synthase (ADCS) of the folate synthesis pathway were characterized by PCR amplification, BAC clone sequencing, and whole genome sequencing. All DHNA and ADCS genes in the Mesoamerican cultivar OAC Rex were isolated and compared with those genes in the genome of Andean genotype G19833. Both genotypes have two functional DHNA genes and one pseudo gene. PvDHNA1 and PvDHNA2 proteins have similar secondary structures and conserved residues as DHNA homologs in Staphylococcus aureus and Arabidopsis. Sequence analysis and synteny mapping indicated that PvDHNA1 might be a duplicated and transposed copy of PvDHNA2. There is only one ADCS gene (PvADCS) identified in the bean genome and it is identical in OAC Rex and G19833. PvADCS has the conserved motifs required for catalytic activity similar to other plant ADCS homologs. DHNA and ADCS gene-specific markers were developed, mapped, and compared to their physical locations on chromosomes 1 and 7, respectively. The gene-specific markers developed in this study should be useful for detection and selection of varieties with enhanced folate contents in bean breeding programs.
Santo, Evan E; Paik, Jihye
2018-06-17
The rapid development of CRISPR technology is revolutionizing molecular approaches to the dissection of complex biological phenomena. Here we describe an alternative generally applicable implementation of the CRISPR-Cas9 system that allows for selective knockdown of extremely homologous genes. This strategy employs the lentiviral delivery of paired sgRNAs and nickase Cas9 (Cas9D10A) to achieve targeted deletion of splice junctions. This general strategy offers several advantages over standard single-guide exon-targeting CRISPR-Cas9 such as greatly reduced off-target effects, more restricted genomic editing, routine disruption of target gene mRNA expression and the ability to differentiate between closely related genes. Here we demonstrate the utility of this strategy by achieving selective knockdown of the highly homologous human genes FOXO3A and suspected pseudogene FOXO3B. We find the spJCRISPR strategy to efficiently and selectively disrupt FOXO3A and FOXO3B mRNA and protein expression; thus revealing that the human FOXO3B locus encodes a bona fide human gene. Unlike FOXO3A, we find the FOXO3B protein to be cytosolically localized in both the presence and absence of active Akt. The ability to selectively target and efficiently disrupt the expression of the closely-related FOXO3A and FOXO3B genes demonstrates the efficacy of the spJCRISPR approach. Copyright © 2018. Published by Elsevier B.V.
Relaxed selection is a precursor to the evolution of phenotypic plasticity.
Hunt, Brendan G; Ometto, Lino; Wurm, Yannick; Shoemaker, DeWayne; Yi, Soojin V; Keller, Laurent; Goodisman, Michael A D
2011-09-20
Phenotypic plasticity allows organisms to produce alternative phenotypes under different conditions and represents one of the most important ways by which organisms adaptively respond to the environment. However, the relationship between phenotypic plasticity and molecular evolution remains poorly understood. We addressed this issue by investigating the evolution of genes associated with phenotypically plastic castes, sexes, and developmental stages of the fire ant Solenopsis invicta. We first determined if genes associated with phenotypic plasticity in S. invicta evolved at a rapid rate, as predicted under theoretical models. We found that genes differentially expressed between S. invicta castes, sexes, and developmental stages all exhibited elevated rates of evolution compared with ubiquitously expressed genes. We next investigated the evolutionary history of genes associated with the production of castes. Surprisingly, we found that orthologs of caste-biased genes in S. invicta and the social bee Apis mellifera evolved rapidly in lineages without castes. Thus, in contrast to some theoretical predictions, our results suggest that rapid rates of molecular evolution may not arise primarily as a consequence of phenotypic plasticity. Instead, genes evolving under relaxed purifying selection may more readily adopt new forms of biased expression during the evolution of alternate phenotypes. These results suggest that relaxed selective constraint on protein-coding genes is an important and underappreciated element in the evolutionary origin of phenotypic plasticity.
Evolution of Synonymous Codon Usage in Neurospora tetrasperma and Neurospora discreta
Whittle, C. A.; Sun, Y.; Johannesson, H.
2011-01-01
Neurospora comprises a primary model system for the study of fungal genetics and biology. In spite of this, little is known about genome evolution in Neurospora. For example, the evolution of synonymous codon usage is largely unknown in this genus. In the present investigation, we conducted a comprehensive analysis of synonymous codon usage and its relationship to gene expression and gene length (GL) in Neurospora tetrasperma and Neurospora discreta. For our analysis, we examined codon usage among 2,079 genes per organism and assessed gene expression using large-scale expressed sequenced tag (EST) data sets (279,323 and 453,559 ESTs for N. tetrasperma and N. discreta, respectively). Data on relative synonymous codon usage revealed 24 codons (and two putative codons) that are more frequently used in genes with high than with low expression and thus were defined as optimal codons. Although codon-usage bias was highly correlated with gene expression, it was independent of selectively neutral base composition (introns); thus demonstrating that translational selection drives synonymous codon usage in these genomes. We also report that GL (coding sequences [CDS]) was inversely associated with optimal codon usage at each gene expression level, with highly expressed short genes having the greatest frequency of optimal codons. Optimal codon frequency was moderately higher in N. tetrasperma than in N. discreta, which might be due to variation in selective pressures and/or mating systems. PMID:21402862
Intriguing Balancing Selection on the Intron 5 Region of LMBR1 in Human Population
He, Fang; Wu, Dong-Dong; Kong, Qing-Peng; Zhang, Ya-Ping
2008-01-01
Background The intron 5 of gene LMBR1 is the cis-acting regulatory module for the sonic hedgehog (SHH) gene. Mutation in this non-coding region is associated with preaxial polydactyly, and may play crucial roles in the evolution of limb and skeletal system. Methodology/Principal Findings We sequenced a region of the LMBR1 gene intron 5 in East Asian human population, and found a significant deviation of Tajima's D statistics from neutrality taking human population growth into account. Data from HapMap also demonstrated extended linkage disequilibrium in the region in East Asian and European population, and significantly low degree of genetic differentiation among human populations. Conclusion/Significance We proposed that the intron 5 of LMBR1 was presumably subject to balancing selection during the evolution of modern human. PMID:18698406
USDA-ARS?s Scientific Manuscript database
Sesame germplasm harbors genetic diversity which can be useful for sesame improvement in breeding programs. Seven accessions with different levels of oleic acid were selected from the entire USDA sesame germplasm collection (1232 accessions) and planted for morphological observation and re-examinati...
Han, Zhenyun; Hu, Yanan; Lv, Yuanda; Sun, Yaqiang; Shen, Fei; Wang, Yi; Zhang, Xinzhong; Xu, Xuefeng
2018-01-01
Through natural or human selection, many fleshy fruits have evolved vivid external or internal coloration, which often develops during ripening. Such developmental changes in color are associated with the biosynthesis of pigments as well as with degreening through chlorophyll degradation. Here, we demonstrated that natural variation in the coding region of the gene ETHYLENE RESPONSE FACTOR17 (ERF17) contributes to apple (Malus domestica) fruit peel degreening. Specifically, ERF17 mutant alleles with different serine (Ser) repeat insertions in the coding region exhibited enhanced transcriptional regulation activity in a dual-luciferase reporter assay when more Ser repeats were present. Notably, surface plasmon resonance analysis showed that the number of Ser repeats affected the binding activity of ERF17 to the promoter sequences of chlorophyll degradation-related genes. In addition, overexpression of ERF17 in evergreen apples altered the accumulation of chlorophyll. Furthermore, we demonstrated that ERF17 has been under selection since the origin of apple tree cultivation. Taken together, these results reveal allelic variation underlying an important fruit quality trait and a molecular genetic mechanism associated with apple domestication. PMID:29431631
Insights into HLA-G Genetics Provided by Worldwide Haplotype Diversity
Castelli, Erick C.; Ramalho, Jaqueline; Porto, Iane O. P.; Lima, Thálitta H. A.; Felício, Leandro P.; Sabbagh, Audrey; Donadi, Eduardo A.; Mendes-Junior, Celso T.
2014-01-01
Human leukocyte antigen G (HLA-G) belongs to the family of non-classical HLA class I genes, located within the major histocompatibility complex (MHC). HLA-G has been the target of most recent research regarding the function of class I non-classical genes. The main features that distinguish HLA-G from classical class I genes are (a) limited protein variability, (b) alternative splicing generating several membrane bound and soluble isoforms, (c) short cytoplasmic tail, (d) modulation of immune response (immune tolerance), and (e) restricted expression to certain tissues. In the present work, we describe the HLA-G gene structure and address the HLA-G variability and haplotype diversity among several populations around the world, considering each of its major segments [promoter, coding, and 3′ untranslated region (UTR)]. For this purpose, we developed a pipeline to reevaluate the 1000Genomes data and recover miscalled or missing genotypes and haplotypes. It became clear that the overall structure of the HLA-G molecule has been maintained during the evolutionary process and that most of the variation sites found in the HLA-G coding region are either coding synonymous or intronic mutations. In addition, only a few frequent and divergent extended haplotypes are found when the promoter, coding, and 3′UTRs are evaluated together. The divergence is particularly evident for the regulatory regions. The population comparisons confirmed that most of the HLA-G variability has originated before human dispersion from Africa and that the allele and haplotype frequencies have probably been shaped by strong selective pressures. PMID:25339953
Parallel or convergent evolution in human population genomic data revealed by genotype networks.
R Vahdati, Ali; Wagner, Andreas
2016-08-02
Genotype networks are representations of genetic variation data that are complementary to phylogenetic trees. A genotype network is a graph whose nodes are genotypes (DNA sequences) with the same broadly defined phenotype. Two nodes are connected if they differ in some minimal way, e.g., in a single nucleotide. We analyze human genome variation data from the 1,000 genomes project, and construct haploid genotype (haplotype) networks for 12,235 protein coding genes. The structure of these networks varies widely among genes, indicating different patterns of variation despite a shared evolutionary history. We focus on those genes whose genotype networks show many cycles, which can indicate homoplasy, i.e., parallel or convergent evolution, on the sequence level. For 42 genes, the observed number of cycles is so large that it cannot be explained by either chance homoplasy or recombination. When analyzing possible explanations, we discovered evidence for positive selection in 21 of these genes and, in addition, a potential role for constrained variation and purifying selection. Balancing selection plays at most a small role. The 42 genes with excess cycles are enriched in functions related to immunity and response to pathogens. Genotype networks are representations of genetic variation data that can help understand unusual patterns of genomic variation.
Davies, Kalina T J; Tsagkogeorga, Georgia; Rossiter, Stephen J
2014-12-19
The majority of DNA contained within vertebrate genomes is non-coding, with a certain proportion of this thought to play regulatory roles during development. Conserved Non-coding Elements (CNEs) are an abundant group of putative regulatory sequences that are highly conserved across divergent groups and thus assumed to be under strong selective constraint. Many CNEs may contain regulatory factor binding sites, and their frequent spatial association with key developmental genes - such as those regulating sensory system development - suggests crucial roles in regulating gene expression and cellular patterning. Yet surprisingly little is known about the molecular evolution of CNEs across diverse mammalian taxa or their role in specific phenotypic adaptations. We examined 3,110 vertebrate-specific and ~82,000 mammalian-specific CNEs across 19 and 9 mammalian orders respectively, and tested for changes in the rate of evolution of CNEs located in the proximity of genes underlying the development or functioning of auditory systems. As we focused on CNEs putatively associated with genes underlying the development/functioning of auditory systems, we incorporated echolocating taxa in our dataset because of their highly specialised and derived auditory systems. Phylogenetic reconstructions of concatenated CNEs broadly recovered accepted mammal relationships despite high levels of sequence conservation. We found that CNE substitution rates were highest in rodents and lowest in primates, consistent with previous findings. Comparisons of CNE substitution rates from several genomic regions containing genes linked to auditory system development and hearing revealed differences between echolocating and non-echolocating taxa. Wider taxonomic sampling of four CNEs associated with the homeobox genes Hmx2 and Hmx3 - which are required for inner ear development - revealed family-wise variation across diverse bat species. Specifically within one family of echolocating bats that utilise frequency-modulated echolocation calls varying widely in frequency and intensity high levels of sequence divergence were found. Levels of selective constraint acting on CNEs differed both across genomic locations and taxa, with observed variation in substitution rates of CNEs among bat species. More work is needed to determine whether this variation can be linked to echolocation, and wider taxonomic sampling is necessary to fully document levels of conservation in CNEs across diverse taxa.
Mansourian, Robert; Mutch, David M; Antille, Nicolas; Aubert, Jerome; Fogel, Paul; Le Goff, Jean-Marc; Moulin, Julie; Petrov, Anton; Rytz, Andreas; Voegel, Johannes J; Roberts, Matthew-Alan
2004-11-01
Microarray technology has become a powerful research tool in many fields of study; however, the cost of microarrays often results in the use of a low number of replicates (k). Under circumstances where k is low, it becomes difficult to perform standard statistical tests to extract the most biologically significant experimental results. Other more advanced statistical tests have been developed; however, their use and interpretation often remain difficult to implement in routine biological research. The present work outlines a method that achieves sufficient statistical power for selecting differentially expressed genes under conditions of low k, while remaining as an intuitive and computationally efficient procedure. The present study describes a Global Error Assessment (GEA) methodology to select differentially expressed genes in microarray datasets, and was developed using an in vitro experiment that compared control and interferon-gamma treated skin cells. In this experiment, up to nine replicates were used to confidently estimate error, thereby enabling methods of different statistical power to be compared. Gene expression results of a similar absolute expression are binned, so as to enable a highly accurate local estimate of the mean squared error within conditions. The model then relates variability of gene expression in each bin to absolute expression levels and uses this in a test derived from the classical ANOVA. The GEA selection method is compared with both the classical and permutational ANOVA tests, and demonstrates an increased stability, robustness and confidence in gene selection. A subset of the selected genes were validated by real-time reverse transcription-polymerase chain reaction (RT-PCR). All these results suggest that GEA methodology is (i) suitable for selection of differentially expressed genes in microarray data, (ii) intuitive and computationally efficient and (iii) especially advantageous under conditions of low k. The GEA code for R software is freely available upon request to authors.
An ensemble rank learning approach for gene prioritization.
Lee, Po-Feng; Soo, Von-Wun
2013-01-01
Several different computational approaches have been developed to solve the gene prioritization problem. We intend to use the ensemble boosting learning techniques to combine variant computational approaches for gene prioritization in order to improve the overall performance. In particular we add a heuristic weighting function to the Rankboost algorithm according to: 1) the absolute ranks generated by the adopted methods for a certain gene, and 2) the ranking relationship between all gene-pairs from each prioritization result. We select 13 known prostate cancer genes in OMIM database as training set and protein coding gene data in HGNC database as test set. We adopt the leave-one-out strategy for the ensemble rank boosting learning. The experimental results show that our ensemble learning approach outperforms the four gene-prioritization methods in ToppGene suite in the ranking results of the 13 known genes in terms of mean average precision, ROC and AUC measures.
Michel, Christian J.
2017-01-01
In 1996, a set X of 20 trinucleotides was identified in genes of both prokaryotes and eukaryotes which has on average the highest occurrence in reading frame compared to its two shifted frames. Furthermore, this set X has an interesting mathematical property as X is a maximal C3 self-complementary trinucleotide circular code. In 2015, by quantifying the inspection approach used in 1996, the circular code X was confirmed in the genes of bacteria and eukaryotes and was also identified in the genes of plasmids and viruses. The method was based on the preferential occurrence of trinucleotides among the three frames at the gene population level. We extend here this definition at the gene level. This new statistical approach considers all the genes, i.e., of large and small lengths, with the same weight for searching the circular code X. As a consequence, the concept of circular code, in particular the reading frame retrieval, is directly associated to each gene. At the gene level, the circular code X is strengthened in the genes of bacteria, eukaryotes, plasmids, and viruses, and is now also identified in the genes of archaea. The genes of mitochondria and chloroplasts contain a subset of the circular code X. Finally, by studying viral genes, the circular code X was found in DNA genomes, RNA genomes, double-stranded genomes, and single-stranded genomes. PMID:28420220
The landscape of cancer genes and mutational processes in breast cancer
Stephens, Philip J.; Tarpey, Patrick S.; Davies, Helen; Loo, Peter Van; Greenman, Chris; Wedge, David C.; Nik-Zainal, Serena; Martin, Sancha; Varela, Ignacio; Bignell, Graham R.; Yates, Lucy R.; Papaemmanuil, Elli; Beare, David; Butler, Adam; Cheverton, Angela; Gamble, John; Hinton, Jonathan; Jia, Mingming; Jayakumar, Alagu; Jones, David; Latimer, Calli; Lau, King Wai; McLaren, Stuart; McBride, David J.; Menzies, Andrew; Mudie, Laura; Raine, Keiran; Rad, Roland; Chapman, Michael Spencer; Teague, Jon; Easton, Douglas; Langerød, Anita; OSBREAC; Lee, Ming Ta Michael; Shen, Chen-Yang; Tee, Benita Tan Kiat; Huimin, Bernice Wong; Broeks, Annegien; Vargas, Ana Cristina; Turashvili, Gulisa; Martens, John; Fatima, Aquila; Miron, Penelope; Chin, Suet-Feung; Thomas, Gilles; Boyault, Sandrine; Mariani, Odette; Lakhani, Sunil R.; van de Vijver, Marc; van ’t Veer, Laura; Foekens, John; Desmedt, Christine; Sotiriou, Christos; Tutt, Andrew; Caldas, Carlos; Reis-Filho, Jorge S.; Aparicio, Samuel A. J. R.; Salomon, Anne Vincent; Børresen-Dale, Anne-Lise; Richardson, Andrea L.; Campbell, Peter J.; Futreal, P. Andrew; Stratton, Michael R.
2012-01-01
All cancers carry somatic mutations in their genomes. A subset, known as driver mutations, confer clonal selective advantage on cancer cells and are causally implicated in oncogenesis1, and the remainder are passenger mutations. The driver mutations and mutational processes operative in breast cancer have not yet been comprehensively explored. Here we examine the genomes of 100 tumours for somatic copy number changes and mutations in the coding exons of protein-coding genes. The number of somatic mutations varied markedly between individual tumours. We found strong correlations between mutation number, age at which cancer was diagnosed and cancer histological grade, and observed multiple mutational signatures, including one present in about ten per cent of tumours characterized by numerous mutations of cytosine at TpC dinucleotides. Driver mutations were identified in several new cancer genes including AKT2, ARID1B, CASP8, CDKN1B, MAP3K1, MAP3K13, NCOR1, SMARCD1 and TBX3. Among the 100 tumours, we found driver mutations in at least 40 cancer genes and 73 different combinations of mutated cancer genes. The results highlight the substantial genetic diversity underlying this common disease. PMID:22722201
New Genes and Functional Innovation in Mammals
Luis Villanueva-Cañas, José; Ruiz-Orera, Jorge; Agea, M. Isabel; Gallo, Maria; Andreu, David
2017-01-01
Abstract The birth of genes that encode new protein sequences is a major source of evolutionary innovation. However, we still understand relatively little about how these genes come into being and which functions they are selected for. To address these questions, we have obtained a large collection of mammalian-specific gene families that lack homologues in other eukaryotic groups. We have combined gene annotations and de novo transcript assemblies from 30 different mammalian species, obtaining ∼6,000 gene families. In general, the proteins in mammalian-specific gene families tend to be short and depleted in aromatic and negatively charged residues. Proteins which arose early in mammalian evolution include milk and skin polypeptides, immune response components, and proteins involved in reproduction. In contrast, the functions of proteins which have a more recent origin remain largely unknown, despite the fact that these proteins also have extensive proteomics support. We identify several previously described cases of genes originated de novo from noncoding genomic regions, supporting the idea that this mechanism frequently underlies the evolution of new protein-coding genes in mammals. Finally, we show that most young mammalian genes are preferentially expressed in testis, suggesting that sexual selection plays an important role in the emergence of new functional genes. PMID:28854603
2014-01-01
Background Fascioliasis is an important and neglected disease of humans and other mammals, caused by trematodes of the genus Fasciola. Fasciola hepatica and F. gigantica are valid species that infect humans and animals, but the specific status of Fasciola sp. (‘intermediate form’) is unclear. Methods Single specimens inferred to represent Fasciola sp. (‘intermediate form’; Heilongjiang) and F. gigantica (Guangxi) from China were genetically identified and characterized using PCR-based sequencing of the first and second internal transcribed spacer regions of nuclear ribosomal DNA. The complete mitochondrial (mt) genomes of these representative specimens were then sequenced. The relationships of these specimens with selected members of the Trematoda were assessed by phylogenetic analysis of concatenated amino acid sequence datasets by Bayesian inference (BI). Results The complete mt genomes of representatives of Fasciola sp. and F. gigantica were 14,453 bp and 14,478 bp in size, respectively. Both mt genomes contain 12 protein-coding genes, 22 transfer RNA genes and two ribosomal RNA genes, but lack an atp8 gene. All protein-coding genes are transcribed in the same direction, and the gene order in both mt genomes is the same as that published for F. hepatica. Phylogenetic analysis of the concatenated amino acid sequence data for all 12 protein-coding genes showed that the specimen of Fasciola sp. was more closely related to F. gigantica than to F. hepatica. Conclusions The mt genomes characterized here provide a rich source of markers, which can be used in combination with nuclear markers and imaging techniques, for future comparative studies of the biology of Fasciola sp. from China and other countries. PMID:24685294
Liu, Guo-Hua; Gasser, Robin B; Young, Neil D; Song, Hui-Qun; Ai, Lin; Zhu, Xing-Quan
2014-03-31
Fascioliasis is an important and neglected disease of humans and other mammals, caused by trematodes of the genus Fasciola. Fasciola hepatica and F. gigantica are valid species that infect humans and animals, but the specific status of Fasciola sp. ('intermediate form') is unclear. Single specimens inferred to represent Fasciola sp. ('intermediate form'; Heilongjiang) and F. gigantica (Guangxi) from China were genetically identified and characterized using PCR-based sequencing of the first and second internal transcribed spacer regions of nuclear ribosomal DNA. The complete mitochondrial (mt) genomes of these representative specimens were then sequenced. The relationships of these specimens with selected members of the Trematoda were assessed by phylogenetic analysis of concatenated amino acid sequence datasets by Bayesian inference (BI). The complete mt genomes of representatives of Fasciola sp. and F. gigantica were 14,453 bp and 14,478 bp in size, respectively. Both mt genomes contain 12 protein-coding genes, 22 transfer RNA genes and two ribosomal RNA genes, but lack an atp8 gene. All protein-coding genes are transcribed in the same direction, and the gene order in both mt genomes is the same as that published for F. hepatica. Phylogenetic analysis of the concatenated amino acid sequence data for all 12 protein-coding genes showed that the specimen of Fasciola sp. was more closely related to F. gigantica than to F. hepatica. The mt genomes characterized here provide a rich source of markers, which can be used in combination with nuclear markers and imaging techniques, for future comparative studies of the biology of Fasciola sp. from China and other countries.
Gene-culture coevolution in the age of genomics
Richerson, Peter J.; Boyd, Robert; Henrich, Joseph
2010-01-01
The use of socially learned information (culture) is central to human adaptations. We investigate the hypothesis that the process of cultural evolution has played an active, leading role in the evolution of genes. Culture normally evolves more rapidly than genes, creating novel environments that expose genes to new selective pressures. Many human genes that have been shown to be under recent or current selection are changing as a result of new environments created by cultural innovations. Some changed in response to the development of agricultural subsistence systems in the Early and Middle Holocene. Alleles coding for adaptations to diets rich in plant starch (e.g., amylase copy number) and to epidemic diseases evolved as human populations expanded (e.g., sickle cell and G6PD deficiency alleles that provide protection against malaria). Large-scale scans using patterns of linkage disequilibrium to detect recent selection suggest that many more genes evolved in response to agriculture. Genetic change in response to the novel social environment of contemporary modern societies is also likely to be occurring. The functional effects of most of the alleles under selection during the last 10,000 years are currently unknown. Also unknown is the role of paleoenvironmental change in regulating the tempo of hominin evolution. Although the full extent of culture-driven gene-culture coevolution is thus far unknown for the deeper history of the human lineage, theory and some evidence suggest that such effects were profound. Genomic methods promise to have a major impact on our understanding of gene-culture coevolution over the span of hominin evolutionary history. PMID:20445092
Poly(A) code analyses reveal key determinants for tissue-specific mRNA alternative polyadenylation
Weng, Lingjie; Li, Yi; Xie, Xiaohui; Shi, Yongsheng
2016-01-01
mRNA alternative polyadenylation (APA) is a critical mechanism for post-transcriptional gene regulation and is often regulated in a tissue- and/or developmental stage-specific manner. An ultimate goal for the APA field has been to be able to computationally predict APA profiles under different physiological or pathological conditions. As a first step toward this goal, we have assembled a poly(A) code for predicting tissue-specific poly(A) sites (PASs). Based on a compendium of over 600 features that have known or potential roles in PAS selection, we have generated and refined a machine-learning algorithm using multiple high-throughput sequencing-based data sets of tissue-specific and constitutive PASs. This code can predict tissue-specific PASs with >85% accuracy. Importantly, by analyzing the prediction performance based on different RNA features, we found that PAS context, including the distance between alternative PASs and the relative position of a PAS within the gene, is a key feature for determining the susceptibility of a PAS to tissue-specific regulation. Our poly(A) code provides a useful tool for not only predicting tissue-specific APA regulation, but also for studying its underlying molecular mechanisms. PMID:27095026
Batagov, Arsen O; Yarmishyn, Aliaksandr A; Jenjaroenpun, Piroon; Tan, Jovina Z; Nishida, Yuichiro; Kurochkin, Igor V
2013-10-16
Mammalian genomes are extensively transcribed producing thousands of long non-protein-coding RNAs (lncRNAs). The biological significance and function of the vast majority of lncRNAs remain unclear. Recent studies have implicated several lncRNAs as playing important roles in embryonic development and cancer progression. LncRNAs are characterized with different genomic architectures in relationship with their associated protein-coding genes. Our study aimed at bridging lncRNA architecture with dynamical patterns of their expression using differentiating human neuroblastoma cells model. LncRNA expression was studied in a 120-hours timecourse of differentiation of human neuroblastoma SH-SY5Y cells into neurons upon treatment with retinoic acid (RA), the compound used for the treatment of neuroblastoma. A custom microarray chip was utilized to interrogate expression levels of 9,267 lncRNAs in the course of differentiation. We categorized lncRNAs into 19 architecture classes according to their position relatively to protein-coding genes. For each architecture class, dynamics of expression of lncRNAs was studied in association with their protein-coding partners. It allowed us to demonstrate positive correlation of lncRNAs with their associated protein-coding genes at bidirectional promoters and for sense-antisense transcript pairs. In contrast, lncRNAs located in the introns and downstream of the protein-coding genes were characterized with negative correlation modes. We further classified the lncRNAs by the temporal patterns of their expression dynamics. We found that intronic and bidirectional promoter architectures are associated with rapid RA-dependent induction or repression of the corresponding lncRNAs, followed by their constant expression. At the same time, lncRNAs expressed downstream of protein-coding genes are characterized by rapid induction, followed by transcriptional repression. Quantitative RT-PCR analysis confirmed the discovered functional modes for several selected lncRNAs associated with proteins involved in cancer and embryonic development. This is the first report detailing dynamical changes of multiple lncRNAs during RA-induced neuroblastoma differentiation. Integration of genomic and transcriptomic levels of information allowed us to demonstrate specific behavior of lncRNAs organized in different genomic architectures. This study also provides a list of lncRNAs with possible roles in neuroblastoma.
Smura, Teemu; Blomqvist, Soile; Vuorinen, Tytti; Ivanova, Olga; Samoilovich, Elena; Al-Hello, Haider; Savolainen-Kopra, Carita; Hovi, Tapani; Roivainen, Merja
2014-01-01
Genus Enterovirus (Family Picornaviridae,) consists of twelve species divided into genetically diverse types by their capsid protein VP1 coding sequences. Each enterovirus type can further be divided into intra-typic sub-clusters (genotypes). The aim of this study was to elucidate what leads to the emergence of novel enterovirus clades (types and genotypes). An evolutionary analysis was conducted for a sub-group of Enterovirus C species that contains types Coxsackievirus A21 (CVA-21), CVA-24, Enterovirus C95 (EV-C95), EV-C96 and EV-C99. VP1 gene datasets were collected and analysed to infer the phylogeny, rate of evolution, nucleotide and amino acid substitution patterns and signs of selection. In VP1 coding gene, high intra-typic sequence diversities and robust grouping into distinct genotypes within each type were detected. Within each type the majority of nucleotide substitutions were synonymous and the non-synonymous substitutions tended to cluster in distinct highly polymorphic sites. Signs of positive selection were detected in some of these highly polymorphic sites, while strong negative selection was indicated in most of the codons. Despite robust clustering to intra-typic genotypes, only few genotype-specific ‘signature’ amino acids were detected. In contrast, when different enterovirus types were compared, there was a clear tendency towards fixation of type-specific ‘signature’ amino acids. The results suggest that permanent fixation of type-specific amino acids is a hallmark associated with evolution of different enterovirus types, whereas neutral evolution and/or (frequency-dependent) positive selection in few highly polymorphic amino acid sites are the dominant forms of evolution when strains within an enterovirus type are compared. PMID:24695547
Smura, Teemu; Blomqvist, Soile; Vuorinen, Tytti; Ivanova, Olga; Samoilovich, Elena; Al-Hello, Haider; Savolainen-Kopra, Carita; Hovi, Tapani; Roivainen, Merja
2014-01-01
Genus Enterovirus (Family Picornaviridae,) consists of twelve species divided into genetically diverse types by their capsid protein VP1 coding sequences. Each enterovirus type can further be divided into intra-typic sub-clusters (genotypes). The aim of this study was to elucidate what leads to the emergence of novel enterovirus clades (types and genotypes). An evolutionary analysis was conducted for a sub-group of Enterovirus C species that contains types Coxsackievirus A21 (CVA-21), CVA-24, Enterovirus C95 (EV-C95), EV-C96 and EV-C99. VP1 gene datasets were collected and analysed to infer the phylogeny, rate of evolution, nucleotide and amino acid substitution patterns and signs of selection. In VP1 coding gene, high intra-typic sequence diversities and robust grouping into distinct genotypes within each type were detected. Within each type the majority of nucleotide substitutions were synonymous and the non-synonymous substitutions tended to cluster in distinct highly polymorphic sites. Signs of positive selection were detected in some of these highly polymorphic sites, while strong negative selection was indicated in most of the codons. Despite robust clustering to intra-typic genotypes, only few genotype-specific 'signature' amino acids were detected. In contrast, when different enterovirus types were compared, there was a clear tendency towards fixation of type-specific 'signature' amino acids. The results suggest that permanent fixation of type-specific amino acids is a hallmark associated with evolution of different enterovirus types, whereas neutral evolution and/or (frequency-dependent) positive selection in few highly polymorphic amino acid sites are the dominant forms of evolution when strains within an enterovirus type are compared.
Rodrigue, Nicolas; Lartillot, Nicolas
2017-01-01
Codon substitution models have traditionally attempted to uncover signatures of adaptation within protein-coding genes by contrasting the rates of synonymous and non-synonymous substitutions. Another modeling approach, known as the mutation-selection framework, attempts to explicitly account for selective patterns at the amino acid level, with some approaches allowing for heterogeneity in these patterns across codon sites. Under such a model, substitutions at a given position occur at the neutral or nearly neutral rate when they are synonymous, or when they correspond to replacements between amino acids of similar fitness; substitutions from high to low (low to high) fitness amino acids have comparatively low (high) rates. Here, we study the use of such a mutation-selection framework as a null model for the detection of adaptation. Following previous works in this direction, we include a deviation parameter that has the effect of capturing the surplus, or deficit, in non-synonymous rates, relative to what would be expected under a mutation-selection modeling framework that includes a Dirichlet process approach to account for across-codon-site variation in amino acid fitness profiles. We use simulations, along with a few real data sets, to study the behavior of the approach, and find it to have good power with a low false-positive rate. Altogether, we emphasize the potential of recent mutation-selection models in the detection of adaptation, calling for further model refinements as well as large-scale applications. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Basu, Swaraj; Larsson, Erik
2018-05-31
Antisense transcripts and other long non-coding RNAs are pervasive in mammalian cells, and some of these molecules have been proposed to regulate proximal protein-coding genes in cis For example, non-coding transcription can contribute to inactivation of tumor suppressor genes in cancer, and antisense transcripts have been implicated in the epigenetic inactivation of imprinted genes. However, our knowledge is still limited and more such regulatory interactions likely await discovery. Here, we make use of available gene expression data from a large compendium of human tumors to generate hypotheses regarding non-coding-to-coding cis -regulatory relationships with emphasis on negative associations, as these are less likely to arise for reasons other than cis -regulation. We document a large number of possible regulatory interactions, including 193 coding/non-coding pairs that show expression patterns compatible with negative cis -regulation. Importantly, by this approach we capture several known cases, and many of the involved coding genes have known roles in cancer. Our study provides a large catalog of putative non-coding/coding cis -regulatory pairs that may serve as a basis for further experimental validation and characterization. Copyright © 2018 Basu and Larsson.
Sounds of silence: synonymous nucleotides as a key to biological regulation and complexity
Shabalina, Svetlana A.; Spiridonov, Nikolay A.; Kashina, Anna
2013-01-01
Messenger RNA is a key component of an intricate regulatory network of its own. It accommodates numerous nucleotide signals that overlap protein coding sequences and are responsible for multiple levels of regulation and generation of biological complexity. A wealth of structural and regulatory information, which mRNA carries in addition to the encoded amino acid sequence, raises the question of how these signals and overlapping codes are delineated along non-synonymous and synonymous positions in protein coding regions, especially in eukaryotes. Silent or synonymous codon positions, which do not determine amino acid sequences of the encoded proteins, define mRNA secondary structure and stability and affect the rate of translation, folding and post-translational modifications of nascent polypeptides. The RNA level selection is acting on synonymous sites in both prokaryotes and eukaryotes and is more common than previously thought. Selection pressure on the coding gene regions follows three-nucleotide periodic pattern of nucleotide base-pairing in mRNA, which is imposed by the genetic code. Synonymous positions of the coding regions have a higher level of hybridization potential relative to non-synonymous positions, and are multifunctional in their regulatory and structural roles. Recent experimental evidence and analysis of mRNA structure and interspecies conservation suggest that there is an evolutionary tradeoff between selective pressure acting at the RNA and protein levels. Here we provide a comprehensive overview of the studies that define the role of silent positions in regulating RNA structure and processing that exert downstream effects on proteins and their functions. PMID:23293005
SNP discovery in candidate adaptive genes using exon capture in a free-ranging alpine ungulate
Roffler, Gretchen H.; Amish, Stephen J.; Smith, Seth; Cosart, Ted F.; Kardos, Marty; Schwartz, Michael K.; Luikart, Gordon
2016-01-01
Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein-coding and nearby 5′ and 3′ untranslated regions of chosen candidate genes. Targeted sequences were taken from bighorn sheep (Ovis canadensis) exon capture data and directly from the domestic sheep genome (Ovis aries v. 3; oviAri3). The bighorn sheep sequences used in the Dall's sheep (Ovis dalli dalli) exon capture aligned to 2350 genes on the oviAri3 genome with an average of 2 exons each. We developed a microfluidic qPCR-based SNP chip to genotype 476 Dall's sheep from locations across their range and test for patterns of selection. Using multiple corroborating approaches (lositan and bayescan), we detected 28 SNP loci potentially under selection. We additionally identified candidate loci significantly associated with latitude, longitude, precipitation and temperature, suggesting local environmental adaptation. The three methods demonstrated consistent support for natural selection on nine genes with immune and disease-regulating functions (e.g. Ovar-DRA, APC, BATF2, MAGEB18), cell regulation signalling pathways (e.g. KRIT1, PI3K, ORRC3), and respiratory health (CYSLTR1). Characterizing adaptive allele distributions from novel genetic techniques will facilitate investigation of the influence of environmental variation on local adaptation of a northern alpine ungulate throughout its range. This research demonstrated the utility of exon capture for gene-targeted SNP discovery and subsequent SNP chip genotyping using low-quality samples in a nonmodel species.
Chloroplast DNA codon use: evidence for selection at the psb A locus based on tRNA availability.
Morton, B R
1993-09-01
Codon use in the three sequenced chloroplast genomes (Marchantia, Oryza, and Nicotiana) is examined. The chloroplast has a bias in that codons NNA and NNT are favored over synonymous NNC and NNG codons. This appears to be a consequence of an overall high A + T content of the genome. This pattern of codon use is not followed by the psb A gene of all three genomes and other psb A sequences examined. In this gene, the codon use favors NNC over NNT for twofold degenerate amino acids. In each case the only tRNA coded by the genome is complementary to the NNC codon. This codon use is similar to the codon use by chloroplast genes examined from Chlamydomonas reinhardtii. Since psb A is the major translation product of the chloroplast, this suggests that selection is acting on the codon use of this gene to adapt codons to tRNA availability, as previously suggested for unicellular organisms.
Plastome Evolution in Hemiparasitic Mistletoes
Petersen, Gitte; Cuenca, Argelia; Seberg, Ole
2015-01-01
Santalales is an order of plants consisting almost entirely of parasites. Some, such as Osyris, are facultative root parasites whereas others, such as Viscum, are obligate stem parasitic mistletoes. Here, we report the complete plastome sequences of one species of Osyris and three species of Viscum, and we investigate the evolutionary aspects of structural changes and changes in gene content in relation to parasitism. Compared with typical angiosperms plastomes, the four Santalales plastomes are all reduced in size (10–22% compared with Vitis), and they have experienced rearrangements, mostly but not exclusively in the border areas of the inverted repeats. Additionally, a number of protein-coding genes (matK, infA, ccsA, rpl33, and all 11 ndh genes) as well as two transfer RNA genes (trnG-UCC and trnV-UAC) have been pseudogenized or completely lost. Most of the remaining plastid genes have a significantly changed selection pattern compared with other dicots, and the relaxed selection of photosynthesis genes is noteworthy. Although gene loss obviously reduces plastome size, intergenic regions were also shortened. As plastome modifications are generally most prominent in Viscum, they are most likely correlated with the increased nutritional dependence on the host compared with Osyris. PMID:26319577
Salanti, Ali; Lavstsen, Thomas; Nielsen, Morten A.; Theander, Thor G.; Leke, Rose G. F.; Lo, Yeung Y.; Bobbili, Naveen; Arnot, David E.; Taylor, Diane W.
2011-01-01
Placental malaria infections are caused by Plasmodium falciparum–infected red blood cells sequestering in the placenta by binding to chondroitin sulfate A, mediated by VAR2CSA, a variant of the PfEMP1 family of adhesion antigens. Recent studies have shown that many P. falciparum genomes have multiple genes coding for different VAR2CSA proteins, and parasites with >1 var2csa gene appear to be more common in pregnant women with placental malaria than in nonpregnant individuals. We present evidence that, in pregnant women, parasites containing multiple var2csa-type genes possess a selective advantage over parasites with a single var2csa gene. Accumulation of parasites with multiple copies of the var2csa gene during the course of pregnancy was also correlated with the development of antibodies involved in blocking VAR2CSA adhesion. The data suggest that multiplicity of var2csa-type genes enables P. falciparum parasites to persist for a longer period of time during placental infections, probably because of their greater capacity for antigenic variation and evasion of variant-specific immune responses. PMID:21592998
PSAT: A web tool to compare genomic neighborhoods of multiple prokaryotic genomes
Fong, Christine; Rohmer, Laurence; Radey, Matthew; Wasnick, Michael; Brittnacher, Mitchell J
2008-01-01
Background The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT) is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. Results PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. Conclusion PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any web browser with no client side software setup or installation required. Source code is freely available to researchers interested in setting up a local version of PSAT for analysis of genomes not available through the public server. Access to the public web server and instructions for obtaining source code can be found at . PMID:18366802
CMCpy: Genetic Code-Message Coevolution Models in Python
Becich, Peter J.; Stark, Brian P.; Bhat, Harish S.; Ardell, David H.
2013-01-01
Code-message coevolution (CMC) models represent coevolution of a genetic code and a population of protein-coding genes (“messages”). Formally, CMC models are sets of quasispecies coupled together for fitness through a shared genetic code. Although CMC models display plausible explanations for the origin of multiple genetic code traits by natural selection, useful modern implementations of CMC models are not currently available. To meet this need we present CMCpy, an object-oriented Python API and command-line executable front-end that can reproduce all published results of CMC models. CMCpy implements multiple solvers for leading eigenpairs of quasispecies models. We also present novel analytical results that extend and generalize applications of perturbation theory to quasispecies models and pioneer the application of a homotopy method for quasispecies with non-unique maximally fit genotypes. Our results therefore facilitate the computational and analytical study of a variety of evolutionary systems. CMCpy is free open-source software available from http://pypi.python.org/pypi/CMCpy/. PMID:23532367
Heat-inducible hygromycin resistance in transgenic tobacco.
Severin, K; Schöffl, F
1990-12-01
We have constructed a chimaeric gene consisting of the promoter of the soybean heat shock (hs) gene Gmhsp17, 6-L, the coding region of a hygromycin phosphotransferase (hpt) gene, and the termination sequence of the nopaline synthase (nos) gene. This gene fusion was introduced into tobacco by Agrobacterium-mediated gene transfer. Heat-inducible synthesis of mRNA was shown by northern hybridization, and translation of this RNA into a functional protein was indicated by plant growth on hygromycin-containing media in a temperature-dependent fashion. One hour incubation at 40 degrees C per day, applied for several weeks, was sufficient to express the resistant phenotype in transgenic plants containing the chimaeric hs-hpt gene. These data suggest that the hygromycin resistance gene is functional and faithfully controlled by the soybean hs promoter. The suitability of these transgenic plants for selection of mutations that alter the hs response is discussed.
Rajter, Ľubomír; Vďačný, Peter
2018-05-12
The class Litostomatea represents a highly diverse but monophyletic group, uniting both free-living and endosymbiotic ciliates. Ribosomal RNA genes and ITS-region sequences helped to recognize and define the main litostomatean lineages, but did not provide enough phylogenetic signal to unambiguously resolve their interrelationships. In this study, we attempted to improve the resolution among main free-living predatory lineages by adding the gene coding for alpha-tubulin. However, our phylogenetic analyses challenged the performance of alpha-tubulin in reconstruction of evolutionary history of free-living litostomateans. We identified several mutually interconnected problems associated with the ciliate alpha-tubulin gene: the paucity of phylogenetic signal, molecular homoplasies and non-neutral evolution. Positive selection may generate molecular homoplasies (parallel evolution), while negative selection may cause a small number of changes and hence little phylogenetic informativness. Both problems were encountered in nucleotide and amino acid alpha-tubulin alignments, indicating an action of various selective pressures. Taking into account the involvement of alpha-tubulin in many essential biological processes, this protein could be so strongly affected by purifying selection that it even might have become an inappropriate molecular marker for reconstruction of phylogenetic relationships. Therefore, a great caution should be paid when tubulin genes are included in phylogenetic and/or phylogenomic analyses. Copyright © 2018 Elsevier Inc. All rights reserved.
Sinha, Pallavi; Pazhamala, Lekha T.; Singh, Vikas K.; Saxena, Rachit K.; Krishnamurthy, L.; Azam, Sarwar; Khan, Aamir W.; Varshney, Rajeev K.
2016-01-01
Pigeonpea is a resilient crop, which is relatively more drought tolerant than many other legume crops. To understand the molecular mechanisms of this unique feature of pigeonpea, 51 genes were selected using the Hidden Markov Models (HMM) those codes for proteins having close similarity to universal stress protein domain. Validation of these genes was conducted on three pigeonpea genotypes (ICPL 151, ICPL 8755, and ICPL 227) having different levels of drought tolerance. Gene expression analysis using qRT-PCR revealed 6, 8, and 18 genes to be ≥2-fold differentially expressed in ICPL 151, ICPL 8755, and ICPL 227, respectively. A total of 10 differentially expressed genes showed ≥2-fold up-regulation in the more drought tolerant genotype, which encoded four different classes of proteins. These include plant U-box protein (four genes), universal stress protein A-like protein (four genes), cation/H(+) antiporter protein (one gene) and an uncharacterized protein (one gene). Genes C.cajan_29830 and C.cajan_33874 belonging to uspA, were found significantly expressed in all the three genotypes with ≥2-fold expression variations. Expression profiling of these two genes on the four other legume crops revealed their specific role in pigeonpea. Therefore, these genes seem to be promising candidates for conferring drought tolerance specifically to pigeonpea. PMID:26779199
Hayes, Michael L; Giang, Karolyn; Mulligan, R Michael
2012-05-14
Pentatricopeptide repeat (PPR) proteins are required for numerous RNA processing events in plant organelles including C-to-U editing, splicing, stabilization, and cleavage. Fifteen PPR proteins are known to be required for RNA editing at 21 sites in Arabidopsis chloroplasts, and belong to the PLS class of PPR proteins. In this study, we investigate the co-evolution of four PPR genes (CRR4, CRR21, CLB19, and OTP82) and their six editing targets in Brassicaceae species. PPR genes are composed of approximately 10 to 20 tandem repeats and each repeat has two α-helical regions, helix A and helix B, that are separated by short coil regions. Each repeat and structural feature was examined to determine the selective pressures on these regions. All of the PPR genes examined are under strong negative selection. Multiple independent losses of editing site targets are observed for both CRR21 and OTP82. In several species lacking the known editing target for CRR21, PPR genes are truncated near the 17th PPR repeat. The coding sequences of the truncated CRR21 genes are maintained under strong negative selection; however, the 3' UTR sequences beyond the truncation site have substantially diverged. Phylogenetic analyses of four PPR genes show that sequences corresponding to helix A are high compared to helix B sequences. Differential evolutionary selection of helix A versus helix B is observed in both plant and mammalian PPR genes. PPR genes and their cognate editing sites are mutually constrained in evolution. Editing sites are frequently lost by replacement of an edited C with a genomic T. After the loss of an editing site, the PPR genes are observed with three outcomes: first, few changes are detected in some cases; second, the PPR gene is present as a pseudogene; and third, the PPR gene is present but truncated in the C-terminal region. The retention of truncated forms of CRR21 that are maintained under strong negative selection even in the absence of an editing site target suggests that unrecognized function(s) might exist for this PPR protein. PPR gene sequences that encode helix A are under strong selection, and could be involved in RNA substrate recognition.
2010-01-01
Background Various enzyme inhibitors act on key insect gut digestive hydrolases, including alpha-amylases and proteinases. Alpha-amylase inhibitors have been widely investigated for their possible use in strengthening a plant's defense against insects that are highly dependent on starch as an energy source. We attempted to unravel the diversity of monomeric alpha-amylase inhibitor genes of Israeli and Golan Heights' wild emmer wheat with different ecological factors (e.g., geography, water, and temperature). Population methods that analyze the nature and frequency of allele diversity within a species and the codon analysis method (comparing patterns of synonymous and non-synonymous changes in protein coding sequences) were used to detect natural selection. Results Three hundred and forty-eight sequences encoding monomeric alpha-amylase inhibitors (WMAI) were obtained from 14 populations of wild emmer wheat. The frequency of SNPs in WMAI genes was 1 out of 16.3 bases, where 28 SNPs were detected in the coding sequence. The results of purifying and the positive selection hypothesis (p < 0.05) showed that the sequences of WMAI were contributed by both natural selection and co-evolution, which ensured conservation of protein function and inhibition against diverse insect amylases. The majority of amino acid substitutions occurred at the C-terminal (positive selection domain), which ensured the stability of WMAI. SNPs in this gene could be classified into several categories associated with water, temperature, and geographic factors, respectively. Conclusions Great diversity at the WMAI locus, both between and within populations, was detected in the populations of wild emmer wheat. It was revealed that WMAI were naturally selected for across populations by a ratio of dN/dS as expected. Ecological factors, singly or in combination, explained a significant proportion of the variations in the SNPs. A sharp genetic divergence over very short geographic distances compared to a small genetic divergence between large geographic distances also suggested that the SNPs were subjected to natural selection, and ecological factors had an important evolutionary role in polymorphisms at this locus. According to population and codon analysis, these results suggested that monomeric alpha-amylase inhibitors are adaptively selected under different environmental conditions. PMID:20534122
The analysis of APOL1 genetic variation and haplotype diversity provided by 1000 Genomes project.
Peng, Ting; Wang, Li; Li, Guisen
2017-08-11
The APOL1 gene variants has been shown to be associated with an increased risk of multiple kinds of diseases, particularly in African Americans, but not in Caucasians and Asians. In this study, we explored the single nucleotide polymorphism (SNP) and haplotype diversity of APOL1 gene in different races provided by 1000 Genomes project. Variants of APOL1 gene in 1000 Genome Project were obtained and SNPs located in the regulatory region or coding region were selected for genetic variation analysis. Total 2504 individuals from 26 populations were classified as four groups that included Africa, Europe, Asia and Admixed populations. Tag SNPs were selected to evaluate the haplotype diversities in the four populations by HaploStats software. APOL1 gene was surrounded by some of the most polymorphic genes in the human genome, variation of APOL1 gene was common, with up to 613 SNP (1000 Genome Project reported) and 99 of them (16.2%) with MAF ≥ 1%. There were 79 SNPs in the URR and 92 SNPs in 3'UTR. Total 12 SNPs in URR and 24 SNPs in 3'UTR were considered as common variants with MAF ≥ 1%. It is worth noting that URR-1 was presents lower frequencies in European populations, while other three haplotypes taken an opposite pattern; 3'UTR presents several high-frequency variation sites in a short segment, and the differences of its haplotypes among different population were significant (P < 0.01), UTR-1 and UTR-5 presented much higher frequency in African population, while UTR-2, UTR-3 and UTR-4 were much lower. APOL1 coding region showed that two SNP of G1 with higher frequency are actually pull down the haplotype H-1 frequency when considering all populations pooled together, and the diversity among the four populations be widen by the G1 two mutation (P 1 = 3.33E-4 vs P 2 = 3.61E-30). The distributions of APOL1 gene variants and haplotypes were significantly different among the different populations, in either regulatory or coding regions. It could provide clues for the future genetic study of APOL1 related diseases.
Selection on the human bitter taste gene, TAS2R16, in Eurasian populations.
Li, Hui; Pakstis, Andrew J; Kidd, Judith R; Kidd, Kenneth K
2011-06-01
Bitter taste is one of the most important senses alerting humans to noxious foods. In gatherer communities, sensitivity to bitterness is presumably advantageous because of various noxious plants. TAS2R16 is the gene coding the taste receptor molecules for some of the most common toxins in plants. A previous study of this gene indicated selection has increased the frequency of a derived allele in this gene that arose before the human expansion out of Africa. We have applied a different methodology for detecting selection, the Long Range Haplotype (LRH) analysis, to TAS2R16 in a larger sampling of populations from around the world. The haplotype with the derived alleles at both the functional polymorphism and a polymorphism in the regulatory region of TAS2R16 showed evidence for recent positive selection in most of the Eurasian populations, though the highest selection signal occurs in Mbuti Pygmies, an African hunter-gatherer group. In Eurasia, only populations of Mesopotamia and the southeast coast of China have no signals of selection. The evidence of recent selection found in most Eurasian populations differs from the geographic pattern seen in the earlier study of selection. One can speculate that the difference may result from a gathering lifestyle extending into the most recent 10,000 yrs and the need to recognize newly encountered bitter natural toxins as populations expanded into new environments and the biota changes with the ending of the most recent ice age. Alternatively, the promoter region variant may be a marker for altered function beyond what the derived amino acid allele conferred.
Nicolas, Laura; Cols, Montserrat; Choi, Jee Eun; Chaudhuri, Jayanta; Vuong, Bao
2018-01-01
Adaptive immune responses require the generation of a diverse repertoire of immunoglobulins (Igs) that can recognize and neutralize a seemingly infinite number of antigens. V(D)J recombination creates the primary Ig repertoire, which subsequently is modified by somatic hypermutation (SHM) and class switch recombination (CSR). SHM promotes Ig affinity maturation whereas CSR alters the effector function of the Ig. Both SHM and CSR require activation-induced cytidine deaminase (AID) to produce dU:dG mismatches in the Ig locus that are transformed into untemplated mutations in variable coding segments during SHM or DNA double-strand breaks (DSBs) in switch regions during CSR. Within the Ig locus, DNA repair pathways are diverted from their canonical role in maintaining genomic integrity to permit AID-directed mutation and deletion of gene coding segments. Recently identified proteins, genes, and regulatory networks have provided new insights into the temporally and spatially coordinated molecular interactions that control the formation and repair of DSBs within the Ig locus. Unravelling the genetic program that allows B cells to selectively alter the Ig coding regions while protecting non-Ig genes from DNA damage advances our understanding of the molecular processes that maintain genomic integrity as well as humoral immunity. PMID:29744038
Duellman, Tyler; Warren, Christopher; Yang, Jay
2014-01-01
Microribonucleic acids (miRNAs) work with exquisite specificity and are able to distinguish a target from a non-target based on a single nucleotide mismatch in the core nucleotide domain. We questioned whether miRNA regulation of gene expression could occur in a single nucleotide polymorphism (SNP)-specific manner, manifesting as a post-transcriptional control of expression of genetic polymorphisms. In our recent study of the functional consequences of matrix metalloproteinase (MMP)-9 SNPs, we discovered that expression of a coding exon SNP in the pro-domain of the protein resulted in a profound decrease in the secreted protein. This missense SNP results in the N38S amino acid change and a loss of an N-glycosylation site. A systematic study demonstrated that the loss of secreted protein was due not to the loss of an N-glycosylation site, but rather an SNP-specific targeting by miR-671-3p and miR-657. Bioinformatics analysis identified 41 SNP-specific miRNA targeting MMP-9 SNPs, mostly in the coding exon and an extension of the analysis to chromosome 20, where the MMP-9 gene is located, suggesting that SNP-specific miRNAs targeting the coding exon are prevalent. This selective post-transcriptional regulation of a target messenger RNA harboring genetic polymorphisms by miRNAs offers an SNP-dependent post-transcriptional regulatory mechanism, allowing for polymorphic-specific differential gene regulation. PMID:24627221
Transformable Rhodobacter strains, method for producing transformable Rhodobacter strains
Laible, Philip D.; Hanson, Deborah K.
2018-05-08
The invention provides an organism for expressing foreign DNA, the organism engineered to accept standard DNA carriers. The genome of the organism codes for intracytoplasmic membranes and features an interruption in at least one of the genes coding for restriction enzymes. Further provided is a system for producing biological materials comprising: selecting a vehicle to carry DNA which codes for the biological materials; determining sites on the vehicle's DNA sequence susceptible to restriction enzyme cleavage; choosing an organism to accept the vehicle based on that organism not acting upon at least one of said vehicle's sites; engineering said vehicle to contain said DNA; thereby creating a synthetic vector; and causing the synthetic vector to enter the organism so as cause expression of said DNA.
Zavala, Eduardo; Reyes, Daniela; Deerenberg, Robert; Vidal, Rodrigo
2017-05-11
MicroRNAs are key non-coding RNA molecules that play a relevant role in the regulation of gene expression through translational repression and/or transcript cleavage during normal development and physiological adaptation processes like stress. Quantitative reverse transcription polymerase chain reaction (RT-qPCR) has become the approach normally used to determine the levels of microRNAs. However, this approach needs the use of endogenous reference. An improper selection of endogenous references can result in confusing interpretation of data. The aim of this study was to identify and validate appropriate endogenous reference miRNA genes for normalizing RT-qPCR survey of miRNAs expression in four different tissues of Atlantic salmon, under handling and confinement stress conditions associated to early or primary stress response. Nine candidate reference normalizers, including microRNAs and nuclear genes, normally used in vertebrate microRNA expression studies were selected from literature, validated by RT-qPCR and analyzed by the algorithms geNorm and NormFinder. The results revealed that the ssa-miR-99-5p gene was the most stable overall and that ssa-miR-99-5p and ssa-miR-23a-5p genes were the best combination. Moreover, the suitability of ssa-miR-99-5p and ssa-miR-23a-5p as endogeneuos reference genes was demostrated by the expression analysis of ssa-miR-193-5p gene.
Sun, Zichen; Stack, Colin; Šlapeta, Jan
2012-05-25
In order to investigate the genetic variation between Tritrichomonas foetus from bovine and feline origins, cysteine protease 8 (CP8) coding sequence was selected as the polymorphic DNA marker. Direct sequencing of CP8 coding sequence of T. foetus from four feline isolates and two bovine isolates with polymerase chain reaction successfully revealed conserved nucleotide polymorphisms between feline and bovine isolates. These results provide useful information for CP8-based molecular differentiation of T. foetus genotypes. Copyright © 2011 Elsevier B.V. All rights reserved.
Dashtban, M; Balafar, Mohammadali
2017-03-01
Gene selection is a demanding task for microarray data analysis. The diverse complexity of different cancers makes this issue still challenging. In this study, a novel evolutionary method based on genetic algorithms and artificial intelligence is proposed to identify predictive genes for cancer classification. A filter method was first applied to reduce the dimensionality of feature space followed by employing an integer-coded genetic algorithm with dynamic-length genotype, intelligent parameter settings, and modified operators. The algorithmic behaviors including convergence trends, mutation and crossover rate changes, and running time were studied, conceptually discussed, and shown to be coherent with literature findings. Two well-known filter methods, Laplacian and Fisher score, were examined considering similarities, the quality of selected genes, and their influences on the evolutionary approach. Several statistical tests concerning choice of classifier, choice of dataset, and choice of filter method were performed, and they revealed some significant differences between the performance of different classifiers and filter methods over datasets. The proposed method was benchmarked upon five popular high-dimensional cancer datasets; for each, top explored genes were reported. Comparing the experimental results with several state-of-the-art methods revealed that the proposed method outperforms previous methods in DLBCL dataset. Copyright © 2017 Elsevier Inc. All rights reserved.
CVD-associated non-coding RNA, ANRIL, modulates expression of atherogenic pathways in VSMC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Congrains, Ada; Kamide, Kei; Katsuya, Tomohiro
Highlights: Black-Right-Pointing-Pointer ANRIL maps in the strongest susceptibility locus for cardiovascular disease. Black-Right-Pointing-Pointer Silencing of ANRIL leads to altered expression of tissue remodeling-related genes. Black-Right-Pointing-Pointer The effects of ANRIL on gene expression are splicing variant specific. Black-Right-Pointing-Pointer ANRIL affects progression of cardiovascular disease by regulating proliferation and apoptosis pathways. -- Abstract: ANRIL is a newly discovered non-coding RNA lying on the strongest genetic susceptibility locus for cardiovascular disease (CVD) in the chromosome 9p21 region. Genome-wide association studies have been linking polymorphisms in this locus with CVD and several other major diseases such as diabetes and cancer. The role of thismore » non-coding RNA in atherosclerosis progression is still poorly understood. In this study, we investigated the implication of ANRIL in the modulation of gene sets directly involved in atherosclerosis. We designed and tested siRNA sequences to selectively target two exons (exon 1 and exon 19) of the transcript and successfully knocked down expression of ANRIL in human aortic vascular smooth muscle cells (HuAoVSMC). We used a pathway-focused RT-PCR array to profile gene expression changes caused by ANRIL knock down. Notably, the genes affected by each of the siRNAs were different, suggesting that different splicing variants of ANRIL might have distinct roles in cell physiology. Our results suggest that ANRIL splicing variants play a role in coordinating tissue remodeling, by modulating the expression of genes involved in cell proliferation, apoptosis, extra-cellular matrix remodeling and inflammatory response to finally impact in the risk of cardiovascular disease and other pathologies.« less
Pujolar, J M; Jacobsen, M W; Bekkevold, D; Lobón-Cervià, J; Jónsson, B; Bernatchez, L; Hansen, M M
2015-08-13
Species showing complex life cycles provide excellent opportunities to study the genetic associations between life cycle stages, as selective pressures may differ before and after metamorphosis. The European eel presents a complex life cycle with two metamorphoses, a first metamorphosis from larvae into glass eels (juvenile stage) and a second metamorphosis into silver eels (adult stage). We tested the hypothesis that different genes and gene pathways will be under selection at different life stages when comparing the genetic associations between glass eels and silver eels. We used two sets of markers to test for selection: first, we genotyped individuals using a panel of 80 coding-gene single nucleotide polymorphisms (SNPs) developed in American eel; second, we investigated selection at the genome level using a total of 153,423 RAD-sequencing generated SNPs widely distributed across the genome. Using the RAD approach, outlier tests identified a total of 2413 (1.57%) potentially selected SNPs. Functional annotation analysis identified signal transduction pathways as the most over-represented group of genes, including MAPK/Erk signalling, calcium signalling and GnRH (gonadotropin-releasing hormone) signalling. Many of the over-represented pathways were related to growth, while others could result from the different conditions that eels inhabit during their life cycle. The observation of different genes and gene pathways under selection when comparing glass eels vs. silver eels supports the adaptive decoupling hypothesis for the benefits of metamorphosis. Partitioning the life cycle into discrete morphological phases may be overall beneficial since it allows the different life stages to respond independently to their unique selection pressures. This might translate into a more effective use of food and niche resources and/or performance of phase-specific tasks (e.g. feeding in the case of glass eels, migrating and reproducing in the case of silver eels).
D'Angeli, Simone; Matteucci, Maya; Fattorini, Laura; Gismondi, Angelo; Ludovici, Matteo; Canini, Antonella; Altamura, Maria Maddalena
2016-05-01
Cold-acclimation genes in woody dicots without winter-dormancy, e.g., olive-tree, need investigation. Positive relationships between OeFAD8, OeOSM , and OeLIP19 and olive-tree cold-acclimation exist, and couple with increased lipid unsaturation and cutinisation. Olive-tree is a woody species with no winter-dormancy and low frost-tolerance. However, cold-tolerant genotypes were empirically selected, highlighting that cold-acclimation might be acquired. Proteins needed for olive-tree cold-acclimation are unknown, even if roles for osmotin (OeOSM) as leaf cryoprotectant, and seed lipid-transfer protein for endosperm cutinisation under cold, were demonstrated. In other species, FAD8, coding a desaturase producing α-linolenic acid, is activated by temperature-lowering, concomitantly with bZIP-LIP19 genes. The research was focussed on finding OeLIP19 gene(s) in olive-tree genome, and analyze it/their expression, and that of OeFAD8 and OeOSM, in drupes and leaves under different cold-conditions/developmental stages/genotypes, in comparison with changes in unsaturated lipids and cell wall cutinisation. Cold-induced cytosolic calcium transients always occurred in leaves/drupes of some genotypes, e.g., Moraiolo, but ceased in others, e.g., Canino, at specific drupe stages/cold-treatments, suggesting cold-acclimation acquisition only in the latter genotypes. Canino and Moraiolo were selected for further analyses. Cold-acclimation in Canino was confirmed by an electrolyte leakage from leaf/drupe membranes highly reduced in comparison with Moraiolo. Strong increases in fruit-epicarp/leaf-epidermis cutinisation characterized cold-acclimated Canino, and positively coupled with OeOSM expression, and immunolocalization of the coded protein. OeFAD8 expression increased with cold-acclimation, as the production of α-linolenic acid, and related compounds. An OeLIP19 gene was isolated. Its levels changed with a trend similar to OeFAD8. All together, results sustain a positive relationship between OeFAD8, OeOSM and OeLIP19 expression in olive-tree cold-acclimation. The parallel changes in unsaturated lipids and cutinisation concur to suggest orchestrated roles of the coded proteins in the process.
Coding of Class I and II aminoacyl-tRNA synthetases
Carter, Charles W.
2018-01-01
SUMMARY The aminoacyl-tRNA synthetases and their cognate transfer RNAs translate the universal genetic code. The twenty canonical amino acids are sufficiently diverse to create a selective advantage for dividing amino acid activation between two distinct, apparently unrelated superfamilies of synthetases, Class I amino acids being generally larger and less polar, Class II amino acids smaller and more polar. Biochemical, bioinformatic, and protein engineering experiments support the hypothesis that the two Classes descended from opposite strands of the same ancestral gene. Parallel experimental deconstructions of Class I and II synthetases reveal parallel losses in catalytic proficiency at two novel modular levels—protozymes and Urzymes—associated with the evolution of catalytic activity. Bi-directional coding supports an important unification of the proteome; affords a genetic relatedness metric—middle base-pairing frequencies in sense/antisense alignments—that probes more deeply into the evolutionary history of translation than do single multiple sequence alignments; and has facilitated the analysis of hitherto unknown coding relationships in tRNA sequences. Reconstruction of native synthetases by modular thermodynamic cycles facilitated by domain engineering emphasizes the subtlety associated with achieving high specificity, shedding new light on allosteric relationships in contemporary synthetases. Synthetase Urzyme structural biology suggests that they are catalytically active molten globules, broadening the potential manifold of polypeptide catalysts accessible to primitive genetic coding and motivating revisions of the origins of catalysis. Finally, bi-directional genetic coding of some of the oldest genes in the proteome places major limitations on the likelihood that any RNA World preceded the origins of coded proteins. PMID:28828732
Moon, Jiyun M; Aronoff, David M; Capra, John A; Abbot, Patrick; Rokas, Antonis
2018-03-28
Sialic acids are nine carbon sugars ubiquitously found on the surfaces of vertebrate cells and are involved in various immune response-related processes. In humans, at least 58 genes spanning diverse functions, from biosynthesis and activation to recycling and degradation, are involved in sialic acid biology. Because of their role in immunity, sialic acid biology genes have been hypothesized to exhibit elevated rates of evolutionary change. Consistent with this hypothesis, several genes involved in sialic acid biology have experienced higher rates of non-synonymous substitutions in the human lineage than their counterparts in other great apes, perhaps in response to ancient pathogens that infected hominins millions of years ago (paleopathogens). To test whether sialic acid biology genes have also experienced more recent positive selection during the evolution of the modern human lineage, reflecting adaptation to contemporary cosmopolitan or geographically-restricted pathogens, we examined whether their protein-coding regions showed evidence of recent hard and soft selective sweeps. This examination involved the calculation of four measures that quantify changes in allele frequency spectra, extent of population differentiation, and haplotype homozygosity caused by recent hard and soft selective sweeps for 55 sialic acid biology genes using publicly available whole genome sequencing data from 1,668 humans from three ethnic groups. To disentangle evidence for selection from confounding demographic effects, we compared the observed patterns in sialic acid biology genes to simulated sequences of the same length under a model of neutral evolution that takes into account human demographic history. We found that the patterns of genetic variation of most sialic acid biology genes did not significantly deviate from neutral expectations and were not significantly different among genes belonging to different functional categories. Those few sialic acid biology genes that significantly deviated from neutrality either experienced soft sweeps or population-specific hard sweeps. Interestingly, while most hard sweeps occurred on genes involved in sialic acid recognition, most soft sweeps involved genes associated with recycling, degradation and activation, transport, and transfer functions. We propose that the lack of signatures of recent positive selection for the majority of the sialic acid biology genes is consistent with the view that these genes regulate immune responses against ancient rather than contemporary cosmopolitan or geographically restricted pathogens. Copyright © 2018 Moon et al.
New Genes and Functional Innovation in Mammals.
Luis Villanueva-Cañas, José; Ruiz-Orera, Jorge; Agea, M Isabel; Gallo, Maria; Andreu, David; Albà, M Mar
2017-07-01
The birth of genes that encode new protein sequences is a major source of evolutionary innovation. However, we still understand relatively little about how these genes come into being and which functions they are selected for. To address these questions, we have obtained a large collection of mammalian-specific gene families that lack homologues in other eukaryotic groups. We have combined gene annotations and de novo transcript assemblies from 30 different mammalian species, obtaining ∼6,000 gene families. In general, the proteins in mammalian-specific gene families tend to be short and depleted in aromatic and negatively charged residues. Proteins which arose early in mammalian evolution include milk and skin polypeptides, immune response components, and proteins involved in reproduction. In contrast, the functions of proteins which have a more recent origin remain largely unknown, despite the fact that these proteins also have extensive proteomics support. We identify several previously described cases of genes originated de novo from noncoding genomic regions, supporting the idea that this mechanism frequently underlies the evolution of new protein-coding genes in mammals. Finally, we show that most young mammalian genes are preferentially expressed in testis, suggesting that sexual selection plays an important role in the emergence of new functional genes. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Theory of microbial genome evolution
NASA Astrophysics Data System (ADS)
Koonin, Eugene
Bacteria and archaea have small genomes tightly packed with protein-coding genes. This compactness is commonly perceived as evidence of adaptive genome streamlining caused by strong purifying selection in large microbial populations. In such populations, even the small cost incurred by nonfunctional DNA because of extra energy and time expenditure is thought to be sufficient for this extra genetic material to be eliminated by selection. However, contrary to the predictions of this model, there exists a consistent, positive correlation between the strength of selection at the protein sequence level, measured as the ratio of nonsynonymous to synonymous substitution rates, and microbial genome size. By fitting the genome size distributions in multiple groups of prokaryotes to predictions of mathematical models of population evolution, we show that only models in which acquisition of additional genes is, on average, slightly beneficial yield a good fit to genomic data. Thus, the number of genes in prokaryotic genomes seems to reflect the equilibrium between the benefit of additional genes that diminishes as the genome grows and deletion bias. New genes acquired by microbial genomes, on average, appear to be adaptive. Evolution of bacterial and archaeal genomes involves extensive horizontal gene transfer and gene loss. Many microbes have open pangenomes, where each newly sequenced genome contains more than 10% `ORFans', genes without detectable homologues in other species. A simple, steady-state evolutionary model reveals two sharply distinct classes of microbial genes, one of which (ORFans) is characterized by effectively instantaneous gene replacement, whereas the other consists of genes with finite, distributed replacement rates. These findings imply a conservative estimate of at least a billion distinct genes in the prokaryotic genomic universe.
Peng, Hui; Lan, Chaowang; Liu, Yuansheng; Liu, Tao; Blumenstein, Michael; Li, Jinyan
2017-10-03
Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes.
Peng, Hui; Lan, Chaowang; Liu, Yuansheng; Liu, Tao; Blumenstein, Michael; Li, Jinyan
2017-01-01
Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes. PMID:29108274
Lyons, Brendan M; McHenry, Monique A; Barrington, David S
2017-07-01
Cytosolic phosphoglucose isomerase (pgiC) is an enzyme essential to glycolysis found universally in eukaryotes, but broad understanding of variation in the gene coding for pgiC is lacking for ferns. We used a substantially expanded representation of the gene for Andean species of the fern genus Polystichum to characterize pgiC in ferns relative to angiosperms, insects, and an amoebozoan; assess the impact of selection versus neutral evolutionary processes on pgiC; and explore evolutionary relationships of selected Andean species. The dataset of complete sequences comprised nine accessions representing seven species and one hybrid from the Andes and Serra do Mar. The aligned sequences of the full data set comprised 3376 base pairs (70% of the entire gene) including 17 exons and 15 introns from two central areas of the gene. The exons are highly conserved relative to angiosperms and retain substantial homology to insect pgiC, but intron length and structure are unique to the ferns. Average intron size is similar to angiosperms; intron number and location in insects are unlike those of the plants we considered. The introns included an array of indels and, in intron 7, an extensive microsatellite array with potential utility in analyzing population-level histories. Bayesian and maximum-parsimony analysis of 129 variable nucleotides in the Andean polystichums revealed that 59 (1.7% of the 3376 total) were phylogenetically informative; most of these united sister accessions. The phylogenetic trees for the Andean polystichums were incongruent with previously published cpDNA trees for the same taxa, likely the result of rapid evolutionary change in the introns and contrasting stability in the exons. The exons code a total of seven amino-acid substitutions. Comparison of non-synonymous to synonymous substitutions did not suggest that the pgiC gene is under selection in the Andes. Variation in pgiC including two additional accessions represented by incomplete sequences provided new insights into reticulate relationships among Andean taxa. Copyright © 2017 Elsevier Inc. All rights reserved.
de Freitas, Michele C. R.; Resende, Juliana A.; Ferreira-Machado, Alessandra B.; Saji, Guadalupe D. R. Q.; de Vasconcelos, Ana T. R.; da Silva, Vânia L.; Nicolás, Marisa F.; Diniz, Cláudio G.
2016-01-01
Bacteroides fragilis, member from commensal gut microbiota, is an important pathogen associated to endogenous infections and metronidazole remains a valuable antibiotic for the treatment of these infections, although bacterial resistance is widely reported. Considering the need of a better understanding on the global mechanisms by which B. fragilis survive upon metronidazole exposure, we performed a RNA-seq transcriptomic approach with validation of gene expression results by qPCR. Bacteria strains were selected after in vitro subcultures with subinhibitory concentration (SIC) of the drug. From a wild type B. fragilis ATCC 43859 four derivative strains were selected: first and fourth subcultures under metronidazole exposure and first and fourth subcultures after drug removal. According to global gene expression analysis, 2,146 protein coding genes were identified, of which a total of 1,618 (77%) were assigned to a Gene Ontology term (GO), indicating that most known cellular functions were taken. Among these 2,146 protein coding genes, 377 were shared among all strains, suggesting that they are critical for B. fragilis survival. In order to identify distinct expression patterns, we also performed a K-means clustering analysis set to 15 groups. This analysis allowed us to detect the major activated or repressed genes encoding for enzymes which act in several metabolic pathways involved in metronidazole response such as drug activation, defense mechanisms against superoxide ions, high expression level of multidrug efflux pumps, and DNA repair. The strains collected after metronidazole removal were functionally more similar to those cultured under drug pressure, reinforcing that drug-exposure lead to drastic persistent changes in the B. fragilis gene expression patterns. These results may help to elucidate B. fragilis response during metronidazole exposure, mainly at SIC, contributing with information about bacterial survival strategies under stress conditions in their environment. PMID:27703449
Domestic animals as models for biomedical research.
Andersson, Leif
2016-01-01
Domestic animals are unique models for biomedical research due to their long history (thousands of years) of strong phenotypic selection. This process has enriched for novel mutations that have contributed to phenotype evolution in domestic animals. The characterization of such mutations provides insights in gene function and biological mechanisms. This review summarizes genetic dissection of about 50 genetic variants affecting pigmentation, behaviour, metabolic regulation, and the pattern of locomotion. The variants are controlled by mutations in about 30 different genes, and for 10 of these our group was the first to report an association between the gene and a phenotype. Almost half of the reported mutations occur in non-coding sequences, suggesting that this is the most common type of polymorphism underlying phenotypic variation since this is a biased list where the proportion of coding mutations are inflated as they are easier to find. The review documents that structural changes (duplications, deletions, and inversions) have contributed significantly to the evolution of phenotypic diversity in domestic animals. Finally, we describe five examples of evolution of alleles, which means that alleles have evolved by the accumulation of several consecutive mutations affecting the function of the same gene.
Domestic animals as models for biomedical research
Andersson, Leif
2016-01-01
Domestic animals are unique models for biomedical research due to their long history (thousands of years) of strong phenotypic selection. This process has enriched for novel mutations that have contributed to phenotype evolution in domestic animals. The characterization of such mutations provides insights in gene function and biological mechanisms. This review summarizes genetic dissection of about 50 genetic variants affecting pigmentation, behaviour, metabolic regulation, and the pattern of locomotion. The variants are controlled by mutations in about 30 different genes, and for 10 of these our group was the first to report an association between the gene and a phenotype. Almost half of the reported mutations occur in non-coding sequences, suggesting that this is the most common type of polymorphism underlying phenotypic variation since this is a biased list where the proportion of coding mutations are inflated as they are easier to find. The review documents that structural changes (duplications, deletions, and inversions) have contributed significantly to the evolution of phenotypic diversity in domestic animals. Finally, we describe five examples of evolution of alleles, which means that alleles have evolved by the accumulation of several consecutive mutations affecting the function of the same gene. PMID:26479863
Winkler, Isaac S; Blaschke, Jeremy D; Davis, Daniel J; Stireman, John O; O'Hara, James E; Cerretti, Pierfilippo; Moulton, John K
2015-07-01
Molecular phylogenetic studies at all taxonomic levels often infer rapid radiation events based on short, poorly resolved internodes. While such rapid episodes of diversification are an important and widespread evolutionary phenomenon, much of this poor phylogenetic resolution may be attributed to the continuing widespread use of "traditional" markers (mitochondrial, ribosomal, and some nuclear protein-coding genes) that are often poorly suited to resolve difficult, higher-level phylogenetic problems. Here we reconstruct phylogenetic relationships among a representative set of taxa of the parasitoid fly family Tachinidae and related outgroups of the superfamily Oestroidea. The Tachinidae are one of the most species rich, yet evolutionarily recent families of Diptera, providing an ideal case study for examining the differential performance of loci in resolving phylogenetic relationships and the benefits of adding more loci to phylogenetic analyses. We assess the phylogenetic utility of nine genes including both traditional genes (e.g., CO1 mtDNA, 28S rDNA) and nuclear protein-coding genes newly developed for phylogenetic analysis. Our phylogenetic findings, based on a limited set of taxa, include: a close relationship between Tachinidae and the calliphorid subfamily Polleninae, monophyly of Tachinidae and the subfamilies Exoristinae and Dexiinae, subfamily groupings of Dexiinae+Phasiinae and Tachininae+Exoristinae, and robust phylogenetic placement of the somewhat enigmatic genera Strongygaster, Euthera, and Ceracia. In contrast to poor resolution and phylogenetic incongruence of "traditional genes," we find that a more selective set of highly informative genes is able to more precisely identify regions of the phylogeny that experienced rapid radiation of lineages, while more accurately depicting their phylogenetic context. Although much expanded taxon sampling is necessary to effectively assess the monophyly of and relationships among major tachinid lineages and their relatives, we show that a small number of well-chosen nuclear protein-coding genes can successfully resolve even difficult phylogenetic problems. Copyright © 2015 Elsevier Inc. All rights reserved.
2010-01-01
Background Molecular characterization of collagen-VI related myopathies currently relies on standard sequencing, which yields a detection rate approximating 75-79% in Ullrich congenital muscular dystrophy (UCMD) and 60-65% in Bethlem myopathy (BM) patients as PCR-based techniques tend to miss gross genomic rearrangements as well as copy number variations (CNVs) in both the coding sequence and intronic regions. Methods We have designed a custom oligonucleotide CGH array in order to investigate the presence of CNVs in the coding and non-coding regions of COL6A1, A2, A3, A5 and A6 genes and a group of genes functionally related to collagen VI. A cohort of 12 patients with UCMD/BM negative at sequencing analysis and 2 subjects carrying a single COL6 mutation whose clinical phenotype was not explicable by inheritance were selected and the occurrence of allelic and genetic heterogeneity explored. Results A deletion within intron 1A of the COL6A2 gene, occurring in compound heterozygosity with a small deletion in exon 28, previously detected by routine sequencing, was identified in a BM patient. RNA studies showed monoallelic transcription of the COL6A2 gene, thus elucidating the functional effect of the intronic deletion. No pathogenic mutations were identified in the remaining analyzed patients, either within COL6A genes, or in genes functionally related to collagen VI. Conclusions Our custom CGH array may represent a useful complementary diagnostic tool, especially in recessive forms of the disease, when only one mutant allele is detected by standard sequencing. The intronic deletion we identified represents the first example of a pure intronic mutation in COL6A genes. PMID:20302629
Insights into hominid evolution from the gorilla genome sequence
Scally, Aylwyn; Dutheil, Julien Y.; Hillier, LaDeana W.; Jordan, Greg E.; Goodhead, Ian; Herrero, Javier; Hobolth, Asger; Lappalainen, Tuuli; Mailund, Thomas; Marques-Bonet, Tomas; McCarthy, Shane; Montgomery, Stephen H.; Schwalie, Petra C.; Tang, Y. Amy; Ward, Michelle C.; Xue, Yali; Yngvadottir, Bryndis; Alkan, Can; Andersen, Lars N.; Ayub, Qasim; Ball, Edward V.; Beal, Kathryn; Bradley, Brenda J.; Chen, Yuan; Clee, Chris M.; Fitzgerald, Stephen; Graves, Tina A.; Gu, Yong; Heath, Paul; Heger, Andreas; Karakoc, Emre; Kolb-Kokocinski, Anja; Laird, Gavin K.; Lunter, Gerton; Meader, Stephen; Mort, Matthew; Mullikin, James C.; Munch, Kasper; O’Connor, Timothy D.; Phillips, Andrew D.; Prado-Martinez, Javier; Rogers, Anthony S.; Sajjadian, Saba; Schmidt, Dominic; Shaw, Katy; Simpson, Jared T.; Stenson, Peter D.; Turner, Daniel J.; Vigilant, Linda; Vilella, Albert J.; Whitener, Weldon; Zhu, Baoli; Cooper, David N.; de Jong, Pieter; Dermitzakis, Emmanouil T.; Eichler, Evan E.; Flicek, Paul; Goldman, Nick; Mundy, Nicholas I.; Ning, Zemin; Odom, Duncan T.; Ponting, Chris P.; Quail, Michael A.; Ryder, Oliver A.; Searle, Stephen M.; Warren, Wesley C.; Wilson, Richard K.; Schierup, Mikkel H.; Rogers, Jane; Tyler-Smith, Chris; Durbin, Richard
2012-01-01
Summary Gorillas are humans’ closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago (Mya). In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution. PMID:22398555
Dominant genetics using a yeast genomic library under the control of a strong inducible promoter.
Ramer, S W; Elledge, S J; Davis, R W
1992-12-01
In Saccharomyces cerevisiae, numerous genes have been identified by selection from high-copy-number libraries based on "multicopy suppression" or other phenotypic consequences of overexpression. Although fruitful, this approach suffers from two major drawbacks. First, high copy number alone may not permit high-level expression of tightly regulated genes. Conversely, other genes expressed in proportion to dosage cannot be identified if their products are toxic at elevated levels. This work reports construction of a genomic DNA expression library for S. cerevisiae that circumvents both limitations by fusing randomly sheared genomic DNA to the strong, inducible yeast GAL1 promoter, which can be regulated by carbon source. The library obtained contains 5 x 10(7) independent recombinants, representing a breakpoint at every base in the yeast genome. This library was used to examine aberrant gene expression in S. cerevisiae. A screen for dominant activators of yeast mating response identified eight genes that activate the pathway in the absence of exogenous mating pheromone, including one previously unidentified gene. One activator was a truncated STE11 gene lacking approximately 1000 base pairs of amino-terminal coding sequence. In two different clones, the same GAL1 promoter-proximal ATG is in-frame with the coding sequence of STE11, suggesting that internal initiation of translation there results in production of a biologically active, truncated STE11 protein. Thus this library allows isolation based on dominant phenotypes of genes that might have been difficult or impossible to isolate from high-copy-number libraries.
The king cobra genome reveals dynamic gene evolution and adaptation in the snake venom system.
Vonk, Freek J; Casewell, Nicholas R; Henkel, Christiaan V; Heimberg, Alysha M; Jansen, Hans J; McCleary, Ryan J R; Kerkkamp, Harald M E; Vos, Rutger A; Guerreiro, Isabel; Calvete, Juan J; Wüster, Wolfgang; Woods, Anthony E; Logan, Jessica M; Harrison, Robert A; Castoe, Todd A; de Koning, A P Jason; Pollock, David D; Yandell, Mark; Calderon, Diego; Renjifo, Camila; Currier, Rachel B; Salgado, David; Pla, Davinia; Sanz, Libia; Hyder, Asad S; Ribeiro, José M C; Arntzen, Jan W; van den Thillart, Guido E E J M; Boetzer, Marten; Pirovano, Walter; Dirks, Ron P; Spaink, Herman P; Duboule, Denis; McGlinn, Edwina; Kini, R Manjunatha; Richardson, Michael K
2013-12-17
Snakes are limbless predators, and many species use venom to help overpower relatively large, agile prey. Snake venoms are complex protein mixtures encoded by several multilocus gene families that function synergistically to cause incapacitation. To examine venom evolution, we sequenced and interrogated the genome of a venomous snake, the king cobra (Ophiophagus hannah), and compared it, together with our unique transcriptome, microRNA, and proteome datasets from this species, with data from other vertebrates. In contrast to the platypus, the only other venomous vertebrate with a sequenced genome, we find that snake toxin genes evolve through several distinct co-option mechanisms and exhibit surprisingly variable levels of gene duplication and directional selection that correlate with their functional importance in prey capture. The enigmatic accessory venom gland shows a very different pattern of toxin gene expression from the main venom gland and seems to have recruited toxin-like lectin genes repeatedly for new nontoxic functions. In addition, tissue-specific microRNA analyses suggested the co-option of core genetic regulatory components of the venom secretory system from a pancreatic origin. Although the king cobra is limbless, we recovered coding sequences for all Hox genes involved in amniote limb development, with the exception of Hoxd12. Our results provide a unique view of the origin and evolution of snake venom and reveal multiple genome-level adaptive responses to natural selection in this complex biological weapon system. More generally, they provide insight into mechanisms of protein evolution under strong selection.
Boldogköi, Zsolt
2004-09-01
Population genetics, the mathematical theory of modern evolutionary biology, defines evolution as the alteration of the frequency of distinct gene variants (alleles) differing in fitness over the time. The major problem with this view is that in gene and protein sequences we can find little evidence concerning the molecular basis of phenotypic variance, especially those that would confer adaptive benefit to the bearers. Some novel data, however, suggest that a large amount of genetic variation exists in the regulatory region of genes within populations. In addition, comparison of homologous DNA sequences of various species shows that evolution appears to depend more strongly on gene expression than on the genes themselves. Furthermore, it has been demonstrated in several systems that genes form functional networks, whose products exhibit interrelated expression profiles. Finally, it has been found that regulatory circuits of development behave as evolutionary units. These data demonstrate that our view of evolution calls for a new synthesis. In this article I propose a novel concept, termed the selfish gene network hypothesis, which is based on an overall consideration of the above findings. The major statements of this hypothesis are as follows. (1) Instead of individual genes, gene networks (GNs) are responsible for the determination of traits and behaviors. (2) The primary source of microevolution is the intraspecific polymorphism in GNs and not the allelic variation in either the coding or the regulatory sequences of individual genes. (3) GN polymorphism is generated by the variation in the regulatory regions of the component genes and not by the variance in their coding sequences. (4) Evolution proceeds through continuous restructuring of the composition of GNs rather than fixing of specific alleles or GN variants.
Small, Clayton M; Harlin-Cognato, April D; Jones, Adam G
2013-01-01
Evolutionary studies have revealed that reproductive proteins in animals and plants often evolve more rapidly than the genome-wide average. The causes of this pattern, which may include relaxed purifying selection, sexual selection, sexual conflict, pathogen resistance, reinforcement, or gene duplication, remain elusive. Investigative expansions to additional taxa and reproductive tissues have the potential to shed new light on this unresolved problem. Here, we embark on such an expansion, in a comparison of the brood-pouch transcriptome between two male-pregnant species of the pipefish genus Syngnathus. Male brooding tissues in syngnathid fishes represent a novel, nonurogenital reproductive trait, heretofore mostly uncharacterized from a molecular perspective. We leveraged next-generation sequencing (Roche 454 pyrosequencing) to compare transcript abundance in the male brooding tissues of pregnant with nonpregnant samples from Gulf (S. scovelli) and dusky (S. floridae) pipefish. A core set of protein-coding genes, including multiple members of astacin metalloprotease and c-type lectin gene families, is consistent between species in both the direction and magnitude of expression bias. As predicted, coding DNA sequence analysis of these putative “male pregnancy proteins” suggests rapid evolution relative to nondifferentially expressed genes and reflects signatures of adaptation similar in magnitude to those reported from Drosophila male accessory gland proteins. Although the precise drivers of male pregnancy protein divergence remain unknown, we argue that the male pregnancy transcriptome in syngnathid fishes, a clade diverse with respect to brooding morphology and mating system, represents a unique and promising object of study for understanding the perplexing evolutionary nature of reproductive molecules. PMID:24324861
DNA Translator and Aligner: HyperCard utilities to aid phylogenetic analysis of molecules.
Eernisse, D J
1992-04-01
DNA Translator and Aligner are molecular phylogenetics HyperCard stacks for Macintosh computers. They manipulate sequence data to provide graphical gene mapping, conversions, translations and manual multiple-sequence alignment editing. DNA Translator is able to convert documented GenBank or EMBL documented sequences into linearized, rescalable gene maps whose gene sequences are extractable by clicking on the corresponding map button or by selection from a scrolling list. Provided gene maps, complete with extractable sequences, consist of nine metazoan, one yeast, and one ciliate mitochondrial DNAs and three green plant chloroplast DNAs. Single or multiple sequences can be manipulated to aid in phylogenetic analysis. Sequences can be translated between nucleic acids and proteins in either direction with flexible support of alternate genetic codes and ambiguous nucleotide symbols. Multiple aligned sequence output from diverse sources can be converted to Nexus, Hennig86 or PHYLIP format for subsequent phylogenetic analysis. Input or output alignments can be examined with Aligner, a convenient accessory stack included in the DNA Translator package. Aligner is an editor for the manual alignment of up to 100 sequences that toggles between display of matched characters and normal unmatched sequences. DNA Translator also generates graphic displays of amino acid coding and codon usage frequency relative to all other, or only synonymous, codons for approximately 70 select organism-organelle combinations. Codon usage data is compatible with spreadsheet or UWGCG formats for incorporation of additional molecules of interest. The complete package is available via anonymous ftp and is free for non-commercial uses.
Rieseberg, Loren H.; Blackman, Benjamin K.
2010-01-01
Background Analyses of speciation genes – genes that contribute to the cessation of gene flow between populations – can offer clues regarding the ecological settings, evolutionary forces and molecular mechanisms that drive the divergence of populations and species. This review discusses the identities and attributes of genes that contribute to reproductive isolation (RI) in plants, compares them with animal speciation genes and investigates what these genes can tell us about speciation. Scope Forty-one candidate speciation genes were identified in the plant literature. Of these, seven contributed to pre-pollination RI, one to post-pollination, prezygotic RI, eight to hybrid inviability, and 25 to hybrid sterility. Genes, gene families and genetic pathways that were frequently found to underlie the evolution of RI in different plant groups include the anthocyanin pathway and its regulators (pollinator isolation), S RNase-SI genes (unilateral incompatibility), disease resistance genes (hybrid necrosis), chimeric mitochondrial genes (cytoplasmic male sterility), and pentatricopeptide repeat family genes (cytoplasmic male sterility). Conclusions The most surprising conclusion from this review is that identities of genes underlying both prezygotic and postzygotic RI are often predictable in a broad sense from the phenotype of the reproductive barrier. Regulatory changes (both cis and trans) dominate the evolution of pre-pollination RI in plants, whereas a mix of regulatory mutations and changes in protein-coding genes underlie intrinsic postzygotic barriers. Also, loss-of-function mutations and copy number variation frequently contribute to RI. Although direct evidence of positive selection on speciation genes is surprisingly scarce in plants, analyses of gene family evolution, along with theoretical considerations, imply an important role for diversifying selection and genetic conflict in the evolution of RI. Unlike in animals, however, most candidate speciation genes in plants exhibit intraspecific polymorphism, consistent with an important role for stochastic forces and/or balancing selection in development of RI in plants. PMID:20576737
[Construction of the superantigen SEA transfected laryngocarcinoma cells].
Ji, Xiaobin; Jingli, J V; Liu, Qicai; Xie, Jinghua
2013-04-01
To construct an eukaryotic expression vectors containing superantigen staphylococcal enterotoxin A (SEA) gene, and to identify its expression in laryngeal squamous carcinoma cells. SEA full-length gene fragment was obtained from ATCC13565 genome of the staphylococcus, referencing standard strains producing SEA. Coding sequence of SEA was artificially synthetized. Than, SEA gene fragments was subcloned into eukaryotic expression vector pIRES-EGFP. The recombinant plasmid pSEA-IRES-EGFP was constructed and was transfected to laryngocarcinoma Hep-2 cells. Resistant clones were screened by G418. The expression of SEA in laryngocarcinoma cells was identified with ELISA and RT-PCR method. The subclone of artificially synthetized SEA gene was subclone to eukaryotic expression vector pires-EGFP. Flanking sequence confirmed that SEA sequence was fully identical to the coding sequence of standard staphylococcus strains ATCC13565 in Genbank. After recombinant plasmid transfected to laryngocarcinoma cells, the resistant clones was obtained after screening for two weeks. The clones were selected. The specific gene fragment was obtained by RT-PCR amplification. ELISA assay confirmed that the content of SEA protein in supernatant fluid of cell culture had reached about Pg level. The recombinant eukaryotic expression vector containing superantigen SEA gene is successfully constructed, and is capable of effective expression and continued secretion of SEA protein in laryngochrcinoma Hep-2 cells after recombinant plasmid transfected to laryngocarcinoma cells.
Reinhardt, Josephine A.; Wanjiru, Betty M.; Brant, Alicia T.; Saelao, Perot; Begun, David J.; Jones, Corbin D.
2013-01-01
How non-coding DNA gives rise to new protein-coding genes (de novo genes) is not well understood. Recent work has revealed the origins and functions of a few de novo genes, but common principles governing the evolution or biological roles of these genes are unknown. To better define these principles, we performed a parallel analysis of the evolution and function of six putatively protein-coding de novo genes described in Drosophila melanogaster. Reconstruction of the transcriptional history of de novo genes shows that two de novo genes emerged from novel long non-coding RNAs that arose at least 5 MY prior to evolution of an open reading frame. In contrast, four other de novo genes evolved a translated open reading frame and transcription within the same evolutionary interval suggesting that nascent open reading frames (proto-ORFs), while not required, can contribute to the emergence of a new de novo gene. However, none of the genes arose from proto-ORFs that existed long before expression evolved. Sequence and structural evolution of de novo genes was rapid compared to nearby genes and the structural complexity of de novo genes steadily increases over evolutionary time. Despite the fact that these genes are transcribed at a higher level in males than females, and are most strongly expressed in testes, RNAi experiments show that most of these genes are essential in both sexes during metamorphosis. This lethality suggests that protein coding de novo genes in Drosophila quickly become functionally important. PMID:24146629
De Novo Origin of Human Protein-Coding Genes
Wu, Dong-Dong; Irwin, David M.; Zhang, Ya-Ping
2011-01-01
The de novo origin of a new protein-coding gene from non-coding DNA is considered to be a very rare occurrence in genomes. Here we identify 60 new protein-coding genes that originated de novo on the human lineage since divergence from the chimpanzee. The functionality of these genes is supported by both transcriptional and proteomic evidence. RNA–seq data indicate that these genes have their highest expression levels in the cerebral cortex and testes, which might suggest that these genes contribute to phenotypic traits that are unique to humans, such as improved cognitive ability. Our results are inconsistent with the traditional view that the de novo origin of new genes is very rare, thus there should be greater appreciation of the importance of the de novo origination of genes. PMID:22102831
Singh, Pratichi; Dass, J Febin Prabhu
2018-05-07
IFNL3 gene plays a crucial role in immune defense against viruses. It induces the interferon stimulated genes (ISGs) with antiviral properties by activating the JAK-STAT pathway. In this study, we investigated the evolutionary force involved in shaping the IFNL3 gene to perform its downstream function as a regulatory gene in HCV clearance. We have selected 25 IFNL3 coding sequences with human gene as a reference sequence and constructed a phylogeny. Furthermore, rate of variation, substitution saturation test, phylogenetic informativeness and differential selection were also analysed. The codon evolution result suggests that nearly neutral mutation is the key pattern in shaping the IFNL3 evolution. The results were validated by subjecting the human IFNL3 protein variants to that of the native through a molecular dynamics simulation study. The molecular dynamics simulation clearly depicts the negative impact on the reported variants in human IFNL3 protein. However, these detrimental mutations (R157Q and R157W) were shown to be negatively selected in the evolutionary study of the mammals. Hence, the variation revealed a mild impact on the IFNL3 function and may be removed from the population through negative selection due to its high functional constraints. In a nutshell, our study may contribute the overall evidence in phylotyping and structural transformation that takes place in the non-synonymous substitutions of IFNL3 protein. Substantially, our obtained theoretical knowledge will lay the path to extend the experimental validation in HCV clearance. Copyright © 2018 Elsevier Ltd. All rights reserved.
Amambua-Ngwa, Alfred; Tetteh, Kevin K A; Manske, Magnus; Gomez-Escobar, Natalia; Stewart, Lindsay B; Deerhake, M Elizabeth; Cheeseman, Ian H; Newbold, Christopher I; Holder, Anthony A; Knuepfer, Ellen; Janha, Omar; Jallow, Muminatou; Campino, Susana; Macinnis, Bronwyn; Kwiatkowski, Dominic P; Conway, David J
2012-01-01
Acquired immunity in vertebrates maintains polymorphisms in endemic pathogens, leading to identifiable signatures of balancing selection. To comprehensively survey for genes under such selection in the human malaria parasite Plasmodium falciparum, we generated paired-end short-read sequences of parasites in clinical isolates from an endemic Gambian population, which were mapped to the 3D7 strain reference genome to yield high-quality genome-wide coding sequence data for 65 isolates. A minority of genes did not map reliably, including the hypervariable var, rifin, and stevor families, but 5,056 genes (90.9% of all in the genome) had >70% sequence coverage with minimum read depth of 5 for at least 50 isolates, of which 2,853 genes contained 3 or more single nucleotide polymorphisms (SNPs) for analysis of polymorphic site frequency spectra. Against an overall background of negatively skewed frequencies, as expected from historical population expansion combined with purifying selection, the outlying minority of genes with signatures indicating exceptionally intermediate frequencies were identified. Comparing genes with different stage-specificity, such signatures were most common in those with peak expression at the merozoite stage that invades erythrocytes. Members of clag, PfMC-2TM, surfin, and msp3-like gene families were highly represented, the strongest signature being in the msp3-like gene PF10_0355. Analysis of msp3-like transcripts in 45 clinical and 11 laboratory adapted isolates grown to merozoite-containing schizont stages revealed surprisingly low expression of PF10_0355. In diverse clonal parasite lines the protein product was expressed in a minority of mature schizonts (<1% in most lines and ∼10% in clone HB3), and eight sub-clones of HB3 cultured separately had an intermediate spectrum of positive frequencies (0.9 to 7.5%), indicating phase variable expression of this polymorphic antigen. This and other identified targets of balancing selection are now prioritized for functional study.
Network perturbation by recurrent regulatory variants in cancer
Cho, Ara; Lee, Insuk; Choi, Jung Kyoon
2017-01-01
Cancer driving genes have been identified as recurrently affected by variants that alter protein-coding sequences. However, a majority of cancer variants arise in noncoding regions, and some of them are thought to play a critical role through transcriptional perturbation. Here we identified putative transcriptional driver genes based on combinatorial variant recurrence in cis-regulatory regions. The identified genes showed high connectivity in the cancer type-specific transcription regulatory network, with high outdegree and many downstream genes, highlighting their causative role during tumorigenesis. In the protein interactome, the identified transcriptional drivers were not as highly connected as coding driver genes but appeared to form a network module centered on the coding drivers. The coding and regulatory variants associated via these interactions between the coding and transcriptional drivers showed exclusive and complementary occurrence patterns across tumor samples. Transcriptional cancer drivers may act through an extensive perturbation of the regulatory network and by altering protein network modules through interactions with coding driver genes. PMID:28333928
Saturation of recognition elements blocks evolution of new tRNA identities
Saint-Léger, Adélaïde; Bello, Carla; Dans, Pablo D.; Torres, Adrian Gabriel; Novoa, Eva Maria; Camacho, Noelia; Orozco, Modesto; Kondrashov, Fyodor A.; Ribas de Pouplana, Lluís
2016-01-01
Understanding the principles that led to the current complexity of the genetic code is a central question in evolution. Expansion of the genetic code required the selection of new transfer RNAs (tRNAs) with specific recognition signals that allowed them to be matured, modified, aminoacylated, and processed by the ribosome without compromising the fidelity or efficiency of protein synthesis. We show that saturation of recognition signals blocks the emergence of new tRNA identities and that the rate of nucleotide substitutions in tRNAs is higher in species with fewer tRNA genes. We propose that the growth of the genetic code stalled because a limit was reached in the number of identity elements that can be effectively used in the tRNA structure. PMID:27386510
Mielczarek, M; Frąszczak, M; Giannico, R; Minozzi, G; Williams, John L; Wojdak-Maksymiec, K; Szyda, J
2017-07-01
Thirty-two whole genome DNA sequences of cows were analyzed to evaluate inter-individual variability in the distribution and length of copy number variations (CNV) and to functionally annotate CNV breakpoints. The total number of deletions per individual varied between 9,731 and 15,051, whereas the number of duplications was between 1,694 and 5,187. Most of the deletions (81%) and duplications (86%) were unique to a single cow. No relation between the pattern of variant sharing and a family relationship or disease status was found. The animal-averaged length of deletions was from 5,234 to 9,145 bp and the average length of duplications was between 7,254 and 8,843 bp. Highly significant inter-individual variation in length and number of CNV was detected for both deletions and duplications. The majority of deletion and duplication breakpoints were located in intergenic regions and introns, whereas fewer were identified in noncoding transcripts and splice regions. Only 1.35 and 0.79% of the deletion and duplication breakpoints were observed within coding regions. A gene with the highest number of deletion breakpoints codes for protein kinase cGMP-dependent type I, whereas the T-cell receptor α constant gene had the most duplication breakpoints. The functional annotation of genes with the largest incidence of deletion/duplication breakpoints identified 87/112 Kyoto Encyclopedia of Genes and Genomes pathways, but none of the pathways were significantly enriched or depleted with breakpoints. The analysis of Gene Ontology (GO) terms revealed that a cluster with the highest enrichment score among genes with many deletion breakpoints was represented by GO terms related to ion transport, whereas the GO term cluster mostly enriched among the genes with many duplication breakpoints was related to binding of macromolecules. Furthermore, when considering the number of deletion breakpoints per gene functional category, no significant differences were observed between the "housekeeping" and "strong selection" categories, but genes representing the "low selection pressure" group showed a significantly higher number of breakpoints. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
A Dual Origin of the Xist Gene from a Protein-Coding Gene and a Set of Transposable Elements
Elisaphenko, Eugeny A.; Kolesnikov, Nikolay N.; Shevchenko, Alexander I.; Rogozin, Igor B.; Nesterova, Tatyana B.; Brockdorff, Neil; Zakian, Suren M.
2008-01-01
X-chromosome inactivation, which occurs in female eutherian mammals is controlled by a complex X-linked locus termed the X-inactivation center (XIC). Previously it was proposed that genes of the XIC evolved, at least in part, as a result of pseudogenization of protein-coding genes. In this study we show that the key XIC gene Xist, which displays fragmentary homology to a protein-coding gene Lnx3, emerged de novo in early eutherians by integration of mobile elements which gave rise to simple tandem repeats. The Xist gene promoter region and four out of ten exons found in eutherians retain homology to exons of the Lnx3 gene. The remaining six Xist exons including those with simple tandem repeats detectable in their structure have similarity to different transposable elements. Integration of mobile elements into Xist accompanies the overall evolution of the gene and presumably continues in contemporary eutherian species. Additionally we showed that the combination of remnants of protein-coding sequences and mobile elements is not unique to the Xist gene and is found in other XIC genes producing non-coding nuclear RNA. PMID:18575625
Alam, Tanvir; Medvedeva, Yulia A.; Jia, Hui; ...
2014-10-02
Transcriptional regulation of protein-coding genes is increasingly well-understood on a global scale, yet no comparable information exists for long non-coding RNA (lncRNA) genes, which were recently recognized to be as numerous as protein-coding genes in mammalian genomes. We performed a genome-wide comparative analysis of the promoters of human lncRNA and protein-coding genes, finding global differences in specific genetic and epigenetic features relevant to transcriptional regulation. These two groups of genes are hence subject to separate transcriptional regulatory programs, including distinct transcription factor (TF) proteins that significantly favor lncRNA, rather than coding-gene, promoters. We report a specific signature of promoter-proximal transcriptionalmore » regulation of lncRNA genes, including several distinct transcription factor binding sites (TFBS). Experimental DNase I hypersensitive site profiles are consistent with active configurations of these lncRNA TFBS sets in diverse human cell types. TFBS ChIP-seq datasets confirm the binding events that we predicted using computational approaches for a subset of factors. For several TFs known to be directly regulated by lncRNAs, we find that their putative TFBSs are enriched at lncRNA promoters, suggesting that the TFs and the lncRNAs may participate in a bidirectional feedback loop regulatory network. Accordingly, cells may be able to modulate lncRNA expression levels independently of mRNA levels via distinct regulatory pathways. Our results also raise the possibility that, given the historical reliance on protein-coding gene catalogs to define the chromatin states of active promoters, a revision of these chromatin signature profiles to incorporate expressed lncRNA genes is warranted in the future.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Alam, Tanvir; Medvedeva, Yulia A.; Jia, Hui
Transcriptional regulation of protein-coding genes is increasingly well-understood on a global scale, yet no comparable information exists for long non-coding RNA (lncRNA) genes, which were recently recognized to be as numerous as protein-coding genes in mammalian genomes. We performed a genome-wide comparative analysis of the promoters of human lncRNA and protein-coding genes, finding global differences in specific genetic and epigenetic features relevant to transcriptional regulation. These two groups of genes are hence subject to separate transcriptional regulatory programs, including distinct transcription factor (TF) proteins that significantly favor lncRNA, rather than coding-gene, promoters. We report a specific signature of promoter-proximal transcriptionalmore » regulation of lncRNA genes, including several distinct transcription factor binding sites (TFBS). Experimental DNase I hypersensitive site profiles are consistent with active configurations of these lncRNA TFBS sets in diverse human cell types. TFBS ChIP-seq datasets confirm the binding events that we predicted using computational approaches for a subset of factors. For several TFs known to be directly regulated by lncRNAs, we find that their putative TFBSs are enriched at lncRNA promoters, suggesting that the TFs and the lncRNAs may participate in a bidirectional feedback loop regulatory network. Accordingly, cells may be able to modulate lncRNA expression levels independently of mRNA levels via distinct regulatory pathways. Our results also raise the possibility that, given the historical reliance on protein-coding gene catalogs to define the chromatin states of active promoters, a revision of these chromatin signature profiles to incorporate expressed lncRNA genes is warranted in the future.« less
Schwientek, Patrick; Neshat, Armin; Kalinowski, Jörn; Klein, Andreas; Rückert, Christian; Schneiker-Bekel, Susanne; Wendler, Sergej; Stoye, Jens; Pühler, Alfred
2014-11-20
Actinoplanes sp. SE50/110 is the producer of the alpha-glucosidase inhibitor acarbose, which is an economically relevant and potent drug in the treatment of type-2 diabetes mellitus. In this study, we present the detection of transcription start sites on this genome by sequencing enriched 5'-ends of primary transcripts. Altogether, 1427 putative transcription start sites were initially identified. With help of the annotated genome sequence, 661 transcription start sites were found to belong to the leader region of protein-coding genes with the surprising result that roughly 20% of these genes rank among the class of leaderless transcripts. Next, conserved promoter motifs were identified for protein-coding genes with and without leader sequences. The mapped transcription start sites were finally used to improve the annotation of the Actinoplanes sp. SE50/110 genome sequence. Concerning protein-coding genes, 41 translation start sites were corrected and 9 novel protein-coding genes could be identified. In addition to this, 122 previously undetermined non-coding RNA (ncRNA) genes of Actinoplanes sp. SE50/110 were defined. Focusing on antisense transcription start sites located within coding genes or their leader sequences, it was discovered that 96 of those ncRNA genes belong to the class of antisense RNA (asRNA) genes. The remaining 26 ncRNA genes were found outside of known protein-coding genes. Four chosen examples of prominent ncRNA genes, namely the transfer messenger RNA gene ssrA, the ribonuclease P class A RNA gene rnpB, the cobalamin riboswitch RNA gene cobRS, and the selenocysteine-specific tRNA gene selC, are presented in more detail. This study demonstrates that sequencing of enriched 5'-ends of primary transcripts and the identification of transcription start sites are valuable tools for advanced genome annotation of Actinoplanes sp. SE50/110 and most probably also for other bacteria. Copyright © 2014 Elsevier B.V. All rights reserved.
Gene-Auto: Automatic Software Code Generation for Real-Time Embedded Systems
NASA Astrophysics Data System (ADS)
Rugina, A.-E.; Thomas, D.; Olive, X.; Veran, G.
2008-08-01
This paper gives an overview of the Gene-Auto ITEA European project, which aims at building a qualified C code generator from mathematical models under Matlab-Simulink and Scilab-Scicos. The project is driven by major European industry partners, active in the real-time embedded systems domains. The Gene- Auto code generator will significantly improve the current development processes in such domains by shortening the time to market and by guaranteeing the quality of the generated code through the use of formal methods. The first version of the Gene-Auto code generator has already been released and has gone thought a validation phase on real-life case studies defined by each project partner. The validation results are taken into account in the implementation of the second version of the code generator. The partners aim at introducing the Gene-Auto results into industrial development by 2010.
Genomic Footprints in Selected and Unselected Beef Cattle Breeds in Korea.
Lim, Dajeong; Strucken, Eva M; Choi, Bong Hwan; Chai, Han Ha; Cho, Yong Min; Jang, Gul Won; Kim, Tae-Hun; Gondro, Cedric; Lee, Seung Hwan
2016-01-01
Korean Hanwoo cattle have been subjected to intensive artificial selection over the past four decades to improve meat production traits. Another three cattle varieties very closely related to Hanwoo reside in Korea (Jeju Black and Brindle) and in China (Yanbian). These breeds have not been part of a breeding scheme to improve production traits. Here, we compare the selected Hanwoo against these similar but presumed to be unselected populations to identify genomic regions that have been under recent selection pressure due to the breeding program. Rsb statistics were used to contrast the genomes of Hanwoo versus a pooled sample of the three unselected population (UN). We identified 37 significant SNPs (FDR corrected) in the HW/UN comparison and 21 known protein coding genes were within 1 MB to the identified SNPs. These genes were previously reported to affect traits important for meat production (14 genes), reproduction including mammary gland development (3 genes), coat color (2 genes), and genes affecting behavioral traits in a broader sense (2 genes). We subsequently sequenced (Illumina HiSeq 2000 platform) 10 individuals of the brown Hanwoo and the Chinese Yanbian to identify SNPs within the candidate genomic regions. Based on allele frequency differences, haplotype structures, and literature research, we singled out one non-synonymous SNP in the APP gene (APP: c.569C>T, Ala199Val) and predicted the mutational effect on the protein structure. We found that protein-protein interactions might be impaired due to increased exposed hydrophobic surfaces of the mutated protein. The APP gene has also been reported to affect meat tenderness in pigs and obesity in humans. Meat tenderness has been linked to intramuscular fat content, which is one of the main breeding goals for brown Hanwoo, potentially supporting a causal influence of the herein described nsSNP in the APP gene.
Abdali, Narges; Younas, Farhan; Mafakheri, Samaneh; Pothula, Karunakar R; Kleinekathöfer, Ulrich; Tauch, Andreas; Benz, Roland
2018-05-09
Corynebacterium urealyticum, a pathogenic, multidrug resistant member of the mycolata, is known as causative agent of urinary tract infections although it is a bacterium of the skin flora. This pathogenic bacterium shares with the mycolata the property of having an unusual cell envelope composition and architecture, typical for the genus Corynebacterium. The cell wall of members of the mycolata contains channel-forming proteins for the uptake of solutes. In this study, we provide novel information on the identification and characterization of a pore-forming protein in the cell wall of C. urealyticum DSM 7109. Detergent extracts of whole C. urealyticum cultures formed in lipid bilayer membranes slightly cation-selective pores with a single-channel conductance of 1.75 nS in 1 M KCl. Experiments with different salts and non-electrolytes suggested that the cell wall pore of C. urealyticum is wide and water-filled and has a diameter of about 1.8 nm. Molecular modelling and dynamics has been performed to obtain a model of the pore. For the search of the gene coding for the cell wall pore of C. urealyticum we looked in the known genome of C. urealyticum for a similar chromosomal localization of the porin gene to known porH and porA genes of other Corynebacterium strains. Three genes are located between the genes coding for GroEL2 and polyphosphate kinase (PKK2). Two of the genes (cur_1714 and cur_1715) were expressed in different constructs in C. glutamicum ΔporAΔporH and in porin-deficient BL21 DE3 Omp8 E. coli strains. The results suggested that the gene cur_1714 codes alone for the cell wall channel. The cell wall porin of C. urealyticum termed PorACur was purified to homogeneity using different biochemical methods and had an apparent molecular mass of about 4 kDa on tricine-containing sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE). Biophysical characterization of the purified protein (PorACur) suggested indeed that cur_1714 is the gene coding for the pore-forming protein in C. urealyticum because the protein formed in lipid bilayer experiments the same pores as the detergent extract of whole cells. The study is the first report of a cell wall channel in the pathogenic C. urealyticum.
Wheat CBF gene family: identification of polymorphisms in the CBF coding sequence.
Mohseni, Sara; Che, Hua; Djillali, Zakia; Dumont, Estelle; Nankeu, Joseph; Danyluk, Jean
2012-12-01
Expression of cold-regulated genes needed for protection against freezing stress is mediated, in part, by the CBF transcription factor family. Previous studies with temperate cereals suggested that the CBF gene family in wheat was large, and that CBF genes were at the base of an important low temperature tolerance trait. Therefore, the goal of our study was to identify the CBF repertoire in the freezing-tolerant hexaploid wheat cultivar Norstar, and then to examine if the coding region of CBF genes in two spring cultivars contain polymorphisms that could affect the protein sequence and structure. Our analyses reveal that hexaploid wheat contains a complex CBF family consisting of at least 65 CBF genes of which 60 are known to be expressed in the cultivar Norstar. They represent 27 paralogous genes with 1-3 homeologous copies for the A, B, and D genomes. The cultivar Norstar contains two pseudogenes and at least 24 additional proteins having sequences and (or) structures that deviate from the consensus in the conserved AP2 DNA-binding and (or) C-terminal activation-domains. This suggests that in cultivars such as Norstar, low temperature tolerance may be increased through breeding of additional optimal alleles. The examination of the CBF repertoire present in the two spring cultivars, Chinese Spring and Manitou, reveals that they have additional polymorphisms affecting conserved positions in these domains. Understanding the effects of these polymorphisms will provide additional information for the selection of optimum CBF alleles in Triticeae breeding programs.
Baurens, Franc-Christophe; Bocs, Stéphanie; Rouard, Mathieu; Matsumoto, Takashi; Miller, Robert N G; Rodier-Goud, Marguerite; MBéguié-A-MBéguié, Didier; Yahiaoui, Nabila
2010-07-16
Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana.
Schwizer, Sarah; Tasara, Taurai; Zurfluh, Katrin; Stephan, Roger; Lehner, Angelika
2013-02-15
Cronobacter spp. are opportunistic pathogens that can cause septicemia and infections of the central nervous system primarily in premature, low-birth weight and/or immune-compromised neonates. Serum resistance is a crucial virulence factor for the development of systemic infections, including bacteremia. It was the aim of the current study to identify genes involved in serum tolerance in a selected Cronobacter sakazakii strain of clinical origin. Screening of 2749 random transposon knock out mutants of a C. sakazakii ES 5 library for modified serum tolerance (compared to wild type) revealed 10 mutants showing significantly increased/reduced resistance to serum killing. Identification of the affected sites in mutants displaying reduced serum resistance revealed genes encoding for surface and membrane proteins as well as regulatory elements or chaperones. By this approach, the involvement of the yet undescribed Wzy_C superfamily domain containing coding region in serum tolerance was observed and experimentally confirmed. Additionally, knock out mutants with enhanced serum tolerance were observed. Examination of respective transposon insertion loci revealed regulatory (repressor) elements, coding regions for chaperones and efflux systems as well as the coding region for the protein YbaJ. Real time expression analysis experiments revealed, that knock out of the gene for this protein negatively affects the expression of the fimA gene, which is a key structural component of the formation of fimbriae. Fimbriae are structures of high immunogenic potential and it is likely that absence/truncation of the ybaJ gene resulted in a non-fimbriated phenotype accounting for the enhanced survival of this mutant in human serum. By using a transposon knock out approach we were able to identify genes involved in both increased and reduced serum tolerance in Cronobacter sakazakii ES5. This study reveals first insights in the complex nature of serum tolerance of Cronobacter spp.
Tu, Ying; Xu, Dan; Feng, Jiaqi; He, Li
2017-01-01
Sensitive skin (SS) is a condition of subjective cutaneous hyper-reactivity. The role of long non-coding RNAs (lncRNAs) in subjects with SS is unclear. Therefore, the aim of the present study was to provide a comprehensive profile of the mRNAs and lncRNAs in subjects with SS. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis presented the characteristics of associated protein-coding genes. In addition, a co-expression network of lncRNA and mRNA was constructed to identify potential underlying regulation targets; the results were verified by quantitative real-time PCR (qRT-PCR) and RNA-seq analyses in patients with SS and normal samples. Compared with the normal skin group, 266 novel lncRNAs and 6750 annotated lncRNAs were identified in the SS group. A total of 71 lncRNA transcripts and 2615 mRNA transcripts were differentially expressed (P < 0.05). The heat signature of the SS samples could be distinguished from the normal skin samples, whereas the majority of the genes that were present in enriched pathways were those that participated in focal adhesion, PI3K-Akt signaling, and cancer-related pathways. Five transcripts were selected for qRT-PCR analysis and the results were consistent with RNA-seq. The results suggested that LNC_000265 may play a role in the epidermal barrier structure of patient with SS. The data suggest novel genes and pathways that may be involved in the pathogenesis of SS and highlight potential targets that could be used for individualized treatment applications. PMID:29383128
Oyebola, Kolapo M; Idowu, Emmanuel T; Olukosi, Yetunde A; Awolola, Taiwo S; Amambua-Ngwa, Alfred
2017-06-29
The burden of falciparum malaria is especially high in sub-Saharan Africa. Differences in pressure from host immunity and antimalarial drugs lead to adaptive changes responsible for high level of genetic variations within and between the parasite populations. Population-specific genetic studies to survey for genes under positive or balancing selection resulting from drug pressure or host immunity will allow for refinement of interventions. We performed a pooled sequencing (pool-seq) of the genomes of 100 Plasmodium falciparum isolates from Nigeria. We explored allele-frequency based neutrality test (Tajima's D) and integrated haplotype score (iHS) to identify genes under selection. Fourteen shared iHS regions that had at least 2 SNPs with a score > 2.5 were identified. These regions code for genes that were likely to have been under strong directional selection. Two of these genes were the chloroquine resistance transporter (CRT) on chromosome 7 and the multidrug resistance 1 (MDR1) on chromosome 5. There was a weak signature of selection in the dihydrofolate reductase (DHFR) gene on chromosome 4 and MDR5 genes on chromosome 13, with only 2 and 3 SNPs respectively identified within the iHS window. We observed strong selection pressure attributable to continued chloroquine and sulfadoxine-pyrimethamine use despite their official proscription for the treatment of uncomplicated malaria. There was also a major selective sweep on chromosome 6 which had 32 SNPs within the shared iHS region. Tajima's D of circumsporozoite protein (CSP), erythrocyte-binding antigen (EBA-175), merozoite surface proteins - MSP3 and MSP7, merozoite surface protein duffy binding-like (MSPDBL2) and serine repeat antigen (SERA-5) were 1.38, 1.29, 0.73, 0.84 and 0.21, respectively. We have demonstrated the use of pool-seq to understand genomic patterns of selection and variability in P. falciparum from Nigeria, which bears the highest burden of infections. This investigation identified known genomic signatures of selection from drug pressure and host immunity. This is evidence that P. falciparum populations explore common adaptive strategies that can be targeted for the development of new interventions.
Transcription in space--environmental vs. genetic effects on differential immune gene expression.
Lenz, Tobias L
2015-09-01
Understanding how organisms adapt to their local environment is one of the key goals in molecular ecology. Adaptation can be achieved through qualitative changes in the coding sequence and/or quantitative changes in gene expression, where the optimal dosage of a gene's product in a given environment is being selected for. Differences in gene expression among populations inhabiting distinct environments can be suggestive of locally adapted gene regulation and have thus been studied in different species (Whitehead & Crawford ; Hodgins-Davis & Townsend ). However, in contrast to a gene's coding sequence, its expression level at a given point in time may depend on various factors, including the current environment. Although critical for understanding the extent of local adaptation, it is usually difficult to disentangle the heritable differences in gene regulation from environmental effects. In this issue of Molecular Ecology, Stutz et al. () describe an experiment in which they reciprocally transplanted three-spined sticklebacks (Gasterosteus aculeatus) between independent pairs of small and large lakes. Their experimental design allows them to attribute differences in gene expression among sticklebacks either to lake of origin or destination lake. Interestingly, they find that translocated sticklebacks show a pattern of gene expression more similar to individuals from the destination lake than to individuals from the lake of origin, suggesting that expression of the targeted genes is more strongly regulated by environmental effects than by genetics. The environmental effect by itself is not entirely surprising; however, the relative extent of it is. Especially when put in the context of local adaptation and population differentiation, as done here, these findings cast a new light onto the heritability of differential gene expression and specifically its relative importance during population divergence and ultimately ecological speciation. © 2015 John Wiley & Sons Ltd.
The Blueprint of a Minimal Cell: MiniBacillus
Reuß, Daniel R.; Commichau, Fabian M.; Gundlach, Jan; Zhu, Bingyao
2016-01-01
SUMMARY Bacillus subtilis is one of the best-studied organisms. Due to the broad knowledge and annotation and the well-developed genetic system, this bacterium is an excellent starting point for genome minimization with the aim of constructing a minimal cell. We have analyzed the genome of B. subtilis and selected all genes that are required to allow life in complex medium at 37°C. This selection is based on the known information on essential genes and functions as well as on gene and protein expression data and gene conservation. The list presented here includes 523 and 119 genes coding for proteins and RNAs, respectively. These proteins and RNAs are required for the basic functions of life in information processing (replication and chromosome maintenance, transcription, translation, protein folding, and secretion), metabolism, cell division, and the integrity of the minimal cell. The completeness of the selected metabolic pathways, reactions, and enzymes was verified by the development of a model of metabolism of the minimal cell. A comparison of the MiniBacillus genome to the recently reported designed minimal genome of Mycoplasma mycoides JCVI-syn3.0 indicates excellent agreement in the information-processing pathways, whereas each species has a metabolism that reflects specific evolution and adaptation. The blueprint of MiniBacillus presented here serves as the starting point for a successive reduction of the B. subtilis genome. PMID:27681641
Kelsen, Judith R.; Dawany, Noor; Moran, Christopher J.; Petersen, Britt-Sabina; Sarmady, Mahdi; Sasson, Ariella; Pauly-Hubbard, Helen; Martinez, Alejandro; Maurer, Kelly; Soong, Joanne; Rappaport, Eric; Franke, Andre; Keller, Andreas; Winter, Harland S.; Mamula, Petar; Piccoli, David; Artis, David; Sonnenberg, Gregory F.; Daly, Mark; Sullivan, Kathleen E.; Baldassano, Robert N.; Devoto, Marcella
2016-01-01
Background & Aims Very early onset inflammatory bowel disease (VEO-IBD), IBD diagnosed ≤5 y of age, frequently presents with a different and more severe phenotype than older-onset IBD. We investigated whether patients with VEO-IBD carry rare or novel variants in genes associated with immunodeficiencies that might contribute to disease development. Methods Patients with VEO-IBD and parents (when available) were recruited from the Children's Hospital of Philadelphia from March 2013 through July 2014. We analyzed DNA from 125 patients with VEO-IBD (ages 3 weeks to 4 y) and 19 parents, 4 of whom also had IBD. Exome capture was performed by Agilent SureSelect V4, and sequencing was performed using the Illumina HiSeq platform. Alignment to human genome GRCh37 was achieved followed by post-processing and variant calling. Following functional annotation, candidate variants were analyzed for change in protein function, minor allele frequency <0.1%, and scaled combined annotation dependent depletion scores ≤10. We focused on genes associated with primary immunodeficiencies and related pathways. An additional 210 exome samples from patients with pediatric IBD (n=45) or adult-onset Crohn's disease (n=20) and healthy individuals (controls, n=145) were obtained from the University of Kiel, Germany and used as control groups. Results Four-hundred genes and regions associated with primary immunodeficiency, covering approximately 6500 coding exons totaling > 1 Mbp of coding sequence, were selected from the whole exome data. Our analysis revealed novel and rare variants within these genes that could contribute to the development of VEO-IBD, including rare heterozygous missense variants in IL10RA and previously unidentified variants in MSH5 and CD19. Conclusions In an exome sequence analysis of patients with VEO-IBD and their parents, we identified variants in genes that regulate B- and T-cell functions and could contribute to pathogenesis. Our analysis could lead to the identification of previously unidentified IBD-associated variants. PMID:26193622
Grötzinger, Stefan W.; Alam, Intikhab; Ba Alawi, Wail; Bajic, Vladimir B.; Stingl, Ulrich; Eppinger, Jörg
2014-01-01
Reliable functional annotation of genomic data is the key-step in the discovery of novel enzymes. Intrinsic sequencing data quality problems of single amplified genomes (SAGs) and poor homology of novel extremophile's genomes pose significant challenges for the attribution of functions to the coding sequences identified. The anoxic deep-sea brine pools of the Red Sea are a promising source of novel enzymes with unique evolutionary adaptation. Sequencing data from Red Sea brine pool cultures and SAGs are annotated and stored in the Integrated Data Warehouse of Microbial Genomes (INDIGO) data warehouse. Low sequence homology of annotated genes (no similarity for 35% of these genes) may translate into false positives when searching for specific functions. The Profile and Pattern Matching (PPM) strategy described here was developed to eliminate false positive annotations of enzyme function before progressing to labor-intensive hyper-saline gene expression and characterization. It utilizes InterPro-derived Gene Ontology (GO)-terms (which represent enzyme function profiles) and annotated relevant PROSITE IDs (which are linked to an amino acid consensus pattern). The PPM algorithm was tested on 15 protein families, which were selected based on scientific and commercial potential. An initial list of 2577 enzyme commission (E.C.) numbers was translated into 171 GO-terms and 49 consensus patterns. A subset of INDIGO-sequences consisting of 58 SAGs from six different taxons of bacteria and archaea were selected from six different brine pool environments. Those SAGs code for 74,516 genes, which were independently scanned for the GO-terms (profile filter) and PROSITE IDs (pattern filter). Following stringent reliability filtering, the non-redundant hits (106 profile hits and 147 pattern hits) are classified as reliable, if at least two relevant descriptors (GO-terms and/or consensus patterns) are present. Scripts for annotation, as well as for the PPM algorithm, are available through the INDIGO website. PMID:24778629
Evolving targeted therapies for right ventricular failure.
Di Salvo, Thomas G
2015-01-01
Although right and left ventricular embryological origins, morphology and cardiodynamics differ, the notion of selectively targeted right ventricular therapies remains controversial. This review focuses on both the currently evolving pharmacologic agents targeting right ventricular failure (metabolic modulators, phosphodiesterase type V inhibitors) and future therapeutic approaches including epigenetic modulation by miRNAs, chromatin binding complexes, long non-coding RNAs, genomic editing, adoptive gene transfer and gene therapy, cell regeneration via cell transplantation and cell reprogramming and cardiac tissue engineering. Strategies for adult right ventricular regeneration will require a more holistic approach than strategies for adult left ventricular failure. Instances of right ventricular failure requiring global reconstitution of right ventricular myocardium, attractive approaches include: i) myocardial patches seeded with cardiac fibroblasts reprogrammed into cardiomyocytes in vivo by small molecules, miRNAs or other epigenetic modifiers; and ii) administration of miRNAs, lncRNAs or small molecules by non-viral vector delivery systems targeted to fibroblasts (e.g., episomes) to stimulate in vivo reprogramming of fibroblasts into cardiomyocytes. For selected heritable genetic myocardial diseases, genomic editing affords exciting opportunities for allele-specific silencing by site-specific directed silencing, mutagenesis or gene excision. Genomic editing by adoptive gene transfer affords similarly exciting opportunities for restoration of myocardial gene expression.
Averina, O V; Nezametdinova, V Z; Alekseeva, M G; Danilenko, V N
2012-11-01
The stability of inheriting several genes in the Russian commercial strain Bifidobacterium longum subsp. longum B379M during cultivation and maintenance under laboratory conditions has been studied. The examined genes code for probiotic characteristics, such as utilization of several sugars (lacA2 gene, encoding beta-galactosidase; ara gene, encoding arabinosidase; and galA gene, encoding arabinogalactan endo-beta-galactosidase); synthesis of bacteriocins (lans gene, encoding lanthionine synthetase); and mobile gene tet(W), conferring resistance to the antibiotic tetracycline. The other gene families studied include the genes responsible for signal transduction and adaptation to stress conditions in the majority of bacteria (serine/threonine protein kinases and the toxin-antitoxin systems of MazEF and RelBE types) and transcription regulators (genes encoding WhiB family proteins). Genomic DNA was analyzed by PCR using specially selected primers. A loss of the genes galA and tet(W) has been shown. It is proposed to expand the requirements on probiotic strains, namely, to control retention of the key probiotic genes using molecular biological methods.
Zhang, Cheng; Ni, Pan; Ahmad, Hafiz Ishfaq; Gemingguli, M; Baizilaitibei, A; Gulibaheti, D; Fang, Yaping; Wang, Haiyang; Asif, Akhtar Rasool; Xiao, Changyi; Chen, Jianhai; Ma, Yunlong; Liu, Xiangdong; Du, Xiaoyong; Zhao, Shuhong
2018-01-01
Animal domestication gives rise to gradual changes at the genomic level through selection in populations. Selective sweeps have been traced in the genomes of many animal species, including humans, cattle, and dogs. However, little is known regarding positional candidate genes and genomic regions that exhibit signatures of selection in domestic horses. In addition, an understanding of the genetic processes underlying horse domestication, especially the origin of Chinese native populations, is still lacking. In our study, we generated whole genome sequences from 4 Chinese native horses and combined them with 48 publicly available full genome sequences, from which 15 341 213 high-quality unique single-nucleotide polymorphism variants were identified. Kazakh and Lichuan horses are 2 typical Asian native breeds that were formed in Kazakh or Northwest China and South China, respectively. We detected 1390 loss-of-function (LoF) variants in protein-coding genes, and gene ontology (GO) enrichment analysis revealed that some LoF-affected genes were overrepresented in GO terms related to the immune response. Bayesian clustering, distance analysis, and principal component analysis demonstrated that the population structure of these breeds largely reflected weak geographic patterns. Kazakh and Lichuan horses were assigned to the same lineage with other Asian native breeds, in agreement with previous studies on the genetic origin of Chinese domestic horses. We applied the composite likelihood ratio method to scan for genomic regions showing signals of recent selection in the horse genome. A total of 1052 genomic windows of 10 kB, corresponding to 933 distinct core regions, significantly exceeded neutral simulations. The GO enrichment analysis revealed that the genes under selective sweeps were overrepresented with GO terms, including “negative regulation of canonical Wnt signaling pathway,” “muscle contraction,” and “axon guidance.” Frequent exercise training in domestic horses may have resulted in changes in the expression of genes related to metabolism, muscle structure, and the nervous system.
Selection of reference genes for expression analyses of red-fleshed sweet orange (Citrus sinensis).
Pinheiro, T T; Nishimura, D S; De Nadai, F B; Figueira, A; Latado, R R
2015-12-28
Red-fleshed oranges (Citrus sinensis) contain high levels of carotenoids and lycopene. The growing consumer demand for products with health benefits has increased interest in these types of Citrus cultivars as a potential source of nutraceuticals. However, little is known about the physiology of these cultivars under Brazilian conditions. Transcriptome and gene expression analyses are important tools in the breeding and management of red-fleshed sweet orange cultivars. Reverse transcription quantitative polymerase chain reaction is a method of quantifying gene expression, but various standardizations are required to obtain precise, accurate, and specific results. Among the standardizations required, the choice of suitable stable reference genes is fundamental. The objective of this study was to evaluate the stability of 11 candidate genes using various tissue and organ samples from healthy plants or leaves from citrus greening disease (Huanglongbing)-symptomatic plants of a Brazilian red-fleshed cultivar ('Sanguínea de Mombuca'), in order to select the most suitable reference gene for investigating gene expression under these conditions. geNorm and NormFinder identified genes that encoded translation initiation factor 3, ribosomal protein L35, and translation initiation factor 5A as the most stable genes under the biological conditions tested, and genes coding actin (ACT) and the subunit of the PSI reaction center subunit III were the least stable. Phosphatase, malate dehydrogenase, and ACT were the most stable genes in the leaf samples of infected plants.
Pearson, J L; Pintel, D J
2000-03-30
Recombination within the coding region of the nonstructural genes of minute virus of mice (MVM), which generates functional levels of wild-type NS1, was observed in the absence of selective pressure following cotransfection of nonreplicating plasmids. P38 activity was used as a measure of recombinant NS1 production, which, together with direct detection of recombinant-generated products by RT-PCR, allowed an estimation of recombination efficiency. In addition, we show that very low levels of wild-type NS1 were able to significantly transactivate P38. Given that recombination following cotransfection can generate NS1 at these levels, our observations have implications for the study of parvoviral genetics, the construction of recombinant parvoviral vectors for gene therapy applications, and perhaps other systems using cotransfection of plasmids that share homologous sequences. Copyright 2000 Academic Press.
Evolutionary dynamics of Newcastle disease virus
Miller, P.J.; Kim, L.M.; Ip, Hon S.; Afonso, C.L.
2009-01-01
A comprehensive dataset of NDV genome sequences was evaluated using bioinformatics to characterize the evolutionary forces affecting NDV genomes. Despite evidence of recombination in most genes, only one event in the fusion gene of genotype V viruses produced evolutionarily viable progenies. The codon-associated rate of change for the six NDV proteins revealed that the highest rate of change occurred at the fusion protein. All proteins were under strong purifying (negative) selection; the fusion protein displayed the highest number of amino acids under positive selection. Regardless of the phylogenetic grouping or the level of virulence, the cleavage site motif was highly conserved implying that mutations at this site that result in changes of virulence may not be favored. The coding sequence of the fusion gene and the genomes of viruses from wild birds displayed higher yearly rates of change in virulent viruses than in viruses of low virulence, suggesting that an increase in virulence may accelerate the rate of NDV evolution. ?? 2009 Elsevier Inc.
Cheng, Chao; Ung, Matthew; Grant, Gavin D.; Whitfield, Michael L.
2013-01-01
Cell cycle is a complex and highly supervised process that must proceed with regulatory precision to achieve successful cellular division. Despite the wide application, microarray time course experiments have several limitations in identifying cell cycle genes. We thus propose a computational model to predict human cell cycle genes based on transcription factor (TF) binding and regulatory motif information in their promoters. We utilize ENCODE ChIP-seq data and motif information as predictors to discriminate cell cycle against non-cell cycle genes. Our results show that both the trans- TF features and the cis- motif features are predictive of cell cycle genes, and a combination of the two types of features can further improve prediction accuracy. We apply our model to a complete list of GENCODE promoters to predict novel cell cycle driving promoters for both protein-coding genes and non-coding RNAs such as lincRNAs. We find that a similar percentage of lincRNAs are cell cycle regulated as protein-coding genes, suggesting the importance of non-coding RNAs in cell cycle division. The model we propose here provides not only a practical tool for identifying novel cell cycle genes with high accuracy, but also new insights on cell cycle regulation by TFs and cis-regulatory elements. PMID:23874175
Decoding the disease-associated proteins encoded in the human chromosome 4.
Chen, Lien-Chin; Liu, Mei-Ying; Hsiao, Yung-Chin; Choong, Wai-Kok; Wu, Hsin-Yi; Hsu, Wen-Lian; Liao, Pao-Chi; Sung, Ting-Yi; Tsai, Shih-Feng; Yu, Jau-Song; Chen, Yu-Ju
2013-01-04
Chromosome 4 is the fourth largest chromosome, containing approximately 191 megabases (~6.4% of the human genome) with 757 protein-coding genes. A number of marker genes for many diseases have been found in this chromosome, including genetic diseases (e.g., hepatocellular carcinoma) and biomedical research (cardiac system, aging, metabolic disorders, immune system, cancer and stem cell) related genes (e.g., oncogenes, growth factors). As a pilot study for the chromosome 4-centric human proteome project (Chr 4-HPP), we present here a systematic analysis of the disease association, protein isoforms, coding single nucleotide polymorphisms of these 757 protein-coding genes and their experimental evidence at the protein level. We also describe how the findings from the chromosome 4 project might be used to drive the biomarker discovery and validation study in disease-oriented projects, using the examples of secretomic and membrane proteomic approaches in cancer research. By integrating with cancer cell secretomes and several other existing databases in the public domain, we identified 141 chromosome 4-encoded proteins as cancer cell-secretable/shedable proteins. Additionally, we also identified 54 chromosome 4-encoded proteins that have been classified as cancer-associated proteins with successful selected or multiple reaction monitoring (SRM/MRM) assays developed. From literature annotation and topology analysis, 271 proteins were recognized as membrane proteins while 27.9% of the 757 proteins do not have any experimental evidence at the protein-level. In summary, the analysis revealed that the chromosome 4 is a rich resource for cancer-associated proteins for biomarker verification projects and for drug target discovery projects.
APADB: a database for alternative polyadenylation and microRNA regulation events
Müller, Sören; Rycak, Lukas; Afonso-Grunz, Fabian; Winter, Peter; Zawada, Adam M.; Damrath, Ewa; Scheider, Jessica; Schmäh, Juliane; Koch, Ina; Kahl, Günter; Rotter, Björn
2014-01-01
Alternative polyadenylation (APA) is a widespread mechanism that contributes to the sophisticated dynamics of gene regulation. Approximately 50% of all protein-coding human genes harbor multiple polyadenylation (PA) sites; their selective and combinatorial use gives rise to transcript variants with differing length of their 3′ untranslated region (3′UTR). Shortened variants escape UTR-mediated regulation by microRNAs (miRNAs), especially in cancer, where global 3′UTR shortening accelerates disease progression, dedifferentiation and proliferation. Here we present APADB, a database of vertebrate PA sites determined by 3′ end sequencing, using massive analysis of complementary DNA ends. APADB provides (A)PA sites for coding and non-coding transcripts of human, mouse and chicken genes. For human and mouse, several tissue types, including different cancer specimens, are available. APADB records the loss of predicted miRNA binding sites and visualizes next-generation sequencing reads that support each PA site in a genome browser. The database tables can either be browsed according to organism and tissue or alternatively searched for a gene of interest. APADB is the largest database of APA in human, chicken and mouse. The stored information provides experimental evidence for thousands of PA sites and APA events. APADB combines 3′ end sequencing data with prediction algorithms of miRNA binding sites, allowing to further improve prediction algorithms. Current databases lack correct information about 3′UTR lengths, especially for chicken, and APADB provides necessary information to close this gap. Database URL: http://tools.genxpro.net/apadb/ PMID:25052703
Maier, Uwe-G; Zauner, Stefan; Woehle, Christian; Bolte, Kathrin; Hempel, Franziska; Allen, John F.; Martin, William F.
2013-01-01
Plastid and mitochondrial genomes have undergone parallel evolution to encode the same functional set of genes. These encode conserved protein components of the electron transport chain in their respective bioenergetic membranes and genes for the ribosomes that express them. This highly convergent aspect of organelle genome evolution is partly explained by the redox regulation hypothesis, which predicts a separate plastid or mitochondrial location for genes encoding bioenergetic membrane proteins of either photosynthesis or respiration. Here we show that convergence in organelle genome evolution is far stronger than previously recognized, because the same set of genes for ribosomal proteins is independently retained by both plastid and mitochondrial genomes. A hitherto unrecognized selective pressure retains genes for the same ribosomal proteins in both organelles. On the Escherichia coli ribosome assembly map, the retained proteins are implicated in 30S and 50S ribosomal subunit assembly and initial rRNA binding. We suggest that ribosomal assembly imposes functional constraints that govern the retention of ribosomal protein coding genes in organelles. These constraints are subordinate to redox regulation for electron transport chain components, which anchor the ribosome to the organelle genome in the first place. As organelle genomes undergo reduction, the rRNAs also become smaller. Below size thresholds of approximately 1,300 nucleotides (16S rRNA) and 2,100 nucleotides (26S rRNA), all ribosomal protein coding genes are lost from organelles, while electron transport chain components remain organelle encoded as long as the organelles use redox chemistry to generate a proton motive force. PMID:24259312
Romero, Roberto; Tarca, Adi L; Chaemsaithong, Piya; Miranda, Jezid; Chaiworapongsa, Tinnakorn; Jia, Hui; Hassan, Sonia S; Kalita, Cynthia A; Cai, Juan; Yeo, Lami; Lipovich, Leonard
2014-09-01
To identify differentially expressed long non-coding RNA (lncRNA) genes in human myometrium in women with spontaneous labor at term. Myometrium was obtained from women undergoing cesarean deliveries who were not in labor (n = 19) and women in spontaneous labor at term (n = 20). RNA was extracted and profiled using an Illumina® microarray platform. We have used computational approaches to bound the extent of long non-coding RNA representation on this platform, and to identify co-differentially expressed and correlated pairs of long non-coding RNA genes and protein-coding genes sharing the same genomic loci. We identified co-differential expression and correlation at two genomic loci that contain coding-lncRNA gene pairs: SOCS2-AK054607 and LMCD1-NR_024065 in women in spontaneous labor at term. This co-differential expression and correlation was validated by qRT-PCR, an experimental method completely independent of the microarray analysis. Intriguingly, one of the two lncRNA genes differentially expressed in term labor had a key genomic structure element, a splice site, that lacked evolutionary conservation beyond primates. We provide, for the first time, evidence for coordinated differential expression and correlation of cis-encoded antisense lncRNAs and protein-coding genes with known as well as novel roles in pregnancy in the myometrium of women in spontaneous labor at term.
De novo transcriptome assembly and positive selection analysis of an individual deep-sea fish.
Lan, Yi; Sun, Jin; Xu, Ting; Chen, Chong; Tian, Renmao; Qiu, Jian-Wen; Qian, Pei-Yuan
2018-05-24
High hydrostatic pressure and low temperatures make the deep sea a harsh environment for life forms. Actin organization and microtubules assembly, which are essential for intracellular transport and cell motility, can be disrupted by high hydrostatic pressure. High hydrostatic pressure can also damage DNA. Nucleic acids exposed to low temperatures can form secondary structures that hinder genetic information processing. To study how deep-sea creatures adapt to such a hostile environment, one of the most straightforward ways is to sequence and compare their genes with those of their shallow-water relatives. We captured an individual of the fish species Aldrovandia affinis, which is a typical deep-sea inhabitant, from the Okinawa Trough at a depth of 1550 m using a remotely operated vehicle (ROV). We sequenced its transcriptome and analyzed its molecular adaptation. We obtained 27,633 protein coding sequences using an Illumina platform and compared them with those of several shallow-water fish species. Analysis of 4918 single-copy orthologs identified 138 positively selected genes in A. affinis, including genes involved in microtubule regulation. Particularly, functional domains related to cold shock as well as DNA repair are exposed to positive selection pressure in both deep-sea fish and hadal amphipod. Overall, we have identified a set of positively selected genes related to cytoskeleton structures, DNA repair and genetic information processing, which shed light on molecular adaptation to the deep sea. These results suggest that amino acid substitutions of these positively selected genes may contribute crucially to the adaptation of deep-sea animals. Additionally, we provide a high-quality transcriptome of a deep-sea fish for future deep-sea studies.
Reiche, Kristin; Kasack, Katharina; Schreiber, Stephan; Lüders, Torben; Due, Eldri U.; Naume, Bjørn; Riis, Margit; Kristensen, Vessela N.; Horn, Friedemann; Børresen-Dale, Anne-Lise; Hackermüller, Jörg; Baumbusch, Lars O.
2014-01-01
Breast cancer, the second leading cause of cancer death in women, is a highly heterogeneous disease, characterized by distinct genomic and transcriptomic profiles. Transcriptome analyses prevalently assessed protein-coding genes; however, the majority of the mammalian genome is expressed in numerous non-coding transcripts. Emerging evidence supports that many of these non-coding RNAs are specifically expressed during development, tumorigenesis, and metastasis. The focus of this study was to investigate the expression features and molecular characteristics of long non-coding RNAs (lncRNAs) in breast cancer. We investigated 26 breast tumor and 5 normal tissue samples utilizing a custom expression microarray enclosing probes for mRNAs as well as novel and previously identified lncRNAs. We identified more than 19,000 unique regions significantly differentially expressed between normal versus breast tumor tissue, half of these regions were non-coding without any evidence for functional open reading frames or sequence similarity to known proteins. The identified non-coding regions were primarily located in introns (53%) or in the intergenic space (33%), frequently orientated in antisense-direction of protein-coding genes (14%), and commonly distributed at promoter-, transcription factor binding-, or enhancer-sites. Analyzing the most diverse mRNA breast cancer subtypes Basal-like versus Luminal A and B resulted in 3,025 significantly differentially expressed unique loci, including 682 (23%) for non-coding transcripts. A notable number of differentially expressed protein-coding genes displayed non-synonymous expression changes compared to their nearest differentially expressed lncRNA, including an antisense lncRNA strongly anticorrelated to the mRNA coding for histone deacetylase 3 (HDAC3), which was investigated in more detail. Previously identified chromatin-associated lncRNAs (CARs) were predominantly downregulated in breast tumor samples, including CARs located in the protein-coding genes for CALD1, FTX, and HNRNPH1. In conclusion, a number of differentially expressed lncRNAs have been identified with relation to cancer-related protein-coding genes. PMID:25264628
Reiche, Kristin; Kasack, Katharina; Schreiber, Stephan; Lüders, Torben; Due, Eldri U; Naume, Bjørn; Riis, Margit; Kristensen, Vessela N; Horn, Friedemann; Børresen-Dale, Anne-Lise; Hackermüller, Jörg; Baumbusch, Lars O
2014-01-01
Breast cancer, the second leading cause of cancer death in women, is a highly heterogeneous disease, characterized by distinct genomic and transcriptomic profiles. Transcriptome analyses prevalently assessed protein-coding genes; however, the majority of the mammalian genome is expressed in numerous non-coding transcripts. Emerging evidence supports that many of these non-coding RNAs are specifically expressed during development, tumorigenesis, and metastasis. The focus of this study was to investigate the expression features and molecular characteristics of long non-coding RNAs (lncRNAs) in breast cancer. We investigated 26 breast tumor and 5 normal tissue samples utilizing a custom expression microarray enclosing probes for mRNAs as well as novel and previously identified lncRNAs. We identified more than 19,000 unique regions significantly differentially expressed between normal versus breast tumor tissue, half of these regions were non-coding without any evidence for functional open reading frames or sequence similarity to known proteins. The identified non-coding regions were primarily located in introns (53%) or in the intergenic space (33%), frequently orientated in antisense-direction of protein-coding genes (14%), and commonly distributed at promoter-, transcription factor binding-, or enhancer-sites. Analyzing the most diverse mRNA breast cancer subtypes Basal-like versus Luminal A and B resulted in 3,025 significantly differentially expressed unique loci, including 682 (23%) for non-coding transcripts. A notable number of differentially expressed protein-coding genes displayed non-synonymous expression changes compared to their nearest differentially expressed lncRNA, including an antisense lncRNA strongly anticorrelated to the mRNA coding for histone deacetylase 3 (HDAC3), which was investigated in more detail. Previously identified chromatin-associated lncRNAs (CARs) were predominantly downregulated in breast tumor samples, including CARs located in the protein-coding genes for CALD1, FTX, and HNRNPH1. In conclusion, a number of differentially expressed lncRNAs have been identified with relation to cancer-related protein-coding genes.
Pollier, Jacob; González-Guzmán, Miguel; Ardiles-Diaz, Wilson; Geelen, Danny; Goossens, Alain
2011-01-01
cDNA-Amplified Fragment Length Polymorphism (cDNA-AFLP) is a commonly used technique for genome-wide expression analysis that does not require prior sequence knowledge. Typically, quantitative expression data and sequence information are obtained for a large number of differentially expressed gene tags. However, most of the gene tags do not correspond to full-length (FL) coding sequences, which is a prerequisite for subsequent functional analysis. A medium-throughput screening strategy, based on integration of polymerase chain reaction (PCR) and colony hybridization, was developed that allows in parallel screening of a cDNA library for FL clones corresponding to incomplete cDNAs. The method was applied to screen for the FL open reading frames of a selection of 163 cDNA-AFLP tags from three different medicinal plants, leading to the identification of 109 (67%) FL clones. Furthermore, the protocol allows for the use of multiple probes in a single hybridization event, thus significantly increasing the throughput when screening for rare transcripts. The presented strategy offers an efficient method for the conversion of incomplete expressed sequence tags (ESTs), such as cDNA-AFLP tags, to FL-coding sequences.
The king cobra genome reveals dynamic gene evolution and adaptation in the snake venom system
Vonk, Freek J.; Casewell, Nicholas R.; Henkel, Christiaan V.; Heimberg, Alysha M.; Jansen, Hans J.; McCleary, Ryan J. R.; Kerkkamp, Harald M. E.; Vos, Rutger A.; Guerreiro, Isabel; Calvete, Juan J.; Wüster, Wolfgang; Woods, Anthony E.; Logan, Jessica M.; Harrison, Robert A.; Castoe, Todd A.; de Koning, A. P. Jason; Pollock, David D.; Yandell, Mark; Calderon, Diego; Renjifo, Camila; Currier, Rachel B.; Salgado, David; Pla, Davinia; Sanz, Libia; Hyder, Asad S.; Ribeiro, José M. C.; Arntzen, Jan W.; van den Thillart, Guido E. E. J. M.; Boetzer, Marten; Pirovano, Walter; Dirks, Ron P.; Spaink, Herman P.; Duboule, Denis; McGlinn, Edwina; Kini, R. Manjunatha; Richardson, Michael K.
2013-01-01
Snakes are limbless predators, and many species use venom to help overpower relatively large, agile prey. Snake venoms are complex protein mixtures encoded by several multilocus gene families that function synergistically to cause incapacitation. To examine venom evolution, we sequenced and interrogated the genome of a venomous snake, the king cobra (Ophiophagus hannah), and compared it, together with our unique transcriptome, microRNA, and proteome datasets from this species, with data from other vertebrates. In contrast to the platypus, the only other venomous vertebrate with a sequenced genome, we find that snake toxin genes evolve through several distinct co-option mechanisms and exhibit surprisingly variable levels of gene duplication and directional selection that correlate with their functional importance in prey capture. The enigmatic accessory venom gland shows a very different pattern of toxin gene expression from the main venom gland and seems to have recruited toxin-like lectin genes repeatedly for new nontoxic functions. In addition, tissue-specific microRNA analyses suggested the co-option of core genetic regulatory components of the venom secretory system from a pancreatic origin. Although the king cobra is limbless, we recovered coding sequences for all Hox genes involved in amniote limb development, with the exception of Hoxd12. Our results provide a unique view of the origin and evolution of snake venom and reveal multiple genome-level adaptive responses to natural selection in this complex biological weapon system. More generally, they provide insight into mechanisms of protein evolution under strong selection. PMID:24297900
Martínez-Castilla, León Patricio; Alvarez-Buylla, Elena R.
2003-01-01
Gene duplication is a substrate of evolution. However, the relative importance of positive selection versus relaxation of constraints in the functional divergence of gene copies is still under debate. Plant MADS-box genes encode transcriptional regulators key in various aspects of development and have undergone extensive duplications to form a large family. We recovered 104 MADS sequences from the Arabidopsis genome. Bayesian phylogenetic trees recover type II lineage as a monophyletic group and resolve a branching sequence of monophyletic groups within this lineage. The type I lineage is comprised of several divergent groups. However, contrasting gene structure and patterns of chromosomal distribution between type I and II sequences suggest that they had different evolutionary histories and support the placement of the root of the gene family between these two groups. Site-specific and site-branch analyses of positive Darwinian selection (PDS) suggest that different selection regimes could have affected the evolution of these lineages. We found evidence for PDS along the branch leading to flowering time genes that have a direct impact on plant fitness. Sites with high probabilities of having been under PDS were found in the MADS and K domains, suggesting that these played important roles in the acquisition of novel functions during MADS-box diversification. Detected sites are targets for further experimental analyses. We argue that adaptive changes in MADS-domain protein sequences have been important for their functional divergence, suggesting that changes within coding regions of transcriptional regulators have influenced phenotypic evolution of plants. PMID:14597714
Fan, SiGang; Hu, ChaoQun; Wen, Jing; Zhang, LvPing
2011-05-01
The complete mitochondrial DNA sequence contains useful information for phylogenetic analyses of metazoa. In this study, the complete mitochondrial DNA sequence of sea cucumber Stichopus horrens (Holothuroidea: Stichopodidae: Stichopus) is presented. The complete sequence was determined using normal and long PCRs. The mitochondrial genome of Stichopus horrens is a circular molecule 16257 bps long, composed of 13 protein-coding genes, two ribosomal RNA genes and 22 transfer RNA genes. Most of these genes are coded on the heavy strand except for one protein-coding gene (nad6) and five tRNA genes (tRNA ( Ser(UCN) ), tRNA ( Gln ), tRNA ( Ala ), tRNA ( Val ), tRNA ( Asp )) which are coded on the light strand. The composition of the heavy strand is 30.8% A, 23.7% C, 16.2% G, and 29.3% T bases (AT skew=0.025; GC skew=-0.188). A non-coding region of 675 bp was identified as a putative control region because of its location and AT richness. The intergenic spacers range from 1 to 50 bp in size, totaling 227 bp. A total of 25 overlapping nucleotides, ranging from 1 to 10 bp in size, exist among 11 genes. All 13 protein-coding genes are initiated with an ATG. The TAA codon is used as the stop codon in all the protein coding genes except nad3 and nad4 that use TAG as their termination codon. The most frequently used amino acids are Leu (16.29%), Ser (10.34%) and Phe (8.37%). All of the tRNA genes have the potential to fold into typical cloverleaf secondary structures. We also compared the order of the genes in the mitochondrial DNA from the five holothurians that are now available and found a novel gene arrangement in the mitochondrial DNA of Stichopus horrens.
Chi, Xiaojuan; Wang, Song; Ma, Yanmei; Chen, Jilong
2017-01-01
The classical swine fever virus (CSFV), circulating worldwide, is a highly contagious virus. Since the emergence of CSFV, it has caused great economic loss in swine industry. The envelope glycoprotein E2 gene of the CSFV is an immunoprotective antigen that induces the immune system to produce neutralizing antibodies. Therefore, it is essential to study the codon usage of the E2 gene of the CSFV. In this study, 140 coding sequences of the E2 gene were analyzed. The value of effective number of codons (ENC) showed low codon usage bias in the E2 gene. Our study showed that codon usage could be described mainly by mutation pressure ENC plot analysis combined with principal component analysis (PCA) and translational selection-correlation analysis between the general average hydropathicity (Gravy) and aromaticity (Aroma), and nucleotides at the third position of codons (A3s, T3s, G3s, C3s and GC3s). Furthermore, the neutrality analysis, which explained the relationship between GC12s and GC3s, revealed that natural selection had a key role compared with mutational bias during the evolution of the E2 gene. These results lay a foundation for further research on the molecular evolution of CSFV. PMID:28880881
Chen, Ye; Li, Xinxin; Chi, Xiaojuan; Wang, Song; Ma, Yanmei; Chen, Jilong
2017-01-01
The classical swine fever virus (CSFV), circulating worldwide, is a highly contagious virus. Since the emergence of CSFV, it has caused great economic loss in swine industry. The envelope glycoprotein E2 gene of the CSFV is an immunoprotective antigen that induces the immune system to produce neutralizing antibodies. Therefore, it is essential to study the codon usage of the E2 gene of the CSFV. In this study, 140 coding sequences of the E2 gene were analyzed. The value of effective number of codons (ENC) showed low codon usage bias in the E2 gene. Our study showed that codon usage could be described mainly by mutation pressure ENC plot analysis combined with principal component analysis (PCA) and translational selection-correlation analysis between the general average hydropathicity (Gravy) and aromaticity (Aroma), and nucleotides at the third position of codons (A3s, T3s, G3s, C3s and GC3s). Furthermore, the neutrality analysis, which explained the relationship between GC12s and GC3s, revealed that natural selection had a key role compared with mutational bias during the evolution of the E2 gene. These results lay a foundation for further research on the molecular evolution of CSFV.
Amy2B copy number variation reveals starch diet adaptations in ancient European dogs.
Ollivier, Morgane; Tresset, Anne; Bastian, Fabiola; Lagoutte, Laetitia; Axelsson, Erik; Arendt, Maja-Louise; Bălăşescu, Adrian; Marshour, Marjan; Sablin, Mikhail V; Salanova, Laure; Vigne, Jean-Denis; Hitte, Christophe; Hänni, Catherine
2016-11-01
Extant dog and wolf DNA indicates that dog domestication was accompanied by the selection of a series of duplications on the Amy2B gene coding for pancreatic amylase. In this study, we used a palaeogenetic approach to investigate the timing and expansion of the Amy2B gene in the ancient dog populations of Western and Eastern Europe and Southwest Asia. Quantitative polymerase chain reaction was used to estimate the copy numbers of this gene for 13 ancient dog samples, dated to between 15 000 and 4000 years before present (cal. BP). This evidenced an increase of Amy2B copies in ancient dogs from as early as the 7th millennium cal. BP in Southeastern Europe. We found that the gene expansion was not fixed across all dogs within this early farming context, with ancient dogs bearing between 2 and 20 diploid copies of the gene. The results also suggested that selection for the increased Amy2B copy number started 7000 years cal. BP, at the latest. This expansion reflects a local adaptation that allowed dogs to thrive on a starch rich diet, especially within early farming societies, and suggests a biocultural coevolution of dog genes and human culture.
Amy2B copy number variation reveals starch diet adaptations in ancient European dogs
Tresset, Anne; Bastian, Fabiola; Lagoutte, Laetitia; Arendt, Maja-Louise; Bălăşescu, Adrian; Marshour, Marjan; Sablin, Mikhail V.; Salanova, Laure; Vigne, Jean-Denis; Hitte, Christophe; Hänni, Catherine
2016-01-01
Extant dog and wolf DNA indicates that dog domestication was accompanied by the selection of a series of duplications on the Amy2B gene coding for pancreatic amylase. In this study, we used a palaeogenetic approach to investigate the timing and expansion of the Amy2B gene in the ancient dog populations of Western and Eastern Europe and Southwest Asia. Quantitative polymerase chain reaction was used to estimate the copy numbers of this gene for 13 ancient dog samples, dated to between 15 000 and 4000 years before present (cal. BP). This evidenced an increase of Amy2B copies in ancient dogs from as early as the 7th millennium cal. BP in Southeastern Europe. We found that the gene expansion was not fixed across all dogs within this early farming context, with ancient dogs bearing between 2 and 20 diploid copies of the gene. The results also suggested that selection for the increased Amy2B copy number started 7000 years cal. BP, at the latest. This expansion reflects a local adaptation that allowed dogs to thrive on a starch rich diet, especially within early farming societies, and suggests a biocultural coevolution of dog genes and human culture. PMID:28018628
Opposite GC skews at the 5' and 3' ends of genes in unicellular fungi
2011-01-01
Background GC-skews have previously been linked to transcription in some eukaryotes. They have been associated with transcription start sites, with the coding strand G-biased in mammals and C-biased in fungi and invertebrates. Results We show a consistent and highly significant pattern of GC-skew within genes of almost all unicellular fungi. The pattern of GC-skew is asymmetrical: the coding strand of genes is typically C-biased at the 5' ends but G-biased at the 3' ends, with intermediate skews at the middle of genes. Thus, the initiation, elongation, and termination phases of transcription are associated with different skews. This pattern influences the encoded proteins by generating differential usage of amino acids at the 5' and 3' ends of genes. These biases also affect fourfold-degenerate positions and extend into promoters and 3' UTRs, indicating that skews cannot be accounted by selection for protein function or translation. Conclusions We propose two explanations, the mutational pressure hypothesis, and the adaptive hypothesis. The mutational pressure hypothesis is that different co-factors bind to RNA pol II at different phases of transcription, producing different mutational regimes. The adaptive hypothesis is that cytidine triphosphate deficiency may lead to C-avoidance at the 3' ends of transcripts to control the flow of RNA pol II molecules and reduce their frequency of collisions. PMID:22208287
Co-expression analysis and identification of fecundity-related long non-coding RNAs in sheep ovaries
Miao, Xiangyang; Luo, Qingmiao; Zhao, Huijing; Qin, Xiaoyu
2016-01-01
Small Tail Han sheep, including the FecBBFecBB (Han BB) and FecB+ FecB+ (Han++) genotypes, and Dorset sheep exhibit different fecundities. To identify novel long non-coding RNAs (lncRNAs) associated with sheep fecundity to better understand their molecular mechanisms, a genome-wide analysis of mRNAs and lncRNAs from Han BB, Han++ and Dorset sheep was performed. After the identification of differentially expressed mRNAs and lncRNAs, 16 significant modules were explored by using weighted gene coexpression network analysis (WGCNA) followed by functional enrichment analysis of the genes and lncRNAs in significant modules. Among these selected modules, the yellow and brown modules were significantly related to sheep fecundity. lncRNAs (e.g., NR0B1, XLOC_041882, and MYH15) in the yellow module were mainly involved in the TGF-β signalling pathway, and NYAP1 and BCORL1 were significantly associated with the oxytocin signalling pathway, which regulates several genes in the coexpression network of the brown module. Overall, we identified several gene modules associated with sheep fecundity, as well as networks consisting of hub genes and lncRNAs that may contribute to sheep prolificacy by regulating the target mRNAs related to the TGF-β and oxytocin signalling pathways. This study provides an alternative strategy for the identification of potential candidate regulatory lncRNAs. PMID:27982099
Miao, Xiangyang; Luo, Qingmiao; Zhao, Huijing; Qin, Xiaoyu
2016-12-16
Small Tail Han sheep, including the FecB B FecB B (Han BB) and FecB + FecB + (Han++) genotypes, and Dorset sheep exhibit different fecundities. To identify novel long non-coding RNAs (lncRNAs) associated with sheep fecundity to better understand their molecular mechanisms, a genome-wide analysis of mRNAs and lncRNAs from Han BB, Han++ and Dorset sheep was performed. After the identification of differentially expressed mRNAs and lncRNAs, 16 significant modules were explored by using weighted gene coexpression network analysis (WGCNA) followed by functional enrichment analysis of the genes and lncRNAs in significant modules. Among these selected modules, the yellow and brown modules were significantly related to sheep fecundity. lncRNAs (e.g., NR0B1, XLOC_041882, and MYH15) in the yellow module were mainly involved in the TGF-β signalling pathway, and NYAP1 and BCORL1 were significantly associated with the oxytocin signalling pathway, which regulates several genes in the coexpression network of the brown module. Overall, we identified several gene modules associated with sheep fecundity, as well as networks consisting of hub genes and lncRNAs that may contribute to sheep prolificacy by regulating the target mRNAs related to the TGF-β and oxytocin signalling pathways. This study provides an alternative strategy for the identification of potential candidate regulatory lncRNAs.
The D4 receptor gene and mood disorders: An association study
DOE Office of Scientific and Technical Information (OSTI.GOV)
Macciardi, F.; Cavalini, M.C.; Petronis, A.
1994-09-01
The problem of a gene-disease association is of major relevance in the current research of Psychiatric Disorders, mostly because of the lack of unequivocal results obtained with the linkage approach. However, some points of an association study must also be carefully considered, namely the statistical methodology and the strategy to select a gene to be tested. The gene coding for the D4 receptor (DRD4) might be theoretically relevant as a component of the genetic susceptibility for mood disorders. We now know that DRD4 has at least 2 functional polymorphisms in the coding regions of the gene, in exon 3 andmore » exon 1, thus conferring etiologic relevance to a potentially positive association. In our work, we investigated the DRD4 genotypes of the 3rd and 1st exon for 93 patients with bipolar disorder and 57 patients with major depression, recurrent disorder. Patients have been diagnosed either by traditional DSMIII-R criteria or by clustering their lifetime psychopathological symptomatology. A random control group consisted of 151 subjects. A significant association has been found with DRD4 exon 3 genotypes, revealing an increase of genotypes 2-4 in Bipolar patients (chi-square=23.07, df=12, p=0.02). Even though a definitive confirmation of our finding requires an independent replication of the study, this result emphasizes the importance of DRD4 in mood disorders.« less
Romero, Roberto; Tarca, Adi; Chaemsaithong, Piya; Miranda, Jezid; Chaiworapongsa, Tinnakorn; Jia, Hui; Hassan, Sonia S.; Kalita, Cynthia A.; Cai, Juan; Yeo, Lami; Lipovich, Leonard
2014-01-01
Objective The mechanisms responsible for normal and abnormal parturition are poorly understood. Myometrial activation leading to regular uterine contractions is a key component of labor. Dysfunctional labor (arrest of dilatation and/or descent) is a leading indication for cesarean delivery. Compelling evidence suggests that most of these disorders are functional in nature, and not the result of cephalopelvic disproportion. The methodology and the datasets afforded by the post-genomic era provide novel opportunities to understand and target gene functions in these disorders. In 2012, the ENCODE Consortium elucidated the extraordinary abundance and functional complexity of long non-coding RNA genes in the human genome. The purpose of the study was to identify differentially expressed long non-coding RNA genes in human myometrium in women in spontaneous labor at term. Materials and Methods Myometrium was obtained from women undergoing cesarean deliveries who were not in labor (n=19) and women in spontaneous labor at term (n=20). RNA was extracted and profiled using an Illumina® microarray platform. The analysis of the protein coding genes from this study has been previously reported. Here, we have used computational approaches to bound the extent of long non-coding RNA representation on this platform, and to identify co-differentially expressed and correlated pairs of long non-coding RNA genes and protein-coding genes sharing the same genomic loci. Results Upon considering more than 18,498 distinct lncRNA genes compiled nonredundantly from public experimental data sources, and interrogating 2,634 that matched Illumina microarray probes, we identified co-differential expression and correlation at two genomic loci that contain coding-lncRNA gene pairs: SOCS2-AK054607 and LMCD1-NR_024065 in women in spontaneous labor at term. This co-differential expression and correlation was validated by qRT-PCR, an independent experimental method. Intriguingly, one of the two lncRNA genes differentially expressed in term labor had a key genomic structure element, a splice site that lacked evolutionary conservation beyond primates. Conclusions We provide for the first time evidence for coordinated differential expression and correlation of cis-encoded antisense lncRNAs and protein-coding genes with known, as well as novel roles in pregnancy in the myometrium of women in spontaneous labor at term. PMID:24168098
Evolution of major histocompatibility complex class I and class II genes in the brown bear
2012-01-01
Background Major histocompatibility complex (MHC) proteins constitute an essential component of the vertebrate immune response, and are coded by the most polymorphic of the vertebrate genes. Here, we investigated sequence variation and evolution of MHC class I and class II DRB, DQA and DQB genes in the brown bear Ursus arctos to characterise the level of polymorphism, estimate the strength of positive selection acting on them, and assess the extent of gene orthology and trans-species polymorphism in Ursidae. Results We found 37 MHC class I, 16 MHC class II DRB, four DQB and two DQA alleles. We confirmed the expression of several loci: three MHC class I, two DRB, two DQB and one DQA. MHC class I also contained two clusters of non-expressed sequences. MHC class I and DRB allele frequencies differed between northern and southern populations of the Scandinavian brown bear. The rate of nonsynonymous substitutions (dN) exceeded the rate of synonymous substitutions (dS) at putative antigen binding sites of DRB and DQB loci and, marginally significantly, at MHC class I loci. Models of codon evolution supported positive selection at DRB and MHC class I loci. Both MHC class I and MHC class II sequences showed orthology to gene clusters found in the giant panda Ailuropoda melanoleuca. Conclusions Historical positive selection has acted on MHC class I, class II DRB and DQB, but not on the DQA locus. The signal of historical positive selection on the DRB locus was particularly strong, which may be a general feature of caniforms. The presence of MHC class I pseudogenes may indicate faster gene turnover in this class through the birth-and-death process. South–north population structure at MHC loci probably reflects origin of the populations from separate glacial refugia. PMID:23031405
Evolution of major histocompatibility complex class I and class II genes in the brown bear.
Kuduk, Katarzyna; Babik, Wiesław; Bojarska, Katarzyna; Sliwińska, Ewa B; Kindberg, Jonas; Taberlet, Pierre; Swenson, Jon E; Radwan, Jacek
2012-10-02
Major histocompatibility complex (MHC) proteins constitute an essential component of the vertebrate immune response, and are coded by the most polymorphic of the vertebrate genes. Here, we investigated sequence variation and evolution of MHC class I and class II DRB, DQA and DQB genes in the brown bear Ursus arctos to characterise the level of polymorphism, estimate the strength of positive selection acting on them, and assess the extent of gene orthology and trans-species polymorphism in Ursidae. We found 37 MHC class I, 16 MHC class II DRB, four DQB and two DQA alleles. We confirmed the expression of several loci: three MHC class I, two DRB, two DQB and one DQA. MHC class I also contained two clusters of non-expressed sequences. MHC class I and DRB allele frequencies differed between northern and southern populations of the Scandinavian brown bear. The rate of nonsynonymous substitutions (dN) exceeded the rate of synonymous substitutions (dS) at putative antigen binding sites of DRB and DQB loci and, marginally significantly, at MHC class I loci. Models of codon evolution supported positive selection at DRB and MHC class I loci. Both MHC class I and MHC class II sequences showed orthology to gene clusters found in the giant panda Ailuropoda melanoleuca. Historical positive selection has acted on MHC class I, class II DRB and DQB, but not on the DQA locus. The signal of historical positive selection on the DRB locus was particularly strong, which may be a general feature of caniforms. The presence of MHC class I pseudogenes may indicate faster gene turnover in this class through the birth-and-death process. South-north population structure at MHC loci probably reflects origin of the populations from separate glacial refugia.
Cheng, X; Sardana, R; Kaplan, H; Altosaar, I
1998-03-17
Over 2,600 transgenic rice plants in nine strains were regenerated from >500 independently selected hygromycin-resistant calli after Agrobacterium-mediated transformation. The plants were transformed with fully modified (plant codon optimized) versions of two synthetic cryIA(b) and cryIA(c) coding sequences from Bacillus thuringiensis as well as the hph and gus genes, coding for hygromycin phosphotransferase and beta-glucuronidase, respectively. These sequences were placed under control of the maize ubiquitin promoter, the CaMV35S promoter, and the Brassica Bp10 gene promoter to achieve high and tissue-specific expression of the lepidopteran-specific delta-endotoxins. The integration, expression, and inheritance of these genes were demonstrated in R0 and R1 generations by Southern, Northern, and Western analyses and by other techniques. Accumulation of high levels (up to 3% of soluble proteins) of CryIA(b) and CryIA(c) proteins was detected in R0 plants. Bioassays with R1 transgenic plants indicated that the transgenic plants were highly toxic to two major rice insect pests, striped stem borer (Chilo suppressalis) and yellow stem borer (Scirpophaga incertulas), with mortalities of 97-100% within 5 days after infestation, thus offering a potential for effective insect resistance in transgenic rice plants.
Omeire, Destiny; Abdin, Shaunte; Brooks, Daniel M; Miranda, Hector C
2015-04-01
The Germain's Peacock-Pheasant Polyplectron germaini (Aves, Galliformes, Phasianidae) is classified as Near Threatened on the IUCN Red List. The complete mitochondrial genome of P. germaini is 16,699 bp, consisting of 13 protein-coding genes, 2 rRNA, 22 tRNA genes and 1 control region. All of the 13 protein-coding genes have ATG as start codon. Eight of the 13 protein-coding genes have TAA as stop codon.
Dietary Variation and Evolution of Gene Copy Number among Dog Breeds
Reiter, Taylor; Jagoda, Evelyn; Capellini, Terence D.
2016-01-01
Prolonged human interactions and artificial selection have influenced the genotypic and phenotypic diversity among dog breeds. Because humans and dogs occupy diverse habitats, ecological contexts have likely contributed to breed-specific positive selection. Prior to the advent of modern dog-feeding practices, there was likely substantial variation in dietary landscapes among disparate dog breeds. As such, we investigated one type of genetic variant, copy number variation, in three metabolic genes: glucokinase regulatory protein (GCKR), phytanol-CoA 2-hydroxylase (PHYH), and pancreatic α-amylase 2B (AMY2B). These genes code for proteins that are responsible for metabolizing dietary products that originate from distinctly different food types: sugar, meat, and starch, respectively. After surveying copy number variation among dogs with diverse dietary histories, we found no correlation between diet and positive selection in either GCKR or PHYH. Although it has been previously demonstrated that dogs experienced a copy number increase in AMY2B relative to wolves during or after the dog domestication process, we demonstrate that positive selection continued to act on amylase copy number in dog breeds that consumed starch-rich diets in time periods after domestication. Furthermore, we found that introgression with wolves is not responsible for deterioration of positive selection on AMY2B among diverse dog breeds. Together, this supports the hypothesis that the amylase copy number expansion is found universally in dogs. PMID:26863414
Dietary Variation and Evolution of Gene Copy Number among Dog Breeds.
Reiter, Taylor; Jagoda, Evelyn; Capellini, Terence D
2016-01-01
Prolonged human interactions and artificial selection have influenced the genotypic and phenotypic diversity among dog breeds. Because humans and dogs occupy diverse habitats, ecological contexts have likely contributed to breed-specific positive selection. Prior to the advent of modern dog-feeding practices, there was likely substantial variation in dietary landscapes among disparate dog breeds. As such, we investigated one type of genetic variant, copy number variation, in three metabolic genes: glucokinase regulatory protein (GCKR), phytanol-CoA 2-hydroxylase (PHYH), and pancreatic α-amylase 2B (AMY2B). These genes code for proteins that are responsible for metabolizing dietary products that originate from distinctly different food types: sugar, meat, and starch, respectively. After surveying copy number variation among dogs with diverse dietary histories, we found no correlation between diet and positive selection in either GCKR or PHYH. Although it has been previously demonstrated that dogs experienced a copy number increase in AMY2B relative to wolves during or after the dog domestication process, we demonstrate that positive selection continued to act on amylase copy number in dog breeds that consumed starch-rich diets in time periods after domestication. Furthermore, we found that introgression with wolves is not responsible for deterioration of positive selection on AMY2B among diverse dog breeds. Together, this supports the hypothesis that the amylase copy number expansion is found universally in dogs.
Genetic Variation of Goat Interferon Regulatory Factor 3 Gene and Its Implication in Goat Evolution
Shu, Liping; Zhang, Yesheng; Wang, Yangzi; Sanni, Timothy M.; Imumorin, Ikhide G.; Peters, Sunday O.; Zhang, Jiajin; Dong, Yang; Wang, Wen
2016-01-01
The immune systems are fundamentally vital for evolution and survival of species; as such, selection patterns in innate immune loci are of special interest in molecular evolutionary research. The interferon regulatory factor (IRF) gene family control many different aspects of the innate and adaptive immune responses in vertebrates. Among these, IRF3 is known to take active part in very many biological processes. We assembled and evaluated 1356 base pairs of the IRF3 gene coding region in domesticated goats from Africa (Nigeria, Ethiopia and South Africa) and Asia (Iran and China) and the wild goat (Capra aegagrus). Five segregating sites with θ value of 0.0009 for this gene demonstrated a low diversity across the goats’ populations. Fu and Li tests were significantly positive but Tajima’s D test was significantly negative, suggesting its deviation from neutrality. Neighbor joining tree of IRF3 gene in domesticated goats, wild goat and sheep showed that all domesticated goats have a closer relationship than with the wild goat and sheep. Maximum likelihood tree of the gene showed that different domesticated goats share a common ancestor and suggest single origin. Four unique haplotypes were observed across all the sequences, of which, one was particularly common to African goats (MOCH-K14-0425, Poitou and WAD). In assessing the evolution mode of the gene, we found that the codon model dN/dS ratio for all goats was greater than one. Phylogenetic Analysis by Maximum Likelihood (PAML) gave a ω0 (dN/dS) value of 0.067 with LnL value of -6900.3 for the first Model (M1) while ω2 = 1.667 in model M2 with LnL value of -6900.3 with positive selection inferred in 3 codon sites. Mechanistic empirical combination (MEC) model for evaluating adaptive selection pressure on particular codons also confirmed adaptive selection pressure in three codons (207, 358 and 408) in IRF3 gene. Positive diversifying selection inferred with recent evolutionary changes in domesticated goat IRF3 led us to conclude that the gene evolution may have been influenced by domestication processes in goats. PMID:27598391
Genetic Variation of Goat Interferon Regulatory Factor 3 Gene and Its Implication in Goat Evolution.
Okpeku, Moses; Esmailizadeh, Ali; Adeola, Adeniyi C; Shu, Liping; Zhang, Yesheng; Wang, Yangzi; Sanni, Timothy M; Imumorin, Ikhide G; Peters, Sunday O; Zhang, Jiajin; Dong, Yang; Wang, Wen
2016-01-01
The immune systems are fundamentally vital for evolution and survival of species; as such, selection patterns in innate immune loci are of special interest in molecular evolutionary research. The interferon regulatory factor (IRF) gene family control many different aspects of the innate and adaptive immune responses in vertebrates. Among these, IRF3 is known to take active part in very many biological processes. We assembled and evaluated 1356 base pairs of the IRF3 gene coding region in domesticated goats from Africa (Nigeria, Ethiopia and South Africa) and Asia (Iran and China) and the wild goat (Capra aegagrus). Five segregating sites with θ value of 0.0009 for this gene demonstrated a low diversity across the goats' populations. Fu and Li tests were significantly positive but Tajima's D test was significantly negative, suggesting its deviation from neutrality. Neighbor joining tree of IRF3 gene in domesticated goats, wild goat and sheep showed that all domesticated goats have a closer relationship than with the wild goat and sheep. Maximum likelihood tree of the gene showed that different domesticated goats share a common ancestor and suggest single origin. Four unique haplotypes were observed across all the sequences, of which, one was particularly common to African goats (MOCH-K14-0425, Poitou and WAD). In assessing the evolution mode of the gene, we found that the codon model dN/dS ratio for all goats was greater than one. Phylogenetic Analysis by Maximum Likelihood (PAML) gave a ω0 (dN/dS) value of 0.067 with LnL value of -6900.3 for the first Model (M1) while ω2 = 1.667 in model M2 with LnL value of -6900.3 with positive selection inferred in 3 codon sites. Mechanistic empirical combination (MEC) model for evaluating adaptive selection pressure on particular codons also confirmed adaptive selection pressure in three codons (207, 358 and 408) in IRF3 gene. Positive diversifying selection inferred with recent evolutionary changes in domesticated goat IRF3 led us to conclude that the gene evolution may have been influenced by domestication processes in goats.
Guttman, Mitchell; Garber, Manuel; Levin, Joshua Z.; Donaghey, Julie; Robinson, James; Adiconis, Xian; Fan, Lin; Koziol, Magdalena J.; Gnirke, Andreas; Nusbaum, Chad; Rinn, John L.; Lander, Eric S.; Regev, Aviv
2010-01-01
RNA-Seq provides an unbiased way to study a transcriptome, including both coding and non-coding genes. To date, most RNA-Seq studies have critically depended on existing annotations, and thus focused on expression levels and variation in known transcripts. Here, we present Scripture, a method to reconstruct the transcriptome of a mammalian cell using only RNA-Seq reads and the genome sequence. We apply it to mouse embryonic stem cells, neuronal precursor cells, and lung fibroblasts to accurately reconstruct the full-length gene structures for the vast majority of known expressed genes. We identify substantial variation in protein-coding genes, including thousands of novel 5′-start sites, 3′-ends, and internal coding exons. We then determine the gene structures of over a thousand lincRNA and antisense loci. Our results open the way to direct experimental manipulation of thousands of non-coding RNAs, and demonstrate the power of ab initio reconstruction to render a comprehensive picture of mammalian transcriptomes. PMID:20436462
Generate Optimized Genetic Rhythm for Enzyme Expression in Non-native systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
2016-11-03
Most amino acids are represented by more than one codon, resulting in redundancy in the genetic code. Silent codon substitutions that do not alter the amino acid sequence still have an effect on protein expression. We have developed an algorithm, GoGREEN, to enhance the expression of foreign proteins in a host organism. GoGREEN selects codons according to frequency patterns seen in the gene of interest using the codon usage table from the host organism. GoGREEN is also designed to accommodate gaps in the sequence.This software takes for input (1) the aligned protein sequences for genes the user wishes to express,more » (2) the codon usage table for the host organism, (3) and the DNA sequence for the target protein found in the host organism. The program will select codons based on codon usage patterns for the target DNA sequence. The program will also select codons for “gaps” found in the aligned protein sequences using the codon usage table from the host organism.« less
AP1 Keeps Chromatin Poised for Action | Center for Cancer Research
The human genome harbors gene-encoding DNA, the blueprint for building proteins that regulate cellular function. Embedded across the genome, in non-coding regions, are DNA elements to which regulatory factors bind. The interaction of regulatory factors with DNA at these sites modifies gene expression to modulate cell activity. In cells, DNA exists in a complex with proteins called chromatin that compacts the DNA in the nucleus, strongly restricting access to DNA sequences. As a result, regulatory factors only interact with a small subset of their potential binding elements in a given cell to regulate genes. How factors recognize and select sites in chromatin across the genome is not well understood -- but several discoveries in CCR’s Laboratory of Receptor Biology and Gene Expression (LRBGE) have shed light on the mechanisms that direct factors to DNA.
Brancaccio, Mariarita; Coretti, Lorena; Florio, Ermanno; Pezone, Antonio; Calabrò, Viola; Falco, Geppino; Keller, Simona; Lembo, Francesca; Avvedimento, Vittorio Enrico; Chiariotti, Lorenzo
2016-01-01
Bacterial lipopolysaccharide (LPS) induces release of inflammatory mediators both in immune and epithelial cells. We investigated whether changes of epigenetic marks, including selected histone modification and DNA methylation, may drive or accompany the activation of COX-2 gene in HT-29 human intestinal epithelial cells upon exposure to LPS. Here we describe cyclical histone acetylation (H3), methylation (H3K4, H3K9, H3K27) and DNA methylation changes occurring at COX-2 gene promoter overtime after LPS stimulation. Histone K27 methylation changes are carried out by the H3 demethylase JMJD3 and are essential for COX-2 induction by LPS. The changes of the histone code are associated with cyclical methylation signatures at the promoter and gene body of COX-2 gene. PMID:27253528
Rapid Evolution of Ovarian-Biased Genes in the Yellow Fever Mosquito (Aedes aegypti).
Whittle, Carrie A; Extavour, Cassandra G
2017-08-01
Males and females exhibit highly dimorphic phenotypes, particularly in their gonads, which is believed to be driven largely by differential gene expression. Typically, the protein sequences of genes upregulated in males, or male-biased genes, evolve rapidly as compared to female-biased and unbiased genes. To date, the specific study of gonad-biased genes remains uncommon in metazoans. Here, we identified and studied a total of 2927, 2013, and 4449 coding sequences (CDS) with ovary-biased, testis-biased, and unbiased expression, respectively, in the yellow fever mosquito Aedes aegypti The results showed that ovary-biased and unbiased CDS had higher nonsynonymous to synonymous substitution rates (dN/dS) and lower optimal codon usage (those codons that promote efficient translation) than testis-biased genes. Further, we observed higher dN/dS in ovary-biased genes than in testis-biased genes, even for genes coexpressed in nonsexual (embryo) tissues. Ovary-specific genes evolved exceptionally fast, as compared to testis- or embryo-specific genes, and exhibited higher frequency of positive selection. Genes with ovary expression were preferentially involved in olfactory binding and reception. We hypothesize that at least two potential mechanisms could explain rapid evolution of ovary-biased genes in this mosquito: (1) the evolutionary rate of ovary-biased genes may be accelerated by sexual selection (including female-female competition or male-mate choice) affecting olfactory genes during female swarming by males, and/or by adaptive evolution of olfactory signaling within the female reproductive system ( e.g. , sperm-ovary signaling); and/or (2) testis-biased genes may exhibit decelerated evolutionary rates due to the formation of mating plugs in the female after copulation, which limits male-male sperm competition. Copyright © 2017 by the Genetics Society of America.
Parallel evolution of auditory genes for echolocation in bats and toothed whales.
Shen, Yong-Yi; Liang, Lu; Li, Gui-Sheng; Murphy, Robert W; Zhang, Ya-Ping
2012-06-01
The ability of bats and toothed whales to echolocate is a remarkable case of convergent evolution. Previous genetic studies have documented parallel evolution of nucleotide sequences in Prestin and KCNQ4, both of which are associated with voltage motility during the cochlear amplification of signals. Echolocation involves complex mechanisms. The most important factors include cochlear amplification, nerve transmission, and signal re-coding. Herein, we screen three genes that play different roles in this auditory system. Cadherin 23 (Cdh23) and its ligand, protocadherin 15 (Pcdh15), are essential for bundling motility in the sensory hair. Otoferlin (Otof) responds to nerve signal transmission in the auditory inner hair cell. Signals of parallel evolution occur in all three genes in the three groups of echolocators--two groups of bats (Yangochiroptera and Rhinolophoidea) plus the dolphin. Significant signals of positive selection also occur in Cdh23 in the Rhinolophoidea and dolphin, and Pcdh15 in Yangochiroptera. In addition, adult echolocating bats have higher levels of Otof expression in the auditory cortex than do their embryos and non-echolocation bats. Cdh23 and Pcdh15 encode the upper and lower parts of tip-links, and both genes show signals of convergent evolution and positive selection in echolocators, implying that they may co-evolve to optimize cochlear amplification. Convergent evolution and expression patterns of Otof suggest the potential role of nerve and brain in echolocation. Our synthesis of gene sequence and gene expression analyses reveals that positive selection, parallel evolution, and perhaps co-evolution and gene expression affect multiple hearing genes that play different roles in audition, including voltage and bundle motility in cochlear amplification, nerve transmission, and brain function.
Adaptive evolution of the Hox gene family for development in bats and dolphins.
Liang, Lu; Shen, Yong-Yi; Pan, Xiao-Wei; Zhou, Tai-Cheng; Yang, Chao; Irwin, David M; Zhang, Ya-Ping
2013-01-01
Bats and cetaceans (i.e., whales, dolphins, porpoises) are two kinds of mammals with unique locomotive styles and occupy novel niches. Bats are the only mammals capable of sustained flight in the sky, while cetaceans have returned to the aquatic environment and are specialized for swimming. Associated with these novel adaptations to their environment, various development changes have occurred to their body plans and associated structures. Given the importance of Hox genes in many aspects of embryonic development, we conducted an analysis of the coding regions of all Hox gene family members from bats (represented by Pteropus vampyrus, Pteropus alecto, Myotis lucifugus and Myotis davidii) and cetaceans (represented by Tursiops truncatus) for adaptive evolution using the available draft genome sequences. Differences in the selective pressures acting on many Hox genes in bats and cetaceans were found compared to other mammals. Positive selection, however, was not found to act on any of the Hox genes in the common ancestor of bats and only upon Hoxb9 in cetaceans. PCR amplification data from additional bat and cetacean species, and application of the branch-site test 2, showed that the Hoxb2 gene within bats had significant evidence of positive selection. Thus, our study, with genomic and newly sequenced Hox genes, identifies two candidate Hox genes that may be closely linked with developmental changes in bats and cetaceans, such as those associated with the pancreatic, neuronal, thymus shape and forelimb. In addition, the difference in our results from the genome-wide scan and newly sequenced data reveals that great care must be taken in interpreting results from draft genome data from a limited number of species, and deep genetic sampling of a particular clade is a powerful tool for generating complementary data to address this limitation.
Adaptive Evolution of the Hox Gene Family for Development in Bats and Dolphins
Pan, Xiao-Wei; Zhou, Tai-Cheng; Yang, Chao; Irwin, David M.; Zhang, Ya-Ping
2013-01-01
Bats and cetaceans (i.e., whales, dolphins, porpoises) are two kinds of mammals with unique locomotive styles and occupy novel niches. Bats are the only mammals capable of sustained flight in the sky, while cetaceans have returned to the aquatic environment and are specialized for swimming. Associated with these novel adaptations to their environment, various development changes have occurred to their body plans and associated structures. Given the importance of Hox genes in many aspects of embryonic development, we conducted an analysis of the coding regions of all Hox gene family members from bats (represented by Pteropus vampyrus, Pteropus alecto, Myotis lucifugus and Myotis davidii) and cetaceans (represented by Tursiops truncatus) for adaptive evolution using the available draft genome sequences. Differences in the selective pressures acting on many Hox genes in bats and cetaceans were found compared to other mammals. Positive selection, however, was not found to act on any of the Hox genes in the common ancestor of bats and only upon Hoxb9 in cetaceans. PCR amplification data from additional bat and cetacean species, and application of the branch-site test 2, showed that the Hoxb2 gene within bats had significant evidence of positive selection. Thus, our study, with genomic and newly sequenced Hox genes, identifies two candidate Hox genes that may be closely linked with developmental changes in bats and cetaceans, such as those associated with the pancreatic, neuronal, thymus shape and forelimb. In addition, the difference in our results from the genome-wide scan and newly sequenced data reveals that great care must be taken in interpreting results from draft genome data from a limited number of species, and deep genetic sampling of a particular clade is a powerful tool for generating complementary data to address this limitation. PMID:23825528
Johnson, Katherine A; Barry, Edwina; Lambert, David; Fitzgerald, Michael; McNicholas, Fiona; Kirley, Aiveen; Gill, Michael; Bellgrove, Mark A; Hawi, Ziarih
2013-12-01
A naturalistic, prospective study of the influence of genetic variation on dose prescribed, clinical response, and side effects related to stimulant medication in 77 children with attention-deficit/hyperactivity disorder (ADHD) was undertaken. The influence of genetic variation of the CES1 gene coding for carboxylesterase 1A1 (CES1A1), the major enzyme responsible for the first-pass, stereoselective metabolism of methylphenidate, was investigated. Parent- and teacher-rated behavioral questionnaires were collected at baseline when the children were medication naïve, and again at 6 weeks while they were on medication. Medication dose, prescribed at the discretion of the treating clinician, and side effects, were recorded at week 6. Blood and saliva samples were collected for genotyping. Single nucleotide polymorphisms (SNPs) were selected in the coding, non-coding and the 3' flanking region of the CES1 gene. Genetic association between CES1 variants and ADHD was investigated in an expanded sample of 265 Irish ADHD families. Analyses were conducted using analysis of covariance (ANCOVA) and logistic regression models. None of the CES1 gene variants were associated with the dose of methylphenidate provided or the clinical response recorded at the 6 week time point. An association between two CES1 SNP markers and the occurrence of sadness as a side effect of short-acting methylphenidate was found. The two associated CES1 markers were in linkage disequilibrium and were significantly associated with ADHD in a larger sample of ADHD trios. The associated CES1 markers were also in linkage disequilibrium with two SNP markers of the noradrenaline transporter gene (SLC6A2). This study found an association between two CES1 SNP markers and the occurrence of sadness as a side effect of short-acting methylphenidate. These markers were in linkage disequilibrium together and with two SNP markers of the noradrenaline transporter gene.
Vascular tone pathway polymorphisms in relation to primary open-angle glaucoma.
Kang, J H; Loomis, S J; Yaspan, B L; Bailey, J C; Weinreb, R N; Lee, R K; Lichter, P R; Budenz, D L; Liu, Y; Realini, T; Gaasterland, D; Gaasterland, T; Friedman, D S; McCarty, C A; Moroi, S E; Olson, L; Schuman, J S; Singh, K; Vollrath, D; Wollstein, G; Zack, D J; Brilliant, M; Sit, A J; Christen, W G; Fingert, J; Forman, J P; Buys, E S; Kraft, P; Zhang, K; Allingham, R R; Pericak-Vance, M A; Richards, J E; Hauser, M A; Haines, J L; Wiggs, J L; Pasquale, L R
2014-06-01
Vascular perfusion may be impaired in primary open-angle glaucoma (POAG); thus, we evaluated a panel of markers in vascular tone-regulating genes in relation to POAG. We used Illumina 660W-Quad array genotype data and pooled P-values from 3108 POAG cases and 3430 controls from the combined National Eye Institute Glaucoma Human Genetics Collaboration consortium and Glaucoma Genes and Environment studies. Using information from previous literature and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, we compiled single-nucleotide polymorphisms (SNPs) in 186 vascular tone-regulating genes. We used the 'Pathway Analysis by Randomization Incorporating Structure' analysis software, which performed 1000 permutations to compare the overall pathway and selected genes with comparable randomly generated pathways and genes in their association with POAG. The vascular tone pathway was not associated with POAG overall or POAG subtypes, defined by the type of visual field loss (early paracentral loss (n=224 cases) or only peripheral loss (n=993 cases)) (permuted P≥0.20). In gene-based analyses, eight were associated with POAG overall at permuted P<0.001: PRKAA1, CAV1, ITPR3, EDNRB, GNB2, DNM2, HFE, and MYL9. Notably, six of these eight (the first six listed) code for factors involved in the endothelial nitric oxide synthase activity, and three of these six (CAV1, ITPR3, and EDNRB) were also associated with early paracentral loss at P<0.001, whereas none of the six genes reached P<0.001 for peripheral loss only. Although the assembled vascular tone SNP set was not associated with POAG, genes that code for local factors involved in setting vascular tone were associated with POAG.
Boyd, David A.; Thevenot, Tracy; Gumbmann, Markus; Honeyman, Allen L.; Hamilton, Ian R.
2000-01-01
Transposon mutagenesis and marker rescue were used to isolate and identify an 8.5-kb contiguous region containing six open reading frames constituting the operon for the sorbitol P-enolpyruvate phosphotransferase transport system (PTS) of Streptococcus mutans LT11. The first gene, srlD, codes for sorbitol-6-phosphate dehydrogenase, followed downstream by srlR, coding for a transcriptional regulator; srlM, coding for a putative activator; and the srlA, srlE, and srlB genes, coding for the EIIC, EIIBC, and EIIA components of the sorbitol PTS, respectively. Among all sorbitol PTS operons characterized to date, the srlD gene is found after the genes coding for the EII components; thus, the location of the gene in S. mutans is unique. The SrlR protein is similar to several transcriptional regulators found in Bacillus spp. that contain PTS regulator domains (J. Stülke, M. Arnaud, G. Rapoport, and I. Martin-Verstraete, Mol. Microbiol. 28:865–874, 1998), and its gene overlaps the srlM gene by 1 bp. The arrangement of these two regulatory genes is unique, having not been reported for other bacteria. PMID:10639465
Long non-coding RNAs and mRNAs profiling during spleen development in pig.
Che, Tiandong; Li, Diyan; Jin, Long; Fu, Yuhua; Liu, Yingkai; Liu, Pengliang; Wang, Yixin; Tang, Qianzi; Ma, Jideng; Wang, Xun; Jiang, Anan; Li, Xuewei; Li, Mingzhou
2018-01-01
Genome-wide transcriptomic studies in humans and mice have become extensive and mature. However, a comprehensive and systematic understanding of protein-coding genes and long non-coding RNAs (lncRNAs) expressed during pig spleen development has not been achieved. LncRNAs are known to participate in regulatory networks for an array of biological processes. Here, we constructed 18 RNA libraries from developing fetal pig spleen (55 days before birth), postnatal pig spleens (0, 30, 180 days and 2 years after birth), and the samples from the 2-year-old Wild Boar. A total of 15,040 lncRNA transcripts were identified among these samples. We found that the temporal expression pattern of lncRNAs was more restricted than observed for protein-coding genes. Time-series analysis showed two large modules for protein-coding genes and lncRNAs. The up-regulated module was enriched for genes related to immune and inflammatory function, while the down-regulated module was enriched for cell proliferation processes such as cell division and DNA replication. Co-expression networks indicated the functional relatedness between protein-coding genes and lncRNAs, which were enriched for similar functions over the series of time points examined. We identified numerous differentially expressed protein-coding genes and lncRNAs in all five developmental stages. Notably, ceruloplasmin precursor (CP), a protein-coding gene participating in antioxidant and iron transport processes, was differentially expressed in all stages. This study provides the first catalog of the developing pig spleen, and contributes to a fuller understanding of the molecular mechanisms underpinning mammalian spleen development.
Fischer, Iris; Steige, Kim A.; Stephan, Wolfgang; Mboup, Mamadou
2013-01-01
The wild tomato species Solanum chilense and S. peruvianum are a valuable non-model system for studying plant adaptation since they grow in diverse environments facing many abiotic constraints. Here we investigate the sequence evolution of regulatory regions of drought and cold responsive genes and their expression regulation. The coding regions of these genes were previously shown to exhibit signatures of positive selection. Expression profiles and sequence evolution of regulatory regions of members of the Asr (ABA/water stress/ripening induced) gene family and the dehydrin gene pLC30-15 were analyzed in wild tomato populations from contrasting environments. For S. chilense, we found that Asr4 and pLC30-15 appear to respond much faster to drought conditions in accessions from very dry environments than accessions from more mesic locations. Sequence analysis suggests that the promoter of Asr2 and the downstream region of pLC30-15 are under positive selection in some local populations of S. chilense. By investigating gene expression differences at the population level we provide further support of our previous conclusions that Asr2, Asr4, and pLC30-15 are promising candidates for functional studies of adaptation. Our analysis also demonstrates the power of the candidate gene approach in evolutionary biology research and highlights the importance of wild Solanum species as a genetic resource for their cultivated relatives. PMID:24205149
MOCAT: A Metagenomics Assembly and Gene Prediction Toolkit
Li, Junhua; Chen, Weineng; Chen, Hua; Mende, Daniel R.; Arumugam, Manimozhiyan; Pan, Qi; Liu, Binghang; Qin, Junjie; Wang, Jun; Bork, Peer
2012-01-01
MOCAT is a highly configurable, modular pipeline for fast, standardized processing of single or paired-end sequencing data generated by the Illumina platform. The pipeline uses state-of-the-art programs to quality control, map, and assemble reads from metagenomic samples sequenced at a depth of several billion base pairs, and predict protein-coding genes on assembled metagenomes. Mapping against reference databases allows for read extraction or removal, as well as abundance calculations. Relevant statistics for each processing step can be summarized into multi-sheet Excel documents and queryable SQL databases. MOCAT runs on UNIX machines and integrates seamlessly with the SGE and PBS queuing systems, commonly used to process large datasets. The open source code and modular architecture allow users to modify or exchange the programs that are utilized in the various processing steps. Individual processing steps and parameters were benchmarked and tested on artificial, real, and simulated metagenomes resulting in an improvement of selected quality metrics. MOCAT can be freely downloaded at http://www.bork.embl.de/mocat/. PMID:23082188
Pompei, Fiorenza; Ciminelli, Bianca Maria; Bombieri, Cristina; Ciccacci, Cinzia; Koudova, Monika; Giorgi, Silvia; Belpinati, Francesca; Begnini, Angela; Cerny, Milos; Des Georges, Marie; Claustres, Mireille; Ferec, Claude; Macek, Milan; Modiano, Guido; Pignatti, Pier Franco
2006-01-01
An average of about 1700 CFTR (cystic fibrosis transmembrane conductance regulator) alleles from normal individuals from different European populations were extensively screened for DNA sequence variation. A total of 80 variants were observed: 61 coding SNSs (results already published), 13 noncoding SNSs, three STRs, two short deletions, and one nucleotide insertion. Eight DNA variants were classified as non-CF causing due to their high frequency of occurrence. Through this survey the CFTR has become the most exhaustively studied gene for its coding sequence variability and, though to a lesser extent, for its noncoding sequence variability as well. Interestingly, most variation was associated with the M470 allele, while the V470 allele showed an 'extended haplotype homozygosity' (EHH). These findings make us suggest a role for selection acting either on the M470V itself or through an hitchhiking mechanism involving a second site. The possible ancient origin of the V allele in an 'out of Africa' time frame is discussed.
MOCAT: a metagenomics assembly and gene prediction toolkit.
Kultima, Jens Roat; Sunagawa, Shinichi; Li, Junhua; Chen, Weineng; Chen, Hua; Mende, Daniel R; Arumugam, Manimozhiyan; Pan, Qi; Liu, Binghang; Qin, Junjie; Wang, Jun; Bork, Peer
2012-01-01
MOCAT is a highly configurable, modular pipeline for fast, standardized processing of single or paired-end sequencing data generated by the Illumina platform. The pipeline uses state-of-the-art programs to quality control, map, and assemble reads from metagenomic samples sequenced at a depth of several billion base pairs, and predict protein-coding genes on assembled metagenomes. Mapping against reference databases allows for read extraction or removal, as well as abundance calculations. Relevant statistics for each processing step can be summarized into multi-sheet Excel documents and queryable SQL databases. MOCAT runs on UNIX machines and integrates seamlessly with the SGE and PBS queuing systems, commonly used to process large datasets. The open source code and modular architecture allow users to modify or exchange the programs that are utilized in the various processing steps. Individual processing steps and parameters were benchmarked and tested on artificial, real, and simulated metagenomes resulting in an improvement of selected quality metrics. MOCAT can be freely downloaded at http://www.bork.embl.de/mocat/.
Divergent transcription is associated with promoters of transcriptional regulators
2013-01-01
Background Divergent transcription is a wide-spread phenomenon in mammals. For instance, short bidirectional transcripts are a hallmark of active promoters, while longer transcripts can be detected antisense from active genes in conditions where the RNA degradation machinery is inhibited. Moreover, many described long non-coding RNAs (lncRNAs) are transcribed antisense from coding gene promoters. However, the general significance of divergent lncRNA/mRNA gene pair transcription is still poorly understood. Here, we used strand-specific RNA-seq with high sequencing depth to thoroughly identify antisense transcripts from coding gene promoters in primary mouse tissues. Results We found that a substantial fraction of coding-gene promoters sustain divergent transcription of long non-coding RNA (lncRNA)/mRNA gene pairs. Strikingly, upstream antisense transcription is significantly associated with genes related to transcriptional regulation and development. Their promoters share several characteristics with those of transcriptional developmental genes, including very large CpG islands, high degree of conservation and epigenetic regulation in ES cells. In-depth analysis revealed a unique GC skew profile at these promoter regions, while the associated coding genes were found to have large first exons, two genomic features that might enforce bidirectional transcription. Finally, genes associated with antisense transcription harbor specific H3K79me2 epigenetic marking and RNA polymerase II enrichment profiles linked to an intensified rate of early transcriptional elongation. Conclusions We concluded that promoters of a class of transcription regulators are characterized by a specialized transcriptional control mechanism, which is directly coupled to relaxed bidirectional transcription. PMID:24365181
Natural selection drove metabolic specialization of the chromatophore in Paulinella chromatophora.
Valadez-Cano, Cecilio; Olivares-Hernández, Roberto; Resendis-Antonio, Osbaldo; DeLuna, Alexander; Delaye, Luis
2017-04-14
Genome degradation of host-restricted mutualistic endosymbionts has been attributed to inactivating mutations and genetic drift while genes coding for host-relevant functions are conserved by purifying selection. Unlike their free-living relatives, the metabolism of mutualistic endosymbionts and endosymbiont-originated organelles is specialized in the production of metabolites which are released to the host. This specialization suggests that natural selection crafted these metabolic adaptations. In this work, we analyzed the evolution of the metabolism of the chromatophore of Paulinella chromatophora by in silico modeling. We asked whether genome reduction is driven by metabolic engineering strategies resulted from the interaction with the host. As its widely known, the loss of enzyme coding genes leads to metabolic network restructuring sometimes improving the production rates. In this case, the production rate of reduced-carbon in the metabolism of the chromatophore. We reconstructed the metabolic networks of the chromatophore of P. chromatophora CCAC 0185 and a close free-living relative, the cyanobacterium Synechococcus sp. WH 5701. We found that the evolution of free-living to host-restricted lifestyle rendered a fragile metabolic network where >80% of genes in the chromatophore are essential for metabolic functionality. Despite the lack of experimental information, the metabolic reconstruction of the chromatophore suggests that the host provides several metabolites to the endosymbiont. By using these metabolites as intracellular conditions, in silico simulations of genome evolution by gene lose recover with 77% accuracy the actual metabolic gene content of the chromatophore. Also, the metabolic model of the chromatophore allowed us to predict by flux balance analysis a maximum rate of reduced-carbon released by the endosymbiont to the host. By inspecting the central metabolism of the chromatophore and the free-living cyanobacteria we found that by improvements in the gluconeogenic pathway the metabolism of the endosymbiont uses more efficiently the carbon source for reduced-carbon production. In addition, our in silico simulations of the evolutionary process leading to the reduced metabolic network of the chromatophore showed that the predicted rate of released reduced-carbon is obtained in less than 5% of the times under a process guided by random gene deletion and genetic drift. We interpret previous findings as evidence that natural selection at holobiont level shaped the rate at which reduced-carbon is exported to the host. Finally, our model also predicts that the ABC phosphate transporter (pstSACB) which is conserved in the genome of the chromatophore of P. chromatophora strain CCAC 0185 is a necessary component to release reduced-carbon molecules to the host. Our evolutionary analysis suggests that in the case of Paulinella chromatophora natural selection at the holobiont level played a prominent role in shaping the metabolic specialization of the chromatophore. We propose that natural selection acted as a "metabolic engineer" by favoring metabolic restructurings that led to an increased release of reduced-carbon to the host.
Origin and evolution of the long non-coding genes in the X-inactivation center.
Romito, Antonio; Rougeulle, Claire
2011-11-01
Random X chromosome inactivation (XCI), the eutherian mechanism of X-linked gene dosage compensation, is controlled by a cis-acting locus termed the X-inactivation center (Xic). One of the striking features that characterize the Xic landscape is the abundance of loci transcribing non-coding RNAs (ncRNAs), including Xist, the master regulator of the inactivation process. Recent comparative genomic analyses have depicted the evolutionary scenario behind the origin of the X-inactivation center, revealing that this locus evolved from a region harboring protein-coding genes. During mammalian radiation, this ancestral protein-coding region was disrupted in the marsupial group, whilst it provided in eutherian lineage the starting material for the non-translated RNAs of the X-inactivation center. The emergence of non-coding genes occurred by a dual mechanism involving loss of protein-coding function of the pre-existing genes and integration of different classes of mobile elements, some of which modeled the structure and sequence of the non-coding genes in a species-specific manner. The rising genes started to produce transcripts that acquired function in regulating the epigenetic status of the X chromosome, as shown for Xist, its antisense Tsix, Jpx, and recently suggested for Ftx. Thus, the appearance of the Xic, which occurred after the divergence between eutherians and marsupials, was the basis for the evolution of random X inactivation as a strategy to achieve dosage compensation. Copyright © 2011. Published by Elsevier Masson SAS.
A Molecular Phylogeny of Hemiptera Inferred from Mitochondrial Genome Sequences
Song, Nan; Liang, Ai-Ping; Bu, Cui-Ping
2012-01-01
Classically, Hemiptera is comprised of two suborders: Homoptera and Heteroptera. Homoptera includes Cicadomorpha, Fulgoromorpha and Sternorrhyncha. However, according to previous molecular phylogenetic studies based on 18S rDNA, Fulgoromorpha has a closer relationship to Heteroptera than to other hemipterans, leaving Homoptera as paraphyletic. Therefore, the position of Fulgoromorpha is important for studying phylogenetic structure of Hemiptera. We inferred the evolutionary affiliations of twenty-five superfamilies of Hemiptera using mitochondrial protein-coding genes and rRNAs. We sequenced three mitogenomes, from Pyrops candelaria, Lycorma delicatula and Ricania marginalis, representing two additional families in Fulgoromorpha. Pyrops and Lycorma are representatives of an additional major family Fulgoridae in Fulgoromorpha, whereas Ricania is a second representative of the highly derived clade Ricaniidae. The organization and size of these mitogenomes are similar to those of the sequenced fulgoroid species. Our consensus phylogeny of Hemiptera largely supported the relationships (((Fulgoromorpha,Sternorrhyncha),Cicadomorpha),Heteroptera), and thus supported the classic phylogeny of Hemiptera. Selection of optimal evolutionary models (exclusion and inclusion of two rRNA genes or of third codon positions of protein-coding genes) demonstrated that rapidly evolving and saturated sites should be removed from the analyses. PMID:23144967
Lobmann, M.; Delem, A.; Jovanovic, D.; Peetermans, J.
1981-01-01
Two recombinants (R22 and R75) of the attenuated B/USSR/69 strain Bright and the virulent B/Hong Kong/5/72 and one recombinant (R5) of Bright and the virulent B/Hong Kong /8/73 were selected for genotypic and phenotypic caracterization. All three recombinants had the growth property of the attenuated parent Brigit. Analysis of their RNA's by polyacrylamide gel electrophoresis revealed that, the strains R22 and R75 had derived all their genes from Brigit, those coding for haemagglutinin excepted. These recombinants were clinically evaluated and found to be attenuated and immunogenic. The recombinant R5 which derived, besides the bene coding for the haemagglutinin, several other genes from B/Hong Kong/8/73 was only partly attenuated since it induced influenza-like symptoms in one out of three volunteers. It is concluded that the strain Brigit can be used as a donor of genes for the attenuation of the B/Hong Kong/5/72 virus and that recombinants of influenza type B can be identified, like influenza type A recombinants, by their RNA pattern. Images Plate 1 PMID:7019320
Autoimmunity as a Driving Force of Cognitive Evolution
Nataf, Serge
2017-01-01
In the last decades, increasingly robust experimental approaches have formally demonstrated that autoimmunity is a physiological process involved in a large range of functions including cognition. On this basis, the recently enunciated “brain superautoantigens” theory proposes that autoimmunity has been a driving force of cognitive evolution. It is notably suggested that the immune and nervous systems have somehow co-evolved and exerted a mutual selection pressure benefiting to both systems. In this two-way process, the evolutionary-determined emergence of neurons expressing specific immunogenic antigens (brain superautoantigens) has exerted a selection pressure on immune genes shaping the T-cell repertoire. Such a selection pressure on immune genes has translated into the emergence of a finely tuned autoimmune T-cell repertoire that promotes cognition. In another hand, the evolutionary-determined emergence of brain-autoreactive T-cells has exerted a selection pressure on neural genes coding for brain superautoantigens. Such a selection pressure has translated into the emergence of a neural repertoire (defined here as the whole of neurons, synapses and non-neuronal cells involved in cognitive functions) expressing brain superautoantigens. Overall, the brain superautoantigens theory suggests that cognitive evolution might have been primarily driven by internal cues rather than external environmental conditions. Importantly, while providing a unique molecular connection between neural and T-cell repertoires under physiological conditions, brain superautoantigens may also constitute an Achilles heel responsible for the particular susceptibility of Homo sapiens to “neuroimmune co-pathologies” i.e., disorders affecting both neural and T-cell repertoires. These may notably include paraneoplastic syndromes, multiple sclerosis as well as autism, schizophrenia and neurodegenerative diseases. In the context of this theoretical frame, a specific emphasis is given here to the potential evolutionary role exerted by two families of genes, namely the MHC class II genes, involved in antigen presentation to T-cells, and the Foxp genes, which play crucial roles in language (Foxp2) and the regulation of autoimmunity (Foxp3). PMID:29123465
Enrichment of Circular Code Motifs in the Genes of the Yeast Saccharomyces cerevisiae.
Michel, Christian J; Ngoune, Viviane Nguefack; Poch, Olivier; Ripp, Raymond; Thompson, Julie D
2017-12-03
A set X of 20 trinucleotides has been found to have the highest average occurrence in the reading frame, compared to the two shifted frames, of genes of bacteria, archaea, eukaryotes, plasmids and viruses. This set X has an interesting mathematical property, since X is a maximal C3 self-complementary trinucleotide circular code. Furthermore, any motif obtained from this circular code X has the capacity to retrieve, maintain and synchronize the original (reading) frame. Since 1996, the theory of circular codes in genes has mainly been developed by analysing the properties of the 20 trinucleotides of X, using combinatorics and statistical approaches. For the first time, we test this theory by analysing the X motifs, i.e., motifs from the circular code X, in the complete genome of the yeast Saccharomyces cerevisiae . Several properties of X motifs are identified by basic statistics (at the frequency level), and evaluated by comparison to R motifs, i.e., random motifs generated from 30 different random codes R. We first show that the frequency of X motifs is significantly greater than that of R motifs in the genome of S. cerevisiae . We then verify that no significant difference is observed between the frequencies of X and R motifs in the non-coding regions of S. cerevisiae , but that the occurrence number of X motifs is significantly higher than R motifs in the genes (protein-coding regions). This property is true for all cardinalities of X motifs (from 4 to 20) and for all 16 chromosomes. We further investigate the distribution of X motifs in the three frames of S. cerevisiae genes and show that they occur more frequently in the reading frame, regardless of their cardinality or their length. Finally, the ratio of X genes, i.e., genes with at least one X motif, to non-X genes, in the set of verified genes is significantly different to that observed in the set of putative or dubious genes with no experimental evidence. These results, taken together, represent the first evidence for a significant enrichment of X motifs in the genes of an extant organism. They raise two hypotheses: the X motifs may be evolutionary relics of the primitive codes used for translation, or they may continue to play a functional role in the complex processes of genome decoding and protein synthesis.
Emergence and Evolution of Hominidae-Specific Coding and Noncoding Genomic Sequences
Saber, Morteza Mahmoudi; Adeyemi Babarinde, Isaac; Hettiarachchi, Nilmini; Saitou, Naruya
2016-01-01
Family Hominidae, which includes humans and great apes, is recognized for unique complex social behavior and intellectual abilities. Despite the increasing genome data, however, the genomic origin of its phenotypic uniqueness has remained elusive. Clade-specific genes and highly conserved noncoding sequences (HCNSs) are among the high-potential evolutionary candidates involved in driving clade-specific characters and phenotypes. On this premise, we analyzed whole genome sequences along with gene orthology data retrieved from major DNA databases to find Hominidae-specific (HS) genes and HCNSs. We discovered that Down syndrome critical region 4 (DSCR4) is the only experimentally verified gene uniquely present in Hominidae. DSCR4 has no structural homology to any known protein and was inferred to have emerged in several steps through LTR/ERV1, LTR/ERVL retrotransposition, and transversion. Using the genomic distance as neutral evolution threshold, we identified 1,658 HS HCNSs. Polymorphism coverage and derived allele frequency analysis of HS HCNSs showed that these HCNSs are under purifying selection, indicating that they may harbor important functions. They are overrepresented in promoters/untranslated regions, in close proximity of genes involved in sensory perception of sound and developmental process, and also showed a significantly lower nucleosome occupancy probability. Interestingly, many ancestral sequences of the HS HCNSs showed very high evolutionary rates. This suggests that new functions emerged through some kind of positive selection, and then purifying selection started to operate to keep these functions. PMID:27289096
Duffy, A; Turecki, G; Grof, P; Cavazzoni, P; Grof, E; Joober, R; Ahrens, B; Berghöfer, A; Müller-Oerlinghausen, B; Dvoráková, M; Libigerová, E; Vojtĕchovský, M; Zvolský, P; Nilsson, A; Licht, R W; Rasmussen, N A; Schou, M; Vestergaard, P; Holzinger, A; Schumann, C; Thau, K; Robertson, C; Rouleau, G A; Alda, M
2000-01-01
OBJECTIVE: To test for genetic linkage and association with GABAergic candidate genes in lithium-responsive bipolar disorder. DESIGN: Polymorphisms located in genes that code for GABRA3, GABRA5 and GABRB3 subunits of the GABAA receptor were investigated using association and linkage strategies. PARTICIPANTS: A total of 138 patients with bipolar 1 disorder with a clear response to lithium prophylaxis, selected from specialized lithium clinics in Canada and Europe that are part of the International Group for the Study of Lithium-Treated Patients, and 108 psychiatrically healthy controls. Families of 24 probands were suitable for linkage analysis. OUTCOME MEASURES: The association between the candidate genes and patients with bipolar disorder versus that of controls and genetic linkage within families. RESULTS: There was no significant association or linkage found between lithium-responsive bipolar disorder and the GABAergic candidate genes investigated. CONCLUSIONS: This study does not support a major role for the GABAergic candidate genes tested in lithium-responsive bipolar disorder. PMID:11022400
Decoding the genome beyond sequencing: the new phase of genomic research.
Heng, Henry H Q; Liu, Guo; Stevens, Joshua B; Bremer, Steven W; Ye, Karen J; Abdallah, Batoul Y; Horne, Steven D; Ye, Christine J
2011-10-01
While our understanding of gene-based biology has greatly improved, it is clear that the function of the genome and most diseases cannot be fully explained by genes and other regulatory elements. Genes and the genome represent distinct levels of genetic organization with their own coding systems; Genes code parts like protein and RNA, but the genome codes the structure of genetic networks, which are defined by the whole set of genes, chromosomes and their topological interactions within a cell. Accordingly, the genetic code of DNA offers limited understanding of genome functions. In this perspective, we introduce the genome theory which calls for the departure of gene-centric genomic research. To make this transition for the next phase of genomic research, it is essential to acknowledge the importance of new genome-based biological concepts and to establish new technology platforms to decode the genome beyond sequencing. Copyright © 2011 Elsevier Inc. All rights reserved.
Laqueyrerie, A; Militzer, P; Romain, F; Eiglmeier, K; Cole, S; Marchal, G
1995-10-01
Effective protection against a virulent challenge with Mycobacterium tuberculosis is induced mainly by previous immunization with living attenuated mycobacteria, and it has been hypothesized that secreted proteins serve as major targets in the specific immune response. To identify and purify molecules present in culture medium filtrate which are dominant antigens during effective vaccination, a two-step selection procedure was used to select antigens able to interact with T lymphocytes and/or antibodies induced by immunization with living bacteria and to counterselect antigens interacting with the immune effectors induced by immunization with dead bacteria. A Mycobacterium bovis BCG 45/47-kDa antigen complex, present in BCG culture filtrate, has been previously identified and isolated (F. Romain, A. Laqueyrerie, P. Militzer, P. Pescher, P. Chavarot, M. Lagranderie, G. Auregan, M. Gheorghiu, and G. Marchal, Infect. Immun. 61:742-750, 1993). Since the cognate antibodies recognize the very same antigens present in M. tuberculosis culture medium filtrates, a project was undertaken to clone, express, and sequence the corresponding gene of M. tuberculosis. An M. tuberculosis shuttle cosmid library was transferred in Mycobacterium smegmatis and screened with a competitive enzyme-linked immunosorbent assay to detect the clones expressing the proteins. A clone containing a 40-kb DNA insert was selected, and by means of subcloning in Escherichia coli, a 2-kb fragment that coded for the molecules was identified. An open reading frame in the 2,061-nucleotide sequence codes for a secreted protein with a consensus signal peptide of 39 amino acids and a predicted molecular mass of 28,779 Da. The gene was referred to as apa because of the high percentages of proline (21.7%) and alanine (19%) in the purified protein. Southern hybridization analysis of digested total genomic DNA from M. tuberculosis (reference strains H37Rv and H37Ra) indicated that the apa gene was present as a single copy on the genome. The N-terminal identity or homology of the M. tuberculosis and M. bovis BCG purified molecules and their similar global and deduced amino acid compositions demonstrated the perfect correspondence between the molecular and chemical analyses. The presence of a high percentage of proline (21.7%) was confirmed and explained the apparent higher molecular mass (45/47 kDa) determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis resulting from the increased rigidity of molecules due to proline residues.(ABSTRACT TRUNCATED AT 400 WORDS)
The genomic substrate for adaptive radiation in African cichlid fish.
Brawand, David; Wagner, Catherine E; Li, Yang I; Malinsky, Milan; Keller, Irene; Fan, Shaohua; Simakov, Oleg; Ng, Alvin Y; Lim, Zhi Wei; Bezault, Etienne; Turner-Maier, Jason; Johnson, Jeremy; Alcazar, Rosa; Noh, Hyun Ji; Russell, Pamela; Aken, Bronwen; Alföldi, Jessica; Amemiya, Chris; Azzouzi, Naoual; Baroiller, Jean-François; Barloy-Hubler, Frederique; Berlin, Aaron; Bloomquist, Ryan; Carleton, Karen L; Conte, Matthew A; D'Cotta, Helena; Eshel, Orly; Gaffney, Leslie; Galibert, Francis; Gante, Hugo F; Gnerre, Sante; Greuter, Lucie; Guyon, Richard; Haddad, Natalie S; Haerty, Wilfried; Harris, Rayna M; Hofmann, Hans A; Hourlier, Thibaut; Hulata, Gideon; Jaffe, David B; Lara, Marcia; Lee, Alison P; MacCallum, Iain; Mwaiko, Salome; Nikaido, Masato; Nishihara, Hidenori; Ozouf-Costaz, Catherine; Penman, David J; Przybylski, Dariusz; Rakotomanga, Michaelle; Renn, Suzy C P; Ribeiro, Filipe J; Ron, Micha; Salzburger, Walter; Sanchez-Pulido, Luis; Santos, M Emilia; Searle, Steve; Sharpe, Ted; Swofford, Ross; Tan, Frederick J; Williams, Louise; Young, Sarah; Yin, Shuangye; Okada, Norihiro; Kocher, Thomas D; Miska, Eric A; Lander, Eric S; Venkatesh, Byrappa; Fernald, Russell D; Meyer, Axel; Ponting, Chris P; Streelman, J Todd; Lindblad-Toh, Kerstin; Seehausen, Ole; Di Palma, Federica
2014-09-18
Cichlid fishes are famous for large, diverse and replicated adaptive radiations in the Great Lakes of East Africa. To understand the molecular mechanisms underlying cichlid phenotypic diversity, we sequenced the genomes and transcriptomes of five lineages of African cichlids: the Nile tilapia (Oreochromis niloticus), an ancestral lineage with low diversity; and four members of the East African lineage: Neolamprologus brichardi/pulcher (older radiation, Lake Tanganyika), Metriaclima zebra (recent radiation, Lake Malawi), Pundamilia nyererei (very recent radiation, Lake Victoria), and Astatotilapia burtoni (riverine species around Lake Tanganyika). We found an excess of gene duplications in the East African lineage compared to tilapia and other teleosts, an abundance of non-coding element divergence, accelerated coding sequence evolution, expression divergence associated with transposable element insertions, and regulation by novel microRNAs. In addition, we analysed sequence data from sixty individuals representing six closely related species from Lake Victoria, and show genome-wide diversifying selection on coding and regulatory variants, some of which were recruited from ancient polymorphisms. We conclude that a number of molecular mechanisms shaped East African cichlid genomes, and that amassing of standing variation during periods of relaxed purifying selection may have been important in facilitating subsequent evolutionary diversification.
The genomic substrate for adaptive radiation in African cichlid fish
Malinsky, Milan; Keller, Irene; Fan, Shaohua; Simakov, Oleg; Ng, Alvin Y.; Lim, Zhi Wei; Bezault, Etienne; Turner-Maier, Jason; Johnson, Jeremy; Alcazar, Rosa; Noh, Hyun Ji; Russell, Pamela; Aken, Bronwen; Alföldi, Jessica; Amemiya, Chris; Azzouzi, Naoual; Baroiller, Jean-François; Barloy-Hubler, Frederique; Berlin, Aaron; Bloomquist, Ryan; Carleton, Karen L.; Conte, Matthew A.; D'Cotta, Helena; Eshel, Orly; Gaffney, Leslie; Galibert, Francis; Gante, Hugo F.; Gnerre, Sante; Greuter, Lucie; Guyon, Richard; Haddad, Natalie S.; Haerty, Wilfried; Harris, Rayna M.; Hofmann, Hans A.; Hourlier, Thibaut; Hulata, Gideon; Jaffe, David B.; Lara, Marcia; Lee, Alison P.; MacCallum, Iain; Mwaiko, Salome; Nikaido, Masato; Nishihara, Hidenori; Ozouf-Costaz, Catherine; Penman, David J.; Przybylski, Dariusz; Rakotomanga, Michaelle; Renn, Suzy C. P.; Ribeiro, Filipe J.; Ron, Micha; Salzburger, Walter; Sanchez-Pulido, Luis; Santos, M. Emilia; Searle, Steve; Sharpe, Ted; Swofford, Ross; Tan, Frederick J.; Williams, Louise; Young, Sarah; Yin, Shuangye; Okada, Norihiro; Kocher, Thomas D.; Miska, Eric A.; Lander, Eric S.; Venkatesh, Byrappa; Fernald, Russell D.; Meyer, Axel; Ponting, Chris P.; Streelman, J. Todd; Lindblad-Toh, Kerstin; Seehausen, Ole; Di Palma, Federica
2015-01-01
Cichlid fishes are famous for large, diverse and replicated adaptive radiations in the Great Lakes of East Africa. To understand the molecular mechanisms underlying cichlid phenotypic diversity, we sequenced the genomes and transcriptomes of five lineages of African cichlids: the Nile tilapia (Oreochromis niloticus), an ancestral lineage with low diversity; and four members of the East African lineage: Neolamprologus brichardi/pulcher (older radiation, Lake Tanganyika), Metriaclima zebra (recent radiation, Lake Malawi), Pundamilia nyererei (very recent radiation, Lake Victoria), and Astatotilapia burtoni (riverine species around Lake Tanganyika). We found an excess of gene duplications in the East African lineage compared to tilapia and other teleosts, an abundance of non-coding element divergence, accelerated coding sequence evolution, expression divergence associated with transposable element insertions, and regulation by novel microRNAs. In addition, we analysed sequence data from sixty individuals representing six closely related species from Lake Victoria, and show genome-wide diversifying selection on coding and regulatory variants, some of which were recruited from ancient polymorphisms. We conclude that a number of molecular mechanisms shaped East African cichlid genomes, and that amassing of standing variation during periods of relaxed purifying selection may have been important in facilitating subsequent evolutionary diversification. PMID:25186727
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mark A. Eiteman PHD; Elliot Altman Phd
2009-02-11
As part of preliminary research efforts, we have completed several experiments which demonstrate 'proof of concept.' These experiments addressed the following three questions: (1) Can a synthetic mixed sugar solution of glucose and xylose be efficiently consumed using the multi-organism approach? (2) Can this approach be used to accumulate a model product? (3) Can this approach be applied to the removal of an inhibitor, acetate, selectively from mixtures of xylose and glucose? To answer the question of whether this multi-organism approach can effectively consume synthetic mixed sugar solutions, we first tested substrate-selective uptake using two strains, one unable to consumemore » glucose and one unable to consume xylose. The xylose-selective strain ALS998 has mutations in the three genes involved in glucose uptake, rendering it unable to consume glucose: ptsG codes for the Enzyme IICB{sup Glc} of the phosphotransferase system (PTS) for carbohydrate transport (Postma et al., 1993), manZ codes for the IID{sup Man} domain of the mannose PTS permease (Huber, 1996), glk codes for glucokinase (Curtis and Epstein 1975) We also constructed strain ALS1008 which has a knockout in the xylA gene encoding for xylose isomerase, rendering ALS1008 unable to consume xylose. Two batch experiments and one continuous bioprocess were completed. In the first experiment, each strain was grown separately in a defined medium of 8 g/L xylose and 15 g/L glucose which represented xylose and glucose concentrations that can be generated by actual biomass. In the second experiment, the two strains were grown together in batch in the same defined, mixed-sugar medium. In a third experiment, we grew the strains continuously in a 'chemostat', except that we shifted the concentrations of glucose and xylose periodically to observe how the system would respond. (For example, we shifted the glucose concentration suddenly from 15 g/L to 30 g/L in the feed).« less
Primer development to obtain complete coding sequence of HA and NA genes of influenza A/H3N2 virus.
Agustiningsih, Agustiningsih; Trimarsanto, Hidayat; Setiawaty, Vivi; Artika, I Made; Muljono, David Handojo
2016-08-30
Influenza is an acute respiratory illness and has become a serious public health problem worldwide. The need to study the HA and NA genes in influenza A virus is essential since these genes frequently undergo mutations. This study describes the development of primer sets for RT-PCR to obtain complete coding sequence of Hemagglutinin (HA) and Neuraminidase (NA) genes of influenza A/H3N2 virus from Indonesia. The primers were developed based on influenza A/H3N2 sequence worldwide from Global Initiative on Sharing All Influenza Data (GISAID) and further tested using Indonesian influenza A/H3N2 archived samples of influenza-like illness (ILI) surveillance from 2008 to 2009. An optimum RT-PCR condition was acquired for all HA and NA fragments designed to cover complete coding sequence of HA and NA genes. A total of 71 samples were successfully sequenced for complete coding sequence both of HA and NA genes out of 145 samples of influenza A/H3N2 tested. The developed primer sets were suitable for obtaining complete coding sequences of HA and NA genes of Indonesian samples from 2008 to 2009.
Usein, C R; Damian, M; Tatu-Chitoiu, D; Capusa, C; Fagaras, R; Tudorache, D; Nica, M; Le Bouguénec, C
2001-01-01
A total of 78 E. coli strains isolated from adults with different types of urinary tract infections were screened by polymerase chain reaction for prevalence of genetic regions coding for virulence factors. The targeted genetic determinants were those coding for type 1 fimbriae (fimH), pili associated with pyelonephritis (pap), S and F1C fimbriae (sfa and foc), afimbrial adhesins (afa), hemolysin (hly), cytotoxic necrotizing factor (cnf), aerobactin (aer). Among the studied strains, the prevalence of genes coding for fimbrial adhesive systems was 86%, 36%, and 23% for fimH, pap, and sfa/foc,respectively. The operons coding for Afa afimbrial adhesins were identified in 14% of strains. The hly and cnf genes coding for toxins were amplified in 23% and 13% of strains, respectively. A prevalence of 54% was found for the aer gene. The various combinations of detected genes were designated as virulence patterns. The strains isolated from the hospitalized patients displayed a greater number of virulence genes and a diversity of gene associations compared to the strains isolated from the ambulatory subjects. A rapid assessment of the bacterial pathogenicity characteristics may contribute to a better medical approach of the patients with urinary tract infections.
Molecular codes for neuronal individuality and cell assembly in the brain
Yagi, Takeshi
2012-01-01
The brain contains an enormous, but finite, number of neurons. The ability of this limited number of neurons to produce nearly limitless neural information over a lifetime is typically explained by combinatorial explosion; that is, by the exponential amplification of each neuron's contribution through its incorporation into “cell assemblies” and neural networks. In development, each neuron expresses diverse cellular recognition molecules that permit the formation of the appropriate neural cell assemblies to elicit various brain functions. The mechanism for generating neuronal assemblies and networks must involve molecular codes that give neurons individuality and allow them to recognize one another and join appropriate networks. The extensive molecular diversity of cell-surface proteins on neurons is likely to contribute to their individual identities. The clustered protocadherins (Pcdh) is a large subfamily within the diverse cadherin superfamily. The clustered Pcdh genes are encoded in tandem by three gene clusters, and are present in all known vertebrate genomes. The set of clustered Pcdh genes is expressed in a random and combinatorial manner in each neuron. In addition, cis-tetramers composed of heteromultimeric clustered Pcdh isoforms represent selective binding units for cell-cell interactions. Here I present the mathematical probabilities for neuronal individuality based on the random and combinatorial expression of clustered Pcdh isoforms and their formation of cis-tetramers in each neuron. Notably, clustered Pcdh gene products are known to play crucial roles in correct axonal projections, synaptic formation, and neuronal survival. Their molecular and biological features induce a hypothesis that the diverse clustered Pcdh molecules provide the molecular code by which neuronal individuality and cell assembly permit the combinatorial explosion of networks that supports enormous processing capability and plasticity of the brain. PMID:22518100
Exceptionally long 5' UTR short tandem repeats specifically linked to primates.
Namdar-Aligoodarzi, P; Mohammadparast, S; Zaker-Kandjani, B; Talebi Kakroodi, S; Jafari Vesiehsari, M; Ohadi, M
2015-09-10
We have previously reported genome-scale short tandem repeats (STRs) in the core promoter interval (i.e. -120 to +1 to the transcription start site) of protein-coding genes that have evolved identically in primates vs. non-primates. Those STRs may function as evolutionary switch codes for primate speciation. In the current study, we used the Ensembl database to analyze the 5' untranslated region (5' UTR) between +1 and +60 of the transcription start site of the entire human protein-coding genes annotated in the GeneCards database, in order to identify "exceptionally long" STRs (≥5-repeats), which may be of selective/adaptive advantage. The importance of this critical interval is its function as core promoter, and its effect on transcription and translation. In order to minimize ascertainment bias, we analyzed the evolutionary status of the human 5' UTR STRs of ≥5-repeats in several species encompassing six major orders and superorders across mammals, including primates, rodents, Scandentia, Laurasiatheria, Afrotheria, and Xenarthra. We introduce primate-specific STRs, and STRs which have expanded from mouse to primates. Identical co-occurrence of the identified STRs of rare average frequency between 0.006 and 0.0001 in primates supports a role for those motifs in processes that diverged primates from other mammals, such as neuronal differentiation (e.g. APOD and FGF4), and craniofacial development (e.g. FILIP1L). A number of the identified STRs of ≥5-repeats may be human-specific (e.g. ZMYM3 and DAZAP1). Future work is warranted to examine the importance of the listed genes in primate/human evolution, development, and disease. Copyright © 2015 Elsevier B.V. All rights reserved.
Pietan, Lucas L.; Spradling, Theresa A.
2016-01-01
In animals, mitochondrial DNA (mtDNA) typically occurs as a single circular chromosome with 13 protein-coding genes and 22 tRNA genes. The various species of lice examined previously, however, have shown mitochondrial genome rearrangements with a range of chromosome sizes and numbers. Our research demonstrates that the mitochondrial genomes of two species of chewing lice found on pocket gophers, Geomydoecus aurei and Thomomydoecus minor, are fragmented with the 1,536 base-pair (bp) cytochrome-oxidase subunit I (cox1) gene occurring as the only protein-coding gene on a 1,916–1,964 bp minicircular chromosome in the two species, respectively. The cox1 gene of T. minor begins with an atypical start codon, while that of G. aurei does not. Components of the non-protein coding sequence of G. aurei and T. minor include a tRNA (isoleucine) gene, inverted repeat sequences consistent with origins of replication, and an additional non-coding region that is smaller than the non-coding sequence of other lice with such fragmented mitochondrial genomes. Sequences of cox1 minichromosome clones for each species reveal extensive length and sequence heteroplasmy in both coding and noncoding regions. The highly variable non-gene regions of G. aurei and T. minor have little sequence similarity with one another except for a 19-bp region of phylogenetically conserved sequence with unknown function. PMID:27589589
Yatabe, Yoko; Kane, Nolan C.; Scotti-Saintagne, Caroline; Rieseberg, Loren H.
2007-01-01
Plant species may remain morphologically distinct despite gene exchange with congeners, yet little is known about the genomewide pattern of introgression among species. Here we analyze the effects of persistent gene flow on genomic differentiation between the sympatric sunflower species Helianthus annuus and H. petiolaris. While the species are strongly isolated in testcrosses, genetic distances at 108 microsatellite loci and 14 sequenced genes are highly variable and much lower (on average) than for more closely related but historically allopatric congeners. Our analyses failed to detect a positive association between levels of genetic differentiation and chromosomal rearrangements (as reported in a prior publication) or proximity to QTL for morphological differences or hybrid sterility. However, a significant increase in differentiation was observed for markers within 5 cM of chromosomal breakpoints. Together, these results suggest that islands of differentiation between these two species are small, except in areas of low recombination. Furthermore, only microsatellites associated with ESTs were identified as outlier loci in tests for selection, which might indicate that the ESTs themselves are the targets of selection rather than linked genes (or that coding regions are not randomly distributed). In general, these results indicate that even strong and genetically complex reproductive barriers cannot prevent widespread introgression. PMID:17277373
Cellular miR-2909 RNomics governs the genes that ensure immune checkpoint regulation.
Kaul, Deepak; Malik, Deepti; Wani, Sameena
2018-06-20
Cross-talk between coding RNAs and regulatory non-coding microRNAs, within human genome, has provided compelling evidence for the existence of flexible checkpoint control of T-Cell activation. The present study attempts to demonstrate that the interplay between miR-2909 and its effector KLF4 gene has the inherent capacity to regulate genes coding for CTLA4, CD28, CD40, CD134, PDL1, CD80, CD86, IL-6 and IL-10 within normal human peripheral blood mononuclear cells (PBMCs). Based upon these findings, we propose a pathway that links miR-2909 RNomics with the genes coding for immune checkpoint regulators required for the maintenance of immune homeostasis.
2010-01-01
Background Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Results Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. Conclusions A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana. PMID:20637079
The Escherichia coli supX locus is topA, the structural gene for DNA topoisomerase I.
Margolin, P; Zumstein, L; Sternglanz, R; Wang, J C
1985-01-01
Mutations in the supX locus, which result in the absence of DNA topoisomerase I enzyme activity in both Salmonella typhimurium and Escherichia coli, are all selected as suppressors of the leu-500 promoter mutation in S. typhimurium. To determine whether the supX locus is the structural gene topA for the DNA topoisomerase I enzyme or is a positive-acting regulator/activator gene for a nearby topA structural gene, nonsense mutations were selected in the E. coli supX gene carried on an F' episome in S. typhimurium cells. The cysB-topA region of the episomes with nonsense-mutant supX alleles were then cloned onto plasmid pBR322 and transformed into E. coli cells lacking a chromosomal supX gene. Three such E. coli strains, each carrying cloned DNA from episomes with different nonsense-mutant supX alleles, all lacked DNA topoisomerase I activity but expressed antigenic determinants specific to the enzyme; control cells lacked both enzyme activity and antigenic determinants. Maxicell studies of plasmid-coded proteins demonstrated the absence of the DNA topoisomerase I protein (100 kDa) in the three strains but the appearance of a new smaller peptide in each (36, 47, and 64 kDa). These new peptides must represent fragments of the enzyme resulting from translation termination at the supX nonsense codons and confirm the interpretation that the supX gene is topA, the structural gene for DNA topoisomerase I. Images PMID:2991925
What makes up plant genomes: The vanishing line between transposable elements and genes.
Zhao, Dongyan; Ferguson, Ann A; Jiang, Ning
2016-02-01
The ultimate source of evolution is mutation. As the largest component in plant genomes, transposable elements (TEs) create numerous types of mutations that cannot be mimicked by other genetic mechanisms. When TEs insert into genomic sequences, they influence the expression of nearby genes as well as genes unlinked to the insertion. TEs can duplicate, mobilize, and recombine normal genes or gene fragments, with the potential to generate new genes or modify the structure of existing genes. TEs also donate their transposase coding regions for cellular functions in a process called TE domestication. Despite the host defense against TE activity, a subset of TEs survived and thrived through discreet selection of transposition activity, target site, element size, and the internal sequence. Finally, TEs have established strategies to reduce the efficacy of host defense system by increasing the cost of silencing TEs. This review discusses the recent progress in the area of plant TEs with a focus on the interaction between TEs and genes. Copyright © 2015 Elsevier B.V. All rights reserved.
Positive selection on the killer whale mitogenome.
Foote, Andrew D; Morin, Phillip A; Durban, John W; Pitman, Robert L; Wade, Paul; Willerslev, Eske; Gilbert, M Thomas P; da Fonseca, Rute R
2011-02-23
Mitochondria produce up to 95 per cent of the eukaryotic cell's energy. The coding genes of the mitochondrial DNA may therefore evolve under selection owing to metabolic requirements. The killer whale, Orcinus orca, is polymorphic, has a global distribution and occupies a range of ecological niches. It is therefore a suitable organism for testing this hypothesis. We compared a global dataset of the complete mitochondrial genomes of 139 individuals for amino acid changes that were associated with radical physico-chemical property changes and were influenced by positive selection. Two such selected non-synonymous amino acid changes were found; one in each of two ecotypes that inhabit the Antarctic pack ice. Both substitutions were associated with changes in local polarity, increased steric constraints and α-helical tendencies that could influence overall metabolic performance, suggesting a functional change.
Autism-like behavioral phenotypes in BTBR T+tf/J mice.
McFarlane, H G; Kusek, G K; Yang, M; Phoenix, J L; Bolivar, V J; Crawley, J N
2008-03-01
Autism is a behaviorally defined neurodevelopmental disorder of unknown etiology. Mouse models with face validity to the core symptoms offer an experimental approach to test hypotheses about the causes of autism and translational tools to evaluate potential treatments. We discovered that the inbred mouse strain BTBR T+tf/J (BTBR) incorporates multiple behavioral phenotypes relevant to all three diagnostic symptoms of autism. BTBR displayed selectively reduced social approach, low reciprocal social interactions and impaired juvenile play, as compared with C57BL/6J (B6) controls. Impaired social transmission of food preference in BTBR suggests communication deficits. Repetitive behaviors appeared as high levels of self-grooming by juvenile and adult BTBR mice. Comprehensive analyses of procedural abilities confirmed that social recognition and olfactory abilities were normal in BTBR, with no evidence for high anxiety-like traits or motor impairments, supporting an interpretation of highly specific social deficits. Database comparisons between BTBR and B6 on 124 putative autism candidate genes showed several interesting single nucleotide polymorphisms (SNPs) in the BTBR genetic background, including a nonsynonymous coding region polymorphism in Kmo. The Kmo gene encodes kynurenine 3-hydroxylase, an enzyme-regulating metabolism of kynurenic acid, a glutamate antagonist with neuroprotective actions. Sequencing confirmed this coding SNP in Kmo, supporting further investigation into the contribution of this polymorphism to autism-like behavioral phenotypes. Robust and selective social deficits, repetitive self-grooming, genetic stability and commercial availability of the BTBR inbred strain encourage its use as a research tool to search for background genes relevant to the etiology of autism, and to explore therapeutics to treat the core symptoms.
Maruyama, Atsushi; Mimura, Junsei; Itoh, Ken
2014-01-01
Recent studies have disclosed the function of enhancer RNAs (eRNAs), which are long non-coding RNAs transcribed from gene enhancer regions, in transcriptional regulation. However, it remains unclear whether eRNAs are involved in the regulation of human heme oxygenase-1 gene (HO-1) induction. Here, we report that multiple nuclear-enriched eRNAs are transcribed from the regions adjacent to two human HO-1 enhancers (i.e. the distal E2 and proximal E1 enhancers), and some of these eRNAs are induced by the oxidative stress-causing reagent diethyl maleate (DEM). We demonstrated that the expression of one forward direction (5′ to 3′) eRNA transcribed from the human HO-1 E2 enhancer region (named human HO-1enhancer RNA E2-3; hereafter called eRNA E2-3) was induced by DEM in an NRF2-dependent manner in HeLa cells. Conversely, knockdown of BACH1, a repressor of HO-1 transcription, further increased DEM-inducible eRNA E2-3 transcription as well as HO-1 expression. In addition, we showed that knockdown of eRNA E2-3 selectively down-regulated DEM-induced HO-1 expression. Furthermore, eRNA E2-3 knockdown attenuated DEM-induced Pol II binding to the promoter and E2 enhancer regions of HO-1 without affecting NRF2 recruitment to the E2 enhancer. These findings indicate that eRNAE2-3 is functional and is required for HO-1 induction. PMID:25404134
Repressor-mediated tissue-specific gene expression in plants
Meagher, Richard B [Athens, GA; Balish, Rebecca S [Oxford, OH; Tehryung, Kim [Athens, GA; McKinney, Elizabeth C [Athens, GA
2009-02-17
Plant tissue specific gene expression by way of repressor-operator complexes, has enabled outcomes including, without limitation, male sterility and engineered plants having root-specific gene expression of relevant proteins to clean environmental pollutants from soil and water. A mercury hyperaccumulation strategy requires that mercuric ion reductase coding sequence is strongly expressed. The actin promoter vector, A2pot, engineered to contain bacterial lac operator sequences, directed strong expression in all plant vegetative organs and tissues. In contrast, the expression from the A2pot construct was restricted primarily to root tissues when a modified bacterial repressor (LacIn) was coexpressed from the light-regulated rubisco small subunit promoter in above-ground tissues. Also provided are analogous repressor operator complexes for selective expression in other plant tissues, for example, to produce male sterile plants.
flyDIVaS: A Comparative Genomics Resource for Drosophila Divergence and Selection
Stanley, Craig E.; Kulathinal, Rob J.
2016-01-01
With arguably the best finished and expertly annotated genome assembly, Drosophila melanogaster is a formidable genetics model to study all aspects of biology. Nearly a decade ago, the 12 Drosophila genomes project expanded D. melanogaster’s breadth as a comparative model through the community-development of an unprecedented genus- and genome-wide comparative resource. However, since its inception, these datasets for evolutionary inference and biological discovery have become increasingly outdated, outmoded, and inaccessible. Here, we provide an updated and upgradable comparative genomics resource of Drosophila divergence and selection, flyDIVaS, based on the latest genomic assemblies, curated FlyBase annotations, and recent OrthoDB orthology calls. flyDIVaS is an online database containing D. melanogaster-centric orthologous gene sets, CDS and protein alignments, divergence statistics (% gaps, dN, dS, dN/dS), and codon-based tests of positive Darwinian selection. Out of 13,920 protein-coding D. melanogaster genes, ∼80% have one aligned ortholog in the closely related species, D. simulans, and ∼50% have 1–1 12-way alignments in the original 12 sequenced species that span over 80 million yr of divergence. Genes and their orthologs can be chosen from four different taxonomic datasets differing in phylogenetic depth and coverage density, and visualized via interactive alignments and phylogenetic trees. Users can also batch download entire comparative datasets. A functional survey finds conserved mitotic and neural genes, highly diverged immune and reproduction-related genes, more conspicuous signals of divergence across tissue-specific genes, and an enrichment of positive selection among highly diverged genes. flyDIVaS will be regularly updated and can be freely accessed at www.flydivas.info. We encourage researchers to regularly use this resource as a tool for biological inference and discovery, and in their classrooms to help train the next generation of biologists to creatively use such genomic big data resources in an integrative manner. PMID:27226167
flyDIVaS: A Comparative Genomics Resource for Drosophila Divergence and Selection.
Stanley, Craig E; Kulathinal, Rob J
2016-08-09
With arguably the best finished and expertly annotated genome assembly, Drosophila melanogaster is a formidable genetics model to study all aspects of biology. Nearly a decade ago, the 12 Drosophila genomes project expanded D. melanogaster's breadth as a comparative model through the community-development of an unprecedented genus- and genome-wide comparative resource. However, since its inception, these datasets for evolutionary inference and biological discovery have become increasingly outdated, outmoded, and inaccessible. Here, we provide an updated and upgradable comparative genomics resource of Drosophila divergence and selection, flyDIVaS, based on the latest genomic assemblies, curated FlyBase annotations, and recent OrthoDB orthology calls. flyDIVaS is an online database containing D. melanogaster-centric orthologous gene sets, CDS and protein alignments, divergence statistics (% gaps, dN, dS, dN/dS), and codon-based tests of positive Darwinian selection. Out of 13,920 protein-coding D. melanogaster genes, ∼80% have one aligned ortholog in the closely related species, D. simulans, and ∼50% have 1-1 12-way alignments in the original 12 sequenced species that span over 80 million yr of divergence. Genes and their orthologs can be chosen from four different taxonomic datasets differing in phylogenetic depth and coverage density, and visualized via interactive alignments and phylogenetic trees. Users can also batch download entire comparative datasets. A functional survey finds conserved mitotic and neural genes, highly diverged immune and reproduction-related genes, more conspicuous signals of divergence across tissue-specific genes, and an enrichment of positive selection among highly diverged genes. flyDIVaS will be regularly updated and can be freely accessed at www.flydivas.info We encourage researchers to regularly use this resource as a tool for biological inference and discovery, and in their classrooms to help train the next generation of biologists to creatively use such genomic big data resources in an integrative manner. Copyright © 2016 Stanley and Kulathinal.
Vouille, V; Amiche, M; Nicolas, P
1997-09-01
We cloned the genes of two members of the dermaseptin family, broad-spectrum antimicrobial peptides isolated from the skin of the arboreal frog Phyllomedusa bicolor. The dermaseptin gene Drg2 has a 2-exon coding structure interrupted by a small 137-bp intron, wherein exon 1 encoded a 22-residue hydrophobic signal peptide and the first three amino acids of the acidic propiece; exon 2 contained the 18 additional acidic residues of the propiece plus a typical prohormone processing signal Lys-Arg and a 32-residue dermaseptin progenitor sequence. The dermaseptin genes Drg2 and Drg1g2 have conserved sequences at both untranslated ends and in the first and second coding exons. In contrast, Drg1g2 comprises a third coding exon for a short version of the acidic propiece and a second dermaseptin progenitor sequence. Structural conservation between the two genes suggests that Drg1g2 arose recently from an ancestral Drg2-like gene through amplification of part of the second coding exon and 3'-untranslated region. Analysis of the cDNAs coding precursors for several frog skin peptides of highly different structures and activities demonstrates that the signal peptides and part of the acidic propieces are encoded by conserved nucleotides encompassed by the first coding exon of the dermaseptin genes. The organization of the genes that belong to this family, with the signal peptide and the progenitor sequence on separate exons, permits strikingly different peptides to be directed into the secretory pathway. The recruitment of such a homologous 'secretory' exon by otherwise non-homologous genes may have been an early event in the evolution of amphibian.
Wu, Shengru; Liu, Yanli; Guo, Wei; Cheng, Xi; Ren, Xiaochun; Chen, Si; Li, Xueyuan; Duan, Yongle; Sun, Qingzhu; Yang, Xiaojun
2018-06-27
The liver is mainly hematopoietic in the embryo, and converts into a major metabolic organ in the adult. Therefore, it is intensively remodeled after birth to adapt and perform adult functions. Long non-coding RNAs (lncRNAs) are involved in organ development and cell differentiation, likely they have potential roles in regulating postnatal liver development. Herein, in order to understand the roles of lncRNAs in postnatal liver maturation, we analyzed the lncRNAs and mRNAs expression profiles in immature and mature livers from one-day-old and adult (40 weeks of age) breeder roosters by Ribo-Zero RNA-Sequencing. Around 21,939 protein-coding genes and 2220 predicted lncRNAs were expressed in livers of breeder roosters. Compared to protein-coding genes, the identified chicken lncRNAs shared fewer exons, shorter transcript length, and significantly lower expression levels. Notably, in comparison between the livers of newborn and adult breeder roosters, a total of 1570 mRNAs and 214 lncRNAs were differentially expressed with the criteria of log 2 fold change > 1 or < - 1 and P values < 0.05, which were validated by qPCR using randomly selected five mRNAs and five lncRNAs. Further GO and KEGG analyses have revealed that the differentially expressed mRNAs were involved in the hepatic metabolic and immune functional changes, as well as some biological processes and pathways including cell proliferation, apoptotic and cell cycle that are implicated in the development of liver. We also investigated the cis- and trans- regulatory effects of differentially expressed lncRNAs on its target genes. GO and KEGG analyses indicated that these lncRNAs had their neighbor protein coding genes and trans-regulated genes associated with adapting of adult hepatic functions, as well as some pathways involved in liver development, such as cell cycle pathway, Notch signaling pathway, Hedgehog signaling pathway, and Wnt signaling pathway. This study provides a catalog of mRNAs and lncRNAs related to postnatal liver maturation of chicken, and will contribute to a fuller understanding of biological processes or signaling pathways involved in significant functional transition during postnatal liver development that differentially expressed genes and lncRNAs could take part in.
Abend, M; Pfeiffer, R M; Ruf, C; Hatch, M; Bogdanova, T I; Tronko, M D; Hartmann, J; Meineke, V; Mabuchi, K; Brenner, A V
2013-10-15
A strong, consistent association between childhood irradiation and subsequent thyroid cancer provides an excellent model for studying radiation carcinogenesis. We evaluated gene expression in 63 paired RNA specimens from frozen normal and tumour thyroid tissues with individual iodine-131 (I-131) doses (0.008-8.6 Gy, no unirradiated controls) received from Chernobyl fallout during childhood (Ukrainian-American cohort). Approximately half of these randomly selected samples (32 tumour/normal tissue RNA specimens) were hybridised on 64 whole-genome microarrays (Agilent, 4 × 44 K). Associations between I-131 dose and gene expression were assessed separately in normal and tumour tissues using Kruskal-Wallis and linear trend tests. Of 155 genes significantly associated with I-131 after Bonferroni correction and with ≥2-fold increase per dose category, we selected 95 genes. On the remaining 31 RNA samples these genes were used for validation purposes using qRT-PCR. Expression of eight genes (ABCC3, C1orf9, C6orf62, FGFR1OP2, HEY2, NDOR1, STAT3, and UCP3) in normal tissue and six genes (ANKRD46, CD47, HNRNPH1, NDOR1, SCEL, and SERPINA1) in tumour tissue was significantly associated with I-131. PANTHER/DAVID pathway analyses demonstrated significant over-representation of genes coding for nucleic acid binding in normal and tumour tissues, and for p53, EGF, and FGF signalling pathways in tumour tissue. The multistep process of radiation carcinogenesis begins in histologically normal thyroid tissue and may involve dose-dependent gene expression changes.
Li, Jia; Hu, Erliang; Chen, Xueying; Xu, Jie; Lan, Hai; Li, Chuan; Hu, Yaodong; Lu, Yanli
2016-05-01
Proteins of the DUF1313 family contain a highly conserved domain and are only found in plants; they play important roles in most plant functions. In this study, 269 DUF1313 genes from 81 photoautotrophic species were identified; they were classified into three major types based on the amino acid substitutions in the conserved region: IARV, I(S/T/F)(K/R)V, and IRRV. Phylogenic tree constructed from 51 DUF1313 genes from graminoids revealed three clades: A, B1, and B2. Clade B1 was found to have undergone episodic positive selection after a gene duplication event and included four amino acid sites under positive selection. The association between DUF1313 family members and traits investigated in maize indicated that three of four genes (GRMZM2G025646, GRMZM5G877647, GRMZM2G359322, and GRMZM2G382774) were associated with the target traits such as days to silking, days to tasselling, and plant height. The nucleotide diversity of the most primitive and highly conserved DUF1313 gene, ELF4-like4, was the highest in Tripsacum and the lowest in maize. Tajima's D and Fu and Li's D tests revealed that significant purifying selection had occurred in the coding sequence region of this DUF1313 gene in teosinte and maize. No significant signal was detected in the 5'-untranslated region of this gene in each of the three species (maize, teosinte, and Tripsacum) or in any gene regions of Tripsacum. Phylogenetic analyses revealed that the 103 accessions of maize, teosinte, and Tripsacum can be grouped into four clades based on the ELF4-like4 gene sequence similarity. Thus, this gene can be used to determine the relationships between maize and its relatives, and the DUF1313 family members and alleles identified in this study might be valuable genetic resources for molecular marker-assisted breeding in maize. Copyright © 2016 Elsevier Inc. All rights reserved.
Hao, Chenyang; Wang, Yuquan; Chao, Shiaoman; Li, Tian; Liu, Hongxia; Wang, Lanfen; Zhang, Xueyong
2017-01-30
A Chinese wheat mini core collection was genotyped using the wheat 9 K iSelect SNP array. Total 2420 and 2396 polymorphic SNPs were detected on the A and the B genome chromosomes, which formed 878 haplotype blocks. There were more blocks in the B genome, but the average block size was significantly (P < 0.05) smaller than those in the A genome. Intense selection (domestication and breeding) had a stronger effect on the A than on the B genome chromosomes. Based on the genetic pedigrees, many blocks can be traced back to a well-known Strampelli cross, which was made one century ago. Furthermore, polyploidization of wheat (both tetraploidization and hexaploidization) induced revolutionary changes in both the A and the B genomes, with a greater increase of gene diversity compared to their diploid ancestors. Modern breeding has dramatically increased diversity in the gene coding regions, though obvious blocks were formed on most of the chromosomes in both tetraploid and hexaploid wheats. Tag-SNP markers identified in this study can be used for marker assisted selection using haplotype blocks as a wheat breeding strategy. This strategy can also be employed to facilitate genome selection in other self-pollinating crop species.
Lyu, Yuping; Wu, Xiaoqing; Ren, He; Zhou, Fangyuan; Zhou, Hongzi; Zhang, Xinjian; Yang, Hetong
2017-10-01
An appropriate reference gene is required to get reliable results from gene expression analysis by quantitative real-time reverse transcription PCR (qRT-PCR). In order to identify stable and reliable reference genes in Trichoderma afroharzianum under oxalic acid (OA) stress, six commonly used housekeeping genes, i.e., elongation factor 1, ubiquitin, ubiquitin-conjugating enzyme, glyceraldehyde-3-phosphate dehydrogenase, α-tubulin, actin, from the effective biocontrol isolate T. afroharzianum strain LTR-2 were tested for their expression during growth in liquid culture amended with OA. Four in silico programs (comparative ΔCt, NormFinder, geNorm and BestKeeper) were used to evaluate the expression stabilities of six candidate reference genes. The elongation factor 1 gene EF-1 was identified as the most stably expressed reference gene, and was used as the normalizer to quantify the expression level of the oxalate decarboxylase coding gene OXDC in T. afroharzianum strain LTR-2 under OA stress. The result showed that the expression of OXDC was significantly up-regulated as expected. This study provides an effective method to quantify expression changes of target genes in T. afroharzianum under OA stress. Copyright © 2017 Elsevier B.V. All rights reserved.
Zhang, Rui; Deng, Patricia; Jacobson, Dionna; Li, Jin Billy
2017-02-01
Adenosine-to-inosine RNA editing diversifies the transcriptome and promotes functional diversity, particularly in the brain. A plethora of editing sites has been recently identified; however, how they are selected and regulated and which are functionally important are largely unknown. Here we show the cis-regulation and stepwise selection of RNA editing during Drosophila evolution and pinpoint a large number of functional editing sites. We found that the establishment of editing and variation in editing levels across Drosophila species are largely explained and predicted by cis-regulatory elements. Furthermore, editing events that arose early in the species tree tend to be more highly edited in clusters and enriched in slowly-evolved neuronal genes, thus suggesting that the main role of RNA editing is for fine-tuning neurological functions. While nonsynonymous editing events have been long recognized as playing a functional role, in addition to nonsynonymous editing sites, a large fraction of 3'UTR editing sites is evolutionarily constrained, highly edited, and thus likely functional. We find that these 3'UTR editing events can alter mRNA stability and affect miRNA binding and thus highlight the functional roles of noncoding RNA editing. Our work, through evolutionary analyses of RNA editing in Drosophila, uncovers novel insights of RNA editing regulation as well as its functions in both coding and non-coding regions.
Jacobson, Dionna
2017-01-01
Adenosine-to-inosine RNA editing diversifies the transcriptome and promotes functional diversity, particularly in the brain. A plethora of editing sites has been recently identified; however, how they are selected and regulated and which are functionally important are largely unknown. Here we show the cis-regulation and stepwise selection of RNA editing during Drosophila evolution and pinpoint a large number of functional editing sites. We found that the establishment of editing and variation in editing levels across Drosophila species are largely explained and predicted by cis-regulatory elements. Furthermore, editing events that arose early in the species tree tend to be more highly edited in clusters and enriched in slowly-evolved neuronal genes, thus suggesting that the main role of RNA editing is for fine-tuning neurological functions. While nonsynonymous editing events have been long recognized as playing a functional role, in addition to nonsynonymous editing sites, a large fraction of 3’UTR editing sites is evolutionarily constrained, highly edited, and thus likely functional. We find that these 3’UTR editing events can alter mRNA stability and affect miRNA binding and thus highlight the functional roles of noncoding RNA editing. Our work, through evolutionary analyses of RNA editing in Drosophila, uncovers novel insights of RNA editing regulation as well as its functions in both coding and non-coding regions. PMID:28166241
XGC developments for a more efficient XGC-GENE code coupling
NASA Astrophysics Data System (ADS)
Dominski, Julien; Hager, Robert; Ku, Seung-Hoe; Chang, Cs
2017-10-01
In the Exascale Computing Program, the High-Fidelity Whole Device Modeling project initially aims at delivering a tightly-coupled simulation of plasma neoclassical and turbulence dynamics from the core to the edge of the tokamak. To permit such simulations, the gyrokinetic codes GENE and XGC will be coupled together. Numerical efforts are made to improve the numerical schemes agreement in the coupling region. One of the difficulties of coupling those codes together is the incompatibility of their grids. GENE is a continuum grid-based code and XGC is a Particle-In-Cell code using unstructured triangular mesh. A field-aligned filter is thus implemented in XGC. Even if XGC originally had an approximately field-following mesh, this field-aligned filter permits to have a perturbation discretization closer to the one solved in the field-aligned code GENE. Additionally, new XGC gyro-averaging matrices are implemented on a velocity grid adapted to the plasma properties, thus ensuring same accuracy from the core to the edge regions.
Khan, Haseeb Ahmad
2004-01-01
The massive surge in the production of microarray data poses a great challenge for proper analysis and interpretation. In recent years numerous computational tools have been developed to extract meaningful interpretation of microarray gene expression data. However, a convenient tool for two-groups comparison of microarray data is still lacking and users have to rely on commercial statistical packages that might be costly and require special skills, in addition to extra time and effort for transferring data from one platform to other. Various statistical methods, including the t-test, analysis of variance, Pearson test and Mann-Whitney U test, have been reported for comparing microarray data, whereas the utilization of the Wilcoxon signed-rank test, which is an appropriate test for two-groups comparison of gene expression data, has largely been neglected in microarray studies. The aim of this investigation was to build an integrated tool, ArraySolver, for colour-coded graphical display and comparison of gene expression data using the Wilcoxon signed-rank test. The results of software validation showed similar outputs with ArraySolver and SPSS for large datasets. Whereas the former program appeared to be more accurate for 25 or fewer pairs (n < or = 25), suggesting its potential application in analysing molecular signatures that usually contain small numbers of genes. The main advantages of ArraySolver are easy data selection, convenient report format, accurate statistics and the familiar Excel platform.
2004-01-01
The massive surge in the production of microarray data poses a great challenge for proper analysis and interpretation. In recent years numerous computational tools have been developed to extract meaningful interpretation of microarray gene expression data. However, a convenient tool for two-groups comparison of microarray data is still lacking and users have to rely on commercial statistical packages that might be costly and require special skills, in addition to extra time and effort for transferring data from one platform to other. Various statistical methods, including the t-test, analysis of variance, Pearson test and Mann–Whitney U test, have been reported for comparing microarray data, whereas the utilization of the Wilcoxon signed-rank test, which is an appropriate test for two-groups comparison of gene expression data, has largely been neglected in microarray studies. The aim of this investigation was to build an integrated tool, ArraySolver, for colour-coded graphical display and comparison of gene expression data using the Wilcoxon signed-rank test. The results of software validation showed similar outputs with ArraySolver and SPSS for large datasets. Whereas the former program appeared to be more accurate for 25 or fewer pairs (n ≤ 25), suggesting its potential application in analysing molecular signatures that usually contain small numbers of genes. The main advantages of ArraySolver are easy data selection, convenient report format, accurate statistics and the familiar Excel platform. PMID:18629036
JavaGenes: Evolving Graphs with Crossover
NASA Technical Reports Server (NTRS)
Globus, Al; Atsatt, Sean; Lawton, John; Wipke, Todd
2000-01-01
Genetic algorithms usually use string or tree representations. We have developed a novel crossover operator for a directed and undirected graph representation, and used this operator to evolve molecules and circuits. Unlike strings or trees, a single point in the representation cannot divide every possible graph into two parts, because graphs may contain cycles. Thus, the crossover operator is non-trivial. A steady-state, tournament selection genetic algorithm code (JavaGenes) was written to implement and test the graph crossover operator. All runs were executed by cycle-scavagging on networked workstations using the Condor batch processing system. The JavaGenes code has evolved pharmaceutical drug molecules and simple digital circuits. Results to date suggest that JavaGenes can evolve moderate sized drug molecules and very small circuits in reasonable time. The algorithm has greater difficulty with somewhat larger circuits, suggesting that directed graphs (circuits) are more difficult to evolve than undirected graphs (molecules), although necessary differences in the crossover operator may also explain the results. In principle, JavaGenes should be able to evolve other graph-representable systems, such as transportation networks, metabolic pathways, and computer networks. However, large graphs evolve significantly slower than smaller graphs, presumably because the space-of-all-graphs explodes combinatorially with graph size. Since the representation strongly affects genetic algorithm performance, adding graphs to the evolutionary programmer's bag-of-tricks should be beneficial. Also, since graph evolution operates directly on the phenotype, the genotype-phenotype translation step, common in genetic algorithm work, is eliminated.
Morin, Ryan D.; Chang, Elbert; Petrescu, Anca; Liao, Nancy; Griffith, Malachi; Kirkpatrick, Robert; Butterfield, Yaron S.; Young, Alice C.; Stott, Jeffrey; Barber, Sarah; Babakaiff, Ryan; Dickson, Mark C.; Matsuo, Corey; Wong, David; Yang, George S.; Smailus, Duane E.; Wetherby, Keith D.; Kwong, Peggy N.; Grimwood, Jane; Brinkley, Charles P.; Brown-John, Mabel; Reddix-Dugue, Natalie D.; Mayo, Michael; Schmutz, Jeremy; Beland, Jaclyn; Park, Morgan; Gibson, Susan; Olson, Teika; Bouffard, Gerard G.; Tsai, Miranda; Featherstone, Ruth; Chand, Steve; Siddiqui, Asim S.; Jang, Wonhee; Lee, Ed; Klein, Steven L.; Blakesley, Robert W.; Zeeberg, Barry R.; Narasimhan, Sudarshan; Weinstein, John N.; Pennacchio, Christa Prange; Myers, Richard M.; Green, Eric D.; Wagner, Lukas; Gerhard, Daniela S.; Marra, Marco A.; Jones, Steven J.M.; Holt, Robert A.
2006-01-01
Sequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection Initiative. Here we present 10,967 full ORF verified cDNA clones (8049 from X. laevis and 2918 from X. tropicalis) as a community resource. Because the genome of X. laevis, but not X. tropicalis, has undergone allotetraploidization, comparison of coding sequences from these two clawed (pipid) frogs provides a unique angle for exploring the molecular evolution of duplicate genes. Within our clone set, we have identified 445 gene trios, each comprised of an allotetraploidization-derived X. laevis gene pair and their shared X. tropicalis ortholog. Pairwise dN/dS, comparisons within trios show strong evidence for purifying selection acting on all three members. However, dN/dS ratios between X. laevis gene pairs are elevated relative to their X. tropicalis ortholog. This difference is highly significant and indicates an overall relaxation of selective pressures on duplicated gene pairs. We have found that the paralogs that have been lost since the tetraploidization event are enriched for several molecular functions, but have found no such enrichment in the extant paralogs. Approximately 14% of the paralogous pairs analyzed here also show differential expression indicative of subfunctionalization. PMID:16672307
Optimization of algorithm of coding of genetic information of Chlamydia
NASA Astrophysics Data System (ADS)
Feodorova, Valentina A.; Ulyanov, Sergey S.; Zaytsev, Sergey S.; Saltykov, Yury V.; Ulianova, Onega V.
2018-04-01
New method of coding of genetic information using coherent optical fields is developed. Universal technique of transformation of nucleotide sequences of bacterial gene into laser speckle pattern is suggested. Reference speckle patterns of the nucleotide sequences of omp1 gene of typical wild strains of Chlamydia trachomatis of genovars D, E, F, G, J and K and Chlamydia psittaci serovar I as well are generated. Algorithm of coding of gene information into speckle pattern is optimized. Fully developed speckles with Gaussian statistics for gene-based speckles have been used as criterion of optimization.
The complete mitochondrial genome of Hydra vulgaris (Hydroida: Hydridae).
Pan, Hong-Chun; Fang, Hong-Yan; Li, Shi-Wei; Liu, Jun-Hong; Wang, Ying; Wang, An-Tai
2014-12-01
The complete mitochondrial genome of Hydra vulgaris (Hydroida: Hydridae) is composed of two linear DNA molecules. The mitochondrial DNA (mtDNA) molecule 1 is 8010 bp long and contains six protein-coding genes, large subunit rRNA, methionine and tryptophan tRNAs, two pseudogenes consisting respectively of a partial copy of COI, and terminal sequences at two ends of the linear mtDNA, while the mtDNA molecule 2 is 7576 bp long and contains seven protein-coding genes, small subunit rRNA, methionine tRNA, a pseudogene consisting of a partial copy of COI and terminal sequences at two ends of the linear mtDNA. COI gene begins with GTG as start codon, whereas other 12 protein-coding genes start with a typical ATG initiation codon. In addition, all protein-coding genes are terminated with TAA as stop codon.
Maeda, Yasuhiro; Yamaguchi, Terufumi; Ueda, Satomi; Matsuo, Koki; Morita, Yasuyoshi; Naiki, Yoshito; Miyazato, Hajime; Shimada, Takahiro; Miyatake, Jun-Ichi; Matsuda, Mitsuhiro; Kanamaru, Akihisa
2003-07-01
In this study, we observed the expression of the GSTT-1 gene in patients with myelodysplastic syndrome (MDS) at the messenger RNA level. Reverse transcription-polymerase chain reaction (RT-PCR) for GSTT-1 was performed with a pair of primers complementary to the 5' coding section and the 3' coding section of the GSTT-1 cDNA for amplifying the 623-bp band. Among 20 patients with MDS, 8 patients showed the expected 623-bp band on RT-PCR, and 12 patients showed a 500-bp band on RT-PCR, indicating that a 123-bp sequence was deleted as a mutant of the GSTT-1 gene. Furthermore, a BLAST DNA search showed that the deletion of a 123 bp sequence creates a sequence that is 63% homologous to human FKBP-rapamycin associated protein (FRAP); this protein has been termed a mammalian target of rapamycin (mTOR). We respectively transfected the wild type and the mutant type GSTT-1 gene in an expression vector to two cell lines (K562 and HL-60). The stable transformants for the wild type and the mutant type GSTT-1 genes were made by G418 selection. Interestingly, rapamycin could induce significant growth inhibition of the stable transformants for mutant type GSTT-1, which was indicative of apoptosis, but not that of those for wild type GSTT-1. These results suggest that rapamycin could be included in the therapeutic modality for the patients with MDS who have the mTOR sequences in GSTT-1 gene.
2009-01-01
Background In soybean somatic embryo transformation, the standard selection agent currently used is hygromycin. It may be preferable to avoid use of antibiotic resistance genes in foods. The objective of these experiments was to develop a selection system for producing transgenic soybean somatic embryos without the use of antibiotics such as hygromycin. Results When tested against different alternate selection agents our studies show that 0.16 μg/mL glufosinate, 40 mg/L isopropylamine-glyphosate, 0.5 mg/mL (S-(2 aminoethyl)-L-cysteine) (AEC) and the acetolactate synthase (ALS) inhibitors Exceed® and Synchrony® both at 150 μg/mL inhibited soybean somatic embryo growth. Even at the concentration of 2 mg/mL, lysine+threonine (LT) were poor selection agents. The use of AEC may be preferable since it is a natural compound. Unlike the plant enzyme, dihydrodipicolinate synthase (DHPS) from E. coli is not feed-back inhibited by physiological concentrations of lysine. The dapA gene which codes for E. coli DHPS was expressed in soybean somatic embryos under the control of the CaMV 35S promoter. Following introduction of the construct into embryogenic tissue of soybean, transgenic events were recovered by incubating the tissue in liquid medium containing AEC at a concentration of 5 mM. Only transgenic soybeans were able to grow at this concentration of AEC; no escapes were observed. Conclusion Genetically engineered soybeans expressing a lysine insensitive DHPS gene can be selected with the non-antibiotic selection agent AEC. We also report here the inhibitory effects of glufosinate, (isopropylamine-glyphosate) (Roundup®), AEC and the ALS inhibitors Exceed® and Synchrony® against different tissues of soybean PMID:19922622
Seo, Hogyu David; Lee, Daeyoup
2018-05-15
Random mutagenesis of a target gene is commonly used to identify mutations that yield the desired phenotype. Of the methods that may be used to achieve random mutagenesis, error-prone PCR is a convenient and efficient strategy for generating a diverse pool of mutants (i.e., a mutant library). Error-prone PCR is the method of choice when a researcher seeks to mutate a pre-defined region, such as the coding region of a gene while leaving other genomic regions unaffected. After the mutant library is amplified by error-prone PCR, it must be cloned into a suitable plasmid. The size of the library generated by error-prone PCR is constrained by the efficiency of the cloning step. However, in the fission yeast, Schizosaccharomyces pombe, the cloning step can be replaced by the use of a highly efficient one-step fusion PCR to generate constructs for transformation. Mutants of desired phenotypes may then be selected using appropriate reporters. Here, we describe this strategy in detail, taking as an example, a reporter inserted at centromeric heterochromatin.
Fourie, Gerda; van der Merwe, Nicolaas A; Wingfield, Brenda D; Bogale, Mesfin; Tudzynski, Bettina; Wingfield, Michael J; Steenkamp, Emma T
2013-09-08
The availability of mitochondrial genomes has allowed for the resolution of numerous questions regarding the evolutionary history of fungi and other eukaryotes. In the Gibberella fujikuroi species complex, the exact relationships among the so-called "African", "Asian" and "American" Clades remain largely unresolved, irrespective of the markers employed. In this study, we considered the feasibility of using mitochondrial genes to infer the phylogenetic relationships among Fusarium species in this complex. The mitochondrial genomes of representatives of the three Clades (Fusarium circinatum, F. verticillioides and F. fujikuroi) were characterized and we determined whether or not the mitochondrial genomes of these fungi have value in resolving the higher level evolutionary relationships in the complex. Overall, the mitochondrial genomes of the three species displayed a high degree of synteny, with all the genes (protein coding genes, unique ORFs, ribosomal RNA and tRNA genes) in identical order and orientation, as well as introns that share similar positions within genes. The intergenic regions and introns generally contributed significantly to the size differences and diversity observed among these genomes. Phylogenetic analysis of the concatenated protein-coding dataset separated members of the Gibberella fujikuroi complex from other Fusarium species and suggested that F. fujikuroi ("Asian" Clade) is basal in the complex. However, individual mitochondrial gene trees were largely incongruent with one another and with the concatenated gene tree, because six distinct phylogenetic trees were recovered from the various single gene datasets. The mitochondrial genomes of Fusarium species in the Gibberella fujikuroi complex are remarkably similar to those of the previously characterized Fusarium species and Sordariomycetes. Despite apparently representing a single replicative unit, all of the genes encoded on the mitochondrial genomes of these fungi do not share the same evolutionary history. This incongruence could be due to biased selection on some genes or recombination among mitochondrial genomes. The results thus suggest that the use of individual mitochondrial genes for phylogenetic inference could mask the true relationships between species in this complex.
Gundersen, Gregory W; Jones, Matthew R; Rouillard, Andrew D; Kou, Yan; Monteiro, Caroline D; Feldmann, Axel S; Hu, Kevin S; Ma'ayan, Avi
2015-09-15
Identification of differentially expressed genes is an important step in extracting knowledge from gene expression profiling studies. The raw expression data from microarray and other high-throughput technologies is deposited into the Gene Expression Omnibus (GEO) and served as Simple Omnibus Format in Text (SOFT) files. However, to extract and analyze differentially expressed genes from GEO requires significant computational skills. Here we introduce GEO2Enrichr, a browser extension for extracting differentially expressed gene sets from GEO and analyzing those sets with Enrichr, an independent gene set enrichment analysis tool containing over 70 000 annotated gene sets organized into 75 gene-set libraries. GEO2Enrichr adds JavaScript code to GEO web-pages; this code scrapes user selected accession numbers and metadata, and then, with one click, users can submit this information to a web-server application that downloads the SOFT files, parses, cleans and normalizes the data, identifies the differentially expressed genes, and then pipes the resulting gene lists to Enrichr for downstream functional analysis. GEO2Enrichr opens a new avenue for adding functionality to major bioinformatics resources such GEO by integrating tools and resources without the need for a plug-in architecture. Importantly, GEO2Enrichr helps researchers to quickly explore hypotheses with little technical overhead, lowering the barrier of entry for biologists by automating data processing steps needed for knowledge extraction from the major repository GEO. GEO2Enrichr is an open source tool, freely available for installation as browser extensions at the Chrome Web Store and FireFox Add-ons. Documentation and a browser independent web application can be found at http://amp.pharm.mssm.edu/g2e/. avi.maayan@mssm.edu. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Van, K; Onoda, S; Kim, M Y; Kim, K D; Lee, S-H
2008-03-01
The Waxy (Wx) gene product controls the formation of a straight chain polymer of amylose in the starch pathway. Dominance/recessiveness of the Wx allele is associated with amylose content, leading to non-waxy/waxy phenotypes. For a total of 113 foxtail millet accessions, agronomic traits and the molecular differences of the Wx gene were surveyed to evaluate genetic diversities. Molecular types were associated with phenotypes determined by four specific primer sets (non-waxy, Type I; low amylose, Type VI; waxy, Type IV or V). Additionally, the insertion of transposable element in waxy was confirmed by ex1/TSI2R, TSI2F/ex2, ex2int2/TSI7R and TSI7F/ex4r. Seventeen single nucleotide polymorphims (SNPs) were observed from non-coding regions, while three SNPs from coding regions were non-synonymous. Interestingly, the phenotype of No. 88 was still non-waxy, although seven nucleotides (AATTGGT) insertion at 2,993 bp led to 78 amino acids shorter. The rapid decline of r (2) in the sequenced region (exon 1-intron 1-exon 2) suggested a low level of linkage disequilibrium and limited haplotype structure. K (s) values and estimation of evolutionary events indicate early divergence of S. italica among cereal crops. This study suggested the Wx gene was one of the targets in the selection process during domestication.
Mikhailov, Alexander T; Torrado, Mario
2018-05-12
There is growing evidence that putative gene regulatory networks including cardio-enriched transcription factors, such as PITX2, TBX5, ZFHX3, and SHOX2, and their effector/target genes along with downstream non-coding RNAs can play a potentially important role in the process of adaptive and maladaptive atrial rhythm remodeling. In turn, expression of atrial fibrillation-associated transcription factors is under the control of upstream regulatory non-coding RNAs. This review broadly explores gene regulatory mechanisms associated with susceptibility to atrial fibrillation-with key examples from both animal models and patients-within the context of both cardiac transcription factors and non-coding RNAs. These two systems appear to have multiple levels of cross-regulation and act coordinately to achieve effective control of atrial rhythm effector gene expression. Perturbations of a dynamic expression balance between transcription factors and corresponding non-coding RNAs can provoke the development or promote the progression of atrial fibrillation. We also outline deficiencies in current models and discuss ongoing studies to clarify remaining mechanistic questions. An understanding of the function of transcription factors and non-coding RNAs in gene regulatory networks associated with atrial fibrillation risk will enable the development of innovative therapeutic strategies.
Methylation of miRNA genes and oncogenesis.
Loginov, V I; Rykov, S V; Fridman, M V; Braga, E A
2015-02-01
Interaction between microRNA (miRNA) and messenger RNA of target genes at the posttranscriptional level provides fine-tuned dynamic regulation of cell signaling pathways. Each miRNA can be involved in regulating hundreds of protein-coding genes, and, conversely, a number of different miRNAs usually target a structural gene. Epigenetic gene inactivation associated with methylation of promoter CpG-islands is common to both protein-coding genes and miRNA genes. Here, data on functions of miRNAs in development of tumor-cell phenotype are reviewed. Genomic organization of promoter CpG-islands of the miRNA genes located in inter- and intragenic areas is discussed. The literature and our own results on frequency of CpG-island methylation in miRNA genes from tumors are summarized, and data regarding a link between such modification and changed activity of miRNA genes and, consequently, protein-coding target genes are presented. Moreover, the impact of miRNA gene methylation on key oncogenetic processes as well as affected signaling pathways is discussed.
Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil
2015-01-01
The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. PMID:25362073
Abdelrahman, Hisham; ElHady, Mohamed; Alcivar-Warren, Acacia; Allen, Standish; Al-Tobasei, Rafet; Bao, Lisui; Beck, Ben; Blackburn, Harvey; Bosworth, Brian; Buchanan, John; Chappell, Jesse; Daniels, William; Dong, Sheng; Dunham, Rex; Durland, Evan; Elaswad, Ahmed; Gomez-Chiarri, Marta; Gosh, Kamal; Guo, Ximing; Hackett, Perry; Hanson, Terry; Hedgecock, Dennis; Howard, Tiffany; Holland, Leigh; Jackson, Molly; Jin, Yulin; Khalil, Karim; Kocher, Thomas; Leeds, Tim; Li, Ning; Lindsey, Lauren; Liu, Shikai; Liu, Zhanjiang; Martin, Kyle; Novriadi, Romi; Odin, Ramjie; Palti, Yniv; Peatman, Eric; Proestou, Dina; Qin, Guyu; Reading, Benjamin; Rexroad, Caird; Roberts, Steven; Salem, Mohamed; Severin, Andrew; Shi, Huitong; Shoemaker, Craig; Stiles, Sheila; Tan, Suxu; Tang, Kathy F J; Thongda, Wilawan; Tiersch, Terrence; Tomasso, Joseph; Prabowo, Wendy Tri; Vallejo, Roger; van der Steen, Hein; Vo, Khoi; Waldbieser, Geoff; Wang, Hanping; Wang, Xiaozhu; Xiang, Jianhai; Yang, Yujia; Yant, Roger; Yuan, Zihao; Zeng, Qifan; Zhou, Tao
2017-02-20
Advancing the production efficiency and profitability of aquaculture is dependent upon the ability to utilize a diverse array of genetic resources. The ultimate goals of aquaculture genomics, genetics and breeding research are to enhance aquaculture production efficiency, sustainability, product quality, and profitability in support of the commercial sector and for the benefit of consumers. In order to achieve these goals, it is important to understand the genomic structure and organization of aquaculture species, and their genomic and phenomic variations, as well as the genetic basis of traits and their interrelationships. In addition, it is also important to understand the mechanisms of regulation and evolutionary conservation at the levels of genome, transcriptome, proteome, epigenome, and systems biology. With genomic information and information between the genomes and phenomes, technologies for marker/causal mutation-assisted selection, genome selection, and genome editing can be developed for applications in aquaculture. A set of genomic tools and resources must be made available including reference genome sequences and their annotations (including coding and non-coding regulatory elements), genome-wide polymorphic markers, efficient genotyping platforms, high-density and high-resolution linkage maps, and transcriptome resources including non-coding transcripts. Genomic and genetic control of important performance and production traits, such as disease resistance, feed conversion efficiency, growth rate, processing yield, behaviour, reproductive characteristics, and tolerance to environmental stressors like low dissolved oxygen, high or low water temperature and salinity, must be understood. QTL need to be identified, validated across strains, lines and populations, and their mechanisms of control understood. Causal gene(s) need to be identified. Genetic and epigenetic regulation of important aquaculture traits need to be determined, and technologies for marker-assisted selection, causal gene/mutation-assisted selection, genome selection, and genome editing using CRISPR and other technologies must be developed, demonstrated with applicability, and application to aquaculture industries.Major progress has been made in aquaculture genomics for dozens of fish and shellfish species including the development of genetic linkage maps, physical maps, microarrays, single nucleotide polymorphism (SNP) arrays, transcriptome databases and various stages of genome reference sequences. This paper provides a general review of the current status, challenges and future research needs of aquaculture genomics, genetics, and breeding, with a focus on major aquaculture species in the United States: catfish, rainbow trout, Atlantic salmon, tilapia, striped bass, oysters, and shrimp. While the overall research priorities and the practical goals are similar across various aquaculture species, the current status in each species should dictate the next priority areas within the species. This paper is an output of the USDA Workshop for Aquaculture Genomics, Genetics, and Breeding held in late March 2016 in Auburn, Alabama, with participants from all parts of the United States.
Toren, Dmitri; Barzilay, Thomer; Tacutu, Robi; Lehmann, Gilad; Muradian, Khachik K; Fraifeld, Vadim E
2016-01-04
Mitochondria are the only organelles in the animal cells that have their own genome. Due to a key role in energy production, generation of damaging factors (ROS, heat), and apoptosis, mitochondria and mtDNA in particular have long been considered one of the major players in the mechanisms of aging, longevity and age-related diseases. The rapidly increasing number of species with fully sequenced mtDNA, together with accumulated data on longevity records, provides a new fascinating basis for comparative analysis of the links between mtDNA features and animal longevity. To facilitate such analyses and to support the scientific community in carrying these out, we developed the MitoAge database containing calculated mtDNA compositional features of the entire mitochondrial genome, mtDNA coding (tRNA, rRNA, protein-coding genes) and non-coding (D-loop) regions, and codon usage/amino acids frequency for each protein-coding gene. MitoAge includes 922 species with fully sequenced mtDNA and maximum lifespan records. The database is available through the MitoAge website (www.mitoage.org or www.mitoage.info), which provides the necessary tools for searching, browsing, comparing and downloading the data sets of interest for selected taxonomic groups across the Kingdom Animalia. The MitoAge website assists in statistical analysis of different features of the mtDNA and their correlative links to longevity. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Kelsen, Judith R; Dawany, Noor; Moran, Christopher J; Petersen, Britt-Sabina; Sarmady, Mahdi; Sasson, Ariella; Pauly-Hubbard, Helen; Martinez, Alejandro; Maurer, Kelly; Soong, Joanne; Rappaport, Eric; Franke, Andre; Keller, Andreas; Winter, Harland S; Mamula, Petar; Piccoli, David; Artis, David; Sonnenberg, Gregory F; Daly, Mark; Sullivan, Kathleen E; Baldassano, Robert N; Devoto, Marcella
2015-11-01
Very early onset inflammatory bowel disease (VEO-IBD), IBD diagnosed at 5 years of age or younger, frequently presents with a different and more severe phenotype than older-onset IBD. We investigated whether patients with VEO-IBD carry rare or novel variants in genes associated with immunodeficiencies that might contribute to disease development. Patients with VEO-IBD and parents (when available) were recruited from the Children's Hospital of Philadelphia from March 2013 through July 2014. We analyzed DNA from 125 patients with VEO-IBD (age, 3 wk to 4 y) and 19 parents, 4 of whom also had IBD. Exome capture was performed by Agilent SureSelect V4, and sequencing was performed using the Illumina HiSeq platform. Alignment to human genome GRCh37 was achieved followed by postprocessing and variant calling. After functional annotation, candidate variants were analyzed for change in protein function, minor allele frequency less than 0.1%, and scaled combined annotation-dependent depletion scores of 10 or less. We focused on genes associated with primary immunodeficiencies and related pathways. An additional 210 exome samples from patients with pediatric IBD (n = 45) or adult-onset Crohn's disease (n = 20) and healthy individuals (controls, n = 145) were obtained from the University of Kiel, Germany, and used as control groups. Four hundred genes and regions associated with primary immunodeficiency, covering approximately 6500 coding exons totaling more than 1 Mbp of coding sequence, were selected from the whole-exome data. Our analysis showed novel and rare variants within these genes that could contribute to the development of VEO-IBD, including rare heterozygous missense variants in IL10RA and previously unidentified variants in MSH5 and CD19. In an exome sequence analysis of patients with VEO-IBD and their parents, we identified variants in genes that regulate B- and T-cell functions and could contribute to pathogenesis. Our analysis could lead to the identification of previously unidentified IBD-associated variants. Copyright © 2015 AGA Institute. Published by Elsevier Inc. All rights reserved.
Guo, Changjiang; Sun, Xiaoguang; Chen, Xiao; Yang, Sihai; Li, Jing; Wang, Long; Zhang, Xiaohui
2016-01-01
Most rice blast resistance genes (R-genes) encode proteins with nucleotide-binding site (NBS) and leucine-rich repeat (LRR) domains. Our previous study has shown that more rice blast R-genes can be cloned in rapidly evolving NBS-LRR gene families. In the present study, two rapidly evolving R-gene families in rice were selected for cloning a subset of genes from their paralogs in three resistant rice lines. A total of eight functional blast R-genes were identified among nine NBS-LRR genes, and some of these showed resistance to three or more blast strains. Evolutionary analysis indicated that high nucleotide diversity of coding regions served as important parameters in the determination of gene resistance. We also observed that amino-acid variants (nonsynonymous mutations, insertions, or deletions) in essential motifs of the NBS domain contribute to the blast resistance capacity of NBS-LRR genes. These results suggested that the NBS regions might also play an important role in resistance specificity determination. On the other hand, different splicing patterns of introns were commonly observed in R-genes. The results of the present study contribute to improving the effectiveness of R-gene identification by using evolutionary analysis method and acquisition of novel blast resistance genes.
Dey, Avishek; Samanta, Milan Kumar; Gayen, Srimonta; Sen, Soumitra K.; Maiti, Mrinal K.
2016-01-01
Drought is one of the major limiting factors for productivity of crops including rice (Oryza sativa L.). Understanding the role of allelic variations of key regulatory genes involved in stress-tolerance is essential for developing an effective strategy to combat drought. The bZIP transcription factors play a crucial role in abiotic-stress adaptation in plants via abscisic acid (ABA) signaling pathway. The present study aimed to search for allelic polymorphism in the OsbZIP23 gene across selected drought-tolerant and drought-sensitive rice genotypes, and to characterize the new allele through overexpression (OE) and gene-silencing (RNAi). Analyses of the coding DNA sequence (CDS) of the cloned OsbZIP23 gene revealed single nucleotide polymorphism at four places and a 15-nucleotide deletion at one place. The single-copy OsbZIP23 gene is expressed at relatively higher level in leaf tissues of drought-tolerant genotypes, and its abundance is more in reproductive stage. Cloning and sequence analyses of the OsbZIP23-promoter from drought-tolerant O. rufipogon and drought-sensitive IR20 cultivar showed variation in the number of stress-responsive cis-elements and a 35-nucleotide deletion at 5’-UTR in IR20. Analysis of the GFP reporter gene function revealed that the promoter activity of O. rufipogon is comparatively higher than that of IR20. The overexpression of any of the two polymorphic forms (1083 bp and 1068 bp CDS) of OsbZIP23 improved drought tolerance and yield-related traits significantly by retaining higher content of cellular water, soluble sugar and proline; and exhibited decrease in membrane lipid peroxidation in comparison to RNAi lines and non-transgenic plants. The OE lines showed higher expression of target genes-OsRab16B, OsRab21 and OsLEA3-1 and increased ABA sensitivity; indicating that OsbZIP23 is a positive transcriptional-regulator of the ABA-signaling pathway. Taken together, the present study concludes that the enhanced gene expression rather than natural polymorphism in coding sequence of OsbZIP23 is accountable for improved drought tolerance and yield performance in rice genotypes. PMID:26959651
Silar, Philippe; Barreau, Christian; Debuchy, Robert; Kicka, Sébastien; Turcq, Béatrice; Sainsard-Chanet, Annie; Sellem, Carole H; Billault, Alain; Cattolico, Laurence; Duprat, Simone; Weissenbach, Jean
2003-08-01
A Podospora anserina BAC library of 4800 clones has been constructed in the vector pBHYG allowing direct selection in fungi. Screening of the BAC collection for centromeric sequences of chromosome V allowed the recovery of clones localized on either sides of the centromere, but no BAC clone was found to contain the centromere. Seven BAC clones containing 322,195 and 156,244bp from either sides of the centromeric region were sequenced and annotated. One 5S rRNA gene, 5 tRNA genes, and 163 putative coding sequences (CDS) were identified. Among these, only six CDS seem specific to P. anserina. The gene density in the centromeric region is approximately one gene every 2.8kb. Extrapolation of this gene density to the whole genome of P. anserina suggests that the genome contains about 11,000 genes. Synteny analyses between P. anserina and Neurospora crassa show that co-linearity extends at the most to a few genes, suggesting rapid genome rearrangements between these two species.
Cytokinin oxidase/dehydrogenase genes in barley and wheat: cloning and heterologous expression.
Galuszka, Petr; Frébortová, Jitka; Werner, Tomás; Yamada, Mamoru; Strnad, Miroslav; Schmülling, Thomas; Frébort, Ivo
2004-10-01
The cloning of two novel genes that encode cytokinin oxidase/dehydrogenase (CKX) in barley is described in this work. Transformation of both genes into Arabidopsis and tobacco showed that at least one of the genes codes for a functional enzyme, as its expression caused a cytokinin-deficient phenotype in the heterologous host plants. Additional cloning of two gene fragments, and an in silico search in the public expressed sequence tag clone databases, revealed the presence of at least 13 more members of the CKX gene family in barley and wheat. The expression of three selected barley genes was analyzed by RT-PCR and found to be organ-specific with peak expression in mature kernels. One barley CKX (HvCKX2) was characterized in detail after heterologous expression in tobacco. Interestingly, this enzyme shows a pH optimum at 4.5 and a preference for cytokinin ribosides as substrates, which may indicate its vacuolar targeting. Different substrate specificities, and the pH profiles of cytokinin-degrading enzymes extracted from different barley tissues, are also presented.
Zhang, M; Bai, X J
2015-05-25
The polymerase chain reaction-single-strand conformation polymorphism technique was employed to measure mononucleotide diversity in the coding region of the leptin and leptin receptor genes in the Arctic fox. The relationships between specific genetic mutations and reproductive performance in Arctic foxes were determined to im-prove breeding strategies. We found that a leptin gene polymorphism was significantly associated with body weight (P < 0.01), abdominal circumference (P < 0.01), and fur length (P < 0.01). Furthermore, a polymorphism in the leptin receptor gene was associated with carcass weight and guard hair length (P < 0.01). Leptin and leptin receptor gene combinatorial genotypes were significantly associated with abdominal circumference, fur length (P < 0.01), and body weight (P < 0.05). The leptin gene is thus a key gene affecting body weight, abdominal circumference, and fur length in Arctic foxes, whereas variations in the leptin receptor mainly affect carcass weight and guard hair. The marker loci identified in this study can be used to assist in the selection of Arctic foxes for breeding to raise the production performance of this species.
Widespread signatures of local mRNA folding structure selection in four Dengue virus serotypes
2015-01-01
Background It is known that mRNA folding can affect and regulate various gene expression steps both in living organisms and in viruses. Previous studies have recognized functional RNA structures in the genome of the Dengue virus. However, these studies usually focused either on the viral untranslated regions or on very specific and limited regions at the beginning of the coding sequences, in a limited number of strains, and without considering evolutionary selection. Results Here we performed the first large scale comprehensive genomics analysis of selection for local mRNA folding strength in the Dengue virus coding sequences, based on a total of 1,670 genomes and 4 serotypes. Our analysis identified clusters of positions along the coding regions that may undergo a conserved evolutionary selection for strong or weak local folding maintained across different viral variants. Specifically, 53-66 clusters for strong folding and 49-73 clusters for weak folding (depending on serotype) aggregated of positions with a significant conservation of folding energy signals (related to partially overlapping local genomic regions) were recognized. In addition, up to 7% of these positions were found to be conserved in more than 90% of the viral genomes. Although some of the identified positions undergo frequent synonymous / non-synonymous substitutions, the selection for folding strength therein is preserved, and thus cannot be trivially explained based on sequence conservation alone. Conclusions The fact that many of the positions with significant folding related signals are conserved among different Dengue variants suggests that a better understanding of the mRNA structures in the corresponding regions may promote the development of prospective anti- Dengue vaccination strategies. The comparative genomics approach described here can be employed in the future for detecting functional regions in other pathogens with very high mutations rates. PMID:26449467
Mahardika, Gusti N
2018-01-01
Abstract To expand our capacity to discover venom sequences from the genomes of venomous organisms, we applied targeted sequencing techniques to selectively recover venom gene superfamilies and nontoxin loci from the genomes of 32 cone snail species (family, Conidae), a diverse group of marine gastropods that capture their prey using a cocktail of neurotoxic peptides (conotoxins). We were able to successfully recover conotoxin gene superfamilies across all species with high confidence (> 100× coverage) and used these data to provide new insights into conotoxin evolution. First, we found that conotoxin gene superfamilies are composed of one to six exons and are typically short in length (mean = ∼85 bp). Second, we expanded our understanding of the following genetic features of conotoxin evolution: 1) positive selection, where exons coding the mature toxin region were often three times more divergent than their adjacent noncoding regions, 2) expression regulation, with comparisons to transcriptome data showing that cone snails only express a fraction of the genes available in their genome (24–63%), and 3) extensive gene turnover, where Conidae species varied from 120 to 859 conotoxin gene copies. Finally, using comparative phylogenetic methods, we found that while diet specificity did not predict patterns of conotoxin evolution, dietary breadth was positively correlated with total conotoxin gene diversity. Overall, the targeted sequencing technique demonstrated here has the potential to radically increase the pace at which venom gene families are sequenced and studied, reshaping our ability to understand the impact of genetic changes on ecologically relevant phenotypes and subsequent diversification. PMID:29514313
Hawkins, Angela K; Garza, Elyssa R; Dietz, Valerie A; Hernandez, Oscar J; Hawkins, W Daryl; Burrell, A Millie
2017-01-01
Abstract Plants on serpentine soils provide extreme examples of adaptation to environment, and thus offer excellent models for the study of evolution at the molecular and genomic level. Serpentine outcrops are derived from ultramafic rock and have extremely low levels of essential plant nutrients (e.g., N, P, K, and Ca), as well as toxic levels of heavy metals (e.g., Ni, Cr, and Co) and low moisture availability. These outcrops provide habitat to a number of endemic plant species, including the annual mustard Caulanthus amplexicaulis var. barbarae (Cab) (Brassicaceae). Its sister taxon, C. amplexicaulis var. amplexicaulis (Caa), is intolerant to serpentine soils. Here, we assembled and annotated comprehensive reference transcriptomes of both Caa and Cab for use in protein coding sequence comparisons. A set of 29,443 reciprocal best Blast hit (RBH) orthologs between Caa and Cab was compared with identify coding sequence variants, revealing a high genome-wide dN/dS ratio between the two taxa (mean = 0.346). We show that elevated dN/dS likely results from the composite effects of genetic drift, positive selection, and the relaxation of negative selection. Further, analysis of paralogs within each taxon revealed the signature of a period of elevated gene duplication (∼10 Ma) that is shared with other species of the tribe Thelypodieae, and may have played a role in the striking morphological and ecological diversity of this tribe. In addition, distribution of the synonymous substitution rate, dS, is strongly bimodal, indicating a history of reticulate evolution that may have contributed to serpentine adaptation. PMID:29220486
Sequence divergence of the red and green visual pigments in great apes and humans.
Deeb, S S; Jorgensen, A L; Battisti, L; Iwasaki, L; Motulsky, A G
1994-01-01
We have determined the coding sequences of red and green visual pigment genes of the chimpanzee, gorilla, and orangutan. The deduced amino acid sequences of these pigments are highly homologous to the equivalent human pigments. None of the amino acid differences occurred at sites that were previously shown to influence pigment absorption characteristics. Therefore, we predict the spectra of red and green pigments of the apes to have wavelengths of maximum absorption that differ by < 2 nm from the equivalent human pigments and that color vision in these nonhuman primates will be very similar, if not identical, to that in humans. A total of 14 within-species polymorphisms (6 involving silent substitutions) were observed in the coding sequences of the red and green pigment genes of the great apes. Remarkably, the polymorphisms at 6 of these sites had been observed in human populations, suggesting that they predated the evolution of higher primates. Alleles at polymorphic sites were often shared between the red and green pigment genes. The average synonymous rate of divergence of red from green sequences was approximately 1/10th that estimated for other proteins of higher primates, indicating the involvement of gene conversion in generating these polymorphisms. The high degree of homology and juxtaposition of these two genes on the X chromosome has promoted unequal recombination and/or gene conversion that led to sequence homogenization. However, natural selection operated to maintain the degree of separation in peak absorbance between the red and green pigments that resulted in optimal chromatic discrimination. This represents a unique case of molecular coevolution between two homologous genes that functionally interact at the behavioral level. PMID:8041777
Numerical classification of coding sequences
NASA Technical Reports Server (NTRS)
Collins, D. W.; Liu, C. C.; Jukes, T. H.
1992-01-01
DNA sequences coding for protein may be represented by counts of nucleotides or codons. A complete reading frame may be abbreviated by its base count, e.g. A76C158G121T74, or with the corresponding codon table, e.g. (AAA)0(AAC)1(AAG)9 ... (TTT)0. We propose that these numerical designations be used to augment current methods of sequence annotation. Because base counts and codon tables do not require revision as knowledge of function evolves, they are well-suited to act as cross-references, for example to identify redundant GenBank entries. These descriptors may be compared, in place of DNA sequences, to extract homologous genes from large databases. This approach permits rapid searching with good selectivity.
NASA Astrophysics Data System (ADS)
He, Feng; Wen, Haishen; Yu, Dahui; Li, Jifang; Shi, Bao; Chen, Caifang; Zhang, Jiaren; Jin, Guoxiong; Chen, Xiaoyan; Shi, Dan; Yang, Yanping
2010-12-01
Follicle stimulating hormone β (FSHβ) of Japanese flounder ( Paralichthys olivaceus) plays a key role in the regulation of gonadal development. This study aimed to investigate molecular genetic characteristics of the FSHβ gene and elucidate the effects of single nucleotide polymorphisms (SNPs) of FSHβ on reproductive traits in Japanese flounder. We used polymerase chain reaction single-strand conformation polymorphism (PCR-SSCP) and sequencing of the FSHβ gene in 60 individuals. We identified only an SNP (T/C) in the coding region of exon3 of FSHβ. The SNP (T/C) did not lead to amino acid changes at the position 340 bp of FSHβ gene. Statistical analysis showed that the SNP was significantly associated with testosterone (T) level and gonadosomatic index (GSI) ( P < 0.05). Individuals with genotype TC of the SNP had significantly higher serum T levels and GSI ( P < 0.05) than that of genotype CC. Therefore, FSHβ gene could be a useful molecular marker in selection for prominent reproductive trait in Japanese Flounder.
Tatarenkov, Andrey; Ayala, Francisco J
2007-08-01
We studied nucleotide sequence variation at the gene coding for dopa decarboxylase (Ddc) in seven populations of Drosophila melanogaster. Strength and pattern of linkage disequilibrium are somewhat distinct in the extensively sampled Spanish and Raleigh populations. In the Spanish population, a few sites are in strong positive association, whereas a large number of sites in the Raleigh population are associated nonrandomly but the association is not strong. Linkage disequilibrium analysis shows presence of two groups of haplotypes in the populations, each of which is fairly diverged, suggesting epistasis or inversion polymorphism. There is evidence of two forms of natural selection acting on Ddc. The McDonald-Kreitman test indicates a deficit of fixed amino acid differences between D. melanogaster and D. simulans, which may be due to negative selection. An excess of derived alleles at high frequency, significant according to the H-test, is consistent with the effect of hitchhiking. The hitchhiking may have been caused by directional selection downstream of the locus studied, as suggested by a gradual decrease of the polymorphism-to-divergence ratio. Altogether, the Ddc locus exhibits a complicated pattern of variation apparently due to several evolutionary forces. Such a complex pattern may be a result of an unusually high density of functionally important genes.
Chen, Angela; Kelley, Lauren D S; Janušonis, Skirmantas
2012-06-12
The serotonin 5-HT(4) receptor (5-HT(4)R) is coded by a complex gene that produces four mRNA splice variants in mice (5-HT(4(a))R, 5-HT(4(b))R, 5-HT(4(e))R, 5-HT(4(f))R). This receptor has highly dynamic expression in brain development and its splice variants differ in their developmental trajectories. Since 5-HT(4)Rs are important in forebrain function (including forebrain control of serotonergic activity in the brainstem), we investigated the susceptibility of 5-HT(4)R expression in the mouse embryonic telencephalon to prenatal maternal stress and altered serotonin (5-hydroxytryptamine, 5-HT) levels. Because the gene coding the adrenergic β(2) receptor (β(2)AR) is embedded in the 5-HT(4)R gene, we also investigated whether 5-HT(4)R mRNA levels were modulated by selective β(2)AR agents. Timed-pregnant C57BL/6 mice were treated beginning at embryonic day (E) 14 and quantitative reverse-transcription polymerase chain reaction (qRT-PCR) was used to assess the mRNA levels of all 5-HT(4)R splice variants and β(2)AR in the embryonic telencephalon at E17. Maternal prenatal stress and 5-HT depletion with pCPA, a tryptophan hydroxylase inhibitor, reduced the levels of the 5-HT(4(b))R splice variant. Terbutaline (a selective β(2)AR agonist) and ICI 118,551 (a selective β(2)AR antagonist) had no effect on β(2)AR and 5-HT(4)R mRNA levels. These results show that prenatal stress and reduced 5-HT levels can alter 5-HT(4)R expression in the developing forebrain and that some 5-HT(4)R splice variants may be more susceptible than others. Copyright © 2012 Elsevier B.V. All rights reserved.
Giraldo-Calderón, Gloria I; Zanis, Michael J; Hill, Catherine A
2017-03-21
Opsins are light sensitive receptors associated with visual processes. Insects typically possess opsins that are stimulated by ultraviolet, short and long wavelength (LW) radiation. Six putative LW-sensitive opsins predicted in the yellow fever mosquito, Aedes aegypti and malaria mosquito, Anopheles gambiae, and eight in the southern house mosquito, Culex quinquefasciatus, suggest gene expansion in the Family Culicidae (mosquitoes) relative to other insects. Here we report the first detailed molecular and evolutionary analyses of LW opsins in three mosquito vectors, with a goal to understanding the molecular basis of opsin-mediated visual processes that could be exploited for mosquito control. Time of divergence estimates suggest that the mosquito LW opsins originated from 18 or 19 duplication events between 166.9/197.5 to 1.07/0.94 million years ago (MY) and that these likely occurred following the predicted divergence of the lineages Anophelinae and Culicinae 145-226 MY. Fitmodel analyses identified nine amino acid residues in the LW opsins that may be under positive selection. Of these, eight amino acids occur in the N and C termini and are shared among all three species, and one residue in TMIII was unique to culicine species. Alignment of 5' non-coding regions revealed potential Conserved Non-coding Sequences (CNS) and transcription factor binding sites (TFBS) in seven pairs of LW opsin paralogs. Our analyses suggest opsin gene duplication and residues possibly associated with spectral tuning of LW-sensitive photoreceptors. We explore two mechanisms - positive selection and differential expression mediated by regulatory units in CNS - that may have contributed to the retention of LW opsin genes in Culicinae and Anophelinae. We discuss the evolution of mosquito LW opsins in the context of major Earth events and possible adaptation of mosquitoes to LW-dominated photo environments, and implications for mosquito control strategies based on disrupting vision-mediated behaviors.
Bester-Van Der Merwe, Aletta; Blaauw, Sonja; Du Plessis, Jana; Roodt-Wilding, Rouvay
2013-09-23
Haliotis midae is one of the most valuable commercial abalone species in the world, but is highly vulnerable, due to exploitation, habitat destruction and predation. In order to preserve wild and cultured stocks, genetic management and improvement of the species has become crucial. Fundamental to this is the availability and employment of molecular markers, such as microsatellites and single nucleotide (SNPs). Transcriptome sequences generated through sequencing-by-synthesis technology were utilized for the in vitro and in silico identification of 505 putative SNPs from a total of 316 selected contigs. A subset of 234 SNPs were further validated and characterized in wild and cultured abalone using two Illumina GoldenGate genotyping assays. Combined with VeraCode technology, this genotyping platform yielded a 65%-69% conversion rate (percentage polymorphic markers) with a global genotyping success rate of 76%-85% and provided a viable means for validating SNP markers in a non-model species. The utility of 31 of the validated SNPs in population structure analysis was confirmed, while a large number of SNPs (174) were shown to be informative and are, thus, good candidates for linkage map construction. The non-synonymous SNPs (50) located in coding regions of genes that showed similarities with known proteins will also be useful for genetic applications, such as the marker-assisted selection of genes of relevance to abalone aquaculture.
MicroRNAs in CAG trinucleotide repeat expansion disorders: an integrated review of the literature.
Dumitrescu, Laura; Popescu, Bogdan O
2015-01-01
MicroRNAs are small RNAs involved in gene silencing. They play important roles in transcriptional regulation and are selectively and abundantly expressed in the central nervous system. A considerable amount of the human genome is comprised of tandem repeating nucleotide streams. Several diseases are caused by above-threshold expansion of certain trinucleotide repeats occurring in a protein-coding or non-coding region. Though monogenic, CAG trinucleotide repeat expansion disorders have a complex pathogenesis, various combinations of multiple coexisting pathways resulting in one common final consequence: selective neurodegeneration. Mutant protein and mutant transcript gain of toxic function are considered to be the core pathogenic mechanisms. The profile of microRNAs in CAG trinucleotide repeat disorders is scarcely described, however microRNA dysregulation has been identified in these diseases and microRNA-related intereference with gene expression is considered to be involved in their pathogenesis. Better understanding of microRNAs functions and means of manipulation promises to offer further insights into the pathogenic pathways of CAG repeat expansion disorders, to point out new potential targets for drug intervention and to provide some of the much needed etiopathogenic therapeutic agents. A number of disease-modifying microRNA silencing strategies are under development, but several implementation impediments still have to be resolved. CAG targeting seems feasible and efficient in animal models and is an appealing approach for clinical practice. Preliminary human trials are just beginning.
Seligmann, Hervé
2013-03-01
Usual DNA→RNA transcription exchanges T→U. Assuming different systematic symmetric nucleotide exchanges during translation, some GenBank RNAs match exactly human mitochondrial sequences (exchange rules listed in decreasing transcript frequencies): C↔U, A↔U, A↔U+C↔G (two nucleotide pairs exchanged), G↔U, A↔G, C↔G, none for A↔C, A↔G+C↔U, and A↔C+G↔U. Most unusual transcripts involve exchanging uracil. Independent measures of rates of rare replicational enzymatic DNA nucleotide misinsertions predict frequencies of RNA transcripts systematically exchanging the corresponding misinserted nucleotides. Exchange transcripts self-hybridize less than other gene regions, self-hybridization increases with length, suggesting endoribonuclease-limited elongation. Blast detects stop codon depleted putative protein coding overlapping genes within exchange-transcribed mitochondrial genes. These align with existing GenBank proteins (mainly metazoan origins, prokaryotic and viral origins underrepresented). These GenBank proteins frequently interact with RNA/DNA, are membrane transporters, or are typical of mitochondrial metabolism. Nucleotide exchange transcript frequencies increase with overlapping gene densities and stop densities, indicating finely tuned counterbalancing regulation of expression of systematic symmetric nucleotide exchange-encrypted proteins. Such expression necessitates combined activities of suppressor tRNAs matching stops, and nucleotide exchange transcription. Two independent properties confirm predicted exchanged overlap coding genes: discrepancy of third codon nucleotide contents from replicational deamination gradients, and codon usage according to circular code predictions. Predictions from both properties converge, especially for frequent nucleotide exchange types. Nucleotide exchanging transcription apparently increases coding densities of protein coding genes without lengthening genomes, revealing unsuspected functional DNA coding potential. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Chen, Geng; Yin, Kangping; Shi, Leming; Fang, Yuanzhang; Qi, Ya; Li, Peng; Luo, Jian; He, Bing; Liu, Mingyao; Shi, Tieliu
2011-01-01
In their expression process, different genes can generate diverse functional products, including various protein-coding or noncoding RNAs. Here, we investigated the protein-coding capacities and the expression levels of their isoforms for human known genes, the conservation and disease association of long noncoding RNAs (ncRNAs) with two transcriptome sequencing datasets from human brain tissues and 10 mixed cell lines. Comparative analysis revealed that about two-thirds of the genes expressed between brain and cell lines are the same, but less than one-third of their isoforms are identical. Besides those genes specially expressed in brain and cell lines, about 66% of genes expressed in common encoded different isoforms. Moreover, most genes dominantly expressed one isoform and some genes only generated protein-coding (or noncoding) RNAs in one sample but not in another. We found 282 human genes could encode both protein-coding and noncoding RNAs through alternative splicing in the two samples. We also identified more than 1,000 long ncRNAs, and most of those long ncRNAs contain conserved elements across either 46 vertebrates or 33 placental mammals or 10 primates. Further analysis showed that some long ncRNAs differentially expressed in human breast cancer or lung cancer, several of those differentially expressed long ncRNAs were validated by RT-PCR. In addition, those validated differentially expressed long ncRNAs were found significantly correlated with certain breast cancer or lung cancer related genes, indicating the important biological relevance between long ncRNAs and human cancers. Our findings reveal that the differences of gene expression profile between samples mainly result from the expressed gene isoforms, and highlight the importance of studying genes at the isoform level for completely illustrating the intricate transcriptome.
Levine, Mia T; Holloway, Alisha K; Arshad, Umbreen; Begun, David J
2007-11-01
Dosage compensation refers to the equalization of X-linked gene transcription among heterogametic and homogametic sexes. In Drosophila, the dosage compensation complex (DCC) mediates the twofold hypertranscription of the single male X chromosome. Loss-of-function mutations at any DCC protein-coding gene are male lethal. Here we report a population genetic analysis suggesting that four of the five core DCC proteins--MSL1, MSL2, MSL3, and MOF--are evolving under positive selection in D. melanogaster. Within these four proteins, several domains that range in function from X chromosome localization to protein-protein interactions have elevated, D. melanogaster-specific, amino acid divergence.
Yasukochi, Yoshiki; Satta, Yoko
2015-03-25
The human cytochrome P450 (CYP) 2D6 gene is a member of the CYP2D gene subfamily, along with the CYP2D7P and CYP2D8P pseudogenes. Although the CYP2D6 enzyme has been studied extensively because of its clinical importance, the evolution of the CYP2D subfamily has not yet been fully understood. Therefore, the goal of this study was to reveal the evolutionary process of the human drug metabolic system. Here, we investigate molecular evolution of the CYP2D subfamily in primates by comparing 14 CYP2D sequences from humans to New World monkey genomes. Window analysis and statistical tests revealed that entire genomic sequences of paralogous genes were extensively homogenized by gene conversion during molecular evolution of CYP2D genes in primates. A neighbor-joining tree based on genomic sequences at the nonsubstrate recognition sites showed that CYP2D6 and CYP2D8 genes were clustered together due to gene conversion. In contrast, a phylogenetic tree using amino acid sequences at substrate recognition sites did not cluster the CYP2D6 and CYP2D8 genes, suggesting that the functional constraint on substrate specificity is one of the causes for purifying selection at the substrate recognition sites. Our results suggest that the CYP2D gene subfamily in primates has evolved to maintain the regioselectivity for a substrate hydroxylation activity between individual enzymes, even though extensive gene conversion has occurred across CYP2D coding sequences. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Yasukochi, Yoshiki; Satta, Yoko
2015-01-01
The human cytochrome P450 (CYP) 2D6 gene is a member of the CYP2D gene subfamily, along with the CYP2D7P and CYP2D8P pseudogenes. Although the CYP2D6 enzyme has been studied extensively because of its clinical importance, the evolution of the CYP2D subfamily has not yet been fully understood. Therefore, the goal of this study was to reveal the evolutionary process of the human drug metabolic system. Here, we investigate molecular evolution of the CYP2D subfamily in primates by comparing 14 CYP2D sequences from humans to New World monkey genomes. Window analysis and statistical tests revealed that entire genomic sequences of paralogous genes were extensively homogenized by gene conversion during molecular evolution of CYP2D genes in primates. A neighbor-joining tree based on genomic sequences at the nonsubstrate recognition sites showed that CYP2D6 and CYP2D8 genes were clustered together due to gene conversion. In contrast, a phylogenetic tree using amino acid sequences at substrate recognition sites did not cluster the CYP2D6 and CYP2D8 genes, suggesting that the functional constraint on substrate specificity is one of the causes for purifying selection at the substrate recognition sites. Our results suggest that the CYP2D gene subfamily in primates has evolved to maintain the regioselectivity for a substrate hydroxylation activity between individual enzymes, even though extensive gene conversion has occurred across CYP2D coding sequences. PMID:25808902
Multi-targeted priming for genome-wide gene expression assays.
Adomas, Aleksandra B; Lopez-Giraldez, Francesc; Clark, Travis A; Wang, Zheng; Townsend, Jeffrey P
2010-08-17
Complementary approaches to assaying global gene expression are needed to assess gene expression in regions that are poorly assayed by current methodologies. A key component of nearly all gene expression assays is the reverse transcription of transcribed sequences that has traditionally been performed by priming the poly-A tails on many of the transcribed genes in eukaryotes with oligo-dT, or by priming RNA indiscriminately with random hexamers. We designed an algorithm to find common sequence motifs that were present within most protein-coding genes of Saccharomyces cerevisiae and of Neurospora crassa, but that were not present within their ribosomal RNA or transfer RNA genes. We then experimentally tested whether degenerately priming these motifs with multi-targeted primers improved the accuracy and completeness of transcriptomic assays. We discovered two multi-targeted primers that would prime a preponderance of genes in the genomes of Saccharomyces cerevisiae and Neurospora crassa while avoiding priming ribosomal RNA or transfer RNA. Examining the response of Saccharomyces cerevisiae to nitrogen deficiency and profiling Neurospora crassa early sexual development, we demonstrated that using multi-targeted primers in reverse transcription led to superior performance of microarray profiling and next-generation RNA tag sequencing. Priming with multi-targeted primers in addition to oligo-dT resulted in higher sensitivity, a larger number of well-measured genes and greater power to detect differences in gene expression. Our results provide the most complete and detailed expression profiles of the yeast nitrogen starvation response and N. crassa early sexual development to date. Furthermore, our multi-targeting priming methodology for genome-wide gene expression assays provides selective targeting of multiple sequences and counter-selection against undesirable sequences, facilitating a more complete and precise assay of the transcribed sequences within the genome.
Zhou, Xia; Tambo, Ernest; Su, Jing; Fang, Qiang; Ruan, Wei; Chen, Jun-Hu; Yin, Ming-Bo; Zhou, Xiao-Nong
2017-10-01
Plasmodium vivax merozoite surface protein-1 (PvMSP1) gene codes for a major malaria vaccine candidate antigen. However, its polymorphic nature represents an obstacle to the design of a protective vaccine. In this study, we analyzed the genetic polymorphism and natural selection of the C-terminal 42 kDa fragment within PvMSP1 gene (Pv MSP142) from 77 P. vivax isolates, collected from imported cases of China-Myanmar border (CMB) areas in Yunnan province and the inland cases from Anhui, Yunnan, and Zhejiang province in China during 2009-2012. Totally, 41 haplotypes were identified and 30 of them were new haplotypes. The differences between the rates of non-synonymous and synonymous mutations suggest that PvMSP142 has evolved under natural selection, and a high selective pressure preferentially acted on regions identified of PvMSP133. Our results also demonstrated that PvMSP142 of P. vivax isolates collected on China-Myanmar border areas display higher genetic polymorphisms than those collected from inland of China. Such results have significant implications for understanding the dynamic of the P. vivax population and may be useful information towards China malaria elimination campaign strategies.
McTavish, H; LaQuier, F; Arciero, D; Logan, M; Mundfrom, G; Fuchs, J A; Hooper, A B
1993-04-01
The genome of Nitrosomonas europaea contains at least three copies each of the genes coding for hydroxylamine oxidoreductase (HAO) and cytochrome c554. A copy of an HAO gene is always located within 2.7 kb of a copy of a cytochrome c554 gene. Cytochrome P-460, a protein that shares very unusual spectral features with HAO, was found to be encoded by a gene separate from the HAO genes.
Genetic variation in eleven phase I drug metabolism genes in an ethnically diverse population.
Solus, Joseph F; Arietta, Brenda J; Harris, James R; Sexton, David P; Steward, John Q; McMunn, Chara; Ihrie, Patrick; Mehall, Janelle M; Edwards, Todd L; Dawson, Elliott P
2004-10-01
The extent of genetic variation found in drug metabolism genes and its contribution to interindividual variation in response to medication remains incompletely understood. To better determine the identity and frequency of variation in 11 phase I drug metabolism genes, the exons and flanking intronic regions of the cytochrome P450 (CYP) isoenzyme genes CYP1A1, CYP1A2, CYP2A6, CYP2B6, CYP2C8, CYP2C9, CYP2C19, CYP2D6, CYP2E1, CYP3A4 and CYP3A5 were amplified from genomic DNA and sequenced. A total of 60 kb of bi-directional sequence was generated from each of 93 human DNAs, which included Caucasian, African-American and Asian samples. There were 388 different polymorphisms identified. These included 269 non-coding, 45 synonymous and 74 non-synonymous polymorphisms. Of these, 54% were novel and included 176 non-coding, 14 synonymous and 21 non-synonymous polymorphisms. Of the novel variants observed, 85 were represented by single occurrences of the minor allele in the sample set. Much of the variation observed was from low-frequency alleles. Comparatively, these genes are variation-rich. Calculations measuring genetic diversity revealed that while the values for the individual genes are widely variable, the overall nucleotide diversity of 7.7 x 10(-4) and polymorphism parameter of 11.5 x 10(-4) are higher than those previously reported for other gene sets. Several independent measurements indicate that these genes are under selective pressure, particularly for polymorphisms corresponding to non-synonymous amino acid changes. There is relatively little difference in measurements of diversity among the ethnic groups, but there are large differences among the genes and gene subfamilies themselves. Of the three CYP subfamilies involved in phase I drug metabolism (1, 2, and 3), subfamily 2 displays the highest levels of genetic diversity.
McGuire, Austen B; Rafi, Syed K; Manzardo, Ann M; Butler, Merlin G
2016-05-05
Mammalian chromosomes are comprised of complex chromatin architecture with the specific assembly and configuration of each chromosome influencing gene expression and function in yet undefined ways by varying degrees of heterochromatinization that result in Giemsa (G) negative euchromatic (light) bands and G-positive heterochromatic (dark) bands. We carried out morphometric measurements of high-resolution chromosome ideograms for the first time to characterize the total euchromatic and heterochromatic chromosome band length, distribution and localization of 20,145 known protein-coding genes, 790 recognized autism spectrum disorder (ASD) genes and 365 obesity genes. The individual lengths of G-negative euchromatin and G-positive heterochromatin chromosome bands were measured in millimeters and recorded from scaled and stacked digital images of 850-band high-resolution ideograms supplied by the International Society of Chromosome Nomenclature (ISCN) 2013. Our overall measurements followed established banding patterns based on chromosome size. G-negative euchromatic band regions contained 60% of protein-coding genes while the remaining 40% were distributed across the four heterochromatic dark band sub-types. ASD genes were disproportionately overrepresented in the darker heterochromatic sub-bands, while the obesity gene distribution pattern did not significantly differ from protein-coding genes. Our study supports recent trends implicating genes located in heterochromatin regions playing a role in biological processes including neurodevelopment and function, specifically genes associated with ASD.
Gene and genon concept: coding versus regulation
2007-01-01
We analyse here the definition of the gene in order to distinguish, on the basis of modern insight in molecular biology, what the gene is coding for, namely a specific polypeptide, and how its expression is realized and controlled. Before the coding role of the DNA was discovered, a gene was identified with a specific phenotypic trait, from Mendel through Morgan up to Benzer. Subsequently, however, molecular biologists ventured to define a gene at the level of the DNA sequence in terms of coding. As is becoming ever more evident, the relations between information stored at DNA level and functional products are very intricate, and the regulatory aspects are as important and essential as the information coding for products. This approach led, thus, to a conceptual hybrid that confused coding, regulation and functional aspects. In this essay, we develop a definition of the gene that once again starts from the functional aspect. A cellular function can be represented by a polypeptide or an RNA. In the case of the polypeptide, its biochemical identity is determined by the mRNA prior to translation, and that is where we locate the gene. The steps from specific, but possibly separated sequence fragments at DNA level to that final mRNA then can be analysed in terms of regulation. For that purpose, we coin the new term “genon”. In that manner, we can clearly separate product and regulative information while keeping the fundamental relation between coding and function without the need to introduce a conceptual hybrid. In mRNA, the program regulating the expression of a gene is superimposed onto and added to the coding sequence in cis - we call it the genon. The complementary external control of a given mRNA by trans-acting factors is incorporated in its transgenon. A consequence of this definition is that, in eukaryotes, the gene is, in most cases, not yet present at DNA level. Rather, it is assembled by RNA processing, including differential splicing, from various pieces, as steered by the genon. It emerges finally as an uninterrupted nucleic acid sequence at mRNA level just prior to translation, in faithful correspondence with the amino acid sequence to be produced as a polypeptide. After translation, the genon has fulfilled its role and expires. The distinction between the protein coding information as materialised in the final polypeptide and the processing information represented by the genon allows us to set up a new information theoretic scheme. The standard sequence information determined by the genetic code expresses the relation between coding sequence and product. Backward analysis asks from which coding region in the DNA a given polypeptide originates. The (more interesting) forward analysis asks in how many polypeptides of how many different types a given DNA segment is expressed. This concerns the control of the expression process for which we have introduced the genon concept. Thus, the information theoretic analysis can capture the complementary aspects of coding and regulation, of gene and genon. PMID:18087760
Yang, Lei; Neme, Rafik; Wichman, Holly A.; Malik, Harmit S.
2014-01-01
Mammalian genomes comprise many active and fossilized retroelements. The obligate requirement for retroelement integration affords host genomes an opportunity to ‘domesticate’ retroelement genes for their own purpose, leading to important innovations in genome defense and placentation. While many such exaptations involve retroviruses, the L1TD1 gene is the only known domesticated gene whose protein-coding sequence is almost entirely derived from a LINE-1 (L1) retroelement. Human L1TD1 has been shown to play an important role in pluripotency maintenance. To investigate how this role was acquired, we traced the origin and evolution of L1TD1. We find that L1TD1 originated in the common ancestor of eutherian mammals, but was lost or pseudogenized multiple times during mammalian evolution. We also find that L1TD1 has evolved under positive selection during primate and mouse evolution, and that one prosimian L1TD1 has ‘replenished’ itself with a more recent L1 ORF1 from the prosimian genome. These data suggest that L1TD1 has been recurrently selected for functional novelty, perhaps for a role in genome defense. L1TD1 loss is associated with L1 extinction in several megabat lineages, but not in sigmodontine rodents. We hypothesize that L1TD1 could have originally evolved for genome defense against L1 elements. Later, L1TD1 may have become incorporated into pluripotency maintenance in some lineages. Our study highlights the role of retroelement gene domestication in fundamental aspects of mammalian biology, and that such domesticated genes can adopt different functions in different lineages. PMID:25211013
Zhen, Ying; Ungerer, Mark C
2008-12-01
Elucidating the molecular basis of adaptive phenotypic variation represents a central aim in evolutionary biology. Traits exhibiting patterns of clinal variation represent excellent models for studies of molecular adaptation, especially when variation in phenotype can be linked to organismal fitness in different environments. Natural accessions of the model plant species Arabidopsis thaliana exhibit clinal variation in freezing tolerance that follows a gradient of temperature variability across the species' native range (Zhen Y, Ungerer MC. 2008. Clinal variation in freezing tolerance among natural accessions of A. thaliana. New Phytol. 177:419-427). Here, we report that this pattern of variation is attributable, at least in part, to relaxed purifying selection on members of a small family of transcriptional activators (the CBF/DREB1s) in the species' southern range. These regulatory genes play a critical role in the ability of A. thaliana plants to undergo cold acclimation and thereby achieve maximum freezing tolerance. Relative to accessions from northern regions, accessions of A. thaliana from the southern part of their geographic range exhibit levels of nonsynonymous nucleotide polymorphism that are approximately 2.8-fold higher across this small gene subfamily. Relaxed selection on the CBF/DREB1s in southern accessions also has resulted in multiple mutations in regulatory regions resulting in abrogated expression of particular subfamily members in particular accessions. These coding-region and regulatory mutations compromise the ability of these genes to act as efficient transcriptional activators during the cold acclimation process, as determined by reductions in rates of induction and maximum levels of expression in the downstream genes they regulate. This study highlights the potential role of regulatory genes in underlying adaptive phenotypic variation in nature.
2013-01-01
Background A classical example of repeated speciation coupled with ecological diversification is the evolution of 14 closely related species of Darwin’s (Galápagos) finches (Thraupidae, Passeriformes). Their adaptive radiation in the Galápagos archipelago took place in the last 2–3 million years and some of the molecular mechanisms that led to their diversification are now being elucidated. Here we report evolutionary analyses of genome of the large ground finch, Geospiza magnirostris. Results 13,291 protein-coding genes were predicted from a 991.0 Mb G. magnirostris genome assembly. We then defined gene orthology relationships and constructed whole genome alignments between the G. magnirostris and other vertebrate genomes. We estimate that 15% of genomic sequence is functionally constrained between G. magnirostris and zebra finch. Genic evolutionary rate comparisons indicate that similar selective pressures acted along the G. magnirostris and zebra finch lineages suggesting that historical effective population size values have been similar in both lineages. 21 otherwise highly conserved genes were identified that each show evidence for positive selection on amino acid changes in the Darwin's finch lineage. Two of these genes (Igf2r and Pou1f1) have been implicated in beak morphology changes in Darwin’s finches. Five of 47 genes showing evidence of positive selection in early passerine evolution have cilia related functions, and may be examples of adaptively evolving reproductive proteins. Conclusions These results provide insights into past evolutionary processes that have shaped G. magnirostris genes and its genome, and provide the necessary foundation upon which to build population genomics resources that will shed light on more contemporaneous adaptive and non-adaptive processes that have contributed to the evolution of the Darwin’s finches. PMID:23402223
Genome-wide signatures of convergent evolution in echolocating mammals
Parker, Joe; Tsagkogeorga, Georgia; Cotton, James A.; Liu, Yuan; Provero, Paolo; Stupka, Elia; Rossiter, Stephen J.
2013-01-01
Evolution is typically thought to proceed through divergence of genes, proteins, and ultimately phenotypes1-3. However, similar traits might also evolve convergently in unrelated taxa due to similar selection pressures4,5. Adaptive phenotypic convergence is widespread in nature, and recent results from a handful of genes have suggested that this phenomenon is powerful enough to also drive recurrent evolution at the sequence level6-9. Where homoplasious substitutions do occur these have long been considered the result of neutral processes. However, recent studies have demonstrated that adaptive convergent sequence evolution can be detected in vertebrates using statistical methods that model parallel evolution9,10 although the extent to which sequence convergence between genera occurs across genomes is unknown. Here we analyse genomic sequence data in mammals that have independently evolved echolocation and show for the first time that convergence is not a rare process restricted to a handful of loci but is instead widespread, continuously distributed and commonly driven by natural selection acting on a small number of sites per locus. Systematic analyses of convergent sequence evolution in 805,053 amino acids within 2,326 orthologous coding gene sequences compared across 22 mammals (including four new bat genomes) revealed signatures consistent with convergence in nearly 200 loci. Strong and significant support for convergence among bats and the dolphin was seen in numerous genes linked to hearing or deafness, consistent with an involvement in echolocation. Surprisingly we also found convergence in many genes linked to vision: the convergent signal of many sensory genes was robustly correlated with the strength of natural selection. This first attempt to detect genome-wide convergent sequence evolution across divergent taxa reveals the phenomenon to be much more pervasive than previously recognised. PMID:24005325
The Evolutionary Tempo of Sex Chromosome Degradation in Carica papaya.
Wu, Meng; Moore, Richard C
2015-06-01
Genes on non-recombining heterogametic sex chromosomes may degrade over time through the irreversible accumulation of deleterious mutations. In papaya, the non-recombining male-specific region of the Y (MSY) consists of two evolutionary strata corresponding to chromosomal inversions occurring approximately 7.0 and 1.9 MYA. The step-wise recombination suppression between the papaya X and Y allows for a temporal examination of the degeneration progress of the young Y chromosome. Comparative evolutionary analyses of 55 X/Y gene pairs showed that Y-linked genes have more unfavorable substitutions than X-linked genes. However, this asymmetric evolutionary pattern is confined to the oldest stratum, and is only observed when recently evolved pseudogenes are included in the analysis, indicating a slow degeneration tempo of the papaya Y chromosome. Population genetic analyses of coding sequence variation of six Y-linked focal loci in the oldest evolutionary stratum detected an excess of nonsynonymous polymorphism and reduced codon bias relative to autosomal loci. However, this pattern was also observed for corresponding X-linked loci. Both the MSY and its corresponding X-specific region are pericentromeric where recombination has been shown to be greatly reduced. Like the MSY region, overall selective efficacy on the X-specific region may be reduced due to the interference of selective forces between highly linked loci, or the Hill-Robertson effect, that is accentuated in regions of low or suppressed recombination. Thus, a pattern of gene decay on the X-specific region may be explained by relaxed purifying selection and widespread genetic hitchhiking due to its pericentromeric location.
Emergence and Evolution of Hominidae-Specific Coding and Noncoding Genomic Sequences.
Saber, Morteza Mahmoudi; Adeyemi Babarinde, Isaac; Hettiarachchi, Nilmini; Saitou, Naruya
2016-07-12
Family Hominidae, which includes humans and great apes, is recognized for unique complex social behavior and intellectual abilities. Despite the increasing genome data, however, the genomic origin of its phenotypic uniqueness has remained elusive. Clade-specific genes and highly conserved noncoding sequences (HCNSs) are among the high-potential evolutionary candidates involved in driving clade-specific characters and phenotypes. On this premise, we analyzed whole genome sequences along with gene orthology data retrieved from major DNA databases to find Hominidae-specific (HS) genes and HCNSs. We discovered that Down syndrome critical region 4 (DSCR4) is the only experimentally verified gene uniquely present in Hominidae. DSCR4 has no structural homology to any known protein and was inferred to have emerged in several steps through LTR/ERV1, LTR/ERVL retrotransposition, and transversion. Using the genomic distance as neutral evolution threshold, we identified 1,658 HS HCNSs. Polymorphism coverage and derived allele frequency analysis of HS HCNSs showed that these HCNSs are under purifying selection, indicating that they may harbor important functions. They are overrepresented in promoters/untranslated regions, in close proximity of genes involved in sensory perception of sound and developmental process, and also showed a significantly lower nucleosome occupancy probability. Interestingly, many ancestral sequences of the HS HCNSs showed very high evolutionary rates. This suggests that new functions emerged through some kind of positive selection, and then purifying selection started to operate to keep these functions. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Non-coding variants contribute to the clinical heterogeneity of TTR amyloidosis.
Iorio, Andrea; De Lillo, Antonella; De Angelis, Flavio; Di Girolamo, Marco; Luigetti, Marco; Sabatelli, Mario; Pradotto, Luca; Mauro, Alessandro; Mazzeo, Anna; Stancanelli, Claudia; Perfetto, Federico; Frusconi, Sabrina; My, Filomena; Manfellotto, Dario; Fuciarelli, Maria; Polimanti, Renato
2017-09-01
Coding mutations in TTR gene cause a rare hereditary form of systemic amyloidosis, which has a complex genotype-phenotype correlation. We investigated the role of non-coding variants in regulating TTR gene expression and consequently amyloidosis symptoms. We evaluated the genotype-phenotype correlation considering the clinical information of 129 Italian patients with TTR amyloidosis. Then, we conducted a re-sequencing of TTR gene to investigate how non-coding variants affect TTR expression and, consequently, phenotypic presentation in carriers of amyloidogenic mutations. Polygenic scores for genetically determined TTR expression were constructed using data from our re-sequencing analysis and the GTEx (Genotype-Tissue Expression) project. We confirmed a strong phenotypic heterogeneity across coding mutations causing TTR amyloidosis. Considering the effects of non-coding variants on TTR expression, we identified three patient clusters with specific expression patterns associated with certain phenotypic presentations, including late onset, autonomic neurological involvement, and gastrointestinal symptoms. This study provides novel data regarding the role of non-coding variation and the gene expression profiles in patients affected by TTR amyloidosis, also putting forth an approach that could be used to investigate the mechanisms at the basis of the genotype-phenotype correlation of the disease.
Maize GO annotation—methods, evaluation, and review (maize-GAMER)
USDA-ARS?s Scientific Manuscript database
We created a new high-coverage, robust, and reproducible functional annotation of maize protein-coding genes based on Gene Ontology (GO) term assignments. Whereas the existing Phytozome and Gramene maize GO annotation sets only cover 41% and 56% of maize protein-coding genes, respectively, this stu...
Burke, Sean V.; Wysocki, William P.; Clark, Lynn G.
2018-01-01
The systematics of grasses has advanced through applications of plastome phylogenomics, although studies have been largely limited to subfamilies or other subgroups of Poaceae. Here we present a plastome phylogenomic analysis of 250 complete plastomes (179 genera) sampled from 44 of the 52 tribes of Poaceae. Plastome sequences were determined from high throughput sequencing libraries and the assemblies represent over 28.7 Mbases of sequence data. Phylogenetic signal was characterized in 14 partitions, including (1) complete plastomes; (2) protein coding regions; (3) noncoding regions; and (4) three loci commonly used in single and multi-gene studies of grasses. Each of the four main partitions was further refined, alternatively including or excluding positively selected codons and also the gaps introduced by the alignment. All 76 protein coding plastome loci were found to be predominantly under purifying selection, but specific codons were found to be under positive selection in 65 loci. The loci that have been widely used in multi-gene phylogenetic studies had among the highest proportions of positively selected codons, suggesting caution in the interpretation of these earlier results. Plastome phylogenomic analyses confirmed the backbone topology for Poaceae with maximum bootstrap support (BP). Among the 14 analyses, 82 clades out of 309 resolved were maximally supported in all trees. Analyses of newly sequenced plastomes were in agreement with current classifications. Five of seven partitions in which alignment gaps were removed retrieved Panicoideae as sister to the remaining PACMAD subfamilies. Alternative topologies were recovered in trees from partitions that included alignment gaps. This suggests that ambiguities in aligning these uncertain regions might introduce a false signal. Resolution of these and other critical branch points in the phylogeny of Poaceae will help to better understand the selective forces that drove the radiation of the BOP and PACMAD clades comprising more than 99.9% of grass diversity. PMID:29416954
Stress-Driven Selection of Novel Phenotypes
NASA Technical Reports Server (NTRS)
Fox, George E.; Stepaov, Victor G.; Liu, Yamei
2011-01-01
A process has been developed that can confer novel properties, such as metal resistance, to a host bacterium. This same process can also be used to produce RNAs and peptides that have novel properties, such as the ability to bind particular compounds. It is inherent in the method that the peptide or RNA will behave as expected in the target organism. Plasmid-born mini-gene libraries coding for either a population of combinatorial peptides or stable, artificial RNAs carrying random inserts are produced. These libraries, which have no bias towards any biological function, are used to transform the organism of interest and to serve as an initial source of genetic variation for stress-driven evolution. The transformed bacteria are propagated under selective pressure in order to obtain variants with the desired properties. The process is highly distinct from in vitro methods because the variants are selected in the context of the cell while it is experiencing stress. Hence, the selected peptide or RNA will, by definition, work as expected in the target cell as the cell adapts to its presence during the selection process. Once the novel gene, which produces the sought phenotype, is obtained, it can be transferred to the main genome to increase the genetic stability in the organism. Alternatively, the cell line can be used to produce novel RNAs or peptides with selectable properties in large quantity for separate purposes. The system allows for easy, large-scale purification of the RNAs or peptide products. The process has been reduced to practice by imposing sub-inhibitory concentrations of NiCl2 on cells of the bacterium Escherichia coli that were transformed separately with the peptide library and RNA library. The evolved resistant clones were isolated, and sequences of the selected mini-gene variants were established. Clones resistant to NiCl2 were found to carry identical plasmid variants with a functional mini-gene that specifically conferred significant nickel tolerance on the host cells. Sequencing of the selected mini-gene revealed a propensity of the encoded peptide to bind transient metal ions. Expression of the mini-gene markedly improved growth parameters of the evolved clones at sub-inhibitory concentrations of NiCl2 while being slightly detrimental in the absence of stress. Similar results have been obtained with the RNA libraries. Overall, the results demonstrate a very natural outcome of the selection experiments in which the mini-genes were expected to be either successfully integrated into bacterial genetic networks, or rejected depending upon their effect on host fitness. This described approach can be useful as a laboratory model to study the dynamics of bacterial adaptive evolution on the molecular level. It can also provide a strategy for screening expressed DNA libraries in search of novel genes with desirable properties.
Pilot, Małgorzata; Malewski, Tadeusz; Moura, Andre E; Grzybowski, Tomasz; Oleński, Kamil; Kamiński, Stanisław; Fadel, Fernanda Ruiz; Alagaili, Abdulaziz N; Mohammed, Osama B; Bogdanowicz, Wiesław
2016-08-09
Domesticated species are often composed of distinct populations differing in the character and strength of artificial and natural selection pressures, providing a valuable model to study adaptation. In contrast to pure-breed dogs that constitute artificially maintained inbred lines, free-ranging dogs are typically free-breeding, i.e., unrestrained in mate choice. Many traits in free-breeding dogs (FBDs) may be under similar natural and sexual selection conditions to wild canids, while relaxation of sexual selection is expected in pure-breed dogs. We used a Bayesian approach with strict false-positive control criteria to identify FST-outlier SNPs between FBDs and either European or East Asian breeds, based on 167,989 autosomal SNPs. By identifying outlier SNPs located within coding genes, we found four candidate genes under diversifying selection shared by these two comparisons. Three of them are associated with the Hedgehog (HH) signaling pathway regulating vertebrate morphogenesis. A comparison between FBDs and East Asian breeds also revealed diversifying selection on the BBS6 gene, which was earlier shown to cause snout shortening and dental crowding via disrupted HH signaling. Our results suggest that relaxation of natural and sexual selection in pure-breed dogs as opposed to FBDs could have led to mild changes in regulation of the HH signaling pathway. HH inhibits adhesion and the migration of neural crest cells from the neural tube, and minor deficits of these cells during embryonic development have been proposed as the underlying cause of "domestication syndrome." This suggests that the process of breed formation involved the same genetic and developmental pathways as the process of domestication. Copyright © 2016 Pilot et al.
Pilot, Małgorzata; Malewski, Tadeusz; Moura, Andre E.; Grzybowski, Tomasz; Oleński, Kamil; Kamiński, Stanisław; Fadel, Fernanda Ruiz; Alagaili, Abdulaziz N.; Mohammed, Osama B.; Bogdanowicz, Wiesław
2016-01-01
Domesticated species are often composed of distinct populations differing in the character and strength of artificial and natural selection pressures, providing a valuable model to study adaptation. In contrast to pure-breed dogs that constitute artificially maintained inbred lines, free-ranging dogs are typically free-breeding, i.e., unrestrained in mate choice. Many traits in free-breeding dogs (FBDs) may be under similar natural and sexual selection conditions to wild canids, while relaxation of sexual selection is expected in pure-breed dogs. We used a Bayesian approach with strict false-positive control criteria to identify FST-outlier SNPs between FBDs and either European or East Asian breeds, based on 167,989 autosomal SNPs. By identifying outlier SNPs located within coding genes, we found four candidate genes under diversifying selection shared by these two comparisons. Three of them are associated with the Hedgehog (HH) signaling pathway regulating vertebrate morphogenesis. A comparison between FBDs and East Asian breeds also revealed diversifying selection on the BBS6 gene, which was earlier shown to cause snout shortening and dental crowding via disrupted HH signaling. Our results suggest that relaxation of natural and sexual selection in pure-breed dogs as opposed to FBDs could have led to mild changes in regulation of the HH signaling pathway. HH inhibits adhesion and the migration of neural crest cells from the neural tube, and minor deficits of these cells during embryonic development have been proposed as the underlying cause of “domestication syndrome.” This suggests that the process of breed formation involved the same genetic and developmental pathways as the process of domestication. PMID:27233669
Positive selection on the killer whale mitogenome
Foote, Andrew D.; Morin, Phillip A.; Durban, John W.; Pitman, Robert L.; Wade, Paul; Willerslev, Eske; Gilbert, M. Thomas P.; da Fonseca, Rute R.
2011-01-01
Mitochondria produce up to 95 per cent of the eukaryotic cell's energy. The coding genes of the mitochondrial DNA may therefore evolve under selection owing to metabolic requirements. The killer whale, Orcinus orca, is polymorphic, has a global distribution and occupies a range of ecological niches. It is therefore a suitable organism for testing this hypothesis. We compared a global dataset of the complete mitochondrial genomes of 139 individuals for amino acid changes that were associated with radical physico-chemical property changes and were influenced by positive selection. Two such selected non-synonymous amino acid changes were found; one in each of two ecotypes that inhabit the Antarctic pack ice. Both substitutions were associated with changes in local polarity, increased steric constraints and α-helical tendencies that could influence overall metabolic performance, suggesting a functional change. PMID:20810427
Essential RNA-Based Technologies and Their Applications in Plant Functional Genomics.
Teotia, Sachin; Singh, Deepali; Tang, Xiaoqing; Tang, Guiliang
2016-02-01
Genome sequencing has not only extended our understanding of the blueprints of many plant species but has also revealed the secrets of coding and non-coding genes. We present here a brief introduction to and personal account of key RNA-based technologies, as well as their development and applications for functional genomics of plant coding and non-coding genes, with a focus on short tandem target mimics (STTMs), artificial microRNAs (amiRNAs), and CRISPR/Cas9. In addition, their use in multiplex technologies for the functional dissection of gene networks is discussed. Copyright © 2015 Elsevier Ltd. All rights reserved.
Premzl, Marko
2015-01-01
Using eutherian comparative genomic analysis protocol and public genomic sequence data sets, the present work attempted to update and revise two gene data sets. The most comprehensive third party annotation gene data sets of eutherian adenohypophysis cystine-knot genes (128 complete coding sequences), and d-dopachrome tautomerases and macrophage migration inhibitory factor genes (30 complete coding sequences) were annotated. For example, the present study first described primate-specific cystine-knot Prometheus genes, as well as differential gene expansions of D-dopachrome tautomerase genes. Furthermore, new frameworks of future experiments of two eutherian gene data sets were proposed. PMID:25941635
Two rapidly evolving genes contribute to male fitness in Drosophila
Reinhardt, Josephine A; Jones, Corbin D
2013-01-01
Purifying selection often results in conservation of gene sequence and function. The most functionally conserved genes are also thought to be among the most biologically essential. These observations have led to the use of sequence conservation as a proxy for functional conservation. Here we describe two genes that are exceptions to this pattern. We show that lack of sequence conservation among orthologs of CG15460 and CG15323 – herein named jean-baptiste (jb) and karr respectively – does not necessarily predict lack of functional conservation. These two Drosophila melanogaster genes are among the most rapidly evolving protein-coding genes in this species, being nearly as diverged from their D. yakuba orthologs as random sequences are. jb and karr are both expressed at an elevated level in larval males and adult testes, but they are not accessory gland proteins and their loss does not affect male fertility. Instead, knockdown of these genes in D. melanogaster via RNA interference caused male-biased viability defects. These viability effects occur prior to the third instar for jb and during late pupation for karr. We show that putative orthologs to jb and karr are also expressed strongly in the testes of other Drosophila species and have similar gene structure across species despite low levels of sequence conservation. While standard molecular evolution tests could not reject neutrality, other data hint at a role for natural selection. Together these data provide a clear case where a lack of sequence conservation does not imply a lack of conservation of expression or function. PMID:24221639
Impact of DRD2/ANKK1 and COMT Polymorphisms on Attention and Cognitive Functions in Schizophrenia.
Nkam, Irene; Ramoz, Nicolas; Breton, Florence; Mallet, Jasmina; Gorwood, Philip; Dubertret, Caroline
2017-01-01
Cognitive deficits such as poor selective attention and executive functions decline have been reported in patients with schizophrenia. Many studies have emphasized the role of dopamine in regulating cognitive functions in the general population as well as in schizophrenia. However, the relationship between cognitive processes, schizophrenia and dopaminergic candidate genes is an original approach given interesting results. The purpose of the current exploratory study was to examine the interaction of dopaminergic genes (coding for dopamine receptor D2, DRD2, and for Catecholamine-O-Methyl-Transferase, COMT) with the diagnostic of schizophrenia in (i) the executive control of attention, (ii) selective attention, and (iii) executive functions. We recruited 52 patients with schizophrenia and 53 healthy controls who performed the Stroop Color-Word Test, the Attention Network Test and the Wisconsin Card Sorting test. Four single nucleotide polymorphisms (SNPs) in the DRD2 gene (rs6275, rs6277, rs2242592 and rs1800497) and two SNPs in the COMT gene (rs4680 and rs165599) have been genotyped. Patients with schizophrenia performed significantly worse than controls in all cognitive performance, taking into account demographic variables. A significant gene by disease interaction was found for the Stroop interference (p = 0.002) for rs6275 of the DRD2 gene. The COMT Val/Val genotype and schizophrenia were associated with increased number of perseverative errors (p = 0.01). In our study, the DRD2 gene is involved in attention while the COMT gene is implicated in executive functions in patients with schizophrenia.
Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil
2015-02-01
The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Tao, Junjie; Feng, Chao; Ai, Bin; Kang, Ming
2016-01-01
Background and Aims Limestone karst areas possess high floral diversity and endemism. The genus Primulina, which contributes to the unique calcicole flora, has high species richness and exhibit specific soil-based habitat associations that are mainly distributed on calcareous karst soils. The adaptive molecular evolutionary mechanism of the genus to karst calcium-rich environments is still not well understood. The Ca2+-permeable channel TPC1 was used in this study to test whether its gene is involved in the local adaptation of Primulina to karst high-calcium soil environments. Methods Specific amplification and sequencing primers were designed and used to amplify the full-length coding sequences of TPC1 from cDNA of 76 Primulina species. The sequence alignment without recombination and the corresponding reconstructed phylogeny tree were used in molecular evolutionary analyses at the nucleic acid level and amino acid level, respectively. Finally, the identified sites under positive selection were labelled on the predicted secondary structure of TPC1. Key Results Seventy-six full-length coding sequences of Primulina TPC1 were obtained. The length of the sequences varied between 2220 and 2286 bp and the insertion/deletion was located at the 5′ end of the sequences. No signal of substitution saturation was detected in the sequences, while significant recombination breakpoints were detected. The molecular evolutionary analyses showed that TPC1 was dominated by purifying selection and the selective pressures were not significantly different among species lineages. However, significant signals of positive selection were detected at both TPC1 codon level and amino acid level, and five sites under positive selective pressure were identified by at least three different methods. Conclusions The Ca2+-permeable channel TPC1 may be involved in the local adaptation of Primulina to karst Ca2+-rich environments. Different species lineages suffered similar selective pressure associated with calcium in karst environments, and episodic diversifying selection at a few sites may play a major role in the molecular evolution of Primulina TPC1. PMID:27582362
Nedelcu, Aurora M.; Lee, Robert W.; Lemieux, Claude; Gray, Michael W.; Burger, Gertraud
2000-01-01
Two distinct mitochondrial genome types have been described among the green algal lineages investigated to date: a reduced–derived, Chlamydomonas-like type and an ancestral, Prototheca-like type. To determine if this unexpected dichotomy is real or is due to insufficient or biased sampling and to define trends in the evolution of the green algal mitochondrial genome, we sequenced and analyzed the mitochondrial DNA (mtDNA) of Scenedesmus obliquus. This genome is 42,919 bp in size and encodes 42 conserved genes (i.e., large and small subunit rRNA genes, 27 tRNA and 13 respiratory protein-coding genes), four additional free-standing open reading frames with no known homologs, and an intronic reading frame with endonuclease/maturase similarity. No 5S rRNA or ribosomal protein-coding genes have been identified in Scenedesmus mtDNA. The standard protein-coding genes feature a deviant genetic code characterized by the use of UAG (normally a stop codon) to specify leucine, and the unprecedented use of UCA (normally a serine codon) as a signal for termination of translation. The mitochondrial genome of Scenedesmus combines features of both green algal mitochondrial genome types: the presence of a more complex set of protein-coding and tRNA genes is shared with the ancestral type, whereas the lack of 5S rRNA and ribosomal protein-coding genes as well as the presence of fragmented and scrambled rRNA genes are shared with the reduced–derived type of mitochondrial genome organization. Furthermore, the gene content and the fragmentation pattern of the rRNA genes suggest that this genome represents an intermediate stage in the evolutionary process of mitochondrial genome streamlining in green algae. [The sequence data described in this paper have been submitted to the GenBank data library under accession no. AF204057.] PMID:10854413
Using a Euclid distance discriminant method to find protein coding genes in the yeast genome.
Zhang, Chun-Ting; Wang, Ju; Zhang, Ren
2002-02-01
The Euclid distance discriminant method is used to find protein coding genes in the yeast genome, based on the single nucleotide frequencies at three codon positions in the ORFs. The method is extremely simple and may be extended to find genes in prokaryotic genomes or eukaryotic genomes with less introns. Six-fold cross-validation tests have demonstrated that the accuracy of the algorithm is better than 93%. Based on this, it is found that the total number of protein coding genes in the yeast genome is less than or equal to 5579 only, about 3.8-7.0% less than 5800-6000, which is currently widely accepted. The base compositions at three codon positions are analyzed in details using a graphic method. The result shows that the preference codons adopted by yeast genes are of the RGW type, where R, G and W indicate the bases of purine, non-G and A/T, whereas the 'codons' in the intergenic sequences are of the form NNN, where N denotes any base. This fact constitutes the basis of the algorithm to distinguish between coding and non-coding ORFs in the yeast genome. The names of putative non-coding ORFs are listed here in detail.
Zhao, Yi; Tang, Liang; Li, Zhe; Jin, Jinpu; Luo, Jingchu; Gao, Ge
2015-04-18
Long-established protein-coding genes may lose their coding potential during evolution ("unitary gene loss"). Members of the Poaceae family are a major food source and represent an ideal model clade for plant evolution research. However, the global pattern of unitary gene loss in Poaceae genomes as well as the evolutionary fate of lost genes are still less-investigated and remain largely elusive. Using a locally developed pipeline, we identified 129 unitary gene loss events for long-established protein-coding genes from four representative species of Poaceae, i.e. brachypodium, rice, sorghum and maize. Functional annotation suggested that the lost genes in all or most of Poaceae species are enriched for genes involved in development and response to endogenous stimulus. We also found that 44 mutated genomic loci of lost genes, which we referred as relics, were still actively transcribed, and of which 84% (37 of 44) showed significantly differential expression across different tissues. More interestingly, we found that there were totally five expressed relics may function as competitive endogenous RNA in brachypodium, rice and sorghum genome. Based on comparative genomics and transcriptome data, we firstly compiled a comprehensive catalogue of unitary gene loss events in Poaceae species and characterized a statistically significant functional preference for these lost genes as well showed the potential of relics functioning as competitive endogenous RNAs in Poaceae genomes.
Hirosawa, I; Aritomi, K; Hoshida, H; Kashiwagi, S; Nishizawa, Y; Akada, R
2004-07-01
The commercial application of genetically modified industrial microorganisms has been problematic due to public concerns. We constructed a "self-cloning" sake yeast strain that overexpresses the ATF1 gene encoding alcohol acetyltransferase, to improve the flavor profile of Japanese sake. A constitutive yeast overexpression promoter, TDH3p, derived from the glyceraldehyde-3-phosphate dehydrogenase gene from sake yeast was fused to ATF1; and the 5' upstream non-coding sequence of ATF1 was further fused to TDH3p-ATF1. The fragment was placed on a binary vector, pGG119, containing a drug-resistance marker for transformation and a counter-selection marker for excision of unwanted DNA. The plasmid was integrated into the ATF1 locus of a sake yeast strain. This integration constructed tandem repeats of ATF1 and TDH3p-ATF1 sequences, between which the plasmid was inserted. Loss of the plasmid, which occurs through homologous recombination between either the TDH3p downstream ATF1 repeats or the TDH3p upstream repeat sequences, was selected by growing transformants on counter-selective medium. Recombination between the downstream repeats led to reversion to a wild type strain, but that between the upstream repeats resulted in a strain that possessed TDH3p-ATF1 without the extraneous DNA sequences. The self-cloning TDH3p-ATF1 yeast strain produced a higher amount of isoamyl acetate. This is the first expression-controlled self-cloning industrial yeast.
Zhang, Yu; Yao, Youlin; Jiang, Siyuan; Lu, Yilu; Liu, Yunqiang; Tao, Dachang; Zhang, Sizhong; Ma, Yongxin
2015-04-01
To identify protein-protein interaction partners of PER1 (period circadian protein homolog 1), key component of the molecular oscillation system of the circadian rhythm in tumors using bacterial two-hybrid system technique. Human cervical carcinoma cell Hela library was adopted. Recombinant bait plasmid pBT-PER1 and pTRG cDNA plasmid library were cotransformed into the two-hybrid system reporter strain cultured in a special selective medium. Target clones were screened. After isolating the positive clones, the target clones were sequenced and analyzed. Fourteen protein coding genes were identified, 4 of which were found to contain whole coding regions of genes, which included optic atrophy 3 protein (OPA3) associated with mitochondrial dynamics and homo sapiens cutA divalent cation tolerance homolog of E. coli (CUTA) associated with copper metabolism. There were also cellular events related proteins and proteins which are involved in biochemical reaction and signal transduction-related proteins. Identification of potential interacting proteins with PER1 in tumors may provide us new insights into the functions of the circadian clock protein PER1 during tumorigenesis.
Early development of Moniliophthora perniciosa basidiomata and developmentally regulated genes
2009-01-01
Background The hemibiotrophic fungus Moniliophthora perniciosa is the causal agent of Witches' broom, a disease of Theobroma cacao. The pathogen life cycle ends with the production of basidiocarps in dead tissues of the infected host. This structure generates millions of basidiospores that reinfect young tissues of the same or other plants. A deeper understanding of the mechanisms underlying the sexual phase of this fungus may help develop chemical, biological or genetic strategies to control the disease. Results Mycelium was morphologically analyzed prior to emergence of basidiomata by stereomicroscopy, light microscopy and scanning electron microscopy. The morphological changes in the mycelium before fructification show a pattern similar to other members of the order Agaricales. Changes and appearance of hyphae forming a surface layer by fusion were correlated with primordia emergence. The stages of hyphal nodules, aggregation, initial primordium and differentiated primordium were detected. The morphological analysis also allowed conclusions on morphogenetic aspects. To analyze the genes involved in basidiomata development, the expression of some selected EST genes from a non-normalized cDNA library, representative of the fruiting stage of M. perniciosa, was evaluated. A macroarray analysis was performed with 192 selected clones and hybridized with two distinct RNA pools extracted from mycelium in different phases of basidiomata formation. This analysis showed two groups of up and down-regulated genes in primordial phases of mycelia. Hydrophobin coding, glucose transporter, Rho-GEF, Rheb, extensin precursor and cytochrome p450 monooxygenase genes were grouped among the up-regulated. In the down-regulated group relevant genes clustered coding calmodulin, lanosterol 14 alpha demethylase and PIM1. In addition, 12 genes with more detailed expression profiles were analyzed by RT-qPCR. One aegerolysin gene had a peak of expression in mycelium with primordia and a second in basidiomata, confirming their distinctiveness. The number of transcripts of the gene for plerototolysin B increased in reddish-pink mycelium and indicated an activation of the initial basidiomata production even at this culturing stage. Expression of the glucose transporter gene increased in mycelium after the stress, coinciding with a decrease of adenylate cyclase gene transcription. This indicated that nutrient uptake can be an important signal to trigger fruiting in this fungus. Conclusion The identification of genes with increased expression in this phase of the life cycle of M. perniciosa opens up new possibilities of controlling fungus spread as well as of genetic studies of biological processes that lead to basidiomycete fruiting. This is the first comparative morphologic study of the early development both in vivo and in vitro of M. perniciosa basidiomata and the first description of genes expressed at this stage of the fungal life cycle. PMID:19653910
Kim, Jae Yoon; Moon, Jun-Cheol; Kim, Hyo Chul; Shin, Seungho; Song, Kitae; Kim, Kyung-Hee; Lee, Byung-Moo
2017-01-01
Premise of the study: Positional cloning in combination with phenotyping is a general approach to identify disease-resistance gene candidates in plants; however, it requires several time-consuming steps including population or fine mapping. Therefore, in the present study, we suggest a new combined strategy to improve the identification of disease-resistance gene candidates. Methods and Results: Downy mildew (DM)–resistant maize was selected from five cultivars using a spreader row technique. Positional cloning and bioinformatics tools were used to identify the DM-resistance quantitative trait locus marker (bnlg1702) and 47 protein-coding gene annotations. Eventually, five DM-resistance gene candidates, including bZIP34, Bak1, and Ppr, were identified by quantitative reverse-transcription PCR (RT-PCR) without fine mapping of the bnlg1702 locus. Conclusions: The combined protocol with the spreader row technique, quantitative trait locus positional cloning, and quantitative RT-PCR was effective for identifying DM-resistance candidate genes. This cloning approach may be applied to other whole-genome-sequenced crops or resistance to other diseases. PMID:28224059
GeneXplorer: an interactive web application for microarray data visualization and analysis.
Rees, Christian A; Demeter, Janos; Matese, John C; Botstein, David; Sherlock, Gavin
2004-10-01
When publishing large-scale microarray datasets, it is of great value to create supplemental websites where either the full data, or selected subsets corresponding to figures within the paper, can be browsed. We set out to create a CGI application containing many of the features of some of the existing standalone software for the visualization of clustered microarray data. We present GeneXplorer, a web application for interactive microarray data visualization and analysis in a web environment. GeneXplorer allows users to browse a microarray dataset in an intuitive fashion. It provides simple access to microarray data over the Internet and uses only HTML and JavaScript to display graphic and annotation information. It provides radar and zoom views of the data, allows display of the nearest neighbors to a gene expression vector based on their Pearson correlations and provides the ability to search gene annotation fields. The software is released under the permissive MIT Open Source license, and the complete documentation and the entire source code are freely available for download from CPAN http://search.cpan.org/dist/Microarray-GeneXplorer/.
Archaebacterial rhodopsin sequences: Implications for evolution
NASA Technical Reports Server (NTRS)
Lanyi, J. K.
1991-01-01
It was proposed over 10 years ago that the archaebacteria represent a separate kingdom which diverged very early from the eubacteria and eukaryotes. It follows that investigations of archaebacterial characteristics might reveal features of early evolution. So far, two genes, one for bacteriorhodopsin and another for halorhodopsin, both from Halobacterium halobium, have been sequenced. We cloned and sequenced the gene coding for the polypeptide of another one of these rhodopsins, a halorhodopsin in Natronobacterium pharaonis. Peptide sequencing of cyanogen bromide fragments, and immuno-reactions of the protein and synthetic peptides derived from the C-terminal gene sequence, confirmed that the open reading frame was the structural gene for the pharaonis halorhodopsin polypeptide. The flanking DNA sequences of this gene, as well as those of other bacterial rhodopsins, were compared to previously proposed archaebacterial consensus sequences. In pairwise comparisons of the open reading frame with DNA sequences for bacterio-opsin and halo-opsin from Halobacterium halobium, silent divergences were calculated. These indicate very considerable evolutionary distance between each pair of genes, even in the dame organism. In spite of this, three protein sequences show extensive similarities, indicating strong selective pressures.
Orexin gene therapy restores the timing and maintenance of wakefulness in narcoleptic mice.
Kantor, Sandor; Mochizuki, Takatoshi; Lops, Stefan N; Ko, Brian; Clain, Elizabeth; Clark, Erika; Yamamoto, Mihoko; Scammell, Thomas E
2013-08-01
Narcolepsy is caused by selective loss of the orexin/hypocretin-producing neurons of the hypothalamus. For patients with narcolepsy, chronic sleepiness is often the most disabling symptom, but current therapies rarely normalize alertness and do not address the underlying orexin deficiency. We hypothesized that the sleepiness of narcolepsy would substantially improve if orexin signaling were restored in specific brain regions at appropriate times of day. We used gene therapy to restore orexin signaling in a mouse model of narcolepsy. In these Atx mice, expression of a toxic protein (ataxin-3) selectively kills the orexin neurons. To induce ectopic expression of the orexin neuropeptides, we microinjected an adeno-associated viral vector coding for prepro-orexin plus a red fluorescence protein (AAV-orexin) into the mediobasal hypothalamus of Atx and wild-type mice. Control mice received an AAV coding only for red fluorescence protein. Two weeks later, we recorded sleep/wake behavior, locomotor activity, and body temperature and examined the patterns of orexin expression. Atx mice rescued with AAV-orexin produced long bouts of wakefulness and had a normal diurnal pattern of arousal, with the longest bouts of wake and the highest amounts of locomotor activity in the first hours of the night. In addition, AAV-orexin improved the timing of rapid eye movement sleep and the consolidation of nonrapid eye movement sleep in Atx mice. These substantial improvements in sleepiness and other symptoms of narcolepsy demonstrate the effectiveness of orexin gene therapy in a mouse model of narcolepsy. Additional work is needed to optimize this approach, but in time, AAV-orexin could become a useful therapeutic option for patients with narcolepsy.
Qi, Weiwei; Zhu, Tong; Tian, Zhongrui; Li, Chaobin; Zhang, Wei; Song, Rentao
2016-08-11
CRISPR/Cas9 genome editing strategy has been applied to a variety of species and the tRNA-processing system has been used to compact multiple gRNAs into one synthetic gene for manipulating multiple genes in rice. We optimized and introduced the multiplex gene editing strategy based on the tRNA-processing system into maize. Maize glycine-tRNA was selected to design multiple tRNA-gRNA units for the simultaneous production of numerous gRNAs under the control of one maize U6 promoter. We designed three gRNAs for simplex editing and three multiple tRNA-gRNA units for multiplex editing. The results indicate that this system not only increased the number of targeted sites but also enhanced mutagenesis efficiency in maize. Additionally, we propose an advanced sequence selection of gRNA spacers for relatively more efficient and accurate chromosomal fragment deletion, which is important for complete abolishment of gene function especially long non-coding RNAs (lncRNAs). Our results also indicated that up to four tRNA-gRNA units in one expression cassette design can still work in maize. The examples reported here demonstrate the utility of the tRNA-processing system-based strategy as an efficient multiplex genome editing tool to enhance maize genetic research and breeding.
Fedrigo, Olivier; Babbitt, Courtney C.; Wortham, Matthew; Tewari, Alok K.; London, Darin; Song, Lingyun; Lee, Bum-Kyu; Iyer, Vishwanath R.; Parker, Stephen C. J.; Margulies, Elliott H.; Wray, Gregory A.; Furey, Terrence S.; Crawford, Gregory E.
2012-01-01
Understanding the molecular basis for phenotypic differences between humans and other primates remains an outstanding challenge. Mutations in non-coding regulatory DNA that alter gene expression have been hypothesized as a key driver of these phenotypic differences. This has been supported by differential gene expression analyses in general, but not by the identification of specific regulatory elements responsible for changes in transcription and phenotype. To identify the genetic source of regulatory differences, we mapped DNaseI hypersensitive (DHS) sites, which mark all types of active gene regulatory elements, genome-wide in the same cell type isolated from human, chimpanzee, and macaque. Most DHS sites were conserved among all three species, as expected based on their central role in regulating transcription. However, we found evidence that several hundred DHS sites were gained or lost on the lineages leading to modern human and chimpanzee. Species-specific DHS site gains are enriched near differentially expressed genes, are positively correlated with increased transcription, show evidence of branch-specific positive selection, and overlap with active chromatin marks. Species-specific sequence differences in transcription factor motifs found within these DHS sites are linked with species-specific changes in chromatin accessibility. Together, these indicate that the regulatory elements identified here are genetic contributors to transcriptional and phenotypic differences among primate species. PMID:22761590
Orexin Gene Therapy Restores the Timing and Maintenance of Wakefulness in Narcoleptic Mice
Kantor, Sandor; Mochizuki, Takatoshi; Lops, Stefan N.; Ko, Brian; Clain, Elizabeth; Clark, Erika; Yamamoto, Mihoko; Scammell, Thomas E.
2013-01-01
Study Objectives: Narcolepsy is caused by selective loss of the orexin/hypocretin-producing neurons of the hypothalamus. For patients with narcolepsy, chronic sleepiness is often the most disabling symptom, but current therapies rarely normalize alertness and do not address the underlying orexin deficiency. We hypothesized that the sleepiness of narcolepsy would substantially improve if orexin signaling were restored in specific brain regions at appropriate times of day. Design: We used gene therapy to restore orexin signaling in a mouse model of narcolepsy. In these Atx mice, expression of a toxic protein (ataxin-3) selectively kills the orexin neurons. Interventions: To induce ectopic expression of the orexin neuropeptides, we microinjected an adeno-associated viral vector coding for prepro-orexin plus a red fluorescence protein (AAV-orexin) into the mediobasal hypothalamus of Atx and wild-type mice. Control mice received an AAV coding only for red fluorescence protein. Two weeks later, we recorded sleep/wake behavior, locomotor activity, and body temperature and examined the patterns of orexin expression. Measurements and Results: Atx mice rescued with AAV-orexin produced long bouts of wakefulness and had a normal diurnal pattern of arousal, with the longest bouts of wake and the highest amounts of locomotor activity in the first hours of the night. In addition, AAV-orexin improved the timing of rapid eye movement sleep and the consolidation of nonrapid eye movement sleep in Atx mice. Conclusions: These substantial improvements in sleepiness and other symptoms of narcolepsy demonstrate the effectiveness of orexin gene therapy in a mouse model of narcolepsy. Additional work is needed to optimize this approach, but in time, AAV-orexin could become a useful therapeutic option for patients with narcolepsy. Citation: Kantor S; Mochizuki T; Lops SN; Ko B; Clain E; Clark E; Yamamoto M; Scammell TE. Orexin gene therapy restores the timing and maintenance of wakefulness in narcoleptic mice. SLEEP 2013;36(8):1129–1138. PMID:23904672
Complete Chloroplast Genome Sequence of Coptis chinensis Franch. and Its Evolutionary History
He, Yang; Deng, Cao; Fan, Gang; Qin, Shishang
2017-01-01
The Coptis chinensis Franch. is an important medicinal plant from the Ranunculales. We used next generation sequencing technology to determine the complete chloroplast genome of C. chinensis. This genome is 155,484 bp long with 38.17% GC content. Two 26,758 bp long inverted repeats separated the genome into a typical quadripartite structure. The C. chinensis chloroplast genome consists of 128 gene loci, including eight rRNA gene loci, 28 tRNA gene loci, and 92 protein-coding gene loci. Most of the SSRs in C. chinensis are poly-A/T. The numbers of mononucleotide SSRs in C. chinensis and other Ranunculaceae species are fewer than those in Berberidaceae species, while the number of dinucleotide SSRs is greater than that in the Berberidaceae. C. chinensis diverged from other Ranunculaceae species an estimated 81 million years ago (Mya). The divergence between Ranunculaceae and Berberidaceae was ~111 Mya, while the Ranunculales and Magnoliaceae shared a common ancestor during the Jurassic, ~153 Mya. Position 104 of the C. chinensis ndhG protein was identified as a positively selected site, indicating possible selection for the photosystem-chlororespiration system in C. chinensis. In summary, the complete sequencing and annotation of the C. chinensis chloroplast genome will facilitate future studies on this important medicinal species. PMID:28698879
A Third Approach to Gene Prediction Suggests Thousands of Additional Human Transcribed Regions
Glusman, Gustavo; Qin, Shizhen; El-Gewely, M. Raafat; Siegel, Andrew F; Roach, Jared C; Hood, Leroy; Smit, Arian F. A
2006-01-01
The identification and characterization of the complete ensemble of genes is a main goal of deciphering the digital information stored in the human genome. Many algorithms for computational gene prediction have been described, ultimately derived from two basic concepts: (1) modeling gene structure and (2) recognizing sequence similarity. Successful hybrid methods combining these two concepts have also been developed. We present a third orthogonal approach to gene prediction, based on detecting the genomic signatures of transcription, accumulated over evolutionary time. We discuss four algorithms based on this third concept: Greens and CHOWDER, which quantify mutational strand biases caused by transcription-coupled DNA repair, and ROAST and PASTA, which are based on strand-specific selection against polyadenylation signals. We combined these algorithms into an integrated method called FEAST, which we used to predict the location and orientation of thousands of putative transcription units not overlapping known genes. Many of the newly predicted transcriptional units do not appear to code for proteins. The new algorithms are particularly apt at detecting genes with long introns and lacking sequence conservation. They therefore complement existing gene prediction methods and will help identify functional transcripts within many apparent “genomic deserts.” PMID:16543943
Peng, Rui; Zeng, Bo; Meng, Xiuxiang; Yue, Bisong; Zhang, Zhihe; Zou, Fangdong
2007-08-01
The complete mitochondrial genome sequence of the giant panda, Ailuropoda melanoleuca, was determined by the long and accurate polymerase chain reaction (LA-PCR) with conserved primers and primer walking sequence methods. The complete mitochondrial DNA is 16,805 nucleotides in length and contains two ribosomal RNA genes, 13 protein-coding genes, 22 transfer RNA genes and one control region. The total length of the 13 protein-coding genes is longer than the American black bear, brown bear and polar bear by 3 amino acids at the end of ND5 gene. The codon usage also followed the typical vertebrate pattern except for an unusual ATT start codon, which initiates the NADH dehydrogenase subunit 5 (ND5) gene. The molecular phylogenetic analysis was performed on the sequences of 12 concatenated heavy-strand encoded protein-coding genes, and suggested that the giant panda is most closely related to bears.
Signatures of selection in tilapia revealed by whole genome resequencing.
Xia, Jun Hong; Bai, Zhiyi; Meng, Zining; Zhang, Yong; Wang, Le; Liu, Feng; Jing, Wu; Wan, Zi Yi; Li, Jiale; Lin, Haoran; Yue, Gen Hua
2015-09-16
Natural selection and selective breeding for genetic improvement have left detectable signatures within the genome of a species. Identification of selection signatures is important in evolutionary biology and for detecting genes that facilitate to accelerate genetic improvement. However, selection signatures, including artificial selection and natural selection, have only been identified at the whole genome level in several genetically improved fish species. Tilapia is one of the most important genetically improved fish species in the world. Using next-generation sequencing, we sequenced the genomes of 47 tilapia individuals. We identified a total of 1.43 million high-quality SNPs and found that the LD block sizes ranged from 10-100 kb in tilapia. We detected over a hundred putative selective sweep regions in each line of tilapia. Most selection signatures were located in non-coding regions of the tilapia genome. The Wnt signaling, gonadotropin-releasing hormone receptor and integrin signaling pathways were under positive selection in all improved tilapia lines. Our study provides a genome-wide map of genetic variation and selection footprints in tilapia, which could be important for genetic studies and accelerating genetic improvement of tilapia.
Hierarchical Parallelization of Gene Differential Association Analysis
2011-01-01
Background Microarray gene differential expression analysis is a widely used technique that deals with high dimensional data and is computationally intensive for permutation-based procedures. Microarray gene differential association analysis is even more computationally demanding and must take advantage of multicore computing technology, which is the driving force behind increasing compute power in recent years. In this paper, we present a two-layer hierarchical parallel implementation of gene differential association analysis. It takes advantage of both fine- and coarse-grain (with granularity defined by the frequency of communication) parallelism in order to effectively leverage the non-uniform nature of parallel processing available in the cutting-edge systems of today. Results Our results show that this hierarchical strategy matches data sharing behavior to the properties of the underlying hardware, thereby reducing the memory and bandwidth needs of the application. The resulting improved efficiency reduces computation time and allows the gene differential association analysis code to scale its execution with the number of processors. The code and biological data used in this study are downloadable from http://www.urmc.rochester.edu/biostat/people/faculty/hu.cfm. Conclusions The performance sweet spot occurs when using a number of threads per MPI process that allows the working sets of the corresponding MPI processes running on the multicore to fit within the machine cache. Hence, we suggest that practitioners follow this principle in selecting the appropriate number of MPI processes and threads within each MPI process for their cluster configurations. We believe that the principles of this hierarchical approach to parallelization can be utilized in the parallelization of other computationally demanding kernels. PMID:21936916
Hierarchical parallelization of gene differential association analysis.
Needham, Mark; Hu, Rui; Dwarkadas, Sandhya; Qiu, Xing
2011-09-21
Microarray gene differential expression analysis is a widely used technique that deals with high dimensional data and is computationally intensive for permutation-based procedures. Microarray gene differential association analysis is even more computationally demanding and must take advantage of multicore computing technology, which is the driving force behind increasing compute power in recent years. In this paper, we present a two-layer hierarchical parallel implementation of gene differential association analysis. It takes advantage of both fine- and coarse-grain (with granularity defined by the frequency of communication) parallelism in order to effectively leverage the non-uniform nature of parallel processing available in the cutting-edge systems of today. Our results show that this hierarchical strategy matches data sharing behavior to the properties of the underlying hardware, thereby reducing the memory and bandwidth needs of the application. The resulting improved efficiency reduces computation time and allows the gene differential association analysis code to scale its execution with the number of processors. The code and biological data used in this study are downloadable from http://www.urmc.rochester.edu/biostat/people/faculty/hu.cfm. The performance sweet spot occurs when using a number of threads per MPI process that allows the working sets of the corresponding MPI processes running on the multicore to fit within the machine cache. Hence, we suggest that practitioners follow this principle in selecting the appropriate number of MPI processes and threads within each MPI process for their cluster configurations. We believe that the principles of this hierarchical approach to parallelization can be utilized in the parallelization of other computationally demanding kernels.
Cui, Peng; Liu, Huitao; Lin, Qiang; Ding, Feng; Zhuo, Guoyin; Hu, Songnian; Liu, Dongcheng; Yang, Wenlong; Zhan, Kehui; Zhang, Aimin; Yu, Jun
2009-12-01
Plant mitochondrial genomes, encoding necessary proteins involved in the system of energy production, play an important role in the development and reproduction of the plant. They occupy a specific evolutionary pattern relative to their nuclear counterparts. Here, we determined the winter wheat (Triticum aestivum cv. Chinese Yumai) mitochondrial genome in a length of 452 and 526 bp by shotgun sequencing its BAC library. It contains 202 genes, including 35 known protein-coding genes, three rRNA and 17 tRNA genes, as well as 149 open reading frames (ORFs; greater than 300 bp in length). The sequence is almost identical to the previously reported sequence of the spring wheat (T. aestivum cv. Chinese Spring); we only identified seven SNPs (three transitions and four transversions) and 10 indels (insertions and deletions) between the two independently acquired sequences, and all variations were found in non-coding regions. This result confirmed the accuracy of the previously reported mitochondrial sequence of the Chinese Spring wheat. The nucleotide frequency and codon usage of wheat are common among the lineage of higher plant with a high AT-content of 58%. Molecular evolutionary analysis demonstrated that plant mitochondrial genomes evolved at different rates, which may correlate with substantial variations in metabolic rate and generation time among plant lineages. In addition, through the estimation of the ratio of non-synonymous to synonymous substitution rates between orthologous mitochondrion-encoded genes of higher plants, we found an accelerated evolutionary rate that seems to be the result of relaxed selection.
Solov'ev, V V; Kel', A E; Kolchanov, N A
1989-01-01
The factors, determining the presence of inverted and symmetrical repeats in genes coding for globular proteins, have been analysed. An interesting property of genetical code has been revealed in the analysis of symmetrical repeats: the pairs of symmetrical codons corresponded to pairs of amino acids with mostly similar physical-chemical parameters. This property may explain the presence of symmetrical repeats and palindromes only in genes coding for beta-structural proteins-polypeptides, where amino acids with similar physical-chemical properties occupy symmetrical positions. A stochastic model of evolution of polynucleotide sequences has been used for analysis of inverted repeats. The modelling demonstrated that only limiting of sequences (uneven frequencies of used codons) is enough for arising of nonrandom inverted repeats in genes.
Lack of pathogenic mutations in SOS1 gene in phenytoin-induced gingival overgrowth patients.
Margiotti, Katia; Pascolini, Giulia; Consoli, Federica; Guida, Valentina; Di Bonaventura, Carlo; Giallonardo, Anna Teresa; Pizzuti, Antonio; De Luca, Alessandro
2017-08-01
Gingival overgrowth is a side effect associated with some distinct classes of drugs, such as anticonvulsants, immunosuppressants, and calcium channel blockers. One of the main drugs associated with gingival overgrowth is the antiepileptic phenytoin, which affects gingival tissues by altering extracellular matrix metabolism. It has been shown that mutation of human SOS1 gene is responsible for a rare hereditary gingival fibromatosis type 1, a benign gingival overgrowth. The aim of the present study is to evaluate the possible contribution of SOS1 mutation to gingival overgrowth-related phenotype. We selected and screened for mutations a group of 24 epileptic patients who experienced significant gingival overgrowth following phenytoin therapy. Mutation scanning was carried out by denaturing high-performance liquid chromatography analysis of the entire coding region of the SOS1 gene. Novel identified variants were analyzed in-silico by using Alamut Visual mutation interpretation software, and comparison with normal control group was done. Mutation scanning of the entire coding sequence of SOS1 gene identified seven intronic variants and one new exonic substitution (c.138G>A). The seven common intronic variants were not considered to be of pathogenic importance. The exonic substitution c.138G>A was found to be absent in 100 ethnically matched normal control chromosomes, but was not expected to have functional significance based on prediction bioinformatics tools. This study represents the first mutation analysis of the SOS1 gene in phenytoin-induced gingival overgrowth epileptic patients. Present results suggest that obvious pathogenic mutations in the SOS1 gene do not represent a common mechanism underlying phenytoin-induced gingival overgrowth in epileptic patients; other mechanisms are likely to be involved in the pathogenesis of this drug-induced phenotype. Copyright © 2017 Elsevier Ltd. All rights reserved.
Heimann, Louisa; Horst, Ina; Perduns, Renke; Dreesen, Björn; Offermann, Sascha; Peterhansel, Christoph
2013-05-01
C4 photosynthesis evolved more than 60 times independently in different plant lineages. Each time, multiple genes were recruited into C4 metabolism. The corresponding promoters acquired new regulatory features such as high expression, light induction, or cell type-specific expression in mesophyll or bundle sheath cells. We have previously shown that histone modifications contribute to the regulation of the model C4 phosphoenolpyruvate carboxylase (C4-Pepc) promoter in maize (Zea mays). We here tested the light- and cell type-specific responses of three selected histone acetylations and two histone methylations on five additional C4 genes (C4-Ca, C4-Ppdk, C4-Me, C4-Pepck, and C4-RbcS2) in maize. Histone acetylation and nucleosome occupancy assays indicated extended promoter regions with regulatory upstream regions more than 1,000 bp from the transcription initiation site for most of these genes. Despite any detectable homology of the promoters on the primary sequence level, histone modification patterns were highly coregulated. Specifically, H3K9ac was regulated by illumination, whereas H3K4me3 was regulated in a cell type-specific manner. We further compared histone modifications on the C4-Pepc and C4-Me genes from maize and the homologous genes from sorghum (Sorghum bicolor) and Setaria italica. Whereas sorghum and maize share a common C4 origin, C4 metabolism evolved independently in S. italica. The distribution of histone modifications over the promoters differed between the species, but differential regulation of light-induced histone acetylation and cell type-specific histone methylation were evident in all three species. We propose that a preexisting histone code was recruited into C4 promoter control during the evolution of C4 metabolism.
Buhler, Stéphane; Sanchez-Mazas, Alicia
2011-01-01
Molecular differences between HLA alleles vary up to 57 nucleotides within the peptide binding coding region of human Major Histocompatibility Complex (MHC) genes, but it is still unclear whether this variation results from a stochastic process or from selective constraints related to functional differences among HLA molecules. Although HLA alleles are generally treated as equidistant molecular units in population genetic studies, DNA sequence diversity among populations is also crucial to interpret the observed HLA polymorphism. In this study, we used a large dataset of 2,062 DNA sequences defined for the different HLA alleles to analyze nucleotide diversity of seven HLA genes in 23,500 individuals of about 200 populations spread worldwide. We first analyzed the HLA molecular structure and diversity of these populations in relation to geographic variation and we further investigated possible departures from selective neutrality through Tajima's tests and mismatch distributions. All results were compared to those obtained by classical approaches applied to HLA allele frequencies. Our study shows that the global patterns of HLA nucleotide diversity among populations are significantly correlated to geography, although in some specific cases the molecular information reveals unexpected genetic relationships. At all loci except HLA-DPB1, populations have accumulated a high proportion of very divergent alleles, suggesting an advantage of heterozygotes expressing molecularly distant HLA molecules (asymmetric overdominant selection model). However, both different intensities of selection and unequal levels of gene conversion may explain the heterogeneous mismatch distributions observed among the loci. Also, distinctive patterns of sequence divergence observed at the HLA-DPB1 locus suggest current neutrality but old selective pressures on this gene. We conclude that HLA DNA sequences advantageously complement HLA allele frequencies as a source of data used to explore the genetic history of human populations, and that their analysis allows a more thorough investigation of human MHC molecular evolution. PMID:21408106