Differentially-Expressed Pseudogenes in HIV-1 Infection.
Gupta, Aditi; Brown, C Titus; Zheng, Yong-Hui; Adami, Christoph
2015-09-29
Not all pseudogenes are transcriptionally silent as previously thought. Pseudogene transcripts, although not translated, contribute to the non-coding RNA pool of the cell that regulates the expression of other genes. Pseudogene transcripts can also directly compete with the parent gene transcripts for mRNA stability and other cell factors, modulating their expression levels. Tissue-specific and cancer-specific differential expression of these "functional" pseudogenes has been reported. To ascertain potential pseudogene:gene interactions in HIV-1 infection, we analyzed transcriptomes from infected and uninfected T-cells and found that 21 pseudogenes are differentially expressed in HIV-1 infection. This is interesting because parent genes of one-third of these differentially-expressed pseudogenes are implicated in HIV-1 life cycle, and parent genes of half of these pseudogenes are involved in different viral infections. Our bioinformatics analysis identifies candidate pseudogene:gene interactions that may be of significance in HIV-1 infection. Experimental validation of these interactions would establish that retroviruses exploit this newly-discovered layer of host gene expression regulation for their own benefit.
Differentially-Expressed Pseudogenes in HIV-1 Infection
Gupta, Aditi; Brown, C. Titus; Zheng, Yong-Hui; Adami, Christoph
2015-01-01
Not all pseudogenes are transcriptionally silent as previously thought. Pseudogene transcripts, although not translated, contribute to the non-coding RNA pool of the cell that regulates the expression of other genes. Pseudogene transcripts can also directly compete with the parent gene transcripts for mRNA stability and other cell factors, modulating their expression levels. Tissue-specific and cancer-specific differential expression of these “functional” pseudogenes has been reported. To ascertain potential pseudogene:gene interactions in HIV-1 infection, we analyzed transcriptomes from infected and uninfected T-cells and found that 21 pseudogenes are differentially expressed in HIV-1 infection. This is interesting because parent genes of one-third of these differentially-expressed pseudogenes are implicated in HIV-1 life cycle, and parent genes of half of these pseudogenes are involved in different viral infections. Our bioinformatics analysis identifies candidate pseudogene:gene interactions that may be of significance in HIV-1 infection. Experimental validation of these interactions would establish that retroviruses exploit this newly-discovered layer of host gene expression regulation for their own benefit. PMID:26426037
Pseudogene redux with new biological significance.
Salmena, Leonardo
2014-01-01
The study of pseudogenes, originally dismissed as genomic relics of evolutionary selection, has seen a resurgence in scientific literature, in addition to being a peculiar topic of discussion in theological debates. For a long time, pseudogenes have been touted as a beacon of natural selection and a definitive proof of evolution due to the slow mutation rate that differentiated them from their parental genes and ultimately caused their genetic demise as functional genes. It now seems that "creationists" have co-opted some recent reports identifying unheralded biological functions to pseudogens and other noncoding RNAs as evidence to undermine the existence of evolution and supporting intelligent design. This issue of Methods in Molecular Biology focused on pseudogenes will certainly not end, nor enter this debate; however, scientists who are also genomics and pseudogene enthusiasts will certainly appreciate that many scientists are thinking about these particular genetic elements in new and interesting ways. With this new interest in a biological significance and "non-junk" role for pseudogenes and other noncoding RNAs, new methods and approaches are being developed to unlock the mystery of these ancient artifacts we know as pseudogenes. In this brief introductory chapter we highlight the renewed interest in pseudogenes and review a rationale for intensification of pseudogene-related research.
Pavlícek, Adam; Paces, Jan; Elleder, Daniel; Hejnar, Jirí
2002-03-01
We report here the presence of numerous processed pseudogenes derived from the W family of endogenous retroviruses in the human genome. These pseudogenes are structurally colinear with the retroviral mRNA followed by a poly(A) tail. Our analysis of insertion sites of HERV-W processed pseudogenes shows a strong preference for the insertion motif of long interspersed nuclear element (LINE) retrotransposons. The genomic distribution, stability during evolution, and frequent truncations at the 5' end resemble those of the pseudogenes generated by LINEs. We therefore suggest that HERV-W processed pseudogenes arose by multiple and independent LINE-mediated retrotransposition of retroviral mRNA. These data document that the majority of HERV-W copies are actually nontranscribed promoterless pseudogenes. The current search for HERV-Ws associated with several human diseases should concentrate on a small subset of transcriptionally competent elements.
2010-01-01
Background Unitary pseudogenes are a class of unprocessed pseudogenes without functioning counterparts in the genome. They constitute only a small fraction of annotated pseudogenes in the human genome. However, as they represent distinct functional losses over time, they shed light on the unique features of humans in primate evolution. Results We have developed a pipeline to detect human unitary pseudogenes through analyzing the global inventory of orthologs between the human genome and its mammalian relatives. We focus on gene losses along the human lineage after the divergence from rodents about 75 million years ago. In total, we identify 76 unitary pseudogenes, including previously annotated ones, and many novel ones. By comparing each of these to its functioning ortholog in other mammals, we can approximately date the creation of each unitary pseudogene (that is, the gene 'death date') and show that for our group of 76, the functional genes appear to be disabled at a fairly uniform rate throughout primate evolution - not all at once, correlated, for instance, with the 'Alu burst'. Furthermore, we identify 11 unitary pseudogenes that are polymorphic - that is, they have both nonfunctional and functional alleles currently segregating in the human population. Comparing them with their orthologs in other primates, we find that two of them are in fact pseudogenes in non-human primates, suggesting that they represent cases of a gene being resurrected in the human lineage. Conclusions This analysis of unitary pseudogenes provides insights into the evolutionary constraints faced by different organisms and the timescales of functional gene loss in humans. PMID:20210993
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tayebi, N.; Cushner, S.; Sidransky, E.
1996-09-01
We describe the use of long-template PCR to differentiate the glucocerebrosidase gene from its pseudogene, which will simplify molecular diagnostic testing and the detection of known and new mutations in patients with Gaucher disease. Gaucher disease results from the inherited deficiency of the lysosomal enzyme, glucocerebrosidase. Sixteen kilobases downstream of the glucocerebrosidase gene is a pseudogene, which is {approximately}2 kb shorter and has >96% identity to the coding regions of the functional gene. Many mutations encountered in Gaucher patients are identical to sequences ordinarily found only in the pseudogene, and some result from recombination between the gene and pseudogene. Thus,more » for diagnostic purposes it is essential to differentiate between sequences from the gene and pseudogene. 9 refs., 1 fig.« less
A mammalian pseudogene lncRNA at the interface of inflammation and anti-inflammatory therapeutics
Rapicavoli, Nicole A; Qu, Kun; Zhang, Jiajing; Mikhail, Megan; Laberge, Remi-Martin; Chang, Howard Y
2013-01-01
Pseudogenes are thought to be inactive gene sequences, but recent evidence of extensive pseudogene transcription raised the question of potential function. Here we discover and characterize the sets of mouse lncRNAs induced by inflammatory signaling via TNFα. TNFα regulates hundreds of lncRNAs, including 54 pseudogene lncRNAs, several of which show exquisitely selective expression in response to specific cytokines and microbial components in a NF-κB-dependent manner. Lethe, a pseudogene lncRNA, is selectively induced by proinflammatory cytokines via NF-κB or glucocorticoid receptor agonist, and functions in negative feedback signaling to NF-κB. Lethe interacts with NF-κB subunit RelA to inhibit RelA DNA binding and target gene activation. Lethe level decreases with organismal age, a physiological state associated with increased NF-κB activity. These findings suggest that expression of pseudogenes lncRNAs are actively regulated and constitute functional regulators of inflammatory signaling. DOI: http://dx.doi.org/10.7554/eLife.00762.001 PMID:23898399
Harpke, Doerte; Peterson, Angela
2008-05-01
The internal transcribed spacer (ITS) region (ITS1, 5.8S rDNA, ITS2) represents the most widely applied nuclear marker in eukaryotic phylogenetics. Although this region has been assumed to evolve in concert, the number of investigations revealing high degrees of intra-individual polymorphism connected with the presence of pseudogenes has risen. The 5.8S rDNA is the most important diagnostic marker for functionality of the ITS region. In Mammillaria, intra-individual 5.8S rDNA polymorphisms of up to 36% and up to nine different types have been found. Twenty-eight of 30 cloned genomic Mammillaria sequences were identified as putative pseudogenes. For the identification of pseudogenic ITS regions, in addition to formal tests based on substitution rates, we attempted to focus on functional features of the 5.8S rDNA (5.8S motif, secondary structure). The importance of functional data for the identification of pseudogenes is outlined and discussed. The identification of pseudogenes is essential, because they may cause erroneous phylogenies and taxonomic problems.
Zuriaga, María Angeles; Mas-Coma, Santiago; Bargues, María Dolores
2015-05-01
A pseudogene, designated as "ps(5.8S+ITS-2)", paralogous to the 5.8S gene and internal transcribed spacer (ITS)-2 of the nuclear ribosomal DNA (rDNA), has been recently found in many triatomine species distributed throughout North America, Central America and northern South America. Among characteristics used as criteria for pseudogene verification, secondary structures and free energy are highlighted, showing a lower fit between minimum free energy, partition function and centroid structures, although in given cases the fit only appeared to be slightly lower. The unique characteristics of "ps(5.8S+ITS-2)" as a processed or retrotransposed pseudogenic unit of the ghost type are reviewed, with emphasis on its potential functionality compared to the functionality of genes and spacers of the normal rDNA operon. Besides the technical problem of the risk for erroneous sequence results, the usefulness of "ps(5.8S+ITS-2)" for specimen classification, phylogenetic analyses and systematic/taxonomic studies should be highlighted, based on consistence and retention index values, which in pseudogenic sequence trees were higher than in functional sequence trees. Additionally, intraindividual, interpopulational and interspecific differences in pseudogene amount and the fact that it is a pseudogene in the nuclear rDNA suggests a potential relationships with fitness, behaviour and adaptability of triatomine vectors and consequently its potential utility in Chagas disease epidemiology and control.
2013-01-01
Background Pseudogenes are traditionally considered “dead” genes, therefore lacking biological functions. This view has however been challenged during the last decade. This is the case of the Protein phosphatase 1 regulatory subunit 2 (PPP1R2) or inhibitor-2 gene family, for which several incomplete copies exist scattered throughout the genome. Results In this study, the pseudogenization process of PPP1R2 was analyzed. Ten PPP1R2-related pseudogenes (PPP1R2P1-P10), highly similar to PPP1R2, were retrieved from the human genome assembly present in the databases. The phylogenetic analysis of mammalian PPP1R2 and related pseudogenes suggested that PPP1R2P7 and PPP1R2P9 retroposons appeared before the great mammalian radiation, while the remaining pseudogenes are primate-specific and retroposed at different times during Primate evolution. Although considered inactive, four of these pseudogenes seem to be transcribed and possibly possess biological functions. Given the role of PPP1R2 in sperm motility, the presence of these proteins was assessed in human sperm, and two PPP1R2-related proteins were detected, PPP1R2P3 and PPP1R2P9. Signatures of negative and positive selection were also detected in PPP1R2P9, further suggesting a role as a functional protein. Conclusions The results show that contrary to initial observations PPP1R2-related pseudogenes are not simple bystanders of the evolutionary process but may rather be at the origin of genes with novel functions. PMID:24195737
Molecular analysis of the glucocerebrosidase gene locus
DOE Office of Scientific and Technical Information (OSTI.GOV)
Winfield, S.L.; Martin, B.M.; Fandino, A.
1994-09-01
Gaucher disease is due to a deficiency in the activity of the lysosomal enzyme glucocerebrosidase. Both the functional gene for this enzyme and a pseudogene are located in close proximity on chromosome 1q21. Analysis of the mutations present in patient samples has suggested interaction between the functional gene and the pseudogene in the origin of mutant genotypes. To investigate the involvement of regions flanking the functional gene and pseudogene in the origin of mutations found in Gaucher disease, a YAC clone containing DNA from this locus has been subcloned and characterized. The original YAC containing {approximately}360 kb was truncated withmore » the use of fragmentation plasmids to about 85 kb. A lambda library derived from this YAC was screened to obtain clones containing glucocerebrosidase sequences. PCR amplification was used to identify subclones containing 5{prime}, central, or 3{prime} sequences of the functional gene or of the pseudogene. Clones spanning the entire distance from the last exon of the functional gene to intron 1 of the pseudogene, the 5{prime} end of the functional gene and 16 kb of 5{prime} flanking region and approximately 15 kb of 3{prime} flanking region of the pseudogene were sequenced. Sequence data from 48 kb of intergenic and flanking regions of the glucocerebrosidase gene and its pseudogene has been generated. A large number of Alu sequences and several simple repeats have been found. Two of these repeats exhibit fragment length polymorphism. There is almost 100% homology between the 3{prime} flanking regions of the functional gene and the pseudogene, extending to about 4 kb past the termination codons. A much lower degree of homology is observed in the 5{prime} flanking region. Patient samples are currently being screened for polymorphisms in these flanking regions.« less
De Martino, Marco; Azzariti, Amalia; Arra, Claudio; Fusco, Alfredo; Esposito, Francesco
2017-01-01
Several studies have established that pseudogene mRNAs can work as competing endogenous RNAs and, when deregulated, play a key role in the onset of human neoplasias. Recently, we have isolated two HMGA1 pseudogenes, HMGA1P6 and HMGA1P7. These pseudogenes have a critical role in cancer progression, acting as micro RNA (miRNA) sponges for HMGA1 and other cancer-related genes. HMGA1 pseudogenes were found overexpressed in several human carcinomas, and their expression levels positively correlate with an advanced cancer stage and a poor prognosis. In order to investigate the molecular alterations following HMGA1 pseudogene 7 overexpression, we carried out miRNA sequencing analysis on HMGA1P7 overexpressing mouse embryonic fibroblasts. Intriguingly, the most upregulated miRNAs were miR-483 and miR-675 that have been described as key regulators in cancer progression. Here, we report that HMGA1P7 upregulates miR-483 and miR-675 through a competing endogenous RNA mechanism with Egr1, a transcriptional factor that positively regulates miR-483 and miR-675 expression. PMID:29149041
PMS2 gene mutational analysis: direct cDNA sequencing to circumvent pseudogene interference.
Wimmer, Katharina; Wernstedt, Annekatrin
2014-01-01
The presence of highly homologous pseudocopies can compromise the mutation analysis of a gene of interest. In particular, when using PCR-based strategies, pseudogene co-amplification has to be effectively prevented. This is often achieved by using primers designed to be parental gene specific according to the reference sequence and by applying stringent PCR conditions. However, there are cases in which this approach is of limited utility. For example, it has been shown that the PMS2 gene exchanges sequences with one of its pseudogenes, named PMS2CL. This results in functional PMS2 alleles containing pseudogene-derived sequences at their 3'-end and in nonfunctional PMS2CL pseudogene alleles that contain gene-derived sequences. Hence, the paralogues cannot be distinguished according to the reference sequence. This shortcoming can be effectively circumvented by using direct cDNA sequencing. This approach is based on the selective amplification of PMS2 transcripts in two overlapping 1.6-kb RT-PCR products. In addition to avoiding pseudogene co-amplification and allele dropout, this method has also the advantage that it allows to effectively identify deletions, splice mutations, and de novo retrotransposon insertions that escape the detection of most DNA-based mutation analysis protocols.
Bargues, M Dolores; Zuriaga, M Angeles; Mas-Coma, Santiago
2014-01-01
A pseudogene, paralogous to rDNA 5.8S and ITS-2, is described in Meccus dimidiata dimidiata, M. d. capitata, M. d. maculippenis, M. d. hegneri, M. sp. aff. dimidiata, M. p. phyllosoma, M. p. longipennis, M. p. pallidipennis, M. p. picturata, M. p. mazzottii, Triatoma mexicana, Triatoma nitida and Triatoma sanguisuga, covering North America, Central America and northern South America. Such a nuclear rDNA pseudogene is very rare. In the 5.8S gene, criteria for pseudogene identification included length variability, lower GC content, mutations regarding the functional uniform sequence, and relatively high base substitutions in evolutionary conserved sites. At ITS-2 level, criteria were the shorter sequence and large proportion of insertions and deletions (indels). Pseudogenic 5.8S and ITS-2 secondary structures were different from the functional foldings, different one another, showing less negative values for minimum free energy (mfe) and centroid predictions, and lower fit between mfe, partition function, and centroid structures. A complete characterization indicated a processed pseudogenic unit of the ghost type, escaping from rDNA concerted evolution and with functionality subject to constraints instead of evolving free by neutral drift. Despite a high indel number, low mutation number and an evolutionary rate similar to the functional ITS-2, that pseudogene distinguishes different taxa and furnishes coherent phylogenetic topologies with resolution similar to the functional ITS-2. The discovery of a pseudogene in many phylogenetically related species is unique in animals and allowed for an estimation of its palaeobiogeographical origin based on molecular clock data, inheritance pathways, evolutionary rate and pattern, and geographical spread. Additional to the technical risk to be considered henceforth, this relict pseudogene, designated as "ps(5.8S+ITS-2)", proves to be a valuable marker for specimen classification, phylogenetic analyses, and systematic/taxonomic studies. It opens a new research field, Chagas disease epidemiology and control included, given its potential relationships with triatomine fitness, behaviour and adaptability. Copyright © 2013 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bozza, M.; Gerard, C.; Kolakowski, L.F. Jr.
1995-06-10
Macrophage migration inhibitory factor, MIF, is a cytokine released by T-lymphocytes, macrophages, and the pituitary gland that serves to integrate peripheral and central inflammatory responses. Ubiquitous expression and developmental regulation suggest that MIF may have additional roles outside of the immune system. Here we report the structure and chromosomal location of the mouse Mif gene and the partial characterization of five Mif pseudogenes. The mouse Mif gene spans less than 0.7 kb of chromosomal DNA and is composed of three exons. A comparison between the mouse and the human genes shows a similar gene structure and common regulatory elements inmore » both promoter regions. The mouse Mif gene maps to the middle region of chromosome 10, between Bcr and S100b, which have been mapped to human chromosomes 22q11 and 21q22.3, respectively. The entire sequence of two pseudogenes demonstrates the absence of introns, the presence of the 5{prime} untranslated region of the cDNA, a 3{prime} poly(A) tail, and the lack of sequence similarity with untranscribed regions of the gene. The five pseudogenes are highly homologous to the cDNA, but contain a variable number of mutations that would produce mutated or truncated MIF-like proteins. Phylogenetic analyses of MIF genes and pseudogenes indicate several independent genetic events that can account for multiple genomic integrations. Three of the Mif pseudogenes were also mapped by interspecific backcross to chromosomes 1, 9, and 17. These results suggest that Mif pseudogenes originated by retrotransposition. 46 refs., 5 figs., 1 tab.« less
Stem cell regulatory function mediated by expression of a novel mouse Oct4 pseudogene
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Huey; Shabbir, Arsalan; Molnar, Merced
2007-03-30
Multiple pseudogenes have been proposed for embryonic stem (ES) cell-specific genes, and their abundance suggests that some of these potential pseudogenes may be functional. ES cell-specific expression of Oct4 regulates stem cell pluripotency and self-renewing state. Although Oct4 expression has been reported in adult tissues during gene reprogramming, the detected Oct4 signal might be contributed by Oct4 pseudogenes. Among the multiple Oct4 transcripts characterized here is a {approx}1 kb clone derived from P19 embryonal carcinoma stem cells, which shares a {approx}87% sequence homology with the parent Oct4 gene, and has the potential of encoding an 80-amino acid product (designated asmore » Oct4P1). Adenoviral expression of Oct4P1 in mesenchymal stem cells promotes their proliferation and inhibits their osteochondral differentiation. These dual effects of Oct4P1 are reminiscent of the stem cell regulatory function of the parent Oct4, and suggest that Oct4P1 may be a functional pseudogene or a novel Oct4-related gene with a unique function in stem cells.« less
Detection of two distinct forms of apoC-I in great apes.
Puppione, Donald L; Ryan, Christopher M; Bassilian, Sara; Souda, Puneet; Xiao, Xinshu; Ryder, Oliver A; Whitelegge, Julian P
2010-03-01
ApoC-I, the smallest of the soluble apolipoproteins, associates with both TG-rich lipoproteins and HDL. Mass spectral analyses of human apoC-I previously had demonstrated that in the circulation there are two forms, either a 57 amino acid protein or a 55 amino acid protein, due to the loss of two amino acids from the N-terminus. In our analyses of the apolipoproteins of the other great apes by mass spectrometry, four forms of apoC-I were detected. Two of these showed a high degree of identity to the mature and truncated forms of human apoC-I. The other two were homologous to the virtual protein and its truncated form that are encoded by a human pseudogene. In humans, the genes for apoC-I and its pseudogene are located on chromosome 19, the pseudogene being 2.5 kb downstream from the apoC-I gene. Based on the similarity between the apoC-I gene and the pseudogene, it has been concluded that the latter arose from the former as a result of gene duplication approximately 35 million years ago. Interestingly, the virtual protein encoded by the pseudogene is acidic, not basic like apoC-I. In the chimpanzee, there also are two genes for apoC-I, the one upstream encodes a basic protein and the downstream gene, rather than being a pseudogene, encodes an acidic protein (P86336). In addition to reporting on the molecular masses of great ape apoC-I, we were able to clearly demonstrate by "Top-down" sequencing that the acidic form arose from a separate gene. In our analyses, we have measured the molecular masses of apoC-I associated with the HDL of the following great apes: bonobo (Pan paniscus), chimpanzee (Pan troglodytes), and the Sumatran orangutan (Pongo abelii). Genomic variations in chromosome 19 among great apes, baboons and macaques as they relate to both genes for apoC-I and the pseudogene are compared and discussed.
Noise-induced multistability in the regulation of cancer by genes and pseudogenes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Petrosyan, K. G., E-mail: pkaren@phys.sinica.edu.tw; Hu, Chin-Kun, E-mail: huck@phys.sinica.edu.tw; National Center for Theoretical Sciences, National Tsing Hua University, Hsinchu 30013, Taiwan
2016-07-28
We extend a previously introduced model of stochastic gene regulation of cancer to a nonlinear case having both gene and pseudogene messenger RNAs (mRNAs) self-regulated. The model consists of stochastic Boolean genetic elements and possesses noise-induced multistability (multimodality). We obtain analytical expressions for probabilities for the case of constant but finite number of microRNA molecules which act as a noise source for the competing gene and pseudogene mRNAs. The probability distribution functions display both the global bistability regime as well as even-odd number oscillations for a certain range of model parameters. Statistical characteristics of the mRNA’s level fluctuations are evaluated.more » The obtained results of the extended model advance our understanding of the process of stochastic gene and pseudogene expressions that is crucial in regulation of cancer.« less
Crainey, James Lee; Marín, Michel Abanto; Silva, Túllio Romão Ribeiro da; de Medeiros, Jansen Fernandes; Pessoa, Felipe Arley Costa; Santos, Yago Vinícius; Vicente, Ana Carolina Paulo; Luz, Sérgio Luiz Bessa
2018-04-18
Despite the broad distribution of M. ozzardi in Latin America and the Caribbean, there is still very little DNA sequence data available to study this neglected parasite's epidemiology. Mitochondrial DNA (mtDNA) sequences, especially the cytochrome oxidase (CO1) gene's barcoding region, have been targeted successfully for filarial diagnostics and for epidemiological, ecological and evolutionary studies. MtDNA-based studies can, however, be compromised by unrecognised mitochondrial pseudogenes, such as Numts. Here, we have used shot-gun Illumina-HiSeq sequencing to recover the first complete Mansonella genus mitogenome and to identify several mitochondrial-origin pseudogenes. Mitogenome phylogenetic analysis placed M. ozzardi in the Onchocercidae "ONC5" clade and suggested that Mansonella parasites are more closely related to Wuchereria and Brugia genera parasites than they are to Loa genus parasites. DNA sequence alignments, BLAST searches and conceptual translations have been used to compliment phylogenetic analysis showing that M. ozzardi from the Amazon and Caribbean regions are near-identical and that previously reported Peruvian M. ozzardi CO1 reference sequences are probably of pseudogene origin. In addition to adding a much-needed resource to the Mansonella genus's molecular tool-kit and providing evidence that some M. ozzardi CO1 sequence deposits are pseudogenes, our results suggest that all Neotropical M. ozzardi parasites are closely related.
Collery, Mark M; Smyth, Cyril J
2007-02-01
The egc locus of Staphylococus aureus harbours two enterotoxin genes (seg and sei) and three enterotoxin-like genes (selm, seln and selo). Between the sei and seln genes are located two pseudogenes, psient1 and psient2, or the selu or seluv gene. While these two alternative sei-seln intergenic regions can be distinguished by PCR, to date, DNA sequencing has been the only confirmatory option because of the very high degree of sequence similarity between egc loci bearing the pseudogenes and the selu or seluv gene. In silico restriction enzyme digestion of genomic regions encompassing the egc locus from the 3' end of the sei gene through the 5' first quarter of the seln gene allowed pseudogene- and selu- or seluv-bearing egc loci to be distinguished by PCR-RFLP. Experimental application of these findings demonstrated that endonuclease HindIII cleaved PCR amplimers bearing pseudogenes but not those with a selu or seluv gene, while selu- or seluv-bearing amplimers were susceptible to cleavage by endonuclease HphI, but not by endonuclease HindIII. The restriction enzyme BccI cleaved selu- or seluv-harbouring amplimers at a unique restriction site created by their signature 15 bp insertion compared with pseudogene-bearing amplimers, thereby allowing distinction of these egc loci. PCR-RFLP analysis using these restriction enzymes provides a rapid, easy to interpret alternative to DNA sequencing for verification of PCR findings on the nature of an egc locus type, and can also be used for the primary identification of the intergenic sei-seln egc locus type.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ginns, E.I.; Winfield, S.; Sidransky, E.
1994-09-01
The human GC locus on chromosome 1q21 encompasses a 7 kb functional gene encoding the enzyme deficient in Gaucher disease, and a highly homologous sequence 16 Kb downstream that has the properties of a pseudogene. A novel gene, gene X, spanning the 6 kb region between the pseudogene and TSP3 has been identified and characterized in the mouse, and appears to be critical for normal embryonic development. As in the mouse, the human gene X is located 5{prime} to the TSP3 gene and two genes are transcribed divergently from a bidirectional promoter; the direction of transcription of gene X andmore » GC is convergent. However, in the human, gene X and GC are separated by gene X and GC pseudogenes that are the consequence of a gene duplication. The gene X pseudogene lacks the first exon and part of the second exon of the functional gene and may not be transcribed. Northern blot analyses indicate that gene X is transcribed in both normal individuals and in patients with Gaucher disease, but the function of this gene is still unknown. The possibility that mutations in gene X could account for some of the diversity of symptoms encountered in individuals with the more atypical presentations of Gaucher disease is under investigation.« less
Avoidance of pseudogene interference in the detection of 3' deletions in PMS2.
Vaughn, Cecily P; Hart, Kimberly J; Samowitz, Wade S; Swensen, Jeffrey J
2011-09-01
Lynch syndrome is characterized by mutations in the mismatch repair genes MLH1, MSH2, MSH6, and PMS2. In PMS2, detection of mutations is confounded by numerous pseudogenes. Detection of 3' deletions is particularly complicated by the pseudogene PMS2CL, which has strong similarity to PMS2 exons 9 and 11-15, due to extensive gene conversion. A newly designed multiplex ligation-dependent probe amplification (MLPA) kit incorporates probes for variants found in both PMS2 and PMS2CL. This provides detection of deletions, but does not allow localization of deletions to the gene or pseudogene. To address this, we have developed a methodology incorporating reference samples with known copy numbers of variants, and paired MLPA results with sequencing of PMS2 and PMS2CL. We tested a subset of clinically indicated samples for which mutations were either unidentified or not fully characterized using existing methods. We identified eight unrelated patients with deletions encompassing exons 9-15, 11-15, 13-15, 14-15, and 15. By incorporating specific, characterized reference samples and sequencing the gene and pseudogene it is possible to identify deletions in this region of PMS2 and provide clinically relevant results. This methodology represents a significant advance in the diagnosis of patients with Lynch syndrome caused by PMS2 mutations. © 2011 Wiley-Liss, Inc.
Mycobacterium leprae: genes, pseudogenes and genetic diversity
Singh, Pushpendra; Cole, Stewart T
2011-01-01
Leprosy, which has afflicted human populations for millenia, results from infection with Mycobacterium leprae, an unculturable pathogen with an exceptionally long generation time. Considerable insight into the biology and drug resistance of the leprosy bacillus has been obtained from genomics. M. leprae has undergone reductive evolution and pseudogenes now occupy half of its genome. Comparative genomics of four different strains revealed remarkable conservation of the genome (99.995% identity) yet uncovered 215 polymorphic sites, mainly single nucleotide polymorphisms, and a handful of new pseudogenes. Mapping these polymorphisms in a large panel of strains defined 16 single nucleotide polymorphism-subtypes that showed strong geographical associations and helped retrace the evolution of M. leprae. PMID:21162636
Lin, Ya-Ying
2017-01-01
A portion of the mitochondrial cytochrome c oxidase I gene was sequenced using both genomic DNA and complement DNA from three planktonic copepod Neocalanus species (N. cristatus, N. plumchrus, and N. flemingeri). Small but critical sequence differences in CO1 were observed between gDNA and cDNA from N. plumchrus. Furthermore, careful observation revealed the presence of recombination between sequences in gDNA from N. plumchrus. Moreover, a chimera of the N. cristatus and N. plumchrus sequences was obtained from N. plumchrus gDNA. The observed phenomena can be best explained by the preferential amplification of the nuclear mitochondrial pseudogenes from gDNA of N. plumchrus. Two conclusions can be drawn from the observations. First, nuclear mitochondrial pseudogenes are pervasive in N. plumchrus. Second, a mating between a female N. cristatus and a male N. plumchrus produced viable offspring, which further backcrossed to a N. plumchrus individual. These observations not only demonstrate intriguing mating behavior in these species, but also emphasize the importance of careful interpretation of species marker sequences amplified from gDNA. PMID:28231343
Antigenic variation of Anaplasma marginale msp2 occurs by combinatorial gene conversion.
Brayton, Kelly A; Palmer, Guy H; Lundgren, Anna; Yi, Jooyoung; Barbet, Anthony F
2002-03-01
The rickettsial pathogen Anaplasma marginale establishes lifelong persistent infection in the mammalian reservoir host, during which time immune escape variants continually arise in part because of variation in the expressed copy of the immunodominant outer membrane protein MSP2. A key question is how the small 1.2 Mb A. marginale genome generates sufficient variants to allow long-term persistence in an immunocompetent reservoir host. The recombination of whole pseudogenes into the single msp2 expression site has been previously identified as one method of generating variants, but is inadequate to generate the number of variants required for persistent infection. In the present study, we demonstrate that recombination of a whole pseudogene is followed by a second level of variation in which small segments of pseudogenes recombine into the expression site by gene conversion. Evidence for four short sequential changes in the hypervariable region of msp2 coupled with the identification of nine pseudogenes from a single strain of A. marginale provides for a combinatorial number of possible expressed MSP2 variants sufficient for lifelong persistence.
The repertoire of bitter taste receptor genes in canids.
Shang, Shuai; Wu, Xiaoyang; Chen, Jun; Zhang, Huanxin; Zhong, Huaming; Wei, Qinguo; Yan, Jiakuo; Li, Haotian; Liu, Guangshuai; Sha, Weilai; Zhang, Honghai
2017-07-01
Bitter taste receptors (Tas2rs) play important roles in mammalian defense mechanisms by helping animals detect and avoid toxins in food. Although Tas2r genes have been widely studied in several mammals, minimal research has been performed in canids. To analyze the genetic basis of Tas2r genes in canids, we first identified Tas2r genes in the wolf, maned wolf, red fox, corsac fox, Tibetan fox, fennec fox, dhole and African hunting dog. A total of 183 Tas2r genes, consisting of 118 intact genes, 6 partial genes and 59 pseudogenes, were detected. Differences in the pseudogenes were observed among nine canid species. For example, Tas2r4 was a pseudogene in the dog but might play a functional role in other canid species. The Tas2r42 and Tas2r10 genes were pseudogenes in the maned wolf and dhole, respectively, and the Tas2r5 and Tas2r34 genes were pseudogenes in the African hunting dog; however, these genes were intact genes in other canid species. The differences in Tas2r pseudogenes among canids might suggest that the loss of intact Tas2r genes in canid species is species-dependent. We further compared the 183 Tas2r genes identified in this study with Tas2r genes from ten additional carnivorous species to evaluate the potential influence of diet on the evolution of the Tas2r gene repertoire. Phylogenetic analysis revealed that most of the Tas2r genes from the 18 species intermingled across the tree, suggesting that Tas2r genes are conserved among carnivores. Within canids, we found that some Tas2r genes corresponded to the traditional taxonomic groupings, while some did not. PIC analysis showed that the number of Tas2r genes in carnivores exhibited no positive correlation with diet composition, which might be due to the limited number of carnivores included in our study.
Han, Xiang Y; Sizer, Kurt C; Thompson, Erika J; Kabanja, Juma; Li, Jun; Hu, Peter; Gómez-Valero, Laura; Silva, Francisco J
2009-10-01
Mycobacterium lepromatosis is a newly discovered leprosy-causing organism. Preliminary phylogenetic analysis of its 16S rRNA gene and a few other gene segments revealed significant divergence from Mycobacterium leprae, a well-known cause of leprosy, that justifies the status of M. lepromatosis as a new species. In this study we analyzed the sequences of 20 genes and pseudogenes (22,814 nucleotides). Overall, the level of matching of these sequences with M. leprae sequences was 90.9%, which substantiated the species-level difference; the levels of matching for the 16S rRNA genes and 14 protein-encoding genes were 98.0% and 93.1%, respectively, but the level of matching for five pseudogenes was only 79.1%. Five conserved protein-encoding genes were selected to construct phylogenetic trees and to calculate the numbers of synonymous substitutions (dS values) and nonsynonymous substitutions (dN values) in the two species. Robust phylogenetic trees constructed using concatenated alignment of these genes placed M. lepromatosis and M. leprae in a tight cluster with long terminal branches, implying that the divergence occurred long ago. The dS and dN values were also much higher than those for other closest pairs of mycobacteria. The dS values were 14 to 28% of the dS values for M. leprae and Mycobacterium tuberculosis, a more divergent pair of species. These results thus indicate that M. lepromatosis and M. leprae diverged approximately 10 million years ago. The M. lepromatosis pseudogenes analyzed that were also pseudogenes in M. leprae showed nearly neutral evolution, and their relative ages were similar to those of M. leprae pseudogenes, suggesting that they were pseudogenes before divergence. Taken together, the results described above indicate that M. lepromatosis and M. leprae diverged from a common ancestor after the massive gene inactivation event described previously for M. leprae.
Immunoglobulin Genomics in the Guinea Pig (Cavia porcellus)
Guo, Yongchen; Bao, Yonghua; Meng, Qingwen; Hu, Xiaoxiang; Meng, Qingyong; Ren, Liming; Li, Ning; Zhao, Yaofeng
2012-01-01
In science, the guinea pig is known as one of the gold standards for modeling human disease. It is especially important as a molecular and cellular biology model for studying the human immune system, as its immunological genes are more similar to human genes than are those of mice. The utility of the guinea pig as a model organism can be further enhanced by further characterization of the genes encoding components of the immune system. Here, we report the genomic organization of the guinea pig immunoglobulin (Ig) heavy and light chain genes. The guinea pig IgH locus is located in genomic scaffolds 54 and 75, and spans approximately 6,480 kb. 507 VH segments (94 potentially functional genes and 413 pseudogenes), 41 DH segments, six JH segments, four constant region genes (μ, γ, ε, and α), and one reverse δ remnant fragment were identified within the two scaffolds. Many VH pseudogenes were found within the guinea pig, and likely constituted a potential donor pool for gene conversion during evolution. The Igκ locus mapped to a 4,029 kb region of scaffold 37 and 24 is composed of 349 Vκ (111 potentially functional genes and 238 pseudogenes), three Jκ and one Cκ genes. The Igλ locus spans 1,642 kb in scaffold 4 and consists of 142 Vλ (58 potentially functional genes and 84 pseudogenes) and 11 Jλ -Cλ clusters. Phylogenetic analysis suggested the guinea pig’s large germline VH gene segments appear to form limited gene families. Therefore, this species may generate antibody diversity via a gene conversion-like mechanism associated with its pseudogene reserves. PMID:22761756
Are Synonymous Substitutions in Flowering Plant Mitochondria Neutral?
Wynn, Emily L; Christensen, Alan C
2015-10-01
Angiosperm mitochondrial genes appear to have very low mutation rates, while non-gene regions expand, diverge, and rearrange quickly. One possible explanation for this disparity is that synonymous substitutions in plant mitochondrial genes are not truly neutral and selection keeps their occurrence low. If this were true, the explanation for the disparity in mutation rates in genes and non-genes needs to consider selection as well as mechanisms of DNA repair. Rps14 is co-transcribed with cob and rpl5 in most plant mitochondrial genomes, but in some genomes, rps14 has been duplicated to the nucleus leaving a pseudogene in the mitochondria. This provides an opportunity to compare neutral substitution rates in pseudogenes with synonymous substitution rates in the orthologs. Genes and pseudogenes of rps14 have been aligned among different species and the mutation rates have been calculated. Neutral substitution rates in pseudogenes and synonymous substitution rates in genes are significantly different, providing evidence that synonymous substitutions in plant mitochondrial genes are not completely neutral. The non-neutrality is not sufficient to completely explain the exceptionally low mutation rates in land plant mitochondrial genomes, but selective forces appear to play a small role.
Immunoglobulin genomics in the guinea pig (Cavia porcellus).
Guo, Yongchen; Bao, Yonghua; Meng, Qingwen; Hu, Xiaoxiang; Meng, Qingyong; Ren, Liming; Li, Ning; Zhao, Yaofeng
2012-01-01
In science, the guinea pig is known as one of the gold standards for modeling human disease. It is especially important as a molecular and cellular biology model for studying the human immune system, as its immunological genes are more similar to human genes than are those of mice. The utility of the guinea pig as a model organism can be further enhanced by further characterization of the genes encoding components of the immune system. Here, we report the genomic organization of the guinea pig immunoglobulin (Ig) heavy and light chain genes. The guinea pig IgH locus is located in genomic scaffolds 54 and 75, and spans approximately 6,480 kb. 507 V(H) segments (94 potentially functional genes and 413 pseudogenes), 41 D(H) segments, six J(H) segments, four constant region genes (μ, γ, ε, and α), and one reverse δ remnant fragment were identified within the two scaffolds. Many V(H) pseudogenes were found within the guinea pig, and likely constituted a potential donor pool for gene conversion during evolution. The Igκ locus mapped to a 4,029 kb region of scaffold 37 and 24 is composed of 349 V(κ) (111 potentially functional genes and 238 pseudogenes), three J(κ) and one C(κ) genes. The Igλ locus spans 1,642 kb in scaffold 4 and consists of 142 V(λ) (58 potentially functional genes and 84 pseudogenes) and 11 J(λ) -C(λ) clusters. Phylogenetic analysis suggested the guinea pig's large germline V(H) gene segments appear to form limited gene families. Therefore, this species may generate antibody diversity via a gene conversion-like mechanism associated with its pseudogene reserves.
2012-01-01
Background Francisella is a genus of gram-negative bacterium highly virulent in fishes and human where F. tularensis is causing the serious disease tularaemia in human. Recently Francisella species have been reported to cause mortality in aquaculture species like Atlantic cod and tilapia. We have completed the sequencing and draft assembly of the Francisella noatunensis subsp. orientalisToba04 strain isolated from farmed Tilapia. Compared to other available Francisella genomes, it is most similar to the genome of Francisella philomiragia subsp. philomiragia, a free-living bacterium not virulent to human. Results The genome is rearranged compared to the available Francisella genomes even though we found no IS-elements in the genome. Nearly 16% percent of the predicted ORFs are pseudogenes. Computational pathway analysis indicates that a number of the metabolic pathways are disrupted due to pseudogenes. Comparing the novel genome with other available Francisella genomes, we found around 2.5% of unique genes present in Francisella noatunensis subsp. orientalis Toba04 and a list of genes uniquely present in the human-pathogenic Francisella subspecies. Most of these genes might have transferred from bacterial species through horizontal gene transfer. Comparative analysis between human and fish pathogen also provide insights into genes responsible for pathogenecity. Our analysis of pseudogenes indicates that the evolution of Francisella subspecies’s pseudogenes from Tilapia is old with large number of pseudogenes having more than one inactivating mutation. Conclusions The fish pathogen has lost non-essential genes some time ago. Evolutionary analysis of the Francisella genomes, strongly suggests that human and fish pathogenic Francisella species have evolved independently from free-living metabolically competent Francisella species. These findings will contribute to understanding the evolution of Francisella species and pathogenesis. PMID:23131096
Genomic gigantism: DNA loss is slow in mountain grasshoppers.
Bensasson, D; Petrov, D A; Zhang, D X; Hartl, D L; Hewitt, G M
2001-02-01
Several studies have shown DNA loss to be inversely correlated with genome size in animals. These studies include a comparison between Drosophila and the cricket, Laupala, but there has been no assessment of DNA loss in insects with very large genomes. Podisma pedestris, the brown mountain grasshopper, has a genome over 100 times as large as that of Drosophila and 10 times as large as that of Laupala. We used 58 paralogous nuclear pseudogenes of mitochondrial origin to study the characteristics of insertion, deletion, and point substitution in P. pedestris and Italopodisma. In animals, these pseudogenes are "dead on arrival"; they are abundant in many different eukaryotes, and their mitochondrial origin simplifies the identification of point substitutions accumulated in nuclear pseudogene lineages. There appears to be a mononucleotide repeat within the 643-bp pseudogene sequence studied that acts as a strong hot spot for insertions or deletions (indels). Because the data for other insect species did not contain such an unusual region, hot spots were excluded from species comparisons. The rate of DNA loss relative to point substitution appears to be considerably and significantly lower in the grasshoppers studied than in Drosophila or Laupala. This suggests that the inverse correlation between genome size and the rate of DNA loss can be extended to comparisons between insects with large or gigantic genomes (i.e., Laupala and Podisma). The low rate of DNA loss implies that in grasshoppers, the accumulation of point mutations is a more potent force for obscuring ancient pseudogenes than their loss through indel accumulation, whereas the reverse is true for Drosophila. The main factor contributing to the difference in the rates of DNA loss estimated for grasshoppers, crickets, and Drosophila appears to be deletion size. Large deletions are relatively rare in Podisma and Italopodisma.
2010-01-01
Background Genome reduction is a common evolutionary process in symbiotic and pathogenic bacteria. This process has been extensively characterized in bacterial endosymbionts of insects, where primary mutualistic bacteria represent the most extreme cases of genome reduction consequence of a massive process of gene inactivation and loss during their evolution from free-living ancestors. Sodalis glossinidius, the secondary endosymbiont of tsetse flies, contains one of the few complete genomes of bacteria at the very beginning of the symbiotic association, allowing to evaluate the relative impact of mobile genetic element proliferation and gene inactivation over the structure and functional capabilities of this bacterial endosymbiont during the transition to a host dependent lifestyle. Results A detailed characterization of mobile genetic elements and pseudogenes reveals a massive presence of different types of prophage elements together with five different families of IS elements that have proliferated across the genome of Sodalis glossinidius at different levels. In addition, a detailed survey of intergenic regions allowed the characterization of 1501 pseudogenes, a much higher number than the 972 pseudogenes described in the original annotation. Pseudogene structure reveals a minor impact of mobile genetic element proliferation in the process of gene inactivation, with most of pseudogenes originated by multiple frameshift mutations and premature stop codons. The comparison of metabolic profiles of Sodalis glossinidius and tsetse fly primary endosymbiont Wiglesworthia glossinidia based on their whole gene and pseudogene repertoires revealed a novel case of pathway inactivation, the arginine biosynthesis, in Sodalis glossinidius together with a possible case of metabolic complementation with Wigglesworthia glossinidia for thiamine biosynthesis. Conclusions The complete re-analysis of the genome sequence of Sodalis glossinidius reveals novel insights in the evolutionary transition from a free-living ancestor to a host-dependent lifestyle, with a massive proliferation of mobile genetic elements mainly of phage origin although with minor impact in the process of gene inactivation that is taking place in this bacterial genome. The metabolic analysis of the whole endosymbiotic consortia of tsetse flies have revealed a possible phenomenon of metabolic complementation between primary and secondary endosymbionts that can contribute to explain the co-existence of both bacterial endosymbionts in the context of the tsetse host. PMID:20649993
Straub, Shannon C K; Cronn, Richard C; Edwards, Christopher; Fishbein, Mark; Liston, Aaron
2013-01-01
Horizontal gene transfer (HGT) of DNA from the plastid to the nuclear and mitochondrial genomes of higher plants is a common phenomenon; however, plastid genomes (plastomes) are highly conserved and have generally been regarded as impervious to HGT. We sequenced the 158 kb plastome and the 690 kb mitochondrial genome of common milkweed (Asclepias syriaca [Apocynaceae]) and found evidence of intracellular HGT for a 2.4-kb segment of mitochondrial DNA to the rps2-rpoC2 intergenic spacer of the plastome. The transferred region contains an rpl2 pseudogene and is flanked by plastid sequence in the mitochondrial genome, including an rpoC2 pseudogene, which likely provided the mechanism for HGT back to the plastome through double-strand break repair involving homologous recombination. The plastome insertion is restricted to tribe Asclepiadeae of subfamily Asclepiadoideae, whereas the mitochondrial rpoC2 pseudogene is present throughout the subfamily, which confirms that the plastid to mitochondrial HGT event preceded the HGT to the plastome. Although the plastome insertion has been maintained in all lineages of Asclepiadoideae, it shows minimal evidence of transcription in A. syriaca and is likely nonfunctional. Furthermore, we found recent gene conversion of the mitochondrial rpoC2 pseudogene in Asclepias by the plastid gene, which reflects continued interaction of these genomes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Popp, R.A.; Lalley, P.A.; Whitney, J.B.
A genetic polymorphism for a Bgl I endonuclease site near the ..cap alpha..-globin-like pseudogene ..cap alpha..-4 of C57BL/6 and C3H/HeN mice was used to show that ..cap alpha..-4 was not affected by three independent mutations in which the adult globin genes ..cap alpha..-1 and ..cap alpha..-2 were deleted. These results indicated that ..cap alpha..-4 might not be located adjacent to the adult ..cap alpha..-globin genes on chromosome 11. Restriction endonuclease analysis of DNA of a primary clone of a Chinese hamster-mouse somatic cell hybrid that had lost mouse chromosomes 11 and 18 showed that this clone lacked the adult murinemore » globin genes ..cap alpha..-1 and ..cap alpha..-2 but it did contain the ..cap alpha..-globin-like pseudogenes ..cap alpha..-3 and ..cap alpha..-4. These results indicated that the adult ..cap alpha..-globin genes and ..cap alpha..-globin-like pseudogenes are not located on the same chromosome. Similar analyses of several other Chinese hamster-mouse somatic cell hybrids that had segregated other mouse chromosomes indicated that the ..cap alpha..-globin-like pseudogenes ..cap alpha..-3 and ..cap alpha..-4 are located on mouse chromosomes 15 and 17, respectively. These data explain why ..cap alpha..-3 and ..cap alpha..-4 were not affected by the three independently induced deletion-type mutations that cause ..cap alpha..-thalassemia in the mouse.« less
Pseudogenization of the tooth gene enamelysin (MMP20) in the common ancestor of extant baleen whales
Meredith, Robert W.; Gatesy, John; Cheng, Joyce; Springer, Mark S.
2011-01-01
Whales in the suborder Mysticeti are filter feeders that use baleen to sift zooplankton and small fish from ocean waters. Adult mysticetes lack teeth, although tooth buds are present in foetal stages. Cladistic analyses suggest that functional teeth were lost in the common ancestor of crown-group Mysticeti. DNA sequences for the tooth-specific genes, ameloblastin (AMBN), enamelin (ENAM) and amelogenin (AMEL), have frameshift mutations and/or stop codons in this taxon, but none of these molecular cavities are shared by all extant mysticetes. Here, we provide the first evidence for pseudogenization of a tooth gene, enamelysin (MMP20), in the common ancestor of living baleen whales. Specifically, pseudogenization resulted from the insertion of a CHR-2 SINE retroposon in exon 2 of MMP20. Genomic and palaeontological data now provide congruent support for the loss of enamel-capped teeth on the common ancestral branch of crown-group mysticetes. The new data for MMP20 also document a polymorphic stop codon in exon 2 of the pygmy sperm whale (Kogia breviceps), which has enamel-less teeth. These results, in conjunction with the evidence for pseudogenization of MMP20 in Hoffmann's two-toed sloth (Choloepus hoffmanni), another enamel-less species, support the hypothesis that the only unique, non-overlapping function of the MMP20 gene is in enamel formation. PMID:20861053
Straub, Shannon C.K.; Cronn, Richard C.; Edwards, Christopher; Fishbein, Mark; Liston, Aaron
2013-01-01
Horizontal gene transfer (HGT) of DNA from the plastid to the nuclear and mitochondrial genomes of higher plants is a common phenomenon; however, plastid genomes (plastomes) are highly conserved and have generally been regarded as impervious to HGT. We sequenced the 158 kb plastome and the 690 kb mitochondrial genome of common milkweed (Asclepias syriaca [Apocynaceae]) and found evidence of intracellular HGT for a 2.4-kb segment of mitochondrial DNA to the rps2–rpoC2 intergenic spacer of the plastome. The transferred region contains an rpl2 pseudogene and is flanked by plastid sequence in the mitochondrial genome, including an rpoC2 pseudogene, which likely provided the mechanism for HGT back to the plastome through double-strand break repair involving homologous recombination. The plastome insertion is restricted to tribe Asclepiadeae of subfamily Asclepiadoideae, whereas the mitochondrial rpoC2 pseudogene is present throughout the subfamily, which confirms that the plastid to mitochondrial HGT event preceded the HGT to the plastome. Although the plastome insertion has been maintained in all lineages of Asclepiadoideae, it shows minimal evidence of transcription in A. syriaca and is likely nonfunctional. Furthermore, we found recent gene conversion of the mitochondrial rpoC2 pseudogene in Asclepias by the plastid gene, which reflects continued interaction of these genomes. PMID:24029811
Clinical analysis of PMS2: mutation detection and avoidance of pseudogenes.
Vaughn, Cecily P; Robles, Jorge; Swensen, Jeffrey J; Miller, Christine E; Lyon, Elaine; Mao, Rong; Bayrak-Toydemir, Pinar; Samowitz, Wade S
2010-05-01
Germline mutation detection in PMS2, one of four mismatch repair genes associated with Lynch syndrome, is greatly complicated by the presence of numerous pseudogenes. We used a modification of a long-range PCR method to evaluate PMS2 in 145 clinical samples. This modification avoids potential interference from the pseudogene PMS2CL by utilizing a long-range product spanning exons 11-15, with the forward primer anchored in exon 10, an exon not shared by PMS2CL. Large deletions were identified by MLPA. Pathogenic PMS2 mutations were identified in 22 of 59 patients whose tumors showed isolated loss of PMS2 by immunohistochemistry (IHC), the IHC profile most commonly associated with a germline PMS2 mutation. Three additional patients with pathogenic mutations were identified from 53 samples without IHC data. Thirty-seven percent of the identified mutations were large deletions encompassing one or more exons. In 27 patients whose tumors showed absence of either another protein or combination of proteins, no pathogenic mutations were identified. We conclude that modified long-range PCR can be used to preferentially amplify the PMS2 gene and avoid pseudogene interference, thus providing a clinically useful germline analysis of PMS2. Our data also support the use of IHC screening to direct germline testing of PMS2. (c) 2010 Wiley-Liss, Inc.
Groth-Malonek, Milena; Wahrmund, Ute; Polsakiewicz, Monika; Knoop, Volker
2007-04-01
Gene transfer from the mitochondrion into the nucleus is a corollary of the endosymbiont hypothesis. The frequent and independent transfer of genes for mitochondrial ribosomal proteins is well documented with many examples in angiosperms, whereas transfer of genes for components of the respiratory chain is a rarity. A notable exception is the nad7 gene, encoding subunit 7 of complex I, in the liverwort Marchantia polymorpha, which resides as a full-length, intron-carrying and transcribed, but nonspliced pseudogene in the chondriome, whereas its functional counterpart is nuclear encoded. To elucidate the patterns of pseudogene degeneration, we have investigated the mitochondrial nad7 locus in 12 other liverworts of broad phylogenetic distribution. We find that the mitochondrial nad7 gene is nonfunctional in 11 of them. However, the modes of pseudogene degeneration vary: whereas point mutations, accompanied by single-nucleotide indels, predominantly introduce stop codons into the reading frame in marchantiid liverworts, larger indels introduce frameshifts in the simple thalloid and leafy jungermanniid taxa. Most notably, however, the mitochondrial nad7 reading frame appears to be intact in the isolated liverwort genus Haplomitrium. Its functional expression is shown by cDNA analysis identifying typical RNA-editing events to reconstitute conserved codon identities and also confirming functional splicing of the 2 liverwort-specific group II introns. We interpret our results 1) to indicate the presence of a functional mitochondrial nad7 gene in the earliest land plants and strongly supporting a basal placement of Haplomitrium among the liverworts, 2) to indicate different modes of pseudogene degeneration and chondriome evolution in the later branching liverwort clades, 3) to suggest a surprisingly long maintenance of a nonfunctional gene in the presumed oldest group of land plants, and 4) to support the model of a secondary loss of RNA-editing activity in marchantiid liverworts.
Ruggiero, Maria Valeria; Procaccini, Gabriele
2004-01-01
Halophila stipulacea is a dioecious marine angiosperm, widely distributed along the western coasts of the Indian Ocean and the Red Sea. This species is thought to be a Lessepsian immigrant that entered the Mediterranean Sea from the Red Sea after the opening of the Suez Canal (1869). Previous studies have revealed both high phenotypic and genetic variability in Halophila stipulacea populations from the western Mediterranean basin. In order to test the hypothesis of a Lessepsian introduction, we compare genetic polymorphism between putative native (Red Sea) and introduced (Mediterranean) populations through rDNA ITS region (ITS1-5.8S-ITS2) sequence analysis. A high degree of intraindividual variability of ITS sequences was found. Most of the intragenomic polymorphism was due to pseudogenic sequences, present in almost all individuals. Features of ITS functional sequences and pseudogenes are described. Possible causes for the lack of homogenization of ITS paralogues within individuals are discussed.
Human Nanog pseudogene8 promotes the proliferation of gastrointestinal cancer cells
DOE Office of Scientific and Technical Information (OSTI.GOV)
Uchino, Keita, E-mail: uchino13@intmed1.med.kyushu-u.ac.jp; Hirano, Gen; Hirahashi, Minako
2012-09-10
There is emerging evidence that human solid tumor cells originate from cancer stem cells (CSCs). In cancer cell lines, tumor-initiating CSCs are mainly found in the side population (SP) that has the capacity to extrude dyes such as Hoechst 33342. We found that Nanog is expressed specifically in SP cells of human gastrointestinal (GI) cancer cells. Nucleotide sequencing revealed that NanogP8 but not Nanog was expressed in GI cancer cells. Transfection of NanogP8 into GI cancer cell lines promoted cell proliferation, while its inhibition by anti-Nanog siRNA suppressed the proliferation. Immunohistochemical staining of primary GI cancer tissues revealed NanogP8 proteinmore » to be strongly expressed in 3 out of 60 cases. In these cases, NanogP8 was found especially in an infiltrative part of the tumor, in proliferating cells with Ki67 expression. These data suggest that NanogP8 is involved in GI cancer development in a fraction of patients, in whom it presumably acts by supporting CSC proliferation. -- Highlights: Black-Right-Pointing-Pointer Nanog maintains pluripotency by regulating embryonic stem cells differentiation. Black-Right-Pointing-Pointer Nanog is expressed in cancer stem cells of human gastrointestinal cancer cells. Black-Right-Pointing-Pointer Nucleotide sequencing revealed that Nanog pseudogene8 but not Nanog was expressed. Black-Right-Pointing-Pointer Nanog pseudogene8 promotes cancer stem cells proliferation. Black-Right-Pointing-Pointer Nanog pseudogene8 is involved in gastrointestinal cancer development.« less
Differential expression of Oct4 variants and pseudogenes in normal urothelium and urothelial cancer.
Wezel, Felix; Pearson, Joanna; Kirkwood, Lisa A; Southgate, Jennifer
2013-10-01
The transcription factor octamer-binding protein 4 (Oct4; encoded by POU5F1) has a key role in maintaining embryonic stem cell pluripotency during early embryonic development and it is required for generation of induced pluripotent stem cells. Controversy exists concerning Oct4 expression in somatic tissues, with reports that Oct4 is expressed in normal and in neoplastic urothelium carrying implications for a bladder cancer stem cell phenotype. Here, we show that the pluripotency-associated Oct4A transcript was absent from cultures of highly regenerative normal human urothelial cells and from low-grade to high-grade urothelial carcinoma cell lines, whereas alternatively spliced variants and transcribed pseudogenes were expressed in abundance. Immunolabeling and immunoblotting studies confirmed the absence of Oct4A in normal and neoplastic urothelial cells and tissues, but indicated the presence of alternative isoforms or potentially translated pseudogenes. The stable forced expression of Oct4A in normal human urothelial cells in vitro profoundly inhibited growth and affected morphology, but protein expression was rapidly down-regulated. Our findings demonstrate that pluripotency-associated isoform Oct4A is not expressed by normal or malignant human urothelium and therefore is unlikely to play a role in a cancer stem cell phenotype. However, our findings also indicate that urothelium expresses a variety of other Oct4 splice-variant isoforms and transcribed pseudogenes that warrant further study. Copyright © 2013 American Society for Investigative Pathology. Published by Elsevier Inc. All rights reserved.
Ogawa, Yuko; Tsujimoto, Masafumi; Yanoshita, Ryohei
2016-01-01
Exosomes are small extracellular vesicles containing microRNAs and mRNAs that are produced by various types of cells. We previously used ultrafiltration and size-exclusion chromatography to isolate two types of human salivary exosomes (exosomes I, II) that are different in size and proteomes. We showed that salivary exosomes contain large repertoires of small RNAs. However, precise information regarding long RNAs in salivary exosomes has not been fully determined. In this study, we investigated the compositions of protein-coding RNAs (pcRNAs) and long non-protein-coding RNAs (lncRNAs) of exosome I, exosome II and whole saliva (WS) by next-generation sequencing technology. Although 11% of all RNAs were commonly detected among the three samples, the compositions of reads mapping to known RNAs were similar. The most abundant pcRNA is ribosomal RNA protein, and pcRNAs of some salivary proteins such as S100 calcium-binding protein A8 (protein S100-A8) were present in salivary exosomes. Interestingly, lncRNAs of pseudogenes (presumably, processed pseudogenes) were abundant in exosome I, exosome II and WS. Translationally controlled tumor protein gene, which plays an important role in cell proliferation, cell death and immune responses, was highly expressed as pcRNA and pseudogenes in salivary exosomes. Our results show that salivary exosomes contain various types of RNAs such as pseudogenes and small RNAs, and may mediate intercellular communication by transferring these RNAs to target cells as gene expression regulators.
The human cytochrome P450 3A locus. Gene evolution by capture of downstream exons.
Finta, C; Zaphiropoulos, P G
2000-12-30
Using a bacterial artificial chromosome (BAC) clone, we have mapped the human cytochrome P450 3A (CYP3A) locus containing the genes encoding for CYP3A4, CYP3A5 and CYP3A7. The genes lie in a head-to-tail orientation in the order of 3A4, 3A7 and 3A5. In both intergenic regions (3A4-3A7 and 3A7-3A5), we have detected several additional cytochrome P450 3A exons, forming two CYP3A pseudogenes. These pseudogenes have the same orientation as the CYP3A genes. To our surprise, a 3A7 mRNA species has been detected in which the exons 2 and 13 of one of the pseudogenes (the one that is downstream of 3A7) are spliced after the 3A7 terminal exon. This results in an mRNA molecule that consists of the 13 3A7 exons and two additional exons at the 3' end. The additional two exons originating from the pseudogene are in an altered reading frame and consequently have the capability to code a completely different amino acid sequence than the canonical CYP3A exons 2 and 13. These findings may represent a generalized evolutionary process with genes having the potential to capture neighboring sequences and use them as functional exons.
Differences in selection drive olfactory receptor genes in different directions in dogs and wolf.
Chen, Rui; Irwin, David M; Zhang, Ya-Ping
2012-11-01
The olfactory receptor (OR) gene family is the largest gene family found in mammalian genomes. It is known to evolve through a birth-and-death process. Here, we characterized the sequences of 16 segregating OR pseudogenes in the samples of the wolf and the Chinese village dog (CVD) and compared them with the sequences from dogs of different breeds. Our results show that the segregating OR pseudogenes in breed dogs are under strong purifying selection, while evolving neutrally in the CVD, and show a more complicated pattern in the wolf. In the wolf, we found a trend to remove deleterious polymorphisms and accumulate nondeleterious polymorphisms. On the basis of protein structure of the ORs, we found that the distribution of different types of polymorphisms (synonymous, nonsynonymous, tolerated, and untolerated) varied greatly between the wolf and the breed dogs. In summary, our results suggest that different forms of selection have acted on the segregating OR pseudogenes in the CVD since domestication, breed dogs after breed formation, and ancestral wolf population, which has driven the evolution of these genes in different directions.
Schuster, W; Brennicke, A
1991-01-01
An intact gene for the ribosomal protein S19 (rps19) is absent from Oenothera mitochondria. The conserved rps19 reading frame found in the mitochondrial genome is interrupted by a termination codon. This rps19 pseudogene is cotranscribed with the downstream rps3 gene and is edited on both sides of the translational stop. Editing, however, changes the amino acid sequence at positions that were well conserved before editing. Other strange editings create translational stops in open reading frames coding for functional proteins. In coxI and rps3 mRNAs CGA codons are edited to UGA stop codons only five and three codons, respectively, downstream to the initiation codon. These aberrant editings in essential open reading frames and in the rps19 pseudogene appear to have been shifted to these positions from other editing sites. These observations suggest a requirement for a continuous evolutionary constraint on the editing specificities in plant mitochondria. Images PMID:1762921
Wiebe, Victor; Przeworski, Molly; Lancet, Doron; Pääbo, Svante
2004-01-01
Olfactory receptor (OR) genes constitute the molecular basis for the sense of smell and are encoded by the largest gene family in mammalian genomes. Previous studies suggested that the proportion of pseudogenes in the OR gene family is significantly larger in humans than in other apes and significantly larger in apes than in the mouse. To investigate the process of degeneration of the olfactory repertoire in primates, we estimated the proportion of OR pseudogenes in 19 primate species by surveying randomly chosen subsets of 100 OR genes from each species. We find that apes, Old World monkeys and one New World monkey, the howler monkey, have a significantly higher proportion of OR pseudogenes than do other New World monkeys or the lemur (a prosimian). Strikingly, the howler monkey is also the only New World monkey to possess full trichromatic vision, along with Old World monkeys and apes. Our findings suggest that the deterioration of the olfactory repertoire occurred concomitant with the acquisition of full trichromatic color vision in primates. PMID:14737185
Tay, Wee Tek; Elfekih, Samia; Court, Leon N; Gordon, Karl H J; Delatte, Hélène; De Barro, Paul J
2017-10-01
Molecular species identification using suboptimal PCR primers can over-estimate species diversity due to coamplification of nuclear mitochondrial (NUMT) DNA/pseudogenes. For the agriculturally important whitefly Bemisia tabaci cryptic pest species complex, species identification depends primarily on characterization of the mitochondrial DNA cytochrome oxidase I (mtDNA COI) gene. The lack of robust PCR primers for the mtDNA COI gene can undermine correct species identification which in turn compromises management strategies. This problem is identified in the B. tabaci Africa/Middle East/Asia Minor clade which comprises the globally invasive Mediterranean (MED) and Middle East Asia Minor I (MEAM1) species, Middle East Asia Minor 2 (MEAM2), and the Indian Ocean (IO) species. Initially identified from the Indian Ocean island of Réunion, MEAM2 has since been reported from Japan, Peru, Turkey and Iraq. We identified MEAM2 individuals from a Peruvian population via Sanger sequencing of the mtDNA COI gene. In attempting to characterize the MEAM2 mitogenome, we instead characterized mitogenomes of MEAM1. We also report on the mitogenomes of MED, AUS, and IO thereby increasing genomic resources for members of this complex. Gene synteny (i.e., same gene composition and orientation) was observed with published B. tabaci cryptic species mitogenomes. Pseudogene fragments matching MEAM2 partial mtDNA COI gene exhibited low frequency single nucleotide polymorphisms that matched low copy number DNA fragments (<3%) of MEAM1 genomes, whereas presence of internal stop codons, loss of expected stop codons and poor primer annealing sites, all suggested MEAM2 as a pseudogene artifact and so not a real species. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Major taste loss in carnivorous mammals
Jiang, Peihua; Josue, Jesusa; Li, Xia; Glaser, Dieter; Li, Weihua; Brand, Joseph G.; Margolskee, Robert F.; Reed, Danielle R.; Beauchamp, Gary K.
2012-01-01
Mammalian sweet taste is primarily mediated by the type 1 taste receptor Tas1r2/Tas1r3, whereas Tas1r1/Tas1r3 act as the principal umami taste receptor. Bitter taste is mediated by a different group of G protein-coupled receptors, the Tas2rs, numbering 3 to ∼66, depending on the species. We showed previously that the behavioral indifference of cats toward sweet-tasting compounds can be explained by the pseudogenization of the Tas1r2 gene, which encodes the Tas1r2 receptor. To examine the generality of this finding, we sequenced the entire coding region of Tas1r2 from 12 species in the order Carnivora. Seven of these nonfeline species, all of which are exclusive meat eaters, also have independently pseudogenized Tas1r2 caused by ORF-disrupting mutations. Fittingly, the purifying selection pressure is markedly relaxed in these species with a pseudogenized Tas1r2. In behavioral tests, the Asian otter (defective Tas1r2) showed no preference for sweet compounds, but the spectacled bear (intact Tas1r2) did. In addition to the inactivation of Tas1r2, we found that sea lion Tas1r1 and Tas1r3 are also pseudogenized, consistent with their unique feeding behavior, which entails swallowing food whole without chewing. The extensive loss of Tas1r receptor function is not restricted to the sea lion: the bottlenose dolphin, which evolved independently from the sea lion but displays similar feeding behavior, also has all three Tas1rs inactivated, and may also lack functional bitter receptors. These data provide strong support for the view that loss of taste receptor function in mammals is widespread and directly related to feeding specializations. PMID:22411809
McGowen, Michael R; Clark, Clay; Gatesy, John
2008-08-01
The macroevolutionary transition of whales (cetaceans) from a terrestrial quadruped to an obligate aquatic form involved major changes in sensory abilities. Compared to terrestrial mammals, the olfactory system of baleen whales is dramatically reduced, and in toothed whales is completely absent. We sampled the olfactory receptor (OR) subgenomes of eight cetacean species from four families. A multigene tree of 115 newly characterized OR sequences from these eight species and published data for Bos taurus revealed a diverse array of class II OR paralogues in Cetacea. Evolution of the OR gene superfamily in toothed whales (Odontoceti) featured a multitude of independent pseudogenization events, supporting anatomical evidence that odontocetes have lost their olfactory sense. We explored the phylogenetic utility of OR pseudogenes in Cetacea, concentrating on delphinids (oceanic dolphins), the product of a rapid evolutionary radiation that has been difficult to resolve in previous studies of mitochondrial DNA sequences. Phylogenetic analyses of OR pseudogenes using both gene-tree reconciliation and supermatrix methods yielded fully resolved, consistently supported relationships among members of four delphinid subfamilies. Alternative minimizations of gene duplications, gene duplications plus gene losses, deep coalescence events, and nucleotide substitutions plus indels returned highly congruent phylogenetic hypotheses. Novel DNA sequence data for six single-copy nuclear loci and three mitochondrial genes (> 5000 aligned nucleotides) provided an independent test of the OR trees. Nucleotide substitutions and indels in OR pseudogenes showed a very low degree of homoplasy in comparison to mitochondrial DNA and, on average, provided more variation than single-copy nuclear DNA. Our results suggest that phylogenetic analysis of the large OR superfamily will be effective for resolving relationships within Cetacea whether supermatrix or gene-tree reconciliation procedures are used.
Deeb, Kristin K; Metcalf, James D; Sesock, Kaitlin M; Shen, Junqing; Wensel, Christine A; Rippel, Larisa I; Smith, Michelle; Chapman, Mark S; Zhang, Shulin
2015-07-01
Cystic fibrosis (CF) is one of the most common recessive conditions among whites, with an estimated carrier frequency of 1 in 25 in the United States. Population-based CF carrier screening was implemented in the United States in 2001. The number of mutations screened by each laboratory may vary; however, the 23 most common CF mutations recommended for screening by the American College of Medical Genetics and American College of Obstetricians and Gynecologists are included in all platforms. The c.1364C>A (p.A455E) mutation located in exon 10 of the CFTR gene is one of the 23 mutations. Because CFTR exon 10 and its flanking intronic regions are duplicated and transposed onto several other chromosomes of the human genome during evolution and function as unprocessed pseudogenes, variations in the CFTR pseudogenes may confound CF screening results for mutations located in exon 10 of the CFTR gene. We report an incorrectly identified carrier status for the c.1364C>A (p.A455E) mutation in a healthy individual using the Hologic InPlex CF assay. Further analysis revealed that the mutation resides in one of the CFTR pseudogenes. Because most commercial kits and laboratory-developed tests for CF carrier screening involve a short amplicon encompassing this mutation, this finding suggests that individuals with the c.1364C>A (p.A455E) mutation may require further investigation to avoid a false assignment of CF carrier status. Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Matthews, T. David; Schmieder, Robert; Silva, Genivaldo G. Z.
The Salmonella enterica serovars Enteritidis, Dublin, and Gallinarum are closely related but differ in virulence and host range. To identify the genetic elements responsible for these differences and to better understand how these serovars are evolving, we sequenced the genomes of Enteritidis strain LK5 and Dublin strain SARB12 and compared these genomes to the publicly available Enteritidis P125109, Dublin CT 02021853 and Dublin SD3246 genome sequences. We also compared the publicly available Gallinarum genome sequences from biotype Gallinarum 287/91 and Pullorum RKS5078. Using bioinformatic approaches, we identified single nucleotide polymorphisms, insertions, deletions, and differences in prophage and pseudogene content betweenmore » strains belonging to the same serovar. Through our analysis we also identified several prophage cargo genes and pseudogenes that affect virulence and may contribute to a host-specific, systemic lifestyle. These results strongly argue that the Enteritidis, Dublin and Gallinarum serovars of Salmonella enterica evolve by acquiring new genes through horizontal gene transfer, followed by the formation of pseudogenes. As a result, the loss of genes necessary for a gastrointestinal lifestyle ultimately leads to a systemic lifestyle and niche exclusion in the host-specific serovars.« less
Matthews, T. David; Schmieder, Robert; Silva, Genivaldo G. Z.; ...
2015-06-03
The Salmonella enterica serovars Enteritidis, Dublin, and Gallinarum are closely related but differ in virulence and host range. To identify the genetic elements responsible for these differences and to better understand how these serovars are evolving, we sequenced the genomes of Enteritidis strain LK5 and Dublin strain SARB12 and compared these genomes to the publicly available Enteritidis P125109, Dublin CT 02021853 and Dublin SD3246 genome sequences. We also compared the publicly available Gallinarum genome sequences from biotype Gallinarum 287/91 and Pullorum RKS5078. Using bioinformatic approaches, we identified single nucleotide polymorphisms, insertions, deletions, and differences in prophage and pseudogene content betweenmore » strains belonging to the same serovar. Through our analysis we also identified several prophage cargo genes and pseudogenes that affect virulence and may contribute to a host-specific, systemic lifestyle. These results strongly argue that the Enteritidis, Dublin and Gallinarum serovars of Salmonella enterica evolve by acquiring new genes through horizontal gene transfer, followed by the formation of pseudogenes. As a result, the loss of genes necessary for a gastrointestinal lifestyle ultimately leads to a systemic lifestyle and niche exclusion in the host-specific serovars.« less
Matthews, T. David; Schmieder, Robert; Silva, Genivaldo G. Z.; Busch, Julia; Cassman, Noriko; Dutilh, Bas E.; Green, Dawn; Matlock, Brian; Heffernan, Brian; Olsen, Gary J.; Farris Hanna, Leigh; Schifferli, Dieter M.; Maloy, Stanley; Dinsdale, Elizabeth A.; Edwards, Robert A.
2015-01-01
The Salmonella enterica serovars Enteritidis, Dublin, and Gallinarum are closely related but differ in virulence and host range. To identify the genetic elements responsible for these differences and to better understand how these serovars are evolving, we sequenced the genomes of Enteritidis strain LK5 and Dublin strain SARB12 and compared these genomes to the publicly available Enteritidis P125109, Dublin CT 02021853 and Dublin SD3246 genome sequences. We also compared the publicly available Gallinarum genome sequences from biotype Gallinarum 287/91 and Pullorum RKS5078. Using bioinformatic approaches, we identified single nucleotide polymorphisms, insertions, deletions, and differences in prophage and pseudogene content between strains belonging to the same serovar. Through our analysis we also identified several prophage cargo genes and pseudogenes that affect virulence and may contribute to a host-specific, systemic lifestyle. These results strongly argue that the Enteritidis, Dublin and Gallinarum serovars of Salmonella enterica evolve by acquiring new genes through horizontal gene transfer, followed by the formation of pseudogenes. The loss of genes necessary for a gastrointestinal lifestyle ultimately leads to a systemic lifestyle and niche exclusion in the host-specific serovars. PMID:26039056
Ndika, Joseph D T; Lusink, Vera; Beaubrun, Claudine; Kanhai, Warsha; Martinez-Munoz, Cristina; Jakobs, Cornelis; Salomons, Gajja S
2014-01-10
Interconversion between phosphocreatine and creatine, catalyzed by creatine kinase is crucial in the supply of ATP to tissues with high energy demand. Creatine's importance has been established by its use as an ergogenic aid in sport, as well as the development of intellectual disability in patients with congenital creatine deficiency. Creatine biosynthesis is complemented by dietary creatine uptake. Intracellular transport of creatine is carried out by a creatine transporter protein (CT1/CRT/CRTR) encoded by the SLC6A8 gene. Most tissues express this gene, with highest levels detected in skeletal muscle and kidney. There are lower levels of the gene detected in colon, brain, heart, testis and prostate. The mechanism(s) by which this regulation occurs is still poorly understood. A duplicated unprocessed pseudogene of SLC6A8-SLC6A10P has been mapped to chromosome 16p11.2 (contains the entire SLC6A8 gene, plus 2293 bp of 5'flanking sequence and its entire 3'UTR). Expression of SLC6A10P has so far only been shown in human testis and brain. It is still unclear as to what is the function of SLC6A10P. In a patient with autism, a chromosomal breakpoint that intersects the 5'flanking region of SLC6A10P was identified; suggesting that SLC6A10P is a non-coding RNA involved in autism. Our aim was to investigate the presence of cis-acting factor(s) that regulate expression of the creatine transporter, as well as to determine if these factors are functionally conserved upstream of the creatine transporter pseudogene. Via gene-specific PCR, cloning and functional luciferase assays we identified a 1104 bp sequence proximal to the mRNA start site of the SLC6A8 gene with promoter activity in five cell types. The corresponding 5'flanking sequence (1050 bp) on the pseudogene also had promoter activity in all 5 cell lines. Surprisingly the pseudogene promoter was stronger than that of its parent gene in 4 of the cell lines tested. To the best of our knowledge, this is the first experimental evidence of a pseudogene with stronger promoter activity than its parental gene. © 2013.
Yang, W; Du, W W; Li, X; Yee, A J; Yang, B B
2016-07-28
It has recently been shown that the upregulation of a pseudogene specific to a protein-coding gene could function as a sponge to bind multiple potential targeting microRNAs (miRNAs), resulting in increased gene expression. Similarly, it was recently demonstrated that circular RNAs can function as sponges for miRNAs, and could upregulate expression of mRNAs containing an identical sequence. Furthermore, some mRNAs are now known to not only translate protein, but also function to sponge miRNA binding, facilitating gene expression. Collectively, these appear to be effective mechanisms to ensure gene expression and protein activity. Here we show that expression of a member of the forkhead family of transcription factors, Foxo3, is regulated by the Foxo3 pseudogene (Foxo3P), and Foxo3 circular RNA, both of which bind to eight miRNAs. We found that the ectopic expression of the Foxo3P, Foxo3 circular RNA and Foxo3 mRNA could all suppress tumor growth and cancer cell proliferation and survival. Our results showed that at least three mechanisms are used to ensure protein translation of Foxo3, which reflects an essential role of Foxo3 and its corresponding non-coding RNAs.
Bardella, Vanessa Bellini; Cabral-de-Mello, Diogo Cavalcanti
2018-03-10
One cluster of 5S rDNA per haploid genome is the most common pattern among Heteroptera. However, in Chariesterus armatus, highly scattered signals were noticed. We isolated and characterized the entire 5S rDNA unit of C. armatus aiming to a deeper knowledge of molecular organization of the 5S rDNA among Heteroptera and to understand possible causes and consequences of 5S rDNA chromosomal spreading. For a comparative analysis, we performed the same approach in Holymenia histrio with 5S rDNA restricted to one bivalent. Multiple 5S rDNA variants were observed in both species, though they were more variable in C. armatus, with some of variants corresponding to pseudogenes. These pseudogenes suggest birth-and-death mechanism, though homogenization was also observed (concerted evolution), indicating evolution through mixed model. Association between transposable elements and 5S rDNA was not observed, suggesting spreading of 5S rDNA through other mechanisms, like ectopic recombination. Scattered organization is a rare example for 5S rDNA, and such organization in C. armatus genome could have led to the high diversification of sequences favoring their pseudogenization. Copyright © 2017. Published by Elsevier B.V.
New steroid 5alpha-reductase type I (SRD5A1) homologous sequences on human chromosomes 6 and 8.
Eminović, I; Liović, M; Prezelj, J; Kocijancic, A; Rozman, D; Komel, R
2001-01-01
To date, two genes encoding 5alpha-reductase isoenzymes are known (type I, type II), and one type I pseudogene. The divergent localization of these genes and the still not fully understood function of the encoded enzymes as well as the perplexing results we obtained after sequencing PCR-amplified SRD5A1 gene fragments (out of genomic DNA), made us assume that, in addition to the known SRD5A1 gene, one or more different human 5alpha-reductase type I coding genes may exist. Our research provide the first evidence for the existence of two new SRD5A1 related, previously unidentified sequences in the human genome. These sequences which were localized to chromosomes 6 and 8 are highly homologous (> 99%) to SRD5A1, and also do not contain any deletions or insertions that are otherwise a characteristic of the SRD5API pseudogene. Our results imply that these sequences may be either coding parts of yet unknown, active SRD5A1 genes, and/or of previously unidentified pseudogenes. These findings additionally support data of Chen et al. who confirmed the existence of various SRD5A1 proteins in cultured human skin cells.
Sex bias in copy number variation of olfactory receptor gene family depends on ethnicity.
Shadravan, Farideh
2013-01-01
Gender plays a pivotal role in the human genetic identity and is also manifested in many genetic disorders particularly mental retardation. In this study its effect on copy number variation (CNV), known to cause genetic disorders was explored. As the olfactory receptor (OR) repertoire comprises the largest human gene family, it was selected for this study, which was carried out within and between three populations, derived from 150 individuals from the 1000 Genome Project. Analysis of 3872 CNVs detected among 791 OR loci, in which 307 loci showed CNV, revealed the following novel findings: Sex bias in CNV was significantly more prevalent in uncommon than common CNV variants of OR pseudogenes, in which the male genome showed more CNVs; and in one-copy number loss compared to complete deletion of OR pseudogenes; both findings implying a more recent evolutionary role for gender. Sex bias in copy number gain was also detected. Another novel finding was that the observed sex bias was largely dependent on ethnicity and was in general absent in East Asians. Using a CNV public database for sick children (International Standard Cytogenomic Array Consortium) the application of these findings for improving clinical molecular diagnostics is discussed by showing an example of sex bias in CNV among kids with autism. Additional clinical relevance is discussed, as the most polymorphic CNV-enriched OR cluster in the human genome, located on chr 15q11.2, is found near the Prader-Willi syndrome/Angelman syndrome bi-directionally imprinted region associated with two well-known mental retardation syndromes. As olfaction represents the primitive cognition in most mammals, arguably in competition with the development of a larger brain, the extensive retention of OR pseudogenes in females of this study, might point to a parent-of-origin indirect regulatory role for OR pseudogenes in the embryonic development of human brain. Thus any perturbation in the temporal regulation of olfactory system could lead to developmental delay disorders including mental retardation.
Are Synonymous Sites in Primates and Rodents Functionally Constrained?
Price, Nicholas; Graur, Dan
2016-01-01
It has been claimed that synonymous sites in mammals are under selective constraint. Furthermore, in many studies the selective constraint at such sites in primates was claimed to be more stringent than that in rodents. Given the larger effective population sizes in rodents than in primates, the theoretical expectation is that selection in rodents would be more effective than that in primates. To resolve this contradiction between expectations and observations, we used processed pseudogenes as a model for strict neutral evolution, and estimated selective constraint on synonymous sites using the rate of substitution at pseudosynonymous and pseudononsynonymous sites in pseudogenes as the neutral expectation. After controlling for the effects of GC content, our results were similar to those from previous studies, i.e., synonymous sites in primates exhibited evidence for higher selective constraint that those in rodents. Specifically, our results indicated that in primates up to 24% of synonymous sites could be under purifying selection, while in rodents synonymous sites evolved neutrally. To further control for shifts in GC content, we estimated selective constraint at fourfold degenerate sites using a maximum parsimony approach. This allowed us to estimate selective constraint using mutational patterns that cause a shift in GC content (GT ↔ TG, CT ↔ TC, GA ↔ AG, and CA ↔ AC) and ones that do not (AT ↔ TA and CG ↔ GC). Using this approach, we found that synonymous sites evolve neutrally in both primates and rodents. Apparent deviations from neutrality were caused by a higher rate of C → A and C → T mutations in pseudogenes. Such differences are most likely caused by the shift in GC content experienced by pseudogenes. We conclude that previous estimates according to which 20-40% of synonymous sites in primates were under selective constraint were most likely artifacts of the biased pattern of mutation.
Šmajs, David; Zobaníková, Marie; Strouhal, Michal; Čejková, Darina; Dugan-Rocha, Shannon; Pospíšilová, Petra; Norris, Steven J.; Albert, Tom; Qin, Xiang; Hallsworth-Pepin, Kym; Buhay, Christian; Muzny, Donna M.; Chen, Lei; Gibbs, Richard A.; Weinstock, George M.
2011-01-01
Treponema paraluiscuniculi is the causative agent of rabbit venereal spirochetosis. It is not infectious to humans, although its genome structure is very closely related to other pathogenic Treponema species including Treponema pallidum subspecies pallidum, the etiological agent of syphilis. In this study, the genome sequence of Treponema paraluiscuniculi, strain Cuniculi A, was determined by a combination of several high-throughput sequencing strategies. Whereas the overall size (1,133,390 bp), arrangement, and gene content of the Cuniculi A genome closely resembled those of the T. pallidum genome, the T. paraluiscuniculi genome contained a markedly higher number of pseudogenes and gene fragments (51). In addition to pseudogenes, 33 divergent genes were also found in the T. paraluiscuniculi genome. A set of 32 (out of 84) affected genes encoded proteins of known or predicted function in the Nichols genome. These proteins included virulence factors, gene regulators and components of DNA repair and recombination. The majority (52 or 61.9%) of the Cuniculi A pseudogenes and divergent genes were of unknown function. Our results indicate that T. paraluiscuniculi has evolved from a T. pallidum-like ancestor and adapted to a specialized host-associated niche (rabbits) during loss of infectivity to humans. The genes that are inactivated or altered in T. paraluiscuniculi are candidates for virulence factors important in the infectivity and pathogenesis of T. pallidum subspecies. PMID:21655244
Chan, Wen-Ling; Yang, Wen-Kuang; Huang, Hsien-Da; Chang, Jan-Gowth
2013-01-01
RNA interference (RNAi) is a gene silencing process within living cells, which is controlled by the RNA-induced silencing complex with a sequence-specific manner. In flies and mice, the pseudogene transcripts can be processed into short interfering RNAs (siRNAs) that regulate protein-coding genes through the RNAi pathway. Following these findings, we construct an innovative and comprehensive database to elucidate siRNA-mediated mechanism in human transcribed pseudogenes (TPGs). To investigate TPG producing siRNAs that regulate protein-coding genes, we mapped the TPGs to small RNAs (sRNAs) that were supported by publicly deep sequencing data from various sRNA libraries and constructed the TPG-derived siRNA-target interactions. In addition, we also presented that TPGs can act as a target for miRNAs that actually regulate the parental gene. To enable the systematic compilation and updating of these results and additional information, we have developed a database, pseudoMap, capturing various types of information, including sequence data, TPG and cognate annotation, deep sequencing data, RNA-folding structure, gene expression profiles, miRNA annotation and target prediction. As our knowledge, pseudoMap is the first database to demonstrate two mechanisms of human TPGs: encoding siRNAs and decoying miRNAs that target the parental gene. pseudoMap is freely accessible at http://pseudomap.mbc.nctu.edu.tw/. Database URL: http://pseudomap.mbc.nctu.edu.tw/
Harnessing Gene Conversion in Chicken B Cells to Create a Human Antibody Sequence Repertoire
Schusser, Benjamin; Yi, Henry; Collarini, Ellen J.; Izquierdo, Shelley Mettler; Harriman, William D.; Etches, Robert J.; Leighton, Philip A.
2013-01-01
Transgenic chickens expressing human sequence antibodies would be a powerful tool to access human targets and epitopes that have been intractable in mammalian hosts because of tolerance to conserved proteins. To foster the development of the chicken platform, it is beneficial to validate transgene constructs using a rapid, cell culture-based method prior to generating fully transgenic birds. We describe a method for the expression of human immunoglobulin variable regions in the chicken DT40 B cell line and the further diversification of these genes by gene conversion. Chicken VL and VH loci were knocked out in DT40 cells and replaced with human VK and VH genes. To achieve gene conversion of human genes in chicken B cells, synthetic human pseudogene arrays were inserted upstream of the functional human VK and VH regions. Proper expression of chimeric IgM comprised of human variable regions and chicken constant regions is shown. Most importantly, sequencing of DT40 genetic variants confirmed that the human pseudogene arrays contributed to the generation of diversity through gene conversion at both the Igl and Igh loci. These data show that engineered pseudogene arrays produce a diverse pool of human antibody sequences in chicken B cells, and suggest that these constructs will express a functional repertoire of chimeric antibodies in transgenic chickens. PMID:24278246
Long-range PCR facilitates the identification of PMS2-specific mutations.
Clendenning, Mark; Hampel, Heather; LaJeunesse, Jennifer; Lindblom, Annika; Lockman, Jan; Nilbert, Mef; Senter, Leigha; Sotamaa, Kaisa; de la Chapelle, Albert
2006-05-01
Mutations within the DNA mismatch repair gene, "postmeiotic segregation increased 2" (PMS2), have been associated with a predisposition to hereditary nonpolyposis colorectal cancer (HNPCC; Lynch syndrome). The presence of a large family of highly homologous PMS2 pseudogenes has made previous attempts to sequence PMS2 very difficult. Here, we describe a novel method that utilizes long-range PCR as a way to preferentially amplify PMS2 and not the pseudogenes. A second, exon-specific, amplification from diluted long-range products enables us to obtain a clean sequence that shows no evidence of pseudogene contamination. This method has been used to screen a cohort of patients whose tumors were negative for the PMS2 protein by immunohistochemistry and had not shown any mutations within the MLH1 gene. Sequencing of the PMS2 gene from 30 colorectal and 11 endometrial cancer patients identified 10 novel sequence changes as well as 17 sequence changes that had previously been identified. In total, putative pathologic mutations were detected in 11 of the 41 families. Among these were five novel mutations, c.705+1G>T, c.736_741del6ins11, c.862_863del, c.1688G>T, and c.2007-1G>A. We conclude that PMS2 mutation detection in selected Lynch syndrome and Lynch syndrome-like patients is both feasible and desirable. Published 2006 Wiley-Liss, Inc.
Concolino, Paola; Mello, Enrica; Minucci, Angelo; Giardina, Emiliano; Zuppi, Cecilia; Toscano, Vincenzo; Capoluongo, Ettore
2009-01-01
Background More than 90% of Congenital Adrenal Hyperplasia (CAH) cases are associated with mutations in the 21-hydroxylase gene (CYP21A2) in the HLA class III area on the short arm of chromosome 6p21.3. In this region, a 30 kb deletion produces a non functional chimeric gene with its 5' and 3' ends corresponding to CYP21A1P pseudogene and CYP21A2, respectively. To date, five different CYP21A1P/CYP21A2 chimeric genes have been found and characterized in recent studies. In this paper, we describe a new CYP21A1P/CYP21A2 chimera (CH-6) found in an Italian CAH patient. Methods Southern blot analysis and CYP21A2 sequencing were performed on the patient. In addition, in order to isolate the new CH-6 chimeric gene, two different strategies were used. Results The CYP21A2 sequencing analysis showed that the patient was homozygote for the g.655C/A>G mutation and heterozygote for the p.P30L missense mutation. In addition, the promoter sequence revealed the presence, in heterozygosis, of 13 SNPs generally produced by microconversion events between gene and pseudogene. Southern blot analysis showed that the woman was heterozygote for the classic 30-kb deletion producing a new CYP21A1P/CYP21A2 chimeric gene (CH-6). The hybrid junction site was located between the end of intron 2 pseudogene, after the g.656C/A>G mutation, and the beginning of exon 3, before the 8 bp deletion. Consequently, CH-6 carries three mutations: the weak pseudogene promoter region, the p.P30L and the g.655C/A>G splice mutation. Conclusion We describe a new CYP21A1P/CYP21A2 chimera (CH-6), associated with the HLA-B15, DR13 haplotype, in a young Italian CAH patient. PMID:19624807
The complete mitochondrial genome of Hydra vulgaris (Hydroida: Hydridae).
Pan, Hong-Chun; Fang, Hong-Yan; Li, Shi-Wei; Liu, Jun-Hong; Wang, Ying; Wang, An-Tai
2014-12-01
The complete mitochondrial genome of Hydra vulgaris (Hydroida: Hydridae) is composed of two linear DNA molecules. The mitochondrial DNA (mtDNA) molecule 1 is 8010 bp long and contains six protein-coding genes, large subunit rRNA, methionine and tryptophan tRNAs, two pseudogenes consisting respectively of a partial copy of COI, and terminal sequences at two ends of the linear mtDNA, while the mtDNA molecule 2 is 7576 bp long and contains seven protein-coding genes, small subunit rRNA, methionine tRNA, a pseudogene consisting of a partial copy of COI and terminal sequences at two ends of the linear mtDNA. COI gene begins with GTG as start codon, whereas other 12 protein-coding genes start with a typical ATG initiation codon. In addition, all protein-coding genes are terminated with TAA as stop codon.
Correa, Fernanda A; França, Marcela M; Fang, Qing; Ma, Qianyi; Bachega, Tania A; Rodrigues, Andresa; Ozel, Bilge A; Li, Jun Z; Mendonca, Berenice B; Jorge, Alexander A L; Carvalho, Luciani R; Camper, Sally A; Arnhold, Ivo J P
2017-12-01
Isolated growth hormone deficiency (IGHD) is the most common pituitary hormone deficiency and, clinically, patients have delayed bone age. High sequence similarity between CYP21A2 gene and CYP21A1P pseudogene poses difficulties for exome sequencing interpretation. A 7.5 year-old boy born to second-degree cousins presented with severe short stature (height SDS -3.7) and bone age of 6 years. Clonidine and combined pituitary stimulation tests revealed GH deficiency. Pituitary MRI was normal. The patient was successfully treated with rGH. Surprisingly, at 10.8 years, his bone age had advanced to 13 years, but physical exam, LH and testosterone levels remained prepubertal. An ACTH stimulation test disclosed a non-classic congenital adrenal hyperplasia due to 21-hydroxylase deficiency explaining the bone age advancement and, therefore, treatment with cortisone acetate was added. The genetic diagnosis of a homozygous mutation in GHRHR (p.Leu144His), a homozygous CYP21A2 mutation (p.Val282Leu) and CYP21A1P pseudogene duplication was established by Sanger sequencing, MLPA and whole-exome sequencing. We report the unusual clinical presentation of a patient born to consanguineous parents with two recessive endocrine diseases: non-classic congenital adrenal hyperplasia modifying the classical GH deficiency phenotype. We used a method of paired read mapping aided by neighbouring mis-matches to overcome the challenges of exome-sequencing in the presence of a pseudogene.
Loss or major reduction of umami taste sensation in pinnipeds
NASA Astrophysics Data System (ADS)
Sato, Jun J.; Wolsan, Mieczyslaw
2012-08-01
Umami is one of basic tastes that humans and other vertebrates can perceive. This taste is elicited by L-amino acids and thus has a special role of detecting nutritious, protein-rich food. The T1R1 + T1R3 heterodimer acts as the principal umami receptor. The T1R1 protein is encoded by the Tas1r1 gene. We report multiple inactivating (pseudogenizing) mutations in exon 3 of this gene from four phocid and two otariid species (Pinnipedia). Jiang et al. (Proc Natl Acad Sci U S A 109:4956-4961, 2012) reported two inactivating mutations in exons 2 and 6 of this gene from another otariid species. These findings suggest lost or greatly reduced umami sensory capabilities in these species. The widespread occurrence of a nonfunctional Tas1r1 pseudogene in this clade of strictly carnivorous mammals is surprising. We hypothesize that factors underlying the pseudogenization of Tas1r1 in pinnipeds may be driven by the marine environment to which these carnivorans (Carnivora) have adapted and may include: the evolutionary change in diet from tetrapod prey to fish and cephalopods (because cephalopods and living fish contain little or no synergistic inosine 5'-monophosphate that greatly enhances umami taste), the feeding behavior of swallowing food whole without mastication (because the T1R1 + T1R3 receptor is distributed on the tongue and palate), and the saltiness of sea water (because a high concentration of sodium chloride masks umami taste).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Steinkasserer, A.; Koettnitz, K.; Hauber, J.
1995-02-10
The eukaryotic initiation factor 5A (eIF-5A) has been identified as an essential cofactor for the HIV-1 transactivator protein Rev. Rev plays a key role in the complex regulation of HIV-1 gene expression and thereby in the generation of infectious virus particles. Expression of eIF-5A is vital for Rev function, and inhibition of this interaction leads to a block of the viral replication cycle. In humans, four different eIF-5A genes have been identified. One codes for the eIF-5A protein and the other three are pseudogenes. Using a panel of somatic rodent-human cell hybrids in combination with fluorescence in situ hybridization analysis,more » we show that the four genes map to three different chromosomes. The coding eIF-5A gene (EIF5A) maps to 17p12-p13, and the three pseudogenes EIF5AP1, EIF5AP2, and EIF5AP3 map to 10q23.3, 17q25, and 19q13.2, respectively. This is the first localization report for a eukaryotic cofactor for a regulatory HIV-1 protein. 16 refs., 1 fig.« less
Evolutionary history and metabolic insights of ancient mammalian uricases
Kratzer, James T.; Lanaspa, Miguel A.; Murphy, Michael N.; Cicerchi, Christina; Graves, Christina L.; Tipton, Peter A.; Ortlund, Eric A.; Johnson, Richard J.; Gaucher, Eric A.
2014-01-01
Uricase is an enzyme involved in purine catabolism and is found in all three domains of life. Curiously, uricase is not functional in some organisms despite its role in converting highly insoluble uric acid into 5-hydroxyisourate. Of particular interest is the observation that apes, including humans, cannot oxidize uric acid, and it appears that multiple, independent evolutionary events led to the silencing or pseudogenization of the uricase gene in ancestral apes. Various arguments have been made to suggest why natural selection would allow the accumulation of uric acid despite the physiological consequences of crystallized monosodium urate acutely causing liver/kidney damage or chronically causing gout. We have applied evolutionary models to understand the history of primate uricases by resurrecting ancestral mammalian intermediates before the pseudogenization events of this gene family. Resurrected proteins reveal that ancestral uricases have steadily decreased in activity since the last common ancestor of mammals gave rise to descendent primate lineages. We were also able to determine the 3D distribution of amino acid replacements as they accumulated during evolutionary history by crystallizing a mammalian uricase protein. Further, ancient and modern uricases were stably transfected into HepG2 liver cells to test one hypothesis that uricase pseudogenization allowed ancient frugivorous apes to rapidly convert fructose into fat. Finally, pharmacokinetics of an ancient uricase injected in rodents suggest that our integrated approach provides the foundation for an evolutionarily-engineered enzyme capable of treating gout and preventing tumor lysis syndrome in human patients. PMID:24550457
Using secondary structure to identify ribosomal numts: cautionary examples from the human genome.
Olson, Link E; Yoder, Anne D
2002-01-01
The identification of inadvertently sequenced mitochondrial pseudogenes (numts) is critical to any study employing mitochondrial DNA sequence data. Failure to discriminate numts correctly can confound phylogenetic reconstruction and studies of molecular evolution. This is especially problematic for ribosomal mtDNA genes. Unlike protein-coding loci, whose pseudogenes tend to accumulate diagnostic frameshift or premature stop mutations, functional ribosomal genes are not constrained to maintain a reading frame and can accumulate insertion-deletion events of varying length, particularly in nonpairing regions. Several authors have advocated using structural features of the transcribed rRNA molecule to differentiate functional mitochondrial rRNA genes from their nuclear paralogs. We explored this approach using the mitochondrial 12S rRNA gene and three known 12S numts from the human genome in the context of anthropoid phylogeny and the inferred secondary structure of primate 12S rRNA. Contrary to expectation, each of the three human numts exhibits striking concordance with secondary structure models, with little, if any, indication of their pseudogene status, and would likely escape detection based on structural criteria alone. Furthermore, we show that the unwitting inclusion of a particularly ancient (18-25 Myr old) and surprisingly cryptic human numt in a phylogenetic analysis would yield a well-supported but dramatically incorrect conclusion regarding anthropoid relationships. Though we endorse the use of secondary structure models for inferring positional homology wholeheartedly, we caution against reliance on structural criteria for the discrimination of rRNA numts, given the potential fallibility of this approach.
A case of early onset rectal cancer of Lynch syndrome with a novel deleterious PMS2 mutation.
Nomura, Sachio; Fujimoto, Yoshiya; Yamamoto, Noriko; Sato, Yuri; Ashihara, Yuumi; Kita, Mizuho; Yamaguchi, Junya; Ishikawa, Yuichi; Ueno, Masashi; Arai, Masami
2015-10-01
Heterozygous deleterious mutation of the PMS2 gene is a cause of Lynch syndrome, an autosomal dominant cancer disease. However, the frequency of PMS2 mutation is rare compared with that of the other causative genes; MSH2, MLH1 and MSH6. PMS2 mutation has so far only been reported once from a Japanese facility. Detection of PMS2 mutation is relatively complicated due to the existence of 15 highly homologous pseudogenes, and its gene conversion event with the pseudogene PMS2CL. Therefore, for PMS2 mutation analysis, it is crucial to clearly distinguish PMS2 from its pseudogenes. We report here a novel deleterious 11 bp deletion mutation of exon 11 of PMS2 distinguished from PMS2CL in a 34-year-old Japanese female with rectal cancer. PMS2 mutated at c.1492del11 results in a truncated 500 amino acid protein rather than the wild-type protein of 862 amino acids. This is supported by the fact that, although there is usually concordance between MLH1 and PMS2 expression, cells were immunohistochemically positive for MLH1, whereas PMS2 could not be immunohistochemically stained using an anti-C-terminal PMS2 antibody, or effective PMS2 mRNA degradation with NMD caused by the frameshift mutation. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
The Sequence and Analysis of Duplication Rich Human Chromosome 16
DOE R&D Accomplishments Database
Martin, Joel; Han, Cliff; Gordon, Laurie A.; Terry, Astrid; Prabhakar, Shyam; She, Xinwei; Xie, Gary; Hellsten, Uffe; Man Chan, Yee; Altherr, Michael; Couronne, Olivier; Aerts, Andrea; Bajorek, Eva; Black, Stacey; Blumer, Heather; Branscomb, Elbert; Brown, Nancy C.; Bruno, William J.; Buckingham, Judith M.; Callen, David F.; Campbell, Connie S.; Campbell, Mary L.; Campbell, Evelyn W.; Caoile, Chenier; Challacombe, Jean F.; Chasteen, Leslie A.; Chertkov, Olga; Chi, Han C.; Christensen, Mari; Clark, Lynn M.; Cohn, Judith D.; Denys, Mirian; Detter, John C.; Dickson, Mark; Dimitrijevic-Bussod, Mira; Escobar, Julio; Fawcett, Joseph J.; Flowers, Dave; Fotopulos, Dea; Glavina, Tijana; Gomez, Maria; Gonzales, Eidelyn; Goodstein, David; Goodwin, Lynne A.; Grady, Deborah L.; Grigoriev, Igor; Groza, Matthew; Hammon, Nancy; Hawkins, Trevor; Haydu, Lauren; Hildebrand, Carl E.; Huang, Wayne; Israni, Sanjay; Jett, Jamie; Jewett, Phillip E.; Kadner, Kristen; Kimball, Heather; Kobayashi, Arthur; Krawczyk, Marie-Claude; Leyba, Tina; Longmire, Jonathan L.; Lopez, Frederick; Lou, Yunian; Lowry, Steve; Ludeman, Thom; Mark, Graham A.; Mcmurray, Kimberly L.; Meincke, Linda J.; Morgan, Jenna; Moyzis, Robert K.; Mundt, Mark O.; Munk, A. Christine; Nandkeshwar, Richard D.; Pitluck, Sam; Pollard, Martin; Predki, Paul; Parson-Quintana, Beverly; Ramirez, Lucia; Rash, Sam; Retterer, James; Ricke, Darryl O.; Robinson, Donna L.; Rodriguez, Alex; Salamov, Asaf; Saunders, Elizabeth H.; Scott, Duncan; Shough, Timothy; Stallings, Raymond L.; Stalvey, Malinda; Sutherland, Robert D.; Tapia, Roxanne; Tesmer, Judith G.; Thayer, Nina; Thompson, Linda S.; Tice, Hope; Torney, David C.; Tran-Gyamfi, Mary; Tsai, Ming; Ulanovsky, Levy E.; Ustaszewska, Anna; Vo, Nu; White, P. Scott; Williams, Albert L.; Wills, Patricia L.; Wu, Jung-Rung; Wu, Kevin; Yang, Joan; DeJong, Pieter; Bruce, David; Doggett, Norman; Deaven, Larry; Schmutz, Jeremy; Grimwood, Jane; Richardson, Paul; et al.
2004-01-01
We report here the 78,884,754 base pairs of finished human chromosome 16 sequence, representing over 99.9 percent of its euchromatin. Manual annotation revealed 880 protein coding genes confirmed by 1,637 aligned transcripts, 19 tRNA genes, 341 pseudogenes and 3 RNA pseudogenes. These genes include metallothionein, cadherin and iroquois gene families, as well as the disease genes for polycystic kidney disease and acute myelomonocytic leukemia. Several large-scale structural polymorphisms spanning hundreds of kilobasepairs were identified and result in gene content differences across humans. One of the unique features of chromosome 16 is its high level of segmental duplication, ranked among the highest of the human autosomes. While the segmental duplications are enriched in the relatively gene poor pericentromere of the p-arm, some are involved in recent gene duplication and conversion events which are likely to have had an impact on the evolution of primates and human disease susceptibility.
Mapping of aldose reductase gene sequences to human chromosomes 1, 3, 7, 9, 11, and 13
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bateman, J.B.; Kojis, T.; Heinzmann, C.
1993-09-01
Aldose reductase (alditol:NAD(P)+ 1-oxidoreductase; EC 1.1.1.21) (AR) catalyzes the reduction of several aldehydes, including that of glucose, to the corresponding sugar alcohol. Using a complementary DNA clone encoding human AR, the authors mapped the gene sequences to human chromosomes 1, 3, 7, 9, 11, 13, 14, and 18 by somatic cell hybridization. By in situ hybridization analysis, sequences were localized to human chromosomes 1q32-q43, 3p12, 7q31-q35, 9q22, 11p14-p15, and 13q14-q21. As a putative functional AR gene has been mapped to chromosome 7 and a putative pseudogene to chromosome 3, the sequences on the other seven chromosomes may represent other activemore » genes, non-aldose reductase homologous sequences, or pseudogenes. 24 refs., 3 figs., 2 tabs.« less
Palmer, Guy H; Futse, James E; Knowles, Donald P; Brayton, Kelly A
2006-10-01
Persistence of Anaplasma spp. in the animal reservoir host is required for efficient tick-borne transmission of these pathogens to animals and humans. Using A. marginale infection of its natural reservoir host as a model, persistent infection has been shown to reflect sequential cycles in which antigenic variants emerge, replicate, and are controlled by the immune system. Variation in the immunodominant outer-membrane protein MSP2 is generated by a process of gene conversion, in which unique hypervariable region sequences (HVRs) located in pseudogenes are recombined into a single operon-linked msp2 expression site. Although organisms expressing whole HVRs derived from pseudogenes emerge early in infection, long-term persistent infection is dependent on the generation of complex mosaics in which segments from different HVRs recombine into the expression site. The resulting combinatorial diversity generates the number of variants both predicted and shown to emerge during persistence.
A pseudogene long noncoding RNA network regulates PTEN transcription and translation in human cells
Johnsson, Per; Ackley, Amanda; Vidarsdottir, Linda; Lui, Weng-Onn; Corcoran, Martin; Grandér, Dan; Morris, Kevin V.
2013-01-01
PTEN is a tumor suppressor gene that has been shown to be under the regulatory control of a PTEN pseudogene expressed noncoding RNA, PTENpg1. Here, we characterize a previously unidentified PTENpg1 encoded antisense RNA (asRNA), which regulates PTEN transcription and PTEN mRNA stability. We find two PTENpg1 asRNA isoforms, alpha and beta. The alpha isoform functions in trans, localizes to the PTEN promoter, and epigenetically modulates PTEN transcription by the recruitment of DNMT3a and EZH2. In contrast, the beta isoform interacts with PTENpg1 through an RNA:RNA pairing interaction, which affects PTEN protein output via changes of PTENpg1 stability and microRNA sponge activity. Disruption of this asRNA-regulated network induces cell cycle arrest and sensitizes cells to doxorubicin, suggesting a biological function for the respective PTENpg1 expressed asRNAs. PMID:23435381
A pseudogene long-noncoding-RNA network regulates PTEN transcription and translation in human cells.
Johnsson, Per; Ackley, Amanda; Vidarsdottir, Linda; Lui, Weng-Onn; Corcoran, Martin; Grandér, Dan; Morris, Kevin V
2013-04-01
PTEN is a tumor-suppressor gene that has been shown to be under the regulatory control of a PTEN pseudogene expressed noncoding RNA, PTENpg1. Here, we characterize a previously unidentified PTENpg1-encoded antisense RNA (asRNA), which regulates PTEN transcription and PTEN mRNA stability. We find two PTENpg1 asRNA isoforms, α and β. The α isoform functions in trans, localizes to the PTEN promoter and epigenetically modulates PTEN transcription by the recruitment of DNA methyltransferase 3a and Enhancer of Zeste. In contrast, the β isoform interacts with PTENpg1 through an RNA-RNA pairing interaction, which affects PTEN protein output through changes of PTENpg1 stability and microRNA sponge activity. Disruption of this asRNA-regulated network induces cell-cycle arrest and sensitizes cells to doxorubicin, which suggests a biological function for the respective PTENpg1 expressed asRNAs.
Zheng, Deyou
2008-01-01
Background Sequencing and annotation of several mammalian genomes have revealed that segmental duplications are a common architectural feature of primate genomes; in fact, about 5% of the human genome is composed of large blocks of interspersed segmental duplications. These segmental duplications have been implicated in genomic copy-number variation, gene novelty, and various genomic disorders. However, the molecular processes involved in the evolution and regulation of duplicated sequences remain largely unexplored. Results In this study, the profile of about 20 histone modifications within human segmental duplications was characterized using high-resolution, genome-wide data derived from a ChIP-Seq study. The analysis demonstrates that derivative loci of segmental duplications often differ significantly from the original with respect to many histone methylations. Further investigation showed that genes are present three times more frequently in the original than in the derivative, whereas pseudogenes exhibit the opposite trend. These asymmetries tend to increase with the age of segmental duplications. The uneven distribution of genes and pseudogenes does not, however, fully account for the asymmetry in the profile of histone modifications. Conclusion The first systematic analysis of histone modifications between segmental duplications demonstrates that two seemingly 'identical' genomic copies are distinct in their epigenomic properties. Results here suggest that local chromatin environments may be implicated in the discrimination of derived copies of segmental duplications from their originals, leading to a biased pseudogenization of the new duplicates. The data also indicate that further exploration of the interactions between histone modification and sequence degeneration is necessary in order to understand the divergence of duplicated sequences. PMID:18598352
Singh, Pushpendra; Benjak, Andrej; Schuenemann, Verena J.; Herbig, Alexander; Avanzi, Charlotte; Busso, Philippe; Nieselt, Kay; Krause, Johannes; Vera-Cabrera, Lucio; Cole, Stewart T.
2015-01-01
Mycobacterium lepromatosis is an uncultured human pathogen associated with diffuse lepromatous leprosy and a reactional state known as Lucio's phenomenon. By using deep sequencing with and without DNA enrichment, we obtained the near-complete genome sequence of M. lepromatosis present in a skin biopsy from a Mexican patient, and compared it with that of Mycobacterium leprae, which has undergone extensive reductive evolution. The genomes display extensive synteny and are similar in size (∼3.27 Mb). Protein-coding genes share 93% nucleotide sequence identity, whereas pseudogenes are only 82% identical. The events that led to pseudogenization of 50% of the genome likely occurred before divergence from their most recent common ancestor (MRCA), and both M. lepromatosis and M. leprae have since accumulated new pseudogenes or acquired specific deletions. Functional comparisons suggest that M. lepromatosis has lost several enzymes required for amino acid synthesis whereas M. leprae has a defective heme pathway. M. lepromatosis has retained all functions required to infect the Schwann cells of the peripheral nervous system and therefore may also be neuropathogenic. A phylogeographic survey of 227 leprosy biopsies by differential PCR revealed that 221 contained M. leprae whereas only six, all from Mexico, harbored M. lepromatosis. Phylogenetic comparisons indicate that M. lepromatosis is closer than M. leprae to the MRCA, and a Bayesian dating analysis suggests that they diverged from their MRCA approximately 13.9 Mya. Thus, despite their ancient separation, the two leprosy bacilli are remarkably conserved and still cause similar pathologic conditions. PMID:25831531
Durand-Dubief, Mickaël; Absalon, Sabrina; Menzer, Linda; Ngwabyt, Sandra; Ersfeld, Klaus; Bastin, Philippe
2007-12-01
The protist Trypanosoma brucei possesses a single Argonaute gene called TbAGO1 that is necessary for RNAi silencing. We previously showed that in strain 427, TbAGO1 knock-out leads to a slow growth phenotype and to chromosome segregation defects. Here we report that the slow growth phenotype is linked to defects in segregation of both large and mini-chromosome populations, with large chromosomes being the most affected. These phenotypes are completely reversed upon inducible re-expression of TbAGO1 fused to GFP, demonstrating their link with TbAGO1. Trypanosomes that do not express TbAGO1 show a general increase in the abundance of transcripts derived from the short retroposon RIME (Ribosomal Interspersed Mobile Element). Supplementary large RIME transcripts emerge in the absence of RNAi, a phenomenon coupled to the disappearance of short transcripts. These fluctuations are reversed by inducible expression of GFP::TbAGO1. Furthermore, we use a combination of Northern blots, RT-PCR and sequencing to reveal that RNAi controls expression of transcripts derived from RHS (Retrotransposon Hot Spot) pseudogenes (RHS genes with retro-element(s) integrated within their coding sequence). Absence of RNAi also leads to an increase of steady-state transcripts from regular RHS genes (those without retro-element), indicating a role for pseudogene in control of gene expression. However, analysis of retroposon abundance and arrangement in the genome of multiple clonal cell lines of TbAGO1-/- failed to reveal movement of mobile elements despite the increased amounts of retroposon transcripts.
Efficient Detection of Copy Number Mutations in PMS2 Exons with a Close Homolog.
Herman, Daniel S; Smith, Christina; Liu, Chang; Vaughn, Cecily P; Palaniappan, Selvi; Pritchard, Colin C; Shirts, Brian H
2018-07-01
Detection of 3' PMS2 copy-number mutations that cause Lynch syndrome is difficult because of highly homologous pseudogenes. To improve the accuracy and efficiency of clinical screening for these mutations, we developed a new method to analyze standard capture-based, next-generation sequencing data to identify deletions and duplications in PMS2 exons 9 to 15. The approach captures sequences using PMS2 targets, maps sequences randomly among regions with equal mapping quality, counts reads aligned to homologous exons and introns, and flags read count ratios outside of empirically derived reference ranges. The method was trained on 1352 samples, including 8 known positives, and tested on 719 samples, including 17 known positives. Clinical implementation of the first version of this method detected new mutations in the training (N = 7) and test (N = 2) sets that had not been identified by our initial clinical testing pipeline. The described final method showed complete sensitivity in both sample sets and false-positive rates of 5% (training) and 7% (test), dramatically decreasing the number of cases needing additional mutation evaluation. This approach leveraged the differences between gene and pseudogene to distinguish between PMS2 and PMS2CL copy-number mutations. These methods enable efficient and sensitive Lynch syndrome screening for 3' PMS2 copy-number mutations and may be applied similarly to other genomic regions with highly homologous pseudogenes. Copyright © 2018 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Ribosomal RNA Genes Contribute to the Formation of Pseudogenes and Junk DNA in the Human Genome.
Robicheau, Brent M; Susko, Edward; Harrigan, Amye M; Snyder, Marlene
2017-02-01
Approximately 35% of the human genome can be identified as sequence devoid of a selected-effect function, and not derived from transposable elements or repeated sequences. We provide evidence supporting a known origin for a fraction of this sequence. We show that: 1) highly degraded, but near full length, ribosomal DNA (rDNA) units, including both 45S and Intergenic Spacer (IGS), can be found at multiple sites in the human genome on chromosomes without rDNA arrays, 2) that these rDNA sequences have a propensity for being centromere proximal, and 3) that sequence at all human functional rDNA array ends is divergent from canonical rDNA to the point that it is pseudogenic. We also show that small sequence strings of rDNA (from 45S + IGS) can be found distributed throughout the genome and are identifiable as an "rDNA-like signal", representing 0.26% of the q-arm of HSA21 and ∼2% of the total sequence of other regions tested. The size of sequence strings found in the rDNA-like signal intergrade into the size of sequence strings that make up the full-length degrading rDNA units found scattered throughout the genome. We conclude that the displaced and degrading rDNA sequences are likely of a similar origin but represent different stages in their evolution towards random sequence. Collectively, our data suggests that over vast evolutionary time, rDNA arrays contribute to the production of junk DNA. The concept that the production of rDNA pseudogenes is a by-product of concerted evolution represents a previously under-appreciated process; we demonstrate here its importance. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Izawa, Kazuki; Kuwahara, Hirokazu; Kihara, Kumiko; Yuki, Masahiro; Lo, Nathan; Itoh, Takehiko; Ohkuma, Moriya; Hongoh, Yuichi
2016-10-13
"Candidatus Endomicrobium trichonymphae" (Bacteria; Elusimicrobia) is an obligate intracellular symbiont of the cellulolytic protist genus Trichonympha in the termite gut. A previous genome analysis of "Ca Endomicrobium trichonymphae" phylotype Rs-D17 (genomovar Ri2008), obtained from a Trichonympha agilis cell in the gut of the termite Reticulitermes speratus, revealed that its genome is small (1.1 Mb) and contains many pseudogenes; it is in the course of reductive genome evolution. Here we report the complete genome sequence of another Rs-D17 genomovar, Ti2015, obtained from a different T. agilis cell present in an R. speratus gut. These two genomovars share most intact protein-coding genes and pseudogenes, showing 98.6% chromosome sequence similarity. However, characteristic differences were found in their defense systems, which comprised restriction-modification and CRISPR/Cas systems. The repertoire of intact restriction-modification systems differed between the genomovars, and two of the three CRISPR/Cas loci in genomovar Ri2008 are pseudogenized or missing in genomovar Ti2015. These results suggest relaxed selection pressure for maintaining these defense systems. Nevertheless, the remaining CRISPR/Cas system in each genomovar appears to be active; none of the "spacer" sequences (112 in Ri2008 and 128 in Ti2015) were shared whereas the "repeat" sequences were identical. Furthermore, we obtained draft genomes of three additional endosymbiotic Endomicrobium phylotypes from different host protist species, and discovered multiple, intact CRISPR/Cas systems in each genome. Collectively, unlike bacteriome endosymbionts in insects, the Endomicrobium endosymbionts of termite-gut protists appear to require defense against foreign DNA, although the required level of defense has likely been reduced during their intracellular lives. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Criado, A; Martinez, J; Buling, A; Barba, J C; Merino, S; Jefferies, R; Irwin, P J
2006-12-20
As a continuation of our studies on molecular epizootiology of piroplasmosis in Spain and other countries, we present in this contribution the finding of new hosts for some piroplasms, as well as information on their 18S rRNA gene sequences. Genetic data were complemented with sequences of apocytochrome b gene (whenever possible). The following conclusions were drawn from these molecular studies: Theileria annulata is capable of infecting dogs, since it was diagnosed in a symptomatic animal. According to cytochrome b sequences, isolates from cows and dog present slight differences. The same isolates showed, however, identical sequence in the 18S rRNA gene. This exemplifies well the usefulness of the mitochondrial gene for examining infra-specific variation. Babesia bovis is an occasional parasite of equines, since it was detected in two symptomatic horses. We found evidence of genetic polymorphism occurring in the 18S rRNA gene of Spanish T. equi-like and B. ovis isolates. B. bennetti from Spanish seagull is loosely related to B. ovis, and might represent a genetically distinct branch of babesids. A partial sequence of a cytochrome b pseudogene was obtained for the first time in Babesia canis rossi from South Africa. The pseudogene is distantly related to B. bigemina cytochrome b gene. These new findings confirm the ability of some piroplasms to infect multiple hosts, as well as the existence of a relatively wide genetic polymorphisms with respect to the cytochrome b gene. On the other hand, the existence of mtDNA-like pseudogenes of possible nuclear location in piroplasms is interesting due to their possible impact on molecular phylogeny studies.
Zhao, Huabin; Yang, Jian-Rong; Xu, Huailiang; Zhang, Jianzhi
2010-12-01
Although it belongs to the order Carnivora, the giant panda is a vegetarian with 99% of its diet being bamboo. The draft genome sequence of the giant panda shows that its umami taste receptor gene Tas1r1 is a pseudogene, prompting the proposal that the loss of the umami perception explains why the giant panda is herbivorous. To test this hypothesis, we sequenced all six exons of Tas1r1 in another individual of the giant panda and five other carnivores. We found that the open reading frame (ORF) of Tas1r1 is intact in all these carnivores except the giant panda. The rate ratio (ω) of nonsynonymous to synonymous substitutions in Tas1r1 is significantly higher for the giant panda lineage than for other carnivore lineages. Based on the ω change and the observed number of ORF-disrupting substitutions, we estimated that the functional constraint on the giant panda Tas1r1 was relaxed ∼ 4.2 Ma, with its 95% confidence interval between 1.3 and 10 Ma. Our estimate matches the approximate date of the giant panda's dietary switch inferred from fossil records. It is probable that the giant panda's decreased reliance on meat resulted in the dispensability of the umami taste, leading to Tas1r1 pseudogenization, which in turn reinforced its herbivorous life style because of the diminished attraction of returning to meat eating in the absence of Tas1r1. Nonetheless, additional factors are likely involved because herbivores such as cow and horse still retain an intact Tas1r1.
Zhao, Huabin; Yang, Jian-Rong; Xu, Huailiang; Zhang, Jianzhi
2010-01-01
Although it belongs to the order Carnivora, the giant panda is a vegetarian with 99% of its diet being bamboo. The draft genome sequence of the giant panda shows that its umami taste receptor gene Tas1r1 is a pseudogene, prompting the proposal that the loss of the umami perception explains why the giant panda is herbivorous. To test this hypothesis, we sequenced all six exons of Tas1r1 in another individual of the giant panda and five other carnivores. We found that the open reading frame (ORF) of Tas1r1 is intact in all these carnivores except the giant panda. The rate ratio (ω) of nonsynonymous to synonymous substitutions in Tas1r1 is significantly higher for the giant panda lineage than for other carnivore lineages. Based on the ω change and the observed number of ORF-disrupting substitutions, we estimated that the functional constraint on the giant panda Tas1r1 was relaxed ∼4.2 Ma, with its 95% confidence interval between 1.3 and 10 Ma. Our estimate matches the approximate date of the giant panda's dietary switch inferred from fossil records. It is probable that the giant panda's decreased reliance on meat resulted in the dispensability of the umami taste, leading to Tas1r1 pseudogenization, which in turn reinforced its herbivorous life style because of the diminished attraction of returning to meat eating in the absence of Tas1r1. Nonetheless, additional factors are likely involved because herbivores such as cow and horse still retain an intact Tas1r1. PMID:20573776
HnRNP A3 genes and pseudogenes in the vertebrate genomes.
Makeyev, Aleksandr V; Kim, Chang Bae; Ruddle, Frank H; Enkhmandakh, Badam; Erdenechimeg, Lkhamsuren; Bayarsaihan, Dashzeveg
2005-04-01
The hnRNP A/B type proteins are abundant nuclear factors that bind to Pol II transcripts and are involved in numerous RNA-related activities. To date most data on the hnRNP A/B family have been obtained with recombinant proteins and cell cultures. Further characterization can result from an examination of the impact of various modifications in intact functional loci; however, such characterization is hampered by the presence of numerous and widely dispersed hnRNP A/B-related sequences in the mammalian genome. We have found hnRNP A3, a poorly recognized member of the hnRNP A/B family, among candidate transcription factors that interact with the regulatory region of the Hoxc8 gene and screened the human and mouse genomes for genes that encode hnRNP A3. We demonstrate that the sequence reported previously as the human hnRNP A3 gene (Accession number S63912) and located on 10p11.1 belongs to a processed pseudogene of the functional intron-containing locus HNRPA3, which we have identified on 2q31.2. We have also identified its murine orthologs on mouse chromosome 2D and rat chromosome 3q23. Alternative splices were revealed at the N-terminus and in the middle of hnRNP A3. 14 and 28 additional loci in the human and mouse genome, respectively, were mapped and identified as hnRNP A3 processed pseudogenes. In addition, we have found and compared hnRNP A3 orthologous genes in Gallus gallus, Xenopus tropicalis, and Danio rerio. The present in silico analysis serves as a necessary step toward a further functional characterization of hnRNP A3. (c) 2005 Wiley-Liss, Inc.
Etzler, J; Peyrl, A; Zatkova, A; Schildhaus, H-U; Ficek, A; Merkelbach-Bruse, S; Kratz, C P; Attarbaschi, A; Hainfellner, J A; Yao, S; Messiaen, L; Slavc, I; Wimmer, K
2008-02-01
Heterozygous germline mutations in one of the mismatch repair (MMR) genes MLH1, MSH2, MSH6, and PMS2 cause hereditary nonpolyposis colorectal cancer (HNPCC) or Lynch syndrome, a dominantly inherited cancer susceptibility syndrome. Recent reports provide evidence for a novel recessively inherited cancer syndrome with constitutive MMR deficiency due to biallelic germline mutations in one of the MMR genes. MMR-deficiency (MMR-D) syndrome is characterized by childhood brain tumors, hematological and/or gastrointestinal malignancies, and signs of neurofibromatosis type 1 (NF1). We established an RNA-based mutation detection assay for the four MMR genes, since 1) a number of splicing defects may escape detection by the analysis of genomic DNA, and 2) DNA-based mutation detection in the PMS2 gene is severely hampered by the presence of multiple highly similar pseudogenes, including PMS2CL. Using this assay, which is based on direct cDNA sequencing of RT-PCR products, we investigated two families with children suspected to suffer from MMR-D syndrome. We identified a homozygous complex MSH6 splicing alteration in the index patients of the first family and a novel homozygous PMS2 mutation (c.182delA) in the index patient of the second family. Furthermore, we demonstrate, by the analysis of a PMS2/PMS2CL "hybrid" allele carrier, that RNA-based PMS2 testing effectively avoids the caveats of genomic DNA amplification approaches; i.e., pseudogene coamplification as well as allelic dropout, and will, thus, allow more sensitive mutation analysis in MMR deficiency and in HNPCC patients with PMS2 defects. (c) 2007 Wiley-Liss, Inc.
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1987-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3575113
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1990-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2333227
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1988-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3368330
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1989-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2654889
The complete plastid genome sequence of Eustrephus latifolius (Asparagaceae: Lomandroideae).
Kim, Hyoung Tae; Kim, Jung Sung; Kim, Joo-Hwan
2016-01-01
The complete chloroplast (cp) genome sequence of Eustrephus latifolius was firstly determined in subfamily Lomandriodeae of family Asparagaceae. It was 159,736 bp and contained a large single copy region (82,403 bp) and a small single copy region (13,607 bp) which were separated by two inverted repeat regions (31,863 bp). In total, 132 genes were identified and they were consisted of 83 coding genes, 8 rRNA genes, 38 tRNA genes, 3 pseudogenes. rpl23 and clpP were pseudogenes due to sequence deletions. Among 23 genes containing introns, rps12 and ycf3 contained two introns and the rest had just one intron. The intact ycf68 was identified within an intron of trnI-GAU. The amino acid sequence was almost identical with Phoenix dactylifera in Aracales. Ycf1 of E. latifolius was completely located in IR. It was similar to cp genome structure of Lemna minor, Spirodela polyrhiza, Wolffiella lingulata, Wolffia australiana in Alismatales.
Allison, Andrew B; Mead, Daniel G; Palacios, Gustavo F; Tesh, Robert B; Holmes, Edward C
2014-01-05
Flanders virus (FLAV) and Hart Park virus (HPV) are rhabdoviruses that circulate in mosquito-bird cycles in the eastern and western United States, respectively, and constitute the only two North American representatives of the Hart Park serogroup. Previously, it was suggested that FLAV is unique among the rhabdoviruses in that it contains two pseudogenes located between the P and M genes, while the cognate sequence for HPV has been lacking. Herein, we demonstrate that FLAV and HPV do not contain pseudogenes in this region, but encode three small functional proteins designated as U1-U3 that apparently arose by gene duplication. To further investigate the U1-U3 region, we conducted the first large-scale evolutionary analysis of a member of the Hart Park serogroup by analyzing over 100 spatially and temporally distinct FLAV isolates. Our phylogeographic analysis demonstrates that although FLAV appears to be slowly evolving, phylogenetically divergent lineages co-circulate sympatrically. © 2013 Published by Elsevier Inc.
Non-concerted ITS evolution in Mammillaria (Cactaceae).
Harpke, Doerte; Peterson, Angela
2006-12-01
Molecular studies of 21 species of the large Cactaceae genus Mammillaria representing a variety of intrageneric taxonomic levels revealed a high degree of intra-individual polymorphism of the internal transcribed spacer region (ITS1, 5.8S rDNA, ITS2). Only a few of these ITS copies belong to apparently functional genes, whereas most are probably non-functional (pseudogenes). As a multiple gene family, the ITS region is subjected to concerted evolution. However, the high degree of intra-individual polymorphism of up to 36% in ITS1 and up to 35% in ITS2 suggests a non-concerted evolution of these loci in Mammillaria. Conserved angiosperm motifs of ITS1 and ITS2 were compared between genomic and cDNA ITS clones of Mammillaria. Some of these motifs (e.g., ITS1 motif 1, 'TGGT' within ITS2) in combination with the determination of GC-content, length comparisons of the spacers and ITS2 secondary structure (helices II and III) are helpful in the identification of pseudogene rDNA regions.
Feng, Feiyue; Qiu, Bin; Zang, Ruochuan; Song, Peng; Gao, Shugeng
2017-04-25
Natural antisense transcripts (NATs) as one of the most diverse classes of long noncoding RNAs (lncRNAs), have been demonstrated involved in fundamental biological processes in human. Here, we reported that human prohibitin gene pseudogene 1 (PHBP1) was upregulated in ESCC, and increased PHBP1 expression in ESCC was associated with clinical advanced stage. Functional experiments showed that PHBP1 knockdown inhibited ESCC cells proliferation, colony formation and xenograft tumor growth in vitro and in vivo by causing cell-cycle arrest at the G1-G0 phase. Mechanisms analysis revealed that PHBP1 transcript as an antisense transcript of PHB is partially complementary to PHB mRNA and formed an RNA-RNA hybrid with PHB, consequently inducing an increase of PHB expression at both the mRNA and protein levels. Furthermore, PHBP1 expression is strongly correlated with PHB expression in ESCC tissues. Collectively, this study elucidates an important role of PHBP1 in promoting ESCC partly via increasing PHB expression.
Chen, Chunxia; Cui, Xiaoying; Yu, Jun; Xiao, Jingfa; Kan, Biao
2012-01-01
Salmonella Paratyphi A (S. Paratyphi A) is a highly adapted, human-specific pathogen that causes paratyphoid fever. Cases of paratyphoid fever have recently been increasing, and the disease is becoming a major public health concern, especially in Eastern and Southern Asia. To investigate the genomic variation and evolution of S. Paratyphi A, a pan-genomic analysis was performed on five newly sequenced S. Paratyphi A strains and two other reference strains. A whole genome comparison revealed that the seven genomes are collinear and that their organization is highly conserved. The high rate of substitutions in part of the core genome indicates that there are frequent homologous recombination events. Based on the changes in the pan-genome size and cluster number (both in the core functional genes and core pseudogenes), it can be inferred that the sharply increasing number of pseudogene clusters may have strong correlation with the inactivation of functional genes, and indicates that the S. Paratyphi A genome is being degraded. PMID:23028950
The molecular dynamics of long noncoding RNA control of transcription in PTEN and its pseudogene
Lister, Nicholas; Shevchenko, Galina; Walshe, James L.; Groen, Jessica; Johnsson, Per; Vidarsdóttir, Linda; Grander, Dan; Ataide, Sandro F.; Morris, Kevin V.
2017-01-01
RNA has been found to interact with chromatin and modulate gene transcription. In human cells, little is known about how long noncoding RNAs (lncRNAs) interact with target loci in the context of chromatin. We find here, using the phosphatase and tensin homolog (PTEN) pseudogene as a model system, that antisense lncRNAs interact first with a 5′ UTR-containing promoter-spanning transcript, which is then followed by the recruitment of DNA methyltransferase 3a (DNMT3a), ultimately resulting in the transcriptional and epigenetic control of gene expression. Moreover, we find that the lncRNA and promoter-spanning transcript interaction are based on a combination of structural and sequence components of the antisense lncRNA. These observations suggest, on the basis of this one example, that evolutionary pressures may be placed on RNA structure more so than sequence conservation. Collectively, the observations presented here suggest a much more complex and vibrant RNA regulatory world may be operative in the regulation of gene expression. PMID:28847966
The Evolution of Ribosomal DNA: Divergent Paralogues and Phylogenetic Implications
Buckler-IV, E. S.; Ippolito, A.; Holtsford, T. P.
1997-01-01
Although nuclear ribosomal DNA (rDNA) repeats evolve together through concerted evolution, some genomes contain a considerable diversity of paralogous rDNA. This diversity includes not only multiple functional loci but also putative pseudogenes and recombinants. We examined the occurrence of divergent paralogues and recombinants in Gossypium, Nicotiana, Tripsacum, Winteraceae, and Zea ribosomal internal transcribed spacer (ITS) sequences. Some of the divergent paralogues are probably rDNA pseudogenes, since they have low predicted secondary structure stability, high substitution rates, and many deamination-driven substitutions at methylation sites. Under standard PCR conditions, the low stability paralogues amplified well, while many high-stability paralogues amplified poorly. Under highly denaturing PCR conditions (i.e., with dimethylsulfoxide), both low- and high-stability paralogues amplified well. We also found recombination between divergent paralogues. For phylogenetics, divergent ribosomal paralogues can aid in reconstructing ancestral states and thus serve as good outgroups. Divergent paralogues can also provide companion rDNA phylogenies. However, phylogeneticists must discriminate among families of divergent paralogues and recombinants or suffer from muddled and inaccurate organismal phylogenies. PMID:9055091
Chi, Sylvia Ighem; Urbarova, Ilona; Johansen, Steinar D
2018-04-30
The mitochondrial genomes of sea anemones are dynamic in structure. Invasion by genetic elements, such as self-catalytic group I introns or insertion-like sequences, contribute to sea anemone mitochondrial genome expansion and complexity. By using next generation sequencing we investigated the complete mtDNAs and corresponding transcriptomes of the temperate sea anemone Anemonia viridis and its closer tropical relative Anemonia majano. Two versions of fused homing endonuclease gene (HEG) organization were observed among the Actiniidae sea anemones; in-frame gene fusion and pseudo-gene fusion. We provided support for the pseudo-gene fusion organization in Anemonia species, resulting in a repressed HEG from the COI-884 group I intron. orfA, a putative protein-coding gene with insertion-like features, was present in both Anemonia species. Interestingly, orfA and COI expression were significantly up-regulated upon long-term environmental stress corresponding to low seawater pH conditions. This study provides new insights to the dynamics of sea anemone mitochondrial genome structure and function. Copyright © 2018 Elsevier B.V. All rights reserved.
Generation and reactivation of T-cell receptor A joining region pseudogenes in primates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thiel, C.; Lanchbury, J.S.; Otting, N.
1996-06-01
Tandemly duplicated T-cell receptor (Tcr) AJ (J{alpha}) segments contribute significantly to TCRA chain junctional region diversity in mammals. Since only limited data exists on TCRA diversity in nonhuman primates, we examined the TCRAJ regions of 37 chimpanzee and 71 rhesus macaque TCRA cDNA clones derived from inverse polymerase chain reaction on peripheral blood mononuclear cell cDNA of healthy animals. Twenty-five different TCRAJ regions were characterized in the chimpanzee and 36 in the rhesus macaque. Each bears a close structural relationship to an equivalent human TCRAJ region. Conserved amino acid motifs are shared between all three species. There are indications thatmore » differences between nonhuman primates and humans exist in the generation of TCRAJ pseudogenes. The nucleotide and amino acid sequences of the various characterized TCRAJ of each species are reported and we compare our results to the available information on human genomic sequences. Although we provide evidence of dynamic processes modifying TCRAJ segments during primate evolution, their repertoire and primary structure appears to be relatively conserved. 21 refs., 2 figs.« less
Kishida, Takushi; Kubota, Shin; Shirayama, Yoshihisa; Fukami, Hironobu
2007-08-22
An olfactory receptor (OR) multigene family is responsible for the well-developed sense of smell possessed by terrestrial tetrapods. Mammalian OR genes had diverged greatly in the terrestrial environment after the fish-tetrapod split, indicating their importance to land habitation. In this study, we analysed OR genes of marine tetrapods (minke whale Balaenoptera acutorostrata, dwarf sperm whale Kogia sima, Dall's porpoise Phocoenoides dalli, Steller's sea lion Eumetopias jubatus and loggerhead sea turtle Caretta caretta) and revealed that the pseudogene proportions of OR gene repertoires in whales were significantly higher than those in their terrestrial relative cattle and also in sea lion and sea turtle. On the other hand, the pseudogene proportion of OR sequences in sea lion was not significantly higher compared with that in their terrestrial relative (dog). It indicates that secondary perfectly adapted marine vertebrates (cetaceans) have lost large amount of their OR genes, whereas secondary-semi-adapted marine vertebrates (sea lions and sea turtles) still have maintained their OR genes, reflecting the importance of terrestrial environment for these animals.
DOE Office of Scientific and Technical Information (OSTI.GOV)
He, Guo-Shun; Grabowski, G.A.
1992-10-01
Gaucher disease is the most frequent lysosomal storage disease and the most prevalent Jewish genetic disease. About 30 identified missense mutations are causal to the defective activity of acid [beta]-glucosidase in this disease. cDNAs were characterized from a moderately affected 9-year-old Ashkenazi Jewish Gaucher disease type 1 patient whose 80-years-old, enzyme-deficient, 1226G (Asn[sup 370][yields]Ser [N370S]) homozygous grandfather was nearly asymptomatic. Sequence analyses revealed four populations of cDNAs with either the 1226G mutation, an exact exon 2 ([Delta] EX2) deletion, a deletion of exon 2 and the first 115 bp of exon 3 ([Delta] EX2-3), or a completely normal sequence. Aboutmore » 50% of the cDNAs were the [Delta] EX2, the [Delta] EX2-3, and the normal cDNAs, in a ratio of 6:3:1. Specific amplification and characterization of exon 2 and 5[prime] and 3[prime] intronic flanking sequences from the structural gene demonstrated clones with either the normal sequence or with a G[sup +1][yields]A[sup +1] transition at the exon 2/intron 2 boundary. This mutation destroyed the splice donor consensus site (U1 binding site) for mRNA processing. This transition also was present at the corresponding exon/intron boundary of the highly homologous pseudogene. This new mutation, termed [open quotes]IVS2 G[sup +1],[close quotes] is the first in the Ashkenazi Jewish population. The occurrence of this [open quotes]pseudogene[close quotes]-type mutation in the structural gene indicates the role of acid [beta]-glucosidase pseudogene and structural gene rearrangements in the pathogenesis of this disease. 33 refs., 8 figs., 1 tab.« less
2011-01-01
Background One member of the W family of human endogenous retroviruses (HERV) appears to have been functionally adopted by the human host. Nevertheless, a highly diversified and regulated transcription from a range of HERV-W elements has been observed in human tissues and cells. Aberrant expression of members of this family has also been associated with human disease such as multiple sclerosis (MS) and schizophrenia. It is not known whether this broad expression of HERV-W elements represents transcriptional leakage or specific transcription initiated from the retroviral promoter in the long terminal repeat (LTR) region. Therefore, potential influences of genomic context, structure and orientation on the expression levels of individual HERV-W elements in normal human tissues were systematically investigated. Results Whereas intronic HERV-W elements with a pseudogene structure exhibited a strong anti-sense orientation bias, intronic elements with a proviral structure and solo LTRs did not. Although a highly variable expression across tissues and elements was observed, systematic effects of context, structure and orientation were also observed. Elements located in intronic regions appeared to be expressed at higher levels than elements located in intergenic regions. Intronic elements with proviral structures were expressed at higher levels than those elements bearing hallmarks of processed pseudogenes or solo LTRs. Relative to their corresponding genes, intronic elements integrated on the sense strand appeared to be transcribed at higher levels than those integrated on the anti-sense strand. Moreover, the expression of proviral elements appeared to be independent from that of their corresponding genes. Conclusions Intronic HERV-W provirus integrations on the sense strand appear to have elicited a weaker negative selection than pseudogene integrations of transcripts from such elements. Our current findings suggest that the previously observed diversified and tissue-specific expression of elements in the HERV-W family is the result of both directed transcription (involving both the LTR and internal sequence) and leaky transcription of HERV-W elements in normal human tissues. PMID:21226900
Characterization of trh2 harbouring Vibrio parahaemolyticus strains isolated in Germany.
Bechlars, Silke; Jäckel, Claudia; Diescher, Susanne; Wüstenhagen, Doreen A; Kubick, Stefan; Dieckmann, Ralf; Strauch, Eckhard
2015-01-01
Vibrio parahaemolyticus is a recognized human enteropathogen. Thermostable direct hemolysin (TDH) and TDH-related hemolysin (TRH) as well as the type III secretion system 2 (T3SS2) are considered as major virulence factors. As tdh positive strains are not detected in coastal waters of Germany, we focused on the characterization of trh positive strains, which were isolated from mussels, seawater and patients in Germany. Ten trh harbouring V. parahaemolyticus strains from Germany were compared to twenty-one trh positive strains from other countries. The complete trh sequences revealed clustering into three different types: trh1 and trh2 genes and a pseudogene Ψtrh. All German isolates possessed alleles of the trh2 gene. MLST analysis indicated a close relationship to Norwegian isolates suggesting that these strains belong to the autochthonous microflora of Northern Europe seawaters. Strains carrying the pseudogene Ψtrh were negative for T3SS2β effector vopC. Transcription of trh and vopC genes was analyzed under different growth conditions. Trh2 gene expression was not altered by bile while trh1 genes were inducible. VopC could be induced by urea in trh2 bearing strains. Most trh1 carrying strains were hemolytic against sheep erythrocytes while all trh2 positive strains did not show any hemolytic activity. TRH variants were synthesized in a prokaryotic cell-free system and their hemolytic activity was analyzed. TRH1 was active against sheep erythrocytes while TRH2 variants were not active at all. Our study reveals a high diversity among trh positive V. parahaemolyticus strains. The function of TRH2 hemolysins and the role of the pseudogene Ψtrh as pathogenicity factors are questionable. To assess the pathogenic potential of V. parahaemolyticus strains a differentiation of trh variants and the detection of T3SS2β components like vopC would improve the V. parahaemolyticus diagnostics and could lead to a refinement of the risk assessment in food analyses and clinical diagnostics.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ploos van Amstel, H.; Reitsma, P.H.; van der Logt, C.P.
The human protein S locus on chromosome 3 consists of two protein S genes, PS{alpha} and PS{beta}. Here the authors report the cloning and characterization of both genes. Fifteen exons of the PS{alpha} gene were identified that together code for protein S mRNA as derived from the reported protein S cDNAs. Analysis by primer extension of liver protein S mRNA, however, reveals the presence of two mRNA forms that differ in the length of their 5{prime}-noncoding region. Both transcripts contain a 5{prime}-noncoding region longer than found in the protein S cDNAs. The two products may arise from alternative splicing ofmore » an additional intron in this region or from the usage of two start sites for transcription. The intron-exon organization of the PS{alpha} gene fully supports the hypothesis that the protein S gene is the product of an evolutional assembling process in which gene modules coding for structural/functional protein units also found in other coagulation proteins have been put upstream of the ancestral gene of a steroid hormone binding protein. The PS{beta} gene is identified as a pseudogene. It contains a large variety of detrimental aberrations, viz., the absence of exon I, a splice site mutation, three stop codons, and a frame shift mutation. Overall the two genes PS{alpha} and PS{beta} show between their exonic sequences 96.5% homology. Southern analysis of primate DNA showed that the duplication of the ancestral protein S gene has occurred after the branching of the orangutan from the African apes. A nonsense mutation that is present in the pseudogene of man also could be identified in one of the two protein S genes of both chimpanzee and gorilla. This implicates that silencing of one of the two protein S genes must have taken place before the divergence of the three African apes.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shipley, J.M.; Klinkenberg, M.; Wu, B.M.
1993-03-01
PCR of cDNA produced from patient fibroblasts allowed the authors to determine the paternal mutation in the first patient reported with [beta]-glucuronidase-deficiency mucopolysaccharidosis type VII (MPS VII). The G[r arrow]T transversion 1,881 bp downstream of the ATG translation initiation codon destroys an MboII restriction site and converts Trp627 to Cys (W627C). Digestion of genomic DNA PCR fragments with MboII indicated that the patient and the father were heterozygous for this missense mutation in exon 12. Failure to find cDNAs from patient RNA which did not contain this mutation suggested that the maternal mutation leads to greatly reduced synthesis or reducedmore » stability of mRNA from the mutant allele. In order to identify the maternal mutation, it was necessary to analyze genomic sequences. This approach was complicated by the finding of multiple unprocessed pseudogenes and/or closely related genes. Using PCR with a panel of human/rodent hybrid cell lines, the authors found that these pseudogenes were present over chromosomes 5-7, 20, and 22 and the Y chromosome. Conditions were defined which allowed them to amplify and characterize genomic sequences for the true [beta]-glucuronidase gene despite this background of related sequences. The patient proved to be heterozygous for a second mutation, in which a C[r arrow]T transition introduces a termination codon (R356STOP) in exon 7. The mother was also heterozygous for this mutation. Expression of a cDNA containing the maternal mutation produced no enzyme activity, as expected. Expression of the paternal mutation in COS-7 cells produced a surprisingly high (65% of control) level of activity. However, activity was 13% of control in transiently transfected murine MPS VII cells. The level of activity of this mutant allele appears to correlate with the level of overexpression. 39 refs., 5 figs., 1 tab.« less
Callebaut, Isabelle; Laurin, Michel; Pascal, Géraldine; Poupon, Anne; Goudet, Ghylène; Monget, Philippe
2012-01-01
Genes encoding proteins involved in sperm-egg interaction and fertilization exhibit a particularly fast evolution and may participate in prezygotic species isolation [1], [2]. Some of them (ZP3, ADAM1, ADAM2, ACR and CD9) have individually been shown to evolve under positive selection [3], [4], suggesting a role of positive Darwinian selection on sperm-egg interaction. However, the genes involved in this biological function have not been systematically and exhaustively studied with an evolutionary perspective, in particular across vertebrates with internal and external fertilization. Here we show that 33 genes among the 69 that have been experimentally shown to be involved in fertilization in at least one taxon in vertebrates are under positive selection. Moreover, we identified 17 pseudogenes and 39 genes that have at least one duplicate in one species. For 15 genes, we found neither positive selection, nor gene copies or pseudogenes. Genes of teleosts, especially genes involved in sperm-oolemma fusion, appear to be more frequently under positive selection than genes of birds and eutherians. In contrast, pseudogenization, gene loss and gene gain are more frequent in eutherians. Thus, each of the 19 studied vertebrate species exhibits a unique signature characterized by gene gain and loss, as well as position of amino acids under positive selection. Reflecting these clade-specific signatures, teleosts and eutherian mammals are recovered as clades in a parsimony analysis. Interestingly the same analysis places Xenopus apart from teleosts, with which it shares the primitive external fertilization, and locates it along with amniotes (which share internal fertilization), suggesting that external or internal environmental conditions of germ cell interaction may not be the unique factors that drive the evolution of fertilization genes. Our work should improve our understanding of the fertilization process and on the establishment of reproductive barriers, for example by offering new leads for experiments on genes identified as positively selected. PMID:22957080
Largest vertebrate vomeronasal type 1 receptor gene repertoire in the semiaquatic platypus.
Grus, Wendy E; Shi, Peng; Zhang, Jianzhi
2007-10-01
Vertebrate vomeronasal chemoreception plays important roles in many aspects of an organism's daily life, such as mating, territoriality, and foraging. Vomeronasal type 1 receptors (V1Rs) and vomeronasal type 2 receptors (V2Rs), 2 large families of G protein-coupled receptors, serve as vomeronasal receptors to bind to various pheromones and odorants. Contrary to the previous observations of reduced olfaction in aquatic and semiaquatic mammals, we here report the surprising finding that the platypus, a semiaquatic monotreme, has the largest V1R repertoire and nearly largest combined repertoire of V1Rs and V2Rs of all vertebrates surveyed, with 270 intact genes and 579 pseudogenes in the V1R family and 15 intact genes, 55 potentially intact genes, and 57 pseudogenes in the V2R family. Phylogenetic analysis shows a remarkable expansion of the V1R repertoire and a moderate expansion of the V2R repertoire in platypus since the separation of monotremes from placentals and marsupials. Our results challenge the view that olfaction is unimportant to aquatic mammals and call for further study into the role of vomeronasal reception in platypus physiology and behavior.
Shapiro, Lori R.; Scully, Erin D.; Straub, Timothy J.; Park, Jihye; Stephenson, Andrew G.; Beattie, Gwyn A.; Gleason, Mark L.; Kolter, Roberto; Coelho, Miguel C.; De Moraes, Consuelo M.; Mescher, Mark C.; Zhaxybayeva, Olga
2016-01-01
Modern industrial agriculture depends on high-density cultivation of genetically similar crop plants, creating favorable conditions for the emergence of novel pathogens with increased fitness in managed compared with ecologically intact settings. Here, we present the genome sequence of six strains of the cucurbit bacterial wilt pathogen Erwinia tracheiphila (Enterobacteriaceae) isolated from infected squash plants in New York, Pennsylvania, Kentucky, and Michigan. These genomes exhibit a high proportion of recent horizontal gene acquisitions, invasion and remarkable amplification of mobile genetic elements, and pseudogenization of approximately 20% of the coding sequences. These genome attributes indicate that E. tracheiphila recently emerged as a host-restricted pathogen. Furthermore, chromosomal rearrangements associated with phage and transposable element proliferation contribute to substantial differences in gene content and genetic architecture between the six E. tracheiphila strains and other Erwinia species. Together, these data lead us to hypothesize that E. tracheiphila has undergone recent evolution through both genome decay (pseudogenization) and genome expansion (horizontal gene transfer and mobile element amplification). Despite evidence of dramatic genomic changes, the six strains are genetically monomorphic, suggesting a recent population bottleneck and emergence into E. tracheiphila’s current ecological niche. PMID:26992913
Viral unmasking of cellular 5S rRNA pseudogene transcripts induces RIG-I-mediated immunity.
Chiang, Jessica J; Sparrer, Konstantin M J; van Gent, Michiel; Lässig, Charlotte; Huang, Teng; Osterrieder, Nikolaus; Hopfner, Karl-Peter; Gack, Michaela U
2018-01-01
The sensor RIG-I detects double-stranded RNA derived from RNA viruses. Although RIG-I is also known to have a role in the antiviral response to DNA viruses, physiological RNA species recognized by RIG-I during infection with a DNA virus are largely unknown. Using next-generation RNA sequencing (RNAseq), we found that host-derived RNAs, most prominently 5S ribosomal RNA pseudogene 141 (RNA5SP141), bound to RIG-I during infection with herpes simplex virus 1 (HSV-1). Infection with HSV-1 induced relocalization of RNA5SP141 from the nucleus to the cytoplasm, and virus-induced shutoff of host protein synthesis downregulated the abundance of RNA5SP141-interacting proteins, which allowed RNA5SP141 to bind RIG-I and induce the expression of type I interferons. Silencing of RNA5SP141 strongly dampened the antiviral response to HSV-1 and the related virus Epstein-Barr virus (EBV), as well as influenza A virus (IAV). Our findings reveal that antiviral immunity can be triggered by host RNAs that are unshielded following depletion of their respective binding proteins by the virus.
Complete Sequence and Analysis of Coconut Palm (Cocos nucifera) Mitochondrial Genome.
Aljohi, Hasan Awad; Liu, Wanfei; Lin, Qiang; Zhao, Yuhui; Zeng, Jingyao; Alamer, Ali; Alanazi, Ibrahim O; Alawad, Abdullah O; Al-Sadi, Abdullah M; Hu, Songnian; Yu, Jun
2016-01-01
Coconut (Cocos nucifera L.), a member of the palm family (Arecaceae), is one of the most economically important crops in tropics, serving as an important source of food, drink, fuel, medicine, and construction material. Here we report an assembly of the coconut (C. nucifera, Oman local Tall cultivar) mitochondrial (mt) genome based on next-generation sequencing data. This genome, 678,653bp in length and 45.5% in GC content, encodes 72 proteins, 9 pseudogenes, 23 tRNAs, and 3 ribosomal RNAs. Within the assembly, we find that the chloroplast (cp) derived regions account for 5.07% of the total assembly length, including 13 proteins, 2 pseudogenes, and 11 tRNAs. The mt genome has a relatively large fraction of repeat content (17.26%), including both forward (tandem) and inverted (palindromic) repeats. Sequence variation analysis shows that the Ti/Tv ratio of the mt genome is lower as compared to that of the nuclear genome and neutral expectation. By combining public RNA-Seq data for coconut, we identify 734 RNA editing sites supported by at least two datasets. In summary, our data provides the second complete mt genome sequence in the family Arecaceae, essential for further investigations on mitochondrial biology of seed plants.
Venditti, C P; Lawlor, D A; Sharma, P; Chorney, M J
1996-09-01
The origins of the functional class I genes predated human speciation, a phenomenon known as trans-speciation. The retention of class Ia orthologues within the great apes, however, has not been paralleled by studies designed to examine the pseudogene content, organization, and structure of their class I regions. Therefore, we have begun the systematic characterization of the Old World primate MHCs. The numbers and sizes of fragments harboring class I sequences were similar among the chimpanzee, gorilla, and human genomes tested. Both of the gorillas included in our study possessed genomic fragments carrying orthologues of the recently evolved HLA-H pseudogene identical to those found in the human. The overall megabase restriction fragment patterns of humans and chimpanzees appeared slightly more similar to each other, although the HLA-A subregional megabase variants may have been generated following the emergence of Homo sapiens. Based on the results of this initial study, it is difficult to generate a firm species tree and to determine human's closest evolutionary neighbor. Nevertheless, an analysis of MHC subregional similarities and differences in the hominoid apes may ultimately aid in localizing and identifying MHC haplotype-associated disease genes such as idiopathic hemochromatosis.
Extensive gene conversion at the PMS2 DNA mismatch repair locus.
Hayward, Bruce E; De Vos, Michel; Valleley, Elizabeth M A; Charlton, Ruth S; Taylor, Graham R; Sheridan, Eamonn; Bonthron, David T
2007-05-01
Mutations of the PMS2 DNA repair gene predispose to a characteristic range of malignancies, with either childhood onset (when both alleles are mutated) or a partially penetrant adult onset (if heterozygous). These mutations have been difficult to detect, due to interference from a family of pseudogenes located on chromosome 7. One of these, the PMS2CL pseudogene, lies within a 100-kb inverted duplication (inv dup), 700 kb centromeric to PMS2 itself on 7p22. Here, we show that the reference genomic sequences cannot be relied upon to distinguish PMS2 from PMS2CL, because of sequence transfer between the two loci. The 7p22 inv dup occurred prior to the divergence of modern ape species (15 million years ago [Mya]), but has undergone extensive sequence homogenization. This process appears to be ongoing, since there is considerable allelic diversity within the duplicated region, much of it derived from sequence exchange between PMS2 and PMS2CL. This sequence diversity can result in both false-positive and false-negative mutation analysis at this locus. Great caution is still needed in the design and interpretation of PMS2 mutation screens. 2007 Wiley-Liss, Inc.
Das, Subhadeep; Singh, Deeksha; Madduluri, Madhavi; Chandrababunaidu, Mathu Malar; Gupta, Akash
2015-01-01
We report here the draft genome sequence of Tolypothrix campylonemoides VB511288, isolated from building facades in Santiniketan, India. The members of this genus produce several compounds of commercial importance. The draft assembly is 10,627,177 bases in 135 scaffolds, and it contains 7,886 protein-coding genes, 994 pseudogenes, 18 rRNA genes, and 76 tRNA genes. PMID:25838485
Characterization of trh2 Harbouring Vibrio parahaemolyticus Strains Isolated in Germany
Bechlars, Silke; Jäckel, Claudia; Diescher, Susanne; Wüstenhagen, Doreen A.; Kubick, Stefan; Dieckmann, Ralf; Strauch, Eckhard
2015-01-01
Background Vibrio parahaemolyticus is a recognized human enteropathogen. Thermostable direct hemolysin (TDH) and TDH-related hemolysin (TRH) as well as the type III secretion system 2 (T3SS2) are considered as major virulence factors. As tdh positive strains are not detected in coastal waters of Germany, we focused on the characterization of trh positive strains, which were isolated from mussels, seawater and patients in Germany. Results Ten trh harbouring V. parahaemolyticus strains from Germany were compared to twenty-one trh positive strains from other countries. The complete trh sequences revealed clustering into three different types: trh1 and trh2 genes and a pseudogene Ψtrh. All German isolates possessed alleles of the trh2 gene. MLST analysis indicated a close relationship to Norwegian isolates suggesting that these strains belong to the autochthonous microflora of Northern Europe seawaters. Strains carrying the pseudogene Ψtrh were negative for T3SS2β effector vopC. Transcription of trh and vopC genes was analyzed under different growth conditions. Trh2 gene expression was not altered by bile while trh1 genes were inducible. VopC could be induced by urea in trh2 bearing strains. Most trh1 carrying strains were hemolytic against sheep erythrocytes while all trh2 positive strains did not show any hemolytic activity. TRH variants were synthesized in a prokaryotic cell-free system and their hemolytic activity was analyzed. TRH1 was active against sheep erythrocytes while TRH2 variants were not active at all. Conclusion Our study reveals a high diversity among trh positive V. parahaemolyticus strains. The function of TRH2 hemolysins and the role of the pseudogene Ψtrh as pathogenicity factors are questionable. To assess the pathogenic potential of V. parahaemolyticus strains a differentiation of trh variants and the detection of T3SS2β components like vopC would improve the V. parahaemolyticus diagnostics and could lead to a refinement of the risk assessment in food analyses and clinical diagnostics. PMID:25799574
Coon, Keith D; Valla, Jon; Szelinger, Szabolics; Schneider, Lonnie E; Niedzielko, Tracy L; Brown, Kevin M; Pearson, John V; Halperin, Rebecca; Dunckley, Travis; Papassotiropoulos, Andreas; Caselli, Richard J; Reiman, Eric M; Stephan, Dietrich A
2006-08-01
The role of mitochondrial dysfunction in the pathogenesis of Alzheimer's disease (AD) has been well documented. Though evidence for the role of mitochondria in AD seems incontrovertible, the impact of mitochondrial DNA (mtDNA) mutations in AD etiology remains controversial. Though mutations in mitochondrially encoded genes have repeatedly been implicated in the pathogenesis of AD, many of these studies have been plagued by lack of replication as well as potential contamination of nuclear-encoded mitochondrial pseudogenes. To assess the role of mtDNA mutations in the pathogenesis of AD, while avoiding the pitfalls of nuclear-encoded mitochondrial pseudogenes encountered in previous investigations and showcasing the benefits of a novel resequencing technology, we sequenced the entire coding region (15,452 bp) of mtDNA from 19 extremely well-characterized AD patients and 18 age-matched, unaffected controls utilizing a new, reliable, high-throughput array-based resequencing technique, the Human MitoChip. High-throughput, array-based DNA resequencing of the entire mtDNA coding region from platelets of 37 subjects revealed the presence of 208 loci displaying a total of 917 sequence variants. There were no statistically significant differences in overall mutational burden between cases and controls, however, 265 independent sites of statistically significant change between cases and controls were identified. Changed sites were found in genes associated with complexes I (30.2%), III (3.0%), IV (33.2%), and V (9.1%) as well as tRNA (10.6%) and rRNA (14.0%). Despite their statistical significance, the subtle nature of the observed changes makes it difficult to determine whether they represent true functional variants involved in AD etiology or merely naturally occurring dissimilarity. Regardless, this study demonstrates the tremendous value of this novel mtDNA resequencing platform, which avoids the pitfalls of erroneously amplifying nuclear-encoded mtDNA pseudogenes, and our proposed analysis paradigm, which utilizes the availability of raw signal intensity values for each of the four potential alleles to facilitate quantitative estimates of mtDNA heteroplasmy. This information provides a potential new target for burgeoning diagnostics and therapeutics that could truly assist those suffering from this devastating disorder.
Raju, Hemalatha B.; Tsinoremas, Nicholas F.; Capobianco, Enrico
2016-01-01
Regeneration of injured nerves is likely occurring in the peripheral nervous system, but not in the central nervous system. Although protein-coding gene expression has been assessed during nerve regeneration, little is currently known about the role of non-coding RNAs (ncRNAs). This leaves open questions about the potential effects of ncRNAs at transcriptome level. Due to the limited availability of human neuropathic pain (NP) data, we have identified the most comprehensive time-course gene expression profile referred to sciatic nerve (SN) injury and studied in a rat model using two neuronal tissues, namely dorsal root ganglion (DRG) and SN. We have developed a methodology to identify differentially expressed bioentities starting from microarray probes and repurposing them to annotate ncRNAs, while analyzing the expression profiles of protein-coding genes. The approach is designed to reuse microarray data and perform first profiling and then meta-analysis through three main steps. First, we used contextual analysis to identify what we considered putative or potential protein-coding targets for selected ncRNAs. Relevance was therefore assigned to differential expression of neighbor protein-coding genes, with neighborhood defined by a fixed genomic distance from long or antisense ncRNA loci, and of parental genes associated with pseudogenes. Second, connectivity among putative targets was used to build networks, in turn useful to conduct inference at interactomic scale. Last, network paths were annotated to assess relevance to NP. We found significant differential expression in long-intergenic ncRNAs (32 lincRNAs in SN and 8 in DRG), antisense RNA (31 asRNA in SN and 12 in DRG), and pseudogenes (456 in SN and 56 in DRG). In particular, contextual analysis centered on pseudogenes revealed some targets with known association to neurodegeneration and/or neurogenesis processes. While modules of the olfactory receptors were clearly identified in protein–protein interaction networks, other connectivity paths were identified between proteins already investigated in studies on disorders, such as Parkinson, Down syndrome, Huntington disease, and Alzheimer. Our findings suggest the importance of reusing gene expression data by meta-analysis approaches. PMID:27803687
Raju, Hemalatha B; Tsinoremas, Nicholas F; Capobianco, Enrico
2016-01-01
Regeneration of injured nerves is likely occurring in the peripheral nervous system, but not in the central nervous system. Although protein-coding gene expression has been assessed during nerve regeneration, little is currently known about the role of non-coding RNAs (ncRNAs). This leaves open questions about the potential effects of ncRNAs at transcriptome level. Due to the limited availability of human neuropathic pain (NP) data, we have identified the most comprehensive time-course gene expression profile referred to sciatic nerve (SN) injury and studied in a rat model using two neuronal tissues, namely dorsal root ganglion (DRG) and SN. We have developed a methodology to identify differentially expressed bioentities starting from microarray probes and repurposing them to annotate ncRNAs, while analyzing the expression profiles of protein-coding genes. The approach is designed to reuse microarray data and perform first profiling and then meta-analysis through three main steps. First, we used contextual analysis to identify what we considered putative or potential protein-coding targets for selected ncRNAs. Relevance was therefore assigned to differential expression of neighbor protein-coding genes, with neighborhood defined by a fixed genomic distance from long or antisense ncRNA loci, and of parental genes associated with pseudogenes. Second, connectivity among putative targets was used to build networks, in turn useful to conduct inference at interactomic scale. Last, network paths were annotated to assess relevance to NP. We found significant differential expression in long-intergenic ncRNAs (32 lincRNAs in SN and 8 in DRG), antisense RNA (31 asRNA in SN and 12 in DRG), and pseudogenes (456 in SN and 56 in DRG). In particular, contextual analysis centered on pseudogenes revealed some targets with known association to neurodegeneration and/or neurogenesis processes. While modules of the olfactory receptors were clearly identified in protein-protein interaction networks, other connectivity paths were identified between proteins already investigated in studies on disorders, such as Parkinson, Down syndrome, Huntington disease, and Alzheimer. Our findings suggest the importance of reusing gene expression data by meta-analysis approaches.
A novel polymorphic cytochrome P450 formed by splicing of CYP3A7 and the pseudogene CYP3AP1.
Rodriguez-Antona, Cristina; Axelson, Magnus; Otter, Charlotta; Rane, Anders; Ingelman-Sundberg, Magnus
2005-08-05
The cytochrome P450 3A7 (CYP3A7) is the most abundant CYP in human liver during fetal development and first months of postnatal age, playing an important role in the metabolism of endogenous hormones, drugs, differentiation factors, and potentially toxic and teratogenic substrates. Here we describe and characterize a novel enzyme, CYP3A7.1L, encompassing the CYP3A7.1 protein with the last four carboxyl-terminal amino acids replaced by a unique sequence of 36 amino acids, generated by splicing of CYP3A7 with CYP3AP1 RNA. The corresponding CYP3A7-3AP1 mRNA had a significant expression in liver, kidney, and gastrointestinal tract, and its presence was found to be tissue-specific and dependent on the developmental stage. Heterologous expression in yeast revealed that CYP3A7.1L was a functional enzyme with a specific activity similar to that of CYP3A7.1 and, in some conditions, a different hydroxylation specificity than CYP3A7.1 using dehydroepiandrosterone as a substrate. CYP3A7.1L was found to be polymorphic due to a mutation at position -6 of the first splicing site of CYP3AP1 (CYP3A7_39256T-->A), which abrogates the pseudogene splicing. This polymorphism had pronounced interethnic differences and was in linkage disequilibrium with other functional polymorphisms described in the CYP3A locus: CYP3A7*2 and CYP3A5*1. Therefore, the resulting CYP3A haplotypes express different sets of enzymes within the population. In conclusion, a novel mechanism, consisting of the splicing of the pseudogene CYP3AP1 to CYP3A7, causes the formation of the novel CYP3A7.1L having a different tissue distribution and functional properties than the parent CYP3A7 enzyme, with possible developmental, physiological, and toxicological consequences.
Riveros-Mckay, Fernando; Campos, Itzia; Giles-Gómez, Martha; Bolívar, Francisco
2014-01-01
Leuconostoc mesenteroides P45 was isolated from the traditional Mexican pulque beverage. We report its draft genome sequence, assembled in 6 contigs consisting of 1,874,188 bp and no plasmids. Genome annotation predicted a total of 1,800 genes, 1,687 coding sequences, 52 pseudogenes, 9 rRNAs, 51 tRNAs, 1 noncoding RNA, and 44 frameshifted genes. PMID:25377708
Das, Subhadeep; Singh, Deeksha; Madduluri, Madhavi; Chandrababunaidu, Mathu Malar; Gupta, Akash; Adhikary, Siba Prasad; Tripathy, Sucheta
2015-04-02
We report here the draft genome sequence of Tolypothrix campylonemoides VB511288, isolated from building facades in Santiniketan, India. The members of this genus produce several compounds of commercial importance. The draft assembly is 10,627,177 bases in 135 scaffolds, and it contains 7,886 protein-coding genes, 994 pseudogenes, 18 rRNA genes, and 76 tRNA genes. Copyright © 2015 Das et al.
Moreno-Avitia, Fabian; Lozano, Luis; Utrilla, Jose; Bolívar, Francisco; Escalante, Adelfo
2017-06-08
Pseudomonas chlororaphis strain ATCC 9446 is a biocontrol-related organism. We report here its draft genome sequence assembled into 35 contigs consisting of 6,783,030 bp. Genome annotation predicted a total of 6,200 genes, 6,128 coding sequences, 81 pseudogenes, 58 tRNAs, 4 noncoding RNAs (ncRNAs), and 41 frameshifted genes. Copyright © 2017 Moreno-Avitia et al.
Dezene P. W. Huber; Melissa Erickson; Christian Leutenegger; Joerg Bohlmann; Steven J. Seybold
2007-01-01
Cytochromes P450 family genes (P450s) are found in a diverse array of organisms ranging from bacteria to mammals to plants to arthropods. Although there are exceptions to this rule, organisms generally contain a fairly large number of P450 genes and pseudogenes in their genomes. For instance, among arthropods whose genomes are well characterized, the mosquito,
Evolution of the bovine lysozyme gene family: changes in gene expression and reversion of function.
Irwin, D M
1995-09-01
Recruitment of lysozyme to a digestive function in ruminant artiodactyls is associated with amplification of the gene. At least four of the approximately ten genes are expressed in the stomach, and several are expressed in nonstomach tissues. Characterization of additional lysozymelike sequences in the bovine genome has identified most, if not all, of the members of this gene family. There are at least six stomachlike lysozyme genes, two of which are pseudogenes. The stomach lysozyme pseudogenes show a pattern of concerted evolution similar to that of the functional stomach genes. At least four nonstomach lysozyme genes exist. The nonstomach lysozyme genes are not monophyletic. A gene encoding a tracheal lysozyme was isolated, and the stomach lysozyme of advanced ruminants was found to be more closely related to the tracheal lysozyme than to the stomach lysozyme of the camel or other nonstomach lysozyme genes of ruminants. The tracheal lysozyme shares with stomach lysozymes of advanced ruminants the deletion of amino acid 103, and several other adaptive sequence characteristics of stomach lysozymes. I suggest here that tracheal lysozyme has reverted from a functional stomach lysozyme. Tracheal lysozyme then represents a second instance of a change in lysozyme gene expression and function within ruminants.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Day, D.J.; Barany, F.; Speiser, P.W.
Steroid 21-hydroxylase deficiency is the most common cause of congenital adrenal hyperplasia, an inherited inability to synthesize cortisol that occurs in 1 in 10,000-15,000 births. Affected females are born with ambiguous genitalia, a condition that can be ameliorated by administering dexamethasone to the mother for most of gestation. Prenatal diagnosis is required for accurate treatment of affected females as well as for genetic counseling purposes. Approximately 95% of mutations causing this disorder result from recombinations between the gene encoding the 21-hydroxylase enzyme (CYP21) and a linked, highly homologous pseudogene (CYP21P). Approximately 20% of these mutations are gene deletions, and themore » remainder are gene conversions that transfer any of nine deleterious mutations from the CYP21P pseudogene to CYP21. We describe a methodology for genetic diagnosis of 21-hydroxylase deficiency that utilizes gene-specific PCR amplification in conjunction with thermostable DNA ligase to discriminate single nucleotide variations in a multiplexed ligation detection assay. The assay has been designed to be used with either fluorescent or radioactive detection of ligation products by electrophoresis on denaturing acrylamide gels and is readily adaptable for use in other disease systems. 30 refs., 5 figs.« less
Complete Sequence and Analysis of Coconut Palm (Cocos nucifera) Mitochondrial Genome
Zhao, Yuhui; Zeng, Jingyao; Alamer, Ali; Alanazi, Ibrahim O.; Alawad, Abdullah O.; Al-Sadi, Abdullah M.; Hu, Songnian; Yu, Jun
2016-01-01
Coconut (Cocos nucifera L.), a member of the palm family (Arecaceae), is one of the most economically important crops in tropics, serving as an important source of food, drink, fuel, medicine, and construction material. Here we report an assembly of the coconut (C. nucifera, Oman local Tall cultivar) mitochondrial (mt) genome based on next-generation sequencing data. This genome, 678,653bp in length and 45.5% in GC content, encodes 72 proteins, 9 pseudogenes, 23 tRNAs, and 3 ribosomal RNAs. Within the assembly, we find that the chloroplast (cp) derived regions account for 5.07% of the total assembly length, including 13 proteins, 2 pseudogenes, and 11 tRNAs. The mt genome has a relatively large fraction of repeat content (17.26%), including both forward (tandem) and inverted (palindromic) repeats. Sequence variation analysis shows that the Ti/Tv ratio of the mt genome is lower as compared to that of the nuclear genome and neutral expectation. By combining public RNA-Seq data for coconut, we identify 734 RNA editing sites supported by at least two datasets. In summary, our data provides the second complete mt genome sequence in the family Arecaceae, essential for further investigations on mitochondrial biology of seed plants. PMID:27736909
The Complete Nucleotide Sequence of the Human Immunoglobulin Heavy Chain Variable Region Locus
Matsuda, Fumihiko; Ishii, Kazuo; Bourvagnet, Patrice; Kuma, Kei-ichi; Hayashida, Hidenori; Miyata, Takashi; Honjo, Tasuku
1998-01-01
The complete nucleotide sequence of the 957-kb DNA of the human immunoglobulin heavy chain variable (VH) region locus was determined and 43 novel VH segments were identified. The region contains 123 VH segments classifiable into seven different families, of which 79 are pseudogenes. Of the 44 VH segments with an open reading frame, 39 are expressed as heavy chain proteins and 1 as mRNA, while the remaining 4 are not found in immunoglobulin cDNAs. Combinatorial diversity of VH region was calculated to be ∼6,000. Conservation of the promoter and recombination signal sequences was observed to be higher in functional VH segments than in pseudogenes. Phylogenetic analysis of 114 VH segments clearly showed clustering of the VH segments of each family. However, an independent branch in the tree contained a single VH, V4-44.1P, sharing similar levels of homology to human VH families and to those of other vertebrates. Comparison between different copies of homologous units that appear repeatedly across the locus clearly demonstrates that dynamic DNA reorganization of the locus took place at least eight times between 133 and 10 million years ago. One nonimmunoglobulin gene of unknown function was identified in the intergenic region. PMID:9841928
Tiwari, Sandeep; Jamal, Syed Babar; Oliveira, Leticia Castro; Clermont, Dominique; Bizet, Chantal; Mariano, Diego; de Carvalho, Paulo Vinicius Sanches Daltro; Souza, Flavia; Pereira, Felipe Luiz; de Castro Soares, Siomar; Guimarães, Luis C; Dorella, Fernanda; Carvalho, Alex; Leal, Carlos; Barh, Debmalya; Figueiredo, Henrique; Hassan, Syed Shah; Azevedo, Vasco; Silva, Artur
2016-08-11
In this work, we describe a set of features of Corynebacterium auriscanis CIP 106629 and details of the draft genome sequence and annotation. The genome comprises a 2.5-Mbp-long single circular genome with 1,797 protein-coding genes, 5 rRNA, 50 tRNA, and 403 pseudogenes, with a G+C content of 58.50%. Copyright © 2016 Tiwari et al.
2008-04-01
small-cell lung cancer; TFP1, transferrin pseudogene 1; TGFβ, transforming growth factor-β; TNF, tumour- necrosis factor; uPA, urokinase-type plasminogen...Cell 79, 315–328 (1994). 23. Frater-Schroder, M., Risau, W., Hallmann, R., Gautschi, P. & Bohlen, P. Tumor necrosis factor type α, a potent inhibitor...relatively thin (skin) or avascular (cartilage) tissues, where post- implantation vascularization from the host is sufficient. To overcome the problem
Riveros-Mckay, Fernando; Campos, Itzia; Giles-Gómez, Martha; Bolívar, Francisco; Escalante, Adelfo
2014-11-06
Leuconostoc mesenteroides P45 was isolated from the traditional Mexican pulque beverage. We report its draft genome sequence, assembled in 6 contigs consisting of 1,874,188 bp and no plasmids. Genome annotation predicted a total of 1,800 genes, 1,687 coding sequences, 52 pseudogenes, 9 rRNAs, 51 tRNAs, 1 noncoding RNA, and 44 frameshifted genes. Copyright © 2014 Riveros-Mckay et al.
Ni, ZhouXian; Ye, YouJu; Bai, Tiandao; Xu, Meng; Xu, Li-An
2017-09-11
The chloroplast genome (CPG) of Pinus massoniana belonging to the genus Pinus (Pinaceae), which is a primary source of turpentine, was sequenced and analyzed in terms of gene rearrangements, ndh genes loss, and the contraction and expansion of short inverted repeats (IRs). P. massoniana CPG has a typical quadripartite structure that includes large single copy (LSC) (65,563 bp), small single copy (SSC) (53,230 bp) and two IRs (IRa and IRb, 485 bp). The 108 unique genes were identified, including 73 protein-coding genes, 31 tRNAs, and 4 rRNAs. Most of the 81 simple sequence repeats (SSRs) identified in CPG were mononucleotides motifs of A/T types and located in non-coding regions. Comparisons with related species revealed an inversion (21,556 bp) in the LSC region; P. massoniana CPG lacks all 11 intact ndh genes (four ndh genes lost completely; the five remained truncated as pseudogenes; and the other two ndh genes remain as pseudogenes because of short insertions or deletions). A pair of short IRs was found instead of large IRs, and size variations among pine species were observed, which resulted from short insertions or deletions and non-synchronized variations between "IRa" and "IRb". The results of phylogenetic analyses based on whole CPG sequences of 16 conifers indicated that the whole CPG sequences could be used as a powerful tool in phylogenetic analyses.
RNA-Mediated Gene Duplication and Retroposons: Retrogenes, LINEs, SINEs, and Sequence Specificity
2013-01-01
A substantial number of “retrogenes” that are derived from the mRNA of various intron-containing genes have been reported. A class of mammalian retroposons, long interspersed element-1 (LINE1, L1), has been shown to be involved in the reverse transcription of retrogenes (or processed pseudogenes) and non-autonomous short interspersed elements (SINEs). The 3′-end sequences of various SINEs originated from a corresponding LINE. As the 3′-untranslated regions of several LINEs are essential for retroposition, these LINEs presumably require “stringent” recognition of the 3′-end sequence of the RNA template. However, the 3′-ends of mammalian L1s do not exhibit any similarity to SINEs, except for the presence of 3′-poly(A) repeats. Since the 3′-poly(A) repeats of L1 and Alu SINE are critical for their retroposition, L1 probably recognizes the poly(A) repeats, thereby mobilizing not only Alu SINE but also cytosolic mRNA. Many flowering plants only harbor L1-clade LINEs and a significant number of SINEs with poly(A) repeats, but no homology to the LINEs. Moreover, processed pseudogenes have also been found in flowering plants. I propose that the ancestral L1-clade LINE in the common ancestor of green plants may have recognized a specific RNA template, with stringent recognition then becoming relaxed during the course of plant evolution. PMID:23984183
Choi, Kyoung Su; Park, Kyu Tae; Park, SeonJoo
2017-11-16
Symplocarpus renifolius is a member of Araceae family that is extraordinarily diverse in appearance. Previous studies on chloroplast genomes in Araceae were focused on duckweeds (Lemnoideae) and root crops ( Colocasia , commonly known as taro). Here, we determined the chloroplast genome of Symplocarpus renifolius and compared the factors, such as genes and inverted repeat (IR) junctions and performed phylogenetic analysis using other Araceae species. The chloroplast genome of S. renifolius is 158,521 bp and includes 113 genes. A comparison among the Araceae chloroplast genomes showed that infA in Lemna , Spirodela , Wolffiella , Wolffia , Dieffenbachia and Colocasia has been lost or has become a pseudogene and has only been retained in Symplocarpus . In the Araceae chloroplast DNA (cpDNA), psbZ is retained. However, psbZ duplication occurred in Wolffia species and tandem repeats were noted around the duplication regions. A comparison of the IR junction in Araceae species revealed the presence of ycf1 and rps15 in the small single copy region, whereas duckweed species contained ycf1 and rps15 in the IR region. The phylogenetic analyses of the chloroplast genomes revealed that Symplocarpus are a basal group and are sister to the other Araceae species. Consequently, infA deletion or pseudogene events in Araceae occurred after the divergence of Symplocarpus and aquatic plants (duckweeds) in Araceae and duplication events of rps15 and ycf1 occurred in the IR region.
Fitz-Gibbon, Sorel; Tomida, Shuta; Li, Huiying
2013-01-01
The human skin harbors a diverse community of bacteria, including the Gram-positive, anaerobic bacterium Propionibacterium acnes. P. acnes has historically been linked to the pathogenesis of acne vulgaris, a common skin disease affecting over 80% of all adolescents in the US. To gain insight into potential P. acnes pathogenic mechanisms, we previously sequenced the complete genome of a P. acnes strain HL096PA1 that is highly associated with acne. In this study, we compared its genome to the first published complete genome KPA171202. HL096PA1 harbors a linear plasmid, pIMPLE-HL096PA1. This is the first described P. acnes plasmid. We also observed a five-fold increase of pseudogenes in HL096PA1, several of which encode proteins in carbohydrate transport and metabolism. In addition, our analysis revealed a few island-like genomic regions that are unique to HL096PA1 and a large genomic inversion spanning the ribosomal operons. Together, these findings offer a basis for understanding P. acnes virulent properties, host adaptation mechanisms, and its potential role in acne pathogenesis at the strain level. Furthermore, the plasmid identified in HL096PA1 may potentially provide a new opportunity for P. acnes genetic manipulation and targeted therapy against specific disease-associated strains. PMID:23762865
Yan, Bin; Wubuli, Aikepaer; Liu, Yidong; Wang, Xin
2018-06-01
Osteosarcoma is a common type of human carcinoma, which exhibits a high metastasis and recurrence rate. Previous studies have indicated that long non-coding RNA phosphatase and tensin homolog pseudogene 1 (lnPTENP1) has tumor suppressive action by modulating PTEN expression in different types of tumor cells. However, the potential mechanism by which lnPTENP1 has an effect in osteosarcoma cells remains elusive. In the present study, the role of lnPTENP1 in osteosarcoma cells was investigated and the possible mechanisms by which it functions were explored. It was revealed that lnPTENP1 transfection significantly inhibited osteosarcoma cell growth, proliferation, migration and invasion. LnPTENP1 transfection also significantly promoted apoptosis in Mg63 cells treated with tunicamycin. Further analysis revealed that lnPTENP1 transfection regulated osteosarcoma cell growth via the PI3K/AKT signaling pathway. In vivo assays revealed that lnPTENP1 transfection significantly inhibited osteosarcoma tumor growth and significantly increased the protein expression and phosphorylation levels of PI3K and AKT. In conclusion, the results of the present study indicated that lnPTENP1 may inhibit osteosarcoma cell growth via the PI3K/AKT signaling pathway, which may be a potential novel target for human osteosarcoma therapy.
Ma, Wei; Gabriel, Tobias Sebastian; Martis, Mihaela Maria; Gursinsky, Torsten; Schubert, Veit; Vrána, Jan; Doležel, Jaroslav; Grundlach, Heidrun; Altschmied, Lothar; Scholz, Uwe; Himmelbach, Axel; Behrens, Sven-Erik; Banaei-Moghaddam, Ali Mohammad; Houben, Andreas
2017-01-01
B chromosomes (Bs) are supernumerary, dispensable parts of the nuclear genome, which appear in many different species of eukaryote. So far, Bs have been considered to be genetically inert elements without any functional genes. Our comparative transcriptome analysis and the detection of active RNA polymerase II (RNAPII) in the proximity of B chromatin demonstrate that the Bs of rye (Secale cereale) contribute to the transcriptome. In total, 1954 and 1218 B-derived transcripts with an open reading frame were expressed in generative and vegetative tissues, respectively. In addition to B-derived transposable element transcripts, a high percentage of short transcripts without detectable similarity to known proteins and gene fragments from A chromosomes (As) were found, suggesting an ongoing gene erosion process. In vitro analysis of the A- and B-encoded AGO4B protein variants demonstrated that both possess RNA slicer activity. These data demonstrate unambiguously the presence of a functional AGO4B gene on Bs and that these Bs carry both functional protein coding genes and pseudogene copies. Thus, B-encoded genes may provide an additional level of gene control and complexity in combination with their related A-located genes. Hence, physiological effects, associated with the presence of Bs, may partly be explained by the activity of B-located (pseudo)genes. © 2016 IPK Gatersleben. New Phytologist © 2016 New Phytologist Trust.
Kuwahara, Hirokazu; Yuki, Masahiro; Izawa, Kazuki; Ohkuma, Moriya; Hongoh, Yuichi
2017-01-01
The cellulolytic protist Trichonympha agilis in the termite gut permanently hosts two symbiotic bacteria, ‘Candidatus Endomicrobium trichonymphae' and ‘Candidatus Desulfovibrio trichonymphae'. The former is an intracellular symbiont, and the latter is almost intracellular but still connected to the outside via a small pore. The complete genome of ‘Ca. Endomicrobium trichonymphae' has previously been reported, and we here present the complete genome of ‘Ca. Desulfovibrio trichonymphae'. The genome is small (1 410 056 bp), has many pseudogenes, and retains biosynthetic pathways for various amino acids and cofactors, which are partially complementary to those of ‘Ca. Endomicrobium trichonymphae'. An amino acid permease gene has apparently been transferred between the ancestors of these two symbionts; a lateral gene transfer has affected their metabolic capacity. Notably, ‘Ca. Desulfovibrio trichonymphae' retains the complex system to oxidize hydrogen by sulfate and/or fumarate, while genes for utilizing other substrates common in desulfovibrios are pseudogenized or missing. Thus, ‘Ca. Desulfovibrio trichonymphae' is specialized to consume hydrogen that may otherwise inhibit fermentation processes in both T. agilis and ‘Ca. Endomicrobium trichonymphae'. The small pore may be necessary to take up sulfate. This study depicts a genome-based model of a multipartite symbiotic system within a cellulolytic protist cell in the termite gut. PMID:27801909
Toscano-Morales, Roberto; Xoconostle-Cázares, Beatriz; Cabrera-Ponce, José L.; Hinojosa-Moya, Jesús; Ruiz-Salas, Jorge L.; Galván-Gordillo, Santiago V.; Guevara-González, Ramón G.; Ruiz-Medrano, Roberto
2015-01-01
The Translationally Controlled Tumor Protein (TCTP) is a central regulator of cell proliferation and differentiation in animals, and probably also in plants. Arabidopsis harbors two TCTP genes, AtTCTP1 (At3g16640), which is an important mitotic regulator, and AtTCTP2 (At3g05540), which is considered a pseudogene. Nevertheless, we have obtained evidence suggesting that this gene is functional. Indeed, a T-DNA insertion mutant, SALK_045146, displays a lethal phenotype during early rosette stage. Also, both the AtTCTP2 promoter and structural gene are functional, and heterozygous plants show delayed development. AtTCTP1 cannot compensate for the loss of AtTCTP2, since the accumulation levels of the AtTCTP1 transcript are even higher in heterozygous plants than in wild-type plants. Leaf explants transformed with Agrobacterium rhizogenes harboring AtTCTP2, but not AtTCTP1, led to whole plant regeneration with a high frequency. Insertion of a sequence present in AtTCTP1 but absent in AtTCTP2 demonstrates that it suppresses the capacity for plant regeneration; also, this phenomenon is enhanced by the presence of TCTP (AtTCTP1 or 2) in the nuclei of root cells. This confirms that AtTCTP2 is not a pseudogene and suggests the involvement of certain TCTP isoforms in vegetative reproduction in some plant species. PMID:26191065
Bueno, Danilo; Palacios-Gimenez, Octavio Manuel; Martí, Dardo Andrea; Mariguela, Tatiane Casagrande; Cabral-de-Mello, Diogo Cavalcanti
2016-08-01
The 5S ribosomal DNA (rDNA) sequences are subject of dynamic evolution at chromosomal and molecular levels, evolving through concerted and/or birth-and-death fashion. Among grasshoppers, the chromosomal location for this sequence was established for some species, but little molecular information was obtained to infer evolutionary patterns. Here, we integrated data from chromosomal and nucleotide sequence analysis for 5S rDNA in two Abracris species aiming to identify evolutionary dynamics. For both species, two arrays were identified, a larger sequence (named type-I) that consisted of the entire 5S rDNA gene plus NTS (non-transcribed spacer) and a smaller (named type-II) with truncated 5S rDNA gene plus short NTS that was considered a pseudogene. For type-I sequences, the gene corresponding region contained the internal control region and poly-T motif and the NTS presented partial transposable elements. Between the species, nucleotide differences for type-I were noticed, while type-II was identical, suggesting pseudogenization in a common ancestor. At chromosomal point to view, the type-II was placed in one bivalent, while type-I occurred in multiple copies in distinct chromosomes. In Abracris, the evolution of 5S rDNA was apparently influenced by the chromosomal distribution of clusters (single or multiple location), resulting in a mixed mechanism integrating concerted and birth-and-death evolution depending on the unit.
Fu, Lili; Han, Bingying; Tan, Deguan; Wang, Meng; Ding, Mei; Zhang, Jiaming
2016-02-22
Myrosinases are β-thioglucoside glucohydrolases and serve as defense mechanisms against insect pests and pathogens by producing toxic compounds. AtTGG6 in Arabidopsis thaliana was previously reported to be a myrosinase pseudogene but specifically expressed in pollen. However, we found that AlTGG6, an ortholog to AtTGG6 in A. lyrata (an outcrossing relative of A. thaliana) was functional, suggesting that functional AtTGG6 alleles may still exist in A. thaliana. AtTGG6 alleles in 29 A. thaliana ecotypes were cloned and sequenced. Results indicate that ten alleles were functional and encoded Myr II type myrosinase of 512 amino acids, and myrosinase activity was confirmed by overexpressing AtTGG6 in Pichia pastoris. However, the 19 other ecotypes had disabled alleles with highly polymorphic frame-shift mutations and diversified sequences. Thirteen frame-shift mutation types were identified, which occurred independently many times in the evolutionary history within a few thousand years. The functional allele was expressed specifically in pollen similar to the disabled alleles but at a higher expression level, suggesting its role in defense of pollen against insect pests such as pollen beetles. However, the defense function may have become less critical after A. thaliana evolved to self-fertilization, and thus resulted in loss of function in most ecotypes.
Dong, Suomeng; Kong, Guanghui; Qutob, Dinah; Yu, Xiaoli; Tang, Junli; Kang, Jixiong; Dai, Tingting; Wang, Hai; Gijzen, Mark; Wang, Yuanchao
2012-07-01
Necrosis- and ethylene-inducing-like proteins (NLP) are widely distributed in eukaryotic and prokaryotic plant pathogens and are considered to be important virulence factors. We identified, in total, 70 potential Phytophthora sojae NLP genes but 37 were designated as pseudogenes. Sequence alignment of the remaining 33 NLP delineated six groups. Three of these groups include proteins with an intact heptapeptide (Gly-His-Arg-His-Asp-Trp-Glu) motif, which is important for necrosis-inducing activity, whereas the motif is not conserved in the other groups. In total, 19 representative NLP genes were assessed for necrosis-inducing activity by heterologous expression in Nicotiana benthamiana. Surprisingly, only eight genes triggered cell death. The expression of the NLP genes in P. sojae was examined, distinguishing 20 expressed and 13 nonexpressed NLP genes. Real-time reverse-transcriptase polymerase chain reaction results indicate that most NLP are highly expressed during cyst germination and infection stages. Amino acid substitution ratios (Ka/Ks) of 33 NLP sequences from four different P. sojae strains resulted in identification of positive selection sites in a distinct NLP group. Overall, our study indicates that expansion and pseudogenization of the P. sojae NLP family results from an ongoing birth-and-death process, and that varying patterns of expression, necrosis-inducing activity, and positive selection suggest that NLP have diversified in function.
Isolation of CYP3A5P cDNA from human liver: a reflection of a novel cytochrome P-450 pseudogene.
Schuetz, J D; Guzelian, P S
1995-03-14
We have isolated, from a human liver cDNA library, a 1627 bp CYP3A5 cDNA variant (CYP3A5P) that contains several large insertions, deletions, and in-frame termination codons. By comparison with the genomic structure of other CYP3A genes, the major insertions in CYP3A5P cDNA demarcate the inferred sites of several CYP3A5 exons. The segments inserted in CYP3A5P have no homology with splice donor acceptor sites. It is unlikely that CYP3A5P cDNA represents an artifact of the cloning procedures since Southern blot analysis of human genomic DNA disclosed that CYP3A5P cDNA hybridized with a DNA fragment distinct from fragments that hybridized with either CYP3A5, CYP3A3 or CYP3A4. Moreover, analysis of adult human liver RNA on Northern blots hybridized with a CYP3A5P cDNA fragment revealed the presence of an mRNA with the predicted size of CYP3A5P. We conclude that CYP3A5P cDNA was derived from a separate gene, CYP3A5P, most likely a pseudogene evolved from CYP3A5.
Allele Specific shRNA for Nanog, and Its Use to Treat Cancer | NCI Technology Transfer Center | TTC
The National Cancer Institute announced positive study results indicating that the expression of NanogP8, a pseudogene of Nanog, is upregulated in human colorectal cancer spheroids formed in serum-free medium. The National Cancer Institute's Labortory of Experimental Carcinogenesis seeks parties of interest to co-develop the use of shRNAs incorporated into a lentiviral vector as a gene therapy to inhibit NanogP8, a retrogene upregulated in several carcinomas.
Ikram, Sobia; Durandet, Monique; Vesa, Simona; Pereira, Serge; Guerche, Philippe; Bonhomme, Sandrine
2014-06-01
F-box protein genes family is one of the largest gene families in plants, with almost 700 predicted genes in the model plant Arabidopsis. F-box proteins are key components of the ubiquitin proteasome system that allows targeted protein degradation. Transcriptome analyses indicate that half of these F-box protein genes are found expressed in microspore and/or pollen, i.e., during male gametogenesis. To assess the role of F-box protein genes during this crucial developmental step, we selected 34 F-box protein genes recorded as highly and specifically expressed in pollen and isolated corresponding insertion mutants. We checked the expression level of each selected gene by RT-PCR and confirmed pollen expression for 25 genes, but specific expression for only 10 of the 34 F-box protein genes. In addition, we tested the expression level of selected F-box protein genes in 24 mutant lines and showed that 11 of them were null mutants. Transmission analysis of the mutations to the progeny showed that none of the single mutations was gametophytic lethal. These unaffected transmission efficiencies suggested leaky mutations or functional redundancy among F-box protein genes. Cytological observation of the gametophytes in the mutants confirmed these results. Combinations of mutations in F-box protein genes from the same subfamily did not lead to transmission defect either, further highlighting functional redundancy and/or a high proportion of pseudogenes among these F-box protein genes.
Emerling, Christopher A
2017-10-01
Regressive evolution of anatomical traits often corresponds with the regression of genomic loci underlying such characters. As such, studying patterns of gene loss can be instrumental in addressing questions of gene function, resolving conflicting results from anatomical studies, and understanding the evolutionary history of clades. The evolutionary origins of snakes involved the regression of a number of anatomical traits, including limbs, taste buds and the visual system, and by analyzing serpent genomes, I was able to test three hypotheses associated with the regression of these features. The first concerns two keratins that are putatively specific to claws. Both genes that encode these keratins are pseudogenized/deleted in snake genomes, providing additional evidence of claw-specificity. The second hypothesis is that snakes lack taste buds, an issue complicated by conflicting results in the literature. I found evidence that different snakes have lost one or more taste receptors, but all snakes examined retained at least one gustatory channel. The final hypothesis addressed is that the earliest snakes were adapted to a dim light niche. I found evidence of deleted and pseudogenized genes with light-associated functions in snakes, demonstrating a pattern of gene loss similar to other dim light-adapted clades. Molecular dating estimates suggest that dim light adaptation preceded the loss of limbs, providing some bearing on interpretations of the ecological origins of snakes. Copyright © 2017 Elsevier Inc. All rights reserved.
Park, Kyu Tae
2017-01-01
Symplocarpus renifolius is a member of Araceae family that is extraordinarily diverse in appearance. Previous studies on chloroplast genomes in Araceae were focused on duckweeds (Lemnoideae) and root crops (Colocasia, commonly known as taro). Here, we determined the chloroplast genome of Symplocarpus renifolius and compared the factors, such as genes and inverted repeat (IR) junctions and performed phylogenetic analysis using other Araceae species. The chloroplast genome of S. renifolius is 158,521 bp and includes 113 genes. A comparison among the Araceae chloroplast genomes showed that infA in Lemna, Spirodela, Wolffiella, Wolffia, Dieffenbachia and Colocasia has been lost or has become a pseudogene and has only been retained in Symplocarpus. In the Araceae chloroplast DNA (cpDNA), psbZ is retained. However, psbZ duplication occurred in Wolffia species and tandem repeats were noted around the duplication regions. A comparison of the IR junction in Araceae species revealed the presence of ycf1 and rps15 in the small single copy region, whereas duckweed species contained ycf1 and rps15 in the IR region. The phylogenetic analyses of the chloroplast genomes revealed that Symplocarpus are a basal group and are sister to the other Araceae species. Consequently, infA deletion or pseudogene events in Araceae occurred after the divergence of Symplocarpus and aquatic plants (duckweeds) in Araceae and duplication events of rps15 and ycf1 occurred in the IR region. PMID:29144427
Kim, Hyoung Tae; Kim, Jung Sung; Moore, Michael J; Neubig, Kurt M; Williams, Norris H; Whitten, W Mark; Kim, Joo-Hwan
2015-01-01
Earlier research has revealed that the ndh loci have been pseudogenized, truncated, or deleted from most orchid plastomes sequenced to date, including in all available plastomes of the two most species-rich subfamilies, Orchidoideae and Epidendroideae. This study sought to resolve deeper-level phylogenetic relationships among major orchid groups and to refine the history of gene loss in the ndh loci across orchids. The complete plastomes of seven orchids, Oncidium sphacelatum (Epidendroideae), Masdevallia coccinea (Epidendroideae), Sobralia callosa (Epidendroideae), Sobralia aff. bouchei (Epidendroideae), Elleanthus sodiroi (Epidendroideae), Paphiopedilum armeniacum (Cypripedioideae), and Phragmipedium longifolium (Cypripedioideae) were sequenced and analyzed in conjunction with all other available orchid and monocot plastomes. Most ndh loci were found to be pseudogenized or lost in Oncidium, Paphiopedilum and Phragmipedium, but surprisingly, all ndh loci were found to retain full, intact reading frames in Sobralia, Elleanthus and Masdevallia. Character mapping suggests that the ndh genes were present in the common ancestor of orchids but have experienced independent, significant losses at least eight times across four subfamilies. In addition, ndhF gene loss was correlated with shifts in the position of the junction of the inverted repeat (IR) and small single-copy (SSC) regions. The Orchidaceae have unprecedented levels of homoplasy in ndh gene presence/absence, which may be correlated in part with the unusual life history of orchids. These results also suggest that ndhF plays a role in IR/SSC junction stability.
Yang, Dong-Dong; de Billerbeck, Gustavo M; Zhang, Jin-Jing; Rosenzweig, Frank; Francois, Jean-Marie
2018-01-01
Homology searches indicate that Saccharomyces cerevisiae strain BY4741 contains seven redundant genes that encode putative aryl-alcohol dehydrogenases (AAD). Yeast AAD genes are located in subtelomeric regions of different chromosomes, and their functional role(s) remain enigmatic. Here, we show that two of these genes, AAD4 and AAD14 , encode functional enzymes that reduce aliphatic and aryl-aldehydes concomitant with the oxidation of cofactor NADPH, and that Aad4p and Aad14p exhibit different substrate preference patterns. Other yeast AAD genes are undergoing pseudogenization. The 5' sequence of AAD15 has been deleted from the genome. Repair of an AAD3 missense mutation at the catalytically essential Tyr 73 residue did not result in a functional enzyme. However, ancestral-state reconstruction by fusing Aad6 with Aad16 and by N-terminal repair of Aad10 restores NADPH-dependent aryl-alcohol dehydrogenase activities. Phylogenetic analysis indicates that AAD genes are narrowly distributed in wood-saprophyte fungi and in yeast that occupy lignocellulosic niches. Because yeast AAD genes exhibit activity on veratraldehyde, cinnamaldehyde, and vanillin, they could serve to detoxify aryl-aldehydes released during lignin degradation. However, none of these compounds induce yeast AAD gene expression, and Aad activities do not relieve aryl-aldehyde growth inhibition. Our data suggest an ancestral role for AAD genes in lignin degradation that is degenerating as a result of yeast's domestication and use in brewing, baking, and other industrial applications. IMPORTANCE Functional characterization of hypothetical genes remains one of the chief tasks of the postgenomic era. Although the first Saccharomyces cerevisiae genome sequence was published over 20 years ago, 22% of its estimated 6,603 open reading frames (ORFs) remain unverified. One outstanding example of this category of genes is the enigmatic seven-member AAD family. Here, we demonstrate that proteins encoded by two members of this family exhibit aliphatic and aryl-aldehyde reductase activity, and further that such activity can be recovered from pseudogenized AAD genes via ancestral-state reconstruction. The phylogeny of yeast AAD genes suggests that these proteins may have played an important ancestral role in detoxifying aromatic aldehydes in ligninolytic fungi. However, in yeast adapted to niches rich in sugars, AAD genes become subject to mutational erosion. Our findings shed new light on the selective pressures and molecular mechanisms by which genes undergo pseudogenization. Copyright © 2017 Yang et al.
de Billerbeck, Gustavo M.; Zhang, Jin-jing; Rosenzweig, Frank
2017-01-01
ABSTRACT Homology searches indicate that Saccharomyces cerevisiae strain BY4741 contains seven redundant genes that encode putative aryl-alcohol dehydrogenases (AAD). Yeast AAD genes are located in subtelomeric regions of different chromosomes, and their functional role(s) remain enigmatic. Here, we show that two of these genes, AAD4 and AAD14, encode functional enzymes that reduce aliphatic and aryl-aldehydes concomitant with the oxidation of cofactor NADPH, and that Aad4p and Aad14p exhibit different substrate preference patterns. Other yeast AAD genes are undergoing pseudogenization. The 5′ sequence of AAD15 has been deleted from the genome. Repair of an AAD3 missense mutation at the catalytically essential Tyr73 residue did not result in a functional enzyme. However, ancestral-state reconstruction by fusing Aad6 with Aad16 and by N-terminal repair of Aad10 restores NADPH-dependent aryl-alcohol dehydrogenase activities. Phylogenetic analysis indicates that AAD genes are narrowly distributed in wood-saprophyte fungi and in yeast that occupy lignocellulosic niches. Because yeast AAD genes exhibit activity on veratraldehyde, cinnamaldehyde, and vanillin, they could serve to detoxify aryl-aldehydes released during lignin degradation. However, none of these compounds induce yeast AAD gene expression, and Aad activities do not relieve aryl-aldehyde growth inhibition. Our data suggest an ancestral role for AAD genes in lignin degradation that is degenerating as a result of yeast's domestication and use in brewing, baking, and other industrial applications. IMPORTANCE Functional characterization of hypothetical genes remains one of the chief tasks of the postgenomic era. Although the first Saccharomyces cerevisiae genome sequence was published over 20 years ago, 22% of its estimated 6,603 open reading frames (ORFs) remain unverified. One outstanding example of this category of genes is the enigmatic seven-member AAD family. Here, we demonstrate that proteins encoded by two members of this family exhibit aliphatic and aryl-aldehyde reductase activity, and further that such activity can be recovered from pseudogenized AAD genes via ancestral-state reconstruction. The phylogeny of yeast AAD genes suggests that these proteins may have played an important ancestral role in detoxifying aromatic aldehydes in ligninolytic fungi. However, in yeast adapted to niches rich in sugars, AAD genes become subject to mutational erosion. Our findings shed new light on the selective pressures and molecular mechanisms by which genes undergo pseudogenization. PMID:29079624
2012-01-01
Background Recent studies in human have highlighted the importance of the monocyte chemotactic proteins (MCP) in leukocyte trafficking and their effects in inflammatory processes, tumor progression, and HIV-1 infection. In European rabbit (Oryctolagus cuniculus) one of the prime MCP targets, the chemokine receptor CCR5 underwent a unique structural alteration. Until now, no homologue of MCP-2/CCL8a, MCP-3/CCL7 or MCP-4/CCL13 genes have been reported for this species. This is interesting, because at least the first two genes are expressed in most, if not all, mammals studied, and appear to be implicated in a variety of important chemokine ligand-receptor interactions. By assessing the Rabbit Whole Genome Sequence (WGS) data we have searched for orthologs of the mammalian genes of the MCP-Eotaxin cluster. Results We have localized the orthologs of these chemokine genes in the genome of European rabbit and compared them to those of leporid genera which do (i.e. Oryctolagus and Bunolagus) or do not share the CCR5 alteration with European rabbit (i.e. Lepus and Sylvilagus). Of the Rabbit orthologs of the CCL8, CCL7, and CCL13 genes only the last two were potentially functional, although showing some structural anomalies at the protein level. The ortholog of MCP-2/CCL8 appeared to be pseudogenized by deleterious nucleotide substitutions affecting exon1 and exon2. By analyzing both genomic and cDNA products, these studies were extended to wild specimens of four genera of the Leporidae family: Oryctolagus, Bunolagus, Lepus, and Sylvilagus. It appeared that the anomalies of the MCP-3/CCL7 and MCP-4/CCL13 proteins are shared among the different species of leporids. In contrast, whereas MCP-2/CCL8 was pseudogenized in every studied specimen of the Oryctolagus - Bunolagus lineage, this gene was intact in species of the Lepus - Sylvilagus lineage, and was, at least in Lepus, correctly transcribed. Conclusion The biological function of a gene was often revealed in situations of dysfunction or gene loss. Infections with Myxoma virus (MYXV) tend to be fatal in European rabbit (genus Oryctolagus), while being harmless in Hares (genus Lepus) and benign in Cottontail rabbit (genus Sylvilagus), the natural hosts of the virus. This communication should stimulate research on a possible role of MCP-2/CCL8 in poxvirus related pathogenicity. PMID:22894773
Structure of novel rat major histocompatibility complex class II genes RT1.Ha and Hb
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arimura, Yutaka; Tang, Wei Ran; Koda, Toshiaki
1995-03-01
We have cloned the novel rat MHC class II genes, RT1.Ha and Hb, which are homologous to human HLA-DPA and DPB. RT1.Hb is a pseudogene, whereas RT1.Ha is apparently intact and may have transcriptional potential. In addition, with an RT1.Ha probe, we detecteda single Southern hybridization band in the genome of the mouse. This finding may aford an opportunity to analyze the HLA-DPA homologue in the mouse genome. 18 refs., 4 figs., 1 tab.
Mouse HLA-DPA homologue H2-Pa: A pseudogene that maps between H2-Pb and H2-Oa
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arimura, Y.; Koda, T.; Kishi, M.
1996-12-31
The major histocompatibility complex (MHC) class II subregion contains several subclasses of genes. The classical class II genes, HLA-DP, DQ, and DR homologues, present antigens directly to CD4{sup +} T cells. HLA-DM homologues facilitate the efficacy and transport of antigens to the cell surface by removing the CLIP peptides from the classical class II molecules. HLA-DNA/DOB homologues show unusual expression patterns and limited polymorphism, but their function is yet to be elucidated. 15 refs., 2 figs.
A Search for Gene Fusions/Translocations in Breast Cancer
2012-10-01
specific pseudogenes. 15. SUBJECT TERMS Gene fusions, sequencing, MAST,Notch 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT 18 ...Figure 5B) as well as in in vivo intravasation and metastasis in chicken chorioallantoic mem- brane xenograft assay (Figure 5C). In contrast...m p h o b la st o id (n = 8) Pa n cr ea ti c B en ig n (n = 3) Pr o st at e B en ig n (n = 18 ) C an ce r S p ec ifi c Sample Frequency (%)0 100
Targeting Tumor Oct4 to Deplete Prostate Tumor and Metastasis Initiating Cells
2016-10-01
Award Number: W81XWH-13-1-0461 TITLE: Targeting Tumor Oct4 to Deplete Prostate Tumor- and Metastasis-Initiating Cells PRINCIPAL INVESTIGATOR: Daotai...29 2016 4. TITLE AND SUBTILE Targeting Tumor Oct4 to Deplete Prostate Tumor- and Metastasis-Initiating Cells 5a. CONTRACT NUMBER 5b. GRANT NUMBER...the c-MYC oncogene. POU5F1B is a pseudogene of embryonic Oct4 (POU5F1). A recent study found that tumor Oct4 found in prostate cancer cells is due
Evolutionary constraints and the neutral theory. [mutation-caused nucleotide substitutions in DNA
NASA Technical Reports Server (NTRS)
Jukes, T. H.; Kimura, M.
1984-01-01
The neutral theory of molecular evolution postulates that nucleotide substitutions inherently take place in DNA as a result of point mutations followed by random genetic drift. In the absence of selective constraints, the substitution rate reaches the maximum value set by the mutation rate. The rate in globin pseudogenes is about 5 x 10 to the -9th substitutions per site per year in mammals. Rates slower than this indicate the presence of constraints imposed by negative (natural) selection, which rejects and discards deleterious mutations.
Massive Losses of Taste Receptor Genes in Toothed and Baleen Whales
Feng, Ping; Zheng, Jinsong; Rossiter, Stephen J.; Wang, Ding; Zhao, Huabin
2014-01-01
Taste receptor genes are functionally important in animals, with a surprising exception in the bottlenose dolphin, which shows extensive losses of sweet, umami, and bitter taste receptor genes. To examine the generality of taste gene loss, we examined seven toothed whales and five baleen whales and sequenced the complete repertoire of three sweet/umami (T1Rs) and ten bitter (T2Rs) taste receptor genes. We found all amplified T1Rs and T2Rs to be pseudogenes in all 12 whales, with a shared premature stop codon in 10 of the 13 genes, which demonstrated massive losses of taste receptor genes in the common ancestor of whales. Furthermore, we analyzed three genome sequences from two toothed whales and one baleen whale and found that the sour taste marker gene Pkd2l1 is a pseudogene, whereas the candidate salty taste receptor genes are intact and putatively functional. Additionally, we examined three genes that are responsible for taste signal transduction and found the relaxation of functional constraints on taste signaling pathways along the ancestral branch leading to whales. Together, our results strongly suggest extensive losses of sweet, umami, bitter, and sour tastes in whales, and the relaxation of taste function most likely arose in the common ancestor of whales between 36 and 53 Ma. Therefore, whales represent the first animal group to lack four of five primary tastes, probably driven by the marine environment with high concentration of sodium, the feeding behavior of swallowing prey whole, and the dietary switch from plants to meat in the whale ancestor. PMID:24803572
Qiao, Qin; Ren, Zhumei; Zhao, Jiayuan; Yonezawa, Takahiro; Hasegawa, Masami; Crabbe, M. James C; Li, Jianqiang; Zhong, Yang
2013-01-01
Background The central function of chloroplasts is to carry out photosynthesis, and its gene content and structure are highly conserved across land plants. Parasitic plants, which have reduced photosynthetic ability, suffer gene losses from the chloroplast (cp) genome accompanied by the relaxation of selective constraints. Compared with the rapid rise in the number of cp genome sequences of photosynthetic organisms, there are limited data sets from parasitic plants. Principal Findings/Significance Here we report the complete sequence of the cp genome of Cistanche deserticola, a holoparasitic desert species belonging to the family Orobanchaceae. The cp genome of C. deserticola is greatly reduced both in size (102,657 bp) and in gene content, indicating that all genes required for photosynthesis suffer from gene loss and pseudogenization, except for psbM. The striking difference from other holoparasitic plants is that it retains almost a full set of tRNA genes, and it has lower dN/dS for most genes than another close holoparasitic plant, E. virginiana, suggesting that Cistanche deserticola has undergone fewer losses, either due to a reduced level of holoparasitism, or to a recent switch to this life history. We also found that the rpoC2 gene was present in two copies within C. deserticola. Its own copy has much shortened and turned out to be a pseudogene. Another copy, which was not located in its cp genome, was a homolog of the host plant, Haloxylon ammodendron (Chenopodiaceae), suggesting that it was acquired from its host via a horizontal gene transfer. PMID:23554920
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nagao, Yoshiro; Sly, W.S.; Batanian, J.R.
1995-08-10
Carbonic anhydrase V (CA V) is expressed in mitochondrial matrix in liver and several other tissues. It is of interest for its putative roles in providing bicarbonate to carbamoyl phosphate synthetase for ureagenesis and to pyruvate carboxylase for gluconeogenesis and its possible importance in explaining certain inherited metabolic disorders with hyperammonemia and hypoglycemia. Following the recent characterization of the cDNA for human CA V, we report the isolation of the human gene from two {lambda} genomic libraries and its characterization. The CA V gene (CA5) is approximately 50 kb long and contains 7 exons and 6 introns. The exon-intron boundariesmore » are found in positions identical to those determined for the previously described CA II, CA III, and CA VII genes. Like the CA VII gene, CA5 does not contain typical TATA and CAAT promoter elements in the 5{prime} flanking region but does contain a TTTAA sequence 147 nucleotides upstream of the initiation codon. CA5 also contains a 12-bp GT-rich segment beginning 13 bp downstream of the polyadenylation signal in the 3{prime} untranslated region of exon 7. FISH analysis allowed CA5 to be assigned to chromosome 16q24.3. An unprocessed pseudogene containing sequence homologous to exons 3-7 and introns 3-6 was also isolated and was assigned by FISH analysis to chromosome 16p11.2-p12. 22 refs., 4 figs., 1 tab.« less
2016-01-01
CYTIDINE DEAMINASE (CDA) catalyzes the deamination of cytidine to uridine and ammonia in the catabolic route of C nucleotides. The Arabidopsis (Arabidopsis thaliana) CDA gene family comprises nine members, one of which (AtCDA) was shown previously in vitro to encode an active CDA. A possible role in C-to-U RNA editing or in antiviral defense has been discussed for other members. A comprehensive bioinformatic analysis of plant CDA sequences, combined with biochemical functionality tests, strongly suggests that all Arabidopsis CDA family members except AtCDA are pseudogenes and that most plants only require a single CDA gene. Soybean (Glycine max) possesses three CDA genes, but only two encode functional enzymes and just one has very high catalytic efficiency. AtCDA and soybean CDAs are located in the cytosol. The functionality of AtCDA in vivo was demonstrated with loss-of-function mutants accumulating high amounts of cytidine but also CMP, cytosine, and some uridine in seeds. Cytidine hydrolysis in cda mutants is likely caused by NUCLEOSIDE HYDROLASE1 (NSH1) because cytosine accumulation is strongly reduced in a cda nsh1 double mutant. Altered responses of the cda mutants to fluorocytidine and fluorouridine indicate that a dual specific nucleoside kinase is involved in cytidine as well as uridine salvage. CDA mutants display a reduction in rosette size and have fewer leaves compared with the wild type, which is probably not caused by defective pyrimidine catabolism but by the accumulation of pyrimidine catabolism intermediates reaching toxic concentrations. PMID:27208239
DOE Office of Scientific and Technical Information (OSTI.GOV)
Malik, Afshan N., E-mail: afshan.malik@kcl.ac.uk; Shahni, Rojeen; Rodriguez-de-Ledesma, Ana
2011-08-19
Highlights: {yields} Mitochondrial dysfunction is central to many diseases of oxidative stress. {yields} 95% of the mitochondrial genome is duplicated in the nuclear genome. {yields} Dilution of untreated genomic DNA leads to dilution bias. {yields} Unique primers and template pretreatment are needed to accurately measure mitochondrial DNA content. -- Abstract: Circulating mitochondrial DNA (MtDNA) is a potential non-invasive biomarker of cellular mitochondrial dysfunction, the latter known to be central to a wide range of human diseases. Changes in MtDNA are usually determined by quantification of MtDNA relative to nuclear DNA (Mt/N) using real time quantitative PCR. We propose that themore » methodology for measuring Mt/N needs to be improved and we have identified that current methods have at least one of the following three problems: (1) As much of the mitochondrial genome is duplicated in the nuclear genome, many commonly used MtDNA primers co-amplify homologous pseudogenes found in the nuclear genome; (2) use of regions from genes such as {beta}-actin and 18S rRNA which are repetitive and/or highly variable for qPCR of the nuclear genome leads to errors; and (3) the size difference of mitochondrial and nuclear genomes cause a 'dilution bias' when template DNA is diluted. We describe a PCR-based method using unique regions in the human mitochondrial genome not duplicated in the nuclear genome; unique single copy region in the nuclear genome and template treatment to remove dilution bias, to accurately quantify MtDNA from human samples.« less
A-to-I RNA editing promotes developmental stage–specific gene and lncRNA expression
Goldstein, Boaz; Agranat-Tamir, Lily; Light, Dean; Ben-Naim Zgayer, Orna; Fishman, Alla; Lamm, Ayelet T.
2017-01-01
A-to-I RNA editing is a conserved widespread phenomenon in which adenosine (A) is converted to inosine (I) by adenosine deaminases (ADARs) in double-stranded RNA regions, mainly noncoding. Mutations in ADAR enzymes in Caenorhabditis elegans cause defects in normal development but are not lethal as in human and mouse. Previous studies in C. elegans indicated competition between RNA interference (RNAi) and RNA editing mechanisms, based on the observation that worms that lack both mechanisms do not exhibit defects, in contrast to the developmental defects observed when only RNA editing is absent. To study the effects of RNA editing on gene expression and function, we established a novel screen that enabled us to identify thousands of RNA editing sites in nonrepetitive regions in the genome. These include dozens of genes that are edited at their 3′ UTR region. We found that these genes are mainly germline and neuronal genes, and that they are down-regulated in the absence of ADAR enzymes. Moreover, we discovered that almost half of these genes are edited in a developmental-specific manner, indicating that RNA editing is a highly regulated process. We found that many pseudogenes and other lncRNAs are also extensively down-regulated in the absence of ADARs in the embryo but not in the fourth larval (L4) stage. This down-regulation is not observed upon additional knockout of RNAi. Furthermore, levels of siRNAs aligned to pseudogenes in ADAR mutants are enhanced. Taken together, our results suggest a role for RNA editing in normal growth and development by regulating silencing via RNAi. PMID:28031250
Evolutionary maintenance of selfish homing endonuclease genes in the absence of horizontal transfer.
Yahara, Koji; Fukuyo, Masaki; Sasaki, Akira; Kobayashi, Ichizo
2009-11-03
Homing endonuclease genes are "selfish" mobile genetic elements whose endonuclease promotes the spread of its own gene by creating a break at a specific target site and using the host machinery to repair the break by copying and inserting the gene at this site. Horizontal transfer across the boundary of a species or population within which mating takes place has been thought to be necessary for their evolutionary persistence. This is based on the assumption that they will become fixed in a host population, where opportunities of homing will disappear, and become susceptible to degeneration. To test this hypothesis, we modeled behavior of a homing endonuclease gene that moves during meiosis through double-strand break repair. We mathematically explored conditions for persistence of the homing endonuclease gene and elucidated their parameter dependence as phase diagrams. We found that, if the cost of the pseudogene is lower than that of the homing endonuclease gene, the 2 forms can persist in a population through autonomous periodic oscillation. If the cost of the pseudogene is higher, 2 types of dynamics appear that enable evolutionary persistence: bistability dependent on initial frequency or fixation irrespective of initial frequency. The prediction of long persistence in the absence of horizontal transfer was confirmed by stochastic simulations in finite populations. The average time to extinction of the endonuclease gene was found to be thousands of meiotic generations or more based on realistic parameter values. These results provide a solid theoretical basis for an understanding of these and other extremely selfish elements.
New Implications on Genomic Adaptation Derived from the Helicobacter pylori Genome Comparison
Lara-Ramírez, Edgar Eduardo; Segura-Cabrera, Aldo; Guo, Xianwu; Yu, Gongxin; García-Pérez, Carlos Armando; Rodríguez-Pérez, Mario A.
2011-01-01
Background Helicobacter pylori has a reduced genome and lives in a tough environment for long-term persistence. It evolved with its particular characteristics for biological adaptation. Because several H. pylori genome sequences are available, comparative analysis could help to better understand genomic adaptation of this particular bacterium. Principal Findings We analyzed nine H. pylori genomes with emphasis on microevolution from a different perspective. Inversion was an important factor to shape the genome structure. Illegitimate recombination not only led to genomic inversion but also inverted fragment duplication, both of which contributed to the creation of new genes and gene family, and further, homological recombination contributed to events of inversion. Based on the information of genomic rearrangement, the first genome scaffold structure of H. pylori last common ancestor was produced. The core genome consists of 1186 genes, of which 22 genes could particularly adapt to human stomach niche. H. pylori contains high proportion of pseudogenes whose genesis was principally caused by homopolynucleotide (HPN) mutations. Such mutations are reversible and facilitate the control of gene expression through the change of DNA structure. The reversible mutations and a quasi-panmictic feature could allow such genes or gene fragments frequently transferred within or between populations. Hence, pseudogenes could be a reservoir of adaptation materials and the HPN mutations could be favorable to H. pylori adaptation, leading to HPN accumulation on the genomes, which corresponds to a special feature of Helicobacter species: extremely high HPN composition of genome. Conclusion Our research demonstrated that both genome content and structure of H. pylori have been highly adapted to its particular life style. PMID:21387011
Molecular evolution of the insect chemoreceptor gene superfamily in Drosophila melanogaster.
Robertson, Hugh M; Warr, Coral G; Carlson, John R
2003-11-25
The insect chemoreceptor superfamily in Drosophila melanogaster is predicted to consist of 62 odorant receptor (Or) and 68 gustatory receptor (Gr) proteins, encoded by families of 60 Or and 60 Gr genes through alternative splicing. We include two previously undescribed Or genes and two previously undescribed Gr genes; two previously predicted Or genes are shown to be alternative splice forms. Three polymorphic pseudogenes and one highly defective pseudogene are recognized. Phylogenetic analysis reveals deep branches connecting multiple highly divergent clades within the Gr family, and the Or family appears to be a single highly expanded lineage within the superfamily. The genes are spread throughout the Drosophila genome, with some relatively recently diverged genes still clustered in the genome. The Gr5a gene on the X chromosome, which encodes a receptor for the sugar trehalose, has transposed from one such tandem cluster of six genes at cytological location 64, as has Gr61a, and all eight of these receptors might bind sugars. Analysis of intron evolution suggests that the common ancestor consisted of a long N-terminal exon encoding transmembrane domains 1-5 followed by three exons encoding transmembrane domains 6-7. As many as 57 additional introns have been acquired idiosyncratically during the evolution of the superfamily, whereas the ancestral introns and some of the older idiosyncratic introns have been lost at least 48 times independently. Altogether, these patterns of molecular evolution suggest that this is an ancient superfamily of chemoreceptors, probably dating back at least to the origin of the arthropods.
Molecular evolution of the insect chemoreceptor gene superfamily in Drosophila melanogaster
Robertson, Hugh M.; Warr, Coral G.; Carlson, John R.
2003-01-01
The insect chemoreceptor superfamily in Drosophila melanogaster is predicted to consist of 62 odorant receptor (Or) and 68 gustatory receptor (Gr) proteins, encoded by families of 60 Or and 60 Gr genes through alternative splicing. We include two previously undescribed Or genes and two previously undescribed Gr genes; two previously predicted Or genes are shown to be alternative splice forms. Three polymorphic pseudogenes and one highly defective pseudogene are recognized. Phylogenetic analysis reveals deep branches connecting multiple highly divergent clades within the Gr family, and the Or family appears to be a single highly expanded lineage within the superfamily. The genes are spread throughout the Drosophila genome, with some relatively recently diverged genes still clustered in the genome. The Gr5a gene on the X chromosome, which encodes a receptor for the sugar trehalose, has transposed from one such tandem cluster of six genes at cytological location 64, as has Gr61a, and all eight of these receptors might bind sugars. Analysis of intron evolution suggests that the common ancestor consisted of a long N-terminal exon encoding transmembrane domains 1-5 followed by three exons encoding transmembrane domains 6-7. As many as 57 additional introns have been acquired idiosyncratically during the evolution of the superfamily, whereas the ancestral introns and some of the older idiosyncratic introns have been lost at least 48 times independently. Altogether, these patterns of molecular evolution suggest that this is an ancient superfamily of chemoreceptors, probably dating back at least to the origin of the arthropods. PMID:14608037
Li, Xi; Zhang, Ti-Cao; Qiao, Qin; Ren, Zhumei; Zhao, Jiayuan; Yonezawa, Takahiro; Hasegawa, Masami; Crabbe, M James C; Li, Jianqiang; Zhong, Yang
2013-01-01
The central function of chloroplasts is to carry out photosynthesis, and its gene content and structure are highly conserved across land plants. Parasitic plants, which have reduced photosynthetic ability, suffer gene losses from the chloroplast (cp) genome accompanied by the relaxation of selective constraints. Compared with the rapid rise in the number of cp genome sequences of photosynthetic organisms, there are limited data sets from parasitic plants. PRINCIPAL FINDINGS/SIGNIFICANCE: Here we report the complete sequence of the cp genome of Cistanche deserticola, a holoparasitic desert species belonging to the family Orobanchaceae. The cp genome of C. deserticola is greatly reduced both in size (102,657 bp) and in gene content, indicating that all genes required for photosynthesis suffer from gene loss and pseudogenization, except for psbM. The striking difference from other holoparasitic plants is that it retains almost a full set of tRNA genes, and it has lower dN/dS for most genes than another close holoparasitic plant, E. virginiana, suggesting that Cistanche deserticola has undergone fewer losses, either due to a reduced level of holoparasitism, or to a recent switch to this life history. We also found that the rpoC2 gene was present in two copies within C. deserticola. Its own copy has much shortened and turned out to be a pseudogene. Another copy, which was not located in its cp genome, was a homolog of the host plant, Haloxylon ammodendron (Chenopodiaceae), suggesting that it was acquired from its host via a horizontal gene transfer.
An evolutionary insight into the hatching strategies of pipefish and seahorse embryos.
Kawaguchi, Mari; Nakano, Yuko; Kawahara-Miki, Ryouka; Inokuchi, Mayu; Yorifuji, Makiko; Okubo, Ryohei; Nagasawa, Tatsuki; Hiroi, Junya; Kono, Tomohiro; Kaneko, Toyoji
2016-03-01
Syngnathiform fishes carry their eggs in a brood structure found in males. The brood structure differs from species to species: seahorses carry eggs within enclosed brood pouch, messmate pipefish carry eggs in the semi-brood pouch, and alligator pipefish carry eggs in the egg compartment on abdomen. These egg protection strategies were established during syngnathiform evolution. In the present study, we compared the hatching mode of protected embryos of three species. Electron microscopic observations revealed that alligator pipefish and messmate pipefish egg envelopes were thicker than those of seahorses, suggesting that the seahorse produces a weaker envelope. Furthermore, molecular genetic analysis revealed that these two pipefishes possessed the egg envelope-digesting enzymes, high choriolytic enzyme (HCE), and low choriolytic enzyme (LCE), as do many euteleosts. In seahorses, however, only HCE gene expression was detected. When searching the entire seahorse genome by high-throughput DNA sequencing, we did not find a functional LCE gene and only a trace of the LCE gene exon was found, confirming that the seahorse LCE gene was pseudogenized during evolution. Finally, we estimated the size and number of hatching gland cells expressing hatching enzyme genes by whole-mount in situ hybridization. The seahorse cells were the smallest of the three species, while they had the greatest number. These results suggest that the isolation of eggs from the external environment by paternal bearing might bring the egg envelope thin, and then, the hatching enzyme genes became pseudogenized. J. Exp. Zool. (Mol. Dev. Evol.) 9999B:XX-XX, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Khan, Imran; Maldonado, Emanuel; Vasconcelos, Vítor; O'Brien, Stephen J; Johnson, Warren E; Antunes, Agostinho
2014-09-10
Adaptation of mammals to terrestrial life was facilitated by the unique vertebrate trait of body hair, which occurs in a range of morphological patterns. Keratin associated proteins (KRTAPs), the major structural hair shaft proteins, are largely responsible for hair variation. We exhaustively characterized the KRTAP gene family in 22 mammalian genomes, confirming the existence of 30 KRTAP subfamilies evolving at different rates with varying degrees of diversification and homogenization. Within the two major classes of KRTAPs, the high cysteine (HS) subfamily experienced strong concerted evolution, high rates of gene conversion/recombination and high GC content. In contrast, high glycine-tyrosine (HGT) KRTAPs showed evidence of positive selection and low rates of gene conversion/recombination. Species with more hair and of higher complexity tended to have more KRATP genes (gene expansion). The sloth, with long and coarse hair, had the most KRTAP genes (175 with 141 being intact). By contrast, the "hairless" dolphin had 35 KRTAPs and the highest pseudogenization rate (74% relative to the 19% mammalian average). Unique hair-related phenotypes, such as scales (armadillo) and spines (hedgehog), were correlated with changes in KRTAPs. Gene expression variation probably also influences hair diversification patterns, for example human have an identical KRTAP repertoire as apes, but much less hair. We hypothesize that differences in KRTAP gene repertoire and gene expression, together with distinct rates of gene conversion/recombination, pseudogenization and positive selection, are likely responsible for micro and macro-phenotypic hair diversification among mammals in response to adaptations to ecological pressures.
Suárez-Esquivel, Marcela; Baker, Kate S.; Ruiz-Villalobos, Nazareth; Hernández-Mora, Gabriela; Barquero-Calvo, Elías; González-Barrientos, Rocío; Castillo-Zeledón, Amanda; Jiménez-Rojas, César; Chacón-Díaz, Carlos; Cloeckaert, Axel; Chaves-Olarte, Esteban; Thomson, Nicholas R.; Moreno, Edgardo
2017-01-01
Abstract Intracellular bacterial pathogens probably arose when their ancestor adapted from a free-living environment to an intracellular one, leading to clonal bacteria with smaller genomes and less sources of genetic plasticity. Still, this plasticity is needed to respond to the challenges posed by the host. Members of the Brucella genus are facultative-extracellular intracellular bacteria responsible for causing brucellosis in a variety of mammals. The various species keep different host preferences, virulence, and zoonotic potential despite having 97–99% similarity at genome level. Here, we describe elements of genetic variation in Brucella ceti isolated from wildlife dolphins inhabiting the Pacific Ocean and the Mediterranean Sea. Comparison with isolates obtained from marine mammals from the Atlantic Ocean and the broader Brucella genus showed distinctive traits according to oceanic distribution and preferred host. Marine mammal isolates display genetic variability, represented by an important number of IS711 elements as well as specific IS711 and SNPs genomic distribution clustering patterns. Extensive pseudogenization was found among isolates from marine mammals as compared with terrestrial ones, causing degradation in pathways related to energy, transport of metabolites, and regulation/transcription. Brucella ceti isolates infecting particularly dolphin hosts, showed further degradation of metabolite transport pathways as well as pathways related to cell wall/membrane/envelope biogenesis and motility. Thus, gene loss through pseudogenization is a source of genetic variation in Brucella, which in turn, relates to adaptation to different hosts. This is relevant to understand the natural history of bacterial diseases, their zoonotic potential, and the impact of human interventions such as domestication. PMID:28854602
Evolutionary maintenance of selfish homing endonuclease genes in the absence of horizontal transfer
Yahara, Koji; Fukuyo, Masaki; Sasaki, Akira; Kobayashi, Ichizo
2009-01-01
Homing endonuclease genes are “selfish” mobile genetic elements whose endonuclease promotes the spread of its own gene by creating a break at a specific target site and using the host machinery to repair the break by copying and inserting the gene at this site. Horizontal transfer across the boundary of a species or population within which mating takes place has been thought to be necessary for their evolutionary persistence. This is based on the assumption that they will become fixed in a host population, where opportunities of homing will disappear, and become susceptible to degeneration. To test this hypothesis, we modeled behavior of a homing endonuclease gene that moves during meiosis through double-strand break repair. We mathematically explored conditions for persistence of the homing endonuclease gene and elucidated their parameter dependence as phase diagrams. We found that, if the cost of the pseudogene is lower than that of the homing endonuclease gene, the 2 forms can persist in a population through autonomous periodic oscillation. If the cost of the pseudogene is higher, 2 types of dynamics appear that enable evolutionary persistence: bistability dependent on initial frequency or fixation irrespective of initial frequency. The prediction of long persistence in the absence of horizontal transfer was confirmed by stochastic simulations in finite populations. The average time to extinction of the endonuclease gene was found to be thousands of meiotic generations or more based on realistic parameter values. These results provide a solid theoretical basis for an understanding of these and other extremely selfish elements. PMID:19837694
Moore, Michael J.; Neubig, Kurt M.; Williams, Norris H.; Whitten, W. Mark; Kim, Joo-Hwan
2015-01-01
Earlier research has revealed that the ndh loci have been pseudogenized, truncated, or deleted from most orchid plastomes sequenced to date, including in all available plastomes of the two most species-rich subfamilies, Orchidoideae and Epidendroideae. This study sought to resolve deeper-level phylogenetic relationships among major orchid groups and to refine the history of gene loss in the ndh loci across orchids. The complete plastomes of seven orchids, Oncidium sphacelatum (Epidendroideae), Masdevallia coccinea (Epidendroideae), Sobralia callosa (Epidendroideae), Sobralia aff. bouchei (Epidendroideae), Elleanthus sodiroi (Epidendroideae), Paphiopedilum armeniacum (Cypripedioideae), and Phragmipedium longifolium (Cypripedioideae) were sequenced and analyzed in conjunction with all other available orchid and monocot plastomes. Most ndh loci were found to be pseudogenized or lost in Oncidium, Paphiopedilum and Phragmipedium, but surprisingly, all ndh loci were found to retain full, intact reading frames in Sobralia, Elleanthus and Masdevallia. Character mapping suggests that the ndh genes were present in the common ancestor of orchids but have experienced independent, significant losses at least eight times across four subfamilies. In addition, ndhF gene loss was correlated with shifts in the position of the junction of the inverted repeat (IR) and small single-copy (SSC) regions. The Orchidaceae have unprecedented levels of homoplasy in ndh gene presence/absence, which may be correlated in part with the unusual life history of orchids. These results also suggest that ndhF plays a role in IR/SSC junction stability. PMID:26558895
Whole-genome analyses of speciation events in pathogenic Brucellae
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chain, Patrick S. G.; Comerci, Diego J.; Tolmasky, Marcelo E.
Despite their high DNA identity and a proposal to group classical Brucella species as biovars of Brucella melitensis, the commonly recognized Brucella species can be distinguished by distinct biochemical and fatty acid characters, as well as by a marked host range (e.g., Brucella suis for swine, B. melitensis for sheep and goats, and Brucella abortus for cattle). Here we present the genome of B. abortus 2308, the virulent prototype biovar 1 strain, and its comparison to the two other human pathogenic Brucella species and to B. abortus field isolate 9-941. The global distribution of pseudogenes, deletions, and insertions supports previousmore » indications that B. abortus and B. melitensis share a common ancestor that diverged from B. suis. With the exception of a dozen genes, the genetic complements of both B. abortus strains are identical, whereas the three species differ in gene content and pseudogenes. The pattern of species-specific gene inactivations affecting transcriptional regulators and outer membrane proteins suggests that these inactivations may play an important role in the establishment of host specificity and may have been a primary driver of speciation in the genus Brucella. Despite being nonmotile, the brucellae contain flagellum gene clusters and display species-specific flagellar gene inactivations, which lead to the putative generation of different versions of flagellum-derived structures and may contribute to differences in host specificity and virulence. Metabolic changes such as the lack of complete metabolic pathways for the synthesis of numerous compounds (e.g., glycogen, biotin, NAD, and choline) are consistent with adaptation of brucellae to an intracellular life-style.« less
Whole-genome analyses of the speciation events in the pathogenic Brucellae
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chain, P; Comerci, D; Tolmasky, M
Despite their high DNA identity and a proposal to group classical Brucella species as biovars of B. melitensis, the commonly recognized Brucella species can be distinguished by distinct biochemical and fatty acid characters as well as by a marked host range (e.g. B. suis for swine, B. melitensis for sheep and goats, B. abortus for cattle). Here we present the genome of B. abortus 2308, the virulent prototype biovar 1 strain, and its comparison to the two other human pathogenic Brucellae species and to the B. abortus field isolate 9-941. The global distribution of pseudogenes, deletions and insertions support previousmore » indications that B. abortus and B. melitensis share a common ancestor that diverged from B. suis. With the exception of a dozen genes, the genetic complement of both B. abortus strains is identical, whereas the three species differ in gene content and pseudogenes. The pattern of species-specific gene inactivations affecting transcriptional regulators and outer membrane proteins suggest that these inactivations may play an important role in the establishment of host-specificity and may have been a primary driver of speciation in the Brucellae. Despite being non-motile, the Brucellae contain flagellum gene clusters and display species-specific flagellar gene inactivations, which lead to the putative generation of different versions of flagellum-derived structures, and may contribute to differences in host-specificity and virulence. Metabolic changes such as the lack of complete metabolic pathways for the synthesis of numerous compounds (e.g. glycogen, biotin, NAD, and choline) are consistent with adaptation of Brucellae to an intracellular lifestyle.« less
Evolution of Olfactory Receptor Genes in Primates Dominated by Birth-and-Death Process
Dong, Dong; He, Guimei; Zhang, Shuyi
2009-01-01
Olfactory receptor (OR) is a large family of G protein–coupled receptors that can detect odorant in order to generate the sense of smell. They constitute one of the largest multiple gene families in animals including primates. To better understand the variation in odor perception and evolution of OR genes among primates, we computationally identified OR gene repertoires in orangutans, marmosets, and mouse lemurs and investigated the birth-and-death process of OR genes in the primate lineage. The results showed that 1) all the primate species studied have no more than 400 intact OR genes, fewer than rodents and canine; 2) Despite the similar number of OR genes in the genome, the makeup of the OR gene repertoires between different primate species is quite different as they had undergone dramatic birth-and-death evolution with extensive gene losses in the lineages leading to current species; 3) Apes and Old World monkey (OWM) have similar fraction of pseudogenes, whereas New World monkey (NWM) have fewer pseudogenes. To measure the selective pressure that had affected the OR gene repertoires in primates, we compared the ratio of nonsynonymous with synonymous substitution rates by using 70 one-to-one orthologous quintets among five primate species. We found that OR genes showed relaxed selective constraints in apes (humans, chimpanzees, and orangutans) than in OWMs (macaques) and NWMs (marmosets). We concluded that OR gene repertoires in primates have evolved in such a way to adapt to their respective living environments. Differential selective constraints might play important role in the primate OR gene evolution in each primate species. PMID:20333195
Carelle-Calmels, Nadège; Girard-Lemaire, Françoise; Guérin, Eric; Bieth, Eric; Rudolf, Gabrielle; Biancalana, Valérie; Pecheur, Hélène; Demil, Houria; Schneider, Thierry; de Saint-Martin, Anne; Caron, Olivier; Legrain, Michèle; Gaston, Valérie; Flori, Elisabeth
2008-01-01
Cytogenetically detectable elongation of the 15q proximal region can be associated with Prader-Willi/Angelman critical region interstitial duplications or with inherited juxtacentromeric euchromatic variants. The first category has been reported in association with developmental delay and autistic disorders. These pathogenic recurrent duplications are more frequently of maternal origin and originate from unequal meiotic crossovers between chromosome 15 low-copy repeats. 15q juxtacentromeric euchromatic variants reflect polymorphic copy number variations of segments containing pseudogenes and usually segregate without apparent phenotypic consequence. Pathogenic relevant 15q11-q13 duplications are not distinguishable from the innocuous euchromatic variants with conventional cytogenetic methods. We report cytogenetic and molecular studies of a patient with hypotonia, developmental delay and epilepsy, carrying, on the same chromosome 15, both a de novo 15q11-q13 interstitial duplication and an inherited 15q juxtacentromeric amplification from maternal origin. The duplication, initially suspected by fluorescent in situ hybridization (FISH), has been confirmed by molecular studies. The 15q juxtacentromeric region amplification, which segregates in the family for at least three generations, has been confirmed by FISH using BAC probes overlapping the NF1 and GABRA5 pseudogenes. This report emphasizes the importance to distinguish proximal 15q polymorphic variants from clinically significant duplications. In any patient with inherited 15q proximal variant but unexplained developmental delay suggesting 15q11-q13 pathology, a pathogenic rearrangement has to be searched with adapted strategies, in order to detect deletions as well as duplications of this region.
Ganster, Christina; Wernstedt, Annekatrin; Kehrer-Sawatzki, Hildegard; Messiaen, Ludwine; Schmidt, Konrad; Rahner, Nils; Heinimann, Karl; Fonatsch, Christa; Zschocke, Johannes; Wimmer, Katharina
2012-01-01
Sequence exchange between PMS2 and its pseudogene PMS2CL, embedded in an inverted duplication on chromosome 7p22, has been reported to be an ongoing process that leads to functional PMS2 hybrid alleles containing PMS2- and PMS2CL-specific sequence variants at the 5′-and the 3′-end, respectively. The frequency of PMS2 hybrid alleles, their biological significance, and the mechanisms underlying their formation are largely unknown. Here we show that overall hybrid alleles account for one-third of 384 PMS2 alleles analyzed in individuals of different ethnic backgrounds. Depending on the population, 14–60% of hybrid alleles carry PMS2CL-specific sequences in exons 13–15, the remainder only in exon 15. We show that exons 13–15 hybrid alleles, named H1 hybrid alleles, constitute different haplotypes but trace back to a single ancient intrachromosomal recombination event with crossover. Taking advantage of an ancestral sequence variant specific for all H1 alleles we developed a simple gDNA-based polymerase chain reaction (PCR) assay that can be used to identify H1-allele carriers with high sensitivity and specificity (100 and 99%, respectively). Because H1 hybrid alleles harbor missense variant p.N775S of so far unknown functional significance, we assessed the H1-carrier frequency in 164 colorectal cancer patients. So far, we found no indication that the variant plays a major role with regard to cancer susceptibility. PMID:20186689
Ganster, Christina; Wernstedt, Annekatrin; Kehrer-Sawatzki, Hildegard; Messiaen, Ludwine; Schmidt, Konrad; Rahner, Nils; Heinimann, Karl; Fonatsch, Christa; Zschocke, Johannes; Wimmer, Katharina
2010-05-01
Sequence exchange between PMS2 and its pseudogene PMS2CL, embedded in an inverted duplication on chromosome 7p22, has been reported to be an ongoing process that leads to functional PMS2 hybrid alleles containing PMS2- and PMS2CL-specific sequence variants at the 5'-and the 3'-end, respectively. The frequency of PMS2 hybrid alleles, their biological significance, and the mechanisms underlying their formation are largely unknown. Here we show that overall hybrid alleles account for one-third of 384 PMS2 alleles analyzed in individuals of different ethnic backgrounds. Depending on the population, 14-60% of hybrid alleles carry PMS2CL-specific sequences in exons 13-15, the remainder only in exon 15. We show that exons 13-15 hybrid alleles, named H1 hybrid alleles, constitute different haplotypes but trace back to a single ancient intrachromosomal recombination event with crossover. Taking advantage of an ancestral sequence variant specific for all H1 alleles we developed a simple gDNA-based polymerase chain reaction (PCR) assay that can be used to identify H1-allele carriers with high sensitivity and specificity (100 and 99%, respectively). Because H1 hybrid alleles harbor missense variant p.N775S of so far unknown functional significance, we assessed the H1-carrier frequency in 164 colorectal cancer patients. So far, we found no indication that the variant plays a major role with regard to cancer susceptibility. (c) 2010 Wiley-Liss, Inc.
Herrera, Victoria L M; Steffen, Martin; Moran, Ann Marie; Tan, Glaiza A; Pasion, Khristine A; Rivera, Keith; Pappin, Darryl J; Ruiz-Opazo, Nelson
2016-06-14
In contrast to rat and mouse databases, the NCBI gene database lists the human dual-endothelin1/VEGFsp receptor (DEspR, formerly Dear) as a unitary transcribed pseudogene due to a stop [TGA]-codon at codon#14 in automated DNA and RNA sequences. However, re-analysis is needed given prior single gene studies detected a tryptophan [TGG]-codon#14 by manual Sanger sequencing, demonstrated DEspR translatability and functionality, and since the demonstration of actual non-translatability through expression studies, the standard-of-excellence for pseudogene designation, has not been performed. Re-analysis must meet UNIPROT criteria for demonstration of a protein's existence at the highest (protein) level, which a priori, would override DNA- or RNA-based deductions. To dissect the nucleotide sequence discrepancy, we performed Maxam-Gilbert sequencing and reviewed 727 RNA-seq entries. To comply with the highest level multiple UNIPROT criteria for determining DEspR's existence, we performed various experiments using multiple anti-DEspR monoclonal antibodies (mAbs) targeting distinct DEspR epitopes with one spanning the contested tryptophan [TGG]-codon#14, assessing: (a) DEspR protein expression, (b) predicted full-length protein size, (c) sequence-predicted protein-specific properties beyond codon#14: receptor glycosylation and internalization, (d) protein-partner interactions, and (e) DEspR functionality via DEspR-inhibition effects. Maxam-Gilbert sequencing and some RNA-seq entries demonstrate two guanines, hence a tryptophan [TGG]-codon#14 within a compression site spanning an error-prone compression sequence motif. Western blot analysis using anti-DEspR mAbs targeting distinct DEspR epitopes detect the identical glycosylated 17.5 kDa pull-down protein. Decrease in DEspR-protein size after PNGase-F digest demonstrates post-translational glycosylation, concordant with the consensus-glycosylation site beyond codon#14. Like other small single-transmembrane proteins, mass spectrometry analysis of anti-DEspR mAb pull-down proteins do not detect DEspR, but detect DEspR-protein interactions with proteins implicated in intracellular trafficking and cancer. FACS analyses also detect DEspR-protein in different human cancer stem-like cells (CSCs). DEspR-inhibition studies identify DEspR-roles in CSC survival and growth. Live cell imaging detects fluorescently-labeled anti-DEspR mAb targeted-receptor internalization, concordant with the single internalization-recognition sequence also located beyond codon#14. Data confirm translatability of DEspR, the full-length DEspR protein beyond codon#14, and elucidate DEspR-specific functionality. Along with detection of the tryptophan [TGG]-codon#14 within an error-prone compression site, cumulative data demonstrating DEspR protein existence fulfill multiple UNIPROT criteria, thus refuting its pseudogene designation.
Massive losses of taste receptor genes in toothed and baleen whales.
Feng, Ping; Zheng, Jinsong; Rossiter, Stephen J; Wang, Ding; Zhao, Huabin
2014-05-06
Taste receptor genes are functionally important in animals, with a surprising exception in the bottlenose dolphin, which shows extensive losses of sweet, umami, and bitter taste receptor genes. To examine the generality of taste gene loss, we examined seven toothed whales and five baleen whales and sequenced the complete repertoire of three sweet/umami (T1Rs) and ten bitter (T2Rs) taste receptor genes. We found all amplified T1Rs and T2Rs to be pseudogenes in all 12 whales, with a shared premature stop codon in 10 of the 13 genes, which demonstrated massive losses of taste receptor genes in the common ancestor of whales. Furthermore, we analyzed three genome sequences from two toothed whales and one baleen whale and found that the sour taste marker gene Pkd2l1 is a pseudogene, whereas the candidate salty taste receptor genes are intact and putatively functional. Additionally, we examined three genes that are responsible for taste signal transduction and found the relaxation of functional constraints on taste signaling pathways along the ancestral branch leading to whales. Together, our results strongly suggest extensive losses of sweet, umami, bitter, and sour tastes in whales, and the relaxation of taste function most likely arose in the common ancestor of whales between 36 and 53 Ma. Therefore, whales represent the first animal group to lack four of five primary tastes, probably driven by the marine environment with high concentration of sodium, the feeding behavior of swallowing prey whole, and the dietary switch from plants to meat in the whale ancestor. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Schroll, Casper; Christensen, Jens P; Christensen, Henrik; Pors, Susanne E; Thorndahl, Lotte; Jensen, Peter R; Olsen, John E; Jelsbak, Lotte
2014-05-14
Serovars of Salmonella enterica exhibit different host-specificities where some have broad host-ranges and others, like S. Gallinarum and S. Typhi, are host-specific for poultry and humans, respectively. With the recent availability of whole genome sequences it has been reported that host-specificity coincides with accumulation of pseudogenes, indicating adaptation of host-restricted serovars to their narrow niches. Polyamines are small cationic amines and in Salmonella they can be synthesized through two alternative pathways directly from l-ornithine to putrescine and from l-arginine via agmatine to putrescine. The first pathway is not active in S. Gallinarum and S. Typhi, and this prompted us to investigate the importance of polyamines for virulence in S. Gallinarum. Bioinformatic analysis of all sequenced genomes of Salmonella revealed that pseudogene formation of the speC gene was exclusive for S. Typhi and S. Gallinarum and happened through independent events. The remaining polyamine biosynthesis pathway was found to be essential for oral infection with S. Gallinarum since single and double mutants in speB and speE, encoding the pathways from agmatine to putrescine and from putrescine to spermidine, were attenuated. In contrast, speB was dispensable after intraperitoneal challenge, suggesting that putrescine was less important for the systemic phase of the disease. In support of this hypothesis, a ΔspeE;ΔpotCD mutant, unable to synthesize and import spermidine, but with retained ability to import and synthesize putrescine, was attenuated after intraperitoneal infection. We therefore conclude that polyamines are essential for virulence of S. Gallinarum. Furthermore, our results point to distinct roles for putrescine and spermidine during systemic infection. Copyright © 2014 Elsevier B.V. All rights reserved.
Huang, Ya-Yi; Cho, Shu-Ting; Haryono, Mindia; Kuo, Chih-Horng
2017-01-01
Common bermudagrass (Cynodon dactylon (L.) Pers.) belongs to the subfamily Chloridoideae of the Poaceae family, one of the most important plant families ecologically and economically. This grass has a long connection with human culture but its systematics is relatively understudied. In this study, we sequenced and investigated the chloroplast genome of common bermudagrass, which is 134,297 bp in length with two single copy regions (LSC: 79,732 bp; SSC: 12,521 bp) and a pair of inverted repeat (IR) regions (21,022 bp). The annotation contains a total of 128 predicted genes, including 82 protein-coding, 38 tRNA, and 8 rRNA genes. Additionally, our in silico analyses identified 10 sets of repeats longer than 20 bp and predicted the presence of 36 RNA editing sites. Overall, the chloroplast genome of common bermudagrass resembles those from other Poaceae lineages. Compared to most angiosperms, the accD gene and the introns of both clpP and rpoC1 genes are missing. Additionally, the ycf1, ycf2, ycf15, and ycf68 genes are pseudogenized and two genome rearrangements exist. Our phylogenetic analysis based on 47 chloroplast protein-coding genes supported the placement of common bermudagrass within Chloridoideae. Our phylogenetic character mapping based on the parsimony principle further indicated that the loss of the accD gene and clpP introns, the pseudogenization of four ycf genes, and the two rearrangements occurred only once after the most recent common ancestor of the Poaceae diverged from other monocots, which could explain the unusual long branch leading to the Poaceae when phylogeny is inferred based on chloroplast sequences. PMID:28617867
Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs.
Powell, Bradford C; Hutchison, Clyde A
2006-01-19
Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs) of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs). We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not) are attractive targets when seeking errors of gene prediction. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency). We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes.
Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs
Powell, Bradford C; Hutchison, Clyde A
2006-01-01
Background Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs) of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs). We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. Results "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not) are attractive targets when seeking errors of gene predicion. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency). We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Conclusion Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes. PMID:16423288
Minaya, Miguel; Díaz-Pérez, Antonio; Mason-Gamer, Roberta; Pimentel, Manuel; Catalán, Pilar
2015-10-01
Low-copy nuclear genes (LCNGs) have complex genetic architectures and evolutionary dynamics. However, unlike multicopy nuclear genes, LCNGs are rarely subject to gene conversion or concerted evolution, and they have higher mutation rates than organellar or nuclear ribosomal DNA markers, so they have great potential for improving the robustness of phylogenetic reconstructions at all taxonomic levels. In this study, our first objective is to evaluate the evolutionary dynamics of the LCNG β-amylase by testing for potential pseudogenization, paralogy, homeology, recombination, and phylogenetic incongruence within a broad representation of the main Pooideae lineages. Our second objective is to determine whether β-amylase shows sufficient phylogenetic signal to reconstruct the evolutionary history of the Pooid grasses. A multigenic (ITS, matK, ndhF, trnTL, and trnLF) tree of the study group provided a framework for assessing the β-amylase phylogeny. Eight accessions showed complete absence of selection, suggesting putative pseudogenic copies or other relaxed selection pressures; resolution of Vulpia alopecuros 2x clones indicated its potential (semi) paralogy; and homeologous copies of allopolyploid species Festuca simensis, F. fenas, and F. arundinacea tracked their Mediterranean origin. Two recombination events were found within early-diverged Pooideae lineages, and five within the PACCMAD clade. The unexpected phylogenetic relationships of 37 grass species (26% of the sampled species) highlight the frequent occurrence of non-treelike evolutionary events, so this LCNG should be used with caution as a phylogenetic marker. However, once the pitfalls are identified and removed, the phylogenetic reconstruction of the grasses based on the β-amylase exon+intron positions is optimal at all taxonomic levels. Copyright © 2015 Elsevier Inc. All rights reserved.
Shrestha, Binu; Reed, J. Michael; Starks, Philip T.; Kaufman, Gretchen E.; Goldstone, Jared V.; Roelke, Melody E.; O'Brien, Stephen J.; Koepfli, Klaus-Peter; Frank, Laurence G.; Court, Michael H.
2011-01-01
The domestic cat (Felis catus) shows remarkable sensitivity to the adverse effects of phenolic drugs, including acetaminophen and aspirin, as well as structurally-related toxicants found in the diet and environment. This idiosyncrasy results from pseudogenization of the gene encoding UDP-glucuronosyltransferase (UGT) 1A6, the major species-conserved phenol detoxification enzyme. Here, we established the phylogenetic timing of disruptive UGT1A6 mutations and explored the hypothesis that gene inactivation in cats was enabled by minimal exposure to plant-derived toxicants. Fixation of the UGT1A6 pseudogene was estimated to have occurred between 35 and 11 million years ago with all extant Felidae having dysfunctional UGT1A6. Out of 22 additional taxa sampled, representative of most Carnivora families, only brown hyena (Parahyaena brunnea) and northern elephant seal (Mirounga angustirostris) showed inactivating UGT1A6 mutations. A comprehensive literature review of the natural diet of the sampled taxa indicated that all species with defective UGT1A6 were hypercarnivores (>70% dietary animal matter). Furthermore those species with UGT1A6 defects showed evidence for reduced amino acid constraint (increased dN/dS ratios approaching the neutral selection value of 1.0) as compared with species with intact UGT1A6. In contrast, there was no evidence for reduced amino acid constraint for these same species within UGT1A1, the gene encoding the enzyme responsible for detoxification of endogenously generated bilirubin. Our results provide the first evidence suggesting that diet may have played a permissive role in the devolution of a mammalian drug metabolizing enzyme. Further work is needed to establish whether these preliminary findings can be generalized to all Carnivora. PMID:21464924
An efficient and comprehensive strategy for genetic diagnostics of polycystic kidney disease.
Eisenberger, Tobias; Decker, Christian; Hiersche, Milan; Hamann, Ruben C; Decker, Eva; Neuber, Steffen; Frank, Valeska; Bolz, Hanno J; Fehrenbach, Henry; Pape, Lars; Toenshoff, Burkhard; Mache, Christoph; Latta, Kay; Bergmann, Carsten
2015-01-01
Renal cysts are clinically and genetically heterogeneous conditions. Autosomal dominant polycystic kidney disease (ADPKD) is the most frequent life-threatening genetic disease and mainly caused by mutations in PKD1. The presence of six PKD1 pseudogenes and tremendous allelic heterogeneity make molecular genetic testing challenging requiring laborious locus-specific amplification. Increasing evidence suggests a major role for PKD1 in early and severe cases of ADPKD and some patients with a recessive form. Furthermore it is becoming obvious that clinical manifestations can be mimicked by mutations in a number of other genes with the necessity for broader genetic testing. We established and validated a sequence capture based NGS testing approach for all genes known for cystic and polycystic kidney disease including PKD1. Thereby, we demonstrate that the applied standard mapping algorithm specifically aligns reads to the PKD1 locus and overcomes the complication of unspecific capture of pseudogenes. Employing careful and experienced assessment of NGS data, the method is shown to be very specific and equally sensitive as established methods. An additional advantage over conventional Sanger sequencing is the detection of copy number variations (CNVs). Sophisticated bioinformatic read simulation increased the high analytical depth of the validation study and further demonstrated the strength of the approach. We further raise some awareness of limitations and pitfalls of common NGS workflows when applied in complex regions like PKD1 demonstrating that quality of NGS needs more than high coverage of the target region. By this, we propose a time- and cost-efficient diagnostic strategy for comprehensive molecular genetic testing of polycystic kidney disease which is highly automatable and will be of particular value when therapeutic options for PKD emerge and genetic testing is needed for larger numbers of patients.
Vomeronasal and Olfactory Structures in Bats Revealed by DiceCT Clarify Genetic Evidence of Function
Yohe, Laurel R.; Hoffmann, Simone; Curtis, Abigail
2018-01-01
The degree to which molecular and morphological loss of function occurs synchronously during the vestigialization of traits is not well understood. The mammalian vomeronasal system, a sense critical for mediating many social and reproductive behaviors, is highly conserved across mammals. New World Leaf-nosed bats (Phyllostomidae) are under strong selection to maintain a functional vomeronasal system such that most phyllostomids possess a distinct vomeronasal organ and an intact TRPC2, a gene encoding a protein primarily involved in vomeronasal sensory neuron signal transduction. Recent genetic evidence, however, shows that TRPC2 is a pseudogene in some Caribbean nectarivorous phyllostomids. The loss-of-function mutations suggest the sensory neural tissue of the vomeronasal organ is absent in these species despite strong selection on this gene in its mainland relatives, but the anatomy was unknown in most Caribbean nectarivorous phyllostomids until this study. We used diffusible iodine-based contrast-enhanced computed tomography (diceCT) to test whether the vomeronasal and main olfactory anatomy of several phyllostomid species matched genetic evidence of function, providing insight into whether loss of a structure is linked to pseudogenization of a molecular component of the system. The vomeronasal organ is indeed rudimentary or absent in species with a disrupted TRPC2 gene. Caribbean nectar-feeders also exhibit derived olfactory turbinal morphology and a large olfactory recess that differs from closely related bats that have an intact vomeronasal organ, which may hint that the main olfactory system may compensate for loss. We emphasize non-invasive diceCT is capable of detecting the vomeronasal organ, providing a feasible approach for quantifying mammalian chemosensory anatomy across species. PMID:29867373
Magnacca, Karl N; Brown, Mark J F
2010-06-11
The past several years have seen a flurry of papers seeking to clarify the utility and limits of DNA barcoding, particularly in areas such as species discovery and paralogy due to nuclear pseudogenes. Heteroplasmy, the coexistence of multiple mitochondrial haplotypes in a single organism, has been cited as a potentially serious problem for DNA barcoding but its effect on identification accuracy has not been tested. In addition, few studies of barcoding have tested a large group of closely-related species with a well-established morphological taxonomy. In this study we examine both of these issues, by densely sampling the Hawaiian Hylaeus bee radiation. Individuals from 21 of the 49 a priori morphologically-defined species exhibited coding sequence heteroplasmy at levels of 1-6% or more. All homoplasmic species were successfully identified by COI using standard methods of analysis, but only 71% of heteroplasmic species. The success rate in identifying heteroplasmic species was increased to 86% by treating polymorphisms as character states rather than ambiguities. Nuclear pseudogenes (numts) were also present in four species, and were distinguishable from heteroplasmic sequences by patterns of nucleotide and amino acid change. Heteroplasmy significantly decreased the reliability of species identification. In addition, the practical issue of dealing with large numbers of polymorphisms- and resulting increased time and labor required - makes the development of DNA barcode databases considerably more complex than has previously been suggested. The impact of heteroplasmy on the utility of DNA barcoding as a bulk specimen identification tool will depend upon its frequency across populations, which remains unknown. However, DNA barcoding is still likely to remain an important identification tool for those species that are difficult or impossible to identify through morphology, as is the case for the ecologically important solitary bee fauna.
2010-01-01
Background The past several years have seen a flurry of papers seeking to clarify the utility and limits of DNA barcoding, particularly in areas such as species discovery and paralogy due to nuclear pseudogenes. Heteroplasmy, the coexistence of multiple mitochondrial haplotypes in a single organism, has been cited as a potentially serious problem for DNA barcoding but its effect on identification accuracy has not been tested. In addition, few studies of barcoding have tested a large group of closely-related species with a well-established morphological taxonomy. In this study we examine both of these issues, by densely sampling the Hawaiian Hylaeus bee radiation. Results Individuals from 21 of the 49 a priori morphologically-defined species exhibited coding sequence heteroplasmy at levels of 1-6% or more. All homoplasmic species were successfully identified by COI using standard methods of analysis, but only 71% of heteroplasmic species. The success rate in identifying heteroplasmic species was increased to 86% by treating polymorphisms as character states rather than ambiguities. Nuclear pseudogenes (numts) were also present in four species, and were distinguishable from heteroplasmic sequences by patterns of nucleotide and amino acid change. Conclusions Heteroplasmy significantly decreased the reliability of species identification. In addition, the practical issue of dealing with large numbers of polymorphisms- and resulting increased time and labor required - makes the development of DNA barcode databases considerably more complex than has previously been suggested. The impact of heteroplasmy on the utility of DNA barcoding as a bulk specimen identification tool will depend upon its frequency across populations, which remains unknown. However, DNA barcoding is still likely to remain an important identification tool for those species that are difficult or impossible to identify through morphology, as is the case for the ecologically important solitary bee fauna. PMID:20540728
A-to-I RNA editing promotes developmental stage-specific gene and lncRNA expression.
Goldstein, Boaz; Agranat-Tamir, Lily; Light, Dean; Ben-Naim Zgayer, Orna; Fishman, Alla; Lamm, Ayelet T
2017-03-01
A-to-I RNA editing is a conserved widespread phenomenon in which adenosine (A) is converted to inosine (I) by adenosine deaminases (ADARs) in double-stranded RNA regions, mainly noncoding. Mutations in ADAR enzymes in Caenorhabditis elegans cause defects in normal development but are not lethal as in human and mouse. Previous studies in C. elegans indicated competition between RNA interference (RNAi) and RNA editing mechanisms, based on the observation that worms that lack both mechanisms do not exhibit defects, in contrast to the developmental defects observed when only RNA editing is absent. To study the effects of RNA editing on gene expression and function, we established a novel screen that enabled us to identify thousands of RNA editing sites in nonrepetitive regions in the genome. These include dozens of genes that are edited at their 3' UTR region. We found that these genes are mainly germline and neuronal genes, and that they are down-regulated in the absence of ADAR enzymes. Moreover, we discovered that almost half of these genes are edited in a developmental-specific manner, indicating that RNA editing is a highly regulated process. We found that many pseudogenes and other lncRNAs are also extensively down-regulated in the absence of ADARs in the embryo but not in the fourth larval (L4) stage. This down-regulation is not observed upon additional knockout of RNAi. Furthermore, levels of siRNAs aligned to pseudogenes in ADAR mutants are enhanced. Taken together, our results suggest a role for RNA editing in normal growth and development by regulating silencing via RNAi. © 2017 Goldstein et al.; Published by Cold Spring Harbor Laboratory Press.
Genome-wide A-to-I RNA editing in fungi independent of ADAR enzymes
Liu, Huiquan; Wang, Qinhu; He, Yi; Chen, Lingfeng; Hao, Chaofeng; Jiang, Cong; Li, Yang; Dai, Yafeng; Kang, Zhensheng; Xu, Jin-Rong
2016-01-01
Yeasts and filamentous fungi do not have adenosine deaminase acting on RNA (ADAR) orthologs and are believed to lack A-to-I RNA editing, which is the most prevalent editing of mRNA in animals. However, during this study with the PUK1 (FGRRES_01058) pseudokinase gene important for sexual reproduction in Fusarium graminearum, we found that two tandem stop codons, UA1831GUA1834G, in its kinase domain were changed to UG1831GUG1834G by RNA editing in perithecia. To confirm A-to-I editing of PUK1 transcripts, strand-specific RNA-seq data were generated with RNA isolated from conidia, hyphae, and perithecia. PUK1 was almost specifically expressed in perithecia, and 90% of transcripts were edited to UG1831GUG1834G. Genome-wide analysis identified 26,056 perithecium-specific A-to-I editing sites. Unlike those in animals, 70.5% of A-to-I editing sites in F. graminearum occur in coding regions, and more than two-thirds of them result in amino acid changes, including editing of 69 PUK1-like pseudogenes with stop codons in ORFs. PUK1 orthologs and other pseudogenes also displayed stage-specific expression and editing in Neurospora crassa and F. verticillioides. Furthermore, F. graminearum differs from animals in the sequence preference and structure selectivity of A-to-I editing sites. Whereas A's embedded in RNA stems are targeted by ADARs, RNA editing in F. graminearum preferentially targets A's in hairpin loops, which is similar to the anticodon loop of tRNA targeted by adenosine deaminases acting on tRNA (ADATs). Overall, our results showed that A-to-I RNA editing occurs specifically during sexual reproduction and mainly in the coding regions in filamentous ascomycetes, involving adenosine deamination mechanisms distinct from metazoan ADARs. PMID:26934920
Ngoc, Phuong Cao Thi; Greenhalgh, Robert; Dermauw, Wannes; Rombauts, Stephane; Bajda, Sabina; Zhurov, Vladimir; Grbić, Miodrag; Van de Peer, Yves; Van Leeuwen, Thomas; Rouzé, Pierre; Clark, Richard M
2016-12-14
While mechanisms to detoxify plant produced, anti-herbivore compounds have been associated with plant host use by herbivores, less is known about the role of chemosensory perception in their life histories. This is especially true for generalists, including chelicerate herbivores that evolved herbivory independently from the more studied insect lineages. To shed light on chemosensory perception in a generalist herbivore, we characterized the chemosensory receptors (CRs) of the chelicerate two-spotted spider mite, Tetranychus urticae, an extreme generalist. Strikingly, T. urticae has more CRs than reported in any other arthropod to date. Including pseudogenes, 689 gustatory receptors were identified, as were 136 degenerin/Epithelial Na+ Channels (ENaCs) that have also been implicated as CRs in insects. The genomic distribution of T. urticae gustatory receptors indicates recurring bursts of lineage-specific proliferations, with the extent of receptor clusters reminiscent of those observed in the CR-rich genomes of vertebrates or C. elegans Although pseudogenization of many gustatory receptors within clusters suggests relaxed selection, a subset of receptors is expressed. Consistent with functions as CRs, the genomic distribution and expression of ENaCs in lineage-specific T. urticae expansions mirrors that observed for gustatory receptors. The expansion of ENaCs in T. urticae to > 3-fold that reported in other animals was unexpected, raising the possibility that ENaCs in T. urticae have been co-opted to fulfill a major role performed by unrelated CRs in other animals. More broadly, our findings suggest an elaborate role for chemosensory perception in generalist herbivores that are of key ecological and agricultural importance. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Raman, Gurusamy; Park, SeonJoo
2015-01-01
Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus.
Raman, Gurusamy; Park, SeonJoo
2015-01-01
Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus. PMID:26513163
Differential Retention of Gene Functions in a Secondary Metabolite Cluster.
Reynolds, Hannah T; Slot, Jason C; Divon, Hege H; Lysøe, Erik; Proctor, Robert H; Brown, Daren W
2017-08-01
In fungi, distribution of secondary metabolite (SM) gene clusters is often associated with host- or environment-specific benefits provided by SMs. In the plant pathogen Alternaria brassicicola (Dothideomycetes), the DEP cluster confers an ability to synthesize the SM depudecin, a histone deacetylase inhibitor that contributes weakly to virulence. The DEP cluster includes genes encoding enzymes, a transporter, and a transcription regulator. We investigated the distribution and evolution of the DEP cluster in 585 fungal genomes and found a wide but sporadic distribution among Dothideomycetes, Sordariomycetes, and Eurotiomycetes. We confirmed DEP gene expression and depudecin production in one fungus, Fusarium langsethiae. Phylogenetic analyses suggested 6-10 horizontal gene transfers (HGTs) of the cluster, including a transfer that led to the presence of closely related cluster homologs in Alternaria and Fusarium. The analyses also indicated that HGTs were frequently followed by loss/pseudogenization of one or more DEP genes. Independent cluster inactivation was inferred in at least four fungal classes. Analyses of transitions among functional, pseudogenized, and absent states of DEP genes among Fusarium species suggest enzyme-encoding genes are lost at higher rates than the transporter (DEP3) and regulatory (DEP6) genes. The phenotype of an experimentally-induced DEP3 mutant of Fusarium did not support the hypothesis that selective retention of DEP3 and DEP6 protects fungi from exogenous depudecin. Together, the results suggest that HGT and gene loss have contributed significantly to DEP cluster distribution, and that some DEP genes provide a greater fitness benefit possibly due to a differential tendency to form network connections. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution 2017. This work is written by US Government employees and is in the public domain in the US.
Suárez-Esquivel, Marcela; Baker, Kate S; Ruiz-Villalobos, Nazareth; Hernández-Mora, Gabriela; Barquero-Calvo, Elías; González-Barrientos, Rocío; Castillo-Zeledón, Amanda; Jiménez-Rojas, César; Chacón-Díaz, Carlos; Cloeckaert, Axel; Chaves-Olarte, Esteban; Thomson, Nicholas R; Moreno, Edgardo; Guzmán-Verri, Caterina
2017-07-01
Intracellular bacterial pathogens probably arose when their ancestor adapted from a free-living environment to an intracellular one, leading to clonal bacteria with smaller genomes and less sources of genetic plasticity. Still, this plasticity is needed to respond to the challenges posed by the host. Members of the Brucella genus are facultative-extracellular intracellular bacteria responsible for causing brucellosis in a variety of mammals. The various species keep different host preferences, virulence, and zoonotic potential despite having 97-99% similarity at genome level. Here, we describe elements of genetic variation in Brucella ceti isolated from wildlife dolphins inhabiting the Pacific Ocean and the Mediterranean Sea. Comparison with isolates obtained from marine mammals from the Atlantic Ocean and the broader Brucella genus showed distinctive traits according to oceanic distribution and preferred host. Marine mammal isolates display genetic variability, represented by an important number of IS711 elements as well as specific IS711 and SNPs genomic distribution clustering patterns. Extensive pseudogenization was found among isolates from marine mammals as compared with terrestrial ones, causing degradation in pathways related to energy, transport of metabolites, and regulation/transcription. Brucella ceti isolates infecting particularly dolphin hosts, showed further degradation of metabolite transport pathways as well as pathways related to cell wall/membrane/envelope biogenesis and motility. Thus, gene loss through pseudogenization is a source of genetic variation in Brucella, which in turn, relates to adaptation to different hosts. This is relevant to understand the natural history of bacterial diseases, their zoonotic potential, and the impact of human interventions such as domestication. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Emerling, Christopher A
2018-01-01
Carotenoids have important roles in bird behavior, including pigmentation for sexual signaling and improving color vision via retinal oil droplets. Yellow carotenoids are diet-derived, but red carotenoids (ketocarotenoids) are typically synthesized from yellow precursors via a carotenoid ketolase. Recent research on passerines has provided evidence that a cytochrome p450 enzyme, CYP2J19, is responsible for this reaction, though it is unclear if this function is phylogenetically restricted. Here I provide evidence that CYP2J19 is the carotenoid ketolase common to Aves using the genomes of 65 birds and the retinal transcriptomes of 15 avian taxa. CYP2J19 is functionally intact and robustly transcribed in all taxa except for several species adapted to foraging in dim light conditions. Two penguins, an owl and a kiwi show evidence of genetic lesions and relaxed selection in their genomic copy of CYP2J19, and six owls show evidence of marked reduction in CYP2J19 retinal transcription compared to nine diurnal avian taxa. Furthermore, one of the owls appears to transcribe a CYP2J19 pseudogene. Notably, none of these taxa are known to use red carotenoids for sexual signaling and several species of owls and penguins represent the only birds known to completely lack red retinal oil droplets. The remaining avian taxa belong to groups known to possess red oil droplets, are known or expected to deposit red carotenoids in skin and/or plumage, and/or frequently forage in bright light. The loss and reduced expression of CYP2J19 is likely an adaptation to maximize retinal sensitivity, given that oil droplets reduce the amount of light available to the retina. Copyright © 2017 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Villa, A.; Strina, D.; Frattini, A.
We have previously reported the characterization of the human ZNF75 gene located on Xq26, which has only limited homology (less than 65%) to other ZF genes in the databases. Here, we describe three human zinc finger genes with 86 to 95% homology to ZNF75 at the nucleotide level, which represent all the members of the human ZNF75 subfamily. One of these, ZNF75B, is a pseudogene mapped to chromosome 12q13. The other two, ZNF75A and ZNF75C, maintain on ORF in the sequenced region, and at least the latter is expressed in the U937 cell line. They were mapped to chromosomes 16more » and 11, respectively. All these genes are conserved in chimpanzees, gorillas, and orangutans. The ZNF75B homologue is a pseudogene in all three great apes, and in chimpanzee it is located on chromosome 10 (phylogenetic XII), at p13 (corresponding to the human 12q13). The chimpanzee homologue of ZNF75 is also located on the Xq26 chromosome, in the same region, as detected by in situ hybridization. As expected, nucleotide changes were clearly more abundant between human and organutan than between human and chimpanzee or gorilla homologues. Members of the same class were more similar to each other than to the other homologues within the same species. This suggests that the duplication and/or retrotranscription events occurred in a common ancestor long before great ape speciation. This, together with the existance of at least two genes in cows and horses, suggests a relatively high conservation of this gene family. 20 refs., 5 figs., 1 tab.« less
Galián, J A; Rosato, M; Rosselló, J A
2012-06-01
In seed plants, the colocalization of the 5S loci within the intergenic spacer (IGS) of the nuclear 45S tandem units is restricted to the phylogenetically derived Asteraceae family. However, fluorescent in situ hybridization (FISH) colocalization of both multigene families has also been observed in other unrelated seed plant lineages. Previous work has identified colocalization of 45S and 5S loci in Ginkgo biloba using FISH, but these observations have not been confirmed recently by sequencing a 1.8 kb IGS. In this work, we report the presence of the 45S-5S linkage in G. biloba, suggesting that in seed plants the molecular events leading to the restructuring of the ribosomal loci are much older than estimated previously. We obtained a 6.0 kb IGS fragment showing structural features of functional sequences, and a single copy of the 5S gene was inserted in the same direction of transcription as the ribosomal RNA genes. We also obtained a 1.8 kb IGS that was a truncate variant of the 6.0 kb IGS lacking the 5S gene. Several lines of evidence strongly suggest that the 1.8 kb variants are pseudogenes that are present exclusively on the satellite chromosomes bearing the 45S-5S genes. The presence of ribosomal IGS pseudogenes best reconciles contradictory results concerning the presence or absence of the 45S-5S linkage in Ginkgo. Our finding that both ribosomal gene families have been unified to a single 45S-5S unit in Ginkgo indicates that an accurate reassessment of the organization of rDNA genes in basal seed plants is necessary.
Guo, Xianwu; Castillo-Ramírez, Santiago; González, Víctor; Bustos, Patricia; Luís Fernández-Vázquez, José; Santamaría, Rosa Isela; Arellano, Jesús; Cevallos, Miguel A; Dávila, Guillermo
2007-01-01
Background Fabaceae (legumes) is one of the largest families of flowering plants, and some members are important crops. In contrast to what we know about their great diversity or economic importance, our knowledge at the genomic level of chloroplast genomes (cpDNAs or plastomes) for these crops is limited. Results We sequenced the complete genome of the common bean (Phaseolus vulgaris cv. Negro Jamapa) chloroplast. The plastome of P. vulgaris is a 150,285 bp circular molecule. It has gene content similar to that of other legume plastomes, but contains two pseudogenes, rpl33 and rps16. A distinct inversion occurred at the junction points of trnH-GUG/rpl14 and rps19/rps8, as in adzuki bean [1]. These two pseudogenes and the inversion were confirmed in 10 varieties representing the two domestication centers of the bean. Genomic comparative analysis indicated that inversions generally occur in legume plastomes and the magnitude and localization of insertions/deletions (indels) also vary. The analysis of repeat sequences demonstrated that patterns and sequences of tandem repeats had an important impact on sequence diversification between legume plastomes and tandem repeats did not belong to dispersed repeats. Interestingly, P. vulgaris plastome had higher evolutionary rates of change on both genomic and gene levels than G. max, which could be the consequence of pressure from both mutation and natural selection. Conclusion Legume chloroplast genomes are widely diversified in gene content, gene order, indel structure, abundance and localization of repetitive sequences, intracellular sequence exchange and evolutionary rates. The P. vulgaris plastome is a rapidly evolving genome. PMID:17623083
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gersuk, V.H.; Rose, T.M.; Todaro, G.J.
The acyl-CoA binding protein (ACBP) and the diazepam binding inhibitor (DBI) or endozepine are independent isolates of a single 86-amino-acid, 10-kDa protein. ACBP/DBI is highly conserved between species and has been identified in several diverse organisms, including human, cow, rat, frog, duck, insects, plants, and yeast. Although the genomic locus has not yet been cloned in humans, complementary DNA clones with different 5{prime} ends have been isolated and characterized. These cDNA clones appear to be encoded by a single gene. However, Southern blot analyses, in situ hybridizations, and somatic cell hybrid chromosomal mapping all suggest that there are multiple ACBP/DBI-relatedmore » sequences in the genome. To identify potential members of this gene family, degenerate oligonucleotides corresponding to highly conserved regions of ACBP/DBI were used to screen a human genomic DNA library using the polymerase chain reaction. A novel gene, DBIP1, that is closely related to ACBP/DBI but is clearly distinct was identified. DBIP1 bears extensive sequence homology to ACBP/DBI but lacks the introns predicted by rat and duck genomic sequence studies. A 1-base deletion in the coding region results in a frameshift and, along with the absence of introns and the lack of a detectable transcript, suggests that DBIP1 is a pseudogene. ACBP/DBI has previously been mapped to chromosome 2, although this was recently disputed, and a chromosome 6 location was suggested. We show that ACBP/DBI is correctly placed on chromosome 2 and that the gene identified on chromosome 6 is DBIP1. 33 refs., 3 figs., 1 tab.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Monaco, L.; Murtagh, J.J.; Newman, K.B.
1990-03-01
ADP-ribosylation factors (ARFs) are {approx}20-kDa proteins that act as GTP-dependent allosteric activators of cholera toxin. With deoxyinosine-containing degenerate oligonucleotide primers corresponding to conserved GTP-binding domains in ARFs, the polymerase chain reaction (PCR) was used to amplify simultaneously from human DNA portions of three ARF genes that include codons for 102 amino acids, with intervening sequences. Amplification products that differed in size because of differences in intron sizes were separated by agarose gel electrophoresis. One amplified DNA contained no introns and had a sequence different from those of known AFRs. Based on this sequence, selective oligonucleotide probes were prepared and usedmore » to isolate clone {Psi}ARF 4, a putative ARF pseudogene, from a human genomic library in {lambda} phage EMBL3. Reverse transcription-PCR was then used to clone from human poly(A){sup +} RNA the cDNA corresponding to the expressed homolog of {Psi}ARF 4, referred to as human ARF 4. It appears that {Psi}ARF 4 arose during human evolution by integration of processed ARF 4 mRNA into the genome. Human ARF 4 differs from previously identified mammalian ARFs 1, 2, and 3. Hybridization of ARF 4-specific oligonucleotide probes with human, bovine, and rat RNA revealed a single 1.8-kilobase mRNA, which was clearly distinguished from the 1.9-kilobase mRNA for ARF 1 in these tissues. The PCR provides a powerful tool for investigating diversity in this and other multigene families, especially with primers targeted at domains believed to have functional significance.« less
Huang, Ya-Yi; Cho, Shu-Ting; Haryono, Mindia; Kuo, Chih-Horng
2017-01-01
Common bermudagrass (Cynodon dactylon (L.) Pers.) belongs to the subfamily Chloridoideae of the Poaceae family, one of the most important plant families ecologically and economically. This grass has a long connection with human culture but its systematics is relatively understudied. In this study, we sequenced and investigated the chloroplast genome of common bermudagrass, which is 134,297 bp in length with two single copy regions (LSC: 79,732 bp; SSC: 12,521 bp) and a pair of inverted repeat (IR) regions (21,022 bp). The annotation contains a total of 128 predicted genes, including 82 protein-coding, 38 tRNA, and 8 rRNA genes. Additionally, our in silico analyses identified 10 sets of repeats longer than 20 bp and predicted the presence of 36 RNA editing sites. Overall, the chloroplast genome of common bermudagrass resembles those from other Poaceae lineages. Compared to most angiosperms, the accD gene and the introns of both clpP and rpoC1 genes are missing. Additionally, the ycf1, ycf2, ycf15, and ycf68 genes are pseudogenized and two genome rearrangements exist. Our phylogenetic analysis based on 47 chloroplast protein-coding genes supported the placement of common bermudagrass within Chloridoideae. Our phylogenetic character mapping based on the parsimony principle further indicated that the loss of the accD gene and clpP introns, the pseudogenization of four ycf genes, and the two rearrangements occurred only once after the most recent common ancestor of the Poaceae diverged from other monocots, which could explain the unusual long branch leading to the Poaceae when phylogeny is inferred based on chloroplast sequences.
Shipitsyna, E; Zolotoverkhaya, E; Hjelmevoll, S O; Maximova, A; Savicheva, A; Sokolovsky, E; Skogen, V; Domeika, M; Unemo, M
2009-11-01
In Russia, laboratory diagnosis of gonorrhoea has been mainly based on microscopy only and, in some settings, relatively rare suboptimal culturing. In recent years, Russian developed and manufactured nucleic acid amplification tests (NAAT) have been implemented for routine diagnosis of Neisseria gonorrhoeae. However, these NAATs have never been validated to any international well-recognized diagnostic NAAT. This study aims to evaluate the performance characteristics of six Russian NAATs for N. gonorrhoeae diagnostics. In total, 496 symptomatic patients were included. Five polymerase chain reaction (PCR) assays and one real-time nucleic acid sequence based amplification (NASBA) assay, developed by three Russian companies, were evaluated on urogenital samples, i.e. cervical and first voided urine (FVU) samples from females (n = 319), urethral and FVU samples from males (n = 127), and extragenital samples, i.e. rectal and pharyngeal samples, from 50 additional female patients with suspicion of gonorrhoea. As reference method, an international strictly validated real-time porA pseudogene PCR was applied. The prevalence of N. gonorrhoeae was 2.7% and 16% among the patients providing urogenital and extragenital samples, respectively. The Russian NAATs and the reference method displayed high level of concordance (99.4-100%). The sensitivities, specificities, positive predictive values and negative predictive values of the Russian tests in different specimens were 66.7-100%, 100%, 100%, and 99.4-100%, respectively. Russian N. gonorrhoeae diagnostic NAATs comprise relatively good performance characteristics. However, larger studies are crucial and, beneficially, the Russian assays should also be evaluated to other international highly sensitive and specific, and ideally Food and Drug Administration approved, NAATs such as Aptima Combo 2 (Gen-Probe).
Holmes, Roger S; Wright, Matthew W; Laulederkind, Stanley J F; Cox, Laura A; Hosokawa, Masakiyo; Imai, Teruko; Ishibashi, Shun; Lehner, Richard; Miyazaki, Masao; Perkins, Everett J; Potter, Phillip M; Redinbo, Matthew R; Robert, Jacques; Satoh, Tetsuo; Yamashita, Tetsuro; Yan, Bingfan; Yokoi, Tsuyoshi; Zechner, Rudolf; Maltais, Lois J
2010-10-01
Mammalian carboxylesterase (CES or Ces) genes encode enzymes that participate in xenobiotic, drug, and lipid metabolism in the body and are members of at least five gene families. Tandem duplications have added more genes for some families, particularly for mouse and rat genomes, which has caused confusion in naming rodent Ces genes. This article describes a new nomenclature system for human, mouse, and rat carboxylesterase genes that identifies homolog gene families and allocates a unique name for each gene. The guidelines of human, mouse, and rat gene nomenclature committees were followed and "CES" (human) and "Ces" (mouse and rat) root symbols were used followed by the family number (e.g., human CES1). Where multiple genes were identified for a family or where a clash occurred with an existing gene name, a letter was added (e.g., human CES4A; mouse and rat Ces1a) that reflected gene relatedness among rodent species (e.g., mouse and rat Ces1a). Pseudogenes were named by adding "P" and a number to the human gene name (e.g., human CES1P1) or by using a new letter followed by ps for mouse and rat Ces pseudogenes (e.g., Ces2d-ps). Gene transcript isoforms were named by adding the GenBank accession ID to the gene symbol (e.g., human CES1_AB119995 or mouse Ces1e_BC019208). This nomenclature improves our understanding of human, mouse, and rat CES/Ces gene families and facilitates research into the structure, function, and evolution of these gene families. It also serves as a model for naming CES genes from other mammalian species.
Schiavo, G; Strillacci, M G; Ribani, A; Bovo, S; Roman-Ponce, S I; Cerolini, S; Bertolini, F; Bagnato, A; Fontanesi, L
2018-06-01
Mitochondrial DNA (mtDNA) insertions have been detected in the nuclear genome of many eukaryotes. These sequences are pseudogenes originated by horizontal transfer of mtDNA fragments into the nuclear genome, producing nuclear DNA sequences of mitochondrial origin (numt). In this study we determined the frequency and distribution of mtDNA-originated pseudogenes in the turkey (Meleagris gallopavo) nuclear genome. The turkey reference genome (Turkey_2.01) was aligned with the reference linearized mtDNA sequence using last. A total of 32 numt sequences (corresponding to 18 numt regions derived by unique insertional events) were identified in the turkey nuclear genome (size ranging from 66 to 1415 bp; identity against the modern turkey mtDNA corresponding region ranging from 62% to 100%). Numts were distributed in nine chromosomes and in one scaffold. They derived from parts of 10 mtDNA protein-coding genes, ribosomal genes, the control region and 10 tRNA genes. Seven numt regions reported in the turkey genome were identified in orthologues positions in the Gallus gallus genome and therefore were present in the ancestral genome that in the Cretaceous originated the lineages of the modern crown Galliformes. Five recently integrated turkey numts were validated by PCR in 168 turkeys of six different domestic populations. None of the analysed numts were polymorphic (i.e. absence of the inserted sequence, as reported in numts of recent integration in other species), suggesting that the reticulate speciation model is not useful for explaining the origin of the domesticated turkey lineage. © 2018 Stichting International Foundation for Animal Genetics.
Regional differences in mitochondrial DNA methylation in human post-mortem brain tissue.
Devall, Matthew; Smith, Rebecca G; Jeffries, Aaron; Hannon, Eilis; Davies, Matthew N; Schalkwyk, Leonard; Mill, Jonathan; Weedon, Michael; Lunnon, Katie
2017-01-01
DNA methylation is an important epigenetic mechanism involved in gene regulation, with alterations in DNA methylation in the nuclear genome being linked to numerous complex diseases. Mitochondrial DNA methylation is a phenomenon that is receiving ever-increasing interest, particularly in diseases characterized by mitochondrial dysfunction; however, most studies have been limited to the investigation of specific target regions. Analyses spanning the entire mitochondrial genome have been limited, potentially due to the amount of input DNA required. Further, mitochondrial genetic studies have been previously confounded by nuclear-mitochondrial pseudogenes. Methylated DNA Immunoprecipitation Sequencing is a technique widely used to profile DNA methylation across the nuclear genome; however, reads mapped to mitochondrial DNA are often discarded. Here, we have developed an approach to control for nuclear-mitochondrial pseudogenes within Methylated DNA Immunoprecipitation Sequencing data. We highlight the utility of this approach in identifying differences in mitochondrial DNA methylation across regions of the human brain and pre-mortem blood. We were able to correlate mitochondrial DNA methylation patterns between the cortex, cerebellum and blood. We identified 74 nominally significant differentially methylated regions ( p < 0.05) in the mitochondrial genome, between anatomically separate cortical regions and the cerebellum in matched samples ( N = 3 matched donors). Further analysis identified eight significant differentially methylated regions between the total cortex and cerebellum after correcting for multiple testing. Using unsupervised hierarchical clustering analysis of the mitochondrial DNA methylome, we were able to identify tissue-specific patterns of mitochondrial DNA methylation between blood, cerebellum and cortex. Our study represents a comprehensive analysis of the mitochondrial methylome using pre-existing Methylated DNA Immunoprecipitation Sequencing data to identify brain region-specific patterns of mitochondrial DNA methylation.
Galián, J A; Rosato, M; Rosselló, J A
2012-01-01
In seed plants, the colocalization of the 5S loci within the intergenic spacer (IGS) of the nuclear 45S tandem units is restricted to the phylogenetically derived Asteraceae family. However, fluorescent in situ hybridization (FISH) colocalization of both multigene families has also been observed in other unrelated seed plant lineages. Previous work has identified colocalization of 45S and 5S loci in Ginkgo biloba using FISH, but these observations have not been confirmed recently by sequencing a 1.8 kb IGS. In this work, we report the presence of the 45S–5S linkage in G. biloba, suggesting that in seed plants the molecular events leading to the restructuring of the ribosomal loci are much older than estimated previously. We obtained a 6.0 kb IGS fragment showing structural features of functional sequences, and a single copy of the 5S gene was inserted in the same direction of transcription as the ribosomal RNA genes. We also obtained a 1.8 kb IGS that was a truncate variant of the 6.0 kb IGS lacking the 5S gene. Several lines of evidence strongly suggest that the 1.8 kb variants are pseudogenes that are present exclusively on the satellite chromosomes bearing the 45S–5S genes. The presence of ribosomal IGS pseudogenes best reconciles contradictory results concerning the presence or absence of the 45S–5S linkage in Ginkgo. Our finding that both ribosomal gene families have been unified to a single 45S–5S unit in Ginkgo indicates that an accurate reassessment of the organization of rDNA genes in basal seed plants is necessary. PMID:22354111
The Genomics of Microbial Domestication in the Fermented Food Environment
Gibbons, John G; Rinker, David C
2015-01-01
Shortly after the agricultural revolution, the domestication of bacteria, yeasts, and molds, played an essential role in enhancing the stability, quality, flavor, and texture of food products. These domestication events were likely the result of human food production practices that entailed the continual recycling of isolated microbial communities in the presence of abundant agricultural food sources. We suggest that within these novel agrarian food niches the metabolic requirements of those microbes became regular and predictable resulting in rapid genomic specialization through such mechanisms as pseudogenization, genome decay, interspecific hybridization, gene duplication, and horizontal gene transfer. The ultimate result was domesticated strains of microorganisms with enhanced fermentative capacities. PMID:26338497
Complete plastid genome of Astragalus mongholicus var. nakaianus (Fabaceae).
Choi, In-Su; Kim, Joo-Hwan; Choi, Byoung-Hee
2016-07-01
The first complete plastid genome (plastome) of the largest angiosperm genus, Astragalus, was sequenced for the Korean endangered endemic species A. mongholicus var. nakaianus. Its genome is relatively short (123,633 bp) because it lacks an Inverted Repeat (IR) region. It comprises 110 genes, including four unique rRNAs, 30 tRNAs, and 76 protein-coding genes. Similar to other closely related plastomes, rpl22 and rps16 are absent. The putative pseudogene with abnormal stop codons is atpE. This plastome has no additional inversions when compared with highly variable plastomes from IRLC tribes Fabeae and Trifolieae. Our phylogenetic analysis confirms the non-monophyly of Galegeae.
Approaches to Fungal Genome Annotation
Haas, Brian J.; Zeng, Qiandong; Pearson, Matthew D.; Cuomo, Christina A.; Wortman, Jennifer R.
2011-01-01
Fungal genome annotation is the starting point for analysis of genome content. This generally involves the application of diverse methods to identify features on a genome assembly such as protein-coding and non-coding genes, repeats and transposable elements, and pseudogenes. Here we describe tools and methods leveraged for eukaryotic genome annotation with a focus on the annotation of fungal nuclear and mitochondrial genomes. We highlight the application of the latest technologies and tools to improve the quality of predicted gene sets. The Broad Institute eukaryotic genome annotation pipeline is described as one example of how such methods and tools are integrated into a sequencing center’s production genome annotation environment. PMID:22059117
DOE Office of Scientific and Technical Information (OSTI.GOV)
Steeghs, K.; Wieringa, B.; Merkx, G.
1994-11-01
Members of the creatine kinase isoenzyme family (CKs; EC 2.7.3.2) are found in mitochondria and specialized subregions of the cytoplasm and catalyze the reversible exchange of high-energy phosphoryl between ATP and phosphocreatine. At least four functionally active genes, which encode the distinct CK subunits CKB, CKM, CKMT1 (ubiquitous), and CKMT2 (sarcomeric), and a variable number of CKB pseudogenes have been identified. Here, we report the use of a CKMT1 containing phage to map the CKMT1 gene by in situ hybridization on both human and mouse chromosomes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boccaccio, C.; Deshatrette, J.; Meunier-Rotival, M.
1994-05-01
The genomic fragment carrying the human activator of liver function, previously described as an episome capable of inducing differentiation upon transfection into a dedifferentiated rat hepatoma cell line, was mapped on human chromosome 12q24.2-12q24.3. This chromosomal location was indistinguishable by in situ hybridization from that of the gene coding for the hepatic transcription factor HNF1. The sequence of the integrated form of the episome as well as its flanking sequences show that it is rich in retroposons. It contains a human ribosomal protein L21 processed pseudogene, one truncated L1Hs sequence, and 10 Alu repeats, which belong to different subfamilies.
Linkage of genes for laminin B1 and B2 subunits on chromosome 1 in mouse.
Elliott, R W; Barlow, D; Hogan, B L
1985-08-01
We have used cDNA clones for the B1 and B2 subunits of laminin to find restriction fragment length DNA polymorphisms for the genes encoding these polypeptides in the mouse. Three alleles were found for LamB2 and two for LamB1 among the inbred mouse strains. The segregation of these polymorphisms among recombinant inbred strains showed that these genes are tightly linked in the central region of mouse Chromosome 1 between Sas-1 and Ly-m22, 7.4 +/- 3.2 cM distal to the Pep-3 locus. There is no evidence in the mouse for pseudogenes for these proteins.
Nagano, Hironori; Clark, Lindsay V.; Zhao, Hua; ...
2015-06-18
The genus Miscanthus is a perennial C 4 grass native to eastern Asia and is a promising candidate bioenergy crop for cool temperate areas. Flowering time is a crucial factor governing regional and seasonal adaptation; in addition, it is also a key target trait for extending the vegetative phase to improve biomass potential. Homologues of CONSTANS (CO)/Heading date 1(Hd1) were cloned from Miscanthus sinensis and named MsiHd1. Sequences of MsiHd1 homologues were compared among 24 wild M. sinensis accessions from Japan, 14 from China, and three from South Korea. Two to five MsiHd1 alleles in each accession were identified, suggestingmore » that MsiHd1 consists of at least three loci in the Miscanthus genome. Verifying the open reading frame in MsiHd1, they were classified as putative functional alleles without mutations or non-functional alleles caused by indels. The Neighbor-Joining tree indicated that one of the multiple MsiHd1 loci is a pseudogene locus without any functional alleles. The pseudogene locus was named MsiHd1b, and the other loci were considered to be part of the MsiHd1a multi-locus family. Interestingly, in most Japanese accessions 50% or more of the MsiHd1a alleles were non-functional, whereas accessions from the East Asian mainland harboured only functional alleles. Five novel miniature inverted transposable elements (MITEs) ( MsiMITE1-MsiMITE5) were observed in MsiHd1a/b. MsiMITE1, detected in exon 1 of MsiHd1a, was only observed in Japanese accessions and its revertant alleles derived from retransposition were predominantly in Chinese accessions. In conclusion, these differences in MsiHd1a show that the dependency on functional MsiHd1a alleles is different between accessions from the East Asian mainland and Japan.« less
Nagano, Hironori; Clark, Lindsay V.; Zhao, Hua; Peng, Junhua; Yoo, Ji Hye; Heo, Kweon; Yu, Chang Yeon; Anzoua, Kossonou Guillaume; Matsuo, Tomoaki; Sacks, Erik J.; Yamada, Toshihiko
2015-01-01
The genus Miscanthus is a perennial C4 grass native to eastern Asia and is a promising candidate bioenergy crop for cool temperate areas. Flowering time is a crucial factor governing regional and seasonal adaptation; in addition, it is also a key target trait for extending the vegetative phase to improve biomass potential. Homologues of CONSTANS (CO)/Heading date 1(Hd1) were cloned from Miscanthus sinensis and named MsiHd1. Sequences of MsiHd1 homologues were compared among 24 wild M. sinensis accessions from Japan, 14 from China, and three from South Korea. Two to five MsiHd1 alleles in each accession were identified, suggesting that MsiHd1 consists of at least three loci in the Miscanthus genome. Verifying the open reading frame in MsiHd1, they were classified as putative functional alleles without mutations or non-functional alleles caused by indels. The Neighbor–Joining tree indicated that one of the multiple MsiHd1 loci is a pseudogene locus without any functional alleles. The pseudogene locus was named MsiHd1b, and the other loci were considered to be part of the MsiHd1a multi-locus family. Interestingly, in most Japanese accessions 50% or more of the MsiHd1a alleles were non-functional, whereas accessions from the East Asian mainland harboured only functional alleles. Five novel miniature inverted transposable elements (MITEs) (MsiMITE1–MsiMITE5) were observed in MsiHd1a/b. MsiMITE1, detected in exon 1 of MsiHd1a, was only observed in Japanese accessions and its revertant alleles derived from retransposition were predominantly in Chinese accessions. These differences in MsiHd1a show that the dependency on functional MsiHd1a alleles is different between accessions from the East Asian mainland and Japan. PMID:26089536
Nagano, Hironori; Clark, Lindsay V; Zhao, Hua; Peng, Junhua; Yoo, Ji Hye; Heo, Kweon; Yu, Chang Yeon; Anzoua, Kossonou Guillaume; Matsuo, Tomoaki; Sacks, Erik J; Yamada, Toshihiko
2015-07-01
The genus Miscanthus is a perennial C4 grass native to eastern Asia and is a promising candidate bioenergy crop for cool temperate areas. Flowering time is a crucial factor governing regional and seasonal adaptation; in addition, it is also a key target trait for extending the vegetative phase to improve biomass potential. Homologues of CONSTANS (CO)/Heading date 1(Hd1) were cloned from Miscanthus sinensis and named MsiHd1. Sequences of MsiHd1 homologues were compared among 24 wild M. sinensis accessions from Japan, 14 from China, and three from South Korea. Two to five MsiHd1 alleles in each accession were identified, suggesting that MsiHd1 consists of at least three loci in the Miscanthus genome. Verifying the open reading frame in MsiHd1, they were classified as putative functional alleles without mutations or non-functional alleles caused by indels. The Neighbor-Joining tree indicated that one of the multiple MsiHd1 loci is a pseudogene locus without any functional alleles. The pseudogene locus was named MsiHd1b, and the other loci were considered to be part of the MsiHd1a multi-locus family. Interestingly, in most Japanese accessions 50% or more of the MsiHd1a alleles were non-functional, whereas accessions from the East Asian mainland harboured only functional alleles. Five novel miniature inverted transposable elements (MITEs) (MsiMITE1-MsiMITE5) were observed in MsiHd1a/b. MsiMITE1, detected in exon 1 of MsiHd1a, was only observed in Japanese accessions and its revertant alleles derived from retransposition were predominantly in Chinese accessions. These differences in MsiHd1a show that the dependency on functional MsiHd1a alleles is different between accessions from the East Asian mainland and Japan. © The Author 2015. Published by Oxford University Press on behalf of the Society for Experimental Biology.
Transcriptomic signatures in cartilage ageing
2013-01-01
Introduction Age is an important factor in the development of osteoarthritis. Microarray studies provide insight into cartilage aging but do not reveal the full transcriptomic phenotype of chondrocytes such as small noncoding RNAs, pseudogenes, and microRNAs. RNA-Seq is a powerful technique for the interrogation of large numbers of transcripts including nonprotein coding RNAs. The aim of the study was to characterise molecular mechanisms associated with age-related changes in gene signatures. Methods RNA for gene expression analysis using RNA-Seq and real-time PCR analysis was isolated from macroscopically normal cartilage of the metacarpophalangeal joints of eight horses; four young donors (4 years old) and four old donors (>15 years old). RNA sequence libraries were prepared following ribosomal RNA depletion and sequencing was undertaken using the Illumina HiSeq 2000 platform. Differentially expressed genes were defined using Benjamini-Hochberg false discovery rate correction with a generalised linear model likelihood ratio test (P < 0.05, expression ratios ± 1.4 log2 fold-change). Ingenuity pathway analysis enabled networks, functional analyses and canonical pathways from differentially expressed genes to be determined. Results In total, the expression of 396 transcribed elements including mRNAs, small noncoding RNAs, pseudogenes, and a single microRNA was significantly different in old compared with young cartilage (± 1.4 log2 fold-change, P < 0.05). Of these, 93 were at higher levels in the older cartilage and 303 were at lower levels in the older cartilage. There was an over-representation of genes with reduced expression relating to extracellular matrix, degradative proteases, matrix synthetic enzymes, cytokines and growth factors in cartilage derived from older donors compared with young donors. In addition, there was a reduction in Wnt signalling in ageing cartilage. Conclusion There was an age-related dysregulation of matrix, anabolic and catabolic cartilage factors. This study has increased our knowledge of transcriptional networks in cartilage ageing by providing a global view of the transcriptome. PMID:23971731
Liu, Guangjian; Walter, Lutz; Tang, Suni; Tan, Xinxin; Shi, Fanglei; Pan, Huijuan; Roos, Christian; Liu, Zhijin; Li, Ming
2014-01-01
Umami and sweet tastes are two important basic taste perceptions that allow animals to recognize diets with nutritious carbohydrates and proteins, respectively. Until recently, analyses of umami and sweet taste were performed on various domestic and wild animals. While most of these studies focused on the pseudogenization of taste genes, which occur mostly in carnivores and species with absolute feeding specialization, omnivores and herbivores were more or less neglected. Catarrhine primates are a group of herbivorous animals (feeding mostly on plants) with significant divergence in dietary preference, especially the specialized folivorous Colobinae. Here, we conducted the most comprehensive investigation to date of selection pressure on sweet and umami taste genes (TAS1Rs) in catarrhine primates to test whether specific adaptive evolution occurred during their diversification, in association with particular plant diets. We documented significant relaxation of selective constraints on sweet taste gene TAS1R2 in the ancestral branch of Colobinae, which might correlate with their unique ingestion and digestion of leaves. Additionally, we identified positive selection acting on Cercopithecidae lineages for the umami taste gene TAS1R1, on the Cercopithecinae and extant Colobinae and Hylobatidae lineages for TAS1R2, and on Macaca lineages for TAS1R3. Our research further identified several site mutations in Cercopithecidae, Colobinae and Pygathrix, which were detected by previous studies altering the sensitivity of receptors. The positively selected sites were located mostly on the extra-cellular region of TAS1Rs. Among these positively selected sites, two vital sites for TAS1R1 and four vital sites for TAS1R2 in extra-cellular region were identified as being responsible for the binding of certain sweet and umami taste molecules through molecular modelling and docking. Our results suggest that episodic and differentiated adaptive evolution of TAS1Rs pervasively occurred in catarrhine primates, most concentrated upon the extra-cellular region of TAS1Rs.
Evolution of Siglec-11 and Siglec-16 Genes in Hominins
Wang, Xiaoxia; Mitra, Nivedita; Cruz, Pedro; Deng, Liwen; Varki, Nissi; Angata, Takashi; Green, Eric D.; Mullikin, Jim; Hayakawa, Toshiyuki; Varki, Ajit
2012-01-01
We previously reported a human-specific gene conversion of SIGLEC11 by an adjacent paralogous pseudogene (SIGLEC16P), generating a uniquely human form of the Siglec-11 protein, which is expressed in the human brain. Here, we show that Siglec-11 is expressed exclusively in microglia in all human brains studied—a finding of potential relevance to brain evolution, as microglia modulate neuronal survival, and Siglec-11 recruits SHP-1, a tyrosine phosphatase that modulates microglial biology. Following the recent finding of a functional SIGLEC16 allele in human populations, further analysis of the human SIGLEC11 and SIGLEC16/P sequences revealed an unusual series of gene conversion events between two loci. Two tandem and likely simultaneous gene conversions occurred from SIGLEC16P to SIGLEC11 with a potentially deleterious intervening short segment happening to be excluded. One of the conversion events also changed the 5′ untranslated sequence, altering predicted transcription factor binding sites. Both of the gene conversions have been dated to ∼1–1.2 Ma, after the emergence of the genus Homo, but prior to the emergence of the common ancestor of Denisovans and modern humans about 800,000 years ago, thus suggesting involvement in later stages of hominin brain evolution. In keeping with this, recombinant soluble Siglec-11 binds ligands in the human brain. We also address a second-round more recent gene conversion from SIGLEC11 to SIGLEC16, with the latter showing an allele frequency of ∼0.1–0.3 in a worldwide population study. Initial pseudogenization of SIGLEC16 was estimated to occur at least 3 Ma, which thus preceded the gene conversion of SIGLEC11 by SIGLEC16P. As gene conversion usually disrupts the converted gene, the fact that ORFs of hSIGLEC11 and hSIGLEC16 have been maintained after an unusual series of very complex gene conversion events suggests that these events may have been subject to hominin-specific selection forces. PMID:22383531
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nagano, Hironori; Clark, Lindsay V.; Zhao, Hua
The genus Miscanthus is a perennial C 4 grass native to eastern Asia and is a promising candidate bioenergy crop for cool temperate areas. Flowering time is a crucial factor governing regional and seasonal adaptation; in addition, it is also a key target trait for extending the vegetative phase to improve biomass potential. Homologues of CONSTANS (CO)/Heading date 1(Hd1) were cloned from Miscanthus sinensis and named MsiHd1. Sequences of MsiHd1 homologues were compared among 24 wild M. sinensis accessions from Japan, 14 from China, and three from South Korea. Two to five MsiHd1 alleles in each accession were identified, suggestingmore » that MsiHd1 consists of at least three loci in the Miscanthus genome. Verifying the open reading frame in MsiHd1, they were classified as putative functional alleles without mutations or non-functional alleles caused by indels. The Neighbor-Joining tree indicated that one of the multiple MsiHd1 loci is a pseudogene locus without any functional alleles. The pseudogene locus was named MsiHd1b, and the other loci were considered to be part of the MsiHd1a multi-locus family. Interestingly, in most Japanese accessions 50% or more of the MsiHd1a alleles were non-functional, whereas accessions from the East Asian mainland harboured only functional alleles. Five novel miniature inverted transposable elements (MITEs) ( MsiMITE1-MsiMITE5) were observed in MsiHd1a/b. MsiMITE1, detected in exon 1 of MsiHd1a, was only observed in Japanese accessions and its revertant alleles derived from retransposition were predominantly in Chinese accessions. In conclusion, these differences in MsiHd1a show that the dependency on functional MsiHd1a alleles is different between accessions from the East Asian mainland and Japan.« less
Alexander, Stephen P. H.; Sharman, Joanna L.; Pawson, Adam J.; Benson, Helen E.; Monaghan, Amy E.; Liew, Wen Chiy; Mpamhanga, Chidochangu P.; Bonner, Tom I.; Neubig, Richard R.; Pin, Jean Philippe; Spedding, Michael; Harmar, Anthony J.
2013-01-01
In 2005, the International Union of Basic and Clinical Pharmacology Committee on Receptor Nomenclature and Drug Classification (NC-IUPHAR) published a catalog of all of the human gene sequences known or predicted to encode G protein-coupled receptors (GPCRs), excluding sensory receptors. This review updates the list of orphan GPCRs and describes the criteria used by NC-IUPHAR to recommend the pairing of an orphan receptor with its cognate ligand(s). The following recommendations are made for new receptor names based on 11 pairings for class A GPCRs: hydroxycarboxylic acid receptors [HCA1 (GPR81) with lactate, HCA2 (GPR109A) with 3-hydroxybutyric acid, HCA3 (GPR109B) with 3-hydroxyoctanoic acid]; lysophosphatidic acid receptors [LPA4 (GPR23), LPA5 (GPR92), LPA6 (P2Y5)]; free fatty acid receptors [FFA4 (GPR120) with omega-3 fatty acids]; chemerin receptor (CMKLR1; ChemR23) with chemerin; CXCR7 (CMKOR1) with chemokines CXCL12 (SDF-1) and CXCL11 (ITAC); succinate receptor (SUCNR1) with succinate; and oxoglutarate receptor [OXGR1 with 2-oxoglutarate]. Pairings are highlighted for an additional 30 receptors in class A where further input is needed from the scientific community to validate these findings. Fifty-seven human class A receptors (excluding pseudogenes) are still considered orphans; information has been provided where there is a significant phenotype in genetically modified animals. In class B, six pairings have been reported by a single publication, with 28 (excluding pseudogenes) still classified as orphans. Seven orphan receptors remain in class C, with one pairing described by a single paper. The objective is to stimulate research into confirming pairings of orphan receptors where there is currently limited information and to identify cognate ligands for the remaining GPCRs. Further information can be found on the IUPHAR Database website (http://www.iuphar-db.org). PMID:23686350
Springer, Mark S; Gatesy, John
2017-04-01
Various toothed whales (Odontoceti) are unique among mammals in lacking olfactory bulbs as adults and are thought to be anosmic (lacking the olfactory sense). At the molecular level, toothed whales have high percentages of pseudogenic olfactory receptor genes, but species that have been investigated to date retain an intact copy of the olfactory marker protein gene (OMP), which is highly expressed in olfactory receptor neurons and may regulate the temporal resolution of olfactory responses. One hypothesis for the retention of intact OMP in diverse odontocete lineages is that this gene is pleiotropic with additional functions that are unrelated to olfaction. Recent expression studies provide some support for this hypothesis. Here, we report OMP sequences for representatives of all extant cetacean families and provide the first molecular evidence for inactivation of this gene in vertebrates. Specifically, OMP exhibits independent inactivating mutations in six different odontocete lineages: four river dolphin genera (Platanista, Lipotes, Pontoporia, Inia), sperm whale (Physeter), and harbor porpoise (Phocoena). These results suggest that the only essential role of OMP that is maintained by natural selection is in olfaction, although a non-olfactory role for OMP cannot be ruled out for lineages that retain an intact copy of this gene. Available genome sequences from cetaceans and close outgroups provide evidence of inactivating mutations in two additional genes (CNGA2, CNGA4), which imply further pseudogenization events in the olfactory cascade of odontocetes. Selection analyses demonstrate that evolutionary constraints on all three genes (OMP, CNGA2, CNGA4) have been greatly reduced in Odontoceti, but retain a signature of purifying selection on the stem Cetacea branch and in Mysticeti (baleen whales). This pattern is compatible with the 'echolocation-priority' hypothesis for the evolution of OMP, which posits that negative selection was maintained in the common ancestor of Cetacea and was not relaxed significantly until the evolution of echolocation in Odontoceti. Copyright © 2017 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Clines, G.; Lovett, M.
1994-09-01
Diastrophic dysplasia (DTD) is an autosomal recessive disorder of unknown pathogenesis that is characterized by abnormal skeletal and cartilage growth. Phenotypic characteristics of the disorder include short stature, scoliosis, and deformation of the first metacarpal. The diastrophic dysplasia gene has been localized to chromosome 5q31-33, within {approximately}60 kb of the colony stimulating factor 1 receptor gene (CSF1R). We have used direct cDNA selection to build a transcription map across {approximately}250 kb surrounding and including the CSF1R locus. cDNA pools from human placenta, activated T cells, cerebellum, Hela cells, fetal brain, chondrocytes, chondrosarcomas and osteosarcomas were multiplexed in these selections. Aftermore » two rounds of selection, an analysis revealed that {approximately}70% of the selected cDNAs were contained within the contig. DNA sequencing and cosmid mapping data from a collection of 310 clones revealed the presence of three new genes in this region that show no appreciable homologies on sequence database searches, as well as cDNA clones from the CSF1R and the PDGFRB loci (another of the known genes in the region). An additional cDNA was found with 100% homology to the gene encoding human ribosomal protein L7 (RPL7). This cDNA comprised {approximately}25% of all selected clones. However, further analysis of the genomic contig revealed the presence of an RPL7 processed pseudogene in very close proximity to the CSF1R and PDGFRB genes. The selection of processed pseudogenes is one previously anticipated artifact of selection metholodolgies, but has not been previously observed. Mutational analysis of the three new genes is underway in diastrophic dysplasia families, as is derivation of full length cDNA clones and the expansion of this detailed transcription map into a larger genomic contig.« less
Manen, Jean-François
2004-01-01
Background Intra-specific and intra-individual polymorphism is frequently observed in nuclear markers of Ilex (Aquifoliaceae) and discrepancy between plastid and nuclear phylogenies is the rule in this genus. These observations suggest that inter-specific plastid or/and nuclear introgression played an important role in the process of evolution of Ilex. With the aim of a precise understanding of the evolution of this genus, two distantly related sympatric species collected in Tenerife (Canary Islands), I. perado and I. canariensis, were studied in detail. Introgression between these two species was previously never reported. One plastid marker (the atpB-rbcL spacer) and two nuclear markers, the ribosomal internal transcribed spacer (ITS) and the nuclear encoded plastid glutamine synthetase (nepGS) were analyzed for 13 and 27 individuals of I. perado and I. canariensis, respectively. Results The plastid marker is intra-specifically constant and correlated with species identity. On the other hand, whereas the nuclear markers are conserved in I. perado, they are highly polymorphic in I. canariensis. The presence of pseudogenes and recombination in ITS sequences of I. canariensis explain this polymorphism. Ancestral sequence polymorphism with incomplete lineage sorting, or past or recent hybridization with an unknown species could explain this polymorphism, not resolved by concerted evolution. However, as already reported for many other plants, past or recent introgression of an alien genotype seem the most probable explanation for such a tremendous polymorphism. Conclusions Data do not allow the determination with certitude of the putative species introgressing I. canariensis, but I. perado is suspected. The introgression would be unilateral, with I. perado as the male donor, and the paternal sequences would be rapidly converted in highly divergent and consequently unidentifiable pseudogenes. At least, this study allows the establishment of precautionary measures when nuclear markers are used in phylogenetic studies of genera having experienced introgression such as the genus Ilex. PMID:15550175
Lee, Yi Chuan; Chan, Soh Ha; Ren, Ee Chee
2008-11-01
Killer cell immunoglobulin-like receptors (KIR) gene frequencies have been shown to be distinctly different between populations and contribute to functional variation in the immune response. We have investigated KIR gene frequencies in 370 individuals representing three Asian populations in Singapore and report here the distribution of 14 KIR genes (2DL1, 2DL2, 2DL3, 2DL4, 2DL5, 2DS1, 2DS2, 2DS3, 2DS4, 2DS5, 3DL1, 3DL2, 3DL3, 3DS1) with two pseudogenes (2DP1, 3DP1) among Singapore Chinese (n = 210); Singapore Malay (n = 80), and Singapore Indian (n = 80). Four framework genes (KIR3DL3, 3DP1, 2DL4, 3DL2) and a nonframework pseudogene 2DP1 were detected in all samples while KIR2DS2, 2DL2, 2DL5, and 2DS5 had the greatest significant variation across the three populations. Fifteen significant linkage patterns, consistent with associations between genes of A and B haplotypes, were observed. Eighty-four distinct KIR profiles were determined in our populations, 38 of which had not been described in other populations. KIR haplotype studies were performed using nine Singapore Chinese families comprising 34 individuals. All genotypes could be resolved into corresponding pairs of existing haplotypes with eight distinct KIR genotypes and eight different haplotypes. The haplotype A2 with frequency of 63.9% was dominant in Singapore Chinese, comparable to that reported in Korean and Chinese Han. The A haplotypes predominate in Singapore Chinese, with ratio of A to B haplotypes of approximately 3:1. Comparison with KIR frequencies in other populations showed that Singapore Chinese shared similar distributions with Chinese Han, Japanese, and Korean; Singapore Indian was found to be comparable with North Indian Hindus while Singapore Malay resembled the Thai.
Liu, Xia; Li, Yuan; Yang, Hongyuan; Zhou, Boyang
2018-04-09
The complete chloroplast (cp) genome of Talinum paniculatum (Caryophyllale), a source of pharmaceutical efficacy similar to ginseng, and a widely distributed and planted edible vegetable, were sequenced and analyzed. The cp genome size of T. paniculatum is 156,929 bp, with a pair of inverted repeats (IRs) of 25,751 bp separated by a large single copy (LSC) region of 86,898 bp and a small single copy (SSC) region of 18,529 bp. The genome contains 83 protein-coding genes, 37 transfer RNA (tRNA) genes, eight ribosomal RNA (rRNA) genes and four pseudogenes. Fifty one (51) repeat units and ninety two (92) simple sequence repeats (SSRs) were found in the genome. The pseudogene rpl23 (Ribosomal protein L23) was insert AATT than other Caryophyllale species by sequence alignment, which located in IRs region. The gene of trnK-UUU (tRNA-Lys) and rpl16 (Ribosomal protein L16) have larger introns in T. paniculatum , and the existence of matK (maturase K) genes, which usually located in the introns of trnK-UUU , rich sequence divergence in Caryophyllale. Complete cp genome comparison with other eight Caryophyllales species indicated that the differences between T. paniculatum and P. oleracea were very slight, and the most highly divergent regions occurred in intergenic spacers. Comparisons of IR boundaries among nine Caryophyllales species showed that T. paniculatum have larger IRs region and the contraction is relatively slight. The phylogenetic analysis among 35 Caryophyllales species and two outgroup species revealed that T. paniculatum and P. oleracea do not belong to the same family. All these results give good opportunities for future identification, barcoding of Talinum species, understanding the evolutionary mode of Caryophyllale cp genome and molecular breeding of T. paniculatum with high pharmaceutical efficacy.
Bak, Søren; Beisson, Fred; Bishop, Gerard; Hamberger, Björn; Höfer, René; Paquette, Suzanne; Werck-Reichhart, Danièle
2011-01-01
There are 244 cytochrome P450 genes (and 28 pseudogenes) in the Arabidopsis genome. P450s thus form one of the largest gene families in plants. Contrary to what was initially thought, this family diversification results in very limited functional redundancy and seems to mirror the complexity of plant metabolism. P450s sometimes share less than 20% identity and catalyze extremely diverse reactions leading to the precursors of structural macromolecules such as lignin, cutin, suberin and sporopollenin, or are involved in biosynthesis or catabolism of all hormone and signaling molecules, of pigments, odorants, flavors, antioxidants, allelochemicals and defense compounds, and in the metabolism of xenobiotics. The mechanisms of gene duplication and diversification are getting better understood and together with co-expression data provide leads to functional characterization. PMID:22303269
Y chromosome of D. pseudoobscura is not homologous to the ancestral Drosophila Y.
Carvalho, Antonio Bernardo; Clark, Andrew G
2005-01-07
We report a genome-wide search of Y-linked genes in Drosophila pseudoobscura. All six identifiable orthologs of the D. melanogaster Y-linked genes have autosomal inheritance in D. pseudoobscura. Four orthologs were investigated in detail and proved to be Y-linked in D. guanche and D. bifasciata, which shows that less than 18 million years ago the ancestral Drosophila Y chromosome was translocated to an autosome in the D. pseudoobscura lineage. We found 15 genes and pseudogenes in the current Y of D. pseudoobscura, and none are shared with the D. melanogaster Y. Hence, the Y chromosome in the D. pseudoobscura lineage appears to have arisen de novo and is not homologous to the D. melanogaster Y.
Molecular evolution tracks macroevolutionary transitions in Cetacea.
McGowen, Michael R; Gatesy, John; Wildman, Derek E
2014-06-01
Cetacea (whales, dolphins, and porpoises) is a model group for investigating the molecular signature of macroevolutionary transitions. Recent research has begun to reveal the molecular underpinnings of the remarkable anatomical and behavioral transformation in this clade. This shift from terrestrial to aquatic environments is arguably the best-understood major morphological transition in vertebrate evolution. The ancestral body plan and physiology were extensively modified and, in many cases, these crucial changes are recorded in cetacean genomes. Recent studies have highlighted cetaceans as central to understanding adaptive molecular convergence and pseudogene formation. Here, we review current research in cetacean molecular evolution and the potential of Cetacea as a model for the study of other macroevolutionary transitions from a genomic perspective. Copyright © 2014 Elsevier Ltd. All rights reserved.
The complete mitochondrial genome sequence of Malus hupehensis var. pinyiensis.
Duan, Naibin; Sun, Honghe; Wang, Nan; Fei, Zhangjun; Chen, Xuesen
2016-07-01
The complete mitochondrial genome sequence of Malus hupehensis var. pinyiensis, a widely used apple rootstock, was determined using the Illumina high-throughput sequencing approach. The genome is 422,555 bp in length and has a GC content of 45.21%. It is separated by a pair of inverted repeats of 32,504 bp, to form a large single copy region of 213,055 bp and a small single copy region of 144,492 bp. The genome contains 38 protein-coding genes, four pseudogenes, 25 tRNA genes, and three rRNA genes. The genome is 25,608 bp longer than that of M. domestica, and several structural variations between these two mitogenomes were detected.
Khan, Abdul Latif; Asaf, Sajjad; Khan, Abdur Rahim; Al-Harrasi, Ahmed; Al-Rawahi, Ahmed; Lee, In-Jung
2016-05-10
Preussia sp. BSL10, family Sporormiaceae, was actively producing phytohormone (indole-3-acetic acid) and extra-cellular enzymes (phosphatases and glucosidases). The fungus was also promoting the growth of arid-land tree-Boswellia sacra. Looking at such prospects of this fungus, we sequenced its draft genome for the first time. The Illumina based sequence analysis reveals an approximate genome size of 31.4Mbp for Preussia sp. BSL10. Based on ab initio gene prediction, total 32,312 coding sequences were annotated consisting of 11,967 coding genes, pseudogenes, and 221 tRNA genes. Furthermore, 321 carbohydrate-active enzymes were predicted and classified into many functional families. Copyright © 2016 Elsevier B.V. All rights reserved.
Patience, C; Wilkinson, D A; Weiss, R A
1997-03-01
Darwin could not have foretold that we are descended from viruses as well as from apes. While there is clear evidence that viral diseases, such as polio and rabies, affected ancient civilizations, viruses were not defined until the early years of this century, shortly after the rediscovery of mendelian genetics. That retroviral genomes can oscillate between infectious and genetic modes of transmission seemed preposterous before the discovery of reverse transcription in 1970. Those of us who had earlier provided mendelian evidence for germ-line transmission of retroviruses were subject of friendly ridicule. Today, the shunting of genetic elements between chromosomes and RNA, and the generation of processed pseudogenes, seems commonplace. It is timely, however, to revisit the topic of human endogenous retroviruses-the subject of this article.
Real Time Optima Tracking Using Harvesting Models of the Genetic Algorithm
NASA Technical Reports Server (NTRS)
Baskaran, Subbiah; Noever, D.
1999-01-01
Tracking optima in real time propulsion control, particularly for non-stationary optimization problems is a challenging task. Several approaches have been put forward for such a study including the numerical method called the genetic algorithm. In brief, this approach is built upon Darwinian-style competition between numerical alternatives displayed in the form of binary strings, or by analogy to 'pseudogenes'. Breeding of improved solution is an often cited parallel to natural selection in.evolutionary or soft computing. In this report we present our results of applying a novel model of a genetic algorithm for tracking optima in propulsion engineering and in real time control. We specialize the algorithm to mission profiling and planning optimizations, both to select reduced propulsion needs through trajectory planning and to explore time or fuel conservation strategies.
Finke, J; Fritzen, R; Ternes, P; Lange, W; Dölken, G
1993-03-01
Specific amplification of nucleic acid sequences by PCR has been extensively used for the detection of gene rearrangements and gene expression. Although successful amplification of DNA sequences has been carried out with DNA prepared from formalin-fixed, paraffin-embedded (FFPE) tissues, there are only a few reports regarding RNA analysis in this kind of material. We describe a procedure for RNA extraction from different types of FFPE tissues, involving digestion with proteinase K followed by guanidinium-thiocyanate acid phenol extraction and DNase I digestion. These RNA preparations are suitable for PCR analysis of mRNA and even of intronless genes. Furthermore, the universally expressed porphobilinogen deaminase mRNA proved to be useful as a positive control because of the lack of pseudogenes.
Comparative genomic analysis of three Leishmania species that cause diverse human disease
Peacock, Christopher S; Seeger, Kathy; Harris, David; Murphy, Lee; Ruiz, Jeronimo C; Quail, Michael A; Peters, Nick; Adlem, Ellen; Tivey, Adrian; Aslett, Martin; Kerhornou, Arnaud; Ivens, Alasdair; Fraser, Audrey; Rajandream, Marie-Adele; Carver, Tim; Norbertczak, Halina; Chillingworth, Tracey; Hance, Zahra; Jagels, Kay; Moule, Sharon; Ormond, Doug; Rutter, Simon; Squares, Rob; Whitehead, Sally; Rabbinowitsch, Ester; Arrowsmith, Claire; White, Brian; Thurston, Scott; Bringaud, Frédéric; Baldauf, Sandra L; Faulconbridge, Adam; Jeffares, Daniel; Depledge, Daniel P; Oyola, Samuel O; Hilley, James D; Brito, Loislene O; Tosi, Luiz R O; Barrell, Barclay; Cruz, Angela K; Mottram, Jeremy C; Smith, Deborah F; Berriman, Matthew
2008-01-01
Leishmania parasites cause a broad spectrum of clinical disease. Here we report the sequencing of the genomes of two species of Leishmania: Leishmania infantum and Leishmania braziliensis. The comparison of these sequences with the published genome of Leishmania major reveals marked conservation of synteny and identifies only ∼200 genes with a differential distribution between the three species. L. braziliensis, contrary to Leishmania species examined so far, possesses components of a putative RNA-mediated interference pathway, telomere-associated transposable elements and spliced leader–associated SLACS retrotransposons. We show that pseudogene formation and gene loss are the principal forces shaping the different genomes. Genes that are differentially distributed between the species encode proteins implicated in host-pathogen interactions and parasite survival in the macrophage. PMID:17572675
Hughes, Linda; Carton, Robert; Minguzzi, Stefano; McEntee, Gráinne; Deinum, Eva E; O'Connell, Mary J; Parle-McDermott, Anne
2015-07-08
The identification of a second functional dihydrofolate reductase enzyme in humans, DHFRL1, led us to consider whether this is also a feature of rodents. We demonstrate that dihydrofolate reductase activity is also a feature of the mitochondria in both rat and mouse but this is not due to a second enzyme. While our phylogenetic analysis revealed that RNA-mediated DHFR duplication events did occur across the mammal tree, the duplicates in brown rat and mouse are likely to be processed pseudogenes. Humans have evolved the need for two separate enzymes while laboratory rats and mice have just one. Copyright © 2015 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
Hirawake, H; Taniwaki, M; Tamura, A; Amino, H; Tomitsuka, E; Kita, K
1999-08-04
We have mapped large (cybL) and small (cybS) subunits of cytochrome b in the succinate-ubiquinone oxidoreductase (complex II) of human mitochondria to chromosome 1q21 and 11q23, respectively (H. Hirawake et al., Cytogenet. Cell Genet. 79 (1997) 132-138). In the present study, the human SDHD gene encoding cybS was cloned and characterized. The gene comprises four exons and three introns extending over 19 kb. Sequence analysis of the 5' promoter region showed several motifs for the binding of transcription factors including nuclear respiratory factors NRF-1 and NRF-2 at positions -137 and -104, respectively. In addition to this gene, six pseudogenes of cybS were isolated and mapped on the chromosome.
Characterisation of the subtelomeric regions of Giardia lamblia genome isolate WBC6.
Prabhu, Anjali; Morrison, Hilary G; Martinez, Charles R; Adam, Rodney D
2007-04-01
Giardia trophozoites are polyploid and have five chromosomes. The chromosome homologues demonstrate considerable size heterogeneity due to variation in the subtelomeric regions. We used clones from the genome project with telomeric sequence at one end to identify six subtelomeric regions in addition to previously identified subtelomeric regions, to study the telomeric arrangement of the chromosomes. The subtelomeric regions included two retroposons, one retroposon pseudogene, and two vsp genes, in addition to the previously identified subtelomeric regions that include ribosomal DNA repeats. The presence of vsp genes in a subtelomeric region suggests that telomeric rearrangements may contribute to the generation of vsp diversity. These studies of the subtelomeric regions of Giardia may contribute to our understanding of the factors that maintain stability, while allowing diversity in chromosome structure.
Structural and functional partitioning of bread wheat chromosome 3B.
Choulet, Frédéric; Alberti, Adriana; Theil, Sébastien; Glover, Natasha; Barbe, Valérie; Daron, Josquin; Pingault, Lise; Sourdille, Pierre; Couloux, Arnaud; Paux, Etienne; Leroy, Philippe; Mangenot, Sophie; Guilhot, Nicolas; Le Gouis, Jacques; Balfourier, Francois; Alaux, Michael; Jamilloux, Véronique; Poulain, Julie; Durand, Céline; Bellec, Arnaud; Gaspin, Christine; Safar, Jan; Dolezel, Jaroslav; Rogers, Jane; Vandepoele, Klaas; Aury, Jean-Marc; Mayer, Klaus; Berges, Hélène; Quesneville, Hadi; Wincker, Patrick; Feuillet, Catherine
2014-07-18
We produced a reference sequence of the 1-gigabase chromosome 3B of hexaploid bread wheat. By sequencing 8452 bacterial artificial chromosomes in pools, we assembled a sequence of 774 megabases carrying 5326 protein-coding genes, 1938 pseudogenes, and 85% of transposable elements. The distribution of structural and functional features along the chromosome revealed partitioning correlated with meiotic recombination. Comparative analyses indicated high wheat-specific inter- and intrachromosomal gene duplication activities that are potential sources of variability for adaption. In addition to providing a better understanding of the organization, function, and evolution of a large and polyploid genome, the availability of a high-quality sequence anchored to genetic maps will accelerate the identification of genes underlying important agronomic traits. Copyright © 2014, American Association for the Advancement of Science.
Dynamics of actin evolution in dinoflagellates.
Kim, Sunju; Bachvaroff, Tsvetan R; Handy, Sara M; Delwiche, Charles F
2011-04-01
Dinoflagellates have unique nuclei and intriguing genome characteristics with very high DNA content making complete genome sequencing difficult. In dinoflagellates, many genes are found in multicopy gene families, but the processes involved in the establishment and maintenance of these gene families are poorly understood. Understanding the dynamics of gene family evolution in dinoflagellates requires comparisons at different evolutionary scales. Studies of closely related species provide fine-scale information relative to species divergence, whereas comparisons of more distantly related species provides broad context. We selected the actin gene family as a highly expressed conserved gene previously studied in dinoflagellates. Of the 142 sequences determined in this study, 103 were from the two closely related species, Dinophysis acuminata and D. caudata, including full length and partial cDNA sequences as well as partial genomic amplicons. For these two Dinophysis species, at least three types of sequences could be identified. Most copies (79%) were relatively similar and in nucleotide trees, the sequences formed two bushy clades corresponding to the two species. In comparisons within species, only eight to ten nucleotide differences were found between these copies. The two remaining types formed clades containing sequences from both species. One type included the most similar sequences in between-species comparisons with as few as 12 nucleotide differences between species. The second type included the most divergent sequences in comparisons between and within species with up to 93 nucleotide differences between sequences. In all the sequences, most variation occurred in synonymous sites or the 5' UnTranslated Region (UTR), although there was still limited amino acid variation between most sequences. Several potential pseudogenes were found (approximately 10% of all sequences depending on species) with incomplete open reading frames due to frameshifts or early stop codons. Overall, variation in the actin gene family fits best with the "birth and death" model of evolution based on recent duplications, pseudogenes, and incomplete lineage sorting. Divergence between species was similar to variation within species, so that actin may be too conserved to be useful for phylogenetic estimation of closely related species.
Unit-length line-1 transcripts in human teratocarcinoma cells.
Skowronski, J; Fanning, T G; Singer, M F
1988-01-01
We have characterized the approximately 6.5-kilobase cytoplasmic poly(A)+ Line-1 (L1) RNA present in a human teratocarcinoma cell line, NTera2D1, by primer extension and by analysis of cloned cDNAs. The bulk of the RNA begins (5' end) at the residue previously identified as the 5' terminus of the longest known primate genomic L1 elements, presumed to represent "unit" length. Several of the cDNA clones are close to 6 kilobase pairs, that is, close to full length. The partial sequences of 18 cDNA clones and full sequence of one (5,975 base pairs) indicate that many different genomic L1 elements contribute transcripts to the 6.5-kilobase cytoplasmic poly(A)+ RNA in NTera2D1 cells because no 2 of the 19 cDNAs analyzed had identical sequences. The transcribed elements appear to represent a subset of the total genomic L1s, a subset that has a characteristic consensus sequence in the 3' noncoding region and a high degree of sequence conservation throughout. Two open reading frames (ORFs) of 1,122 (ORF1) and 3,852 (ORF2) bases, flanked by about 800 and 200 bases of sequence at the 5' and 3' ends, respectively, can be identified in the cDNAs. Both ORFs are in the same frame, and they are separated by 33 bases bracketed by two conserved in-frame stop codons. ORF 2 is interrupted by at least one randomly positioned stop codon in the majority of the cDNAs. The data support proposals suggesting that the human L1 family includes one or more functional genes as well as an extraordinarily large number of pseudogenes whose ORFs are broken by stop codons. The cDNA structures suggest that both genes and pseudogenes are transcribed. At least one of the cDNAs (cD11), which was sequenced in its entirety, could, in principle, represent an mRNA for production of the ORF1 polypeptide. The similarity of mammalian L1s to several recently described invertebrate movable elements defines a new widely distributed class of elements which we term class II retrotransposons. Images PMID:2454389
Lorenzon, S; Wesonga, H; Ygesu, Laikemariam; Tekleghiorgis, Tesfaalem; Maikano, Y; Angaya, M; Hendrikx, P; Thiaucourt, F
2002-03-01
Contagious caprine pleuropneumonia (CCPP) is a major threat to goat farming in developing countries. Its exact distribution is not well known, despite the fact that new diagnostic tools such as PCR and competitive ELISA are now available. The authors developed a study of the molecular epidemiology of the disease, based on the amplification of a 2400 bp long fragment containing two duplicated gene coding for a putative membrane protein. The sequence of this fragment, obtained on 19 Mycoplasma capricolum subsp. capripneumoniae (Mccp) strains from various geographical locations, gave 11 polymorphic positions. The three mutations found on gene H2prim were silent and did not appear to induce any amino acid modifications in the putative translated protein. The second gene may be a pseudogene not translated in vivo, as it bore a deletion of the ATG codon found in the other members of the "Mycoplasma mycoides cluster" and as the six mutations evidenced in the Mccp strains would induce modifications in the translated amino acids. In addition, an Mccp strain isolated in the United Arab Emirates showed a deletion of the whole pseudogene, a further indication that this gene is not compulsory for mycoplasma growth. Four lineages were defined, based on the nucleotide sequence. These correlated relatively well with the geographical origin of the strains: North, Central or East Africa. The strain of Turkish origin had a sequence similar to that found in North African strains, while strains isolated in Oman had sequences similar to those of North or East African strains. The latter is possibly due to the regular import of goats of various origins. Similar molecular epidemiology tools have been developed by sequencing the two operons of the 16S rRNA gene or by AFLP. All these various techniques give complementary results. One (16S rRNA) offers the likelihood of a finer identification of strains circulating in a region, another (H2) of determining the geographical origin of the strains. These tools can make a very useful contribution to understanding the epidemiology of CCPP.
Engel, Pablo; Angulo, Ana
2016-01-01
Since the discovery of the high abundance of Alu elements in the human genome, the interest for the functional significance of these retrotransposons has been increasing. Primate Alu and rodent Alu-like elements are retrotransposed by a mechanism driven by the LINE1 (L1) encoded proteins, the same machinery that generates the L1 repeats, the processed pseudogenes (PPs), and other retroelements. Apart from free Alu RNAs, Alus are also transcribed and retrotranscribed as part of cellular gene transcripts, generally embedded inside 3’ untranslated regions (UTRs). Despite different proposed hypotheses, the functional implication of the presence of Alus inside 3’UTRs remains elusive. In this study we hypothesized that Alu elements in 3’UTRs could be involved in the genesis of PPs. By analyzing human genome data we discovered that the existence of 3’UTR-embedded Alu elements is overrepresented in genes source of PPs. In contrast, the presence of other retrotransposable elements in 3’UTRs does not show this PP linked overrepresentation. This research was extended to mouse and rat genomes and the results accordingly reveal overrepresentation of 3’UTR-embedded B1 (Alu-like) elements in PP parent genes. Interestingly, we also demonstrated that the overrepresentation of 3’UTR-embedded Alus is particularly significant in PP parent genes with low germline gene expression level. Finally, we provide data that support the hypothesis that the L1 machinery is also the system that herpesviruses, and possibly other large DNA viruses, use to capture host genes expressed in germline or somatic cells. Altogether our results suggest a novel role for Alu or Alu-like elements inside 3’UTRs as facilitators of the genesis of PPs, particularly in lowly expressed genes. Moreover, we propose that this L1-driven mechanism, aided by the presence of 3’UTR-embedded Alus, may also be exploited by DNA viruses to incorporate host genes to their viral genomes. PMID:28033411
DOE Office of Scientific and Technical Information (OSTI.GOV)
de Wit, Pierre J. G. M.; van der Burgt, Ate; Okmen, Bilal
2012-05-04
We sequenced and compared the genomes of the Dothideomycete fungal plant pathogens Cladosporium fulvum (Cfu) (syn. Passalora fulva) and Dothistroma septosporum (Dse) that are closely related phylogenetically, but have different lifestyles and hosts. Although both fungi grow extracellularly in close contact with host mesophyll cells, Cfu is a biotroph infecting tomato, while Dse is a hemibiotroph infecting pine. The genomes of these fungi have a similar set of genes (70percent of gene content in both genomes are homologs), but differ significantly in size (Cfu >61.1-Mb; Dse 31.2-Mb), which is mainly due to the difference in repeat content (47.2percent in Cfumore » versus 3.2percent in Dse). Recent adaptation to different lifestyles and hosts is suggested by diverged sets of genes. Cfu contains an tomatinase gene that we predict might be required for detoxification of tomatine, while this gene is absent in Dse. Many genes encoding secreted proteins are unique to each species and the repeat-rich areas in Cfu are enriched for these species-specific genes. In contrast, conserved genes suggest common host ancestry. Homologs of Cfu effector genes, including Ecp2 and Avr4, are present in Dse and induce a Cf-Ecp2- and Cf-4-mediated hypersensitive response, respectively. Strikingly, genes involved in production of the toxin dothistromin, a likely virulence factor for Dse, are conserved in Cfu, but their expression differs markedly with essentially no expression by Cfu in planta. Likewise, Cfu has a carbohydrate-degrading enzyme catalog that is more similar to that of necrotrophs or hemibiotrophs and a larger pectinolytic gene arsenal than Dse, but many of these genes are not expressed in planta or are pseudogenized. Overall, comparison of their genomes suggests that these closely related plant pathogens had a common ancestral host but since adapted to different hosts and lifestyles by a combination of differentiated gene content, pseudogenization, and gene regulation.« less
Xie, Zhenze; Wang, Congyan; Wang, Ke; Wang, Shunli; Li, Xiaohui; Zhang, Zhao; Ma, Wujun; Yan, Yueming
2010-11-01
Nineteen novel full-ORF α-gliadin genes and 32 pseudogenes containing at least one stop codon were cloned and sequenced from three Aegilops tauschii accessions (T15, T43 and T26) and two bread wheat cultivars (Gaocheng 8901 and Zhongyou 9507). Analysis of three typical α-gliadin genes (Gli-At4, Gli-G1 and Gli-Z4) revealed some InDels and a considerable number of SNPs among them. Most of the pseudogenes were resulted from C to T change, leading to the generation of TAG or TAA in-frame stop codon. The putative proteins of both Gli-At3 and Gli-Z7 genes contained an extra cysteine residue in the unique domain II. Analysis of toxic epitodes among 19 deduced α-gliadins demonstrated that 14 of these contained 1-5 T cell stimulatory toxic epitopes while the other 5 did not contain any toxic epitopes. The glutamine residues in two specific ployglutamine domains ranged from 7 to 27, indicating a high variation in length. According to the numbers of 4 T cell stimulatory toxic epitopes and glutamine residues in the two ployglutamine domains among the 19 α-gliadin genes, 2 were assigned to chromosome 6A, 5 to chromosome 6B and 12 to chromosome 6D. These results were consistent with those from wheat cv. Chinese Spring nulli-tetrasomic and phylogenetic analysis. Secondary structure prediction showed that all α-gliadins had high content of β-strands and most of the α-helixes and β-strands were present in two unique domains. Phylogenetic analysis demonstrated that α-gliadin genes had a high homology with γ-gliadin, B-hordein, and LMW-GS genes and they diverged at approximate 39 MYA. Finally, the five α-gliadin genes were successfully expressed in E. coli, and their expression amount reached to the maximum after 4 h induced by IPTG, indicating that the α-gliadin genes can express in a high level under the control of T(7) promoter.
Comparative Omics-Driven Genome Annotation Refinement: Application across Yersiniae
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rutledge, Alexandra C.; Jones, Marcus B.; Chauhan, Sadhana
2012-03-27
Genome sequencing continues to be a rapidly evolving technology, yet most downstream aspects of genome annotation pipelines remain relatively stable or are even being abandoned. To date, the perceived value of manual curation for genome annotations is not offset by the real cost and time associated with the process. In order to balance the large number of sequences generated, the annotation process is now performed almost exclusively in an automated fashion for most genome sequencing projects. One possible way to reduce errors inherent to automated computational annotations is to apply data from 'omics' measurements (i.e. transcriptional and proteomic) to themore » un-annotated genome with a proteogenomic-based approach. This approach does require additional experimental and bioinformatics methods to include omics technologies; however, the approach is readily automatable and can benefit from rapid developments occurring in those research domains as well. The annotation process can be improved by experimental validation of transcription and translation and aid in the discovery of annotation errors. Here the concept of annotation refinement has been extended to include a comparative assessment of genomes across closely related species, as is becoming common in sequencing efforts. Transcriptomic and proteomic data derived from three highly similar pathogenic Yersiniae (Y. pestis CO92, Y. pestis pestoides F, and Y. pseudotuberculosis PB1/+) was used to demonstrate a comprehensive comparative omic-based annotation methodology. Peptide and oligo measurements experimentally validated the expression of nearly 40% of each strain's predicted proteome and revealed the identification of 28 novel and 68 previously incorrect protein-coding sequences (e.g., observed frameshifts, extended start sites, and translated pseudogenes) within the three current Yersinia genome annotations. Gene loss is presumed to play a major role in Y. pestis acquiring its niche as a virulent pathogen, thus the discovery of many translated pseudogenes underscores a need for functional analyses to investigate hypotheses related to divergence. Refinements included the discovery of a seemingly essential ribosomal protein, several virulence-associated factors, and a transcriptional regulator, among other proteins, most of which are annotated as hypothetical, that were missed during annotation.« less
Hyland, Catherine A; Millard, Glenda M; O'Brien, Helen; Schoeman, Elizna M; Lopez, Genghis H; McGowan, Eunike C; Tremellen, Anne; Puddephatt, Rachel; Gaerty, Kirsten; Flower, Robert L; Hyett, Jonathan A; Gardener, Glenn J
2017-12-01
Non-invasive fetal RHD genotyping in Australia to reduce anti-D usage will need to accommodate both prolonged sample transport times and a diverse population demographic harbouring a range of RHD blood group gene variants. We compared RHD genotyping accuracy using two blood sample collection tube types for RhD negative women stratified into deleted RHD gene haplotype and RHD gene variant cohorts. Maternal blood samples were collected into EDTA and cell-free (cf)DNA stabilising (BCT) tubes from two sites, one interstate. Automated DNA extraction and polymerase chain reaction (PCR) were used to amplify RHD exons 5 and 10 and CCR5. Automated analysis flagged maternal RHD variants, which were classified by genotyping. Time between sample collection and processing ranged from 2.9 to 187.5 hours. cfDNA levels increased with time for EDTA (range 0.03-138 ng/μL) but not BCT samples (0.01-3.24 ng/μL). For the 'deleted' cohort (n=647) all fetal RHD genotyping outcomes were concordant, excepting for one unexplained false negative EDTA sample. Matched against cord RhD serology, negative predictive values using BCT and EDTA tubes were 100% and 99.6%, respectively. Positive predictive values were 99.7% for both types. Overall 37.2% of subjects carried an RhD negative baby. The 'variant' cohort (n=15) included one novel RHD and eight hybrid or African pseudogene variants. Review for fetal RHD specific signals, based on one exon, showed three EDTA samples discordant to BCT, attributed to high maternal cfDNA levels arising from prolonged transport times. For the deleted haplotype cohort, fetal RHD genotyping accuracy was comparable for samples collected in EDTA and BCT tubes despite higher cfDNA levels in the EDTA tubes. Capacity to predict fetal RHD genotype for maternal carriers of hybrid or pseudogene RHD variants requires stringent control of cfDNA levels. We conclude that fetal RHD genotyping is feasible in the Australian environment to avoid unnecessary anti-D immunoglobulin prophylaxis. Copyright © 2017. Published by Elsevier B.V.
Genomic assessment of the evolution of the prion protein gene family in vertebrates.
Harrison, Paul M; Khachane, Amit; Kumar, Manish
2010-05-01
Prion diseases are devastating neurological disorders caused by the propagation of particles containing an alternative beta-sheet-rich form of the prion protein (PrP). Genes paralogous to PrP, called Doppel and Shadoo, have been identified, that also have neuropathological relevance. To aid in the further functional characterization of PrP and its relatives, we annotated completely the PrP gene family (PrP-GF), in the genomes of 42 vertebrates, through combined strategic application of gene prediction programs and advanced remote homology detection techniques (such as HMMs, PSI-TBLASTN and pGenThreader). We have uncovered several previously undescribed paralogous genes and pseudogenes. We find that current high-quality genomic evidence indicates that the PrP relative Doppel, was likely present in the last common ancestor of present-day Tetrapoda, but was lost in the bird lineage, since its divergence from reptiles. Using the new gene annotations, we have defined the consensus of structural features that are characteristic of the PrP and Doppel structures, across diverse Tetrapoda clades. Furthermore, we describe in detail a transcribed pseudogene derived from Shadoo that is conserved across primates, and that overlaps the meiosis gene, SYCE1, thus possibly regulating its expression. In addition, we analysed the locus of PRNP/PRND for significant conservation across the genomic DNA of eleven mammals, and determined the phylogenetic penetration of non-coding exons. The genomic evidence indicates that the second PRNP non-coding exon found in even-toed ungulates and rodents, is conserved in all high-coverage genome assemblies of primates (human, chimp, orang utan and macaque), and is, at least, likely to have fallen out of use during primate speciation. Furthermore, we have demonstrated that the PRNT gene (at the PRNP human locus) is conserved across at least sixteen mammals, and evolves like a long non-coding RNA, fashioned from fragments of ancient, long, interspersed elements. These annotations and evolutionary analyses will be of further use for functional characterisation of the PrP-GF, and will be updatable in a semi-automated fashion as more genomes accumulate. Copyright 2010 Elsevier Inc. All rights reserved.
New consensus nomenclature for mammalian keratins
Schweizer, Jürgen; Bowden, Paul E.; Coulombe, Pierre A.; Langbein, Lutz; Lane, E. Birgitte; Magin, Thomas M.; Maltais, Lois; Omary, M. Bishr; Parry, David A.D.; Rogers, Michael A.; Wright, Mathew W.
2006-01-01
Keratins are intermediate filament–forming proteins that provide mechanical support and fulfill a variety of additional functions in epithelial cells. In 1982, a nomenclature was devised to name the keratin proteins that were known at that point. The systematic sequencing of the human genome in recent years uncovered the existence of several novel keratin genes and their encoded proteins. Their naming could not be adequately handled in the context of the original system. We propose a new consensus nomenclature for keratin genes and proteins that relies upon and extends the 1982 system and adheres to the guidelines issued by the Human and Mouse Genome Nomenclature Committees. This revised nomenclature accommodates functional genes and pseudogenes, and although designed specifically for the full complement of human keratins, it offers the flexibility needed to incorporate additional keratins from other mammalian species. PMID:16831889
Bhattacharya, Pamela; Barnebey, Adam; Zemla, Marcin; ...
2015-10-05
Thermoanaerobacter thermohydrosulfuricus BSB-33 is a thermophilic gram positive obligate anaerobe isolated from a hot spring in West Bengal, India. Unlike other T. thermohydrosulfuricus strains, BSB-33 is able to anaerobically reduce Fe(III) and Cr(VI) optimally at 60 °C. BSB-33 is the first Cr(VI) reducing T. thermohydrosulfuricus genome sequenced and of particular interest for bioremediation of environmental chromium contaminations. Here we discuss features of T. thermohydrosulfuricus BSB-33 and the unique genetic elements that may account for the peculiar metal reducing properties of this organism. The T. thermohydrosulfuricus BSB-33 genome comprises 2597606 bp encoding 2581 protein genes, 12 rRNA, 193 pseudogenes and hasmore » a G + C content of 34.20 %. Lastly, putative chromate reductases were identified by comparative analyses with other Thermoanaerobacter and chromate-reducing bacteria.« less
Spiridonova, L N; Red'kin, Ya A; Valchuk, O P
2016-01-01
First evidence for the presence of copies of mitochondrial cytochrome b gene of the subspecies group Luscinia calliope anadyrensis-L. c. camtschatkensis in the nuclear genome of nominative L. c. calliope was obtained, which indirectly indicates the nuclear origin of the subspecies-specific mitochondrial haplotypes in Siberian rubythroat. This fact clarifies the appearance of mitochondrial haplotypes of eastern subspecies by exchange between the homologous regions of the nuclear and mitochondrial genomes followed by fixation by the founder effect. This is the first study to propose a mechanism of DNA fragment exchange between the nucleus and mitochondria (intergenomic recombination) and to show the role of nuclear copies of mtDNA as a source of new taxon-specific mitochondrial haplotypes, which implies their involvement in the microevolutionary processes and morphogenesis.
Organization of the SUC gene family in Saccharomyces.
Carlson, M; Botstein, D
1983-01-01
The SUC gene family of yeast (Saccharomyces) includes six structural genes for invertase (SUC1 through SUC5 and SUC7) found at unlinked chromosomal loci. A given yeast strain does not usually carry SUC+ alleles at all six loci; the natural negative alleles are called suc0 alleles. Cloned SUC2 DNA probes were used to investigate the physical structure of the SUC gene family in laboratory strains, commercial wine strains, and different Saccharomyces species. The active SUC+ genes are homologous. The suc0 allele at the SUC2 locus (suc2(0) in some strains is a silent gene or pseudogene. Other SUC loci carrying suc0 alleles appear to lack SUC DNA sequences. These findings imply that SUC genes have transposed to different chromosomal locations in closely related Saccharomyces strains. Images PMID:6843548
[Structural organization of 5S ribosomal DNA of Rosa rugosa].
Tynkevych, Iu O; Volkov, R A
2014-01-01
In order to clarify molecular organization of the genomic region encoding 5S rRNA in diploid species Rosa rugosa several 5S rDNA repeated units were cloned and sequenced. Analysis of the obtained sequences revealed that only one length variant of 5S rDNA repeated units, which contains intact promoter elements in the intergenic spacer region (IGS) and appears to be transcriptionally active is present in the genome. Additionally, a limited number of 5S rDNA pseudogenes lacking a portion of coding sequence and the complete IGS was detected. A high level of sequence similarity (from 93.7 to 97.5%) between the IGS of major 5S rDNA variants of East Asian R. rugosa and North American R. nitida was found indicating comparatively recent divergence of these species.
Molecular basis of length polymorphism in the human zeta-globin gene complex.
Goodbourn, S E; Higgs, D R; Clegg, J B; Weatherall, D J
1983-01-01
The length polymorphism between the human zeta-globin gene and its pseudogene is caused by an allele-specific variation in the copy number of a tandemly repeating 36-base-pair sequence. This sequence is related to a tandemly repeated 14-base-pair sequence in the 5' flanking region of the human insulin gene, which is known to cause length polymorphism, and to a repetitive sequence in intervening sequence (IVS) 1 of the pseudo-zeta-globin gene. Evidence is presented that the latter is also of variable length, probably because of differences in the copy number of the tandem repeat. The homology between the three length polymorphisms may be an indication of the presence of a more widespread group of related sequences in the human genome, which might be useful for generalized linkage studies. PMID:6308667
Clendenning, Mark; Walsh, Michael D; Gelpi, Judith Balmana; Thibodeau, Stephen N; Lindor, Noralane; Potter, John D; Newcomb, Polly; LeMarchand, Loic; Haile, Robert; Gallinger, Steve; Hopper, John L; Jenkins, Mark A; Rosty, Christophe; Young, Joanne P; Buchanan, Daniel D
2013-09-01
Current screening practices have been able to identify PMS2 mutations in 78 % of cases of colorectal cancer from the Colorectal Cancer Family Registry (Colon CFR) which showed solitary loss of the PMS2 protein. However the detection of large-scale deletions in the 3' end of the PMS2 gene has not been possible due to technical difficulties associated with pseudogene sequences. Here, we utilised a recently described MLPA/long-range PCR-based approach to screen the remaining 22 % (n = 16) of CRC-affected probands for mutations in the 3' end of the PMS2 gene. No deletions encompassing any or all of exons 12 through 15 were identified; therefore, our results suggest that 3' deletions in PMS2 are not a frequent occurrence in such families.
The Ftx Noncoding Locus Controls X Chromosome Inactivation Independently of Its RNA Products.
Furlan, Giulia; Gutierrez Hernandez, Nancy; Huret, Christophe; Galupa, Rafael; van Bemmel, Joke Gerarda; Romito, Antonio; Heard, Edith; Morey, Céline; Rougeulle, Claire
2018-05-03
Accumulation of the Xist long noncoding RNA (lncRNA) on one X chromosome is the trigger for X chromosome inactivation (XCI) in female mammals. Xist expression, which needs to be tightly controlled, involves a cis-acting region, the X-inactivation center (Xic), containing many lncRNA genes that evolved concomitantly to Xist from protein-coding ancestors through pseudogeneization and loss of coding potential. Here, we uncover an essential role for the Xic-linked noncoding gene Ftx in the regulation of Xist expression. We show that Ftx is required in cis to promote Xist transcriptional activation and establishment of XCI. Importantly, we demonstrate that this function depends on Ftx transcription and not on the RNA products. Our findings illustrate the multiplicity of layers operating in the establishment of XCI and highlight the diversity in the modus operandi of the noncoding players. Copyright © 2018 Elsevier Inc. All rights reserved.
Structure and expression of the attacin genes in Hyalophora cecropia.
Sun, S C; Lindström, I; Lee, J Y; Faye, I
1991-02-26
To study the regulation of the immune genes in insects, we have cloned and sequenced the attacin gene locus of the giant silk moth Hyalophora cecropia. The locus contains one acidic and one basic attacin gene as well as two pseudogenes, which are remnants of basic attacin genes. A small insertion element was found within the locus. The two functional attacin genes are transcribed in opposite directions and have two introns inserted at homologous positions. A common sequence, GGGGATTCCT, is found at nucleotide position -48 in the acidic gene and at nucleotide position -58 in the basic gene. Interestingly, this decanucleotide is similar to the consensus of the NF-k B-binding site. Expression studies revealed that both attacins are strongly induced by phorbol 12-myristate 13-acetate, lipopolysaccharide and bacteria. However, only the acidic attacin gene showed a clear response to injury.
Molecular diagnostics for hereditary hearing loss in children.
Sommen, Manou; Wuyts, Wim; Van Camp, Guy
2017-08-01
Hearing loss (HL) is the most common birth defect in industrialized countries with far-reaching social, psychological and cognitive implications. It is an extremely heterogeneous disease, complicating molecular testing. The introduction of next-generation sequencing (NGS) has resulted in great progress in diagnostics allowing to study all known HL genes in a single assay. The diagnostic yield is currently still limited, but has the potential to increase substantially. Areas covered: In this review the utility of NGS and the problems for comprehensive molecular testing for HL are evaluated and discussed. Expert commentary: Different publications have proven the appropriateness of NGS for molecular testing of heterogeneous diseases such as HL. However, several problems still exist, such as pseudogenic background of some genes and problematic copy number variant analysis on targeted NGS data. Another main challenge for the future will be the establishment of population specific mutation-spectra to achieve accurate personalized comprehensive molecular testing for HL.
Lozano, Roberto; Ponce, Olga; Ramirez, Manuel; Mostajo, Nelly; Orjeda, Gisella
2012-01-01
The majority of disease resistance (R) genes identified to date in plants encode a nucleotide-binding site (NBS) and leucine-rich repeat (LRR) domain containing protein. Additional domains such as coiled-coil (CC) and TOLL/interleukin-1 receptor (TIR) domains can also be present. In the recently sequenced Solanum tuberosum group phureja genome we used HMM models and manual curation to annotate 435 NBS-encoding R gene homologs and 142 NBS-derived genes that lack the NBS domain. Highly similar homologs for most previously documented Solanaceae R genes were identified. A surprising ∼41% (179) of the 435 NBS-encoding genes are pseudogenes primarily caused by premature stop codons or frameshift mutations. Alignment of 81.80% of the 577 homologs to S. tuberosum group phureja pseudomolecules revealed non-random distribution of the R-genes; 362 of 470 genes were found in high density clusters on 11 chromosomes. PMID:22493716
Krak, Karol; Alvarez, Inés; Caklová, Petra; Costa, Andrea; Chrtek, Jindrich; Fehrer, Judith
2012-02-01
The development of three low-copy nuclear markers for low taxonomic level phylogenies in Asteraceae with emphasis on the subtribe Hieraciinae is reported. Marker candidates were selected by comparing a Lactuca complementary DNA (cDNA) library with public DNA sequence databases. Interspecific variation and phylogenetic signal of the selected genes were investigated for diploid taxa from the subtribe Hieraciinae and compared to a reference phylogeny. Their ability to cross-amplify was assessed for other Asteraceae tribes. All three markers had higher variation (2.1-4.5 times) than the internal transcribed spacer (ITS) in Hieraciinae. Cross-amplification was successful in at least seven other tribes of the Asteraceae. Only three cases indicating the presence of paralogs or pseudogenes were detected. The results demonstrate the potential of these markers for phylogeny reconstruction in the Hieraciinae as well as in other Asteraceae tribes, especially for very closely related species.
Gene flow contributes to diversification of the major fungal pathogen Candida albicans.
Ropars, Jeanne; Maufrais, Corinne; Diogo, Dorothée; Marcet-Houben, Marina; Perin, Aurélie; Sertour, Natacha; Mosca, Kevin; Permal, Emmanuelle; Laval, Guillaume; Bouchier, Christiane; Ma, Laurence; Schwartz, Katja; Voelz, Kerstin; May, Robin C; Poulain, Julie; Battail, Christophe; Wincker, Patrick; Borman, Andrew M; Chowdhary, Anuradha; Fan, Shangrong; Kim, Soo Hyun; Le Pape, Patrice; Romeo, Orazio; Shin, Jong Hee; Gabaldon, Toni; Sherlock, Gavin; Bougnoux, Marie-Elisabeth; d'Enfert, Christophe
2018-06-08
Elucidating population structure and levels of genetic diversity and recombination is necessary to understand the evolution and adaptation of species. Candida albicans is the second most frequent agent of human fungal infections worldwide, causing high-mortality rates. Here we present the genomic sequences of 182 C. albicans isolates collected worldwide, including commensal isolates, as well as ones responsible for superficial and invasive infections, constituting the largest dataset to date for this major fungal pathogen. Although, C. albicans shows a predominantly clonal population structure, we find evidence of gene flow between previously known and newly identified genetic clusters, supporting the occurrence of (para)sexuality in nature. A highly clonal lineage, which experimentally shows reduced fitness, has undergone pseudogenization in genes required for virulence and morphogenesis, which may explain its niche restriction. Candida albicans thus takes advantage of both clonality and gene flow to diversify.
Epigenetics, chromatin and genome organization: recent advances from the ENCODE project.
Siggens, L; Ekwall, K
2014-09-01
The organization of the genome into functional units, such as enhancers and active or repressed promoters, is associated with distinct patterns of DNA and histone modifications. The Encyclopedia of DNA Elements (ENCODE) project has advanced our understanding of the principles of genome, epigenome and chromatin organization, identifying hundreds of thousands of potential regulatory regions and transcription factor binding sites. Part of the ENCODE consortium, GENCODE, has annotated the human genome with novel transcripts including new noncoding RNAs and pseudogenes, highlighting transcriptional complexity. Many disease variants identified in genome-wide association studies are located within putative enhancer regions defined by the ENCODE project. Understanding the principles of chromatin and epigenome organization will help to identify new disease mechanisms, biomarkers and drug targets, particularly as ongoing epigenome mapping projects generate data for primary human cell types that play important roles in disease. © 2014 The Association for the Publication of the Journal of Internal Medicine.
Creating reference gene annotation for the mouse C57BL6/J genome assembly.
Mudge, Jonathan M; Harrow, Jennifer
2015-10-01
Annotation on the reference genome of the C57BL6/J mouse has been an ongoing project ever since the draft genome was first published. Initially, the principle focus was on the identification of all protein-coding genes, although today the importance of describing long non-coding RNAs, small RNAs, and pseudogenes is recognized. Here, we describe the progress of the GENCODE mouse annotation project, which combines manual annotation from the HAVANA group with Ensembl computational annotation, alongside experimental and in silico validation pipelines from other members of the consortium. We discuss the more recent incorporation of next-generation sequencing datasets into this workflow, including the usage of mass-spectrometry data to potentially identify novel protein-coding genes. Finally, we will outline how the C57BL6/J genebuild can be used to gain insights into the variant sites that distinguish different mouse strains and species.
yadBC of Yersinia pestis, a new virulence determinant for bubonic plague.
Forman, Stanislav; Wulff, Christine R; Myers-Morales, Tanya; Cowan, Clarissa; Perry, Robert D; Straley, Susan C
2008-02-01
In all Yersinia pestis strains examined, the adhesin/invasin yadA gene is a pseudogene, yet Y. pestis is invasive for epithelial cells. To identify potential surface proteins that are structurally and functionally similar to YadA, we searched the Y. pestis genome for open reading frames with homology to yadA and found three: the bicistronic operon yadBC (YPO1387 and YPO1388 of Y. pestis CO92; y2786 and y2785 of Y. pestis KIM5), which encodes two putative surface proteins, and YPO0902, which lacks a signal sequence and likely is nonfunctional. In this study we characterized yadBC regulation and tested the importance of this operon for Y. pestis adherence, invasion, and virulence. We found that loss of yadBC caused a modest loss of invasiveness for epithelioid cells and a large decrease in virulence for bubonic plague but not for pneumonic plague in mice.
Huang, Ya-Yi; Matzke, Antonius J. M.; Matzke, Marjori
2013-01-01
Coconut, a member of the palm family (Arecaceae), is one of the most economically important trees used by mankind. Despite its diverse morphology, coconut is recognized taxonomically as only a single species (Cocos nucifera L.). There are two major coconut varieties, tall and dwarf, the latter of which displays traits resulting from selection by humans. We report here the complete chloroplast (cp) genome of a dwarf coconut plant, and describe the gene content and organization, inverted repeat fluctuations, repeated sequence structure, and occurrence of RNA editing. Phylogenetic relationships of monocots were inferred based on 47 chloroplast protein-coding genes. Potential nodes for events of gene duplication and pseudogenization related to inverted repeat fluctuation were mapped onto the tree using parsimony criteria. We compare our findings with those from other palm species for which complete cp genome sequences are available. PMID:24023703
The complete mitochondrial genome of Chinese green hydra, Hydra sinensis (Hydroida: Hydridae).
Pan, Hong-Chun; Qian, Xiao-Cheng; Li, Ping; Li, Xiao-Fei; Wang, An-Tai
2014-02-01
The complete mitochondrial genome of Chinese green hydra, Hydra sinensis (Hydroida: Hydridae) is a linear molecule of 16,189 bp in length, containing 13 protein-coding genes, small and large subunit ribosomal RNAs, methionine and tryptophan transfer RNAs, a pseudogene consisting of a partial copy of COI and terminal sequences at two ends of the linear mitochondrial DNA. The A + T content of the overall base composition of H-strand is 77.2% (T: 41.7%; C: 10.9%; A: 35.5%; and G: 11.9%). COI and ND1 genes begin with GTG as start codon, while other 11 protein-coding genes start with a typical ATG initiation codon. COII, ATP8, ATP6, COIII, ND5, ND6, ND3, ND1, ND4 and COI genes are terminated with TAA as stop codon, ND4L ends with TAG, ND2 ends with TA and Cyt b ends with T.
Kawamura, Norihiko; Nimura, Keisuke; Nagano, Hiromichi; Yamaguchi, Sohei; Nonomura, Norio; Kaneda, Yasufumi
2015-09-08
NANOG expression in prostate cancer is highly correlated with cancer stem cell characteristics and resistance to androgen deprivation. However, it is not clear whether NANOG or its pseudogenes contribute to the malignant potential of cancer. We established NANOG- and NANOGP8-knockout DU145 prostate cancer cell lines using the CRISPR/Cas9 system. Knockouts of NANOG and NANOGP8 significantly attenuated malignant potential, including sphere formation, anchorage-independent growth, migration capability, and drug resistance, compared to parental DU145 cells. NANOG and NANOGP8 knockout did not inhibit in vitro cell proliferation, but in vivo tumorigenic potential decreased significantly. These phenotypes were recovered in NANOG- and NANOGP8-rescued cell lines. These results indicate that NANOG and NANOGP8 proteins are expressed in prostate cancer cell lines, and NANOG and NANOGP8 equally contribute to the high malignant potential of prostate cancer.
Huang, Ya-Yi; Matzke, Antonius J M; Matzke, Marjori
2013-01-01
Coconut, a member of the palm family (Arecaceae), is one of the most economically important trees used by mankind. Despite its diverse morphology, coconut is recognized taxonomically as only a single species (Cocos nucifera L.). There are two major coconut varieties, tall and dwarf, the latter of which displays traits resulting from selection by humans. We report here the complete chloroplast (cp) genome of a dwarf coconut plant, and describe the gene content and organization, inverted repeat fluctuations, repeated sequence structure, and occurrence of RNA editing. Phylogenetic relationships of monocots were inferred based on 47 chloroplast protein-coding genes. Potential nodes for events of gene duplication and pseudogenization related to inverted repeat fluctuation were mapped onto the tree using parsimony criteria. We compare our findings with those from other palm species for which complete cp genome sequences are available.
Clendenning, Mark; Walsh, Michael D; Gelpi, Judith Balmana; Thibodeau, Stephen N.; Lindor, Noralane; Potter, John D.; Newcomb, Polly; LeMarchand, Loic; Haile, Robert; Gallinger, Steve; Hopper, John L.; Jenkins, Mark A.; Rosty, Christophe; Young, Joanne P.; Buchanan, Daniel D.
2013-01-01
Current screening practices have been able to identify PMS2 mutations in 78% of cases of colorectal cancer from the Colorectal Cancer Family Registry (Colon CFR) which showed solitary loss of the PMS2 protein. However the detection of large-scale deletions in the 3′ end of the PMS2 gene has not been possible due to technical difficulties associated with pseudogene sequences. Here, we utilised a recently described MLPA/long-range PCR-based approach to screen the remaining 22% (n = 16) of CRC-affected probands for mutations in the 3′ end of the PMS2 gene. No deletions encompassing any or all of exons 12 through 15 were identified; therefore, our results suggest that 3′ deletions in PMS2 are not a frequent occurrence in such families. PMID:23288611
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ferrari, S.; Finelli, P.; Rocchi, M.
The human genome contains a large number of sequences related to the cDNA for High Mobility Group 1 protein (HMG1), which so far has hampered the cloning and mapping of the active HMG1 gene. We show that the human HMG1 gene contains introns, while the HMG1-related sequences do not and most likely are retrotransposed pseudogenes. We identified eight YACs from the ICI and CEPH libraries that contain the human HMG1 gene. The HMG1 gene is similar in structure to the previously characterized murine homologue and maps to human chromosome 13 and q12, as determined by in situ hybridization. The mousemore » Hmg1 gene maps to the telomeric region of murine Chromosome 5, which is syntenic to the human 13q12 band. 18 refs., 3 figs.« less
Folle, Ana Maite; Kitano, Eduardo S.; Lima, Analía; Gil, Magdalena; Cucher, Marcela; Mourglia-Ettlin, Gustavo; Iwai, Leo K.; Rosenzvit, Mara; Batthyány, Carlos
2017-01-01
The larva of cestodes belonging to the Echinococcus granulosus sensu lato (s.l.) complex causes cystic echinococcosis (CE). It is a globally distributed zoonosis with significant economic and public health impact. The most immunogenic and specific Echinococcus-genus antigen for human CE diagnosis is antigen B (AgB), an abundant lipoprotein of the hydatid cyst fluid (HF). The AgB protein moiety (apolipoprotein) is encoded by five genes (AgB1-AgB5), which generate mature 8 kDa proteins (AgB8/1-AgB8/5). These genes seem to be differentially expressed among Echinococcus species. Since AgB immunogenicity lies on its protein moiety, differences in AgB expression within E. granulosus s.l. complex might have diagnostic and epidemiological relevance for discriminating the contribution of distinct species to human CE. Interestingly, AgB2 was proposed as a pseudogene in E. canadensis, which is the second most common cause of human CE, but proteomic studies for verifying it have not been performed yet. Herein, we analysed the protein and lipid composition of AgB obtained from fertile HF of swine origin (E. canadensis G7 genotype). AgB apolipoproteins were identified and quantified using mass spectrometry tools. Results showed that AgB8/1 was the major protein component, representing 71% of total AgB apolipoproteins, followed by AgB8/4 (15.5%), AgB8/3 (13.2%) and AgB8/5 (0.3%). AgB8/2 was not detected. As a methodological control, a parallel analysis detected all AgB apolipoproteins in bovine fertile HF (G1/3/5 genotypes). Overall, E. canadensis AgB comprised mostly AgB8/1 together with a heterogeneous mixture of lipids, and AgB8/2 was not detected despite using high sensitivity proteomic techniques. This endorses genomic data supporting that AgB2 behaves as a pseudogene in G7 genotype. Since recombinant AgB8/2 has been found to be diagnostically valuable for human CE, our findings indicate that its use as antigen in immunoassays could contribute to false negative results in areas where E. canadensis circulates. Furthermore, the presence of anti-AgB8/2 antibodies in serum may represent a useful parameter to rule out E. canadensis infection when human CE is diagnosed. PMID:28045899
2009-01-01
Background Polymerase chain reaction (PCR) is very useful in many areas of molecular biology research. It is commonly observed that PCR success is critically dependent on design of an effective primer pair. Current tools for primer design do not adequately address the problem of PCR failure due to mis-priming on target-related sequences and structural variations in the genome. Methods We have developed an integrated graphical web-based application for primer design, called RExPrimer, which was written in Python language. The software uses Primer3 as the primer designing core algorithm. Locally stored sequence information and genomic variant information were hosted on MySQLv5.0 and were incorporated into RExPrimer. Results RExPrimer provides many functionalities for improved PCR primer design. Several databases, namely annotated human SNP databases, insertion/deletion (indel) polymorphisms database, pseudogene database, and structural genomic variation databases were integrated into RExPrimer, enabling an effective without-leaving-the-website validation of the resulting primers. By incorporating these databases, the primers reported by RExPrimer avoid mis-priming to related sequences (e.g. pseudogene, segmental duplication) as well as possible PCR failure because of structural polymorphisms (SNP, indel, and copy number variation (CNV)). To prevent mismatching caused by unexpected SNPs in the designed primers, in particular the 3' end (SNP-in-Primer), several SNP databases covering the broad range of population-specific SNP information are utilized to report SNPs present in the primer sequences. Population-specific SNP information also helps customize primer design for a specific population. Furthermore, RExPrimer offers a graphical user-friendly interface through the use of scalable vector graphic image that intuitively presents resulting primers along with the corresponding gene structure. In this study, we demonstrated the program effectiveness in successfully generating primers for strong homologous sequences. Conclusion The improvements for primer design incorporated into RExPrimer were demonstrated to be effective in designing primers for challenging PCR experiments. Integration of SNP and structural variation databases allows for robust primer design for a variety of PCR applications, irrespective of the sequence complexity in the region of interest. This software is freely available at http://www4a.biotec.or.th/rexprimer. PMID:19958502
Pseudogenization of a Sweet-Receptor Gene Accounts for Cats' Indifference toward Sugar
Li, Xia; Li, Weihua; Wang, Hong; Cao, Jie; Maehashi, Kenji; Huang, Liquan; Bachmanov, Alexander A; Reed, Danielle R; Legrand-Defretin, Véronique; Beauchamp, Gary K; Brand, Joseph G
2005-01-01
Although domestic cats (Felis silvestris catus) possess an otherwise functional sense of taste, they, unlike most mammals, do not prefer and may be unable to detect the sweetness of sugars. One possible explanation for this behavior is that cats lack the sensory system to taste sugars and therefore are indifferent to them. Drawing on work in mice, demonstrating that alleles of sweet-receptor genes predict low sugar intake, we examined the possibility that genes involved in the initial transduction of sweet perception might account for the indifference to sweet-tasting foods by cats. We characterized the sweet-receptor genes of domestic cats as well as those of other members of the Felidae family of obligate carnivores, tiger and cheetah. Because the mammalian sweet-taste receptor is formed by the dimerization of two proteins (T1R2 and T1R3; gene symbols Tas1r2 and Tas1r3), we identified and sequenced both genes in the cat by screening a feline genomic BAC library and by performing PCR with degenerate primers on cat genomic DNA. Gene expression was assessed by RT-PCR of taste tissue, in situ hybridization, and immunohistochemistry. The cat Tas1r3 gene shows high sequence similarity with functional Tas1r3 genes of other species. Message from Tas1r3 was detected by RT-PCR of taste tissue. In situ hybridization and immunohistochemical studies demonstrate that Tas1r3 is expressed, as expected, in taste buds. However, the cat Tas1r2 gene shows a 247-base pair microdeletion in exon 3 and stop codons in exons 4 and 6. There was no evidence of detectable mRNA from cat Tas1r2 by RT-PCR or in situ hybridization, and no evidence of protein expression by immunohistochemistry. Tas1r2 in tiger and cheetah and in six healthy adult domestic cats all show the similar deletion and stop codons. We conclude that cat Tas1r3 is an apparently functional and expressed receptor but that cat Tas1r2 is an unexpressed pseudogene. A functional sweet-taste receptor heteromer cannot form, and thus the cat lacks the receptor likely necessary for detection of sweet stimuli. This molecular change was very likely an important event in the evolution of the cat's carnivorous behavior. PMID:16103917
Folle, Ana Maite; Kitano, Eduardo S; Lima, Analía; Gil, Magdalena; Cucher, Marcela; Mourglia-Ettlin, Gustavo; Iwai, Leo K; Rosenzvit, Mara; Batthyány, Carlos; Ferreira, Ana María
2017-01-01
The larva of cestodes belonging to the Echinococcus granulosus sensu lato (s.l.) complex causes cystic echinococcosis (CE). It is a globally distributed zoonosis with significant economic and public health impact. The most immunogenic and specific Echinococcus-genus antigen for human CE diagnosis is antigen B (AgB), an abundant lipoprotein of the hydatid cyst fluid (HF). The AgB protein moiety (apolipoprotein) is encoded by five genes (AgB1-AgB5), which generate mature 8 kDa proteins (AgB8/1-AgB8/5). These genes seem to be differentially expressed among Echinococcus species. Since AgB immunogenicity lies on its protein moiety, differences in AgB expression within E. granulosus s.l. complex might have diagnostic and epidemiological relevance for discriminating the contribution of distinct species to human CE. Interestingly, AgB2 was proposed as a pseudogene in E. canadensis, which is the second most common cause of human CE, but proteomic studies for verifying it have not been performed yet. Herein, we analysed the protein and lipid composition of AgB obtained from fertile HF of swine origin (E. canadensis G7 genotype). AgB apolipoproteins were identified and quantified using mass spectrometry tools. Results showed that AgB8/1 was the major protein component, representing 71% of total AgB apolipoproteins, followed by AgB8/4 (15.5%), AgB8/3 (13.2%) and AgB8/5 (0.3%). AgB8/2 was not detected. As a methodological control, a parallel analysis detected all AgB apolipoproteins in bovine fertile HF (G1/3/5 genotypes). Overall, E. canadensis AgB comprised mostly AgB8/1 together with a heterogeneous mixture of lipids, and AgB8/2 was not detected despite using high sensitivity proteomic techniques. This endorses genomic data supporting that AgB2 behaves as a pseudogene in G7 genotype. Since recombinant AgB8/2 has been found to be diagnostically valuable for human CE, our findings indicate that its use as antigen in immunoassays could contribute to false negative results in areas where E. canadensis circulates. Furthermore, the presence of anti-AgB8/2 antibodies in serum may represent a useful parameter to rule out E. canadensis infection when human CE is diagnosed.
Castillo, Jonathan; Stueve, Theresa R.; Marconett, Crystal N.
2017-01-01
Previously thought of as junk transcripts and pseudogene remnants, long non-coding RNAs (lncRNAs) have come into their own over the last decade as an essential component of cellular activity, regulating a plethora of functions within multicellular organisms. lncRNAs are now known to participate in development, cellular homeostasis, immunological processes, and the development of disease. With the advent of next generation sequencing technology, hundreds of thousands of lncRNAs have been identified. However, movement beyond mere discovery to the understanding of molecular processes has been stymied by the complicated genomic structure, tissue-restricted expression, and diverse regulatory roles lncRNAs play. In this review, we will focus on lncRNAs involved in lung cancer, the most common cause of cancer-related death in the United States and worldwide. We will summarize their various methods of discovery, provide consensus rankings of deregulated lncRNAs in lung cancer, and describe in detail the limited functional analysis that has been undertaken so far. PMID:29113413
Evolution of the Class IV HD-Zip Gene Family in Streptophytes
Zalewski, Christopher S.; Floyd, Sandra K.; Furumizu, Chihiro; Sakakibara, Keiko; Stevenson, Dennis W.; Bowman, John L.
2013-01-01
Class IV homeodomain leucine zipper (C4HDZ) genes are plant-specific transcription factors that, based on phenotypes in Arabidopsis thaliana, play an important role in epidermal development. In this study, we sampled all major extant lineages and their closest algal relatives for C4HDZ homologs and phylogenetic analyses result in a gene tree that mirrors land plant evolution with evidence for gene duplications in many lineages, but minimal evidence for gene losses. Our analysis suggests an ancestral C4HDZ gene originated in an algal ancestor of land plants and a single ancestral gene was present in the last common ancestor of land plants. Independent gene duplications are evident within several lineages including mosses, lycophytes, euphyllophytes, seed plants, and, most notably, angiosperms. In recently evolved angiosperm paralogs, we find evidence of pseudogenization via mutations in both coding and regulatory sequences. The increasing complexity of the C4HDZ gene family through the diversification of land plants correlates to increasing complexity in epidermal characters. PMID:23894141
Gentle Masking of Low-Complexity Sequences Improves Homology Search
Frith, Martin C.
2011-01-01
Detection of sequences that are homologous, i.e. descended from a common ancestor, is a fundamental task in computational biology. This task is confounded by low-complexity tracts (such as atatatatatat), which arise frequently and independently, causing strong similarities that are not homologies. There has been much research on identifying low-complexity tracts, but little research on how to treat them during homology search. We propose to find homologies by aligning sequences with “gentle” masking of low-complexity tracts. Gentle masking means that the match score involving a masked letter is , where is the unmasked score. Gentle masking slightly but noticeably improves the sensitivity of homology search (compared to “harsh” masking), without harming specificity. We show examples in three useful homology search problems: detection of NUMTs (nuclear copies of mitochondrial DNA), recruitment of metagenomic DNA reads to reference genomes, and pseudogene detection. Gentle masking is currently the best way to treat low-complexity tracts during homology search. PMID:22205972
Unusual loss of chymosin in mammalian lineages parallels neo-natal immune transfer strategies.
Lopes-Marques, Mónica; Ruivo, Raquel; Fonseca, Elza; Teixeira, Ana; Castro, L Filipe C
2017-11-01
Gene duplication and loss are powerful drivers of evolutionary change. The role of loss in phenotypic diversification is notably illustrated by the variable enzymatic repertoire involved in vertebrate protein digestion. Among these we find the pepsin family of aspartic proteinases, including chymosin (Cmy). Previous studies demonstrated that Cmy, a neo-natal digestive pepsin, is inactivated in some primates, including humans. This pseudogenization event was hypothesized to result from the acquisition of maternal immune immunoglobulin G (IgG) transfer. By investigating 94 mammalian subgenomes we reveal an unprecedented level of Cmy erosion in placental mammals, with numerous independent events of gene loss taking place in Primates, Dermoptera, Rodentia, Cetacea and Perissodactyla. Our findings strongly suggest that the recurrent inactivation of Cmy correlates with the evolution of the passive transfer of IgG and uncovers a noteworthy case of evolutionary cross-talk between the digestive and the immune system, modulated by gene loss. Copyright © 2017 Elsevier Inc. All rights reserved.
Breaux, Breanna; Hunter, Margaret; Cruz-Schneider, Maria Paula; Sena, Leonardo; Bonde, Robert K.; Criscitiello, Michael F.
2018-01-01
The Florida manatee (Trichechus manatus latirostris) has limited diversity in the immunoglobulin heavy chain. We therefore investigated the antigen receptor loci of the other arm of the adaptive immune system: the T cell receptor. Manatees are the first species from Afrotheria, a basal eutherian superorder, to have an in-depth characterization of all T cell receptor loci. By annotating the genome and expressed transcripts, we found that each chain has distinct features that correlates to their individual functions. The genomic organization also plays a role in modulating sequence conservation between species. There were extensive V subgroup synteny blocks in the TRA and TRB loci between T. m. latirostrisand human. Increased genomic locus complexity correlated to increased locus synteny. We also identified evidence for a VHD pseudogene for the first time in a eutherian mammal. These findings emphasize the value of including species within this basal eutherian radiation in comparative studies.
Chromosomal arrangement of leghemoglobin genes in soybean.
Lee, J S; Brown, G G; Verma, D P
1983-01-01
A cluster of four different leghemoglobin (Lb) genes was isolated from AluI-HaeIII and EcoRI genomic libraries of soybean in a set of overlapping clones which together include 45 kilobases (kb) of contiguous DNA. These four genes, including a pseudogene, are present in the same orientation and are arranged in the order: 5'-Lba-Lbc1-Lb psi-Lbc3-3'. The intergenic regions average 2.5 kb. In addition to this main Lb locus, there are other Lb genes which do not appear to be contiguous to this locus. A sequence probably common to the 3' region of Lb loci was found flanking the Lbc3 gene. The 3' flanking region of the main Lb locus also contains a sequence that appears to be expressed more abundantly in root tissue. Another sequence which is primarily expressed in root and leaf is found 5' to two Lb loci. Overall, the main leghemoglobin locus is similar in structure to the mammalian globin gene loci. Images PMID:6310504
Current Research on Non-Coding Ribonucleic Acid (RNA).
Wang, Jing; Samuels, David C; Zhao, Shilin; Xiang, Yu; Zhao, Ying-Yong; Guo, Yan
2017-12-05
Non-coding ribonucleic acid (RNA) has without a doubt captured the interest of biomedical researchers. The ability to screen the entire human genome with high-throughput sequencing technology has greatly enhanced the identification, annotation and prediction of the functionality of non-coding RNAs. In this review, we discuss the current landscape of non-coding RNA research and quantitative analysis. Non-coding RNA will be categorized into two major groups by size: long non-coding RNAs and small RNAs. In long non-coding RNA, we discuss regular long non-coding RNA, pseudogenes and circular RNA. In small RNA, we discuss miRNA, transfer RNA, piwi-interacting RNA, small nucleolar RNA, small nuclear RNA, Y RNA, single recognition particle RNA, and 7SK RNA. We elaborate on the origin, detection method, and potential association with disease, putative functional mechanisms, and public resources for these non-coding RNAs. We aim to provide readers with a complete overview of non-coding RNAs and incite additional interest in non-coding RNA research.
Bacteriological and genetic assessment of game meat from Japanese wild boars.
Naya, Yuka; Horiuchi, Motohiro; Ishiguro, Naotaka; Shinagawa, Morikazu
2003-01-15
Bacterial tests were used to assess bacterial contamination of game meat from Japanese wild boars. The bacterial contamination of wild boar meat was less than that of domestic pork, as determined by aerobic plate counts (APC) and coliform counts. None of the meat examined in this study was contaminated by Salmonella or E. coli O-157. To detect adulteration by domestic pig meat or European wild boar meat, 46 samples of game meat sold as Japanese wild boar were examined genetically. A total of 17 samples showed genetic haplotypes of European and Asian domestic pigs in the D-loop of mitochondrial DNA (mtDNA), and 16 samples showed nuclear glucosephosphate isomerase-processed pseudogene (GPIP) genotypes of European domestic pigs. The European GPIP genotypes of these samples were confirmed by PCR-RFLP analysis. These results indicate that some game meat sold as Japanese wild boar is adulterated by cross-breeding between pigs and wild boars or by contamination with meat from domestic pigs or European wild boars.
The evolution of vertebrate Toll-like receptors
Roach, J.C.; Glusman, G.; Rowen, L.; Kaur, A.; Purcell, M.K.; Smith, K.D.; Hood, L.E.; Aderem, A.
2005-01-01
The complete sequences of Takifugu Toll-like receptor (TLR) loci and gene predictions from many draft genomes enable comprehensive molecular phylogenetic analysis. Strong selective pressure for recognition of and response to pathogen-associated molecular patterns has maintained a largely unchanging TLR recognition in all vertebrates. There are six major families of vertebrate TLRs. This repertoire is distinct from that of invertebrates. TLRs within a family recognize a general class of pathogen-associated molecular patterns. Most vertebrates have exactly one gene ortholog for each TLR family. The family including TLR1 has more species-specific adaptations than other families. A major family including TLR11 is represented in humans only by a pseudogene. Coincidental evolution plays a minor role in TLR evolution. The sequencing phase of this study produced finished genomic sequences for the 12 Takifugu rubripes TLRs. In addition, we have produced > 70 gene models, including sequences from the opossum, chicken, frog, dog, sea urchin, and sea squirt. ?? 2005 by The National Academy of Sciences of the USA.
Genes in one megabase of the HLA class I region
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wei, H.; Fan, Wu-Fang; Xu, Hongxia
1993-11-15
To define the gene content of the HLA class I region, cDNA selection was applied to three overlapping yeast artificial chromosomes (YACs) that spanned 1 megabase (Mb) of this region of the human major histocompatibility complex. These YACs extended from the region centromeric to HLA-E to the region telomeric to HLA-F. In additions to the recognized class I genes and pseudogenes and the anonymous non-class-I genes described recently by the authors and others, 20 additional anonymous cDNA clones were identified from this 1-Mb region. They also identified a long repetitive DNA element in the region between HLA-B and HLA-E. Homologuesmore » of this outside of the HLA complex. The portion of the HLA class I region represented by these YACs shows an average gene density as high as the class II and class III regions. Thus, the high gene density portion of the HLA complex is extended to more than 3 Mb.« less
Underlying mathematics in diversification of human olfactory receptors in different loci.
Hassan, Sk Sarif; Choudhury, Pabitra Pal; Goswami, Arunava
2013-12-01
As per conservative estimate, approximately 51-105 Olfactory Receptors (ORs) loci are present in human genome occurring in clusters. These clusters are apparently unevenly spread as mosaics over 21 pairs of human chromosomes. Olfactory Receptor (OR) gene families which are thought to have expanded for the need to provide recognition capability for a huge number of pure and complex odorants, form the largest known multigene family in the human genome. Recent studies have shown that 388 full length and 414 OR pseudo-genes are present in these OR genomic clusters. In this paper, the authors report a classification method for all human ORs based on their sequential quantitative information like presence of poly strings of nucleotides bases, long range correlation and so on. An L-System generated sequence has been taken as an input into a star-model of specific subfamily members and resultant sequence has been mapped to a specific OR based on the classification scheme using fractal parameters like Hurst exponent and fractal dimensions.
Oshima, Junko; Lee, Jennifer A; Breman, Amy M; Fernandes, Priscilla H; Babovic-Vuksanovic, Dusica; Ward, Patricia A; Wolfe, Lynne A; Eng, Christine M; Del Gaudio, Daniela
2011-07-01
Mucopolysaccharidosis type II (MPS II) is caused by mutations in the IDS gene, which encodes the lysosomal enzyme iduronate-2-sulfatase. In ∼20% of MPS II patients the disorder is caused by gross IDS structural rearrangements. We identified two male cases harboring complex rearrangements involving the IDS gene and the nearby pseudogene, IDSP1, which has been annotated as a low-copy repeat (LCR). In both cases the rearrangement included a partial deletion of IDS and an inverted insertion of the neighboring region. In silico analyses revealed the presence of repetitive elements as well as LCRs at the junctions of rearrangements. Our models illustrate two alternative consequences of rearrangements initiated by non-allelic homologous recombination of LCRs: resolution by a second recombination event (that is, Alu-mediated recombination), or resolution by non-homologous end joining repair. These complex rearrangements have the potential to be recurrent and may be present among those MSP II cases with previously uncharacterized aberrations involving IDS.
Huang, Xiaomei; Zhou, Xi; Hu, Qing; Sun, Binyu; Deng, Mingming; Qi, Xiaolong; Lü, Muhan
2018-01-28
Esophageal cancer is a malignant digestive tract cancer with high mortality. Although studies have found that esophageal cancer is involved in a complex and important gene regulation network, the pathogenesis remains unclear. The recently described long non-coding RNAs (lncRNAs) are one effective part of the gene regulation network. However, in past decades, lncRNAs were thought to be "transcript noise" or "pseudogenes" and were thus ignored. Early studies indicated that lncRNAs play pivotal roles during evolution. However, in recent years, increasing research has revealed that many lncRNAs are associated with tumorigenesis. In particular, lncRNAs may act as important elements for epigenetic regulation, transcription, post-transcriptional regulation and post-translational modification of proteins. Additionally, they may be novel biomarkers for tumors and therapeutic targets in cancer. Here, we summarize the functions of lncRNAs in esophageal cancer, with an emphasis on lncRNA-mediated regulatory mechanisms that affect the biological characteristics of esophageal cancer. Copyright © 2017 Elsevier B.V. All rights reserved.
The developmental proteome of Drosophila melanogaster
Casas-Vila, Nuria; Bluhm, Alina; Sayols, Sergi; Dinges, Nadja; Dejung, Mario; Altenhein, Tina; Kappei, Dennis; Altenhein, Benjamin; Roignant, Jean-Yves; Butter, Falk
2017-01-01
Drosophila melanogaster is a widely used genetic model organism in developmental biology. While this model organism has been intensively studied at the RNA level, a comprehensive proteomic study covering the complete life cycle is still missing. Here, we apply label-free quantitative proteomics to explore proteome remodeling across Drosophila’s life cycle, resulting in 7952 proteins, and provide a high temporal-resolved embryogenesis proteome of 5458 proteins. Our proteome data enabled us to monitor isoform-specific expression of 34 genes during development, to identify the pseudogene Cyp9f3Ψ as a protein-coding gene, and to obtain evidence of 268 small proteins. Moreover, the comparison with available transcriptomic data uncovered examples of poor correlation between mRNA and protein, underscoring the importance of proteomics to study developmental progression. Data integration of our embryogenesis proteome with tissue-specific data revealed spatial and temporal information for further functional studies of yet uncharacterized proteins. Overall, our high resolution proteomes provide a powerful resource and can be explored in detail in our interactive web interface. PMID:28381612
Breaux, Breanna; Hunter, Margaret E; Cruz-Schneider, Maria Paula; Sena, Leonardo; Bonde, Robert K; Criscitiello, Michael F
2018-08-01
The Florida manatee (Trichechus manatus latirostris) has limited diversity in the immunoglobulin heavy chain. We therefore investigated the antigen receptor loci of the other arm of the adaptive immune system: the T cell receptor. Manatees are the first species from Afrotheria, a basal eutherian superorder, to have an in-depth characterization of all T cell receptor loci. By annotating the genome and expressed transcripts, we found that each chain has distinct features that correlates to their individual functions. The genomic organization also plays a role in modulating sequence conservation between species. There were extensive V subgroup synteny blocks in the TRA and TRB loci between T. m. latirostris and human. Increased genomic locus complexity correlated to increased locus synteny. We also identified evidence for a VHD pseudogene for the first time in a eutherian mammal. These findings emphasize the value of including species within this basal eutherian radiation in comparative studies. Copyright © 2018. Published by Elsevier Ltd.
Shendre, Aditi; Wiener, Howard W.; Irvin, Marguerite R.; Aouizerat, Bradley E.; Overton, Edgar T.; Lazar, Jason; Liu, Chenglong; Hodis, Howard N.; Limdi, Nita A.; Weber, Kathleen M.; Zhi, Degui; Floris-Moore, Michelle A.; Ofotokun, Ighovwerha; Qi, Qibin; Hanna, David B.; Kaplan, Robert C.
2017-01-01
Cardiovascular disease (CVD) is a major comorbidity among HIV-infected individuals. Common carotid artery intima-media thickness (cCIMT) is a valid and reliable subclinical measure of atherosclerosis and is known to predict CVD. We performed genome-wide association (GWA) and admixture analysis among 682 HIV-positive and 288 HIV-negative Black, non-Hispanic women from the Women’s Interagency HIV study (WIHS) cohort using a combined and stratified analysis approach. We found some suggestive associations but none of the SNPs reached genome-wide statistical significance in our GWAS analysis. The top GWAS SNPs were rs2280828 in the region intergenic to mediator complex subunit 30 and exostosin glycosyltransferase 1 (MED30 | EXT1) among all women, rs2907092 in the catenin delta 2 (CTNND2) gene among HIV-positive women, and rs7529733 in the region intergenic to family with sequence similarity 5, member C and regulator of G-protein signaling 18 (FAM5C | RGS18) genes among HIV-negative women. The most significant local European ancestry associations were in the region intergenic to the zinc finger and SCAN domain containing 5D gene and NADH: ubiquinone oxidoreductase complex assembly factor 1 (ZSCAN5D | NDUF1) pseudogene on chromosome 19 among all women, in the region intergenic to vomeronasal 1 receptor 6 pseudogene and zinc finger protein 845 (VN1R6P | ZNF845) gene on chromosome 19 among HIV-positive women, and in the region intergenic to the SEC23-interacting protein and phosphatidic acid phosphatase type 2 domain containing 1A (SEC23IP | PPAPDC1A) genes located on chromosome 10 among HIV-negative women. A number of previously identified SNP associations with cCIMT were also observed and included rs2572204 in the ryanodine receptor 3 (RYR3) and an admixture region in the secretion-regulating guanine nucleotide exchange factor (SERGEF) gene. We report several SNPs and gene regions in the GWAS and admixture analysis, some of which are common across HIV-positive and HIV-negative women as demonstrated using meta-analysis, and also across the two analytic approaches (i.e., GWA and admixture). These findings suggest that local European ancestry plays an important role in genetic associations of cCIMT among black women from WIHS along with other environmental factors that are related to CVD and may also be triggered by HIV. These findings warrant confirmation in independent samples. PMID:29206233
Inverse PCR-based method for isolating novel SINEs from genome.
Han, Yawei; Chen, Liping; Guan, Lihong; He, Shunping
2014-04-01
Short interspersed elements (SINEs) are moderately repetitive DNA sequences in eukaryotic genomes. Although eukaryotic genomes contain numerous SINEs copy, it is very difficult and laborious to isolate and identify them by the reported methods. In this study, the inverse PCR was successfully applied to isolate SINEs from Opsariichthys bidens genome in Eastern Asian Cyprinid. A group of SINEs derived from tRNA(Ala) molecular had been identified, which were named Opsar according to Opsariichthys. SINEs characteristics were exhibited in Opsar, which contained a tRNA(Ala)-derived region at the 5' end, a tRNA-unrelated region, and AT-rich region at the 3' end. The tRNA-derived region of Opsar shared 76 % sequence similarity with tRNA(Ala) gene. This result indicated that Opsar could derive from the inactive or pseudogene of tRNA(Ala). The reliability of method was tested by obtaining C-SINE, Ct-SINE, and M-SINEs from Ctenopharyngodon idellus, Megalobrama amblycephala, and Cyprinus carpio genomes. This method is simpler than the previously reported, which successfully omitted many steps, such as preparation of probes, construction of genomic libraries, and hybridization.
Homoeologous cloning of omega-secalin gene family in a wheat 1BL/1RS translocation.
Chai, Jian Fang; Liu, Xu; Jia, Ji Zeng
2005-08-01
Wheat 1BL/1RS translocations are widely planted in China as well as in most of the wheat producing area in the world for their good qualities of disease resistance and high yield. 1BL/1RS translocations are however poor in bread making, partially caused by a family of small monomeric proteins, omega-secalins, which are encoded by genes on 1RS. Based on published sequence of a rye omega-secalin gene we designed a pair of primers to cover the whole mature protein coding sequence. A major band could be amplified from 1BL/1RS translocations but not from euploid wheat. Using this primer set we conducted PCR amplification by using high fidelity Pfu polymerase on the genomic DNAs and cDNAs purified from a 1BL/1RS translocation Lankao 906. Sequencing analysis indicated that this gene family contains several members of 1150 bp, 1076 bp, 1075 bp, 1052 bp and 1004 bp genes, including two pseudogenes and three active genes. The gene transcripts were differentially expressed in developing seeds.
Nucleolar Association and Transcriptional Inhibition through 5S rDNA in Mammals
Fedoriw, Andrew M.; Starmer, Joshua; Yee, Della; Magnuson, Terry
2012-01-01
Changes in the spatial positioning of genes within the mammalian nucleus have been associated with transcriptional differences and thus have been hypothesized as a mode of regulation. In particular, the localization of genes to the nuclear and nucleolar peripheries is associated with transcriptional repression. However, the mechanistic basis, including the pertinent cis- elements, for such associations remains largely unknown. Here, we provide evidence that demonstrates a 119 bp 5S rDNA can influence nucleolar association in mammals. We found that integration of transgenes with 5S rDNA significantly increases the association of the host region with the nucleolus, and their degree of association correlates strongly with repression of a linked reporter gene. We further show that this mechanism may be functional in endogenous contexts: pseudogenes derived from 5S rDNA show biased conservation of their internal transcription factor binding sites and, in some cases, are frequently associated with the nucleolus. These results demonstrate that 5S rDNA sequence can significantly contribute to the positioning of a locus and suggest a novel, endogenous mechanism for nuclear organization in mammals. PMID:22275877
Sheep skeletal muscle transcriptome analysis reveals muscle growth regulatory lncRNAs.
Chao, Tianle; Ji, Zhibin; Hou, Lei; Wang, Jin; Zhang, Chunlan; Wang, Guizhi; Wang, Jianmin
2018-01-01
As widely distributed domestic animals, sheep are an important species and the source of mutton. In this study, we aimed to evaluate the regulatory lncRNAs associated with muscle growth and development between high production mutton sheep (Dorper sheep and Qianhua Mutton Merino sheep) and low production mutton sheep (Small-tailed Han sheep). In total, 39 lncRNAs were found to be differentially expressed. Using co-expression analysis and functional annotation, 1,206 co-expression interactions were found between 32 lncRNAs and 369 genes, and 29 of these lncRNAs were found to be associated with muscle development, metabolism, cell proliferation and apoptosis. lncRNA-mRNA interactions revealed 6 lncRNAs as hub lncRNAs. Moreover, three lncRNAs and their associated co-expressed genes were demonstrated by cis-regulatory gene analyses, and we also found a potential regulatory relationship between the pseudogene lncRNA LOC101121401 and its parent gene FTH1. This study provides a genome-wide resolution of lncRNA and mRNA regulation in muscles from mutton sheep.
Fujimi, T J; Nakajyo, T; Nishimura, E; Ogura, E; Tsuchiya, T; Tamiya, T
2003-08-14
The genes encoding erabutoxin (short chain neurotoxin) isoforms (Ea, Eb, and Ec), LsIII (long chain neurotoxin) and a novel long chain neurotoxin pseudogene were cloned from a Laticauda semifasciata genomic library. Short and long chain neurotoxin genes were also cloned from the genome of Laticauda laticaudata, a closely related species of L. semifasciata, by PCR. A putative matrix attached region (MAR) sequence was found in the intron I of the LsIII gene. Comparative analysis of 11 structurally relevant snake toxin genes (three-finger-structure toxins) revealed the molecular evolution of these toxins. Three-finger-structure toxin genes diverged from a common ancestor through two types of evolutionary pathways (long and short types), early in the course of evolution. At a later stage of evolution in each gene, the accumulation of mutations in the exons, especially exon II, by accelerated evolution may have caused the increased diversification in their functions. It was also revealed that the putative MAR sequence found in the LsIII gene was integrated into the gene after the species-level divergence.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mavromatis, K; Doyle, C Kuyler; Lykidis, A
2006-01-01
Ehrlichia canis, a small obligately intracellular, tick-transmitted, gram-negative, {alpha}-proteobacterium, is the primary etiologic agent of globally distributed canine monocytic ehrlichiosis. Complete genome sequencing revealed that the E. canis genome consists of a single circular chromosome of 1,315,030 bp predicted to encode 925 proteins, 40 stable RNA species, 17 putative pseudogenes, and a substantial proportion of noncoding sequence (27%). Interesting genome features include a large set of proteins with transmembrane helices and/or signal sequences and a unique serine-threonine bias associated with the potential for O glycosylation that was prominent in proteins associated with pathogen-host interactions. Furthermore, two paralogous protein families associatedmore » with immune evasion were identified, one of which contains poly(G-C) tracts, suggesting that they may play a role in phase variation and facilitation of persistent infections. Genes associated with pathogen-host interactions were identified, including a small group encoding proteins (n = 12) with tandem repeats and another group encoding proteins with eukaryote-like ankyrin domains (n = 7).« less
Extracellular RNA profiles with human age.
Dluzen, Douglas F; Noren Hooten, Nicole; De, Supriyo; Wood, William H; Zhang, Yongqing; Becker, Kevin G; Zonderman, Alan B; Tanaka, Toshiko; Ferrucci, Luigi; Evans, Michele K
2018-05-24
Circulating extracellular RNAs (exRNAs) are potential biomarkers of disease. We thus hypothesized that age-related changes in exRNAs can identify age-related processes. We profiled both large and small RNAs in human serum to investigate changes associated with normal aging. exRNA was sequenced in 13 young (30-32 years) and 10 old (80-85 years) African American women to identify all RNA transcripts present in serum. We identified age-related differences in several RNA biotypes, including mitochondrial transfer RNAs, mitochondrial ribosomal RNA, and unprocessed pseudogenes. Age-related differences in unique RNA transcripts were further validated in an expanded cohort. Pathway analysis revealed that EIF2 signaling, oxidative phosphorylation, and mitochondrial dysfunction were among the top pathways shared between young and old. Protein interaction networks revealed distinct clusters of functionally-related protein-coding genes in both age groups. These data provide timely and relevant insight into the exRNA repertoire in serum and its change with aging. Published 2018. This article is a U.S. Government work and is in the public domain in the USA. Aging Cell published by the Anatomical Society and John Wiley & Sons Ltd.
Comparative genomics reveals convergent evolution between the bamboo-eating giant and red pandas.
Hu, Yibo; Wu, Qi; Ma, Shuai; Ma, Tianxiao; Shan, Lei; Wang, Xiao; Nie, Yonggang; Ning, Zemin; Yan, Li; Xiu, Yunfang; Wei, Fuwen
2017-01-31
Phenotypic convergence between distantly related taxa often mirrors adaptation to similar selective pressures and may be driven by genetic convergence. The giant panda (Ailuropoda melanoleuca) and red panda (Ailurus fulgens) belong to different families in the order Carnivora, but both have evolved a specialized bamboo diet and adaptive pseudothumb, representing a classic model of convergent evolution. However, the genetic bases of these morphological and physiological convergences remain unknown. Through de novo sequencing the red panda genome and improving the giant panda genome assembly with added data, we identified genomic signatures of convergent evolution. Limb development genes DYNC2H1 and PCNT have undergone adaptive convergence and may be important candidate genes for pseudothumb development. As evolutionary responses to a bamboo diet, adaptive convergence has occurred in genes involved in the digestion and utilization of bamboo nutrients such as essential amino acids, fatty acids, and vitamins. Similarly, the umami taste receptor gene TAS1R1 has been pseudogenized in both pandas. These findings offer insights into genetic convergence mechanisms underlying phenotypic convergence and adaptation to a specialized bamboo diet.
Toxin gene determination and evolution in scorpaenoid fish.
Chuang, Po-Shun; Shiao, Jen-Chieh
2014-09-01
In this study, we determine the toxin genes from both cDNA and genomic DNA of four scorpaenoid fish and reconstruct their evolutionary relationship. The deduced protein sequences of the two toxin subunits in Sebastapistes strongia, Scorpaenopsis oxycephala, and Sebastiscus marmoratus are about 700 amino acid, similar to the sizes of the stonefish (Synanceia horrida, and Synanceia verrucosa) and lionfish (Pterois antennata and Pterois volitans) toxins previously published. The intron positions are highly conserved among these species, which indicate the applicability of gene finding by using genomic DNA template. The phylogenetic analysis shows that the two toxin subunits were duplicated prior to the speciation of Scorpaenoidei. The precedence of the gene duplication over speciation indicates that the toxin genes may be common to the whole family of Scorpaeniform. Furthermore, one additional toxin gene has been determined in the genomic DNA of Dendrochirus zebra. The phylogenetic analysis suggests that an additional gene duplication occurred before the speciation of the lionfish (Pteroinae) and a pseudogene may be generally present in the lineage of lionfish. Copyright © 2014 Elsevier Ltd. All rights reserved.
Genes on B chromosomes: old questions revisited with new tools.
Banaei-Moghaddam, Ali M; Martis, Mihaela M; Macas, Jiří; Gundlach, Heidrun; Himmelbach, Axel; Altschmied, Lothar; Mayer, Klaus F X; Houben, Andreas
2015-01-01
B chromosomes are supernumerary dispensable parts of the karyotype which appear in some individuals of some populations in some species. Often, they have been considered as 'junk DNA' or genomic parasites without functional genes. Due to recent advances in sequencing technologies, it became possible to investigate their DNA composition, transcriptional activity and effects on the host transcriptome profile in detail. Here, we review the most recent findings regarding the gene content of B chromosomes and their transcriptional activities and discuss these findings in the context of comparable biological phenomena, like sex chromosomes, aneuploidy and pseudogenes. Recent data suggest that B chromosomes carry transcriptionally active genic sequences which could affect the transcriptome profile of their host genome. These findings are gradually changing our view that B chromosomes are solely genetically inert selfish elements without any functional genes. This at one side could partly explain the deleterious effects which are associated with their presence. On the other hand it makes B chromosome a nice model for studying regulatory mechanisms of duplicated genes and their evolutionary consequences. Copyright © 2014 Elsevier B.V. All rights reserved.
Morley, Laura; McNally, Alan; Paszkiewicz, Konrad; Corander, Jukka; Méric, Guillaume; Sheppard, Samuel K.; Blom, Jochen
2015-01-01
Campylobacter jejuni is a highly diverse species of bacteria commonly associated with infectious intestinal disease of humans and zoonotic carriage in poultry, cattle, pigs, and other animals. The species contains a large number of distinct clonal complexes that vary from host generalist lineages commonly found in poultry, livestock, and human disease cases to host-adapted specialized lineages primarily associated with livestock or poultry. Here, we present novel data on the ST403 clonal complex of C. jejuni, a lineage that has not been reported in avian hosts. Our data show that the lineage exhibits a distinctive pattern of intralineage recombination that is accompanied by the presence of lineage-specific restriction-modification systems. Furthermore, we show that the ST403 complex has undergone gene decay at a number of loci. Our data provide a putative link between the lack of association with avian hosts of C. jejuni ST403 and both gene gain and gene loss through nonsense mutations in coding sequences of genes, resulting in pseudogene formation. PMID:25795671
Pangolin genomes and the evolution of mammalian scales and immunity
Rayko, Mike; Tan, Tze King; Hari, Ranjeev; Komissarov, Aleksey; Wee, Wei Yee; Yurchenko, Andrey A.; Kliver, Sergey; Tamazian, Gaik; Antunes, Agostinho; Wilson, Richard K.; Warren, Wesley C.; Koepfli, Klaus-Peter; Minx, Patrick; Krasheninnikova, Ksenia; Kotze, Antoinette; Dalton, Desire L.; Vermaak, Elaine; Paterson, Ian C.; Dobrynin, Pavel; Sitam, Frankie Thomas; Rovie-Ryan, Jeffrine J.; Johnson, Warren E.; Yusoff, Aini Mohamed; Luo, Shu-Jin; Karuppannan, Kayal Vizi; Fang, Gang; Zheng, Deyou; Gerstein, Mark B.; Lipovich, Leonard; O'Brien, Stephen J.; Wong, Guat Jah
2016-01-01
Pangolins, unique mammals with scales over most of their body, no teeth, poor vision, and an acute olfactory system, comprise the only placental order (Pholidota) without a whole-genome map. To investigate pangolin biology and evolution, we developed genome assemblies of the Malayan (Manis javanica) and Chinese (M. pentadactyla) pangolins. Strikingly, we found that interferon epsilon (IFNE), exclusively expressed in epithelial cells and important in skin and mucosal immunity, is pseudogenized in all African and Asian pangolin species that we examined, perhaps impacting resistance to infection. We propose that scale development was an innovation that provided protection against injuries or stress and reduced pangolin vulnerability to infection. Further evidence of specialized adaptations was evident from positively selected genes involving immunity-related pathways, inflammation, energy storage and metabolism, muscular and nervous systems, and scale/hair development. Olfactory receptor gene families are significantly expanded in pangolins, reflecting their well-developed olfaction system. This study provides insights into mammalian adaptation and functional diversification, new research tools and questions, and perhaps a new natural IFNE-deficient animal model for studying mammalian immunity. PMID:27510566
Limited mitogenomic degradation in response to a parasitic lifestyle in Orobanchaceae
Fan, Weishu; Zhu, Andan; Kozaczek, Melisa; Shah, Neethu; Pabón-Mora, Natalia; González, Favio; Mower, Jeffrey P.
2016-01-01
In parasitic plants, the reduction in plastid genome (plastome) size and content is driven predominantly by the loss of photosynthetic genes. The first completed mitochondrial genomes (mitogenomes) from parasitic mistletoes also exhibit significant degradation, but the generality of this observation for other parasitic plants is unclear. We sequenced the complete mitogenome and plastome of the hemiparasite Castilleja paramensis (Orobanchaceae) and compared them with additional holoparasitic, hemiparasitic and nonparasitic species from Orobanchaceae. Comparative mitogenomic analysis revealed minimal gene loss among the seven Orobanchaceae species, indicating the retention of typical mitochondrial function among Orobanchaceae species. Phylogenetic analysis demonstrated that the mobile cox1 intron was acquired vertically from a nonparasitic ancestor, arguing against a role for Orobanchaceae parasites in the horizontal acquisition or distribution of this intron. The C. paramensis plastome has retained nearly all genes except for the recent pseudogenization of four subunits of the NAD(P)H dehydrogenase complex, indicating a very early stage of plastome degradation. These results lend support to the notion that loss of ndh gene function is the first step of plastome degradation in the transition to a parasitic lifestyle. PMID:27808159
Taylor, Robert W.; Taylor, Geoffrey A.; Durham, Steve E.; Turnbull, Douglass M.
2001-01-01
Studies of single cells have previously shown intracellular clonal expansion of mitochondrial DNA (mtDNA) mutations to levels that can cause a focal cytochrome c oxidase (COX) defect. Whilst techniques are available to study mtDNA rearrangements at the level of the single cell, recent interest has focused on the possible role of somatic mtDNA point mutations in ageing, neurodegenerative disease and cancer. We have therefore developed a method that permits the reliable determination of the entire mtDNA sequence from single cells without amplifying contaminating, nuclear-embedded pseudogenes. Sequencing and PCR–RFLP analyses of individual COX-negative muscle fibres from a patient with a previously described heteroplasmic COX II (T7587C) mutation indicate that mutant loads as low as 30% can be reliably detected by sequencing. This technique will be particularly useful in identifying the mtDNA mutational spectra in age-related COX-negative cells and will increase our understanding of the pathogenetic mechanisms by which they occur. PMID:11470889
Noncoding RNA:RNA Regulatory Networks in Cancer
Chan, Jia Jia; Tay, Yvonne
2018-01-01
Noncoding RNAs (ncRNAs) constitute the majority of the human transcribed genome. This largest class of RNA transcripts plays diverse roles in a multitude of cellular processes, and has been implicated in many pathological conditions, especially cancer. The different subclasses of ncRNAs include microRNAs, a class of short ncRNAs; and a variety of long ncRNAs (lncRNAs), such as lincRNAs, antisense RNAs, pseudogenes, and circular RNAs. Many studies have demonstrated the involvement of these ncRNAs in competitive regulatory interactions, known as competing endogenous RNA (ceRNA) networks, whereby lncRNAs can act as microRNA decoys to modulate gene expression. These interactions are often interconnected, thus aberrant expression of any network component could derail the complex regulatory circuitry, culminating in cancer development and progression. Recent integrative analyses have provided evidence that new computational platforms and experimental approaches can be harnessed together to distinguish key ceRNA interactions in specific cancers, which could facilitate the identification of robust biomarkers and therapeutic targets, and hence, more effective cancer therapies and better patient outcome and survival. PMID:29702599
Organization of the human [zeta]-crystallin/quinone reductase gene (CRYZ)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gonzalez, P.; Rao, P.V.; Zigler, J.S. Jr.
1994-05-15
[zeta]-Crystallin is a protein highly expressed in the lens of guinea pigs and camels, where it comprises about 10% of the total soluble protein. It has recently been characterized as a novel quinone oxidoreductase present in a variety of mammalian tissues. The authors report here the isolation and characterization of the human [zeta]-crystallin gene (CRYZ) and its processed pseudogene. The functional gene is composed of nine exons and spans about 20 kb. The 5[prime]-flanking region of the gene is rich in G and C (58%) and lacks TATA and CAAT boxes. Previous analysis of the guinea pig gene revealed themore » presence of two different promoters, one responsible for the high lens-specific expression and the other for expression at the enzymatic level in numerous tissues. Comparative analysis with the guinea pig gene shows that a region of [approximately]2.5 kb that includes the promoter responsible for the high expression in the lens in guinea pig is not present in the human gene. 34 refs., 6 figs., 1 tab.« less
Comparative genomics reveals convergent evolution between the bamboo-eating giant and red pandas
Hu, Yibo; Wu, Qi; Ma, Shuai; Ma, Tianxiao; Shan, Lei; Wang, Xiao; Nie, Yonggang; Ning, Zemin; Yan, Li; Xiu, Yunfang; Wei, Fuwen
2017-01-01
Phenotypic convergence between distantly related taxa often mirrors adaptation to similar selective pressures and may be driven by genetic convergence. The giant panda (Ailuropoda melanoleuca) and red panda (Ailurus fulgens) belong to different families in the order Carnivora, but both have evolved a specialized bamboo diet and adaptive pseudothumb, representing a classic model of convergent evolution. However, the genetic bases of these morphological and physiological convergences remain unknown. Through de novo sequencing the red panda genome and improving the giant panda genome assembly with added data, we identified genomic signatures of convergent evolution. Limb development genes DYNC2H1 and PCNT have undergone adaptive convergence and may be important candidate genes for pseudothumb development. As evolutionary responses to a bamboo diet, adaptive convergence has occurred in genes involved in the digestion and utilization of bamboo nutrients such as essential amino acids, fatty acids, and vitamins. Similarly, the umami taste receptor gene TAS1R1 has been pseudogenized in both pandas. These findings offer insights into genetic convergence mechanisms underlying phenotypic convergence and adaptation to a specialized bamboo diet. PMID:28096377
Looking back on a decade of barcoding crustaceans
Raupach, Michael J.; Radulovici, Adriana E.
2015-01-01
Abstract Species identification represents a pivotal component for large-scale biodiversity studies and conservation planning but represents a challenge for many taxa when using morphological traits only. Consequently, alternative identification methods based on molecular markers have been proposed. In this context, DNA barcoding has become a popular and accepted method for the identification of unknown animals across all life stages by comparison to a reference library. In this review we examine the progress of barcoding studies for the Crustacea using the Web of Science data base from 2003 to 2014. All references were classified in terms of taxonomy covered, subject area (identification/library, genetic variability, species descriptions, phylogenetics, methods, pseudogenes/numts), habitat, geographical area, authors, journals, citations, and the use of the Barcode of Life Data Systems (BOLD). Our analysis revealed a total number of 164 barcoding studies for crustaceans with a preference for malacostracan crustaceans, in particular Decapoda, and for building reference libraries in order to identify organisms. So far, BOLD did not establish itself as a popular informatics platform among carcinologists although it offers many advantages for standardized data storage, analyses and publication. PMID:26798245
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mavromatis, K.; Kuyler Doyle, C.; Lykidis, A.
2005-09-01
Ehrlichia canis, a small obligately intracellular, tick-transmitted, gram-negative, a-proteobacterium is the primary etiologic agent of globally distributed canine monocytic ehrlichiosis. Complete genome sequencing revealed that the E. canis genome consists of a single circular chromosome of 1,315,030 bp predicted to encode 925 proteins, 40 stable RNA species, and 17 putative pseudogenes, and a substantial proportion of non-coding sequence (27 percent). Interesting genome features include a large set of proteins with transmembrane helices and/or signal sequences, and a unique serine-threonine bias associated with the potential for O-glycosylation that was prominent in proteins associated with pathogen-host interactions. Furthermore, two paralogous protein familiesmore » associated with immune evasion were identified, one of which contains poly G:C tracts, suggesting that they may play a role in phase variation and facilitation of persistent infections. Proteins associated with pathogen-host interactions were identified including a small group of proteins (12) with tandem repeats and another with eukaryotic-like ankyrin domains (7).« less
Anazi, Shamsa; Alshamekh, Shomoukh; Alkuraya, Fowzan S.
2013-01-01
The use of autozygosity as a mapping tool in the search for autosomal recessive disease genes is well established. We hypothesized that autozygosity not only unmasks the recessiveness of disease causing variants, but can also reveal natural knockouts of genes with less obvious phenotypic consequences. To test this hypothesis, we exome sequenced 77 well phenotyped individuals born to first cousin parents in search of genes that are biallelically inactivated. Using a very conservative estimate, we show that each of these individuals carries biallelic inactivation of 22.8 genes on average. For many of the 169 genes that appear to be biallelically inactivated, available data support involvement in modulating metabolism, immunity, perception, external appearance and other phenotypic aspects, and appear therefore to contribute to human phenotypic variation. Other genes with biallelic inactivation may contribute in yet unknown mechanisms or may be on their way to conversion into pseudogenes due to true recent dispensability. We conclude that sequencing the autozygome is an efficient way to map the contribution of genes to human phenotypic variation that goes beyond the classical definition of disease. PMID:24367280
Sheep skeletal muscle transcriptome analysis reveals muscle growth regulatory lncRNAs
Chao, Tianle; Ji, Zhibin; Hou, Lei; Wang, Jin; Zhang, Chunlan
2018-01-01
As widely distributed domestic animals, sheep are an important species and the source of mutton. In this study, we aimed to evaluate the regulatory lncRNAs associated with muscle growth and development between high production mutton sheep (Dorper sheep and Qianhua Mutton Merino sheep) and low production mutton sheep (Small-tailed Han sheep). In total, 39 lncRNAs were found to be differentially expressed. Using co-expression analysis and functional annotation, 1,206 co-expression interactions were found between 32 lncRNAs and 369 genes, and 29 of these lncRNAs were found to be associated with muscle development, metabolism, cell proliferation and apoptosis. lncRNA–mRNA interactions revealed 6 lncRNAs as hub lncRNAs. Moreover, three lncRNAs and their associated co-expressed genes were demonstrated by cis-regulatory gene analyses, and we also found a potential regulatory relationship between the pseudogene lncRNA LOC101121401 and its parent gene FTH1. This study provides a genome-wide resolution of lncRNA and mRNA regulation in muscles from mutton sheep. PMID:29666768
Kolesnikov, N N; Elisafenko, E A
2010-10-01
After the radiation of primates and rodents, the evolution of X-chromosome inactivation centers in human and mouse (XIC/Xic) followed two different directions. Human XIC followed the pathway towards transposon accumulation (the repeat proportion in the center constitutes 72%), especially LINEs, which prevail in the center. On the contrary, mouse Xic eliminated long repeats and accumulated species-specific SIN Es (the repeat proportion in the center constitutes 35%). The mechanism underlying inactivation of one of the X chromosomes in female mammals appeared on the basis of trasnsposons. The key gene of the inactivation process, XIST/Xist, similarly to other long noncoding RNA genes, like TSIX/Tsix, JPX/Jpx, and FTX/Ftx, was formed with the involvement of different transposon sequences. Furthermore, two clusters ofmicroRNA genes from inactivation center originated from L2 [1]. In mouse, one of such clusters has been preserved in the form of microRNA pseudogenes. Thus, long ncRNA genes and microRNAs appeared during the period of transposable elements expansion in this locus, 140 to 105 Myr ago, after the radiation of marsupials and placental mammal lineages.
A Dual Origin of the Xist Gene from a Protein-Coding Gene and a Set of Transposable Elements
Elisaphenko, Eugeny A.; Kolesnikov, Nikolay N.; Shevchenko, Alexander I.; Rogozin, Igor B.; Nesterova, Tatyana B.; Brockdorff, Neil; Zakian, Suren M.
2008-01-01
X-chromosome inactivation, which occurs in female eutherian mammals is controlled by a complex X-linked locus termed the X-inactivation center (XIC). Previously it was proposed that genes of the XIC evolved, at least in part, as a result of pseudogenization of protein-coding genes. In this study we show that the key XIC gene Xist, which displays fragmentary homology to a protein-coding gene Lnx3, emerged de novo in early eutherians by integration of mobile elements which gave rise to simple tandem repeats. The Xist gene promoter region and four out of ten exons found in eutherians retain homology to exons of the Lnx3 gene. The remaining six Xist exons including those with simple tandem repeats detectable in their structure have similarity to different transposable elements. Integration of mobile elements into Xist accompanies the overall evolution of the gene and presumably continues in contemporary eutherian species. Additionally we showed that the combination of remnants of protein-coding sequences and mobile elements is not unique to the Xist gene and is found in other XIC genes producing non-coding nuclear RNA. PMID:18575625
Bacillus anthracis genome organization in light of whole transcriptome sequencing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Martin, Jeffrey; Zhu, Wenhan; Passalacqua, Karla D.
2010-03-22
Emerging knowledge of whole prokaryotic transcriptomes could validate a number of theoretical concepts introduced in the early days of genomics. What are the rules connecting gene expression levels with sequence determinants such as quantitative scores of promoters and terminators? Are translation efficiency measures, e.g. codon adaptation index and RBS score related to gene expression? We used the whole transcriptome shotgun sequencing of a bacterial pathogen Bacillus anthracis to assess correlation of gene expression level with promoter, terminator and RBS scores, codon adaptation index, as well as with a new measure of gene translational efficiency, average translation speed. We compared computationalmore » predictions of operon topologies with the transcript borders inferred from RNA-Seq reads. Transcriptome mapping may also improve existing gene annotation. Upon assessment of accuracy of current annotation of protein-coding genes in the B. anthracis genome we have shown that the transcriptome data indicate existence of more than a hundred genes missing in the annotation though predicted by an ab initio gene finder. Interestingly, we observed that many pseudogenes possess not only a sequence with detectable coding potential but also promoters that maintain transcriptional activity.« less
Genomic and Genetic Evidence for the Loss of Umami Taste in Bats
Zhao, Huabin; Xu, Dong; Zhang, Shuyi; Zhang, Jianzhi
2012-01-01
Umami taste is responsible for sensing monosodium glutamate, nucleotide enhancers, and other amino acids that are appetitive to vertebrates and is one of the five basic tastes that also include sour, salty, sweet, and bitter. To study how ecological factors, especially diets, impact the evolution of the umami taste, we examined the umami taste receptor gene Tas1r1 in a phylogenetically diverse group of bats including fruit eaters, insect eaters, and blood feeders. We found that Tas1r1 is absent, unamplifiable, or pseudogenized in each of the 31 species examined, including the genome sequences of two species, suggesting the loss of the umami taste in most, if not all, bats regardless of their food preferences. Most strikingly, vampire bats have also lost the sweet taste receptor gene Tas1r2 and the gene required for both umami and sweet tastes (Tas1r3), being the first known mammalian group to lack two of the five tastes. The puzzling absence of the umami taste in bats calls for a better understanding of the roles that this taste plays in the daily life of vertebrates. PMID:22117084
Plastome Evolution in Hemiparasitic Mistletoes
Petersen, Gitte; Cuenca, Argelia; Seberg, Ole
2015-01-01
Santalales is an order of plants consisting almost entirely of parasites. Some, such as Osyris, are facultative root parasites whereas others, such as Viscum, are obligate stem parasitic mistletoes. Here, we report the complete plastome sequences of one species of Osyris and three species of Viscum, and we investigate the evolutionary aspects of structural changes and changes in gene content in relation to parasitism. Compared with typical angiosperms plastomes, the four Santalales plastomes are all reduced in size (10–22% compared with Vitis), and they have experienced rearrangements, mostly but not exclusively in the border areas of the inverted repeats. Additionally, a number of protein-coding genes (matK, infA, ccsA, rpl33, and all 11 ndh genes) as well as two transfer RNA genes (trnG-UCC and trnV-UAC) have been pseudogenized or completely lost. Most of the remaining plastid genes have a significantly changed selection pattern compared with other dicots, and the relaxed selection of photosynthesis genes is noteworthy. Although gene loss obviously reduces plastome size, intergenic regions were also shortened. As plastome modifications are generally most prominent in Viscum, they are most likely correlated with the increased nutritional dependence on the host compared with Osyris. PMID:26319577
Intra-isolate genome variation in arbuscular mycorrhizal fungi persists in the transcriptome.
Boon, E; Zimmerman, E; Lang, B F; Hijri, M
2010-07-01
Arbuscular mycorrhizal fungi (AMF) are heterokaryotes with an unusual genetic makeup. Substantial genetic variation occurs among nuclei within a single mycelium or isolate. AMF reproduce through spores that contain varying fractions of this heterogeneous population of nuclei. It is not clear whether this genetic variation on the genome level actually contributes to the AMF phenotype. To investigate the extent to which polymorphisms in nuclear genes are transcribed, we analysed the intra-isolate genomic and cDNA sequence variation of two genes, the large subunit ribosomal RNA (LSU rDNA) of Glomus sp. DAOM-197198 (previously known as G. intraradices) and the POL1-like sequence (PLS) of Glomus etunicatum. For both genes, we find high sequence variation at the genome and transcriptome level. Reconstruction of LSU rDNA secondary structure shows that all variants are functional. Patterns of PLS sequence polymorphism indicate that there is one functional gene copy, PLS2, which is preferentially transcribed, and one gene copy, PLS1, which is a pseudogene. This is the first study that investigates AMF intra-isolate variation at the transcriptome level. In conclusion, it is possible that, in AMF, multiple nuclear genomes contribute to a single phenotype.
Ulusal, SD; Gürkan, H; Atlı, E; Özal, SA; Çiftdemir, M; Tozkır, H; Karal, Y; Güçlü, H; Eker, D; Görker, I
2017-01-01
Abstract Neurofibromatosis Type I (NF1) is a multi systemic autosomal dominant neurocutaneous disorder predisposing patients to have benign and/or malignant lesions predominantly of the skin, nervous system and bone. Loss of function mutations or deletions of the NF1 gene is responsible for NF1 disease. Involvement of various pathogenic variants, the size of the gene and presence of pseudogenes makes it difficult to analyze. We aimed to report the results of 2 years of multiplex ligation-dependent probe amplification (MLPA) and next generation sequencing (NGS) for genetic diagnosis of NF1 applied at our genetic diagnosis center. The MLPA, semiconductor sequencing and Sanger sequencing were performed in genomic DNA samples from 24 unrelated patients and their affected family members referred to our center suspected of having NF1. In total, three novel and 12 known pathogenic variants and a whole gene deletion were determined. We suggest that next generation sequencing is a practical tool for genetic analysis of NF1. Deletion/duplication analysis with MLPA may also be helpful for patients clinically diagnosed to carry NF1 but do not have a detectable mutation in NGS. PMID:28924536
Decoding the similarities and differences among mycobacterial species
Vedithi, Sundeep Chaitanya; Blundell, Tom L.
2017-01-01
Mycobacteriaceae comprises pathogenic species such as Mycobacterium tuberculosis, M. leprae and M. abscessus, as well as non-pathogenic species, for example, M. smegmatis and M. thermoresistibile. Genome comparison and annotation studies provide insights into genome evolutionary relatedness, identify unique and pathogenicity-related genes in each species, and explore new targets that could be used for developing new diagnostics and therapeutics. Here, we present a comparative analysis of ten-mycobacterial genomes with the objective of identifying similarities and differences between pathogenic and non-pathogenic species. We identified 1080 core orthologous clusters that were enriched in proteins involved in amino acid and purine/pyrimidine biosynthetic pathways, DNA-related processes (replication, transcription, recombination and repair), RNA-methylation and modification, and cell-wall polysaccharide biosynthetic pathways. For their pathogenicity and survival in the host cell, pathogenic species have gained specific sets of genes involved in repair and protection of their genomic DNA. M. leprae is of special interest owing to its smallest genome (1600 genes and ~1300 psuedogenes), yet poor genome annotation. More than 75% of the pseudogenes were found to have a functional ortholog in the other mycobacterial genomes and belong to protein families such as transferases, oxidoreductases and hydrolases. PMID:28854187
Chiang, Shih-Chieh; Veldhuizen, Edwin J.A.; Barnes, Frances A.; Craven, C. Jeremy; Haagsman, Henk P.; Bingle, Colin D.
2011-01-01
Palate, lung and nasal epithelial clone (PLUNC) proteins are structural homologues to the innate defence molecules LPS-binding protein (LBP) and bactericidal/permeability-increasing protein (BPI). PLUNCs make up the largest portion of the wider BPI/LBP/PLUNC-like protein family and are amongst the most rapidly evolving mammalian genes. In this study we systematically identified and characterised BPI/LBP/PLUNC-like protein-encoding genes in the chicken genome. We identified eleven complete genes (and a pseudogene). Five of them are clustered on a >50 kb locus on chromosome 20, immediately adjacent to BPI. In addition to BPI, we have identified presumptive orthologues LPLUNCs 2, 3, 4 and 6, and BPIL-2. We find no evidence for the existence of single domain containing proteins in birds. Strikingly our analysis also suggests that there is no LBP orthologue in chicken. This observation may in part account for the relative resistance to LPS toxicity observed in birds. Our results indicate significant differences between the avian and mammalian repertoires of BPI/LBP/PLUNC-like genes at the genomic and transcriptional levels and provide a framework for further functional analyses of this gene family in chickens. PMID:20959152
Mandelker, Diana; Schmidt, Ryan J; Ankala, Arunkanth; McDonald Gibson, Kristin; Bowser, Mark; Sharma, Himanshu; Duffy, Elizabeth; Hegde, Madhuri; Santani, Avni; Lebo, Matthew; Funke, Birgit
2016-12-01
Next-generation sequencing (NGS) is now routinely used to interrogate large sets of genes in a diagnostic setting. Regions of high sequence homology continue to be a major challenge for short-read technologies and can lead to false-positive and false-negative diagnostic errors. At the scale of whole-exome sequencing (WES), laboratories may be limited in their knowledge of genes and regions that pose technical hurdles due to high homology. We have created an exome-wide resource that catalogs highly homologous regions that is tailored toward diagnostic applications. This resource was developed using a mappability-based approach tailored to current Sanger and NGS protocols. Gene-level and exon-level lists delineate regions that are difficult or impossible to analyze via standard NGS. These regions are ranked by degree of affectedness, annotated for medical relevance, and classified by the type of homology (within-gene, different functional gene, known pseudogene, uncharacterized noncoding region). Additionally, we provide a list of exons that cannot be analyzed by short-amplicon Sanger sequencing. This resource can help guide clinical test design, supplemental assay implementation, and results interpretation in the context of high homology.Genet Med 18 12, 1282-1289.
Massive gene loss in mistletoe (Viscum, Viscaceae) mitochondria
Petersen, G.; Cuenca, A.; Møller, I. M.; Seberg, O.
2015-01-01
Parasitism is a successful survival strategy across all kingdoms and has evolved repeatedly in angiosperms. Parasitic plants obtain nutrients from other plants and some are agricultural pests. Obligate parasites, which cannot complete their lifecycle without a host, may lack functional photosystems (holoparasites), or have retained photosynthesis (hemiparasites). Plastid genomes are often reduced in parasites, but complete mitochondrial genomes have not been sequenced and their mitochondrial respiratory capacities are largely unknown. The hemiparasitic European mistletoe (Viscum album), known from folklore and postulated therapeutic properties, is a pest in plantations and forestry. We compare the mitochondrial genomes of three Viscum species based on the complete mitochondrial genome of V. album, the first from a parasitic plant. We show that mitochondrial genes encoding proteins of all respiratory complexes are lacking or pseudogenized raising several questions relevant to all parasitic plants: Are any mitochondrial gene functions essential? Do any genes need to be located in the mitochondrial genome or can they all be transferred to the nucleus? Can parasitic plants survive without oxidative phosphorylation by using alternative respiratory pathways? More generally, our study is a step towards understanding how host- and self-perception, host integration and nucleic acid transfer has modified ancestral mitochondrial genomes. PMID:26625950
Regha, Kakkad; Sloane, Mathew A.; Huang, Ru; Pauler, Florian M.; Warczok, Katarzyna E.; Melikant, Balázs; Radolf, Martin; Martens, Joost H.A.; Schotta, Gunnar; Jenuwein, Thomas; Barlow, Denise P.
2010-01-01
SUMMARY The Igf2r imprinted cluster is an epigenetic silencing model in which expression of a ncRNA silences multiple genes in cis. Here, we map a 250 kb region in mouse embryonic fibroblast cells to show that histone modifications associated with expressed and silent genes are mutually exclusive and localized to discrete regions. Expressed genes were modified at promoter regions by H3K4me3 + H3K4me2 + H3K9Ac and on putative regulatory elements flanking active promoters by H3K4me2 + H3K9Ac. Silent genes showed two types of nonoverlapping profile. One type spread over large domains of tissue-specific silent genes and contained H3K27me3 alone. A second type formed localized foci on silent imprinted gene promoters and a nonexpressed pseudogene and contained H3K9me3 + H4K20me3 ± HP1. Thus, mammalian chromosome arms contain active chromatin interspersed with repressive chromatin resembling the type of heterochromatin previously considered a feature of centromeres, telomeres, and the inactive X chromosome. PMID:17679087
Gao, Ling; Ren, Wenhao; Zhang, Linmei; Li, Shaoming; Kong, Xinjuan; Zhang, Hao; Dong, Jianwei; Cai, Guangfeng; Jin, Changxiong; Zheng, Danqing; Zhi, Keqian
2017-04-01
PTENp1, non-coding RNA (ncRNA) pseudogene, is involved in oral squamous cell carcinoma (OSCC). The precise effects mediated by PTENp1 transcripts within intricate regulatory networks involving molecular interactions with ancestral gene PTEN and tumorigenicity in OSCC remain unclear. Here, we found that PTENp1 was aberrantly expressed in OSCC. There was a positive correlation between the expression levels of PTENp1 and PTEN. Further, we showed that PTENp1 acted as a competing endogenous RNA that protects PTEN transcripts from being inhibited by miR-21, and consequently inhibited proliferation and colony formation and triggered S-G2/M cell cycle arrest through the AKT pathway. Also, the homogeneous relationship between expression of PTENp1 and PTEN was confirmed in OSCC tumor xenografts. Finally, low expression of PTENp1 and PTEN was negatively associated with histological differentiation and OSCC prognosis. The present work provided the first evidence for the extraordinary crosstalk among PTENp1, PTEN, and miR-21, and rendered a new light on the treatment of OSCC. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Birth of a new gene on the Y chromosome of Drosophila melanogaster
Carvalho, Antonio Bernardo; Vicoso, Beatriz; Russo, Claudia A. M.; Swenor, Bonnielin; Clark, Andrew G.
2015-01-01
Contrary to the pattern seen in mammalian sex chromosomes, where most Y-linked genes have X-linked homologs, the Drosophila X and Y chromosomes appear to be unrelated. Most of the Y-linked genes have autosomal paralogs, so autosome-to-Y transposition must be the main source of Drosophila Y-linked genes. Here we show how these genes were acquired. We found a previously unidentified gene (flagrante delicto Y, FDY) that originated from a recent duplication of the autosomal gene vig2 to the Y chromosome of Drosophila melanogaster. Four contiguous genes were duplicated along with vig2, but they became pseudogenes through the accumulation of deletions and transposable element insertions, whereas FDY remained functional, acquired testis-specific expression, and now accounts for ∼20% of the vig2-like mRNA in testis. FDY is absent in the closest relatives of D. melanogaster, and DNA sequence divergence indicates that the duplication to the Y chromosome occurred ∼2 million years ago. Thus, FDY provides a snapshot of the early stages of the establishment of a Y-linked gene and demonstrates how the Drosophila Y has been accumulating autosomal genes. PMID:26385968
Current knowledge of microRNA-mediated regulation of drug metabolism in humans.
Nakano, Masataka; Nakajima, Miki
2018-05-01
Understanding the factors causing inter- and intra-individual differences in drug metabolism potencies is required for the practice of personalized or precision medicine, as well as for the promotion of efficient drug development. The expression of drug-metabolizing enzymes is controlled by transcriptional regulation by nuclear receptors and transcriptional factors, epigenetic regulation, such as DNA methylation and histone acetylation, and post-translational modification. In addition to such regulation mechanisms, recent studies revealed that microRNAs (miRNAs), endogenous ~22-nucleotide non-coding RNAs that regulate gene expression through the translational repression and degradation of mRNAs, significantly contribute to post-transcriptional regulation of drug-metabolizing enzymes. Areas covered: This review summarizes the current knowledge regarding miRNAs-dependent regulation of drug-metabolizing enzymes and transcriptional factors and its physiological and clinical significance. We also describe recent advances in miRNA-dependent regulation research, showing that the presence of pseudogenes, single-nucleotide polymorphisms, and RNA editing affects miRNA targeting. Expert opinion: It is unwavering fact that miRNAs are critical factors causing inter- and intra-individual differences in the expression of drug-metabolizing enzymes. Consideration of miRNA-dependent regulation would be a helpful tool for optimizing personalized and precision medicine.
The problems and promise of DNA barcodes for species diagnosis of primate biomaterials
Lorenz, Joseph G; Jackson, Whitney E; Beck, Jeanne C; Hanner, Robert
2005-01-01
The Integrated Primate Biomaterials and Information Resource (www.IPBIR.org) provides essential research reagents to the scientific community by establishing, verifying, maintaining, and distributing DNA and RNA derived from primate cell cultures. The IPBIR uses mitochondrial cytochrome c oxidase subunit I sequences to verify the identity of samples for quality control purposes in the accession, cell culture, DNA extraction processes and prior to shipping to end users. As a result, IPBIR is accumulating a database of ‘DNA barcodes’ for many species of primates. However, this quality control process is complicated by taxon specific patterns of ‘universal primer’ failure, as well as the amplification or co-amplification of nuclear pseudogenes of mitochondrial origins. To overcome these difficulties, taxon specific primers have been developed, and reverse transcriptase PCR is utilized to exclude these extraneous sequences from amplification. DNA barcoding of primates has applications to conservation and law enforcement. Depositing barcode sequences in a public database, along with primer sequences, trace files and associated quality scores, makes this species identification technique widely accessible. Reference DNA barcode sequences should be derived from, and linked to, specimens of known provenance in web-accessible collections in order to validate this system of molecular diagnostics. PMID:16214744
Evolutionary diversification of type-2 HDAC structure, function and regulation in Nicotiana tabacum.
Nicolas-Francès, Valérie; Grandperret, Vincent; Liegard, Benjamin; Jeandroz, Sylvain; Vasselon, Damien; Aimé, Sébastien; Klinguer, Agnès; Lamotte, Olivier; Julio, Emilie; de Borne, François Dorlhac; Wendehenne, David; Bourque, Stéphane
2018-04-01
Type-2 HDACs (HD2s) are plant-specific histone deacetylases that play diverse roles during development and in responses to biotic and abiotic stresses. In this study we characterized the six tobacco genes encoding HD2s that mainly differ by the presence or the absence of a typical zinc finger in their C-terminal part. Of particular interest, these HD2 genes exhibit a highly conserved intron/exon structure. We then further investigated the phylogenetic relationships among the HD2 gene family, and proposed a model of the genetic events that led to the organization of the HD2 family in Solanaceae. Absolute quantification of HD2 mRNAs in N. tabacum and in its precursors, N. tomentosiformis and N. sylvestris, did not reveal any pseudogenization of any of the HD2 genes, but rather specific regulation of HD2 expression in these three species. Functional complementation approaches in Arabidopsis thaliana demonstrated that the four zinc finger-containing HD2 proteins exhibit the same biological function in response to salt stress, whereas the two HD2 proteins without zinc finger have different biological function. Copyright © 2018 Elsevier B.V. All rights reserved.
The ENCODE Project at UC Santa Cruz.
Thomas, Daryl J; Rosenbloom, Kate R; Clawson, Hiram; Hinrichs, Angie S; Trumbower, Heather; Raney, Brian J; Karolchik, Donna; Barber, Galt P; Harte, Rachel A; Hillman-Jackson, Jennifer; Kuhn, Robert M; Rhead, Brooke L; Smith, Kayla E; Thakkapallayil, Archana; Zweig, Ann S; Haussler, David; Kent, W James
2007-01-01
The goal of the Encyclopedia Of DNA Elements (ENCODE) Project is to identify all functional elements in the human genome. The pilot phase is for comparison of existing methods and for the development of new methods to rigorously analyze a defined 1% of the human genome sequence. Experimental datasets are focused on the origin of replication, DNase I hypersensitivity, chromatin immunoprecipitation, promoter function, gene structure, pseudogenes, non-protein-coding RNAs, transcribed RNAs, multiple sequence alignment and evolutionarily constrained elements. The ENCODE project at UCSC website (http://genome.ucsc.edu/ENCODE) is the primary portal for the sequence-based data produced as part of the ENCODE project. In the pilot phase of the project, over 30 labs provided experimental results for a total of 56 browser tracks supported by 385 database tables. The site provides researchers with a number of tools that allow them to visualize and analyze the data as well as download data for local analyses. This paper describes the portal to the data, highlights the data that has been made available, and presents the tools that have been developed within the ENCODE project. Access to the data and types of interactive analysis that are possible are illustrated through supplemental examples.
McClelland, Michael; Sanderson, Kenneth E; Clifton, Sandra W; Latreille, Phil; Porwollik, Steffen; Sabo, Aniko; Meyer, Rekha; Bieri, Tamberlyn; Ozersky, Phil; McLellan, Michael; Harkins, C Richard; Wang, Chunyan; Nguyen, Christine; Berghoff, Amy; Elliott, Glendoria; Kohlberg, Sara; Strong, Cindy; Du, Feiyu; Carter, Jason; Kremizki, Colin; Layman, Dan; Leonard, Shawn; Sun, Hui; Fulton, Lucinda; Nash, William; Miner, Tracie; Minx, Patrick; Delehaunty, Kim; Fronick, Catrina; Magrini, Vincent; Nhan, Michael; Warren, Wesley; Florea, Liliana; Spieth, John; Wilson, Richard K
2004-12-01
Salmonella enterica serovars often have a broad host range, and some cause both gastrointestinal and systemic disease. But the serovars Paratyphi A and Typhi are restricted to humans and cause only systemic disease. It has been estimated that Typhi arose in the last few thousand years. The sequence and microarray analysis of the Paratyphi A genome indicates that it is similar to the Typhi genome but suggests that it has a more recent evolutionary origin. Both genomes have independently accumulated many pseudogenes among their approximately 4,400 protein coding sequences: 173 in Paratyphi A and approximately 210 in Typhi. The recent convergence of these two similar genomes on a similar phenotype is subtly reflected in their genotypes: only 30 genes are degraded in both serovars. Nevertheless, these 30 genes include three known to be important in gastroenteritis, which does not occur in these serovars, and four for Salmonella-translocated effectors, which are normally secreted into host cells to subvert host functions. Loss of function also occurs by mutation in different genes in the same pathway (e.g., in chemotaxis and in the production of fimbriae).
Immunoglobulin isotypes in Atlantic salmon, Salmo salar.
Hordvik, Ivar
2015-02-27
There are three major immunoglobulin (Ig) isotypes in salmonid fish: IgM, IgD and IgT, defined by the heavy chains μ, δ and τ, respectively. As a result of whole genome duplication in the ancestor of the salmonid fish family, Atlantic salmon (Salmo salar) possess two highly similar Ig heavy chain gene complexes (A and B), comprising two μ genes, two δ genes, three intact τ genes and five τ pseudogenes. The μA and μB genes correspond to two distinct sub-populations of serum IgM. The IgM-B sub-variant has a characteristic extra cysteine near the C-terminal part of the heavy chain and exhibits a higher degree of polymer disulfide cross-linking compared to IgM-A. The IgM-B:IgM-A ratio in serum is typically 60:40, but skewed ratios are also observed. The IgT isotype appears to be specialized to mucosal immune responses in salmonid fish. The concentration of IgT in serum is 100 to 1000 times lower than IgM. Secreted forms of IgD have been detected in rainbow trout, but not yet in Atlantic salmon.
Deep Recurrent Neural Network-Based Autoencoders for Acoustic Novelty Detection
Vesperini, Fabio; Schuller, Björn
2017-01-01
In the emerging field of acoustic novelty detection, most research efforts are devoted to probabilistic approaches such as mixture models or state-space models. Only recent studies introduced (pseudo-)generative models for acoustic novelty detection with recurrent neural networks in the form of an autoencoder. In these approaches, auditory spectral features of the next short term frame are predicted from the previous frames by means of Long-Short Term Memory recurrent denoising autoencoders. The reconstruction error between the input and the output of the autoencoder is used as activation signal to detect novel events. There is no evidence of studies focused on comparing previous efforts to automatically recognize novel events from audio signals and giving a broad and in depth evaluation of recurrent neural network-based autoencoders. The present contribution aims to consistently evaluate our recent novel approaches to fill this white spot in the literature and provide insight by extensive evaluations carried out on three databases: A3Novelty, PASCAL CHiME, and PROMETHEUS. Besides providing an extensive analysis of novel and state-of-the-art methods, the article shows how RNN-based autoencoders outperform statistical approaches up to an absolute improvement of 16.4% average F-measure over the three databases. PMID:28182121
Recolonization and radiation in Larix (Pinaceae): evidence from nuclear ribosomal DNA paralogues.
Wei, Xiao-Xin; Wang, Xiao-Quan
2004-10-01
Gene paralogy frequently causes the conflict between gene tree and species tree, but sometimes the coexistence of a few paralogous copies could provide more markers for tracing the phylogeographical process of some organisms. In the present study, nrDNA ITS paralogues were cloned from all but one species of Larix, an Eocene genus having two sections, Larix and Multiserialis, with a huge circumboreal distribution and an Eastern Asia-Western North America disjunction, respectively. A total of 96 distinct clones, excluding five putative pseudogenes or recombinants, were obtained and used in the gene genealogy analysis. The clones from all Eurasian species of section Larix are mixed together, suggesting that recolonization and recent morphological differentiation could have played important roles in the evolution of this section. In contrast, the species diversification of the Eurasian section Multiserialis may result from radiation in the east Himalayas and its vicinity, considering extensive nrDNA founder effects in this group. Our study also suggests that the distribution pattern analysis of members of multiple gene family would be very useful in tracking the evolutionary history of some taxa with recent origin or rapid radiation that cannot be resolved by other molecular markers.
Why do we like sweet taste: A bitter tale?
Beauchamp, Gary K.
2016-01-01
Sweet is widely considered to be one of a small number of basic or primary taste qualities. Liking for sweet tasting substances is innate, although postnatal experiences can shape responses. The power of sweet taste to induce consumption and to motivate behavior is profound, suggesting the importance of this sense for many species. Most investigators presume that the ability to identify sweet molecules through the sense of taste evolved to allow organisms to detect sources of readily available glucose from plants. Perhaps the best evidence supporting this presumption are recent discoveries in comparative biology demonstrating that species in the order Carnivora that do not consume plants also do not perceive sweet taste due to the pseudogenization of a component of the primary sweet taste receptor. However, arguing against this idea is the observation that the sweetness of a plant, or the amount of easily metabolizable sugars contained in the plant, provides little quantitative indication of the plant’s energy or broadly conceived food value. Here it is suggested that the perceptual ratio of sweet taste to bitter taste (a signal for toxicity) may be a better gauge of a plant’s broadly conceived food value than sweetness alone and that it is this ratio that helps guide selection or rejection of a potential plant food. PMID:27174610
Mahelka, Václav; Krak, Karol; Kopecký, David; Fehrer, Judith; Šafář, Jan; Bartoš, Jan; Hobza, Roman; Blavet, Nicolas; Blattner, Frank R
2017-02-14
The movement of nuclear DNA from one vascular plant species to another in the absence of fertilization is thought to be rare. Here, nonnative rRNA gene [ribosomal DNA (rDNA)] copies were identified in a set of 16 diploid barley ( Hordeum ) species; their origin was traceable via their internal transcribed spacer (ITS) sequence to five distinct Panicoideae genera, a lineage that split from the Pooideae about 60 Mya. Phylogenetic, cytogenetic, and genomic analyses implied that the nonnative sequences were acquired between 1 and 5 Mya after a series of multiple events, with the result that some current Hordeum sp. individuals harbor up to five different panicoid rDNA units in addition to the native Hordeum rDNA copies. There was no evidence that any of the nonnative rDNA units were transcribed; some showed indications of having been silenced via pseudogenization. A single copy of a Panicum sp. rDNA unit present in H. bogdanii had been interrupted by a native transposable element and was surrounded by about 70 kbp of mostly noncoding sequence of panicoid origin. The data suggest that horizontal gene transfer between vascular plants is not a rare event, that it is not necessarily restricted to one or a few genes only, and that it can be selectively neutral.
Shark Ig light chain junctions are as diverse as in heavy chains.
Fleurant, Marshall; Changchien, Lily; Chen, Chin-Tung; Flajnik, Martin F; Hsu, Ellen
2004-11-01
We have characterized a small family of four genes encoding one of the three nurse shark Ig L chain isotypes, called NS5. All NS5 cDNA sequences are encoded by three loci, of which two are organized as conventional clusters, each consisting of a V and J gene segment that can recombine and one C region exon; the third contains a germline-joined VJ in-frame and the fourth locus is a pseudogene. This is the second nurse shark L chain type where both germline-joined and split V-J organizations have been found. Since there are only two rearranging Ig loci, it was possible for the first time to examine junctional diversity in defined fish Ig genes, comparing productive vs nonproductive rearrangements. N region addition was found to be considerably more extensive in length and in frequency than any other vertebrate L chain so far reported and rivals that in H chain. We put forth the speculation that the unprecedented efficiency of N region addition (87-93% of NS5 sequences) may be a result not only of simultaneous H and L chain rearrangement in the shark but also of processing events that afford greater accessibility of the V or J gene coding ends to terminal deoxynucleotidyltransferase.
Epigenetic Regulation of the Sex Determination Gene MeGI in Polyploid Persimmon[OPEN
Kawai, Takashi; Tao, Ryutaro
2016-01-01
Epigenetic regulation can add a flexible layer to genetic variation, potentially enabling long-term but reversible cis-regulatory changes to an allele while maintaining its DNA sequence. Here, we present a case in which alternative epigenetic states lead to reversible sex determination in the hexaploid persimmon Diospyros kaki. Previously, we elucidated the molecular mechanism of sex determination in diploid persimmon and demonstrated the action of a Y-encoded sex determinant pseudogene called OGI, which produces small RNAs targeting the autosomal gene MeGI, resulting in separate male and female individuals (dioecy). We contrast these findings with the discovery, in hexaploid persimmon, of an additional layer of regulation in the form of DNA methylation of the MeGI promoter associated with the production of both male and female flowers in genetically male trees. Consistent with this model, developing male buds exhibited higher methylation levels across the MeGI promoter than developing female flowers from either monoecious or female trees. Additionally, a DNA methylation inhibitor induced developing male buds to form feminized flowers. Concurrently, in Y-chromosome-carrying trees, the expression of OGI is silenced by the presence of a SINE (short interspersed nuclear element)-like insertion in the OGI promoter. Our findings provide an example of an adaptive scenario involving epigenetic plasticity. PMID:27956470
Liu, Xia; Zhao, Bo; Zheng, Hua-Jun; Hu, Yan; Lu, Gang; Yang, Chang-Qing; Chen, Jie-Dan; Chen, Jun-Jian; Chen, Dian-Yang; Zhang, Liang; Zhou, Yan; Wang, Ling-Jian; Guo, Wang-Zhen; Bai, Yu-Lin; Ruan, Ju-Xin; Shangguan, Xiao-Xia; Mao, Ying-Bo; Shan, Chun-Min; Jiang, Jian-Ping; Zhu, Yong-Qiang; Jin, Lei; Kang, Hui; Chen, Shu-Ting; He, Xu-Lin; Wang, Rui; Wang, Yue-Zhu; Chen, Jie; Wang, Li-Jun; Yu, Shu-Ting; Wang, Bi-Yun; Wei, Jia; Song, Si-Chao; Lu, Xin-Yan; Gao, Zheng-Chao; Gu, Wen-Yi; Deng, Xiao; Ma, Dan; Wang, Sen; Liang, Wen-Hua; Fang, Lei; Cai, Cai-Ping; Zhu, Xie-Fei; Zhou, Bao-Liang; Jeffrey Chen, Z; Xu, Shu-Hua; Zhang, Yu-Gao; Wang, Sheng-Yue; Zhang, Tian-Zhen; Zhao, Guo-Ping; Chen, Xiao-Ya
2015-09-30
Of the two cultivated species of allopolyploid cotton, Gossypium barbadense produces extra-long fibers for the production of superior textiles. We sequenced its genome (AD)2 and performed a comparative analysis. We identified three bursts of retrotransposons from 20 million years ago (Mya) and a genome-wide uneven pseudogenization peak at 11-20 Mya, which likely contributed to genomic divergences. Among the 2,483 genes preferentially expressed in fiber, a cell elongation regulator, PRE1, is strikingly At biased and fiber specific, echoing the A-genome origin of spinnable fiber. The expansion of the PRE members implies a genetic factor that underlies fiber elongation. Mature cotton fiber consists of nearly pure cellulose. G. barbadense and G. hirsutum contain 29 and 30 cellulose synthase (CesA) genes, respectively; whereas most of these genes (>25) are expressed in fiber, genes for secondary cell wall biosynthesis exhibited a delayed and higher degree of up-regulation in G. barbadense compared with G. hirsutum, conferring an extended elongation stage and highly active secondary wall deposition during extra-long fiber development. The rapid diversification of sesquiterpene synthase genes in the gossypol pathway exemplifies the chemical diversity of lineage-specific secondary metabolites. The G. barbadense genome advances our understanding of allopolyploidy, which will help improve cotton fiber quality.
Phylogenetic appearance of Neuropeptide S precursor proteins in tetrapods
Reinscheid, Rainer K.
2007-01-01
Sleep and emotional behavior are two hallmarks of vertebrate animal behavior, implying that specialized neuronal circuits and dedicated neurochemical messengers may have been developed during evolution to regulate such complex behaviors. Neuropeptide S (NPS) is a newly identified peptide transmitter that activates a typical G protein-coupled receptor. Central administration of NPS produces profound arousal, enhances wakefulness and suppresses all stages of sleep. In addition, NPS can alleviate behavioral responses to stress by producing anxiolytic-like effects. A bioinformatic analysis of current genome databases revealed that the NPS peptide precursor gene is present in all vertebrates with the exception of fish. A high level of sequence conservation, especially of aminoterminal structures was detected, indicating stringent requirements for agonist-induced receptor activation. Duplication of the NPS precursor gene was only found in one out of two marsupial species with sufficient genome coverage (Monodelphis domestica; opossum), indicating that the duplicated opossum NPS sequence might have arisen as an isolated event. Pharmacological analysis of both Monodelphis NPS peptides revealed that only the closely related NPS peptide retained agonistic activity at NPS receptors. The duplicated precursor might be either a pseudogene or could have evolved different receptor selectivity. Together, these data show that NPS is a relatively recent gene in vertebrate evolution whose appearance might coincide with its specialized physiological functions in terrestrial vertebrates. PMID:17293003
Terpene Specialized Metabolism in Arabidopsis thaliana
Tholl, Dorothea; Lee, Sungbeom
2011-01-01
Terpenes constitute the largest class of plant secondary (or specialized) metabolites, which are compounds of ecological function in plant defense or the attraction of beneficial organisms. Using biochemical and genetic approaches, nearly all Arabidopsis thaliana (Arabidopsis) enzymes of the core biosynthetic pathways producing the 5-carbon building blocks of terpenes have been characterized and closer insight has been gained into the transcriptional and posttranscriptional/translational mechanisms regulating these pathways. The biochemical function of most prenyltransferases, the downstream enzymes that condense the C5-precursors into central 10-, 15-, and 20-carbon prenyldiphosphate intermediates, has been described, although the function of several isoforms of C20-prenyltranferases is not well understood. Prenyl diphosphates are converted to a variety of C10-, C15-, and C20-terpene products by enzymes of the terpene synthase (TPS) family. Genomic organization of the 32 Arabidopsis TPS genes indicates a species-specific divergence of terpene synthases with tissue- and cell-type specific expression profiles that may have emerged under selection pressures by different organisms. Pseudogenization, differential expression, and subcellular segregation of TPS genes and enzymes contribute to the natural variation of terpene biosynthesis among Arabidopsis accessions (ecotypes) and species. Arabidopsis will remain an important model to investigate the metabolic organization and molecular regulatory networks of terpene specialized metabolism in relation to the biological activities of terpenes. PMID:22303268
Analyses of sweet receptor gene (Tas1r2) and preference for sweet stimuli in species of Carnivora.
Li, Xia; Glaser, Dieter; Li, Weihua; Johnson, Warren E; O'Brien, Stephen J; Beauchamp, Gary K; Brand, Joseph G
2009-01-01
The extent to which taste receptor specificity correlates with, or even predicts, diet choice is not known. We recently reported that the insensitivity to sweeteners shown by species of Felidae can be explained by their lacking of a functional Tas1r2 gene. To broaden our understanding of the relationship between the structure of the sweet receptors and preference for sugars and artificial sweeteners, we measured responses to 12 sweeteners in 6 species of Carnivora and sequenced the coding regions of Tas1r2 in these same or closely related species. The lion showed no preference for any of the 12 sweet compounds tested, and it possesses the pseudogenized Tas1r2. All other species preferred some of the natural sugars, and their Tas1r2 sequences, having complete open reading frames, predict functional sweet receptors. In addition to preferring natural sugars, the lesser panda also preferred 3 (neotame, sucralose, and aspartame) of the 6 artificial sweeteners. Heretofore, it had been reported that among vertebrates, only Old World simians could taste aspartame. The observation that the lesser panda highly preferred aspartame could be an example of evolutionary convergence in the identification of sweet stimuli.
Insights into three whole-genome duplications gleaned from the Paramecium caudatum genome sequence.
McGrath, Casey L; Gout, Jean-Francois; Doak, Thomas G; Yanagi, Akira; Lynch, Michael
2014-08-01
Paramecium has long been a model eukaryote. The sequence of the Paramecium tetraurelia genome reveals a history of three successive whole-genome duplications (WGDs), and the sequences of P. biaurelia and P. sexaurelia suggest that these WGDs are shared by all members of the aurelia species complex. Here, we present the genome sequence of P. caudatum, a species closely related to the P. aurelia species group. P. caudatum shares only the most ancient of the three WGDs with the aurelia complex. We found that P. caudatum maintains twice as many paralogs from this early event as the P. aurelia species, suggesting that post-WGD gene retention is influenced by subsequent WGDs and supporting the importance of selection for dosage in gene retention. The availability of P. caudatum as an outgroup allows an expanded analysis of the aurelia intermediate and recent WGD events. Both the Guanine+Cytosine (GC) content and the expression level of preduplication genes are significant predictors of duplicate retention. We find widespread asymmetrical evolution among aurelia paralogs, which is likely caused by gradual pseudogenization rather than by neofunctionalization. Finally, cases of divergent resolution of intermediate WGD duplicates between aurelia species implicate this process acts as an ongoing reinforcement mechanism of reproductive isolation long after a WGD event. Copyright © 2014 by the Genetics Society of America.
Organization and transient expression of the gene for human U11 snRNA
Clemens, Suter-Crazzolara; Walter, Keller
1991-01-01
The nucleotide sequence of U11 small nuclear RNA, a minor U RNA from HeLa cells, was determined. Computer analysis of the sequence (135 residues) predicts two strong hairpin loops which are separated by seventeen nucleotides containing an Sm binding site (AAUUUUUUGG). A synthetic gene was constructed in which the coding region of U11 RNA is under the control of a T7 promoter. This vector can be used to produce U11 RNA in vitro. Southern hybridization and PCR analysis of HeLa genomic DNA suggest that U11 RNA is encoded by a single copy gene, and that at least three genomic regions could be U11 RNA pseudogenes. A HeLa genomic copy of a U11 gene was isolated by inverted PCR. This gene contains the U11 RNA coding sequence and several sequence elements unique for the U RNA genes. These include a Distal Sequence Element (DSE, ATTTGCATA) present between positions −215 and −223 relative to the start of transcription; a Proximal Sequence Element (PSE, TTCACCTTTACCAAAAATG) located between positions −43 and −63 ; and a 3′box (GTTAGGCGAAATATTA) between positions +150 and +166. Transfection of HeLa cells with this gene revealed that it is functioning in vivo and can produce U11 RNA. PMID:1820214
Ancestral whole-genome duplication in the marine chelicerate horseshoe crabs
Kenny, N J; Chan, K W; Nong, W; Qu, Z; Maeso, I; Yip, H Y; Chan, T F; Kwan, H S; Holland, P W H; Chu, K H; Hui, J H L
2016-01-01
Whole-genome duplication (WGD) results in new genomic resources that can be exploited by evolution for rewiring genetic regulatory networks in organisms. In metazoans, WGD occurred before the last common ancestor of vertebrates, and has been postulated as a major evolutionary force that contributed to their speciation and diversification of morphological structures. Here, we have sequenced genomes from three of the four extant species of horseshoe crabs—Carcinoscorpius rotundicauda, Limulus polyphemus and Tachypleus tridentatus. Phylogenetic and sequence analyses of their Hox and other homeobox genes, which encode crucial transcription factors and have been used as indicators of WGD in animals, strongly suggests that WGD happened before the last common ancestor of these marine chelicerates >135 million years ago. Signatures of subfunctionalisation of paralogues of Hox genes are revealed in the appendages of two species of horseshoe crabs. Further, residual homeobox pseudogenes are observed in the three lineages. The existence of WGD in the horseshoe crabs, noted for relative morphological stasis over geological time, suggests that genomic diversity need not always be reflected phenotypically, in contrast to the suggested situation in vertebrates. This study provides evidence of ancient WGD in the ecdysozoan lineage, and reveals new opportunities for studying genomic and regulatory evolution after WGD in the Metazoa. PMID:26419336
A Novel Method for Accurate Operon Predictions in All SequencedProkaryotes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Price, Morgan N.; Huang, Katherine H.; Alm, Eric J.
2004-12-01
We combine comparative genomic measures and the distance separating adjacent genes to predict operons in 124 completely sequenced prokaryotic genomes. Our method automatically tailors itself to each genome using sequence information alone, and thus can be applied to any prokaryote. For Escherichia coli K12 and Bacillus subtilis, our method is 85 and 83% accurate, respectively, which is similar to the accuracy of methods that use the same features but are trained on experimentally characterized transcripts. In Halobacterium NRC-1 and in Helicobacterpylori, our method correctly infers that genes in operons are separated by shorter distances than they are in E.coli, andmore » its predictions using distance alone are more accurate than distance-only predictions trained on a database of E.coli transcripts. We use microarray data from sixphylogenetically diverse prokaryotes to show that combining intergenic distance with comparative genomic measures further improves accuracy and that our method is broadly effective. Finally, we survey operon structure across 124 genomes, and find several surprises: H.pylori has many operons, contrary to previous reports; Bacillus anthracis has an unusual number of pseudogenes within conserved operons; and Synechocystis PCC6803 has many operons even though it has unusually wide spacings between conserved adjacent genes.« less
The zebrafish reference genome sequence and its relationship to the human genome.
Howe, Kerstin; Clark, Matthew D; Torroja, Carlos F; Torrance, James; Berthelot, Camille; Muffato, Matthieu; Collins, John E; Humphray, Sean; McLaren, Karen; Matthews, Lucy; McLaren, Stuart; Sealy, Ian; Caccamo, Mario; Churcher, Carol; Scott, Carol; Barrett, Jeffrey C; Koch, Romke; Rauch, Gerd-Jörg; White, Simon; Chow, William; Kilian, Britt; Quintais, Leonor T; Guerra-Assunção, José A; Zhou, Yi; Gu, Yong; Yen, Jennifer; Vogel, Jan-Hinnerk; Eyre, Tina; Redmond, Seth; Banerjee, Ruby; Chi, Jianxiang; Fu, Beiyuan; Langley, Elizabeth; Maguire, Sean F; Laird, Gavin K; Lloyd, David; Kenyon, Emma; Donaldson, Sarah; Sehra, Harminder; Almeida-King, Jeff; Loveland, Jane; Trevanion, Stephen; Jones, Matt; Quail, Mike; Willey, Dave; Hunt, Adrienne; Burton, John; Sims, Sarah; McLay, Kirsten; Plumb, Bob; Davis, Joy; Clee, Chris; Oliver, Karen; Clark, Richard; Riddle, Clare; Elliot, David; Eliott, David; Threadgold, Glen; Harden, Glenn; Ware, Darren; Begum, Sharmin; Mortimore, Beverley; Mortimer, Beverly; Kerry, Giselle; Heath, Paul; Phillimore, Benjamin; Tracey, Alan; Corby, Nicole; Dunn, Matthew; Johnson, Christopher; Wood, Jonathan; Clark, Susan; Pelan, Sarah; Griffiths, Guy; Smith, Michelle; Glithero, Rebecca; Howden, Philip; Barker, Nicholas; Lloyd, Christine; Stevens, Christopher; Harley, Joanna; Holt, Karen; Panagiotidis, Georgios; Lovell, Jamieson; Beasley, Helen; Henderson, Carl; Gordon, Daria; Auger, Katherine; Wright, Deborah; Collins, Joanna; Raisen, Claire; Dyer, Lauren; Leung, Kenric; Robertson, Lauren; Ambridge, Kirsty; Leongamornlert, Daniel; McGuire, Sarah; Gilderthorp, Ruth; Griffiths, Coline; Manthravadi, Deepa; Nichol, Sarah; Barker, Gary; Whitehead, Siobhan; Kay, Michael; Brown, Jacqueline; Murnane, Clare; Gray, Emma; Humphries, Matthew; Sycamore, Neil; Barker, Darren; Saunders, David; Wallis, Justene; Babbage, Anne; Hammond, Sian; Mashreghi-Mohammadi, Maryam; Barr, Lucy; Martin, Sancha; Wray, Paul; Ellington, Andrew; Matthews, Nicholas; Ellwood, Matthew; Woodmansey, Rebecca; Clark, Graham; Cooper, James D; Cooper, James; Tromans, Anthony; Grafham, Darren; Skuce, Carl; Pandian, Richard; Andrews, Robert; Harrison, Elliot; Kimberley, Andrew; Garnett, Jane; Fosker, Nigel; Hall, Rebekah; Garner, Patrick; Kelly, Daniel; Bird, Christine; Palmer, Sophie; Gehring, Ines; Berger, Andrea; Dooley, Christopher M; Ersan-Ürün, Zübeyde; Eser, Cigdem; Geiger, Horst; Geisler, Maria; Karotki, Lena; Kirn, Anette; Konantz, Judith; Konantz, Martina; Oberländer, Martina; Rudolph-Geiger, Silke; Teucke, Mathias; Lanz, Christa; Raddatz, Günter; Osoegawa, Kazutoyo; Zhu, Baoli; Rapp, Amanda; Widaa, Sara; Langford, Cordelia; Yang, Fengtang; Schuster, Stephan C; Carter, Nigel P; Harrow, Jennifer; Ning, Zemin; Herrero, Javier; Searle, Steve M J; Enright, Anton; Geisler, Robert; Plasterk, Ronald H A; Lee, Charles; Westerfield, Monte; de Jong, Pieter J; Zon, Leonard I; Postlethwait, John H; Nüsslein-Volhard, Christiane; Hubbard, Tim J P; Roest Crollius, Hugues; Rogers, Jane; Stemple, Derek L
2013-04-25
Zebrafish have become a popular organism for the study of vertebrate gene function. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination.
Characterization of interleukin-8 receptors in non-human primates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Alvarez, V.; Coto, E.; Gonzalez-Roces, S.
Interleukin-8 is a chemokine with a potent neutrophil chemoatractant activity. In humans, two different cDNAs encoding human IL8 receptors designated IL8RA and IL8RB have been cloned. IL8RA binds IL8, while IL8RB binds IL8 as well as other {alpha}-chemokines. Both human IL8Rs are encoded by two genes physically linked on chromosome 2. The IL8RA and IL8RB genes have open reading frames (ORF) lacking introns. By direct sequencing of the polymerase chain reaction products, we sequenced the IL8R genes of cell lines from four non-human primates: chimpanzee, gorilla, orangutan, and macaca. The IL8RB encodes an ORF in the four non-human primates, showingmore » 95%-99% similarity to the human IL8RB sequence. The IL8RA homologue in gorilla and chimpanzee consisted of two ORF 98%-99% identical to the human sequence. The macaca and orangutan IL8RA homologues are pseudogenes: a 2 base pair insertion generated a sequence with several stop codons. In addition, we describe the physical linkage of these genes in the four non-human primates and discuss the evolutionary implications of these findings. 25 refs., 5 figs., 3 tabs.« less
Emerling, Christopher A.; Springer, Mark S.
2015-01-01
Rod monochromacy is a rare condition in vertebrates characterized by the absence of cone photoreceptor cells. The resulting phenotype is colourblindness and low acuity vision in dim-light and blindness in bright-light conditions. Early reports of xenarthrans (armadillos, sloths and anteaters) suggest that they are rod monochromats, but this has not been tested with genomic data. We searched the genomes of Dasypus novemcinctus (nine-banded armadillo), Choloepus hoffmanni (Hoffmann's two-toed sloth) and Mylodon darwinii (extinct ground sloth) for retinal photoreceptor genes and examined them for inactivating mutations. We performed PCR and Sanger sequencing on cone phototransduction genes of 10 additional xenarthrans to test for shared inactivating mutations and estimated the timing of inactivation for photoreceptor pseudogenes. We concluded that a stem xenarthran became an long-wavelength sensitive-cone monochromat following a missense mutation at a critical residue in SWS1, and a stem cingulate (armadillos, glyptodonts and pampatheres) and stem pilosan (sloths and anteaters) independently acquired rod monochromacy early in their evolutionary history following the inactivation of LWS and PDE6C, respectively. We hypothesize that rod monochromacy in armadillos and pilosans evolved as an adaptation to a subterranean habitat in the early history of Xenarthra. The presence of rod monochromacy has major implications for understanding xenarthran behavioural ecology and evolution. PMID:25540280
Patil, Yogita; Müller, Nicolai; Schink, Bernhard; ...
2017-02-20
Anaerobium acetethylicum strain GluBS11 T belongs to the family Lachnospiraceae within the order Clostridiales. It is a Gram-positive, non-motile and strictly anaerobic bacterium isolated from biogas slurry that was originally enriched with gluconate as carbon source (Patil, et al., Int J Syst Evol Microbiol 65:3289-3296, 2015). Here we describe the draft genome sequence of strain GluBS11 T and provide a detailed insight into its physiological and metabolic features. The draft genome sequence generated 4,609,043 bp, distributed among 105 scaffolds assembled using the SPAdes genome assembler method. It comprises in total 4,132 genes, of which 4,008 were predicted to be proteinmore » coding genes, 124 RNA genes and 867 pseudogenes. The content was 43.51 mol %. The annotated genome of strain GluBS11 T contains putative genes coding for the pentose phosphate pathway, the Embden-Meyerhoff-Parnas pathway, the Entner-Doudoroff pathway and the tricarboxylic acid cycle. The genome revealed the presence of most of the necessary genes required for the fermentation of glucose and gluconate to acetate, ethanol, and hydrogen gas. However, a candidate gene for production of formate was not identified.« less
Jiao, Jian-Yu; Carro, Lorena; Liu, Lan; ...
2017-02-03
Jiangella gansuensis strain YIM 002 T is the type strain of the type species of the genus Jiangella, which is at the present time composed of five species, and was isolated from desert soil sample in Gansu Province (China). The five strains of this genus are clustered in a monophyletic group when closer actinobacterial genera are used to infer a 16S rRNA gene sequence phylogeny. The study of this genome is part of the Genomic Encyclopedia of Bacteria and Archaea project, and here we describe the complete genome sequence and annotation of this taxon. The genome of J. gansuensis strainmore » YIM 002T contains a single scaffold of size 5,585,780 bp, which involves 149 pseudogenes, 4905 protein-coding genes and 50 RNA genes, including 2520 hypothetical proteins and 4 rRNA genes. From the investigation of genome sizes of Jiangella species, J. gansuensis shows a smaller size, which indicates this strain might have discarded too much genetic information to adapt to desert environment. Seven new compounds from this bacterium have recently been described; however, its potential should be higher, as secondary metabolite gene cluster analysis predicted 60 gene clusters, including the potential to produce the pristinamycin.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Souza, B; Stoutland, P; Derbise, A
2004-01-24
Yersinia pestis, the causative agent of plague, is a highly uniform clone that diverged recently from the enteric pathogen Yersinia pseudotuberculosis. Despite their close genetic relationship, they differ radically in their pathogenicity and transmission. Here we report the complete genomic sequence of Y. pseudotuberculosis IP32953 and its use for detailed genome comparisons to available Y. pestis sequences. Analyses of identified differences across a panel of Yersinia isolates from around the world reveals 32 Y. pestis chromosomal genes that, together with the two Y. pestis-specific plasmids, represent the only new genetic material in Y. pestis acquired since the divergence from Y.more » pseudotuberculosis. In contrast, 149 new pseudogenes (doubling the previous estimate) and 317 genes absent from Y. pestis were detected, indicating that as many as 13% of Y. pseudotuberculosis genes no longer function in Y. pestis. Extensive IS-mediated genome rearrangements and reductive evolution through massive gene loss, resulting in elimination and modification of pre-existing gene expression pathways appear to be more important than acquisition of new genes in the evolution of Y. pestis. These results provide a sobering example of how a highly virulent epidemic clone can suddenly emerge from a less virulent, closely related progenitor.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Patil, Yogita; Müller, Nicolai; Schink, Bernhard
Anaerobium acetethylicum strain GluBS11 T belongs to the family Lachnospiraceae within the order Clostridiales. It is a Gram-positive, non-motile and strictly anaerobic bacterium isolated from biogas slurry that was originally enriched with gluconate as carbon source (Patil, et al., Int J Syst Evol Microbiol 65:3289-3296, 2015). Here we describe the draft genome sequence of strain GluBS11 T and provide a detailed insight into its physiological and metabolic features. The draft genome sequence generated 4,609,043 bp, distributed among 105 scaffolds assembled using the SPAdes genome assembler method. It comprises in total 4,132 genes, of which 4,008 were predicted to be proteinmore » coding genes, 124 RNA genes and 867 pseudogenes. The content was 43.51 mol %. The annotated genome of strain GluBS11 T contains putative genes coding for the pentose phosphate pathway, the Embden-Meyerhoff-Parnas pathway, the Entner-Doudoroff pathway and the tricarboxylic acid cycle. The genome revealed the presence of most of the necessary genes required for the fermentation of glucose and gluconate to acetate, ethanol, and hydrogen gas. However, a candidate gene for production of formate was not identified.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jiao, Jian-Yu; Carro, Lorena; Liu, Lan
Jiangella gansuensis strain YIM 002 T is the type strain of the type species of the genus Jiangella, which is at the present time composed of five species, and was isolated from desert soil sample in Gansu Province (China). The five strains of this genus are clustered in a monophyletic group when closer actinobacterial genera are used to infer a 16S rRNA gene sequence phylogeny. The study of this genome is part of the Genomic Encyclopedia of Bacteria and Archaea project, and here we describe the complete genome sequence and annotation of this taxon. The genome of J. gansuensis strainmore » YIM 002T contains a single scaffold of size 5,585,780 bp, which involves 149 pseudogenes, 4905 protein-coding genes and 50 RNA genes, including 2520 hypothetical proteins and 4 rRNA genes. From the investigation of genome sizes of Jiangella species, J. gansuensis shows a smaller size, which indicates this strain might have discarded too much genetic information to adapt to desert environment. Seven new compounds from this bacterium have recently been described; however, its potential should be higher, as secondary metabolite gene cluster analysis predicted 60 gene clusters, including the potential to produce the pristinamycin.« less
Bereczky, Zsuzsanna; Kovács, Kitti B; Muszbek, László
2010-12-01
Protein C (PC) and protein S (PS) are vitamin K-dependent glycoproteins that play an important role in the regulation of blood coagulation as natural anticoagulants. PC is activated by thrombin and the resulting activated PC (APC) inactivates membrane-bound activated factor VIII and factor V. The free form of PS is an important cofactor of APC. Deficiencies in these proteins lead to an increased risk of venous thromboembolism; a few reports have also associated these deficiencies with arterial diseases. The degree of risk and the prevalence of PC and PS deficiency among patients with thrombosis and in those in the general population have been examined by several population studies with conflicting results, primarily due to methodological variability. The molecular genetic background of PC and PS deficiencies is heterogeneous. Most of the mutations cause type I deficiency (quantitative disorder). Type II deficiency (dysfunctional molecule) is diagnosed in approximately 5%-15% of cases. The diagnosis of PC and PS deficiencies is challenging; functional tests are influenced by several pre-analytical and analytical factors, and the diagnosis using molecular genetics also has special difficulties. Large gene segment deletions often remain undetected by DNA sequencing methods. The presence of the PS pseudogene makes genetic diagnosis even more complicated.
Zhou, Jindan; Rudd, Kenneth E.
2013-01-01
EcoGene (http://ecogene.org) is a database and website devoted to continuously improving the structural and functional annotation of Escherichia coli K-12, one of the most well understood model organisms, represented by the MG1655(Seq) genome sequence and annotations. Major improvements to EcoGene in the past decade include (i) graphic presentations of genome map features; (ii) ability to design Boolean queries and Venn diagrams from EcoArray, EcoTopics or user-provided GeneSets; (iii) the genome-wide clone and deletion primer design tool, PrimerPairs; (iv) sequence searches using a customized EcoBLAST; (v) a Cross Reference table of synonymous gene and protein identifiers; (vi) proteome-wide indexing with GO terms; (vii) EcoTools access to >2000 complete bacterial genomes in EcoGene-RefSeq; (viii) establishment of a MySql relational database; and (ix) use of web content management systems. The biomedical literature is surveyed daily to provide citation and gene function updates. As of September 2012, the review of 37 397 abstracts and articles led to creation of 98 425 PubMed-Gene links and 5415 PubMed-Topic links. Annotation updates to Genbank U00096 are transmitted from EcoGene to NCBI. Experimental verifications include confirmation of a CTG start codon, pseudogene restoration and quality assurance of the Keio strain collection. PMID:23197660
The complete sequence and promoter activity of the human A-raf-1 gene (ARAF1)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, J.E.; Beck, T.W.; Brennscheidt, U.
1994-03-01
The raf proto-oncogenes encode cytoplasmic protein serine/threonine kinases, which play a critical role in cell growth and development. One of these, A-raf-1 (human gene symbol, ARAF1), which is predominantly expressed in mouse urogenital tissues, has been mapped to an evolutionarily conserved linkage group composed of ARAF1, SYN1, TIMP, and properdin located at human chromosome Xp11.2. The authors have isolated human genomic DNA clones containing the expressed gene (ARAF1) on the X chromosome and a pseudogene (ARAF2) on chromosome 7p12-q11.21. Analysis of the nucleotide sequence from the ARAF1 genomic clones demonstrated that it consists of 16 exons encoded by minimally 10,776more » nucleotides. The major transcriptional start site (+1) was determined by RNase protection and primer extension assays. Promoter activity was confirmed by functional assays using DNA fragments fused to a CAT reporter gene. The ARAF1 minimal promoter, located between nucleotides -59 and +93, has a low G + C content and lacks consensus TATA and Inr sequences but shows sequence similarity at position -1 to the E box that is known to interact with USF and TFII-I transcription factors. 65 refs., 7 figs., 1 tab.« less
Constitutive heterochromatin of chromosome 1 and Duffy blood group alleles in schizophrenia
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kosower, N.S.; Gerad, L.; Goldstein, M.
1995-04-24
Cytogenetic analysis was carried out in unrelated schizophrenic patients, unrelated controls and patients and family members in multiplex families. The size-distribution of chromosome 1 heterochromatic region (1qH, C-band variants) among 21 unrelated schizophrenic patients was different from that found in a group of 46 controls. The patient group had 1qH variants of smaller size than the control group (P < 0.01). Incubation of phytohemagglutinin-treated blood lymphocytes with 5-azacytidine (which causes decondensation and extension of the heterochromatin) led to a lesser degree of heterochromatin decondensation in a group of patients than in the controls (7 schizophrenic, 9 controls, P < 0.01).more » The distribution of phenotypes of Duffy blood group system (whose locus is linked to the 1qH region) among 28 schizophrenic patients was also different from that in the general population. Cosegregation of schizophrenia with a 1qH (C-band) variant and Duffy blood group allele was observed in one of six multiplex families. The overall results suggest that alterations within the Duffy/1qH region are involved in schizophrenia in some cases. This region contains the locus of D5 dopamine receptor pseudogene 2 (1q21.1), which is transcribed in normal lymphocytes. 33 refs., 1 fig., 2 tabs.« less
Copy number polymorphism of the salivary amylase gene: implications in human nutrition research.
Santos, J L; Saus, E; Smalley, S V; Cataldo, L R; Alberti, G; Parada, J; Gratacòs, M; Estivill, X
2012-01-01
The salivary α-amylase is a calcium-binding enzyme that initiates starch digestion in the oral cavity. The α-amylase genes are located in a cluster on the chromosome that includes salivary amylase genes (AMY1), two pancreatic α-amylase genes (AMY2A and AMY2B) and a related pseudogene. The AMY1 genes show extensive copy number variation which is directly proportional to the salivary α-amylase content in saliva. The α-amylase amount in saliva is also influenced by other factors, such as hydration status, psychosocial stress level, and short-term dietary habits. It has been shown that the average copy number of AMY1 gene is higher in populations that evolved under high-starch diets versus low-starch diets, reflecting an intense positive selection imposed by diet on amylase copy number during evolution. In this context, a number of different aspects can be considered in evaluating the possible impact of copy number variation of the AMY1 gene on nutrition research, such as issues related to human diet gene evolution, action on starch digestion, effect on glycemic response after starch consumption, modulation of the action of α-amylases inhibitors, effect on taste perception and satiety, influence on psychosocial stress and relation to oral health. Copyright © 2012 S. Karger AG, Basel.
The zebrafish reference genome sequence and its relationship to the human genome
Howe, Kerstin; Clark, Matthew D.; Torroja, Carlos F.; Torrance, James; Berthelot, Camille; Muffato, Matthieu; Collins, John E.; Humphray, Sean; McLaren, Karen; Matthews, Lucy; McLaren, Stuart; Sealy, Ian; Caccamo, Mario; Churcher, Carol; Scott, Carol; Barrett, Jeffrey C.; Koch, Romke; Rauch, Gerd-Jörg; White, Simon; Chow, William; Kilian, Britt; Quintais, Leonor T.; Guerra-Assunção, José A.; Zhou, Yi; Gu, Yong; Yen, Jennifer; Vogel, Jan-Hinnerk; Eyre, Tina; Redmond, Seth; Banerjee, Ruby; Chi, Jianxiang; Fu, Beiyuan; Langley, Elizabeth; Maguire, Sean F.; Laird, Gavin K.; Lloyd, David; Kenyon, Emma; Donaldson, Sarah; Sehra, Harminder; Almeida-King, Jeff; Loveland, Jane; Trevanion, Stephen; Jones, Matt; Quail, Mike; Willey, Dave; Hunt, Adrienne; Burton, John; Sims, Sarah; McLay, Kirsten; Plumb, Bob; Davis, Joy; Clee, Chris; Oliver, Karen; Clark, Richard; Riddle, Clare; Eliott, David; Threadgold, Glen; Harden, Glenn; Ware, Darren; Mortimer, Beverly; Kerry, Giselle; Heath, Paul; Phillimore, Benjamin; Tracey, Alan; Corby, Nicole; Dunn, Matthew; Johnson, Christopher; Wood, Jonathan; Clark, Susan; Pelan, Sarah; Griffiths, Guy; Smith, Michelle; Glithero, Rebecca; Howden, Philip; Barker, Nicholas; Stevens, Christopher; Harley, Joanna; Holt, Karen; Panagiotidis, Georgios; Lovell, Jamieson; Beasley, Helen; Henderson, Carl; Gordon, Daria; Auger, Katherine; Wright, Deborah; Collins, Joanna; Raisen, Claire; Dyer, Lauren; Leung, Kenric; Robertson, Lauren; Ambridge, Kirsty; Leongamornlert, Daniel; McGuire, Sarah; Gilderthorp, Ruth; Griffiths, Coline; Manthravadi, Deepa; Nichol, Sarah; Barker, Gary; Whitehead, Siobhan; Kay, Michael; Brown, Jacqueline; Murnane, Clare; Gray, Emma; Humphries, Matthew; Sycamore, Neil; Barker, Darren; Saunders, David; Wallis, Justene; Babbage, Anne; Hammond, Sian; Mashreghi-Mohammadi, Maryam; Barr, Lucy; Martin, Sancha; Wray, Paul; Ellington, Andrew; Matthews, Nicholas; Ellwood, Matthew; Woodmansey, Rebecca; Clark, Graham; Cooper, James; Tromans, Anthony; Grafham, Darren; Skuce, Carl; Pandian, Richard; Andrews, Robert; Harrison, Elliot; Kimberley, Andrew; Garnett, Jane; Fosker, Nigel; Hall, Rebekah; Garner, Patrick; Kelly, Daniel; Bird, Christine; Palmer, Sophie; Gehring, Ines; Berger, Andrea; Dooley, Christopher M.; Ersan-Ürün, Zübeyde; Eser, Cigdem; Geiger, Horst; Geisler, Maria; Karotki, Lena; Kirn, Anette; Konantz, Judith; Konantz, Martina; Oberländer, Martina; Rudolph-Geiger, Silke; Teucke, Mathias; Osoegawa, Kazutoyo; Zhu, Baoli; Rapp, Amanda; Widaa, Sara; Langford, Cordelia; Yang, Fengtang; Carter, Nigel P.; Harrow, Jennifer; Ning, Zemin; Herrero, Javier; Searle, Steve M. J.; Enright, Anton; Geisler, Robert; Plasterk, Ronald H. A.; Lee, Charles; Westerfield, Monte; de Jong, Pieter J.; Zon, Leonard I.; Postlethwait, John H.; Nüsslein-Volhard, Christiane; Hubbard, Tim J. P.; Crollius, Hugues Roest; Rogers, Jane; Stemple, Derek L.
2013-01-01
Zebrafish have become a popular organism for the study of vertebrate gene function1,2. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease3–5. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes6, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination. PMID:23594743
Diversity, classification and function of the plant protein kinase superfamily
Lehti-Shiu, Melissa D.; Shiu, Shin-Han
2012-01-01
Eukaryotic protein kinases belong to a large superfamily with hundreds to thousands of copies and are components of essentially all cellular functions. The goals of this study are to classify protein kinases from 25 plant species and to assess their evolutionary history in conjunction with consideration of their molecular functions. The protein kinase superfamily has expanded in the flowering plant lineage, in part through recent duplications. As a result, the flowering plant protein kinase repertoire, or kinome, is in general significantly larger than other eukaryotes, ranging in size from 600 to 2500 members. This large variation in kinome size is mainly due to the expansion and contraction of a few families, particularly the receptor-like kinase/Pelle family. A number of protein kinases reside in highly conserved, low copy number families and often play broadly conserved regulatory roles in metabolism and cell division, although functions of plant homologues have often diverged from their metazoan counterparts. Members of expanded plant kinase families often have roles in plant-specific processes and some may have contributed to adaptive evolution. Nonetheless, non-adaptive explanations, such as kinase duplicate subfunctionalization and insufficient time for pseudogenization, may also contribute to the large number of seemingly functional protein kinases in plants. PMID:22889912
Shao, Renfu; Mitani, Harumi; Barker, Stephen C; Takahashi, Mamoru; Fukunaga, Masahito
2005-06-01
To better understand the evolution of mitochondrial (mt) genomes in the Acari (mites and ticks), we sequenced the mt genome of the chigger mite, Leptotrombidium pallidum (Arthropoda: Acari: Acariformes). This genome is highly rearranged relative to that of the hypothetical ancestor of the arthropods and the other species of Acari studied. The mt genome of L. pallidum has two genes for large subunit rRNA, a pseudogene for small subunit rRNA, and four nearly identical large noncoding regions. Nineteen of the 22 tRNAs encoded by this genome apparently lack either a T-arm or a D-arm. Further, the mt genome of L. pallidum has two distantly separated sections with identical sequences but opposite orientations of transcription. This arrangement cannot be accounted for by homologous recombination or by previously known mechanisms of mt gene rearrangement. The most plausible explanation for the origin of this arrangement is illegitimate inter-mtDNA recombination, which has not been reported previously in animals. In light of the evidence from previous experiments on recombination in nuclear and mt genomes of animals, we propose a model of illegitimate inter-mtDNA recombination to account for the novel gene content and gene arrangement in the mt genome of L. pallidum.
Kaltenegger, Elisabeth; Eich, Eckart; Ober, Dietrich
2013-01-01
Homospermidine synthase (HSS), the first pathway-specific enzyme of pyrrolizidine alkaloid biosynthesis, is known to have its origin in the duplication of a gene encoding deoxyhypusine synthase. To study the processes that followed this gene duplication event and gave rise to HSS, we identified sequences encoding HSS and deoxyhypusine synthase from various species of the Convolvulaceae. We show that HSS evolved only once in this lineage. This duplication event was followed by several losses of a functional gene copy attributable to gene loss or pseudogenization. Statistical analyses of sequence data suggest that, in those lineages in which the gene copy was successfully recruited as HSS, the gene duplication event was followed by phases of various selection pressures, including purifying selection, relaxed functional constraints, and possibly positive Darwinian selection. Site-specific mutagenesis experiments have confirmed that the substitution of sites predicted to be under positive Darwinian selection is sufficient to convert a deoxyhypusine synthase into a HSS. In addition, analyses of transcript levels have shown that HSS and deoxyhypusine synthase have also diverged with respect to their regulation. The impact of protein–protein interaction on the evolution of HSS is discussed with respect to current models of enzyme evolution. PMID:23572540
Evolution of cholinesterases in the animal kingdom.
Pezzementi, Leo; Chatonnet, Arnaud
2010-09-06
Cholinesterases emerged from a family of enzymes and proteins with adhesion properties. This family is absent in plants and expanded in multicellular animals. True cholinesterases appeared in triploblastic animals together with the cholinergic system. Lineage specific duplications resulted in two acetylcholinesterases in most hexapods and in up to four genes in nematodes. In vertebrates the duplication leading to acetylcholinesterase (AChE) and butyrylcholinesterase (BChE) is now considered to be an ancient event which occurred before the split of osteichthyes. The product of one or the other of the paralogues is responsible for the physiological hydrolysis of acetylcholine, depending on the species lineage and tissue considered. The BChE gene seems to have been lost in some fish lineages. The complete genome of amphioxus (Branchiostoma floridae: cephalochordate) contains a large number of duplicated genes or pseudogenes of cholinesterases. Sequence comparison and tree constructions raise the question of considering the atypical ChE studied in this organism as a representative of ancient BChE. Thus nematodes, arthropods, annelids, molluscs, and vertebrates typically possess two paralogous genes coding for cholinesterases. The origin of the duplication(s) is discussed. The mode of attachment through alternative C-terminal coding exons seems to have evolved independently from the catalytic part of the gene. Copyright (c) 2010 Elsevier Ireland Ltd. All rights reserved.
Coincidence of synteny breakpoints with malignancy-related deletions on human chromosome 3
Kost-Alimova, Maria; Kiss, Hajnalka; Fedorova, Ludmila; Yang, Ying; Dumanski, Jan P.; Klein, George; Imreh, Stefan
2003-01-01
We have found previously that during tumor growth intact human chromosome 3 transferred into tumor cells regularly looses certain 3p regions, among them the ≈1.4-Mb common eliminated region 1 (CER1) at 3p21.3. Fluorescence in situ hybridization analysis of 12 mouse orthologous loci revealed that CER1 splits into two segments in mouse and therefore contains a murine/human conservation breakpoint region (CBR). Several breaks occurred in tumors within the region surrounding the CBR, and this sequence has features that characterize unstable chromosomal regions: deletions in yeast artificial chromosome clones, late replication, gene and segment duplications, and pseudogene insertions. Sequence analysis of the entire 3p12-22 revealed that other cancer-associated deletions (regions eliminated from monochromosomal hybrids carrying an intact chromosome 3 during tumor growth and homozygous deletions found in human tumors) colocalized nonrandomly with murine/human CBRs and were characterized by an increased number of local gene duplications and murine/human conservation mismatches (single genes that do not match into the conserved chromosomal segment). The CBR within CER1 contains a simple tandem TATAGA repeat capable of forming a 40-bp-long secondary hairpin-like structure. This repeat is nonrandomly localized within the other tumor-associated deletions and in the vicinity of 3p12-22 CBRs. PMID:12738884
Ogier, Jean-Claude; Pagès, Sylvie; Bisch, Gaëlle; Chiapello, Hélène; Médigue, Claudine; Rouy, Zoé; Teyssier, Corinne; Vincent, Stéphanie; Tailliez, Patrick; Givaudan, Alain; Gaudriault, Sophie
2014-01-01
Bacteria of the genus Xenorhabdus are symbionts of soil entomopathogenic nematodes of the genus Steinernema. This symbiotic association constitutes an insecticidal complex active against a wide range of insect pests. Unlike other Xenorhabdus species, Xenorhabdus poinarii is avirulent when injected into insects in the absence of its nematode host. We sequenced the genome of the X. poinarii strain G6 and the closely related but virulent X. doucetiae strain FRM16. G6 had a smaller genome (500–700 kb smaller) than virulent Xenorhabdus strains and lacked genes encoding potential virulence factors (hemolysins, type 5 secretion systems, enzymes involved in the synthesis of secondary metabolites, and toxin–antitoxin systems). The genomes of all the X. poinarii strains analyzed here had a similar small size. We did not observe the accumulation of pseudogenes, insertion sequences or decrease in coding density usually seen as a sign of genomic erosion driven by genetic drift in host-adapted bacteria. Instead, genome reduction of X. poinarii seems to have been mediated by the excision of genomic blocks from the flexible genome, as reported for the genomes of attenuated free pathogenic bacteria and some facultative mutualistic bacteria growing exclusively within hosts. This evolutionary pathway probably reflects the adaptation of X. poinarii to specific host. PMID:24904010
Yu, Liying; Tang, Weiqi; He, Weiyi; Ma, Xiaoli; Vasseur, Liette; Baxter, Simon W; Yang, Guang; Huang, Shiguo; Song, Fengqin; You, Minsheng
2015-03-10
Cytochrome P450 monooxygenases are present in almost all organisms and can play vital roles in hormone regulation, metabolism of xenobiotics and in biosynthesis or inactivation of endogenous compounds. In the present study, a genome-wide approach was used to identify and analyze the P450 gene family of diamondback moth, Plutella xylostella, a destructive worldwide pest of cruciferous crops. We identified 85 putative cytochrome P450 genes from the P. xylostella genome, including 84 functional genes and 1 pseudogene. These genes were classified into 26 families and 52 subfamilies. A phylogenetic tree constructed with three additional insect species shows extensive gene expansions of P. xylostella P450 genes from clans 3 and 4. Gene expression of cytochrome P450s was quantified across multiple developmental stages (egg, larva, pupa and adult) and tissues (head and midgut) using P. xylostella strains susceptible or resistant to insecticides chlorpyrifos and fiprinol. Expression of the lepidopteran specific CYP367s predominantly occurred in head tissue suggesting a role in either olfaction or detoxification. CYP340s with abundant transposable elements and relatively high expression in the midgut probably contribute to the detoxification of insecticides or plant toxins in P. xylostella. This study will facilitate future functional studies of the P. xylostella P450s in detoxification.
Yu, Liying; Tang, Weiqi; He, Weiyi; Ma, Xiaoli; Vasseur, Liette; Baxter, Simon W.; Yang, Guang; Huang, Shiguo; Song, Fengqin; You, Minsheng
2015-01-01
Cytochrome P450 monooxygenases are present in almost all organisms and can play vital roles in hormone regulation, metabolism of xenobiotics and in biosynthesis or inactivation of endogenous compounds. In the present study, a genome-wide approach was used to identify and analyze the P450 gene family of diamondback moth, Plutella xylostella, a destructive worldwide pest of cruciferous crops. We identified 85 putative cytochrome P450 genes from the P. xylostella genome, including 84 functional genes and 1 pseudogene. These genes were classified into 26 families and 52 subfamilies. A phylogenetic tree constructed with three additional insect species shows extensive gene expansions of P. xylostella P450 genes from clans 3 and 4. Gene expression of cytochrome P450s was quantified across multiple developmental stages (egg, larva, pupa and adult) and tissues (head and midgut) using P. xylostella strains susceptible or resistant to insecticides chlorpyrifos and fiprinol. Expression of the lepidopteran specific CYP367s predominantly occurred in head tissue suggesting a role in either olfaction or detoxification. CYP340s with abundant transposable elements and relatively high expression in the midgut probably contribute to the detoxification of insecticides or plant toxins in P. xylostella. This study will facilitate future functional studies of the P. xylostella P450s in detoxification. PMID:25752830
Zhou, Jindan; Rudd, Kenneth E
2013-01-01
EcoGene (http://ecogene.org) is a database and website devoted to continuously improving the structural and functional annotation of Escherichia coli K-12, one of the most well understood model organisms, represented by the MG1655(Seq) genome sequence and annotations. Major improvements to EcoGene in the past decade include (i) graphic presentations of genome map features; (ii) ability to design Boolean queries and Venn diagrams from EcoArray, EcoTopics or user-provided GeneSets; (iii) the genome-wide clone and deletion primer design tool, PrimerPairs; (iv) sequence searches using a customized EcoBLAST; (v) a Cross Reference table of synonymous gene and protein identifiers; (vi) proteome-wide indexing with GO terms; (vii) EcoTools access to >2000 complete bacterial genomes in EcoGene-RefSeq; (viii) establishment of a MySql relational database; and (ix) use of web content management systems. The biomedical literature is surveyed daily to provide citation and gene function updates. As of September 2012, the review of 37 397 abstracts and articles led to creation of 98 425 PubMed-Gene links and 5415 PubMed-Topic links. Annotation updates to Genbank U00096 are transmitted from EcoGene to NCBI. Experimental verifications include confirmation of a CTG start codon, pseudogene restoration and quality assurance of the Keio strain collection.
Draft genome of the red harvester ant Pogonomyrmex barbatus.
Smith, Chris R; Smith, Christopher D; Robertson, Hugh M; Helmkampf, Martin; Zimin, Aleksey; Yandell, Mark; Holt, Carson; Hu, Hao; Abouheif, Ehab; Benton, Richard; Cash, Elizabeth; Croset, Vincent; Currie, Cameron R; Elhaik, Eran; Elsik, Christine G; Favé, Marie-Julie; Fernandes, Vilaiwan; Gibson, Joshua D; Graur, Dan; Gronenberg, Wulfila; Grubbs, Kirk J; Hagen, Darren E; Viniegra, Ana Sofia Ibarraran; Johnson, Brian R; Johnson, Reed M; Khila, Abderrahman; Kim, Jay W; Mathis, Kaitlyn A; Munoz-Torres, Monica C; Murphy, Marguerite C; Mustard, Julie A; Nakamura, Rin; Niehuis, Oliver; Nigam, Surabhi; Overson, Rick P; Placek, Jennifer E; Rajakumar, Rajendhran; Reese, Justin T; Suen, Garret; Tao, Shu; Torres, Candice W; Tsutsui, Neil D; Viljakainen, Lumi; Wolschin, Florian; Gadau, Jürgen
2011-04-05
We report the draft genome sequence of the red harvester ant, Pogonomyrmex barbatus. The genome was sequenced using 454 pyrosequencing, and the current assembly and annotation were completed in less than 1 y. Analyses of conserved gene groups (more than 1,200 manually annotated genes to date) suggest a high-quality assembly and annotation comparable to recently sequenced insect genomes using Sanger sequencing. The red harvester ant is a model for studying reproductive division of labor, phenotypic plasticity, and sociogenomics. Although the genome of P. barbatus is similar to other sequenced hymenopterans (Apis mellifera and Nasonia vitripennis) in GC content and compositional organization, and possesses a complete CpG methylation toolkit, its predicted genomic CpG content differs markedly from the other hymenopterans. Gene networks involved in generating key differences between the queen and worker castes (e.g., wings and ovaries) show signatures of increased methylation and suggest that ants and bees may have independently co-opted the same gene regulatory mechanisms for reproductive division of labor. Gene family expansions (e.g., 344 functional odorant receptors) and pseudogene accumulation in chemoreception and P450 genes compared with A. mellifera and N. vitripennis are consistent with major life-history changes during the adaptive radiation of Pogonomyrmex spp., perhaps in parallel with the development of the North American deserts.
Drury, Suzanne; Mason, Sarah; McKay, Fiona; Lo, Kitty; Boustred, Christopher; Jenkins, Lucy; Chitty, Lyn S
2016-01-01
Our UK National Health Service regional genetics laboratory offers NIPD for autosomal dominant and de novo conditions (achondroplasia, thanataphoric dysplasia, Apert syndrome), paternal mutation exclusion for cystic fibrosis and a range of bespoke tests. NIPD avoids the risks associated with invasive testing, making prenatal diagnosis more accessible to families at high genetic risk. However, the challenge remains in offering definitive diagnosis for autosomal recessive diseases, which is complicated by the predominance of the maternal mutant allele in the cell-free DNA sample and thus requires a variety of different approaches. Validation and diagnostic implementation for NIPD of congenital adrenal hyperplasia (CAH) is further complicated by presence of a pseudogene that requires a different approach. We have used an assay targeting approximately 6700 heterozygous SNPs around the CAH gene (CYP21A2) to construct the high-risk parental haplotypes and tested this approach in five cases, showing that inheritance of the parental alleles can be correctly identified using NIPD. We are evaluating various measures of the fetal fraction to help determine inheritance of parental mutations. We are currently exploring the utility of an NIPD multi-disorder panel for autosomal recessive disease, to make testing more widely applicable to families with a variety of serious genetic conditions.
A cluster of novel serotonin receptor 3-like genes on human chromosome 3.
Karnovsky, Alla M; Gotow, Lisa F; McKinley, Denise D; Piechan, Julie L; Ruble, Cara L; Mills, Cynthia J; Schellin, Kathleen A B; Slightom, Jerry L; Fitzgerald, Laura R; Benjamin, Christopher W; Roberds, Steven L
2003-11-13
The ligand-gated ion channel family includes receptors for serotonin (5-hydroxytryptamine, 5-HT), acetylcholine, GABA, and glutamate. Drugs targeting subtypes of these receptors have proven useful for the treatment of various neuropsychiatric and neurological disorders. To identify new ligand-gated ion channels as potential therapeutic targets, drafts of human genome sequence were interrogated. Portions of four novel genes homologous to 5-HT(3A) and 5-HT(3B) receptors were identified within human sequence databases. We named the genes 5-HT(3C1)-5-HT(3C4). Radiation hybrid (RH) mapping localized these genes to chromosome 3q27-28. All four genes shared similar intron-exon organizations and predicted protein secondary structure with 5-HT(3A) and 5-HT(3B). Orthologous genes were detected by Southern blotting in several species including dog, cow, and chicken, but not in rodents, suggesting that these novel genes are not present in rodents or are very poorly conserved. Two of the novel genes are predicted to be pseudogenes, but two other genes are transcribed and spliced to form appropriate open reading frames. The 5-HT(3C1) transcript is expressed almost exclusively in small intestine and colon, suggesting a possible role in the serotonin-responsiveness of the gut.
"Orphan" retrogenes in the human genome.
Ciomborowska, Joanna; Rosikiewicz, Wojciech; Szklarczyk, Damian; Makałowski, Wojciech; Makałowska, Izabela
2013-02-01
Gene duplicates generated via retroposition were long thought to be pseudogenized and consequently decayed. However, a significant number of these genes escaped their evolutionary destiny and evolved into functional genes. Despite multiple studies, the number of functional retrogenes in human and other genomes remains unclear. We performed a comparative analysis of human, chicken, and worm genomes to identify "orphan" retrogenes, that is, retrogenes that have replaced their progenitors. We located 25 such candidates in the human genome. All of these genes were previously known, and the majority has been intensively studied. Despite this, they have never been recognized as retrogenes. Analysis revealed that the phenomenon of replacing parental genes with their retrocopies has been taking place over the entire span of animal evolution. This process was often species specific and contributed to interspecies differences. Surprisingly, these retrogenes, which should evolve in a more relaxed mode, are subject to a very strong purifying selection, which is, on average, two and a half times stronger than other human genes. Also, for retrogenes, they do not show a typical overall tendency for a testis-specific expression. Notably, seven of them are associated with human diseases. Recognizing them as "orphan" retrocopies, which have different regulatory machinery than their parents, is important for any disease studies in model organisms, especially when discoveries made in one species are transferred to humans.
Jheng, Cheng-Fong; Chen, Tien-Chih; Lin, Jhong-Yi; Chen, Ting-Chieh; Wu, Wen-Luan; Chang, Ching-Chun
2012-07-01
The chloroplast genome of Phalaenopsis equestris was determined and compared to those of Phalaenopsis aphrodite and Oncidium Gower Ramsey in Orchidaceae. The chloroplast genome of P. equestris is 148,959 bp, and a pair of inverted repeats (25,846 bp) separates the genome into large single-copy (85,967 bp) and small single-copy (11,300 bp) regions. The genome encodes 109 genes, including 4 rRNA, 30 tRNA and 75 protein-coding genes, but loses four ndh genes (ndhA, E, F and H) and seven other ndh genes are pseudogenes. The rate of inter-species variation between the two moth orchids was 0.74% (1107 sites) for single nucleotide substitution and 0.24% for insertions (161 sites; 1388 bp) and deletions (189 sites; 1393 bp). The IR regions have a lower rate of nucleotide substitution (3.5-5.8-fold) and indels (4.3-7.1-fold) than single-copy regions. The intergenic spacers are the most divergent, and based on the length variation of the three intergenic spacers, 11 native Phalaenopsis orchids could be successfully distinguished. The coding genes, IR junction and RNA editing sites are relatively more conserved between the two moth orchids than between those of Phalaenopsis and Oncidium spp. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Bordetella pertussis risA, but Not risS, Is Required for Maximal Expression of Bvg-Repressed Genes
Stenson, Trevor H.; Allen, Andrew G.; al-Meer, Jehan A.; Maskell, Duncan; Peppler, Mark S.
2005-01-01
Expression of virulence determinants by Bordetella pertussis, the primary etiological agent of whooping cough, is regulated by the BvgAS two-component regulatory system. The role of a second two-component regulatory system, encoded by risAS, in this process is not defined. Here, we show that mutation of B. pertussis risA does not affect Bvg-activated genes or proteins. However, mutation of risA resulted in greatly diminished expression of Bvg-repressed antigens and decreased transcription of Bvg-repressed genes. In contrast, mutation of risS had no effect on the expression of Bvg-regulated molecules. Mutation of risA also resulted in decreased bacterial invasion in a HeLa cell model. However, decreased invasion could not be attributed to the decreased expression of Bvg-repressed products, suggesting that mutation of risA may affect the expression of a variety of genes. Unlike the risAS operons in B. parapertussis and B. bronchiseptica, B. pertussis risS is a pseudogene that encodes a truncated RisS sensor. Deletion of the intact part of the B. pertussis risS gene does not affect the expression of risA-dependent, Bvg-repressed genes. These observations suggest that RisA activation occurs through cross-regulation by a heterologous system. PMID:16113320
PTESFinder: a computational method to identify post-transcriptional exon shuffling (PTES) events.
Izuogu, Osagie G; Alhasan, Abd A; Alafghani, Hani M; Santibanez-Koref, Mauro; Elliott, David J; Elliot, David J; Jackson, Michael S
2016-01-13
Transcripts, which have been subject to Post-transcriptional exon shuffling (PTES), have an exon order inconsistent with the underlying genomic sequence. These have been identified in a wide variety of tissues and cell types from many eukaryotes, and are now known to be mostly circular, cytoplasmic, and non-coding. Although there is no uniformly ascribed function, several have been shown to be involved in gene regulation. Accurate identification of these transcripts can, however, be difficult due to artefacts from a wide variety of sources. Here, we present a computational method, PTESFinder, to identify these transcripts from high throughput RNAseq data. Uniquely, it systematically excludes potential artefacts emanating from pseudogenes, segmental duplications, and template switching, and outputs both PTES and canonical exon junction counts to facilitate comparative analyses. In comparison with four existing methods, PTESFinder achieves highest specificity and comparable sensitivity at a variety of read depths. PTESFinder also identifies between 13 % and 41.6 % more structures, compared to publicly available methods recently used to identify human circular RNAs. With high sensitivity and specificity, user-adjustable filters that target known sources of false positives, and tailored output to facilitate comparison of transcript levels, PTESFinder will facilitate the discovery and analysis of these poorly understood transcripts.
Ermakov, Oleg A.; Simonov, Evgeniy; Surin, Vadim L.; Titov, Sergey V.; Brandler, Oleg V.; Ivanova, Natalia V.; Borisenko, Alex V.
2015-01-01
The utility of DNA Barcoding for species identification and discovery has catalyzed a concerted effort to build the global reference library; however, many animal groups of economical or conservational importance remain poorly represented. This study aims to contribute DNA barcode records for all ground squirrel species (Xerinae, Sciuridae, Rodentia) inhabiting Eurasia and to test efficiency of this approach for species discrimination. Cytochrome c oxidase subunit 1 (COI) gene sequences were obtained for 97 individuals representing 16 ground squirrel species of which 12 were correctly identified. Taxonomic allocation of some specimens within four species was complicated by geographically restricted mtDNA introgression. Exclusion of individuals with introgressed mtDNA allowed reaching a 91.6% identification success rate. Significant COI divergence (3.5–4.4%) was observed within the most widespread ground squirrel species (Spermophilus erythrogenys, S. pygmaeus, S. suslicus, Urocitellus undulatus), suggesting the presence of cryptic species. A single putative NUMT (nuclear mitochondrial pseudogene) sequence was recovered during molecular analysis; mitochondrial COI from this sample was amplified following re-extraction of DNA. Our data show high discrimination ability of 100 bp COI fragments for Eurasian ground squirrels (84.3%) with no incorrect assessments, underscoring the potential utility of the existing reference librariy for the development of diagnostic ‘mini-barcodes’. PMID:25617768
Wu, Chung-Shien; Wang, Ting-Jen; Wu, Chia-Wen; Wang, Ya-Nan
2017-01-01
Abstract To date, little is known about the evolution of plastid genomes (plastomes) in Lauraceae. As one of the top five largest families in tropical forests, the Lauraceae contain many species that are important ecologically and economically. Lauraceous species also provide wonderful materials to study the evolutionary trajectory in response to parasitism because they contain both nonparasitic and parasitic species. This study compared the plastomes of nine Lauraceous species, including the sole hemiparasitic and herbaceous genus Cassytha (laurel dodder; here represented by Cassytha filiformis). We found differential contractions of the canonical inverted repeat (IR), resulting in two IR types present in Lauraceae. These two IR types reinforce Cryptocaryeae and Neocinnamomum—Perseeae–Laureae as two separate clades. Our data reveal several traits unique to Cas. filiformis, including loss of IRs, loss or pseudogenization of 11 ndh and rpl23 genes, richness of repeats, and accelerated rates of nucleotide substitutions in protein-coding genes. Although Cas. filiformis is low in chlorophyll content, our analysis based on dN/dS ratios suggests that both its plastid house-keeping and photosynthetic genes are under strong selective constraints. Hence, we propose that short generation time and herbaceous lifestyle rather than reduced photosynthetic ability drive the accelerated rates of nucleotide substitutions in Cas. filiformis. PMID:28985306
Law, MeiYee; Childs, Kevin L.; Campbell, Michael S.; Stein, Joshua C.; Olson, Andrew J.; Holt, Carson; Panchy, Nicholas; Lei, Jikai; Jiao, Dian; Andorf, Carson M.; Lawrence, Carolyn J.; Ware, Doreen; Shiu, Shin-Han; Sun, Yanni; Jiang, Ning; Yandell, Mark
2015-01-01
The large size and relative complexity of many plant genomes make creation, quality control, and dissemination of high-quality gene structure annotations challenging. In response, we have developed MAKER-P, a fast and easy-to-use genome annotation engine for plants. Here, we report the use of MAKER-P to update and revise the maize (Zea mays) B73 RefGen_v3 annotation build (5b+) in less than 3 h using the iPlant Cyberinfrastructure. MAKER-P identified and annotated 4,466 additional, well-supported protein-coding genes not present in the 5b+ annotation build, added additional untranslated regions to 1,393 5b+ gene models, identified 2,647 5b+ gene models that lack any supporting evidence (despite the use of large and diverse evidence data sets), identified 104,215 pseudogene fragments, and created an additional 2,522 noncoding gene annotations. We also describe a method for de novo training of MAKER-P for the annotation of newly sequenced grass genomes. Collectively, these results lead to the 6a maize genome annotation and demonstrate the utility of MAKER-P for rapid annotation, management, and quality control of grasses and other difficult-to-annotate plant genomes. PMID:25384563
Penguins reduced olfactory receptor genes common to other waterbirds
Lu, Qin; Wang, Kai; Lei, Fumin; Yu, Dan; Zhao, Huabin
2016-01-01
The sense of smell, or olfaction, is fundamental in the life of animals. However, penguins (Aves: Sphenisciformes) possess relatively small olfactory bulbs compared with most other waterbirds such as Procellariiformes and Gaviiformes. To test whether penguins have a reduced reliance on olfaction, we analyzed the draft genome sequences of the two penguins, which diverged at the origin of the order Sphenisciformes; we also examined six closely related species with available genomes, and identified 29 one-to-one orthologous olfactory receptor genes (i.e. ORs) that are putatively functionally conserved and important across the eight birds. To survey the 29 one-to-one orthologous ORs in penguins and their relatives, we newly generated 34 sequences that are missing from the draft genomes. Through the analysis of totaling 378 OR sequences, we found that, of these functionally important ORs common to other waterbirds, penguins have a significantly greater percentage of OR pseudogenes than other waterbirds, suggesting a reduction of olfactory capability. The penguin-specific reduction of olfactory capability arose in the common ancestor of penguins between 23 and 60 Ma, which may have resulted from the aquatic specializations for underwater vision. Our study provides genetic evidence for a possible reduction of reliance on olfaction in penguins. PMID:27527385
Kishida, Takushi; Thewissen, J G M
2012-01-25
Odontocetes and mysticetes are two extant suborders of cetaceans. It is reported that the former have no sense of olfaction, while the latter can smell in air. To explain the ecological reason why mysticetes still retain their sense of smell, two hypotheses have been proposed - the echolocation-priority hypothesis, which assumes that the acquisition of echolocation causes the reduction of the importance of olfaction, and the filter-feeder hypothesis, which assumes that olfactory ability is important for filter-feeders to locate their prey because clouds of plankton give off a peculiar odor. The olfactory marker protein (OMP) is almost exclusively expressed in vertebrate olfactory receptor neurons, and is considered to play important roles in olfactory systems. In this study, full-length open reading frames of OMP genes were identified in 6 cetacean species and we analyzed the nonsynonymous to synonymous substitution rate ratio based on the maximum likelihood method. The evolutionary changes of the selective pressures on OMP genes did fit better to the filter-feeder hypothesis than to the echolocation-priority hypothesis. In addition, no pseudogenization mutations are found in all five odontocetes OMP genes investigated in this study. It may suggest that OMP retains some function even in 'anosmic' odontocetes. Copyright © 2011 Elsevier B.V. All rights reserved.
Analyses of Sweet Receptor Gene (Tas1r2) and Preference for Sweet Stimuli in Species of Carnivora
Glaser, Dieter; Li, Weihua; Johnson, Warren E.; O'Brien, Stephen J.; Beauchamp, Gary K.; Brand, Joseph G.
2009-01-01
The extent to which taste receptor specificity correlates with, or even predicts, diet choice is not known. We recently reported that the insensitivity to sweeteners shown by species of Felidae can be explained by their lacking of a functional Tas1r2 gene. To broaden our understanding of the relationship between the structure of the sweet receptors and preference for sugars and artificial sweeteners, we measured responses to 12 sweeteners in 6 species of Carnivora and sequenced the coding regions of Tas1r2 in these same or closely related species. The lion showed no preference for any of the 12 sweet compounds tested, and it possesses the pseudogenized Tas1r2. All other species preferred some of the natural sugars, and their Tas1r2 sequences, having complete open reading frames, predict functional sweet receptors. In addition to preferring natural sugars, the lesser panda also preferred 3 (neotame, sucralose, and aspartame) of the 6 artificial sweeteners. Heretofore, it had been reported that among vertebrates, only Old World simians could taste aspartame. The observation that the lesser panda highly preferred aspartame could be an example of evolutionary convergence in the identification of sweet stimuli. PMID:19366814
Cold-adapted tubulins in the glacier ice worm, Mesenchytraeus solifugus.
Tartaglia, Lawrence J; Shain, Daniel H
2008-11-01
Glacier ice worms, Mesenchytraeus solifugus and related species, are the only known annelids that survive obligately in glacier ice and snow. One fundamental component of cold temperature adaptation is the ability to polymerize tubulin, which typically depolymerizes at low physiological temperatures (e.g., <10 degrees C) in most temperate species. In this study, we isolated two alpha-tubulin (Msalpha1, Msalpha2) and two beta-tubulin (Msbeta1, Msbeta2) subunits from an ice worm cDNA library, and compared their predicted amino acid sequences with homologues from other cold-adapted organisms (e.g., Antarctic fish, ciliate) in an effort to identify species-specific amino acid substitutions that contribute to cold temperature-dependent tubulin polymerization. Our comparisons and predicted protein structures suggest that ice worm-specific amino acid substitutions stabilize lateral contact associations, particularly between beta-tubulin protofilaments, but these substitutions occur at different positions in comparison with other cold-adapted tubulins. The ice worm tubulin gene family appears relatively small, comprising one primary alpha- and one primary beta-tubulin monomers, though minor isoforms and pseudogenes were identified. Our analyses suggest that variation occurs in the strategies (i.e., species-specific amino acid substitutions, gene number) by which cold-adapted taxa have evolved the ability to polymerize tubulin at low physiological temperatures.
Recurrent and founder mutations in the PMS2 gene
Tomsic, Jerneja; Senter, Leigha; Liyanarachchi, Sandya; Clendenning, Mark; Vaughn, Cecily P.; Jenkins, Mark A.; Hopper, John L.; Young, Joanne; Samowitz, Wade; de la Chapelle, Albert
2012-01-01
Germline mutations in PMS2 are associated with Lynch syndrome (LS), the most common known cause of hereditary colorectal cancer. Mutation detection in PMS2 has been difficult due to the presence of several pseudogenes, but a custom-designed long-range PCR strategy now allows adequate mutation detection. Many mutations are unique. However some mutations are observed repeatedly, across individuals not known to be related, due to the mutation being either recurrent, arising multiple times de novo at hot spots for mutations, or of founder origin, having occurred once in an ancestor. Previously, we observed 36 distinct mutations in a sample of 61 independently ascertained Caucasian probands of mixed European background with PMS2 mutations. Eleven of these mutations were detected in more than one individual not known to be related and of these, six were detected more than twice. These six mutations accounted for 31 (51%) ostensibly unrelated probands. Here we performed genotyping and haplotype analysis in four mutations observed in multiple probands and found two (c.137G>T and exon 10 deletion) to be founder mutations, one (c.903G>T) a probable founder, and one (c.1A>G) where founder mutation status could not be evaluated. We discuss possible explanations for the frequent occurrence of founder mutations in PMS2. PMID:22577899
A massive parallel sequencing workflow for diagnostic genetic testing of mismatch repair genes
Hansen, Maren F; Neckmann, Ulrike; Lavik, Liss A S; Vold, Trine; Gilde, Bodil; Toft, Ragnhild K; Sjursen, Wenche
2014-01-01
The purpose of this study was to develop a massive parallel sequencing (MPS) workflow for diagnostic analysis of mismatch repair (MMR) genes using the GS Junior system (Roche). A pathogenic variant in one of four MMR genes, (MLH1, PMS2, MSH6, and MSH2), is the cause of Lynch Syndrome (LS), which mainly predispose to colorectal cancer. We used an amplicon-based sequencing method allowing specific and preferential amplification of the MMR genes including PMS2, of which several pseudogenes exist. The amplicons were pooled at different ratios to obtain coverage uniformity and maximize the throughput of a single-GS Junior run. In total, 60 previously identified and distinct variants (substitutions and indels), were sequenced by MPS and successfully detected. The heterozygote detection range was from 19% to 63% and dependent on sequence context and coverage. We were able to distinguish between false-positive and true-positive calls in homopolymeric regions by cross-sample comparison and evaluation of flow signal distributions. In addition, we filtered variants according to a predefined status, which facilitated variant annotation. Our study shows that implementation of MPS in routine diagnostics of LS can accelerate sample throughput and reduce costs without compromising sensitivity, compared to Sanger sequencing. PMID:24689082
Recurrent and founder mutations in the PMS2 gene.
Tomsic, J; Senter, L; Liyanarachchi, S; Clendenning, M; Vaughn, C P; Jenkins, M A; Hopper, J L; Young, J; Samowitz, W; de la Chapelle, A
2013-03-01
Germline mutations in PMS2 are associated with Lynch syndrome (LS), the most common known cause of hereditary colorectal cancer. Mutation detection in PMS2 has been difficult due to the presence of several pseudogenes, but a custom-designed long-range PCR strategy now allows adequate mutation detection. Many mutations are unique. However, some mutations are observed repeatedly across individuals not known to be related due to the mutation being either recurrent, arising multiple times de novo at hot spots for mutations, or of founder origin, having occurred once in an ancestor. Previously, we observed 36 distinct mutations in a sample of 61 independently ascertained Caucasian probands of mixed European background with PMS2 mutations. Eleven of these mutations were detected in more than one individual not known to be related and of these, six were detected more than twice. These six mutations accounted for 31 (51%) ostensibly unrelated probands. Here, we performed genotyping and haplotype analysis in four mutations observed in multiple probands and found two (c.137G>T and exon 10 deletion) to be founder mutations and one (c.903G>T) a probable founder. One (c.1A>G) could not be evaluated for founder mutation status. We discuss possible explanations for the frequent occurrence of founder mutations in PMS2. © 2012 John Wiley & Sons A/S.
Kostygov, Alexei Y.; Butenko, Anzhelika; Nenarokova, Anna; Tashyreva, Daria; Flegontov, Pavel; Lukeš, Julius; Yurchenko, Vyacheslav
2017-01-01
We have sequenced, annotated, and analyzed the genome of Ca. Pandoraea novymonadis, a recently described bacterial endosymbiont of the trypanosomatid Novymonas esmeraldas. When compared with genomes of its free-living relatives, it has all the hallmarks of the endosymbionts’ genomes, such as significantly reduced size, extensive gene loss, low GC content, numerous gene rearrangements, and low codon usage bias. In addition, Ca. P. novymonadis lacks mobile elements, has a strikingly low number of pseudogenes, and almost all genes are single copied. This suggests that it already passed the intensive period of host adaptation, which still can be observed in the genome of Polynucleobacter necessarius, a certainly recent endosymbiont. Phylogenetically, Ca. P. novymonadis is more related to P. necessarius, an intracytoplasmic bacterium of free-living ciliates, than to Ca. Kinetoplastibacterium spp., the only other known endosymbionts of trypanosomatid flagellates. As judged by the extent of the overall genome reduction and the loss of particular metabolic abilities correlating with the increasing dependence of the symbiont on its host, Ca. P. novymonadis occupies an intermediate position P. necessarius and Ca. Kinetoplastibacterium spp. We conclude that the relationships between Ca. P. novymonadis and N. esmeraldas are well-established, although not as fine-tuned as in the case of Strigomonadinae and their endosymbionts. PMID:29046673
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chain, Patrick S. G.; Carniel, E.; Larimer, Frank W
2004-09-01
Yersinia pestis, the causative agent of plague, is a highly uniform clone that diverged recently from the enteric pathogen Yersinia pseudotuberculosis. Despite their close genetic relationship, they differ radically in their pathogenicity and transmission. Here, we report the complete genomic sequence of Y. pseudotuberculosis IP32953 and its use for detailed genome comparisons with available Y. pestis sequences. Analyses of identified differences across a panel of Yersinia isolates from around the world reveal 32 Y. pestis chromosomal genes that, together with the two Y. pestis-specific plasmids, to our knowledge, represent the only new genetic material in Y. pestis acquired since themore » the divergence from Y. pseudotuberculosis. In contrast, 149 other pseudogenes (doubling the previous estimate) and 317 genes absent from Y. pestis were detected, indicating that as many as 13% of Y. pseudotuberculosis genes no longer function in Y. pestis. Extensive insertion sequence-mediated genome rearrangements and reductive evolution through massive gene loss, resulting in elimination and modification of preexisting gene expression pathways, appear to be more important than acquisition of genes in the evolution of Y. pestis. These results provide a sobering example of how a highly virulent epidemic clone can suddenly emerge from a less virulent, closely related progenitor.« less
Analysis of ZP1 gene reveals differences in zona pellucida composition in carnivores.
Moros-Nicolás, C; Leza, A; Chevret, P; Guillén-Martínez, A; González-Brusi, L; Boué, F; Lopez-Bejar, M; Ballesta, J; Avilés, M; Izquierdo-Rico, M J
2018-01-01
The zona pellucida (ZP) is an extracellular envelope that surrounds mammalian oocytes. This coat participates in the interaction between gametes, induction of the acrosome reaction, block of polyspermy and protection of the oviductal embryo. Previous studies suggested that carnivore ZP was formed by three glycoproteins (ZP2, ZP3 and ZP4), with ZP1 being a pseudogene. However, a recent study in the cat found that all four proteins were expressed. In the present study, in silico and molecular analyses were performed in several carnivores to clarify the ZP composition in this order of mammals. The in silico analysis demonstrated the presence of the ZP1 gene in five carnivores: cheetah, panda, polar bear, tiger and walrus, whereas in the Antarctic fur seal and the Weddell seal there was evidence of pseudogenisation. Molecular analysis showed the presence of four ZP transcripts in ferret ovaries (ZP1, ZP2, ZP3 and ZP4) and three in fox ovaries (ZP2, ZP3 and ZP4). Analysis of the fox ZP1 gene showed the presence of a stop codon. The results strongly suggest that all four ZP genes are expressed in most carnivores, whereas ZP1 pseudogenisation seems to have independently affected three families (Canidae, Otariidae and Phocidae) of the carnivore tree.
Detecting long tandem duplications in genomic sequences.
Audemard, Eric; Schiex, Thomas; Faraut, Thomas
2012-05-08
Detecting duplication segments within completely sequenced genomes provides valuable information to address genome evolution and in particular the important question of the emergence of novel functions. The usual approach to gene duplication detection, based on all-pairs protein gene comparisons, provides only a restricted view of duplication. In this paper, we introduce ReD Tandem, a software using a flow based chaining algorithm targeted at detecting tandem duplication arrays of moderate to longer length regions, with possibly locally weak similarities, directly at the DNA level. On the A. thaliana genome, using a reference set of tandem duplicated genes built using TAIR,(a) we show that ReD Tandem is able to predict a large fraction of recently duplicated genes (dS < 1) and that it is also able to predict tandem duplications involving non coding elements such as pseudo-genes or RNA genes. ReD Tandem allows to identify large tandem duplications without any annotation, leading to agnostic identification of tandem duplications. This approach nicely complements the usual protein gene based which ignores duplications involving non coding regions. It is however inherently restricted to relatively recent duplications. By recovering otherwise ignored events, ReD Tandem gives a more comprehensive view of existing evolutionary processes and may also allow to improve existing annotations.
The Plasmodium PHIST and RESA-Like Protein Families of Human and Rodent Malaria Parasites
Moreira, Cristina K.; Naissant, Bernina; Coppi, Alida; Bennett, Brandy L.; Aime, Elena; Franke-Fayard, Blandine; Janse, Chris J.; Coppens, Isabelle; Sinnis, Photini; Templeton, Thomas J.
2016-01-01
The phist gene family has members identified across the Plasmodium genus, defined by the presence of a domain of roughly 150 amino acids having conserved aromatic residues and an all alpha-helical structure. The family is highly amplified in P. falciparum, with 65 predicted genes in the genome of the 3D7 isolate. In contrast, in the rodent malaria parasite P. berghei 3 genes are identified, one of which is an apparent pseudogene. Transcripts of the P. berghei phist genes are predominant in schizonts, whereas in P. falciparum transcript profiles span different asexual blood stages and gametocytes. We pursued targeted disruption of P. berghei phist genes in order to characterize a simplistic model for the expanded phist gene repertoire in P. falciparum. Unsuccessful attempts to disrupt P. berghei PBANKA_114540 suggest that this phist gene is essential, while knockout of phist PBANKA_122900 shows an apparent normal progression and non-essential function throughout the life cycle. Epitope-tagging of P. falciparum and P. berghei phist genes confirmed protein export to the erythrocyte cytoplasm and localization with a punctate pattern. Three P. berghei PEXEL/HT-positive exported proteins exhibit at least partial co-localization, in support of a common vesicular compartment in the cytoplasm of erythrocytes infected with rodent malaria parasites. PMID:27022937
Gernandt, D S; Liston, A; Piñero, D
2001-12-01
The pinyon pines (Pinus subsection Cembroides), distributed in semiarid regions of the western United States and Mexico, include a mixture of relictual and more recently evolved taxa. To investigate relationships among the pinyons, we screened and partially sequenced 3000-bp clones of the nuclear ribosomal DNA internal transcribed spacer (ITS) region for 16 taxa from subsect. Cembroides and nine representatives from four other subsections of subgenus Strobus. Restriction digests of clones reveal within-individual heterogeneity, suggesting that concerted evolution is operating slowly on the ITS in pine species. Two ITS clones were identified as pseudogenes. Tandem subrepeats in the ITS1 form stem loops comparable to those in other genera of Pinaceae and may be promoting recombination between rDNA repeats, resulting in ITS1 chimeras. Within the pinyon clade, phylogenetic structure is present, but different clones from the same (or different) individuals of a species are polyphyletic, indicating that coalescence of ITS copies within individual genomes predates evolutionary divergence in the group. At the level of subsection and above, the ITS region corresponds well with morphological and cpDNA evidence. Except for P. nelsonii, the pinyons are monophyletic, with both subsect. Cembroides and P. nelsonii forming a clade with the foxtail and bristlecone pines (subsect. Balfourianae) of western North America.
Functional analysis and transcriptional output of the Göttingen minipig genome.
Heckel, Tobias; Schmucki, Roland; Berrera, Marco; Ringshandl, Stephan; Badi, Laura; Steiner, Guido; Ravon, Morgane; Küng, Erich; Kuhn, Bernd; Kratochwil, Nicole A; Schmitt, Georg; Kiialainen, Anna; Nowaczyk, Corinne; Daff, Hamina; Khan, Azinwi Phina; Lekolool, Isaac; Pelle, Roger; Okoth, Edward; Bishop, Richard; Daubenberger, Claudia; Ebeling, Martin; Certa, Ulrich
2015-11-14
In the past decade the Göttingen minipig has gained increasing recognition as animal model in pharmaceutical and safety research because it recapitulates many aspects of human physiology and metabolism. Genome-based comparison of drug targets together with quantitative tissue expression analysis allows rational prediction of pharmacology and cross-reactivity of human drugs in animal models thereby improving drug attrition which is an important challenge in the process of drug development. Here we present a new chromosome level based version of the Göttingen minipig genome together with a comparative transcriptional analysis of tissues with pharmaceutical relevance as basis for translational research. We relied on mapping and assembly of WGS (whole-genome-shotgun sequencing) derived reads to the reference genome of the Duroc pig and predict 19,228 human orthologous protein-coding genes. Genome-based prediction of the sequence of human drug targets enables the prediction of drug cross-reactivity based on conservation of binding sites. We further support the finding that the genome of Sus scrofa contains about ten-times less pseudogenized genes compared to other vertebrates. Among the functional human orthologs of these minipig pseudogenes we found HEPN1, a putative tumor suppressor gene. The genomes of Sus scrofa, the Tibetan boar, the African Bushpig, and the Warthog show sequence conservation of all inactivating HEPN1 mutations suggesting disruption before the evolutionary split of these pig species. We identify 133 Sus scrofa specific, conserved long non-coding RNAs (lncRNAs) in the minipig genome and show that these transcripts are highly conserved in the African pigs and the Tibetan boar suggesting functional significance. Using a new minipig specific microarray we show high conservation of gene expression signatures in 13 tissues with biomedical relevance between humans and adult minipigs. We underline this relationship for minipig and human liver where we could demonstrate similar expression levels for most phase I drug-metabolizing enzymes. Higher expression levels and metabolic activities were found for FMO1, AKR/CRs and for phase II drug metabolizing enzymes in minipig as compared to human. The variability of gene expression in equivalent human and minipig tissues is considerably higher in minipig organs, which is important for study design in case a human target belongs to this variable category in the minipig. The first analysis of gene expression in multiple tissues during development from young to adult shows that the majority of transcriptional programs are concluded four weeks after birth. This finding is in line with the advanced state of human postnatal organ development at comparative age categories and further supports the minipig as model for pediatric drug safety studies. Genome based assessment of sequence conservation combined with gene expression data in several tissues improves the translational value of the minipig for human drug development. The genome and gene expression data presented here are important resources for researchers using the minipig as model for biomedical research or commercial breeding. Potential impact of our data for comparative genomics, translational research, and experimental medicine are discussed.
Costa, Elísio; Duque, Frederico; Oliveira, Jorge; Garcia, Paula; Gonçalves, Isabel; Diogo, Luísa; Santos, Rosário
2007-01-01
Shwachman-Diamond syndrome (SDS) is caused by mutations in the SBDS gene, most of which are the result of gene conversion events involving its highly homologous pseudogene SBDSP. Here we describe the molecular characterization of the first documented gross deletion in the SBDS gene, in a 4-year-old Portuguese girl with SDS. The clinical diagnosis was based on the presence of hematological symptoms (severe anemia and cyclic neutropenia), pancreatic exocrine insufficiency and skeletal abnormalities. Routine molecular screening revealed heterozygosity for the common splicing mutation c.258+2T>C, and a further step-wise approach led to the detection of a large deletion encompassing exon 3, the endpoints of which were subsequently delineated at the gDNA level. This novel mutation (c.258+374_459+250del), predictably giving rise to an internally deleted polypeptide (p.Ile87_Gln153del), appears to have arisen from an excision event mediated by AluSx elements which are present in introns 2 and 3. Our case illustrates the importance of including gross deletion screening in the SDS diagnostic setting, especially in cases where only one deleterious mutation is detected by routine screening methods. In particular, deletional rearrangements involving exon 3 should be considered, since Alu sequences are known to be an important cause of recurrent mutations.
One Health and Food-Borne Disease: Salmonella Transmission between Humans, Animals, and Plants.
Silva, Claudia; Calva, Edmundo; Maloy, Stanley
2014-02-01
There are >2,600 recognized serovars of Salmonella enterica. Many of these Salmonella serovars have a broad host range and can infect a wide variety of animals, including mammals, birds, reptiles, amphibians, fish, and insects. In addition, Salmonella can grow in plants and can survive in protozoa, soil, and water. Hence, broad-host-range Salmonella can be transmitted via feces from wild animals, farm animals, and pets or by consumption of a wide variety of common foods: poultry, beef, pork, eggs, milk, fruit, vegetables, spices, and nuts. Broad-host-range Salmonella pathogens typically cause gastroenteritis in humans. Some Salmonella serovars have a more restricted host range that is associated with changes in the virulence plasmid pSV, accumulation of pseudogenes, and chromosome rearrangements. These changes in host-restricted Salmonella alter pathogen-host interactions such that host-restricted Salmonella organisms commonly cause systemic infections and are transmitted between host populations by asymptomatic carriers. The secondary consequences of efforts to eliminate host-restricted Salmonella serovars demonstrate that basic ecological principles govern the environmental niches occupied by these pathogens, making it impossible to thwart Salmonella infections without a clear understanding of the human, animal, and environmental reservoirs of these pathogens. Thus, transmission of S. enterica provides a compelling example of the One Health paradigm because reducing human infections will require the reduction of Salmonella in animals and limitation of transmission from the environment.
Transposon identification using profile HMMs
2010-01-01
Background Transposons are "jumping genes" that account for large quantities of repetitive content in genomes. They are known to affect transcriptional regulation in several different ways, and are implicated in many human diseases. Transposons are related to microRNAs and viruses, and many genes, pseudogenes, and gene promoters are derived from transposons or have origins in transposon-induced duplication. Modeling transposon-derived genomic content is difficult because they are poorly conserved. Profile hidden Markov models (profile HMMs), widely used for protein sequence family modeling, are rarely used for modeling DNA sequence families. The algorithm commonly used to estimate the parameters of profile HMMs, Baum-Welch, is prone to prematurely converge to local optima. The DNA domain is especially problematic for the Baum-Welch algorithm, since it has only four letters as opposed to the twenty residues of the amino acid alphabet. Results We demonstrate with a simulation study and with an application to modeling the MIR family of transposons that two recently introduced methods, Conditional Baum-Welch and Dynamic Model Surgery, achieve better estimates of the parameters of profile HMMs across a range of conditions. Conclusions We argue that these new algorithms expand the range of potential applications of profile HMMs to many important DNA sequence family modeling problems, including that of searching for and modeling the virus-like transposons that are found in all known genomes. PMID:20158867
Comparative sequence analysis of the X-inactivation center region in mouse, human, and bovine.
Chureau, Corinne; Prissette, Marine; Bourdet, Agnès; Barbe, Valérie; Cattolico, Laurence; Jones, Louis; Eggen, André; Avner, Philip; Duret, Laurent
2002-06-01
We have sequenced to high levels of accuracy 714-kb and 233-kb regions of the mouse and bovine X-inactivation centers (Xic), respectively, centered on the Xist gene. This has provided the basis for a fully annotated comparative analysis of the mouse Xic with the 2.3-Mb orthologous region in human and has allowed a three-way species comparison of the core central region, including the Xist gene. These comparisons have revealed conserved genes, both coding and noncoding, conserved CpG islands and, more surprisingly, conserved pseudogenes. The distribution of repeated elements, especially LINE repeats, in the mouse Xic region when compared to the rest of the genome does not support the hypothesis of a role for these repeat elements in the spreading of X inactivation. Interestingly, an asymmetric distribution of LINE elements on the two DNA strands was observed in the three species, not only within introns but also in intergenic regions. This feature is suggestive of important transcriptional activity within these intergenic regions. In silico prediction followed by experimental analysis has allowed four new genes, Cnbp2, Ftx, Jpx, and Ppnx, to be identified and novel, widespread, complex, and apparently noncoding transcriptional activity to be characterized in a region 5' of Xist that was recently shown to attract histone modification early after the onset of X inactivation.
Epigenetic Regulation of the Sex Determination Gene MeGI in Polyploid Persimmon.
Akagi, Takashi; Henry, Isabelle M; Kawai, Takashi; Comai, Luca; Tao, Ryutaro
2016-12-01
Epigenetic regulation can add a flexible layer to genetic variation, potentially enabling long-term but reversible cis-regulatory changes to an allele while maintaining its DNA sequence. Here, we present a case in which alternative epigenetic states lead to reversible sex determination in the hexaploid persimmon Diospyros kaki Previously, we elucidated the molecular mechanism of sex determination in diploid persimmon and demonstrated the action of a Y-encoded sex determinant pseudogene called OGI, which produces small RNAs targeting the autosomal gene MeGI, resulting in separate male and female individuals (dioecy). We contrast these findings with the discovery, in hexaploid persimmon, of an additional layer of regulation in the form of DNA methylation of the MeGI promoter associated with the production of both male and female flowers in genetically male trees. Consistent with this model, developing male buds exhibited higher methylation levels across the MeGI promoter than developing female flowers from either monoecious or female trees. Additionally, a DNA methylation inhibitor induced developing male buds to form feminized flowers. Concurrently, in Y-chromosome-carrying trees, the expression of OGI is silenced by the presence of a SINE (short interspersed nuclear element)-like insertion in the OGI promoter. Our findings provide an example of an adaptive scenario involving epigenetic plasticity. © 2016 American Society of Plant Biologists. All rights reserved.
Liu, Yang; Wang, Bin; Cui, Peng; Li, Libo; Xue, Jia-Yu; Yu, Jun; Qiu, Yin-Long
2012-01-01
Mitochondrial genomes have maintained some bacterial features despite their residence within eukaryotic cells for approximately two billion years. One of these features is the frequent presence of polycistronic operons. In land plants, however, it has been shown that all sequenced vascular plant chondromes lack large polycistronic operons while bryophyte chondromes have many of them. In this study, we provide the completely sequenced mitochondrial genome of a lycophyte, from Huperzia squarrosa, which is a member of the sister group to all other vascular plants. The genome, at a size of 413,530 base pairs, contains 66 genes and 32 group II introns. In addition, it has 69 pseudogene fragments for 24 of the 40 protein- and rRNA-coding genes. It represents the most archaic form of mitochondrial genomes of all vascular plants. In particular, it has one large conserved gene cluster containing up to 10 ribosomal protein genes, which likely represents a polycistronic operon but has been disrupted and greatly reduced in the chondromes of other vascular plants. It also has the least rearranged gene order in comparison to the chondromes of other vascular plants. The genome is ancestral in vascular plants in several other aspects: the gene content resembling those of charophytes and most bryophytes, all introns being cis-spliced, a low level of RNA editing, and lack of foreign DNA of chloroplast or nuclear origin.
Kauzlaric, Annamaria; Ecco, Gabriela; Cassano, Marco; Duc, Julien; Imbeault, Michael; Trono, Didier
2017-01-01
KRAB-containing poly-zinc finger proteins (KZFPs) constitute the largest family of transcription factors encoded by mammalian genomes, and growing evidence indicates that they fulfill functions critical to both embryonic development and maintenance of adult homeostasis. KZFP genes underwent broad and independent waves of expansion in many higher vertebrates lineages, yet comprehensive studies of members harbored by a given species are scarce. Here we present a thorough analysis of KZFP genes and related units in the murine genome. We first identified about twice as many elements than previously annotated as either KZFP genes or pseudogenes, notably by assigning to this family an entity formerly considered as a large group of Satellite repeats. We then could delineate an organization in clusters distributed throughout the genome, with signs of recombination, translocation, duplication and seeding of new sites by retrotransposition of KZFP genes and related genetic units (KZFP/rGUs). Moreover, we harvested evidence indicating that closely related paralogs had evolved through both drifting and shifting of sequences encoding for zinc finger arrays. Finally, we could demonstrate that the KAP1-SETDB1 repressor complex tames the expression of KZFP/rGUs within clusters, yet that the primary targets of this regulation are not the KZFP/rGUs themselves but enhancers contained in neighboring endogenous retroelements and that, underneath, KZFPs conserve highly individualized patterns of expression. PMID:28334004
Grandi, Nicole; Tramontano, Enzo
2017-06-27
Human Endogenous Retroviruses (HERVs) are ancient infection relics constituting ~8% of our DNA. While HERVs' genomic characterization is still ongoing, impressive amounts of data have been obtained regarding their general expression across tissues. Among HERVs, one of the most studied is the W group, which is the sole HERV group specifically mobilized by the long interspersed element-1 (LINE-1) machinery, providing a source of novel insertions by retrotransposition of HERV-W processed pseudogenes, and comprising a member encoding a functional envelope protein coopted for human placentation. The HERV-W group has been intensively investigated for its putative role in several diseases, such as cancer, inflammation, and autoimmunity. Despite major interest in the link between HERV-W expression and human pathogenesis, no conclusive correlation has been demonstrated so far. In general, (i) the absence of a proper identification of the specific HERV-W sequences expressed in a given condition, and (ii) the lack of studies attempting to connect the various observations in the same experimental conditions are the major problems preventing the definitive assessment of the HERV-W impact on human physiopathology. In this review, we summarize the current knowledge on the HERV-W group presence within the human genome and its expression in physiological tissues as well as in the main pathological contexts.
2017-01-01
Human Endogenous Retroviruses (HERVs) are ancient infection relics constituting ~8% of our DNA. While HERVs’ genomic characterization is still ongoing, impressive amounts of data have been obtained regarding their general expression across tissues. Among HERVs, one of the most studied is the W group, which is the sole HERV group specifically mobilized by the long interspersed element-1 (LINE-1) machinery, providing a source of novel insertions by retrotransposition of HERV-W processed pseudogenes, and comprising a member encoding a functional envelope protein coopted for human placentation. The HERV-W group has been intensively investigated for its putative role in several diseases, such as cancer, inflammation, and autoimmunity. Despite major interest in the link between HERV-W expression and human pathogenesis, no conclusive correlation has been demonstrated so far. In general, (i) the absence of a proper identification of the specific HERV-W sequences expressed in a given condition; and (ii) the lack of studies attempting to connect the various observations in the same experimental conditions are the major problems preventing the definitive assessment of the HERV-W impact on human physiopathology. In this review, we summarize the current knowledge on the HERV-W group presence within the human genome and its expression in physiological tissues as well as in the main pathological contexts. PMID:28653997
Silva, Marcio Roberto; Rocha, Adalgiza da Silva; da Costa, Ronaldo Rodrigues; de Alencar, Andrea Padilha; de Oliveira, Vania Maria; Fonseca Júnior, Antônio Augusto; Sales, Mariana Lázaro; Issa, Marina de Azevedo; Filho, Paulo Martins Soares; Pereira, Omara Tereza Vianello; dos Santos, Eduardo Calazans; Mendes, Rejane Silva; Ferreira, Angela Maria de Jesus; Mota, Pedro Moacyr Pinto Coelho; Suffys, Philip Noel; Guimarães, Mark Drew Crosland
2013-05-01
In this cross-sectional study, mycobacteria specimens from 189 tuberculosis (TB) patients living in an urban area in Brazil were characterised from 2008-2010 using phenotypic and molecular speciation methods (pncA gene and oxyR pseudogene analysis). Of these samples, 174 isolates simultaneously grew on Löwenstein-Jensen (LJ) and Stonebrink (SB)-containing media and presented phenotypic and molecular profiles of Mycobacterium tuberculosis, whereas 12 had molecular profiles of M. tuberculosis based on the DNA analysis of formalin-fixed paraffin wax-embedded tissue samples (paraffin blocks). One patient produced two sputum isolates, the first of which simultaneously grew on LJ and SB media and presented phenotypic and molecular profiles of M. tuberculosis, and the second of which only grew on SB media and presented phenotypic profiles of Mycobacterium bovis. One patient provided a bronchial lavage isolate, which simultaneously grew on LJ and SB media and presented phenotypic and molecular profiles of M. tuberculosis, but had molecular profiles of M. bovis from paraffin block DNA analysis, and one sample had molecular profiles of M. tuberculosis and M. bovis identified from two distinct paraffin blocks. Moreover, we found a low prevalence (1.6%) of M. bovis among these isolates, which suggests that local health service procedures likely underestimate its real frequency and that it deserves more attention from public health officials.
Vampire bats exhibit evolutionary reduction of bitter taste receptor genes common to other bats
Hong, Wei; Zhao, Huabin
2014-01-01
The bitter taste serves as an important natural defence against the ingestion of poisonous foods and is thus believed to be indispensable in animals. However, vampire bats are obligate blood feeders that show a reduced behavioural response towards bitter-tasting compounds. To test whether bitter taste receptor genes (T2Rs) have been relaxed from selective constraint in vampire bats, we sampled all three vampire bat species and 11 non-vampire bats, and sequenced nine one-to-one orthologous T2Rs that are assumed to be functionally conserved in all bats. We generated 85 T2R sequences and found that vampire bats have a significantly greater percentage of pseudogenes than other bats. These results strongly suggest a relaxation of selective constraint and a reduction of bitter taste function in vampire bats. We also found that vampire bats retain many intact T2Rs, and that the taste signalling pathway gene Calhm1 remains complete and intact with strong functional constraint. These results suggest the presence of some bitter taste function in vampire bats, although it is not likely to play a major role in food selection. Together, our study suggests that the evolutionary reduction of bitter taste function in animals is more pervasive than previously believed, and highlights the importance of extra-oral functions of taste receptor genes. PMID:24966321
A benchmark study of scoring methods for non-coding mutations.
Drubay, Damien; Gautheret, Daniel; Michiels, Stefan
2018-05-15
Detailed knowledge of coding sequences has led to different candidate models for pathogenic variant prioritization. Several deleteriousness scores have been proposed for the non-coding part of the genome, but no large-scale comparison has been realized to date to assess their performance. We compared the leading scoring tools (CADD, FATHMM-MKL, Funseq2 and GWAVA) and some recent competitors (DANN, SNP and SOM scores) for their ability to discriminate assumed pathogenic variants from assumed benign variants (using the ClinVar, COSMIC and 1000 genomes project databases). Using the ClinVar benchmark, CADD was the best tool for detecting the pathogenic variants that are mainly located in protein coding gene regions. Using the COSMIC benchmark, FATHMM-MKL, GWAVA and SOMliver outperformed the other tools for pathogenic variants that are typically located in lincRNAs, pseudogenes and other parts of the non-coding genome. However, all tools had low precision, which could potentially be improved by future non-coding genome feature discoveries. These results may have been influenced by the presence of potential benign variants in the COSMIC database. The development of a gold standard as consistent as ClinVar for these regions will be necessary to confirm our tool ranking. The Snakemake, C++ and R codes are freely available from https://github.com/Oncostat/BenchmarkNCVTools and supported on Linux. damien.drubay@gustaveroussy.fr or stefan.michiels@gustaveroussy.fr. Supplementary data are available at Bioinformatics online.
Schuster, W; Wissinger, B; Unseld, M; Brennicke, A
1990-01-01
A number of cytosines are altered to be recognized as uridines in transcripts of the nad3 locus in mitochondria of the higher plant Oenothera. Such nucleotide modifications can be found at 16 different sites within the nad3 coding region. Most of these alterations in the mRNA sequence change codon identities to specify amino acids better conserved in evolution. Individual cDNA clones differ in their degree of editing at five nucleotide positions, three of which are silent, while two lead to codon alterations specifying different amino acids. None of the cDNA clones analysed is maximally edited at all possible sites, suggesting slow processing or lowered stringency of editing at these nucleotides. Differentially edited transcripts could be editing intermediates or could code for differing polypeptides. Two edited nucleotides in an open reading frame located upstream of nad3 change two amino acids in the deduced polypeptide. Part of the well-conserved ribosomal protein gene rps12 also encoded downstream of nad3 in other plants, is lost in Oenothera mitochondria by recombination events. The functional rps12 protein must be imported from the cytoplasm since the deleted sequences of this gene are not found in the Oenothera mitochondrial genome. The pseudogene sequence is not edited at any nucleotide position. Images Fig. 3. Fig. 4. Fig. 7. PMID:1688531
Gao, Bin; Zhu, Shunyi
2016-01-01
Drosomycin (DRS) is a strictly antifungal peptide in Drosophila melanogaster, which contains four disulfide bridges (DBs) with three buried in molecular interior and one exposed on molecular surface to tie the amino- and carboxyl-termini of the molecule together (called wrapper disulfide bridge, WDB). Based on computational analysis of genomes of Drosophila species belonging to the Oriental lineage, we identified a new multigene family of DRS in Drosphila takahashii that includes a total of 11 DRS-encoding genes (termed DtDRS-1 to DtDRS-11) and a pseudogene. Phylogenetic tree and synteny analyses reveal orthologous relationship between DtDRSs and DRSs, indicating that orthologous genes of DRS-1, DRS-2, DRS-3 and DRS-6 have undergone duplication in D. takahashii and three amplifications (DtDRS-9 to DtDRS-11) of DRS-3 have lost WDB. Among the 11 genes, five are transcriptionally active in adult fruitflies. The ortholog of DRS (DtDRS-1) shows high structural and functional similarity to DRS while two WDB-deficient members display antibacterial activity accompanying complete loss or remarkable reduction of antifungal activity. To the best of our knowledge, this is the first report on the presence of three-disulfide antibacterial DRSs in a specific Drosophila species, suggesting a potential role of DB loss in neofunctionalization of a protein via structural adjustment. PMID:27562645
Kohzuma, Kaori; Chiba, Motoko; Nagano, Soichiro; Anai, Toyoaki; Ueda, Miki U.; Oguchi, Riichi; Shirai, Kazumasa; Hanada, Kousuke; Hikosaka, Kouki; Fujii, Nobuharu
2017-01-01
Radish (Raphanus sativus L. var. sativus), a widely cultivated root vegetable crop, possesses a large sink organ (the root), implying that photosynthetic activity in radish can be enhanced by altering both the source and sink capacity of the plant. However, since radish is a self-incompatible plant, improved mutation-breeding strategies are needed for this crop. TILLING (Targeting Induced Local Lesions IN Genomes) is a powerful method used for reverse genetics. In this study, we developed a new TILLING strategy involving a two-step mutant selection process for mutagenized radish plants: the first selection is performed to identify a BC1M1 line, that is, progenies of M1 plants crossed with wild-type, and the second step is performed to identify BC1M1 individuals with mutations. We focused on Rubisco as a target, since Rubisco is the most abundant plant protein and a key photosynthetic enzyme. We found that the radish genome contains six RBCS genes and one pseudogene encoding small Rubisco subunits. We screened 955 EMS-induced BC1M1 lines using our newly developed TILLING strategy and obtained six mutant lines for the six RsRBCS genes, encoding proteins with four different types of amino acid substitutions. Finally, we selected a homozygous mutant and subjected it to physiological measurements. PMID:28744180
Choi, Kyoung Su; Park, SeonJoo
2015-11-10
Aster spathulifolius, a member of the Asteraceae family, is distributed along the coast of Japan and Korea. This plant is used for medicinal and ornamental purposes. The complete chloroplast (cp) genome of A. sphathulifolius consists of 149,473 bp that include a pair of inverted repeats of 24,751 bp separated by a large single copy region of 81,998 bp and a small single copy region of 17,973 bp. The chloroplast genome contains 78 coding genes, four rRNA genes and 29 tRNA genes. When compared to other cpDNA sequences of Asteraceae, A. spathulifolius showed the closest relationship with Jacobaea vulgaris, and its atpB gene was found to be a pseudogene, unlike J. vulgaris. Furthermore, evaluation of the gene compositions of J. vulgaris, Helianthus annuus, Guizotia abyssinica and A. spathulifolius revealed that 13.6-kb showed inversion from ndhF to rps15, unlike Lactuca of Asteraceae. Comparison of the synonymous (Ks) and nonsynonymous (Ka) substitution rates with J. vulgaris revealed that synonymous genes related to a small subunit of the ribosome showed the highest value (0.1558), while nonsynonymous rates of genes related to ATP synthase genes were highest (0.0118). These findings revealed that substitution has occurred at similar rates in most genes, and the substitution rates suggested that most genes is a purified selection. Copyright © 2015 Elsevier B.V. All rights reserved.
Expression of human placental lactogen and variant growth hormone genes in placentas.
Martinez-Rodriguez, H G; Guerra-Rodriguez, N E; Iturbe-Cantu, M A; Martinez-Torres, A; Barrera-Saldaña, H A
1997-01-01
Previous studies comparing the expression levels of human placental lactogen (hPL) genes have shown varying results, due to, perhaps, the fact that in all of them only one placenta was being analyzed. Here, the expression of hPL and growth hormone variant (hGH-V) genes in fifteen term placentas was comparatively analyzed at the RNA level, using reverse transcription coupled to polymerase chain reaction (RT-PCR). The abundance of the combined RNA transcripts derived from these genes varied from one placenta to another. The authors found that hPL-4 transcripts were more abundant than those of hPL-3 in most samples (ratios from 1:1 to 6:1), transcripts from the putative hPL-1 pseudogene were more abundant at the unprocessed stage while those of the hGH-V gene were mostly processed. Again, the authors of this study observed wide variation from placenta to placenta in the abundance of both of these types of transcripts. The same was observed when a group of six placentas from abortuses and nine from pregnancies complicated by preclampsia, diabetes and hypertension was studied. The authors conclude that the disagreeing results reported in the literature which are not in agreement concerning the expression levels of hPL genes could be explained by normal variations of their expression levels among the different placentas analyzed.
Fukami, Maki; Iso, Manami; Sato, Naoko; Igarashi, Maki; Seo, Misuzu; Kazukawa, Itsuro; Kinoshita, Eiichi; Dateki, Sumito; Ogata, Tsutomu
2013-01-01
Combined pituitary hormone deficiency (CPHD), isolated hypogonadotropic hypogonadism (IHH), Kallmann syndrome (KS), and septo-optic dysplasia (SOD) are genetically related conditions caused by abnormal development of the anterior midline in the forebrain. Although mutations in the fibroblast growth factor receptor 1 (FGFR1) gene have been implicated in the development of IHH, KS, and SOD, the relevance of FGFR1 abnormalities to CPHD remains to be elucidated. Here, we report a Japanese female patient with CPHD and FGFR1 haploinsufficiency. The patient was identified through copy-number analyses and direct sequencing of FGFR1 performed for 69 patients with CPHD. The patient presented with a combined deficiency of GH, LH and FSH, and multiple neurological abnormalities. In addition, normal TSH values along with a low free T4 level indicated the presence of central hypothyroidism. Molecular analyses identified a heterozygous ~ 8.5 Mb deletion involving 56 genes and pseudogenes. None of these genes except FGFR1 have been associated with brain development. No FGFR1 abnormalities were identified in the remaining 68 patients, although two patients carried nucleotide substitutions (p.V102I and p.S107L) that were assessed as benign polymorphism by in vitro functional assays. These results indicate a possible role of FGFR1 in anterior pituitary function and the rarity of FGFR1 abnormalities in patients with CPHD.
Parvari, R; Ziv, E; Lentner, F; Tel-Or, S; Burstein, Y; Schechter, I
1987-01-01
cDNA libraries of chicken spleen and Harder gland (a gland enriched with immunocytes) constructed in pBR322 were screened by differential hybridization and by mRNA hybrid-selected translation. Eleven L-chain cDNA clones were identified from which VL probes were prepared and each was annealed with kidney DNA restriction digests. All VL probes revealed the same set of bands, corresponding to about 15 germline VL genes of one subgroup. The nucleotide sequences of six VL clones showed greater than or equal to 85% homology, and the predicted amino acid sequences were identical or nearly identical to the major N-terminal sequence of L-chains in chicken serum. These findings, and the fact that the VL clones were randomly selected from normal lymphoid tissues, strongly indicate that the bulk of chicken L-chains is encoded by a few germline VL genes, probably much less than 15 since many of the VL genes are known to be pseudogenes. Therefore, it is likely that somatic mechanisms operating prior to specific triggering by antigen play a major role in the generation of antibody diversity in chicken. Analysis of the constant region locus (sequencing of CL gene and cDNAs) demonstrate a single CL isotype and suggest the presence of CL allotypes.
Sawicki, Rafał; Singh, Sharda P; Mondal, Ashis K; Benes, Helen; Zimniak, Piotr
2003-01-01
From the fruitfly, Drosophila melanogaster, ten members of the cluster of Delta-class glutathione S-transferases (GSTs; formerly denoted as Class I GSTs) and one member of the Epsilon-class cluster (formerly GST-3) have been cloned, expressed in Escherichia coli, and their catalytic properties have been determined. In addition, nine more members of the Epsilon cluster have been identified through bioinformatic analysis but not further characterized. Of the 11 expressed enzymes, seven accepted the lipid peroxidation product 4-hydroxynonenal as substrate, and nine were active in glutathione conjugation of 1-chloro-2,4-dinitrobenzene. Since the enzymically active proteins included the gene products of DmGSTD3 and DmGSTD7 which were previously deemed to be pseudogenes, we investigated them further and determined that both genes are transcribed in Drosophila. Thus our present results indicate that DmGSTD3 and DmGSTD7 are probably functional genes. The existence and multiplicity of insect GSTs capable of conjugating 4-hydroxynonenal, in some cases with catalytic efficiencies approaching those of mammalian GSTs highly specialized for this function, indicates that metabolism of products of lipid peroxidation is a highly conserved biochemical pathway with probable detoxification as well as regulatory functions. PMID:12443531
Lipoxygenase pathways in Homo neanderthalensis: functional comparison with Homo sapiens isoforms
Chaitidis, Pavlos; Adel, Susan; Anton, Monika; Heydeck, Dagmar; Kuhn, Hartmut; Horn, Thomas
2013-01-01
Lipoxygenases (LOX) have been implicated in biosynthesis of pro- and anti-inflammatory mediators, and a previous report suggested compromised leukotriene signaling in H. neanderthalensis. To search for corresponding differences in leukotriene biosynthesis, we screened the Neandertal genome for LOX genes and found that, as modern humans, this archaic hominid contains six LOX genes (nALOX15, nALOX12, nALOX5, nALOX15B, nALOX12B, and nALOXE3) and one pseudogene. In the Neandertal genome, 60–75% of the amino acids of the different human LOX isoforms have been identified, and the degree of identity varies between 96 and 99%. Most functional amino acids (iron ligands, specificity determinants, calcium and ATP-binding sites, membrane-binding determinants, and phosphorylation sites) are well conserved in the Neandertal LOX isoforms, and expression of selected neandertalized human LOX mutants revealed no major functional defects. However, in nALOX12 and nALOXE3, two premature stop codons were found, leading to inactive enzyme species. These data suggest that ALOX15, ALOX5, ALOX15B, and ALOX12B should have been present as functional enzymes in H. neanderthalensis and that in contrast to lower nonhuman primates (M. mulatta) and other mammals (mice, rats), this ancient hominid expressed a 15-lipoxygenating ALOX15. Expression of ALOXE3 and ALOX12 was compromised, which might have caused problems in epidermal differentiation. PMID:23475662
Insights into natural products biosynthesis from analysis of 490 polyketide synthases from Fusarium.
Brown, Daren W; Proctor, Robert H
2016-04-01
Species of the fungus Fusarium collectively cause disease on almost all crop plants and produce numerous natural products (NPs), including some of the mycotoxins of greatest concern to agriculture. Many Fusarium NPs are derived from polyketide synthases (PKSs), large multi-domain enzymes that catalyze sequential condensation of simple carboxylic acids to form polyketides. To gain insight into the biosynthesis of polyketide-derived NPs in Fusarium, we retrieved 488 PKS gene sequences from genome sequences of 31 species of the fungus. In addition to these apparently functional PKS genes, the genomes collectively included 81 pseudogenized PKS genes. Phylogenetic analysis resolved the PKS genes into 67 clades, and based on multiple lines of evidence, we propose that homologs in each clade are responsible for synthesis of a polyketide that is distinct from those synthesized by PKSs in other clades. The presence and absence of PKS genes among the species examined indicated marked differences in distribution of PKS homologs. Comparisons of Fusarium PKS genes and genes flanking them to those from other Ascomycetes provided evidence that Fusarium has the genetic potential to synthesize multiple NPs that are the same or similar to those reported in other fungi, but that have not yet been reported in Fusarium. The results also highlight ways in which such analyses can help guide identification of novel Fusarium NPs and differences in NP biosynthetic capabilities that exist among fungi. Published by Elsevier Inc.
Detection of novel NF1 mutations and rapid mutation prescreening with Pyrosequencing.
Brinckmann, Anja; Mischung, Claudia; Bässmann, Ingelore; Kühnisch, Jirko; Schuelke, Markus; Tinschert, Sigrid; Nürnberg, Peter
2007-12-01
Neurofibromatosis type 1 (NF1) is caused by mutations in the neurofibromin (NF1) gene. Mutation analysis of NF1 is complicated by its large size, the lack of mutation hotspots, pseudogenes and frequent de novo mutations. Additionally, the search for NF1 mutations on the mRNA level is often hampered by nonsense-mediated mRNA decay (NMD) of the mutant allele. In this study we searched for mutations in a cohort of 38 patients and investigated the relationship between mutation type and allele-specific transcription from the wild-type versus mutant alleles. Quantification of relative mRNA transcript numbers was done by Pyrosequencing, a novel real-time sequencing method whose signals can be quantified very accurately. We identified 21 novel mutations comprising various mutation types. Pyrosequencing detected a definite relationship between allelic NF1 transcript imbalance due to NMD and mutation type in 24 of 29 patients who all carried frame-shift or nonsense mutations. NMD was absent in 5 patients with missense and silent mutations, as well as in 4 patients with splice-site mutations that did not disrupt the reading frame. Pyrosequencing was capable of detecting NMD even when the effects were only moderate. Diagnostic laboratories could thus exploit this effect for rapid prescreening for NF1 mutations as more than 60% of the mutations in this gene disrupt the reading frame and are prone to NMD.
Law, MeiYee; Childs, Kevin L; Campbell, Michael S; Stein, Joshua C; Olson, Andrew J; Holt, Carson; Panchy, Nicholas; Lei, Jikai; Jiao, Dian; Andorf, Carson M; Lawrence, Carolyn J; Ware, Doreen; Shiu, Shin-Han; Sun, Yanni; Jiang, Ning; Yandell, Mark
2015-01-01
The large size and relative complexity of many plant genomes make creation, quality control, and dissemination of high-quality gene structure annotations challenging. In response, we have developed MAKER-P, a fast and easy-to-use genome annotation engine for plants. Here, we report the use of MAKER-P to update and revise the maize (Zea mays) B73 RefGen_v3 annotation build (5b+) in less than 3 h using the iPlant Cyberinfrastructure. MAKER-P identified and annotated 4,466 additional, well-supported protein-coding genes not present in the 5b+ annotation build, added additional untranslated regions to 1,393 5b+ gene models, identified 2,647 5b+ gene models that lack any supporting evidence (despite the use of large and diverse evidence data sets), identified 104,215 pseudogene fragments, and created an additional 2,522 noncoding gene annotations. We also describe a method for de novo training of MAKER-P for the annotation of newly sequenced grass genomes. Collectively, these results lead to the 6a maize genome annotation and demonstrate the utility of MAKER-P for rapid annotation, management, and quality control of grasses and other difficult-to-annotate plant genomes. © 2015 American Society of Plant Biologists. All Rights Reserved.
Sloan, Daniel B; Müller, Karel; McCauley, David E; Taylor, Douglas R; Storchová, Helena
2012-12-01
In angiosperms, mitochondrial-encoded genes can cause cytoplasmic male sterility (CMS), resulting in the coexistence of female and hermaphroditic individuals (gynodioecy). We compared four complete mitochondrial genomes from the gynodioecious species Silene vulgaris and found unprecedented amounts of intraspecific diversity for plant mitochondrial DNA (mtDNA). Remarkably, only about half of overall sequence content is shared between any pair of genomes. The four mtDNAs range in size from 361 to 429 kb and differ in gene complement, with rpl5 and rps13 being intact in some genomes but absent or pseudogenized in others. The genomes exhibit essentially no conservation of synteny and are highly repetitive, with evidence of reciprocal recombination occurring even across short repeats (< 250 bp). Some mitochondrial genes exhibit atypically high degrees of nucleotide polymorphism, while others are invariant. The genomes also contain a variable number of small autonomously mapping chromosomes, which have only recently been identified in angiosperm mtDNA. Southern blot analysis of one of these chromosomes indicated a complex in vivo structure consisting of both monomeric circles and multimeric forms. We conclude that S. vulgaris harbors an unusually large degree of variation in mtDNA sequence and structure and discuss the extent to which this variation might be related to CMS. © 2012 The Authors. New Phytologist © 2012 New Phytologist Trust.
Bentley, Stephen D.; Corton, Craig; Brown, Susan E.; Barron, Andrew; Clark, Louise; Doggett, Jon; Harris, Barbara; Ormond, Doug; Quail, Michael A.; May, Georgiana; Francis, David; Knudson, Dennis; Parkhill, Julian; Ishimaru, Carol A.
2008-01-01
Clavibacter michiganensis subsp. sepedonicus is a plant-pathogenic bacterium and the causative agent of bacterial ring rot, a devastating agricultural disease under strict quarantine control and zero tolerance in the seed potato industry. This organism appears to be largely restricted to an endophytic lifestyle, proliferating within plant tissues and unable to persist in the absence of plant material. Analysis of the genome sequence of C. michiganensis subsp. sepedonicus and comparison with the genome sequences of related plant pathogens revealed a dramatic recent evolutionary history. The genome contains 106 insertion sequence elements, which appear to have been active in extensive rearrangement of the chromosome compared to that of Clavibacter michiganensis subsp. michiganensis. There are 110 pseudogenes with overrepresentation in functions associated with carbohydrate metabolism, transcriptional regulation, and pathogenicity. Genome comparisons also indicated that there is substantial gene content diversity within the species, probably due to differential gene acquisition and loss. These genomic features and evolutionary dating suggest that there was recent adaptation for life in a restricted niche where nutrient diversity and perhaps competition are low, correlated with a reduced ability to exploit previously occupied complex niches outside the plant. Toleration of factors such as multiplication and integration of insertion sequence elements, genome rearrangements, and functional disruption of many genes and operons seems to indicate that there has been general relaxation of selective pressure on a large proportion of the genome. PMID:18192393
Kauzlaric, Annamaria; Ecco, Gabriela; Cassano, Marco; Duc, Julien; Imbeault, Michael; Trono, Didier
2017-01-01
KRAB-containing poly-zinc finger proteins (KZFPs) constitute the largest family of transcription factors encoded by mammalian genomes, and growing evidence indicates that they fulfill functions critical to both embryonic development and maintenance of adult homeostasis. KZFP genes underwent broad and independent waves of expansion in many higher vertebrates lineages, yet comprehensive studies of members harbored by a given species are scarce. Here we present a thorough analysis of KZFP genes and related units in the murine genome. We first identified about twice as many elements than previously annotated as either KZFP genes or pseudogenes, notably by assigning to this family an entity formerly considered as a large group of Satellite repeats. We then could delineate an organization in clusters distributed throughout the genome, with signs of recombination, translocation, duplication and seeding of new sites by retrotransposition of KZFP genes and related genetic units (KZFP/rGUs). Moreover, we harvested evidence indicating that closely related paralogs had evolved through both drifting and shifting of sequences encoding for zinc finger arrays. Finally, we could demonstrate that the KAP1-SETDB1 repressor complex tames the expression of KZFP/rGUs within clusters, yet that the primary targets of this regulation are not the KZFP/rGUs themselves but enhancers contained in neighboring endogenous retroelements and that, underneath, KZFPs conserve highly individualized patterns of expression.
van de Guchte, M; Penaud, S; Grimaldi, C; Barbe, V; Bryson, K; Nicolas, P; Robert, C; Oztas, S; Mangenot, S; Couloux, A; Loux, V; Dervyn, R; Bossy, R; Bolotin, A; Batto, J-M; Walunas, T; Gibrat, J-F; Bessières, P; Weissenbach, J; Ehrlich, S D; Maguin, E
2006-06-13
Lactobacillus delbrueckii ssp. bulgaricus (L. bulgaricus) is a representative of the group of lactic acid-producing bacteria, mainly known for its worldwide application in yogurt production. The genome sequence of this bacterium has been determined and shows the signs of ongoing specialization, with a substantial number of pseudogenes and incomplete metabolic pathways and relatively few regulatory functions. Several unique features of the L. bulgaricus genome support the hypothesis that the genome is in a phase of rapid evolution. (i) Exceptionally high numbers of rRNA and tRNA genes with regard to genome size may indicate that the L. bulgaricus genome has known a recent phase of important size reduction, in agreement with the observed high frequency of gene inactivation and elimination; (ii) a much higher GC content at codon position 3 than expected on the basis of the overall GC content suggests that the composition of the genome is evolving toward a higher GC content; and (iii) the presence of a 47.5-kbp inverted repeat in the replication termination region, an extremely rare feature in bacterial genomes, may be interpreted as a transient stage in genome evolution. The results indicate the adaptation of L. bulgaricus from a plant-associated habitat to the stable protein and lactose-rich milk environment through the loss of superfluous functions and protocooperation with Streptococcus thermophilus.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mushegian, Arcady R., E-mail: mushegian2@gmail.com; Elena, Santiago F., E-mail: sfelena@ibmcp.upv.es; The Santa Fe Institute, Santa Fe, NM 87501
Homologs of Tobacco mosaic virus 30K cell-to-cell movement protein are encoded by diverse plant viruses. Mechanisms of action and evolutionary origins of these proteins remain obscure. We expand the picture of conservation and evolution of the 30K proteins, producing sequence alignment of the 30K superfamily with the broadest phylogenetic coverage thus far and illuminating structural features of the core all-beta fold of these proteins. Integrated copies of pararetrovirus 30K movement genes are prevalent in euphyllophytes, with at least one copy intact in nearly every examined species, and mRNAs detected for most of them. Sequence analysis suggests repeated integrations, pseudogenizations, andmore » positive selection in those provirus genes. An unannotated 30K-superfamily gene in Arabidopsis thaliana genome is likely expressed as a fusion with the At1g37113 transcript. This molecular background of endopararetrovirus gene products in plants may change our view of virus infection and pathogenesis, and perhaps of cellular homeostasis in the hosts. - Highlights: • Sequence region shared by plant virus “30K” movement proteins has an all-beta fold. • Most euphyllophyte genomes contain integrated copies of pararetroviruses. • These integrated virus genomes often include intact movement protein genes. • Molecular evidence suggests that these “30K” genes may be selected for function.« less
Jin, Ke; Xue, Chenyi; Wu, Xiaoli; Qian, Jinyi; Zhu, Yong; Yang, Zhen; Yonezawa, Takahiro; Crabbe, M James C; Cao, Ying; Hasegawa, Masami; Zhong, Yang; Zheng, Yufang
2011-01-01
The giant panda has an interesting bamboo diet unlike the other species in the order of Carnivora. The umami taste receptor gene T1R1 has been identified as a pseudogene during its genome sequencing project and confirmed using a different giant panda sample. The estimated mutation time for this gene is about 4.2 Myr. Such mutation coincided with the giant panda's dietary change and also reinforced its herbivorous life style. However, as this gene is preserved in herbivores such as cow and horse, we need to look for other reasons behind the giant panda's diet switch. Since taste is part of the reward properties of food related to its energy and nutrition contents, we did a systematic analysis on those genes involved in the appetite-reward system for the giant panda. We extracted the giant panda sequence information for those genes and compared with the human sequence first and then with seven other species including chimpanzee, mouse, rat, dog, cat, horse, and cow. Orthologs in panda were further analyzed based on the coding region, Kozak consensus sequence, and potential microRNA binding of those genes. Our results revealed an interesting dopamine metabolic involvement in the panda's food choice. This finding suggests a new direction for molecular evolution studies behind the panda's dietary switch.
Jin, Ke; Xue, Chenyi; Wu, Xiaoli; Qian, Jinyi; Zhu, Yong; Yang, Zhen; Yonezawa, Takahiro; Crabbe, M. James C.; Cao, Ying; Hasegawa, Masami; Zhong, Yang; Zheng, Yufang
2011-01-01
Background The giant panda has an interesting bamboo diet unlike the other species in the order of Carnivora. The umami taste receptor gene T1R1 has been identified as a pseudogene during its genome sequencing project and confirmed using a different giant panda sample. The estimated mutation time for this gene is about 4.2 Myr. Such mutation coincided with the giant panda's dietary change and also reinforced its herbivorous life style. However, as this gene is preserved in herbivores such as cow and horse, we need to look for other reasons behind the giant panda's diet switch. Methodology/Principal Findings Since taste is part of the reward properties of food related to its energy and nutrition contents, we did a systematic analysis on those genes involved in the appetite-reward system for the giant panda. We extracted the giant panda sequence information for those genes and compared with the human sequence first and then with seven other species including chimpanzee, mouse, rat, dog, cat, horse, and cow. Orthologs in panda were further analyzed based on the coding region, Kozak consensus sequence, and potential microRNA binding of those genes. Conclusions/Significance Our results revealed an interesting dopamine metabolic involvement in the panda's food choice. This finding suggests a new direction for molecular evolution studies behind the panda's dietary switch. PMID:21818345
Non-coding RNA in cystic fibrosis.
Glasgow, Arlene M A; De Santi, Chiara; Greene, Catherine M
2018-05-09
Non-coding RNAs (ncRNAs) are an abundant class of RNAs that include small ncRNAs, long non-coding RNAs (lncRNA) and pseudogenes. The human ncRNA atlas includes thousands of these specialised RNA molecules that are further subcategorised based on their size or function. Two of the more well-known and widely studied ncRNA species are microRNAs (miRNAs) and lncRNAs. These are regulatory RNAs and their altered expression has been implicated in the pathogenesis of a variety of human diseases. Failure to express a functional cystic fibrosis (CF) transmembrane receptor (CFTR) chloride ion channel in epithelial cells underpins CF. Secondary to the CFTR defect, it is known that other pathways can be altered and these may contribute to the pathophysiology of CF lung disease in particular. For example, quantitative alterations in expression of some ncRNAs are associated with CF. In recent years, there has been a series of published studies exploring ncRNA expression and function in CF. The majority have focussed principally on miRNAs, with just a handful of reports to date on lncRNAs. The present study reviews what is currently known about ncRNA expression and function in CF, and discusses the possibility of applying this knowledge to the clinical management of CF in the near future. © 2018 The Author(s). Published by Portland Press Limited on behalf of the Biochemical Society.
A double-strand break can trigger immunoglobulin gene conversion
Bastianello, Giulia; Arakawa, Hiroshi
2017-01-01
All three B cell-specific activities of the immunoglobulin (Ig) gene re-modeling system—gene conversion, somatic hypermutation and class switch recombination—require activation-induced deaminase (AID). AID-induced DNA lesions must be further processed and dissected into different DNA recombination pathways. In order to characterize potential intermediates for Ig gene conversion, we inserted an I-SceI recognition site into the complementarity determining region 1 (CDR1) of the Ig light chain locus of the AID knockout DT40 cell line, and conditionally expressed I-SceI endonuclease. Here, we show that a double-strand break (DSB) in CDR1 is sufficient to trigger Ig gene conversion in the absence of AID. The pattern and pseudogene usage of DSB-induced gene conversion were comparable to those of AID-induced gene conversion; surprisingly, sometimes a single DSB induced multiple gene conversion events. These constitute direct evidence that a DSB in the V region can be an intermediate for gene conversion. The fate of the DNA lesion downstream of a DSB had more flexibility than that of AID, suggesting two alternative models: (i) DSBs during the physiological gene conversion are in the minority compared to single-strand breaks (SSBs), which are frequently generated following DNA deamination, or (ii) the physiological gene conversion is mediated by a tightly regulated DSB that is locally protected from non-homologous end joining (NHEJ) or other non-homologous DNA recombination machineries. PMID:27701075
Evidence of birth-and-death evolution of 5S rRNA gene in Channa species (Teleostei, Perciformes).
Barman, Anindya Sundar; Singh, Mamta; Singh, Rajeev Kumar; Lal, Kuldeep Kumar
2016-12-01
In higher eukaryotes, minor rDNA family codes for 5S rRNA that is arranged in tandem arrays and comprises of a highly conserved 120 bp long coding sequence with a variable non-transcribed spacer (NTS). Initially the 5S rDNA repeats are considered to be evolved by the process of concerted evolution. But some recent reports, including teleost fishes suggested that evolution of 5S rDNA repeat does not fit into the concerted evolution model and evolution of 5S rDNA family may be explained by a birth-and-death evolution model. In order to study the mode of evolution of 5S rDNA repeats in Perciformes fish species, nucleotide sequence and molecular organization of five species of genus Channa were analyzed in the present study. Molecular analyses revealed several variants of 5S rDNA repeats (four types of NTS) and networks created by a neighbor net algorithm for each type of sequences (I, II, III and IV) did not show a clear clustering in species specific manner. The stable secondary structure is predicted and upstream and downstream conserved regulatory elements were characterized. Sequence analyses also shown the presence of two putative pseudogenes in Channa marulius. Present study supported that 5S rDNA repeats in genus Channa were evolved under the process of birth-and-death.
KIR-HLA distribution in a Vietnamese population from Hanoi.
Amorim, Leonardo Maldaner; van Tong, Hoang; Hoan, Nghiem Xuan; Vargas, Luciana de Brito; Ribeiro, Enilze Maria de Souza Fonseca; Petzl-Erler, Maria Luiza; Boldt, Angelica B W; Toan, Nguyen Linh; Song, Le Huu; Velavan, Thirumalaisamy P; Augusto, Danillo G
2018-02-01
The KIR (killer cell immunoglobulin-like receptors) gene family codifies a group of receptors that recognize human leukocyte antigens (HLA) and modulate natural killer (NK) cells response. Genetic diversity of KIR genes and HLA ligands has not yet been deeply investigated in South East Asia. Here, we characterized KIR gene presence and absence polymorphism of 14 KIR genes and two pseudogenes, as well as the frequencies of the ligands HLA-Bw4, HLA-C1 and HLA-C2 in a Vietnamese population from Hanoi (n = 140). Genotyping was performed by polymerase chain reaction with specific sequence primers (PCR-SSP). We compared KIR frequencies and performed principal component analysis with 43 worldwide populations of different ancestries. KIR carrier frequencies in Vietnamese were similar to those reported for Thai and Chinese Han, but differed significantly from other geographically close populations such as Japanese and South Korean. This similarity was also observed in KIR gene-content genotypes and is in accordance with the origin from Southern China and Thailand proposed for the Vietnamese population. The frequencies of HLA ligands observed in Vietnamese did not differ from those reported for other East-Asian populations (p > .05). Studies regarding KIR-HLA in populations are of prime importance to understand their evolution, function and role in diseases. Copyright © 2017 American Society for Histocompatibility and Immunogenetics. Published by Elsevier Inc. All rights reserved.
miRSponge: a manually curated database for experimentally supported miRNA sponges and ceRNAs.
Wang, Peng; Zhi, Hui; Zhang, Yunpeng; Liu, Yue; Zhang, Jizhou; Gao, Yue; Guo, Maoni; Ning, Shangwei; Li, Xia
2015-01-01
In this study, we describe miRSponge, a manually curated database, which aims at providing an experimentally supported resource for microRNA (miRNA) sponges. Recent evidence suggests that miRNAs are themselves regulated by competing endogenous RNAs (ceRNAs) or 'miRNA sponges' that contain miRNA binding sites. These competitive molecules can sequester miRNAs to prevent them interacting with their natural targets to play critical roles in various biological and pathological processes. It has become increasingly important to develop a high quality database to record and store ceRNA data to support future studies. To this end, we have established the experimentally supported miRSponge database that contains data on 599 miRNA-sponge interactions and 463 ceRNA relationships from 11 species following manual curating from nearly 1200 published articles. Database classes include endogenously generated molecules including coding genes, pseudogenes, long non-coding RNAs and circular RNAs, along with exogenously introduced molecules including viral RNAs and artificial engineered sponges. Approximately 70% of the interactions were identified experimentally in disease states. miRSponge provides a user-friendly interface for convenient browsing, retrieval and downloading of dataset. A submission page is also included to allow researchers to submit newly validated miRNA sponge data. Database URL: http://www.bio-bigdata.net/miRSponge. © The Author(s) 2015. Published by Oxford University Press.
miRSponge: a manually curated database for experimentally supported miRNA sponges and ceRNAs
Wang, Peng; Zhi, Hui; Zhang, Yunpeng; Liu, Yue; Zhang, Jizhou; Gao, Yue; Guo, Maoni; Ning, Shangwei; Li, Xia
2015-01-01
In this study, we describe miRSponge, a manually curated database, which aims at providing an experimentally supported resource for microRNA (miRNA) sponges. Recent evidence suggests that miRNAs are themselves regulated by competing endogenous RNAs (ceRNAs) or ‘miRNA sponges’ that contain miRNA binding sites. These competitive molecules can sequester miRNAs to prevent them interacting with their natural targets to play critical roles in various biological and pathological processes. It has become increasingly important to develop a high quality database to record and store ceRNA data to support future studies. To this end, we have established the experimentally supported miRSponge database that contains data on 599 miRNA-sponge interactions and 463 ceRNA relationships from 11 species following manual curating from nearly 1200 published articles. Database classes include endogenously generated molecules including coding genes, pseudogenes, long non-coding RNAs and circular RNAs, along with exogenously introduced molecules including viral RNAs and artificial engineered sponges. Approximately 70% of the interactions were identified experimentally in disease states. miRSponge provides a user-friendly interface for convenient browsing, retrieval and downloading of dataset. A submission page is also included to allow researchers to submit newly validated miRNA sponge data. Database URL: http://www.bio-bigdata.net/miRSponge. PMID:26424084
DNA Methylation of T1R1 Gene in the Vegetarian Adaptation of Grass Carp Ctenopharyngodon idella.
Cai, Wenjing; He, Shan; Liang, Xu-Fang; Yuan, Xiaochen
2018-05-02
Although previous studies have indicated importance of taste receptors in food habits formation in mammals, little is known about those in fish. Grass carp is an excellent model for studying vegetarian adaptation, as it shows food habit transition from carnivore to herbivore. In the present study, pseudogenization or frameshift mutations of the umami receptors that hypothesized related to dietary switch in vertebrates, were not found in grass carp, suggesting other mechanisms for vegetarian adaptation in grass carp. T1R1 and T1R3 strongly responded to L-Arg and L-Lys, differing from those of zebrafish and medaka, contributing to high species specificity in amino acid preferences and diet selection of grass carp. After food habit transition of grass carp, DNA methylation levels were higher in CPG1 and CPG3 islands of upstream control region of T1R1 gene. Luciferase activity assay of upstream regulatory region of T1R1 (-2500-0 bp) without CPG1 or CPG3 indicated that CPG1 and CPG3 might be involved in transcriptional regulation of T1R1 gene. Subsequently, high DNA methylation decreased expression of T1R1 in intestinal tract. It could be a new mechanism to explain, at least partially, the vegetarian adaptation of grass carp by regulation of expression of umami receptor via epigenetic modification.
2009-01-01
Olfaction is essential for the survival of animals. Versatile odour molecules in the environment are received by olfactory receptors (ORs), which form the largest multigene family in vertebrates. Identification of the entire repertories of OR genes using bioinformatics methods from the whole-genome sequences of diverse organisms revealed that the numbers of OR genes vary enormously, ranging from ~1,200 in rats and ~400 in humans to ~150 in zebrafish and ~15 in pufferfish. Most species have a considerable fraction of pseudogenes. Extensive phylogenetic analyses have suggested that the numbers of gene gains and losses are extremely large in the OR gene family, which is a striking example of the birth-and-death evolution. It appears that OR gene repertoires change dynamically, depending on each organism's living environment. For example, higher primates equipped with a well-developed vision system have lost a large number of OR genes. Moreover, two groups of OR genes for detecting airborne odorants greatly expanded after the time of terrestrial adaption in the tetrapod lineage, whereas fishes retain diverse repertoires of genes that were present in aquatic ancestral species. The origin of vertebrate OR genes can be traced back to the common ancestor of all chordate species, but insects, nematodes and echinoderms utilise distinctive families of chemoreceptors, suggesting that chemoreceptor genes have evolved many times independently in animal evolution. PMID:20038498
Extensive Gains and Losses of Olfactory Receptor Genes in Mammalian Evolution
Niimura, Yoshihito; Nei, Masatoshi
2007-01-01
Odor perception in mammals is mediated by a large multigene family of olfactory receptor (OR) genes. The number of OR genes varies extensively among different species of mammals, and most species have a substantial number of pseudogenes. To gain some insight into the evolutionary dynamics of mammalian OR genes, we identified the entire set of OR genes in platypuses, opossums, cows, dogs, rats, and macaques and studied the evolutionary change of the genes together with those of humans and mice. We found that platypuses and primates have <400 functional OR genes while the other species have 800–1,200 functional OR genes. We then estimated the numbers of gains and losses of OR genes for each branch of the phylogenetic tree of mammals. This analysis showed that (i) gene expansion occurred in the placental lineage each time after it diverged from monotremes and from marsupials and (ii) hundreds of gains and losses of OR genes have occurred in an order-specific manner, making the gene repertoires highly variable among different orders. It appears that the number of OR genes is determined primarily by the functional requirement for each species, but once the number reaches the required level, it fluctuates by random duplication and deletion of genes. This fluctuation seems to have been aided by the stochastic nature of OR gene expression. PMID:17684554
Small interfering RNA-producing loci in the ancient parasitic eukaryote Trypanosoma brucei
2012-01-01
Background At the core of the RNA interference (RNAi) pathway in Trypanosoma brucei is a single Argonaute protein, TbAGO1, with an established role in controlling retroposon and repeat transcripts. Recent evidence from higher eukaryotes suggests that a variety of genomic sequences with the potential to produce double-stranded RNA are sources for small interfering RNAs (siRNAs). Results To test whether such endogenous siRNAs are present in T. brucei and to probe the individual role of the two Dicer-like enzymes, we affinity purified TbAGO1 from wild-type procyclic trypanosomes, as well as from cells deficient in the cytoplasmic (TbDCL1) or nuclear (TbDCL2) Dicer, and subjected the bound RNAs to Illumina high-throughput sequencing. In wild-type cells the majority of reads originated from two classes of retroposons. We also considerably expanded the repertoire of trypanosome siRNAs to encompass a family of 147-bp satellite-like repeats, many of the regions where RNA polymerase II transcription converges, large inverted repeats and two pseudogenes. Production of these newly described siRNAs is strictly dependent on the nuclear DCL2. Notably, our data indicate that putative centromeric regions, excluding the CIR147 repeats, are not a significant source for endogenous siRNAs. Conclusions Our data suggest that endogenous RNAi targets may be as evolutionarily old as the mechanism itself. PMID:22925482
Anti-digoxin Fab variants generated by phage display.
Murata, Viviane Midori; Schmidt, Mariana Costa Braga; Kalil, Jorge; Tsuruta, Lilian Rumi; Moro, Ana Maria
2013-06-01
Digoxin is a pharmaceutical used in the control of cardiac dysfunction. Its therapeutic window is narrow, with effect dosage very close to the toxic dosage. To counteract the toxic effect, polyclonal Fab fragments are commercially available. Our study is based on a monoclonal anti-digoxin antibody, which would provide a product with a specific potency and more precise dosage for the detoxification of patients under digoxin treatment. Phage display technology was used to select variants with high affinity. From an anti-digoxin hybridoma, RNA was extracted for subsequent cDNA synthesis. Specific primers were used for the LC and Fd amplifications, then cloned sequentially in a phagemid vector (pComb3X) for the combinatorial Fab library construction. Clones were selected for their ability to bind to digoxin-BSA. The presence of light and heavy chains was checked, randomly selected clones then sequenced and induced to produce soluble Fabs, and subsequently analyzed for anti-digoxin expression. Out of ten clones randomly chosen, six resulted positive expression of the product. The sequencing of these revealed two identical clones and one presenting a pseudogene in the LC. Four clones presenting variations in the framework1 showed binding to digoxin-BSA by ELISA and western blotting. The specific binding was further confirmed by Biacore(®), which allowed ranking of the clones. The development of these clones allowed the selection of variants with higher affinity than the original version.
Liu, Tingli; Ye, Wenwu; Ru, Yanyan; Yang, Xinyu; Gu, Biao; Tao, Kai; Lu, Shan; Dong, Suomeng; Zheng, Xiaobo; Shan, Weixing; Wang, Yuanchao; Dou, Daolong
2011-01-01
Phytophthora sojae encodes hundreds of putative host cytoplasmic effectors with conserved FLAK motifs following signal peptides, termed crinkling- and necrosis-inducing proteins (CRN) or Crinkler. Their functions and mechanisms in pathogenesis are mostly unknown. Here, we identify a group of five P. sojae-specific CRN-like genes with high levels of sequence similarity, of which three are putative pseudogenes. Functional analysis shows that the two functional genes encode proteins with predicted nuclear localization signals that induce contrasting responses when expressed in Nicotiana benthamiana and soybean (Glycine max). PsCRN63 induces cell death, while PsCRN115 suppresses cell death elicited by the P. sojae necrosis-inducing protein (PsojNIP) or PsCRN63. Expression of CRN fragments with deleted signal peptides and FLAK motifs demonstrates that the carboxyl-terminal portions of PsCRN63 or PsCRN115 are sufficient for their activities. However, the predicted nuclear localization signal is required for PsCRN63 to induce cell death but not for PsCRN115 to suppress cell death. Furthermore, silencing of the PsCRN63 and PsCRN115 genes in P. sojae stable transformants leads to a reduction of virulence on soybean. Intriguingly, the silenced transformants lose the ability to suppress host cell death and callose deposition on inoculated plants. These results suggest a role for CRN effectors in the suppression of host defense responses.
Shen, Danyu; Liu, Tingli; Ye, Wenwu; Liu, Li; Liu, Peihan; Wu, Yuren; Wang, Yuanchao; Dou, Daolong
2013-01-01
Phytophthora and other oomycetes secrete a large number of putative host cytoplasmic effectors with conserved FLAK motifs following signal peptides, termed crinkling and necrosis inducing proteins (CRN), or Crinkler. Here, we first investigated the evolutionary patterns and mechanisms of CRN effectors in Phytophthora sojae and compared them to two other Phytophthora species. The genes encoding CRN effectors could be divided into 45 orthologous gene groups (OGG), and most OGGs unequally distributed in the three species, in which each underwent large number of gene gains or losses, indicating that the CRN genes expanded after species evolution in Phytophthora and evolved through pathoadaptation. The 134 expanded genes in P. sojae encoded family proteins including 82 functional genes and expressed at higher levels while the other 68 genes encoding orphan proteins were less expressed and contained 50 pseudogenes. Furthermore, we demonstrated that most expanded genes underwent gene duplication or/and fragment recombination. Three different mechanisms that drove gene duplication or recombination were identified. Finally, the expanded CRN effectors exhibited varying pathogenic functions, including induction of programmed cell death (PCD) and suppression of PCD through PAMP-triggered immunity or/and effector-triggered immunity. Overall, these results suggest that gene duplication and fragment recombination may be two mechanisms that drive the expansion and neofunctionalization of the CRN family in P. sojae, which aids in understanding the roles of CRN effectors within each oomycete pathogen.
Sugano, Kokichi; Nakajima, Takeshi; Sekine, Shigeki; Taniguchi, Hirokazu; Saito, Shinya; Takahashi, Masahiro; Ushiama, Mineko; Sakamoto, Hiromi; Yoshida, Teruhiko
2016-11-01
Germline PMS2 gene mutations were detected by RT-PCR/direct sequencing of total RNA extracted from puromycin-treated peripheral blood lymphocytes (PBL) and multiplex ligation-dependent probe amplification (MLPA) analyses of Japanese patients with colorectal cancer (CRC) fulfilling either the revised Bethesda Guidelines or being an age at disease onset of younger than 70 years, and screened by mismatch repair protein immunohistochemistry of formalin-fixed paraffin embedded sections. Of the 501 subjects examined, 7 (1.40%) showed the downregulated expression of the PMS2 protein alone and were referred to the genetic counseling clinic. Germline PMS2 mutations were detected in 6 (85.7%), including 3 nonsense and 1 frameshift mutations by RT-PCR/direct sequencing and 2 genomic deletions by MLPA. No mutations were identified in the other MMR genes (i.e. MSH2, MLH1 and MSH6). The prevalence of the downregulated expression of the PMS2 protein alone was 1.40% among the subjects examined and IHC results predicted the presence of PMS2 germline mutations. RT-PCR from puromycin-treated PBL and MLPA may be employed as the first screening step to detect PMS2 mutations without pseudogene interference, followed by the long-range PCR/nested PCR validation using genomic DNA. © 2016 The Authors. Cancer Science published by John Wiley & Sons Australia, Ltd on behalf of Japanese Cancer Association.
van der Klift, Heleen M; Tops, Carli M J; Bik, Elsa C; Boogaard, Merel W; Borgstein, Anne-Marijke; Hansson, Kerstin B M; Ausems, Margreet G E M; Gomez Garcia, Encarna; Green, Andrew; Hes, Frederik J; Izatt, Louise; van Hest, Liselotte P; Alonso, Angel M; Vriends, Annette H J T; Wagner, Anja; van Zelst-Stams, Wendy A G; Vasen, Hans F A; Morreau, Hans; Devilee, Peter; Wijnen, Juul T
2010-05-01
Heterozygous mutations in PMS2 are involved in Lynch syndrome, whereas biallelic mutations are found in Constitutional mismatch repair-deficiency syndrome patients. Mutation detection is complicated by the occurrence of sequence exchange events between the duplicated regions of PMS2 and PMS2CL. We investigated the frequency of such events with a nonspecific polymerase chain reaction (PCR) strategy, co-amplifying both PMS2 and PMS2CL sequences. This allowed us to score ratios between gene and pseudogene-specific nucleotides at 29 PSV sites from exon 11 to the end of the gene. We found sequence transfer at all investigated PSVs from intron 12 to the 3' end of the gene in 4 to 52% of DNA samples. Overall, sequence exchange between PMS2 and PMS2CL was observed in 69% (83/120) of individuals. We demonstrate that mutation scanning with PMS2-specific PCR primers and MLPA probes, designed on PSVs, in the 3' duplicated region is unreliable, and present an RNA-based mutation detection strategy to improve reliability. Using this strategy, we found 19 different putative pathogenic PMS2 mutations. Four of these (21%) are lying in the region with frequent sequence transfer and are missed or called incorrectly as homozygous with several PSV-based mutation detection methods. (c) 2010 Wiley-Liss, Inc.
Khan, Abdur Rahim; Park, Gun-Seok; Asaf, Sajjad; Hong, Sung-Jun; Jung, Byung Kwon
2017-01-01
Serratia marcescens RSC-14 is a Gram-negative bacterium that was previously isolated from the surface-sterilized roots of the Cd-hyperaccumulator Solanum nigrum. The strain stimulates plant growth and alleviates Cd stress in host plants. To investigate the genetic basis for these traits, the complete genome of RSC-14 was obtained by single-molecule real-time sequencing. The genome of S. marcescens RSC-14 comprised a 5.12-Mbp-long circular chromosome containing 4,593 predicted protein-coding genes, 22 rRNA genes, 88 tRNA genes, and 41 pseudogenes. It contained genes with potential functions in plant growth promotion, including genes involved in indole-3-acetic acid (IAA) biosynthesis, acetoin synthesis, and phosphate solubilization. Moreover, annotation using NCBI and Rapid Annotation using Subsystem Technology identified several genes that encode antioxidant enzymes as well as genes involved in antioxidant production, supporting the observed resistance towards heavy metals, such as Cd. The presence of IAA pathway-related genes and oxidative stress-responsive enzyme genes may explain the plant growth-promoting potential and Cd tolerance, respectively. This is the first report of a complete genome sequence of Cd-tolerant S. marcescens and its plant growth promotion pathway. The whole-genome analysis of this strain clarified the genetic basis underlying its phenotypic and biochemical characteristics, underpinning the beneficial interactions between RSC-14 and plants. PMID:28187139
Complete Chloroplast Genome of the Wollemi Pine (Wollemia nobilis): Structure and Evolution.
Yap, Jia-Yee S; Rohner, Thore; Greenfield, Abigail; Van Der Merwe, Marlien; McPherson, Hannah; Glenn, Wendy; Kornfeld, Geoff; Marendy, Elessa; Pan, Annie Y H; Wilton, Alan; Wilkins, Marc R; Rossetto, Maurizio; Delaney, Sven K
2015-01-01
The Wollemi pine (Wollemia nobilis) is a rare Southern conifer with striking morphological similarity to fossil pines. A small population of W. nobilis was discovered in 1994 in a remote canyon system in the Wollemi National Park (near Sydney, Australia). This population contains fewer than 100 individuals and is critically endangered. Previous genetic studies of the Wollemi pine have investigated its evolutionary relationship with other pines in the family Araucariaceae, and have suggested that the Wollemi pine genome contains little or no variation. However, these studies were performed prior to the widespread use of genome sequencing, and their conclusions were based on a limited fraction of the Wollemi pine genome. In this study, we address this problem by determining the entire sequence of the W. nobilis chloroplast genome. A detailed analysis of the structure of the genome is presented, and the evolution of the genome is inferred by comparison with the chloroplast sequences of other members of the Araucariaceae and the related family Podocarpaceae. Pairwise alignments of whole genome sequences, and the presence of unique pseudogenes, gene duplications and insertions in W. nobilis and Araucariaceae, indicate that the W. nobilis chloroplast genome is most similar to that of its sister taxon Agathis. However, the W. nobilis genome contains an unusually high number of repetitive sequences, and these could be used in future studies to investigate and conserve any remnant genetic diversity in the Wollemi pine.
Maintenance and Loss of Duplicated Genes by Dosage Subfunctionalization.
Gout, Jean-Francois; Lynch, Michael
2015-08-01
Whole-genome duplications (WGDs) have contributed to gene-repertoire enrichment in many eukaryotic lineages. However, most duplicated genes are eventually lost and it is still unclear why some duplicated genes are evolutionary successful whereas others quickly turn to pseudogenes. Here, we show that dosage constraints are major factors opposing post-WGD gene loss in several Paramecium species that share a common ancestral WGD. We propose a model where a majority of WGD-derived duplicates preserve their ancestral function and are retained to produce enough of the proteins performing this same ancestral function. Under this model, the expression level of individual duplicated genes can evolve neutrally as long as they maintain a roughly constant summed expression, and this allows random genetic drift toward uneven contributions of the two copies to total expression. Our analysis suggests that once a high level of imbalance is reached, which can require substantial lengths of time, the copy with the lowest expression level contributes a small enough fraction of the total expression that selection no longer opposes its loss. Extension of our analysis to yeast species sharing a common ancestral WGD yields similar results, suggesting that duplicated-gene retention for dosage constraints followed by divergence in expression level and eventual deterministic gene loss might be a universal feature of post-WGD evolution. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Reduced MHC and neutral variation in the Galápagos hawk, an island endemic
2011-01-01
Background Genes at the major histocompatibility complex (MHC) are known for high levels of polymorphism maintained by balancing selection. In small or bottlenecked populations, however, genetic drift may be strong enough to overwhelm the effect of balancing selection, resulting in reduced MHC variability. In this study we investigated MHC evolution in two recently diverged bird species: the endemic Galápagos hawk (Buteo galapagoensis), which occurs in small, isolated island populations, and its widespread mainland relative, the Swainson's hawk (B. swainsoni). Results We amplified at least two MHC class II B gene copies in each species. We recovered only three different sequences from 32 Galápagos hawks, while we amplified 20 unique sequences in 20 Swainson's hawks. Most of the sequences clustered into two groups in a phylogenetic network, with one group likely representing pseudogenes or nonclassical loci. Neutral genetic diversity at 17 microsatellite loci was also reduced in the Galápagos hawk compared to the Swainson's hawk. Conclusions The corresponding loss in neutral diversity suggests that the reduced variability present at Galápagos hawk MHC class II B genes compared to the Swainson's hawk is primarily due to a founder event followed by ongoing genetic drift in small populations. However, purifying selection could also explain the low number of MHC alleles present. This lack of variation at genes involved in the adaptive immune response could be cause for concern should novel diseases reach the archipelago. PMID:21612651
Dufresne, Karine; Saulnier-Bellemare, Julie; Daigle, France
2018-01-01
The human-specific pathogen Salmonella enterica serovar Typhi causes typhoid, a major public health issue in developing countries. Several aspects of its pathogenesis are still poorly understood. S . Typhi possesses 14 fimbrial gene clusters including 12 chaperone-usher fimbriae ( stg, sth, bcf , fim, saf , sef , sta, stb, stc, std, ste , and tcf ). These fimbriae are weakly expressed in laboratory conditions and only a few are actually characterized. In this study, expression of all S . Typhi chaperone-usher fimbriae and their potential roles in pathogenesis such as interaction with host cells, motility, or biofilm formation were assessed. All S . Typhi fimbriae were better expressed in minimal broth. Each system was overexpressed and only the fimbrial gene clusters without pseudogenes demonstrated a putative major subunits of about 17 kDa on SDS-PAGE. Six of these (Fim, Saf, Sta, Stb, Std, and Tcf) also show extracellular structure by electron microscopy. The impact of fimbrial deletion in a wild-type strain or addition of each individual fimbrial system to an S . Typhi afimbrial strain were tested for interactions with host cells, biofilm formation and motility. Several fimbriae modified bacterial interactions with human cells (THP-1 and INT-407) and biofilm formation. However, only Fim fimbriae had a deleterious effect on motility when overexpressed. Overall, chaperone-usher fimbriae seem to be an important part of the balance between the different steps (motility, adhesion, host invasion and persistence) of S . Typhi pathogenesis.
Ndah, Elvis; Jonckheere, Veronique
2017-01-01
Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well- and poorly-annotated genomes. PMID:28432195
Willems, Patrick; Ndah, Elvis; Jonckheere, Veronique; Stael, Simon; Sticker, Adriaan; Martens, Lennart; Van Breusegem, Frank; Gevaert, Kris; Van Damme, Petra
2017-06-01
Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well- and poorly-annotated genomes. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Shen, Bin; Fang, Tao; Dai, Mengyao; Jones, Gareth; Zhang, Shuyi
2013-01-01
A trade-off between the sensory modalities of vision and hearing is likely to have occurred in echolocating bats as the sophisticated mechanism of laryngeal echolocation requires considerable neural processing and has reduced the reliance of echolocating bats on vision for perceiving the environment. If such a trade-off exists, it is reasonable to hypothesize that some genes involved in visual function may have undergone relaxed selection or even functional loss in echolocating bats. The Gap junction protein, alpha 10 (Gja10, encoded by Gja10 gene) is expressed abundantly in mammal retinal horizontal cells and plays an important role in horizontal cell coupling. The interphotoreceptor retinoid-binding protein (Irbp, encoded by the Rbp3 gene) is mainly expressed in interphotoreceptor matrix and is known to be critical for normal functioning of the visual cycle. We sequenced Gja10 and Rbp3 genes in a taxonomically wide range of bats with divergent auditory characteristics (35 and 18 species for Gja10 and Rbp3, respectively). Both genes have became pseudogenes in species from the families Hipposideridae and Rhinolophidae that emit constant frequency echolocation calls with Doppler shift compensation at high-duty-cycles (the most sophisticated form of biosonar known), and in some bat species that emit echolocation calls at low-duty-cycles. Our study thus provides further evidence for the hypothesis that a trade-off occurs at the genetic level between vision and echolocation in bats. PMID:23874796
Shen, Bin; Fang, Tao; Dai, Mengyao; Jones, Gareth; Zhang, Shuyi
2013-01-01
A trade-off between the sensory modalities of vision and hearing is likely to have occurred in echolocating bats as the sophisticated mechanism of laryngeal echolocation requires considerable neural processing and has reduced the reliance of echolocating bats on vision for perceiving the environment. If such a trade-off exists, it is reasonable to hypothesize that some genes involved in visual function may have undergone relaxed selection or even functional loss in echolocating bats. The Gap junction protein, alpha 10 (Gja10, encoded by Gja10 gene) is expressed abundantly in mammal retinal horizontal cells and plays an important role in horizontal cell coupling. The interphotoreceptor retinoid-binding protein (Irbp, encoded by the Rbp3 gene) is mainly expressed in interphotoreceptor matrix and is known to be critical for normal functioning of the visual cycle. We sequenced Gja10 and Rbp3 genes in a taxonomically wide range of bats with divergent auditory characteristics (35 and 18 species for Gja10 and Rbp3, respectively). Both genes have became pseudogenes in species from the families Hipposideridae and Rhinolophidae that emit constant frequency echolocation calls with Doppler shift compensation at high-duty-cycles (the most sophisticated form of biosonar known), and in some bat species that emit echolocation calls at low-duty-cycles. Our study thus provides further evidence for the hypothesis that a trade-off occurs at the genetic level between vision and echolocation in bats.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Mingqun ..; Kikuchi, Takane; Brewer, Heather M.
2011-02-17
Anaplasma phagocytophilum and Ehrlichia chaffeensis are obligatory intracellular {alpha}-proteobacteria that infect human leukocytes and cause potentially fatal emerging zoonoses. In the present study, we determined global protein expression profiles of these bacteria cultured in the human promyelocytic leukemia cell line, HL-60. Mass spectrometric (MS) analyses identified a total of 1,212 A. phagocytophilum and 1,021 E. chaffeensis proteins, representing 89.3 and 92.3% of the predicted bacterial proteomes, respectively. Nearly all bacterial proteins ({approx}99%) with known functions were expressed, whereas only approximately 80% of hypothetical proteins were detected in infected human cells. Quantitative MS/MS analyses indicated that highly expressed proteins in bothmore » bacteria included chaperones, enzymes involved in biosynthesis and metabolism, and outer membrane proteins, such as A. phagocytophilum P44 and E. chaffeensis P28/OMP-1. Among 113 A. phagocytophilum p44 paralogous genes, 110 of them were expressed and 88 of them were encoded by pseudogenes. In addition, bacterial infection of HL-60 cells up-regulated the expression of human proteins involved mostly in cytoskeleton components, vesicular trafficking, cell signaling, and energy metabolism, but down regulated some pattern recognition receptors involved in innate immunity. Our proteomics data represent a comprehensive analysis of A. phagocytophilum and E. chaffeensis proteomes, and provide a quantitative view of human host protein expression profiles regulated by bacterial infection. The availability of these proteomic data will provide new insights into biology and pathogenesis of these obligatory intracellular pathogens.« less
Up-regulation of heat shock proteins is essential for cold survival during insect diapause
Rinehart, Joseph P.; Li, Aiqing; Yocum, George D.; Robich, Rebecca M.; Hayward, Scott A. L.; Denlinger, David L.
2007-01-01
Diapause, the dormancy common to overwintering insects, evokes a unique pattern of gene expression. In the flesh fly, most, but not all, of the fly's heat shock proteins (Hsps) are up-regulated. The diapause up-regulated Hsps include two members of the Hsp70 family, one member of the Hsp60 family (TCP-1), at least four members of the small Hsp family, and a small Hsp pseudogene. Expression of an Hsp70 cognate, Hsc70, is uninfluenced by diapause, and Hsp90 is actually down-regulated during diapause, thus diapause differs from common stress responses that elicit synchronous up-regulation of all Hsps. Up-regulation of the Hsps begins at the onset of diapause, persists throughout the overwintering period, and ceases within hours after the fly receives the signal to reinitiate development. The up-regulation of Hsps appears to be common to diapause in species representing diverse insect orders including Diptera, Lepidoptera, Coleoptera, and Hymenoptera as well as in diapauses that occur in different developmental stages (embryo, larva, pupa, adult). Suppressing expression of Hsp23 and Hsp70 in flies by using RNAi did not alter the decision to enter diapause or the duration of diapause, but it had a profound effect on the pupa's ability to survive low temperatures. We thus propose that up-regulation of Hsps during diapause is a major factor contributing to cold-hardiness of overwintering insects. PMID:17522254
DOE Office of Scientific and Technical Information (OSTI.GOV)
White, J.K.; Shaw, M.A.; Barton, C.H.
1994-11-15
Recent interest has focused on the region of conserved synteny between mouse chromosome 1 and human 2q33-q37, particularly over the region encoding the murine macrophage resistance gene Ity/Lsh/Bcg (candidate Nramp) and members of the Il8r interleukin-8 (IL8) receptor gene cluster. In this paper, identification of a restriction fragment length polymorphism in the Il8RB gene in 35 pedigrees previously typed for markers in the 2q33-37 interval provided evidence (lod scores > 3) for linkage between Il8RB and the 2q34-135 markers FN1, TNP1, VIL1, and DES. Physical mapping, using yeast artificial chromosomes isolated with VIL1, confirmed that IL8RA, IL8RB and the IL8RBmore » pseudogene map within the NRAMP-VIL1 interval, with the physical distance (155 kb) from 5{prime} LSH to 3{prime} VIL1 representing {approx}3-fold that observed in the mouse. Partial sequencing of NRAMP confirmed the presence of the N-terminal proline/serine-rich putative SH3 binding domain in exon 2 of the human gene. Further analysis of Brazilian leprosy and visceral leishmaniasis pedigrees identified a rare second allele varying in a 9-nucleotide repeat motif of the exon 2 sequence but segregating independently of the disease phenotype. 38 refs., 4 figs., 3 tabs.« less
Euarchontan Opsin Variation Brings New Focus to Primate Origins
Melin, Amanda D.; Wells, Konstans; Moritz, Gillian L.; Kistler, Logan; Orkin, Joseph D.; Timm, Robert M.; Bernard, Henry; Lakim, Maklarin B.; Perry, George H.; Kawamura, Shoji; Dominy, Nathaniel J.
2016-01-01
Debate on the adaptive origins of primates has long focused on the functional ecology of the primate visual system. For example, it is hypothesized that variable expression of short- (SWS1) and middle-to-long-wavelength sensitive (M/LWS) opsins, which confer color vision, can be used to infer ancestral activity patterns and therefore selective ecological pressures. A problem with this approach is that opsin gene variation is incompletely known in the grandorder Euarchonta, that is, the orders Scandentia (treeshrews), Dermoptera (colugos), and Primates. The ancestral state of primate color vision is therefore uncertain. Here, we report on the genes (OPN1SW and OPN1LW) that encode SWS1 and M/LWS opsins in seven species of treeshrew, including the sole nocturnal scandentian Ptilocercus lowii. In addition, we examined the opsin genes of the Central American woolly opossum (Caluromys derbianus), an enduring ecological analogue in the debate on primate origins. Our results indicate: 1) retention of ultraviolet (UV) visual sensitivity in C. derbianus and a shift from UV to blue spectral sensitivities at the base of Euarchonta; 2) ancient pseudogenization of OPN1SW in the ancestors of P. lowii, but a signature of purifying selection in those of C. derbianus; and, 3) the absence of OPN1LW polymorphism among diurnal treeshrews. These findings suggest functional variation in the color vision of nocturnal mammals and a distinctive visual ecology of early primates, perhaps one that demanded greater spatial resolution under light levels that could support cone-mediated color discrimination. PMID:26739880
Allying with armored snails: the complete genome of gammaproteobacterial endosymbiont.
Nakagawa, Satoshi; Shimamura, Shigeru; Takaki, Yoshihiro; Suzuki, Yohey; Murakami, Shun-ichi; Watanabe, Tamaki; Fujiyoshi, So; Mino, Sayaka; Sawabe, Tomoo; Maeda, Takahiro; Makita, Hiroko; Nemoto, Suguru; Nishimura, Shin-Ichiro; Watanabe, Hiromi; Watsuji, Tomo-o; Takai, Ken
2014-01-01
Deep-sea vents harbor dense populations of various animals that have their specific symbiotic bacteria. Scaly-foot gastropods, which are snails with mineralized scales covering the sides of its foot, have a gammaproteobacterial endosymbiont in their enlarged esophageal glands and diverse epibionts on the surface of their scales. In this study, we report the complete genome sequencing of gammaproteobacterial endosymbiont. The endosymbiont genome displays features consistent with ongoing genome reduction such as large proportions of pseudogenes and insertion elements. The genome encodes functions commonly found in deep-sea vent chemoautotrophs such as sulfur oxidation and carbon fixation. Stable carbon isotope ((13)C)-labeling experiments confirmed the endosymbiont chemoautotrophy. The genome also includes an intact hydrogenase gene cluster that potentially has been horizontally transferred from phylogenetically distant bacteria. Notable findings include the presence and transcription of genes for flagellar assembly, through which proteins are potentially exported from bacterium to the host. Symbionts of snail individuals exhibited extreme genetic homogeneity, showing only two synonymous changes in 19 different genes (13 810 positions in total) determined for 32 individual gastropods collected from a single colony at one time. The extremely low genetic individuality in endosymbionts probably reflects that the stringent symbiont selection by host prevents the random genetic drift in the small population of horizontally transmitted symbiont. This study is the first complete genome analysis of gastropod endosymbiont and offers an opportunity to study genome evolution in a recently evolved endosymbiont.
Stanko, Vera; Giuliani, Concetta; Retzer, Katarzyna; Djamei, Armin; Wahl, Vanessa; Wurzinger, Bernhard; Wilson, Cathal; Heberle-Bors, Erwin; Teige, Markus; Kragler, Friedrich
2014-01-01
Mitogen-activated protein kinase (MAPK) cascades are universal signal transduction modules present in all eukaryotes. In plants, MAPK cascades were shown to regulate cell division, developmental processes, stress responses, and hormone pathways. The subgroup A of Arabidopsis MAPKs consists of AtMPK3, AtMPK6, and AtMPK10. AtMPK3 and AtMPK6 are activated by their upstream MAP kinase kinases (MKKs) AtMKK4 and AtMKK5 in response to biotic and abiotic stress. In addition, they were identified as key regulators of stomatal development and patterning. AtMPK10 has long been considered as a pseudo-gene, derived from a gene duplication of AtMPK6. Here we show that AtMPK10 is expressed highly but very transiently in seedlings and at sites of local auxin maxima leaves. MPK10 encodes a functional kinase and interacts with the upstream MAP kinase kinase (MAPKK) AtMKK2. mpk10 mutants are delayed in flowering in long-day conditions and in continuous light. Moreover, cotyledons of mpk10 and mkk2 mutants have reduced vein complexity, which can be reversed by inhibiting polar auxin transport (PAT). Auxin does not affect AtMPK10 expression while treatment with the PAT inhibitor HFCA extends the expression in leaves and reverses the mpk10 mutant phenotype. These results suggest that the AtMKK2–AtMPK10 MAPK module regulates venation complexity by altering PAT efficiency. PMID:25064848
Rose, Laura E.; Langley, Charles H.; Bernal, Adriana J.; Michelmore, Richard W.
2005-01-01
Disease resistance to the bacterial pathogen Pseudomonas syringae pv. tomato (Pst) in the cultivated tomato, Lycopersicon esculentum, and the closely related L. pimpinellifolium is triggered by the physical interaction between plant disease resistance protein, Pto, and the pathogen avirulence protein, AvrPto. To investigate the extent to which variation in the Pto gene is responsible for naturally occurring variation in resistance to Pst, we determined the resistance phenotype of 51 accessions from seven species of Lycopersicon to isogenic strains of Pst differing in the presence of avrPto. One-third of the plants displayed resistance specifically when the pathogen expressed AvrPto, consistent with a gene-for-gene interaction. To test whether this resistance in these species was conferred specifically by the Pto gene, alleles of Pto were amplified and sequenced from 49 individuals and a subset (16) of these alleles was tested in planta using Agrobacterium-mediated transient assays. Eleven alleles conferred a hypersensitive resistance response (HR) in the presence of AvrPto, while 5 did not. Ten amino acid substitutions associated with the absence of AvrPto recognition and HR were identified, none of which had been identified in previous structure-function studies. Additionally, 3 alleles encoding putative pseudogenes of Pto were isolated from two species of Lycopersicon. Therefore, a large proportion, but not all, of the natural variation in the reaction to strains of Pst expressing AvrPto can be attributed to sequence variation in the Pto gene. PMID:15944360
Yu, Dandan; Wu, Yong; Xu, Ling; Fan, Yu; Peng, Li; Xu, Min; Yao, Yong-Gang
2016-07-01
In mammals, the toll-like receptors (TLRs) play a major role in initiating innate immune responses against pathogens. Comparison of the TLRs in different mammals may help in understanding the TLR-mediated responses and developing of animal models and efficient therapeutic measures for infectious diseases. The Chinese tree shrew (Tupaia belangeri chinensis), a small mammal with a close relationship to primates, is a viable experimental animal for studying viral and bacterial infections. In this study, we characterized the TLRs genes (tTLRs) in the Chinese tree shrew and identified 13 putative TLRs, which are orthologs of mammalian TLR1-TLR9 and TLR11-TLR13, and TLR10 was a pseudogene in tree shrew. Positive selection analyses using the Maximum likelihood (ML) method showed that tTLR8 and tTLR9 were under positive selection, which might be associated with the adaptation to the pathogen challenge. The mRNA expression levels of tTLRs presented an overall low and tissue-specific pattern, and were significantly upregulated upon Hepatitis C virus (HCV) infection. tTLR4 and tTLR9 underwent alternative splicing, which leads to different transcripts. Phylogenetic analysis and TLR structure prediction indicated that tTLRs were evolutionarily conserved, which might reflect an ancient mechanism and structure in the innate immune response system. Taken together, TLRs had both conserved and unique features in the Chinese tree shrew. Copyright © 2016 Elsevier Ltd. All rights reserved.
Kim, Hee Jin; Prithiviraj, Kalyani; Groathouse, Nathan; Brennan, Patrick J; Spencer, John S
2013-02-01
The cell-mediated immunity (CMI)-based in vitro gamma interferon release assay (IGRA) of Mycobacterium leprae-specific antigens has potential as a promising diagnostic means to detect those individuals in the early stages of M. leprae infection. Diagnosis of leprosy is a major obstacle toward ultimate disease control and has been compromised in the past by the lack of specific markers. Comparative bioinformatic analysis among mycobacterial genomes identified potential M. leprae-specific proteins called "hypothetical unknowns." Due to massive gene decay and the prevalence of pseudogenes, it is unclear whether any of these proteins are expressed or are immunologically relevant. In this study, we performed cDNA-based quantitative real-time PCR to investigate the expression status of 131 putative open reading frames (ORFs) encoding hypothetical unknowns. Twenty-six of the M. leprae-specific antigen candidates showed significant levels of gene expression compared to that of ESAT-6 (ML0049), which is an important T cell antigen of low abundance in M. leprae. Fifteen of 26 selected antigen candidates were expressed and purified in Escherichia coli. The seroreactivity to these proteins of pooled sera from lepromatous leprosy patients and cavitary tuberculosis patients revealed that 9 of 15 recombinant hypothetical unknowns elicited M. leprae-specific immune responses. These nine proteins may be good diagnostic reagents to improve both the sensitivity and specificity of detection of individuals with asymptomatic leprosy.
Prithiviraj, Kalyani; Groathouse, Nathan; Brennan, Patrick J.; Spencer, John S.
2013-01-01
The cell-mediated immunity (CMI)-based in vitro gamma interferon release assay (IGRA) of Mycobacterium leprae-specific antigens has potential as a promising diagnostic means to detect those individuals in the early stages of M. leprae infection. Diagnosis of leprosy is a major obstacle toward ultimate disease control and has been compromised in the past by the lack of specific markers. Comparative bioinformatic analysis among mycobacterial genomes identified potential M. leprae-specific proteins called “hypothetical unknowns.” Due to massive gene decay and the prevalence of pseudogenes, it is unclear whether any of these proteins are expressed or are immunologically relevant. In this study, we performed cDNA-based quantitative real-time PCR to investigate the expression status of 131 putative open reading frames (ORFs) encoding hypothetical unknowns. Twenty-six of the M. leprae-specific antigen candidates showed significant levels of gene expression compared to that of ESAT-6 (ML0049), which is an important T cell antigen of low abundance in M. leprae. Fifteen of 26 selected antigen candidates were expressed and purified in Escherichia coli. The seroreactivity to these proteins of pooled sera from lepromatous leprosy patients and cavitary tuberculosis patients revealed that 9 of 15 recombinant hypothetical unknowns elicited M. leprae-specific immune responses. These nine proteins may be good diagnostic reagents to improve both the sensitivity and specificity of detection of individuals with asymptomatic leprosy. PMID:23239802
[The ENCODE project and functional genomics studies].
Ding, Nan; Qu, Hongzhu; Fang, Xiangdong
2014-03-01
Upon the completion of the Human Genome Project, scientists have been trying to interpret the underlying genomic code for human biology. Since 2003, National Human Genome Research Institute (NHGRI) has invested nearly $0.3 billion and gathered over 440 scientists from more than 32 institutions in the United States, China, United Kingdom, Japan, Spain and Singapore to initiate the Encyclopedia of DNA Elements (ENCODE) project, aiming to identify and analyze all regulatory elements in the human genome. Taking advantage of the development of next-generation sequencing technologies and continuous improvement of experimental methods, ENCODE had made remarkable achievements: identified methylation and histone modification of DNA sequences and their regulatory effects on gene expression through altering chromatin structures, categorized binding sites of various transcription factors and constructed their regulatory networks, further revised and updated database for pseudogenes and non-coding RNA, and identified SNPs in regulatory sequences associated with diseases. These findings help to comprehensively understand information embedded in gene and genome sequences, the function of regulatory elements as well as the molecular mechanism underlying the transcriptional regulation by noncoding regions, and provide extensive data resource for life sciences, particularly for translational medicine. We re-viewed the contributions of high-throughput sequencing platform development and bioinformatical technology improve-ment to the ENCODE project, the association between epigenetics studies and the ENCODE project, and the major achievement of the ENCODE project. We also provided our prospective on the role of the ENCODE project in promoting the development of basic and clinical medicine.
Emerling, Christopher A.
2018-01-01
The end-Cretaceous extinction led to a massive faunal turnover, with placental mammals radiating in the wake of nonavian dinosaurs. Fossils indicate that Cretaceous stem placentals were generally insectivorous, whereas their earliest Cenozoic descendants occupied a variety of dietary niches. It is hypothesized that this dietary radiation resulted from the opening of niche space, following the extinction of dinosaurian carnivores and herbivores. We provide the first genomic evidence for the occurrence and timing of this dietary radiation in placental mammals. By comparing the genomes of 107 placental mammals, we robustly infer that chitinase genes (CHIAs), encoding enzymes capable of digesting insect exoskeletal chitin, were present as five functional copies in the ancestor of all placental mammals, and the number of functional CHIAs in the genomes of extant species positively correlates with the percentage of invertebrates in their diets. The diverse repertoire of CHIAs in early placental mammals corroborates fossil evidence of insectivory in Cretaceous eutherians, with descendant lineages repeatedly losing CHIAs beginning at the Cretaceous/Paleogene (K/Pg) boundary as they radiated into noninsectivorous niches. Furthermore, the timing of gene loss suggests that interordinal diversification of placental mammals in the Cretaceous predates the dietary radiation in the early Cenozoic, helping to reconcile a long-standing debate between molecular timetrees and the fossil record. Our results demonstrate that placental mammal genomes, including humans, retain a molecular record of the post-K/Pg placental adaptive radiation in the form of numerous chitinase pseudogenes. PMID:29774238
Consugar, Mark B.; Wong, Wai C.; Lundquist, Patrick A.; Rossetti, Sandro; Kubly, Vickie J.; Walker, Denise L.; Rangel, Laureano J.; Aspinwall, Richard; Niaudet, W. Patrick; Özen, Seza; David, Albert; Velinov, Milen; Bergstralh, Eric J.; Bae, Kyongtae T.; Chapman, Arlene B.; Guay-Woodford, Lisa M.; Grantham, Jared J.; Torres, Vicente E.; Sampson, Julian R.; Dawson, Brian D.; Harris, Peter C.
2009-01-01
Large DNA rearrangements account for about 8% of disease mutations and are more common in duplicated genomic regions, where they are difficult to detect. Autosomal dominant polycystic kidney disease (ADPKD) is caused by mutations in either PKD1 or PKD2. PKD1 is located in an intrachromosomally duplicated region. A tuberous sclerosis gene, TSC2, lies immediately adjacent to PKD1 and large deletions can result in the PKD1/TSC2 contiguous gene deletion syndrome. To rapidly identify large rearrangements, a multiplex ligation-dependent probe amplification assay was developed employing base-pair differences between PKD1 and the six pseudogenes to generate PKD1-specific probes. All changes in a set of 25 previously defined deletions in PKD1, PKD2 and PKD1/TSC2 were detected by this assay and we also found 14 new mutations at these loci. About 4% of the ADPKD patients in the CRISP study were found to have gross rearrangements, and these accounted for about a third of base-pair mutation negative families. Sensitivity of the assay showed that about 40% of PKD1/TSC contiguous gene deletion syndrome families contained mosaic cases. Characterization of a family found to be mosaic for a PKD1 deletion is discussed here to illustrate family risk and donor selection considerations. Our assay improves detection levels and the reliability of molecular testing of patients with ADPKD. PMID:18818683
QuickMap: a public tool for large-scale gene therapy vector insertion site mapping and analysis.
Appelt, J-U; Giordano, F A; Ecker, M; Roeder, I; Grund, N; Hotz-Wagenblatt, A; Opelz, G; Zeller, W J; Allgayer, H; Fruehauf, S; Laufs, S
2009-07-01
Several events of insertional mutagenesis in pre-clinical and clinical gene therapy studies have created intense interest in assessing the genomic insertion profiles of gene therapy vectors. For the construction of such profiles, vector-flanking sequences detected by inverse PCR, linear amplification-mediated-PCR or ligation-mediated-PCR need to be mapped to the host cell's genome and compared to a reference set. Although remarkable progress has been achieved in mapping gene therapy vector insertion sites, public reference sets are lacking, as are the possibilities to quickly detect non-random patterns in experimental data. We developed a tool termed QuickMap, which uniformly maps and analyzes human and murine vector-flanking sequences within seconds (available at www.gtsg.org). Besides information about hits in chromosomes and fragile sites, QuickMap automatically determines insertion frequencies in +/- 250 kb adjacency to genes, cancer genes, pseudogenes, transcription factor and (post-transcriptional) miRNA binding sites, CpG islands and repetitive elements (short interspersed nuclear elements (SINE), long interspersed nuclear elements (LINE), Type II elements and LTR elements). Additionally, all experimental frequencies are compared with the data obtained from a reference set, containing 1 000 000 random integrations ('random set'). Thus, for the first time a tool allowing high-throughput profiling of gene therapy vector insertion sites is available. It provides a basis for large-scale insertion site analyses, which is now urgently needed to discover novel gene therapy vectors with 'safe' insertion profiles.
Will, Jessica L; Kim, Hyun Seok; Clarke, Jessica; Painter, John C; Fay, Justin C; Gasch, Audrey P
2010-04-01
A major goal in evolutionary biology is to understand how adaptive evolution has influenced natural variation, but identifying loci subject to positive selection has been a challenge. Here we present the adaptive loss of a pair of paralogous genes in specific Saccharomyces cerevisiae subpopulations. We mapped natural variation in freeze-thaw tolerance to two water transporters, AQY1 and AQY2, previously implicated in freeze-thaw survival. However, whereas freeze-thaw-tolerant strains harbor functional aquaporin genes, the set of sensitive strains lost aquaporin function at least 6 independent times. Several genomic signatures at AQY1 and/or AQY2 reveal low variation surrounding these loci within strains of the same haplotype, but high variation between strain groups. This is consistent with recent adaptive loss of aquaporins in subgroups of strains, leading to incipient balancing selection. We show that, although aquaporins are critical for surviving freeze-thaw stress, loss of both genes provides a major fitness advantage on high-sugar substrates common to many strains' natural niche. Strikingly, strains with non-functional alleles have also lost the ancestral requirement for aquaporins during spore formation. Thus, the antagonistic effect of aquaporin function-providing an advantage in freeze-thaw tolerance but a fitness defect for growth in high-sugar environments-contributes to the maintenance of both functional and nonfunctional alleles in S. cerevisiae. This work also shows that gene loss through multiple missense and nonsense mutations, hallmarks of pseudogenization presumed to emerge after loss of constraint, can arise through positive selection.
Colonizing the world in spite of reduced MHC variation
Gangoso, L.; Alcaide, M.; Grande, J.M.; Muñoz, J.; Talbot, Sandra L.; Sonsthagen, Sarah A.; Sage, Kevin; Figuerola, J.
2012-01-01
Reduced immune gene diversity is thought to negatively affect the capacity of organisms to adapt to pathogen challenges, which represent a major force in natural selection. Genes of the Major Histocompatibility Complex (MHC) are the most widely invoked adaptive loci in conservation biology, and have become the most popular genetic markers to investigate pathogen-host interactions in vertebrates. Although MHC genes are the most polymorphic genes described in the vertebrate genome, the extent to which MHC diversity determines the long-term persistence of populations is, unclear and often debated, as recent studies have documented the occurrence of natural populations thriving even after a depletion of MHC diversity caused by genetic drift. Here, we show that some phylogenetically related species belonging to the Falco genus (Aves: Falconidae) present a dramatically low MHC variability that has not precluded, nevertheless, the successful colonization of almost all existing regions and habitats worldwide. We found evidence for two remarkably different patterns of MHC variation within the genus. While kestrels show a high MHC variation according to the general theory, falcons exhibit an ancestrally low intra- and inter-specific MHC allelic diversity. We provide compelling evidence that this pattern is not caused by the degeneration of functional genes into pseudogenes, the inadvertent analyses of paralogous MHC genes, or the devastating action of genetic drift. Instead, our results strongly support the idea of an evolutionary transition driven and maintained by natural selection from primarily highly variable towards low polymorphic, but functional and expressed, MHC genes with species-specific pathogen-recognition capabilities.
Santo, Evan E; Paik, Jihye
2018-06-17
The rapid development of CRISPR technology is revolutionizing molecular approaches to the dissection of complex biological phenomena. Here we describe an alternative generally applicable implementation of the CRISPR-Cas9 system that allows for selective knockdown of extremely homologous genes. This strategy employs the lentiviral delivery of paired sgRNAs and nickase Cas9 (Cas9D10A) to achieve targeted deletion of splice junctions. This general strategy offers several advantages over standard single-guide exon-targeting CRISPR-Cas9 such as greatly reduced off-target effects, more restricted genomic editing, routine disruption of target gene mRNA expression and the ability to differentiate between closely related genes. Here we demonstrate the utility of this strategy by achieving selective knockdown of the highly homologous human genes FOXO3A and suspected pseudogene FOXO3B. We find the spJCRISPR strategy to efficiently and selectively disrupt FOXO3A and FOXO3B mRNA and protein expression; thus revealing that the human FOXO3B locus encodes a bona fide human gene. Unlike FOXO3A, we find the FOXO3B protein to be cytosolically localized in both the presence and absence of active Akt. The ability to selectively target and efficiently disrupt the expression of the closely-related FOXO3A and FOXO3B genes demonstrates the efficacy of the spJCRISPR approach. Copyright © 2018. Published by Elsevier B.V.
Apostolopoulos, J; Sparrow, R L; McLeod, J L; Collier, F M; Darcy, P K; Slater, H R; Ngu, C; Gregorio-King, C C; Kirkland, M A
2001-10-01
Evidence is presented for a family of mammalian homologs of ependymin, which we have termed the mammalian ependymin-related proteins (MERPs). Ependymins are secreted glycoproteins that form the major component of the cerebrospinal fluid in many teleost fish. We have cloned the entire coding region of human MERP-1 and mapped the gene to chromosome 7p14.1 by fluorescence in situ hybridization. In addition, three human MERP pseudogenes were identified on chromosomes 8, 16, and X. We have also cloned the mouse MERP-1 homolog and an additional family member, mouse MERP-2. Then, using bioinformatics, the mouse MERP-2 gene was localized to chromosome 13, and we identified the monkey MERP-1 homolog and frog ependymin-related protein (ERP). Despite relatively low amino acid sequence conservation between piscine ependymins, toad ERP, and MERPs, several amino acids (including four key cysteine residues) are strictly conserved, and the hydropathy profiles are remarkably alike, suggesting the possibilities of similar protein conformation and function. As with fish ependymins, frog ERP and MERPs contain a signal peptide typical of secreted proteins. The MERPs were found to be expressed at high levels in several hematopoietic cell lines and in nonhematopoietic tissues such as brain, heart, and skeletal muscle, as well as several malignant tissues and malignant cell lines. These findings suggest that MERPs have several potential roles in a range of cells and tissues.
Dynamic evolution at pericentromeres.
Hall, Anne E; Kettler, Gregory C; Preuss, Daphne
2006-03-01
Pericentromeres are exceptional genomic regions: in animals they contain extensive segmental duplications implicated in gene creation, and in plants they sustain rearrangements and insertions uncommon in euchromatin. To examine the mechanisms and patterns of plant pericentromere evolution, we compared pericentromere sequence from four Brassicaceae species separated by <15 million years (Myr). This flowering plant family is ideal for studying relationships between genome reorganization and pericentromere evolution-its members have undergone recent polyploidization and hybridization, with close relatives changing in genome size and chromosome number. Through sequence and hybridization analyses, we examined regions from Arabidopsis arenosa, Capsella rubella, and Olimarabidopsis pumila that are homologous to Arabidopsis thaliana pericentromeres (peri-CENs) III and V, and used FISH to demonstrate they have been maintained near centromere satellite arrays in each species. Sequence analysis revealed a set of highly conserved genes, yet we discovered substantial differences in intergenic length and species-specific changes in sequence content and gene density. We discovered that A. thaliana has undergone recent, significant expansions within its pericentromeres, in some cases measuring hundreds of kilobases; these findings are in marked contrast to euchromatic segments in these species that exhibit only minor length changes. While plant pericentromeres do contain some duplications, we did not find evidence of extensive segmental duplications, as has been documented in primates. Our data support a model in which plant pericentromeres may experience selective pressures distinct from euchromatin, tolerating rapid, dynamic changes in structure and sequence content, including large insertions of mobile elements, 5S rDNA arrays and pseudogenes.
Nock, Tanya G; Chand, Dhan; Lovejoy, David A
2011-04-01
The gonadotropin-releasing hormone (GnRH) and corticotropin-releasing family (CRF) are two neuropeptides families that are strongly conserved throughout evolution. Recently, the genome of the holocephalan, Callorhinchus milii (elephant shark) has been sequenced. The phylogenetic position of C. milii, along with the relatively slow evolution of the cartilaginous fish suggests that neuropeptides in this species may resemble the earliest gnathostome forms. The genome of the elephant shark was screened, in silico, using the various conserved motifs of both the vertebrate CRF paralogs and the insect diuretic hormone sequences to identify the structure of the C. milii CRF/DH-like peptides. A similar approach was taken to identify the GnRH peptides using conserved motifs in both vertebrate and invertebrate forms. Two CRF peptides, a urotensin-1 peptide and a urocortin 3 peptide were found in the genome. There was only about 50% sequence identity between the two CRF peptides suggesting an early divergence. In addition, the urocortin 2 peptide seems to have been lost and was identified as a pseudogene in C. milii. In contrast to the number of CRF family peptides, only a GnRH-II preprohormone with the conserved mature decapeptide was found. This confirms early studies about the identity of GnRH in the Holocephali, and suggests that the Holocephali and Elasmobranchii differ with respect to GnRH structure and function. Copyright © 2011 Elsevier Inc. All rights reserved.
Recurrent gene loss correlates with the evolution of stomach phenotypes in gnathostome history
Castro, L. Filipe C.; Gonçalves, Odete; Mazan, Sylvie; Tay, Boon-Hui; Venkatesh, Byrappa; Wilson, Jonathan M.
2014-01-01
The stomach, a hallmark of gnathostome evolution, represents a unique anatomical innovation characterized by the presence of acid- and pepsin-secreting glands. However, the occurrence of these glands in gnathostome species is not universal; in the nineteenth century the French zoologist Cuvier first noted that some teleosts lacked a stomach. Strikingly, Holocephali (chimaeras), dipnoids (lungfish) and monotremes (egg-laying mammals) also lack acid secretion and a gastric cellular phenotype. Here, we test the hypothesis that loss of the gastric phenotype is correlated with the loss of key gastric genes. We investigated species from all the main gnathostome lineages and show the specific contribution of gene loss to the widespread distribution of the agastric condition. We establish that the stomach loss correlates with the persistent and complete absence of the gastric function gene kit—H+/K+-ATPase (Atp4A and Atp4B) and pepsinogens (Pga, Pgc, Cym)—in the analysed species. We also find that in gastric species the pepsinogen gene complement varies significantly (e.g. two to four in teleosts and tens in some mammals) with multiple events of pseudogenization identified in various lineages. We propose that relaxation of purifying selection in pepsinogen genes and possibly proton pump genes in response to dietary changes led to the numerous independent events of stomach loss in gnathostome history. Significantly, the absence of the gastric genes predicts that reinvention of the stomach in agastric lineages would be highly improbable, in line with Dollo's principle. PMID:24307675
Recurrent gene loss correlates with the evolution of stomach phenotypes in gnathostome history.
Castro, L Filipe C; Gonçalves, Odete; Mazan, Sylvie; Tay, Boon-Hui; Venkatesh, Byrappa; Wilson, Jonathan M
2014-01-22
The stomach, a hallmark of gnathostome evolution, represents a unique anatomical innovation characterized by the presence of acid- and pepsin-secreting glands. However, the occurrence of these glands in gnathostome species is not universal; in the nineteenth century the French zoologist Cuvier first noted that some teleosts lacked a stomach. Strikingly, Holocephali (chimaeras), dipnoids (lungfish) and monotremes (egg-laying mammals) also lack acid secretion and a gastric cellular phenotype. Here, we test the hypothesis that loss of the gastric phenotype is correlated with the loss of key gastric genes. We investigated species from all the main gnathostome lineages and show the specific contribution of gene loss to the widespread distribution of the agastric condition. We establish that the stomach loss correlates with the persistent and complete absence of the gastric function gene kit--H(+)/K(+)-ATPase (Atp4A and Atp4B) and pepsinogens (Pga, Pgc, Cym)--in the analysed species. We also find that in gastric species the pepsinogen gene complement varies significantly (e.g. two to four in teleosts and tens in some mammals) with multiple events of pseudogenization identified in various lineages. We propose that relaxation of purifying selection in pepsinogen genes and possibly proton pump genes in response to dietary changes led to the numerous independent events of stomach loss in gnathostome history. Significantly, the absence of the gastric genes predicts that reinvention of the stomach in agastric lineages would be highly improbable, in line with Dollo's principle.
Lee, Kyeong-Ryeol; In Sohn, Soo; Jung, Jin Hee; Kim, Sun Hee; Roh, Kyung Hee; Kim, Jong-Bum; Suh, Mi Chung; Kim, Hyun Uk
2013-12-01
Fatty acid desaturase 2 (FAD2), which resides in the endoplasmic reticulum (ER), plays a crucial role in producing linoleic acid (18:2) through catalyzing the desaturation of oleic acid (18:1) by double bond formation at the delta 12 position. FAD2 catalyzes the first step needed for the production of polyunsaturated fatty acids found in the glycerolipids of cell membranes and the triacylglycerols in seeds. In this study, four FAD2 genes from amphidiploid Brassica napus genome were isolated by PCR amplification, with their enzymatic functions predicted by sequence analysis of the cDNAs. Fatty acid analysis of budding yeast transformed with each of the FAD2 genes showed that whereas BnFAD2-1, BnFAD2-2, and BnFAD2-4 are functional enzymes, and BnFAD2-3 is nonfunctional. The four FAD2 genes of B. napus originated from synthetic hybridization of its diploid progenitors Brassica rapa and Brassica oleracea, each of which has two FAD2 genes identical to those of B. napus. The BnFAD2-3 gene of B. napus, a nonfunctional pseudogene mutated by multiple nucleotide deletions and insertions, was inherited from B. rapa. All BnFAD2 isozymes except BnFAD2-3 localized to the ER. Nonfunctional BnFAD2-3 localized to the nucleus and chloroplasts. Four BnFAD2 genes can be classified on the basis of their expression patterns. © 2013.
Mu, Wenbo; Lu, Hsiao-Mei; Chen, Jefferey; Li, Shuwei; Elliott, Aaron M
2016-11-01
Next-generation sequencing (NGS) has rapidly replaced Sanger sequencing as the method of choice for diagnostic gene-panel testing. For hereditary-cancer testing, the technical sensitivity and specificity of the assay are paramount as clinicians use results to make important clinical management and treatment decisions. There is significant debate within the diagnostics community regarding the necessity of confirming NGS variant calls by Sanger sequencing, considering that numerous laboratories report having 100% specificity from the NGS data alone. Here we report our results from 20,000 hereditary-cancer NGS panels spanning 47 genes, in which all 7845 nonpolymorphic variants were Sanger- sequenced. Of these, 98.7% were concordant between NGS and Sanger sequencing and 1.3% were identified as NGS false-positives, located mainly in complex genomic regions (A/T-rich regions, G/C-rich regions, homopolymer stretches, and pseudogene regions). Simulating a false-positive rate of zero by adjusting the variant-calling quality-score thresholds decreased the sensitivity of the assay from 100% to 97.8%, resulting in the missed detection of 176 Sanger-confirmed variants, the majority in complex genomic regions (n = 114) and mosaic mutations (n = 7). The data illustrate the importance of setting quality thresholds for panel testing only after thousands of samples have been processed and the necessity of Sanger confirmation of NGS variants to maintain the highest possible sensitivity. Copyright © 2016 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Uric acid in plants and microorganisms: Biological applications and genetics - A review.
Hafez, Rehab M; Abdel-Rahman, Tahany M; Naguib, Rasha M
2017-09-01
Uric acid increased accumulation and/or reduced excretion in human bodies is closely related to pathogenesis of gout and hyperuricemia. It is highly affected by the high intake of food rich in purine. Uric acid is present in both higher plants and microorganisms with species dependent concentration. Urate-degrading enzymes are found both in plants and microorganisms but the mechanisms by which plant degrade uric acid was found to be different among them. Higher plants produce various metabolites which could inhibit xanthine oxidase and xanthine oxidoreductase, so prohibit the oxidation of hypoxanthine to xanthine then to uric acid in the purine metabolism. However, microorganisms produce group of degrading enzymes uricase, allantoinase, allantoicase and urease, which catalyze the degradation of uric acid to the ammonia. In humans, researchers found that several mutations caused a pseudogenization (silencing) of the uricase gene in ancestral apes which exist as an insoluble crystalloid in peroxisomes. This is in contrast to microorganisms in which uricases are soluble and exist either in cytoplasm or peroxisomes. Moreover, many recombinant uricases with higher activity than the wild type uricases could be induced successfully in many microorganisms. The present review deals with the occurrence of uric acid in plants and other organisms specially microorganisms in addition to the mechanisms by which plant extracts, metabolites and enzymes could reduce uric acid in blood. The genetic and genes encoding for uric acid in plants and microorganisms are also presented.
Kerkhof, Jennifer; Schenkel, Laila C; Reilly, Jack; McRobbie, Sheri; Aref-Eshghi, Erfan; Stuart, Alan; Rupar, C Anthony; Adams, Paul; Hegele, Robert A; Lin, Hanxin; Rodenhiser, David; Knoll, Joan; Ainsworth, Peter J; Sadikovic, Bekim
2017-11-01
Next-generation sequencing (NGS) technology has rapidly replaced Sanger sequencing in the assessment of sequence variations in clinical genetics laboratories. One major limitation of current NGS approaches is the ability to detect copy number variations (CNVs) approximately >50 bp. Because these represent a major mutational burden in many genetic disorders, parallel CNV assessment using alternate supplemental methods, along with the NGS analysis, is normally required, resulting in increased labor, costs, and turnaround times. The objective of this study was to clinically validate a novel CNV detection algorithm using targeted clinical NGS gene panel data. We have applied this approach in a retrospective cohort of 391 samples and a prospective cohort of 2375 samples and found a 100% sensitivity (95% CI, 89%-100%) for 37 unique events and a high degree of specificity to detect CNVs across nine distinct targeted NGS gene panels. This NGS CNV pipeline enables stand-alone first-tier assessment for CNV and sequence variants in a clinical laboratory setting, dispensing with the need for parallel CNV analysis using classic techniques, such as microarray, long-range PCR, or multiplex ligation-dependent probe amplification. This NGS CNV pipeline can also be applied to the assessment of complex genomic regions, including pseudogenic DNA sequences, such as the PMS2CL gene, and to mitochondrial genome heteroplasmy detection. Copyright © 2017 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
TLR10 is a Negative Regulator of Both MyD88-Dependent and Independent TLR Signaling
Jiang, Song; Li, Xinyan; Hess, Nicholas J.; Guan, Yue; Tapping, Richard I.
2016-01-01
Toll-like receptors are central components of the innate immune system which, upon recognition of bacterial, fungal or viral components, activate intracellular signals that lead to protective inflammatory responses. Among the ten-member human TLR family, TLR10 is the only remaining orphan receptor without a known ligand or signaling function. Murine TLR10 is a disrupted pseudogene, which precludes investigation using classic gene knock-out approaches. We report here that TLR10 suppressed the production of an array of cytokines in stably transfected human myelomonocytic U937 cells in response to other TLR agonists. This broad TLR suppressive activity affects both MyD88 and TRIF-mediated signaling pathways upstream of IκB and MAPK activation. Compared to non-transgenic littermate controls, monocytes of TLR10 transgenic mice exhibited blunted IL-6 production following ex vivo blood stimulation with other TLR agonists. After intraperitoneal injection of LPS, lower levels of TNFα, IL-6 and Type 1 IFN was measured in the serum of TLR10 transgenic mice, compared to non-transgenic mice, but did not affect mouse survival in an LPS-induced septic shock model. Finally, treatment of human mononuclear cells with a monoclonal anti-TLR10 antibody suppressed pro-inflammatory cytokines released by LPS stimulation. These results demonstrate that TLR10 functions as a broad negative regulator of TLR signaling and suggests TLR10 may have a role in controlling immune responses in vivo. PMID:27022193
Wheat CBF gene family: identification of polymorphisms in the CBF coding sequence.
Mohseni, Sara; Che, Hua; Djillali, Zakia; Dumont, Estelle; Nankeu, Joseph; Danyluk, Jean
2012-12-01
Expression of cold-regulated genes needed for protection against freezing stress is mediated, in part, by the CBF transcription factor family. Previous studies with temperate cereals suggested that the CBF gene family in wheat was large, and that CBF genes were at the base of an important low temperature tolerance trait. Therefore, the goal of our study was to identify the CBF repertoire in the freezing-tolerant hexaploid wheat cultivar Norstar, and then to examine if the coding region of CBF genes in two spring cultivars contain polymorphisms that could affect the protein sequence and structure. Our analyses reveal that hexaploid wheat contains a complex CBF family consisting of at least 65 CBF genes of which 60 are known to be expressed in the cultivar Norstar. They represent 27 paralogous genes with 1-3 homeologous copies for the A, B, and D genomes. The cultivar Norstar contains two pseudogenes and at least 24 additional proteins having sequences and (or) structures that deviate from the consensus in the conserved AP2 DNA-binding and (or) C-terminal activation-domains. This suggests that in cultivars such as Norstar, low temperature tolerance may be increased through breeding of additional optimal alleles. The examination of the CBF repertoire present in the two spring cultivars, Chinese Spring and Manitou, reveals that they have additional polymorphisms affecting conserved positions in these domains. Understanding the effects of these polymorphisms will provide additional information for the selection of optimum CBF alleles in Triticeae breeding programs.
Genes encoding the vacuolar Na+/H+ exchanger and flower coloration.
Yamaguchi, T; Fukada-Tanaka, S; Inagaki, Y; Saito, N; Yonekura-Sakakibara, K; Tanaka, Y; Kusumi, T; Iida, S
2001-05-01
Vacuolar pH plays an important role in flower coloration: an increase in the vacuolar pH causes blueing of flower color. In the Japanese morning glory (Ipomoea nil or Pharbitis nil), a shift from reddish-purple buds to blue open flowers correlates with an increase in the vacuolar pH. We describe details of the characterization of a mutant that carries a recessive mutation in the Purple (Pr) gene encoding a vacuolar Na+/H+ exchanger termed InNHX1. The genome of I. nil carries one copy of the Pr (or InNHX1) gene and its pseudogene, and it showed functional complementation to the yeast nhx1 mutation. The mutant of I. nil, called purple (pr), showed a partial increase in the vacuolar pH during flower-opening and its reddish-purple buds change into purple open flowers. The vacuolar pH in the purple open flowers of the mutant was significantly lower than that in the blue open flowers. The InNHX1 gene is most abundantly expressed in the petals at around 12 h before flower-opening, accompanying the increase in the vacuolar pH for the blue flower coloration. No such massive expression was observed in the petunia flowers. Since the NHX1 genes that promote the transport of Na+ into the vacuoles have been regarded to be involved in salt tolerance by accumulating Na+ in the vacuoles, we can add a new biological role for blue flower coloration in the Japanese morning glory by the vacuolar alkalization.
Makeyev, A V; Chkheidze, A N; Liebhaber, S A
1999-08-27
Gene families normally expand by segmental genomic duplication and subsequent sequence divergence. Although copies of partially or fully processed mRNA transcripts are occasionally retrotransposed into the genome, they are usually nonfunctional ("processed pseudogenes"). The two major cytoplasmic poly(C)-binding proteins in mammalian cells, alphaCP-1 and alphaCP-2, are implicated in a spectrum of post-transcriptional controls. These proteins are highly similar in structure and are encoded by closely related mRNAs. Based on this close relationship, we were surprised to find that one of these proteins, alphaCP-2, was encoded by a multiexon gene, whereas the second gene, alphaCP-1, was identical to and colinear with its mRNA. The alphaCP-1 and alphaCP-2 genes were shown to be single copy and were mapped to separate chromosomes. The linkage groups encompassing each of the two loci were concordant between mice and humans. These data suggested that the alphaCP-1 gene was generated by retrotransposition of a fully processed alphaCP-2 mRNA and that this event occurred well before the mammalian radiation. The stringent structural conservation of alphaCP-1 and its ubiquitous tissue distribution suggested that the retrotransposed alphaCP-1 gene was rapidly recruited to a function critical to the cell and distinct from that of its alphaCP-2 progenitor.
Velickovic, M; Velickovic, Z; Panigoro, R; Dunckley, H
2009-01-01
Killer cell immunoglobulin-like receptors (KIRs) regulate the activity of natural killer and T cells through interactions with specific human leucocyte antigen class I molecules on target cells. Population studies performed over the last several years have established that KIR gene frequencies (GFs) and genotype content vary considerably among different ethnic groups, indicating the extent of KIR diversity, some of which have also shown the effect of the presence or absence of specific KIR genes in human disease. We have determined the frequencies of 16 KIR genes and pseudogenes and genotypes in 193 Indonesian individuals from Java, East Timor, Irian Jaya (western half of the island of New Guinea) and Kalimantan provinces of Indonesian Borneo. All 16 KIR genes were observed in all four populations. Variation in GFs between populations was observed, except for KIR2DL4, KIR3DL2, KIR3DL3, KIR2DP1 and KIR3DP1 genes, which were present in every individual tested. When comparing KIR GFs between populations, both principal component analysis and a phylogenetic tree showed close clustering of the Kalimantan and Javanese populations, while Irianese populations were clearly separated from the other three populations. Our results indicate a high level of KIR polymorphism in Indonesian populations that probably reflects the large geographical spread of the Indonesian archipelago and the complex evolutionary history and population migration in this region.
Ecological adaptation determines functional mammalian olfactory subgenomes
Hayden, Sara; Bekaert, Michaël; Crider, Tess A.; Mariani, Stefano; Murphy, William J.; Teeling, Emma C.
2010-01-01
The ability to smell is governed by the largest gene family in mammalian genomes, the olfactory receptor (OR) genes. Although these genes are well annotated in the finished human and mouse genomes, we still do not understand which receptors bind specific odorants or how they fully function. Previous comparative studies have been taxonomically limited and mostly focused on the percentage of OR pseudogenes within species. No study has investigated the adaptive changes of functional OR gene families across phylogenetically and ecologically diverse mammals. To determine the extent to which OR gene repertoires have been influenced by habitat, sensory specialization, and other ecological traits, to better understand the functional importance of specific OR gene families and thus the odorants they bind, we compared the functional OR gene repertoires from 50 mammalian genomes. We amplified more than 2000 OR genes in aquatic, semi-aquatic, and flying mammals and coupled these data with 48,000 OR genes from mostly terrestrial mammals, extracted from genomic projects. Phylogenomic, Bayesian assignment, and principle component analyses partitioned species by ecotype (aquatic, semi-aquatic, terrestrial, flying) rather than phylogenetic relatedness, and identified OR families important for each habitat. Functional OR gene repertoires were reduced independently in the multiple origins of aquatic mammals and were significantly divergent in bats. We reject recent neutralist views of olfactory subgenome evolution and correlate specific OR gene families with physiological requirements, a preliminary step toward unraveling the relationship between specific odors and respective OR gene families. PMID:19952139
Reiman, Mario; Laan, Maris; Rull, Kristiina; Sõber, Siim
2017-08-01
RNA degradation is a ubiquitous process that occurs in living and dead cells, as well as during handling and storage of extracted RNA. Reduced RNA quality caused by degradation is an established source of uncertainty for all RNA-based gene expression quantification techniques. RNA sequencing is an increasingly preferred method for transcriptome analyses, and dependence of its results on input RNA integrity is of significant practical importance. This study aimed to characterize the effects of varying input RNA integrity [estimated as RNA integrity number (RIN)] on transcript level estimates and delineate the characteristic differences between transcripts that differ in degradation rate. The study used ribodepleted total RNA sequencing data from a real-life clinically collected set ( n = 32) of human solid tissue (placenta) samples. RIN-dependent alterations in gene expression profiles were quantified by using DESeq2 software. Our results indicate that small differences in RNA integrity affect gene expression quantification by introducing a moderate and pervasive bias in expression level estimates that significantly affected 8.1% of studied genes. The rapidly degrading transcript pool was enriched in pseudogenes, short noncoding RNAs, and transcripts with extended 3' untranslated regions. Typical slowly degrading transcripts (median length, 2389 nt) represented protein coding genes with 4-10 exons and high guanine-cytosine content.-Reiman, M., Laan, M., Rull, K., Sõber, S. Effects of RNA integrity on transcript quantification by total RNA sequencing of clinically collected human placental samples. © FASEB.
Chen, Xiao-Ren; Huang, Shen-Xin; Zhang, Ye; Sheng, Gui-Lin; Li, Yan-Peng; Zhu, Feng
2018-03-23
Phytophthora capsici is a hemibiotrophic, phytopathogenic oomycete that infects a wide range of crops, resulting in significant economic losses worldwide. By means of a diverse arsenal of secreted effector proteins, hemibiotrophic pathogens may manipulate plant cell death to establish a successful infection and colonization. In this study, we described the analysis of the gene family encoding necrosis- and ethylene-inducing peptide 1 (Nep1)-like proteins (NLPs) in P. capsici, and identified 39 real NLP genes and 26 NLP pseudogenes. Out of the 65 predicted NLP genes, 48 occur in groups with two or more genes, whereas the remainder appears to be singletons distributed randomly among the genome. Phylogenetic analysis of the 39 real NLPs delineated three groups. Key residues/motif important for the effector activities are degenerated in most NLPs, including the nlp24 peptide consisting of the conserved region I (11-aa immunogenic part) and conserved region II (the heptapeptide GHRHDWE motif) that is important for phytotoxic activity. Transcriptional profiling of eight selected NLP genes indicated that they were differentially expressed during the developmental and plant infection phases of P. capsici. Functional analysis of ten cloned NLPs demonstrated that Pc11951, Pc107869, Pc109174 and Pc118548 were capable of inducing cell death in the Solanaceae, including Nicotiana benthamiana and hot pepper. This study provides an overview of the P. capsici NLP gene family, laying a foundation for further elucidating the pathogenicity mechanism of this devastating pathogen.
Complete genome sequence of uropathogenic Escherichia coli isolate UPEC 26-1.
Subhadra, Bindu; Kim, Dong Ho; Kim, Jaeseok; Woo, Kyungho; Sohn, Kyung Mok; Kim, Hwa-Jung; Han, Kyudong; Oh, Man Hwan; Choi, Chul Hee
2018-06-01
Urinary tract infections (UTIs) are among the most common infections in humans, predominantly caused by uropathogenic Escherichia coli (UPEC). The diverse genomes of UPEC strains mostly impede disease prevention and control measures. In this study, we comparatively analyzed the whole genome sequence of a highly virulent UPEC strain, namely UPEC 26-1, which was isolated from urine sample of a patient suffering from UTI in Korea. Whole genome analysis showed that the genome consists of one circular chromosome of 5,329,753 bp, comprising 5064 protein-coding genes, 122 RNA genes (94 tRNA, 22 rRNA and 6 ncRNA genes), and 100 pseudogenes, with an average G+C content of 50.56%. In addition, we identified 8 prophage regions comprising 5 intact, 2 incomplete and 1 questionable ones and 63 genomic islands, suggesting the possibility of horizontal gene transfer in this strain. Comparative genome analysis of UPEC 26-1 with the UPEC strain CFT073 revealed an average nucleotide identity of 99.7%. The genome comparison with CFT073 provides major differences in the genome of UPEC 26-1 that would explain its increased virulence and biofilm formation. Nineteen of the total GIs were unique to UPEC 26-1 compared to CFT073 and nine of them harbored unique genes that are involved in virulence, multidrug resistance, biofilm formation and bacterial pathogenesis. The data from this study will assist in future studies of UPEC strains to develop effective control measures.
Gingerich, Derek J.; Hanada, Kousuke; Shiu, Shin-Han; Vierstra, Richard D.
2007-01-01
Selective ubiquitination of proteins is directed by diverse families of ubiquitin-protein ligases (or E3s) in plants. One important type uses Cullin-3 as a scaffold to assemble multisubunit E3 complexes containing one of a multitude of bric-a-brac/tramtrack/broad complex (BTB) proteins that function as substrate recognition factors. We previously described the 80-member BTB gene superfamily in Arabidopsis thaliana. Here, we describe the complete BTB superfamily in rice (Oryza sativa spp japonica cv Nipponbare) that contains 149 BTB domain–encoding genes and 43 putative pseudogenes. Amino acid sequence comparisons of the rice and Arabidopsis superfamilies revealed a near equal repertoire of putative substrate recognition module types. However, phylogenetic comparisons detected numerous gene duplication and/or loss events since the rice and Arabidopsis BTB lineages split, suggesting possible functional specialization within individual BTB families. In particular, a major expansion and diversification of a subset of BTB proteins containing Meprin and TRAF homology (MATH) substrate recognition sites was evident in rice and other monocots that likely occurred following the monocot/dicot split. The MATH domain of a subset appears to have evolved significantly faster than those in a smaller core subset that predates flowering plants, suggesting that the substrate recognition module in many monocot MATH-BTB E3s are diversifying to ubiquitinate a set of substrates that are themselves rapidly changing. Intriguing possibilities include pathogen proteins attempting to avoid inactivation by the monocot host. PMID:17720868
Wernstedt, Annekatrin; Valtorta, Emanuele; Armelao, Franco; Togni, Roberto; Girlando, Salvatore; Baudis, Michael; Heinimann, Karl; Messiaen, Ludwine; Staehli, Noemie; Zschocke, Johannes; Marra, Giancarlo; Wimmer, Katharina
2012-09-01
Heterozygous PMS2 germline mutations are associated with Lynch syndrome. Up to one third of these mutations are genomic deletions. Their detection is complicated by a pseudogene (PMS2CL), which--owing to extensive interparalog sequence exchange--closely resembles PMS2 downstream of exon 12. A recently redesigned multiplex ligation-dependent probe amplification (MLPA) assay identifies PMS2 copy number alterations with improved reliability when used with reference DNAs containing equal numbers of PMS2- and PMS2CL-specific sequences. We selected eight such reference samples--all publicly available--and used them with this assay to study 13 patients with PMS2-defective colorectal tumors. Three presented deleterious alterations: an Alu-mediated exon deletion; a 125-kb deletion encompassing PMS2 and four additional genes (two with tumor-suppressing functions); and a novel deleterious hybrid PMS2 allele produced by recombination with crossover between PMS2 and PMS2CL, with the breakpoint in intron 10 (the most 5' breakpoint of its kind reported thus far). We discuss mechanisms that might generate this allele in different chromosomal configurations (and their diagnostic implications) and describe an allele-specific PCR assay that facilitates its detection. Our data indicate that the redesigned PMS2 MLPA assay is a valid first-line option. In our series, it identified roughly a quarter of all PMS2 mutations. Copyright © 2012 Wiley Periodicals, Inc.
Wernstedt, Annekatrin; Valtorta, Emanuele; Armelao, Franco; Togni, Roberto; Girlando, Salvatore; Baudis, Michael; Heinimann, Karl; Messiaen, Ludwine; Staehli, Noemie; Zschocke, Johannes; Marra, Giancarlo; Wimmer, Katharina
2012-01-01
Heterozygous PMS2 germline mutations are associated with Lynch syndrome. Up to one third of these mutations are genomic deletions. Their detection is complicated by a pseudogene (PMS2CL), which – owing to extensive interparalog sequence exchange – closely resembles PMS2 downstream of exon 12. A recently redesigned multiplex ligation-dependent probe amplification (MLPA) assay identifies PMS2 copy number alterations with improved reliability when used with reference DNAs containing equal numbers of PMS2- and PMS2CL-specific sequences. We selected eight such reference samples – all publicly available – and used them with this assay to study 13 patients with PMS2-defective colorectal tumors. Three presented deleterious alterations: an Alu-mediated exon deletion; a 125-kb deletion encompassing PMS2 and four additional genes (two with tumor-suppressing functions); and a novel deleterious hybrid PMS2 allele produced by recombination with crossover between PMS2 and PMS2CL, with the breakpoint in intron 10 (the most 5′ breakpoint of its kind reported thus far). We discuss mechanisms that might generate this allele in different chromosomal configurations (and their diagnostic implications) and describe an allele-specific PCR assay that facilitates its detection. Our data indicate that the redesigned PMS2 MLPA assay is a valid first-line option. In our series, it identified roughly a quarter of all PMS2 mutations. © 2012 Wiley Periodicals, Inc. PMID:22585707
van der Klift, Heleen M; Mensenkamp, Arjen R; Drost, Mark; Bik, Elsa C; Vos, Yvonne J; Gille, Hans J J P; Redeker, Bert E J W; Tiersma, Yvonne; Zonneveld, José B M; García, Encarna Gómez; Letteboer, Tom G W; Olderode-Berends, Maran J W; van Hest, Liselotte P; van Os, Theo A; Verhoef, Senno; Wagner, Anja; van Asperen, Christi J; Ten Broeke, Sanne W; Hes, Frederik J; de Wind, Niels; Nielsen, Maartje; Devilee, Peter; Ligtenberg, Marjolijn J L; Wijnen, Juul T; Tops, Carli M J
2016-11-01
Monoallelic PMS2 germline mutations cause 5%-15% of Lynch syndrome, a midlife cancer predisposition, whereas biallelic PMS2 mutations cause approximately 60% of constitutional mismatch repair deficiency (CMMRD), a rare childhood cancer syndrome. Recently improved DNA- and RNA-based strategies are applied to overcome problematic PMS2 mutation analysis due to the presence of pseudogenes and frequent gene conversion events. Here, we determined PMS2 mutation detection yield and mutation spectrum in a nationwide cohort of 396 probands. Furthermore, we studied concordance between tumor IHC/MSI (immunohistochemistry/microsatellite instability) profile and mutation carrier state. Overall, we found 52 different pathogenic PMS2 variants explaining 121 Lynch syndrome and nine CMMRD patients. In vitro mismatch repair assays suggested pathogenicity for three missense variants. Ninety-one PMS2 mutation carriers (70%) showed isolated loss of PMS2 in their tumors, for 31 (24%) no or inconclusive IHC was available, and eight carriers (6%) showed discordant IHC (presence of PMS2 or loss of both MLH1 and PMS2). Ten cases with isolated PMS2 loss (10%; 10/97) harbored MLH1 mutations. We confirmed that recently improved mutation analysis provides a high yield of PMS2 mutations in patients with isolated loss of PMS2 expression. Application of universal tumor prescreening methods will however miss some PMS2 germline mutation carriers. © 2016 WILEY PERIODICALS, INC.
Campo, Vanina A.; Patenaude, Anne-Marie; Kaden, Svenja; Horb, Lori; Firka, Daniel; Jiricny, Josef; Di Noia, Javier M.
2013-01-01
The mammalian antibody repertoire is shaped by somatic hypermutation (SHM) and class switch recombination (CSR) of the immunoglobulin (Ig) loci of B lymphocytes. SHM and CSR are triggered by non-canonical, error-prone processing of G/U mismatches generated by activation-induced deaminase (AID). In birds, AID does not trigger SHM, but it triggers Ig gene conversion (GC), a ‘homeologous’ recombination process involving the Ig variable region and proximal pseudogenes. Because recombination fidelity is controlled by the mismatch repair (MMR) system, we investigated whether MMR affects GC in the chicken B cell line DT40. We show here that Msh6−/− and Pms2−/− DT40 cells display cell cycle defects, including genomic re-replication. However, although IgVλ GC tracts in MMR-deficient cells were slightly longer than in normal cells, Ig GC frequency, donor choice or the number of mutations per sequence remained unaltered. The finding that the avian MMR system, unlike that of mammals, does not seem to contribute towards the processing of G/U mismatches in vitro could explain why MMR is unable to initiate Ig GC in this species, despite initiating SHM and CSR in mammalian cells. Moreover, as MMR does not counteract or govern Ig GC, we report a rare example of ‘homeologous’ recombination insensitive to MMR. PMID:23314153
High incidence of large deletions in the PMS2 gene in Spanish Lynch syndrome families.
Brea-Fernández, A J; Cameselle-Teijeiro, J M; Alenda, C; Fernández-Rozadilla, C; Cubiella, J; Clofent, J; Reñé, J M; Anido, U; Milá, M; Balaguer, F; Castells, A; Castellvi-Bel, S; Jover, R; Carracedo, A; Ruiz-Ponte, C
2014-06-01
Lynch syndrome (LS) is caused by germline mutations in one of the four mismatch repair (MMR) genes. Defects in this pathway lead to microsatellite instability (MSI) in DNA tumors, which constitutes the molecular hallmark of this disease. Selection of patients for genetic testing in LS is usually based on fulfillment of diagnostic clinical criteria (i.e. Amsterdam criteria or the revised Bethesda guidelines). However, following these criteria PMS2 mutations have probably been underestimated as their penetrances appear to be lower than those of the other MMR genes. The use of universal MMR study-based strategies, using MSI testing and immunohistochemical (IHC) staining, is being one proposed alternative. Besides, germline mutation detection in PMS2 is complicated by the presence of highly homologous pseudogenes. Nevertheless, specific amplification of PMS2 by long-range polymerase chain reaction (PCR) and the improvement of the analysis of large deletions/duplications by multiplex ligation-dependent probe amplification (MLPA) overcome this difficulty. By using both approaches, we analyzed 19 PMS2-suspected carriers who have been selected by clinical or universal strategies and found five large deletions and one frameshift mutation in PMS2 in six patients (31%). Owing to the high incidence of large deletions found in our cohort, we recommend MLPA analysis as the first-line method for searching germline mutations in PMS2. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Ni, Lianghong; Zhao, Zhili; Xu, Hongxi; Chen, Shilin; Dorje, Gaawe
2016-02-15
Endemic to the Sino-Himalayan subregion, the medicinal alpine plant Gentiana straminea is a threatened species. The genetic and molecular data about it is deficient. Here we report the complete chloroplast (cp) genome sequence of G. straminea, as the first sequenced member of the family Gentianaceae. The cp genome is 148,991bp in length, including a large single copy (LSC) region of 81,240bp, a small single copy (SSC) region of 17,085bp and a pair of inverted repeats (IRs) of 25,333bp. It contains 112 unique genes, including 78 protein-coding genes, 30 tRNAs and 4 rRNAs. The rps16 gene lacks exon2 between trnK-UUU and trnQ-UUG, which is the first rps16 pseudogene found in the nonparasitic plants of Asterids clade. Sequence analysis revealed the presence of 13 forward repeats, 13 palindrome repeats and 39 simple sequence repeats (SSRs). An entire cp genome comparison study of G. straminea and four other species in Gentianales was carried out. Phylogenetic analyses using maximum likelihood (ML) and maximum parsimony (MP) were performed based on 69 protein-coding genes from 36 species of Asterids. The results strongly supported the position of Gentianaceae as one member of the order Gentianales. The complete chloroplast genome sequence will provide intragenic information for its conservation and contribute to research on the genetic and phylogenetic analyses of Gentianales and Asterids. Copyright © 2015 Elsevier B.V. All rights reserved.
Macrophage-expressed perforins mpeg1 and mpeg1.2 have an anti-bacterial function in zebrafish.
Benard, Erica L; Racz, Peter I; Rougeot, Julien; Nezhinsky, Alexander E; Verbeek, Fons J; Spaink, Herman P; Meijer, Annemarie H
2015-01-01
Macrophage-expressed gene 1 (MPEG1) encodes an evolutionarily conserved protein with a predicted membrane attack complex/perforin domain associated with host defence against invading pathogens. In vertebrates, MPEG1/perforin-2 is an integral membrane protein of macrophages, suspected to be involved in the killing of intracellular bacteria by pore-forming activity. Zebrafish have 3 copies of MPEG1; 2 are expressed in macrophages, whereas the third could be a pseudogene. The mpeg1 and mpeg1.2 genes show differential regulation during infection of zebrafish embryos with the bacterial pathogens Mycobacterium marinum and Salmonella typhimurium. While mpeg1 is downregulated during infection with both pathogens, mpeg1.2 is infection inducible. Upregulation of mpeg1.2 is partially dependent on the presence of functional Mpeg1 and requires the Toll-like receptor adaptor molecule MyD88 and the transcription factor NFκB. Knockdown of mpeg1 alters the immune response to M. marinum infection and results in an increased bacterial burden. In Salmonella typhimurium infection, both mpeg1 and mpeg1.2 knockdown increase the bacterial burdens, but mpeg1 morphants show increased survival times. The combined results of these two in vivo infection models support the anti-bacterial function of the MPEG1/perforin-2 family and indicate that the intricate cross-regulation of the two mpeg1 copies aids the zebrafish host in combatting infection of various pathogens. © 2014 S. Karger AG, Basel.
Mollusk genes encoding lysine tRNA (UUU) contain introns.
Matsuo, M; Abe, Y; Saruta, Y; Okada, N
1995-11-20
New intron-containing genes encoding tRNAs were discovered when genomic DNA isolated from various animal species was amplified by the polymerase chain reaction (PCR) with primers based on sequences of rabbit tRNA(Lys). From sequencing analysis of the products of PCR, we found that introns are present in several genes encoding tRNA(Lys) in mollusks, such as Loligo bleekeri (squid) and Octopus vulgaris (octopus). These introns were specific to genes encoding tRNA(Lys)(CUU) and were not present in genes encoding tRNA(Lys)(CUU). In addition, the sequences of the introns were different from one another. To confirm the results of our initial experiments, we isolated and sequenced genes encoding tRNA(Lys)(CUU) and tRNA(Lys)(UUU). The gene for tRNA(Lys)(UUU) from squid contained an intron, whose sequence was the same as that identified by PCR, and the gene formed a cluster with a corresponding pseudogene. Several DNA regions of 2.1 kb containing this cluster appeared to be tandemly arrayed in the squid genome. By contrast, the gene encoding tRNA(Lys)(CUU) did not contain an intron, as shown also by PCR. The tRNA(Lys)(UUU) that corresponded to the analyzed gene was isolated and characterized. The present study provides the first example of an intron-containing gene encoding a tRNA in mollusks and suggests the universality of introns in such genes in higher eukaryotes.
Complete Chloroplast Genome of the Wollemi Pine (Wollemia nobilis): Structure and Evolution
Yap, Jia-Yee S.; Rohner, Thore; Greenfield, Abigail; Van Der Merwe, Marlien; McPherson, Hannah; Glenn, Wendy; Kornfeld, Geoff; Marendy, Elessa; Pan, Annie Y. H.; Wilkins, Marc R.; Rossetto, Maurizio; Delaney, Sven K.
2015-01-01
The Wollemi pine (Wollemia nobilis) is a rare Southern conifer with striking morphological similarity to fossil pines. A small population of W. nobilis was discovered in 1994 in a remote canyon system in the Wollemi National Park (near Sydney, Australia). This population contains fewer than 100 individuals and is critically endangered. Previous genetic studies of the Wollemi pine have investigated its evolutionary relationship with other pines in the family Araucariaceae, and have suggested that the Wollemi pine genome contains little or no variation. However, these studies were performed prior to the widespread use of genome sequencing, and their conclusions were based on a limited fraction of the Wollemi pine genome. In this study, we address this problem by determining the entire sequence of the W. nobilis chloroplast genome. A detailed analysis of the structure of the genome is presented, and the evolution of the genome is inferred by comparison with the chloroplast sequences of other members of the Araucariaceae and the related family Podocarpaceae. Pairwise alignments of whole genome sequences, and the presence of unique pseudogenes, gene duplications and insertions in W. nobilis and Araucariaceae, indicate that the W. nobilis chloroplast genome is most similar to that of its sister taxon Agathis. However, the W. nobilis genome contains an unusually high number of repetitive sequences, and these could be used in future studies to investigate and conserve any remnant genetic diversity in the Wollemi pine. PMID:26061691
Isolated p.H62L Mutation in the CYP21A2 Gene in a Simple Virilizing 21-Hydroxylase Deficient Patient
Fernández, Cecilia; Belli, Susana; Buzzalino, Noemi; Dain, Liliana
2013-01-01
Congenital adrenal hyperplasia due to 21-hydroxylase deficiency accounts for 90%–95% of cases. This autosomal recessive disorder has a broad spectrum of clinical forms, ranging from severe or classical, which includes the salt-wasting and simple virilizing forms, to the mild late onset or nonclassical form. Most of the disease-causing mutations described are likely to be the consequence of nonhomologous recombination or gene conversion events between the active CYP21A2 gene and its homologous CYP21A1P pseudogene. Nevertheless, an increasing number of naturally occurring mutations have been found. The change p.H62L is one of the most frequent rare mutations of the CYP21A2 gene. It was suggested that the p.H62L represents a mild mutation that may be responsible for a more severe enzymatic impairment when presented with another mild mutation on the same allele. In this report, a 20-year-old woman carrying an isolated p.H62L mutation in compound heterozygosity with c.283-13A/C>G mutation is described. Although a mildly nonclassical phenotype was expected, clinical signs and hormonal profile of the patient are consistent with a more severe simple virilizing form of 21-hydroxylase deficiency. The study of genotype-phenotype correlation in additional patients would help in defining the role of p.H62L in disease manifestation. PMID:23936690
Taboas, Melisa; Fernández, Cecilia; Belli, Susana; Buzzalino, Noemi; Alba, Liliana; Dain, Liliana
2013-01-01
Congenital adrenal hyperplasia due to 21-hydroxylase deficiency accounts for 90%-95% of cases. This autosomal recessive disorder has a broad spectrum of clinical forms, ranging from severe or classical, which includes the salt-wasting and simple virilizing forms, to the mild late onset or nonclassical form. Most of the disease-causing mutations described are likely to be the consequence of nonhomologous recombination or gene conversion events between the active CYP21A2 gene and its homologous CYP21A1P pseudogene. Nevertheless, an increasing number of naturally occurring mutations have been found. The change p.H62L is one of the most frequent rare mutations of the CYP21A2 gene. It was suggested that the p.H62L represents a mild mutation that may be responsible for a more severe enzymatic impairment when presented with another mild mutation on the same allele. In this report, a 20-year-old woman carrying an isolated p.H62L mutation in compound heterozygosity with c.283-13A/C>G mutation is described. Although a mildly nonclassical phenotype was expected, clinical signs and hormonal profile of the patient are consistent with a more severe simple virilizing form of 21-hydroxylase deficiency. The study of genotype-phenotype correlation in additional patients would help in defining the role of p.H62L in disease manifestation.
Fernández, Cecilia S; Bruque, Carlos D; Taboas, Melisa; Buzzalino, Noemí D; Espeche, Lucia D; Pasqualini, Titania; Charreau, Eduardo H; Alba, Liliana G; Ghiringhelli, Pablo D; Dain, Liliana
2015-09-01
The aim of the current study was to search for the presence of genetic variants in the CYP21A2 Z promoter regulatory region in patients with congenital adrenal hyperplasia due to 21-hydroxylase deficiency. Screening of the 10 most frequent pseudogene-derived mutations was followed by direct sequencing of the entire coding sequence, the proximal promoter, and a distal regulatory region in DNA samples from patients with at least one non-determined allele. We report three non-classical patients that presented a novel genetic variant-g.15626A>G-within the Z promoter regulatory region. In all the patients, the novel variant was found in cis with the mild, less frequent, p.P482S mutation located in the exon 10 of the CYP21A2 gene. The putative pathogenic implication of the novel variant was assessed by in silico analyses and in vitro assays. Topological analyses showed differences in the curvature and bendability of the DNA region bearing the novel variant. By performing functional studies, a significantly decreased activity of a reporter gene placed downstream from the regulatory region was found by the G transition. Our results may suggest that the activity of an allele bearing the p.P482S mutation may be influenced by the misregulated CYP21A2 transcriptional activity exerted by the Z promoter A>G variation.
Minutolo, Carolina; Nadra, Alejandro D; Fernández, Cecilia; Taboas, Melisa; Buzzalino, Noemí; Casali, Bárbara; Belli, Susana; Charreau, Eduardo H; Alba, Liliana; Dain, Liliana
2011-01-11
Congenital adrenal hyperplasia (CAH) due to 21-hydroxylase deficiency is the most frequent inborn error of metabolism, and accounts for 90-95% of CAH cases. The affected enzyme, P450C21, is encoded by the CYP21A2 gene, located together with a 98% nucleotide sequence identity CYP21A1P pseudogene, on chromosome 6p21.3. Even though most patients carry CYP21A1P-derived mutations, an increasing number of novel and rare mutations in disease causing alleles were found in the last years. In the present work, we describe five CYP21A2 novel mutations, p.R132C, p.149C, p.M283V, p.E431K and a frameshift g.2511_2512delGG, in four non-classical and one salt wasting patients from Argentina. All novel point mutations are located in CYP21 protein residues that are conserved throughout mammalian species, and none of them were found in control individuals. The putative pathogenic mechanisms of the novel variants were analyzed in silico. A three-dimensional CYP21 structure was generated by homology modeling and the protein design algorithm FoldX was used to calculate changes in stability of CYP21A2 protein. Our analysis revealed changes in protein stability or in the surface charge of the mutant enzymes, which could be related to the clinical manifestation found in patients.
Fernández, Cecilia; Taboas, Melisa; Buzzalino, Noemí; Casali, Bárbara; Belli, Susana; Charreau, Eduardo H.; Alba, Liliana; Dain, Liliana
2011-01-01
Congenital adrenal hyperplasia (CAH) due to 21-hydroxylase deficiency is the most frequent inborn error of metabolism, and accounts for 90–95% of CAH cases. The affected enzyme, P450C21, is encoded by the CYP21A2 gene, located together with a 98% nucleotide sequence identity CYP21A1P pseudogene, on chromosome 6p21.3. Even though most patients carry CYP21A1P-derived mutations, an increasing number of novel and rare mutations in disease causing alleles were found in the last years. In the present work, we describe five CYP21A2 novel mutations, p.R132C, p.149C, p.M283V, p.E431K and a frameshift g.2511_2512delGG, in four non-classical and one salt wasting patients from Argentina. All novel point mutations are located in CYP21 protein residues that are conserved throughout mammalian species, and none of them were found in control individuals. The putative pathogenic mechanisms of the novel variants were analyzed in silico. A three-dimensional CYP21 structure was generated by homology modeling and the protein design algorithm FoldX was used to calculate changes in stability of CYP21A2 protein. Our analysis revealed changes in protein stability or in the surface charge of the mutant enzymes, which could be related to the clinical manifestation found in patients. PMID:21264314
Karpe, Snehal D.; Jain, Rikesh; Brockmann, Axel; Sowdhamini, Ramanathan
2016-01-01
Abstract We developed a computational pipeline for homology based identification of the complete repertoire of olfactory receptor (OR) genes in the Asian honey bee species, Apis florea. Apis florea is phylogenetically the most basal honey bee species and also the most distant sister species to the Western honey bee Apis mellifera, for which all OR genes had been identified before. Using our pipeline, we identified 180 OR genes in A. florea, which is very similar to the number of ORs identified in A. mellifera (177 ORs). Many characteristics of the ORs including gene structure, synteny of tandemly repeated ORs and basic phylogenetic clustering are highly conserved. The composite phylogenetic tree of A. florea and A. mellifera ORs could be divided into 21 clades which are in harmony with the existing Hymenopteran tree. However, we found a few nonorthologous OR relationships between both species as well as independent pseudogenization of ORs suggesting separate evolutionary changes. Particularly, a subgroup of the OR gene clade XI, which had been hypothesized to code cuticular hydrocarbon receptors showed a high number of species-specific ORs. RNAseq analysis detected a total number of 145 OR transcripts in male and 162 in female antennae. Most of the OR genes were highly expressed on the female antennae. However, we detected five distinct male-biased OR genes, out of which three genes (AfOr11, AfOr18, AfOr170P) were shown to be male-biased in A. mellifera, too, thus corroborating a behavioral function in sex-pheromone communication. PMID:27540087